Genetic differences in host infectivity affect disease spread and survival in epidemics

Anacleto, O., Cabaleiro, S., Saura, M., Villanueva, B., Houston, R.D., Woolliams, J.A., Doeschl-Wilson, A.B.


Survival during an epidemic is partly determined by host genetics. While quantitative genetic studies typically consider survival as an indicator for disease resistance, mortality rates of populations undergoing an epidemic are also affected by tolerance and infectivity (i.e. the propensity of an infected individual to transmit disease). Few studies have demonstrated genetic variation in disease tolerance, and no study has demonstrated genetic variation in host infectivity, despite strong evidence for considerable phenotypic variation in this trait. Here we propose an experimental design and statistical models for estimating genetic diversity in all three host traits. Using an infection model in fish we provide, for the first time, direct evidence for genetic variation in host infectivity, in addition to variation in resistance and tolerance. We also demonstrate how genetic differences in these three traits contribute to survival. Our results imply that animals can evolve different disease response types affecting epidemic survival rates, with important implications for understanding and controlling epidemics.

Enhancing genetic disease control by selecting for low host infectivity and susceptibility

Tsairidou, S., Anacleto, O., Woolliams, J. A., Doeschl-Wilson, A. B.


Infectious diseases have a huge impact on animal health, production and welfare, and human health. Understanding the role of host genetics in disease spread is essential for developing enhanced genetic disease control strategies that will efficiently reduce disease prevalence and risk. While some genetic selection schemes already exploit heritable variation in susceptibility, increasing evidence suggests that there is also genetic variation in host infectivity, and super-spreaders have been documented in several disease outbreaks. We used a genetic epidemiological model to investigate how disease dynamics are influenced by different levels of polygenic genetic variation in host susceptibility and infectivity. Response to genetic selection was calculated over 20 generations, exploring a variety of selection schemes differing in accuracy and intensity, to evaluate the benefit from combined selection for lower infectivity and susceptibility in reducing epidemic risk and severity. For example, assuming moderate genetic variation in both traits, 50% selection for susceptibility required 7 generations for reducing the basic reproductive number R0 from 7.64 to the critical threshold of <1, below which diseases die out. Adding infectivity in the selection goal accelerated the rate of decline towards R0<1, to 3 generations. Our results show that although genetic selection for susceptibility reduces disease risk and prevalence, combined selection for both susceptibility and infectivity can significantly accelerate this decline, requiring fewer generations for eradication, and can alleviate the delay potentially generated by unfavourable correlations. Future disease control strategies will benefit from estimating and utilising genetic effects for both infectivity and susceptibility.

Disentangling genetic variation for resistance and tolerance to scuticociliatosis in turbot using pedigree and genomic information

Saura, M., Carabano, M. J., Fernandez, A., Cabaleiro, S., Doeschl-Wilson, A. B., Anacleto, O., Maroso, F., Millan, A., Hermida, M., Fernandez, C. ; Martinez, P., Villanueva, B.


Impact of genetic selection for increased cattle resistance to bovine tuberculosis on the disease transmission dynamics

Raphaka. K, Sánchez-Molano, E., Tsairidou, S., Anacleto, O., Glass, E., Woolliams, J.A., Doeschl-Wilson, A. B., Banos, G.
Journal Paper Frontiers in Veterinary Science, 5:237 (2018)


Bovine tuberculosis (bTB) poses a challenge to animal health and welfare worldwide. Presence of genetic variation in host resistance to Mycobacterium bovis infection makes the trait amenable to improvement with genetic selection. Genetic evaluations for resistance to infection in dairy cattle are currently available in the United Kingdom (UK), enabling genetic selection of more resistant animals. However, the extent to which genetic selection could contribute to bTB eradication is unknown. The objective of this study was to quantify the impact of genetic selection for bTB resistance on cattle-to-cattle disease transmission dynamics and prevalence by developing a stochastic genetic epidemiological model. The model was used to implement genetic selection in a simulated cattle population. The model considered various levels of selection intensity over 20 generations assuming genetic heterogeneity in host resistance to infection. Our model attempted to represent the dairy cattle population structure and current bTB control strategies in the UK, and was informed by genetic and epidemiological parameters inferred from data collected from UK bTB infected dairy herds. The risk of a bTB breakdown was modeled as the percentage of herds where initially infected cows (index cases) generated secondary cases by infecting herd-mates. The model predicted that this risk would be reduced by half after 4, 6, 9, and 15 generations for selection intensities corresponding to genetic selection of the 10, 25, 50, and 70% most resistant sires, respectively. In herds undergoing bTB breakdowns, genetic selection reduced the severity of breakdowns over generations by reducing both the percentage of secondary cases and the duration over which new secondary cases were detected. Selection of the 10, 25, 50, and 70% most resistant sires reduced the percentage of secondary cases to <1% in 4, 5, 7, and 11 generations, respectively. Similarly, the proportion of long breakdowns (breakdowns in which secondary cases were detected for more than 365 days) was reduced by half in 2, 2, 3, and 4 generations, respectively. Collectively, results suggest that genetic selection could be a viable tool that can complement existing management and surveillance methods to control and ultimately eradicate bTB.

Dynamic chain graph models for time series network data

Anacleto, O., Queen, C.M.
Journal Paper Bayesian Analysis, Volume 12, Number 2 (2017), 491-509


This paper introduces a new class of Bayesian dynamic models for inference and forecasting in high-dimensional time series observed on networks. The new model, called the dynamic chain graph model, is suitable for multivariate time series which exhibit symmetries within subsets of series and a causal drive mechanism between these subsets. The model can accommodate high-dimensional, non-linear and non-normal time series and enables local and parallel computation by decomposing the multivariate problem into separate, simpler sub-problems of lower dimensions. The advantages of the new model are illustrated by forecasting traffic network flows and also modelling gene expression data from transcriptional networks.

A novel statistical model to estimate host genetic effects affecting disease transmission

Anacleto, O., Garcia-Cortez, L., Lipschutz-Powell, D., Woolliams, J.A., Doeschl-Wilson, A.B.
Journal Paper Genetics, Volume 201, Number 3 (2015), 871-884 ,


There is increasing recognition that genetic diversity can affect the spread of diseases, potentially affecting plant and livestock disease control as well as the emergence of human disease outbreaks. Nevertheless, even though computational tools can guide the control of infectious diseases, few epidemiological models can simultaneously accommodate the inherent individual heterogeneity in multiple infectious disease traits influencing disease transmission, such as the frequently modeled propensity to become infected and infectivity, which describes the host ability to transmit the infection to susceptible individuals. Furthermore, current quantitative genetic models fail to fully capture the heritable variation in host infectivity, mainly because they cannot accommodate the nonlinear infection dynamics underlying epidemiological data. We present in this article a novel statistical model and an inference method to estimate genetic parameters associated with both host susceptibility and infectivity. Our methodology combines quantitative genetic models of social interactions with stochastic processes to model the random, nonlinear, and dynamic nature of infections and uses adaptive Bayesian computational techniques to estimate the model parameters. Results using simulated epidemic data show that our model can accurately estimate heritabilities and genetic risks not only of susceptibility but also of infectivity, therefore exploring a trait whose heritable variation is currently ignored in disease genetics and can greatly influence the spread of infectious diseases. Our proposed methodology offers potential impacts in areas such as livestock disease control through selective breeding and also in predicting and controlling the emergence of disease outbreaks in human populations.

Multivariate forecasting of road traffic flows in the presence of heteroscedasticity and measurement errors

Anacleto, O., Queen, C.M., Albers, C.J.
Journal Paper Journal of the Royal Statistical Society: Series C (Applied Statistics), Volume 62, Number 2 (2013), 251–270


Linear multiregression dynamic models, which combine a graphical representation of a multivariate time series with a state space model, have been shown to be a promising class of models for forecasting traffic flow data. Analysis of flows at a busy motorway intersection near Manchester, UK, highlights two important modelling issues: accommodating different levels of traffic variability depending on the time of day and accommodating measurement errors due to data collection errors. This paper extends linear multiregression dynamic models to address these issues. Additionally, the paper investigates how close the approximate forecast limits that are usually used with the linear multiregression dynamic model are to the true, but not so readily available, forecast limits.

Forecasting multivariate road traffic flows using Bayesian dynamic graphical models, splines and other traffic variables

Anacleto, O., Queen, C.M., Albers, C.J.
Journal Paper Australian & New Zealand Journal of Statistics, Volume 55, Number 2 (2013), 69–86


Traffic flow data are routinely collected for many networks worldwide. These invariably large data sets can be used as part of a traffic management system, for which good traffic flow forecasting models are crucial. The linear multiregression dynamic model (LMDM) has been shown to be promising for forecasting flows, accommodating multivariate flow time series, while being a computationally simple model to use. While statistical flow forecasting models usually base their forecasts on flow data alone, data for other traffic variables are also routinely collected. This paper shows how cubic splines can be used to incorporate extra variables into the LMDM in order to enhance flow forecasts. Cubic splines are also introduced into the LMDM to parsimoniously accommodate the daily cycle exhibited by traffic flows.

The proposed methodology allows the LMDM to provide more accurate forecasts when forecasting flows in a real high-dimensional traffic data set. The resulting extended LMDM can deal with some important traffic modelling issues not usually considered in flow forecasting models. Additionally, the model can be implemented in a real-time environment, a crucial requirement for traffic management systems designed to support decisions and actions to alleviate congestion and keep traffic flowing.

An introduction to nonparametric statistical methods for recurrent event data
(in Portuguese)

Louzada, F., Faria, R., Anacleto, O., Benzé, B. G., Pacífico, A. M. L.
Journal Paper Revista Brasileira de Estatística, Volume 55, Number 2 (2013), 69–86


Bootstrap confidence intervals for industrial recurrent event data

Anacleto, O., Louzada. F.
Journal Paper Pesquisa Operacional, Volume 31, Number 1 (2011), 103-119


Industrial recurrent event data where an event of interest can be observed more than once in a single sample unit are presented in several areas, such as engineering, manufacturing and industrial reliability. Such type of data provide information about the number of events, time to their occurrence and also their costs. Nelson (1995) presents a methodology to obtain asymptotic confidence intervals for the cost and the number of cumulative recurrent events. Although this is a standard procedure, it can not perform well in some situations, in particular when the sample size available is small. In this context, computer-intensive methods such as bootstrap can be used to construct confidence intervals. In this paper, we propose a technique based on the bootstrap method to have interval estimates for the cost and the number of cumulative events. One of the advantages of the proposed methodology is the possibility for its application in several areas and its easy computational implementation. In addition, it can be a better alternative than asymptotic-based methods to calculate confidence intervals, according to some Monte Carlo simulations. An example from the engineering area illustrates the methodology.

Poly-bagging predictors for classification modelling in credit scoring

Louzada F., Anacleto, O., Candolo C., Mazucheli, J.
Journal Paper Expert Systems with Applications, Volume 38, Number 10 (2011), 12717-12720


Credit scoring modelling comprises one of the leading formal tools for supporting the granting of credit. Its core objective consists of the generation of a score by means of which potential clients can be listed in the order of the probability of default. A critical factor is whether a credit scoring model is accurate enough in order to provide correct classification of the client as a good or bad payer. In this context the concept of bootstraping aggregating (bagging) arises. The basic idea is to generate multiple classifiers by obtaining the predicted values from the fitted models to several replicated datasets and then combining them into a single predictive classification in order to improve the classification accuracy. In this paper we propose a new bagging-type variant procedure, which we call poly-bagging, consisting of combining predictors over a succession of resamplings. The study is derived by credit scoring modelling. The proposed poly-bagging procedure was applied to some different artificial datasets and to a real granting of credit dataset up to three successions of resamplings. We observed better classification accuracy for the two-bagged and the three-bagged models for all considered setups. These results lead to a strong indication that the poly-bagging approach may promote improvement on the modelling performance measures, while keeping a flexible and straightforward bagging-type structure easy to implement.

Reject inference: using combined models to boost predictive power

Louzada, F., Diniz, C.A.R., Rocha, R.F., Anacleto, O.
Journal Paper Credit Technology (A Serasa Experian publication), Volume 77 (2011), 7 - 17