Fitting pedigree based mixed models in BUGS software Gregor Gorjanc University of Ljubljana, Biotechnical Faculty, Department of Animal Science, Slovenia
5th SGD, Otočec, Slovenia 20th September 2009
Introduction I
Pedigree based mixed model = animal model y = Xb + Za + e
y|b, a, σe2 ∼ N Xb + Za, Iσe2 a|A, σa2 ∼ N 0, Aσa2 parameters: b, a, σa2 , σe2
derived quantities: h2 = σa2/(σa2 +σe2 )
Introduction I
Pedigree based mixed model = animal model y = Xb + Za + e
y|b, a, σe2 ∼ N Xb + Za, Iσe2 a|A, σa2 ∼ N 0, Aσa2 parameters: b, a, σa2 , σe2 I I
derived quantities: h2 = σa2/(σa2 +σe2 )
Simplistic, powerful, & robust model Frequently used in: I I I
animal and plant breeding - research & INDUSTRY evolutionary biology - research human genetics - research
Bayesian approach I
Recent rise of Bayesian approach to infer parameters
p b, a, σa2 , σe2 |y
=
´ ´ ´
´
b a σa2
ˆ ˆ p (b|y)
= a
p (a|y)
ˆ
=
σa2
σe2
p y|b, a, σe2 p (b) p a, Aσa2 p σa2 p σe2 2 2 2 2 2 2 σe2 p (y|b, a, σe ) p (b) p (a, Aσa ) p (σa ) p (σe ) dbdadσa dσe
p b, a, σa2 , σe2 |y dadσa2 dσe2
...
...
I
Computationally extremely demanding!!!
Bayesian approach I
Recent rise of Bayesian approach to infer parameters
p b, a, σa2 , σe2 |y
=
´ ´ ´
´
b a σa2
ˆ ˆ p (b|y)
= a
p (a|y)
ˆ
=
σa2
σe2
p y|b, a, σe2 p (b) p a, Aσa2 p σa2 p σe2 2 2 2 2 2 2 σe2 p (y|b, a, σe ) p (b) p (a, Aσa ) p (σa ) p (σe ) dbdadσa dσe
p b, a, σa2 , σe2 |y dadσa2 dσe2
...
...
I
Computationally extremely demanding!!!
I
Solution - Markov chain Monte Carlo (McMC) methods - sample from full conditional distributions - “easy” p (bi |y, b−i , a, σa2 , σe2 ) p (ai |y, b, a−i , σa2 , σe2 ) ...
Available software Animal breeding and genetics community I I I I I I I I
DMU GIBBSF90 MCMCglmm R package MTGSAM SIR-BAYES TM VCE ...
Available software Animal breeding and genetics community I I I I I I I I
DMU GIBBSF90 MCMCglmm R package MTGSAM SIR-BAYES TM VCE ...
General purpose statistical programs I
BUGS I I I
Classic BUGS WinBUGS OpenBUGS µ
τ
Y1
Y6
θ
Y2 Y3
Y5 Y4
Aim Fit animal model in BUGS in a general manner I
Previous work I
I
Damgaard (2007) Technical note: How to use Winbugs to draw inferences in animal models. J. Anim. Sci., 85(6): 1363-1368. http://jas.fass.org/cgi/reprint/85/6/1363.pdf Waldmann (2009) Easy and flexible Bayesian inference of quantitative genetic parameters. Evolution, 63(6): 1640-1643. http://www3.interscience.wiley.com/ journal/121675188/abstract
How?
Describe animal model as a graphical model using Directed Acyclic Graph (DAG) µ
τ
Y1
Y6
θ
Y2 Y3
Y5 Y4
DAG for the “alien” example
Figure by Jouke
DAG for the “alien” example σa2
98
σe2
a1
a2
a3
a4 a5
a6 a7 109
Figure by Jouke
a8 a9
DAG for the “alien” example σa2
a1 98
σe2
105
a3
a2 a4
a5
106
101 a6
a7 109 Figure by Jouke
93 a8
a9
DAG for the “alien” example σa2
σe2
b1
b2 a1
98
105
a3
a2 a4
a5
106
101 a6
a7 109 Figure by Jouke
93 a8
a9
DAG for the “alien” example σa2
σe2
b1
b2 a1
98
105
a3
a2 a4
a5
106
101 a6
a7 109 Figure by Jouke
93 a8
a9
DAG using plate notation σa2
2
y|b, a, σe2 ∼ N Xb + Za, Iσe a|A, σa2 ∼ N 0, Aσa2
Wk,k af (k)
am(k)
1/2
1/2
ak A A−1
= TWTT −1 = (I − 1/2P)−1 W I − 1/2PT T = (T−1 ) W−1 T−1 = (I − 1/2P)T W−1 (I − 1/2P)
Zi,k
bj j = 1 : nB
Xi,j µi
k = 1 : nI
σe2
yi i = 1 : nY
DAG description with BUGS model language ## Additive genetic values for(k in 1:nI) { a[id[k]] ∼ dnorm(pa[id[k]], Xtau2a[id[k]]) pa[id[k]] <- 0.5 * (a[fid[k]] + a[mid[k]]) Xtau2a[id[k]] <- winv[id[k]] * tau2a } a[nU] <- 0 # NULL (zero) holder ## Phenotypes for(i in 1:nY) { y[i] ∼ dnorm(mu[i], tau2e) mu[i] <- b[x[i]] + a[idy[i]] }
DAG description with BUGS model language cont.
## Variance priors tau2e ∼ dgamma(0.001, 0.001) tau2a ∼ dgamma(0.001, 0.001) sigma2e <- 1 / tau2e sigma2a <- 1 / tau2a ## Location priors for(j in 1:nB) { b[j] ∼ dnorm(0, 1.0E-6) }
Data list(## Constants nI=9, nU=10, nY=6, nB=2, ## Pedigree id=c( 1, 2, fid=c(10, 10, mid=c(10, 10, winv=c( 1, 1,
)
3, 4, 2, 2, 1, 10, 2, 1.3,
5, 4, 3, 2,
6, 2, 3, 2,
7, 8, 9), 5, 1, 7), 6, 10, 8), 2.5, 1, 2.3),
## Phenotypes & model variables y=c(105, 98, 101, 106, 93, 109), idy=c( 2, 3, 4, 5, 6, 9), x=c( 1, 1, 2, 2, 2, 1)
Initial values
I
Not strictly needed, but its good to provide them to avoid extreme initial values
list(## Means b=c(1, -1), a=c(0, -0.8, 1, 0.6, 1.2, 5, 0, -1, 2, NA),
)
## Variances tau2a=1, tau2e=1
How-to use BUGS? I
Check “Welcome to WinBUGS - the movie” to see the point&click work-flow http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/ winbugsthemovie.html
I
There are also “automatic” interfaces from: I I I I
R & S-PLUS (packages R2WinBUGS & BRugs) SAS MATLAB Excel
Heritability for the "alien” example - prior effect1
1
For mathematical details see Sorensen & Gianola (2002) - Example 2.21 (p. 109-111)
Heritability for the "alien” example - prior effect1 Non-informative prior tau2e ∼ dgamma(1, 1) tau2a ∼ dgamma(1, 1)
0.0 0.2 0.4 0.6 0.8 1.0 1
For mathematical details see Sorensen & Gianola (2002) - Example 2.21 (p. 109-111)
Heritability for the "alien” example - prior effect1 Non-informative prior tau2e ∼ dgamma(1, 1) tau2a ∼ dgamma(1, 1)
Informative prior tau2e ∼ dgamma(5, 120) tau2a ∼ dgamma(15, 240)
0.0 0.2 0.4 0.6 0.8 1.0
0.0 0.2 0.4 0.6 0.8 1.0
1
For mathematical details see Sorensen & Gianola (2002) - Example 2.21 (p. 109-111)
Conclusions I
Graphical model formulation enabled fitting animal model in BUGS in a general manner - this provides an easy way to use Bayesian approach to fit animal models
Conclusions I
Graphical model formulation enabled fitting animal model in BUGS in a general manner - this provides an easy way to use Bayesian approach to fit animal models
I
Several animal model extensions are already developed! I I I I I I I I
environmental effects genetic groups reduced animal model maternal model multiple trait model repeatability and random regression model uncertain parentage non-Gaussian traits
Any questions?
Figure by Jouke
Abstract Pedigree based mixed model (commonly called animal model) is an important class of statistical models for inference of quantitative genetic parameters in various fields such as animal and plant breeding, evolutionary biology and human genetics. In last years Bayesian statistics has been introduced to a set of standard statistical procedures of a quantitative geneticists’ toolbox, due to the ever increasing complexity of fitted models. While several specific programs can be used to fit animal model using Bayesian approach, none of them provide and easy to use and flexible environment for the development and testing of models. It is common to use favourite programming language to develop the needed programs, but this requires a considerable amount of programming and statistical skills. A viable alternative is to use general purpose statistical packages. BUGS (Bayesian Using Gibbs Sampling) is a popular and fairly flexible program for the Bayesian analysis of complex statistical models using Markov Chain Monte Carlo methods. Recently two reports of fitting animal model in BUGS were given, but both failed to provide a generic procedure that can be used independently of the collected data. Here, a generic description of animal model is presented using the concept of graphical models. This description was translated to BUGS language and fitted to a small example. Comparison with other programs revealed the validity of a new procedure. Tests with other data sets showed that BUGS can be used to efficiently fit animal model for medium sized data sets. Using these results quantitative geneticists can now easily use Bayesian approach to fit animal model in BUGS.