The rapid advancement in genome sequencing techniques allows the dissection of complex traits of non-model organisms of importance in evolutionary biology, conservation genetics, breeding and medicine. This advancement requires new statistical analysis tools that can handle large amount of sequencing data efficiently.
We propose an analytic Bayesian implementation of the mixed linear model which allows rapid and robust inferences of heritability. The two main features of the method are (i) breeding values and residual variance component are analytically integrated out of the model and (ii) the parameter space of the variance ratio parameter is discretized so that a Gibbs sampling distribution can be utilized. We propose further two separate methods to infer breeding values that acknowledge uncertainty of the learned heritability. The benefit of the method compared to a standard Markov Chain Monte Carlo (MCMC) method is visualized on public data sets: two simulated data sets and one Wheat (Triticum aestivum L.) pedigree.
Results show that while the accuracy of inferred heritability obtained by the proposed and standard methods are almost identical, the computational performance is very encouraging: up to hundred fold speed up and the possibility to make parallel implementation is particularly appealing here, which may further speed up computations.
The method allows analysis using a non-invertible relationship matrix so that ad hoc manipulation is avoided which can be important as our results imply. We completely avoid convergence and mixing problems here: this is a well-known problem of MCMC simulation, which sometimes can severely reduce the inferential power. Bayes factors for model comparisons can be conveniently calculated as a by-product of the inference procedure. The source code will be available for download at http://www.rni.helsinki.fi/~mjs.