Get access

An assessment of linkage disequilibrium in Holstein cattle using a Bayesian network

Authors


Correspondence

G. Morota, Department of Animal Sciences, University of Wisconsin, Madison, WI 53706, USA. Tel: +1 (608) 263 3499; Fax: +1 (608) 263 5157; E-mail: morota@ansci.wisc.edu

Summary

Linkage disequilibrium (LD) is defined as a non-random association of the distributions of alleles at different loci within a population. This association between loci is valuable in prediction of quantitative traits in animals and plants and in genome-wide association studies. A question that arises is whether standard metrics such as D′ and r2 reflect complex associations in a genetic system properly. It seems reasonable to take the view that loci associate and interact together as a system or network, as opposed to in a simple pairwise manner. We used a Bayesian network (BN) as a representation of choice for an LD network. A BN is a graphical depiction of a probability distribution and can represent sets of conditional independencies. Moreover, it provides a visual display of the joint distribution of the set of random variables in question. The usefulness of BN for linkage disequilibrium was explored and illustrated using genetic marker loci found to have the strongest effects on milk protein in Holstein cattle based on three strategies for ranking marker effect estimates: posterior means, standardized posterior means and additive genetic variance. Two different algorithms, Tabu search (a local score–based algorithm) and incremental association Markov blanket (a constraint-based algorithm), coupled with the chi-square test, were used for learning the structure of the BN and were compared with the reference r2 metric represented as an LD heat map. The BN captured several genetic markers associated as clusters, implying that markers are inter-related in a complicated manner. Further, the BN detected conditionally dependent markers. The results confirm that LD relationships are of a multivariate nature and that r2 gives an incomplete description and understanding of LD. Use of an LD Bayesian network enables inferring associations between loci in a systems framework and provides a more accurate picture of LD than that resulting from the use of pairwise metrics.

Ancillary