• clustering;
  • linkage analysis;
  • locus heterogeneity;
  • model selection;
  • reversible jump MCMC;
  • SNP;
  • stochastic EM


Complex genetic traits are inherently heterogeneous, i.e., they may be caused by different genes, or non-genetic factors, in different individuals. So, for mapping genes responsible for these diseases using linkage analysis, heterogeneity must be accounted for in the model. Heterogeneity across different families can be modeled using a mixture distribution by letting each family have its own heterogeneity parameter denoting the probability that its disease-causing gene is linked to the marker map under consideration. A substantial gain in power is expected if covariates that can discriminate between the families of linked and unlinked types are incorporated in this modeling framework. To this end, we propose a hierarchical Bayesian model, in which the families are grouped according to various (categorized) levels of covariate(s). The heterogeneity parameters of families within each group are assigned a common prior, whose parameters are further assigned hyper-priors. The hyper-parameters are obtained by utilizing the empirical Bayes estimates. We also address related issues such as evaluating whether the covariate(s) under consideration are informative and grouping of families. We compare the proposed approach with one that does not utilize covariates and show that our approach leads to considerable gains in power to detect linkage and in precision of interval estimates through various simulation scenarios. An application to the asthma datasets of Genetic Analysis Workshop 12 also illustrates this gain in a real data analysis. Additionally, we compare the performances of microsatellite markers and single nucleotide polymorphisms for our approach and find that the latter clearly outperforms the former. Genet. Epidemiol. 2007. © 2007 Wiley-Liss, Inc.