• Open Access

Link Functions in Multi-Locus Genetic Models: Implications for Testing, Prediction, and Interpretation

Authors

  • David Clayton*

    Corresponding author
    • Juvenile Diabetes Research Foundation/Wellcome Trust Diabetes and Inflammation Laboratory, Cambridge, Institute for Medical Research, Cambridge University, United Kingdom
    Search for more papers by this author

Correspondence to: David Clayton, Juvenile Diabetes Research Foundation/Wellcome Trust Diabetes and Inflammation Laboratory Cambridge, Institute for Medical Research, Cambridge University, UK.

Abstract

“Complex” diseases are, by definition, influenced by multiple causes, both genetic and environmental, and statistical work on the joint action of multiple risk factors has, for more than 40 years, been dominated by the generalized linear model (GLM). In genetics, models for dichotomous traits have traditionally been approached via the model of an underlying, normally distributed, liability. This corresponds to the GLM with binomial errors and a probit link function. Elsewhere in epidemiology, however, the logistic regression model, a GLM with logit link function, has been the tool of choice, largely because of its convenient properties in case-control studies. The choice of link function has usually been dictated by mathematical convenience, but it has some important implications in (a) the choice of association test statistic in the presence of existing strong risk factors, (b) the ability to predict disease from genotype given its heritability, and (c) the definition, and interpretation of epistasis (or epistacy). These issues are reviewed, and a new association test proposed. Genet. Epidemiol. 36:409–418, 2012. © 2012 Wiley Periodicals, Inc.

Ancillary