Variable Selection in Canonical Discriminant Analysis for Family Studies
Article first published online: 30 MAR 2010
© 2010, The International Biometric Society
Volume 67, Issue 1, pages 124–132, March 2011
How to Cite
Jin, M. and Fang, Y. (2011), Variable Selection in Canonical Discriminant Analysis for Family Studies. Biometrics, 67: 124–132. doi: 10.1111/j.1541-0420.2010.01414.x
- Issue published online: 30 MAR 2010
- Article first published online: 30 MAR 2010
- Received June 2009. Revised January 2010. Accepted January 2010.
- Model selection;
Summary In family studies, canonical discriminant analysis can be used to find linear combinations of phenotypes that exhibit high ratios of between-family to within-family variabilities. But with large numbers of phenotypes, canonical discriminant analysis may overfit. To estimate the predicted ratios associated with the coefficients obtained from canonical discriminant analysis, two methods are developed; one is based on bias correction and the other based on cross-validation. Because the cross-validation is computationally intensive, an approximation to the cross-validation is also developed. Furthermore, these methods can be applied to perform variable selection in canonical discriminant analysis. The proposed methods are illustrated with simulation studies and applications to two real examples.