Get access

A Multiclass Likelihood Ratio Approach for Genetic Risk Prediction Allowing for Phenotypic Heterogeneity

Authors

  • Yalu Wen,

    1. Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, Michigan
    Search for more papers by this author
  • Qing Lu

    Corresponding author
    1. Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, Michigan
    • Correspondence to: Dr. Qing Lu, Department of Epidemiology and Biostatistics, Michigan State University, B601 West Fee Hall, East Lansing, Michigan 48824. E-mail: qlu@epi.msu.edu

    Search for more papers by this author

ABSTRACT

The translation of human genome discoveries into health practice is one of the major challenges in the coming decades. The use of emerging genetic knowledge for early disease prediction, prevention, and pharmacogenetics will advance genome medicine and lead to more effective prevention/treatment strategies. For this reason, studies to assess the combined role of genetic and environmental discoveries in early disease prediction represent high priority research projects, as manifested in the multiple risk prediction studies now underway. However, the risk prediction models formed to date lack sufficient accuracy for clinical use. Converging evidence suggests that diseases with the same or similar clinical manifestations could have different pathophysiological and etiological processes. When heterogeneous subphenotypes are treated as a single entity, the effect size of predictors can be reduced substantially, leading to a low-accuracy risk prediction model. The use of more refined subphenotypes facilitates the identification of new predictors and leads to improved risk prediction models. To account for the phenotypic heterogeneity, we have developed a multiclass likelihood-ratio approach, which simultaneously determines the optimum number of subphenotype groups and builds a risk prediction model for each group. Simulation results demonstrated that the new approach had more accurate and robust performance than existing approaches under various underlying disease models. The empirical study of type II diabetes (T2D) by using data from the Genes and Environment Initiatives suggested heterogeneous etiology underlying obese and nonobese T2D patients. Considering phenotypic heterogeneity in the analysis leads to improved risk prediction models for both obese and nonobese T2D subjects.

Get access to the full text of this article

Ancillary