Using the Optimal Robust Receiver Operating Characteristic (ROC) Curve for Predictive Genetic Tests




Summary Current ongoing genome-wide association (GWA) studies represent a powerful approach to uncover common unknown genetic variants causing common complex diseases. The discovery of these genetic variants offers an important opportunity for early disease prediction, prevention, and individualized treatment. We describe here a method of combining multiple genetic variants for early disease prediction, based on the optimality theory of the likelihood ratio (LR). Such theory simply shows that the receiver operating characteristic (ROC) curve based on the LR has maximum performance at each cutoff point and that the area under the ROC curve so obtained is highest among that of all approaches. Through simulations and a real data application, we compared it with the commonly used logistic regression and classification tree approaches. The three approaches show similar performance if we know the underlying disease model. However, for most common diseases we have little prior knowledge of the disease model and in this situation the new method has an advantage over logistic regression and classification tree approaches. We applied the new method to the type 1 diabetes GWA data from the Wellcome Trust Case Control Consortium. Based on five single nucleotide polymorphisms, the test reaches medium level classification accuracy. With more genetic findings to be discovered in the future, we believe a predictive genetic test for type 1 diabetes can be successfully constructed and eventually implemented for clinical use.