Weighted area under the receiver operating characteristic curve and its application to gene selection


Jialiang Li, Department of Statistics and Applied Probability, National University of Singapore, 117546 Singapore.
E-mail: stalj@nus.edu.sg


Summary.  The partial area under the receiver operating characteristic curve (PAUC) has been proposed for gene selection by Pepe and co-workers and thereafter applied in real data analysis. It was noticed from empirical studies that this measure has several key weaknesses, such as an inability to reflect non-uniform weighting of different decision thresholds, resulting in large numbers of ties. We propose the weighted area under the receiver operating characteristic curve (WAUC) to address the problems that are associated with PAUC. Our proposed measure enjoys a greater flexibility to describe the discrimination accuracy of genes. Non-parametric and parametric estimation methods are introduced, including PAUC as a special case, along with theoretical properties of the estimators. We also provide a simple variance formula, yielding a novel variance estimator for non-parametric estimation of PAUC, which has proven challenging in previous work. The methods proposed permit sensitivity analyses, whereby the effect of differing weight functions on gene rankings may be assessed and results may be synthesized across weights. Simulations and reanalysis of a well-known microarray data set illustrate the practical utility of WAUC.