AAPL: assessing association between p-value lists



Joint analyses of high-throughput datasets generate the need to assess the association between two long lists of p-values. In such p-value lists, the vast majority of the features are insignificant. Ideally contributions of features that are null in both tests should be minimized. However, by random chance their p-values are uniformly distributed between 0 and 1, and weak correlations of the p-values may exist due to inherent biases in the high-throughput technology used to generate the multiple datasets. Rank-based agreement test may capture such unwanted effects. Testing contingency tables generated using hard cutoffs may be sensitive to arbitrary threshold choice. We develop a novel method on the basis of feature-level concordance using local false discovery rate. The association score enjoys straight-forward interpretation. The method shows higher statistical power to detect the association between p-value lists in simulation. We demonstrate its utility using real data analysis. The R implementation of the method is available at http://userwww.service.emory.edu/∼tyu8/AAPL/. © 2013 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 6: 144–155, 2013