Empirical extensions of the lasso penalty to reduce the false discovery rate in high‐dimensional Cox regression models
Abstract
Correct selection of prognostic biomarkers among multiple candidates is becoming increasingly challenging as the dimensionality of biological data becomes higher. Therefore, minimizing the false discovery rate (FDR) is of primary importance, while a low false negative rate (FNR) is a complementary measure. The lasso is a popular selection method in Cox regression, but its results depend heavily on the penalty parameter λ. Usually, λ is chosen using maximum cross‐validated log‐likelihood (max‐cvl). However, this method has often a very high FDR. We review methods for a more conservative choice of λ. We propose an empirical extension of the cvl by adding a penalization term, which trades off between the goodness‐of‐fit and the parsimony of the model, leading to the selection of fewer biomarkers and, as we show, to the reduction of the FDR without large increase in FNR. We conducted a simulation study considering null and moderately sparse alternative scenarios and compared our approach with the standard lasso and 10 other competitors: Akaike information criterion (AIC), corrected AIC, Bayesian information criterion (BIC), extended BIC, Hannan and Quinn information criterion (HQIC), risk information criterion (RIC), one‐standard‐error rule, adaptive lasso, stability selection, and percentile lasso. Our extension achieved the best compromise across all the scenarios between a reduction of the FDR and a limited raise of the FNR, followed by the AIC, the RIC, and the adaptive lasso, which performed well in some settings. We illustrate the methods using gene expression data of 523 breast cancer patients. In conclusion, we propose to apply our extension to the lasso whenever a stringent FDR with a limited FNR is targeted. Copyright © 2016 John Wiley & Sons, Ltd.
Citing Literature
Number of times cited according to CrossRef: 25
- H. Chen, M. Liang, X. Li, T. Wu, L. Zhang, X. Liu, An individualised radiomics composite model predicting prognosis of stage 1 solid lung adenocarcinoma, Clinical Radiology, 10.1016/j.crad.2020.03.019, (2020).
- Julia Gilhodes, Florence Dalenc, Jocelyn Gal, Christophe Zemmour, Eve Leconte, Jean-Marie Boher, Thomas Filleron, Comparison of Variable Selection Methods for Time-to-Event Data in High-Dimensional Settings, Computational and Mathematical Methods in Medicine, 10.1155/2020/6795392, 2020, (1-13), (2020).
- Shaima Belhechmi, Riccardo De Bin, Federico Rotolo, Stefan Michiels, Accounting for grouped predictor variables or pathways in high-dimensional penalized Cox regression models, BMC Bioinformatics, 10.1186/s12859-020-03618-y, 21, 1, (2020).
- Maxim S. Kovalev, Lev V. Utkin, Ernest M. Kasimov, SurvLIME: A method for explaining machine learning survival models, Knowledge-Based Systems, 10.1016/j.knosys.2020.106164, 203, (106164), (2020).
- Charlotte A. Espensen, Jens F. Kiilgaard, Ane L. Appelt, Lotte S. Fog, Joel Herault, Celia Maschi, Jean-Pierre Caujolle, Juliette Thariat, Dose-response and normal tissue complication probabilities after proton therapy for choroidal melanomas, Ophthalmology, 10.1016/j.ophtha.2020.06.030, (2020).
- Hongwei Sun, Yuehua Cui, Hui Wang, Haixia Liu, Tong Wang, Comparison of methods for the detection of outliers and associated biomarkers in mislabeled omics data, BMC Bioinformatics, 10.1186/s12859-020-03653-9, 21, 1, (2020).
- Zhongjing Zhang, Wanqing Weng, Weiguo Huang, Boda Wu, Yi Zhou, Jie Zhang, Tuo Deng, Wen Ye, Jiecheng Zhang, Jianyang Ao, Qiyu Zhang, Keqing Shi, A novel molecular-clinicopathologic nomogram to improve prognosis prediction of hepatocellular carcinoma, Aging, 10.18632/aging.103350, 12, 13, (12896-12920), (2020).
- Ling Chen, Zijin Xiang, Xueru Chen, Xiuting Zhu, Xiangdong Peng, A seven-gene signature model predicts overall survival in kidney renal clear cell carcinoma, Hereditas, 10.1186/s41065-020-00152-y, 157, 1, (2020).
- Oscar Díaz-Cambronero, Guido Mazzinari, Francisco Giner, Amparo Belltall, Lola Ruiz-Boluda, Anabel Marqués-Marí, Luis Sánchez-Guillén, Pilar Eroles, Juan Pablo Cata, María Pilar Argente-Navarro, Mu Opioid Receptor 1 (MOR-1) Expression in Colorectal Cancer and Oncological Long-Term Outcomes: A Five-Year Retrospective Longitudinal Cohort Study, Cancers, 10.3390/cancers12010134, 12, 1, (134), (2020).
- Susana Vinga, Structured sparsity regularization for analyzing high-dimensional omics data, Briefings in Bioinformatics, 10.1093/bib/bbaa122, (2020).
- Lev V. Utkin, Andrei V. Konstantinov, Viacheslav S. Chukanov, Mikhail V. Kots, Mikhail A. Ryabinin, Anna A. Meldo, A weighted random survival forest, Knowledge-Based Systems, 10.1016/j.knosys.2019.04.015, (2019).
- Md Hasinur Rahaman Khan, Anamika Bhadra, Tamanna Howlader, Stability selection for lasso, ridge and elastic net implemented with AFT models, Statistical Applications in Genetics and Molecular Biology, 10.1515/sagmb-2017-0001, 18, 5, (2019).
- Xue Ming, Ronald Wihal Oei, Ruiping Zhai, Fangfang Kong, Chengrun Du, Chaosu Hu, Weigang Hu, Zhen Zhang, Hongmei Ying, Jiazhou Wang, MRI-based radiomics signature is a quantitative prognostic biomarker for nasopharyngeal carcinoma, Scientific Reports, 10.1038/s41598-019-46985-0, 9, 1, (2019).
- Alberto Stefano Tagliafico, Bianca Bignotti, Federica Rossi, Joao Matos, Massimo Calabrese, Francesca Valdora, Nehmat Houssami, Breast cancer Ki-67 expression prediction by digital breast tomosynthesis radiomics features, European Radiology Experimental, 10.1186/s41747-019-0117-2, 3, 1, (2019).
- Koen B. Pouwels, Berit Muller-Pebody, Timo Smieszek, Susan Hopkins, Julie V. Robotham, Selection and co-selection of antibiotic resistances among Escherichia coli by antibiotic use in primary care: An ecological analysis, PLOS ONE, 10.1371/journal.pone.0218134, 14, 6, (e0218134), (2019).
- Koen B Pouwels, Rachel Freeman, Berit Muller-Pebody, Graeme Rooney, Katherine L Henderson, Julie V Robotham, Timo Smieszek, Association between use of different antibiotics and trimethoprim resistance: going beyond the obvious crude association, Journal of Antimicrobial Chemotherapy, 10.1093/jac/dky031, 73, 6, (1700-1707), (2018).
- Rémy Jardillier, Florent Chatelain, Laurent Guyon, Bioinformatics Methods to Select Prognostic Biomarker Genes from Large Scale Datasets: A Review, Biotechnology Journal, 10.1002/biot.201800103, 13, 12, (2018).
- Nils Ternès, Federico Rotolo, Stefan Michiels, biospear: an R package for biomarker selection in penalized Cox regression, Bioinformatics, 10.1093/bioinformatics/btx560, 34, 1, (112-113), (2017).
- Nils Ternès, Federico Rotolo, Stefan Michiels, Robust estimation of the expected survival probabilities from high-dimensional Cox models with biomarker-by-treatment interactions in randomized clinical trials, BMC Medical Research Methodology, 10.1186/s12874-017-0354-0, 17, 1, (2017).
- Cong Liu, Xujun Wang, Georgi Z. Genchev, Hui Lu, Multi-omics facilitated variable selection in Cox-regression model for cancer prognosis prediction, Methods, 10.1016/j.ymeth.2017.06.010, 124, (100-107), (2017).
- Mengyun Wu, Yangguang Zang, Sanguo Zhang, Jian Huang, Shuangge Ma, Accommodating missingness in environmental measurements in gene‐environment interaction analysis, Genetic Epidemiology, 10.1002/gepi.22055, 41, 6, (523-554), (2017).
- Guodong Shi, Jingjing Zhang, Zipeng Lu, Dongfang Liu, Yang Wu, Pengfei Wu, Jie Yin, Hao Yuan, Qicong Zhu, Lei Chen, Yue Fu, Yunpeng Peng, Yan Wang, Kuirong Jiang, Yi Miao, A novel messenger RNA signature as a prognostic biomarker for predicting relapse in pancreatic ductal adenocarcinoma, Oncotarget, 10.18632/oncotarget.22861, 8, 67, (110849-110860), (2017).
- Mei-Zhu Hong, Linglong Ye, Li-Xin Jin, Yan-Dan Ren, Xiao-Fang Yu, Xiao-Bin Liu, Ru-Mian Zhang, Kuangnan Fang, Jin-Shui Pan, Noninvasive scoring system for significant inflammation related to chronic hepatitis B, Scientific Reports, 10.1038/srep43752, 7, (43752), (2017).
- Nils Ternès, Federico Rotolo, Georg Heinze, Stefan Michiels, Identification of biomarker‐by‐treatment interactions in randomized clinical trials with survival outcomes and high‐dimensional spaces, Biometrical Journal, 10.1002/bimj.201500234, 59, 4, (685-701), (2016).
- S. Michiels, N. Ternès, F. Rotolo, Statistical controversies in clinical research: prognostic gene signatures are not (yet) useful in clinical practice, Annals of Oncology, 10.1093/annonc/mdw307, 27, 12, (2160-2167), (2016).




