Volume 43, Issue 3
RESEARCH ARTICLE

Robust network‐based regularization and variable selection for high‐dimensional genomic data in cancer prognosis

Jie Ren

Department of Statistics, Kansas State University, Manhattan, Kansas

Search for more papers by this author
Yinhao Du

Department of Statistics, Kansas State University, Manhattan, Kansas

Search for more papers by this author
Shaoyu Li

Department of Mathematics and Statistics, University of North Carolina at Charlotte, Charlotte, North Carolina

Search for more papers by this author
Shuangge Ma

Department of Biostatistics, Yale University, New Haven, Connecticut

Search for more papers by this author
Yu Jiang

Division of Epidemiology, Biostatistics and Environmental Health, School of Public Health, University of Memphis, Memphis, Tennessee

Search for more papers by this author
Cen Wu

Corresponding Author

E-mail address: wucen@ksu.edu

Department of Statistics, Kansas State University, Manhattan, Kansas

Correspondence Cen Wu, Department of Statistics, Kansas State University, Manhattan, KS. Email: wucen@ksu.edu

Search for more papers by this author
First published: 11 February 2019
Citations: 3

Abstract

In cancer genomic studies, an important objective is to identify prognostic markers associated with patients' survival. Network‐based regularization has achieved success in variable selections for high‐dimensional cancer genomic data, because of its ability to incorporate the correlations among genomic features. However, as survival time data usually follow skewed distributions, and are contaminated by outliers, network‐constrained regularization that does not take the robustness into account leads to false identifications of network structure and biased estimation of patients' survival. In this study, we develop a novel robust network‐based variable selection method under the accelerated failure time model. Extensive simulation studies show the advantage of the proposed method over the alternative methods. Two case studies of lung cancer datasets with high‐dimensional gene expression measurements demonstrate that the proposed approach has identified markers with important implications.

Number of times cited according to CrossRef: 3

  • Features of alternative splicing in stomach adenocarcinoma and their clinical implication: a research based on massive sequencing data, BMC Genomics, 10.1186/s12864-020-06997-x, 21, 1, (2020).
  • Structured sparsity regularization for analyzing high-dimensional omics data, Briefings in Bioinformatics, 10.1093/bib/bbaa122, (2020).
  • A Selective Review of Multi-Level Omics Data Integration Using Variable Selection, High-Throughput, 10.3390/ht8010004, 8, 1, (4), (2019).

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.