Clustering of significant genes in prognostic studies with microarrays: Application to a clinical study for multiple myeloma

Authors

  • Shigeyuki Matsui,

    Corresponding author
    1. Department of Pharmacoepidemiology, School of Public Health, Kyoto University, Yoshida Konoe-cho, Sakyo-ku, Kyoto, Japan
    2. Translational Research Informatics Center, Foundation for Biomedical Research and Innovation, Minatojima-minami-machi, Chuo-ku, Kobe, Japan
    • Department of Pharmacoepidemiology, School of Public Health, Kyoto University, Yoshida Konoe-cho, Sakyo-ku, Kyoto 606-8501, Japan
    Search for more papers by this author
  • Takeharu Yamanaka,

    1. Cancer Statistics Laboratory, Institute for Clinical Research, National Kyushu Cancer Center, Fukuoka, Japan
    Search for more papers by this author
  • Bart Barlogie,

    1. The Myeloma Institute for Research and Therapy, University of Arkansas for Medical Science, Little Rock, AR, U.S.A.
    Search for more papers by this author
  • John D. Shaughnessy Jr,

    1. The Myeloma Institute for Research and Therapy, University of Arkansas for Medical Science, Little Rock, AR, U.S.A.
    Search for more papers by this author
  • John Crowley

    1. Cancer Research and Biostatistics, Southwest Oncology Group Statistical Center, Seattle, WA, U.S.A.
    Search for more papers by this author

Abstract

When a large number of genes are significant in correlating microarray gene expression data with patient prognosis, clustering of significant genes may be effective not only for further dimension reduction but also for identifying co-regulated genes that belong to the same molecular pathway related to disease biology and aggressiveness. Moreover, a reduced feature, such as the average expression across samples for a cluster of significant genes, can play an important role in reducing variance in prediction analysis. We propose a simple procedure to select gene clusters that have strong marginal association with survival outcome from a large pool of candidate hierarchical clusters of significant genes. Selected gene clusters can have better predictive capability than the other gene clusters and singleton genes. Application of such clustering to the data set from a clinical study for patients with multiple myeloma and associated microarrays is given. Copyright © 2007 John Wiley & Sons, Ltd.

Ancillary