Large-scale integration of microarray data reveals genes and pathways common to multiple cancer types

Authors

  • Noor B. Dawany,

    1. Center for Integrated Bioinformatics, School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA 19104
    Search for more papers by this author
  • Will N. Dampier,

    1. Center for Integrated Bioinformatics, School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA 19104
    Search for more papers by this author
  • Aydin Tozeren

    Corresponding author
    1. Center for Integrated Bioinformatics, School of Biomedical Engineering, Science and Health Systems, Drexel University, Philadelphia, PA 19104
    • Center for Integrated Bioinformatics, Drexel University, Bossone Research Building 711, 3102 Market Street, Philadelphia, PA 19104, USA
    Search for more papers by this author

Abstract

The global gene expression analysis of cancer and healthy tissues typically results in large numbers of genes that are significantly altered in cancer. Such data, however, has been difficult to interpret due to the high level of variation of gene lists across laboratories and the small sample sizes used in individual studies. In this investigation, we compiled microarray data obtained from the same platform family from 84 laboratories, resulting in a database containing 1,043 healthy tissue samples and 4,900 cancer samples for 13 different tissue types. The primary cancers considered included adrenal gland, brain, breast, cervix, colon, kidney, liver, lung, ovary, pancreas, prostate and skin tissues. We normalized the data together and analyzed subsets for the discovery of genes involved in normal to cancer transformation. Our integrated significance analysis of microarrays approach produced top 400 gene lists for each of the 13 cancer types. These lists were highly statistically enriched with genes already associated with cancer in research publications excluding microarray studies (p < 1.31 E - 12). The genes MTIM and RRM2 appeared in nine and TOP2A in eight lists of significantly altered genes in cancer. In total, there were 132 genes present in at least four gene lists, 11 of which were not previously associated with cancer. The list contains 17 metal ions and 15 adenyl ribonucleotide binding proteins, six kinases and six transcription factors. Our results point to the value of integrating microarray data in the study of combination drug therapies targeting metastasis.

Ancillary