SEARCH

SEARCH BY CITATION

Keywords:

  • cancer;
  • microarray;
  • large-scale;
  • meta-analysis

Abstract

  1. Top of page
  2. Abstract
  3. Material and Methods
  4. Results
  5. Discussion
  6. References
  7. Supporting Information

The global gene expression analysis of cancer and healthy tissues typically results in large numbers of genes that are significantly altered in cancer. Such data, however, has been difficult to interpret due to the high level of variation of gene lists across laboratories and the small sample sizes used in individual studies. In this investigation, we compiled microarray data obtained from the same platform family from 84 laboratories, resulting in a database containing 1,043 healthy tissue samples and 4,900 cancer samples for 13 different tissue types. The primary cancers considered included adrenal gland, brain, breast, cervix, colon, kidney, liver, lung, ovary, pancreas, prostate and skin tissues. We normalized the data together and analyzed subsets for the discovery of genes involved in normal to cancer transformation. Our integrated significance analysis of microarrays approach produced top 400 gene lists for each of the 13 cancer types. These lists were highly statistically enriched with genes already associated with cancer in research publications excluding microarray studies (p < 1.31 E - 12). The genes MTIM and RRM2 appeared in nine and TOP2A in eight lists of significantly altered genes in cancer. In total, there were 132 genes present in at least four gene lists, 11 of which were not previously associated with cancer. The list contains 17 metal ions and 15 adenyl ribonucleotide binding proteins, six kinases and six transcription factors. Our results point to the value of integrating microarray data in the study of combination drug therapies targeting metastasis.

Tens of thousands of microarray samples have accumulated in public access databases in the last decade.1–3 A large portion of such data is cancer-specific and therefore holds the promise of cancer-associated gene discovery based on thousands of samples (not tens or hundreds). Much of the cancer-associated microarray data in public domains comes without control samples. In fact, the data in Gene Expression Omnibus (GEO) is highly asymmetric, containing datasets with cancer microarray samples only and other datasets containing samples for healthy tissue but not cancer tissues. Conventional meta-scale approaches of integrating data, where laboratory results are combined after the datasets were analyzed, would not be useful in drastically increasing the sample sizes in the microarray analysis of cancer. Such analyses require the presence of both cancer and normal tissue samples in the same microarray dataset.

In our study, we used a large-scale approach to integrate microarray data from multiple laboratories by normalizing them together and then using the significance analysis of microarray (SAM) method4 to identify the list of genes that are significantly altered in cancer compared to normal (SAM genes), specific for 13 distinct tissues. Our methodology is grounded on our previous study that revealed the predictive potential of integrated microarray data.5 Large-scale meta-analysis techniques applied to cancer have already been adopted by a few groups,6–9 focusing on a single tissue type. Other studies merged all cancer microarray data regardless of tissue type into one group and controls into another10, 11 to identify gene sets associated with common cancer mechanisms. Our approach is unusual when compared to the typical meta-analysis methods but it allows for the integration of asymmetric microarray data for global gene expression.5 In our study, we asked the question to what extent does the currently available microarray data have the potential to replicate the research literature on the molecular mechanisms of cancer. The automated text search algorithms we used point to high level of coincidence between our gene lists and cancer-associated genes determined from non-microarray research literature.

Using nearly 6,000 microarray samples, our study identifies 132 genes that are highly significantly associated in at least four distinct cancer types. Our study also presents a set of 270 genes that appear to be highly significant in comparisons of datasets consisting of cancer and normal tissues independent of tissue type. These sets have 74 genes in common and will potentially contribute to more detailed annotation of the genes in the cancer bioinformatics databases. Our study shows the value of large-scale compilation of microarray data in cancer research, as the inclusion of large amounts of microarray data from different labs helps eliminate the effects of lab-specific noise, increasing the reliability of the results.12

Material and Methods

  1. Top of page
  2. Abstract
  3. Material and Methods
  4. Results
  5. Discussion
  6. References
  7. Supporting Information

Microarray dataset selection and normalization

An Affymetrix microarray database was constructed for normal and cancer samples obtained from 13 different solid tissues. The tissues considered were adrenal gland, brain, breast, cervix, colon, kidney, liver, lung, ovary, pancreas, prostate, skin and stomach. The microarray data contained a total of 4,900 cancer and 1,043 normal tissue samples acquired from 84 labs (Supporting Information Additional File 1). All the data were obtained from the publically accessible GEO1, 2 and Array Express3 online repositories. The inclusion criteria restricted the use of datasets hybridized specifically on one of the three comparable Affymetrix platforms (HG-U133A, HG-U133A 2.0 and the HG-U133 Plus 2.0), where raw data cell intensity file (CEL) files were available, with at least 20 usable microarray samples. In addition, the results from the datasets should have been previously published in a peer-reviewed study. No differentiation was made with respect to the different malignancies obtained from the same tissue.

The data were normalized using the refRMA algorithm,13 using the platform-compatible custom ENTREZG chip description file (CDF) files (version 12)14 to obtain Entrez gene intensities. Using refRMA, background adjustment was applied, and quantile normalization was performed on a training set used recently by Dawany and Tozeren.5 The training set containing a large number of samples hybridized to the newer Affymetrix HG-U133 Plus 2.0 platform was chosen. This was then applied to compute the probe-level quantiles for the remaining data. Median polishing of the training set was finally used to adjust the normalized probe intensities of the remaining data. Data were then filtered to remove the genes not shared by the three platforms. In control experiments, the quantile–quantile (Q–Q) plots obtained by dividing the data into two groups each composed of separate experiments indicated that the data were properly normalized using refRMA.5 Finally the gene intensities of replicate samples obtained from the same source were averaged across replicates. All data preprocessing was performed in MATLAB.15

Differential gene expression

The differential expression of genes between cancer tissues and the corresponding controls was investigated using the significance analysis of microarrays (SAM4) by using the samr package16 in R.17 The SAM test was applied individually to the microarray datasets specific to each of the 13 tissues under consideration. During each SAM test, a 100 random iterations were performed to determine the false discovery rate (FDR) for each gene. To identify significant genes, the FDR was constrained to zero.

In addition, a general normal versus cancer test was conducted. To avoid over-representation and dominance of certain tissues, 10 arrays were randomly chosen from both the normal and tumor samples of each tissue to produce two datasets (cancer and control) for SAM analysis. The number 10 was determined by the smallest sample size available for any tissue (adrenal normal tissue). SAM genes were then identified following the aforementioned criteria. The random selection and differential expression process was repeated a 100 times.

Functional annotation of top ranked and conserved genes

The lists of top 400 SAM genes were obtained for each of the 13 tissues. The cutoff (top 400) was chosen to optimize the match between our predicted SAM lists and lists of cancer-associated genes obtained by us via automated text search of non-microarray PubMed (PM) abstracts.5 An enriched Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway profile was produced for each of the 13 tissues individually, at a p value ≤ 0.05 using DAVID Bioinformatics resources.

Consistent differential expression across tissues

Among the top 400 gene lists provided for each tissue, a subset of genes that were consistently differentially expressed was determined. These genes were selected provided they appeared to be significantly altered in at least four of the 13 tissues. Moreover, the top 400 genes from the general normal cancer comparisons were obtained for each of the 100 iterations. The frequency of occurrence of each of the genes appearing in any of the lists was calculated to determine those genes whose changes in expression were most concordant. The results of the two approaches were then compared.

Cancer literature annotation of identified significant SAM genes

To determine which genes from our SAM lists were known to be associated with cancer, an automated text search was performed. For all the genes in the microarray platform, a Boolean text search using the official gene symbol and the term “cancer,” or its synonyms, was conducting using the PM interface. The results were limited to non-microarray literature by excluding all abstracts which contained the word “microarray.” In addition, literature articles associated with each of these genes as provided by the NCBI ftp site were obtained.18 A list of PM IDs of all cancer, non-microarray literature were then acquired using the PM search “cancer NOT microarray” and was used to further determine which genes had been previously associated with cancer. Results from the two approaches were combined to provide a comprehensive coverage of the known cancer literature. Our SAM lists were then annotated with these results, identifying those genes that were cited in relation to cancer at least once from those that had no cancer association. As a control, a 100 random gene lists from the same platform of equal size to the SAM lists under consideration were obtained. The number of cancer-related genes was determined in each iteration. The mean and standard deviation were calculated from these values to obtain the parameters of a normal distribution. The expected value and the standard deviation were then used to compute the p values for the significant association of each of our cancer gene lists with the known non-microarray literature. Increasing the 100 iterations by 10 more iterations did not alter the aforementioned p values.

Results

  1. Top of page
  2. Abstract
  3. Material and Methods
  4. Results
  5. Discussion
  6. References
  7. Supporting Information

Dataset

We have used nearly 6,000 microarray samples to identify significant gene lists involving normal to cancer transformations in 13 distinct human tissue types. The distribution of samples across each tissue is shown in Supporting Information Additional File 2 Table S1. Overall, we had 4,900 cancer samples and 1,043 normal tissue samples. The largest samples in the database belong to the breast, brain, colon and kidney tissues. Sample distributions were asymmetric, with many more cancer samples than normal tissue samples. Moreover, to increase sample sizes, we added to our large-scale database those datasets with only cancer or only normal tissue samples. This approach eliminated the use of microarray meta-analysis techniques where each dataset is normalized and analyzed separately. On the other hand, the merged SAM analysis used here best fits the recent trend of asymmetric growth in cancer samples in public-access microarray data. Restriction of analysis to comparable microarray chips allowed us to normalize and analyze samples in an integrated fashion without significantly reducing the number of samples that could be used in the analysis.

SAM genes and their match with research literature

The SAM gene lists obtained for the 13 distinct human tissues by setting the FDR to zero varied in length depending on the tissue. However, top 400 genes in each gene list matched well with cancer-associated gene literature obtained from experiments excluding microarrays (Table 1). Our automated text search algorithm described in the Material and Methods section showed that nearly 80% of the genes in these lists were previously associated with cancer in non-microarray studies. p values for occurrence of these matches by chance were estimated by generating randomly chosen gene lists from the microarray chip and varied from a low of 2.86 E -33 for adrenal tissue to 6.99 E -12 for brain tissue. Next we looked at those genes that occurred in multiple tissue-specific lists and their match with the literature was similarly difficult to explain by chance events. These results indicate the potential of microarray studies based on large sample sizes to regenerate much of the genes known to be associated with cancer. The choice of top 400 as a cutoff is somewhat arbitrary. Our data not shown here indicated the match between microarray predictions and literature was nearly optimal at this particular cutoff value.

Table 1.  Overview of resultsnumber of significant genes among the top 400 genes for the 13 tissues appearing at least in one (T 400), two (T2 400) or three (T3 400) tissues
inline image

Cellular pathways enriched for top 400 SAM genes

We projected the top 400 SAM gene lists for each tissue type onto KEGG19 cellular pathways and evaluated their statistical enrichment using DAVID.20, 21 We also generated random gene lists of the same size chosen from genes on the microarray chip and considered their enrichment as controls. Results shown in Figure 1 indicate the statistically enriched cellular pathways in tissue-specific top 400 gene lists. The list includes pathways previously associated with cancer such as the glycine, serine and threonine metabolism, PPAR signaling pathway, DNA replication and extracellular membrane (ECM)–receptor interaction. The variation in the list of enriched pathways from tissue to tissue, as shown in Figure 1, is a reflection of the tissue-specific dimensions of cancer. Even if a pathway was not enriched for a tissue while appearing as enriched for other tissues, several genes in the pathway were still differentially expressed in that tissue despite not reaching statistical significance.

thumbnail image

Figure 1. Pathway profiles of different cancer tissues. Heat map showing the significant pathway profiles for each of the 13 cancer tissues considered. The color scale represents the -log of the p value for the pathway enrichment using a p value cutoff of 0.05.

Download figure to PowerPoint

SAM genes in multiple gene lists

A total of 132 genes appeared in at least four of the top 400 SAM genes out of the 13 total tissue types considered (Fig. 2a). All, with the exception of 11 genes were previously affiliated with cancer in the non-microarray research literature. Comparisons of tissues within the reproductive and digestive systems were conducted based on the results of the commonly altered 132 genes. The number of similarly altered genes between each two tissues was calculated and represented as a percentage of the overlap (Fig. 2b). The results point out to the large intersection between genes affected in breast cancer and ovarian cancer patients. Overall, the average intensity of intersection between significant gene lists of organs belonging to the same organ systems (13.4%) was greater than the corresponding average for organs belonging to different organ systems (9.3%).

thumbnail image

Figure 2. Commonly significant cancer genes. (a) Histogram depicting the distribution of the 132 common genes that occur in at least four out of the 13 tissues considered and (b) heat map illustrating the commonalities among the changes associated with different cancers emerging in the reproductive and digestive systems. The intensities indicate the percent of similarly altered genes that occur between two tissues among the list of 132 common genes that were significantly enriched in at least four tissues.

Download figure to PowerPoint

The list of 132 genes is presented in Table 2 along with the affiliated tissue types in which they appeared among the top 400 SAM genes. The table also identifies the upregulation and downregulation of the genes in each cancer tissue and annotates approved and experimental drugs targeting some of these genes as obtained from DrugBank.22, 23 The genes MTIM and RRM2 appear in nine and TOP2A appears in eight out of 13 tissues. These genes are followed in the list by genes that appear in at least seven cancer types: ADH1B, CDC20, CFD, GSTM5, CLEC3B, PRC1, MELK, ABCA8, UBE2C, KIF4A and RACGAP1 genes. Among this list, TOP2A is currently targeted by seven approved drugs (Table 3). The gene EPHX2 is targeted by tamoxifen in the treatment of breast cancer and ESSRG by diethylstilbestrol for prostate cancer. Meanwhile, experimental drugs targeting CDC2 and TUBA1B are undergoing approval processes.

Table 2. Annotation of commonly altered geneslist of genes that are differentially expressed in at least four tissues and have been previously associated with cancer in the non-microarray literature
inline image
Table 3. Annotation of commonly altered geneslist of approved and experimental cancer drugs targeting commonly altered genes
inline image

Those top 400 SAM genes found in at least four lists but have not been previously associated with cancer are shown in Table 4. The gene LPCAT1 appears in the lists for cervix, colon, kidney, pancreas and stomach. This enzyme mediates conversion of lysophosphatidylcholine (LPC) to phosphatidylcholine (PC), thereby playing a pivotal role in respiratory physiology. Other genes in Tables 4 and 5 are found in four SAM gene lists out of the 13 tissue types under study. Only BBOX1 was associated with an approved drug targets. Further studies are needed to annotate the potential roles of these genes in the progression of cancer.

Table 4. Annotation of new cancer geneslist of genes that are differentially expressed in at least four tissues and have not been previously associated with cancer in the non-microarray literature
inline image
Table 5. Annotation of new cancer geneslist of approved drugs targeting commonly altered genes that have not been previously associated with cancer
inline image

We have used a second (alternative) method to identify those genes that are common in the general pathway of cancer. We generated a cancer microarray database and a control database by randomly selecting 10 samples from each tissue type, resulting in a set of 130 cancer and 130 control samples. We then used the SAM analysis to identify the top 400 significant genes and repeated this operation a 100 times. The union of these 100 lists each containing 400 genes produced 1,411 genes of which 44 are in KEGG's pathways in cancer. The union of genes from the first 50 iterations produces a list of 1,235 genes such that additional iterations produce few new SAM genes. The p value associated with the intersection of 1,411 genes with KEGG's “Pathways of Cancer” using the hypergeometric test with the platform genes as the background was 0.0196. Of the 1,411 genes, 271 are found in at least 70% of the iterations, of which 12 are found in the pathways of cancer with a p value of 0.0208. Moreover, 74 genes out of the 271 appeared among the 132 genes listed in Tables 4 and 5. The p value for this overlap is 9.0763 E - 82. The list of 271 genes is provided as Supporting Information Additional File 2 Table S2. Taken together with genes in Tables 4 and 5, they can be used to extend and further annotate the general pathways of cancer.

Discussion

  1. Top of page
  2. Abstract
  3. Material and Methods
  4. Results
  5. Discussion
  6. References
  7. Supporting Information

In our study, nearly 6,000 microarray samples were obtained from comparable Affymetrix platforms to investigate the commonalities as well as the tissue-specific components of normal to cancer transformations in 13 distinct tissue types. Obtaining such a large sample size was accomplished by adding highly asymmetric datasets into our microarray sample pool. In addition to symmetric data, following our recent method evaluation study,5 we also considered those datasets with large numbers of cancer samples and small number (including zero) of control samples and vice versa. Out of the 13 tissue types under study, only the breast, colon, kidney and pancreas tissues had three or more different datasets that included at least 10 cancer and 10 control samples.

Our approach is unusual in the sense that it does not fit typical meta-analysis techniques7, 11, 24–27 where each dataset need to have both disease and control samples in sufficient numbers, and datasets are normalized and analyzed separately for significant genes. However, we have recently shown in a number of test cases that our approach yields gene lists better matching cancer literature than meta-analysis techniques.5 Using the meta-analysis approach, Ramasamy et al.25 analyzed 21 distinct microarray datasets from 14 different cancer types comprising of 419 control and 973 cancer samples. The minimum sample size for cancer and control in their study was seven and some of the tissue types such as renal tissue appeared only in one dataset in their collection. The advantage of this method is the flexibility concerning the multiple platforms that can be incorporated and thereby increasing the sample size through acceptance of several platforms. Because we focused on a set of comparable platforms, our results are not directly comparable. Nevertheless, Ramasamy et al.25 published five upregulated and five downregulated genes as most significantly associated with cancer. Among this list of 10 genes, four (TMEM136, RBM15, FGD4 and KIAA1881) are not part of the minimal platform considered in our study, suggesting that as the data in public-access microarray repositories grows, datasets used in our approach will be restricted to the latest version of platforms containing many more probes. Of the remaining six genes, our top 400 lists confirmed the downregulation of PRKAR2B and GPM6B in four different tissues. Genes MYOM2 and RBCK1 in their 10 gene list were SAM genes in multiple lists in our study but were in the top 400 only in the liver gene list. Similarly, ALG3 did not appear in any of our top 400 gene lists but was significantly upregulated in six of the 13 tissues in our complete SAM lists. The last gene in their list, IRAK1 was a top 10 ranking gene in our pancreas SAM gene list, however, this gene was downregulated in the pancreas as well as five more tissues in our study, as opposed to the upregulated notation presented to the gene by Ramasamy et al.25 Note that our study contained 106 cancer and 71 normal tissue pancreatic microarray samples as opposed to the 12 tumor and seven normal microarray samples in Ref. 25. It is not feasible to summarize the comparison with a p value because the gene list presented in Ref. 25 contains only 10 genes whereas our various gene lists contain hundreds of genes. Nevertheless, it is clear that two approaches could potentially produce gene lists whose intersection is highly unlikely to be a random event.

Our approach takes advantage of the rapid increase of asymmetric databases in public-access microarray repositories. Moreover, gene lists predicted using this large asymmetric data reproduces much of the research literature on cancer-associated genes obtained by experimental methods other than microarray. Our analysis predicts 132 genes as significantly altered in normal to cancer transformation in at least four tissue types and out of this list, 121 were previously annotated in the literature as cancer associated. The remaining 11 genes comprise potential targets for further studies in cancer research. Note also that 74 out of the 132 genes in the list also appear in 70% of the SAM gene lists generated by comparing normal and cancer datasets comprising of randomly chosen 10 samples from each tissue type. The two gene lists presented in our study for cancer-associated genes with multiple tissue specificity will further contribute to the annotation of pathways of cancer. Recently emerging annotation-based microarray data tools such as A-MADMAN28 will help in the compilation process of large-scale microarray data for studying complex diseases and for biomarker discovery and drug development.

In conclusion, in our study we used nearly 6,000 microarray samples and identified a total of 329 genes that appeared as highly significant in normal to cancer transformation with regards to multiple cancer types. The gene list consists largely of genes that have already been associated with cancer in the research literature excluding microarray studies. The list can be used in detailed annotation of cancer pathways. In addition, due to the inclusion of numerous subtypes and cancer grades, the genes in this list can serve as potential targets for new drugs against metastasis.

References

  1. Top of page
  2. Abstract
  3. Material and Methods
  4. Results
  5. Discussion
  6. References
  7. Supporting Information
  • 1
    Barrett T, Troup DB, Wilhite SE, Ledoux P, Rudnev D, Evangelista C, Kim IF, Soboleva A, Tomashevsky M, Edgar R. NCBI GEO: mining tens of millions of expression profiles—database and tools update. Nucleic Acids Res 2007; 35: D7605.
  • 2
    Edgar R, Domrachev M, Lash AE. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res 2002; 30: 20710.
  • 3
    Brazma A, Parkinson H, Sarkans U, Shojatalab M, Vilo J, Abeygunawardena N, Holloway E, Kapushesky M, Kemmeren P, Lara GG, Oezcimen A, Rocca-Serra P, et al. ArrayExpress—a public repository for microarray gene expression data at the EBI. Nucleic Acids Res 2003; 31: 6871.
  • 4
    Tusher VG, Tibshirani R, Chu G. Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci USA 2001; 98: 511621.
  • 5
    Dawany NB, Tozeren A. Asymmetric microarray data produces gene lists highly predictive of research literature on multiple cancer types. BMC Bioinformatics 2010; 11: 483.
  • 6
    Choi JK, Choi JY, Kim DG, Choi DW, Kim BY, Lee KH, Yeom YI, Yoo HS, Yoo OJ, Kim S. Integrative analysis of multiple gene expression profiles applied to liver cancer study. FEBS Lett 2004; 565: 93100.
  • 7
    Rhodes DR, Barrette TR, Rubin MA, Ghosh D, Chinnaiyan AM. Meta-analysis of microarrays: interstudy validation of gene expression profiles reveals pathway dysregulation in prostate cancer. Cancer Res 2002; 62: 442733.
  • 8
    Sanga S, Broom BM, Cristini V, Edgerton ME. Gene expression meta-analysis supports existence of molecular apocrine breast cancer with a role for androgen receptor and implies interactions with ErbB family. BMC Med Genomics 2009; 2: 59.
  • 9
    Gorlov IP, Byun J, Gorlova OY, Aparicio AM, Efstathiou E, Logothetis CJ. Candidate pathways and genes for prostate cancer: a meta-analysis of gene expression data. BMC Med Genomics 2009; 2: 48.
  • 10
    Xu L, Geman D, Winslow RL. Large-scale integration of cancer microarray data identifies a robust common cancer signature. BMC Bioinformatics 2007; 8: 275.
  • 11
    Rhodes DR, Yu J, Shanker K, Deshpande N, Varambally R, Ghosh D, Barrette T, Pandey A, Chinnaiyan AM. Large-scale meta-analysis of cancer microarray data identifies common transcriptional profiles of neoplastic transformation and progression. Proc Natl Acad Sci USA 2004; 101: 930914.
  • 12
    Warnat P, Eils R, Brors B. Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes. BMC Bioinformatics 2005; 6: 265.
  • 13
    Katz S, Irizarry RA, Lin X, Tripputi M, Porter MW. A summarization approach for Affymetrix GeneChip data using a reference training set from a large, biologically diverse database. BMC Bioinformatics 2006; 7: 464.
  • 14
    Dai MH, Wang PL, Boyd AD, Kostov G, Athey B, Jones EG, Bunney WE, Myers RM, Speed TP, Akil H, Watson SJ, Meng F. Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data. Nucleic Acids Res 2005; 33: e175.
  • 15
    Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 2003; 4: 24964.
  • 16
    Tibshirani R CG, Hastie T, Narasimhan B. samr: SAM: Significance Analysis of Microarrays. R package version 1.26. 2008. http://www-stat.stanford.edu/∼tibs/SAM.
  • 17
    R Development Core Team. R: A language and environment for statistical computing, Austria: R Foundation for Statistical Computing V, 2008.
  • 18
    Maglott D, Ostell J, Pruitt KD, Tatusova T. Entrez Gene: gene-centered information at NCBI. Nucl Acids Res 2007; 35: D26D31.
  • 19
    Kanehisa M, Goto S, Hattori M, Aoki-Kinoshita KF, Itoh M, Kawashima S, Katayama T, Araki M, Hirakawa M. From genomics to chemical genomics: new developments in KEGG. Nucl Acids Res 2006; 34: D3547.
  • 20
    Dennis G, Jr, Sherman BT, Hosack DA, Yang J, Gao W, Lane HC, Lempicki RA. DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biol 2003; 4: P3.
  • 21
    Huang da W, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 2009; 4: 4457.
  • 22
    Wishart DS, Knox C, Guo AC, Cheng D, Shrivastava S, Tzur D, Gautam B, Hassanali M. DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Res 2008; 36: D9016.
  • 23
    Wishart DS, Knox C, Guo AC, Shrivastava S, Hassanali M, Stothard P, Chang Z, Woolsey J. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res 2006; 34: D66872.
  • 24
    DeConde RP, Hawley S, Falcon S, Clegg N, Knudsen B, Etzioni R. Combining results of microarray experiments: a rank aggregation approach. Stat Appl Genet Mol Biol 2006; 5; Article 15.
  • 25
    Ramasamy A, Mondry A, Holmes CC, Altman DG. Key issues in conducting a meta-analysis of gene expression microarray datasets. PLoS Med 2008; 5: e184.
  • 26
    Pihur V, Datta S. RankAggreg, an R package for weighted rank aggregation. BMC Bioinformatics 2009; 10: 62.
  • 27
    Hong F, Breitling R, McEntee CW, Wittner BS, Nemhauser JL, Chory J. RankProd: a bioconductor package for detecting differentially expressed genes in meta-analysis. Bioinformatics 2006; 22: 28257.
  • 28
    Bisognin A, Coppe A, Ferrari F, Risso D, Romualdi C, Bicciato S , Bortoluzzi S. A-MADMAN: annotation-based microarray data meta-analysis tool. BMC Bioinformatics 2009; 10: 201.

Supporting Information

  1. Top of page
  2. Abstract
  3. Material and Methods
  4. Results
  5. Discussion
  6. References
  7. Supporting Information

Additional Supporting Information may be found in the online version of this article.

FilenameFormatSizeDescription
IJC_25854_sm_suppinfo1.xlsx695KSupporting Information 1
IJC_25854_sm_suppinfo2.docx494KSupporting Information 2

Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.