Gastrointestinal stromal tumors (GISTs) historically were grouped with leiomyosarcomas (LMSs) based on their morphologic similarities; however, recently, GIST was established unequivocally as a distinct type of sarcoma based on its molecular features and response to imatinib treatment.
To gain further insight into the genomic differences between GISTs and LMSs, the authors mapped gene copy number aberrations (CNAs) in 42 GISTs and 30 LMSs and integrated the results with gene expression profiles.
Distinct patterns of CNAs were revealed between GISTs and LMSs. Losses in 1p, 14q, 15q, and 22q were significantly more frequent in GISTs than in LMSs (P < .001); whereas losses in chromosomes 10 and 16 and gains in 1q, 14q, and 15q (P < .001) were more common in LMSs. By integrating CNAs with gene expression data and clinical information, the authors identified several clinically relevant CNAs that were prognostic of survival in patients with GIST. Furthermore, GISTs were categorized into 4 groups according to an accumulating pattern of genetic alterations. Many key cellular pathways were expressed differently in the 4 groups, and the patients in each group had increasingly worse prognoses as the extent of genomic alterations increased.
Gastrointestinal stromal tumors (GISTs) previously were grouped with spindle cell and other soft-tissue sarcomas, including leiomyosarcoma (LMS).1 However, in recent years, GIST has emerged as a distinct mesenchymal tumor type that frequently is associated with a gain-of-function mutation in the v-kit Hardy-Zuckerman 4 feline sarcoma viral oncogene homolog (KIT) gene (80%-85% of GISTs) or the platelet-derived growth factor-alpha (PDGFRA) gene (5%-7% of GISTs).1-3 The presence of these mutations allows for targeted therapy using imatinib (Gleevec, STI-571; Novartis Pharmaceuticals, Basel, Switzerland), which has demonstrated efficacy in 60% to 80% of patients with GISTs.4 Conversely, LMSs are not associated with KIT gene mutations or overexpression and do not benefit from imatinib therapy. The treatment of patients with LMS using contemporary cytotoxic chemotherapy has resulted in a 53% objective response rate, whereas patients with GIST who received traditional cytotoxic chemotherapy have not had a measurable response.4, 5 Although mutations in KIT and PDGFRA explain why 60% to 80% of patients with GIST initially benefit from imatinib, the duration of benefit that patients receive from this therapy remains considerably variable. Furthermore, even rarely, some patients with KIT exon 11 mutations are resistant to imatinib; and secondary mutations of KIT reportedly have occurred in patients who initially responded to imatinib therapy.6 Thus, robust and biologically relevant prognostic factors, especially those for predicting the survival of patients with GIST, still are needed.
Growing evidence indicates that the accumulation of specific genetic alterations ultimately leads to a highly unstable underlying genome in cancer development and progression.7 Although some recurrent changes in GIST and LMS genomes have been investigated before, the deficiencies of early measurement technologies or small sample sizes that were used in early studies makes it necessary to accumulate additional genomic information in additional samples and to create a more refined map of the recurrent aberrations. Toward this objective, we conducted a comprehensive, high-resolution, whole-genome array comparative genomic hybridization (aCGH) analysis to map the recurrent copy number aberrations (CNAs) in GISTs and LMSs. We also investigated the clinical relevance of our results in an integrative analysis of the CNAs, gene expression profiles, and patient survival information. The results from this study led us to propose a new tumor-progression genetic staging system termed genomic instability stage (GIS) to complement the current GIST staging system, which is based on tumor size, mitotic index (MI), and c-kit mutation.
MATERIALS AND METHODS
Primary Tumors and Pathologic Evaluation
In total, 72 primary tumors, including 42 GISTs and 30 LMSs, were acquired from surgical specimens from 1989 through 2005 at The University of Texas M. D. Anderson Cancer Center under an institutional review board–approved protocol. For transcriptome analysis, high-quality RNA was acquired from 32 GISTs and 25 LMSs. For genomic profiling, we used these samples as well as 15 additional samples (10 GISTs and 5 LMSs). The diagnoses were made on the basis of clinicopathologic evaluation and molecular marker studies. The clinical information is summarized in Table 1.
Table 1. Clinical Information on Patients With Gastrointestinal Stromal Tumors and Leiomyosarcomas
Genomic DNA from tumors and pooled normal tissue was isolated according to standard procedure. Labeled genomic DNA was hybridized to the Agilent Human Genome CGH Microarray 4x44 Kit according to the manufacturer's instructions (Agilent Technologies, Palo Alto, Calif). The data were extracted from microarrays with Agilent Feature Extraction software version 9.5 using the default settings and were analyzed further with MATLAB version R2007b (The MathWorks, Inc., Natick, Mass) and R statistical software (version 2.6.2; R Development Core Team, available at http://cran.r-project.org/ accessed August 26, 2010). Intensity values were lowess-normalized to compensate for common nonlinear biases. Ratios of normalized intensity values from tumor tissues and normal tissue were transformed to log2-space. Then, log-ratio data were subjected to a circular binary segmentation algorithm8 (R implementation DNA copy; version 1.6.0) to reduce the effect of noise. The CGHcall algorithm9 (version 1.2.2. in R) was used to label the segments as lost, normal, or gained.
Gene expression data were measured by using whole human genome oligo arrays with 44-K 60-mer probes (Agilent Technologies) with 500 ng of total RNA starting material according to the manufacturer's protocol. Arrays were scanned with the Agilent dual laser-based scanner. Features were extracted from arrays with Agilent Feature Extraction software (version 8.0). The expression data were quantile normalized. Both aCGH and gene expression data are available at http://www.cs.tut.fi/sgn/csb/GISTLMS/ accessed August 26, 2010.
DNA sequences were classified as recurrently aberrated if the number of aberrations in individual samples exceeded a threshold of statistical significance, as estimated using a permutation test. The 95th percentile values were chosen as the threshold of significance. By using this procedure, we estimated that similar aberrations in at least 14 samples (33%) for GISTs and 12 samples (40%) for LMSs were required for a sequence to be called recurrently aberrant. Probe average recurrence (PAR) was used to quantify the aberration rate of a recurrently aberrated DNA segment. The PAR is calculated by averaging the aberration rate over the probes in a contiguous, recurrently aberrated DNA segment. Differences in aberration frequencies between GIST and LMS were tested independently with the Fisher exact test for each probe. To account for the resulting multiple comparisons problem, the level of significance in these tests was set to .001. Differential expression between sample sets was determined with the Wilcoxon rank-sum test with a threshold of .05. In finding the subgroups within the GIST samples, hierarchical clustering with inner squared linkage was applied. The most informative genes for clustering were selected by using a 2-tailed t test. In estimating patient survival curves, Kaplan-Meier survival estimators were applied. A Mantel-Cox test also was used to determine the statistical significance of the difference of these survival estimators. A significance threshold of .05 was selected for all survival tests. A hypergeometric distribution with a significance threshold of .05 was used in computing gene set enrichments.
Gastrointestinal Stromal Tumors and Leiomyosarcomas Have Distinct Differences in Their Genomes
After performing comprehensive aCGH profiling experiments with 42 primary GISTs and 30 primary LMSs, we analyzed the recurrent CNAs in these tumors. Our analysis revealed several distinct loci throughout the genome that frequently were aberrant in GISTs (Fig. 1A) and in LMSs (Fig. 1B), similar to the reported data.10 Statistical comparisons of the concluded cancer genomes revealed that losses in chromosomes 1p, 14q, 15q, and 22q were significantly more frequent in GISTs than in LMSs (P < .001), whereas losses in chromosomes 10 and 16 were more common in LMSs (P < .001). We not only confirmed previous CNAs, such as loss of 1p, 14q, 15q, and 22q,10-21 but also demonstrated that the deletion of 22q was the most common recurrent deletion in GISTs (84% PAR): parts of 22q were deleted in >95% of GIST samples, a rate that was significantly higher than previously reported data.12, 21 In addition, although losses in 1p were common in both sarcoma types, many more and much larger deletions in 1p were observed in GISTs than in LMSs. In comparison, tumors from patients with LM more frequently had gains in chromosomes 1q, 14q, and 15q (P < .001).
From the aberration profiles illustrated in Figure 1, we created a gene-level map of the recurrent CNAs in GISTs and LMSs. In total, 328 recurrently aberrant segments of DNA were identified in GISTs (202 gains and 126 losses), and 373 were identified in LMSs (194 gains and 179 losses) based on the PAR, which was defined as the average recurrence rate of the probes that were included in a segment. We matched CNAs with corresponding gene expression profiles and identified which genes had expression that was correlated significantly with gene dosage. Next, we investigated the effect of each dosage-sensitive gene on patient survival and identified which recurrent CNAs harbored at least 1 dosage-sensitive gene that was correlated significantly with patient survival (Table 2). These clinically relevant CNA segments and the putative target genes offer a promising starting point from which functional validations can be carried out in future studies.
Table 2. Copy Number Aberrations That Harbored at Least 1 Dosage-Sensitive Gene Associated With a Poor Prognosis in Patients With Gastrointestinal Stromal Tumors and Leiomyosarcomas
Genomic Instability Stage May Be a Valuable Prognostic System for Gastrointestinal Stromal Tumors
We performed a cluster analysis in an attempt to identify clinically relevant subgroups that were defined by chromosome-level CNAs. In contrast to LMS, which did not cluster well into clear genomic subtypes, GIST aberration profiles revealed 4 distinct groups with various degrees of genetic alterations (n1 = 12, n2 = 8, n3 = 12, n4 = 10) (Fig. 2A). A survival analysis of these groups revealed that patient survival is increasingly worse with the presence of more and more genomic aberrations (Fig. 2B). All 4 groups feature partial losses in distal 1p, 19, and 22q, which suggests that these deletions must be early events in GIST development. The defining chromosome-scale difference between Group 1 (with the least amount of aberrations) and Group 2 (with slightly more aberrations than Group 1) is the added deletion of chromosome 14q in Group 2. Patients with tumors classified into Group 1 or 2 have significantly longer survival (Fig. 2C) than patients with Group 3 or Group 4 tumors (which feature more aberrations than the tumors in Groups 1 and 2). Group 3 harbors the same aberrations that characterize Groups 1 and 2 but also have additional deletions of chromosome 15q and the proximal part of chromosome 1p. Tumors in Group 4 are distinguished from tumors in Group 3 by the additional loss of chromosome 10. Although Group 4 retains the characteristics of the first 3 groups, it also contains a more diverse set of tumors, which is apparent in the more heterogeneous pattern of CNAs compared with the other 3 groups. This also is reflected in the survival estimate, which falls between the first 2 groups and the third group.
These results lead us to propose a new tumor-progression genetic staging system termed genomic instability stage (GIS) to complement the current prognostic staging system for GIST based on tumor size, MI, and KIT mutation. Although we did not have MI information for all patients in the study, we did have sequencing data on KIT and PDGFRA gene mutations (Fig. 2A). This allowed us to investigate the relation between the mutation of these genes, especially KIT, and genomic instability manifested by the accumulation of CNAs. The high mutation rate of KIT exon 11 in Groups 1 and 2 suggested that KIT mutation is an early event in GIST, which is consistent with its role as a driver oncogene. Increased KIT mutation frequency was observed in Groups 3 and 4, and this observation was consistent with reports that secondary mutations of KIT occur at later stages in GIST progression.6 Imatinib-treated patients who had mutations in KIT exon 11 survived significantly longer (in a group-independent manner) than patients who had the same mutation but did not receive imatinib (P = .002) (Fig. 2D), suggesting that the differences in genomic survival estimators were not affected significantly by either imatinib treatment or KIT mutations. Furthermore, other common risk-assessment and clinical parameters, such as patient age, tumor size, sex, primary site, and the presence of metastases, were not correlated significantly with the groups. Because of the lack of data on MI, we were unable to fully compare the existing risk-assessment system4 with the genomic stages. Further prospective characterization of these genomic profiles, coupled with full risk assessment (based on 2007 National Comprehensive Cancer Network guidelines) and clinical outcome, is needed to validate this proposed model.
The incremental occurrence of the observed CNA patterns, increasing KIT mutations, and independence from clinical parameters other than survival suggested that the 4 GIS stages reflect the progressive accumulation of chromosome-scale genetic abnormalities during GIST progression. We further confirmed the sequential nature of the chromosome-scale events by determining which aberrations are the most prominent in each stage. That analysis clearly revealed that losses of 1p, 14q, 15q, 19, and 22q are the most distinct events in the 4 groups, although many smaller scale events also may play a critical role in GIST progression (Fig. 2E). Notably, the prominence of less aberrated chromosomes also increased from the first stage to the third, as observed in the amplification of chromosomes 3, 4, 5, and 6. In addition, the analysis revealed that dosage-sensitive genes in critical segments also changed their expression in a corresponding manner between the hypothesized GIS stages (Fig. 3A-D). From the gene expression rates, we could clearly observe copy number changes in the chromosomes that harbored the genes, such as loss of 15q for the anchor protein 13 (AKAP13) gene and the chromosome 15 open reading frame 5 (C15orf5) gene, loss of 14q for the oxidase (cytochrome c) assembly 1-like (OXA1L) gene, and gains in 3q for the switch/sucrose nonfermentable-related, matrix-associated, actin-dependent regulator of chromatin subfamily A member 3 (SMARCA3) gene. Because these are genes that significantly affect survival, these possible target genes also ultimately may be responsible for the worse outcome observed for patients in Groups 3 and 4.
Different Cellular Pathways Are Altered in Gastrointestinal Stromal Tumors With Different Genomic Instability Stages
The genes that were expressed differently between 2 adjacent GIS groups were used in an enrichment analysis with the objective of finding the biologic processes that were altered significantly during the progression from 1 stage to the next. We used the list of biologic processes in the Gene Ontology database22 as our reference. Differences in genome level translate into several distinct cancer-related processes at the transcriptome level (Fig. 4). The changes from GIS1 to GIS2 impaired mainly the apoptotic, DNA-repair, and damage-response pathways; whereas the progression from GIS2 to GIS3 affected the mitotic, cell cycle, and growth pathways. The final transition from GIS3 to GIS4 had substantially more differences in gene expression, most notably in the cell-cell adhesion and chromosomal organization pathways.
Recent progress in cancer genomics, highlighted by the advancement of the Cancer Genome Atlas program, has demonstrated that comprehensive genomic characterization of a large number of cancer samples is highly valuable for fully understanding the molecular basis of human cancer and for classifying cancer into clinically meaningful subtypes.23, 24 In the current study, we conducted an integrated analysis of high-resolution genomic maps, gene expression data, and clinical information on GISTs and LMSs. Our genomic analysis provided further evidence that GISTs are distinct from LMSs at the genomic level and pointed out the exact chromosomal locations of the greatest difference and similarity. However, it is most noteworthy that our analysis provides a genomic view of GIST progression and demonstrates that staging by using a specific genomic alteration may offer a clinically meaningful system for predicting the prognosis for patients with GIST, even in those who receive imatinib therapy.
Although several previous studies have profiled genomic alterations in GISTs and LMSs using different generations of technologies and relatively small sample cohorts, a key aspect of the current analysis is the correlation of genomic alterations with gene expression data and clinical information. By using this integrative approach, we were able to pinpoint clinically relevant CNAs from the vast number of biologically irrelevant aberrations. Whereas simple mapping of recurrently aberrant genes can yield hundreds or thousands of clinically irrelevant passenger genes, the clinically relevant genomic segments (critical segments) that we have uncovered provide a reasonable number of putative targets for future validation studies.
Our integrated analysis also led us to a new appreciation for the genetic basis of the progression of GISTs. Pattern recognition analysis of the genomic alterations revealed that there is an obvious incremental accumulation of gene copy number alterations in GIST. Consequently, we have proposed a new tumor-progression genetic staging system (Genomic Instability Staging or GIS) to complement the standard tumor site, size, and proliferation risk-assessment system.4 According to the GIS staging system, deletions of distal 1p, 19, and 22q are the likely keys to early chromosome-scale events that may have triggered the transformation from normal tissue to GIS1 tumor. Whether these events occur before or after KIT mutation is not apparent from our data, because KIT mutation is a high-frequency event in every GIS stage. The most distinct event that follows these deletions is the deletion of 14q, which can be observed clearly as the defining feature in GIS2. Further key deletions of proximal 1p and 15q mark GIS3 disease. Loss of chromosome 10, which also has been associated with late stage in many solid tumors,25 defines the final stage, GIS4. Our GIS groups are consistent with previously reported data21, 26 but provide more specific information on the key aberrant events. The lack of significant differences in KIT and PDGFRA mutation status and in the response to imatinib for different GIS groups indicates that the GIS system may have independent prognostic value for patients with GISTs.
Our pathway analysis provides additional insight into the process of tumorigenesis in which early stage GISTs (GIS1 and GIS2) evade apoptosis, intermediate-stage GISTs (GIS2 and GIS3) undergo accelerated proliferation, and late-stage GISTs (GIS3 and GIS4) lose their dependence on cell adhesion, allowing invasion and metastasis. These different key pathway aberrations in different GIS groups validate the accumulative progressive character of GISTs. We believe that these findings are compelling; however, functionally confirming them would require a much larger study. We also must point out that, although we did not observe similar findings for LMSs, this may mean only that LMS is a more heterogeneous disease, and a larger sample size would be needed to reveal key signatures that underlie disease progression and prognosis in patients with LMS.
We thank Drs. Bogdan Czerniak, Jean-Pierre Issa, and Janet Bruner for their critical review of this article and valuable comments. In addition, we thank David Cogdell and Limei Hu for performing the microarray experiments and Drs. Robert Benjamin, Olli Yli-Harja, and Ilya Shmulevich for their significant contribution to the experimental design and interpretation of the results. We also thank Ms. Tamara Locke of the Department of Scientific Publications at The University of Texas M. D. Anderson Cancer Center for editing this article.
CONFLICT OF INTEREST DISCLOSURES
Supported by National Institutes of Health (NIH) grant R01 CA098570 (to W.Z.), an NIH Career Development Award (to J.T.), a Commonwealth Foundation for Cancer Research grant (to W.Z. and J.T.), Academy of Finland Projects 213462 and 122973 (to A.Y. and M.N.), and the National Natural Science Foundation of China (30901715/C171002; to J.Y.). This research is supported in part by the NIH through The University of Texas M. D. Anderson Cancer Center Support Grant CA016672.