Chromosomal instability and a deregulated cell cycle are intrinsic features of high‐risk gastrointestinal stromal tumours with a metastatic potential

Patients with localised, high‐risk gastrointestinal stromal tumours (GIST) benefit from adjuvant imatinib treatment. Still, approximately 40% of patients relapse within 3 years after adjuvant therapy and the clinical and histopathological features currently used for risk classification cannot precisely predict poor outcomes after standard treatment. This study aimed to identify genomic and transcriptomic profiles that could be associated with disease relapse and thus a more aggressive phenotype. Using a multi‐omics approach, we analysed a cohort of primary tumours from patients with untreated, resectable high‐risk GISTs. We compared patients who developed metastatic disease within 3 years after finishing adjuvant imatinib treatment and patients without disease relapse after more than 5 years of follow‐up. Combining genomics and transcriptomics data, we identified somatic mutations and deregulated mRNA and miRNA genes intrinsic to each group. Our study shows that increased chromosomal instability (CIN), including chromothripsis and deregulated kinetochore and cell cycle signalling, separates high‐risk samples according to metastatic potential. The increased CIN seems to be an intrinsic feature for tumours that metastasise and should be further validated as a novel prognostic biomarker for high‐risk GIST.


Introduction
Gastrointestinal stromal tumour (GIST) is the most frequent soft tissue sarcoma and is characterised by activating mutations in the receptor tyrosine kinases KIT or PDGFRA.GIST represents a paradigm for the development of precision cancer medicine, where a successful long-time response to targeted treatment by tyrosine kinase inhibitors is achieved for many patients.Adjuvant treatment with imatinib for three years is recommended for patients with localised GIST who have a significant risk of relapse [1].The prognosis and treatment decisions after curatively intended resection rely on the classification of recurrence risk according to defined criteria including tumour size, mitotic index, tumour rupture and anatomical site [2,3].Several risk classifications have been proposed, of which the most commonly used are the modified NIH (mNIH) classification and the Armed Forces Institute of Pathology (AFIP) criteria [2].Patients classified as high-risk benefit from adjuvant imatinib treatment, but around 40% of the patients relapse within three years after stopping the recommended three years of treatment [4].Furthermore, historical data have shown that almost half of the high-risk patients could be cured by surgery alone [5].Thus, the high-risk group includes both tumours with a good prognosis without imatinib and tumours that recur despite adjuvant treatment, and there is a need for an improved understanding of the biological differences.
The gain of oncogenic mutations in KIT or PDGFRA is the initial event in GIST [6,7].Most GISTs then progress through a stepwise accumulation of large-scale genomic gains, losses and rearrangements, known as chromosomal instability (CIN).Studies have shown that the number of genomic changes can predict the clinical outcome of intermediate-risk patients as defined by AFIP [8,9].We have recently shown for the first time that genomic complexity is a prognostic biomarker in high-risk GIST and that tumour recurrences are infrequent for patients with a simple tumour karyotype [10].Models introducing CIN as a prognostic biomarker may provide a more accurate estimation of the risk of metastasis, but a more detailed understanding of the molecular differences between genomically simple and complex tumours is needed.
The accumulation of chromosomal aberrations is associated with the inactivation of cell cycle control genes such as TP53, RB1 and CDKN2A [11].Mutation of the KIT and PDGF receptors leads to a constitutional activation and ligand-independent downstream signalling through the RAS/RAF/MAPK, PI3K/AKT/ mTOR and STAT3 pathways [12,13].These pathways further upregulate important transcriptional activators, stimulating the cell cycle and having anti-apoptotic effects [14][15][16][17][18][19][20][21].The cell cycle pathways are part of complex interacting networks with miRNAs as central regulatory functions [22].The integration of the various sources of genomic data may help in understanding the intrinsic biology of GIST.Thus, the observed biological and clinical behaviour differences can be better understood.
This study aimed to characterise the genomic and transcriptomic landscape of high-risk GIST to increase the understanding of the differences between tumours that metastasize and those that do not recur.We investigated somatic mutations, DNA copy number changes, and expression of mRNA and miRNA genes and pathways.This showed that CIN and deregulation of cell cycle pathways were the most prominent features associated with increased metastatic capacity in high-risk GIST.

Tumour patient characteristics
The "GIST Risk" patient cohort consisted of 21 patients with resectable, gastric, high-risk GIST, where primary, untreated tumours were collected.Clinicopathological characteristics are summarised in Table 1.None of the patients received neoadjuvant therapy before tumour resection.Sixteen patients received adjuvant imatinib treatment for a median duration of 32 months (range 2-60 months).Disease recurrence was recorded for 10 patients.All developed metastases to the liver and/or peritoneum without locoregional recurrences.The median time from primary tumour surgery to metastasis was 27 months (range 6-63 months).Median follow-up for patients without metastasis was 112 months (range 85-134 months).
The patient cohort was divided into two groups: (i) 10 high-risk GIST patients who developed metastatic disease within three years after the end of imatinib treatment or surgery for patients without adjuvant imatinib, and (ii) 11 high-risk GIST patients without disease relapse after at least five years of follow-up after the end of imatinib treatment or surgery for patients without adjuvant imatinib.Fresh frozen tumour tissues and blood samples for germline control were available for all patients.The tumours were diagnosed using the current World Health Organization classification and risk classified using the modified mNIH criteria [2].The samples were collected between 2002 and 2017 at Oslo University Hospital (OUH).
Written informed consent was obtained from all patients enrolled in the study.The study was approved by the Regional Committees for Medical Research Ethics Southern and Eastern Norway (Project S-06132).The study was performed according to the Declaration of Helsinki.

Isolation of nuclei acids
Genomic DNA and total RNA, including miRNAs, were extracted from fresh frozen tumour samples using the Qiagen AllPrep DNA/RNA/miRNA Universal kit according to the manufacturer's instructions (Qiagen, GmBH, Hilden, Germany).As a germline control, peripheral blood mononuclear cells' genomic DNA was isolated using the Qiagen QIAmp DNA blood kit.

Whole exome sequencing
The whole genome sequencing libraries were generated using the Agilent SureSelectXT reagent kit according to the manufacturer's instructions.Furthermore, exome sequences were captured using the Agilent Human All Exon V5 capturing probe set (Agilent, Santa Clara, CA, USA).Exome libraries were sequenced paired-end 2x150 bp on the Illumina High-Seq 4000 instrument.Library preparation and sequencing were performed by the Oslo University Hospital [Genomics Core Facility (Oslo.genomics.no)].The mean coverage of the target regions for tumour and peripheral white blood cell control samples were 4499 and 1969, respectively.

Somatic and germline variant detection
A benchmarked in-house bioinformatics pipeline [23] was used to process the sequencing reads and identify somatic changes.Sequence alignment was performed using BWA MEM [24] towards the b37 genome assembly with added decoy contigs.Further pre-processing was performed with PICARD (http://broadinstitute. github.io/picard/)and GATK [25] before performing the calling of somatic single nucleotide variants (SNVs) and insertions/deletions (INDELs) using MuTect [26] and Strelka [27].Variant calling files were further processed and annotated using PCGR [28].The somatic variants were further filtered according to the following criteria: tumour sample coverage ≥ 509, control sample coverage ≥ 309, tumour sample mutant allele frequency (MAF) ≥ 0.03, and number of reads supporting MAF in the tumour sample ≥ 5. Protein-coding variants, including non-synonymous, frameshift, and splice-site mutations, were classified according to a four-tiered structure of clinical significance [29].Mutated genes were annotated as tumour suppressors and/or proto-oncogenes using the Network of Cancer Genes (NCG) database [30].The long KIT duplications were manually identified using IGV [31].The Mann-Whitney test was used to determine significant differences in the number of SNVs.Tumour mutational burden (TMB) was estimated using PCGR, following the approach outlined in [32].Low TMB was defined as fewer than 5 mutations per Mb [32].Germline variants (SNVs/INDELs) were identified from the blood samples through the Illumina DRA-GEN germline pipeline v.3.9 (07.021.595.3.7.5).Variant calling files were further processed and annotated using the Cancer Predisposition Sequencing Reporter (CPSR v0.6.1)[33] to identify potential cancerpredisposition variants.CPSR was configured to report variants of clinical significance in an exploratory panel of 335 cancer predisposition genes.

Copy number analysis and chromosomal instability
Allele-specific DNA copy-number variation and frequency analysis were performed using FACETS [34].Identification of significant recurrent copy-number changes at the arm and focal level was performed by GISTIC v2.0 [35], where regions of alteration with a qvalue less than 0.25 were reported as significant.Genes corresponding to the significantly recurrent changed regions were reported.CIN was scored using the CINmetrics package (https://github.com/lasseignelab/CINmetrics).The scoring was based on the number of aberrant chromosomes containing regions with gain and loss of alleles and/or containing regions with copy-neutral loss of heterozygosity (LOH).Identification of chromothripsis was performed per sample by counting the number of switches between copy-number states for each chromosome based on the FACETS results.Chromosomes containing 10 or more such switches within a 50 Mb interval were classified as chromothripsis-positive with high confidence, as described in [36].

mRNA sequencing analysis
RNA sequencing libraries were constructed using the KAPA RNA Hyper kit (Roche, Bassel, Switzerland) to generate total RNA libraries and the Twist Core Exome probe set (Twist Biosciences, San Francisco, CA, USA) to capture assay targets.RNA-exome libraries were sequenced paired-end 2x75 bp on an Illumina HiSeq 4000 instrument.The library preparation and sequencing were performed at the Oslo University Hospital Genomics Core Facility.
Sequencing reads were aligned to the GRCh38 human genome assembly using STAR [37] and gene expression quantification was performed using the featureCounts function of the subread package [38].Genes with a median number of transcripts per million (TPM) < 2 in both metastatic and non-metastatic groups were considered as being expressed below the background level and were filtered out.Differentially gene expression (DE) analysis was performed using the DESeq2 R package [39].Genes with an adjusted P-value < 0.05 and absolute values of log 2 fold change (logFC) > 1 were considered differentially expressed between the groups.
A gene set enrichment analysis (GSEA) was performed using the fgsea R package [41].The lfcShrink function from DeSeq2 was used for ranking.The GSEA was run using the cell proliferation gene set (cell_proliferation_GO_0008283) from the Human Molecular Signatures Database (MSigDB) [42].

Allele-specific expression at mutated loci
Allele-specific expression analysis of protein-coding SNV was performed using the ASEReadCounter from GATK [25], which resulted in sample-wise RNA read counts for mutated and wild-type alleles for each input SNV.Before running the tool, the RNA sequencing reads were mapped to the GRCh37 human genome assembly with an added decoy using STAR and post-processed using Picard.ASEReadCounter was run with the additional parameters "min-mapping-quality" of 10 and "min-base-quality" of 2. The expression counts for mutated and wild-type alleles for protein-coding INDELs were identified manually using IGV.The mutations were further classified into three categories: mutated allele expressed, mutated allele not expressed or mutated gene not expressed, based on TPM values of corresponding genes and variant allelic fractions (VAF) in the tumour.The SNVs not present in the covered RNA data were filtered out.A mutation was called expressed if the corresponding gene was expressed with a TPM ≥ 2 and VAF given by mRNA data was greater than 3%.If the corresponding gene was expressed, but VAF was ≤ 3%, then the mutated allele was classified as not expressed.If the corresponding TPM value was less than 2, the gene was considered as not expressed.

miRNA expression analysis
miRNA expression profiling was performed using the Human V3 miRNA assay from nanoString (nano-String Technologies Inc., Seattle, Washington, USA).The assay was performed by the Oslo University Hospital Genomics Core Facility.Quality control and preprocessing steps were performed using nSolver software from nanoString.Normalisation and differentially expressed gene analysis were performed using the NanostringDiff R package [43].Genes with median read count expression values before and after normalisation less than 45 in both metastatic and nonmetastatic groups were filtered out.Genes with q-value ≤ 0.05 and an absolute value of logFC ≥ 1 were considered differentially expressed.miRNA target predictions were performed using the miRNA and mRNA interaction databases in IPA.miRNAs and putative mRNA targets were selected to have opposite expression fold changes, and the confidence level of the target predictions was set to experimentally observed and highly predicted interactions.

The expressed mutational landscape of GIST
By whole-exome sequencing, a total of 738 somatic protein-changing mutations (668 SNVs and 70 INDELs, Data S1) within the coding regions of 585 genes were found across the 21 tumours analysed.To explore the differences in behaviour and clinical outcome within the high-risk group, we compared tumours from patients who developed metastasis to those without after longterm follow-up (> 5 years).Three tumours (14%) had oncogenic PDGFRA exon 18 mutations, and 18 tumours (86%) had KIT exon 11 mutations (Fig. 1A).Except for the KIT and PDGFRA mutations, no other mutations with clinical or potential clinical significance were observed among the samples.All tumours had a low tumour mutational burden (TMB < 3), but tumours from patients who developed a metastatic disease had a higher number of variants compared to non-metastatic patients (39 vs 32 variants; P = 0.04).No pathogenic or likely pathogenic mutations associated with predisposition to GIST were observed.
Combined mutation and allele-specific expression data were available for 719 mutated protein-coding variants, where an average of 36% of the variants across the samples was found not to be expressed, and for 32%, only the wild-type allele was expressed.Thus, only 230 of the somatic mutations (185 mutated SNVs and 45 INDELs) were found to be expressed within the GIST cohort.The average proportion of variants with expressed mutated alleles was 31% in metastatic and 33% in nonmetastatic samples (Fig. 1, Data S1).Among the total 585 genes, 68 were annotated as tumour suppressor genes and/or oncogenes.Of these, 19 genes were expressed; 10 tumour suppressors and 9 oncogenes were distributed evenly across the groups (Fig. 1).

High level of chromosomal instability is associated with the development of metastatic disease
The genomic copy number patterns were analysed using two different approaches, focusing on large-scale or more segmental patterns.DNA copy number variation (CNV) analysis showed that ≥ 40% of metastatic samples had large-scale amplification in chromosome 8 and deletions in chromosomes 1, 2, 3, 9, 10, 12, 14, 15 and 22 (Fig. 2A).Meanwhile, the non-metastatic group only showed deletions in chromosomes 1 and 14 (Fig. 2B), providing different profiles for the two groups.Three of the metastatic and one of the non-metastatic tumours did not have a deletion of chromosome 14.Recurrent focal and arm-level copy number changes as determined by GISTIC were amplification of chromosome 1q and deletions in 1p, 9p and 12p in the metastatic group and deletions of 1p and 14q in the non-metastatic group (Fig. 2C-E).The deleted regions contained the tumour suppressors CDKN2C, SDHB, and SFPQ in 1p and CDKN2A in 9p in metastatic samples and the tumour suppressors BCL10 and NOTCH2 and the oncogenes JAK1 and NRAS in 1p in the non-metastatic sample group (Data S2).The dystrophin (DMD) gene, previously reported to be frequently deleted in metastatic GISTs [44], was only found to be deleted in chromosome X in three samples.
The number of chromosomes affected by CNV and/ or copy number neutral LOH was investigated and compared between the two groups.The median number of chromosomes affected was 15 (range 7-23) for the metastatic samples and 6 (range 1-20) for the nonmetastatic group (P = 0.01; Fig. 3A).Based on this observation, we further investigated if CIN could be used as a classifier separating the groups.Analysis was

Oncogene
Tumor suppressor

Oncogene status
KIT exon 11 del_557/558    performed using the CINmetrics package (https:// github.com/lasseignelab/CINmetrics),including the CIN scoring algorithms based on the total number of copy number abnormalities (CNAs or segment breakpoints; [45,46]), the fraction of the genome altered (fga; [47]), and total aberrations measuring the abundance and genomic size of CNV (tai; [48]) (Data S3).In addition, a CIN score was generated based on the number of chromosomes affected by CNV and LOH.For the various algorithms, individual separation values were determined based on an observed change in the distribution of CIN values.For the algorithm based on the total number of copy number abnormalities, no clear distribution change could be observed.For the total aberration algorithm (tai), most samples clustered on a very narrow distribution with limited possibility to distinguish the samples.When ranking the GIST samples according to the fraction of genome altered (fga) or total number of chromosomes affected (CNV and LOH), the samples clustered in two or more distributions.When considering the clinical behaviour of the patients, the separation obtained from the fraction of the genome altered algorithm could only classify a few metastatic samples as highly complex.However, the complexity score based on the number of chromosomes affected by CNV and/or copy number neutral LOH more clearly separated the metastatic from the nonmetastatic samples according to clinical behaviour (Data S3).Based on the distribution of the number of affected chromosomes (Data S3), we classified samples with 10 or more affected chromosomes as complex, and those with nine or fewer as simple (Table 2).All the samples in the cohort had CNVs, but a large diversity of simple and complex CNV patterns was observed both within and between the GIST groups (Fig. 3B-F, Data S4 and S5).Among the simple samples, the diversity reflected the typical evolution in GIST with an increasing loss of chromosomes (Fig 3B,  C).Interestingly, all the 11 complex samples had CNV which caused LOH in several chromosome arms or whole chromosomes (Fig. 3D-F and Table 2).One of the complex metastatic samples showed a loss of one allele in almost all chromosomes (Fig. 3E).An intricate pattern of genome duplication combined with CNV was observed for three complex samples (Fig. 3F).Five of the complex samples (45%) did not have a deletion of chromosome 14, a feature regarded as an early event in GIST development and observed to be present in all simple samples.

Percentage of mutations
The presence of high-confidence, non-canonical chromothripsis was observed in four out of 10 samples in the metastatic group and only one out of 11 samples in the non-metastatic group (Table 3).Four out of the five samples with chromothripsis had highly complex genomes.Chromosome 1 and chromosome 19 were the chromosomes most frequently showing a chromotriptic pattern, with three and two of the samples affected, respectively (Table 3).Gene expression analysis of the GIST cohort was performed to identify differentially expressed genes between tumours that metastasised and those that did not.Among the 487 differentially expressed genes, 180 were downregulated and 307 were upregulated in metastatic cases (Data S6).Unsupervised hierarchical clustering based on the differentially expressed genes showed good separation of the sample groups (Data S7).Canonical pathway analysis was performed on the list of differentially expressed genes.We identified 10 canonical pathways that were significantly differentially expressed in metastatic samples compared to non-metastatic samples.All the pathways were associated with cell cycle regulation, specifically with cell cycle signalling or the kinetochore complex (Table 4).The pathways contained a total of 49 genes (Table 4), of which 47 were upregulated, and only two (TGFB2 and CBX72) were downregulated in metastatic patient samples.
The observed downregulation of TGFB2 can promote the progression into the late G1 and S phase (reviewed in [49]).Reduced susceptibility to senescence could be promoted by the observed reduced level of the tumour suppressor CBX7 [50].Loss of CBX7 is associated with highly malignant phenotypes and poor prognosis in cancer [51] and promotes resistance to TKIs [52].Among the upregulated genes, there were a A ratio of the number of genes from the gene list that maps to the pathway divided by the total number of genes in IPA that map to the same pathway.
Several of these genes are also central players in cell cycle regulation together with other upregulated genes like the cyclins CCNB1, CCNB2, CCNA2 and CCNE2, the E2F transcription factors E2F1, E2F2, E2F7 and E2F8 and the crucial cell cycle phase regulators CDC25A and CDC25B.Furthermore, upregulation of the chromosomal replication pathway was observed in the metastatic group.This was evidenced by increased expression of central genes encoding for components of the replication initiation complex, such as ORC1, CDT1 and the MCM genes MCM2 and MCM4.
An upstream regulator analysis identified potential mechanisms regulating the differentially expressed genes (Data S8).The central upstream regulators FOXM1, MYBL2 (Fig. 4A), CHAP2L, E2F1 and ERBB2 were differentially upregulated in metastatic samples, and the downstream target genes were observed to be significantly activated.The upstream regulator miR34a was significantly downregulated, and the predicted target genes were upregulated (Fig. 4B).In addition, a set of transcriptional regulators was identified without differential gene expression changes, but the protein was assumed to have changed activity due to the observed activation or inhibition of the downstream target genes (Data S8).
To identify the possible biological effect provided by the differentially expressed genes, we performed a functional analysis in IPA to categorise the differentially expressed genes.Six functional categories were identified; Cell cycle, Cellular assembly and organisation, DNA replication, recombination and repair, Cellular development, cellular growth and proliferation, Cell death and survival and Cellular movement (Table 5).The categories contained a total of 168 unique genes (Data S9).Most of the genes were upregulated (130 genes), with only 36 genes found downregulated in metastatic samples.Interestingly, the strongest downregulated genes were associated with the hormone signalling pathway like somatostatin receptor 1 (SSTR1), progesterone receptor (PGR), LDL Receptor Related Protein 1B (LRP1B) and Insulin-like growth factor 1 (IGF1) where the first three genes are putative tumour suppressors.
A regulatory effect analysis provided a larger combined overview of the signalling networks suggested to be deregulated when comparing the high-risk GIST groups.The networks illustrate how the activated or inhibited upstream regulators can affect the downstream target molecules with the given impact on downstream molecular functions.One of the larger networks is shown in Fig. 4B, highlighting the upstream regulators CDKN1A, E2F3, ERBB2, FOXM1, HSPB1, KDM5B, KRAS, miR-34a-5p, MYBL2, MYC, RABL6, SPI1, STUB1, TP53 and YAP1.Of these, MYC was found to be recurrently gained among metastatic samples.The upstream regulators in Fig. 4B were predicted to affect target molecules involved in cell cycle regulation, like cyclins and kinases, which are central players in DNA replication and repair, cell death and proliferation.To further investigate the observed increase in proliferation genes, GSEA was run using a cell proliferation gene set.A gene expression enrichment for cell proliferation was seen in the metastatic group (P-adjusted = 0.003, ES = 0.345).This is further supported by the observed differences in the number of mitoses per 50 high-power fields of view, with a median of 28 for the metastatic group and 7 for the non-metastatic (Table 1).

Integration of miRNA and mRNA data identifies interaction networks associated with the cell cycle
MicroRNA gene expression analysis identified 25 miR-NAs to be differentially expressed between the two groups of high-risk tumours.The only upregulated miRNAs in metastatic samples were miR-196b-5p, miR-194-5p and miR-424-5p, and the most downregulated miRNAs were miR-514a-3p, miR-497-5p and miR-218-5p (Data S10).Hierarchical clustering using the differentially expressed miRNAs separated the GIST samples into two main clusters, one dominated by metastatic and the other by non-metastatic samples (Data S11).
The interaction between the differentially expressed miRNAs and mRNAs was predicted using the micro-RNA target filter in IPA with a confidence level set to experimentally observed and/or highly predicted interactions.Of the 25 differentially expressed miRNAs, 20 were predicted to directly target 63 of the differentially expressed mRNAs (Data S10).As specified, all the miRNA and mRNA pairs had opposite fold changes; Table 5. Overview of enriched functional categories identified by IPA analysis of differentially expressed genes between metastatic and non-metastatic samples.The six main categories are divided further into molecular functions which are predicted to increase or decrease.Cut-off activation |z-score| > 2, B-H corrected P-value<0.05.all the miRNAs were observed to be downregulated in metastatic samples, and the target mRNAs were upregulated.The individual miRNAs had up to 10 predicted mRNA targets, of which the largest networks are shown in Fig. 5.Of the 63 mRNAs predicted to be regulated by the miRNAs, 20 were associated with cell cycle regulation and the kinetochore, including the genes AURKB, CCNA2, CDC25A, CDK1, E2F2, E2F7, H2AX, KIF23 and PLK1 (Data S10), supporting a post-transcriptional regulation.

Discussion
Here, we have used a multi-omics approach to analyse a cohort of patients with high-risk gastric GISTs.We have compared genomic and transcriptomic tumour profiles from patients who developed metastasis to patients without metastasis after long-term follow-up.
Our results show intrinsic differences in gene expression and DNA copy number changes, providing evidence of distinct biology among high-risk GISTs with different metastatic potential.The mutational burden was low in GISTs, even for the most aggressive subgroup of patients, which is consistent with previously published data [16].No recurrent actionable somatic mutations were expressed except for mutations in KIT and PDGFRA.Interestingly, only a low fraction (~30%) of the somatic mutations were found to be expressed, which is in line with the observations previously seen in colorectal cancer [53], and lower when compared to prostate cancer [54].This has a big impact on the interpretation of the actionability of somatic mutations in cancer without knowledge regarding the expression of the mutated allele.
Chromosomal instability was the most pronounced genomic feature, and we observed both CNVs and copy number neutral LOH.Most metastasizing highrisk GISTs showed complex CIN profiles, while the less aggressive, non-metastatic tumours showed a Fig. 5. Interaction networks between miRNAs and mRNAs.Shown networks where both the miRNAs and mRNAs are differentially expressed, and the mRNAs are predicted target genes with opposite expression levels of the miRNAs.Green downregulated miRNA, and red upregulated mRNA.
simpler CIN pattern.Loss of chromosome 14 is the earliest and most frequent aberration in GIST, followed by loss of additional chromosomes in higherrisk tumours [55][56][57].Based on the pattern of CNVs, the groups in our cohort seem to represent different stages of the genomic evolution of GIST, as the simplest samples contain only a deletion of one copy of chromosome 14, in addition to a KIT or PDGFRA mutation.Our finding is in line with the general assumption that CIN fosters much of the intratumoural heterogeneity observed in cancers and drives phenotypic variation facilitating adaptation during tumour evolution.
Genomic complexity involving copy-neutral LOH of whole chromosomes or chromosome arms has been reported as a frequent event in GIST [58].Copyneutral LOH can be due to a loss of chromosome material during abnormal mitosis, followed by duplication of the remaining material.This may contribute to the multi-hit inactivation of tumour suppressors.The presence of chromothripsis was observed more frequently in high-risk GIST samples with a more aggressive phenotype.A few cases of chromothripsis in GIST have been observed in a pan-cancer study describing the landscape of chromothripsis [36].
Chibon and co-workers have demonstrated that a high number of CNVs is associated with inferior outcomes in AFIP intermediate-risk GISTs [8,9], indicating that such patients might benefit from adjuvant imatinib treatment.On the other hand, approximately half of the patients classified as mNIH high-risk may be cured by surgery alone [5].Thus, identifying prognostic biomarkers within the mNIH high-risk group could lead to an improved selection of patients for adjuvant therapy and ultimately spare high-risk patients with a good prognosis for the toxicity and cost of three years of imatinib treatment.In a recent study by Boye et al., [10] karyotyping was used to detect chromosomal aberrations in 206 tumours, of which 76 had a complex karyotype with > 5 chromosomal aberrations.High-risk patients with a simple tumour karyotype had an estimated 5-year RFS of 94%, while patients with a complex karyotype had 51%.These studies strongly support further investigation of CIN as a prognostic biomarker to improve patient selection for adjuvant treatment, and the present study provides a detailed genomic and transcriptomic landscape to increase our understanding of high-risk GISTs.
When comparing expression differences between the two groups of high-risk tumours, we see a deregulated gene expression signature of proteins involved in all phases of the cell cycle.This was inhibition of G1/S and G2/M checkpoint pathways, reduced senescence and increased chromosomal replication in aggressive high-risk tumours.The upregulated CDK1 is a central player with a coordinating role in both cell cycle regulation and DNA replication [59].The pathway inhibition of the G2/M checkpoints, through increased levels of AURKA, CDK1, and CCNB1/2, can facilitate cells with damaged DNA to enter into the M phase, causing increased genomic instability.As an upstream regulator, the transcription factor MYBL2 is involved in cell survival, proliferation and differentiation, and transactivates late cell cycle genes in the G2/M phase.The increased MYBL2 gene expression in metastatic samples is in line with an observed overexpression and poor patient outcomes in numerous cancer entities (reviewed in [60]).ERBB2 is another upstream regulator observed to be upregulated in the metastatic samples and is a major driver of cell proliferation and angiogenesis.In a previous study, ERBB2 protein (Her2/neu) expression has been significantly correlated with risk grade, tumour size, mitotic count, and increased risk of relapse in primary GIST [61].Overexpression of the upstream regulator MYC has previously been shown to reversibly induce and maintain CIN, contributing to aneuploidy, tumorigenesis, and tumour evolution [62].
In addition to a deregulated cell cycle, an enrichment in cell proliferation was seen for the metastatic samples.The enhanced proliferation capacity can be seen as a result of adaptation to elevated CIN.However, it is difficult to conclude if the deregulated gene expression is a consequence of CIN or facilitates CIN or both.The possible mechanisms driving CIN are diverse, including mitotic defects, defects in DNA replication and aneuploidy-driven CIN (reviewed in [63]).In particular, the dysfunctional control of the kinetochore, which can induce mistakes in the alignment and separation of the sister chromatids during the cell cycle, can represent a possible source for chromosome instability in GIST, as seen for other cancers (reviewed in [64]).A strong enrichment for the kinetochore complex and mitotic spindle organisation genes in the metastatic samples was observed.Several of the upregulated core kinetochore genes identified in our study have previously been seen to be periodically and coordinately expressed during the cell cycle [65], and their proteins temporarily localise to kinetochores only during mitosis.In addition to this broad cell division program, additional genes are essential for cell cycle progression and DNA replication was upregulated.Of these, the aurora kinases AURKA and AURKB have been shown to enhance the generation of aneuploid cells, facilitating genomic instability and malignant transformation (reviewed in [66]).Overexpression of AURKA has been suggested to be an independent unfavourable prognostic factor in both treatment-na€ ıve [21,67] and imatinib-treated advanced GIST patients [68].AURKA overexpression has been shown to enhance the resistance of GIST cells to imatinib [21], thus an AURKA inhibitor may have potential as a therapeutic agent for both imatinib-sensitive and imatinib-resistant GIST [68].
The upregulation of the kinetochore genes, including AURKA and AURKB, as well as other cell cycle genes can be facilitated by the transcription factor FoxM1 [65,69,70], which shows a strong upregulation in our GIST samples that developed metastasis.FoxM1 binding is significantly enriched at the G2/M phase of the cell cycle, and strong control is required for the proper execution of the mitotic program and to maintain chromosome stability [71].Increased expression of FOXM1 is observed in a variety of cancer types, and elevated expression has been related to poor overall survival [72].
Different transcriptional miRNA profiles between the two groups of high-risk patients were identified, with a strong association with the cell cycle.The miRNA miR-34a-5p, seen to be downregulated in the metastatic sample group in our study, was also predicted to be an important upstream regulator.miR-34a-5p was inversely expressed to a high number of target mRNAs in our datasets, including the G1/Sspecific cyclin CCNE2.miR-34a is located in the chromosome 1p region, which is recurrently deleted in the metastatic group.In a study comparing miRNA and mRNA expression profiles of imatinib-na€ ıve and imatinib-resistant primary GIST, analyses revealed a deregulation of cell cycle progression and cellular proliferation with an increased expression of miR-34a-5p in the imatinib-resistant group [15].The miR-34 family is a direct target of TP53, and when up-regulated, it induces apoptosis, cell cycle arrest, and senescence in cancer (reviewed in [73]).The observed downregulation of miR-218-5p was in line with previous studies observing miR-218-5p to be the most decreased miRNA in GIST compared to normal gastric tissue [74] and also separating high-risk GIST from low and intermediate-risk samples [75].miR-218 has been shown to regulate KIT protein expression and inhibit GIST cell proliferation and invasion in vitro [76].miR-218 overexpression might improve the sensitivity of GIST cells to imatinib through PI3K/AKT signalling pathway [77].
Several studies have revealed the enrichment of deregulated cell cycle processes in GIST [14][15][16][17][18][19][20][21].Using a cohort comparable to ours, including only primary, treatment-na€ ıve, high-risk GIST, samples with higher mitotic counts were shown to have a significantly increased protein expression of cyclin D1, p53 and E2F1 [17].This associated cell cycle deregulation with a more aggressive phenotype, in line with our results.However, most studies include heterogeneous groups both when it comes to risk classification and treatment status.By comparing gene expression of primary, imatinib-na€ ıve GIST with imatinib-resistant, an enrichment for regulators of oestrogen-mediated S-phase entry, cyclins and cell cycle regulators, as well as G2/ M checkpoint regulation pathways, was seen [15].Enrichment of the cell cycle has also been identified comparing metastatic and high-risk GIST to less aggressive GIST [16].However, these previous studies have consisted of more heterogeneous patient cohorts, and it is difficult to separate intrinsic tumour biology from the effects of TKI treatment.

Conclusion
Patients with high-risk GIST are clinically challenging as our biological understanding of the clinical heterogeneity is limited.And there are no available biomarkers able to reliably predict which patients will later develop metastatic disease.Our multi-omics approach shows that the presence of complex CIN and deregulated cell cycle processes were the most prominent features in this group.Our findings indicate that the increased CIN in the tumours with higher metastatic capacity could be a phenotype of the deregulated cell cycle, including dysfunctional control of the kinetochore machinery.We show that complex CIN and cell cycle control are intrinsically deregulated processes associated with metastasis in gastric, high-risk GIST.Our results suggest that CIN should be further investigated as a prognostic biomarker in high-risk GIST.We further show that an improved understanding of the genomic and transcriptomic changes in tumours from a carefully selected patient cohort may inform the development of clinically useful biomarkers.
on data integration and results discussions.The study was funded by grants 218241 and 221580 from the Norwegian Research Council (OM), 144385 (OM) and 198034-2019 (LAMZ) from the Norwegian Cancer Society, and 172011 from the Radium Hospital Foundation (OM), South-eastern Norway Regional Health Authorities grant 2019064 (KB) and 2019011 (LAMZ).

3 MutatedFig. 1 .
Fig. 1.The expressed tumour mutational burden of GIST.(A) Topography of expressed mutations in tumour suppressor and oncogenes in GIST.(B) Sample-wise proportion of the mutated loci with the expression of the mutated allele.Met: metastatic.NoMet: non-metastatic.

Fig. 2 .
Fig. 2. Identification of copy number changes in GIST.Frequency plots with an overview of deletions and amplifications for (A) metastatic samples (B) non-metastatic samples.Blue and grey represent alternating chromosomes, and the genomic locations of chromosomal peaks are shown with the chromosome number along the y-axis.Significant, recurrent arm-level and focal genomic changes were seen across the GIST samples e.g. as shown in (C) amplification in metastatic samples (D) deletion in metastatic samples, and (E) deletion in non-metastatic samples.The statistical significance of the aberrations is displayed as FDR q values along the x-axis (cut-off 0.25).The dotted lines show the centromere location.Met: metastatic.Non-met: non-metastatic.

Fig. 3 .
Fig. 3. Chromosomal instability in GIST.(A) Level of CIN in the groups of metastatic and non-metastatic samples, *P-value = 0.01, Mann-Whitney test.Examples of various patterns of genomic copy number changes and LOH in simple (B and C) and complex (D-F) genomes are shown.

Fig. 4 .
Fig. 4. Mechanistic and regulatory networks in GIST.(A) Mechanistic network with FOXM1 and MYBL2 as master upstream regulators.The master regulators mediate the regulation of the target genes through direct regulation or intermediate regulators.(B) Regulatory effects network of upstream regulators, target molecules and the impact on downstream functional mechanisms.Pink: increased gene expression in the dataset.Green: decreased gene expression in the dataset.Blue: predicted inhibition.Orange line: leads to activation.Blue line: leads to inhibition.Grey line: effect not predicted.

Table 1 .
Clinical and histopathological characteristics based on whether patients developed metastases during follow-up.Risk classification was performed at the time of primary tumour surgery or diagnosis.HPF, high-power field of the microscope.
a Values are median (range).

Table 2 .
Overview of CIN.The number of affected chromosomes per sample includes CNV and/or copy-neutral LOH.Chr, chromosome.

Table 3 .
Chromothriptic pattern of high-confidence scored GIST samples.Shown chromosomes with a chromothripsis pattern and the number of copy number switches within 50 Mb for each of the chromosomes.

Table 4 .
Significantly activated or inhibited IPA Canonical pathways.Analysis was performed on 487 genes differentially expressed between metastatic and non-metastatic GIST samples.A negative z value connotates an overall pathway's inhibition and a positive z value connotates an overall activation, with a cut-off at |z-score| > 1.6 and Benjamini-Hochberg adjusted P < 0.05.