Identification of recurrent FGFR3 fusion genes in lung cancer through kinome‐centred RNA sequencing

Oncogenic fusion genes that involve kinases have proven to be effective targets for therapy in a wide range of cancers. Unfortunately, the diagnostic approaches required to identify these events are struggling to keep pace with the diverse array of genetic alterations that occur in cancer. Diagnostic screening in solid tumours is particularly challenging, as many fusion genes occur with a low frequency. To overcome these limitations, we developed a capture enrichment strategy to enable high‐throughput transcript sequencing of the human kinome. This approach provides a global overview of kinase fusion events, irrespective of the identity of the fusion partner. To demonstrate the utility of this system, we profiled 100 non‐small cell lung cancers and identified numerous genetic alterations impacting fibroblast growth factor receptor 3 (FGFR3) in lung squamous cell carcinoma and a novel ALK fusion partner in lung adenocarcinoma. © 2013 The Authors. Journal of Pathology published by John Wiley & Sons Ltd on behalf of Pathological Society of Great Britain and Ireland.


Introduction
Recurrent translocations have been studied in leukaemia for over half a century [1], but in the past decade it has become clear that structural rearrangements and fusion genes also contribute to the development of solid tumours. Sometimes these rearrangements are very common, for example consider fusions involving ETS -family members in prostate cancer [2], but many seem to occur at a low frequency and often involve multiple fusion partners, which represents a significant challenge for discovery and for subsequent diagnostic screening.

Kinase fusion genes in NSCLC 271
Concerted and systematic efforts have been applied to define the key genetic alterations that drive lung cancer [3][4][5][6][7][8]. Numerous technologies have been employed, including exome sequencing, whole genome sequencing, and transcriptome sequencing. These studies have shown that the genomic landscape of lung cancer is highly complex, due to high rates of somatic mutations, copy number alterations, and genetic rearrangements. Although this work has highlighted many new driver genes, our knowledge of the genetic rearrangements and fusion genes that occur in lung cancer remains limited because only a small handful of genome sequences have been completed.
The best-characterized fusion gene in lung cancer is EML4-ALK , which was discovered using a cellbased transformation assay [9]. ALK has since been found to be involved in a variety of fusions, all of which preserve its kinase domain. Recent clinical trials have demonstrated that tumours that carry ALK fusions respond to the small molecule ALK inhibitor crizotinib [10]. It took a little over 5 years between the identification of ALK as a therapeutic target in lung cancer and its validation in a genotype-driven clinical trial. The results obtained with ALK, and other fusion kinases such as BCR-ABL, serve as an important reminder that cancer cells become addicted to signalling through oncogenic kinases and that there is tremendous value in identifying these events and developing therapeutic strategies to target them.
While candidate-based approaches have been applied successfully to define new fusion kinases in lung cancer, such as the identification of RET and ROS1 fusions in adenocarcinoma [11,12], a global method for detection is needed to fully understand the diversity of kinase alterations driving the disease. We developed a high-throughput platform for systematically profiling kinase fusions that relies on specific enrichment of kinase transcripts. Using this approach, we screened a panel of non-small cell lung cancer (NSCLC) samples and identified a number of activating mutations, amplifications, and novel fusion transcripts.

Patient material and sequencing
Our cohort included 95 patients, 80 of whom were previously described in a study that developed a prognostic classifier for early-stage lung cancer [13]. The ethical review panel of the NKI-AVL approved the use of patient material in this study. Patient material was available from frozen or formalin-fixed, paraffin-embedded (FFPE) tissue blocks. Quantification and quality assessment for RNA were performed with a Bioanalyzer (Agilent, Santa Clara, CA, USA). Sequencing libraries were constructed from frozen tissue with a TruSeq mRNA Library Preparation Kit using poly-A-enriched RNA (Illumina, San Diego, CA, USA). Capture enrichment was performed with the human kinome DNA capture baits (Agilent Technologies, Santa Clara, CA, USA). Six libraries were pooled for each capture reaction, with 100 ng of each library, and custom blockers were added to prevent hybridization to adapter sequences: B1: 5 AGATCGGAAGAGCACACGTCTGAACTCCAGT CACNNNNNNATCTCGTATGCCGTCTTCTGCTT G/3 ddC; B2: 5 CAAGCAGAAGACGGCATAC GAGATNNNNNNGTGACTGGAGTTCAGACGTGT GCTCTTCCGATCT/3 ddC.
Captured libraries were sequenced on an Illumina HiSeq2000 platform with a paired-end 51-base protocol. Sequences were aligned to the human genome (Hg19) with TopHat [14]. HTSeq was used to assess the number of uniquely assigned reads for each gene; expression values were then normalized to 10 7 total reads and log 2 transformed (results are provided as Supplementary Dataset 1). Sequence variants were detected with SAMtools and were annotated using the ENSEMBL variant effect predictor and the NHLBI GO Exome Variant Server (results are provided as Supplementary Dataset 2) [15].

Fusion detection
Three pipelines were used to identify and rank candidate fusion genes: TopHat-fusion [16]; deFUSE [17]; and de novo transcript assembly with Trinity [18]. Detection parameters differed for each platform. For the de novo assembly approach, sequence reads were assembled using Trinity (version r2012-10-05) and the resulting transcripts were used to identify fusion candidates. The filtering pipeline consisted of a number of steps: (1) the de novo transcripts were aligned to the human genome (Hg19) refSeq open reading frame sequences; (2) transcripts were identified that had multiple refSeq gene alignments; (3) sequence reads were mapped back to the candidate fusion transcripts using bowtie2 to determine the number of spanning reads (reads aligning to the breakpoint with at least 15 bases on either side) and spanning pairs (pairs with a read on each side of the breakpoint); (4) erroneous fusions were removed, for example, where the fusion partners shared sequence at the breakpoint, or no spanning read pairs were detected, or if the candidate fusion was identified in a normal sample. Full results of the de novo assembly pipeline are provided as supplementary material (Supplementary Dataset 3).

Validation
The Maxima First Strand cDNA Synthesis Kit was used to produce input cDNA for RT-PCR (Thermo Scientific, Newington, NH, USA). PCR primers were designed to amplify across the fusion breakpoints:

Fluorescence in situ hybridization (FISH)
ALK translocations were assessed using the Vysis ALK Break Apart FISH Probe Kit. Samples were processed according to the manufacturer's instructions (Abbott Molecular, Des Plaines, IL, USA). In short, unstained FFPE sections (4 µm) were deparaffinized, treated with protease, and washed in preparation for hybridization. FISH probes were hybridized for 14-24 h at 37 • C, after which the slides were washed thoroughly. Mounting medium with DAPI was added (Vector Laboratories, Burlingame, CA, USA) and coverslips were attached to facilitate imaging.

Immunohistochemistry
Immunohistochemistry was performed on the Bench-Mark Ultra automated staining instrument (Ventana Medical Systems, Oro Valley, AZ, USA). Paraffin sections (4 µm) were heated at 75 • C for 28 min and then deparaffinized in the instrument. Sections were treated with CC1 buffer for 64 min before incubation with the primary antibody (Ventana Medical Systems). For ALK staining, sections were incubated in a 1 : 50 dilution of the primary antibody (NCL-ALK, clone 5A4; Leica Biosystems, Wetzlar, Germany) for 2 h at 37 • C. For FGFR3 staining, sections were incubated in a 1 : 50 dilution of the primary antibody (FGFR3, clone B-9; Santa Cruz Biotechnology, Santa Cruz, CA, USA) for 1 h at room temperature, followed by a Ventana amplification step (Ventana Medical Systems). Bound primary antibody was detected using the Universal DAB Detection Kit (Ventana Medical Systems) and slides were counterstained with haematoxylin.

Results
Kinome-centred RNA sequencing identifies STRN as a novel ALK fusion partner A kinome-centred RNA sequencing method was developed in which biotinylated RNA probes are used to selectively capture kinase transcripts prior to sequencing. The capture increases the coverage of target transcripts and provides a more sensitive way to detect mutations. We began by looking for kinases that were involved in fusion genes in a panel of 95 NSCLCs, which included 36 adenocarcinomas, 48 squamous cell carcinomas (SCCs), and 11 others (Supplementary Table 1). Hybridization to the probes targeting the human kinome resulted in an 18-fold enrichment in coverage for these transcripts (Supplementary Figure 1). Three analysis pipelines were employed to detect fusion transcripts, resulting in a list of 20 candidates (Supplementary Table 2). Of these, four were also present in normal tissue and were not considered further.
The EML4-ALK fusion was identified in one adenocarcinoma and in the H3122 cell line, which was included as a positive control. ALK was also found in another fusion, which joined exon 3 of striatin (STRN ) to exon 20 of ALK . The STRN-ALK fusion produces an in-frame protein that contains the first 137 amino acids from STRN joined to the last 339 amino acids of ALK, a region that includes the kinase domain ( Figure 1). The EML4-ALK and STRN-ALK fusions were confirmed by RT-PCR with primers that spanned the breakpoint. STRN and ALK are both located on chromosome 2 but are separated by approximately 7 Mb; as the genes share the same transcriptional orientation, it is most likely that the fusion results from a large intra-chromosomal deletion. Rearrangement of the ALK locus was confirmed using FISH ( Figure 1C). The rearranged STRN-ALK gene produced two distinct signals in each nucleus, suggesting that the rearranged locus had also been amplified ( Figure 1C). Tumours that were positive for ALK fusions had the highest levels of ALK expression across the cohort (ranked 1 and 2 from a total of 95 samples). Staining with an antibody confirmed expression of ALK in the sample carrying the STRN-ALK fusion ( Figure 1D).

FGFR3 is recurrently mutated in squamous NSCLC
We also detected two SCC samples that carried a candidate fusion involving FGFR3 and transforming acidic coiled-coil containing protein 3 (TACC3 ). FGFR3-TACC3 fusions were recently identified in glioblastoma and bladder cancer [19][20][21]. The rearrangement places the first 18 exons of FGFR3 , including almost the entire open reading frame, upstream of the last seven exons of TACC3 . The resulting fusion transcript is in frame, such that the last 226 amino acids of TACC3 are added directly to the truncated FGFR3 protein (amino acids 1-760). The C-terminus of the fusion protein includes a complete TACC domain (Figure 2A). RT-PCR and capillary sequencing were used to confirm the fusion between exon 18 of FGFR3 and exon 10 of TACC3 ( Figure 2B). The tumour from one patient had very low levels of the fusion transcript.
To ensure that this was not due to contamination, we confirmed the presence of the fusion using independent material derived from FFPE blocks. A diagnostic FISH assay used to detect FGFR3 translocations did not detect rearrangement at the locus, which reflects the fact that the two genes are separated by only 48 kilobases (data not shown). Expression of the FGFR3 protein was markedly elevated in the samples that carried the FGFR3-TACC3 translocation ( Figure 2C). As well as detecting the gene fusion, we also identified two SCCs that carried activating mutations in FGFR3 . The mutation causes a serine-to-cysteine substitution at position 249 and is the most common FGFR3 activating mutation identified in bladder cancer ( Figure 2D).

FGFR3 expression defines a subset of squamous NSCLCs
Expression of FGFR3 was assessed across a panel of 280 NSCLCs that included 136 squamous cell carcinomas and 144 adenocarcinomas. Strong FGFR3 expression was detected in ten SCCs (7.4%), whereas the adenocarcinomas were uniformly negative (Supplementary Figure 2). Tumours that had high FGFR3 levels were screened by RT-PCR, which revealed two additional cases in which FGFR3 was fused to TACC3 .

Discussion
In this report, kinome-centred transcriptome sequencing was applied to profile a series of primary lung cancers. We identified STRN as a novel fusion partner for ALK , adding to a growing list of oncogenic fusions found in adenocarcinoma. A number of different mutations were identified in FGFR3 , including a recurrent fusion with TACC3 , which provides much needed insight into the oncogenic pathways operating in SCC and makes a strong case for applying FGFR inhibitors selectively in this group of lung cancers.
Systematic screening has identified fusion genes involving ALK , RET , and ROS1 in adenocarcinoma. We identified two patients who carried fusions involving ALK in our cohort, but no fusions involving RET or ROS1 . This is likely due to the size of the patient cohort, which included only 36 adenocarcinomas, and the fact that smoking status and ethnicity are known to influence the frequency of these events [11,12,22]. Fusion to EML4 or STRN resulted in high-level expression of ALK , which was confirmed at both the level of mRNA and the level of protein. STRN joins a host of genes that serve as fusion partners for ALK in lung cancer, which emphasizes the importance of adopting screening platforms that do not depend on prior knowledge of the partner.
All of the fusion genes previously identified in lung cancer occur predominantly in adenocarcinoma, but our work draws attention to the role of FGFR3 specifically in squamous cancers. FGFR3 has been implicated in the pathogenesis of a range of cancers, but it is most commonly altered in bladder cancer and myeloma. Activation of FGFR3 in bladder cancer typically occurs through point mutations that stimulate dimerization and induce kinase activity [23]; these mutations are rarely seen in myeloma, where the gene is instead involved in a translocation that elevates expression [24]. Recently, FGFR3 was found fused to TACC3 in glioblastoma and also in bladder cancer [19][20][21]. The transforming activity of FGFR3-TACC3 is dependent on its kinase activity, but expression of FGFR3 itself is not sufficient to transform cells, indicating that the fusion protein has acquired a unique functional activity. It has been suggested that the transforming activity of FGFR3-TACC3 results from the loss of microRNA regulation [19], from changes in canonical signalling activity [21], or from relocation to the mitotic spindle mediated by the TACC domain [20]. FGFR3 was also identified in a fusion with brain-specific angiogenesis inhibitor 1-associated protein 2-like 1 (BAIAP2L1 ) in bladder cancer, which does not contain a TACC domain, which would suggest that rather than specifically targeting FGFR3 to the spindle, the fusion partner stimulates activation through a conformational change, or by providing a dimerization interface. The fact that both point mutations and fusions are found in bladder cancer and SCC supports the view that these events are functionally equivalent.
Recent genomic profiling in SCC has highlighted a number of new molecular targets, including the FGFR family receptors FGFR1, FGFR2, and FGFR3. It has been demonstrated that FGFR1 amplification occurs in approximately 10% of SCCs [25], but more extensive profiling identified low-frequency activating mutations and copy number alterations in all three receptors [4]. Importantly, it has been demonstrated that FGFR1 -amplified lung cancer cell lines are highly sensitive to treatment with small molecule inhibitors [25], which formed the basis for clinical trials with FGFR inhibitors in SCCs with FGFR mutations (ClinicalTrials.gov Identifier: NCT01761747), or specifically in cases with amplified FGFR1 or FGFR2 (ClinicalTrials.gov Identifier: NCT01795768). FGFR inhibitors also show strong activity in bladder cancer cell lines that carry FGFR3 fusions [21,26] and prolong survival in mice that carry glioblastomas initiated by FGFR3-TACC3 [20]. Using expression data from The Cancer Genome Atlas (http://cancergenome.nih.gov/), Wu et al recently identified many different FGFR3 fusions in solid cancers, as well as fusions that involved FGFR1 and FGFR2 [27]. Consistent with our findings, FGFR3 fusions were detected in SCC but not in lung adenocarcinoma. These findings provide a strong rationale for broadening the inclusion criteria of current clinical trials to capture SCC patients who carry FGFR fusion genes.
Our work has identified a subset of SCCs that express high levels of FGFR3 and has defined a range of activating mutations that target the receptor, including a recurrent translocation that fuses FGFR3 with TACC3 . Larger patient cohorts will be required to define the biology of cancers that have activated FGFR3, both to understand their clinical course and to gauge the influence of ethnicity and smoking status on the frequency of these mutations. Detailed pharmacological studies will be required to assess the utility of FGFR inhibitors for treating lung cancers that carry the FGFR3-TACC3 fusion; however, in this respect, preclinical work with glioblastoma and bladder cancer models, and with lung cancer patients who carry other FGFR mutations, is all very encouraging. It will also be important to extend our findings to look for similar activating mutations in other squamous cancers, particularly head and neck cancer, which shares many features in common with lung SCC [28,29]. Although we have learned a great deal about the mutational events driving lung cancer [3][4][5][6][7][8], our results demonstrate that targeted profiling of the transcriptome has the potential to expand our view of the mutational landscape, particularly for the detection of fusion genes, alternative splicing, and complex gene rearrangements.
Diagnostic screening for mutations in kinases such as ALK and FGFR-family members is complicated by the fact that they participate in a diverse array of gene fusion events and are also activated by other mechanisms, including gene amplification and single amino-acid substitutions. Our results suggest that current screening methods are missing patients who would benefit from inhibitors that target these kinases. Kinome-centred profiling has the advantage of being able to detect alterations in any expressed kinase and it can be used to identify a wide variety of oncogenic mutations. Focusing our sequencing efforts on known therapeutic targets will improve our ability to detect clinically actionable mutations and help to ensure that patients receive the most appropriate therapy.