FLT3‐ITD mutations in acute myeloid leukaemia – molecular characteristics, distribution and numerical variation

Recurrent somatic internal tandem duplications (ITD) in the FMS‐like tyrosine kinase 3 (FLT3) gene characterise approximately one third of patients with acute myeloid leukaemia (AML), and FLT3‐ITD mutation status guides risk‐adapted treatment strategies. The aim of this work was to characterise FLT3‐ITD variant distribution in relation to molecular and clinical features, and overall survival in adult AML patients. We performed two parallel retrospective cohort studies investigating FLT3‐ITD length and expression by cDNA fragment analysis, followed by Sanger sequencing in a subset of samples. In the two cohorts, a total of 139 and 172 mutant alleles were identified in 111 and 123 patients, respectively, with 22% and 28% of patients presenting with more than one mutated allele. Further, 15% and 32% of samples had a FLT3‐ITD total variant allele frequency (VAF) < 0.3, while 24% and 16% had a total VAF ≥ 0.7. Most of the assessed clinical features did not significantly correlate to FLT3‐ITD numerical variation nor VAF. Low VAF was, however, associated with lower white blood cell count, while increasing VAF correlated with inferior overall survival in one of the cohorts. In the other cohort, ITD length above 50 bp was identified to correlate with inferior overall survival. Our report corroborates the poor prognostic association with high FLT3‐ITD disease burden, as well as extensive inter‐ and intrapatient heterogeneity in the molecular features of FLT3‐ITD. We suggest that future use of FLT3‐targeted therapy could be accompanied with thorough molecular diagnostics and follow‐up to better predict optimal therapy responders.

Despite two decades of accumulating data, the utility of FLT3-targeting therapeutics has provided limited benefit both as monotherapy and in combination therapy [23,25]. Results from the RATIFY trial, a large international multicentre phase III trial, recently demonstrated a 7.1% increase in 4-year overall survival and a 21% relative risk reduction in patients treated with the broadly acting kinase inhibitor midostaurin as maintenance therapy [25]. QuANTUM-R, a randomised controlled phase III trial, demonstrated a modest increase in overall survival in refractory or relapsed AML treated with the FLT3specific inhibitor quizartinib as monotherapy, where median survival was 6.2 months in the exploratory arm compared to 4.7 months in the control arm [23]. In the phase III ADMIRAL trial, comparing gilteritinib treatment to salvage chemotherapy in FLT3-ITD mutated relapsed/refractory AML patients, the median overall survival was moderately improved from 5.3 months in the control arm to 9.3 months in the experimental arm [26]. FLT3-targeted therapy has also been shown to improve long-term outcome when administered as maintenance therapy of FLT3 mutated AML after allogeneic stem cell transplantation [27,28], a disease state characterised by low tumour burden and anti-leukemic immunological mechanisms.
Although the poor risk association of FLT3-ITD mutations in AML is well established, the impact of the significant inter-and intrapatient heterogeneity in various molecular features of ITDs is still unclear. Numerical variation of FLT3-ITD mutations, duplication length, duplication sequence and the insertion/duplication integration site are all characteristics of FLT3-ITD mutated AML that have been shown to influence disease outcome [29,30,15,31,10,32,33]. However, no clear consensus currently exists regarding the significance of these features. Furthermore, the molecular mechanisms underlying this diversity is unknown and the strength and direction of these various associations are conflicting [14,5,34,15,35,16,17,36,32,37,20,38]. Thus, understanding more about the heterogeneity and complexity of FLT3 mutations in AML may reveal relationships that could inform future efforts directed at improving FLT3-targeted approaches. In this report, we present results from retrospective molecular profiling of FLT3-ITD mutations in a total of 263 AML patients. We provide a comprehensive overview of the heterogeneity and impact of FLT3-ITD mutations in AML by assessing the numerical variation, variant allele distribution and the relationship with clinical features as well as with FLT3-ITD molecular characteristics like length, sequence and integration site correlated to overall survival in two independent AML cohorts.

Ethics
Clinical trials were approved by local ethics committees and performed in accordance with the Declaration of Helsinki. All participants signed and submitted written informed consent at trial inclusion. The consent covered use of biological material for research not directly related to the clinical trial.

Sample processing
Sampling and data gathering were performed as previously described [45,48,49]. In short, bone marrow (BM) and/or peripheral blood samples were collected at time of study inclusion. The mononuclear cell fraction was isolated by Ficoll-Hypaque centrifugation and the cells were cryopreserved and stored at the Erasmus University Medical Centre, Rotterdam, until further processing.

DNA fragment analysis by capillary electrophoresis
Length mutations in the juxtamembrane region of the FLT3 gene were validated and characterised by DNA fragment analysis by capillary electrophoresis. The procedure was performed independently for the C1 and C2 cohorts at two separate centres. Samples in C2 were analysed as previously described [45]. For samples in C1, the concentration of complementary DNA (cDNA) and genomic DNA (gDNA) was quantified and normalised to approximately 20 ngÁlL À1 adjusted by ddH2O by NanoDrop 1000 (ThermoFisher). 1 lL of each sample was subsequently amplified by polymerase chain reaction (PCR), according to standard protocols. We used AmpliTaq Gold 360 Master Mix (Applied Biosystems, Waltham, MA, USA. Cat nr 4398881) and two discrete primer sets; one set for the cDNA reactions and one set for the gDNA reactions, respectively: 11F [6FAM]GCAATTTAGGTATGAAA GCCAGC and e15R1 CATAAGCTGTTGCGTTCAT CAC, and i13F -[6FAM]GCAGAACTGCCTATTC CTAACTG and e15R1 -CATAAGCTGTTGCGTTC ATCAC (desalted, Sigma-Aldrich, St. Louis, MO, USA). The PCRs, performed on a thermic cycler (GeneAmp PCR System 9700, Applied Biosystems/ S1000 Thermal Cycler, Bio-Rad, Hercules, CA, USA), were run according to the following profile: initialisation at 95°C for 10 min permitting enzyme activation. Enzyme driven DNA replication was performed for 29 thermal cycles under the conditions of 94°C for 30 s, 62°C for 30 s and 72°C for 1 min, allowing denaturation, annealing and elongation. The reaction was finalised by a finishing elongation step at 72°C for 7 min, before the samples were cooled to 4°C until further processing. The amplified fragments were mixed with a size marker (GeneScan-500 ROX, Applied Biosystems) and HiDi Formamid (Applied Biosystems) according to manufactures instructions and separated by size using capillary electrophoresis (ABI 3100 Genetic Analyzer (POP4 polymer), Applied Biosystems, Thermo Fisher Scientific, Waltham, MA, USA). Data were analysed in Peak Scanner (Applied Biosystems) in accordance with developers' guidelines, determining fragment size relative to an internal control. All analyses were performed in triplicates.

Cloning
cDNA was amplified by PCR as described above, but run for a total of 35 cycles and using a forward primer without a 6FAM label. The amplified PCR products were subsequently cloned using TOP10 chemical competent cells according to the TOPO TA cloning manual (Invitrogen, Thermo Fisher Scientific, Waltham, MA, USA). Positive colonies were directly PCR-amplified and fragments were analysed by fragment analysis by capillary electrophoresis, as described above.

Sanger sequencing
Positive clones (defined as PCR fragments larger than the estimated length product of the wild-type FLT3 fragment) were re-amplified using a forward primer without the 6FAM label and further purified using ExoZap-IT (Applied Biosystems), or illustra Exoprostar 1-step (VWR, Radnor, PA, USA) and PCR-amplified for sequencing. BigDye v1.1 Terminator cycle sequencing kit (Applied Biosystems) was used to perform direct sequencing and the products were analysed on an ABI 3730 Genetic Analyzer (POP7 polymer), (Applied Biosystems) according to the manual. The sequences were analysed using FinchTV (Geospiza Inc., Seattle, WA, USA).

Statistical methods
Peaks larger than the peak representing the FLT3 wild-type product, identified in all three technical replicates, were considered to represent probable individual FLT3-ITD mutations. Fragment length of the PCR product was calculated as the mean value of three replicates. The relationship between the wild-type peak and additional peaks in the sample was calculated as variant allele frequencies (VAF) (individual fragment/sum of fragments in the sample). A total VAF (t-VAF) was calculated for each sample, representing the load of FLT3-ITD mutants (sum aberrant fragments/sum of all fragments). A VAF of 0 indicates no detected mutation, whereas a VAF of 1 indicates loss of the wild-type allele in all cells.
We performed descriptive and univariate analyses to characterise the cohorts based on disease-related variables. The Wilcoxon signed-rank test/Mann-Whitney (non-parametric) test was applied for pairwise comparison of continuous variables. For comparison of categorical variables, we performed 2 9 2 tables and applied the 2-sided Fisher exact test. Pearson correlation was used to test relationships between continuous variables. For comparison of patient, disease and survival differences with respect to FLT3-ITD t-VAF, FLT3-ITD length and FLT3-ITD insertion site, the variables were dichotomised in agreement with optimally selected cutpoints calculated by maximally selected rank statistics. All statistical tests comparing clinico-pathological features across groups are summarised in the supplementary tables (Tables S1-S6). Notably, not all tests comprised the full sample set due to incomplete data. Median follow-up was estimated by the reverse Kaplan-Meier method. Overall survival was calculated by the Kaplan-Meier method and visualised by Kaplan-Meier plots. The 2-sided log-rank test was applied to compare the Kaplan-Meier estimates. Logistic regression analysis was applied to identify factors most closely associated with overall survival. The multivariate Cox proportional hazards regression model included age, sex, and WBC count in addition to FLT3-specific variables. Statistical significance was defined as P-value ≤ 0.05. All statistical calculations and graphical representations were performed in R-STUDIO (version 1.1.453) and R (version 3.5.0) [50]. Supplementary tables include P-values adjusted for multiple testing calculated by the Benjamini and Hochberg method [51].
Of the 432 and 625 patients included in the initial screen for FLT3-ITD mutations, a total of 117 (27.1%) and 146 (23.4%) FLT3-ITD-positive samples were identified in C1 and C2, respectively. There was no significant difference in the fraction of FLT3-ITDpositive samples between the two cohorts.

FLT3-ITD variant allele distribution
The relationship between FLT3-ITD variant alleles and wild-type alleles assessed in gDNA is a direct function of the cellular distribution of FLT3-ITD mutated cells in the sample, while the same relationship in cDNA is a function of the expression of the various FLT3 alleles. It is not clear whether the wildtype and mutated alleles are equally expressed. We therefore correlated the VAF estimated from cDNA and gDNA in 84/116 samples from C1 and identified 95 corresponding ITDs in 82 patients (Fig. S1A). The overall correlation of VAF in cDNA and gDNA was very strong (n = 95, R = 0.96, P < 2.2e-16) (Fig. 1B). Based on this relationship and the availability of cDNA for most samples, we preceded with molecular assessment of FLT3-ITDs with respect to length and the relationship with the wild-type allele in cDNA for 116/117 cases in C1 and 117/146 cases and C2. In C2, we included six cases where cDNA was missing, and the analysis was performed using gDNA.
In samples exhibiting plural ITDs, we observed no clear pattern in the size (as assessed by VAF) of LM1 in relation to the remaining mutant alleles of lower VAF; some patients were characterised by one dominating ITD, while other patients displayed co-existence of multiple similarly sized leukemic cell populations harbouring distinct FLT3-ITDs ( Fig. 2A). In C1, the ITD length distribution of LM1 across patients did not significantly differ from the ITD length distribution of mutant alleles with lower VAF (24 LM1: 42 bp vs 28 LM-non-LM1: 51 bp, P = 1). However, we observed a tendency towards shorter length of the mutant alleles with lower VAF in C2 (35 LM1: 60 bp vs 49 LM-not-LM1: 42 bp. P = 0.053). In C1, 58% of the second largest mutant allele (LM2) exceeded the length of LM1 (14/24) while the same was true in 28% of cases is C2 (10/35) (Fig. 2B). duplicated tyrosine residues as well as integration site (illustrated in Fig. 3A). A total of 74 ITDs were characterised (Fig. 3B). We identified one sequence in each of 58 samples and two sequences in eight samples. In one sample (6366), we identified one ITD, one 6 bp insertion and one 12 bp deletion. 54% (40/74) of ITDs were preceded by insertions of varying length, with a median of 3 bp and a range up to 24 bp, all expected to result in altered amino acid sequence. Previous studies have demonstrated heterogeneity of the duplicated motif, with hardly two identical ITDs within a study population [14,34,20]. Further, the duplicated sequence can cover several functionally distinct entities of the gene. Despite this heterogeneity, there seem to be some highly conserved elements, which we confirm in our cohorts. All ITDs span at least one tyrosine residue from the tyrosine rich stretch Y591-Y599 (YVDREYEY). We identified six ITDs that span a single tyrosine residue, 36 ITDs that span two tyrosine residues and eight and 24 spanned three and four tyrosine residues, respectively. The number of duplicated tyrosine residues correlated with ITD length (Fig. 4A), but no association was found with WBC counts, BM blast percentage or t-VAF ( Fig. 4B-D). We further assessed the ITD integration site in accordance with the functional structure of the FLT3 protein in line with previous reports [35,20]. The position of the integration site strongly correlated with FLT3-ITD length (R = 0.6, P < 0.001) (Fig. 4E, F). Analogously, integration region correlated with FLT3-ITD length (Fig. 4G). We found that 28% (21/ 72) of sequences integrated within the tyrosine kinase domain 1 with 19 sequences located in the Beta sheet 1 and 2 in the nucleotide-binding loop. The remaining sequences were located in the juxtamembrane domain, with two in the Switch motif of the juxtamembrane domain, 40 in the Zipper motif of juxtamembrane domain region and 11 in the hinge region. No association was found with WBC counts or BM blast percentage (Fig. 4H,I). Integration in the hinge region was associated with higher t-VAF (Fig. 4J).

Outcome of FLT3-ITD mutated AML patients
Next, we assessed the outcome of the FLT3-ITD mutated patients in the two cohorts. The median follow-up time was 113.7 months (95% CI 102.5-122.9) and 42.3 months (95% CI 40.9-43.7) in C1 and C2, respectively. During the treatment course, 24% (27/ 111) and 67% (82/123) of patients in C1 and C2 underwent allo-HSCT, respectively. An additional 15 patients in C1 and 13 patients in C2 received an auto-HSCT. In C1, FLT3 mutation status was retrospectively assessed and did not influence treatment decisions, while in C2, individuals characterised as FLT3-ITD mutated were usually recommended an allo-HSCT in first complete remission if considered eligible. For survival analysis, however, no patients were censored at time of HSCT. As expected from the composition of the two cohorts and their temporal separation, C2 has a significant superior survival with a median survival of 32.0 months (95% CI 27.4-41.8 months) as compared to 18.8 months (95% CI 16.0-24.7 months, P = 0.031) for C1. This was also true for the FLT3 mutated patients, who had a median overall survival of 9.2 months (95% CI 8.11-1-5-5) in C1 (n = 111) and 17.5 months (95% CI 13.7-40.7) in C2 (n = 123) (P = 0.022).
Median survival was significantly shorter in the group characterised by a long ITD sequence (length of LM1 ≥ 50 bp) in C1. Patients with LM1 < 50 bp (n = 55) had a median survival of 15.21 months (95%

Discussion
Here, we have described the distribution of the FLT3-ITD mutated alleles in two separate cohorts of treatment-na€ ıve AML patients and related it to clinical and molecular characteristics as well as outcome. The associations related to FLT3-ITD mutations, including younger age, female sex, higher WBC counts and higher BM blast percentage as well as cytomorphology, cytogenetics and molecular genetics were mostly consistent with previous reports [14,52,15,53,18,54,11]. Among these observations, the coherent associations between FLT3-ITD status and clinical features nonattributable to downstream effects of the mutation are particularly interesting. This includes the association between FLT3-ITD and female sex and younger age [14,54] (although not statistically significant in our study). The multivariate cox regression model further identified females as a subgroup that had inferior survival within the FLT3-ITD positive population in C1. Cytogenetic as well as molecular genetic variation in relation to age has previously been described in AML [55,56], raising questions regarding age-specific aetiology or whether downstream effects of a mutated gene product may be influenced by age. Such a mechanism has been experimentally substantiated by Porter and colleagues, demonstrating that the phenotypic transition following FLT3-ITD mutations in murine models varied between foetal or neonatal mice and adult mice, where only adult mice developed leukemic phenotypes [57]. Sex-specific mutational patterns could indicate a similar mechanism. Positive and negative associations between FLT3-ITD mutation status and co-occurrence or mutual exclusivity with cytogenetic and mutational aberrations like DNMT3A and NPM1 are also recurring findings. Both DNMT3A and NPM1 mutations frequently precede FLT3-ITDs, as inferred by recurring VAF patterns [58] and single cell sequencing data [59,60]. Experimentally, ectopic FLT3-ITD expression in FLT3 wild-type background is known to be detrimental, even in a genetic background that frequently co-occurs with FLT3-ITDs [61]. In sum, this suggests that the 'driver' qualities attributed to FLT3-ITD mutations may at least in part be determined by the gene-context, including systemic conditions (by the association with age and sex) as well as intracellular gene-context (as supported by the relationship with cytogenetic and molecular genetic features).
Our results corroborate that intra-tumour plurality of FLT3-ITD mutations at time of diagnosis is a frequent characteristic of FLT3-ITD mutated AML [29,30,15,10,32,20,33]. Considering the low mutation rate of haematopoietic stem and progenitor cells, estimated to comprise as little as one acquired exonic mutation per decade [62], as well as the relative stability of somatic variants through single AML disease courses [63,58], it seems implausible that multiple FLT3-ITD mutations are acquired synchronically. Indeed, reports determining FLT3-ITD numerical variation with higher sensitivity assays suggest that plurality of FLT3-ITDs is strongly underestimated [29,33], which could account for the absence of significant associations observed when comparing patients with one or several FLT3-ITD mutations when assessed by a low sensitivity assay as we have done. A recent report identified up to 16 discrete FLT3-ITD mutations in an individual patient sample, with an average of 3.8 FLT3-ITDs identified per sample when assessed by a deep next-generation sequencing approach [29]. Longitudinal assessment of FLT3-ITD mutated AML patients [29,64,65] as well as single-cell sequencing studies [59,60] suggests that discrete FLT3-ITDs are indicative of separate cell populations derived from multiple individual cells that independently acquired FLT3-ITD mutations. Recent work in healthy individuals has demonstrated the ubiquity of mutations known to co-occur in FLT3-ITD mutated AML, as well as age-dependent expansion of cell populations characterised by such mutations, including DNMT3A and TET2 [66][67][68]. Interpreted along with the high frequency of plural FLT3-ITDs in FLT3-ITD mutated AML, this could suggest that FLT3-ITD mutations are more prevalent than commonly suggested, implying that FLT3-ITD mutations alone are insufficient to trigger AML disease eruption.
As has been well established [20,11,69], we demonstrate heterogeneity of FLT3-ITD variant allele distribution at the time of diagnosis. Molecular characteristics of the dominating FLT3-ITD mutation, including length, number of tyrosine residues or insertion integration site, did not correlate with mutational load. However, FLT3-ITD length exceeding 50 bp was associated with inferior overall survival in C1. No clear consensus currently exists regarding the prognostic impact of ITD length, although some studies have suggested that longer ITDs could be detrimental [70,38]. However, we did not observe a similar ++ + + + + + + + + + ++ + P = 0.24 + + + + + + + + ++ ++ + + + + ++ + + + + + + ++ + + + + + + + + ++ + + ++ ++ ++++ + + ++  association in C2, and the significance of this finding is therefore uncertain. Conversely, high mutational load of FLT3-ITD has been well established to correlate with poor prognosis in AML [12,13,71,19]. This was also evident from our analyses, although only significant in C2. Of note, differences in cohort composition and disparity of treatment, mainly resulting from the temporal separation of the two cohorts, is an important confounder for these results. In C2, FLT3 mutation status was prospectively determined and used to guide risk-adapted treatment decisions. This is evident from the significantly higher proportion of FLT3-ITD mutated patients in C2 receiving allogeneic haematopoietic stem cell transplantation (67% vs 24%), which is further reflected in the significantly superior overall survival of FLT3-ITD mutated patients in this cohort. Furthermore, C1 is confined by available sample material and is therefore enriched for patients with high disease burden (i.e. high WBC counts). This could explain the relatively weak impact of t-VAF on overall survival in this cohort as compared to C2, as WBC count was found to correlate to VAF in both cohorts. Of note, it was recently reported that risk-adapted treatment strategies in AML appear to have eliminated the poor risk association with FLT3-ITD [72]. We have confirmed this observation in the HOVON cohorts in a separate study; FLT3-ITD mutated patients in C1 had a significantly worse outcome compared to patients with FLT3wt, while there was no difference in overall survival between FLT3wt and FLT3-ITD in C2 [73]. Interestingly, in the present study we still identified a poor risk association with high mutational burden within the FLT3-ITD mutated subgroup of this cohort. This relationship between FLT3-ITD VAF and outcome suggests that it is primarily the expansion of FLT3-ITD mutated cell populations rather than the existence of FLT3-ITD mutated cells that correlate with inferior overall survival. Based on the observed co-existence and often parallel expansion of multiple FLT3-ITD mutations, one cannot exclude that the poor outcome of this subgroup may in part be a function of an underlying systemic mechanism permitting emergence of leukemic properties in multiple FLT3-ITD mutated haematopoietic stem or progenitor cells synchronously. Furthermore, elevated expression of the DNA polymerase terminal deoxynucleotidyl transferase (TdT) is suggested to be related to formation of FLT3-ITDs in AML [74,75], which may represent a mechanism for the repeated generation of unique length mutations in AML progenitor cells. Such conditions could perhaps account for the complexity of FLT3-ITD mutation distribution and dynamics as well as the persistent finding of FLT3-ITDs as biomarkers of inferior outcome.

Conclusions
It is clear that FLT3-ITD mutations in AML are characterised by significant inter-and intra-patient heterogeneity. The biological significance of this heterogeneity is not clear, although the poor prognostic impact associated with high VAF and length of the duplicated sequence suggests that these observations are not trivial. It is reasonable to suggest that this heterogeneity could also pose a significant challenge with regards to FLT3-targeting therapy. Perhaps this could provide a partial explanation for the very limited progress made to date using FLT3-targeting inhibitors. We suggest that a thorough molecular characterisation of FLT3-ITDs in AML patients undergoing FLT3-targeting therapy could provide novel biological insight that could ultimately increase predictive and therapeutic precision in FLT3-ITD mutated AML.
biological and clinical data and contributed to experimental design. AB, LW, RH and SLB performed experiments and TG and AAH were involved in data analysis. All authors contributed to preparation of the manuscript.

Data accessibility
Individual participant data from the HOVON1 and HOVON2 cohorts are not publically available. Information about the individual HOVON trials including study protocols is available at http://www.hovon.nl.