Genomic tools in acute myeloid leukemia: From the bench to the bedside


  • Brian S. White PhD,

    1. Department of Internal Medicine, Division of Oncology, Washington University School of Medicine, St. Louis, Missouri
    2. The Genome Institute, Washington University, St. Louis, Missouri
    Search for more papers by this author
  • John F. DiPersio MD, PhD

    Corresponding author
    1. Department of Internal Medicine, Division of Oncology, Washington University School of Medicine, St. Louis, Missouri
    2. Siteman Cancer Center, Barnes-Jewish Hospital, Washington University School of Medicine, St. Louis, Missouri
    • Corresponding author: John F. DiPersio, MD, PhD, Division of Oncology, Campus Box 8007, Washington University Medical School, 660 South Euclid Avenue, St. Louis, MO 63110; Fax: (314) 454-7551;

    Search for more papers by this author

  • We thank Joshua McMichael for assistance with figure design and Nicole Maher for critical review of the manuscript.


Since its use in the initial characterization of an acute myeloid leukemia (AML) genome, next-generation sequencing (NGS) has continued to molecularly refine the disease. Here, the authors review the spectrum of NGS applications that have subsequently delineated the prognostic significance and biologic consequences of these mutations. Furthermore, the role of this technology in providing a high-resolution glimpse of AML clonal heterogeneity, which may inform future choice of targeted therapy, is discussed. Although obstacles remain in applying these techniques clinically, they have already had an impact on patient care. Cancer 2014;120:1134–1144. © 2014 American Cancer Society.


Most patients with acute myeloid leukemia (AML) are cytogenetically normal (CN-AML). Their variable overall survival suggests clinical and biologic heterogeneity and a need for additional biomarkers. Because many of these patients have no detectable copy number alterations, mutations in the diploid genome are likely pathogenic. Hence, lesions driving disease progression in patients without known recurrent mutations[1] may be discovered by whole-genome sequencing (WGS) and whole-exome sequencing (WES). Indeed, the decade after the sequencing of the first cancer genome2—that of an individual presenting with CN-AML—has witnessed an explosion of this technology[3] that has advanced our understanding of AML through the discovery of mutations in DNA(cytosine-5-)-methyltransferase 3α (DNMT3A)[4] and isocitrate dehydrogenase 1 (nicotinamide adenine dinucleotide phosphate+), soluble (IDH1)[5] and the recent finding[6] that nearly all patients harbor at least 1 likely pathogenic mutation.

The prognostic significance and functional consequences of these and other recently discovered mutations are being elucidated. Translating these results to the clinic and assessing the impact of “targeted therapies” may further require determining whether mutations reside in subclones and how the latter evolve over time or after treatment. Just as WGS and WES have facilitated mutation discovery, these subsequent challenges are being addressed by a diverse array of next-generation sequencing (NGS) technologies, including transcriptome, methylome, and targeted, custom capture sequencing.

Mutation Discovery

The capillary electrophoresis of individual, fluorescently labeled Sanger reaction products used to originally sequence the human genome over 10 years at an expense of several billion dollars have since given way to “NGS” platforms. These can resequence the human genome at a fraction of the original cost (approximately $10,000 per tumor/germline pair) in 4 to 6 weeks. These platforms require a library preparation phase, in which synthetic DNAs (adapters) are ligated onto the ends of the fragmented DNA to be sequenced. The fragments are then subjected to a polymerase-mediated reaction, the intermediate products of which can be monitored in real time by the platform's optical instruments. For example, the Illumina platform (Illumina, San Diego, Calif) detects the incorporation of fluorescently labeled nucleotides during an amplification reaction, with noninterference between contiguous nucleotides ensured by a step-wise reaction in which an incorporated nucleotide's 3′ blocking group prevents further extension until its dye has been detected and its blocking group removed by chemical cleavage. These platforms provide a nucleotide resolution view of the genome that, for example, yields precise breakpoints of chromosomal rearrangements coarsely detected by less sensitive approaches, such as spectral karyotyping. The reads have characteristic lengths (generally from 25 to 100 base pairs [bp]) and error profiles (often reflecting the error-prone polymerase) that are platform dependent. These impact a platform's suitability for a particular application (eg, the short reads and greater depth of the Illumina platform are suited for quantitating single nucleotide variants [SNVs], whereas the longer reads of the Pacific Biosciences RS platform [Pacific Biosciences of California, Menlo Park, Calif] facilitate the discovery of structural variants [SVs]).

Whole-Genome Sequencing

WGS provides comprehensive DNA sequencing of the entire genome. Thus, Ley and colleagues chose[7] it to elucidate the unknown initiating event in tumors from 2 patients with CN-AML through a series of studies.[2, 4, 5] These led to the discovery of mutations in IDH1 and DNMT3A and have been extensively reviewed elsewhere.[7-11] In addition, they established several paradigms that have guided subsequent genomics studies in hematologic[12, 13] and solid tumors, including validation of sequencing results using an orthogonal platform and comparison of tumor and matched normal samples from the same patient to discover acquired, somatic variants. When matched normal samples are not available, putative variants instead may be filtered if they occur in a cohort of (unmatched) normal samples or are annotated in single nucleotide polymorphism (SNP) databases.

In the latest of the trio of studies,[5] paired-end sequencing from both ends of a DNA fragment, as opposed to single reads, provided greater genomic context and facilitated alignment of reads to the reference genome. This, coupled with maturing variant-calling algorithms[14] that analyze mapped reads to conclude SNVs in the presence of sequencing errors, (mis)alignment artifacts, and tumor/normal contamination, dramatically improved variant-calling false-positive rates. These advances accentuate the innate capability of WGS to characterize the full range of mutations, including intronic and exonic SNVs, insertions and deletions (indels), copy number alterations, and SVs (including fusions/translocations).[15]

Whole-Exome Sequencing

Because mutations effecting protein function are likely within coding exons, sequencing the exome (ie, the coding exons of annotated genes) using WES is a cost-effective alternative to WGS. WES targets exons through a capture-based library preparation phase using probes whose length, number, and exonic targets vary across platforms.[16] This approach captures regions flanking the probes (eg, approximately 100 bp, depending on fragment length), including those in introns and untranslated regions. However, promoters, enhancers, and intronic spicing silencers or enhancers far outside these targeted regions will not be sequenced. Furthermore, some regions, particularly those with extreme GC content[6, 16, 17] are difficult to capture and hence are under-represented. Nevertheless, WES is attractive in limiting analysis to the approximately 1% to 2% of the genome most likely to be of pathogenic interest.

Several groups have applied WES to rationally selected genotypes to minimize intersample heterogeneity and to enrich for subtype-specific mutations. This approach has identified mutations in DNMT3A in acute monocytic leukemia18; in B-cell leukemia 6 corepressor (BCOR) in a molecularly screened CN-AML that was free of known oncogenic mutations in NPM1, CCAAT/enhancer binding protein α (CEBPA), fms-related tyrosine kinase 3 (FLT3), or myeloid/lymphoid or mixed-lineage leukemia (MLL)19; and in GATA2 in patients who had CN-AML with biallelic CEBPA mutations.[20] WES and WGS also have identified splicing factor mutations in splicing factor 3b subunit 1 (SF3B1), SRSF2, and U2 small nuclear RNA auxiliary factor 1 (U2AF1) in myelodysplastic syndomes (MDS) (reviewed by Ogawa[21]). Recurrent mutations of the spliceosome, including in U2AF1, have subsequently been discovered in AML using both platforms.[6]

Transcriptome Sequencing (RNA-seq)

Unbiased sequencing of the transcriptome (RNA-seq) offers several advantages[22] with respect to WGS and WES: it facilitates the discovery of novel transcripts and of alternative splicing events and trans-splicing[23] or read-through[15] fusion events that cannot be detected from genomic DNA; and it reduces false-positive results by enriching for expressed transcripts and their variants, which are more likely to be pathogenic, and quantitates this expression digitally. However, it is ineffective in discovering mutations that destabilize transcripts (eg, by inducing nonsense-mediated decay [NMD][24]) or are rarely expressed. Furthermore, RNA-seq involves additional complexity during the preparation of complementary DNA (cDNA) sequencing libraries from RNA,[25] including the need to cope with potential degradation of unstable RNA and the sequence and structural dependence of cDNA synthesis and hybridization.[22] Variations in this step accommodate different downstream analyses. For example, polyadenylation fractionation enriches for expressed mRNA relative to noncoding RNA, whereas size selection of unfractionated total RNA enriches for microRNA (miRNA).

Several groups[26-30] have used RNA-seq to discover somatic mutations in AML at considerably reduced cost and effort relative to WGS. Greif et al[26] identified mutations in RUNX1, TLE4, and SHKBP1; McNerney et al[27] discovered that cut-like homeobox 1 (CUX1) on chromosome 7q was expressed at haploinsufficient levels in monosomy 7/del(7q) de novo and therapy-related AML samples; Wen et al[28] identified 7 novel fusions specific to CN-AML and an additional CIITA-DEXI fusion that occurred in 14 of 29 (48%) CN-AML samples; Masetti et al[29] discovered a recurrent CBFA2T3-GLIS2 fusion in 3 of 7 patients with childhood CN-AML; and Walter et al[30] identified a novel ITGA5 splice variant as a potential relapse risk factor by RNA sequencing of relapsed patients who had been classified as low risk based on known cytogenetic and molecular markers.

Ramsingh et al[31] characterized miRNA expressed in a patient with CN-AML by sequencing size-selected cDNA using the Sequencing by Oligonucleotide Ligation and Detection (SOLiD) platform (Life Technologies, Carlsbad, Calif) and discovered outlying expression of miR-233 beyond the dynamic range of miRNA microarrays and reverse transcriptase-polymerase chain reaction. Subsequent miRNA-seq studies have detected 6 miRNA biomarkers in circulating blood that differentiate AML patients from controls[32] and have uncovered 2 miRNAs whose loss leads to leukemia-related diseases in mice.[33]

Frequency and Prognostic Significance of Mutations

Discovered somatic variants are frequently validated on an orthogonal platform, eg, mutations detected on the Illumina platform may be validated using custom primer amplification followed by direct sequencing[2] or NGS on the Roche GS FLX system (Roche Diagnostics, Basel, Switzerland).[5] Alternatively, deep read counts can provide validation. Deep amplicon sequencing, ie, amplification using custom-designed polymerase chain reaction primers followed by deep NGS, is 1 such approach. Another involves liquid hybridization capture using custom sequence probes designed to cover the region of interest (eg, spanning an SNV, an indel, or all exons within a gene) and subsequent deep NGS.[34, 35] The scalability of custom probe approaches is attractive when validating many variants.

Both amplicon-based and capture-based strategies are useful in defining mutation frequencies across a large cohort and in clinical correlation studies.[1] Amplicon pyrosequencing has been extensively used[36-38] to determine the clinical significance of tet methylcytosine dioxygenase 2 (TET2) mutations,[39] although the findings are inconsistent.[40] In 1 study, little correlation was observed between outcomes and TET2 mutations in patients with MDS,[36] whereas another study revealed their correlation with inferior event-free survival in patients with de novo CN-AML,[37, 38] and particularly in the European LeukemiaNet favorable-risk subgroup.[38] Furthermore, amplicon sequencing using the Illumina[41] and Roche[42-44] platforms has associated SF3B1 mutations with MDS characterized by ring sideroblasts and a good clinical outcome.

Many groups have demonstrated frequent recurrent mutations in multiple genes that impact prognosis. These include mutations and deletions of the tumor protein P53 gene (TP53) in cytogenetically complex AML45; MLL fusions46; and FLT3-ITD (internal tandem duplication), FLT3-TKD (SNVs in the tyrosine kinase domain),[47] DNMT3A,[48-53] and additional sex combs like 1 (ASXL1)[54, 55] mutations in intermediate-risk and/or CN-AML, all of which have been associated with poor outcomes in these cytogenetic subsets. However, even when large-scale patient populations are studied by different investigators, consistent correlations are not always observed. For example, not all retrospective studies have associated either the most common arginine codon 882 (R882) and/or non-R882 DNMT3A mutations with poor outcome.[53]

Frequently conflicting results between correlation studies may be attributed to several factors. For example, secondary mutations may modulate the effect of another mutation, as frequently observed in patients with mutations in both SF3B1 and DNMT3A who have improved survival relative to patients with DNTM3A mutations alone.[56] Furthermore, Damm et al[57] note that their finding that SF3B1 mutations had no effect on overall survival or leukemic progression, in contrast to other studies, may reflect the low-risk cohort in their study. Finally, additional heterogeneity between or within studies may be introduced by different treatment regimens.[57] Reliable clinical associations thus require multivariate analyses incorporating diverse mutations and other prognostic indicators or analyses restricted to molecularly homogeneous populations. Such studies could be conveniently accommodated by custom capture-based approaches.

Functional and Biologic Consequences of Mutations

Bioinformatic analysis is useful in narrowing the sea of mutations discovered from large-scale genomic surveys to “driver” mutations that likely are significant in disease onset and progression. This winnowing out of irrelevant “passenger” mutations is partially achieved by the etiology of AML. Sequencing studies across cancer types have revealed that AML has a relatively low mutation burden (Fig. 1), which increases the likelihood of discovering pathogenic mutations in AML. Because “cancer genes” are often mutated across multiple cancer types, their facilitated discovery has relevance beyond AML.

Figure 1.

Acute myeloid leukemia has a reduced mutation burden relative to other cancer types. Labels indicate cancer codes from The Cancer Genome Atlas (TCGA) (available at:; accessed December 23, 2013). Mbp indicates million base pairs.

Genes[58, 59] and pathways[60] mutated at a statistically significant rate above background mutation rates may be detected from large-scale studies. Other approaches search for recurrently mutated subnetworks within protein-protein interaction networks[61, 62] or integrate genome-wide expression and mutation data to probabilistically infer perturbations in annotated pathways.[63] Candidate mutations for subsequent experimental study also include those that occur at high frequency in other cancers,[64] are within annotated functional domains[65] or conserved regions,[66] or are predicted to disrupt protein function.[67] These may be prioritized further through integrated exploration of multidimensional cancer genomics data, including those describing SNVs; copy number alterations; DNA methylation; and mRNA, protein, and phosphoprotein expression.[68]

Whether a mutated gene should be overexpressed or knocked down, for example, to assess its functionality, depends on whether the lesion is likely a gain-of-function mutation in an oncogene or a loss-of-function mutation in a tumor suppressor. This distinction may be inferred from the type and distribution of mutations within the gene. Several types of mutations introduce premature termination codons, including nonsense point mutations and inopportune frame-shift indels, which are likely to destabilize the transcript. The scattering of these and/or consensus splice site mutations throughout a gene's coding region, as has been detected in BCOR19, TET2,[40] and the splicing factor ZRSR2,[41, 69] is a likely indicator that it acts as a tumor suppressor. Homozygous deletions or the detection of the mutation within a recurrently deleted chromosomal region, as in the CUX1 study,[27] provide further evidence of a tumor suppressor role. In contrast, the clustering of missense mutations within hotspots, particularly within known functional domains or conserved regions, is more suggestive of a gain-of-function mutation. These characteristics are shared by mutations in U2AF1 (preferentially targeting residues within zinc finger domains) and SF3B1 (preferentially targeting residues within conserved[70] HEAT domains). An oncogenic role is further supported for a gene that has a missense mutation within a recurrently amplified region. For example, to enrich for potential oncogenes and tumor suppressors, Dolnik et al[71] designed custom DNA capture probes targeting the coding exons of 1000 genes that were detected within minimally deleted/gained regions through SNP analysis and used subsequent NGS to reveal mutations in RAD21 within amplified regions.

The approaches described above are particularly useful in planning experimental studies for a gene with unknown function. However, several classes of AML-relevant mutations occur in genes with ascribed function, including epigenetic regulators and splicing factors. For these, NGS approaches can conveniently assay downstream effects of the mutations.

Effects of Mutations in Epigenetic Modifiers

The epigenome is frequently dysregulated in cancer.[72] Indeed, epigenetic modifiers are recurrently mutated in AML and MDS,[73] including the DNA methyltransferase DNMT3A, the methylcytosine dioxygenase TET2, and the isocitrate dehydrogenases IDH1 and IDH2 discussed above, as well as the polycomb-associated ASXL1 and the methyltransferases EZH2 and MLL.[74] Gain-of-function IDH1/IDH2 mutations[75] antagonize TET2 function76; and mutations in TET2, IDH1, and IDH2 are mutually exclusive.[40] These mutations have clinical significance,[74] and the DNA methyltransferase inhibitors azacytidine and decitabine induce clinically significant complete and partial responses in patients with MDS and low blast counts in patients with AML.[77] Therefore, array-based strategies have been used to characterize their downstream effects genome-wide and have detected a global, predominantly hypermethylation pattern induced by IDH1/IDH26,76 and TET276 mutations and largely shared[76] between them.

Recent methods replace the array-based detection of earlier techniques with NGS for nucleotide resolution and reduced bias[78] and are based on 1 of 3 approaches: methylcytosine-sensitive restriction digestion, bisulfite conversion (possibly preceded by methyl-insensitive digestion), or immunoprecipitation (IP).[78] Digital restriction enzyme analysis of methylation (DREAM),[79] an example of the first technique, performs serial digestion with methylation-sensitive SmaI and methylation-insensitive XmaI restriction enzymes. Both target the same CCCGGG sequence, although the former is blocked by CpG methylation and leaves 5′-GGG blunt ends, whereas the latter cuts any remaining target sequences and leaves a 5′-CCGGG overhang, which acts as a unique signature for methylated sites. The second approach, sodium bisulfite treatment, converts unmethylated cytosine to uracil, but it leaves methylated cytosine intact. To enrich for DNA that can be (differentially) methylated, reduced representation bisulfite sequencing (RRBS)[80] cuts bisulfite-treated DNA at CCGG sequences using the methylation-insensitive MspI restriction enzyme, and size selection ensures the presence of at least 2 such sites within a defined (eg, 300-bp) sequence span. Finally, IP-based approaches are typified by methylated DNA immunoprecipitation NGS sequencing (MeDIP-seq),[81] which uses an antibody directed against 5-methylcytosine (5mC) to immunoprecipitate methylated genomic regions. MethylCap-seq is a related approach that enriches for methylated DNA through capture with methyl-binding protein (MBD2).[82] In all 4 techniques, the resulting DNA libraries are characterized by NGS.

Leukemic subtypes segregate according to differentially methylated regions, detected by MeDIP-seq, not only in promoters, as might be anticipated, but also in gene bodies, CpG islands (CGIs) (inside and outside promoters), and CGI shores.[83] These subtypes also are clustered according to differential methylation in satellites, long terminal repeats and in short interspersed nuclear elements (SINEs) and long interspersed nuclear elements (LINEs). A toggling of methylation status in these repetitive regions between normal and leukemic blood cells was observed using DREAM analysis: sites with significant hypermethylation in normal cells tended to exhibit significant hypomethylation in leukemia samples and vice versa.[79] Mutation- and drug-specific effects have also been described. Loss of DNTM3A in hematopoietic stem cells induced hypermethylation and hypomethylation in CpG dinucleotides, as detected by RRBS.[84] This loss instead leads to predominant hypomethylation after differentiation to B cells as assayed by DREAM.[82] IDH1/IDH2 mutations lead to a marked increase in hypermethylated sites relative to hypomethylated sites in mutants relative to controls, with an enrichment in promoter regions and CpG islands near transcription start sites (TSSs) detected by an enhanced RRBS.[85] Finally, using MethylCap-seq, it was observed that decitabine treatment significantly reduced global methylation, particularly in chromosome subtelomeric regions, possibly suggesting a region-specific mechanism of drug action.[82]

Mutations to epigenetic modifiers are anticipated to affect transcription and chromatin state, consequences that have been investigated using chromatin IP followed by NGS (ChIP-seq). ChIP-seq using antibodies targeting trimethylated histone H3 on lysine 27 (H3K27me3) revealed a significant reduction in genome-wide H3K27me3 TSS occupancy after ASXL1 knockdown.[86] Specific loss of H3K27me3 at the posterior HOXA cluster, which is known to contribute to myeloid transformation, suggests that ASXL1 mutations promote transformation by relieving gene repression. Saeed et al,[87] also using ChIP-seq, identified accessible genome regions to which the oncofusion proteins AML1-ETO and PML-RARA bind.

Effects of Mutations in Splicing Factors

Minigene reporter assays of splicing mutations[88] have been extended genome-wide using gene expression arrays in expectation that the mutations' impaired ability to recognize the 3′ splice site will perturb gene expression. For example, splicing-sensitive arrays indicated that exons were significantly down-regulated and that introns were significantly up-regulated (ie, unspliced) after U2AF1 mutant expression.[41] RNA-seq provides a more comprehensive and quantitative alternative to gene and exon arrays. Yoshida et al[41] validated the above finding by observing increased read counts in likely intronic regions within U2AF1-mutant samples, whereas Makishima et al[89] observed that these mutations perturbed TET2 splicing. To determine the role of modulated splicing in AML, Przychodzen et al[90] analyzed publically available RNA-seq data from The Cancer Genome Atlas (TCGA;; accessed December 23, 2013) in their comparison of 6 samples of therapy-related AML or secondary AML that had U2AF1 mutations and 14 wild-type samples. In that study, 35 differentially-expressed exons were predominantly skipped, including those in genes involved in mitosis and RNA processing. In contrast, SF3B1 mutations assayed by RNA-seq induced a high percentage of exon retention in patients who had refractory anemia with ring sideroblasts (RARS).[91]

Clonal Evolution

NGS has revealed that most AML tumors are oligoclonal. Variants discovered by WES and their subsequent capture-based, targeted deep sequencing in hematopoietic stem/progenitor cells (HSPCs) from healthy donors indicate that the cells accumulate random mutations during aging.[35] If an HSPC transforms to a leukemic blast, these passenger mutations are “captured” in its progeny as it clonally expands and serve as a genetic signature identifying the clone. In particular, the variant allele frequency (VAF) (or ratio of reads supporting the variant to total reads at the locus) acts as a molecular clock indicating when in the clonal hierarchy the mutation was acquired: heterozygous, clonal mutations within a pure sample have a VAF of approximately 50%, whereas subclonal mutations, acquired later, are present in fewer cells and have lower VAFs. Aggregations of VAFs, thus, reflect clones (Fig. 2). Because both passenger mutations and driver mutations are valuable clonal markers, the most comprehensive perspective on clonal architecture is provided by first discovering variants using WGS and then quantifying VAFs using deep, targeted sequencing.[34, 92, 93] Discovering clonal makers with WES[35, 94, 95] or even through limited candidate gene resequencing[96] also has revealed subclonal architecture, although their more limited number relative to WGS almost certainly (further) underestimates clonal heterogeneity. Hence, WGS-based discovery is particularly important for diseases that have few protein-coding mutations, such as AML; whereas WES may suffice to assess clones in diseases with high mutation burden, such as melanoma and chronic lymphocytic leukemia.[95] In addition, sensitivity in detecting low-frequency clones is improved with increased sequencing depth. For example, a sequencing depth of 100× is likely to detect VAFs as low as approximately 4% (95% binomial confidence interval).

Figure 2.

A subclone escapes therapy to expand into a relapsed clone. (a) Simultaneous (SciClone clustering[93]) analysis of tumor and post-treatment relapse samples[34] reveals a founding clone (having mutations in cluster 1) and subclones that evolve from it by acquiring additional mutations. These include a subclone (with additional mutations in cluster 3) that was eradicated by therapy and one (having mutations in cluster 2) that escaped therapy and acquired additional mutations (cluster 4) to become dominant during relapse. VAF indicates variant allele frequency. (b) Analysis of relapse-only VAFs cannot disambiguate subclones having mutations in clusters 2 and 4 from the founding clone. (c) An analysis of tumor-only VAFs is illustrated. (d) Clonal evolution concluded from a is illustrated.

A study of tumor/relapse pairs has demonstrated 2 patterns of clonal evolution during AML relapse: 1) the founding clone in the primary tumor gained additional mutations and evolved into the relapse clone; or 2) a subclone escaped therapy, gained additional mutations, and expanded into the relapse clone (Fig. 2).[34] A likely model describing this subclonal evolution, in which AML develops through serial acquisition of mutations in HSPCs, has been supported by isolating and sequencing preleukemic HSPCs in which a subset of mutations of their leukemic progeny was identified.[94] A similar persistence of mutations from an antecedent disease accompanied by acquisition of additional mutations during progression to AML has been observed in serial studies tracking evolution from severe congenital neutropenia[97] or MDS[92] to AML. However, recurrently mutated genes display a wide range of VAFs across secondary AML samples, indicating that none are consistently associated with the founding clone and that the disease progresses through a variety of acquired mutations across patients.[93]

Clonal heterogeneity is frequently discernible from a single tumor sample. However, multiple samples derived from a single patient are often required to disambiguate subclones that overlap in the original sample and to appreciate the tumor's complexity (Fig. 2). These may be relapse samples, physically isolated biopsies taken at the same time point, or tumor cells exposed to some manipulation (eg, passage in culture or through immunodeficient mice) that could induce a distinct (fitness) phenotype in a subpopulation of cells. Characterizing[98] this intratumor heterogeneity and clonal architecture is important, because they may have clinical implications[99] and could contribute to therapy resistance.[100] For example, subclonal mutations may be correlated with a poor clinical outcome in chronic lymphocytic leukemia.[95]

Clinical Application of NGS

Current genetic testing in AML is inadequate to detect the clinically relevant mutations of this heterogeneous disease.[101] Metaphase cytogenetics and fluorescence in situ hybridization lack resolution, whereas Sanger-based sequencing is cost and time prohibitive. Mass spectrometry genotyping identifies mutations at specific residues (eg, in N/KRAS and IDH1/IDH2), but it is unable to detect mutations scattered throughout the gene body (eg, in tumor suppressors TP53, TET2, and ASXL1). Furthermore, genes such as N/KRAS[102] and DNMT3A,[4] which are expected to be targeted at well characterized residues, have exhibited noncanonical, but oncogenic, mutations that would have been missed by SNP-directed approaches.[102] In the latter case, up to 40% of patients with DNMT3A mutations harbor non-R882 mutations, which also are associated with poor outcome.[4] Comprehensive WGS is an attractive diagnostic platform, but it suffers from high cost and analysis time and from moderate to low-coverage insensitivity to low-frequency, subclonal mutations. Deep-coverage, targeted NGS for panels of candidate genes ameliorates these concerns and leverages the community's investment in large-scale sequencing efforts, particularly through TCGA, in cataloging somatic cancer mutations. For example, a “pan-cancer” panel assays the entire coding sequence and selected introns of 236 cancer-related genes (www.founda; accessed December 23, 2013). A single hybrid-capture NGS platform improves efficiency and scalability over a variety of disparate methods otherwise required to discover the full spectrum of mutations active in AML, including translocations, SNVs, and indels.[103] Furthermore, the deep read counts of targeted NGS are well suited to the monitoring of minimal residual disease.[104]

Technical challenges remain, however, including the inefficient capture of GC-imbalanced targets such as CEBPA[103, 105] and potentially limited genomic DNA, although whole-genome amplification may address the latter without introducing appreciable bias.[106] Bioinformatic analysis[14] is a bottleneck in the clinical sequencing pipeline and software tools need to be validated.[107] For example, alignment is computationally demanding, potentially sensitive to noisy reads, and problematic in repetitive regions.[108] Fully leveraging NGS clinically will require carefully considered informed consent. For example, WGS of a patient with multiple primary tumors revealed the presence of a cancer susceptibility germline mutation in TP53 with clinical implications for the patient's children.[109] Because the informed-consent document included a provision to communicate clinically relevant information to family members, the treating physician was able to inform the next of kin of the mutation and to encourage genetic counseling. In a second case, WGS detected a cryptic fusion within a clinically relevant timeframe, which impacted the patient's treatment plan.[110] Diagnosis was facilitated by a “movable firewall” within the institutional review board-approved protocol that maintained the patient's anonymity yet allowed the research team to communicate relevant findings to the treating physician.

Despite the remaining obstacles, the 2 cases described above demonstrate the clinical impact of NGS. These successes and earlier studies have significantly advanced the state of the art in clinical diagnosis, with at least 10 open clinical trials using NGS (; keywords: “next generation sequencing” and “cancer”). Furthermore, large, clinically annotated data sets have already been accumulated by prior studies and offer a rich opportunity for retrospective analysis. Thus, as technological trends continue to reduce sequencing cost, integrated[6] clinical analysis and sequencing of the genome, transcriptome, and methylome, coupled with queries to drug-gene interaction databases (; accessed December 23, 2013)[111], will become a practical approach to comprehensively interrogating and treating leukemia.


This work was supported by grants from the National Institutes of Health/National Cancer Institute (grants NIH/NCI P01 CA101937 [J.F.D.] and NIH/NCI R01 CA152329 [J.F.D.]) and by Barnes-Jewish Hospital Foundation Award 7603-55 (B.S.W.).


The authors made no disclosures.