Integration of cancer genomics with treatment selection

From the genome to predictive biomarkers


  • Thomas J. Ow MD, MS,

    1. Department of Otorhinolaryngology-Head and Neck Surgery, Montefiore Medical Center/Albert Einstein College of Medicine, Bronx, New York
    2. Department of Pathology, Montefiore Medical Center/Albert Einstein College of Medicine, Bronx, New York
    Search for more papers by this author
  • Vlad C. Sandulache MD, PhD,

    1. Bobby R. Alford Department of Otolaryngology-Head and Neck Surgery, Baylor College of Medicine, Houston, Texas
    Search for more papers by this author
  • Heath D. Skinner MD, PhD,

    1. Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas
    Search for more papers by this author
  • Jeffrey N. Myers MD, PhD

    Corresponding author
    1. Department of Head and Neck Surgery, The University of Texas MD Anderson Cancer Center, Houston, Texas
    • Corresponding author: Jeffrey N. Myers, MD, PhD, Department of Head and Neck Surgery, The University of Texas MD Anderson Cancer Center, 1515 Holcombe Boulevard, Unit 1445, Room FCT10.6028, Houston, TX 77030-4009; Fax: (713) 794-4662;

    Search for more papers by this author


The field of cancer genomics is rapidly advancing as new technology provides detailed genetic and epigenetic profiling of human cancers. The amount of new data available describing the genetic make-up of tumors is paralleled by rapid advances in drug discovery and molecular therapy currently under investigation to treat these diseases. This review summarizes the challenges and approaches associated with the integration of genomic data into the development of new biomarkers in the management of cancer. Cancer 2013;119:3914–3928. © 2013 American Cancer Society.


We are currently in a new era of both cancer research and cancer therapy. It is truly remarkable to compare cancer treatment a half century ago with the scientific and therapeutic innovations under consideration today. In 1953, the crystal structure of DNA was first described and postulated to be the carrier of heritable information.[1] Only 50 years later, in 2003, the Human Genome Project, a 13-year research endeavor, reported the base-pair sequence encoding an entire human genome. Today, technology is available that can interrogate the entire human genome in a matter of weeks, for orders of magnitude less cost. High-throughput techniques can also detail complimentary epigenetic events, such as gene expression, micro-RNA expression, and DNA methylation, providing hundreds of thousands of data points that can be used to evaluate the inner workings of human cells and tissues.

Advances in the science of genomics have been paralleled by an increased understanding of human cancer. In the last century, it has become clear that cancer is a disease of the genome. A growing understanding of the fundamental molecular biology of cancer has led to our current model in which genetic mutation leads to hallmarks of cellular dysregulation that allow cells to become cancerous.[2] Improved insight into the role of genomic alterations in cellular transformation has led to the development of several new classes of drugs targeting the molecular drivers that have been discovered in human cancers. The progress made in this new frontier of cancer research rests heavily on the ability of translational scientists to integrate new technology and increasing data with the vast array of new therapeutics available. In this article, we review these challenges and present several strategies for improving the development of prognostic and predictive biomarkers.

Deciphering the Cancer Genome With Modern Technology

Over the past decade, several high-throughput techniques have been developed that allow rapid and comprehensive assessment of the cancer genome and epigenetic events. The next section summarizes several of these techniques and provides contemporary examples of how these technologies have recently been applied to cancer research.

Array-based comparative genomic hybridization

Array-based comparative genomic hybridization (array-CGH) allows the evaluation of copy number variations (CNVs), such as microdeletions, unbalanced translocations, and amplifications, across the whole genome in a high-throughput manner.[3] With this technique, a test DNA sample (usually from a tumor) and a reference sample (often from adjacent normal tissue or from blood cells) are digested and differentially labeled with fluorophores (usually red vs green), which are hybridized to an array that contains thousands of probes corresponding to regions of the human genome.[4] The intrinsic resolution of the assay is based on the number and size of DNA probes on the arrays. Advances in array-CGH technology, such as representational oligonucleotide microarray (ROMA) analysis,[5] have sought to increase the resolution of the genomic regions assessed. Newer systems using very small and overlapping probes allow CGH microarray resolution down to a range from 10 kilobases to 200 base pairs.[6] Analysis of global CNVs can be used to classify tumors based on the spectrum or number of CNVs observed, or individual CNVs can be explored. Identified unbalanced translocations or amplifications may yield targetable oncogenic events.[7] For example, in a recent report, Morris and colleagues evaluated CNVs using array-CGH in head and neck squamous cell carcinoma (HNSCC), which led to their identification of common amplifications in phosphatidylinositol-4,5-bisphosphate 3-kinase, catalytic subunit α (PI3KCA) as well as novel deletion events in the protein tyrosine phosphate, receptor type, S (PTPRS) gene.[8] Several agents targeting the PI3K pathway exist,[9] and the authors demonstrated that PTPRS loss can diminish the efficacy of epidermal growth factor receptor (EGFR/Her1) inhibition. CNVs appear to be a driving force in the promotion of carcinogenesis, and identification of these changes with array-CGH can inform drug selection for targeted molecular therapy.

Single-nucleotide polymorphism arrays

The human genome carries approximately 20 million conserved single-nucleotide variations that occur with a defined regularity in the population, termed single-nucleotide polymorphisms (SNPs).[10, 11] These variations can be used to genotype individuals, to do linkage analysis, to perform genome-wide association studies (GWAS), and to detect CNVs and sites of loss of heterozygosity (LOH).[6, 12] Using current DNA microarray platforms,[13] up to approximately 1 × 106 SNPs can be analyzed concurrently. This technology allows evaluation of the genome at high resolution and, unlike array-CGH, does not require a reference sample. Copy number evaluation in HNSCCs using a multitude of techniques has been reviewed in great detail by Chen and Chen,[14] and recent studies have integrated copy number analysis using SNP arrays with gene expression data to develop prognostic signatures in oral cavity squamous cell carcinoma.[15, 16] SNP genotyping also has been used to identify polymorphisms in germline DNA associated with the risk of second primary tumors and recurrence among patients treated for early stage HNSCC,[17] demonstrating another application of SNP evaluation in patients: germline risk profiling.

Next-generation sequencing

Until very recently, DNA sequencing depended on variations in the methods originally developed by Frederick Sanger and colleagues in the late 1970s.[18, 19] With these techniques, fragments of DNA from the sample of interest are amplified using polymerase chain reaction (PCR), and each PCR reaction is randomly terminated with a chemically altered base pair. The size of each pool of fragments generated is then measured (eg, with gel electrophoresis), and the last base added can be determined, depending on which base terminated the reaction (eg, each nucleotide can be radio-labeled, or color-coded with a fluorescent marker). When each fragment is evaluated in aggregate, the entire sequence of the sample can be constructed. Sanger “base-by-base” sequencing techniques remained the state-of-the-art for genetic sequencing for 3 decades, and these methods were largely used to complete the Human Genome Project.

The term “next-generation” sequencing has been ascribed to a variety of techniques that parallelize sequencing reactions, which means that large numbers of sequences (thousands to millions) from the DNA of interest are generated simultaneously and then aligned to compose the final sequencing result. Rapid advances in technology and computing power have produced a multitude of next-generation sequencing techniques (eg, 454 pyrosequencing, ion semiconductor sequencing, sequencing by synthesis, sequencing by ligation),[20, 21] and a detailed review of these technologies is beyond the scope of the current article. These sequencing techniques allow for gene or multi-gene sequencing in a very-high-throughput and rapid manner. Next-generation sequencing has also led to the ability to sequence the entire coding region or the entire genome of an individual or tumor in a matter of weeks.

Whole-exome sequencing and whole-genome sequencing

Exomes are regions of the genome that are transcribed into protein-coding RNAs. There are approximately 180,000 exons in the human genome, composed of 30 megabases, which yield roughly 20,000 protein-coding genes. Remarkably, this represents only approximately 1% of the entire human genome. The premise of whole-exome sequencing relies on a method for enriching exomic DNA followed by next-generation sequencing of these enriched targets. There are several enrichment methods, including PCR-based targeted amplification, the use of molecular inversion probes, hybrid capture, and in-solution capture.[22] Whole-exome sequencing can be used to identify mutations in coding genes, to perform SNP genotyping from known SNPs in the exome, and to identify translocations and determine CNVs that involve exomic DNA.[23, 24] Whole-exome sequencing of HNSCC was recently reported in 2 large studies.[25, 26] Those findings confirmed mutations in the genes known to be common players in this disease (eg, tumor protein 53 [TP53], HRAS, CDKN2A), and novel mutations also were identified in genes not previously implicated in HNSCC (eg, NOTCH1, FBXW7, FAT1).[25, 26]

Whole-exome sequencing is proving to be a valuable tool in cancer research and discovery; however, nonprotein-coding DNA, previously even referred to as “junk” DNA, is proving to be more important than once surmised. The Encyclopedia of DNA Elements (ENCODE) project has demonstrated that much of the genome is involved in the regulation of gene expression through 3-dimensional conformation effects, interaction with coding elements, and the production of noncoding RNAs, such as micro-RNA and long noncoding RNA (lnc-RNA).[27] Therefore, whole-genome sequencing approaches also are proving to be important in cancer research. Whole-genome analysis provides sequencing for all (or most) DNA in an organism (eg, chromosomal and mitochondrial or chloroplast DNA). Older sequencing techniques based on Sanger methods were used to complete the Human Genome Project; however, newer techniques, such as nanopore, nanoball, fluorophore, and pyrosequencing technologies, combined with parallelization, as described above, have significantly reduced the time and cost of whole-genome sequencing.

Recent studies have just begun to apply whole-genome sequencing to analyze human cancers, such as a study of 39 pediatric patients with low-grade glioma,[28] a study that included a subset of 15 patients with esophageal adenocarcinoma,[29] and 2 patients who were included in 1 of the HNSCC next-generation sequencing reports.[26] Those studies, however, all focused largely on events that were identified within protein-coding DNA. Our understanding of noncoding DNA is increasing, and the importance of these elements in human cancer is just being realized, as evidenced by the important lnc-RNA HOTAIR, which plays a role in cancer cell behavior.[30] In another study, mutations in the TERT promoter region, which has the potential to increase telomerase expression, were commonly identified in a subset of cancers, including gliomas, melanoma, and oral squamous cell cancer.[31] Thus, as more of the human genome is understood, more layers of complexity in gene sequence and regulation are uncovered, all of which have implications in human cancer.

High-throughput epigenetics and transcriptional analysis

The current article focuses on recent advances and applications of genomics in cancer research and biomarker development; however, new technologies and applications in epigenetics, gene expression, and transcriptome analysis are inherently related and deserve mention here. The newest technologies provide very comprehensive evaluation of gene expression, DNA methylation, and micro-RNA expression. High-throughput gene expression analysis has now been available for approximately 2 decades, and several groups have identified gene expression signatures as potential biomarkers in cancer. Advances in breast cancer signatures have been notable,[32] including the development of a 21-gene signature[33] that has led to the clinical diagnostic, Oncotype DX (developed by Genomic Health, Inc., Redwood City, Calif), which is used to prognosticate patients with estrogen receptor-positive, early-stage breast cancer. Gene expression evaluation has also been used to profile HNSCC; for example Chung, and colleagues used gene expression signatures to define 4 distinct groups of patients with HNSCC and demonstrated that those signatures could be used to predict outcome.[34]

Until recently, complementary DNA (cDNA) microarray technology was the standard in gene expression analysis. The most common techniques involve purification of messenger RNA, reverse transcription to cDNA, and hybridization to an array carrying tens of thousands of probes used to determine the relative quantity of each target. In the last few years, next-generation sequencing techniques have rivaled microarray technology for gene expression analysis. RNA-seq is a technique that uses next-generation sequencing to evaluate the entire transcriptome. In this technology, RNA is purified according to the level of analysis that is desired (eg, messenger RNA can be isolated or only ribosomal RNA can be eliminated, allowing the evaluation of nongene transcripts, such as micro-RNA and lnc-RNA), reverse transcription is carried out, and sequencing commences.[35] The entire transcriptome is reassembled from sequence reads, and coverage can be used to estimate expression levels.[35]

Microarray technology also has been developed to evaluate genome-wide methylation events. In these techniques, 1 DNA sample is treated with bisulfite conversion, which only changes unmethylated DNA; then, the bisulfite-converted sample is compared with an untreated sample on the array to determine the degree of methylation for tens of thousands of probes across the genome. Alterations of DNA methylation are common in human cancers, and individual methylation events or methylation signatures can be used as potential prognostic or predictive biomarkers.[36] Microarrays also have been used to profile micro-RNAs in cancer, and several biomarker approaches are under development.[37] Technology that allows high-throughput evaluation of gene targets of specific transcription factors also have been developed: Chromatin immunoprecipitation (ChIP) allows profiling of the potential gene targets of specific transcription factors. With this technique, a transcription factor of interest is bound to genomic DNA, which is then isolated, fragmented, and subsequently immunoprecipitated (chromatin immunoprecipitation) to “pull-down” the DNA associated with the transcription factor. Then, this DNA is either hybridized to a microarray to assess and quantify transcription targets, or fragments are sequenced (ChIP-seq).[38] Experiments examining protein-DNA interactions have been an important component of the ENCODE project[39] and have led to a greater understanding of gene regulation and transcription factor activity.

The Cancer Genome Atlas

All of the technologies detailed above have been developed and improved over the last 2 decades. Table 1 summarizes these research techniques. The application of these methods has yielded a tremendous amount of currently available data describing several human cancers. Perhaps the most comprehensive program applying modern, high-throughput technology and integrating the data generated is The Cancer Genome Atlas (TCGA) project. The TCGA was initiated in 2006 and is supported jointly by the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI). The goal of the program is to organize and support several expert centers with the common aim to characterize the genomes of more than 20 types of human cancers using the most up-to-date, high-throughput technology available. To date, reports on glioblastoma, ovarian cancer, colorectal cancer, squamous cell lung cancer, breast cancer, and endometrial cancer have been published[40-44]; and analyses of several other tumor types, including HNSCC, are underway. Data from hundreds of tumor samples submitted to the TCGA are available to the public and include information on CNVs, somatic mutations, SNPs, as well as epigenetic and transcriptome data, including methylation, gene expression, and micro-RNA expression. Modern technology and computing power have now allowed comprehensive genomic indexing of human cancers and have made individualized tumor characterization a reality. The challenge at this point is to determine the optimal way to analyze these results and use these data clinically.

Table 1. High-Throughput Molecular Biology Techniques Currently Available for Cancer Research
  1. Abbreviations: cDNA, complementary deoxyribonucleic acid; CGH, comparative genomic hybridization; ChIP, chromatin immunoprecipitation; CNV, copy number variation; DNA, deoxyribonucleic acid; lnc-RNA, long, noncoding ribonucleic acid; LOH, loss of heterozygosity; RNA, ribonucleic acid; SNP, single nucleotide polymorphism.

DNA evaluation 
CGH arraysDNA from a test sample and a reference sample are labeled differentially using different fluorophores; then, these are hybridized to an array that contains several thousand probes; used to detect copy number changes at a resolution ranging from 5 kilobases down to 200 base pairs
SNP arraysDNA microarrays that contain probes representing approximately 1 × 106 SNPs are used to explore SNPs present in a test sample; applications include genetic linkage analysis, examination of CNVs, and evaluation of LOH
Next-generation sequencingSequencing techniques that parallelize the process of DNA sequencing, producing massive numbers of sequences at once; several techniques exist, including pyrosequencing (454), sequencing by synthesis, sequencing by ligation, and ion-torrent and single-molecule real-time sequencing; these sequencing techniques have shortened read time and lowered cost substantially
Whole exome sequencingTechnique used to explore the entire coding region of human DNA for genetic variations; typically, an enrichment strategy is used to extract DNA coding for gene exons; these targets are then sequenced using next-generation techniques
Whole genome sequencingTechnique used to explore the nucleotide base-pair sequence of an entire genome; next-generation sequencing techniques have allowed whole genome analysis to occur over a period of days to weeks and at a cost of approximately $1000 to $5000
RNA evaluation 
cDNA microarrayMessenger RNA (mRNA) from a test sample is reverse-transcribed to cDNA; labeled (eg, with fluorophore) cDNAs are hybridized to DNA probes (typically approximately 28,000-44,000 probes) on the microarray that correspond to thousands of known genes; intensity of the signal from hybridized cDNAs are used to estimate gene expression level
RNA-seqRNA from a test sample is reverse-transcribed to cDNA, which is then evaluated with next-generation sequencing; studies examining cDNA derived from noncoding RNA in addition to mRNA have yielded important regulatory RNAs, such as lnc-RNAs; RNA-seq analysis allows the identification of genetic variations, such as mutations or fusion genes, if these are transcribed to RNA; gene expression can be estimated from sequence coverage analysis
Epigenetic evaluation 
ChIP-on-chipA DNA-binding protein (eg, a transcription factor) is cross-linked to DNA, and the DNA is fragmented; the protein and linked DNA are immunoprecipitated, and the DNA fragments are hybridized to a microarray for identification and quantification; allows high-throughput analysis to identify potential genetic targets of a protein of interest
DNA methylation microarraysUnmethylated DNA is segregated from methylated DNA with bisulfite-conversion; a bisulfite-converted sample and a control DNA sample are differentially labeled, and the relative abundance of methylated and unmethylated DNA for specific genes is evaluated using DNA microarrays
Methylated DNA immunoprecipitationMethylated regions of the genome are immunoprecipitated using an antibody directed toward 5-methylcytosine; methylated genes are then evaluated using either microarray-based methods or next-generation sequencing

From Molecular Biology to Therapeutic Targets

Advances in molecular pharmacology, in parallel with an increased understanding of cancer genomics, have led to rapid expansion in new therapeutics available to treat cancer. During the mid-20th century, the first effective cancer therapeutics were realized with the use of aminopterin to treat childhood leukemia pioneered by Farber and Diamond.[45] Subsequently, several cytotoxic therapies were developed and applied over the ensuing decades, many of which remain the cornerstone of most chemotherapeutic regimens today. These drugs, such as nucleotide analogues, DNA-damaging agents, and antifolates, largely act upon rapidly dividing cells by disrupting the cellular machinery or molecular building blocks that drive cell division. A greater understanding of the molecular biology of cancer over the last 3 decades has led to the development of “targeted therapeutics” that act upon specific cancer cell proteins, such as surface receptors or intracellular signaling molecules. This vast array of new, currently available therapeutics can be categorized based on their structure and mechanism of action.

Some of the earliest examples of “targeted” therapies were aimed at the inhibition of hormones. Examples include the utility of tamoxifen to selectively block the estrogen receptor in breast cancer[46] and androgen deprivation using leuprolide or flutamide in prostate cancer.[47] These approaches have proven remarkably effective for these diseases. Because most cancers do not depend on a hormonal driver, the concept of “targeted therapy” is used more commonly to apply to the targeting of cancer-specific cell receptors or intracellular signaling molecules with either monoclonal antibodies or synthetic small molecules. A multitude of strategies have been developed, and several of these are highlighted in the sections below.

Monoclonal antibodies that target cell surface receptors

The concept of targeting cancer-specific proteins with antibodies was established during the middle to late 20th century, and the first therapeutics became a reality after monoclonal antibody (MoAb) production became possible.[48-50] The delivery of MoAbs to patients without an immune reaction became feasible as recombinant techniques led to chimerization and then to humanization of antibody products.[50, 51] Several cell surface receptors that have been identified as overexpressed in certain human cancers can be targeted with MoAb therapy. EGFR/Her1 has been targeted with the MoAb cetuximab, which is approved to treat colorectal cancer[52] and HNSCC.[53] In breast cancer, overexpression of human EGFR-2 (Her2/neu) in approximately 25% to 30% of patients led to the development and application of trastuzumab in these tumors, which has proven to be remarkably effective in Her2-expressing breast cancers.[54] Angiogenesis, which is crucial to tumor growth and metastasis, has been targeted with bevacizumab, an MoAb against vascular endothelial growth factor A (VEGF-A)[55] that was initially approved for the treatment of patients with metastatic colorectal cancer.[56]

Kinase inhibition with small molecules

Self-sufficiency in growth signaling and insensitivity to antigrowth signaling have been described as hallmarks of cancer cells.[2] It has been established that many cell-signaling proteins are activated or deactivated through cellular kinase activity or phosphatase activity, respectively, and these proteins can form cascades that mediate signals from the extracellular space to the nucleus, leading to transcription factor activation. Many of the kinases involved phosphorylate tyrosine molecules (tyrosine kinases [TKs]) on downstream target molecules, but protein kinases that target other amino acids are not uncommon (eg, serine-threonine kinases). TKs can be divided into receptor TKs (RTKs), which are membrane-bound and carry an extracellular ligand-binding domain, and an intracellular kinase domain, which is normally activated when the receptor domain is stimulated with a ligand. Nonreceptor TKs (nRTKs) are intracellular kinases located in the cytoplasm. They are often activated through phosphorylation by RTKs or other nRTKs, and they propagate signal by phosphorylating downstream nRTKs or transcription factors.

Abnormal activation of kinase signaling pathways is ubiquitous in cancer, and this has been exploited with modern therapeutics through the development of small molecule inhibitors.[57] These drugs usually act in the intracellular space and often can be taken orally. One of the first TKs to be targeted was the bcr-abl fusion protein, a constitutively active kinase created by the 9:22 translocation common in chronic myelogenous leukemia (CML).[58-60] Imatinib, a small molecule TK inhibitor (SMTKI) that was designed to target bcr-abl, has demonstrated effectiveness in treating this disease.[61] Imatinib inhibits other TKs, including c-Kit, as is common to many SMTKIs. C-Kit was identified as a major driver in gastrointestinal stromal tumors (GISTs), and imatinib has demonstrated substantial efficacy in this disease, leading to long-term durable responses in the majority of patients. More recently, vemurafenib, an SMTKI that targets v-raf murine sarcoma viral oncogene homolog B (BRAF), has been approved to treat advanced cutaneous melanoma in patients who have the common BRAF valine to glutamic acid substitution at residue 600 (V600E) activating mutation.[62, 63] Small molecule inhibitors that target key cancer pathways, such as the RAF-Mek-Erk cascade, EGFR signaling, and the PI3K/mTor pathway, are arguably the fastest growing class of cancer therapeutics, and a multitude of new drugs have been approved or are currently under development.

Targeted agents that facilitate immune-mediated cell death

The immune system is constantly warding off cancers by attacking preneoplastic and neoplastic cells, as evidenced by an increase in cancer rates among immune-compromised individuals. At the same time, tumors commonly arise in the setting of chronic inflammation and often promote an inflammatory response.[64] Several newer therapeutics use the immune system to destroy cancer cells. There are many approaches to accomplish this.[64] One approach is to use MoAbs to directly target and bind to tumor cell proteins, thus triggering a cytotoxic immune response. An example is retuximab, an MoAb that binds CD20, a surface protein commonly present on B-cells. Retuximab has demonstrated effectiveness in treating chronic lymphocytic leukemia and certain types of non-Hodgkin lymphoma.[65] Another approach to enhance immune-mediated tumor killing is to potentiate established responses that become suppressed by normal immune regulation or tumor modulation. This is the premise behind the utility of cytotoxic T-lymphocyte antigen 4 (CTLA-4) inhibitors, such as ipilimumab. T-cells require costimulation of several receptors to potentiate activation by antigen-presenting cells (APCs). CD28 presented by APCs stimulates CD80 and CD86 to costimulate T-cells. CTLA4 expressed on T-regulatory cells can block costimulation with CD28, thus suppressing the immune response. Ipilimumab inhibits CTLA-4, thereby allowing costimulation and continued activation of the T-cell immune response.[66] This approach has proven effective in the treatment of cutaneous melanoma,[67] and its therapeutic efficacy is now being explored in several other cancers. Several agents and approaches that use or amplify the immune response to tumor antigens are either on the market already or are currently being studied; and, in the near future, genomic and epigenetic profiling of tumors and the tumor microenvironment may inform which cancers will be best treated with these approaches.

Other new molecular approaches

There are several additional therapeutic approaches based on molecular tumor biology that are either approved or under investigation. These include agents that target the cell proteasome[68] or inhibit histone deacetylaces[69] and antibodies that recognize tumor antigens and deliver toxins[70] or radioactive particles[71] to cancer cells. Full descriptions of contemporary innovations in experimental therapeutics are beyond the scope of this article, but the plethora of current options both known to be effective in cancer and under active investigation can be appreciated by the list of approved targeted agents posted on the information web page by the NCI (available at: (accessed August 4, 2013), and are also summarized here in Table 2. Several agents under active investigation are identified on the Cancer Therapy Evaluation Program (CTEP) website (available at: spreadsheet available at (accessed August 4, 2013).

Table 2. Molecularly Targeted Agents Approved by the US Food and Drug Administration for Treating Human Cancer
AgentClassPrimary Molecular TargetsUtility
  1. Abbreviations: AIDS, acquired immune deficiency syndrome; ALL, acute lymphoblastic leukemia; CLL, chronic lymphocytic leukemia; CML, chronic myelogenous leukemia; CTCL, cutaneous T-cell lymphoma; GE, gastroesophageal; GIST, gastrointestinal stromal tumor; HDAC, histone deacetylase; IL-2, interleukin 2; MoAb, monoclonal antibody; PI, proteosome inhibitor; SMI, small molecule inhibitor; SMSTKI, small molecule serine-threonine kinase inhibitor; SMTKI, small molecule tyrosine kinase inhibitor.

Signal transduction inhibitors   
Imatinib mesylate (Gleevec)SMTKIc-Kit, bcr-abl, PDGFR, othersGIST, leukemia, dermatofibrosarcoma protuberans, and myelodysplastic/myeloproliferative disorders, systemic
Dasatinib (Sprycel)SMTKIbcr-abl, SrcCML and ALL
Nilotinib (Tasigna)SMTKIbcr-abl, Kit, LCK, EPHA3, EPHA8, DDR1, DDR2, PDGFRB, MAPK11, ZAKCML
Bosutinib (Bosulif)SMTKIbcr-abl, Src, HDAC inhibitor, alsoCML
Trastuzumab (Herceptin)MCAHER-2Breast cancer and gastric/GE junction adenocarcinoma
Pertuzumab (Perjeta)MCAHER-2Metastatic breast cancer with trastuzumab and docetaxel
Lapatinib (Tykerb)SMTKIBreast cancer 
Gefitinib (Iressa)SMTKIEGFRNonsmall cell lung cancer
Erlotinib (Tarceva)SMTKIEGFRNonsmall cell lung cancer and pancreatic cancer (unresectable/metastatic)
Cetuximab (Erbitux)MCAEGFRHead and neck squamous cell carcinoma and colorectal cancer
Panitumumab (Vectibix)MCAEGFRMetastatic colon cancer
Temsirolimus (Torisel)SMSTKImTORAdvanced renal cell carcinoma
Everolimus (Afinitor)SMSTKIImmunophilin FK binding protein-12: Binds and inhibits mTORAdvanced, progressive kidney cancer; subependymal giant cell astrocytoma in patients with tuberous sclerosis; advanced breast cancer; and pancreatic neuroendocrine tumors
Vandetinib (Caprelsa)SMIEGFR, VEGF, RETMetastatic medullary thyroid cancer
Vemurafenib (Zelboraf)SMSTKIBRAF V600EInoperable/metastatic melanoma
Crizotinib (Xolkori)SMTKI EML4-ALK fusion proteinLocally advanced/metastatic nonsmall cell lung cancer 
Target proteins that regulate key cell functions or gene expression   
Vorinostat (Zolinza)SMIHDAC inhibitorCTCL
Romidepsin (Istodax)SMIHDAC inhibitorCTCL
Bexarotene (Targretin)RetinoidRetinoid X receptor agonistCTCL
Alitretinoin (Panretin)RetinoidRetinoic acid and retinoid X receptor agonistAIDS-related Kaposi sarcoma
Tretinoin (Vesanoid)RetinoidRetinoic acid receptorAcute promyelocytic leukemia
Agents that induce apoptosis   
Bortezomib (Velcade)PIProteosomeMultiple myeloma and mantle cell lymphoma
Carfilzomib (Kyprolis)PI proteosome multiple myeloma  
Pralatrexate (Folotyn)AntifolateSelectively accumulates in RFC-1–expressing cellsPeripheral T-cell lymphoma
Target angiogenesis   
Bevacizumab (Avastin)MCABinds VEGFGlioblastoma, nonsmall cell lung cancer, metastatic colon cancer, and kidney cancer
Ziv-aflibercept (Zaltrap)VEGFR-mimic/immune proteinBinds VEGFMetastatic colon cancer
Sorafenib (Nexavar)SMTKIVEGFR, PDGFR, C-Raf, B-RafAdvanced renal cell carcinoma and hepatocellular carcinoma
Sunitinib (Sutent)SMTKIPGDFRs, VEGFRs, KIT, RET, CSF-1R, flt3Metastatic renal cell carcinoma, imatinib-resistant GIST, and pancreatic neuroendocrine tumors
Pazopanib (Votrient)SMTKI VEGRs, c-KIT, PDGFRRenal cell carcinoma, advanced soft tissue sarcoma 
Regorafenib (Stivarga)SMTKIVEGFR, angiopoietin-1 receptor (TIEF2), PDGFR, RET, c-KIT, RAFMetastatic colorectal cancer
Cabozantinib (Cometriq)SMTKIVEGF, RET, MET, TRKB, TIE2Metastatic medullary thyroid cancer
Agents that facilitate immune-mediated cell death   
Rituximab (Rituxan)MoAbCD20CLL, B-cell lymphomas
Alemtuzumab (Campath)MoAbCD52B-cell CLL
Ofatumumab (Arzerra)MoAbCD20Fludarabine-resistant and alemtuzumab-resistant CLL
Ipilimumab (Yervoy)MoAbCTLA-4 inhibitorUnresectable or metastatic melanoma
MoAbs that deliver toxic molecules to cancer cells   
Tositumomab and 131Itositumomab (Bexxar)MoAb linked to I131 CD20-expressing B-cellsNon-Hodgkin B-cell lymphoma 
Ibritumomab tiuxetan (Zevalin)MoAb linked to radioisotopesCD20-expressing cellsNon-Hodgkin B-cell lymphoma
Denileukin diftitox (Ontak)IL-2 + diphtheria toxin proteinsIL-2 receptorsCutaneous T-cell lymphoma
Brentuximab vedotin (Adcetris)MoAb + monomethyl auristatin ECD30Anaplastic lymphoma, Hodgkin lymphoma after chemotherapy, and stem cell transplantation

Integration of Genomic Data and Cancer Treatment to Develop Effective Biomarkers

The challenge we face is to incorporate an enormous amount of genomic information from human cancers into treatment strategies using the large number of molecular targeted therapeutic agents that are either currently available or under development. Cancer genomics must be translated into clinical biomarkers that can be used to prognosticate (ie, prognostic biomarkers) or to predict response to therapy (ie, predictive biomarkers). Genomic biomarkers have several applications for patients who have cancer or who are at risk of developing cancer. Several of these applications are presented, with examples, in Table 3. There are many obstacles that must be surmounted if improvements in cancer treatment will continue efficiently and effectively, and the development of new genomic biomarkers requires several important steps.

Table 3. Applications of Genomic Biomarkers in Cancer
Biomarker UtilityExample
  1. Abbreviations: DNA, deoxyribonucleic acid; HPV, human papillomavirus; PD, pharmacodynamics; PK, pharmacokinetics.

Predict risk of cancerBRCA mutation as a risk factor for development of breast cancer
Provide prognostic informationHPV and p16 positivity in oropharyngeal squamous cell carcinoma
Determine PK/PD of specific chemotherapeutics (pharmacogenomics)CYP2D6 enzyme polymorphisms and metabolism of Tamoxifen
Predict response to specific therapyBRAF V600E mutation in cutaneous melanoma
Tumor surveillanceCirculating tumor DNA in breast cancer

Development and application of bioinformatic approaches to identify candidate biomarkers

The first step in incorporating genomic data into clinical practice is to identify the genomic events most relevant to a given cancer type or subset of patients. The first-pass analysis of whole-exome or whole-genome data presents several challenges from the outset. Quality control and evaluation with appropriate normalization are essential in this process. Variant calls are highly dependent on sequence coverage and wild-type tissue contamination.[72] Several alignment tools are available to build a bioinformatic pipeline, and analysis requires alignment to a specific reference genome. Variant characterization then depends on comparing identified variations with known polymorphisms to make bona fide mutation calls.[73] Data can also be used to identify copy number changes and perform SNP genotyping.[74]

The next step in translating genomic information to clinical biomarkers is to identify which genomic events are relevant: which are important to tumor biology, and which are relevant clinically. Several approaches exist and depend on the scientific and translational questions being asked. Now that whole-exome sequencing has been applied to a large number of tumors, it appears that the majority of mutations are “passengers”—seemingly irrelevant to a tumor's development and behavior—whereas a select few appear to “drive” tumor biology. Several statistical methods and functional genomic approaches are being put forth to identify which genomic alterations in tumors are key oncogenic or tumor suppressive events crucial to cancer development, progression, and behavior.[72]

Another approach to identifying the most relevant genomic events is to integrate genomic data with other platforms, such as epigenetic, proteomic, and metabolomic data sets. This systems biology approach can evaluate cellular signaling pathways and processes to identify common aberrations that are caused by multiple insults along a specific pathway or network.[75] Integrated analysis can clarify the picture of the molecular biology driving human cancer cells. For example, the pilot TCGA project integrated nucleotide sequencing, CNV, gene expression, and methylation data from glioblastoma and uncovered a mutator phenotype linked to MGMT promoter methylation, and recurrent aberrations among key pathways involving TP53, Rb, and a network of RTKs were described.[40] Since that initial TCGA report, integrated analyses from other cancers have continued to shed light on several tumor types. In a recent project studying oral squamous cell cancer published by the senior author of the this review (J.N.M.), an integrated analysis of mutation, copy number analysis, DNA methylation, gene expression, and micro-RNA expression[76] yielded insight not previously appreciated with whole-exome evaluation alone.[25, 26] These findings included 4 distinct pathways that were commonly altered by several events at multiple levels. In addition, despite the finding that loss of tumor suppressor activity dominated the genomic landscape of these tumors, targetable oncogenic events were identified in most samples after an integrated analysis was performed. Figure 1 provides examples of these findings after the integrated analysis.

Figure 1.

An integrated analysis of mutation, copy number, methylation, and gene expression identified recurrent aberrations in 4 dominant pathways: (A) the Notch pathway, (B) cell cycle pathways, (C) pathways in mitogenic signaling, and (D) the TP53 pathway. (E) Several potentially actionable oncogenes in oral cavity squamous cell carcinoma also were identified after an integrated analysis. Freq. indicates frequency. (Adapted with permission from the American Association for Cancer Research: Pickering CR, Zhang J, Yoo SY et al Integrative genomic characterization of oral squamous cell carcinoma identifies frequent somatic drivers. Cancer Discov. 2013;3:770-781.)

Bioinformatics and integration through systems biology approaches are helping us discover relevant genomic events to better understand human cancer. The next step in biomarker development is to determine which of these events are clinically relevant and have prognostic or predictive applications.

Preclinical approaches to selecting clinically relevant targets

With the bewildering amount of data now available and expected from future reports profiling cancers, it becomes difficult to focus on clinically actionable events. Although there are already several examples of genomic events that have been successfully developed as predictive biomarkers for treatment selection, the majority of cancers still lack clinically significant biomarkers. The history of the drug imatinib is a perfect example of how success depends on elements of directed and persistent development (ie, the proven efficacy of imatinib against bcr-abl activity in CML[61]) and an element of intelligence-driven serendipity, as in the case of Dr. Jonesuu's discovery of imatinib's profound activity in GIST.[77] Realistically, both will drive the identification and implementation of new and effective cancer treatments, but we must focus on improving the former rather than relying on the latter, because sound study design and investigation are in our control.

After stringent selection from bioinformatic and systems biology analyses, the functional effects of manipulation of specific genomic targets can be evaluated in several ways. In the current era, data using tumor-derived cell lines in in vitro studies and tumor xenograft animal models remain important. Preclinical studies to date have largely focused on single genes or pathways and have analyzed small numbers of cell lines. To be translatable to the clinical setting, it is becoming increasingly important to perform global molecular assessments and/or to use large numbers of cell lines in an attempt to recapitulate the heterogeneity observed in the reality of human cancer. For example, large-scale short interfering RNA screens applied in the context of a specific mutation or in conjunction with a targeted agent can identify aberrant pathways that might be synthetically lethal with each condition. Turner and colleagues used this method to identify factors related to PARP inhibitor-sensitivity in BRCA-mutant breast cancer cell lines,[78] and Berns et al used similar methods to demonstrate that alterations in the PI3K pathway mediated resistance to trastuzumab in breast cancers.[79] The authors of this review (T.J.O., V.K.S., H.D.S., and J.N.M.) performed genomic and phenotypic analyses on a large panel of immortalized head and neck cancer cell lines to demonstrate that TP53-disruptive mutations were associated with aggressive tumor growth and metastasis in an orthotopic xenograft model[80] as well as with in vitro radiation resistance,[81] providing preclinical corroboration with results observed clinically in this disease in 2 large patient studies.[82, 83] In another very notable example, preclinical analysis of 602 cell lines demonstrated that amplification or translocation of the anaplastic lymphoma kinase (ALK) gene leading to activation was strongly associated with sensitivity to ALK inhibitors.[84] This preclinical work led to a successful clinical trial of crizotinib, an ALK inhibitor, in non-small cell lung cancers with ALK rearrangements.[85]

Large-scale genome projects such as the TCGA will likely continue to produce lists of novel genomic events that have not yet been functionally characterized. These events will need to be rigorously studied using preclinical functional evaluation to determine their potential as predictive biomarkers. Several platforms exist for this evaluation, including systematic mutagenesis using transposons or retroviruses, RNA-interference library screens (as described above), and high-throughput overexpression systems (eg, cDNA or open-reading frame libraries). These techniques can be used for comprehensive evaluation of the function and consequences of manipulating target genes.[86] Rigorous preclinical evaluation can optimize the selection of candidate biomarkers that will be most successful in the clinical arena.

Clinical Trial Design in the Current Era of Genomics

The typical phase 3 clinical trial compares a standard therapy against a new regimen (often the standard of care plus a new drug); and, typically, hundreds of patients are evaluated to demonstrate a small but statistically significant improvement in outcome. The new era of cancer genomics and targeted therapy is changing approaches to the design of effective trials. Several targeted molecular therapies depend on the presence of a specific predictive biomarker. Examples include wild-type Kirsten rat sarcoma viral oncogene homolog (KRAS) as a marker for cetuximab sensitivity in colon cancer[87] and the BRAF V600E mutation as a marker for vemurafenib activity in cutaneous melanoma.[62] As more and more effective therapeutics are developed, clinical trials must be designed to deliver these drugs to their intended patient population, and they should be developed in a manner that demonstrates efficacy in the most expeditious manner. Several recommendations have been put forth to accomplish this goal.

First, molecular prescreening should be applied wherever possible to select patients for clinical trials evaluating targeted agents.[88] Several barriers exist that are hindering the translation of scientific findings into Clinical Laboratory Improvement Amendments (CLIA)-certified diagnostics, including prioritizing the appropriate markers and drugs from preclinical data, logistical considerations such as the availability and accessibility of tissue for biomarker evaluation, and regulatory barriers to the establishment of certified diagnostics.[89] Such barriers must not prohibit the evaluation of these markers in the clinical trial setting. Molecular analysis also should be incorporated into the design of the clinical trial to evaluate whether molecular therapy is acting on the proposed target, to determine whether action on a target indeed correlates with clinical efficacy, and to identify secondary molecular events that correlate with sensitivity or resistance.

Second, the general consensus is that clinical trials assessing targeted therapy should become smaller and shorter.[89-91] Genomic and molecular evaluations of cancer are identifying small cohorts of patients who have tumors with molecular characteristics that potentially may be affected by a specific drug, in some ways segregating cancer into a heterogeneous array of “orphan” diseases.[90] Therefore, our approaches to trials should concentrate efforts into properly matching these small cohorts with the appropriate drugs. This philosophy lends itself best to phase 1 or phase 2 clinical trial designs that target selected patients, in which large numbers of patients are screened to identify a small cohort of patients who will likely benefit from receiving drug in the study. The endpoints should also focus on dramatic tumor responses to identify drugs that have the best efficacy. Studies in the neoadjuvant or unresectable/metastatic settings represent the best opportunities to test markers/drugs under this paradigm. A notable example of this approach is the previously referenced trial examining crizotinib in non-small cell lung cancer with ALK rearrangements.[85] In that study, approximately 1500 patients with advanced non-small cell lung cancer were screened for this genomic alteration. Eighty-two patients were enrolled in the study, and the majority had a response to crizotinib monotherapy with a very favorable side-effect profile.[85] Another innovative approach was used in the Biomarker-Integrated Approaches of Targeted Therapy for Lung Cancer Elimination (BATTLE) trial, which studied patients with chemorefractory non-small cell lung cancer and was published in 2011.[92] In that trial, mandated biopsies were obtained to evaluate 4 biomarker profiles: EGFR mutation/copy number, KRAS/BRAF mutation, VEGF/VEGFR2 expression, and Retinoid X Receptors (RXRs)/cyclin D1 (CCND1) expression/CCND1 copy number. The initial cohort of patients was randomized to receive 4 treatments: erlotinib, vandetinib, erlotinib/bexarotene, and sorafenib. Disease control rates were determined for each biomarker group for each drug to create pretest probabilities of response for each group. In the second phase of the study, a Bayesian adaptive randomization method was used to randomize patients to the 4 regimens. This study design allowed testing of prespecified hypotheses regarding selected biomarkers and treatment efficacy and demonstrated, for example, that the disease control rate in the KRAS/BRAF marker group that received sorafenib was 79% versus only 14% in the group that received erlotinib. That study was a landmark trial in its strategy for studying targeted therapies in the context of relevant biomarkers.

With comprehensive genomic analysis of human cancers well underway, and with hundreds of new drugs to test either alone or in a near infinite number of potential combinations, clinical trial strategies should focus on matching biomarkers and appropriate drugs, with the ultimate goal of dramatic efficacy. To accomplish these goals, the prevailing clinical trial paradigm will likely need to change. We can envision a comprehensive network of smaller scale trials offering several drugs or combinations of drugs to patients selected by genomic or molecular criteria rather than by disease site and histology. Efficacy would then be confirmed in smaller phase 2 trials and, ultimately, in phase 3 studies if deemed necessary. If we continue to accept marginal improvements after accruing hundreds of patients over several years of evaluation in large-scale, phase 3 trials of unselected patients, then we will never keep pace with genomic and therapeutic advances that are now available and that offer potential cancer cures.

Several recent endeavors have been initiated and funded by the NCI with aims that parallel those highlighted in this review. The Cancer Target Discovery and Development (CTDD) program is made up of a network of 11 centers across the United States that are currently working both individually and collaboratively to drive 13 specific projects. The collective goal of the program is to translate the enormous wealth of data in cancer genomics and biology into improved clinical outcomes through biomarker development and improved treatments. The program can be reviewed on their website (available at: (accessed August 4, 2013). A similar program, Therapeutically Applicable Research to Generate Effective Treatments (TARGET), is aimed at childhood cancers and also can be reviewed on the National Institutes of Health website (available at: (accessed August 4, 2013). Planned for the near future is the NCI-MATCH (Molecular Analysis for Therapy Choice) program, which aims to enroll patients with solid tumors or lymphoma who have progressed after at least 1 standard therapy. This program proposes to require a biopsy and sequencing of tumor samples from enrolled patients, who will then be referred to a clinical center within the NCI-MATCH network to participate in 1 of several single-arm, phase 2 studies. The appropriate trial will be selected based on an agent that targets a genomic event profiled in the participant's biopsy screen. This program is currently in the design phase and hopefully will lead to significant advances in the endeavor to translate cancer genomics into predictive biomarkers for several cancers.

Other Applications of Modern Genomics in Cancer Treatment and Biomarkers

Finally, the application of high-throughput genomic and epigenetic analysis is not limited to matching genomic biomarkers with the most efficacious drugs. Pharmacogenomics evaluates the different pharmacokinetics and pharmacodynamics of drugs because of genomic variation that can help optimize drug and dose selection for specific patients. An example is the identification of cytochrome P450, family 2, subfamily D, polypeptide 6 (CYP2D6) polymorphisms that alter the metabolism of tamoxifen, in which a significant association with outcome was reported among patients with estrogen receptor-positive breast cancer who received this drug.[93] Another application of modern genomics in cancer biomarker development is the utility of circulating tumor DNA as a method of assessing tumor burden, response, and surveillance for recurrence. One recent study identified specific genomic alterations among biopsy samples taken from patients with metastatic breast cancer and used next-generation deep sequencing to identify and quantify circulating tumor DNA during treatment. Levels of circulating tumor DNA were correlated with tumor burden and response to treatment.[94] Thus, as new genomic information and techniques are developed, novel applications of these data also will continue to evolve.

One final consideration is how recent studies have further characterized the level of intratumoral genomic diversity. A recent review by Murugaesu, Chew, and Swanton[95] focuses on the current understanding of genomic diversity in individual tumors, clonal evolution, and the implications of these characteristics for cancer treatment. Pleomorphism among tumor cells is a hallmark of cancer, and tumor cell phenotypic heterogeneity and the degree of differentiation of cancer cells often are graded subjectively to aid in prognosis for many cancer types. Recent studies have characterized tumor cell heterogeneity at the genomic level. A study by Yachida and colleagues used whole-exome sequencing and copy number analysis to examine different regions within primary pancreatic cancers as well as among associated metastatic lesions. Those investigators demonstrated that the metastases evolved from subclones from the primary tumor; however, the metastases themselves also were genetically evolved.[96] Martinez et al recently reported on CNVs from 48 biopsies among 8 advanced renal cell carcinomas.[97] Unsupervised clustering of those biopsies, along with comparative evaluation of 440 tumors in the TCGA, revealed that there was significant heterogeneity within individual tumors, with clonal populations within each tumor that recapitulated clusters segregated among tumors in the TCGA. Such studies have significant implications for future approaches to targeted treatment.

Intratumoral heterogeneity and genomic heterogeneity between metastatic foci and primary disease pose obvious challenges to individualized therapy based on genomic evaluation. Future strategies will have to account for these hurdles, and several possible approaches exist. Intratumoral heterogeneity suggests that multiple biopsies separated spatially and temporally may be necessary to optimize treatment; however, there are obvious issues with practicality and morbidity. Treatment options will likely require tailored combinations of drugs to target dominant clonal populations of tumor cells. Strategies may attempt to selectively target clonal populations to “prune” tumors into a state that is genomically “manageable.” Perhaps new strategies will selectively allow a clonal unit with more benign features to dominate while first attacking smaller, yet more aggressive tumor populations before definitive intervention. The current era of genomics has certainly improved our understanding of the cancer genome; however, along with this knowledge there are new challenges facing cancer treatment.

The Future of Genomic Biomarkers

This is a very exciting time to be studying cancer genomics and to carry out translational cancer research. Modern genomics and contemporary therapeutics have already led to treatment strategies that were in the realm of science fiction 3 decades ago. However, there are still many challenges ahead. In several tumor sites, such as HNSCC, no predictive biomarkers exist. Therefore, the scientific and medical communities need to work collectively to use the strategies proposed in this review (summarized in Fig. 2) to develop more clinically relevant biomarkers. With these efforts, new biomarkers and better treatment strategies will lead to improved outcomes for patients afflicted with cancer.

Figure 2.

This graphic summarizes the steps necessary to translate cancer genomic information into successful biomarkers. SNP indicates single-nucleotide polymorphism; CGH, comparative genomic hybridization; RNAseq, RNA sequencing; chIP, chromatin immunoprecipitation; HDAC, histone deacetylase; si-RNA, short interfering RNA.


Supported by National Institutes of Health/National Cancer Institute grant RC2DE020958 (Comprehensive Analysis of Genetic Alterations in Oral Cancer) and Cancer Prevention and Research Institute of Texas (CPRIT) grant RP100233 (Comprehensive Analysis of Genetic and Epigenetic Changes in Oral Cancer).


The authors made no disclosures.