Santorini mutation detection meeting 2011: Rapid advance in sequencing technology poses challenges for interpretation of genetic variations
The 11th International Symposium on Mutations in the Genome was held on 6–10 June, 2011, in Santorini, Greece. Meeting participants described novel detection technologies, rapid advances in whole genome and whole-exome sequencing, but also highlighted the urgent need for the development of sequence variation databases and the clinical interpretation of the genomic data. This report summarizes some of the major themes presented during the meeting. Hum Mutat 33:1497–1500, 2012. © 2012 Wiley Periodicals, Inc.
After touring the world for the last 20 years (in Oxford, 1991; Lago d'Orta, 1993; Visby, 1995; Brno, 1997; Vicoforte, 1999; Bled, 2001; Palm Cove, 2003; Santorini, 2005; Xiamen, 2007; and Paphos, 2009), the 11th International Symposium on Mutations in the Genome returned to the Greek volcanic island of Santorini, where over 140 scientists from 27 countries gathered on 6–10 June, 2011. The beautiful scenery of the Aegean Sea, created 3,600 years ago by the largest recorded volcanic eruption in history, provided a unique setting for the meeting on mutation detection organized by Richard Cotton (Australia), Aglaia Athanassiadou (Greece), Johan den Dunnen (the Netherlands), Ivo Gut (Spain), Mats Nilsson (Sweden), and Ann-Christine Syvänen (Sweden) and supported by Agilent Technologies (Santa Clara, CA), RainDance Technologies (Lexington, MA), BGI (Shenzhen, China), Complete Genomics (Mountain View, CA), Roche, (Basel, Switzerland) Idaho Technology Inc. (Salt Lake City, UT), the Human Variome Project (HVP), and the Municipality of Thera. To celebrate the return of this meeting to Santorini, the vice mayor of Thera (Santorini) offered a generous financial award for the best poster of the Symposium that was won by Sewon Kim (KAIST, Daejeon, South Korea).
In this meeting report, we describe some of the highlights and try to provide a general overview of the key discussion points.
Ever-Expanding Volume of Data Output and Increasing Speed Achieved by Next-Generation Sequencing: Is More Always Better?
The Symposium began with a plenary lecture by Thomas Caskey (University of Texas, Houston, TX) giving an overview of the key historic discoveries and technological innovations that have led to the discovery of genes associated with acquired and heritable diseases. Professor Caskey described how the recent technological advances, such as whole-genome sequencing (WGS) and whole-exome sequencing, have allowed access to molecular diagnosis of individual clinical cases without requiring analysis of large pedigrees, even for complex disorders. This presentation put the rapid technological progresses of the recent years in a useful historical context and highlighted the remarkable speed, reduction in cost, and the ever-increasing amount of molecular data produced by next-generation sequencing (NGS) techniques but also commented on the difficulty in interpreting the resulting data. This was to become a recurrent theme of the meeting. Indeed, Charles Strom (Quest Diagnostics, San Juan Capistrano, CA) described the rapid progress allowed by the development of new technologies in the diagnostic laboratory (including microarrays, NGS, qPCR, MLPA, and molecular combing—a technique developed in association with the Pasteur Institute, Paris, France), but warned of the limitations of these techniques for data interpretation in clinical medicine. In particular, he introduced the concept of “VUCS” (variants of unknown clinical significance), obtained during genomic sequence analysis and explained the problems and frustrations clinicians face in interpreting sequencing data of individual patients in a clinical setting. He proposed that clinical panels should include only well-validated disease-causing variants, suggested the use of bioinformatics filters to prevent reporting indiscriminately VUCS in databases, and advocated the establishment of interpretive guidelines for variants. Finally, he recommended that WGS should be performed for research purposes only or once a firm clinical diagnosis has been achieved. Similar comments were also made by several other speakers, including Paul Gissen (Great Ormond Street Hospital, London, United Kingdom), Joseph Thakuria (Harvard, Cambridge, MA), and Johan den Dunnen (Leiden University Medical Center, the Netherlands), who described the incredible speed with which DNA sequencing technology has developed over the last couple of years and how it has led to an exponential rise in the number of genomic databases, resulting in an immediate benefit to society. However, they insisted on the importance of the clinical interpretation of the genomic data generated by DNA sequencing technology and highlighted the need for formal approaches to the analysis of genomic datasets, especially for reproductive counseling and molecular diagnosis of Mendelian disorders.
The benefits of the large volume of data generated by recent technological improvements in NGS were illustrated during Ivo Gut's presentation (Centro Nacional de Analisis Genomico, Barcelona, Spain), which described the efforts of the International Cancer Genome Consortium to sequence and characterize the genetic landscape of 50 different cancer types, among which he reported on chronic lymphocytic leukemia and kidney cancer. Wang Jun (BGI, Shenzhen, China) illustrated the remarkable yield achieved by the BGI sequencing powerhouse that has the capacity to produce approximately 10,000 human genome sequences per year. He reported the progress made since the sequence of the first Asian Genome in 2008 with more than 100 individual genomes completed within the remit of the Yanhuang project. He further described the BGI's efforts to produce a “tree of life” by building accurate, complete, diploid genome sequences as well as their advances in WGS of single cells.
Although NGS allows rapid and relatively cheap data output, several speakers insisted that “more is not always better” and that choosing the right platform for a given disease (and within a given budget) is essential in clinical practice. Paul Gissen (Great Ormond Street Hospital) presented a practical evaluation (considering speed and cost) of an Affymetrix-based microarray resequencing chip (BRUM1: Birmingham ReseqUencing Microarray version1). When compared with capillary sequencing, the microarray chip (with a capacity of 250 kb per array) is cost-effective as it allows screening of multiple genes. Further improvements are still necessary in order to reduce the running costs and the processing times. Olga Jarinova (University of Ottawa, Ottawa, Canada) presented her experience of detection of mutations involved in hereditary neuropathies using a custom microarray (“Neuropathy Chip”) based on the Affymetrix resequencing platform that enables rapid analysis of multiple genes as part of a single test. She concluded that the cost of the Neuropathy Chip (allowing simultaneous testing of 42 genes associated with all known hereditary neuropathies) would be very close to the fee currently charged for analysis of a single gene. Michael Mindrinos (Stanford Genome Technology Center, Palo Alto, CA) showed that combining long range PCR with NGS (Illumina) can provide a high-resolution, high-throughput, and cost-effective methodology for accurate immunotyping of the human leukocytes antigen (HLA) genes. Further improvements to the technology (including better design of gene-specific primers, increase of sample processing though multiplexing, and improvements in bioinformatics tools) should reduce the cost to the point where HLA typing will become a routine clinical assay that can be integrated into individual treatment plans. Claire Chauveau (University Pierre Marie Curie, France) used NGS for identification of pathogenic variants associated with core myopathies and concluded that NGS is a highly valuable tool especially for very large and complex target genes. These are currently impossible to analyze using classical approaches.
Focus on Rare Mendelian Diseases and Complex Disorders
Recent advances in NGS techniques have had a considerable impact on gene discovery for rare genetic diseases as well as complex disorders. Joris Veltman (Nijmegen, the Netherlands) highlighted the limitations of classical approaches relying on positional cloning for gene identification in the case of rare or sporadic disorders and shared his experience of using whole-exome sequencing. The approach, which consists in capturing and sequencing all exons of parents–offspring trios, is particularly suitable for identification of de novo mutations. However, it has also allowed direct and unbiased identification of several genes involved in rare recessive diseases, such as Sensenbrenner syndrome and osteogenesis imperfecta, as well as genes involved in common genetic disorders, such as mental retardation. Johan den Dunnen (Leiden University Medical Center) advocated the benefits of implementing NGS in an academic hospital setting, especially in the case of rare disorders that have eluded genetic diagnostic in the past. As the cost of a human exome is currently nearing 1,000 Euros, this technology is now systematically applied to unresolved genetic cases with excellent results. However, he emphasized the value of using a range of different techniques in molecular diagnosis, including not only whole-exome sequencing and WGS but also long range PCR, array-on-demand technology, and single-molecule sequencing technology (Helicos). Thomas Caskey (University of Texas) highlighted the problems encountered by the study of complex disorders, such as schizophrenia (SZ), for which analysis is hampered by an extreme genetic heterogeneity, resulting in an overwhelming number of candidate genes. In an attempt to identify genes involved in familial SZ, both WGS (using the Complete Genomics platform) and whole-exome sequencing were performed on a panel of patients and results from both approaches were compared: while exome sequencing was fast and cost-effective, it consistently missed some exons, did not cover variants in splicing or regulatory regions, nor provided structural or copy-number variations. In the case of SZ, the use of WGS and cosegregation analysis of mutations and disease status lead to identification of 42 new candidate genes that still await validation by a third method (Sanger sequencing or genotyping) and analysis in a larger cohort of SZ patients and normal controls. Rita Cacace and Aline Verstraeten (both from University of Antwerp, Belgium) used WGS (Complete Genomics platform) to find new genes involved in neurodegenerative disorders (early-onset familial Alzheimer's disease [AD] and early-onset familial Parkinson's disease [PD]). They explained the pipeline they followed, from patient selection to variant prioritization, analysis and validation, and how the identification of novel AD and PD genes will lead to a better understanding of the underlying pathological disease mechanisms and to the development of diagnostic tools and preventive therapies. David Goldgar (University of Utah, Salt Lake City, UT) highlighted the benefits (and the reasonable cost) of NGS for mutation detection in the case of complex disorders and suggested that simulation-based analysis of different strategies should be carefully considered prior to undergoing NGS projects as they provided valuable guidelines for study design. Finally, Kerry Miller (Royal Children's Hospital, Melbourne, Australia) used NGS to identify causative mutations in ENU mouse models of human skeletal dysmorphology.
Novel Technological Advances: Developments Inside and Outside the Laboratory
Companies sponsoring the meeting presented their recent technological advances in a series of short talks, including Radoje Drmanac (Complete Genomics) who explained that Complete Genomics anticipates being able to provide WGS (of clinical quality) at the rate of 800–1,200 genomes per month by the end of 2011. As new sequencing instruments are being developed, the company plans further to increase this pace by a factor 10-fold during 2012. Petr Zak (Roche Diagnostics), Adam Corner (RainDance Technologies), and Bram Herman (Agilent Technologies) reported on different targeted resequencing approaches, while Carl Wittwer (Idaho Technology Inc.) described new developments and benefits provided by high-resolution DNA melting in mutation detection.
The presentation of novel (“fresh”) mutation detection technologies, which are not currently commercially available, has traditionally been an important and highly anticipated feature of the Mutation Detection meetings. Murali Venkatesan (University of Illinois, Urbana, IL) reported on the development of solid-state nanopore sensors with enhanced surface properties for real-time detection and analysis of single DNA molecules and explained how SiO2 nanopores, Al2O3 nanopores, and graphene/Al2O3 hybrid nanopores can be used to detect single-base mutations and DNA–protein complexes. Harold Craighead (Cornell University, Ithaca, NY) introduced the audience to the use of microfluidic for DNA analysis and showed how arrays of fluidic nanostructures are used to manipulate and isolate DNA molecules. Although one obvious application of these devices is DNA extraction, the ability to immobilize and stretch DNA molecules within these nano-chambers also allows single-cell sequencing and epigenetic (methylation and chromatin) modification analysis. Achilles Kapanidis (University of Oxford, Oxford, United Kingdom) described the development of a FRET-based single-molecule DNA sequencing technique using dark quenchers that will enable real-time sequencing over very long reads and with minimal sequence error, while Natalie Vanden Bon (Hasselt University, Belgium) reported on the development of a handheld device containing nanocrystalline diamond-based DNA sensor, allowing label-free and real-time detection of point mutations.
Simple approaches intending to allow mutation detection outside the traditional setting of clinical laboratories were also discussed. Theodore Christopoulos (University of Patras, Patras, Greece) showed how disposable dipstick-type DNA biosensors can be used for simple genotyping. This technique, which is fast, cheap, and entirely portable, combines PCR amplification, allele discrimination by capillary reaction, and simple visual detection in the form of colored lines and can be scaled up to multi-allelic polymorphisms. John Burn (Quantum Dx and Newcastle University, Newcastle upon Tyne, United Kingdom) presented a new instrument aiming at providing “DNA testing while you wait.” Although a benchtop Si nanowire-based device (QPoc) capable of DNA extraction and PCR is already available for genotyping, the company is working toward a handheld device that will be used for DNA sequencing of fragments of up to 10 kb, with the aim of reducing testing time and costs and that can be used in remote clinics.
Analysis of Minority Mutations
Analysis of low-level mutations can be of crucial importance (e.g., in the case of tumors to determine heterogeneity, estimate recurrence risks, or predict metastasis) and new approaches are being devised to detect rare mutations more reliably. Mike Makrigiorgos (Harvard Medical School) described the improvements recently made to the simple and elegant technique of co-amplification at lower denaturation temperature PCR (COLD PCR), developed a few years ago to detect low-level mutations (up to 1%) in biological samples; improved and complete enrichment COLD PCR) can be applied to all mutation types and allow identification of mutation present at levels as low as 0.1%. To use the full benefit of deep sequencing provided by NGS, his group is also developing an emulsion based high-throughput version of the technique (eCOLD PCR). Thessalia Papasavva (The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus) described the benefits of NGS in noninvasive prenatal diagnosis in the case of β-thalassaemia, an autosomal recessive single-gene disorder prevalent in Cyprus. The team used a combination of high-throughput sequencing and haplotyping for detection of paternally inherited fetal alleles in maternal plasma. Ivo Gut (Centro Nacional de Analisis Genomico) warned that traditional PCR-based methods combined with direct sequencing often fail to detect somatic mutations in heterogeneous tumor samples and explained the challenges (and importance) of detecting oncogenic mutations in circulation. He described a highly specific amplification method (ribo-pyrophosphorolysis-activated polymerization PCR), combined with MALDI-TOF mass spectrometer or NGS detection that was successfully used for the analysis of very low level specific KRAS somatic mutations. Mats Nilsson (Uppsala University, Sweden) and members of his group presented elegant work on in situ mRNA detection of low abundance oncogenic mutations in biological samples through the use of a robust protocol combining padlock probes, rolling circle amplification, and immunofluorescent detection. Practical applications of the technique were presented by Rongqin Ke (Uppsala University) who demonstrated that the method can be used for direct sequencing of individual molecules in fixed mouse and human cocultured cells, while Ida Grundberg (Uppsala University) elegantly demonstrated the application of the technique for in situ detection of seven different oncogenic KRAS mutations from colorectal and lung cancer tissue imprints as well as on fresh and FFPE tissue sections.
Databases and Bioinformatics
The urgent need and importance of accurately reporting, cataloguing, and interpreting the ever-increasing number of genetic variants generated during genome sequencing studies was a key theme of the meeting. The last session of the meeting addressed these concerns through a series of presentations describing different initiatives and variant databases. Johan den Dunnen (Leiden University Medical Center) highlighted the remarkable achievements of the last 30 years in discovering and cataloguing variants associated with over 1,500 genetic diseases but emphasized the need for powerful computational tools to sift through the increasing numbers of variants being discovered through NGS techniques. He emphasized the benefits of fully Web-based gene sequence variant or locus-specific databases that can provide different levels of data access (i.e., submitter vs. visitor, curator…) and stressed that, to be useful, the information needs to be regularly curated and easily accessible to clinical users. George Patrinos (University of Patras) presented the benefits of a database containing variants specific to hemoglobinopathies and thalassemias. He also introduced the principle of “microattribution” as an incentive to encourage submission of genetic variants to databases (FINDbase [Georgitsi et al., 2011]), and suggested that similar strategies could increase the rate of submission to other databases, for example, for some common and/or complex human genetic diseases. Finlay Macrae (The Royal Melbourne Hospital, Melbourne, Australia) described the progress of InSiGHT (International Society for Gastrointestinal Hereditary Tumors [www.insight-group.org]), a database that focuses on mismatch repair genes associated with hereditary cancer syndromes. For the last 5 years, InSiGHT has worked in close collaboration with the HVP, allowing significant improvements in database organization (including IT technologies [LOVD]), DNA variant curation, description of genotype–phenotype association, and variant interpretation (achieved through regular consultations of an “interpretation committee” of experts). Peter Taschner (Leiden University Medical Center) explained the improvements in sequence variant descriptions recently made in the Human Genome Variation Society (HGVS) nomenclature (www.hgvs.org/mutnomen), in an effort to accommodate NGS data input. He then introduced a novel version of Mutalyzer (Mutalyzer2 [www.mutalyzer.nl]] that is highly reliable, accepts GenBank or locus region genomic inputs, and integrates an HGVS nomenclature check before submission. Andrew Devereau (National Genetics Reference Laboratory (NGRL), Manchester, United Kingdom) presented the Diagnostic Mutation Database, a key database used in diagnostic laboratories across the United Kingdom (www.dmudb.net). He reported that the database contains nearly 13,000 independent records of genetic tests, corresponding to >39,000 variants in 51 genes (mainly breast and colon cancer genes) and has currently 238 active users. As the main issue of the project appears to be data collection and sustainability, an international partnership with the European Molecular Quality Network has been initiated to extend access to the database. Justin Paschall (National Institute of Health [NIH], Bethesda, MA, Bethesda, MD) described the ClinVar database, available through the NCBI website (http://www.ncbi.nlm.nih.gov/clinvar/), as a free online resource that aims to extend the benefits of dbSNP (that catalogues SNPs associated with “natural” variation) to clinical genetics and provide supporting evidence linking genetic variations and disease phenotypes. The NIH, recognizing the importance of recording and making information freely available, has also begun to develop the Genetic Testing Registry, an online resource that provides a centralized location for test developers and manufacturers to submit voluntarily test information such as indications of use, validity data, and evidence of the test's usefulness. Richard Cotton (Florey Neurosciences Institute, Melbourne, Australia) emphasized the importance of implementing national and international databases, the need for data sharing, and talked about his experience and the new developments of the HVP (www.humanvariomeproject.org), set up as a global community effort to collect human variation information into gene/disease- and country-specific collection databases [Cotton et al., 2008]. In the HVP setting, data are directly collected from diagnostic laboratories within a given country and then shared through international “nodes” in order to help global dissemination of the information. The data are also sent to central variation depositories allowing better capture, validation, and curation that will help clinicians to interpret variations and translation into the clinic.
The 11th International Symposium on Mutations in the Genome Mutation Detection clearly highlighted the benefits provided by the recent technological advances in mutation detection achieved through the implementation of NGS technology in both research and clinical settings. DNA sequencing technologies have (and are being) developed at speed, delivering a high volume of good quality data at lower cost. As such, NGS provides a very useful tool to identify pathogenic variants in samples that until recently were inaccessible to research or clinical diagnosis. However, concerns were repeatedly expressed during the meeting about the main bottleneck of this revolution, which appears to lie in the analysis, validation, and interpretation of the data. The large number of “VUCS” being reported, rather than simplifying diagnosis, can complicate the clinical interpretation of the data. Given these issues, national and international efforts are being initiated to allow development of comprehensive databases and disease- and/or locus-specific repositories as well as “interpretation committees.” No one will deny that these are very exciting times for mutation detection and genetic analysis, but we need to ensure that we build the necessary infrastructure that will make translating molecular datasets to the clinic a reality.