ENIGMA—Evidence-based network for the interpretation of germline mutant alleles: An international initiative to evaluate risk and clinical significance associated with sequence variation in BRCA1 and BRCA2 genes†
Communicated by Richard G. H. Cotton
As genetic testing for predisposition to human diseases has become an increasingly common practice in medicine, the need for clear interpretation of the test results is apparent. However, for many disease genes, including the breast cancer susceptibility genes BRCA1 and BRCA2, a significant fraction of tests results in the detection of a genetic variant for which disease association is not known. The finding of an “unclassified” variant (UV)/variant of uncertain significance (VUS) complicates genetic test reporting and counseling. As these variants are individually rare, a large collaboration of researchers and clinicians will facilitate studies to assess their association with cancer predisposition. It was with this in mind that the ENIGMA consortium (www.enigmaconsortium.org) was initiated in 2009. The membership is both international and interdisciplinary, and currently includes more than 100 research scientists and clinicians from 19 countries. Within ENIGMA, there are presently six working groups focused on the following topics: analysis, clinical, database, functional, tumor histopathology, and mRNA splicing. ENIGMA provides a mechanism to pool resources, exchange methods and data, and coordinately develop and apply algorithms for classification of variants in BRCA1 and BRCA2. It is envisaged that the research and clinical application of models developed by ENIGMA will be relevant to the interpretation of sequence variants in other disease genes. Hum Mutat 33:2–7, 2012. © 2011 Wiley Periodicals, Inc.
Evaluating the Clinical Relevance of Sequence Variants in BRCA1 and BRCA2
BRCA1 (MIM# 113705) and BRCA2 (MIM# 600185) were identified as “high-risk” breast cancer predisposition genes in 1994 and 1995, respectively, by investigation of families with multiple cases of breast cancer. It is now well established that monoallelic germline mutations in BRCA1 and BRCA2 also confer markedly increased risk of ovarian cancer, as well as modestly increased risks of prostate, pancreatic, and male breast cancer, although the strength of these associations vary somewhat by gene. Sequencing of the BRCA1 and BRCA2 genes may thus be offered to cancer-affected probands as a diagnostic tool to identify an underlying genetic cause for their disease. In addition, healthy women may be offered testing if they have a strong family history and no living affected members available for testing, or if they are more likely to carry a founder mutation in BRCA1 or BRCA2 as in the case, for example, of Ashkenazi Jewish women. In the first 5–10 years in which such testing was available, it was largely confined to women with very strong family histories of disease. Evidence-based inclusion criteria are still used to identify women for testing in Australia and many European countries. However, testing has become considerably more widespread in the USA where it has been directly marketed to consumers, and physician practices, such as community oncologists and gynecologists, despite current recommendations of the American Society of Clinical Oncology that all genetic testing and genomic risk assessment be conducted in the setting of pre- and posttest counseling [Robson et al., 2010].
Two factors may contribute to an even broader use of genetic testing in the future. The first factor is the slow but progressive development of options for specific preventive strategies and for clinical management available to mutation carriers. Discovery of a deleterious or pathogenic mutation in an individual delineates the course of his/her clinical management. In addition, identification of a pathogenic mutation in a family proband has direct implications for the management of family members, including presymptomatic testing of at-risk relatives for the causative mutation. For many years, mutation carriers have been offered the option of prophylactic surgery to prevent development of (further) cancers, and/or increased surveillance to assist early detection of cancers. The amount of BRCA testing is likely to further increase if therapies specifically targeting BRCA1 and BRCA2 mutated tumors, such as Poly-(ADP-ribose) polymerase (PARP) inhibitors, move out of clinical trials and into practice. Based on the therapeutic implications, some experts have advocated such genetic testing of newly diagnosed premenopausal breast cancer cases, irrespective of family history of cancer [Trainer et al., 2010]. The second factor is the advent of next-generation massively parallel sequencing. As costs reduce for whole-genome sequencing experiments, it is likely that a larger number of individuals will be screened systematically for variation in the complete genomic region spanning BRCA1 and BRCA2, capturing genetic variation beyond the exons and intron–exon boundaries routinely covered by current clinical tests [Walsh et al., 2010].
Many sequence variants identified through sequencing of BRCA1 and BRCA2 can be inferred to be pathogenic since they result in predicted truncating or null proteins and/or are frequent enough in breast–ovarian cancer families that their risk of disease can be estimated directly. However, there are a substantial number of cases in which sequencing identifies rare BRCA1 or BRCA2 sequence variants of uncertain clinical significance. Almost 1,800 distinct sequence variants in BRCA1/2 are listed as having unknown clinical significance on the Breast Cancer Information Core (BIC) database (http://research.nhgri.nih.gov/bic/). Such unclassified variants (UVs) can include missense changes, small in-frame insertions or deletions, and potential splice site alterations, which are problematic for cancer risk estimation and clinical management as their functional implications are not immediately apparent [Schwartz et al., 2008]. There is evidence to indicate that the problems associated with UVs in the clinic are reflected not only as uncertainty in clinical decision making, but also as more intensive management than recommended with consequential increases in economic costs [Plon et al., 2011]. Defining the subset of these variants that can be treated in the same manner as known pathogenic (e.g., truncating) mutations would thus have obvious clinical benefit for the patients and family members who carry them. In addition knowing that such a variant is of little clinical significance would avoid the anxiety of uncertainty about the meaning of the test result. Moreover, as indicated above, the application of next-generation sequencing will markedly increase the scope of testing to cover all noncoding intronic regions and large tracts of genomic sequence flanking BRCA1 and BRCA2 [Walsh et al., 2010], and will identify many novel variants in possible regulatory regions that are currently poorly researched with respect to the functional effects of genetic variation.
Epidemiological approaches such as association studies are impractical to assess the clinical significance of individually rare sequence variants. A multifactorial model was thus developed as a method to integrate data from several different independent sources to compare the likelihood that a given genetic variant is a pathogenic mutation to the likelihood that the variant is neutral with respect to risk as a function of the various data elements [Goldgar et al., 2004]. The model has been revised to incorporate likelihoods from additional data sources, and refine information that was originally included [Chenevix-Trench et al., 2006; Easton et al., 2007; Goldgar et al., 2008; Gomez Garcia et al., 2009; Osorio et al., 2007; Spearman et al., 2008; Spurdle et al., 2008b; Tavtigian et al., 2008; Walker et al., 2010; Whiley et al., 2011]. Further, a large dataset was used to derive likelihood ratios from family history, cosegregation with disease in families, and co-occurrence of a variant in trans with a pathogenic mutation [Easton et al., 2007], and these likelihood ratios were then used to estimate the prior probability of pathogenicity of a variant based on in silico properties (sequence conservation, severity of the amino acid substitution, domain)—information that can be used together with other likelihood ratios to calculate the posterior probability that a given variant is pathogenic [Tavtigian et al., 2008]. A recognized value of this model-based approach is that it provides a quantitative output that can be used to categorize variants into defined classification categories and so minimizes subjectivity [Plon et al., 2008]. To date, there have been 111 BRCA1 and 102 BRCA2 variants classified according to this method (http://brca.iarc.fr/LOVD/home.php). These multifactorial approaches are most useful when there are a sufficient number of occurrences of a given variant in unrelated families to allow genetic approaches such as cosegregation and family history analyses to inform classification. Therefore, it is important to devise strategies for classifying variants that may only occur once or twice in any given national or regional laboratory. This classification can be facilitated by pooling genetic, clinical, and histopathological information from a worldwide network of laboratories to obtain sufficient data for empirical approaches to be used, and developing algorithms that allow inclusion of a likelihood of clinical pathogenicity from functional assay results [Farrugia et al., 2008; Iversen et al., 2011]), or from mRNA expression and splicing assay results (Spurdle et al., 2008a). ENIGMA was conceived to address both of these issues.
Importantly, the approaches and strategies used to optimize classification of BRCA1/2 variants have applicability to assess sequence variants in any gene. The studies of BRCA1/2 have seeded research on other cancer susceptibility genes, from application of existing bioinformatic approaches to assessment of sequence variants of ATM, TP53, CHEK2 (http://agvgd.iarc.fr/)[Tavtigian et al., 2009], and development of multifactorial classification approaches for other cancer syndromes [Arnold et al., 2009; Murphy et al., 2004; Spurdle 2010]. It is also recognized that the interpretation of rare sequence variants will become a major issue with the advent of massively parallel sequencing projects [Heinen 2010; Ng et al., 2009, 2010], and bioinformatic strategies will be particularly valuable to prioritize sequence variation relevant to disease phenotype [Eliseos et al., 2011; Ng et al., 2010].
ENIGMA—Evidence-Based Network for the Interpretation of Germline Mutant Alleles
In late 2008, several different research groups had established formal and informal collaborations to investigate approaches to classify rare BRCA1/2 sequence variants. It was recognized that a more formal collaborative network would facilitate pooling of resources to address the problem of classifying these variants. In May 2009, a group of researchers met in Amsterdam to discuss the purpose and scope of a consortium focused on the issue of sequence variant classification, and founded ENIGMA—the Evidence-based Network for the Interpretation of Germline Mutant Alleles.
The following principles were agreed upon:
- (1)The focus of the Consortium will initially be limited to breast cancer susceptibility genes, and the research projects to BRCA1/2 sequence variants with potential to expand the effort in the future.
- (2)The Consortium is a research-based collaboration to encourage and improve research efforts and methods in the field.
- (3)The purpose of the Consortium is to share data, methods, and resources to facilitate classification of variants.
- (4)The intent of the Consortium is to promote large-scale projects, including those that assess the validity of assumptions of the current prediction models.
- (5)Classification of variants that arise from research conducted by the Consortium would be relayed to the BIC locus-specific database providing a freely accessible repository of variants and their classification.
The ENIGMA membership criteria and current procedures have been established and refined over three subsequent ENIGMA meetings, held at roughly 6- to 9-month intervals. An ENIGMA member is currently defined as a researcher or research group (consortium) who is willing to work collaboratively toward classification of variants and contribute data from families with unclassified sequence variants, as required to aid in the variant classification projects of ENIGMA and/or conduct statistical analysis or laboratory-based assays aimed at classification of variants within a working-group framework. This last requirement is vital to the functioning of ENIGMA, as the working groups form the basis of the consortium's overall approach to the problem of unclassified sequence variants. Current members are listed in Supp. Table S1.
Table 1 describes the six ENIGMA working groups that are charged with conceiving and undertaking projects principally in their field of interest, the purpose/scope of each working group, and one project actively underway by each working group. The working groups interact with each other to access complementary expertise and information, as required to enhance design and implementation of their projects. Although proposals for projects led by individual ENIGMA members are also considered, the intent is to encourage collaboration between groups with shared interests, and the pooling of existing data for the same topic/variant.
Table 1. ENIGMA Working Groups
|Analysis||To inform the interpretation of genetic variants by developing and applying statistical analyses and design of studies to improve the assessment of cancer risk and clinical relevance of BRCA1 and BRCA2 variants.||Generation of reference sets of family history scores from known neutral and pathogenic variants at each center|
|Clinical||To translate information classifying uncertain variants into the clinical arena.||Assessment of practices of reporting and counseling of unclassified variants across a diverse set of countries.|
|Database||To provide input and guidance to the development and maintenance of databases for the ENIGMA project.||Development of a comprehensive relational database to capture and integrate the wide range of data types being generated by ENIGMA activities.|
|Functional||To develop and use functional assays to contribute to the classification (interpretation) of variants in BRCA1 and BRCA2 for the ENIGMA project.||Standardization of functional assays on a reference set of unclassified variants.|
|Pathology||To identify tumor markers to be integrated into the multifactorial likelihood model for interpretation of variants of uncertain significance in the BRCA1 and BRCA2 genes.||Comparison of tumor histopathological features and loss-of-heterozygosity patterns in carriers of pathogenic missense variants versus truncating mutations.|
|Splicing||To pool the expertise of different active research groups to conduct large-scale studies that improve the clinical classification of likely spliceogenic variants.||Comparison of protocols in use by different ENIGMA sites, assessing results of assays on biological material from a set of reference wild-type controls and carriers of spliceogenic variants causing major or minor mRNA aberrations.|
In addition to specific working-group projects, there are projects that span interests of the entire consortium. The initial project in this category proposes to use existing multifactorial likelihood methodology to assess the clinical significance of a set of BRCA1 and BRCA2 sequence variants that have been submitted to ENIGMA, with the intent to rapidly provide information of clinical relevance back to the submitting clinical sites. Now that data collection and cleaning are complete, we have selected 25 variants and we are gathering information to assay the various components currently included in the model: namely, family history, cosegregation in families, and tumor pathology. Each of these components is designed to be expressed as a simple likelihood ratio comparing pathogenic to neutral, and that as progress on estimating these characteristics for functional assays and splicing assays develops, these will be able to be integrated into the same multifactorial model. Another such project is to assess cancer risk associated with ∼70 BRCA1/2 sequence variants selected from submissions to the ENIGMA database using case–control analysis, by screening these variants in ∼50,000 breast cancer cases, ∼15,000 ovarian cancer cases, ∼15,000 prostate cancer cases, and country-matched controls for each cancer type through collaboration with the Collaborative Oncology Gene-environment Study (COGS; www.cogseu.org). Given that many of the sequence variants studied are quite rare, it is only through such large-scale collaborative studies that sufficient sample size would be available to provide adequate power to test the association of these variants with cancer risk. Another example of the power of large-scale collaborations would be to collect data on variant of uncertain significance (VUS) in the large Consortium of Investigators of Modifiers of BRCA1/2 (CIMBA) consortium of >27,000 BRCA1 and BRCA2 carriers of known pathogenic information, as knowledge of co-occurrence of a particular UV in trans with a pathogenic mutation provides evidence against the VUS being pathogenic. Although not routinely collected as part of the CIMBA database, the data on any UV observed in these individuals are likely present at the testing laboratories in which the mutations were identified and a joint CIMBA/ENIGMA effort could be launched to obtain this information. Because there are often multiple carriers per family in CIMBA, this would allow the phase of the VUS with respect to the pathogenic mutation to be easily determined.
In terms of organizational structure, a steering committee provides direction to ENIGMA, with delegated tasks including:
- (1)managing communication with members,
- (2)meeting organization,
- (3)review and brokering of project proposals submitted by ENIGMA working groups/members,
- (4)arbitration of authorship of manuscripts arising from projects,
- (5)coordinating management of the ENIGMA database,
- (6)coordinating relationships with other classification databases,
- (7)establishing and maintaining communication with existing databases and consortia, for example, BIC http://research.nhgri.nih.gov/bic/; InSIGHT http://www.insight-group.org/; Human Variome Project, http://www.humanvariomeproject.org/; CIMBA, http://www.srl.cam.ac.uk/consortia/cimba/; BCAC, http://www.srl.cam.ac.uk/consortia/bcac/; OCAC , http://www.srl.cam.ac.uk/consortia/ocac/,
- (8)maintenance of the ENIGMA Web site,
- (9)seeking funding to support the Network.
The current steering committee structure comprises: ENIGMA working-group chairs, elected by members of the relevant working group; representatives of large multicenter national-level consortia, defined by the existence of a centralized country-specific database of testing results with nation-wide coverage and access to pedigree and family clinical data, selected by the relevant country consortium; and two at-large members proposed by any ENIGMA member, elected by the general membership. The Chair of the steering committee is nominated by the steering committee, and approved by a majority of the voting membership.
With the exception of the designated steering committee chair, steering committee members will serve for 2 years but can be renominated/reelected if so desired. The steering committee chair will serve as an additional at-large member for the year following his/her term as chair, to ensure continuity.
Current working-group chairs and additional steering committee members are listed on the ENIGMA Web site (http://www.enigmaconsortium.org). It is noteworthy that seven of the 11 ENIGMA Steering Committee members are currently also members of the Steering Committee of the BIC, allowing for a straight channel of communication between ENIGMA and the central locus-specific database for BRCA1 and BRCA2. Although, as described above, the Steering Committee of ENIGMA will undergo periodic changes, we expect that this cross-membership with BIC will continue well into the future.
The Value of the Consortium Approach to Evaluating Rare Sequence Variants
The obvious and immediate value of establishing a consortium to assess the clinical significance of rare unclassified sequence variants is the increased power for statistical analyses obtained through pooling of many families with data provided in a common format, since it is unlikely that a definitive classification can be provided from the analysis of a rare variant identified in a single family by a single clinical testing center. Even in this case, additional confirmatory evidence would be important. The multiplicative combination of genetic and pathology information from several families in the multifactorial likelihood model provides a more robust estimate of pathogenicity that is also more likely to cross thresholds currently defined for classification as pathogenic or neutral/low clinical significance [Plon et al., 2008].
With 5,960 submissions of 1,286 BRCA1 and 2,040 BRCA2 unique variants, identified in 13,860 families tested across 43 groups in 17 countries, ENIGMA is well placed to begin investigation of UVs. Table 2 shows the summary counts of variant submissions to ENIGMA as at September 2010, and demonstrates the value of a collaborative network for pooling information from families carrying the same rare variant.
Table 2. Summary of Variants Submitted to ENIGMA as at September 2010*
| 1–20 bp from boundary||173||422||174||454|
| >20 bp from boundary/UTR||180||372||300||788|
An immediate benefit of the data collection exercise to date has been to ensure that all variants submitted to ENIGMA are recorded using standardized HGVS nomenclature that allows accurate assessment of the number of unique variants present in the database and facilitates the search for information on variants in the scientific literature. Another important aspect of data collation will be standardized classification of individual variants at different ENIGMA laboratories worldwide. An additional value of pooling resources is that study design and implementation of studies conducted within a working-group framework benefit from the amalgamation of laboratory materials and expertise from multiple active research groups, also facilitating combined analysis of results across laboratories where appropriate. Moreover, the interdisciplinary membership of the collaborative network results in synergistic research studies that promote innovative use of a variety of laboratory and statistical approaches, and importantly, allow for rapid translation of findings to the cancer genetics clinics via the ENIGMA clinical working group.
Extension to Other Breast Cancer Susceptibility Genes
It is anticipated that the progression to include research on genes other than BRCA1 and BRCA2 will be gradual, and will be initiated by current members in an effort to better address research questions that arise from their parallel research on other breast cancer predisposition genes. In addition, the Consortium will seek interest/participation from potential new members with known active research interest in the additional genes selected for study. The main impetus for adding additional genes will be when they become relevant in clinical practice, likely as part of a multigene panel. Since many concepts and variant classification processes will be similar for additional breast cancer predisposition genes, it is likely that subgroups dedicated to gene-specific projects will fall under relevant existing working groups. Given the known interests of current members, it is likely that ATM and PALB2 will be the next genes to be included in the ENIGMA scope of work.
The advantage of considering multiple genes in a single consortium is that we will maximize experience gained from research already done on the BRCA1 and BRCA2 genes, and from the infrastructure established within the existing membership to access clinical and molecular information from groups with active research groups with interest in classification of gene sequence variants for clinical application. While expanding the research base to include additional genes has the potential to fragment research focus, this is less likely to be a problem if the decision to expand is motivated by existing group members and by a need to solve a clinical problem arising for members working at the clinical interface.
The ENIGMA consortium was established to evaluate the clinical significance of sequence variants in high-risk breast cancer genes. ENIGMA is a multidisciplinary international network that provides a mechanism to pool resources, exchange methods and data at many different levels, and to coordinately develop and apply algorithms for variant classification. It is envisaged that the research and clinical application of models developed by ENIGMA will be relevant to the interpretation of sequence variants not only in the BRCA genes that are the current focus of ENIGMA, but also to other disease susceptibility genes. The framework established by ENIGMA to assess pathogenicity of genetic variants can also potentially contribute to the global effort to identify genetic determinants of diseases. In an ideal world, a more systematic approach would be employed encompassing many different genes related to many different clinical phenotypes both cancer and noncancer and indeed, could even be extended to assessment of variants identified through whole-exome and whole-genome sequencing studies. A large network of collaborators could be set up to exchange expertise that transcends individual genes and combines skill sets that might be lacking in individual gene consortia. For example, the whole field of in silico predictions of pathogenicity for both missense and splice mutations could be developed across all disease genes. One can also envision within this structure subgroups working on areas that may be applicable to certain classes of disease such as cancer, in which things such as tumor pathology and Loss of heterozygosity (LOH) apply. Such an integrated approach would take large-scale funding as well as large-scale coordination but should be considered in the long term, perhaps starting with the ENIGMA model presented here.
Coordination of ENIGMA is funded by the National Institutes of Health Recovery Act supplement award (CA116167Z).