Both these authors contributed equally to this work.
A novel classification system to predict the pathogenic effects of CHD7 missense variants in CHARGE syndrome†
Article first published online: 11 MAY 2012
© 2012 Wiley Periodicals, Inc.
Volume 33, Issue 8, pages 1251–1260, August 2012
How to Cite
Bergman, J. E.H., Janssen, N., van der Sloot, A. M., de Walle, H. E.K., Schoots, J., Rendtorff, N. D., Tranebjærg, L., Hoefsloot, L. H., van Ravenswaaij-Arts, C. M.A. and Hofstra, R. M.W. (2012), A novel classification system to predict the pathogenic effects of CHD7 missense variants in CHARGE syndrome. Hum. Mutat., 33: 1251–1260. doi: 10.1002/humu.22106
Communicated by Sean V. Tavtigian
- Issue published online: 13 JUL 2012
- Article first published online: 11 MAY 2012
- Accepted manuscript online: 26 APR 2012 03:23PM EST
- Manuscript Accepted: 10 APR 2012
- Manuscript Received: 23 SEP 2011
- Netherlands Organization for Health Research and Development. Grant Number: 92003460 to JEHB
- Nuts Ohra Fund. Grant Number: (project 0901-80 to NJ)
- CHARGE syndrome;
- CHD7, missense mutation;
- classification system;
- prediction pathogenicity;
- genotype-phenotype correlation
- Top of page
- Patients and Methods
- Supporting Information
CHARGE syndrome is characterized by the variable occurrence of multisensory impairment, congenital anomalies, and developmental delay, and is caused by heterozygous mutations in the CHD7 gene. Correct interpretation of CHD7 variants is essential for genetic counseling. This is particularly difficult for missense variants because most variants in the CHD7 gene are private and a functional assay is not yet available. We have therefore developed a novel classification system to predict the pathogenic effects of CHD7 missense variants that can be used in a diagnostic setting. Our classification system combines the results from two computational algorithms (PolyPhen-2 and Align-GVGD) and the prediction of a newly developed structural model of the chromo- and helicase domains of CHD7 with segregation and phenotypic data. The combination of different variables will lead to a more confident prediction of pathogenicity than was previously possible. We have used our system to classify 145 CHD7 missense variants. Our data show that pathogenic missense mutations are mainly present in the middle of the CHD7 gene, whereas benign variants are mainly clustered in the 5′ and 3′ regions. Finally, we show that CHD7 missense mutations are, in general, associated with a milder phenotype than truncating mutations. Hum Mutat 33:1251–1260, 2012. © 2012 Wiley Periodicals, Inc.
- Top of page
- Patients and Methods
- Supporting Information
CHARGE syndrome (MIM# 214800) is a clinically heterogeneous syndrome that is characterized by the occurrence of ocular coloboma, heart defects, atresia of choanae, retardation of growth and/or development, genital anomalies, and ear anomalies often combined with deafness [Bergman et al., 2011b; Pagon et al., 1981; Sanlaville and Verloes, 2007; Zentner et al., 2010]. It is inherited in an autosomal dominant fashion. Most cases are sporadic due to de novo mutations but familial recurrence has also been described [Bergman et al., 2011b]. CHARGE syndrome has an estimated incidence of 1 in 16,000 newborns [Janssen et al., 2012]. The major gene involved in CHARGE syndrome is CHD7 (MIM# 608892) and heterozygous CHD7 mutations are found in more than 90% of the patients with typical CHARGE syndrome based on the clinical diagnostic criteria [Blake et al., 1998; Jongmans et al., 2006; Verloes, 2005; Vissers et al., 2004]. Nonsense and frameshift mutations are most prevalent with a frequency of 44% and 34%, respectively [Janssen et al., 2012]. Splice site mutations are found in 11% of patients, while missense mutations are present in 8% of patients. Deletions and genomic rearrangements occur in 3% of cases. Although missense mutations in the CHD7 gene are found in only a minority of patients, their interpretation may be problematic, thus resulting in difficulties in genetic counseling.
Variants that are expected to lead to a truncated protein (nonsense and frameshift mutations and deletions) are considered to be pathogenic (disease causing) because they are highly likely to result in haploinsufficiency. The interpretation of the consequences of a missense variant is more difficult, especially in a rare disease like CHARGE syndrome, in which most mutations are private. A functional assay would be very helpful in the classification of missense variants, but is not available for CHD7. To analyze the consequences of missense variants, computational algorithms have been developed [Tavtigian et al., 2008]. These are mostly based on multiple sequence alignments of a protein across species, with mutations at conserved positions being more likely to disrupt protein function, and/or on the nature of the specific amino acids involved. Each algorithm has its unique strengths and weaknesses. Therefore, it is worthwhile to combine different algorithms to increase the accuracy of the prediction [Frederic et al., 2009; McGee et al., 2010]. In addition, structural models, when available, can help in predicting the effect of a certain variant on the structural and binding properties of the protein [Alibes et al., 2010; Kim et al., 2008; Pey et al., 2007; Rakoczy et al., 2011; van der Sloot et al., 2006]. Apart from these tools, segregation analysis can supply crucial information for classifying missense variants [Bell et al., 2007; Zanetti et al., 2009].
In this study, we present a novel classification system for CHD7 missense variants that combines the results of two computational algorithms—PolyPhen-2 [Adzhubei et al., 2010; Ramensky et al., 2002] and Align-GVGD [Mathe et al., 2006; Tavtigian et al., 2006]—and the prediction of a newly developed structural model of the CHD7 chromo- and helicase domains, with segregation and phenotypic data. We have classified all the CHD7 missense variants known to us (n = 145). Furthermore, we compared the clinical features of patients with a missense variant that we classified as “probably pathogenic” with the features of patients with a truncating mutation to test our hypothesis that missense mutations are associated with a less severe phenotype than truncating mutations.
Patients and Methods
- Top of page
- Patients and Methods
- Supporting Information
Inclusion of CHD7 Missense Variants
In this article, we give an overview of all CHD7 missense variants reported in the literature before 15 June 2011 [Asakura et al., 2008; Bartels et al., 2010; Bergman et al., 2011a, b; Dauber et al., 2010; De Arriba Munoz et al., 2011; Delahaye et al., 2007; Felix et al., 2006; Feret et al., 2010; Fujita et al., 2009; Gao et al., 2007; Holak et al., 2008; Jongmans et al., 2006, 2008, 2009; Kim et al., 2008; Lalani et al., 2006; Pauli et al., 2012; Vissers et al., 2004; Vuorela et al., 2007; Wessels et al., 2010; Wincent et al., 2008] and the variants that were reported in the NCBI Single Nucleotide Polymorphism database (http://www.ncbi.nlm.nih.gov/SNP, dbSNP build 132) with frequency data (n = 104, Supp. Table S1). In addition, we show all unpublished missense variants that were found in the DNA diagnostic laboratories of the Radboud University Nijmegen Medical Center (RUNMC), Nijmegen, the Netherlands and the Department of Cellular and Molecular Medicine (ICMM), the Panum Institute, University of Copenhagen, Denmark (n = 41).
CHD7 analysis was performed as previously described [Jongmans et al., 2006] and multiplex ligation-dependent probe amplification (MLPA) was performed if CHD7 sequence analysis did not identify a mutation [Bergman et al., 2008]. The GenBank accession number NM_017780.2 was used as reference sequence for the CHD7 gene. The A of ATG was designated number 1. The intron sequences of the CHD7 gene can be found in NG_007009.1. Segregation of the CHD7 variant was studied whenever possible.
Development of a Classification System for CHD7 Missense Variants
We first screened all CHD7 missense variants for possible splice effects using the splicing module of Alamut version 1.5 (http://www.interactive-biosoftware.com/alamut.html). This module contains four splice prediction programs; SpliceSiteFinder-Like [Zhang, 1998], MaxEntScan [Yeo and Burge, 2004], NNSPLICE [Reese et al., 1997], and GeneSplicer [Pertea et al., 2001]. In addition, all missense variants in the CHD7 gene were analyzed with three computational algorithms that predict whether a variant is deleterious: SIFT, PolyPhen-2, and Align-GVGD. The protein multiple sequence alignments used with the computational algorithms were provided by Alamut version 1.5.
The SIFT algorithm, Sorting Intolerant From Tolerant, is available at http://sift.jcvi.org. SIFT uses the PSI-BLAST algorithm to find functionally related protein sequences and then creates a protein sequence alignment of multiple species [Kumar et al., 2009; Ng and Henikoff, 2006]. Prediction is based on the evolutionary conservation of the affected residue and the type of amino acid substitution. The SIFT score is calculated with position-specific scoring matrices with Dirichlet priors and ranges between 0 and 1. SIFT scores less than 0.05 are predicted to be deleterious (probably pathogenic) and scores greater or equal to 0.05 are predicted to be tolerated (benign).
PolyPhen-2, Polymorphism Phenotyping program version 2, is available at http://genetics.bwh.harvard.edu/pph2/. PolyPhen-2 is an update from PolyPhen [Ramensky et al., 2002] and relies on sequence-based and structure-based features [Adzhubei et al., 2010]. For this study, version 2.2.0 (r364) of PolyPhen-2 was used. The source of the sequence and structure information were UniProtKB/UniRef100 release on 5 April 2011 and PDB/DSSP Snapshot on 6 April 2011, respectively. HumVar-trained PolyPhen-2 was developed for diagnostic work in Mendelian diseases. PolyPhen-2 calculates the Naive Bayes posterior probability that a certain variant is damaging and gives estimations of the false-positive and true-positive rates. Based on the model's false-positive rate, a quantitative classification (benign, possibly damaging, or probably damaging) is given. If data are lacking, the PolyPhen-2 outcome is reported as “unknown.” None of the CHD7 missense variants that we entered in PolyPhen-2 had “unknown” as an outcome.
Align-GVGD is available at http://agvgd.iarc.fr/agvgd_input.php. Align-GVGD combines protein sequence alignments of multiple species with the biophysical characteristics of amino acids to calculate the range of biochemical variation among amino acids found at a given position in the alignment (Grantham variation). In addition, the biochemical distance of the mutant amino acid from the observed amino acids at a particular position in different species is calculated (Grantham deviation) [Mathe et al., 2006; Tavtigian et al., 2006]. A grade, varying from C0 to C65, is given to estimate the probability that a certain variant is pathogenic. We interpreted C0 as “probably benign,” C15, C25, and C35 as “possibly pathogenic” and C45, C55, and C65 as “probably pathogenic” in agreement with McGee et al. [McGee et al., 2010].
Structural model of the CHD7 chromo- and helicase domains
No experimentally derived structures of the CHD7 chromo- and helicase domains are available as yet and we therefore constructed a structural model for these domains. We did not perform structural analysis of the SANT and BRK domains of CHD7, because only three of the 145 missense variants were identified in these domains. Template structures for the homology modeling of the CHD7 chromo- and helicase domains were selected from the protein database using BLAST (Supp. Table S2) [Durr et al., 2005; Flanagan et al., 2005, 2007; Hauk et al., 2010; Okuda et al., 2007; Thoma et al., 2005]. We used the X-ray structure of the yeast chromatin remodeler Chd1 (3MWY) as a basis for our structural model and for all subsequent analyses, because it shows the chromo- and helicase domains in a single structure [Hauk et al., 2010]. A low-percentage sequence identity (approximately 30%) between the target sequence and most of the template sequences was observed. This increases the risk of alignment errors, resulting in the construction of faulty structural models. However, a structural superposition of structures of several chromo- or helicase domains derived from distantly related organisms showed that many structural features of these domains are particularly well conserved despite remote ancestry and divergent functionality. This indicates that sufficiently accurate models can be constructed of the conserved regions and that the location of many of the CHD7 variants can be predicted with reasonable accuracy. Multiple sequence alignments and structural alignments of the CHD7 target structure and the template structures were performed using Expresso/T-Coffee [Armougom et al., 2006; Notredame et al., 2000]. The homology models of the CHD7 protein were constructed using YASARA Structure version 11.4.18 using standard settings. A short combined steepest descent and simulated annealing minimization using constraints on aligned backbone atoms was performed, followed by a full unrestrained simulated annealing minimization for the entire model using the YASARA2 force field [Krieger et al., 2002; Krieger et al., 2004; Krieger et al., 2009]. Modeling of the CHD7 variants and the assessment of the effect on CHD7 stability was performed using the FoldX protein design algorithm [Guerois et al., 2002; Schymkowitz et al., 2005; Reis et al., 2010] as described previously [Alibes et al., 2010; Pey et al., 2007; Rakoczy et al., 2011; van der Sloot et al., 2006].
Performance of the computational algorithms and our structural model
To test whether SIFT, PolyPhen-2, Align-GVGD, and our structural model gave correct predictions, we examined whether their predictions were correct for 12 surely benign and 9 surely pathogenic CHD7 missense variants (Table 1; Supp. Table S1). The surely benign variants had been found in two or more controls. The surely pathogenic variants had occurred at least twice de novo in a patient with CHARGE syndrome, or had occurred de novo once and were found in at least two patients with CHARGE syndrome (Table 1; Supp. Table S1). Furthermore, none of the surely pathogenic CHD7 missense variants was predicted to influence splicing.
For the phenotypic comparison of patients with a CHD7 missense mutation with those carrying a truncating mutation, we only included the patients who were analyzed at the RUNMC and the ICMM. In total, we compared the clinical features of 35 patients with a missense variant that we had classified as “probably pathogenic” with the features of 315 patients with a truncating mutation (5 patients with a deletion, 145 patients with a frameshift mutation, and 165 patients with a nonsense mutation). Clinical data were gathered through questionnaires and/or retrospective chart review. Fisher's exact test was performed to identify significant differences between the two groups of patients (significance level P < 0.05).
- Top of page
- Patients and Methods
- Supporting Information
Development of Our Classification System
The performance of the computational algorithms and our structural model is shown in Table 1. SIFT gave a correct prediction for only 2 of 12 benign variants, whereas PolyPhen-2 and especially Align-GVGD performed much better for the benign variants. PolyPhen-2 was better than Align-GVGD in correctly predicting the pathogenic variants. Therefore, we included PolyPhen-2 and Align-GVGD in our classification system, but did not include SIFT (Table 2). The output of PolyPhen-2 and Align-GVGD was scored as 0 (benign), +0.5 (possibly pathogenic), or +1 (probably pathogenic) and was then summed as previously suggested by McGee et al. [McGee et al., 2010].
|Computational algorithms (summed score between 0 and +2)|
|Polyphen-2: benign = 0, possibly damaging = +0.5, and probably damaging = +1|
|Align-GVGD: C0 = 0, C15/C25/C35 = +0.5, and C45/C55/C65 = +1|
|Structural model (summed score between −1 and +1)|
|Minor effect = −1, undetermined effect = 0, detrimental effect, or located close to the ATP binding site = +1|
|Segregation analysis (summed score between −10 and +4)|
|Variant occurred de novo in one patient with features of CHARGE syndrome = +3|
|Variant occurred at least twice de novo in patients with features of CHARGE syndrome = +4|
|Asymptomatic carrier of the varianta = −2|
|Variant found in a homozygous state = −5|
|Variant found in combination with a pathogenic CHD7 mutationb = −3|
|Prediction based on total summed score (total score between −11 and +7)|
|Probably benign: total score 0 or less|
|Unknown: total score between 0 and +4|
|Probably pathogenic: total score +4 or more|
Two surely benign and five surely pathogenic missense variants were located in the chromo- or helicase domain and could therefore be modeled. Our structural model gave a correct prediction for all variants, although one variant had an undetermined effect (Table 1). As our structural model did predict correctly almost all variants, we decided to integrate our structural model in our classification system. Variants that were predicted to have a minor effect on the stability of the CHD7 protein were scored as −1, variants that had an undetermined effect received a score of 0 and variants that were predicted to have a detrimental effect or were located in the ATP-binding domain were scored as +1 (Table 2).
In addition to the scores of the algorithms and our structural model, we integrated data from segregation analysis in our classification system (Table 2). If the variant of interest had occurred de novo in one patient with features of CHARGE syndrome, three points were added. If a certain variant had occurred at least twice de novo, in patients with features of CHARGE syndrome, four points were added. In contrast, two points were subtracted if the variant was found in at least one clinically well-characterized person without features of CHARGE syndrome, or if the variant was found in at least two persons reported to be normal, but for whom no detailed clinical information was available (e.g., controls reported in the NCBI SNP database or not thoroughly investigated family members). Five points were subtracted if the variant was found in a homozygous state (this because homozygous CHD7 mutations are presumed to be lethal). Three points were subtracted if the missense variant was found in combination with a clearly pathogenic CHD7 mutation, that is, a truncating, missense, or splice site mutation.
Total scores could vary between −11 and +7. Variants with a negative score or 0 were classified as “probably benign,” those with a score between 0 and +4 were classified as “unclassified variants (UV),” and those with scores of +4 and higher were classified as “probably pathogenic” (Table 2; Supp. Table S1).
Classification of CHD7 Missense Variants
A complete overview of all 145 missense variants in the CHD7 gene is supplied in Supp. Table S1. As a first screen, we ran the splice prediction programs and determined that 12 of the 145 missense variants might have an effect on splicing. However, as we were unable to confirm the splice prediction with RNA studies, we classified these variants with our scoring system to see whether the amino acid substitution had a pathogenic effect. Using our classification system (see Table 2), 40 variants had a score ≥+4 and were classified as “probably pathogenic” (27%, with four variants possibly affecting splicing), 46 variants had a score between 0 and +4 and were classified as “UV” (32%, with six variants possibly affecting splicing), and 59 variants had a score ≤0 and were classified as “probably benign” (41%, with 2 variants possibly affecting splicing) (Supp. Table S1). Our classification agreed well with most of the classifications of the 104 previously reported variants (Table 3). However, it was discordant for five variants; p.His55Arg, p.Pro732Ala, p.Val2102Ile, p.Ala2789Thr, and p.Lys2948Glu. We had classified these five variants as “probably benign” based solely on in silico data, but they had previously been reported as pathogenic [Felix et al., 2006; Jongmans et al., 2006; Kim et al., 2008] (Supp. Table S1).
|Classification as reported in the literature|
Most CHD7 missense variants were identified in only one person, but 38 variants were recurrent (26%). Of these recurrent variants, 13/38 were classified as “probably pathogenic” (including three variants possibly affecting splicing), 5/38 as “UV” (including two variants possibly affecting splicing) and 20/38 as “probably benign” (with 1 variant possibly affecting splicing). Four benign variants (p.Ser103Thr, p.Met340Val, p.Gly522Val, and p.Phe2750Leu) were found in more than 15 index persons. Two of these variants (p.Ser103Thr and p.Gly522Val) were found in homozygous states and are, therefore, surely benign (mice with homozygous CHD7 mutations die in early embryogenesis [Bosman et al., 2005] and homozygous CHD7 mutations have never been found in a patient with CHARGE syndrome). The three most frequently occurring pathogenic missense mutations were each found in more than three index patients (p.Ile1028Val, p.Gln1214Arg, and p.Gly2108Arg). Three pathogenic missense mutations were possibly implicated in familial CHARGE syndrome: p.Ser834Phe [Delahaye et al., 2007], p.His2096Arg [Feret et al., 2010], and p.Gly2108Arg [Jongmans et al., 2008].
The 145 CHD7 missense variants were distributed throughout the entire coding region of the CHD7 gene, as shown in Figure 1. The variants that we classified as “probably benign” were predominantly located in the 5′ and 3′ regions of the CHD7 gene and those classified as “probably pathogenic” were found in the middle of the gene. Forty-five variants were located in, or were in very close proximity to, functional domains of CHD7; 11 in the chromodomains, 31 in the helicase domain, and only three in the SANT and BRK domains.
Structural Model of CHD7
Our structural model shows the two chromodomains capping the DNA-binding cleft between the N-terminal and C-terminal lobe of the helicase domain and has the two helicase lobes relatively spaced far apart with residues of the C-terminal lobe not making any direct contact with ATP (Fig. 2). Like the yeast Chd1 structure, the CHD7 model shows an acidic helix connecting chromodomain 1 and 2 that interacts with a basic patch on the C-terminal helicase lobe. This suggests that CHD7 might employ a similar mechanism to discriminate between nucleosome-DNA substrates as that proposed for the yeast Chd1 structure [Hauk et al., 2010].
Based on the “wild-type” model described above, FoldX was used to create structural models of the different CHD7 variants and estimate their effect on the structural stability of the CHD7 chromo- and helicase domains. The effect of the different missense variants was classified either as likely to have a minor effect, or likely to have a detrimental effect on the structural stability of the protein. Mutations increasing the calculated Gibbs free energy with more than 1 kcal/mol were considered to be potentially “detrimental.” For the calculation, the main focus was on the terms that describe increases in energy due to van der Waals clashes (mutation to a larger residue in the protein core), or to a loss in van der Waals energy (mutation to a smaller residue in the protein core), or unfavorable solvation (mutation from a hydrophobic residue to a hydrophilic residue in the protein core). An increase in energy due to the loss of a hydrogen bond was ignored, due to the high dependence on accurate atom positions of the hydrogen-bond donor and acceptor. Apart from effects on structural stability, the position of the mutation in the structure was also taken into account: was it in close proximity to the ATP binding site or was a known interaction motive altered?
Of the 42 variants that were located in the chromo- or helicase domain, 11 variants were predicted to have a minor effect (26%), 28 variants were predicted to have a detrimental effect (67%) and for three variants the effect could not be determined (7%). The variants that were predicted to decrease the stability of the CHD7 protein were frequently located in the core of the protein (24/28), whereas the variants with a predicted minor effect were often located at the surface or in linker/loop regions (9/11). Two variants were considered to have a detrimental effect due to their close location to the ATP binding site: p.Asn1030Ser in the Helicase N-lobe and p.Gln1395His in the Helicase C-lobe (Fig. 2). Although not in direct contact with ATP, the latter variant is located next to p.Arg1399, which in homologous structures is considered to function as one of the Arginine finger residues involved in stabilizing the transition state of ATP. Therefore, these two variants could influence the ATPase activity of the helicase domain and are predicted to be “detrimental.” No direct effects on phosphorylation or interaction motives were found for any of the variants.
The clinical data of the patients who had CHD7 analysis done at the RUNMC or ICMM were used to compare the phenotype of patients with a missense variant that we classified as “probably pathogenic” (n = 35) with that of patients with a truncating mutation in the CHD7 gene (n = 315) (Table 4). The patients with a truncating mutation more often fulfilled the clinical criteria of Blake et al. [Blake et al., 1998] and Verloes [Verloes, 2005] (P = 0.017 and P = 0.031, respectively). In addition, cleft lip and/or palate (P = 0.042), choanal anomalies (P = 0.015), and congenital heart defects (P < 0.001) were present significantly more often in patients with a truncating mutation compared to those with a missense mutation. The other clinical features were not significantly different between the two groups. In conclusion, missense mutations were, in general, found to be associated with a milder phenotype compared to truncating mutations.
|Patients with a CHD7 missense mutation (n = 35)a||Patients with a CHD7 truncating mutation (n = 315)b||Comparison P value|
|Blake criteria||57.1% (8/14)c||85.5% (106/124)c||0.017|
|Verloes criteria||71.4% (10/14)||92.5% (111/120)||0.031|
|Cleft lip and/or palate||31.8% (7/22)||55.6% (80/144)||0.042|
|Choanal anomaly||30.0% (6/20)||60.4% (110/182)||0.015|
|Heart defect||46.7% (14/30)||82.5% (212/257)||<0.001|
|Tracheoesophageal anomaly||13.6% (3/22)||33.6% (43/128)||0.080|
|Coloboma and/or microphthalmia||75.9% (22/29)||86.9% (199/229)||0.154|
|Cranial nerve dysfunction||78.9% (15/19)||90.8% (119/131)||0.123|
|Semicircular canal anomaly and/or balance disturbance||95.0% (19/20)||100% (121/121)||0.142|
|External ear anomaly||96.4% (27/28)||98.2% (217/221)||0.452|
|Kidney anomaly||30.0% (6/20)||37.6% (44/117)||0.620|
- Top of page
- Patients and Methods
- Supporting Information
A Novel Classification System for CHD7 Missense Variants
It is important to classify missense variants accurately as either benign or pathogenic because of the clinical implications. Patients harboring a pathogenic CHD7 mutation should be screened for additional features of CHARGE syndrome, for example, hypogonadotropic hypogonadism, heart, and kidney defects (see surveillance scheme in [Bergman et al., 2011b]). Early detection and treatment of hypogonadotropic hypogonadism is important, because this will reduce the risk of osteoporosis [Bergman et al., 2011a]. In addition, genetic counseling is indicated to inform the patient and the parents about reproductive options. Furthermore, correct classification of CHD7 missense variants can contribute to the knowledge about CHD7 function. A good classification of CHD7 missense variants, however, is difficult. The absence of a particular missense variant in the control population is often used for confirmation of pathogenicity. However, most CHD7 missense variants are found only once, even in cohorts of more than 1,000 patients, and, therefore, very large numbers of controls should be screened before the missense variant can be classified as “probably pathogenic” [Bell et al., 2007]. A control group of that size is currently not available, but the 1000 Genome Project and the Dutch genome project will likely supply useful data. In addition, a validated functional model can be very helpful in the classification, but such a model is currently not available, due to the complexity of CHD7 function. To overcome these problems, we have developed a novel classification system that can be used in a diagnostic setting. Our system combines the results of PolyPhen-2, Align-GVGD, and our structural model with segregation and phenotypic data (Table 2). It will, therefore, increase the reliability of predictions. We used our system to classify all known missense variants (n = 145) and were able to classify 40 variants as “probably pathogenic,” 46 as “unclassified variant,” and 59 as “probably benign.” The specificity and sensitivity of our classification is not known, because a gold standard does not exist. However, combining the output of different algorithms is known to increase the predictive value [Chan et al., 2007] and segregation data are widely accepted as a valuable source of information for the classification of missense variants [Bell et al., 2007]. In addition, our classification was frequently in agreement with predictions from the literature (Table 3).
According to our classification system, a variant is considered “probably pathogenic” when it has occurred at least twice de novo in patients with features of CHARGE syndrome, or when the variant has occurred de novo in one patient with features of CHARGE syndrome and is predicted to have a deleterious effect according to either PolyPhen-2, Align-GVGD, or our structural model. Considering that many diagnostic laboratories conclude that every de novo variant is pathogenic, we are confident that a variant classified as “probably pathogenic” with our more conservative approach is very likely to be a true pathogenic mutation. We feel that one de novo occurrence is not enough for classifying a variant as surely pathogenic, because a benign polymorphism can also by chance occur de novo in a sporadic patient with CHARGE syndrome (e.g., c.6103+8C>T, which had occurred de novo in a CHARGE patient [Lalani et al., 2006], but was later conclusively proven to be benign [Kim et al., 2008]). We classify a variant as “probably benign” when there are no clues to suggest pathogenicity from either PolyPhen-2, Align-GVGD, our structural model or segregation data or when the segregation data suggest that the variant is probably benign. This means that there is a chance that a variant that we classified as “probably benign,” might later receive the label “probably pathogenic,” if future studies show that the variant has occurred at least twice de novo in patients with features of CHARGE syndrome. Five variants that we classified as “probably benign,” p.His55Arg, p.Pro732Ala, p.Val2102Ile, p.Ala2789Thr, and p.Lys2948Glu, were previously reported as pathogenic missense mutations [Felix et al., 2006; Jongmans et al., 2006; Kim et al., 2008]. Because all five variants were predicted to be “benign” by PolyPhen-2 and Align-GVGD and segregation data (including phenotypic data) were not available, the total summed score was 0, leading to our classification of “probably benign.” Neither variant was located in the chromo- or helicase domain and therefore structural modeling was not performed by us. However, Ala2789Thr was predicted to be deleterious according to a structural model of CHD7 that was constructed by Kim et al. [Kim et al., 2008]. Hopefully, segregation data concerning these five variants will become available in the future, leading to a more reliable classification of pathogenicity.
Unfortunately, we had to classify one third of the CHD7 variants as “UV” (46/145), due to a lack of segregation data (n = 25) or of phenotypic data of the carrier parent (n = 17). For only four variants, these data were available, but we still had to classify the variant as “UV.” Three of these variants had been identified in an affected family member, but unfortunately it was unknown whether the variant had occurred de novo in the affected parent. Additional segregation and phenotypic data from patients and/or controls can ultimately lead to a correct classification of all missense variants. The locus-specific CHD7 mutation database (available at www.CHD7.org; [Swertz et al., 2010]) provides a valuable source of information, as it contains both segregation and clinical data. Clinical data are important, because the phenotype of patients who undergo CHD7 analysis in a clinical diagnostic laboratory is not always highly suggestive of CHARGE syndrome [Bartels et al., 2010]. On the contrary, many patients have only a few features of CHARGE syndrome and CHD7 analysis is performed to exclude a diagnosis of CHARGE syndrome. The prior chance of finding a pathogenic CHD7 mutation in this group is therefore much lower than in the group of patients with typical CHARGE syndrome.
Segregation data in combination with phenotypic data are reasonably reliable, but one should be aware that a variant that segregates with the disease is not always pathogenic, because the missense variant may be in linkage disequilibrium with an unidentified pathogenic mutation. When interpreting segregation data, the possibility of phenocopies, variable expressivity, and nonpaternity should be considered. The presence of a CHD7 variant in the NCBI SNP database does not necessarily mean that the variant is benign, because there is always a chance that a mildly affected patient with CHARGE syndrome could have been included in the NCBI SNP cohorts [Bell et al., 2007].
Our system mainly classifies missense variants according to the predicted effect of the amino acid substitution. However, missense variants, as well as synonymous changes, can also have a deleterious effect on splicing, because the variant can be located in, or close to, a splice site, or it can create a novel splice site. Of the 145 missense variants that were assessed in this study, 12 were predicted to have a possible effect on splicing according to the splice prediction programs (12/145 = 8%). RNA studies should be performed to confirm the splice effects.
Distribution of CHD7 Missense Variants
The CHD7 missense variants were present in the entire coding region of the CHD7 gene (Fig. 1, Supp. Table S1). The variants that we classified as “probably pathogenic” were all located in the middle of the CHD7 gene. Those that we classified as “probably benign” were predominantly located at the 5′ and 3′ ends of the CHD7 gene: 47/59 “probably benign” variants were found in amino acids 1–820 and 2320–2997 (Fig. 1). The 5′ end of the CHD7 gene is only weakly to moderately conserved among species and both the N- and C-terminal of the CHD7 protein do not contain functionally important domains.
Structural Model of Chromo- and Helicase Domains
We constructed a structural model of the chromo- and helicase domains of the CHD7 protein, based on different template structures (Fig. 2, Supp. Table S2). FoldX was used to create structural models of the different CHD7 variants and estimate their effect on the structural stability of the CHD7 chromo- and helicase domains. Because the accuracy of the energy prediction by FoldX depends on the exact position of the amino acid atoms in a structure, the accuracy of our prediction is more limited than in previous works [Alibes et al., 2010; Pey et al., 2007; Rakoczy et al., 2011], due to the use of models based on low sequence identity between target and template, and the low resolution of the available template structures.
A previous study constructed a structural model of the C-terminal part of the CHD7 protein and concluded that variants in the loop regions were likely detrimental, because of their possible effects on the structural and binding properties of the CHD7 protein [Kim et al., 2008]. This is in contrast to our model, where five CHD7 missense variants located in loop regions were all predicted to have a likely minor effect on protein stability (Supp. Table S1). For every new CHD7 variant that is submitted to the CHD7 database, we will provide the prediction of our structural model.
CHARGE syndrome is extremely variable and the phenotype cannot be predicted from the genotype. However, when comparing the clinical features of patients with a CHD7 missense mutation with patients with a truncating mutation, we have shown that missense mutations are, in general, associated with a milder phenotype (Table 4). This association is also seen in other syndromes, for example, Rett syndrome [Cheadle et al., 2000]. Three features were found significantly more often in the patients with a CHD7 truncating mutation: cleft lip/palate, choanal anomalies, and congenital heart defects. This is consistent with a previous study that showed that 10 severely affected fetuses with CHARGE syndrome were all carrying a CHD7 truncating mutation [Sanlaville et al., 2006]. The features that are almost always present in CHARGE syndrome (external ear anomalies, cranial nerve dysfunction and balance disturbance caused by semicircular canal anomalies [Bergman et al., 2011b]), do not occur significantly more often in patients with a truncating mutation. This was to be expected, because these features are frequently seen in very mildly affected patients [Bergman et al., 2011b; Delahaye et al., 2007; Jongmans et al., 2008; Lalani et al., 2006; Vuorela et al., 2008].
- Top of page
- Patients and Methods
- Supporting Information
We have developed a novel classification system to predict the pathogenic effects of CHD7 missense variants that can be used in a diagnostic setting. In our classification system we have combined the outcome of PolyPhen-2 and Align-GVGD and the prediction of our structural model with segregation and phenotypic data of carriers of a CHD7 missense variant. The combination of different variables will lead to a more confident prediction of pathogenicity than was previously possible. We have used our system to classify 145 CHD7 missense variants and have made our data available in the locus-specific CHD7 mutation database (www.CHD7.org). Ongoing submission of new segregation and phenotypic data will contribute to a better classification, in particular, for those CHD7 missense variants that we have classified as UV or those that were classified as “probably benign” solely based on in silico data. CHD7 missense variants were found scattered throughout the entire coding region of the CHD7 gene, with pathogenic mutations found in the middle of the CHD7 gene and the benign variants mainly clustered in the 5′ and 3′ regions. Finally, we showed that CHD7 missense mutations are, in general, associated with a milder phenotype than truncating mutations.
- Top of page
- Patients and Methods
- Supporting Information
We thank the Netherlands Organization for Health Research and Development (grant no. 92003460 to JEHB) and the Nuts Ohra Fund (project 0901-80 to NJ) for financial support and Jackie Senior for editing the manuscript. AMS was partially supported by a Juan de la Cierva fellowship of the Spanish Ministry of Science. The Danish part of the project was hosted by the Wilhelm Johannsen Centre for Functional Genome Research, established by the Danish National Research Foundation. We thank M. A. Swertz and T. Rengaw of the Genomics Coordination Center, University Medical Centre Groningen for the work on the CHD7 mutation database (www.CHD7.org).
- Top of page
- Patients and Methods
- Supporting Information
- 2010. A method and server for predicting damaging missense mutations. Nat Methods 7:248–249. , , , , , , , .
- 2010. Using protein design algorithms to understand the molecular basis of disease caused by protein-DNA interactions: the Pax6 example. Nucleic Acids Res 38:7422–7431. , , , , , .
- 2006. Expresso: automatic incorporation of structural information in multiple sequence alignments using 3D-coffee. Nucleic Acids Res 34:W604–W608. , , , , , , , .
- 2008. Endocrine and radiological studies in patients with molecularly confirmed CHARGE syndrome. J Clin Endocrinol Metab 93:920–924. , , , , , , , , .
- 2010. Mutations in the CHD7 gene: the experience of a commercial laboratory. Genet Test Mol Biomarkers 14:881–891. , , , , .
- 2007. Practice guidelines for the interpretation and reporting of unclassified variants (UVs) in clinical molecular genetics. CMGS/VKGL 6 p. http://www.cmgs.org/BPGs/pdfs%20current%20bpgs/UV%20GUIDELINES%20ratified.pdf. , , , .
- 2008. Exon copy number alterations of the CHD7 gene are not a major cause of CHARGE and CHARGE-like syndrome. Eur J Med Genet 51:417–425. , , , , , .
- 2011a. Anosmia predicts hypogonadotropic hypogonadism in CHARGE syndrome. J Pediatr 158:474–479. , , , , .
- 2011b. CHD7 mutations and CHARGE syndrome: the clinical implications of an expanding phenotype. J Med Genet 48:334–342. , , , , , .
- 1998. CHARGE association: an update and review for the primary pediatrician. Clin Pediatr (Phila) 37:159–173. , , , , , , , .
- 2005. Multiple mutations in mouse Chd7 provide models for CHARGE syndrome. Hum Mol Genet 14:3463–3476. , , , , , .
- 2007. Interpreting missense variants: comparing computational methods in human disease genes CDKN2A, MLH1, MSH2, MECP2, and tyrosinase (TYR). Hum Mutat 28:683–693. , , , , , , , , , , , , .
- 2000. Long-read sequence analysis of the MECP2 gene in Rett syndrome patients: correlation of disease severity with mutation type and location. Hum Mol Genet 9:1119–1129. , , , , , , , , , , , , , , .
- 2010. Delayed puberty due to a novel mutation in CHD7 causing CHARGE syndrome. Pediatrics 126:e1594–e1598. , , , , .
- 2011. CHARGE syndrome and CHD7 gene mutation. Neurologia 26:255. , , , , , , .
- 2007. Familial CHARGE syndrome because of CHD7 mutation: clinical intra- and interfamilial variability. Clin Genet 72:112–121. , , , , , , , , , , , , , .
- 2005. X-ray structures of the sulfolobus solfataricus SWI2/SNF2 ATPase core and its complex with DNA. Cell 121:363–373. , , , , .
- 2006. CHD7 gene and non-syndromic cleft lip and palate. Am J Med Genet A 140:2110–2114. , , , , .
- 2010. Expanding the phenotypic overlap between CHARGE and Kallmann syndromes due to CHD7 mutations. No. 1671. Abstract of poster presented at 60th Annual ASHG Meeting, Washington DC, 2–6 Nov. 2010. www.ashg.org/2010meeting/abstracts/fulltext/f22415.htm. , , , .
- 2007. Molecular implications of evolutionary differences in CHD double chromodomains. J Mol Biol 369:334–342. , , , , , .
- 2005. Double chromodomains cooperate to recognize the methylated histone H3 tail. Nature 438:1181–1185. , , , , , , , , .
- 2009. UMD-predictor, a new prediction tool for nucleotide substitution pathogenicity—application to four genes: FBN1, FBN2, TGFBR1, and TGFBR2. Hum Mutat 30:952–959. , , , , , , .
- 2009. Abnormal basiocciput development in CHARGE syndrome. Am J Neuroradiol 30:629–634. , , , , , , , , .
- 2007. CHD7 gene polymorphisms are associated with susceptibility to idiopathic scoliosis. Am J Hum Genet 80:957–965. , , , , , , , , , , , , , , , .
- 2002. Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations. J Mol Biol 320:369–387. , , .
- 2010. The chromodomains of the Chd1 chromatin remodeler regulate DNA access to the ATPase motor. Mol Cell 39:711–723. , , , .
- 2008. New recognized ophthalmic morphologic anomalies in CHARGE syndrome caused by the R2319C mutation in the CHD7 gene. Ophthalmic Genet 29:79–84. , , , .
- 2012. Mutation update on the CHD7 gene involved in CHARGE syndrome. Hum Mutat doi: 10.1002/humu.22086 , , , , , , , , .
- 2006. CHARGE syndrome: the phenotypic spectrum of mutations in the CHD7 gene. J Med Genet 43:306–314. , , , , , , , , , , , , , , .
- 2008. Familial CHARGE syndrome and the CHD7 gene: a recurrent missense mutation, intrafamilial recurrence and variability. Am J Med Genet A 146:43–50. , , , , , , , , , , .
- 2009. CHD7 mutations in patients initially diagnosed with Kallmann syndrome—the clinical overlap with CHARGE syndrome. Clin Genet 75:65–71. , , , , , , , , , , .
- 2008. Mutations in CHD7, encoding a chromatin-remodeling protein, cause idiopathic hypogonadotropic hypogonadism and Kallmann syndrome. Am J Hum Genet 83:511–519. , , , , , , , , , , , , , , , .
- 2004. Making optimal use of empirical energy functions: force-field parameterization in crystal space. Proteins 57:678–683. , , , , .
- 2009. Improving physical realism, stereochemistry, and side-chain accuracy in homology modeling: four approaches that performed well in CASP8. Proteins 77(Suppl 9):114–122. , , , , , , , , .
- 2002. Increasing the precision of comparative models with YASARA NOVA—a self-parameterizing force field. Proteins 47:393–402. , , .
- 2009. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat Protoc 4:1073–1081. , , .
- 2006. Spectrum of CHD7 mutations in 110 individuals with CHARGE syndrome and genotype-phenotype correlation. Am J Hum Genet 78:303–314. , , , , , , , , , , , , , , , , , , .
- 1993. Kallmann syndrome in two sisters with other developmental anomalies also affecting their father. Clin Genet 43:51–53. , .
- 2006. Computational approaches for predicting the biological effect of p53 missense mutations: a comparison of three sequence analysis based methods. Nucleic Acids Res 34:1317–1325. , , , , , .
- 2010. Novel mutations in the long isoform of the USH2A gene in patients with Usher syndrome type II or non-syndromic retinitis pigmentosa. J Med Genet 47:499–506. , , , , .
- 2006. Predicting the effects of amino acid substitutions on protein function. Annu Rev Genomics Hum Genet 7:61–80. , .
- 2000. T-coffee: a novel method for fast and accurate multiple sequence alignment. J Mol Biol 302:205–217. , , .
- 2007. Structural polymorphism of chromodomains in Chd1. J Mol Biol 365:1047–1062. , , .
- 1981. Coloboma, congenital heart disease, and choanal atresia with multiple anomalies: CHARGE association. J Pediatr 99:223–227. , , , .
- 2012. CHD7 mutations causing CHARGE syndrome are predominantly of paternal origin. Clin Genet 81:234–239. , , , , , , , .
- 2001. GeneSplicer: a new computational method for splice site prediction. Nucleic Acids Res 29:1185–1190. , , .
- 2007. Predicted effects of missense mutations on native-state stability account for phenotypic outcome in phenylketonuria, a paradigm of misfolding diseases. Am J Hum Genet 81:1006–1024. , , , .
- 2011. Analysis of disease-linked rhodopsin mutations based on structure, function, and protein stability calculations. J Mol Biol 405:584–606. , , , , .
- 2002. Human non-synonymous SNPs: server and survey. Nucleic Acids Res 30:3894–3900. , , .
- 1997. Improved splice site detection in genie. J Comput Biol 4:311–323. , , , .
- 2010. Rapid and efficient cancer cell killing mediated by high-affinity death receptor homotrimerizing TRAIL variants. Cell Death Dis 1:e83. , , , , , , , , , , , .
- 2006. Phenotypic spectrum of CHARGE syndrome in fetuses with CHD7 truncating mutations correlates with expression during human development. J Med Genet 43:211–217. , , , , , , , , , , , , , , , , , , , , , , , , .
- 2007. CHARGE syndrome: an update. Eur J Hum Genet 15:389–399. , .
- 2005. Prediction of water and metal binding sites and their affinities by using the fold-X force field. Proc Natl Acad Sci USA 102:10147–10152. , , , , , .
- 2010. The MOLGENIS toolkit: rapid prototyping of biosoftware at the push of a button. BMC Bioinformatics 11(Suppl 12):S12. , , , , , , , , , , , , , , .
- 2006. Comprehensive statistical study of 452 BRCA1 missense substitutions with classification of eight recurrent substitutions as neutral. J Med Genet 43:295–305. , , , , , , , , .
- IARC Unclassified Genetic Variants Working Group. 2008. In silico analysis of missense substitutions using sequence-alignment based methods. Hum Mutat 29:1327–1336. , , , ,
- 2005. Structure of the SWI2/SNF2 chromatin-remodeling domain of eukaryotic Rad54. Nat Struct Mol Biol 12:350–356. , , , , , .
- 2006. Designed tumor necrosis factor-related apoptosis-inducing ligand variants initiating apoptosis exclusively via the DR5 receptor. Proc Natl Acad Sci USA 103:8634–8639. , , , , , , , .
- 2005. Updated diagnostic criteria for CHARGE syndrome: a proposal. Am J Med Genet A 133:306–308.
- 2004. Mutations in a new member of the chromodomain gene family cause CHARGE syndrome. Nat Genet 36:955–957. , , , , , , , , , , , , , .
- 2007. Molecular analysis of the CHD7 gene in CHARGE syndrome: identification of 22 novel mutations and evidence for a low contribution of large CHD7 deletions. Genet Med 9:690–694. , , , , , , , , , , , .
- 2008. A familial CHARGE syndrome with a CHD7 nonsense mutation and new clinical features. Clin Dysmorphol 17:249–253. , , , , , .
- 2010. Novel CHD7 mutations contributing to the mutation spectrum in patients with CHARGE syndrome. Eur J Med Genet 53:280–285. , , , , , , , , , , , , , , , .
- 2008. CHD7 mutation spectrum in 28 Swedish patients diagnosed with CHARGE syndrome. Clin Genet 74:31–38. , , , , , , , , .
- 2004. Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals. J Comput Biol 11:377–394. , .
- 2009. Segregation analysis in a family at risk for the Maroteaux–Lamy syndrome conclusively reveals c.1151G>A (p.S384N) as to be a polymorphism. Eur J Hum Genet 17:1160–1164. , , , , , , , .
- 2010. Molecular and phenotypic aspects of CHD7 mutation in CHARGE syndrome. Am J Med Genet A 152A:674–686. , , , .
- 1998. Statistical features of human exons and their flanking regions. Hum Mol Genet 7:919–932.
- Top of page
- Patients and Methods
- Supporting Information
Additional Supporting information may be found in the online version of this article
Please note: Wiley Blackwell is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.