A Novel Classification System for CHD7 Missense Variants
It is important to classify missense variants accurately as either benign or pathogenic because of the clinical implications. Patients harboring a pathogenic CHD7 mutation should be screened for additional features of CHARGE syndrome, for example, hypogonadotropic hypogonadism, heart, and kidney defects (see surveillance scheme in [Bergman et al., 2011b]). Early detection and treatment of hypogonadotropic hypogonadism is important, because this will reduce the risk of osteoporosis [Bergman et al., 2011a]. In addition, genetic counseling is indicated to inform the patient and the parents about reproductive options. Furthermore, correct classification of CHD7 missense variants can contribute to the knowledge about CHD7 function. A good classification of CHD7 missense variants, however, is difficult. The absence of a particular missense variant in the control population is often used for confirmation of pathogenicity. However, most CHD7 missense variants are found only once, even in cohorts of more than 1,000 patients, and, therefore, very large numbers of controls should be screened before the missense variant can be classified as “probably pathogenic” [Bell et al., 2007]. A control group of that size is currently not available, but the 1000 Genome Project and the Dutch genome project will likely supply useful data. In addition, a validated functional model can be very helpful in the classification, but such a model is currently not available, due to the complexity of CHD7 function. To overcome these problems, we have developed a novel classification system that can be used in a diagnostic setting. Our system combines the results of PolyPhen-2, Align-GVGD, and our structural model with segregation and phenotypic data (Table 2). It will, therefore, increase the reliability of predictions. We used our system to classify all known missense variants (n = 145) and were able to classify 40 variants as “probably pathogenic,” 46 as “unclassified variant,” and 59 as “probably benign.” The specificity and sensitivity of our classification is not known, because a gold standard does not exist. However, combining the output of different algorithms is known to increase the predictive value [Chan et al., 2007] and segregation data are widely accepted as a valuable source of information for the classification of missense variants [Bell et al., 2007]. In addition, our classification was frequently in agreement with predictions from the literature (Table 3).
According to our classification system, a variant is considered “probably pathogenic” when it has occurred at least twice de novo in patients with features of CHARGE syndrome, or when the variant has occurred de novo in one patient with features of CHARGE syndrome and is predicted to have a deleterious effect according to either PolyPhen-2, Align-GVGD, or our structural model. Considering that many diagnostic laboratories conclude that every de novo variant is pathogenic, we are confident that a variant classified as “probably pathogenic” with our more conservative approach is very likely to be a true pathogenic mutation. We feel that one de novo occurrence is not enough for classifying a variant as surely pathogenic, because a benign polymorphism can also by chance occur de novo in a sporadic patient with CHARGE syndrome (e.g., c.6103+8C>T, which had occurred de novo in a CHARGE patient [Lalani et al., 2006], but was later conclusively proven to be benign [Kim et al., 2008]). We classify a variant as “probably benign” when there are no clues to suggest pathogenicity from either PolyPhen-2, Align-GVGD, our structural model or segregation data or when the segregation data suggest that the variant is probably benign. This means that there is a chance that a variant that we classified as “probably benign,” might later receive the label “probably pathogenic,” if future studies show that the variant has occurred at least twice de novo in patients with features of CHARGE syndrome. Five variants that we classified as “probably benign,” p.His55Arg, p.Pro732Ala, p.Val2102Ile, p.Ala2789Thr, and p.Lys2948Glu, were previously reported as pathogenic missense mutations [Felix et al., 2006; Jongmans et al., 2006; Kim et al., 2008]. Because all five variants were predicted to be “benign” by PolyPhen-2 and Align-GVGD and segregation data (including phenotypic data) were not available, the total summed score was 0, leading to our classification of “probably benign.” Neither variant was located in the chromo- or helicase domain and therefore structural modeling was not performed by us. However, Ala2789Thr was predicted to be deleterious according to a structural model of CHD7 that was constructed by Kim et al. [Kim et al., 2008]. Hopefully, segregation data concerning these five variants will become available in the future, leading to a more reliable classification of pathogenicity.
Unfortunately, we had to classify one third of the CHD7 variants as “UV” (46/145), due to a lack of segregation data (n = 25) or of phenotypic data of the carrier parent (n = 17). For only four variants, these data were available, but we still had to classify the variant as “UV.” Three of these variants had been identified in an affected family member, but unfortunately it was unknown whether the variant had occurred de novo in the affected parent. Additional segregation and phenotypic data from patients and/or controls can ultimately lead to a correct classification of all missense variants. The locus-specific CHD7 mutation database (available at www.CHD7.org; [Swertz et al., 2010]) provides a valuable source of information, as it contains both segregation and clinical data. Clinical data are important, because the phenotype of patients who undergo CHD7 analysis in a clinical diagnostic laboratory is not always highly suggestive of CHARGE syndrome [Bartels et al., 2010]. On the contrary, many patients have only a few features of CHARGE syndrome and CHD7 analysis is performed to exclude a diagnosis of CHARGE syndrome. The prior chance of finding a pathogenic CHD7 mutation in this group is therefore much lower than in the group of patients with typical CHARGE syndrome.
Segregation data in combination with phenotypic data are reasonably reliable, but one should be aware that a variant that segregates with the disease is not always pathogenic, because the missense variant may be in linkage disequilibrium with an unidentified pathogenic mutation. When interpreting segregation data, the possibility of phenocopies, variable expressivity, and nonpaternity should be considered. The presence of a CHD7 variant in the NCBI SNP database does not necessarily mean that the variant is benign, because there is always a chance that a mildly affected patient with CHARGE syndrome could have been included in the NCBI SNP cohorts [Bell et al., 2007].
Our system mainly classifies missense variants according to the predicted effect of the amino acid substitution. However, missense variants, as well as synonymous changes, can also have a deleterious effect on splicing, because the variant can be located in, or close to, a splice site, or it can create a novel splice site. Of the 145 missense variants that were assessed in this study, 12 were predicted to have a possible effect on splicing according to the splice prediction programs (12/145 = 8%). RNA studies should be performed to confirm the splice effects.
Distribution of CHD7 Missense Variants
The CHD7 missense variants were present in the entire coding region of the CHD7 gene (Fig. 1, Supp. Table S1). The variants that we classified as “probably pathogenic” were all located in the middle of the CHD7 gene. Those that we classified as “probably benign” were predominantly located at the 5′ and 3′ ends of the CHD7 gene: 47/59 “probably benign” variants were found in amino acids 1–820 and 2320–2997 (Fig. 1). The 5′ end of the CHD7 gene is only weakly to moderately conserved among species and both the N- and C-terminal of the CHD7 protein do not contain functionally important domains.
Structural Model of Chromo- and Helicase Domains
We constructed a structural model of the chromo- and helicase domains of the CHD7 protein, based on different template structures (Fig. 2, Supp. Table S2). FoldX was used to create structural models of the different CHD7 variants and estimate their effect on the structural stability of the CHD7 chromo- and helicase domains. Because the accuracy of the energy prediction by FoldX depends on the exact position of the amino acid atoms in a structure, the accuracy of our prediction is more limited than in previous works [Alibes et al., 2010; Pey et al., 2007; Rakoczy et al., 2011], due to the use of models based on low sequence identity between target and template, and the low resolution of the available template structures.
A previous study constructed a structural model of the C-terminal part of the CHD7 protein and concluded that variants in the loop regions were likely detrimental, because of their possible effects on the structural and binding properties of the CHD7 protein [Kim et al., 2008]. This is in contrast to our model, where five CHD7 missense variants located in loop regions were all predicted to have a likely minor effect on protein stability (Supp. Table S1). For every new CHD7 variant that is submitted to the CHD7 database, we will provide the prediction of our structural model.
CHARGE syndrome is extremely variable and the phenotype cannot be predicted from the genotype. However, when comparing the clinical features of patients with a CHD7 missense mutation with patients with a truncating mutation, we have shown that missense mutations are, in general, associated with a milder phenotype (Table 4). This association is also seen in other syndromes, for example, Rett syndrome [Cheadle et al., 2000]. Three features were found significantly more often in the patients with a CHD7 truncating mutation: cleft lip/palate, choanal anomalies, and congenital heart defects. This is consistent with a previous study that showed that 10 severely affected fetuses with CHARGE syndrome were all carrying a CHD7 truncating mutation [Sanlaville et al., 2006]. The features that are almost always present in CHARGE syndrome (external ear anomalies, cranial nerve dysfunction and balance disturbance caused by semicircular canal anomalies [Bergman et al., 2011b]), do not occur significantly more often in patients with a truncating mutation. This was to be expected, because these features are frequently seen in very mildly affected patients [Bergman et al., 2011b; Delahaye et al., 2007; Jongmans et al., 2008; Lalani et al., 2006; Vuorela et al., 2008].