All 14 nonsense variants reported here are considered to be pathogenic (Supporting Information Table S2). Synonymous variants are generally viewed as “silent,” however, depending on their position in the gene they may disrupt normal splicing, codon usage, mRNA folding and stability, which may adversely affect normal peptide synthesis (Sauna & Kimchi-Sarfaty, 2011). Analysis of the 11 synonymous variants reported here suggest that 9 would have no effect on splicing, as the affinity scores for the variant and wild-type sequences were not markedly different (Table 1). However, the variant c.1845G>A destroys the exon 12 donor splice site and is therefore likely to be pathogenic. Although the results for c.1216C>A at the native exon 9 acceptor site were only minimally changed, all three predictive programs identified a novel acceptor site with affinity levels equivalent to the wild-type. If this site were used it would result in a frame-shift and a truncated peptide (p.(His407Thrfs*7)) and could therefore be considered to be pathogenic.
Using PolyPhen 1 & 2 (version 2.1.0), SIFT, and Refined SIFT analyses, 73 missense substitutions were predicted to be pathogenic and 10 nonpathogenic (Table S3), although nine variants gave discordant results, with three or four “nonpathogenic” predictions (from a combination of PolyPhen and SIFT programs). These nine variants were analysed further (Table 2). Conservation scores were calculated for each position in the LDLR peptide with ScoreCons (Valdar, 2002), using the alignment of functionally similar proteins with the same domain types, number and order (Fig. 1B). These scores were used for analysing the degree of conservation of residues involved in missense variants and small rearrangements, and were applied to the three-dimensional (3D) model of the extracellular portion of LDLR to allow visualization of the conservation patterns in the peptide. In addition to the ScoreCons analysis, the nine missense variants with discordant results were also analysed with SAAPdb and Mutation Taster.
p.(Ser123Pro): SAAPdb analysis of p.(Ser123Pro) suggests that removal of the serine would destroy hydrogen bonds, and that introduction of a proline residue, which is larger and more rigid than serine, would further disrupt the LDL class 3A ligand-binding domain where this residue is located. These results are in agreement with those from Mutation Taster and lead us to suggest that this variant is probably pathogenic (Table 2).
p.(Phe200Cys): Located in the LDL-receptor class A5 repeat, p.(Phe200Cys) was designated as nonpathogenic by Mutation Taster. However, conservation analysis shows that this variant occurs in an environment of highly conserved residues with disulphide bonds around it (Fig. 2Ai). Therefore, introduction of an additional cysteine residue at this position could potentially result in the formation of disulphide bonds with cysteine residues in the region (positions 197, 204, or 209) (Fig. 2Ai). Formation of novel disulphide bonds would disrupt the wild-type configuration of this domain and would probably be pathogenic.
Figure 2. 3D diagrammatic representation of conservation score (residues 22–720) coloured to represent conservation scores (red: high, purple: moderate, blue: poor and black: no conservation). The residue of interest in each model is shown in green and does not reflect the level of conservation, the orientation chosen best shows the residue of interest. (A) Phenylalanine200 is poorly conserved (inset i) in a highly conserved environment, local disulphide bridges shown as green lines. (B) Histidine211 is highly conserved. (C) Glycine314 is on the surface of the protein, moderately conserved (inset i) and close to calcium cation (inset ii). (D) Leucine339 is on the surface of the protein and is poorly conserved (inset i and ii). (E) Ramachandran plot for p.(Leu339Pro), dark green areas show the favoured regions for prolines, whereas pale green shows acceptable areas. Triangles represent prolines present in the native structure (black in favoured regions, orange in acceptable regions, red in disfavoured regions). There are many prolines in disfavoured regions due to a poor quality structure. Leucine339 which is mutated to proline is shown as a blue square in a disfavoured region for proline. Unlike any of the disfavoured prolines in the native structure, it has a more negative phi angle. (F) Alanine612 is moderately conserved, it is buried within the structure and so cannot be visualised in the diagrammatic representation of conservation score.
Download figure to PowerPoint
p.(His211Leu): Histidine211, which is also located in LDL-receptor class A5 repeat, (Fig. 2B) is highly conserved (ScoreCons 0.847), it is positively charged and polar whereas leucine is neutral and nonpolar. The difference in polarity and charge may impact on ligand binding and therefore this variant could possibly be pathogenic (Table 2).
p.(Gly314Arg): Glycine314 in the epidermal growth factor (EGF)-like 1 domain is not highly conserved (ScoreCons 0.656) (Fig. 2Ci) and analysis of the SAAPdb provides no evidence for the potential pathogenicity of p.(Gly314Arg). The 3D structure in the region revealed that glycine314 is next to a threonine residue that is involved in binding a bound calcium cation (Figs 2Ci and ii), it is possible therefore, that the substitution of the glycine with a positively charged arginine could affect the binding affinity of the calcium cation, although arginine is found at this position in other species (lizard and Chinese hamster). Examination of this variant at the DNA level reveals that the guanine at position 940 is the last residue in exon 6, and changing this to a thymine is predicted to reduce the affinity of the splice donor site to 71, 52, and 97% of wild-type (SplicePort, NNSSP and NetGene2, respectively) (Table 2). There are no other splice donor sites in the region with equal or greater binding affinities, and it is possible that normal spicing of exons 6 and 7 may be disrupted, therefore this variant could be pathogenic either as a result of the structural change or by disrupting splicing.
p.(Leu339Pro): The leucine339 in the EGF-like 1 domain, is located on the surface of the peptide and is poorly conserved across species (ScoreCons 0.403)(Figs. 2Di and ii), however, SAAPdb revealed that substitution of leucine with proline causes a small clash, and a Ramachandran plot (Fig. 2E) indicates that introduction of proline would be unfavourable for folding. Project Hope analysis of p.(Leu339Pro) suggested that Leu339 is also involved in several multimer contacts, and introduction of the smaller proline at this position may prevent these contacts being made. Overall, we predict that this variant is probably pathogenic.
p.(Ala612Ser): ScoreCons analysis (ScoreCons 0.710)(Fig. 1B) indicates that the β-propeller domain, which includes Ala612, shows a generally high degree of conservation (Fig. 2F); an average of 0.72 compared to the overall average of 0.64. Of note, the core region shows a very high degree of conservation, and many of the most conserved regions are buried, whereas the surface contains fewer highly conserved regions. Conservation analysis of the 3D structure shows that Ala612 is buried within the LDLR structure and lies within a highly conserved core structural element of the β-propeller domain, (residues 611–614 have the following ScoreCons scores: 0.708, 0.708, 0.780, 0.951, respectively). It is highly unlikely that changes will be easily tolerated in this region and may well affect the folding rate of the domain. Analysis of c.1834G>T, underlying p.(Ala612Ser), with SplicePort suggests that it reduces the affinity score at the exon 12 splice donor site to 34% of normal, (Table 2). This site already has a low affinity and a further reduction may interfere with normal splicing. However, it should be noted that NNSSP and NetGene2 did not identify this reduction. Overall, this variant is most likely to be pathogenic because of its influence on the structure of the peptide.
p.(Thr766Ala): Substitution of threonine766 in the O-linked glycosylation region will remove a sugar residue from the mature peptide. Furthermore, this substitution replaces a highly hydrophobic residue with the less hydrophobic alanine (hhHydrophobicity 0.52 and 0.11, respectively). However, this variant may not be pathogenic, as deletion of this region has been shown to have no effect on receptor function (Davis et al., 1986), although it is thought to maintain the LDL-binding domains at an appropriate distance from the cell surface (Hussain et al., 1999).
p.(Val797Met): Replacement of the strongly hydrophobic (hhHydrophobicity =–0.31) valine797 in the core of the transmembrane domain with the nominally hydrophobic (hhHydrophobicity =–0.1) methionine is likely to affect insertion of the peptide into the membrane and helix formation, thereby inhibiting anchoring of LDLR in the cell membrane. The substitution (c.2389G>A), responsible for p.(Val797Met) affects the last residue of exon 16; analysis of this variant with the splicing programs revealed that the affinity scores were reduced at the exon 16 donor site (0% SplicePort, 89% NNSSP and NetGene2)(Table 2). This is in agreement with experimental findings from the reporting group (Bourbon et al., 2009). Therefore, this variant is probably pathogenic because of its affect on splicing. Similarly, although the substitution c.2389G>T (p.(Val797Leu) was predicted by the PolyPhen and SIFT programs to be nonpathogenic (Table S3), it is also predicted to disrupt normal splicing (0% SplicePort, 82% NNSSP, 63% NetGene2) and therefore to be pathogenic.
p.(Val800Asp): The third transmembrane domain variant p.(Val800Asp), is also predicted to be pathogenic as the valine800 (hhHydrophobicity =–0.31) is replaced with the hydrophilic aspartic acid thereby potentially disrupting LDLR insertion into the membrane.
Sixteen of the 38 intronic variants reported in this study are likely to be nonpathogenic, as in silico analysis reveals that their affinity scores either differed only slightly from wild-type (n= 14) or were the same as wild-type (n= 2) (Table S5). Conversely, 18 intronic variants are predicted to be pathogenic, as they resulted in complete destruction of the wild-type splice site (Table S5). Cryptic splice sites could be activated by two of these variants. c.1358 + 1G>T destroys the exon 9 splice donor site; were the cryptic splice site to be used it would result in a frame-shift p.(Ser453fs*2). Similarly, a cryptic splice acceptor site at c.1907 may be activated by variant c.1846–2A>C which would result in destruction of the wild-type exon 13 splice acceptor site, use of this cryptic splice site would also cause a frame-shift p.(Glu615fs*30). The potential use of cryptic splice sites in these variants would have pathogenic effects.
Four variants gave weaker evidence for pathogenicity with the splice prediction programs. The affinity scores for variant c.313 + 6T>C were reduced at the exon 3 donor site (Table S5). Bourbon et al. (2009), who reported this variant, showed experimentally that it resulted in the skipping of exon 3 and so it is probably pathogenic. In silico analysis of the variant c.941–13T>A, showed reduced affinity scores at the exon 7 splice acceptor site and the creation of a novel splice acceptor (Table S5); if this site were used, exon 7 would start at c.941–11 resulting in a reading frame-shift and termination of translation after 60 codons (p.(Gly314fs*60)). Similarly, c.2547 + 3G>C results in reduced affinity scores and there is a cryptic site nearby, which if used would also result in a reading frame-shift (p.(Ser849fs*6)(Table S5), however this variant has only been reported in the presence of c.798T>A p.D266E which is itself pathogenic (Chmara et al., 2010) and so c.2547 + 3G>C may not be pathogenic.
The affinity scores at the exon 10 acceptor site for the variant c.1359–5C>G were reduced to 74% (NNSSP and SplicePort) and 99% (NetGene2) of normal, which is inconclusive (Table S5). However, the reporting group (Bourbon et al., 2009) demonstrated in vitro that intron 9 is retained in the transcript from this variant; it is therefore predicted to be pathogenic.
Although the affinity scores at the exon 14 donor site remained unchanged in the variant c.2140 + 86C>G, a novel site was created, which according to Kulseth et al. (2010) is used and inserts 27 amino acids into the LDLR peptide and would therefore be pathogenic.
Small DNA rearrangements
Thirty-seven of the small rearrangements are considered to be pathogenic as they result in a reading frame-shift (Table S6). The eight in-frame small rearrangements reported in this study were subjected to in silico analysis was used in an attempt to predict whether they affect LDLR function (Table 3).
p.(Gly219_Pro220del), p.(Asp221_Asp227dup), p.(Asp224_Lys225delinsPhe) and p.(Asp227del): These variants affect residues within one of two hairpin loops in the class 5A repeat of the LDLR ligand-binding domain, which is stabilised by three disulphide bonds (cysteine residues 197 and 209, 214 and 231, 204 and 222). This domain includes the highly conserved D-x-S-D-E motif, spanning positions 224–228 (Kurniawan et al., 2000; Jeon et al., 2001; Jeon & Blacklow, 2005). Furthermore, residues 221 and 227 are among the four most highly conserved acidic residues directly involved in calcium coordination in LDLR (Kurniawan et al., 2000; Jeon et al., 2001; Jeon & Blacklow, 2005). Although the glycine and proline residues in p.(Gly219_Pro220del) are moderately and poorly conserved, respectively (ScoreCons 0.685, 0.416)(Fig. 3Ai), their deletion would alter the structure, disrupt H-bonding and prevent formation of the hairpin loop which requires a glycine (Fig. 3Aii). This would be likely to result in disruption of the local structure and opening up of the ends of the beta sheet strands linked by the loop. These residues also lie approximately 6Å from a calcium binding site and are likely to be required for its correct formation (Figs 3Ai and ii). Variants p.(Asp221_Asp227dup) and p.(Asp227del) will both destroy the D-x-S-D-E motif and are likely to distort the hairpin loop (Figs 3B and D). In the variant p.(Asp224_Lys225delinsPhe) not only are the highly and moderately conserved acidic residues aspartic acid and lysine deleted (Fig. 3Ci), they are replaced with the neutral and hydrophobic phenylalanine. As with the other variants in this region, their replacement is likely to disrupt calcium binding, which will in turn affect protein folding, break the hairpin loop and thereby alter the conformation (Fig. 3Dii). We suggest that these four variants are probably pathogenic.
Figure 3. 3D diagrammatic representation of conservation score (residues 22–720) coloured to represent conservation scores (red: high, purple: moderate, blue: poor and black: no conservation). The residue of interest in each model is shown in green and does not reflect the level of conservation, the orientation chosen best shows the residue of interest. (A) Glycine219 and proline220 are located on the surface of the LDLR protein. Glycine219 is moderately conserved and proline220 poorly conserved (inset i), both residues surround a calcium cation and form part of a hairpin (inset ii). (B) Residues from aspartate221 to aspartate227 are located on the surface of the protein, are mostly highly conserved (inset i, within the oval) and surround a calcium cation (inset ii). (C) Aspartic acid 224 is highly and lysine225 moderately conserved (inset i), both are located on the surface of the protein and surround a calcium cation (inset ii). (D) Aspartic acid 227 is located on the surface of the protein, is highly conserved and binds the calcium cation (inset i).
Download figure to PowerPoint
p.(Trp562dup): Tryptophan562, buried within the peptide (Fig. 4A), is highly conserved and close to a calcium cation (Figs 4Ai–iii). Duplication of this residue in the variant p.(Trp562dup) is likely to disrupt coordination of the calcium cation (Figs 4Ai–iii) and is therefore probably or possibly pathogenic.
Figure 4. 3D diagrammatic representation of conservation score (residues 22–720) coloured to represent conservation scores (red: high, purple: moderate, blue: poor and black: no conservation). The residue of interest in each model is shown in green and does not reflect the level of conservation, the orientation chosen best shows the residue of interest. (A) Trytophan562 is located deep in the protein but is visible. It is highly conserved (inset i) and is located in the EGF domain (inset ii). (B) Glycine593 is located on the surface of the protein and is moderately conserved (inset i), it is at the end of a strand (inset ii). (C) Isoleucine624 is located on the surface of the protein, is poorly conserved (inset i) and forms part of a hairpin loop (inset ii).
Download figure to PowerPoint
p.(Gly593del) and p.(Ile624del): Glycine593 and isoleucine624 are both in the β-propeller domain and form part of a hairpin loop in the fifth and sixth propeller blades, respectively (Figs 4B and C). These blades pack tightly against the C-terminal EGF module (Jeon & Blacklow, 2005), and deletion of these residues could disrupt the packing of these propeller blades, ultimately affecting displacement of the ligand from the ligand-binding region. Overall, we suggest that both of these variants are probably pathogenic.
p.(Leu799del): Leucine799 lies within the transmembrane domain of LDLR and deletion of this residue would shorten the transmembrane domain and may alter the structural conformation. This variant may therefore affect insertion of the peptide into the membrane, and is therefore probably pathogenic.