Screening of known disease genes in congenital scoliosis

Abstract Background Congenital scoliosis (CS) is defined as a lateral curvature of the spine due to the vertebral malformations and has an incidence of 0.5–1/1,000 births. We previously examined TBX6 in Japanese CS patients and revealed that approximately 10% of CS was caused by TBX6 mutations. However, the genetic cause of remaining CS is unknown. Methods We recruited 78 CS patients without TBX6 mutations and major comorbidities, and investigated the genes previously reported to be associated with CS and congenital vertebral malformations by whole‐exome sequencing. Results We identified the compound heterozygous missense variants in LFNG in one patient. No likely disease‐causing variants were identified in other patients, however. LFNG encodes a GlcNAc‐transferase. The LFNG variants showed loss of their enzyme function. Conclusions A LFNG mutation is reported in a case of spondylocostal dysostosis (SCD), a skeletal dysplasia with severe malformations of vertebra and rib. The CS patient with LFNG mutations had multiple vertebral malformations including hemivertebrae, butterfly vertebrae, and block vertebrae, and rib malformations. LFNG mutations may cause a spectrum of phenotypes including CS and SCD. The current list of known disease genes could explain only a small fraction of genetic cause of CS.


Funding information
Japan Agency for Medical Research and Development (AMED); Japan Orthopaedics and Traumatology Foundation; Japan Society for the Promotion of Science, Japan, Grant/Award Number: 16K08251 Results: We identified the compound heterozygous missense variants in LFNG in one patient. No likely disease-causing variants were identified in other patients, however. LFNG encodes a GlcNAc-transferase. The LFNG variants showed loss of their enzyme function. Conclusions: A LFNG mutation is reported in a case of spondylocostal dysostosis (SCD), a skeletal dysplasia with severe malformations of vertebra and rib. The CS patient with LFNG mutations had multiple vertebral malformations including hemivertebrae, butterfly vertebrae, and block vertebrae, and rib malformations. LFNG mutations may cause a spectrum of phenotypes including CS and SCD. The current list of known disease genes could explain only a small fraction of genetic cause of CS.

K E Y W O R D S
congenital scoliosis, LFNG, spondylocostal dysostosis, whole-exome sequencing

| INTRODUCTION
Congenital scoliosis (CS) is defined as a lateral curvature of the spine due to vertebral malformations; its incidence is 0.5-1/1,000 births worldwide (Giampietro, 2012;Giampietro et al., 2013;Liljenqvist, 2004;McMaster & Ohtsuka, 1982). Segmentation of the vertebrae occurs between 20 and 35 days after conception in human embryonic development (Turnpenny et al., 2007). Somites, the precursors to the ribs, striated muscle, and dermis of the back, are also formed during this period. This process is called somitogenesis. Disruptive perturbation of somitogenesis contributes to congenital vertebral malformations leading to CS. CS is well known to be a heterogeneous disorder ranging from nonsyndromic to possibly syndromic CS who has various major comorbidities (Turnpenny et al., 2007).
Perturbation of somitogenesis causes not only vertebral malformations but also rib malformations. Therefore, many CS patients also have rib malformations. Such a severe form of CS associated with rib malformations is referred to spondylocostal dysostosis (SCD) (Bulman et al., 2000;Giampietro et al., 2006;Sparrow et al., 2006;Sparrow, Guillen-Navarro, Fatkin, & Dunwoodie, 2008;Whittock et al., 2004). While CS is the view from the orthopedic aspect of the disorder, SCD considers developmental abnormalities of the skeleton (Supporting Information Figure S1). Therefore, some SCD have only mild scoliosis due to the balanced abnormality of the spine even though they have severe vertebral abnormalities. SCD is included in the nosology and classification of genetic skeletal disorder: 2015 revision as the group 35, "Dysostoses with predominant vertebral with and without costal involvement" (Bonafe et al., 2015). Severe spine and rib malformations of CS and SCD usually affect the thoracic growth and function adversely, and lead to thoracic insufficiency syndrome (Campbell et al., 2003;Flynn et al., 2013). Early diagnosis and intervention are mandatory to keep the deformity under control.
We previously examined TBX6 in undiagnosed Japanese CS patients and found that approximately 10% of CS was caused by TBX6 mutations (Takeda et al., 2017). The incidence of TBX6-associated CS was similar worldwide (Lefebvre et al., 2017;Takeda et al., 2017;Wu et al., 2015). However, the genetic cause of CS other than TBX6 remains unknown. Therefore, we investigated the mutations in genes associated with the development of spine in Japanese CS by WES. We identified the compound heterozygous missense mutations in LFNG in one CS patient. These mutations in LFNG lead to loss of its enzyme function.

| Subjects
We recruited 196 Japanese CS patients who were seen at participating hospitals. They received clinical and radiological examinations by expert spinal surgeons and pediatricians. Patients who have known syndromic scoliosis, including Alagille syndrome, Goldenhar's syndrome, Jarcho-Levin syndrome, Klippel-Feil syndrome, SOTOS syndrome, and VACTERL association, were excluded. Genomic DNA was extracted from peripheral blood leukocytes using standard methods or from saliva using Oragene DNA collection kit (DNA Genotek, Ottawa) according to the manufacturer's protocol. Informed consent was obtained from the patients and their parents. The study was approved by the ethical committee of all participating hospitals and RIKEN.

| WES
TBX6 mutation was examined as previously described (Takeda et al., 2017). We conducted WES as previously described Wang et al., 2017). Briefly, approximately 3 μg DNA was sheared and used for a SureSelect Human All Exon V5 library (Agilent Technologies, Santa Clara, CA) according to the manufacturer's instructions. Patients' DNA samples were sequenced on a HiSeq2000 (Illumina, San Diego, CA) with 101-bp pairedend reads.

| DNA sequencing of LFNG
The genomic region of the LFNG gene including two missense variants (c.467T>G and c.856C>T) was PCR-amplified using KOD Fx and primers 5′-CCAGTGCGGCC-GCCTGGGTGCTGTGTCAATAAGGGT-3′ (forward) and 5′-CCAGTGAATTCATACCGTAGCTCAGCGTCACCTG-3′ (reverse). The PCR amplicons were cloned into the NotI and EcoRI sites of the pBluescript SK(-) cloning vector (Agilent Technologies). After ligation, the vectors and insert DNA fragments were transformed to E.coli competent cells. The bacterial colonies were collected and verified by Sanger sequencing. LFNG variants were detected by direct sequencing of PCR products using KOD Fx (Toyobo, Tokyo) and primer sets using a 3730xl DNA analyzer (Applied Biosystems, Foster City, CA). Nucleotide numbering uses +1 as the A of the ATG translation initiation codon in the reference sequence, with the initiation codon as codon 1.

| Expression of the soluble form of the LFNG proteins and enzyme assay of their GlcNAc activity
The expression vector of human LFNG was constructed as described previously with slight modifications (Rampal et al., 2005). A truncated form of LFNG (wild-type) was amplified from human placenta cDNA (BioChain, Newark, CA) by two-round PCR. The first PCR was performed with a forward primer, 5′-GCTGCT CGCCTGCCTG CTGGTGCTCA-3′, and a reverse primer, 5′-CTCCACA-GAGCA GAACTGCCCAGCAGCCT-3′, followed by nested PCR with nested primers: a forward primer containing an in-frame EcoRI site located 259-bp downstream from the initial codon, 5′-CCGGAATTCA-GATGCGGGCCCGCCGCCCGG-3′, and a reverse primer containing a KpnI site located 3-bp downstream from the stop codon, 5′-CGGGGTACCGATTGGGTCTCAGC-CATGGC-3′. Each PCR was conducted with KOD-Plus DNA polymerase (Toyobo). A DNA fragment, which encoded the human LFNG lacking the first 86 amino acids including the predicted transmembrane domain, was subcloned into a p3xFLAG-CMV8 (Sigma, St. Louis, MO) vector, resulting in the fusion of LFNG (wild-type) to the preprotrypsin leader sequence and the 3xFLAG tag sequence at the N-terminus present in the vector.
Site-specific mutagenesis of LFNG to produce the p.Leu156Arg and p.Arg286Trp mutants was performed using two rounds of PCR. The first PCR was performed with a 5′-primer containing an in-frame EcoRI site described above and a 3′-internal mutagenic oligonucleotide primers (5′-GCCCGTGTGCCTGGCCCG GGCCT CATCTTCCCC-3′ or 5′-GCAGTCATCAGGCAGCCAGA TCCGCTC AGCCGT-3′ for p.Leu156Arg or p.Arg286Trp, respectively), or 5′-internal mutagenic oligonucleotide primer (5′-GGGGAAGATGAGGCCCGGGCCAGGCACACG GGC-3′ or 5′-ACGGCTGAGCGGATCTGGCTGCCT-GATGACTGC-3′ for p.Leu156Arg or p.Arg286Trp, respectively) and 3′-primer containing an EcoRV site described above, and p3xFLAG-CMV8/LFNG (wild-type) as a template. The second PCR was performed with a 5′and 3′-primers containing a HindIII and an EcoRV sites described above and KOD-Plus polymerase, and the first PCR products as a template. The amplified fragments were digested with EcoRI and KpnI, inserted into p3xFLAG-CMV8, and sequenced using a 3730xl DNA Analyzer.
The expression plasmid was transfected into HEK293T cells on a T-75 flask using FuGENE 6 HD (Promega, Madison, WI). Three days after transfection, 3 ml each of the culture medium and the cells was collected. The cells were lysed with phosphate-buffered saline containing 0.5% Tri-tonX-100 and protease inhibitor cocktail (Roche, Basel). Those conditioned media and cell lysates were incubated with 10 µl of anti-FLAG M2 agarose resin (Sigma) for 2 hr and overnight, respectively, at 4°C. The beads were washed with 1 ml of 50 mM Tris-HCl, pH 7.5 containing 150 mM NaCl, 0.02% Tween-20, and then analyzed on a 10% sodium dodecyl sulfate (SDS)-polyacrylamide gel, transferred to a nitrocellulose membrane (Bio-Rad), and incubated for 1 hr with anti-FLAG M2 antibody (Sigma). The bound antibody was detected with anti-mouse IgG conjugated with a fluorescent dye, IRDye 680RD (Li-Cor, Lincoln, NA) using ODYS-SEY CLx (Li-Cor). The amount of recombinant LFNG protein was estimated by standard curve from fluorescent intensity of 3xFLAG-tagged bovine alkaline phosphatase (Sigma) using Image Studio (ver 4.0, Li-Cor).
GlcNAc-transferase activity was examined as described previously (Rampal et al., 2005). Briefly, the GlcNAc-transferase assay mixture contained 10 µl of enzyme-bound anti-FLAG affinity resins, 50 mM 2-(N-morpholino) ethanesulfonic acid-NaOH (pH 6.5), 10 mM MnCl 2 , 0.1 mM UDP-GlcNAc (ultrapure grade, Promega) as the sugar donor substrate, and 1 mM p-nitrophenyl-α-L-fucose (pNP-Fuc, Sigma) as the sugar acceptor in a total volume of 50 µl. The reaction mixture was incubated at 37°C for 2-4 hr. The reaction product, UDP moiety released from UDP-GlcNAc, was mixed with UDP detection reagent, which contains an enzyme converting UDP to ATP, in UDP-Glo ™ Glycosyltransferase Assay kit (Promega). The newly synthesized ATP to be measured using a luciferase/luciferin reaction and the luminescent signals were detected using a luminometer, Victor X4 (PerkinElmer, Waltham, MA).

| Clinical features of the CS patients
We found TBX6 mutations in 15 out of 196 CS patients. From the patients without TBX6 mutations, we selected 78 nonsyndromic CS patients (Supporting Information  Table S1) and examined them by using WES. Their mean age was 12.0 ± 5.4 years old. Ten patients had family histories of scoliosis; two had CS in their families. The mean Cobb angle of the main curve was 43.0°± 13.4°. Eight patients had various comorbidities. Various types of rib malformations were found in 23 patients; one irregular shaped, 11 fused, six missing, and five additional ribs.

| WES
The mean depth of coverage for reads was 74.3×, and 94.3% of the targeted bases had more than 20 reads on average (Supporting Information Table S2). By evaluation of the variants for their pathogenicity, we identified a CS patient (S1289) who had two likely disease-causing variants in LFNG: c.467T>G (p.Leu156Arg) and c.856C>T (p.Arg286Trp) ( Table 1). The sequences of the variants were confirmed by Sanger sequencing (Figure 1a,b). We did not identify any likely disease-causing variants in other genes (PAX1, SLC35A3, T, DYNC1H1, DLL3, HES7, DLL1, MESP2, and RIPPLY2). The coverages of the exons of these genes which had more than 10 reads on average were over 90% (Supporting Information Table S3).
Because the parents' DNAs were unavailable, we examined the haplotype of the two missense variants in S1289 by a long-range PCR followed by sequencing the subcloned amplicons. We found that S1289 was compound heterozygotes for the two missense variants.

| Functional analysis of LFNG variants in vitro
To examine whether the LFNG missense variants, p.Leu156Arg and p.Arg286Trp cause a reduction of the GlcNAc-transferase activity, the LFNG proteins in soluble TAKEDA ET AL.

| 969
forms with a 3xFLAG epitope were generated by replacing the first 86 amino acids of LFNG with a cleavable pre-protrypsin signal sequence. The soluble 3xFLAG-tagged LFNG proteins were expressed in HEK293T cells. The recombinant enzymes secreted in the medium were adsorbed onto anti-FLAG antibody-bound beads to eliminate endogenous LFNG and to facilitate the purification of the enzyme. The expression levels of wild-type and a mutant enzyme (p.Arg286Trp-and p.Leu156Arg-LFNG) in the cell fraction were comparable between the wild-type and mutants (Figure 2a). GlcNAc-transferase activities of both mutants from cell lysate were significantly reduced compared to that of the wild-type LFNG (Figure 2b). These data indicate that the two LFNG variants lead to loss of the enzyme function. By contrast, the expression of wild-type and a mutant enzyme, p.Arg286Trp-LFNG, but not p.Leu156Arg-LFNG, was detected in the conditioned medium (Supporting Information Figure S2a). These observations suggest that the secretion pathway of p.Leu156Arg-LFNG might be impaired. The GlcNAc-transferase activity of p.Arg286Trp-LFNG from the conditioned medium was also drastically decreased compared to that of the wild-type LFNG (Supporting Information Figure S2b).

| Clinical features of the patient with
LFNG mutations S1289 was a 16 years old male who was born as a child of nonconsanguineous healthy parents. He presented with mild short stature (−2.1 SD) and had no comorbidity (Supporting Information Table S1). He had severe scoliosis and was diagnosed with CS. The main Cobb angle was 70°at T9-L3 (Figure 3a). He had multiple vertebral malformations: hemivertebrae at T5, L1, and L4 (one segmented type and two nonsegmented type), block vertebra at T8-T9, and butterfly vertebrae (nonsegmented type) at T7, T12, and L5 (Figure 3b). Multiple rib malformations, including hypoplasia of bilateral 2nd ribs and left 6-7th, 8-9th, right 5-6th, 7-8th, and 9-10th fused ribs, were also found.

| DISCUSSION
Our WES identified a CS patient who is compound heterozygous for two novel missense variants of LFNG. Only one LFNG mutation (c.564C>A, p.Phe188Leu) has previously been reported, which causes SCD3 (MIM# 609813) in the homozygous state (Sparrow et al., 2006). Both of the variants we found were rare and were predicted deleterious in all bioinformatic tools examined. LFNG encodes an O-fucosylpeptide 3-β-N-acetylglucosaminyltransferase that adds Nacetylglucosamine (GlcNAc) residues to O-fucose on the EGF-like repeats of Notch receptors (Bruckner, Perez, Clausen, & Cohen, 2000;Moloney et al., 2000). Our variants and the previously reported mutation are all located in highly conserved region that encodes a glucosaminyl-transferase; particularly, the Arg286 residue where we found the mutation was conserved in all known fringe proteins from Drosophila melanogaster to human (Correia et al., 2003). Although these in silico evaluation data strongly suggested that the two variants are disease-causing mutations, it has been reported that to predict the deleterious effect of missense variants in silico is sometimes difficult (Majithia et al., 2016;Takeda et al., 2017). Therefore, we examined the effect of the variants in vitro and revealed that both lead to T A B L E 1 In silico evaluation of pathogenicity of the missense variants identified in S1289   Figure S2a). It is reported that p.Phe188Leu mutation was also mislocalized and enzymatically inactive by a functional assay (Sparrow et al., 2006).

| 971
The CS patient who carried the LFNG mutations (c.467T>G; p.Leu156Arg and c.856C>T; p.Arg286Trp) had severe scoliosis due to interspersed vertebral malformations from thoracic to lumbar spine together with multiple rib malformations, while the SCD patient who carried the LFNG mutation (c.564C>A; p.Phe188Leu) had nonprogressive scoliosis but more extensive and a large number of congenital vertebral malformations like "pebble beach" from cervical to lumbar spine as well as severe short trunk and finger abnormalities (Sparrow et al., 2006;Turnpenny et al., 2003). Thus, LFNG mutations are considered to cause a spectrum of vertebral malformation recognized as CS and/or SCD (Supporting Information Figure S1). Such phenotypic variation including CS and SCD is also known in vertebral malformation caused by TBX6 mutations (Lefebvre et al., 2017;Takeda et al., 2017). Further, accumulation of the patients with LFNG mutations is mandatory to conclude the phenotypic range of the LFNG mutation and genotype-phenotype association.
In the screenings of SCD genes, known gene mutations have been found in 20%-25% of the patients examined; 60% of SCD is caused by DLL3 mutations (Bonafe, Giunta, Gassner, Steinmann, & Superti-Furga, 2003;Turnpenny et al., 2003;Turnpenny, Sloman, & Dunwoodie, 1993;Whittock et al., 2004). In contrast, our screening of known disease genes for CS and vertebral malformation could only find the causal genes in <10% of nonsyndromic CS patients who had no major comorbidities; CS in our cohort did not have the mutations in known genes causing CS and SCD other than TBX6 and LNFG. CS is a very heterogeneous disorder; the type, location, number of vertebral malformations, and comorbidities are different between the patients (Giampietro et al., 2013). On the other hand, CS caused by the TBX6 mutation has been reported to have specific abnormality of the vertebra (Takeda et al., 2017;Wu et al., 2015). The vertebral anomaly along the whole length of the spine with the vertebrae of rounded shape ("pebble beach" sign) is a feature of SCD1 (MIM# 277300), which is caused by DLL3 mutations (Turnpenny et al., 2003). Their clinical differences suggest considerable genetic heterogeneity of CS. In the current study, we investigated the limited number of genes that are considered to be associated with the development of spine. Most of them are related to the Notch signal. Somitogenesis of vertebrate species relies on the intersecting gradients and cross-regulatory activities of various signal transduction pathways, not only Notch signaling but also Fgf, Wnt, and retinoic acid signaling pathways (Gibb, Maroto, & Dale, 2010;Wahi, Bochter, & Cole, 2016). It was reported that mutations in Wnt3a identified as a negative regulator on Notch signaling and somitogenesis associated with congenital vertebral malformations (Aulehla et al., 2003;Hayward, Kalmar, & Arias, 2008). Therefore, further comprehensive approach might identify the causal genes for CS. However, our preliminary screening of unknown CS gene(s) for the exome data did not present a possible causal gene that had likely deleterious mutation in more than two cases. To discover the unknown causal genes for the rare diseases by WES is still challenging and tends to come up against the N = 1 problem (Akawi et al., 2015). It would be mandatory to increase the sample size and evaluate the phenotype well.