Application of targeted panel sequencing and whole exome sequencing for 76 Chinese families with retinitis pigmentosa

Abstract Background This study aimed to identify the gene variants and molecular etiologies in 76 unrelated Chinese families with retinitis pigmentosa (RP). Methods In total, 76 families with syndromic or nonsyndromic RP, diagnosed on the basis of clinical manifestations, were recruited for this study. Genomic DNA samples from probands were analyzed by targeted panels or whole exome sequencing. Bioinformatics analysis, Sanger sequencing, and available family member segregation were used to validate sequencing data and confirm the identities of disease‐causing genes. Results The participants enrolled in the study included 62 families that exhibited nonsyndromic RP, 13 that exhibited Usher syndrome, and one that exhibited Bardet–Biedl syndrome. We found that 43 families (56.6%) had disease‐causing variants in 15 genes, including RHO, PRPF31, USH2A, CLRN1, BBS2, CYP4V2, EYS, RPE65, CNGA1, CNGB1, PDE6B, MERTK, RP1, RP2, and RPGR; moreover, 12 families (15.8%) had only one heterozygous variant in seven autosomal recessive RP genes, including USH2A, EYS, CLRN1, CERKL, RP1, CRB1, and SLC7A14. We did not detect any variants in the remaining 21 families (27.6%). We also identified 67 potential pathogenic gene variants, of which 24 were novel. Conclusion The gene variants identified in this study expand the variant frequency and spectrum of RP genes; moreover, the identification of these variants supplies foundational clues for future RP diagnosis and therapy.

Thus far, 98 genes (33 for syndromic RP and 65 for nonsyndromic RP) and 9 loci (3 for syndromic RP and 6 for nonsyndromic RP) are known to cause RP. More than 3,000 gene variants are responsible for nonsyndromic RP (Guadagni, Novelli, Piano, Gargini, & Strettoi, 2015). The underlying molecular etiologies involve the phototransduction cascade and retinal transcription factors associated with the phototransduction cascade, as well as ribonucleic acid splicing machinery, retinal metabolism, retinal cell structure, ciliary structure, and ciliary function (Veleri et al., 2015). Most genes associated with RP are expressed in rod photoreceptors, whereas a small number are expressed in retinal pigment epithelium (Koch et al., 2012). Next-generation sequencing (NGS) technology in bioinformatics and computing technologies has undergone rapid development; accordingly, low-cost, high-throughput, highly efficient DNA sequencing has enabled accurate diagnosis and precise assessment of patient prognosis. Inherited genetic diseases are increasingly diagnosed accurately using NGS technology (Bamshad et al., 2011;Bell et al., 2011;Neuhaus et al., 2017;Yang et al., 2013). However, it remains a considerable challenge to identify disease-causing genes with NGS technology (Bainbridge et al., 2008). Inherited gene variants are reportedly responsible for only 60% of known cases of RP (Huang et al., 2017;Xu et al., 2014;Zhang, 2016); thus, the disease-causing gene is unknown in a substantial proportion of affected individuals. It is imperative to determine the genetic etiology of RP and provide guidance for efficient molecular diagnosis.
In this study, we enrolled 76 families with syndromic or nonsyndromic RP. All probands were evaluated using NGS technology. Through functional prediction, Sanger sequencing, and segregation analysis, we found that 43 families (56.6%) had disease-causing variants in 15 genes, while 12 families (15.8%) had only 1 heterozygous variant in 7 arRP genes. We also identified 67 potential pathogenic gene variants, of which 24 have not been previously described.

| Ethical compliance
The research protocol was approved by the medical ethics committee of Renmin Hospital of Wuhan University and carried out in accordance with the tenets of the Declaration of Helsinki. Written informed consent was obtained from each participant or their guardian (for participants who were children) prior to the study. All participants were consecutively recruited in Renmin Hospital of Wuhan University (Hubei, China), which is located in central China.

| Clinical testing
A detailed family history was obtained from the proband or the proband's family members. All participants received comprehensive ophthalmological examinations, including best-corrected visual acuity, refractive error measurement, slit lamp examination, intraocular pressure measurement, and funduscopy. Participants who agreed to additional ophthalmological examinations underwent fundus photography, visual field assessment, optical coherence tomography (OCT), and full-field electroretinography (ERG). High-resolution fundus photographs were obtained with a digital fundus camera VISUCAM 200 (Carl Zeiss Meditec AG, Jena, Thuringia, Germany). Visual field assessment was performed using a Humphrey HFA II-750 (Carl Zeiss Meditec AG). OCT was performed using an AngioVue® Imaging System (Optovue). ERG was recorded using an Espion system (Diagnosys) in accordance with the standards and methodology of the International Society for Clinical Electrophysiology of Vision (Mcculloch et al., 2015). Participants who exhibited hearing loss or carried gene variants indicative of Usher syndrome underwent hearing examinations using an ITERA sonometer (Otometrics, DK-2630).

| Targeted panel sequencing and whole exome sequencing
Genomic DNA was analyzed with targeted panel sequencing (each of six panels containing 70, 316, 78, 370, 429, and 386 genes) or whole exome sequencing (WES). Genes included in the panels are listed in Text S1; these genes are primarily responsible for inherited retinal dystrophy. Genomic DNA was isolated from leukocytes of venous blood samples using the QIAamp DNA Blood Midi Kit (Qiagen) or TIANamp Blood DNA Midi Kit (TIANGEN Biotech), in accordance with the manufacturer's standard protocol. Library preparation was performed using the Ion AmpliseqTM Library Kit 2 or SureSelect Exome V5 Capture library, in accordance with the manufacturer's instructions (Biswas et al., 2017;Chen et al., 2013;Javadiyan et al., 2018). Sequencing was performed on an Ion Torrent PGM (Life Technologies) or HiSeq (Illumina) platform.

| Sanger sequencing and segregation analysis
Raw reads were filtered and the selected variants were subjected to validation and segregation analyses. Polymerase chain reaction was used to amplify gene fragments that included the variants. Primers were designed with Primer3 (http://prime r3.ut.ee/); primers used for Sanger sequencing are listed in Table S2. The amplicons were sequenced using 3500xL Dx Genetic Analyser (Applied Biosystems, Foster City, CA, USA) with ABI BigDye Terminator v3.1 Cycle Sequencing kit. The proband sequences and corresponding consensus sequences (obtained from the NCBI Human Genome Database https :// www.ncbi.nlm.nih.gov/) were analyzed using the SeqMan II software of the Lasergene software package (DNASTAR). DNA samples of all probands and their available family members were subjected to Sanger sequencing and segregation analysis based on the inheritance pattern.

| Clinical manifestations
In total, 76 Chinese families of Han ethnicity were consecutively enrolled in the study. All probands complained of night blindness, constricted vision field, and impaired vision, with the exception of proband 12, who was very young. Four probands who exhibited RP beginning in childhood had complained of strabismus and nystagmus. Most probands exhibited fundus signs typical of RP, including bone spicule pigmentation, retinal vascular stenosis, and waxy-pale optic disc. The fundus photographs of probands with novel variants are shown in Figure S1. Visual field analyses showed that probands had a constricted visual field with increased mean deviation. OCT revealed severe thinning of the retinal nerve fiber layer, outer nuclear layer, and epiretinal membranes. Full-field ERG demonstrated extinguished or severely reduced dark-adapted and light-adapted responses, with significant reductions of a and b waves. Typical visual field, OCT, and ERG are shown in Figure S2. Clinical features of the 43 probands with disease-causing genes are listed in Table 1. In total, 15 probands harbored USH2A (OMIM * 608400) compound heterozygous or homozygous variants, while 1 proband harbored CLRN1 (OMIM * 606397) homozygous variants and 3 probands harbored USH2A heterozygous variants. Thirteen probands (11 probands with compound heterozygous or homozygous variants and two probands with USH2A heterozygous variants) were diagnosed with Usher syndrome. Six probands (five probands with USH2A compound heterozygous or homozygous variants and one proband with USH2A heterozygous variants) did not complain of hearing loss and did not exhibit hearing impairment in hearing examinations; they were diagnosed with nonsyndromic RP. Proband 28 had a compound heterozygous BBS2 (OMIM * 606151) variant and was diagnosed with Bardet-Biedl syndrome; he exhibited fourth toe brachydactyly in both feet, which was more severe in the right foot. The proband exhibited obesity, with a body mass index of 28.2 kg/m 2 ; he refused further examinations (e.g., sperm or genital gland). Notably, he did not exhibit obvious bone spicule pigmentation in the fundus and showed no mental retardation.

| NGS results
Based on bioinformatics, Sanger sequencing validation, and segregation analysis, we found that 43 families ( . Segregation analysis was available for 24 of the 43 families, and the variants were segregated with the disease, except for Family 15 and Family 176. Two genes were associated with adRP in three families with heterozygous variants; 11 genes were associated with arRP in 35 families with homozygous variants (10 families) or compound heterozygous variants (25 families); and 2 genes were associated with xlRP in 5 families with hemizygous variants. The gene most frequently found in the study is USH2A (19.7%), followed by CYP4V2 (6.6%). The gene variants of these probands are described in Table 2. The genomic information is shown in Table S3. In addition, we found that 12 families (15.8%) had only one heterozygous variant in seven arRP genes, including USH2A, EYS, CLRN1, CERKL (OMIM * 608381), RP1, CRB1 (OMIM * 604210), and SLC7A14 (OMIM * 615720); these heterozygous variants are described in Table 3. We did not detect any variants in the remaining 21 families (27.6%). The proportions of genes associated with RP in this cohort are shown in Figure  1a.
In total, we identified 67 potential pathogenic gene variants; these included 38 missense variants (52.2%), 10 nonsense variants (16.4%), 1 small indel variant (1.5%), 10 small deletion variants (14.9%), 2 small insertion variants (3.0%), and 6 splice variants (9.0%). The proportions of all types of variants are shown in Figure 1b. Of these 67 potential pathogenic variants, 24 were novel. The pedigrees of the probands with novel variants are shown in Figure S3; the sequencing chromatographs of novel variants and corresponding wildtype alleles are shown in Figure S4. Schematic representations of the genomic structures of genes with novel variants are shown in Figure 2a. The eight USH2A novel variants were distributed irregularly among the exons of USH2A; these variants presumably affect specific domains of the USH2A protein (Figure 2b). The topology and molecular models of seven novel variants showed molecular alterations in proteins

| DISCUSSION
Despite the advent of the personalized medicine era, traditional sequencing has not been able to achieve precise genetic diagnosis (Neveling et al., 2013). NGS technology is regarded as a powerful and effective tool for the detection of pathogenic gene variants underlying genetic RP (Gilissen, Hoischen, Brunner, & Veltman, 2011;Lovric et al., 2014;Riera et al., 2017;Wang et al., 2019). In this study, we used NGS technology, bioinformatics prediction, Sanger sequencing validation, and available family member segregation; we identified 43 families (56.6%) with disease-causing gene variants, whereas the detection rates were 63.5%, 50%, and 58% in previous studies (Huang et al., 2018;Neveling et al., 2012;Xu et al., 2015). The detection rate of gene variants in patients with RP was higher with targeted panel sequencing and whole exome sequencing than with microarray genotyping (Avila-Fernandez et al., 2010;Blanco-Kelly et al., 2012), targeted-capture sequencing (Fu et al., 2013;Wang et al., 2014), or individual gene sequencing (Sweeney, McGee, Berson, & Dryja, 2007). In the present study, the detection rates of Usher syndrome, Bardet-Biedl syndrome, and Bietti crystalline corneoretinal dystrophy were 17.1% (13 probands), 1.3% (1 proband), and 6.6% (5 probands), respectively. In these targeted panels, panel 5 was the most informative in Chinese patients with RP due to its relatively high detection rate (71.4%). The detection rate of novel variants among all identified variants was 35.8%, whereas the detection rates were 72.7% and 67% in previous studies (Huang et al., 2018;Xu et al., 2014). The higher novel detection rate observed in the prior studies was potentially because probands without identified gene variants were enrolled in those studies. The detection rate of variants in USH2A, the causative gene most frequently identified in this study, was 19.7% (15 probands). Among families with nonsyndromic RP, variants in USH2A were identified in 8.1% (five probands), which was higher than the rate in a study of North American families (7%) (Seyedahmadi, Rivolta, Keene, Berson, & Dryja, 2004) and the rate in a study of Spanish families (7%) (Avila-Fernandez et al., 2010). Variants c.8559-2A>G and c.11156G>A in USH2A were recurrent, as they were found in five and four probands, respectively. We presume that these variants are founder variants.
In the study, we did not find a disease-causing variant in 21 families (27.6%), whereas we found only one heterozygous variant of arRP genes in 12 families (15.8%). Possible reasons for these results are as follows. First, targeted panels sequencing and WES cannot capture variants in the noncoding regions of corresponding genes, nor can they detect variants comprising gross deletions, gross insertions, or complex rearrangements (Broadgate, Yu, Downes, & Halford, 2017). Second, the sequencing depth of coverage was insufficient to accurately call all variants, especially those located in regions with high GC content. Third, variants of novel genes in patients with RP may have F I G U R E 3 Topology and molecular models of seven novel variants. (a) CYP4V2 protein molecular alteration caused by CYP4V2 variant c.413G>A, p.(Ser138Asn). These models were predicted using 6c94.1. Compared to the wild-type model, serine is replaced by aspartic acid, which creates H-bonds (green dash line) between residues in the mutant model. (b) RPE65 protein molecular alteration caused by RPE65 variant c.1403C>T p.(Ser468Leu). These models were predicted using 4f30.1. Compared to the wild-type model, the number of H-bonds (green dash line) between residues in the mutant model markedly decreased. (c) CNGB1 protein molecular alteration caused by CNGB1 variant c.2921T>G p.(Met974Arg). These models were predicted using 5h3o.1. Compared to the wild-type model, the number of H-bonds (green dash line) between residues in the mutant model markedly decreased. (d) PDE6B protein molecular alteration caused by PDE6B variant c.622G>A p.(Val208Met). These models were predicted using 6mzb.1. There was no major difference between the wild-type and mutant models. (e) PDE6B protein molecular alteration caused by PDE6B variant c.2435A>T, p.(Asp812Val). These models were predicted using 6mzb.1. Compared to the wild-type model, the last helix is divided in the mutant model. (f) RPGR protein molecular alteration caused by RPGR variant c.818A>G, p.(Gln273Arg). These models were predicted using 4jhn.1. Compared to the wild-type model, the number of H-bonds (green dash line) between residues in the mutant model markedly decreased. (g) SLC7A14 protein molecular alteration caused by SLC7A14 variant c.524G>A, p.(Gly175Glu). These models were predicted using 6f34.1. Compared to the wild-type model, glycine is replaced by glutamic acid, which changes the direction of beta strand folding in the mutant model been filtered out in raw data analysis (Daiger, Sullivan, & Bowne, 2013). Fourth, other mild and moderate systemic clinical manifestations of syndromic RP may have been neglected (Xu et al., 2014). Fifth, small indel, large structural, copy number, or duplication variants in patients with Usher syndrome are not readily identified with NGS technology (Bonnet et al., 2016;O'Donnell-Luria & Miller, 2016). Whole genome sequencing may be a comprehensive alternative strategy because it partially resolves these problems (Carrigan et al., 2016).
In this study, we also detected two novel hemizygous RPGR variants c.2006G>A, p.(Trp669*) and c.818A>G, p.(Gln273Arg). These variants did not segregate with the disease in family Family 15 and Family 176. Both of the probands' biological parents exhibited wild-type genotypes without histories of bone marrow transplant surgery. The lack of segregation was possibly because the variants were de novo or because the probands' mothers exhibited chimerism. Other examinations (e.g., high-depth DNA sequencing of oral mucosa and urinary sediment for somatic cell chimerism, or of an ovum for gonad chimerism) are needed to definitively determine the statuses of the probands' mothers.
This study identified the gene variants in a cohort of Chinese probands with RP; however, there were some limitations. Some panels did not allow analysis of all RP genes. Furthermore, some families could not undergo segregation analysis. We plan to perform WES or whole genome sequencing to capture more genes and include patients in future research.
In conclusion, we enrolled a cohort of 76 families who exhibited RP. We identified 43 families (56.58%) with disease-causing variants in 15 genes and 12 families (15.79%) with only one heterozygous variant in arRP genes. We also detected 67 potential pathogenic gene variants, of which 24 have not been previously described. These results will provide useful data for clinicians to make accurate genetic diagnosis, prognosis estimation, and genetic counseling; moreover, they will provide further support for researchers to explore RP pathogenesis.