Comprehensive strategy improves the genetic diagnosis of different polycystic kidney diseases

Abstract Polycystic kidney disease (PKD) is known to occur in three main forms, namely autosomal dominant PKD (ADPKD), autosomal recessive PKD (ARPKD) and syndromic PKD (SPKD), based on the clinical manifestations and genetic causes, which are diagnosable from the embryo stage to the later stages of life. Selection of the genetic test for the individuals with diagnostic imaging reports of cystic kidneys without a family history of the disease continues to be a challenge in clinical practice. With the objective of maintaining a limit on the time and medical cost of the procedure, a practical strategy for genotyping and targeted validation to resolve cystogene variations was developed in our clinical laboratory, which combined the techniques of whole‐exome sequencing (WES), Long‐range PCR (LR‐PCR), Sanger sequencing and multiplex ligation–dependent probe amplification (MLPA) to work in a stepwise approach. In this context, twenty‐six families with renal polycystic disorders were enrolled in the present study. Thirty‐two variants involving four ciliary genes (PKD1, PKHD1, TMEM67 and TMEM107) were identified and verified in 23 families (88.5%, 23/26), which expanded the variant spectrum by 16 novel variants. Pathogenic variations in five foetuses of six families diagnosed with PKD were identified using prenatal ultrasound imaging. Constitutional biallelic and digenic variations constituted the pathogenic patterns in these foetuses. The preliminary clinical data highlighted that the WES + LR PCR‐based workflow followed in the present study is efficient in detecting divergent variations in PKD. The biallelic and digenic mutations were revealed as the main pathogenic patterns in the foetuses with PKD.


| INTRODUC TI ON
Hereditary polycystic kidney diseases (PKDs) exhibit clinical similarity and genetic diversity 1 and may affect organ development and growth in patients ranging from embryos to adults. 2 Polycystic kidney diseases originate from the cellular dysfunctions of chemo-and mechanosensations and fluid transport in renal tubules, 3 which correlate with numerous molecules. Recent studies have delineated several genes associated with multiple signalling pathways to be involved in renal cystic disorders. 4 The pathogenic variants in these genes could cause autosomal dominant/recessive polycystic kidney disease (ADPKD/ARPKD) 5 and rare syndromes manifesting as renal polycysts.
Autosomal dominant polycystic kidney disease (ADPKD), which affects ~1 person among 1000 individuals, is the most common form of hereditary PKD and the underlying cause of end-stage renal diseases (ESRD). 6 The pathogenic variants in the PKD1 and PKD2 genes contribute to over 99% of all the ADPKD cases, half of which gradually progress to ESRD. 7 The beginning time and the phenotypic severity of ADPKD depend on the nature of specific variants. 8 The onset of ADPKD occurs in adulthood in the majority of the cases, while 2%-5% of the cases exhibit early-onset, even foetal-onset, and the mechanism underlying early-onset PKD has attracted great attention. 3,9 Recent in vivo and in vitro studies on biallelic and digenic mutations have proposed a 'two-hit' model for cystogenesis in PKD, which involves the inactivation of both copies of a polycystic kidney disease gene by germline and somatic mutations, thereby leading to cyst formation. 10 The PKD1 gene comprises 46 exons within a 52-kb locus. There are six highly homologous pseudogenes matching with the exons 11 in addition to the high guanine-cytosine (GC) content and simple repeats, 12 which have rendered it challenging to analyse the genetic variations in the PKD1 gene.
Autosomal recessive polycystic kidney disease (ARPKD) is mainly a result of the biallelic pathogenic variants in the PKHD1 (PKD type 4) and DZIP1L (PKD type 5) genes, besides numerous syndromic PKD genes. 13,14 These syndromic disorders often exhibit severe earlyonset and multi-systemic manifestations. 15,16 Advances in the next-generation sequencing (NGS) technology have provided an opportunity to optimize the genetic diagnosis of PKDs for establishing their genotype-phenotype correlations. 8 Availability of adequate genetic tests would provide the patients with the benefit of precise counselling and management. 17 In terms of feasibility, the gene panel test promises better sequencing depth, 18 while the whole-exome sequencing (WES) and wholegenome sequencing (WGS) cover the full genome and allow the clinicians to identify novel pathogenic variants. 8 In the present study, six cases of foetal-onset PKD diagnosed using ultrasonography and 20 cases of adult-onset PKD were recruited and registered for comprehensive genetic analyses using our new workflow. The overall detection rate observed reached 88.5%.
The preliminary data showed that the WES + Targeted PKD1-seqbased workflow used in the present study was efficient in detecting the divergent mutations in different forms of PKD, particularly for the foetuses with polycysts.

| Participants and clinical analyses
The present study was approved by the Ethics Committee of Shijiazhuang Obstetrics and Gynecology Hospital (approval no.: 20200042). All participants were Chinese and were recruited from four medical centres (affiliate 2-5) in the northern region of China.
Informed consent for participation in the study was obtained from all recruits. A total of 46 individuals from 26 unrelated families with no consanguineous relationship were enrolled, among which 21 families were associated with adult-onset PKDs, six families were associated with early-onset PKDs, and one family was associated with both. The diagnostic criteria for adult-onset PKD were according to the KDIGO guideline. 19

| Workflow set-up for genetic analyses
In view of the clinical and genetic heterogeneities of PKDs, the experiment cost and the analytical complexity of the PKD1 gene, a comprehensive workflow combining the techniques of WES, targeted LR-PCR plus sequencing, MLPA (multiplex ligation-dependent probe amplification), QF-PCR (quantitative fluorescence PCR) and in silico analysis was established, as illustrated in Figure 1. WES and targeted PKD1-seq were applied simultaneously as the first tier of tests. The targeted sequencing was aimed to detect variants in the PKD1 exons 1-34, while WES aimed to detect variants in the genome-wide coding region.
Sanger sequencing, MLPA, QF-PCR and in silico analysis formed the second tier of tests aiming for mutation validation and functional prediction. All WES-detected variants suspected with pathogenicity were verified using Sanger sequencing. The patients with no sequence variant were screened for possible rearrangements and/or CNVs in PKD1 using MLPA and QF-PCR. In this manner, the maximum turnaround time could be limited to just 2 weeks.

| DNA extraction
Genomic DNA was extracted from the peripheral blood (for adults) and cord blood (for foetuses) samples (200 µL each) by using the DNA Blood Midi/Minikit (QIAGEN, Hilden, Germany) in accordance with the manufacturer's protocol.

| Whole-exome sequencing (WES)
Whole-exome sequencing was performed on the probands in each family as described in a previous study. 20 The details of the analysis procedures are provided in Figure 1 (left dashed block). The obtained DNA samples (1 µg/each) were subjected to quality evaluation using agarose gel electrophoresis and UV spectrophotometry, after, which the DNA fragments were hybridized and captured using the IDT's xGen Exome Research Panel 2.0 (Integrated DNA Technologies, San Diego) according to the manufacturer's protocol. The libraries were screened for enrichment using qPCR, and the size distribution and concentration were determined using Agilent Bioanalyzer 2100 (Agilent Technologies, Santa Clara, CA). The Novaseq6000 platform (Illumina, San Diego) was employed for the genomic sequencing of ~300 pmol·L −1 of F I G U R E 1 The diagnostic workflow for mutation analysis in PKD patients. The proposed complete strategy comprised three stepwise analyses. The first tier of tests were administrated simultaneously, which included whole-exome sequencing (WES) and targeted PKD1 (exon 1-34) sequencing. The targeted Sanger sequencing procedure was performed subsequent to long-range PCR (LR-PCR) that was performed to amplify the repetitive sequences in the four fragments spanning 1-34 exons and introns as indicated in the right dashed box. WES was applied to screen single nucleotide polymorphisms (SNPs), insertions/deletions (InDels) and copy number variations (CNVs). When the first tier of tests identified a variant of unknown significance (VUS), in silico analysis was conducted to prognosticate its functional influence. When WES detected a microdeletion/microduplication mutation or a no point mutation, QF-PCR and MLPA were performed, respectively. The 'pathogenic' and 'likely pathogenic' point mutations detected using WES were validated using Sanger sequencing as the second-tier test. ACMG, American College of Medical Genetics and Genomics; HGMD, Human Gene Mutation Database; NL pearl™, Natural Language Pearl processing and analysis system by Berry Genomics, Inc [Colour figure can be viewed at wileyonlinelibrary.com] DNA per sample using Novaseq Reagent Kit. 21 The raw reads (quality level Q30 > 89%) were aligned to the human reference genome (hg19/GRCh37) using the Burrows-Wheeler Aligner tool, and the PCR duplicates were removed using Picard v1.57. 22 Variant calling was performed using the Verita Trekker ® Variants Detection system v 2.0 (Berry Genomics, Inc, Beijing) and the Genome Analysis Toolkit. 23 The SNPs and InDels with a frequency of >0.01 in 1000 genomes, ExAC and gnomAD_exomes were removed. The non-synonymous variants were evaluated using  (Table S1).

| LR-PCR for targeted PKD1-seq
Long-range PCR was performed to isolate the sequence of the PKD1  Table S2, and the PCR reaction conditions are provided in Table S3.

| Multiplex ligation-dependent probe amplification (MLPA)
In order to directly identify the CNVs and validate the findings of WES suggesting an exonic CNV of PKD1, MLPA was performed using the SALSA kit P351 (MRC-Holland, Inc, Amsterdam) according to the manufacturer's protocol.

| In silico analysis of conservatism and molecular modelling
All the missense variants identified in the present study were analysed for evolutionary conservatism of the affected amino acid residues using MEGA7 with default parameters. 27 In the case of missense variants located in the well-defined peptide structure models available in the PDB database, 28 the Modeller 9V17 software was employed to predict the functional influence of the structural anomaly. 29 In the case of missense variants not matching with any well-defined peptide structure models available in the PDB database, the UniProt database 30 was searched for a functional prediction using the Rosetta CM program as described in a previous study. 31 The confidence threshold was set at ≥0.6.
In the molecular dynamics (MD) analysis, CHARMM22 was employed to add hydrogen atoms and N-and C-terminal patches to the models. 32 The generated models were solvated and neutralized with TIP3P water within a box at a minimum distance of 13 Å between the model and the wall of the box. All simulations were run using NAMD 2.9 and by applying periodic boundary conditions (PBC). The temperature was maintained at 300 K, and the pressure was maintained at 1 atm. The time step was set to 2 fs, the particle-mesh Ewald method was applied to model the electrostatics, and the van der Waals interactions threshold was set at 12 Å. Both models included a three-step pre-equilibration totalling 600 ps, the last snapshots of which were selected as the beginning structures for 20-ns productive simulations without constraints.  Table 1. In order to better illustrate the data from the PKD families, pedigree diagrams, imaging graphs, genetic variants, and pathological results were constructed into composite figures (Figures S1 and S2; 3). Among the identified variants, 17 variants (53.12%) were novel variants and 15 variants (46.88%) were the previously described ones. According to the ACMG evidence, 6.25% of the variants were pathogenic (P), 59.38% of the variants were likely pathogenic (LP), and 34.37% of the variants were variants of uncertain significance (VUS). In resolved families, 75% of the variants were autosomal dominant (AD) and 25% were autosomal recessive (AR). All the variants detected in the present study are listed in Figure 2 and Table 2.

| Identification of gene variants in PKD families
The detection rate for the patients with phenotypically mani-

| Conservatism and molecular modelling analysis
In the present study, 12 missense variants were detected. It was demonstrated that all the affected amino acid residues were evolutionarily conserved across species ( Figure S3).  As depicted in Figure 4A-2, the side chain of Ser 1448 formed hydrogen bonds with Arg 1441 , and p.S1448F, which caused the polar serine to be replaced with a large hydrophobic phenylalanine residue, eliminated these hydrogen bonds ( Figure 4B-2 polar cysteine, could destroy this hydrophobic core, thereby causing the associated structural changes and ultimately influencing the overall protein folding and stability ( Figure 4A-3). Asp 2095 formed hydrogen bonds with Gly 2097 , and the replacement of Asp 2095 with a tyrosine could eliminate the hydrogen bonds and potentially alter the protein distribution ( Figure 4A-4).
Moreover, the differences among the secondary structures of p.S1448F, p.W2006C, p.D2095Y and WT PKD1 were quite obvious

| D ISCUSS I ON
Over In these conditions, polycystic deformities may develop at different ages. Autosomal dominant PKDs usually occur in adults, although early development and even in utero-onset have been reported. 11,12 The variants of co-factorial genes drastically influence the phenotypes of PKD patients. 35,36 In the case of no family history of the disease, the decision to straightaway perform target gene analysis becomes difficult.
Ten years ago, Rossetti and colleagues proposed a plausible protocol based on HPLC, LR-PCR, and direct sequencing to overcome the spurious amplification of the PKD1 pseudogenes with a detection rate of 63%. 37 Next-generation sequencing techniques, either in a targeted sequencing approach 8,11 or in a more comprehensive manner, 38  The cases of early-onset PKDs have been a concern for decades.
The six cases of in utero-onset PKD included in the present study demonstrated a scenario of clinical similarity and genetic diversity. variant resulting in severe phenotypes and embryonic lethality. 45,46 Therefore, we hypothesized that the combinational effects of the ADPKD and ARPKD traits having been proposed to produce further severe and early-onset phenotypes.  Figure S1). Further detailed CNV analysis and whole-genome

| CON CLUS ION
In the present study, a comprehensive genetic strategy was developed to identify genomic variants in 26 PKD families. A total of 32 variants were identified in 23 families, among which 16 were the novel ones.
The present study expanded the variation spectrum of cystogenes, and in particular, it revealed further solid evidence for foetal-onset PKDs usually occurring with biallelic and digenic variants. The biophysical analysis of the missense variants in the PKD1 gene illuminated a potential tool to assist in the determination of pathogenicity.

ACK N OWLED G EM ENT
We thank all subjects for participating in this study.

CO N FLI C T O F I NTE R E S T
The authors declare no conflict of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data that support the findings of this study are available from the corresponding author upon reasonable request.