Association between copy number variations in parkin (PRKN) and schizophrenia and autism spectrum disorder: A case–control study

Abstract Aim The present study aimed to examine the association between copy number variations (CNVs) in parkin (PRKN) and schizophrenia (SCZ) and autism spectrum disorder (ASD) in a large case–control sample. Method Array comparative genomic hybridization was performed on 3111 cases with SCZ, 1236 cases with ASD, and 2713 controls. We systematically prioritized likely pathogenic CNVs (LP‐CNVs) in PRKN and examined their association with SCZ and ASD. Results In total, 3014 SCZ cases (96.9%), 1205 ASD cases (97.5%), and 2671 controls (98.5%) passed quality control. We found that monoallelic carriers of LP‐CNVs in PRKN were common (70/6890, 1.02%) and were not at higher risk of SCZ (p = 0.29) or ASD (p = 0.72). We observed that the distribution pattern of LP‐CNVs in the Japanese population was consistent with those in other populations. We also identified a patient diagnosed with SCZ and early‐onset Parkinson's disease carrying biallelic pathogenic CNVs in PRKN. The absence of Parkinson's symptoms in 10 other monoallelic carriers of the same pathogenic CNV further reflects the lack of effect of monoallelic pathogenic variants in PRKN in the absence of a second hit. Conclusion The present findings suggest that monoallelic CNVs in PRKN do not confer a significant risk for SCZ or ASD. However, further studies to investigate the association between biallelic CNVs in PRKN and SCZ and ASD are warranted.


| INTRODUC TI ON
The encoding gene of parkin, PRKN (OMIM *602544), formerly known as PARK2, was originally discovered in 1998 by Kitada et al. 1 and named after its role in the pathogenesis of autosomal recessive juvenile Parkinson disease-2 (PARK2, MIM 600116).
Soon thereafter, further works established the genetic variants of PRKN as a common cause of early-onset Parkinson's disease (EOPD). 2,3Since the original discovery of exonic deletions in PRKN, copy number variations (CNVs) have been the main focus of genetic studies in PRKN because they likely lead to functional loss of Parkin.1][12] Thereafter, several reports of patients with NPDs carrying CNVs in PRKN were published, [13][14][15] and case-control association studies attempted to clarify the role of heterozygous CNVs in PRKN in NPDs, including attention-deficit/hyperactivity disorder (ADHD) 16,17 and ASD. 18,19ile these early approaches implicated a possible association of CNVs in PRKN with NPDs, the genotyping methods used in these studies may have missed some of the CNVs, and the results were inconsistent or lacked sufficient power because of their limited sample sizes (ranging from about 340 to 880 [16][17][18][19] ).Accompanying the limited sample sizes was the lack of proper filtering.As CNVs are rare events, studies tend to adopt loose filtering to preserve more CNVs, resulting in the inclusion of not only likely pathogenic CNVs (e.g., CNVs at internal exons), but also less likely pathogenic CNVs 20 (e.g., intronic CNVs, 8,16 duplications spanning terminal [first/last] exons) in the same study.These defects in study design may have led to inaccurate interpretations of the role of CNVs in PRKN in the pathogenesis of NPDs.Therefore, a case-control association study with an adequate sample size and appropriate prioritization using a sensitive and reliable detection method is needed to clarify the role of CNVs in PRKN in NPDs.To address this issue, we performed array comparative genomic hybridization (aCGH) on 4347 NPD cases (3111 SCZ and 1236 ASD) and 2713 healthy controls.We adopted a systematic prioritization for likely pathogenic CNVs (LP-CNVs) (Figure 1) and examined the association between LP-CNVs in PRKN and NPDs.Additionally, we analyzed the frequency and distribution of LP-CNVs in PRKN in the Japanese population and compared our findings with previous reports in other populations.We reported a biallelic carrier of pathogenic CNVs in PRKN diagnosed with SCZ and EOPD, along with other monoallelic carries of the same pathogenic CNV to investigate the effect of monoallelic pathogenic variants in PRKN in the absence of a second hit.

| Participants
All participants recruited into the present study, including 3111 SCZ cases, 1236 ASD cases, and 2713 healthy controls, were of Japanese ancestry.SCZ and ASD were diagnosed based on the criteria set forth in the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition.The healthy controls were selected from the general population and confirmed to have no history of mental disorders based on self-reported information and questionnaires.This study was approved by the ethics committee of the Nagoya University and affiliated institutes.Written informed consent was obtained from all participants or accompanying family members before the study began.for artifacts caused by GC content and fragment length, a noisereduction algorithm for aCGH data was adopted. 21The accuracy of CNVs identified by aCGH was confirmed in our previous study using a quantitative real-time polymerase chain reaction (TaqMan copy number assays) (Applied Biosystems). 22For quality control (QC), a score was calculated for each sample based on the statistical variance of the probe-to-probe log ratios.Lower QC scores indicated higher quality results with less variation, whereas scores >0.2 were excluded from the analysis, as were samples with gender mismatches or excessive autosomal CNV calls.We then filtered out CNV calls that were < 10 kb in length or had low probe density (<1 probe/30 kb).Finally, we removed common CNVs that appeared in ≥1% of the total sample).

| Prioritization of likely pathogenic CNVs in PRKN
We started by considering all CNVs in the PRKN locus (chr6: 161691121-163 068 690, NCBI36) as potential candidates.To identify CNVs that were likely pathogenic, we performed a systematic prioritization process (Figure 1A).First, we excluded all intronic CNVs because they were deemed less likely to be pathogenic.Next, to prioritize the pathogenic CNVs that cause functional loss of one copy of PRKN, we followed the guidelines for interpreting CNVs provided by the American College of Medical Genetics and Genomics and the Clinical Genome Resource (Figure 1B), 23 under the assumption of PRKN being a haploinsufficient gene, although this is still under debate.We excluded CNVs falling in categories 2C-2, 2D-1, 2H, and 2 J, which have insufficient evidence to support their pathogenicity, and retained CNVs falling in categories 2A, 2C-1, 2D-2, 2D-3, 2D-4, 2E, and 2I.We further evaluated the CNVs in categories 2E and 2I based on PVS1 specifications. 242E and 2I CNVs disrupting reading frames and predicted to undergo nonsense-mediated decay (NMD) were classified as PVS1.2E CNVs preserving reading frames were classified as PVS1_Strong (if removing >10% of protein) or PVS1_Moderate (if removing <10% of protein).2I CNVs preserving reading frames were classified as N/A.CNVs classified as PVS1, PVS1_Strong, or PVS1_Moderate were retained, whereas those classified as N/A were excluded.After this prioritization process, we retained the CNVs considered to have moderate to very strong evidence supporting their pathogenicity and labeled these as likely pathogenic CNVs (LP-CNVs; Table S1).

| Association analysis
We conducted association analysis on carriers of monoallelic (heterozygous) and biallelic (homozygous or compound heterozygous) CNVs separately (Figure 1A).To investigate the associations between LP-CNVs in PRKN and SCZ or ASD, we used Fisher's exact test (two-tailed) to calculate odds ratios (ORs) and P-values for the contingency table.In a secondary analysis, we investigated the association between LP-CNVs that spanned specific exon(s) and SCZ or ASD using Fisher's exact test (two-tailed).
CNV23 is a duplication in exon 2, while CNV56 is a deletion in exon 6 of PRKN, both resulting in a premature stop codon and potential degradation of transcripts by NMD (Figure 2C).Ten other individuals were found to carry similar duplications in exon 2 with uniform length as CNV23 (CNV18-20, CNV24-25, and CNV65-68), but none exhibited Parkinson's symptoms.

| Association analysis
With adequate identification of CNVs and a systematic prioritization, we revisited the question regarding whether CNVs in PRKN increase the risk of developing NPD, specifically SCZ or ASD.Our initial association analysis conducted on 6889 participants with zero or one LP-CNV revealed that monoallelic carriers of LP-CNVs in PRKN were not at a higher risk of developing SCZ (OR = 1.35, 95% confidence interval [CI] = 0.80-2.30,p = 0.29) or ASD (OR = 1.16, 95% CI = 0.57-2.34,p = 0.72) (Table 1).We also conducted a secondary analysis to examine whether monoallelic CNVs spanning specific exon(s) (e.g., exon 2, exons 5-12) of PRKN confer an increased risk for NPDs, but no statistically significant results (p < 0.05) were found (Table 1).Furthermore, we conducted an association analysis on biallelic carriers and non-carriers, but due to the limited discovery of only one biallelic carrier (a SCZ case) among all participants, no statistically significant result was found (Table 1).

| Distribution pattern of CNVs in PRKN
Among the LP-CNVs identified in 6890 Japanese individuals, we observed a clustering toward exons 1-4 encoding Ubl and RING0 domains of Parkin (Figure 2B).Overall, no obvious difference in the distribution of CNVs in PRKN was found between NPD cases and controls.CNVs involving exon 2, 3, or 4 accounted for the majority 62/72 (86%) of all LP-CNVs.Exon-wise, exon 2 was observed with most of the LP-CNVs, including 33 CNVs spanning exon 2 and 26 single-exon CNVs.Notably, 11 of the 13 single-exon duplications in exon 2 (CNVs 18-20, 23-26, and 65-68) were uniform in width (~300 kb), accounting for 85% of the duplications in exon 2 of PRKN.This finding was consistent with a recent large-scale study in the European population showing that CNVs in exons 2-4 of PRKN are common and that duplications found in exon 2 were uniform in size. 25Intriguingly, the duplications identified in our sample set of the Japanese population were larger (~300 kb) than those identified in that study (~200 kb).Given the high-resolution of CNV detection methods in both our study (NimbleGen or Agilent) and the previous study (NeuroX 26 or NeuroChip 27 ), the 100-kb difference in length suggested that the two duplications were likely distinct variants.However, whether the 300-kb duplication discovered in our study is population-specific requires further investigation.In the present study, we identified a greater number of LP-CNVs in PRKN in NPD cases compared with previous studies. 16,18,197][18][19] Yin et al. 18 reported a higher frequency of PRKN CNVs in ASD cases compared with controls (0.94% vs. 0.14%, respectively; p = 0.014) by screening 1428 participants (335 ASD cases and 1093 controls) for CNVs in PRKN and genotyping on designated regions for replication in 301 ASD cases and 301 controls.Jarick et al. 16 reported a similar result in ADHD cases (1.04% vs. 0.13%, respectively; p = 0.043) in the replication samples of 386 ADHD cases and 781 controls.However, the frequency of CNVs in PRKN in their control samples was much lower than that in our study (0.86%) and other large-scale studies (0.55-0.95% 25,28,29 ).This omission of CNVs may have been caused by limitations in the detection methods employed, and may have influenced their results.Moreover, we were unable to replicate the excess of CNVs among NPD cases in the risk regions suggested by their studies (i.e., exon 2, exons 3-4, exon 5, and exons 6-7) (Table 1).Instead, our observations were more consistent with the results of Conceição et al., 19 who reported that CNVs in exons 1-4 of PRKN, which encode the Ubl and RING0 domains of Parkin, were frequently observed in both controls and NPD cases.We found that LP-CNVs in exons 2-4 were the most common, whereas LP-CNVs in exons 5-12 were infrequent.Notably, exons 5-12 of PRKN encode the RBR domain, which is the main functional domain of Parkin, and an excess of CNVs in this region has been reported in NPD cases. 19However, our association analysis did not yield any statistically significant evidence to suggest that monoallelic LP-CNVs in exons 5-12 of PRKN are associated with an increased risk of NPDs (Table 1).Overall, while the possibility of monoallelic CNVs in PRKN conferring a higher risk of SCZ or ASD remains, as low-penetrance from monoallelic CNVs in PRKN for Parkinson's disease (PD) has been suggested, 30 our data showed that any potential risk should be limited.
Only one carrier of biallelic pathogenic CNVs in PRKN was identified among the NPD cases in our study (none in the control group); however, due to limited number of cases, statistical significance was not achieved.In fact, the low prevalence of biallelic CNVs in PRKN (1/6890) makes achieving statistical significance challenging.
To achieve a statistical power of 0.8, a minimum of 200 000 participants is estimated to be required, even when combining other pathogenic variants such as disruptive single-nucleotide variations (SNVs).Future studies will require more complex study designs to investigate the role of biallelic pathogenic variants in PRKN in NPDs, rather than relying solely on large sample sizes in a one-stage genetic association study.
Nevertheless, the discovery of this biallelic carrier is valuable, as this carrier was also diagnosed with EOPD, a recessive form of PD commonly caused by pathogenic variants in PRKN. 31This diagnosis supports the pathogenicity of the two CNVs (CNV23 and CNV56) carried by the individual, suggesting that they may lead to the functional loss of both copies of PRKN in a compound heterozygous manner.This finding is significant because 10 other monoallelic carriers of the same pathogenic CNVs in exon 2 did not exhibit PD symptoms, further highlighting the lack of effect of monoallelic pathogenic CNVs in PRKN in the absence of a second hit.
Here, we reported the frequency and distribution of LP-CNVs in PRKN in a Japanese population of 6890 participants by systematically screening for CNVs using high-resolution arrays.We found that monoallelic carriers of LP-CNVs in PRKN were relatively common (1.02%), whereas biallelic carriers were rare (1/6890).The distribution of CNVs in NPD cases and controls showed no obvious differences, and the clustering toward exons 2-4 was similar to that observed in other populations. 25Notably, we identified 300-kb duplications in exon 2, which were larger than those previously reported in the European population, 25 but were also uniform in length and likely had identical consequences, as they both involved only exon 2. Whether this 300-kb duplication is population-specific requires further investigation.
To our knowledge, our case-control study is the first to systematically examine the association of CNVs in PRKN with SCZ since the original discovery, 9 and the largest to examine the association of CNVs in PRKN with NPD. 16,18,19Therefore, we have provided a better perspective for interpreting CNVs in PRKN in patients with NPDs by highlighting the importance of reassessing results from previous genetic studies and reporting negative results, which was also suggested by recent large-scale studies on PRKN. 25,28,32is study does have a few limitations worth mentioning.First, while aCGH is a sensitive and reliable method for detecting CNVs, it cannot identify SNVs.Therefore, it is possible that there may be biallelic carriers with another pathogenic SNV that were detected.
However, because pathogenic SNVs in PRKN are rare, 25 this limitation is unlikely to affect the main conclusion of the present study.
Second, CNV detection in this study was performed using two platforms, NimbleGen and Agilent.While some may question the validity of this approach, our previous research has demonstrated that the CNV detection resolution of these two platforms is comparable under the settings used in this study. 33 conclusion, the findings of the present study indicate that monoallelic CNVs in PRKN do not confer a significant risk for SCZ or ASD.However, further studies are warranted to investigate the association between biallelic CNVs in PRKN and NPDs.We also found that CNVs in PRKN were relatively common among the Japanese population and that their distribution was no different from that of other populations.

2. 2 |
Array comparative genomic hybridization (aCGH) Two aCGH platforms were utilized for CNV detection in this study: the NimbleGen 720 K Whole-Genome Tiling array (Roche NimbleGen) and the Agilent SurePrint G3 human CGH 400 K (Agilent).CNV calls were generated using the Fast Adaptive States Segmentation Technique 2 algorithm implemented in Nexus Copy Number software v9.0 (BioDiscovery).The log 2 ratio thresholds for CNV calls on both platforms were set as follows: (i) 10-500 kb: −0.6 (deletion) and 0.4 (duplication); and (ii) >500 kb: −0.4 (deletion) and 0.3 (duplication).A significance threshold of 1 × 10 −6 was applied to adjust the sensitivity of the segmentation algorithm, and at least three contiguous probes were required for a CNV call in both the NimbleGen and Agilent arrays.To systematically correct

F I G U R E 1
Workflow and classifications of CNVs in the present study.(A) Workflow of the present study.(B) Schematic illustration of classifications based on overlapping patterns following the guidelines for the interpretation of CNVs provided by ACMG and ClinGen.Red bars and dark blue bars represent deletions and duplications, respectively.Bars framed by the dotted line represent classifications prioritized as LP-CNVs.2A: Deletions completely overlapping an established HI gene/genomic region; 2C-1: Deletions partially overlapping with the 5′ end (3′ end not involved) and the coding sequence is involved; 2C-2: Deletions partially overlapping with the 5′ end (3′ end not involved) and only the 5' UTR is involved; 2D-1: Deletions partially overlapping with the 3′ end (5′ end not involved) and only the 3′ untranslated region is involved; 2D-2/3: Deletions partially overlapping with the 3′ end (5′ end not involved) and only the last exon is involved; 2D-4: Deletions partially overlapping with the 3′ end (5′ end not involved) and includes other exons in addition to the last exon.Nonsense-mediated decay is expected to occur; 2E: Deletions with both breakpoints within the same gene; 2H: Duplications fully containing an HI gene; 2I: Duplications with both breakpoints within the same gene; 2 J: Duplications with one breakpoint within the established HI gene.ACMG, American College of Medical Genetics and Genomics; ClinGen, Clinical Genome Resource; HI, haploinsufficient; LP-CNV, likely pathogenic copy number variation.

F I G U R E 2
Genomic locations, distribution, and consequences of LP-CNVs in PRKN.(A) Genomic locations of LP-CNVs in PRKN.Red bars and dark blue bars represent deletions and duplications identified in the monoallelic carriers, respectively.The yellow bar and light blue bar represent the deletion (CNV56) and the duplication (CNV23) identified in the biallelic carrier.The genomic coordinates correspond to the NCBI36/hg18 build of the human genome assembly.(B) Distribution of LP-CNVs in the 12 exons of PRKN in NPD cases (n = 4219) and controls (n = 2671).(C) Predicted consequences of the mRNA structure due to CNVs in PRKN.Duplication or deletion disrupting reading frames and resulting in a premature stop codon that leads to NMD (e.g., CNV23 and CNV56).Duplication spanning one or more exons preserving reading frames and leading to mRNA with exon repetition (e.g., CNV53, and CNV80).ASD, autism spectrum disorder; CON, healthy controls; LP-CNV, likely pathogenic copy number variation; del, deletion; dup, duplication; NMD, nonsense-mediated mRNA decay; NPD, neuropsychiatric disorder; nt, nucleotide; SCZ, schizophrenia.TA B L E 1 Association analysis of LP-CNVs in PRKN with risk of SCZ or ASD.