An increased burden of rare exonic variants in NRXN1 microdeletion carriers is likely to enhance the penetrance for autism spectrum disorder

Abstract Autism spectrum disorder (ASD) is characterized by a complex polygenic background, but with the unique feature of a subset of cases (~15%‐30%) presenting a rare large‐effect variant. However, clinical interpretation in these cases is often complicated by incomplete penetrance, variable expressivity and different neurodevelopmental trajectories. NRXN1 intragenic deletions represent the prototype of such ASD‐associated susceptibility variants. From chromosomal microarrays analysis of 104 ASD individuals, we identified an inherited NRXN1 deletion in a trio family. We carried out whole‐exome sequencing and deep sequencing of mitochondrial DNA (mtDNA) in this family, to evaluate the burden of rare variants which may contribute to the phenotypic outcome in NRXN1 deletion carriers. We identified an increased burden of exonic rare variants in the ASD child compared to the unaffected NRXN1 deletion‐transmitting mother, which remains significant if we restrict the analysis to potentially deleterious rare variants only (P = 6.07 × 10−5). We also detected significant interaction enrichment among genes with damaging variants in the proband, suggesting that additional rare variants in interacting genes collectively contribute to cross the liability threshold for ASD. Finally, the proband's mtDNA presented five low‐level heteroplasmic mtDNA variants that were absent in the mother, and two maternally inherited variants with increased heteroplasmic load. This study underlines the importance of a comprehensive assessment of the genomic background in carriers of large‐effect variants, as penetrance modulation by additional interacting rare variants to might represent a widespread mechanism in neurodevelopmental disorders.

carried out whole-exome sequencing and deep sequencing of mitochondrial DNA (mtDNA) in this family, to evaluate the burden of rare variants which may contribute to the phenotypic outcome in NRXN1 deletion carriers. We identified an increased burden of exonic rare variants in the ASD child compared to the unaffected NRXN1 deletion-transmitting mother, which remains significant if we restrict the analysis to potentially deleterious rare variants only (P = 6.07 × 10 −5 ). We also detected significant interaction enrichment among genes with damaging variants in the proband, suggesting that additional rare variants in interacting genes collectively contribute to cross the liability threshold for ASD. Finally, the proband's mtDNA presented five low-level heteroplasmic mtDNA variants that were absent in the mother, and two maternally inherited variants with increased heteroplasmic load. This study underlines the importance of a comprehensive assessment of the genomic background in carriers of large-effect variants, as penetrance modulation by additional interacting rare variants to might represent a widespread mechanism in neurodevelopmental disorders.

| INTRODUC TI ON
Autism Spectrum Disorder (ASD) is a heterogeneous neurodevelopmental disorder with a high prevalence (>1%) and a remarkable social burden, with no effective pharmacological treatments. 1 Despite the high heritability, the vast majority of genetic risk factors still remains unknown and only about 15%-30% of ASD individuals have an identifiable genetic cause. 2 Several years of investigation has led to considerable progress in the identification of a large number of risk genes and the delineation of a heterogeneous and complex genetic architecture. It is now evident that ASD, like other neuropsychiatric disorders, has a polygenic basis, but with the peculiarity of a subset of cases where a large-effect variant is present. Advance in microarrays and whole-exome sequencing (WES) enabled the discovery of many rare variants of large effect, both structural and single-nucleotide variants, which pinpointed more than 100 high-confidence specific genes and genetic loci. [3][4][5] Even if the identification of rare genic likely pathogenic mutations can be extremely informative, their translation into clinical settings is not straightforward, as many of them are characterized by incomplete penetrance and variable expressivity. is transcribed in neurons from two independent promoters which generate a longer alpha (α) and a shorter beta (β) isoform, composed of distinct extracellular domains but with an identical intracellular sequence. Moreover, through extensive alternative splicing, thousands of isoforms are produced and differentially expressed throughout the brain, with a likely role as surface recognition molecules that specify synapses. 9 The NRXN1 locus is extremely prone to non-recurrent deletions with different size and breakpoint location, resulting from chromosomal rearrangements due to genomic instability.
Rare intragenic deletions spanning NRXN1 have been described in individuals with ASD, Attention-Deficit/Hyperactivity Disorder (ADHD), intellectual disability (ID), epilepsy, schizophrenia and bipolar disorder, but also in unaffected parents, siblings and healthy controls, suggesting reduced penetrance and the contribution of other interacting genetic and/or environmental factors that influence clinical features and severity. 7,8 The enhanced frequency of additional pathogenic CNVs in cohorts of patients carrying NRXN1 deletions has added support to the role of secondary and independently segregating genetic risk factors in the definition of the final phenotype in children with inherited deletions. Previous studies suggested that deletion extent and exon content may also play a role in this clinical heterogeneity; however, there is no consensus on the different penetrance of 5′ NRXN1 deletions (exons 1-6, NM_001135659.2) versus 3′ NRXN1 deletions (exons 7-24, NM_001135659.2). Specifically, one study proposed a lower penetrance of 3′ NRXN1 deletions as they are more frequently co-occurring with another rare and often pathogenic CNV, 7 while a more recent study reported a higher penetrance of 3′ deletions, given the much higher frequencies in cases versus controls and the higher de novo occurrence. 8 In order to identify pathogenic CNVs in ASD probands, we performed array-based comparative genomic hybridization (aCGH) on a cohort of 104 ASD individuals from 89 Italian families. A 3′ NRXN1 deletion was identified in a trio family in which a girl with ASD inherited the deletion from the unaffected mother. To further explore the impact of genetic background on the penetrance of NRXN1 deletions, a detailed clinical evaluation of the girl and her parents was combined with genetic analysis including (a) WES, in order to characterize the background of rare variants and (b) deep sequencing of the entire mitochondrial genome, with the aim of detecting rare pathogenic mutations and evaluating the burden of low-heteroplasmy variants. Mitochondrial DNA (mtDNA) defects could, indeed, represent an overlooked contributing factor to ASD susceptibility, by reducing mitochondrial function sufficiently to fall below the brain's bioenergetics threshold. 10 Moreover, a recent study has provided evidence for a coordinated down-regulation of synaptic and mitochondrial function genes in post-mortem brain of ASD subjects, 11 suggesting that a mitochondrial dysfunction might enhance the clinical outcome of NRXN1 deletions.

| Participants
A total of 89 Italian families with one or more children with an ASD diagnosis were recruited at the UOSI Disturbi dello Spettro Autistico, The ASD sample includes 78 males and 26 females, from 18 multiplex and 71 simplex families. DNA samples from both proband's parents were available for 87 families, and from a single parent for the remaining ones. All DNA samples were extracted from whole blood.
All participants provided a written informed consent to participate to this study. This study was approved by the local Ethical Committee (Comitato Etico di Area Vasta Emilia Centro (CE-AVEC); code CE 14060). All research was performed in accordance with the relevant guidelines and regulations.

| Copy number variant analysis
Array-based comparative genomic hybridization (aCGH) was per- for CNV interpretation and reporting. 12 According to these criteria, the only identified pathogenic CNV is a deletion involving the NRXN1 gene in a girl with ASD from a simplex family. Parental inheritance and validation of NRXN1 deletion were carried out with quantitative PCR (qPCR) using SsoAdvanced™ Universal SYBR ® Green Supermix (BIORAD). The assay was performed in triplicate, with four sets of primers corresponding to the region of interest and another mapping to a control region on FOXP2 gene at 7q31.1. The number of copies of each amplified fragment was calculated using the ddCt method.

| Whole-exome analysis
DNA from ASD proband and parents was subjected to exome capture using NimbleGen SeqCap EZ MedExome enrichment kit (Roche), followed by paired-end reads sequencing on an Illumina NextSeq550 (Illumina Inc, San Diego, CA, USA). Exomes had a read depth (DP) of 10× or more for 90% of the total exome coverage and 20x or more for 80%. Coverage statistics and comparison of coverage between samples was performed using QualiMap, 13 showing a mean coverage depth of 120-122X for the three samples and a mean quality mapping is 58. Data analysis was performed using CoVaCS, 14 a pipeline exploiting a consensus call-set approach from three different algorithms (GATK, Varscan and Freebayes) to generate a final set of high-confidence variants. All variants were annotated with ANNOVAR, using RefSeq for gene-based annotation (position, nomenclature, gene name, gene function). In order to remove low-quality variants, genotypes were required to have DP ≥ 10, and Genome Quality (GQ) ≥20. A minor allele frequency (MAF) threshold of ≤0.5% in gnomAD exome (https://gnomad.broad insti tute.org/) 15 (and <1% in gnomAD genome and the 1000 Genomes Project 16 ) was chosen. We selected exonic and splicing variants, excluding synonymous variants. Likely deleterious variants were prioritized to capture Likely Gene Disrupting (LGD) and damaging missense mutations.

| Deep sequencing of mitochondrial genome
Direct sequence analysis of the entire mtDNA molecule was performed on total DNA extracted from blood, by next generation sequencing (NGS) approach. 23 Briefly, the mitochondrial genome was amplified in two long-range PCR, the NGS library constructed by Nextera XT (Illumina) and paired-end sequenced on NextSeq Instrument (Illumina), using an High Output Kit (300 cycle). Fastq files were analysed with an in-house pipeline, integrating three different callers (MToolBox, Unified Genotyper of GATK and DetermineVariants) [24][25][26] to detect low-level heteroplasmy.

| Mitochondrial DNA quantification
MtDNA content was assessed on total DNA extracted from blood, using a multiplex probe-based real-time PCR method, 27 co-amplifying a mitochondrial gene (MT-ND2) and a nuclear gene (FASLG). All three individuals were compared with age-matched control groups of healthy individuals.

| Comparative genomic hybridization array CGH
Comparative genomic hybridization array CGH was performed in 104 individuals with ASD.
The only pathogenic CNV identified was a deletion of ~811 kb at 2p16.3 (NC_000002.11:g.50170766_50982172del) ( Figure 1A) involving exons from 7 to 23 of the NRXN1 gene (NM_001135659.2) in a trio family. qPCR in all family members confirmed the presence of the NRXN1 intragenic deletion in the proband and showed that the CNV is inherited from the unaffected mother ( Figure S1). We identified three other rare CNVs in this family (Table S1), but none of them is considered to be clinically relevant.

| Clinical characterization of the family with the NRXN1 deletion
No familiarity for ASD, congenital malformations or intellectual disability was reported. The female proband was born without pre-, peri-or post-natal relevant findings. Birth weight was 3690 kg.
Development of socio-communicative and motor abilities was reported to be slightly delayed until 18 months of age with acquisition of some words. At 18-19 months of age, parents reported a regression of the acquired socio-communicative abilities: the girl stopped responding to simple commands and to her name. Eye-contact was lacking, while social isolation started to be more evident. Imitation skills, communicative gestures and language stopped. She started to show hyperactivity, short attention span and motor stereotypies such as hand flapping when excited. She manifested sensory interests manipulating materials mostly to get visual, acoustic and tactile stimulation (ie passing a hair or thread upon her lips, thread waving and ripping paper in thin stripes) and a restricted interest for hair and threads. At 4.6 years old, language expression was limited to 4 single words, while language comprehension seemed to be relatively better. Hyperactivity appeared to be slightly reduced. No epileptic seizures were reported. Diagnosis of ASD was made at 35 months old through clinical observation, the Childhood Autism Rating Scale-Second edition (CARS2-ST) 28 31,32 ). To contextualize their phenotypic profiles beyond questionnaires, we also investigated their education status: they both reached a good education level (high school diploma and bachelor's degree, respectively) with no need for special education or services. No family history of psychiatric disorders was present, except for the paternal grandfather who was reported to take depression medication.

| Whole-exome sequencing
A trio-based whole-exome sequencing (WES) approach was undertaken for this family. We focused our analysis on rare variants (MAF ≤ 0.5%), and more specifically on those predicted to have a functional effect, including LGD variants and missense variants defined damaging, according to a combination of prediction algorithms. 17 We compared the load of rare variants between the proband and her mother: the proband has a higher number of rare variants compared with the NRXN1 deletion-transmitting mother (1036 versus 573). This difference remains significant by considering the putative damaging variants only (303 in the proband vs 212 in the mother, χ 2 = 16.08, P = 6.07 × 10 −5 ). We then tested for transmission disequilibrium of damaging variants from the parents to the proband and we detected a preferential transmission of damaging variants from the father (203 transmitted vs 165 untransmitted variants, TDT P = 0.048) but not from the mother.
Genes with at least one LGD or putative damaging missense variant or CNV were analysed using the STRING database 33 Table 2). Among them, the missense variant in CNTNAP5 is likely to exert a significant role, given that CNTNAP5 is itself a previously known ASD candidate gene, likely to be intolerant to mutations (RVIS percentile = 8.4) (Figure 2). Finally, we investigated the mother's burden of rare variants in these risk categories and, although we are not able to identify de novo and compound heterozygous variants, there is still a higher number of variants in the proband compared with the mother (11 vs 6 variants) (Table S2).

| Mitochondrial DNA analyses
We carried out deep sequencing of the entire mtDNA in the ASD proband and her parents (Table S3 and S4). Both the proband and her mother showed all defining variants of haplogroup H13a1a1 of European ancestry, whereas the father's variants identified the haplogroup L2c2b1b background of African ancestry. None of the rare variants were previously reported (private or unique to an individual) or predicted as pathogenic.
Taking advantage of the high mean coverage in sequencing (16721X in the proband, 12090X in the mother and 19027X in the father), we were able to detect variants with a low-level heteroplasmy (between 0.2% and 15%), in all three individuals (Table S3 and S4).
Two of these were present in the proband and inherited from the To verify the remote possibility of biparental inheritance, as recently reported, 34 we also compared the proband and her father mtDNA sequences. With the exception of the reference sequence private variants, 35 the proband and her father did no share any other variants, not even at low-level heteroplasmy (Table S3 and   S4).
Last, the mtDNA content of the proband and her parents was comparable to the range of age-matched healthy individuals ( Figure S2).

| D ISCUSS I ON
In this study, we identified an intragenic NRXN1 deletion in a female proband with ASD, who has inherited it from the unaffected mother.
We have thus characterized rare variants in the nuclear and mitochondrial genome of this family, in order to investigate their contribution towards the manifestation of the ASD phenotype in NRXN1 deletion carriers.
The NRXN1 deletion identified in this family can be classified as a 3'deletion, 8 as it overlaps exons from 7 to 23 (NM_001135659.2). The deletion gives rise to a putative in-frame transcript, lacking the majority of NRXN1 protein domains (from Gly311 to Leu1445), specifically all α-neurexin LNS-domains (laminin/neurexin/sex hormone-binding globulin domains) except the first one, and the two intercalated epidermal growth factor (EGF)-like domains, while maintaining the transmembrane and intracellular C-terminal domain ( Figure 1B). Moreover, the deletion impacts the canonical splice sites (SS2 to SS6), including SS4, thought to represent a key mechanism for the regulation of NRXN-ligand interactions at synapses. 9 As the 3'deletion identified in our proband is in-frame, it is possible that the phenotypic effect of this deletion may arise by two concurrent mechanisms: haploinsufficiency due to lack of wild-type NRXN1α

TA B L E 1 Summary of clinical data
isoforms, and a dominant-negative activity of the mutant splice isoform, as suggested by a recent study using induced pluripotent stem cell (hiPSC)-derived neurons from subjects with heterozygous intragenic deletions. 36 The proband's phenotype is mainly compatible with clinical features of 3' deletion carriers, as the girl with ASD has macrocephaly Finally, the proband did not carry any pathogenic mutation in her mtDNA. However, we found five variants with low-level heteroplasmy (ranging from 0.2% to 0.7%) that were absent in the mother, and two maternally inherited variants, which increased their heteroplasmic load in the proband as compared to the mother. The burden of low-level heteroplasmic mtDNA variants, both inherited or de novo, also known as universal heteroplasmy, 56 might contribute to the risk of developing ASD, but further analyses on large cohorts are needed to validate this hypothesis. The mtDNA copy number was also uninformative.
In conclusion, we have characterized a trio family in which a large 3' exonic NRXN1 deletion is transmitted from an unaffected mother to a child with ASD. Exonic NRXN1 deletions represent the prototype of incomplete penetrant ASD-associated susceptibility variants as they are often inherited from unaffected or mildly affected parents, but they are still considered pathogenic.
The key finding is the presence of an increased burden of exonic rare variants in the affected proband compared to the unaffected deletion-transmitting mother, supporting the hypothesis that the NRXN1 deletion sensitizes the genome to a clinical manifestation, but other genetic contributors are necessary to cross the threshold for a phenotypic manifestation (Figure 3). 57 Moreover, the reduced penetrance of the NRXN1 deletion in the unaffected mother is consistent with a female protective effect: females would require an excess burden of deleterious CNVs and SNVs to reach the ASD diagnostic threshold. 58  shown that also common polygenic variation contributes additively to ASD risk, even in cases that carry a strongly acting variant. 60,61 A limitation to our study is therefore that we were not able to test the potential contribution of common variation in modulating the penetrance of the NRXN1 deletion in this family.
Further investigation in a large dataset will be necessary to properly evaluate the cumulative effects of rare deleterious and common variants to ASD risk in NRXN1 deletion carriers; family-based samples will be particularly informative, as they allow intrafamilial comparison of phenotypic features and inheritance pattern of specific variants.
This study underlines the importance of a comprehensive assessment of the genomic landscape of ASD individuals even when a 'likely pathogenic' variant has been already identified, as it is apparent that multiple rare variants contribute in conjunction to the overall genetic risk and the final clinical outcome. It is time to move from a genetic to a genomic perspective, shifting from a single variant analysis to an integrated view of many variants of different origin (nuclear and mitochondrial), types (CNVs and SNVs), inheritance pattern (de novo and inherited), frequency (rare and common, heteroplasmic and homoplasmic) and effect sizes, considering the role of protein-protein interactions.

ACK N OWLED G M ENTS
We gratefully acknowledge all the subjects who have participated in the study. We thank the Centro Interdipartimentale di Ricerche F I G U R E 3 Multi-factorial threshold model for ASD in this family. Each family member has an ASD risk cup with balls representing risk variants that contribute to ASD with variable degrees of impact. In both parents, the burden of risk variants is not enough to develop ASD, while in the child the ASD threshold is reached as a combination of strong and weak, inherited and de novo genetic variants. The NRXN1 deletion is depicted as a strong, primary contributing factor to reaching the ASD threshold in the ASD child, but not sufficient alone to develop ASD in the deletion carrier mother 57 [Colour figure can be viewed at wileyonlinelibrary.com] sul Cancro 'Giorgio Prodi' (CIRC), University of Bologna for Illumina Sequencing Service and CINECA for computational resources.

CO N FLI C T O F I NTE R E S T
The authors confirm that there are no conflicts of interest.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data that support the findings of this study are available in the supplementary material of this article.