Segregating patterns of copy number variations in extended autism spectrum disorder (ASD) pedigrees

Autism spectrum disorder (ASD) is a relatively common childhood onset neurodevelopmental disorder with a complex genetic etiology. While progress has been made in identifying the de novo mutational landscape of ASD, the genetic factors that underpin the ASD's tendency to run in families are not well understood. In this study, nine extended pedigrees each with three or more individuals with ASD, and others with a lesser autism phenotype, were phenotyped and genotyped in an attempt to identify heritable copy number variants (CNVs). Although these families have previously generated linkage signals, no rare CNV segregated with these signals in any family. A small number of clinically relevant CNVs were identified. Only one CNV was identified that segregated with ASD phenotype; namely, a duplication overlapping DLGAP2 in three male offspring each with an ASD diagnosis. This gene encodes a synaptic scaffolding protein, part of a group of proteins known to be pathologically implicated in ASD. On the whole, however, the heritable nature of ASD in the families studied remains poorly understood.

onset developmental disorders, occurring with a population prevalence of more than 1% (Centers for Disease Control & Prevention, 2012). It is characterized by core socio-communicative and behavioral symptoms, and associated with major psychiatric and medical disorders (Anagnostou et al., 2014). Variants of different allele frequencies are now known to play an etiological role, with most progress having been made in the identification of rare, often de novo, copy number variants (CNVs) and single nucleotide variants (Woodbury-Smith & Scherer, 2018). As a result of ascertainment and study design, many studies focus on only nuclear families, including singletons, discordant sibpairs, and affected sibpairs, and so the wider segregation of variants has not been studied. What is particularly striking about ASD, however, is the fact that ASD tends to segregate in families, with first and second-degree members of families with ASD probands often diagnosed with ASD or classified with a lesser variant termed the broader autism phenotype (BAP; Sasson et al., 2013).
Heritability estimates as high as 90% support a major role for genetic factors rather than common environment in explaining this familial nature of ASD and BAP (Tick, Bolton, Happe, Rutter, & Rijsdijk, 2016). There is also evidence that genetic predisposition may extend to other neuropsychiatric disorders in families (De Rubeis & Buxbaum, 2015).
The fact that ASD often shows familial segregation has been investigated in a series of genome-wide linkage studies (Abrahams & Geschwind, 2008;Allen-Brady et al., 2010;Szatmari, 2007;. These have identified a number of genome-wide significant signals for ASD and related phenotypes, earmarking genomic loci that could potentially harbor ASD-associated variants. Some of these loci overlap genes of potential significance to ASD, while other loci overlap regions linked and/or associated with other neurodevelopmental disorders. Only two loci, at 7q35 (Abrahams & Geschwind, 2008) and 20p13 (Werling, Lowe, Luo, Cantor, & Geschwind, 2014), have been replicated at genome-wide significance.
More recent studies have also investigated the segregation of CNVs across families (Woodbury-Smith et al., 2015). We reported previously on the segregation of CNVs in 19 US and Canadian extended pedigrees (Woodbury-Smith et al., 2015). Such families are likely to be enriched for heritable forms of the disorder, and provide the opportunity to track variants more widely (Wijsman, 2012). Although we did not identify any rare variants that were shared widely among family members with ASD and/or BAP, there were examples of CNVs that segregated with ASD phenotype in a subset of family members. None coincided with identified linkage signals in these same families , pointing to a potential source of complexity in ASD's heritable genetic architecture.
There is also evidence that CNVs of potential etiological significance may segregate with other psychiatric disorders in such families (Sato et al., 2012), consistent with the evidence of shared genetic risk, and earmarking loci that cause brain vulnerability rather than disorder per se.
In this current study, we focused on an examination of CNVs in nine newly recruited Canadian extended pedigrees. Having already undertaken a genome-wide linkage study that included these same families , we were interested in whether any rare ASD and/or BAP segregating CNVs overlapped with any of these identified linkage signals. Additionally, and based on our previous study, we were also interested in identifying rare CNVs that were shared among ASD/BAP family members.

| Recruitment
We recruited extended pedigrees with at least three ASD cases spread across at least two nuclear families. All families were either known to the authors through previous studies or identified through advertising. Initial screening was by way of a telephone based assessment in conjunction with the Autism Family History Interview (Piven et al., 1990). All families who met the inclusion criteria were then followed for further clinical assessment. To minimize etiologic heterogeneity, families were excluded from the study if there was evidence of the following co-occurring medical conditions, thought to be etiologically related to autism, in one of the index probands: tuberous sclerosis, neurofibromatosis, phenylketonuria, Fragile X syndrome, or significant CNS injury, although no such families were identified. All individuals were of northern European ancestry. Data collection took place under Institutional Review Board's approval, and the research was conducted in accordance with the World Medical Association Declaration of Helsinki. Written informed consent was obtained from subjects or their proxy decision maker after the study had been fully explained.

| Clinical diagnosis
ASD diagnosis was confirmed by expert clinical judgment incorporating information from the Autism Diagnostic Interview Revised (Lord, Rutter, & Le Couteur, 1994) and Autism Diagnostic Observation Schedule -Revised (Lord et al., 2000), which were administered by trained and reliable clinicians. All participants classified as ASD met DSM IV criteria for either Autistic Disorder, Asperger syndrome or Pervasive Developmental Disorder Not Otherwise Specified according to the criteria in Risi et al. (2006). Although DSM-IV criteria were used at the time of assessment, all diagnosed participants also meet the DSM 5 criteria for ASD.
Among non-ASD family members, assessment for BAP was undertaken. The BAP Questionnaire was used for diagnosis of BAP in individuals greater than 15 years of age (Hurley, Losh, Parlier, Reznick, & Piven, 2007). The measure was completed by the participant about him/herself (the self-version) and by someone close to the participant about him/her (parent or spouse, the informant version) to obtain an average score (between the self and informant scores). Whenever available, the average scores were utilized. A BAP diagnosis was assigned if an individual met gender-specific criteria in any domain.
Additional clinical evaluation was undertaken using the Wechsler Abbreviated Scale of Intelligence (Pearson Clinical), the Vineland Adaptive Behavior Scale (Pearson Clinical), the Repetitive Behavior Scale (Lam & Aman, 2007), and the Oral and Written Language Scales (Pearson Clinical). Clinical information pertaining to diagnosis of other neurodevelopmental disorders, including epilepsy, along with known neonatal complications was collected using a structured parent completed questionnaire. Partek Genomics Suite (Grayson & Aune, 2011). A stringent set of variants was defined for further analyses. For the Affymetrix chip, this set included variants detected by one or both of ChAS or iPattern, and if detected by only one of these, then also by one of Nexus or Partek. For stringent calls on the X chromosome, we required calling by both ChAS and iPattern. For Illumina generated genotypes, stringent calls included variants detected by iPattern, and one of PennCNV (Wang et al., 2007) or QuantiSNP (Colella et al., 2007). Stringent calls on the X chromosome required calling by all three algorithms.

| Genotyping
We analyzed CNVs covered by at least five consecutive probes and those with a minimum length of 15 kb. CNVs were filtered to prioritize rare variants that occurred with a frequency of <0.1% in control samples (N = 10,851; refer to Zarrei et al., 2018, for list of control samples). For the purpose of filtering, CNVs with >50% reciprocal overlap were deemed identical. We also removed all CNVs that had >70% overlap with a known segmental duplication. We further restricted our list to those with more than 75% overlap with copy number stable regions, according to the stringent CNV map of the human genome (Zarrei, Mac-Donald, Merico, & Scherer, 2015). All clinically relevant CNVs described in the index family have been validated using the SYBR green-based quantitative PCR method, TaqMan Copy Number Assays and/or visualization of WGS generated bam files (see below) using the Integrated Genomics Browser (IGV; Nicol, Helt, Blanchard, Raja, & Loraine, 2009

| CNV prioritization
We first examined whether any of the rare CNVs overlapped linkage signals in these same families. Linkage signals were defined as those with a PPL ≥ 0.1, whose margins were at PPL ≥ 0.05. We then prioritized deletions and duplications ≥3 Mb in size. Next, variants were filtered to extract only those that overlapped one or more coding exons, and then the following CNVs were extracted: (a) variants overlapping a known ASD gene  and (b) variants congruent with known genomic syndromes as those curated in DECIPHER (Firth et al., 2009) or ClinGen (Rehm et al., 2015). Only those DECIPHER/ClinGen loci that earmarked developmental brain syndromes were retained in our final list.
Further, variants were clinically classified as "likely benign,", "of unknown significance" (VOUS), and "likely pathogenic" according to the American College of Medical Genetics guidelines (Kearney et al., 2011). CNVs in cases with whole genome data from MSSNG project  were also visualized in IGV to further investigate structure of duplications and to refine breakpoints of variants.

| Whole genome sequencing
Twenty six individuals also underwent whole genome sequencing as part of the MSSNG initiative . Variant prioritizations were followed as previously described . Briefly, identified variants were annotated, and likely deleterious mutations prioritized to capture those that were rare (MAF ≤ 1%), LoF (nonsense, splice site, frameshift), and damaging de novo missense mutations (damaging as evidenced by four of following algorithms: SIFT ≤ 0.05, Polyphen ≥ 0.95, CADD ≥ 15, Mutation Assessor score ≥ 2, PhyloP ≥ 2.4). MAF was based upon filtering according to frequencies in the following control samples: 1000 Genomes Project (Genomes Project Consortium et al., 2015), NHLBI-ESP (Fu et al., 2013), ExAC (v.0.3.1) (Lek et al., 2016), and Complete Genomics internal control data. Variants described in this article have been validated by Sanger sequencing.

| RESULTS
A total of nine extended pedigrees were recruited. The total sample includes 318 family members, among whom 170 were genotyped on one or more platforms and 132 had phenotype data (122 with both genotype and phenotype data). The sample comprised 46 individuals diagnosed with ASD, 22 with BAP, and 68 with neither ASD nor BAP.
The clinical characteristics of these pedigrees are described in Supplementary Table S1. Identified CNVs are detailed in Supplementary   Table S2.

| CNVs segregating with linkage signals within and across families
We first examined linkage signals generated by all nine pedigrees combined. A number of CNVs overlapped these signals, but none were inherited. As such, there were no segregating rare CNVs that would theoretically explain any of the signals observed. The same was true when examining pedigree specific linkage signals. Only two families showed peaks above a 10% PPL threshold (Ped 4, ASD, X chromosome; Ped N1, BAP, chromosomes 9 and 20; Supplementary Figure S1), but in neither case was there evidence of CNVs aligning with these signals.

| Inherited CNVs segregating with ASD/BAP phenotypes
A small number of rare CNVs overlapped known ASD genes and/or recurrent known genomic syndromes (Table 1). One father of an ASD male child, who himself met the criteria for BAP, had a 2.5 Mb clinically pathogenic duplication at 22q11.21 (Wenger et al., 2016), impacting many genes, several, including COMT and SLC25A1, implicated in ASD and other neurodevelopmental disorders. The male offspring of this individual has a diagnosis of ASD and did not inherit this mutation.
At five other loci, CNVs of unknown clinical significance were identified. This included one duplication shared by an ASD male and his unaffected female sibling impacting the ASD-implicated gene NEDD9. This CNV tandemly duplicates three shorter isoforms of NEDD9, but intragenic for the two larger isoforms of this gene. This gene belongs to a family of molecules that mediate protein-protein interactions in signal transduction pathways. This same ASD male also had a tandem duplication that overlapped the cortical dysplasia implicated gene TUBB2A, associated with complex cortical dysplasia and other brain malformations (Romaniello et al., 2018). In Family 3 (Table 1), three members had a hemizygous deletion impacting the ASD-implicated gene AGMO. One ASD female from this same family had a large de novo, 1.6 Mb CNV duplication at Xp22.31, impacting seven genes including STS.
The most striking finding was evidence of a paternally inherited rare tandem duplication impacting the ASD-implicated gene DLGAP2 in three ASD-affected male offspring (Figure 1, individuals 003 range. DLGAP2 is translated into a membrane bound protein that is brain expressed and believed to play a role in synapse organization and signaling in neuronal cells.

| DISCUSSION
Our examination of large, multigenerational pedigrees, each with at least three members with ASD, has found no evidence of widely segregating CNVs in these families. While a small number of CNVs potentially implicated in ASD were identified, these did not explain the heritable nature of the disorder in these families. Importantly, these CNVs did not overlap any of the linkage signals previously identified in these families . As such, other heritable genetic risk remains to be identified. Moreover, the exact nature of how heritable risk and identified CNVs interact with each other and additional factors needs to be more fully evaluated. The possibility of an interaction between two or more genes in raising susceptibility to ASD has previously been argued, but such a mechanism remains to be demonstrated.
One family with four male offspring all diagnosed with ASD was characterized by the presence of shared mutations, none of which neatly segregated with phenotype. One tandem duplication impacting DLGAP2 was identified in three male offspring who inherited this CNV from their father, who is of unknown phenotype. Two of the three males with this CNV have been diagnosed with ADHD, and for two social functioning as measured by the Vineland Adaptive Behavior Schedule, is in the impaired range. This is a strong candidate for ASD as DLGAP2 is strongly brain expressed (https://gtexportal.org/ home/gene/DLGAP2) and is identified as important in synapse development (Rasmussen, Rasmussen, & Silahtaroglu, 2017). Indeed, as a synaptic scaffolding protein, it interacts structurally at the domain level with SHANK1-3 (Rasmussen et al., 2017;Takeuchi et al., 1997) themselves implicated in ASD (Sato et al., 2012). Mouse models have shown that knockout of DLGAP2 impacts social behavior, as well synaptic structural integrity (Jiang-Xie et al., 2014). The potential role of DLGAP2 in ASD is unclear, however, as one study failed to identify F I G U R E 1 Pedigree diagram for Ped 19 [Color figure can be viewed at wileyonlinelibrary.com] statistically significant difference in frequency of rare missense and nonsense mutations in ASD cases versus controls, although the sample size was small: all cases were Asian, having been recruited in Tai-wan (Chien et al., 2013). However, one other study has identified ASD and other neurodevelopmental phenotypes in nine nonoverlapping families with DLGAP2 duplications (Poquet et al., 2017).
Four individuals with ASD in a single pedigree also had the same stop-gain point mutation in MYO1A. Although IQ information was unavailable for one individual, the other three all had intellectual ability in the average range. Despite this, however, their social functioning was in the impaired to borderline impaired range. Of additional relevance, the two offspring with ADHD both had DLGAP2 and MYO1A mutations. However, other aspects of phenotype, including language skills and medical history, did not clearly differentiate between individuals in this family with different mutations. The one individual with both a MYO1A mutation and large 16q23.3-16q24.1 deletion was floppy at birth but otherwise unremarkable in terms of perinatal complications and medical history.
We have previously highlighted the complexity of making genotype phenotype correlations in ASD, particularly in view of the phenotypic heterogeneity and infrequency of individual mutations (Woodbury-Smith et al., 2017). This is compounded by the variable penetrance of these mutations. Although we are not able to confidently make causal statements, overcoming this formidable obstacle will require the accumulation of phenotype and genotype information afforded by in-depth studies such as ours. Such variable penetrance is widely recognized in ASD and other neurodevelopmental disorders (Bateman et al., 2018;Le, Williams, Alaimo, & Elsea, 2019;Ropers & Wienker, 2015), even at "high risk" neurodevelopmental loci. For example, the well-described 2q37 deletion syndrome is associated with a wide range of neurodevelopmental and medical phenotypes, and it is only by considering size and position of CNV along with downstream expression of underlying genes and their interacting partners, that the relationship of genotype to phenotype is more fully appreciated (Le et al., 2019). Moreover, similar to ours, other studies have also found variable penetrance for identical mutations in family members of identified probands (Bateman et al., 2018).
We did identify a number of other CNVs in ASD and/or BAP cases that have been previously implicated in ASD (COMT, SLC25A1, NEDD9, and AGMO) and cortical dysplasia (TUBB2A; Yuen et al., 2017). The likely variable penetrance of these mutations, however, is evidenced by their occurrence in both BAP cases as well as typically developing siblings of ASD probands. Further work, including functional studies, is therefore required to improve our understanding of their penetrance in relation to ASD. Indeed, it is notable that several of the identified CNVs are interpreted clinically as of unknown significance. This level of uncertainty presents major problems for genetic counselors tasked with explaining causality and risk to families. A further CNV impacting STS is associated with X-linked ichthyosis, itself described in association with neurodevelopmental phenotypes including ASD and ADHD (Kent et al., 2008).
T A B L E 2 Phenotypes of Ped 19 (see Figure 1 for pedigree) Extended pedigrees are a valuable resource for the identification of heritable genetic causes of ASD. Many of the genetic variants identified so far as ASD implicated are de novo and therefore will not explain ASD's heritability of~90% (Tick et al., 2016). Families such as the ones studied here often contain members with milder autism phenotypes, termed BAP. We have shown previously that these pedigrees can individually generate reasonable sized linkage signals, with strong evidence that these signals are consistent with segregating uncommon variants.
Moreover, such families offer the opportunity to tease out the relative roles of environmental risk factors that may be shared among affected members. While we do not expect variants to be shared across different families, it is likely that within individual families some shared variants will explain the propensity for clustering of ASD and other mental disorders even in the presence of intrafamilial genetic heterogeneity. Therefore, although the findings are modest, families such as those studied here are a unique resource for future studies tacking the complex genetic etiology of ASD.