Copy number variation burden does not predict severity of neurodevelopmental phenotype in children with a sex chromosome trisomy

Sex chromosome trisomies (SCTs) (XXX, XXY, and XYY karyotypes) are associated with an elevated risk of neurodevelopmental disorders. The range of severity of the phenotype is substantial. We considered whether this variable outcome was related to the presence of copy number variants (CNVs)—stretches of duplicated or deleted DNA. A sample of 125 children with an SCT were compared with 181 children of normal karyotype who had been given the same assessments. First, we compared the groups on measures of overall CNV burden: number of CNVs, total span of CNVs, and likely functional impact (probability of loss‐of‐function intolerance, pLI, summed over CNVs). Differences between groups were small relative to within‐group variance and not statistically significant on overall test. Next, we considered whether a measure of general neurodevelopmental impairment was predicted by pLI summed score, SCT versus comparison group, or the interaction between them. There was a substantial effect of SCT/comparison status but the pLI score was not predictive of outcomes in either group. We conclude that variable presence of CNVs is not a likely explanation for the wide phenotypic variation in children with SCTs. We discuss methodological challenges of testing whether CNVs are implicated in causing neurodevelopmental problems.

(ASD), while others have little or no evidence of neurodevelopmental problems.
Such heterogeneity is a common feature of genetic neurodevelopmental syndromes, even between individuals who carry the same causative genetic variant. A range of explanations have been proposed for this high level of phenotypic variability. Veltman and Brunner (2010) suggested the "two-hit" hypothesis in which genetic rearrangements can combine with a secondary variant to amplify the impact of a microdeletion syndrome. Bishop and Scerif (2011) extended this hypothesis to SCTs to develop a "double hit" model that refers specifically to SCTs. In this particular case, the secondary variant is carried in a region that is common to both the X and Y chromosomes and maintains expression of both gene copies (gametologues).
They focused on a specific pathway which includes neurexin and neuroligin genes, as this network has previously been associated with synaptic formation and language disorders. Newbury, Simpson, Thompson and Bishop (2018) tested the "double hit" hypothesis, which maintains that the presence of an extra dose of neuroligin associated with overexpression of NLGN4 on X and Y chromosomes could amplify the impact of genetic variants (both on the sex chromosomes and autosomes) that normally create only a minor risk for neurodevelopmental abnormalities. They did not, however, find any support for that hypothesis from investigation of common variants in either of the two candidate genes CNTNAP2 and NRXN1.
In the current article, we consider a possible "double hit" mechanism in which an extra sex chromosome could amplify genetic risk, by interacting with copy number variants (CNVs). The history of recognition of CNVs is documented by Beckmann et al. (2007): these are deletions or insertions affecting chunks of DNA 1 kb in length or larger, which were first described in the 1960s and 1970s. It was a few decades later before it was recognized that this kind of large-scale submicroscopic variation is common across the genome and not necessarily pathological. Nevertheless, where CNVs are large and/or affect the function of key genes, they are likely to be associated with neurodevelopmental disorders, notably intellectual disability (Coe et al., 2014) and ASD (Sanders et al., 2015).
One study also found an association with severe developmental language disorder (DLD) (Kalnak et al., 2018).
Because they are more likely to have pathogenic effects, large, disruptive CNVs tend to be relatively rare in populations. However, the constraints upon smaller CNVs may be less pronounced and, as such, these are more commonly observed. Simpson et al. (2015) found that there was a slight increase in total CNV "burden" (i.e., cumulative size of all CNVs across the genome) in cases of DLD and their relatives, compared to a comparison sample. The fact that unaffected relatives showed the increase as well as affected individuals suggested that an increased number of CNVs may play a cumulative role in mediating an increased risk of language disorder, but the precise impact may depend on the location and extent of the CNV, and whether it disrupts gene function.
The specific combination of inherited events may also be important.
Here, we consider the role of CNVs in moderating neurodevelopmental outcomes in children carrying SCTs. We explore two hypotheses. The epistasis hypothesis predicts that the risk of neurodevelopmental disorder associated with a CNV will be increased when there is a trisomy, because of interactions between CNVs and the overexpression of genes on the sex chromosomes. This relates to the idea of a two-hit model (Veltman & Brunner, 2010), whereby the effect of a microdeletion is not deterministic, but rather acts as a risk factor that can increase the impact of deletions or duplications elsewhere on the genome. This kind of mechanism is supported by Girirajan et al. (2010) who found that individuals with severe neurodevelopmental disorders associated with a deletion on chromosome 16p12.1 often had a second autosomal CNV. A genomic alteration that may have little or no effect in an unaffected relative appeared to have a particularly detrimental effect in combination with a second "hit." We extend this idea to encompass the notion that the impact of a third copy of a sex chromosome may be amplified by a CNV that might have little effect in a child with a normal complement of chromosomes.
Alternatively, a quite different hypothesis-the burden hypothesismaintains there is an increased number of CNVs in individuals with SCTs, across the entire genome. According to this hypothesis, the high rate of neurodevelopmental problems could be a direct consequence of an increased number of CNVs-perhaps because whatever mechanism leads to a trisomy also disrupts CNV checkpoints.
The best source of evidence for an increased number of CNVs in SCT cases comes from Rocca et al. (2016), who presented evidence that men with Klinefelter syndrome (47, XXY) had an unusually high number of X-chromosome CNVs. They compared CNVs on the Xchromosome in 94 men with Klinefelter (47, XXY) syndrome to that in 85 controls (43 males and 42 females), and reported a higher number of CNVs, especially duplications, in the Klinefelter group. Thirty-nine of them (41.5%) carried CNVs, compared to 12/42, (28.6%) of control females, and 8/43, (18.6%) of control males. As the authors noted, the presence of additional CNVs (either in terms of burden, or specific CNVs that "hit" complementary genetic pathways) in some individuals could provide an explanation for the variable phenotype, but they did not test for associations with phenotype in their sample. Their study raised the further question of whether an increased rate of CNVs might be seen in other SCTs-XXX and XYY-and whether these might be found on the autosomes as well as the X-chromosome.
Further circumstantial evidence for the burden hypothesis and the impact of CNVs in cases of SCT comes from Le Gall et al. (2017), who focused on a group of 14 patients with SCTs in whom an additional causative autosomal copy number event was suspected because of an unusually severe phenotype involving intellectual disability or other severe developmental disorder. They found seven patients carried a pathogenic CNV (one with Williams-Beuren syndrome, one with 7q11.23 duplication, one with 17q12 duplication, three with 16p11.2 duplication and one with a 15q11.3 deletion). It is important to note that the breakpoints of individual CNVs vary between cases, and that for example the three 16p11.2 duplications will differ in their start and stop point in the genome. They additionally reported that two further cases carried likely pathogenic CNVs and five carried a CNV of uncertain significance. Because their report focused only on cases with a known additional micro-deletion or -duplication, the authors were not, however, able to estimate the prevalence of additional pathogenic CNVs in cases of SCT, or to show that a CNV was specifically related to the severity of the phenotype.

| Measurement of CNVs
The simplest approach to measuring CNV burden is to count the number of such events. However, each individual CNV event can vary substantially in size, and we would expect the total extent of coverage of all CNVs to also be important. Within the "burden" model, large and rare CNVs have been shown to be related to the incidence of neurodevelopmental disorder, presumably because larger events have a higher likelihood of affecting important genes. Moreover, larger events in the wider CNV literature are correlated with more severe clinical presentation (Girirajan et al., 2010). Furthermore, the impact of a CNV will depend on whether or not it affects the function of a dosage sensitive gene. More recently, large population sequence data have allowed the development of gene dosage-sensitivity measures such as the pLI score (probability of being loss-of-function intolerant) reported in the Exome Aggregation Consortium database (ExAC: http://exac.broadinstitute.org/) (Lek et al., 2016). This metric is based upon the observed frequency of loss-of-function variants in population control data compared to that expected given the gene size. A small event that disrupts a dosage sensitive gene could have a much higher burden than a large event that affects a large number of dosage-insensitive genes. A pLI score of ≥0.9 is indicative of haploinsufficiency and, as such, the metric shows constraint of the genetic sequence which, in turn, indicates that loss of function of that gene will affect development.
Given all these considerations, it can be difficult to settle on a measure for testing for the hypotheses outlined above. Given the functional validity of the pLI measure, we decided a priori that our primary measure of burden should be the total pLI score of all the genes affected by the CNV event.

| Selection of a study sample
Because many CNVs are found in asymptomatic individuals, we can only interpret potential relevance of CNV burden for neurodevelopmental disorders if we have a comparison sample without any disorder.
Although there are databases of medically relevant CNVs (DECIPHER, ClinVar), these usually only represent extremely rare and pathogenic changes. More recently, large (N > 14,000) population samples have become available through gnomAD v3.0 (Collins et al., 2019). However, neither SNP arrays nor genomic sequencing methods directly measure the number of copies of a given genetic region. Information regarding CNVs therefore has to be inferred from such data. It is often problematic to compare samples assessed using whole genome sequencing methods to the array-based methods applied here. Ideally, we need a target group and a comparison sample processed together using the same methods, to help establish whether the rate, extent, or functional impact of CNVs is unusually high in the target group.
In the current study, a sample of twin children tested on the same psychometric battery and genotyped in the same experimental data set as the children with SCTs acted as a comparison sample. However, this comparison was complicated by the fact that the twin children had been selected to over-represent cases with DLD. Furthermore, the SCT sample included some children whose trisomy was only discovered when they were investigated for neurodevelopmental problems. Thus, the sampling method might have biased both samples in the direction of finding high rates of CNVs. In practice, this proved not to have an effect, but we present data for the subset with relatively low bias (see below for definition), as well as the full sample, to demonstrate that this potential confound did not affect our results.

| Measure of the phenotype
The "double hit" hypothesis predicts that CNV burden will have a disproportionate impact on the phenotype in children with SCTs. In testing this hypothesis, we focused on a measure of global neurodevelopmental impairment, as this would be sensitive to the conditions that are usually associated with CNVs: intellectual impairment and autism.

| Aims of the current study
The aims of this article were twofold; first, we aimed to test the "burden" hypothesis. In doing so, we extend previous analyses by Rocca et al. (2016) by including three subtypes of trisomy: 47, XXX, 47, XXY, and 47,XYY. We investigated CNVs on the autosomes as well as the X chromosome. Second, we consider the "epistasis" hypothesis, testing the prediction that the severity of neurodevelopmental problems will be related to an increased burden of CNV events affecting gene function.

| MATERIALS (SUBJECTS) AND METHODS
All methods were registered on Open Science Framework Preregistration (https://osf.io/u2j97). The participant phenotyping and genomewide SNP array data were generated for the companion study and are described in detail in Newbury, Simpson, Thompson, and Bishop (2018).

| SCT group recruitment
SCT cases aged from 5 to 16 years were recruited from among participants in a previous study  who had agreed to be re-contacted. Additional participants were recruited via support groups (Unique: the Rare Chromosome Support Group, and the Klinefelter Syndrome Association), National Health Service Clinical Genetics Centers, and self-referred through the research project Facebook page or website. In order to be eligible for the study, SCT group participants had to have a genetic diagnosis of either XXX, XXY, or XYY, and be fully aware of their genetic status. Table 1 shows the numbers of children with SCTs in relation to the type of trisomy and the reason for diagnosis. We distinguish here between those diagnosed in the course of investigations for neurodevelopmental disorder, who are referred to as "High bias", and the remainder, the "Low bias" group, who were diagnosed either in the course of pre-natal screening, or during investigations for medical conditions. Both groups combined can be used to test predictions about CNV/phenotype associations, but the Low bias group is more suitable for estimating the typical CNV burden associated with SCTs.

| Comparison group
The comparison group of twin children had been previously recruited for a study of DLD and laterality (Wilson & Bishop, 2018) and had undergone the same test battery as the SCT group. We aimed to recruit a sample where around 75% pairs would include at least one twin with DLD. This was achieved by selecting cases for inclusion on the basis of parental response on a telephone interview: any mention of language delay, history of speech and language therapy, current language problems or dyslexia was coded as "parental concern." This sample therefore represents a DLD enriched group rather than a typically developing group. Some twin children had evidence of ASD (N = 15) or intellectual disability (N = 3), and 12 failed a hearing screen on the day of testing, although none of them had any known sensorineural hearing loss. For the current study, because we were interested in a broader phenotype than pure DLD, these cases were retained in the sample. One twin from each pair was randomly selected to ensure independent observations. When comparing rates of CNV burden with SCT cases, we distinguished between twins selected for having language problems and those whose parents had not expressed any concern about language development.
The number of cases that passed both genotyping and CNV calling quality control (see below) are shown in Table 1, subdivided by karyotype and whether or not they were recruited from a biased source (i.e., either trisomy cases whose trisomy was discovered in the course of investigations for neurodevelopmental/behavioral problems, or twins whose parents volunteered for the study because of concerns about language development in one or both twins).

| Test battery
The test battery is described in detail by Newbury et al. (2018). The battery was designed to provide a quantitative estimate of language, literacy and communication ability in children aged 5-16 years. In addition, parents completed a telephone interview, and were invited to complete two questionnaires and an online diagnostic interview. Our primary phenotype outcome measure is a scale devised for the study by Newbury et al. (2018), the global index of neurodevelopmental impairment (GNI). GNI is an ad hoc measure that combines all available information about a range of neurodevelopmental disorders affecting language, attention, social communication and overall functioning (see Table 2). Scores ranged from 0 (no impairment) to 6 (high impairment).
This was judged to be the most appropriate measure, given that CNVs have previously been associated most strongly with severe problems affecting behavior as well as language and cognitive functioning.
In response to reviewer request, we also report exploratory analyses of two more specific phenotypic measures in relation to CNV burden: language factor scores and Performance IQ (PIQ). The language factor is derived from tests of Oromotor Skills, Verbal Comprehension, Vocabulary, and Sentence Repetition (see Newbury et al., 2018, for details), where a lower score indicates lower language abilities. The PIQ measure is based on the Wechsler Abbreviated Scale of Intelligence (Wechsler, 1999).

| Genetic measures
Genomic DNA was collected and extracted from saliva samples (OG-500, DNA Genotek) using the manufacturer recommended protocol.
Samples were genotyped using the genome-wide SNP array Infinium Numbers of children by karyotype and ascertainment bias Beta allele frequencies and log2 ratio for each of the remaining SNPs were exported to a .csv file. Autosomal CNV calling was performed using two separate methods; QuantiSNP (Colella et al., 2007) and PennCNV (Wang et al., 2007). PennCNV analysis required generation of an array specific PFB file which was built using the gnomAD whole genome Non-Finnish European frequency file (hg19) built for use in Annovar (Wang, Li, & Hakonarson, 2010 CNV calls on the chromosome X were not included in the association analyses due to inability to call duplications in XXX participants, and deletions (CN state = 2) rendering them effectively wildtype at these sites.
Resulting CNV regions containing at least five consecutive SNPs were annotated using ANNOVAR (2018Apr16 release) (Wang et al., 2010). CNVs annotated as overlapping at least one gene, inclusive of exons, introns, untranslated regions and non-coding genes, were retained for further analysis. Sum of pLI (sum_pLI) scores from each gene within a CNV region were calculated for each individual in R (code available at https://osf.io/rgqwp/).

| Power analysis
Our study is constrained by the numbers we were able to recruit, with further loss of cases which did not pass genotyping quality control. As explained in our preregistration, the power to detect given effect sizes in the current study was computed by simulation of correlated datasets based on our existing sample size. With this sample we are adequately powered to detect a correlation around .35 between CNV burden and phenotype in the SCT group.

| Statistical methods and visualization
Burden metrics, in terms of the number of CNVs, cumulative CNV span, and a pLI index were calculated for each individual and analyzed for group differences (SCTs against comparison individuals). We had pre-registered a Wilcoxon test for group comparisons, but this did not allow computation of exact probabilities because there were a very large number of ties. We therefore also calculated exact probabilities using a permutation test. Empirical p-values were calculated using 10,000 permutations and adjustments for multiple testing used the Benjamini-Yekutieli procedure. The analysis script is available on Open Science Framework (https://osf.io/rgqwp/).

| Reporting of pathogenic events
As a side-product of the calling of CNVs, necessary for the calculation of burden metrics, we were able to identify putative pathogenic events T A B L E 2 Index of global neurodevelopmental impairment All available information was used to create a single scale reflecting global level of neurodevelopmental impairment ranging from 0 (no impairment) to 6 (severe problems). Data from initial parental telephone interview were available for all children. Data from language testing were available for all but two very low-functioning children, who were unable to attempt our tasks. Data from the Social Responsiveness Scale (SRS) were available for 127 of 143 children with SCTs, and 316 of 388 comparison children. DSM5 diagnoses from the online Development and Well-being Assessment (DAWBA; Goodman et al., 2000) were available for 89 children with SCTs and 276 comparison children. We used all available data for each child to create a scale by adding points as specified below, with maximum score of 6 a . Note that some categories are mutually exclusive (e.g., dyslexia and low PIQ). • History of speech problems (assessed or treated by speech-language therapist at preschool age) = 1 • Schooling: Current help in mainstream school (support or special class or speech-language therapy) = 1; OR attends special school = 2 • Dyslexia (at least two reading tests >1 SD below mean, PIQ > 70) = 1 • DLD (Woodcock-Johnson comprehension + at least one other oral language test >1 SD below mean, PIQ > 70) = 1 • ADHD (parental report or DAWBA diagnosis) = 1 • Behavior problems (DAWBA diagnosis of conduct disorder or clear description on interview) = 1 • Autism: Report from interview of definite diagnosis, or SRS = 90, or DAWBA diagnosis = 2 • Low IQ (PIQ < 70 or refusal/inability to do battery) = 1 a In our previous report by Newbury et al. (2018), this scale was inverted so a low score corresponded to impairment. that occurred in SCT probands and comparison individuals. These events represent CNVs that have been reported previously in the literature to directly cause a micro-deletion/-duplication syndrome and therefore may directly explain any neurodevelopmental difficulties experienced by individuals, irrespective of their SCT status.
Individual autosomal CNVs with a sum_pLI score of ≥0.9, regardless of size, were reported as pathogenic if they either overlapped by at least 50% (and in the same CN state) with known CNV Syndrome region as reported in DECIPHER (https://decipher.sanger.ac. uk/disorders#syndromes/overview), or with similar sized events reported in DECIPHER and/or ClinVar as either pathogenic or likely pathogenic, in line with current American College of Medical Genetics (ACMG) guidelines (Nowakowska, 2017;South et al., 2013). While the DECIPHER database is not specific to neurological phenotypes, it is enriched for neurodevelopmental difficulties as these are common features of CNV syndromes.

| X chromosome CNV burden
We found no evidence of an increase in the number of X chromosome CNVs in the SCT group compared to the comparison group. In total 13 high-quality X chromosome events were detected, three deletions and five duplications were found to occur in eight comparison individuals (N = 8/181, 3.97%), while two duplications and a deletion were found in three XYY individuals and two duplications in two XXY individuals (N = 5/85, 5.9%) (OR = 1.35, 95% CI = 0.43-4.26, p = 0.61, z = 0.514).
The XXX individuals (N = 40) were excluded from this analysis. Table 3 shows comparative data for SCT versus comparison individuals on CNV burden for three measures: total number of autosomal CNVs, cumulative span of autosomal CNVs (in kb), and total pLI scores across all autosomal CNVs per individual. Individual data points for CNV span and number of CNVs are shown in Figure 1. A suggestion of a marginal excess in number of CNVs in the SCT group was not statistically reliable when correction was made for multiple tests. Notably, the comparison on our primary measure, total pLI score (see Figure 2), did not reveal a reliable group difference, either with or without the High Bias cases included.

| Enrichment of pathogenic CNVs
We next sought to investigate whether the identified CNVs incorporated known pathogenic copy number events, as suggested by Le Gall et al. (2017). This analysis was not pre-registered but was suggested by reviewers. All CNV events overlapping ≥50% with previously reported pathogenic microdeletion and duplications reported in the DECIPHER database (Table 4). CNVs were described if previously reported as either pathogenic or likely pathogenic, and therefore clinically relevant according to ACMG criteria (Nowakowska, 2017;South et al., 2013). Pathogenic CNVs were identified in 15 individuals in total. This consisted of six identified in the Comparison groups (3 XX and 3 XY, N = 6/181, 3.31%) and nine in the SCTs (4 XYY, 4 XXY, and 1 XXX. N = 9/85, 10.59%; OR = 2.26, 95% CI = 0.7845-6.53, p = .1308). While there appears to be a modest enrichment of pathogenic CNVs in the SCT cases, 10.58% compared to 3.31% in the comparison group, this was not statistically robust due to small numbers.
To investigate if carrying a pathogenic CNV resulted in a more severe phenotype, individuals in whom a pathogenic or likely pathogenic CNV was identified are indicated (P) in Figure 2. There is no visible trend indicative of a clear association between increased GNI and carrying a pathogenic CNV.

T A B L E 3
Mean (SD) measures for autosomal CNV events for SCT and Comparison groups, both for whole sample, and for subset with low ascertainment bias

| Predicting global neurodevelopmental impairment (GNI) from pLI scores
We conducted a pre-registered analysis, to test the prediction that SCT versus comparison status would mediate the relationship between CNV burden and phenotype. The relevant data are shown in Figure 2. The analysis was implemented in a Poisson regression, using the whole sample, with global neurodevelopmental index (GNI) as the dependent variable, and SCT versus comparison status and total pLI score as predictors. Results are shown in Table 5. As expected, there was a substantial effect of SCT/Comparison status on GNI. However, the pLI measure of CNV burden did not predict outcome, and there was no interaction with SCT/Comparison status. We observed a number of individuals with a high GNI and a low total pLI score (indicating that the genes hit by the CNVs were not dosage sensitive) or a low GNI score with CNVs that hit genes with a high pLI (see Figure 2).

| Additional exploratory analyses
In response to reviewer suggestions for more analyses, we report in Supplementary Material data similar to Table 3 (Table S1) but for each trisomy separately, and data similar to Table 5 for two further phenotypes, Language Factor, and PIQ (Table S2).

| DISCUSSION
In this article, we aimed to test two alternative models of heterogeneity with regards to CNVs in individuals carrying SCT. The "burden" hypothesis suggests that individuals with SCTs may be at an increased risk of neurodevelopmental disorder due to an increased burden of CNVs across the genome. This is supported by previous investigations by Rocca et al. (2016) and Le Gall et al. (2017). Alternatively, the "epistasis" hypothesis suggests that the severity of neurodevelopmental problems in SCT cases will be related to an increased burden of CNV events affecting gene function. We find that neither of these hypotheses could account for the variation in neurodevelopmental phenotypes seen across SCT cases. Although we observed a slight increase in the number of rare pathogenic CNVs in the SCT cases, this was not significant and did not predict the severity of neurodevelopmental disorder. Similarly, we did not observe an excess of CNV burden in SCT cases, nor did we find a relationship between CNV load and neurodevelopmental outcomes. We did not find an enrichment of CNVs on the X chromosome in SCT (XXY and XYY) individuals, failing to replicate the findings of Rocca et al. (2016). Interestingly, our method identified considerably fewer CNVs on the X chromosomes than the autosomes; 3.97% of comparison individuals and 5.9% of SCTs carried a CNV on chromosome X, in stark contrast to rates of 41% (XXY), 28.6% (XX), and 18.6% (XY) reported by Rocca et al. (2016). This striking discrepancy may be explained by the use of two different algorithms (PennCNV and GenomeStudio) to call X chromosome CNVs in this study compared to just PennCNV in Rocca et al. study. This double analysis approach decreases false positives and increases confidence in the calls made but may increase Type II error. Our method was extremely conservative, and we therefore may have missed genuine CNVs. This emphasizes that cross study comparisons are made difficult by methodological differences.
We did not find a reliable increase in overall burden of mean recurrent micro-deletion is considered to be susceptibility locus for ASD and developmental delay often accompanied by language difficulties, but the phenotype is highly variable and unaffected carriers have been reported (Hannes et al., 2009).
Given these results, and absence of statistically robust findings, we do not find evidence to support the role of the burden model in relation to "double-hits" in SCTs. Furthermore, we did not detect any association between total pLI score and severity of neurodevelopmental impairment, indicating that the two-hit model of CNV action does not explain the range in severity of neurodevelopmental and language disorders seen in SCT individuals as would be expected under an "epistasis" model.
One strength of this study is that it includes SCT cases across clinical categories (XXX, XXY, and XYY) that were both prenatally and postnatally diagnosed as well as individuals with a wide range of neurodevelopmental function (as described in . This range extends the focus of previous studies (Le Gall et al., 2017;Rocca et al., 2016) and provides scope to detect genetic differences that may underlie neurodevelopmental outcomes. However, this study design also results in limitations and methodological constraints, as presented in the introduction and the discussion above. Note: The table indicated the region covered by the CNV (Region), the genes contained within the CNV region (Genes), the size of the region (Length), number of SNPs contained within the CNV (No. SNPs), whether the CNV was a deletion or duplication (CN State), and the sum of pLI scores for the reported CNV (sum_pLI). The pathogenicity and syndrome columns contain the class and associated syndrome of the CNV from the DECIPHER database. Phenotyping test scores are reported. Abbreviations: LF, language function, where low score indicates impairment; GNI, global neurodevelopmental index, where high score indicates impairment; PIQ, performance IQ.
especially when genetic effects are expected to be heterogeneous, a characteristic of CNVs (Veltman & Brunner, 2010). This heterogeneity was apparent within individuals who carried pathogenic CNVs, not all of whom presented with neurodevelopmental syndromes (see Table 4). Similarly, it means that we cannot rule out the presence of epistatic effects at the individual level. Although no single copy number event occurred in all SCT cases, it remains possible that specific CNVs, and genetic variants in the wider sense, may act in an epistatic manner. The characterization of these effects would require a genome wide approach that affords power to consider interactions between multiple genetic factors and environmental effects and was beyond the scope of the current study.
In summary, this analysis does not support the view that there is an increased burden of CNVs in individuals with SCTs, nor that CNVs have disproportionate impact on neurodevelopmental phenotypes in this population. Rare, pathogenic CNVs may contribute to the phenotype in some individuals with severe neurodevelopmental problems, as was observed by Le Gall et al. (2017) but these do not account for all neurodevelopmental difficulties within this cohort. Our data, which includes cases detected prenatally and with a wide range of phenotypic presentations, suggest that secondary pathogenic events are not a common occurrence in cases of SCT. Similarly, the wide variation in phenotypes seen in this population cannot be explained by either the "burden" hypothesis associated with excess CNV burden or the "epistasis" model where a disproportionate impact is observed when CNVs co-occur with a trisomy of the sex chromosomes.