De novo variants in neurodevelopmental disorders—experiences from a tertiary care center

Up to 40% of neurodevelopmental disorders (NDDs) such as intellectual disability, developmental delay, autism spectrum disorder, and developmental motor abnormalities have a documented underlying monogenic defect, primarily due to de novo variants. Still, the overall burden of de novo variants as well as novel disease genes in NDDs await discovery. We performed parent‐offspring trio exome sequencing in 231 individuals with NDDs. Phenotypes were compiled using human phenotype ontology terms. The overall diagnostic yield was 49.8% (n = 115/231) with de novo variants contributing to more than 80% (n = 93/115) of all solved cases. De novo variants affected 72 different—mostly constrained—genes. In addition, we identified putative pathogenic variants in 16 genes not linked to NDDs to date. Reanalysis performed in 80 initially unsolved cases revealed a definitive diagnosis in two additional cases. Our study consolidates the contribution and genetic heterogeneity of de novo variants in NDDs highlighting trio exome sequencing as effective diagnostic tool for NDDs. Besides, we illustrate the potential of a trio‐approach for candidate gene discovery and the power of systematic reanalysis of unsolved cases.


| INTRODUCTION
Neurodevelopmental disorders (NDDs) comprise a heterogeneous group of conditions affecting brain development and function and can manifest in impaired cognition, behavior, language, and motor functioning. 1 In accordance to "Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition" 2 (DSM-5), NDD encompasses intellectual developmental disorders, communication disorders, autism spectrum disorders, attention-deficit/hyperactivity disorders, specific learning disorders, and motor disorders. 2 Furthermore, patients with NDDs often demonstrate additional, (non-) neurological comorbidities. 3 While NDDs can have numerous causes such as fetal exposure to toxicants, perinatal asphyxia and environmental contaminants, monogenic conditions make an essential contribution to the etiology of NDD. 1 The genetic etiology underlying NDD is extremely heterogeneous extending from large chromosomal aberration to single-nucleotide variants (SNVs) in >1000 of genes. 4 Nevertheless, theoretical calculations indicate that over 500 novel NDD genes remain to be discovered. 5 It has been widely acknowledged in large-scale sequencing studies that variants in protein-coding genes that have arisen de novo are enriched in individuals with NDDs and constitute the major cause of NDDs in outbred populations. [6][7][8][9][10][11][12][13][14] 42%-48% of individuals with a NDD are thought to harbor a causative de novo variant in known as well as yet-undiscovered disease genes. 13 However, the burden of de novo variants in NDD has not yet been fully illuminated. 14 With the aim to better elucidate the genetic spectrum of (de novo) variants underlying rare NDDs, we describe detailed clinical and genetic findings in 231 individuals with NDDs who underwent trio exome sequencing in a single tertiary care genetic center.

| Study design
We retrospectively analyzed 231 individuals with NDDs in whom trio exome sequencing was performed in our institute. The families were recruited over a period of 3 years (August 2017 until July 2020) from different centers for human genetics, neuropediatrics, and neurology in Germany, Switzerland, Slovakia and Czech Republic. 177 (76.6%) of these 231 trios have not been published previously. Individuals were found eligible for this study if they had (1) a symptom or a constellation of symptoms consistent with a NDD (in accordance with the diagnostic criteria of "Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition" 2 ) and (2) no prior genetic diagnosis. We obtained and thoroughly reviewed clinical records of all individuals and applied the human phenotype ontology (HPO) to systematically characterize the individuals' phenotype. 15 As previously published, individuals were categorized to one of two categories based on their clinical presentation: (1) isolated NDD or (2) NDD plus associated conditions defined as any additional neurological, systemic, syndromic, or other clinical characteristic, for example, microcephaly or neutropenia. 16 Family history was collected by the referring clinician where applicable and a family history was considered as positive when a first-degree relative had a NDD.
All participants or their guardians gave written informed consent for exome sequencing and the publication of relevant findings. The study was performed in agreement with the ethical standards of the responsible committee on human experimentation (institutional and national) and with the Declaration of Helsinki, and was approved by the respective local ethics committees.  18,19 Mitochondrial DNA (mtDNA) variants were assessed using off-target reads as previously described. 20 Variants were analyzed in the in-house exome variant analysis database (EVAdb) using I) a recessive filter for homozygous and compound heterozygous variants with a minor allele frequency (MAF, according to in-house database with over 20 000 exomes) < 1%, II a filter for X chromosomal variants with a MAF < 0.1% and III) a filter for de novo variants with a MAF < 0.01%. IV) A phenotype-based search was conducted by performing an OMIM full term search using the three most characteristic phenotypic traits to establish a gene list. The filter queries variants with a MAF < 0.1%. In addition, CNVs with a MAF < 0.01 and mtDNA variants with a MAF < 1% were assessed. Identified variants were classified according to the American College of Medical Genetics and Genomics (ACMG) guidelines. [21][22][23] Only cases with likely pathogenic or pathogenic variants as per ACMG (in the following designated "disease-causing") in established disease genes for NDDs were considered as solved and were reflected in the overall diagnostic yield. All genes with "strong" or "definitive" evidence for gene-disease relationship as defined by the Clinical Genome Resource (ClinGen) were considered as established disease genes. 24  For all established disease genes containing causative de novo variants, constraint metrics (pLIs and Z-scores) were extracted from Genome Aggregation Database (gnomAD) v2.1.1 to evaluate gene tolerance to loss-of-function or missense variants. 25 As recommended by gnomAD, we used pLI > 0.9 for loss-of-function variants and Z-score > 3.09 for missense variants as constraint threshold values. 26 3 | RESULTS Clinical characteristics were captured using HPO terms (Table S1). 15 CTCF  EFTUD2  HECW2  HK1  HNRNPH2  HNRNPU  IFIH1  IMPDH2  ITPR1  KCNH1  KCNT1  KDM3B  KIF11  KIF1A  KIF5C  KMT2A  KMT2C  KMT2D  MORC2  NONO  NSD1  NUS1  PAK1  PDHA1  PPP2R5D  PUF60  SET  SETD1B  SETD5  SHANK3  SLC2A1  SLC6A1  SMARCA4  SOX11  SPTBN2  STAG2  STX1B  STXBP1  TBL1XR1  TLK2  TRIO  TUBB  WAC  YWHAG   The variant was identified as low-level mosaicism in the mother (in 1/216 reads, maternal DNA derived from blood). majority of them occurring only once (n = 58/72, 79.2%). The most commonly affected gene was ZEB2 (n = 4/72, 5.6%) associated with "Mowat-Wilson syndrome", followed by ARID1B (n = 3/72, 4.2%), GNAO1 (n = 3/72, 4.2%), KMT2B (n = 3/72, 4.2%) and PURA (n = 3/ 72, 4.2%). Disease-causing variants in nine different X-linked genes comprising DDX3X (n = 2), MSL3 (n = 2), SMC1A (n = 2), CDKL5 (n = 1), HNRNPH2 (n = 1), NONO (n = 1), PDHA1 (n = 1), STAG2 (n = 1), and ZC4H2 (n = 1) were detected. The spectrum of genes containing disease-causing de novo variants is visualized in Figure 2 Table 1 gives an overview of all disease-causing de novo variants identified in this study, including the associated disorder.

| Diagnostic yield
We systematically evaluated constraint metrics (pLIs and Z-scores) for all genes containing (likely) pathogenic de novo variants (excluding CNVs spanning more than one gene). We observed that the majority of genes (n = 58/67, 86.6%) showed a pLI score > 0.9 indicating a high intolerance toward loss-of-function variants. 46/67 (68.7%) genes had a Z-score > 3.09 expressing a high constraint toward missense variants ( Figure 2(C), Figure 2(D)). We further evaluated those five genes (RHOBTB2, SPTBN2, KCNT1, IMPDH2, IFIH1, SOX11) that did not show an overall constraint toward missense as well as toward loss-offunction variants (Z-scores ≤3.09 and pLIs ≤0.9). Apart from SOX11, whose pLI is most likely low due to the small gene size, we observed that pathogenic variants reported in those genes are all missense variants that cluster within or around a specific domain, in line with a region-specific high constraint (Table S2, Figure S2).

| Identification of novel candidate and disease genes
In cases without a definite molecular diagnosis, we sought to uncover (novel) candidate genes for NDDs. In summary, 22 different candidate genes were prioritized in 23 individuals. In the majority of individuals (n = 16), de novo variants in candidate genes for autosomal dominant inherited NDDs were found. Seven individuals harbored biallelic variants in candidate genes for autosomal recessive inherited NDDs. All nominated candidate genes were submitted to GeneMatcher. Six individuals were subsequently published within large collaborations connected through GeneMatcher and one individual was published as case report following two previous case descriptions, all together establishing six novel disease-associated genes for NDDs, namely CYFIP2, KDM3B, IMPDH2, FITM2, RALGAPA1, and VARS. [28][29][30][31][32][33] Those seven individuals were considered as solved and assigned to the overall yield (Supplemental Figure 1A). Furthermore, we published another three individuals from this study as single case reports proposing three novel candidate genes for NDDs (CAMK4, POU3F2, RBL2). [34][35][36] A number of the nominated candidate genes from this study is included in ongoing studies with manuscripts in process and is therefore not listed in detail.

| Systematic reanalysis of unsolved cases
We reanalyzed existing exome data from all cases with negative results older than ≥1 year (August 2017-September 2019). In summary, we performed reanalysis of 80 initially negative cases using updated variant annotation and newly discovered disease-associated genes. We achieved a diagnosis in two additional individuals increasing the overall yield from n = 113/231 (48.9%) to n = 115/231 (49.8%). Both individuals harbored variants in genes associated with autosomal recessive disorders (SMPD4, UGDH) 37,38 that had not been described as disease-associated genes at the time of data interpretation and were therefore not prioritized as potentially relevant variants.
Furthermore, two previously not prioritized candidate genes were identified (Supplemental Figure 1B).

| DISCUSSION
In this study, we present 231 individuals with different NDDs who underwent trio exome sequencing. We further delineate the associated genetic spectrum of NDDs and corroborate the burden of de novo variants in NDDs.
Performing trio exome sequencing in 231 individuals with NDDs and their parents, we achieved an overall yield of 49.8%. The diagnostic yield was significantly higher in individuals with NDD plus associated conditions in comparison to individuals with isolated NDD. Our results are in accordance with a recent meta-analysis (assessing 30 articles with data on molecular diagnostic yield of exome sequencing in individuals with NDDs) that reported a diagnostic yield of 31% for isolated NDD and 53% for NDD plus associated conditions. 16 One possible reason for this difference in diagnostic yields might be that a subgroup of those cases with isolated NDD has a multifactorial basis rather than a monogenic explanation.
With regard to disease burden of CNVs in NDDs, the observed proportion of 3% in our cohort was smaller than previous estimations ranging from 10% to 15%. 24,39 This discrepancy most likely originates from a depletion of our cohort for cases with CNVs due to prior genetic work up including chromosome microarray analysis in some cases. From a phenotype perspective, the vast majority of individuals in our study displayed additional, often predominant neurological features such as dystonia or seizures further underlining the convergence in the genetics of NDDs and other neurological comorbidities. 1,30,40 Even though it is widely recognized that de novo variants in proteincoding genes constitute the major genetic cause of NDDs in outbred populations, the burden as well as the genetic spectrum de novo variants in NDDs have not been fully elucidated yet. 14 In terms of de novo variants, we made several key observations in our study: First, the frequency of disease-causing de novo variants of 40.3% (n = 93/231) aligns with the prevalence of 42% recently presented in a large sequencing study of individuals with NDDs, 13 emphasizing the utility of trio sequencing as a first-line strategy, in particular in sporadic cases. 41,42 Second, with the identification of 72 distinct molecular diagnoses in our cohort, we replicate the enormous genetic heterogeneity underlying NDDs which challenges diagnostic determinations based on clinical examination alone, even in disorders actually considered as highly recognizable such as Mowat Wilson syndrome. 16,43 Those findings illustrate the advantage of exome sequencing over a targeted panel sequencing approach and further support exome sequencing as first-tier for the genetic testing of unexplained NDD in clinical practice. 16,44 Third, we expand the list of disease-causing variants in NDDs-associated genes with 50 previously unreported (likely) pathogenic variants facilitating variant classification in other cases. Last, we observed that in the majority of genes containing de novo variants the predicted constraint metrics indicated an overall high intolerance toward loss-of-function (pLI > 0.9) and/or missense variants (Z-score > 3.09) or a region-specific constraint illustrating the importance of constraint metrics for disease gene discovery and the understanding of disease mechanism. 25 The percentage of autosomal recessive disorders in our NDD cohort (16%) which did not derive from a significant proportion of cases with a consanguineous background was surprisingly high in comparison to a previous study showing a low contribution (4%) of autosomal recessive disorders to NDD in patients with European ancestry. 45 The proportion of cases with syndromal NDD was higher in the subgroup with autosomal recessive inheritance (n = 19/19, 100%) in comparison with those with de novo variants (n = 89/93, 95.7%) raising the question whether inclusion criteria were different in our study in comparison with previously published cohorts.
As hundreds of novel causal genes for rare NDDs still await discovery, 5 we also aimed to elucidate novel disease-associated genes for NDDs leading to the prioritization of more than 20 different candidate genes in our cohort of 231 individuals. A number of the nominated candidate genes have already resulted in publication as novel disease-associated genes, 28,29,31 once more emphasizing the potential of international data sharing and cooperation. 46,47 Most important, we illustrate that a parent-offspring trio approach is also a powerful tool for the discovery of novel disease-associated genes as it facilitates the prompt identification of de novo variants and assignment of zygosity for inherited variants. 42 Given the fact that our overall diagnostic yield did not include individuals with findings in new candidate genes, some of which are currently in preparation for publication, we furthermore anticipate that the actual number of molecular diagnoses in our cohort is going to increase.
The discovery of gene-disease and variant-disease associations is continually growing necessitating regular reevaluation of unsolved exomes. 48,49 In line with previous studies demonstrating an improved diagnostic yield by systematic reanalysis of existing data, 48,50 we achieved a definitive diagnosis in two additional individuals (among 80 reanalyzed individuals with initial negative results). Beyond, reanalysis in our cohort lead to the identification of two novel candidate genes for NDDs highlighting the potential of subsequent reanalysis also for disease gene discovery. 41,51 In summary, we consolidate the contribution and genetic heterogeneity of de novo variants in NDDs highlighting trio exome sequencing as an excellent diagnostic tool for rare NDDs. Besides, we illustrate the potential of a trio-approach for candidate gene discovery and the power of systematic reanalysis of unsolved cases.

SUPPORTING INFORMATION
Additional supporting information may be found online in the Supporting Information section at the end of this article.