The role of the gut microbiota in patients with Kleefstra syndrome

Kleefstra Syndrome (KS) is a rare monogenetic syndrome, caused by haploinsufficiency of the euchromatic histone methyl transferase 1 (EHMT1) gene, an important regulator of neurodevelopment. The clinical features of KS include intellectual disability, autistic behavior and gastrointestinal problems. The gut microbiota, an important modifier of the gut-brain-axis, may constitute an unexplored mechanism underlying clinical KS variation. We investigated the gut microbiota composition of 23 individuals with KS (patients) and 40 of their family members, to test whether (1) variation in the gut microbiota associates with KS diagnosis and (2) variation within the gut microbiota relates with KS syndrome symptoms. Both alpha and beta diversity of patients were different from their family members. Genus Coprococcus 3 was lower in abundance in patients compared to family members. Moreover, abundance of genus Merdibacter was lower in patients versus family members, but only in participants reporting intestinal complaints. Within the patient group, behavioral problems explained 7% of beta diversity variance. Also, within this group, we detected higher levels of Atopobiaceae – uncultured and Ruminococcaceae Subdoligra-nulum associated with higher symptom severity. These significant signatures in the gut microbiota composition in patients with KS suggest that microbiota differences are part of the KS phenotype.

. Co-morbid autism spectrum disorders (ASDs) are highly prevalent in this syndrome (Vermeulen, de Boer, et al., 2017).Over the lifespan of these children, a major concern is the occurrence of regression during adolescence and young adulthood.This is often thought to be due to a (manic) psychotic disorder, causing loss of functioning, disorganization and sleep problems.These episodes of regression typically start during puberty or young adulthood (Vermeulen, Staal, et al., 2017).Besides the effect on neurodevelopment, many patients or their guardians (such as family members) report gastrointestinal problems co-morbid with KS, such as severe constipation which, in some cases, requires surgery.Personal communication from Kleefstra et al. (unpublished data) indicates digestive problems in 59.6% (56/94) of patients, of which 21 report severe and 18 mild constipation.Repeated respiratory infections at pediatric ages are also prevalent (Kleefstra et al., 2009).Hence, both mental (intellectual disability, ASD) as well as somatic disorders (gastrointestinal and proneness to infections) are part of the KS phenotype.Treatment options are limited to medication aimed at reducing problematic behavior and relieving somatic complaints.The genotype-phenotype correlation in KS is low as there is still high (unexplained) variability in functioning between patients with a similar size or variant of the EHMT1 haploinsufficiency.
An unexplored mechanism underlying the mental and somatic symptoms in KS and explaining part of the variability in phenotype may be impairments in the bi-directional communication that exists between the gastrointestinal tract and the central nervous system (CNS), called the gut-brain-axis (GBA) (Cryan & Dinan, 2012).The GBA encompasses several routes, including endocrine signaling through hormones and neuro-active metabolites as well as signaling through the immune system and vagal nerve.The trillion of bacteria commensally living in our gastrointestinal system (also known as the gut microbiota) are an essential hub of the GBA, that modulate the bidirectional communication routes with the CNS (Cryan et al., 2019).
Directly after birth and during early life, the gut microbiota play a crucial role in building a healthy gut, educating immune processes and shaping neurodevelopment (Borre et al., 2014;Sordillo et al., 2019).
GBA alterations and differences in gut microbial composition (compared to family members or healthy control subjects) are observed in many neurodevelopmental and intellectual disability disorders (NDDs and IDDs) with similar symptoms to those also observed in KS (Dam et al., 2019;Q. Li, Han, Dy, & Hagerman, 2017;Van Ameringen et al., 2019).As there are, to our knowledge, hardly any studies reporting on the gut microbiota in homogeneous monogenetic (or Mendelian) syndromes such as KS, these disorders are the only relevant reference we currently have.Of these, cohorts of patients with ASDs are particularly relevant for KS and have been quite extensively studied, at least in comparison to other NDDs.Similar to KS, gastrointestinal problems occur more frequently in patients with ASDs compared to family members and the general population (Holingue, Newill, Lee, Pasricha, & Daniele Fallin, 2018) and bowel problem severity correlates with ASD-symptoms severity in a study by Adams, Johansen, Powell, Quig, and Rubin (2011).Most studies investigating the gut microbiota in ASD show less diverse microbiota in the ASD group compared to controls, as reviewed by Bundgaard-Nielsen et al. (2020).The most often observed differences include an increased abundance of bacterial genera Bacteriodes and Clostridium as well as a decreased abundance of Prevotella, Bifidobacterium and Streptococcus (Bundgaard-Nielsen et al., 2020).
A few (smaller) studies exist investigating the role the gut microbiota play in monogenetic IDDs, such as Down Syndrome (trisomy 21) and Rett syndrome, which are also characterized by similar symptoms as seen in KS such as ASD behaviors and gastrointestinal problems (Bermudez, de Oliveira, de Lima Cat, Magdalena, & Celli, 2019;Smeets, Pelc, & Dan, 2012).In Down Syndrome, increased relative abundance of (amongst others) genus Sutterella was observed, positively correlating with the Aberrant Behavior Checklist score (Biagi et al., 2014).In Rett syndrome both alpha and beta diversity were altered and several genera showed increased abundance (Borghi et al., 2017;Strati et al., 2016).
These above-mentioned findings in disorders similar and/or comorbid with KS suggest that KS may similarly be characterized by alterations in gut microbial composition.These alterations may contribute to gastrointestinal symptoms but also immune functioning and behavioral impairments.Hence, we hypothesize that the gut microbiota is an integral component of the clinical development and presentation of patients with KS.
In this exploratory study we aimed to test whether (1) variation in the gut microbiota associates with KS diagnosis and (2) variation in the gut microbiota associates with the clinical presentation, that is, syndrome symptoms of KS.We were able to collect data from 23 patients diagnosed with KS and 40 of their family member (n = 14 fathers, n = 16 mothers and n = 10 siblings).To this aim, we assessed (1) differences in global gut microbial community (alpha and beta diversity) as well as microbial composition at the taxonomic genus level between patients with KS and their family members.Moreover, (2) we tested the association between the abundance of bacterial genera with intestinal complaints (in both patients and family members), the EHMT1 genotype (of the patients) and two instruments measuring behavioral problems in the patients; the Autism Diagnostic Observation Schedule (ADOS) and Child Behavior CheckList (CBCL).

| Participants
In total 63 subjects participated in this study: 23 patients with KS and 40 of their family members consisting of 16 mothers, 14 fathers and 10 siblings (see This age difference limits our ability to compare Body Mass Index values (Cole, Bellizzi, Flegal, & Dietz, 2000).Therefore, we first grouped our participants by age and then grouped each subjects' BMI into their age and sex appropriate category (healthy, overweight or obese) (Kist-van Holthe et al., 2012).Analyses were done using these categories rather than crude BMI values.For seven samples (one patient, one father and five siblings) BMI was imputed using the median of their family group used in the analyses (patient or family members), see Table S1 for samples with missing values.The variable "dietary setting" was used to indicate shared or separate dietary intake between patients and family members.For example, when patients live at home they likely have a largely similar dietary intake across that whole family.In case of a patient living in a community or a sibling or parent living in a different household (e.g., in case of divorced parents) we assumed no shared dietary intake.No information about dietary setting (of family members) was available for two siblings so we grouped them as living at home.We used the "dietary setting" variable as a covariate in our models (coded as" living at home" and" not living at home").The variable "familyID" was used to correct for family relatedness in all analyses when including all subjects.To further characterize the patient sample and the relation amongst their phenotypic features, being behavioral and somatic symptoms and their genetic variants, correlations were performed between these 4 measures.

| Experimental procedure
All known families of patients with KS at our expert center (EC) at time of initiation of the study were approached for participation (via the Radboudumc EC for rare genetic neurodevelopmental syndromes).Patients were included if they had a molecular proven diagnoses of KS with a genetic description and if they had a biological age of at least 3 years old.The 3 year old cut-off was used as the behavior problems are not yet classified.Moreover, from that age children eat food similar to their parents, making their gut microbiota more comparable.The involved medical specialist informed them about the study and the families received an information letter.Written informed consent was obtained by the (adult) family members and/or their legal guardian.After the written informed consent was signed, the families received questionnaires to complete as well as the material for fecal collection.Data were collected between July and November 2017.

| Autism diagnostic observation schedule
The Autism diagnostic observation schedule (ADOS) (C Lord et al., 2012) is a standardized diagnostic instrument, which is frequently used worldwide (Catherine Lord, Elsabbagh, Baird, & Veenstra-Vanderweele, 2018) to assess autism features.It provides a semi-structured observation, which is performed by a certified psychologist or psychiatrist (in our case the involved medical specialist) and consists of four modules based on the language capacity of the participant.It runs from module 1, preverbal to minimal verbal capacity and includes activities, to module 4, fluently verbal and includes T A B L E 1 Descriptive statistics of patients with Kleefstra syndrome and their family members For more detailed information on intestinal complaints see Table S2.
adult topics to discuss.A comparison score is available to compare the different modules.The score for clinical suspicion of an autism spectrum disorder ranges between 5 and 10 (moderate to severe suspicion) (Vermeulen, de Boer, et al., 2017).
Our previous studies in KS patients showed a high prevalence of ASD, in which the ADOS-2 instrument best discriminated the ASDsymptoms from the developmental delay (Vermeulen, Egger, et al., 2017).All patients in this study scored 5 or higher, see Table 1.

| Child behavior checkList
The child behavior checkList (CBCL) is a widely used diagnostic instrument to quantify problem behavior and skills of children and adolescents, completed by the parents or guardian (T.M. Achenbach & Rescorla, 2001a;Thomas M Achenbach, Dumenci, & Rescorla, 2001b).The version for children from 1,5 years to 5 years old was used, corresponding to the developmental age of the patients (Borthwick-Duffy, Lane, & Widaman, 1997;Vermeulen, Egger, et al., 2017).Parents or guardians answer 99 questions about emotional and behavioral problems they observe in the child, answers ranging from 0 "not true" to 2 "very true".A total t-score (for a broad range of problematic behavior, internalizing as well as externalizing) was calculated and used in this study.In patients with NDDs, like KS, the total t-score is the best reflection of behavioral problems, because there is often a mix of ID, comorbid NDDs (like ASD) and they are prone to develop major psychiatric disorders as well, like psychosis or mood disorders (Vermeulen, de Boer, et al., 2017).A total score of 59 or below indicates nonclinical symptoms, a score between 60-64 indicates the child is at risk for problem behaviors and a score of 65 or higher indicates clinical symptoms (T.Achenbach & Rescorla, 2001).Three patients scored below the nonclinical symptom cut-off score, five were in the at risk category and all others (n = 15) scored within the clinical symptom range.

| Lifestyle questionnaire-intestinal complaints
A semiquantitative lifestyle questionnaire was used to assess dietary habits, medication, smoking, allergies, dietary setting, incontinence, physical and mental health.
A range of physical and mental health questions and dietary habits was included in this lifestyle questionnaire, for an overview see Supplementary methods and Table S2.Of specific clinical interest are "intestinal complaints" which were indicated for patients and family members on a 7 point scale ranging from "never"," < 1 times per month", "1 times per month", "2-3 times per month", "1 day per week"," >1 day per week" or "every day".As these 7 points results in very small sample sizes per category we coded these into two groups: "No", containing the "never" group and "Yes", containing the other groups.Detailed reported medication prescriptions by patients and family members are listed separately in Table S3.

| Fecal sample collection
Fecal samples were collected by the subjects at home using a validated protocol by OMNIgene•GUT kit (DNAGenotek, Ottawa, CA).
Caregivers were instructed to send back the kits within 2 weeks after collection.Samples were on average received and frozen 3.4 days after collection with two exceptions beyond the 14 days instruction of 18 and 22 days.This is still within the 60 days period determined by DNAGenotek.The material was aliquoted into 1.5 ml Eppendorf tubes and stored in À80 C until further laboratory processing (Szopinska et al., 2018).

| Bacterial DNA isolation and sequencing
For bacterial DNA extraction, 150ul or mg of feces was aliquoted from the Eppendorf tube.Microbial DNA was isolated including a bead-beating procedure and purified using a customized protocol by Baseclear B.V. (Leiden, the Netherlands).The V4 region of 16 S ribosomal RNA (rRNA) gene was amplified in duplicate PCR reactions for each sample.The V4 region was targeted using primers 515F (5 0 GTGYCAGCMGCCGCGGTAA) and 806R (5 0 GGACTACNVGGG TWTCTAAT) (Apprill, Mcnally, Parsons, & Weber, 2015;Caporaso et al., 2011;Parada, Needham, & Fuhrman, 2016).Both a PCR negative was included sample to assess contamination introduced during the PCR step as well as two positive control samples to assess correct detection of a sample with a known microbial distribution.Libraries were pooled for sequencing on the Illumina NovaSeq 6,000 platform (paired-end, 250 bp) (Baseclear B.V., Leiden, the Netherlands).
Sequence reads were demultiplexed and filtered using Illumina Chastity.Reads containing PhiX control signal were removed and adapter sequences were removed from all reads.An average sequencing quality score "Q "of 35.98 (range 35.17-36.25)was obtained indicating the sequencing error probability for a given base, where Q = 30 indicates 99.9% accuracy.We observed a total number of raw read pairs of 57.749.319 with an average of 916.655 and a range of 752.140-1.209.564.

| Bioinformatics
Reads were processed for statistical analysis in QIIME2.First, reads were denoised using DADA2 (Callahan et al., 2016): primers were trimmed off, sequencing errors (chimeras) were removed, combined read pairs and erroneous read combinations were removed.Read pairs were aligned into Amplicon Sequence Variants (ASVs) using mafft (Katoh, Misawa, Kuma, & Miyata, 2002).Combined reads deviating from the expected combined length of 250 bases were filtered out, allowing variation of 10 bases both directions (i.e., 240-260 range), which was 0.1% of the total of 39.777.830reads.This resulted in a total of 39.739.397reads corresponding to 5,452 unique ASVs over the 63 samples.An average read count per sample of 635.858 was obtained, ranging between 479.606-896.730,see Figure S1.
These sequences represented a total of 6.771 ASVs with an average of 477 ASVs per sample, ranging between 176-914 ASVs.Taxonomic assignment was done in q2-feature-classifier using the Naive Bayes classifier (Bokulich et al., 2018) trained on the SILVA reference database (version 132 with 99% OTU from 515/806R region of sequences) (Yarza et al., 2014).A phylogenetic tree was built using fasttree2 (Price, Dehal, & Arkin, 2010) and combined in a biom file including ASV and taxonomy information.Nonbacterial and unassigned ASVs were removed from the biom file.Data was made compositional; that is, abundance of each ASV was expressed relative to the total amount and prevalence of ASVs.All statistical analyses were performed in R (version 3.6.2) using packages phyloseq (McMurdie & Holmes, 2013) and microbiome (Lathi & Shetty, 2017), other packages are cited when used.

| Analyses on gut microbiota
The analyses were divided in two parts: analysis on the whole dataset (n = 63) and within the patient group only (n = 23), see Table 2 for a schematic of all analyses performed.Analysis on the whole dataset consisted of community and composition analysis on disease status (patient versus family members) and on the interaction between disease status with intestinal complaints.The patient only analysis consisted of association analyses of community and composition with phenotypic information on behavioral symptom severity (CBCL total problems score and ADOS comparative score) and on the genetic variants (mutation and deletion in the EHMT1 gene, see above).

| Community analyses
Global community analyses were performed without any prevalence filtering on genus and sample level to prevent inducing a bias against less prevalent genera.

Alpha-diversity
Shannon alpha-diversity was calculated in this study.After assessing normality using Shapiro-Wilk test, parametric ANOVAs were used.
For the analyses of the complete dataset, two separate ANOVA models were performed, one testing for effect of the factor disease status (patient vs. family) and another including the factor intestinal complaints (yes/no), allowing us to test for an interaction between disease status and intestinal complaints.Both models were corrected for age, BMI, sex, familyID and dietary setting.
In the patient only analyses, alpha diversity was correlated with the symptom severity measurements CBCL total score and ADOS comparative score (using Pearson's correlation where both measures are normally distributed, Spearman's where not).Moreover, using ANOVA we tested for differences in Shannon diversity between (1) two groups of genetic variants (mutation + deletion <1 MB, deletion >1 MB) and (2) three groups (mutation, deletion <1 MB, deletion >1 MB) both controlled for age, BMI and sex (see Table 2 for a an overview of analyses performed).

Beta-diversity
As a measure of beta-diversity, we calculated Bray-Curtis dissimilarity using the Principal Coordinate Analysis (PCoA) method (via the ordinate function in the phyloseq package).Bray-Curtis is a distance metric explaining differences in the relative abundances between microbial communities (Lozupone & Knight, 2009).Permutation testing was performed using the adonis function in the vegan package (Oksanen et al., 2019).
In the whole dataset, permutation testing was done in two models: the first one tested for the effect of disease status (Patient, Family) on beta diversity and the second one assessed an interaction between disease status and intestinal complaints (yes/no).Both models were corrected for age, BMI, sex, dietary setting and accounted for family relatedness by using familyID in the "strata" argument.This ensures permutations are constrained between and not within families.
For the patient only analyses, we assessed the effect of behavioral symptom severity (CBCL total problems score and ADOS comparative score) and genetic variants on beta diversity, correcting for age, BMI and sex.
Besides statistical testing, beta diversity was visualized using Canonical Analysis of Principal Coordinates (CAP) (Anderson & Willis, 2003).This supervised ordination method allows visualization of variance explained by a variable of interest, rather than visualization of the largest source of variation in the data as is the case in unsupervised ordination methods such as Principle Coordinates Analyses (PCoA).

| Compositional analyses on the taxonomic data
Compositional analyses were performed at genus level; the 6.771 ASV's obtained after denoising were aggregated to 417 genera.Two filtering steps removed genera containing excess zeros in the composition data, see (Szopinska-Tokov et al., 2021).First, at genus level, genera with a prevalence >50% across all samples were selected; consisting of 138 genera (these are genera with at last 50% of nonzero values across all samples).This threshold removes noise and rare genera and includes the most abundant microbiota, see Risely et al., 2021;Weiss et al., 2016, therefore referring to this selection as the "core" microbiota.We acknowledge that true signals present in taxa with frequencies lower than the threshold cannot be detected this way.Second, at sample level, samples with less than 10% of genera with nonzero values are removed, which resulted in zero removed samples.
For the whole dataset, per genus we used two regression models to test for (1) a main effect of disease status and (2) an interaction effect between disease status and intestinal complaints (yes/no) on relative abundance (see Table 2).To abide by the skewed nature of the data, nonparametric regressions were performed using the aligned.rank.transformfunction of the ART package (Villacorta, 2015).
In short, this function includes a preprocessing step aligning data before applying averaged ranks, and then runs a classic ANOVA on each of the aligned ranks (F-test) (Wobbrock, Findlater, Gergle, & Higgins, 2011).Rank-based statistics are very commonly used in nonparametric testing, making the statistics less vulnerable for extreme values in the distribution (Feys, 2016).Moreover, this method allows inclusion of covariates as well as interaction testing, which is not pos-

| Analyses on the relation between diseaserelated symptoms and EHMT1 genetic variation
To assist the interpretation of potential associations between diseaserelated characteristics (such as behavioral symptoms, intestinal complaints and genetic variants) and gut microbiota, we assessed the relation amongst these symptom characteristics in the patients.We tested the relation between behavioral symptoms (CBCL and ADOS) using a Spearman correlation.CBCL and ADOS were compared between patients that did versus those that did not experience intestinal complaints using a t-test for the CBCL and a Wilcoxon test for the ADOS.
For genetic variants, loss of function in EHMT1 gene accounts for most features in Kleefstra syndrome, which can be caused by several types of mutations on the EHMT1 gene.Two types of groupings can be made based on the clinical presentation: patients carrying pathogenic mutations or small deletions have a similar clinical severity spectrum (Kleefstra & de Leeuw, 2019).However, patients with larger deletions usually have more severe intellectual disability and more severe medical/somatic problems such as congenital anomalies, feeding problems and respiratory problems (Kleefstra & de Leeuw, 2019).
Hence, we split the genetic variants firstly into two groups according to the type of genetic variants they were carrying: firstly, pathogenic mutations + deletions <1 MB (n = 18) and large deletions >1 MB (n = 4) and secondly, using three dissociable groups: presence of a pathogenic mutation (n = 12), small deletion (<1 Mb) (n = 6) and large deletion (>1 Mb) (n = 4) (see Table 2 for an overview on the analyses).
Comparison between CBCL and the two groups of genetic variants was done using a t-test and for the three groups of genetic variants using an ANOVA.Comparison between ADOS and two groups of genetic variants was done using a Wilcoxon test and a Kruskal-Wallis test for the three groups of genetic variants.To test the association between intestinal complaints and the two and three groups of genetic variants, a chi-square test was used.

| Community results
We observed a main effect of disease status on Shannon diversity; F  S4. Disease status explained 3% variance explained in adonis permutation testing (R2 in Table S4) in the model.The CAP visualization (Figure 2) shows separate grouping of bacterial communities of patients and their family members.

| Compositional results
Out of 138 genera, one genus, Coprococcus 3, was significantly associated with disease status; relative abundance of this genus was higher in the family group compared to the patient group (F = 17.079, p uncorrected = 0.0002, p corrected = 0.033, mean difference = À0.135,SD difference = À0.033),see Figure 3.
For 15 genera we observed an FDR-uncorrected effect of disease status, see Supplementary Results for more information on these, amongst others Table S6 on the statistical results per genus and We observed an FDR un-corrected interaction effect of disease status*intestinal complaints for 13 genera.In the Supplementary Results more information is provided, amongst others Table S6 for the statistical results per genus and Figure S5 for plots per genus.

| Community results
No significant correlations were found between Shannon diversity and the CBCL (r = À0.114,p = .614)or the ADOS (r = 0.018, p = .936).We did not observe significant differences on Shannon  The CBCL total problems score showed a significant difference in the permanova test, see Table S5.
Covariates age, bmi and sex were included in the model Relative abundance of nine genera showed a nonsignificant (FDR-uncorrected) association with CBCL total score.See the Supplementary Results, Table S6 and Figure S7 for more information.

| ADOS comparative score
We did not detect any significant associations between any of the genera with the ADOS.However, one genus, Incertae_Sedis_uncultured, showed suggestive evidence for association with the ADOS comparative score.See Supplementary Results, Table S6 and Figure S8 for more information.

| Relation between disease-related symptoms and genetic variants
Scores on ADOS and CBCL scales did not correlate significantly (r = À0.113,p = 0.625).Regarding the relation with intestinal complaints, we observed a significant difference in CBCL total score (t = À3,171, p = .018),showing that high behavioral symptom severity was associated with frequent intestinal complaints, see Figure S9.
ADOS scores did not differ between patients experiencing and those not experiencing intestinal complaints (W = 28, p = .732).
None of the genetic variant groups showed differences in CBCL or ADOS scores (CBCL: two groups [t = À0.181,p = .872],three We detected significant taxonomic differences (at genus level) in relative abundance between patients with KS and their family members; relative abundance of genus Coprococcus 3 was significantly lower in patients compared to their family members.Moreover, an interaction between disease status and intestinal complaints was observed for genus Merdibacter.That is, for those participants reporting intestinal complaints, Merdibacter relative abundance was lower in patients (versus family members), while the opposite pattern was seen in participants without intestinal complaints.Within the patient group, CBCL scores associated significantly with beta diversity, explaining 7% of variance in the microbial community.Moreover, we found higher levels of two genera associated with more behavioral symptoms on the CBCL.

| Gut microbial community differences between patients with Kleefstra syndrome and family members
The gut microbial community of KS patients is less diverse and clusters separately from that of their family members, indicating the microbial community consists of different types and lower abundance of bacteria.Such findings are found in several types of NDDs, both monogenic as well as multifactorial.For example, ASD, ADHD as well as Down Syndrome, Rett Syndrome are associated with reduced alpha and altered beta diversity compared to controls or family members (Biagi et al., 2014;Borghi & Vignoli, 2019;Bundgaard-Nielsen et al., 2020), although few experimental studies are yet available.Comparison to these related NDDs is relevant as there are no studies on the gut microbiota in KS.Moreover, patients with KS often also display ASD-like symptoms and meet ASD criteria, hence gut microbial differences and underlying mechanisms may partially overlap.
When zooming into compositional level, abundance of the genus Coprococcus 3 was lower in KS patients versus their family members.
Interestingly, also in children with ASD Coprococcus was lowered (Kang et al., 2013).Moreover, Coprococcus abundance associated with altered gut-brain networks in patients with irritable bowel syndrome (Labus et al., 2019)

| Patient only dataset: Associations between gut microbiota and behavioral symptoms
Within the patient group, we showed that the total t-score on the CBCL explains a significant part of the variance in the microbial community (7% variance explained in beta diversity).At taxonomic level, the abundance of the uncultured genus Atopobiaceae correlated positively with behavioral symptom severity.Note that this result is build up by several zero-observations (n = 12 zero versus n = 7 nonzero observations) of which we cannot verify whether these zero observations are true biological zeros or reflect under-sampling.As this is still an uncultured bacterium, not much is known about its function in the human intestinal tract and GBA.Blasting of the most frequent observed sequence revealed one study where this genus was associated with schizophrenia (S.Li et al., 2020).Genus Ruminococcaceae Subdoligranulum levels are lower in subjects with antibiotics-related diarrhea (Gu et al., 2022) and depression (Radjabzadeh et al., 2021).
Its abundance is associated with metabolic syndrome markers (Van Hul et al., 2020).Another positive (uncorrected) correlation with CBCL was observed for Coprococcus 3, which was also observed to be lower in the patients versus family member comparison (see above).
This combination of results suggests this genus may be a relevant microbial signature belonging to KS.
Within the patient group, no significant effect of genetic variants on the gut microbial community was observed.Despite this and a small sample size, visually separate clustering was seen between these different genetic variants (Figure 7).This suggests a potential "dose effect" of EHMT1 genetic variants on the microbiota diversity.Clinical phenotypes are known to vary based on genetic variant; e.g., larger deletions are associated with worse behavioral and somatic symptoms.Furthermore, KS patients carrying a mutation rather than a deletion are more likely to have high birth weight and be overweight later in life (BMI > 25) (Kleefstra et al. 2009, Willemsen et al., 2012).Along these lines, genetic variants within KS may be reflected also at gut microbial level.

| ADOS versus CBCL associations with gut microbiota
In the associations with ADOS, none were FDR-significant and correlation plots with relative abundance do not show convincing patterns, possibly partly due to the lower sample size available for this analysis.
This difference compared to CBCL is a relevant point of attention for future studies, as ADOS is a specific measure for autistic symptoms assessed by a professional in a 30-60 minutes during observation.It is hence a more independent measure focused on autistic behaviors compared to CBCL, in which the occurrence of a wider range of behavioral symptoms (cognitive, emotional, physical and social) during the past 6 months is rated by a representative of the patient.Consequently, ADOS and CBCL measure different aspects of behavior and daily functioning, which is reflected in the absence of correlation between these symptoms scores.This also seems to be reflected in the nonoverlapping relations with the gut microbiota of KS patients.
Possibly, despite the fact many patients meet the diagnostic criteria of ASD, in the context of the gut microbiota, the wider range of behavioral symptoms assessed by the CBCL is a more relevant measure.impacts the gut microbiota and can contribute to differences between patients (with KS) and comparison groups (Jackson et al., 2018;Maier et al., 2018;McGovern, Hamlin, & Winter, 2019).Nevertheless, medication use was not controlled for in the analyses, as all such strategies in this dataset will be flawed especially given our sample size.For example, excluding patients with medication use will only leave six samples in this group to compare with 36 family members.Moreover, medication use in KS patients often include psychotropics such as anti-psychotics and sedatives, which are therefore in themselves an indication of disease and symptom severity.This means that when testing for an association with behavioral symptom severity, medication use is colinear with disease severity and cannot be properly controlled for.When testing for effects of disease status, depending on the type of medication, patients may never go without and hence one can argue that the effect of medication on the gut microbiota is part of the KS patient phenotype.

| FUTURE DIRECTIONS
The current findings should be tested for replication in a larger sample.For example, including more siblings will form a separate, better age-matched, comparison group.Moreover, higher sample size per genetic variant will extend analyses on microbial profile differences between genetic variants.Including a broader age range of patients will allow testing of the stability of gut microbial signatures of KS across age or with regression stage.Furthermore, intestinal complaints may be more extensively characterized with validated (observational) questionnaires, moving beyond the rough semi-quantitative inventory that was currently done for feasibility reasons.It is noteworthy that a study by Strati et al. (2016) in patients with Rett Syndrome did define constipation according to the official Rome III criteria and also found no effect on the global community measures (alpha and beta diversity).Similarly, dietary patterns may be studied more in depth, even though a proper assessment requires an extensive time and effort both from the participant as well as a certified nutritionist scoring the dietary diary or Food Frequency Questionnaire.Illustrating the relevance of proper dietary control in gut microbial case control comparisons is the recent article on the bidirectional relation between diet and gut microbial differences associated with ASD (Yap et al., 2021).
Tapping more closely into mechanisms driving gut-brain alterations, in future studies we would like to assess metabolic characteristics of KS patients.The EHMT gene affects metabolic processes; such as (brown) fat metabolism (Nagano et al., 2015) and energy metabolism under stress conditions, as observed in EHMT/G9a knockout fruit flies (Riahi et al., 2019).Mitochondrial metabolism is disturbed in ASD and other NDDs (Panneman, Smeitink, & Rodenburg, 2018;Rossignol & Frye, 2012).These defects in mitochondrial fat/energy metabolism in KS and related NDDs could also interact with the gut microbiota, whose metabolites modulate metabolic health (Franco-Obreg on & Gilbert, 2017).By studying such host-microbiota interactions in KS patients we may identify suitable pathways for microbial interventions such as probiotics.
In relevant initiatives, overlapping microbial signatures across various types of NDDs (monogenetic such as Rett and Down Syndrome and multifactorial such as ASD) are reviewed (Borghi & Vignoli, 2019;Orozco, Hertz-Picciotto, Abbeduto, & Slupsky, 2019).With more studies on microbiota in NDDs emerging, these initiatives can be up-

| CONCLUSION
In conclusion, our results suggest that gut microbial alterations are an integral part of the KS phenotype.Understanding potential alterations in the gut microbiota of individuals with KS may bring improved therapeutic guidance for this patient group.For example, novel treatment strategies, such early life probiotics supplementation may support better gut microbial health with the ultimate aim to relive their GI symptoms and, through the GBA, their cognitive abilities and behavioral symptoms.
sible using other nonparametric methods such as Wilcoxson's signed rank test.The two models included correction for age, BMI, sex, dietary setting and familyID.In case of significant interactions between disease state and intestinal complaints, simple effects of disease state per intestinal complaints group (yes/no) were tested using Mann-Whitney U-tests.For the patient dataset, we ran three models to test for a main effect of CBCL total score and ADOS comparative score on the relative abundance of any of the bacterial genera.These models were corrected for age, BMI, sex.Due to the low sample size per genetic variant sub-group, no compositional analyses were performed on genetic variants.All the taxonomic analyses were False Discovery Rate (FDR)-corrected for the amount of tests performed (138 genera per model).For significant results, we visually inspected the relative abundance data distribution: identifying the number of nonzero and zero observations per group.The nature of sequencing data does not allow dissociation between "true" biological zero's from sub-threshold detection or sequencing artifacts(Kaul, Mandal, Davidov, & Peddada, 2017).Hence, interpretation of statistical results based on many zero observations should be done with caution.Working with a core microbiota already limits the chance of finding such results.The interpretation of statistical results should be done in the context of how many nonzero observations this result was based on, which is reported per group (patient and family members) in each data plot on the horizontal axis.

3. 3
| Compositional results 3.3.1 | CBCL total score Higher relative abundance of genera Atopobiaceaeuncultured (F = 26,427, p uncorrected = 0.0001, p corrected = 0.014) and Ruminococcaceae Subdoligranulum (F = 21,273, p uncorrected = 0.00029, p corrected = 0.012) were found to be significantly associated with a higher CBCL total score, see Figure 7. F I G U R E 4 Boxplot showing the interaction effect of disease status*intestinal complaints on relative abundance of genus Merdibacter.The model included covariates age, BMI, sex, family relatedness and dietary setting.The boxplot indicates the median (black horizontal line) and 25th and 75th quartiles as the outside of the box.The lines represent the largest and smallest values within the 1.5 interquartile range and the dots represents the outliers which is >1.5 times and <3 times the interquartile range.The asterisks show the results of the within test using a Mann-Whitney U-test: ns = p > .05,* = p < .05 and **p < .01F I G U R E 5 CAP plots of Bray-Curtis dissimilarity metric for CBCL total problems score (CAP1, left panel) and ADOS comparative score (CAP1, right panel).The coloring represents the CBCL total problems or ADOS comparative score where a lighter gray represents a higher score and a darker gray represents a lower score.

F
I G U R E 7 Scatter plot showing the relation between relative abundance of genera Atopobiaceaeuncultured, Ruminococcaceae Subdoligranulum and CBCL total score.The models testing this association included covariates age, bmi and sex F I G U R E 6 CAP plots of Bray-Curtis dissimilarity metric for genetic variants.The coloring represents the different genetic variants in three groups (pathogenic mutation, deletion <1 MB or deletion >1 MB) or in two groups (pathogenic mutation + deletion <1 MB or deletion >1 MB).In the right panel, the shapes are based on the three groups (pathogenic mutation, deletion <1 MB or deletion >1 MB) to see how the pathogenic mutation and deletion <1 MB variants cluster groups [F[2,17] = 0.114, p = .893],ADOS comparative score: two groups [W = 29.5, p = .712],three groups [Χ 2 = 5.002, p = .082]).No significant relation was found between intestinal complaints and two groups (Χ 2 = 1.523 e-30, p = 1) or three groups of genetic variants (Χ 2 = 1.593, p = .451). 4 | DISCUSSION 4.1 | Summary In this first exploratory study in a sample of 23 patients and 40 of their family members we found evidence for both (1) a microbial signature of KS diagnosis in the gut microbiota and (2) variation within the gut microbiota relating with syndrome symptoms.Alpha diversity was lower in KS patients compared to their family members and beta diversity of the patients clustered separately from family members.
scaled and extended to other NDDs including KS, characterizing shared as well as unique microbial alterations in NDDs.The ultimate goal is this research is to flag gastro-intestinal complaints in an early stage and design interventions targeted at the gut microbiota such as probiotics to improve developmental trajectories and quality of life of KS patients and their caregivers.
Table 1 for their descriptive statistics).All participants lived in the Netherlands and Belgium.As shown in Table 1, our sample includes both adults and children (both patients and family members).
. Specifically, in healthy controls Coprococcus abundance associated with both gastrointestinal pain experienced during a nutrient and lactulose challenge test and resting state activity in the left caudate nucleus.These relations were absent in irritable bowel syndrome patients.In this context, genus Coprococ-