Atypicalities in sleep and semantic consolidation in autism.

Abstract Sleep is known to support the neocortical consolidation of declarative memory, including the acquisition of new language. Autism spectrum disorder (ASD) is often characterized by both sleep and language learning difficulties, but few studies have explored a potential connection between the two. Here, 54 children with and without ASD (matched on age, nonverbal ability and vocabulary) were taught nine rare animal names (e.g., pipa). Memory was assessed via definitions, naming and speeded semantic decision tasks immediately after learning (pre‐sleep), the next day (post‐sleep, with a night of polysomnography between pre‐ and post‐sleep tests) and roughly 1 month later (follow‐up). Both groups showed comparable performance at pre‐test and similar levels of overnight change on all tasks; but at follow‐up children with ASD showed significantly greater forgetting of the unique features of the new animals (e.g., pipa is a flat frog). Children with ASD had significantly lower central non‐rapid eye movement (NREM) sigma power. Associations between spindle properties and overnight changes in speeded semantic decisions differed by group. For the TD group, spindle duration predicted overnight changes in responses to novel animals but not familiar animals, reinforcing a role for sleep in the stabilization of new semantic knowledge. For the ASD group, sigma power and spindle duration were associated with improvements in responses to novel and particularly familiar animals, perhaps reflecting more general sleep‐associated improvements in task performance. Plausibly, microstructural sleep atypicalities in children with ASD and differences in how information is prioritized for consolidation may lead to cumulative consolidation difficulties, compromising the quality of newly formed semantic representations in long‐term memory.

Studies of neurodevelopmental disorders have the potential, therefore, to offer valuable theoretical insight into individual differences in consolidation processes (see Smith & Henderson, 2016, for discussion of this in the context of dyslexia).
Importantly, dialogue between the hippocampus and neocortex is thought to be orchestrated by sleep spindles: distinct trains oduring infant sleep. Nature Communicationsf sinusoidal EEG activity at 10-15 Hz, lasting approximately 0.5-3 s . Sleep spindles are thalamically generated during non-rapid eye movement (NREM) sleep, and are proposed to support consolidation via their temporal synchrony with hippocampal sharp-wave ripples and neocortical slow oscillations (Antony, Schönauer, Staresina, & Cairney, 2018;Diekelmann & Born, 2010;Genzel et al., 2017;Latchoumane, Ngo, Born, & Shin, 2017;Staresina et al., 2015). It has been hypothesized that spindle-orchestrated 'replaying' patterns of hippocampal and neocortical activity following learning are key to the 'wholebrain reorganization' required for cellular consolidation across distributed neocortical connections (i.e., systems consolidation ;Genzel et al., 2017; see Runyan, Moore, & Dash, 2019, for a review). Sleep spindles have been shown to occur more frequently after learning, and have been associated with synaptic plasticity and improved retention (Muller et al., 2016;Rosanova & Ulrich, 2005). Within the domain of word learning, it has been demonstrated that overnight improvements in lexical stabilization and integration are associated with spindle characteristics measured via polysomnography in adults (Tamminen, Payne, Stickgold, Wamsley, & Gaskell, 2010;Weighall, Henderson, Barr, Cairney, & Gaskell, 2017) and children (Smith et al., 2018). Sleep spindle density has also been linked to the integration of new knowledge into a previously learned memory schema, and with increasing independence from hippocampus during recall the subsequent day (Hennies, Ralph, Kempkes, Cousins, & Lewis, 2016).
While the role of sleep is well-established in phonological aspects of word learning (see James, Gaskell, Weighall & Henderson, 2017), there is less evidence relating to semantic aspects of vocabulary consolidation. Tham, Lindsay, and Gaskell (2015) provided evidence for a role for sleep in consolidating novel form-meaning mappings. Adult participants learnt Malay translations for nine English animal names and were later tested using a size judgement paradigm, after a period of either sleep or wake. Participants were presented with two English or Malay animal names written on screen and had to decide which animal was larger. Font size congruent (BEE-COW) and incongruent (BEE-COW) trials were included, such that if meaning was automatically retrieved upon presentation of the written words, then response times (RTs) should be faster for congruent trials (Rubinsten & Henik, 2002). Two key findings emerged. First, evidencing semantic stabilization, overall task RTs were quicker in the sleep than the wake group for Malay, but not English trials, regardless of congruency. This pattern suggested that sleep led to more efficient semantic processing of the newly learned items. Additionally, the sleep group demonstrated a size congruency effect for the Malay trials (signalling automatic semantic retrieval, owing to semantic integration). However, this effect was weak and only evident in trials for which there was a larger size difference between animals. Nevertheless, larger congruency effects for these trials were associated with greater spindle density in the sleep group. The authors argued that this supports a systems consolidation account of declarative learning, with sleep playing an active role in the integration of novel semantic information into existing networks.
These findings resonate with a recent infant study, in which state-dependent changes in spindle density predicted generalization of novel object labels 1 day after learning (Friedrich, Mölle, Friederici, & Born, 2018). Furthermore, a developmental MEG study in children aged 8-12 years found that activity in inferior frontal gyri and medial prefrontal cortex was associated with recall of novel object associations (i.e., semantic learning) following sleep but not wake; whereas the wake group showed significantly greater hippocampal activation (Urbain et al., 2016). Thus, there is emerging evidence that sleep, particularly spindle activity, supports the consolidation of new semantic material in development, as well as in adulthood.
The above findings have potential implications for individuals with neurodevelopmental disorders characterized by atypical sleep.
ASD is a pervasive neurodevelopmental disorder, with prevalence between 1/34 and 1/76 (Baio et al., 2018). Sleep disorders are claimed to be present in up to 80% of ASD children, most often characterized by longer sleep onset latency and reduced sleep efficiency

Research Highlights
• Initial learning and overnight consolidation of the names and meanings of novel animals were comparable in children with autism and typical peers.
• A month after learning, children with autism were more likely to forget the unique features of the new animals than typical peers.
• Children with autism showed lower sigma power on the night after learning than typical peers.
• Associations between spindle parameters and overnight changes in semantic decision speed were specific to novel animals in the typical (but not the autism) group. (Diaz-Roman, Shang, Delorme, Zhang, Delorme, Beggiato, & Cortese, 2018;Fletcher et al., 2017;Souders et al., 2009). However, few studies have utilized polysomnography to objectively explore sleep in children with ASD. There is some evidence to suggest that sleep spindles may differ in ASD (Gruber & Wise, 2016), with reduced N2 central spindle density (spindles per minute) in adults (Godbout, Bergeron, Limoges, Stip, & Mottron, 2000;Limoges, Mottron, Bolduc, Berthiaume, & Godbout, 2005). In a sample of 13 children with ASD, no differences in N2 central spindle density were observed compared with controls (Lambert et al., 2016; see also Maski et al., 2015), but there was significantly reduced central spindle duration and central sigma power in the same sample (i.e., power spectral density across the spindle frequency range, Tessier et al., 2015). It seems highly relevant, then, to explore the extent to which sleep supports memory consolidation in ASD.
In one exception, Norbury, Griffiths, and Nation (2010) taught children with and without ASD (matched on receptive vocabulary) four novel object names and assessed memory (via definitions and naming tasks, to tap semantic and phonological knowledge respectively) immediately and a month later. Children with ASD showed poorer overall performance when asked to define the features of the novel objects than TD peers. Furthermore, whilst the typical peers showed further improvements in accuracy at the 1-month follow-up (+11%), the ASD group showed weaker feature recall (-5%). The ASD group outperformed TD peers at the immediate naming test, but this difference diminished at the 1-month followup because the TD peers (but not the ASD group) improved over time. Therefore, across tasks only the TD group showed results consistent with long-term consolidation. The enhanced initial phonological performance in the ASD group immediately after learning aligns with Henderson, Powell, Gaskell, and Norbury (2014), where children with ASD showed evidence of immediate lexical integration of novel phonological forms, which was not maintained 24 hr later. Interestingly, explicit measures of phonological recall and word form recognition identified intact overnight consolidation mechanisms in children with ASD relative to their TD peers.
Collectively then, previous data suggest that initial encoding of words may be spared or even enhanced in ASD, and overnight consolidation of word form information may be intact; however, difficulties may lie in longer term consolidation processes. Whilst it is plausible that such difficulties could be linked to atypical sleep architecture, this is yet to be established, particularly in relation to semantic aspects of word learning.
This study examined semantic aspects of rare word learning in school-aged children with and without ASD, matched on age, vocabulary and nonverbal ability. We utilized polysomnography to investigate associations between key sleep parameters and behavioural changes in memory. Participants learnt the names of previously unfamiliar animals over a series of explicit training trials (e.g., reading aloud; word-picture matching). Explicit memory was assessed via a naming task (to assess the accuracy and speed of phonological retrieval in response to the pictures) and a definitions task (to assess the depth of semantic knowledge). Furthermore, a size judgement task, based on Tham et al. (2015), assessed both semantic stabilization (speed of animal size judgement for novel and familiar animals) and semantic integration (size congruency for novel and familiar animals). The use of novel and familiar trials allowed us to examine whether children prioritized novel information for consolidation over already familiar information, similar to the adult findings from Tham et al. (2015). In addition to the familiar and novel trials used by Tham et al. (2015) we introduced 'mixed' trials, comprising one novel and one familiar animal. This formed a midway condition between familiar and novel trials to explore the way in prior knowledge may scaffold semantic decision speed; whereby the difference between familiar and novel trials should be greater than the difference between familiar and mixed.
The following hypotheses were made: (a) The TD and ASD groups would demonstrate comparable performance immediately after learning when defining and naming the newly learned animals, but consolidation (particularly at a delayed follow-up) may be stronger in TD than ASD groups (Henderson et al., 2014;Norbury et al., 2010); (b) For the size judgement task, RTs would reduce overnight for trials including novel animals (relative to trials containing already familiar animals, for which no sleep-dependent consolidation would be required) and this consolidation benefit would be larger in TD than ASD, representing greater stabilization of novel semantic information in TD children. Further, if semantic integration occurred, then congruency effects (faster RTs for congruent than incongruent trials) would be evident after sleep in trials containing novel animals (particularly for trials with a large semantic distance, as in Tham et al., 2015); (c) children with ASD would show differences in sleep microstructure, including reduced NREM sleep duration, sigma power (i.e., power within the sleep spindle frequency range), spindle duration and/or spindle density; (d) Sleep spindle parameters would be associated with overnight changes in the semantic stabilization and integration of novel (but not familiar) animals.

| Participants
Children aged 8-12 years (n = 59), with and without autism, were recruited as part of the SleepSmart project at the University of York.
The research team carried out the recruitment and selection of participants.

| Inclusion-exclusion criteria
Children were invited to participate following an initial screening interview administered over the phone to ensure they were (a) native monolingual English speakers, (b) had no diagnosis of epilepsy or genetic syndromes, (c) they had normal or corrected to normal vision and hearing and (d) they had no diagnoses of sleep disordered breathing.
Twenty-five children were initially recruited for the ASD group.
We excluded any children with diagnoses of co-occurring conditions (i.e., leading to two children with ASD being excluded as a consequence of having dyslexia). Due to the high verbal demands of the experimental tasks, three children were also excluded due to scoring <75 on the British Picture Vocabulary Scale 3rd Edition (BPVS; Dunn, Dunn, & Styles, 2009). The remaining 20 children all met our inclusion criterion of either a formal diagnosis of autism (n 14) or an ongoing formal diagnostic assessment (n 6), which has an average duration for this age range of 3.5 years (Crane, Chester, Goddard, Henry, & Hill, 2016). In one large-scale study, 70% of children referred for an autism diagnosis went on to receive a diagnosis, and for children without any co-occurring conditions (as was the case for the present sample) this figure rose to 89% (Lo, Klopper, Barnes, & Williams, 2017). Importantly, all parents completed the Gilliam An additional 34 additional children met inclusion criteria for the TD group: (a) not a sibling of a child with ASD, (b) GARS-AI < 55 (i.e., below cut-off for 'probable' parent-report autism profiles), (c) no diagnosed psychological disorder.

| Group characteristics
As shown in Table 1 Bishop, 2003) than their TD peers (all p < .001). The DSM-orientated scales were also applied to the CBCL to derive the percentage of children above the clinical cut-off for affective, anxiety and ADHD problems. See Table 1 for descriptive statistics and statistical tests of group differences. It

ASD (n = 20) TD (n = 34) t/χ 2 d/w
Age ( Three participants in the ASD group were reported to be taking melatonin at the time of study intake (tablet: 4 mg and 9 mg and liquid: 4 ml). Regarding educational setting, one child in the TD group, and three children in the ASD group were home schooled. One child in the ASD group attended a school for children with social emotional and behavioural difficulties (SEBD). The remaining 91% of children attended mainstream schools and attended classes with their typically developing peers.

| Stimuli
Nine mono-or bi-syllabic rare words with 3 to 4 letters were selected (asp, goby, pipa, mata, uda, saki, gir, topi, paso). These were names of extant species/breeds of familiar animals (e.g., gir is a breed of cow), were judged to be unfamiliar to children aged 8-12 years, and were characterized by at least one unique physical feature (e.g.,a gir is a humped cow). They were allocated to a size category (small, medium or large) according to the rated size of their respective 'base' animal (e.g., cow) in existing norms (Paivio, 1975). Size categories were confirmed by data from 62 adults with animals rated on a scale from 1 (smallest) to 9 (largest). One 3-letter and two 4-letter words were chosen for each size category. A photograph of each novel animal was selected from Google images. In each, the animal took up approximately three-fourth of the total photograph and all backgrounds were of a natural habitat.
Nine familiar animal names were also selected for use in the size judgement task (worm, slug, rat, duck, goat, pig, cow, lion, bear).
All were 3 or 4 letters in length with an Age of Acquisition (AoA) below 6 years (Kuperman, Stadthagen-Gonzalez, & Brysbaert, 2012). The familiar animals were also allocated to a size group based on Paivio (1975) size norms. One 3-letter and two 4-letter words were chosen for each size category, based on those identified to be the most familiar to children aged 8-12 years (see Table   S1 for stimuli lists).

| Procedure
Participation In between pre-and post-sleep sessions, participants underwent overnight home polysomnography. In a preliminary meeting, participants completed a battery of cognitive assessments including BAS word definitions and matrices subscales and the BPVS.
Parents also completed the CSHQ, the CCC-2, the CBCL and the GARS-3.
Training and test sessions were delivered on DMDX (Forster & Forster, 2003) and the PVT task was administered using E-prime experimental software (Psychology Software Tools).

| Training
Participants were told Today you are going to learn some new words. All of the words are names for different types of animals. Some of the animals might look a bit like animals you already know. Participants were then asked if they had heard of any of the animals before. Each novel animal name was presented via headphones and participants gave a yes/no verbal response. Yes responses were probed by the experimenter ('Please describe a ____ to me?').
Training consisted of 12 exposures to each novel word. In the first two exposures, participants heard the animal name and were asked to repeat it, after which the associated picture was presented onscreen for 3000 ms. In the following two exposures, participants saw the uppercase rare word onscreen and were asked to read the name out loud, after which the picture was presented for 3000 ms.
Participants then completed a series of 2AFC trials with feedback.
In image-matching 2AFC trials, participants saw two images onscreen (one target and one distractor), to the left and right of the centre point. A novel written word was simultaneously presented centred underneath the images. Participants were asked to select, using a keypress, which image matched the word. Orthographymatching trials were similar but with two words and one picture.
The distractor was always another item from the stimulus set, with all items appearing an equal number of times throughout training.
There was no timeout for this task and feedback was provided in the form of the target, which remained on screen for 2,000 ms.
Participants completed four image-matching, and four orthography-matching trials, for each item in alternating blocks, with a different distractor for each exposure. Trial order was randomised within each block.

Size judgement task
This task consisted of three blocks: familiar, mixed and novel. Familiar trials involved two familiar animals (e.g., BEE-COW), mixed trials had one familiar and one novel animal (e.g., ASP-COW or COW-ASP) and novel trials contained only novel animals (e.g., GIR-ASP). Twelve word pairs were selected for each block, six with a large semantic distance (large vs. small animal), and six with a small semantic dis- At the end of the follow-up session only (to avoid influencing performance on other test), participants also completed a size-ordering task. The purpose of this task was to check that participants' perception of size aligned with the allocated small, medium and large categories. For the novel animals, participants were provided with nine cards, each with a picture of one novel animal. Participants were asked to order the animals from smallest (left) to largest (right).
This therefore assessed participants' perception of the size of the animals based solely on the trained image, as required for the size judgement task. For the familiar animals, participants were provided with the orthographic form (rather than an image) and again asked to order them from smallest (left) to largest (right). This part of the task therefore also served as a check for semantic knowledge of the familiar animals.

Definitions task
Each novel animal name was presented via headphones and participants were asked to describe the animal to the experimenter.
Any responses which made reference only to the base animal (e.g., 'a Gir is a cow') were probed with a standard response of 'can you tell me more about a ___?' Separate scores were allocated for correctly recalling the 'base animal' and the feature. There was no timeout for this task.

Naming speed
Each novel animal picture was shown on screen and participants were asked to name the animal as quickly as possible. Timeout was set to 5,000 ms. Responses were recorded from picture onset via DMDX (Forster & Forster, 2003) and scored using CheckVocal (Protopapas, 2007) software. Accuracies and RTs were double-scored and all discrepant accuracies and any RT differences >10 ms were checked and agreement was reached. One hundred per cent phonetic accuracy was required for each item to be scored as correct.

| Sleep recordings
Home polysomnography recordings were completed using an am-

| Psychomotor Vigilance Task
To capture between-group or between-session baseline differences in alertness, participants completed a bespoke 90-item psychomotor vigilance task (PVT) based on one developed by Basner and colleagues (Basner, Mollicone & Dinges, 2011). The task took approximately 4 min to complete. Participants were informed that a star would pop up on the screen intermittently and they were to click the mouse button as fast as they could. RT and frequency of lapses (RT > 500 ms) were recorded. ISIs ranged from 1,000 to 4,000 ms.
There were no practice trials for this task.

| RE SULTS
The following analysis presents a series of mixed effects regression models. For reference, unadjusted and untransformed participant-level descriptive statistics for all tasks are shown in Table 2. Binomial GLMMs were used for accuracy data with response (0/1) as the DV and linear mixed effects models were used for RT data, with log-transformed RT as the DV.

| Explicit memory (naming speed and definitions)
There was a significant overnight increase in naming accuracy, with item-level responses more than three times (Session:  TA B L E 2 Unadjusted group means for the language learning tasks, reported as M (SD)

Semantic stabilization
Stabilization effects were explored first by fitting a model with the interaction terms for session, block and group. To recap, a significant session*type interaction indicates a pre-sleep to post-sleep RT change for novel/mixed trials that is distinct from familiar trials (i.e., overnight change controlling for practice effects). A session*type*group interaction indicates that these stabilization effects were different between groups. As previously stated, the role of age was explored in all models and was found to contribute significantly for this model, predicting overall RT; age was therefore retained as a fixed effect in the model. Significant session*type interactions (session*mixed: B = −0.16, t = 3.72, p < .001, Session*novel: B = −0.25, t = 7.90, p < .001) were identified and explored using emmeans. As shown in Figure 1 and Table 2, the pre-sleep to post-sleep decrease in RT was significantly greater for mixed (z = 6.68, p < .001) and novel (z = 13.37, p < .001) trials, relative to familiar trials (z = 3.00, p = .003). As such, post-sleep task performance was characterized by more efficient semantic processing for items containing novel animals; suggesting overnight stabilization of the novel semantic information. Crucially, given that the model contrasts compared mixed and novel trials to familiar trials, these consolidation effects are unlikely to be a consequence of repeat test (i.e., practice) or circadian effects.
That is, if practice or circadian confounds were responsible for these effects then they should also be influencing RTs to familiar trials. Synonymous with the definitions and naming task, this overnight consolidation was comparable between groups for novel (Session*novel*group: B = −0.03, t = 0.43, p = .67) and mixed (Session*novel*group: B < 0.01, t = 0.006, p = .99) trials. There was also a significant group*type interaction for novel trials relative to familiar. As shown in Figure 1, the ASD group showed less of an RT benefit for familiar animals relative to novel animals (z = 6.93, p < .001), compared to the TD group (z = 11.74, p < .001), perhaps as a consequence of less efficient processing of familiar animals.
See Table S4 for full model output.

Semantic integration. Semantic integration effects were explored
by assessing the roles of congruency and semantic distance. An overall semantic distance effect was observed (B = 0.11, t = 7.15, p < .001) with quicker RTs for trials with a large semantic distance than a small semantic distance; however, this effect was not consolidation-dependent, with no relationship with session or block type

| Month follow-up
Follow-up data were available for 14 ASD and 32 TD participants.
These subgroups were also matched on age (t = 1.41, p = .17), sex (χ 2 = 0.72, p = .39), receptive vocabulary (t = 1.68, p = .10), expressive vocabulary (t = 1.0, p = .32) and nonverbal ability (t = 1.04, p = .32). For the definitions task, the ASD group was comparable to TD children at recalling the base animals that were associated with the novel animals (e.g., asp is like a caterpillar; OR = 1.62, z = 1.22, p = .23), but they recalled significantly fewer unique features (e.g., asp is a hairy caterpillar; OR = 2.24, z = 2.13, p = .033; Figure 2). Given F I G U R E 1 Estimated marginal means (adjusting for age) for size congruency RT as a function of block type, session and group. Error bars represent standard errors

| Sleep characteristics
Polysomnography data were available for 83.6% (17 ASD, 28 TD) of participants (see Table 3). Missing data were due to: (a) child opt-out  Table 3 and Figure 3). Notably, this group difference in sigma power survives Bonferroni correction for multiple comparisons. Further exploration tentatively suggests that this group difference is driven primarily by power within the slow (10-12.5 Hz; t = 2.87, p = .006) but not fast (

| The relationship between spindle characteristics and semantic stabilization
Given the lack of evidence for overnight changes in semantic integration, only the role of sleep in semantic stabilization was explored.
As such, three sleep models were created, one for each sleep variable (sigma power, spindle duration, spindle density To accompany each mode, Table 4 presents the z ratios for pre-sleep to post-sleep, for each block type and group. These z ratios are comparable to the more traditional correlations between sleep and overnight change. A positive z ratio indicates that the sleep variable predicted an overnight reduction in task speed (i.e., task improvement and support for our hypothesis) and a negative z ratio indicates that the sleep variable predicted an overnight increase in task speed.
Task improvement is therefore synonymous with a positive z ratio.
For spindle duration, the highest order four-way interaction was significant, specifically for the novel:familiar type contrast (t = 2.53, p = .011). As shown in Table 4 and Figure 4, this was accounted for by a direct dissociation in the role of spindle duration; predicting task improvement in novel trials for the TD group, but in F I G U R E 3 Mean log-transformed sigma power for the TD and autism spectrum disorder groups. Error bars represent ±1 SE and points represent individual participants TA B L E 4 Z ratios for sleep characteristics predicting semantic judgement speed; pre-sleep compared to post-sleep F I G U R E 4 Spindle duration as a predictor of participant-level overnight change in task RT. Points represent individual participants and shaded area represents 95% confidence interval for participant-level regression line familiar and mixed trials for the ASD group. Spindle density showed a significant three-way interaction with session and group (t = 4.79, p < .001), as did sigma power (t = 5.21, p < .001). Namely, spindle density and sigma power predicted task improvement (collapsed across type) for the ASD group (density: t = 5.17, p < .001; power: t = 5.18, p < .001) but not the TD group (density: t = −0.61, p = .54; power: t = 1.25, p = .21). As shown in Table 4, this was characterized by these spindle properties predicting overnight semantic stabilization (i.e., task improvement) only for completely novel animal trials for the TD group. In fact, higher sigma power was associated with an overnight reduction in performance (i.e., slowing down) in familiar trials. In contrast, spindle density and sigma power predicted overnight task improvements more globally for children with ASD, working across familiar trials as well as trials containing novel animals. Notably, the associations were numerically strongest for the familiar words in the ASD group, where there was less to learn, at least semantically. To recall, though, the overall slower task speed in the ASD group for familiar trials relative to novel and mixed trials (supported by the group*type interaction shown in Figure 1) perhaps offered more opportunity for sleep to play a role in enhancing task performance for familiar trials. It is also important to note that when controlling for multiple comparisons using Bonferroni correction, the only z ratios to remain significant were for sigma power and spindle density predicting overnight change for familiar trials in the ASD group, and for spindle duration predicting overnight change for novel trials in the TD group. This bolsters our interpretation that the TD group are biased towards consolidating novel information, in contrast to the ASD group.
To summarize, associations between spindle properties and overnight semantic stabilization differed by group. For the TD group, spindle duration predicted overnight changes in responses to novel animals, but not changes in responses to familiar animals. In contrast, for children with ASD, sigma power and duration had a more holistic association with improvements in response speed for all types of trial, but particularly when trials contained familiar animals.

| D ISCUSS I ON
Sleep difficulties are commonly reported in childhood, particularly in neurodevelopmental disorders such as ASD. Despite this, little progress has been made in examining the impact of sleep on learning and development in these populations. In this endeavour, we examined the sleep-associated consolidation of novel vocabulary in a relatively high ability, verbally able school-aged children with ASD compared to TD peers matched on age, vocabulary and nonverbal ability. An assessment of sleep microstructure identified significantly lowers NREM central sigma activity in children with ASD, relative to TD peers. There was also some evidence of significantly reduced time in NREM sleep (mainly driven by reduced slow wave sleep duration in children with ASD). Nevertheless, children with and without ASD showed striking similarity in the extent to which they consolidated novel semantic knowledge overnight.
More specifically, they showed equivalent performance when asked to define novel animals, name pictures of them, and make speeded semantic decisions about them immediately after learning, and both groups showed similar improvements after a single night of sleep.
Spindle parameters predicted overnight improvements in speeded semantic judgements. Importantly, however, the nature of this relationship differed between groups. For the TD group, spindle parameters were specifically associated with performance on trials containing novel animals. Conversely, the associations were more general in the ASD group and strongest for trials containing already familiar animals, reflecting sleep-associated improvements in task performance rather than with specific stabilization of new semantic knowledge. One month later, there was clear evidence that children with ASD were less likely to retain the unique (and defining) features of the novel animals. It is plausible, therefore, that the impact of sleep atypicalities and/or a lack of prioritization towards sleep-dependent consolidation of new information in children with ASD may leave new semantic representations more vulnerable to the effects of long-term forgetting.

| Sleep characteristics in children with ASD
Mirroring numerous previous studies (e.g., Fletcher et al., 2017), parents of children with autism reported a higher rate of sleep problems than parents of typical peers. The CSHQ total scores were on average ~10 points higher in the autism group than in than the TD group.
The current work also demonstrates the feasibility of administering objective home-based polysomnography in children with ASD.
Whilst recruitment bias for this type of study is highly likely (and the present data do not reflect children with ASD who have more severe sensory, language and cognitive issues, for example), only one child with ASD chose not to wear the equipment, a number far lower than anticipated. Consistent with previous findings (Lambert et al., 2016) our data demonstrate that a sample of children with ASD who have language abilities within the normal range nevertheless have almost half an hour per night less of NREM sleep, mainly as a consequence of reduced SWS duration. Childhood is typically characterized by SWS-rich sleep, with up to three times as long spent in SWS compared to adults . The reduction of this in children with ASD may therefore have important implications for correlates of SWS, including declarative memory consolidation . It is important to note, however, that the group difference in NREM sleep did not survive correction for multiple comparisons, so should be interpreted with caution.
Regarding spindle characteristics, in line with Maski et al. (2015) and Lambert et al. (2016), we did not find evidence for reduced central spindle density in children with ASD (although note that Lambert et al., observed reduced frontal spindle density). Since previous studies more consistently report reduced spindle density in adults with ASD (Godbout et al., 2000;Limoges et al., 2005), and given findings that spindle density peaks around adolescence and reduces over adulthood (Purcell et al., 2017), it is possible that these findings reflect atypical maturation of spindle density in ASD.
Despite being well matched with TD peers on the prevalence of spindles in sleep and the typical duration of a spindle, the children with ASD showed significantly lower central NREM sigma power (i.e., the average power spectral density (PSD) within the spindle range, 10-15 Hz) supporting recent findings from Tessier et al. (2015). Thus, sleep spindles were lower in amplitude for children with ASD, compared to the amplitude of equivalent spindles in TD children. Although traditionally examined less often than spindle density, sigma power is gaining support as a robust predictor of general cognitive ability (e.g., Hoedlmoser et al., 2014;Tessier et al., 2015) and may be key to memory consolidation in developmental populations, as evidenced from studies of vocabulary consolidation (Smith et al., 2018) and nonverbal declarative memory (Maski et al., 2015). Evidence from neurotypical adults suggests that 'spindle power' (i.e., the average power of individually detected spindles, as opposed to the power within the sigma (spindle) frequency band, as measured here) reflects the structural integrity of an extensive network of white matter tracts including the forceps minor, parts of the uncinate fascicle and the anterior corpus callosum, as well as subcortical regions (including tracts within and around the thalamus; Piantoni et al., 2013). Thus, spindles reflect both the dynamics of network connectivity at the synaptic level (Poe, Walsh, & Bjorness, 2010;Tononi & Cirelli, 2006) as well as the state-like network parameters that are governed by the structure of white matter tracts (Piantoni et al., 2013). Interestingly, reductions in white matter integrity, reflecting neocortical underconnectivity and local overconnectivity, are well documented from late childhood to adulthood in individuals with autism (e.g., Karahanoğlu et al., 2018). Further, it has been hypothesized that such disruptions to long-range axonal projections in autism, crucial for the coordination of distributed neocortical activity, may impede cellular and systems consolidation (Runyan et al., 2019). Clearly, data are needed to fully characterize these differences to illuminate implications for learning and development.

| Semantic learning and consolidation
Children with ASD and typical peers showed similar performance on the explicit measures of novel animal knowledge (i.e., naming speed and definitions accuracy) immediately after training, consistent with previous research (Henderson et al., 2014;Norbury et al., 2010). Furthermore, similar improvements were observed for these measures the following morning in both groups, again similar to Henderson et al.'s (2014) findings of intact overnight consolidation of novel word form knowledge in ASD. Thus, it appears that when learning via direct explicit instruction, children with ASD are akin to TD peers at encoding novel semantic information and consolidating these new memory traces overnight. Significant improvements in task performance, rather than maintenance, could signal an active role for sleep in supporting the consolidation of semantic knowledge in childhood, with sleep working to stabilize and integrate novel memories into existing semantic networks (Urbain et al., 2016). It is important to note, however, that the testing phase incorporated additional presentations of the novel animals which could have contributed to these offline improvements. For example, additional presentations could provide feedback for novel animals not accurately remembered in the initial tests (e.g., Krishnan, Sellars, Wood, Bishop, & Watkins, 2018). Nonetheless, overnight improvements in performance in similar word learning paradigms have been reported in the absence of repeat testing (e.g., Henderson, Weighall, & Gaskell, 2013). Furthermore, in studies where children are trained in the morning or evening and retested immediately, 12 hours and 24 hours later, improvements in recall are only observed at the 12 hour test for children trained in the evening (Henderson, Weighall, Brown, & Gaskell, 2012), implying that repeat testing cannot be solely responsible for overnight gains in performance.
There was, however, clear evidence that simple practice effects did not account for overnight changes in semantic stabilization.
Namely, both groups of children showed significantly greater overnight reduction in overall semantic judgement RT for novel animals, than familiar animals. This is consistent with a consolidation effect that is specific to novel memory traces, as opposed to general practice effects on the task (similar to Tham et al., 2015). It should be noted, however, that children with ASD showed less of a differential consolidation benefit for novel versus familiar trials, with more of a tendency to also show a slight overnight improvement for familiar trials (Figure 1). This provides an initial suggestion that consolidation may be less strongly prioritized towards novel information in children with ASD (discussed further below, see Section 4.3).
The semantic judgement task was also included to capture semantic integration (i.e., as indexed by a size congruency effect). Counter to Tham et al. (2015), no clear post-sleep congruency effects were observed for novel trials for typical peers or children with ASD. It is possible that the acquisition of novel semantic information may involve a more prolonged consolidation process in childhood in order to elicit congruency effects, which rely on automatic access to meaning upon written presentation of the word. Alternatively, the absence of these effects in children may more simply reflect increased variability in RTs relative to adults, rendering the congruency effect a less reliable marker of automatic semantic access in child populations.
Consistent with this, there was only very weak evidence of a congruency effect even for familiar trials with a large semantic distance (e.g., COW -BEE), which was confined to the typical peers.
Importantly, despite showing similar initial performance across all measures and evidence of overnight consolidation, children with ASD showed significantly greater rates of forgetting for the features of the novel animals roughly 1 month after training. This pattern of increased memory loss is strikingly similar to that of Norbury et al. (2010), where despite comparable performance on a definitions task shortly after learning, children with ASD recalled less semantic features of novel objects 1 month later in contrast to typical peers. This supports the notion of a prolonged consolidation process, whereby semantic information is gradually consolidated over a long period of time (McClelland et al., 1995). The fact that groups did not differ immediately after training or the next day suggests that the increased forgetting 1 month later cannot be a consequence of the pragmatic demands of this task (i.e., conversational strategy, prioritising relevance etc.). Instead, these data imply a more rapid decay of the integrity of semantic representations over time in ASD. This is consistent with previous reports that, in contrast to intact item memory, the 'wheres' and 'whens' of episodic memories are atypical in ASD, with generally poorer recall and reduced hippocampal connectivity during recall for such associations (Cooper et al., 2017;Cooper & Simons, 2018).

| A role for sleep in semantic stabilization?
Spindles have been targeted as key to consolidation (e.g., Antony et al., 2018), and one previous study, to our knowledge, reports an association between overnight improvements in novel phonological knowledge and NREM spindle parameters in school-aged children (Smith et al., 2018). Such data lend a developmental perspective to the predictions of the Complementary Learning Systems account of word learning (Davis & Gaskell, 2009), which proposes that this process engages two neural systems: the hippocampal system required for the rapid acquisition of a new word, and a slower learning neocortical system that enables strengthening of explicit knowledge as well as integration with existing vocabulary knowledge (Davis & Gaskell, 2009). The present data add to this evidence, showing that spindle parameters captured on the night after learning are also associated with overnight changes in the stabilization of novel semantic information. Specifically, we observed significant associations between sigma power and spindle duration and overnight change in semantic decision speed to novel animals, relative to familiar trials, in the typical children. The fact that these associations were specific to novel trials is crucial: This suggests that sleep is specifically targeting new memory traces, as opposed to general aspects of task performance.
Strikingly though, this same specificity of consolidation towards

| Conclusions and implications
The current data add to an important body of evidence suggesting that sleep plays a role in language development, and that atypicalities of sleep may partly account for variability in language learning in neurodevelopmental disorders. Whilst the reasons for difficulties in the initiation of sleep are relatively well understood in ASD, the causal underpinnings of microstructural sleep atypicalities remain largely understudied. With clear evidence here that sigma power (and to a lesser extent NREM duration) are atypical even in a sample of children with ASD without co-occurring language learning difficulties, causal factors for such profiles need to be explored in future research. Here, we have demonstrated that sleep spindles work to stabilize novel semantic memory traces in school-aged children, with spindle characteristics specifically associated with overnight changes in novel (vs. familiar) material. In contrast, children with ASD showed reduced sigma power, more general associations between spindle characteristics and overnight changes in memory that were not prioritized towards the novel semantic information, and they showed greater forgetting of novel semantic features over the longer term. Thus, the behavioural consequences of reduced sigma power and/or general (vs. novel-specific) consolidation processes may be most apparent after many iterations of the process (e.g., 1 month later), as opposed to just one (i.e., the following day).
Of course, the present findings apply only to one particular kind of semantic learning (i.e., the learning of rare but real animals) and only to a fraction of the autism population (i.e., without intellectual impairment and highly verbal individuals). Future research should aim to assess the generalizability of these findings across the spectrum and to the learning of other material. For instance, studies could address whether long-term consolidation differs according to whether material is associated with a special interest. Notwithstanding these limitations, these data open up numerous theoretical and pedagogical questions, including how we might optimize consolidation in the autism population. For instance, repeated learning opportunities may be particularly beneficial for children with ASD, or modifying the training regimes to encourage prioritisation of the novel information to-be-learned.

ACK N OWLED G EM ENTS
This research was conducted at the University of York and supported by an Economic and Social Research Council (ESRC) research grant awarded to LMH, MGG and CN (grant number: ES/N009924/1). We wish to thank the families who generously gave their time to participate in this research.

CO N FLI C T O F I NTE R E S T
No conflict of interest exist for any authors.

DATA AVA I L A B I L I T Y S TAT E M E N T
The datasets generated during and/or analysed during the current study are available on the OSF: https ://osf.io/bd9qy/ ?view_ only=2e357 aa592 84476 bb018 60e94 c15247f.