Shared and unique features of mammalian sleep spindles – insights from new and old animal models

Sleep spindles are phasic events observed in mammalian non‐rapid eye movement sleep. They are relevant today in the study of memory consolidation, sleep quality, mental health and ageing. We argue that our advanced understanding of their mechanisms has not exhausted the utility and need for animal model work. This is both because some topics, like cognitive ageing, have not yet been addressed sufficiently in comparative efforts and because the evolutionary history of this oscillation is still poorly understood. Comparisons across species often are either limited to referencing the classical cat and rodent models, or are over‐inclusive, uncritically including reports of sleep spindles in rarely studied animals. In this review, we discuss the emergence of new (dog and sheep) models for sleep spindles and compare the strengths and shortcomings of new and old models based on the three validation criteria for animal models – face, predictive, and construct validity. We conclude that an emphasis on cognitive ageing might dictate the future of comparative sleep spindle studies, a development that is already becoming visible in studies on dogs. Moreover, reconstructing the evolutionary history of sleep spindles will require more stringent criteria for their identification, across more species. In particular, a stronger emphasis on construct and predictive validity can help verify if spindle‐like events in other species are actual sleep spindles. Work in accordance with such stricter validation suggests that sleep spindles display more universally shared features, like defining frequency, than previously thought.


I. INTRODUCTION
There is no shortage of interest in the evolution of the sleeping brain. Four chapters are dedicated to the evolution of sleep in the 5th edition of Principles and practice of sleep medicine (Kryger, Roth & Dement, 2011) and six chapters in the Encyclopedia of sleep and dreams (Barrett & McNamara, 2012). In their chapter on 'Mammalian Sleep', Zepelin, Siegel & Tobler (2005) open with a discussion of the sleep spindlewhich was among the earliest patterns of electroencephalogram (EEG) activity observed and described in sleep research (Berger, 1933;Loomis, Harvey & Hobart, 1935). A long list of mammalian species is compared by the authors with regard to the characteristic frequency (waves/s) of sleep spindles, which appear as short (maximum 5 or 6 s long) trains of monomorphic and symmetric waves on surface EEG recordings (Rechtschaffen & Kales, 1968;Dutertre, 1977). Their list, unfortunately, is misleading. In many species, such as the echidna, sloth, rabbit, baboon and chimpanzee, our knowledge of sleep spindles is based upon a few reports of spindlelike oscillations (Petersen, Di Perri & Himwich, 1964;Bert, Kripke & Rhodes, 1970a;Bert et al., 1970b;de Moura Filho, Huggins & Lines, 1983;Nicol et al., 2000;Voirin et al., 2014). These must be taken with caution due to the existence of pseudo-spindles (Gottesmann, 1996) and other deceivingly similar patterns such as alpha band activity [as cautioned for example by de Gennaro & Ferrara (2003); see also Section I.2]. In stark contrast with this over-inclusive approach when sleep is discussed from an evolutionary perspective, the human literature on sleep spindles rarely makes references to comparative work other than the well-known cat and rodent models [see e.g. Genzel et al. (2014), Fernandez & Lüthi (2020) and de Gennaro & Ferrara (2003)]. One notable exception is an older review by Jankel & Niedermeyer (1985) which briefly mentions dogs and includes now rarely discussed investigations in several primate species. What exactly are sleep spindles, and why are they important?
Research on sleep spindles intersects with various topics of clinical relevance in humans, including learning and memory, development, ageing, psychiatric conditions, and hormonal modulation of sleep and brain functions (see Table 1). While most of this work is correlational, there have been notable efforts to understand how the putative functions of sleep spindles could derive from the underlying mechanisms (Siapas & Wilson, 1998;Rosanova & Ulrich, 2005;Latchoumane et al., 2017). Taken together, the wide range of associations sleep spindles display with other functions and brain states, as well as the known dynamics supporting the observed correlations, warrant the attention that these EEG transients have received in the human literature.
To further our understanding of the sleep spindle in an evolutionary, cross-species comparative context, it is crucial to establish criteria that allow spindle-like bursts to be defined as actual spindles. With regard to this, the literature on validating animal models is of particular interest. The utility of a model depends on the assumption of similarities to humans that could imply an underlying homology. Across different authors and lines of research three criteria for the validation of animal models are a recurring theme (Overall, 2000;Coenen & van Luijtelaar, 2003;van der Staay, 2006;van der Staay, Arndt & Nordquist, 2009;Nestler & Hyman, 2010;Vervliet & Raes, 2013;Topál, Román & Turcsán, 2019): face, predictive and construct validity. These criteria can be traced back to Willner (1984), but have undergone several reinterpretations since then (Belzung & Lemoine, 2011). A fourth type, known as external validity (van der Staay et al., 2009;Belzung & Lemoine, 2011) will also be considered here. External validity refers to how well findings in the model generalize across experimental contexts and species. It can be viewed as an additional criterion (van der Staay et al., 2009) which covers observations outside the strictest, classical definition using the three criteria (Willner, 1984) or as an overarching construct to which the three criteria and other observations contribute (Belzung & Lemoine, 2011).
Here we should consider the defining features of spindles, used in visual and automatic detection, as evidence for face validity, i.e. their intrinsic frequency, duration and the topography of frequency variation (presence of fast and slow subtypes).
(2) Predictive validity is, in the narrow context of disease modelling, only satisfied by similarities in the outcome of treatments; in its first formulation only by pharmacological interventions (Willner, 1984). For this reason, it is also referred to by some authors as pharmacological validity (Nestler & Hyman, 2010). van der Staay et al. (2009) andGeyer &Markou (1995) define predictive validity more broadly, in line with the psychometric definition, to include all relationships of the type 'A predicts B'. We apply this broader definition here because sleep spindles predict, and are predicted by, a wide range of variables. In our opinion, this 'network' of predictive relationships constitutes a stronger argument for analogies among species than any morphological similarities in the EEG signal. Because however, a broader definition might face opposition, we point out that some of the evidence we discuss in favour of predictive validity might fall under the external validity category encompassing all analogies among species and/or experimental contexts (van der Staay et al., 2009;Belzung & Lemoine, 2011). In Section II.6, we will discuss specific analogies among species as evidence for external validity, but otherwise, we operate only within the classical three-validities framework. Specifically, correlative relationships between sleep spindle features (density, amplitude, frequency) and other variables, such as learning performance and changes induced by sexual hormones and age (see also Table 1) will be considered in addition to classical criteria of predictive validity (Willner, 1984), such as changes in sleep spindle features caused by pharmacological interventions.
(3) Construct validity concerns the similarity in underlying mechanisms between a model and the modelled entity.
In the context of disease modelling, this usually concerns a similar underlying pathology, for instance, a mutation or lesion, but it is also used broadly to denote any shared mechanisms in non-pathological contexts (van der Staay et al., 2009). For sleep spindles, construct validity will be deemed satisfied if dependence upon the thalamus and more specifically the RTN can be demonstrated. Other arguments for involvement of the same mechanisms derive indirectly from strict predictive validity (pharmacological responses) and modulation by sexual hormones.
In Section II, we review the feline, rodent, canine, sheep, and macaque models in comparative sleep spindle research using the three validities defined above as a point of reference. Importantly, we use these models to draw conclusions about the evolution of sleep spindles and their universal versus species-specific features. On a more practical note, we also reflect on the likely future avenues for comparative sleep spindle work (see Section II.6) for which a discussion of ecological validity will be offered.
(2) Mechanisms of real and pseudo-spindles The RTN is a web of GABAergic cells projecting onto thalamocortical relays. In wakefulness, these projections are 'inactive' as by default the cells of the RTN inhibit each other. During NREM sleep, increased inhibitory input to the RTN activates low-threshold calcium channels (also found in the thalamus) (Bazhenov et al., 1999), causing a release of GABA onto thalamic targets (Scheibel & Scheibel, 1966). As a result, the thalamic relay cells respond less to peripheral sensory input (Steriade et al., 1969(Steriade et al., , 1971Glenn & Steriade, 1982;Steriade & Timofeev, 2001), but send their own rebound signals to the cortex, as the hyperpolarization activates their low-threshold calcium channels (Destexhe & Sejnowski, 2002). According to Steriade & Llinás (1988), this sequence of hyperpolarizations and rebound bursts underlie the 7-14 Hz rhythmicity that characterises feline spindles. Importantly, 7-14 Hz synchronization in the cat begins in the RTN itself and is sufficiently explained by the intrinsic properties of this structure (Bazhenov et al., 1999) whereas a cortical origin has been described for spike-wave discharges (SWDs) (Meeren et al., 2009) and alpha waves (Bollimunta et al., 2011). Thalamic suppression by the RTN is part of all these three EEG signatures (Steriade & Timofeev, 2001;Roux & Uhlhaas, 2014;Wimmer et al., 2015;Chen et al., 2016), however, and might explain reduced consciousness in sleep and absence epilepsy, as well as attentional filtering associated with alpha waves. Still, crucial differences exist between alpha waves, SWDs and spindles regarding underlying mechanisms at the thalamic level. Increased RTN firing is phase-locked with different parts of the alpha wave and spindle cycles (Chen et al., 2016), while the generation of SWDs is more localized within the RTN-thalamic network and is responsible for bilateral expression (thalamo-cortical spindles are exclusively ipsilateral) (Meeren et al., 2009). In rats, SWDs and sleep spindles share a similar overlap in defining frequency, as alpha waves and spindles do in other mammals (Coenen & van Luijtelaar, 2003). Failing to adjust for this fact during automatic detection (e.g. see discussion in van Hese et al., 2009) can turn SWDs into a source of pseudospindles in rodent studies.
A pseudo-spindle of entirely different origin was observed by Terrier & Gottesmann (1978) to occur on points of transition between NREM and REM sleep. Gottesmann (1996) deduced that it originates in the hippocampus and described it as being of larger amplitude and with sharper waves than other spindles. Importantly, Gottesmann (1996) distinguished this transitional (hippocampal) spindle, characterised by a relatively low frequency, from the actual slow spindle (Terrier & Gottesmann, 1978;Gottesmann, 1996). Recent propositions that slow spindles are not thalamic (Timofeev & Chauvette, 2013) might not take into account that pseudo-spindles are more similar to actual slow spindles than to fast spindles in terms of frequency and that therefore the hippocampal spindle presents a risk for the correct identification and detection of actual slow spindles.

II. RESEARCH ON ANIMAL MODELS
(1) The cat Early on, cats became the predominant model in sleep research (Pampiglione, 1971) and were widely used to study various topics connected to sleep physiology from the beginning of the 20th century. Jouvet (1979) demonstrated behaviour suggestive of dreaming in the cat, when he relieved the muscle tone inhibition characterizing the REM sleep phase, leading to the animals acting out various behaviours without waking up. Meanwhile, large-scale synchronization in the feline cortex was confirmed to underlie the slow rhythmic activity predominating during NREM sleep (Amzica & Steriade, 1995).
The study of sleep spindles and thalamic function are strongly intertwined in the cat (Steriade & Llinás, 1988). Our knowledge of the mechanisms behind feline spindling seems so complete that it might be more appropriate to say that, as far as construct validity is concerned, humans are a model for cats and not vice versa. Extensive work has shown that irregular fast bursts in the RTN underlie spindle rhythms in the feline thalamus (Steriade et al., 1985(Steriade et al., , 1987, and that these bursts are associated with lower thalamic responsivity (Steriade et al., 1969;Glenn & Steriade, 1982;Steriade & Timofeev, 2001), but also that this inhibition causes rebound activation in thalamic relay cells (Bazhenov et al., 1999) contributing to cortical spindles (Steriade & Timofeev, 2001).
Because work in cats was strongly focused on uncovering mechanisms, little evidence was produced for the face or predictive validity of the cat model. Only the narrow, pharmacological definition of predictive validity is satisfied by the observation that in cats (Ganes & Andersen, 1975 Ganes, 1976) and humans (Soldatos et al., 1977) barbiturates enhance sleep spindles. However, no evidence supporting the broader definition of predictive validity or in favour of external validity is available. In other words, we do not know how ageing or learning modifies the feline spindle.
In conclusion, cats as model animals in sleep spindle research are characterised by (i) strong construct validity, (ii) predictive validity in accordance with the strict, pharmacological definition (Nestler & Hyman, 2010), and (iii) weak face validity which, however, is not critically challenged. A precise overlap between the spindling frequency of different mammals is generally neither assumed nor reported (Zepelin et al., 2005). Based on visual inspection, it is believed to diverge even more dramatically from humans in three species of sloth (Bradypus tridactylus, Bradypus variegatus and Bradypus pygmaeus) and the echidna (Tachyglossus aculeatus) (6-7/8 Hz) (de Moura Filho et al., 1983;Nicol et al., 2000;Voirin et al., 2014). Even oscillations that show high face, predictive, and construct validity among species, for example SWDs, can be characterised by strikingly different intrinsic frequencies (Coenen & van Luijtelaar, 2003). Therefore, such differences do not necessarily constitute a decisive argument against inter-species comparability. That cat and human spindles reflect the same process is supported by studies that suggest similar underlying mechanisms in rats (Meeren et al., 2009) and humans (Schabus et al., 2007) (see Section I.2). A potentially more concerning weakness of the cat model's face validity is the absence of the two spindle types (slow and fast) that are characteristic of humans (Gibbs & Gibbs, 1950), rats (Terrier & Gottesmann, 1978), and dogs (Iotchev et al., 2019), but which also appear to be absent in sheep (Schneider et al., 2020) and mice (Kim et al., 2015).
(2) The rat and mouse Rats and mice are the most studied mammals in neuroscience (Ellenbroek & Youn, 2016). Here, we mainly focus on the rat, but also note differences with the mouse model. Much of what we know about rat sleep spindles comes from research primarily aiming to understand SWDs, the EEG correlates of absence epileptic seizures (Kandel & Buzsáki, 1997;van Luijtelaar, 1997;van Luijtelaar & Bikbaev, 2007;Meeren et al., 2009;Sitnikova, 2010;Sitnikova et al., 2014a,b). Mutual interest of researchers focussing on absence epilepsy and those studying sleep spindles stems partly from a now-outdated theory that SWDs constitute pathological sleep spindles (Kostopoulos, 2000). Although evidence against this interpretation has accumulated (van Luijtelaar, 1997;Meeren et al., 2009), both oscillations are characterised by reduced thalamic transmission due to RTN-mediated inhibition (Steriade & Timofeev, 2001), thus there is overlap in the underlying mechanisms at the thalamic level. This overlap has been argued to imply that reduced consciousness in sleep and absence epilepsy share the same origin, even though sleep spindles and SWDs do not (Meeren et al., 2004).
The face validity of the rat model is better than that of the cat. Frequency ranges used to detect rat spindles [12-15 Hz in Eschenko et al. (2006), 10-15 Hz in Sitnikova et al. (2012)] show a greater overlap with those used in human studies, and a distinction between frontal-slow and posteriorfast spindles was demonstrated in rats (Terrier & Gottesmann, 1978), but not in cats (see discussion by van Luijtelaar, 1997), sheep (Schneider et al., 2020) or mice (Kim et al., 2015).
Rats also display high predictive validity, both in the narrow, pharmacological sense (van Luijtelaar, 1997) and more broadly with regard to relationships between spindles and other variables (Siapas & Wilson, 1998;Eschenko et al., 2006;Mölle et al., 2009). Specifically, sleep spindle occurrence is enhanced with benzodiazepines and barbiturates in both rats (van Luijtelaar, 1997) and humans (Soldatos et al., 1977;Hirshkowitz, Thornby & Karacan, 1982), increases with exposure to novel learned information (Eschenko et al., 2006;Mölle et al., 2009), and is time-locked to the occurrence of hippocampal ripples (high frequency bursts associated with sequential memory replay), as in humans (Siapas & Wilson, 1998;Clemens et al., 2011). In early development, both humans and rats display an increase in sleep spindle density (van Luijtelaar & Bikbaev, 2007;Bódizs et al., 2014;Hahn et al., 2018), but it is not clear how comparable these changes are. In rats, sleep spindles are primarily distinguished as anterior and occipital, but in humans they are delineated as fast versus slow. In both species, spindles with faster intrinsic frequency (≥13 Hz) Biological Reviews 96 (2021)  dominate in the central and posterior cortices, but while in humans, fast spindle density increases from childhood to adolescence (Bódizs et al., 2014;Hahn et al., 2018) in rats, it is the anterior, and thus slow spindles (Terrier & Gottesmann, 1978), which increase in density (occurrence/min) until the animals reach 6 months (age of social maturity according to Sengupta, 2013). Effects of ageing have only been studied in epileptic rats (Sitnikova et al., 2014a) up to 9 months of age (reproductive senescence in female rats occurs between 15 and 20 months of age; Sengupta, 2013), and thus there is no external validity for age-related changes in the rat. Another important difference with humans is that in rats learning prior to sleep does not synchronize sleep spindles and ripples with the up-states of slow oscillations (episodes of increased synchronous firing of cortical cells, part of ongoing <1 Hz NREM activity; see also Fig. 1), but in humans and mice, this is the case (Mölle et al., 2009;Latchoumane et al., 2017). Moreover, in rats the oestrous cycle does not appear to affect spindle expression (Schwierin, Borbély & Tobler, 1998) as the menstrual cycle does in humans (Driver et al., 1996;Baker et al., 2007;de Zambotti et al., 2015). Generally, sex differences in the expression of rat sleep spindles have been studied only indirectly, mostly in the context of how sleep in male rats was affected by female hormones (Manber & Armitage, 1999). These studies provided indirect arguments for an enhancing effect of female steroids on EEG activity in frequencies ≥13 Hz during NREM sleep (including, but not limited to, fast spindles). Importantly, some of these differences with humans also carry implications for predictive and construct validity, implying that the otherwise similar core mechanisms (RTN and thalamo-cortical relays) are embedded in different interactions with other systems (like the oestrous cycle and slow oscillations).
In conclusion, the rat model satisfies all three broad validities and predictive validity is satisfied according to the narrow definition as well. The rat provides important arguments for the comparability of human and cat sleep spindles because rat spindles depend on RTN inhibition of thalamic relay cells, as in cats (Meeren et al., 2009), while also displaying a wider range of associations that satisfy face, predictive, and external validity (Terrier & Gottesmann, 1978;van Luijtelaar, 1997;Siapas & Wilson, 1998;Eschenko et al., 2006;Mölle et al., 2009) in comparison with humans. This is crucial, as there is only indirect evidence for thalamic involvement in humans (Schabus et al., 2007) and the ability to isolate the relevant mechanisms directly is limited. Work unique to the rat model has helped elucidate how cortical dynamics contribute to the expression of spindle properties as they appear on surface-recorded EEGs (Kandel & Buzsáki, 1997;Rosanova & Ulrich, 2005;Sitnikova, 2010;Peyrache, Battaglia & Destexhe, 2011). Rebound bursts of thalamic relay cells during spindling both interrupt communication between the frontal cortex and hippocampus (Peyrache et al., 2011) and induce long-term potentiation in cortical synapses (Rosanova & Ulrich, 2005). Computational modelling based on rat studies also predicts that spindle amplitudes increase with inhibitory activity from within the cortex (Sitnikova, 2010). Complementary experiments in mice have helped discern differences in the mechanisms underlying spindles and alpha waves (Chen et al., 2016) and also demonstrated a causal relationship underlying spindlelearning correlations (Latchoumane et al., 2017), indicating that these involve thalamic spindles. On the other hand, the absence of fast versus slow spindle subtypes suggests the face validity of the mouse model is weaker in comparison with rats (Kim et al., 2015).
The high face, predictive, and construct validity of the rat model suggests that these observations might be generalized to other species expressing sleep spindles, but as a note of caution, the external validity of the rat model is challenged by some findings (Schwierin et al., 1998;van Luijtelaar & Bikbaev, 2007;Mölle et al., 2009). These were discussed above in more detail and may reflect species differences in development and fertility cycles. Mice display a weaker face validity in comparison (Kim et al., 2015), but otherwise show similar strengths to the rat model. The advance of optogenetics, combined with a more complete understanding of the mouse genome might indeed lead to mice taking a greater role in sleep spindle studies, as observed in other lines of research (Ellenbroek & Youn, 2016). A description of ageing effects in older, non-epileptic animals is currently lacking for both rats and mice (van Luijtelaar & Bikbaev, 2007;Sitnikova et al., 2014a). The latter is a significant short-coming because the human literature suggests that sleep spindles may be relevant to pathological aspects of ageing such as dementia (Smirne et al., 1977;Ktonas et al., 2007;Latreille et al., 2015;Gorgoni et al., 2016).  (Petersen et al., 1964;Kis et al., 2014) and under anaesthesia (Charles & Fuller, 1956;Jeserevics et al., 2007;Pákozdy et al., 2012), characterization of their relationship to variables such as age, sex, and sleep-dependent learning is very recent (Iotchev et al., 2017(Iotchev et al., , 2019(Iotchev et al., , 2020a. These new developments form part of a larger renaissance exploring the sleep physiology of dogs (Kis et al., , 2017aBunford et al., 2018;Varga et al., 2018;Bódizs et al., 2020;Reicher et al., 2020).
These recent findings show that dog sleep spindles display high face validity. Applying automatic detection with a 9-16 Hz target frequency and using search criteria validated for human sleep spindles (Nonclercq et al., 2013) confirmed several predictions based on the human literature (Table 1) concerning age, learning ability and sex (Iotchev et al., 2017(Iotchev et al., , 2019(Iotchev et al., , 2020a. The 9-16 Hz frequency range overlaps entirely with one of the definitions used in the human literature (Bódizs et al., 2009) and strongly with one of those used for spindle detection in mice (8-16 Hz) (Kim et al., 2015). Moreover, dogs display more fast spindles over central recording sites (Iotchev et al., 2019) as in humans (Gibbs & Gibbs, 1950) and rats (Terrier & Gottesmann, 1978), but unlike rats and similar to Biological Reviews 96 (2021)  humans the occurrence of fast spindles does not appear to be exclusive to signals from central/posterior electrodes. One potential shortcoming for the model's face validity, discussed in Section II.1 with regard to the cat, is that it is not clear from the human literature whether spindling frequency should extend to as low as 8 or 9 Hz. The predictive and external validity of the dog model have been strengthened through a series of studies on ageing, sleep-dependent learning and sex differences.
Dogs are the first animal species in which differences in sleep spindle expression have been compared over a wide age range (Iotchev et al., 2019) (1-16 years), which are important comparisons given the potential role of sleep spindles in cognitive ageing. As in humans (Landolt & Borbély, 2001;Martin et al., 2012), the amplitude of frontal, slow spindles was lower in older dogs, which also displayed the higher frequencies associated with ageing in the human literature (Principe & Smith, 1982;Crowley et al., 2002;Ktonas et al., 2007). However, unlike healthy ageing humans (Martin et al., 2012), but similar to patients with Alzheimer's disease (AD) (Gorgoni et al., 2016), healthy older dogs displayed a lower occurrence of centrally detected sleep spindles, although unlike AD patients these were specifically of the slow type. A higher occurrence of frontal, fast spindles in older dogs seems to be unique to this species. It is currently unclear if this should be viewed as analogous to the transition between childhood and adolescence in humans (Bódizs et al., 2014;Hahn et al., 2018). These findings suggest that mechanisms underlying the expression of frontal, fast spindles follow a different developmental trajectory in the dog, with a continuous increase throughout the animals' lifespan. Alternatively, this observation may be explained by the fact that sleep spindles display higher frequencies when there is less overall inhibition in the brain, as suggested by the fact that barbiturate anaesthesia-induced spindles are exclusively slow (Steriade & Llinás, 1988). A study investigating the association of age-related changes with cognitive changes in dogs showed that higher intrinsic spindle frequencies are associated with more trials needed in a reversal-learning task (Iotchev et al., 2020b). In humans, higher spindling frequencies are similarly associated with perseverance errors (Guadagni et al., 2020) and risk of developing dementia (Ktonas et al., 2007).
In dogs, post-sleep recall on a novel command-learning task was positively correlated with sleep spindle occurrence (Iotchev et al., 2017). This finding is important to this model's validation, since spindle-learning associations were confirmed through optogenetic experiments in the mouse specifically to concern thalamic spindles (Latchoumane et al., 2017). A failure to replicate this in the largest to date human sample, however, has raised concerns about the retest-reliability of this association [Ackermann et al. (2015) with >900 subjects]. Follow-up work in the dog (Iotchev et al., 2020a) suggests that positive associations between sleep spindle occurrence and learning are not artefacts caused by measurement error (a potential concern with small samples). Importantly, all demonstrations of this effect in dogs (Iotchev et al., 2017(Iotchev et al., , 2020a utilized the same spindle-detection regime, which is crucial given that different automated detectors diverge strongly in their estimates (Warby et al., 2014).
Indirect arguments for construct validity come from the observation of sexually dimorphic features in dog sleep spindles (Iotchev et al., 2017(Iotchev et al., , 2019. In the dog, similar to humans (Bódizs, 2017), the density, frequency and amplitude of fast spindles are higher in females (Iotchev et al., 2019). In particular, fast spindle density and frequency were found to be higher in intact female dogs compared to intact males (Iotchev et al., 2017(Iotchev et al., , 2019 or neutered females (Iotchev et al., 2019). This suggests an enhancing effect of female sex hormones and is consistent with the finding that in humans fast spindle density and frequency peak in the high progesterone luteal phase (Driver et al., 1996;Baker et al., 2007;de Zambotti et al., 2015).
The observed associations of spindle variables with age, learning, and sex emphasize the predictive validity of the dog model, in accordance with the broad definition (van der Staay et al., 2009). The narrow, pharmacological validity definition (Nestler & Hyman, 2010) and construct validity are supported indirectly by effects of neutering on the observed sex differences (Iotchev et al., 2017(Iotchev et al., , 2019, suggesting a similar involvement of sex hormones in sleep spindle modulation. However, more work is needed to specify the exact hormones responsible for these changes in the dog. The extent to which the construct validity of the dog model could improve in the future is currently not clear. Invasive studies, crucial for our understanding of mechanisms, may become increasingly controversial in dogs, as opposition increases (Bailey & Pereira, 2018). The strength of the dog model may not depend on its ability to provide mechanistic insights, but rather in supporting faster and easier longitudinal and genetic studies (Sándor & Kubinyi, 2019) due to their relatively short lifespan and the relative genetic homogeneity of dog breeds (Kubinyi, Sasvári-Székely & Miklósi, 2011). It is then perhaps not surprising that arguments have emerged in favour of using dogs as models in cognitive ageing studies (Adams et al., 2000;Head, 2013;Sándor & Kubinyi, 2019). As the only nonhuman species in which age-related differences in sleep spindles have been described (Iotchev et al., 2019), ageing is likely to be the focus of studies with the dog model in future comparative sleep spindle research. Another strength of the model is that thus far all findings relating dog sleep spindles to other behavioural and physiological variables (Iotchev et al., 2017(Iotchev et al., , 2019(Iotchev et al., , 2020a(Iotchev et al., , 2020b have used the same detection algorithm, modelled on the work of Nonclercq et al. (2013). This is important given that automated detectors diverge in their estimates for spindle occurrence in humans (Warby et al., 2014), suggesting that caution is necessary when making comparisons across different automated methods. comparative sleep spindle research. While so far these results satisfy only face validity, the work of Schneider et al. (2020) is significantly different from other face-value reports of spindles (Bert et al., 1970a;de Moura Filho et al., 1983;Nicol et al., 2000;Voirin et al., 2014). The spindling properties satisfying face validity in sheep (frequency, duration) were not a priori criteria used for detection, but were observed after selecting for other criteria (the search was based on a power-threshold criterion applied within a broader frequency range, from 5-16 Hz). In the 10-16 Hz range, alleged sheep spindles strongly resemble those of humans, rats, mice, and dogs. Of particular importance was the spindle duration (maximally 3 s in six animals), as this sets them apart from the similar, but longer-lasting alpha waves (Steriade & Llinás, 1988). As in cats and mice, however, a distinction between a slow and fast subtype was not confirmed.

(5) The macaque
Surprisingly little is known about sleep spindles in other primates. In the human literature, references to other species are mostly limited to rodent work Fernandez & Lüthi, 2020) or the historically prominent cat model (de Gennaro & Ferrara, 2003;Fernandez & Lüthi, 2020). Early work on chimpanzees, baboons, and macaques did not confirm the presence of sleep spindles in adult specimens (Bert et al., 1970a,b), possibly explaining the subsequent lack of primate research. Two recent studies have investigated sleep spindles in macaques (Takeuchi et al., 2016;Sritharan et al., 2020), although note that these used two different sub-species of the genus Macaca and thus integration of the findings should be done with caution. Although both report sleep spindles with similar to human duration, average frequency, and prevalence throughout the sleep-wake cycle (i.e. mostly in NREM), a twofold violation of face validity is seen in Takeuchi et al. (2016) concerning both topography and frequency range. Intrinsic frequencies of up to 20 Hz were reported, which are not only unusually high, but likely to overlap with artefacts related to epileptoform or respiration-linked activity (discussed in Jankel & Niedermeyer, 1985). Moreover, they observed the highest intrinsic frequency for 'sleep spindles' in the dorsolateral prefrontal cortex (DLPFC) suggesting a reversal of the usual pattern of frontal-slow, posterior-fast topography. However, Takeuchi et al. (2016) did demonstrate a humanlike temporal coupling of spindle-like events with other oscillations, such as gamma rhythms and K-complexes (Ayoub et al., 2012;Staresina et al., 2015), although note that some authors maintain there is no causal connection between sleep spindles and gamma rhythms (Mackenzie, Pope & Willoughby, 2005). Sritharan et al. (2020) focused on thalamo-cortical synchronization during spontaneous and evoked spindles. Comparability between the spontaneous and evoked oscillations was a higher priority in this endeavour than comparability to humans or other animals. Therefore, only few of their observations relate to external validity, satisfying only face validity criteria. Crucially, their work does not confirm the existence of 20 Hz spindles alleged by Takeuchi et al. (2016), but the use of different subspecies and methods limit the integration of these results. Earlier work in macaques, while focusing on spindle-relevant structures in thalamo-cortical organization, did not directly address sleep spindles (Paré & Steriade, 1993;Carden et al., 2006).

(6) General overview and future directions
We provide two tabular overviews of the six animal models. Table 2 summarizes the current situation concerning face, predictive, and construct validity for each of the six species. Table 3 summarizes their external validity.
The future utility of these animal models will depend strongly on the future of sleep spindle research in general. In human studies, sleep spindles have received much recent attention with regard to ageing (Principe & Smith, 1982;Guazzelli et al., 1986;Landolt & Borbély, 2001;Crowley et al., 2002;Martin et al., 2012), specifically as predictors of pathologies such as AD and dementia (Smirne et al., 1977;Ktonas et al., 2007;Latreille et al., 2015;Gorgoni et al., 2016). A stronger focus on modelling age-related changes will likely shift focus to the dog, because characterization of age-related sleep spindle changes in this species has already begun (Iotchev et al., 2019) and corresponding changes in spindle features and cognitive performance have been detected (Iotchev et al., 2020b). Finally, unlike rats, dogs develop dementia-like symptoms naturally (Landsberg & Malamed, 2017;Sándor & Kubinyi, 2019). The historical focus on cats in sleep science might also undergo a renaissance since cats also develop dementia naturally (Landsberg & Malamed, 2017), although this will require additional evaluation of the predictive and external validity of the cat model. To date, studies on cats have almost exclusively investigated the mechanistic aspects of sleep spindles. Meanwhile, mice are beginning to displace rats as the preferred rodent model for studies of human physiology and cognition, due to advances in optogenetics combined with a better understanding of the mouse genome (Ellenbroek & Youn, 2016). This has already resulted in studies that actively manipulated spindle-generating mechanisms in the mouse (Astori et al., 2013;Latchoumane et al., 2017), thus opening up new directions in sleep spindle research.
A strong argument for more primate studies is that the subdivision of NREM sleep into different stages seems to be unique to primates [discussed in Bert et al. (1970b) and Genzel et al. (2014)]. In light of this, human findings showing that the precise stage of NREM sleep was relevant to associations between spindle occurrence and recall performance (Cox et al., 2012) present a specific argument for studies of our closest phylogenetic relatives. The more recent attempts to record sleep spindles in non-human primates have produced mixed results regarding face validity (Takeuchi et al., 2016;Sritharan et al., 2020).
One crucial criterion to consider with regard to practical choices in comparative work is the ecological validity of a model (Schmuckler, 2001), i.e. how similar are the conditions under which the phenomenon is expressed within a study to how it occurs in nature? Studying mechanisms often involves invasive methodology, and even isolated structures or tissue preparations [for examples in the sleep spindle literature, see Rosanova & Ulrich (2005) and Bazhenov et al. (1999)]. For this reason, experimental methodology and ecological validity are somewhat mutually exclusive, or involve at least a considerable trade-off.
We consider that studies with dogs and sheep currently score highest in terms of ecological validity. In sheep this is because the experimental animals were kept under relatively natural conditions [socially, outdoors, natural lighting (Schneider et al., 2020)], while neuroscience studies on dogs (Bunford et al., 2017) benefit from the establishment of a non-invasive recording method . High ecological validity might be most important when studying effects associated with the animals' genetic background, as any phenotypic effects introduced by the experimental situation will be reduced. The genetic determinants of sleep spindle features and their association with cognitive functions have received attention in both humans and mice (Franken, Malafosse & Tafti, 1998;de Gennaro et al., 2008;Purcell, 2017).

III. THE MAMMALIAN SLEEP SPINDLE REVISITED
Reviewing the literature on non-human sleep spindles through the lens of the three criteria used in animal model validation first proposed by Willner (1984) allows us to break down the uncertainties surrounding comparative work into two specific questions. Firstly, which descriptions of sleep spindles can be trusted and to what extent? Secondly, which features of the sleep spindle are universal, and are therefore likely to be conserved, and which features differ and were therefore likely shaped by different selection pressures?
The first question is a critical re-evaluation of the claim that sleep spindles have been observed through visual inspection of EEGs in a wide range of mammals (Zepelin et al., 2005). The morphology of an EEG transient can be deceiving; for example, slow spindles and alpha-waves are of the same frequency and shape (de Gennaro & Ferrara, 2003), yet do not share the same mechanisms (Chen et al., 2016). Some spindle-like events described during the transition from and into REM sleep, are entirely nonthalamic in origin (Gottesmann, 1996) but instead seem to arise in the hippocampus. Since recent consensus mandates that 'sleep spindles' should refer to thalamo-cortical bursts (Fernandez & Lüthi, 2020), as described in the classical cat studies (Steriade & Llinás, 1988), cross-species comparisons face the challenge of having to verify that the sleep spindles they observed were indeed thalamo-cortical in origin. Considering furthermore, that they do not appear to be part of bird and reptilian sleep physiology (Rattenborg et al., 2011;Shein-Idelson et al., 2016;van Der Meij et al., 2019) the existence of true sleep spindles in any species cannot be taken for granted.
We argue that the most common three criteria applied in animal model validation [face, predictive, and construct validation (Willner, 1984;Coenen & van Luijtelaar, 2003;van der Staay, 2006;van der Staay et al., 2009;Nestler & Hyman, 2010;Belzung & Lemoine, 2011;Topál et al., 2019)] provide a useful method for addressing this question. However, a potential concern with using this framework for sleep spindles is that two alternative paths can be taken to evaluate whether the criteria are met. From a clinical perspective, it makes sense to take humans as a point of reference and focus on the human sleep spindle literature (Table 1) in order to evaluate face and predictive/external validity for a given model. From an evolutionary perspective, however, construct validity is more important and since the mechanisms of spindling are currently best understood in the cat (Steriade & Llinás, 1988) it is justified to ask if the two perspectives are not at odds with each other. In Section II.2 we saw that rats and mice can be a valuable bridge between the cat model and humans, in that humanlike findings for rodent spindles relate to catlike mechanisms (i.e. the RTNmediated interplay between the thalamus and cortex described in more detail in Section I.2). This is illustrated most directly by a recent optogenetic mouse experiment (Latchoumane et al., 2017) which demonstrated a humanlike temporal coupling to hippocampal ripples, cortical up-states, and a positive association with learning performance for the murine thalamic spindle. Because of the presence of oscillations that are spindle-like in appearance in the hippocampus (Gottesmann, 1996), rodent 'bridge' work like this mouse study are crucial arguments for viewing human and cat work as complementary. Although our evaluation of these models likewise adopts this assumption, we caution against rejecting the possibility that some conclusions are based on pseudospindles. In particular, recent claims that slow spindles might not be thalamic in origin (Timofeev & Chauvette, 2013) need to be investigated further. Pseudo-spindles are more similar to the slow than to the fast subtype (Terrier & Gottesmann, 1978;Gottesmann, 1996), while in cats and mice a thalamic dependence of sleep spindles is demonstrated across a broad frequency range (7-15 Hz) (Steriade et al., 1987;Kim et al., 2015;Latchoumane et al., 2017) which would include both the slow and fast subtypes seen in humans.
Our first conclusion is that we are only justified to speak of sleep spindles with certainty in humans, cats, rodents, and dogs since the evidence exceeds morphological similarities and validity criteria other than face validity are satisfied to a large extent (Table 2). Promising methods of validation are also emerging for sheep (Schneider et al., 2020), but these only apply so far to face validity. In non-human primates face validity was historically weak (discussed in Jankel & Niedermeyer, 1985) and has only recently received some support in two macaque sub-species (Takeuchi et al., 2016;Sritharan et al., 2020).
To address the second question requires re-evaluation of earlier claims about species differences by taking as a point of reference the animals for which extensive validation attempts exist (Tables 2 and 3). One common assumption has been that sleep spindles show species-specific intrinsic frequencies (Zepelin et al., 2005). For example, spindles of surprisingly low frequency ranges, 6-8 Hz (with no distinction into slow/fast subtypes), were observed in sloths and the echidna (de Moura Filho et al., 1983;Nicol et al., 2000;Voirin et al., 2014). By contrast, while the frequency signature of the same brain processes can diverge strongly among species (Coenen & van Luijtelaar, 2003), work in the human, cat, rat, mouse, dog, and sheep suggests that for sleep spindles, the defining frequency range, while not identical, mostly overlaps. In these species, the lower boundary of the spindle frequency-band is 7-10 Hz and the upper boundary 15-16 Hz. Importantly, there is considerable support that in these species spindle-like bursts indeed reflect the same neural processes. Our second conclusion, therefore, is that the frequency range defining a sleep spindle is a shared, universal feature of mammalian sleep physiology.
While intrinsic frequency seems to present a universal feature when well-validated models are considered, the same evidence suggests that frequency differences between frontal and posterior sleep spindles are a crucial point of divergence among these species. In particular, a division into slowfrontal and posterior-fast spindle types is not reported for sheep and cats, while the evidence for mice is mixed [compare Kim et al. (2015) with Fernandez et al. (2018)] and a more complex partially reversed topography was reported for the Japanese macaque (Takeuchi et al., 2016). What can we deduce about the possible underlying physiology from these observations? Steriade & Llinás (1988) hypothesized, that spindling frequency is slowed down with increased suppression of the thalamo-cortical network since barbiturate anaesthesia is associated with lower spindling frequency and reduced spontaneous thalamo-cortical activity, while ketamine-xylazine anaesthesia is associated with higher sleep spindle frequencies and thalamo-cortical activity (Contreras & Steriade, 1995;Contreras, Destexhe & Steriade, 1997). The emergence of local differences in sleep spindle frequencies could therefore reflect local differences in thalamocortical interactions, which can also be expected on the basis of how different thalamic nuclei connect to each other and the cortex (Crabtree & Isaac, 2002) via the RTN.
A more direct argument for localized thalamo-cortical dynamics underlying localized spindle features, comes from mice, where although anterior and posterior sleep spindles do not display different frequencies, the relative involvement of the cortex, thalamus and RTN (measured using cross-correlation-derived coefficients) did differ between the anterior and posterior sites (Kim et al., 2015). What we see in the mouse also suggests that the mechanisms underlying local differences in sleep spindle frequency might be present, but expressed less, in species that do not display measurable frequency differences between anterior and posterior spindles.
Another non-mutually exclusive explanation is the observation that different recording methods and anatomical differences can influence the recorded spindling frequency. In mice, some authors have reported faster spindles within very localized (barrel cortex) areas (Fernandez et al., 2018), while in cats different cortical depths, rather than an anteriorposterior division, are associated with different sleep spindle frequencies (Timofeev & Chauvette, 2013). Specifically, faster sleep spindles in the cat are seen in the suprasylvian cortex (a structure extending across the entire anterior-posterior axis) than in the medial prefrontal cortex. In the Japanese Biological Reviews 96 (2021)  macaque both the slowest and fastest sleep spindles were observed in the prefrontal cortex, in the medial and dorsolateral regions respectively (Takeuchi et al., 2016). In humans and dogs the observed anterior-posterior differences are based on extra-cranial recordings (Gibbs & Gibbs, 1950;Iotchev et al., 2019), whereas in rats results are from subdurally implanted electrodes which did not extend deeper than the dorsal (outer) cortex (Terrier & Gottesmann, 1978). Importantly, in sheep and mice, recordings across the anterior-posterior cortical surface (Kim et al., 2015;Schneider et al., 2020) did not reproduce a slow versus fast distinction, thus recording depth alone could not explain the observed species differences regarding localized spindle frequency. Therefore, our third conclusion is that localized differences in sleep spindle frequency are a promising indicator of species differences that likely reflect evolutionary adaptations. Observations in the mouse (Kim et al., 2015) further suggest that local differences in how the thalamic and cortical areas are connected could represent the mechanisms upon which selection can act. Whether a particular arrangement was selected because of its effects on behaviour when awake or on sleep will not be easy to deduce, since similar thalamocortical interactions support learning in both states in mice (Chen et al., 2016).
Mapping which features of the sleep spindle are universal versus species-specific will be crucial for reconstructing its evolution. One complication towards making the next step is the different results obtained using different methodologies for determining phylogenies in mammals (Amrine-Madsen et al., 2003;Reyes et al., 2004;Cannarozzi, Schneider & Gonnet, 2007;Helgen, 2011). For now, by using wellvalidated model species as a limited source of inference, we can demonstrate that some observations (Nicol et al., 2000;Zepelin et al., 2005;Voirin et al., 2014) about how sleep spindles are expressed across the animal kingdom are likely erroneous. The defining frequency range may be an evolutionarily conserved feature of the sleep spindle, whereas local variations of frequency and the localization of these differences may be more labile among species, likely resulting from topographic differences in thalamo-cortical connectivity.

IV. CONCLUSIONS
(1) Comparative work on the sleep spindle has been of varying quality and neglect of the three validities for animal models (face, predictive and construct validity) unfortunately has been common. This leads to uncertainty about the validity of earlier sleep spindle descriptions, since the influence of pseudo-spindles cannot be ruled out. We reviewed evidence that the sleep spindles observed in humans, macaques, cats, dogs, rodents and sheep are likely manifestations of the same underlying neural processes. The same cannot be claimed for reports of similar oscillations observed in other animals.
(2) We find little support for the widely held belief that sleep spindle frequency differs among species. The available results do not support this notion, with any exceptions exclusively concerning rarely studied animals or face-value criteria for spindle detection. (3) The presence and localization of different sleep spindle subtypes as defined by their intrinsic frequency may represent a species-specific feature of the sleep spindle. A distinction into frontal-slow and posterior-fast is observed in humans, rats and dogs. The underlying reasons for these differences might be related to differences in anatomy and recording depth.

V. ACKNOWLEDGEMENTS
We would like to thank Anna Kis, Adám Miklósi, Kauê Machado Costa and Péter Pongrácz for useful comments and suggestions. Fruitful discussions with Nóra Bunford, Péter Ujma, and Gilles van Luijtelaar shaped some of the thoughts expressed herein. This project received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (Grant Agreement No. 680040). Open access funding enabled and organized by Projekt DEAL.