Improving actigraphic sleep estimates in insomnia and dementia: how many nights?

Authors

  • EUS J. W. VAN SOMEREN

    1. Netherlands Institute for Neuroscience, an Institute of the Royal Netherlands Academy of Arts and Sciences, Amsterdam, The Netherlands and Departments of Neurology, Clinical Neurophysiology and Medical Psychology, VU University Medical Center, Amsterdam, The Netherlands
    Search for more papers by this author

Eus J. W. Van Someren, PhD, Department of Sleep and Cognition, Netherlands Institute for Neuroscience and VU University Medical Center, p/a Meibergdreef 47, 1107 BA Amsterdam, The Netherlands. Tel.: +31 20 566 5497; fax: +31 20 696 1006; e-mail: e.van.someren@nin.knaw.nl

Summary

In order to investigate how the duration of actigraphic recordings affects the reliability of actigraphic estimates of sleep and 24-h activity rhythm variables, two to 3 weeks of actigraphy were recorded, from which pairs of variables derived from two periods of increasing length (1–10 days) were compared. Two groups were studied: (1) 10 subjects suffering from primary insomnia; and (2) 12 demented elderly subjects living semi-independently in group care facilities of homes for the elderly. Actigraphic estimates of primary measures of sleep (duration and efficiency) and of the 24-h activity pattern (interdaily stability, intradaily variability and amplitude) were calculated on variable lengths of the actigraphic recordings. The average absolute difference of two estimates decreased – and reliability increased – strongly with an increasing number of days analysed. An acceptable reliability of the interdaily stability estimate required more than 7 days of recording. It can be concluded that a valuable improvement in the reliability of actigraphic sleep estimates can be obtained by simply increasing the number of recording nights. The results support the importance of day-to-day variability in insomnia and dementia that has already been previously noted by others, and even suggest the presence of ’week-to-week’ variability. This variability may have been involved in the equivocal results of treatment studies in insomnia and dementia where outcome measures were based on a limited number of nights. Such studies could profit from extension of the recording duration to, e.g. 2 weeks, and from the inclusion of variability measures as measures of clinical interest.

Introduction

An advantage of actigraphy over polysomnography is that multiday recordings are easily accomplished. This makes it feasible to improve sleep estimates by averaging over multiple nights. It also allows one to investigate night-to-night variability in sleep variables and activity rhythm variables, which may be of considerable importance themselves (Chambers, 1994; Edinger et al., 1991; Huang et al., 2002; Scherder et al., 1999; Vallieres et al., 2005). Sleep variability across consecutive nights has been noticed in many previous reports (e.g. Kronholm et al., 1987; Edinger et al., 1991; Van Hilten et al., 1993; Sadeh et al., 1991; Acebo et al., 1999). Because this variability can affect the reliability of the sleep variables, it is a source of concern if one wants to estimate differences in ‘typical’ sleep variables in relation to patient groups or treatments. Especially for studies of limited sample size, even small improvements of the reliability of outcome measures may help to ensure enough statistical power to discriminate groups or treatment effects. Reliability of the sleep estimates increases with the number of days recorded. Practice parameters for the use of actigraphy in the clinical assessment of sleep disorders have been published by the Board of Directors of the American Academy of Sleep Medicine (Ancoli-Israel et al., 2003; Littner et al., 2003; Sadeh et al., 1995), and in one of them it has been advocated to use no less than three consecutive 24-h periods (Littner et al., 2003). Acebo et al. (1999) have advocated the use of at least 5 days for actigraphic sleep estimates in children, and recognized that some sleep variables in certain subgroups of children hardly reached acceptable reliability even if the estimate would be based on seven nights of recording. To our knowledge no previous studies have evaluated the actual, empirically determined reliability and absolute difference between sleep variables based on two subsequent recording periods varying between 1 and 10 days in insomnia and demented elderly. Also, the effect of recording duration on 24-h activity rhythm variables (Van Someren et al., 1999) has not previously been investigated. The aim of the present study is to investigate recording duration-related changes in reliability and in absolute differences of sleep and activity rhythm outcome measures, in sample sizes typical of smaller studies (n = ±10) in insomnia and dementia.

Methods

Participants

The first sample consisted of 10 subjects suffering from primary insomnia (Diagnostic Classification Steering Committee, 1990), mostly female (seven), aged 58 ± 2 years (mean ± SEM), actigraphically recorded for 14 days continuously. The diagnosis of primary insomnia was given according to the qualitative criteria of the International classification of sleep disorders (ICSD) (Diagnostic Classification Steering Committee, 1990) and the Research Diagnostic Criteria for Primary Insomnia (Edinger et al., 2004), as well as according to the quantitative criteria proposed by Lichstein et al. (2003), i.e. sleep onset latency or wake time after sleep onset of more than 30 min, occurring at least three times a week for at least half a year. The study complied with the majority of the recommendations of the recently published Recommendations for a Standard Research Assessment of Insomnia (Buysse et al., 2006). Diagnosis was performed under supervision of accredited sleep specialists (RS – see Acknowledgements– and EVS) and included interviews, questionnaires, sleep diaries, wrist and foot actigraphy and polysomnography.

The second sample consisted of 12 residents of group-care facilities of homes for the elderly, mostly female (10), aged 85 ± 2 years (mean ± SEM), actigraphically recorded for 20 days continuously. Their average Mini Mental State Examination (MMSE) (Folstein et al., 1975) rating of the dementia group was 15 ± 1, and clinical diagnoses according to consensus criteria (American Psychiatric Association, 1994, McKeith et al., 1996; McKhann et al., 1984) were probable Alzheimer’s disease (six), vascular dementia (four), frontal type dementia (one) and Wernicke–Korsakoff (one). The clinical diagnoses were performed by a specialized physician (RR – see Acknowledgements). The Medical Ethical Committees of Hospital De Gelderse Vallei, Ede and the VU University Medical Center, Amsterdam, the Netherlands, approved the studies, and subjects only participated after informed consent by themselves or their relatives.

Procedure

Estimates of sleep and sleep–wake rhythm parameters were obtained using the Actiwatch and Sleep 5 software (Cambridge Neurotechnology, Cambridge, UK), logged at 1-min intervals and analysed at high sensitivity (Colling et al., 2000; Kushida et al., 2001). It should be noted that the Actiwatch is only one of several available actigraphs, with an appropriate filter bandwidth for recording small wrist movements (Van Someren et al., 1996), and that other sleep scoring algorithms than used in the Sleep 5 software are available as well. Thus, the present results may differ slightly for other software and hardware. It should also be noted that there is no one-to-one relation between actigraphic sleep estimates, PSG and subjective sleep estimates – each likely contributing in its own right to the quantification of sleep and sleep problems. Insomniacs may be awake without moving, resulting in inadequate sleep estimates. In elderly insomniacs, a reasonable cross-validity between PSG and actigraphic estimates has been demonstrated by some, but not other studies (Brooks et al., 1993; Friedman et al., 2000; Sivertsen et al., 2006). Although acceptable cross-validity between PSG and actigraphic estimates in dementia has been demonstrated (Ancoli-Israel et al., 1997) – be it using different actigraphs and software – even the validity of PSG itself to estimate sleep in demented elderly is questionable, due to the intrusion of slow waves in the wake EEG. Thus, actigraphic estimates should best be regarded as separate measures, not as measures to replace either subjective sleep estimates or PSG-based sleep estimates. Of note, there is also considerable night-to-night variability in PSG-based sleep estimates. The possible increased convergence of PSG and actigraphy averages over prolonged (weeks) recording durations of each remains to be investigated.

To set lights-out and rise times in the software, sleep logs were used for the primary insomniac subjects and bedtimes and get-up times provided by the nursing staff were used for the demented subjects which were put to bed and taken out of bed at fixed times every day. Previous work showed that especially the information about bedtimes provided by the nursing staff was highly reliable in the present sample, possibly due to the fixed times of bringing subjects to bed and putting their light off (Hoekert et al., 2006).

Analysis and statistics

We here illustrate the effect of increasing the number of analysed nights on two of the most often used sleep parameters, i.e. the sleep duration (total sleep time, TST) and the sleep efficiency (SE%), the latter reflecting the percentage of time in bed spent asleep (0–100%); and on three sleep–wake rhythm parameters: (1) the interdaily stability (IS) reflecting the predictability of the 24-h rest–activity pattern; (2) the intradaily variability (IV) reflecting the fragmentation of the activity profile into brief periods of rest and activity; and (3) the amplitude (AMP) of the rhythm, calculated non-parametrically from the major periods of activity and of rest (Van Someren et al., 1999). The parameters were repeatedly calculated for periods either 1 week apart (primary insomniac subjects) or 10 days apart (demented elderly). Calculations were made for two separate single days, for two periods of 2 days, for two periods of 3 days, etcetera, up to two separate periods of 7 days for primary insomniac subjects (days 1–7 and 8–14) and up to two separate periods of 10 days (days 1–10 and 11–20) in demented elderly. For the multiple-day assessments the resulting sleep estimates were averaged over the number of days, whereas the activity rhythm parameters are intrinsically calculated over multiple days. Repeated measures t-tests were used to test whether the average absolute estimate difference of longer registrations differed from the absolute estimate difference between two periods of 3 days, which is the minimum number of days recommended by the AASM Consensus report on the use of actigraphy (Littner et al., 2003). One-sided values of P < 0.05 were considered statistically significant. Intraclass correlation coefficients (ICC) were moreover calculated between the pairs of 1- to 10-day estimates to provide insight in the increase in reliability with increasing number of days included in the analysis.

Results

Figure 1 shows the means (±SEM) of the absolute differences between two estimates of the sleep and circadian variables. The horizontal axes show the number of days included in the two estimates. It is clear that the absolute difference between most estimates decreases, i.e. the reliability increases, with the number of days included in the analysis.

Figure 1.

 Absolute differences (vertical axes) between two actigraphic estimates based on an increasing number of days (horizontal axes) included in the estimate. The left panels show results for subjects suffering from primary insomnia, the right panels for demented elderly subjects. From the upper to the lower panels average absolute estimate differences (±SEM) are shown for: (a) total sleep time, (b) sleep efficiency, (c) interdaily stability, (d) intradaily variability and (e) 24-h rhythm amplitude. Note the gain in reliability with an increasing number of days included in the parameter estimates.

For insomniac subjects, there were significant decreases in the absolute difference between two estimates based on 3 days and between two estimates based on 7 days for TST (49% decrease, P = 0.01), IV (56% decrease, P = 0.01) and AMP (50% decrease, P = 0.02). For demented elderly, there were significant decreases in the absolute difference between two estimates based on 3 days and between two estimates based on 10 days for TST (55% decrease, P = 0.002), SE% (55% decrease, P = 0.002), IS (67% decrease, P = 0.02) and IV (59% decrease, P = 0.001).

Table 1 summarizes the ICCs calculated between the pairs of 1- to 10-day estimates. Estimates based on 7 days of recording were reliable according to consensus (ICC > 0.70), except for the variable IS, which was marginally reliable in dementia and not reliable in insomnia. This indicates that the activity rhythm shows a high intrasubject variability even when comparing two subsequent weeks. Extension of the number of days included in the analysis to more than 7 days could be evaluated in the dementia recordings and demonstrated a continued increase in the reliability of the variables, most pronounced so for IS (Table 1).

Table 1.   Intraclass correlation coefficients, as measure of reliability, calculated between the pairs of estimates of sleep and circadian parameters based on 1–7 days (insomnia, left) or 1–10 days (dementia, right)
VariableNumber of nightsNumber of nights
123456712345678910
  1. Note the omission of 1-day and 2-day outcomes for interdaily stability (IS), the calculation of which makes sense only on a minimum number of days. Note that the reliability of IS increases only slowly, and does not even appear to saturate at a minimally acceptable level (>0.70) after seven days of recording in insomnia patients, indicating strong week-to-week variability.

InsomiaDementia
TST0.700.650.780.790.880.910.930.590.720.890.930.940.940.950.950.960.97
SE0.720.820.880.860.890.900.900.680.760.890.930.930.930.950.950.950.96
IS  0.670.240.470.490.53  0.100.290.490.590.710.730.860.95
IV0.270.130.330.480.590.760.770.160.360.710.680.740.690.890.860.890.93
AMP0.850.830.840.850.890.940.940.890.800.960.920.940.930.960.970.970.97

Discussion

It was evaluated whether increasing the number of recorded days would affect the reliability of actigraphic estimates of sleep and sleep–wake rhythm parameters in insomnia and demented elderly. As illustrated by commonly used actigraphic estimates of the sleep variables, TST and SE%, and of the 24-h activity rhythm variables IS, IV and AMP, an increase in the number of days used for the actigraphic estimate pays off well in terms of an increase in reliability and a reduction in the absolute difference between two estimates. This finding is of importance, as it indicates that extending the number of recording days strongly improves the statistical power of studies investigating treatment effects or group differences with actigraphic estimates of sleep. Improvement of reliability was found for both primary insomnia and dementia, and it is likely that similar improvements are feasible in studies on other subjects, unless it is known beforehand that subjects adhere to a very strict and stable sleep pattern and sleep–wake rhythm. The actigraph has been the first really small and well-integrated recorder for human behaviour, and other promising and truly small recorders for both movement disorders (e.g. tremor) and human physiology (e.g. temperature) have recently followed (Van Marken Lichtenbelt et al., 2006; Van Someren et al., 1993, 1998, 2006). Because multiday recording with such a small device is much more feasible than multiday polysomnography, actigraphy can have added value in case of variable sleep and sleep–wake rhythms. The present findings also confirm that, although there may not be a systematic ‘first night effect’ (Agnew et al., 1966) in actigraphic sleep studies, there is a considerable night-to-night variability (e.g. Kronholm et al., 1987; Edinger et al., 1991; Van Hilten et al., 1993). Although actigraphy should not be considered an alternative for polysomnography in many cases (Littner et al., 2003), its use can at least provide additional information, including estimates of night-to-night variability of sleep parameters, which would be costly and time consuming using polysomnography, if at all feasible for the patients under study. This variability might be of use as an outcome measure in itself, as previously suggested for sleep variables (Chambers, 1994; Edinger et al., 1991; Vallieres et al., 2005) and the 24-h activity rhythm (Van Someren et al., 1999).

Previous studies have addressed the issues of the night-to-night variability of actigraphic activity measures and of the changes in their reliability with an increasing number of nights included in their estimation. In an earlier study, Aubert-Tulkens et al. (1987) reported an intrasubject variability of only about 3% in actigraphic nocturnal mobility measures in healthy subjects and subjects suffering from sleep apnoea. Van Hilten et al. (1993) similarly used actigraphy to examine the variability in nocturnal activity and immobility measures over six nights in healthy elderly subjects. No systematic differences were found over the six nights, indicating the absence of a first night effect. However, the night-to-night intrasubject variability was very high, as indicated by coefficients of variations ranging from 19% to 64%, depending on the variable of interest. Sadeh et al. (1991) reported several studies on intrasubject variability in children with normal and disturbed sleep. They evaluated two nights of actigraphic sleep estimates and confirmed absence of a systematic difference over the two nights. Intrasubject variability was acceptable for some variables, yet considerable for the number of awakenings, sleep duration and the longest sleep period. Acebo et al. (1999) investigated reliability based on seven nights of actigraphic sleep estimates in children and adolescents, and concluded that adequate reliability (defined as >0.70) required at least five nights, and was not reached even with seven nights for some variables and subject groups. Acebo et al. used the Spearman–Brown prophecy formula, well known from test item analyses, to predict reliability of sleep variables based on 2- to 7-day averages from a single-night ICC obtained over seven recording nights. In their application, the formula assumed each subject to have the same mean and SD on any two periods of 2–7 days. Our results, however, indicate that this assumption may not hold in insomniac and demented elderly. For this reason, and because the Spearman–Brown prophecy formula is regarded to make valid reliability estimates especially for a large number of items, read ’nights’ (e.g. >15, see Bodner, 1980), we preferred in our study to empirically determine the changes in reliability and absolute errors when increasing the number of days, rather than to calculate estimates. Indeed, a post hoc calculation of estimated reliabilities on our data set according to the methods described in Acebo et al. gave results that differed considerable from the empirically determined reliabilities, especially for short recording durations. For example, in our demented subjects data set, the reliability of TSD based on two nights was estimated to be 0.82, but only 0.72 if empirically determined from two actual two-night averages.

Not only in actigraphic studies, but also in subjective sleep reports, night-to-night variability is observed. A recent interesting finding is that sequential analyses of this variability suggests different subtypes of chronic insomnia (Vallieres et al., 2005). In agreement with the recommendation of the present paper to obtain a long actigraphic recording period, this is also necessary for the type of sequential analysis described by Vallieres et al.

In conclusion, our study indicates that both sleep and 24-h activity rhythm variables obtained with actigraphy in insomniac or demented elderly can improve considerably with extended recording duration. A sequence of assessed nights has often been considered as a sequence of assessments of a variable with a stable underlying mean and SD. Under such assumptions, statistical considerations would predict hardly any increase in reliability beyond seven assessments (Vickers, 2003). One notable exception to this rule is if the variable under study changes episodically: in that case more repeated measures continue to increase the reliability of the estimate. Our results suggest indeed that actigraphically estimated sleep in insomniac elderly and demented elderly include a component of such episodic waxing and waning of disturbed sleep, not limited to ’good nights’ and ’bad nights’ but possibly even extending to ’good weeks’ and ’bad weeks’. Such variability may have been involved in the equivocal actigraphic and polysomnographic findings of treatment studies in insomniacs and demented elderly. The statistical power of actigraphic sleep studies can thus not only be enhanced by including more subjects, but also by extending the recordings duration, for which we recommend 2 weeks. Moreover, quantification of the variability as clinical outcome measures is advocated for such studies, especially because variability is most strongly related to variables describing well-being, at least in demented elderly subjects (Carvalho-Bos et al., 2007).

Acknowledgements

Research funded by ZON-MW, The Hague, the Netherlands (projects SOW 014-90-001 and 28-3003); the Netherlands Organization for Scientific Research, The Hague, the Netherlands (VIDI Innovation Grant 016.025.041); and EU FP6 Sensation Integrated Project (FP6-507231). RS: Rob Strijers, MD, PhD, VU University Medical Center, Department of Clinical Neurophysiology, Amsterdam, the Netherlands is acknowledged for help with diagnosis of participants with insomnia. RR: Rixt Riemersma, MD; Ellemarije Altena, MSc; Ysbrand Van Der Werf, PhD and Rebecca Schutte, MSc, Netherlands Institute for Neuroscience, are acknowledged for help with patient recruitment, diagnosis, data-collection and analysis.

Ancillary