Insomnia Severity Index: A reliability generalisation meta‐analysis

The aim of the current study was to conduct a reliability generalisation (RG) meta‐analysis of Cronbachʼs alpha for the Insomnia Severity Index (ISI). A systematic search of three databases (PubMed, Scopus, and Web of Science) from inception to 12 March 2021 was performed. Publications that reported Cronbachʼs alpha for the total ISI score were included. Only psychometric‐focussed studies were considered. Meta‐analysis was carried out using a random‐effects model to derive a pooled estimate of Cronbachʼs alphas. The number of participants in the included publications ranged from 25 to 12,056, with 33 studies (42 estimates) comprising internal consistency coefficients, and a combined sample size of N = 29,688. The age range of the included publications was from 13.4 to 74.3 years. Data extraction implied 33 publications out of 706 found through the database search. Cronbachʼs alphas ranged from 0.65 to 0.92. The majority of the reported coefficients were ≥0.7 and presented a low risk of bias (n = 32). The pooled alpha coefficient was 0.83 (IC [0.81–0.85]; SE = 0.009) with high heterogeneity among the included publications (I2 = 97%). Subgroup analyses including moderators such as continent, setting, risk of bias, and age did not affect significantly the overall result. In general, the cumulative estimate of Cronbachʼs alpha for the ISI is good. However, this finding should be interpreted with caution since there is a high heterogeneity level and some of the studies might not have checked the assumptions underlying Cronbach's alphas.

To quantify the severity, intensity, and clinical significance of insomnia, clinicians often use indicators such as time to fall asleep, duration of awakenings, and total sleep time, as well as both duration and frequency of sleep difficulties . Insomnia is a clinical condition with a high prevalence, it being estimated that approximately one-third of the adult population has at least one symptom of insomnia (Ali et al., 2020).
Insomnia presents a high rate of comorbidities, such as increased risks of major depressive disorder, hypertension, and myocardial infarction (Dieck et al., 2018). Therefore, its assessment must be as accurate as possible. It is important to note that the clinical interview is the gold standard method to establish a diagnosis of insomnia. Even so, self-report measures may assume an important role as a complementary strategy (Marques, 2020;Moul et al., 2004).
According to Buysse et al. (2006), the most recommended measures for evaluating global sleep and insomnia symptoms are the Pittsburgh Sleep Quality Index (PSQI) and the Insomnia Severity Index (ISI). However, it is important to outline that the PSQI was designed to measure general sleep disturbances, whereas the ISI was specifically developed to assess insomnia severity and the patientʼs perception of it (Bastien et al., 2001).
The Insomnia Severity Index is a generic scale aimed to assess insomnia symptoms, though it is also used in other domains and fields beyond sleep medicine and insomnia-related issues. For instance, researchers may use the ISI to correlate sleep quality with the current patientʼs situation, such as in cancerology (Sharpley et al., 2021), in mild traumatic brain injury events (Luethcke et al., 2011), in pain medicine, such as patients suffering from fibromyalgia (Ware et al., 2010), and in the regulation of sleep-related side effects of menopause (Sternfeld et al., 2014).
According to Morin et al. (2011), the ISI is a 7-item scale that aims to assess nighttime and daytime insomnia symptoms being used frequently as a follow-up measure in several clinical studies. This scale assesses sleep onset, sleep maintenance, early morning awakening, interference with daily functioning, perceived prominence of impairment attributed to the sleep problem, concerns about sleep problems, and satisfaction with sleep patterns. Each item is rated on a 5-point Likert scale, yielding a total score ranging from 0 to 28. A higher score denotes more severe insomnia, the administration time being less than 5 min generally (Bastien et al., 2001).
To use a measure such as the ISI it is crucial that internal consistency is established. Cronbach's alpha is the most common measure of internal consistency. It rates the accuracy of the instruments and the degree of item agreement within a measure (Adamson & Prion, 2013). Rodriguez and Maeda (2006) explain that "various other estimates of reliability capture specific sources of random variance (measurement error) and result in different sampling distributions" (p.309). One should note that important works such as the one by Guttman (1945) constitute an important background for reliability in psychological measures, particularly, Cronbach's alpha. As mentioned by Trizano-Hermosilla and Alvarado (2016), two of the alternatives to Cronbach's alpha are the composite reliability and McDonald's omega.
These take into account the relative weights of the items and do not have the "lower bound" problem (Zinbarg et al., 2005). It is important to note that researchers generally do not know that Cronbach's alpha is based on some assumptions such as unidimensionality and tauequivalence. These assumptions are seldom fulfilled in empirical research, but researchers often do not place importance on or they do not know this (Tavakol & Dennick, 2011).
Generally, the recommended Cronbach's alpha should be ≥0.70 to be considered reliable (Coaley, 2010). Nevertheless, to ensure the validity of a test, the internal consistency should be determined before it is employed for research or medical use (Tavakol & Dennick, 2011). As mentioned by Tavakol and Dennick (2011) Chmielewski and Watson (2009) and John and Benet-Martinez (2000), and cited by Agbo (2010), reliability pertains to the "consistency or reproducibility of a measurement procedure" (p.233).
When an instrument is translated and validated into many languages and different settings, the need for a broader study, one that synthesises all the various versions of the same tool becomes germane and relevant. A meta-analysis is the appropriate technique to integrate and summarise the results of a given instrument. The main goal is to synthesise evidence accurately and reliably, combining information from different studies. Meta-analysis is essentially used to promote theoretical advances and to solve conflicts within a scientific domain, and to identify new directions to future research (Cooper et al., 2009). When collecting data, researchers can focus only on one measure, Cronbach's alpha, for instance, to formulate a unique value to use as a gold standard in other studies, as it displays the quality of the assessment tool.
As described by Field and Gillett (2010), the first step in conducting a meta-analysis study is to systematically review the existing literature. The systematic review consists of structured steps that synthesise all the existing studies. It begins by defining first what is the focus of the study (being a broadall the insomnia treatments, or a narrow themethe Cronbach's alpha in validation studies), then the specific keywords and queries to search on the databases and lastly the inclusion and exclusion criteria for the papers. In a systematic review, a paper can be published by the gathering of all this information, but depending on the data, it is possible to step further and to calculate the effect sizes, and that is called a meta-analysis.
According to Vacha-Haase (1998), reliability generalisation (RG) is a method used to synthesise estimates of the reliability of an instrument. Reliability generalisation also characterises the amount of variability and its source over the reliability coefficients given for each study, by psychometric measures related to reliability. Reliability is influenced by the instrument itself and by the sample (i.e., its size, homogeneity, heterogeneity). Thus, and as stated by Vassar and Hale (2007), "the measure itself is not reliable, but rather it is the scores produced by the measure that are reliable" (p. 490). The results of the obtained variability allow the construction of confidence intervals around the average reliability of provided scores. Hence, they allow an analysis of how reliable are the scores produced by the measure concerning the different samples and the different characteristics (Graham et al., 2011).
When conducting a reliability generalisation meta-analysis, the notion of reliability induction and its implications must be considered.
Reliability induction is the heuristic that some researchers make when using another researcher's measure when it has acceptable levels of reliability, instead of developing their own. It is when researchers "commonly reference reliability coefficients from past studies or normative samples and induce that the data in hand are equally reliable" (Shields & Caruso, 2004, p.256). This practice is controversial because the reliability coefficient, primarily reported from a study considering its sample, is distinct from the sample composition and variability under development (Vassar & Hale, 2007). As reported by Deditius-Island and Caruso (2002), there are two types of reliability induction: the first has already been described earlier and the second refers to when researchers omit the value.

| Objectives
To the best of our knowledge, there is no reliability generalisation study that focussed on the ISI. Therefore, the goal of the present systematic review and meta-analysis is to obtain an estimate of the average reliability of the ISI pertaining to psychometric-focussed published studies. In this paper, the Cronbachʼs alpha is the only explored reliability measurement, since it is that most reported in the literature (Trizano-Hermosilla & Alvarado, 2016). Another reason is the inability to combine different estimates.
Recently, Manzar et al. (2021) published results from a systematic review that included a meta-analysis of the ISIʼs Cronbach alphas.
However, some important limitations should be outlined: the analysis left aside some important studies that were published thereafter (from 2019); Scopus and Web of Science databases were not considered in the searches; REGEMA guidelines were not followed or at least mentioned (which is the most appropriate and updated guideline to conduct reliability generalisation studies); and only published articles in the English language were considered.

| METHOD
The current systematic review and meta-analysis was previously registered in PROSPERO (International prospective register of systematic reviews) with the following reference: CRD42021261889. The recommended 30-item checklist (i.e., the REGEMA -REliability GEneralization Meta-Analysis) for reporting a reliability generalisation study was followed whenever it was adequate (Sánchez-Meca et al., 2021).

| Selection criteria
Validation studies that reported Cronbachʼs alpha pertaining to the ISIʼs total score were included. Conference proceedings/abstracts, editorials, or letters and studies that did not report global Cronbachʼs alpha were excluded. There were no geographical and/or cultural

| Search strategies
A systematic search of three databases (i.e., PubMed, Scopus, and Web of Science) from inception to 12 March 2021 was performed using keywords and free text words ("ISI" or Insomnia Severity Index, "validation", "Cronbach's alpha", "Psychometric", "Insomnia", and "Sleep"). In the databases, the authors combined the keywords in multiple ways, such as: ("ISI" OR "insomnia severity index") AND "validation" AND "Cronbach's alpha" OR "Cronbach").

| Data extraction
Two authors (LQC and MCJ) independently selected papers for fulltext screening and extracted the data. In general, the results showed a high agreement (κ = 0.92). Discrepancies between both researchers were discussed, clarified, and resolved recurring to a third author review (DRM).
The extracted demographic variables included the mean and standard deviation of the participants' age (in years), the year of the publications, and the country and the continent in which the study was performed. Regarding the study information, the type of sample (i.e., adults, elderly, and adolescents), the sample size and the type of population (i.e., community and/or clinical with insomnia or another condition, specifying what condition was) were extracted. Along with these characteristics, the reliability coefficient (Cronbach's alpha) was retrieved as the effect size measure from the global ISI scale reported in each selected study.

| Quality assessment (risk of bias)
Evaluation of the risk of bias (RoB) was performed on all estimates from each selected article. Two independent authors (LQC and MCJ) evaluated the risk of bias (ICC = 0.96) and any conflict was clarified by a third reviewer (DRM). The RoB was assessed according to the nine-item questionnaire developed by Hoy et al. (2012), and each item (e.g., true or close representation of sampling frame, random selection, etc) was rated with either a "0" or a "1", low risk and high risk, respectively. After all the questions were added, the study was rated as low risk (0-3), moderate risk (4-6) or high risk of bias (7-9). Empirical references that reported some reliability coefficient (n = 48) Records not recovered by interlibrary loan (n = 0) Empirical references that induced the reliability (n = 0) Empirical references excluded: -Wrong publication type (n = 7) -Missing outcome (authors did not respond) (n = 4) -Not an ISI validation article (n = 2) -Preliminary validation (n = 2) Empirical references included in the meta-analysis (n = 33)

| Reported reliability
In the meta-analysis, one type of reliability coefficient related to internal consistency was considered: Cronbach's alpha. Therefore, the latter was extracted from all the included studies.

| Transformation method
The reliability coefficients were not transformed for the meta-analytic integration because it was only analysed one type of reliability coefficient.

| Statistical model
All Cronbachʼs alphas presented in each publication were included in the meta-analysis. To assess heterogeneity, I 2 and Q statistics were calculated. Untransformed estimates of alpha and inverse variance weighting were used. Meta-analysis was carried out using a Random-  (Ouzzani et al., 2016). All the statistical analysis, including descriptive statistics (e.g., mean age of the selected sample), was performed in Jamovi software -Version 2.3.18 (The jamovi project, 2022). For the organisation of this article, we followed the structure and data analysis strategies presented in Pentapati et al. (2020).

| RESULTS
We identified 706 publications from Web Of Science (n = 461), Pubmed (n = 177), Scopus (n = 68), and other sources (n = 3). After removing the duplicates, 570 publications were screened according to the title and abstract on Rayyan. A total of 48 publications were included for full-text screening and 15 publications were excluded due to being the wrong publication type = 7, missing outcome (authors did not respond to the sent emails within 15 days) = 4, not being an ISI validation = 2, and preliminary validation = 2. Data extraction encompassed 33 publications (cf. Figure 1).
Some of the publications (n = 8) reported more than one Cronbach's alpha, as they administered the ISI to more than one group (e.g., insomnia and community groups). In total, there were 42 estimates identified (Cronbach's alpha). The mean age among the included publications was 38.5 (SD = 16.4) years, being an age range of 13.4 to 74.3 years. One should note that one publication did not report the age concerning the total sample (insomnia group and control groups together), although it did report the Cronbach's alpha for those separately. Cronbach alphas ranged from 0.65 to 0.92 (cf. Table 1).
A random-effects model was performed using a restricted maximum-likelihood method to calculate the cumulative estimates p < 0.001) (cf. Figure 2).
The funnel plot for the assessment of publication bias showed asymmetry (cf. Figure 3) which is in accordance with the Eggerʼs regression test (À3.981; p < 0.001). It is important to note that the interpretation of the funnel plot is rather subjective, having to be supported by more objective tests such as the Eggerʼs test.

| Study setting
The selected articles concerned either populations that were extracted from the community (did not mention any medical conditions) or from a clinical background. Within those from a clinical background, some were diagnosed with only insomnia and some had insomnia and another medical condition (which we refer to in this paper as "Clinical [Insomnia and +]"). Besides those, some presented a medical diagnosis other than insomnia (referred in this paper as

| Geographic location
Most of the reported studies were from Asia (n = 18), followed by eleven European and ten North American studies, and three studies from African countries. The highest cumulative alpha was from North American studies (α = 0.84) and was found to be the least among studies from African countries (α = 0.79).

| Risk of bias assessment
The majority of the estimates presented a low risk of bias (n = 32), and those presented a higher cumulative alpha (α = 0.83) than those with a moderate risk of bias (α = 0.79) (cf. Table 2).

| DISCUSSION
The alpha coefficient is one of the most reported measures for examining internal consistency. Thus, the present reliability generalisation meta-analysis study of the ISI assesses how its reliability estimates vary through different administrations and validations with the main goal of obtaining a cumulative reliability estimate.
To calculate the cumulative Cronbach's alpha estimates, the reliability generalisation method yields a reliable measure, and it circumvents the reliability induction issue. Moreover, it is also a "powerful method with which to characterise and explore variance in score reliability" (Vacha-Haase, 1998, p.16).
The key objective of this study was to obtain an estimate of Regarding the moderation analyses, the explored variables were "setting", "continent", "risk of bias", and age, however, the results yielded non-significant effects. It is important to outline that the findings reported in our study have high heterogeneity, thus, the results must be interpreted with caution. This high level of heterogeneity seems to be unaffected by the examined moderator variables. Despite the majority of subgroup analyses we performed having achieved the minimum number of recommended studies (at least 10 studies), we cannot rule out the hypothesis that with a greater number of studies these moderators might to some extent account for heterogeneity (Harrer et al., 2021).
To our knowledge, there is only one publication that presented a meta-analytic estimate of the ISIʼs Cronbach alpha (Manzar et al., 2021). In this study, the coefficient was α = 0.82, which is close to the value we found. However, one should note that the study by bach's alpha, consequently, it is also the most common and known coefficient (Padilla & Divers, 2016).
Nonetheless, there are some limitations about Cronbach's alpha.
One of them is that Cronbach's alpha is influenced by the number of items on the scale, which means the higher the number of items on the scale, the greater the coefficient. In other words, the intercorrelation between the items can be small even though the alpha can be large. By contrast, the omega coefficient is not affected by the number of items on the scale, making the measure more reliable than Cronbachʼs alpha.
A frequent misconception is that Cronbach's alpha reports the unidimensionality (i.e., it is assumed that a high value of alpha indicates the unidimensionality of the item set, although it is unwise to link the unidimensionality concept with the Cronbachʼs alpha without knowing if the set of items measures a single construct. Additionally, regarding the notion of internal consistency, alpha depends on the average correlation between item responses, and even though the alpha is high, the average mentioned before relying upon correlations that are zero or close to (Adamson & Prion, 2013;Hayes & Coutts, 2020).
In conclusion, this study provides one of the first meta-analytic studies involving the ISI, suggesting that the measure presents a reliable internal consistency across clinical and community samples.
Future research concerning internal consistency and reliability measures must integrate more robust coefficients such as composite reliability, and explore other moderator variables that help to account for the high heterogeneity observed. In addition, since Cronbach's alpha is an indicator that must check two major assumptions as we already outlined, it is important that future studies consider this information and include it in meta-regression analyses (i.e., studies that actually check these assumptions vs. the ones that do not).

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available from the corresponding author upon reasonable request.