Reliability and validity of three quality rating instruments for systematic reviews of observational studies
Version of Record online: 15 SEP 2011
Copyright © 2011 John Wiley & Sons, Ltd.
Research Synthesis Methods
Volume 2, Issue 2, pages 110–118, June 2011
How to Cite
Hootman, J. M., Driban, J. B., Sitler, M. R., Harris, K. P. and Cattano, N. M. (2011), Reliability and validity of three quality rating instruments for systematic reviews of observational studies. Res. Synth. Method, 2: 110–118. doi: 10.1002/jrsm.41
- Issue online: 29 SEP 2011
- Version of Record online: 15 SEP 2011
- Manuscript Accepted: 18 AUG 2011
- Manuscript Revised: 1 JUN 2011
- Manuscript Received: 2 NOV 2010
- instrument psychometrics;
- research methods;
- systematic review;
To assess the inter-rater reliability, validity, and inter-instrument agreement of the three quality rating instruments for observational studies. Inter-rater reliability, criterion validity, and inter-instrument reliability were assessed for three quality rating scales, the Downs and Black (D&B), Newcastle–Ottawa (NOS), and Scottish Intercollegiate Guidelines Network (SIGN), using a sample of 23 observational studies of musculoskeletal health outcomes. Inter-rater reliability for the D&B (Intraclass correlations [ICC] = 0.73; CI = 0.47–0.88) and NOS (ICC = 0.52; CI = 0.14–0.76) were moderate to good and was poor for the SIGN (κ = 0.09; CI = −0.22–0.40). The NOS was not statistically valid (p = 0.35), although the SIGN was statistically valid (p < 0.05) with medium to large effect sizes (f2 = 0.29–0.47). Inter-instrument agreement estimates were κ = 0.34, CI = 0.05–0.62 (D&B versus SIGN), κ = 0.26, CI = 0.00–0.52 (SIGN versus NOS), and κ = 0.43, CI = 0.09–0.78 (D&B versus NOS). Reliability and validity are quite variable across quality rating scales used in assessing observational studies in systematic reviews. Copyright © 2011 John Wiley & Sons, Ltd.