Get access

Reliability and validity of three quality rating instruments for systematic reviews of observational studies

Authors

  • Jennifer M. Hootman,

    Corresponding author
    • Division of Adult and Community Health, National Center for Chronic Disease Prevention and Health Promotion, Centers for Disease Control and Prevention, Atlanta, GA, USA
    Search for more papers by this author
  • Jeffrey B. Driban,

    1. Biokinetics Research Laboratory, Athletic Training Division, Department of Kinesiology, Temple University, Philadelphia, PA, USA
    2. Division of Rheumatology, Tufts Medical Center, Boston, MA, USA
    Search for more papers by this author
  • Michael R. Sitler,

    1. Biokinetics Research Laboratory, Athletic Training Division, Department of Kinesiology, Temple University, Philadelphia, PA, USA
    Search for more papers by this author
  • Kyle P. Harris,

    1. Biokinetics Research Laboratory, Athletic Training Division, Department of Kinesiology, Temple University, Philadelphia, PA, USA
    Search for more papers by this author
  • Nicole M. Cattano

    1. Department of Sports Medicine, West Chester University of Pennsylvania, West Chester, PA, USA
    Search for more papers by this author

Jennifer M. Hootman PhD, ATC, FACSM, FNATA, Division of Adult and Community Health, Centers for Disease Control and Prevention, 4770 Buford Highway NE, Mailstop K-51, Atlanta, GA 30341, USA.

E-mail: jhootman@cdc.gov

Abstract

To assess the inter-rater reliability, validity, and inter-instrument agreement of the three quality rating instruments for observational studies. Inter-rater reliability, criterion validity, and inter-instrument reliability were assessed for three quality rating scales, the Downs and Black (D&B), Newcastle–Ottawa (NOS), and Scottish Intercollegiate Guidelines Network (SIGN), using a sample of 23 observational studies of musculoskeletal health outcomes. Inter-rater reliability for the D&B (Intraclass correlations [ICC] = 0.73; CI = 0.47–0.88) and NOS (ICC = 0.52; CI = 0.14–0.76) were moderate to good and was poor for the SIGN (κ = 0.09; CI = −0.22–0.40). The NOS was not statistically valid (p = 0.35), although the SIGN was statistically valid (p < 0.05) with medium to large effect sizes (f2 = 0.29–0.47). Inter-instrument agreement estimates were κ = 0.34, CI = 0.05–0.62 (D&B versus SIGN), κ = 0.26, CI = 0.00–0.52 (SIGN versus NOS), and κ = 0.43, CI = 0.09–0.78 (D&B versus NOS). Reliability and validity are quite variable across quality rating scales used in assessing observational studies in systematic reviews. Copyright © 2011 John Wiley & Sons, Ltd.

Ancillary