Get access

Reliability and construct validity of the compatible MRI scoring system for evaluation of elbows in haemophilic children

Authors


Andrea S Doria, Department of Diagnostic Imaging, Hospital for Sick Children, 555 University Avenue, Toronto, Ontario M5G 1X8.
Tel.: 416-813-6079; fax: 416-813-5638;
e-mail: andrea.doria@sickkids.ca

Abstract

Summary.  We assessed the reliability and construct validity of the Compatible MRI scale for evaluation of elbows, and compared the diagnostic performance of MRI and radiographs for assessment of these joints. Twenty-nine MR examinations of elbows from 27 boys with haemophilia A and B [age range, 5–17 years (mean, 11.5)] were independently read by four blinded radiologists on two occasions. Three centres participated in the study: (Toronto, n = 24 examinations; Atlanta, n = 3; Cuiaba, n = 2). The number of previous joint bleeds and severity of haemophilia were reference standard measures. The inter-reader reliability of MRI scores was substantial (ICC = 0.73) for the additive (A)-scale and excellent (ICC = 0.83) for the progressive (P)-scale. The intrareader reliability was excellent for both P-scores (ICC = 0.91) and A-scores (ICC = 0.93). The total P- and A-scores correlated poorly (r = 0.36) or moderately (r = 0.54), but positively, with clinical-laboratory measurements. The total MRI scores demonstrated high accuracy for discrimination of presence or absence of arthropathy [P-scale, area-under-the-curve (AUC) = 0.94 ± 0.05; A-scale, AUC = 0.89 ± 0.06], as did the soft tissue scores of both scales (P-scale, AUC = 0.90 ± 0.06; A-scale, AUC = 0.86 ± 0.06). Areas-under-the-curve used to discriminate severe disease demonstrated high accuracy for both P-MRI scores (AUC = 0.83 ± 0.09) and A-MRI scores (AUC = 0.87 ± 0.09), but non-diagnostic ability to discriminate mild disease. Similar results were noted for radiographic scales. In conclusion, both MRI scales demonstrated substantial to excellent reliability and accuracy for discrimination of presence/absence of arthropathy, and severe/non-severe disease, but poor to moderate convergent validity for total scores and non-diagnostic discriminant validity for mild/non-mild disease. Compared with radiographic scores, MRI scales did not perform better for discrimination of severity of arthropathy.

Ancillary