The only established system to grade subchondral bone attrition in knee osteoarthritis (OA) has low interobserver reliability. In this study, our aim was to convert this system into a reliable tool for the assessment of subchondral bone loss in knee OA.
Templates that were designed to outline the normal contours of the knee were overlaid onto conventional radiographs of a random sample of 100 knees of OA patients who were awaiting total knee replacement (TKR). Seventy-five films from individuals with chronic knee pain who were not awaiting TKR and 75 films from asymptomatic control subjects were also assessed. Bone loss was graded from 0 (no attrition) to 3 (severe attrition of >10 mm); other established radiologic features were also graded. Spearman's rho was used to determine the correlation of attrition scores with other features, and logistic regression was used to explore whether definite bone attrition was associated with night pain.
The inter- and intraobserver reliability values were high for attrition scores and for the presence of definite attrition (score ≥2). Bone attrition was evident in 62% of films from patients awaiting TKR, in 9% of films from individuals with chronic knee pain who were not awaiting TKR, and in 1% of films from controls. In all groups, the correlation between attrition and other features was weak to moderate. There was a nonsignificant association between definite bone attrition and night pain.
Bone attrition is an additional dimension of knee OA that can be measured reliably. Definite attrition may be associated with night pain.
Recent evidence suggests that the subchondral bone is important for both generation of pain and progression of osteoarthritis (OA) (1–3). Scintigraphic and magnetic resonance imaging studies have emphasized the importance of subchondral bone changes (2, 4, 5), but these methods cannot be used for routine assessments. Pathology studies of surgical specimens have shown that subchondral bone changes, including attrition or loss of bone, are common in persons with advanced OA (6). Attempts have been made to assess subchondral cysts and sclerosis as potential pathologic measures of the bone in conventional radiographs, but these are hampered by low intra- and interobserver reliability (7).
In 1968, the Swedish radiologist Ahlbäck introduced a scoring system (8) that could be used to grade the loss of subchondral bone in OA on conventional radiographs. This scoring system is based on the following criterion: the presence or absence of bone defects that give an impression of having been caused by attrition of the articular surfaces, with distinctions made between defects of <5 mm, between 5 and 10 mm, and >10 mm. Later, Ahlbäck and Rydberg recommended a 5-stage grading system (9), in which grade 1 = joint space narrowing, grade 2 = loss of joint space, grade 3 = loss of joint space with bone defects of <5 mm, grade 4 = loss of joint space with bone defects of 5–10 mm, and grade 5 = loss of joint space with bone defects of >10 mm. This system was never fully described or validated but has been widely used, particularly by orthopedic surgeons (10–12). In 2001, the International Society of Arthroscopy, Knee Surgery and Orthopaedic Sports Medicine recommended that this system be used for selection of cases for total knee replacement (TKR) (13). A subsequent study found the interobserver reliability of the scoring system to be low, however, and concluded that the Ahlbäck 5-stage grading system cannot be relied on for TKR case selection (14).
Bone loss may be difficult to detect in the absence of clear defects of cortical integrity. We hypothesized that we could optimize intra- and interobserver reliability of the original Ahlbäck 3-grade approach through the use of templates designed to outline the hypothetical normal contours of the knee joint overlaid onto conventional knee radiographs (Figure 1). Using a random sample of radiographs from consecutive patients awaiting TKR for OA, we set out to determine the intra- and interobserver reliability of this approach, to explore whether bone attrition may be an additional dimension of severity in advanced OA, to determine whether scores for bone attrition are less likely to show a ceiling effect in advanced OA than would other scores of radiologic features, and to explore whether bone loss may be associated with the severity of pain and the occurrence of night pain. In subsequent random samples of radiographs from individuals with chronic knee pain who were not awaiting TKR and from asymptomatic control subjects, we explored distributions of scores and correlations between different features of OA.
PATIENTS AND METHODS
Selection of samples and data collection.
We assessed a random sample of 100 weight-bearing anteroposterior (AP) knee radiographs from consecutive patients awaiting TKR for OA at the Glenfield Hospital National Health Service Trust in Leicester, UK, between August 1987 and December 1996. During this period, patients awaiting TKR routinely underwent standardized radiography of the index knee and responded to interviewer-administered questions on the intensity of global knee pain (rated on a Likert scale of 0–4) and the occurrence of night pain. Films were selected using computer-generated random numbers by an individual unrelated to the study, who was blinded to the characteristics of the patients. Subsequently, we selected 2 random samples, each comprising 75 films, from participants of a cross-sectional study that had taken place in Somerset and Avon, UK, between November 1993 and December 1995 (15, 16). One sample comprised 75 films from individuals who had undergone clinical examination for chronic knee pain but were not awaiting TKR, and the other comprised 75 films from individuals who had undergone clinical examination because of hip pain but had not reported pain in or around either knee during the previous 12 months.
Assessment of radiographs.
All films were processed and assessed in a blinded manner. We developed templates of the outline of the contours of the knee joint from standing AP knee radiographs that were considered to be normal. These could be overlaid onto the knee radiographs of the study subjects to determine the presence of bone attrition, defined as a vertical loss of bone volume in the affected condyle (Figure 1). Alignment of the normal contours of the femur and tibia allowed measurement of the extent of bone attrition separately for the femoral condyles and tibial plateaus. Three different sizes of the template were used for knees of small, medium, or large dimension. Based on Ahlbäck's original suggestion, we graded attrition on a scale of 0–3 (0 = no attrition, 1 = attrition of doubtful significance [<5 mm], 2 = definite attrition of a moderate degree [5–10 mm], 3 = severe attrition [>10 mm]). Using a standard atlas (17), we then rated the worst osteophytes and joint space narrowing on a grading scale of 0–3 (0 = none; 1 = minute, of doubtful significance; 2 = definite, of a moderate degree; 3 = severe), and the presence or absence of cysts and sclerosis for both parts of the tibiofemoral joint and the overall tibiofemoral joint. Finally, we assigned Kellgren/Lawrence (K/L) grades of global radiologic severity on a scale of 0–4 (0 = no features of OA, 1 = minute osteophytes of doubtful significance; 2 = definite osteophytes, no definite joint space narrowing; 3 = definite joint space narrowing of a moderate degree; 4 = severe joint space impairment) (18).
One investigator (SW), who was blinded to each patient's clinical information, performed a single assessment of all radiographs from patients awaiting TKR. A random sample of 20 of these films was also assessed twice in random order by 2 independent observers (SW and PAD). Another observer (SR) then read all radiographs from the participants of the cross-sectional study. These films were mixed with 20 radiographs from patients about to undergo TKR. The observer was blinded to the characteristics of each individual and was unaware of the numbers of films selected from different sources that contributed to the total sample of 170 films being assessed.
We used intraclass correlation coefficients to determine the inter- and intraobserver reliability of the K/L scores and the scores of attrition, osteophytes, and joint space narrowing (19), and kappa statistics for the inter- and intraobserver reliability of scoring for the presence or absence of definite attrition, osteophytes, and joint space narrowing (defined as a score ≥2) as well as sclerosis and cysts (20). We then determined the correlation between scores of different radiologic features using Spearman's rank correlation coefficient. Finally, we used linear and logistic regression models to explore whether definite bone attrition (score ≥2, corresponding to a bone loss of ≥5 mm) was associated with global pain intensity and the occurrence of night pain in patients awaiting TKR, adjusted for the presence of definite osteophytes and joint space narrowing (scores ≥2). For these calculations, we used robust standard errors, which allowed for correlation within participants who had undergone bilateral knee replacement. In exploratory analyses, we determined the distribution and correlation of scores for individual radiologic features in individuals with chronic knee pain who were not awaiting TKR and in asymptomatic control subjects. All analyses were performed with Stata, version 8.2 (Stata, College Station, TX).
All selected radiographs were included in the analysis. The primary sample from patients awaiting TKR included 53 right knees and 47 left knees from 97 patients (1 man and 2 women had undergone bilateral knee replacements). The 97 patients, of whom 61 (63%) were female, had a mean ± SD age of 72 ± 8 years. Five radiographs had missing attrition scores, because of poor technical quality of 3 of the radiographs, and inability to use the templates due to gross varus deformities on 2 of the radiographs. Scores of joint space narrowing and K/L grades were unavailable for 1 film because the radiograph was not obtained with the knee in the weight-bearing position, and information on the intensity of global knee pain and the occurrence of night pain was missing for another knee. The sample from individuals with chronic knee pain who were not awaiting TKR included 40 right knees and 35 left knees from 75 participants who had a mean ± SD age of 67 ± 12 years, 42 (56%) of whom were women. Scores were available for all features in all participants. The sample from asymptomatic controls included 39 right knees and 36 left knees from 75 individuals who had a mean ± SD age of 60 ± 13 years, 51 (68%) of whom were women. One film could not be assigned scores because the knee was severely rotated.
Table 1 presents the intraclass correlation coefficients and kappa values for the intra- and interobserver reliabilities of the scores for individual radiologic features in patients about to undergo TKR. The reliability values for the presence or absence of attrition and for the degree of bone loss were high and comparable with those yielded for other scores of radiologic features. Interobserver reliabilities for determination of the presence of sclerosis and cysts were generally low.
Table 1. Intra- and interobserver reliability for scores of radiologic features on a random sample of 20 films from patients with knee osteoarthritis awaiting total knee replacement*
Medial tibiofemoral joint
Lateral tibiofemoral joint
Tibiofemoral joint overall
Values are intraclass correlation coefficients for scores of severity or kappa values for presence or absence of feature (95% confidence interval).
Range of scores 0–3, except for Kellgren/Lawrence (K/L) grades, which had a range of 0–4.
Attrition, osteophytes, and joint space narrowing were considered to be present if scores were higher than 1.
Some bone attrition (score ≥1) was present in 59 (62%) of 95 assessable radiographs from patients awaiting TKR, and definite attrition (score ≥2, corresponding to ≥5 mm bone loss) was present in 22 (23%) of the radiographs. The medial tibial plateau was affected most frequently (45%), followed by the medial femoral condyle (22%), the lateral tibial plateau (10%), and the lateral femoral condyle (7%). Twenty-five radiographs were scored as showing subchondral sclerosis, but only 2 radiographs were considered to show subchondral cysts.
In participants with chronic knee pain who were not awaiting TKR, some attrition was present on 7 films (9%) and definite attrition on 1 film (1%). Five films were scored as showing sclerosis (7%) and none was found to show subchondral cysts. Among asymptomatic controls, only 1 film had evidence of some attrition (1%) and 1 film had evidence of sclerosis (1%), but none of the control films were reported to show subchondral cysts.
With regard to the analysis of trends in scores of individual features across groups, we assessed a median bone attrition score of 1 in patients awaiting TKR (interquartile range [IQR] 0–3), a score of 0 in participants with chronic knee pain who were not awaiting TKR (IQR 0–1), and a score of 0 in asymptomatic controls (IQR 0–0) (P for trend < 0.001 for each). Similarly, median scores for the presence of osteophytes were 3 (IQR 1–3), 0 (IQR 0–2), and 0 (IQR 0–1), respectively, median scores for joint space narrowing were 3 (IQR 1–3), 0 (IQR 0–3), and 0 (IQR 0–1), respectively, and median K/L grades were 4 (IQR 2–4), 0 (IQR 0–4), and 0 (IQR 0–2) among patients awaiting TKR, patients with chronic knee pain who were not awaiting TKR, and asymptomatic controls, respectively (P < 0.001 for each).
In patients awaiting TKR, scores for joint space narrowing, scores for osteophytes, and K/L grades showed a ceiling effect, in which 36% of radiographs had grade 2 and 54% had grade 3 osteophytes, joint space narrowing was graded 2 in 41% of films and graded 3 in 52% of films, and 41% of radiographs were assigned a K/L grade 3 and 52% were assigned a K/L grade 4. No ceiling effect was found for scoring of attrition in this group of patients (39% of radiographs were scored 0, 38% were scored 1, 19% were scored 2, and only 4% were scored 3). Table 2 presents the correlation between scoring of the different radiographic features in the 3 samples. The correlation between attrition scores and scores of other features was consistently weak to moderate. Correlations were strongest between scores for joint space narrowing and K/L grades in patients awaiting TKR, and between scores for the presence of osteophytes and K/L grades in the other 2 samples of films.
Table 2. Correlation between scores of individual radiologic features in tibiofemoral joints on 100 films from patients about to undergo total knee replacement (TKR), 75 from individuals with chronic knee pain not awaiting TKR, and 75 from asymptomatic controls*
Joint space narrowing
Values are Spearman's rank correlation coefficients (95% confidence intervals).
Patients about to undergo TKR
0.33 (0.14, 0.50)
0.37 (0.18, 0.54)
0.37 (0.18, 0.54)
0.33 (0.15, 0.50)
0.33 (0.15, 0.50)
Joint space narrowing
1.00 (0.99, 1.00)
Individuals with chronic knee pain
0.33 (0.11, 0.52)
0.37 (0.15, 0.55)
0.35 (0.14, 0.54)
0.61 (0.44, 0.73)
0.92 (0.88, 0.95)
Joint space narrowing
0.80 (0.69, 0.87)
0.26 (0.03, 0.46)
0.39 (0.18, 0.57)
0.25 (0.02, 0.45)
0.39 (0.18, 0.57)
0.93 (0.88, 0.95)
Joint space narrowing
0.39 (0.18, 0.57)
The mean reported intensity of global pain in patients about to undergo TKR was high, with a mean ± SD rating of 2.9 ± 0.6. In linear regression, definite attrition was not associated with pain intensity; the difference in global pain intensity scores between knees with and those without definite attrition was 0.1 (95% confidence interval [95% CI] −0.1, 0.4; P = 0.33). Night pain was reported to be present in 87 (88%) of 99 knees with available data. In univariable logistic regression, there was a nonsignificant, but potentially relevant association of night pain with the presence of definite bone attrition (odds ratio 4.2, 95% CI 0.5, 34.9; P = 0.18). We found no evidence of an association of night pain with definite osteophytes (odds ratio 0.7, 95% CI 0.1, 6.2) or definite joint space narrowing (odds ratio 0.7, 95% CI 0.1, 3.4). The association of night pain with definite attrition remained identical after adjusting for osteophytes and joint space narrowing in a multivariable model.
Our novel approach for assessing bone attrition yielded substantial intra- and interobserver reliability for assessment of knee radiographs from patients awaiting TKR. A consistently low to moderate correlation of attrition with other features indicated that bone attrition is a distinct dimension of OA severity, different from osteophyte formation and joint space narrowing. Unlike other features of OA, bone attrition did not show a ceiling effect in advanced disease. Finally, exploratory analyses suggested that definite attrition may be associated with the occurrence of night pain.
Our study has several limitations. First, intraobserver reliability may have been overestimated due to the recall of certain OA patterns. Second, our study was based on the readings of 3 assessors only, and therefore our estimates of agreement may not be generalizable to other studies. Third, the analysis of the association of radiologic features with global pain intensity or night pain was based on individuals awaiting TKR only. Larger studies in unselected groups of OA patients are required to reliably determine the association of bone attrition with pain.
Several reports in orthopedic and rheumatology journals have described the use of Ahlbäck's system, in the absence of any clarity of how the method was applied or any attempt to assess reliability (10–12). One study compared Ahlbäck and Rydberg's 5-point system with K/L grades in a community-based sample of subjects with knee pain (21), and found a high concordance. In contrast to the 5-point scores (9), our approach allows the assessment of bone defects as a distinct entity, without assuming that they occur only when the joint space has been obliterated. Unsurprisingly, the correlation between K/L grades and our attrition scores was low to moderate.
Radiographic assessment of knee OA is widely used in clinical research and practice. Until recently, the K/L system was frequently used, particularly in epidemiologic studies. Over the last decade, more emphasis has been given to individual radiographic features (17, 22). However, subchondral bone changes remained difficult to assess because of low interobserver reliabilities (7) and low rates of recording of definite changes.
Ahlbäck (8) suggested that bone attrition is an important radiographic feature that could be measured relatively easily. We took up that suggestion and used templates to assess the extent of bone loss. As predicted by Ahlbäck, we found that bone attrition was common in advanced knee OA, but uncommon in individuals with less advanced disease and in asymptomatic controls. Our approach may therefore allow further classification of severe radiographic OA, when scores for other features cannot discriminate because of a ceiling effect. Since the clinical survival of prostheses may be related to the surrounding bone environment, this classification may be used, for example, to determine whether preoperative subchondral bone attrition is associated with an increased likelihood of prosthesis loosening. Subchondral bone changes may also be important in symptom generation and may correlate better with symptoms than would other radiographic features (2, 23). Thirty-seven years after the original concept was presented by Ahlbäck, we confirm that bone attrition is an additional radiographic dimension of knee OA, which can be measured reliably using templates that outline the hypothetical normal contours of the joint. The assessment of this dimension will be useful primarily in monitoring the advanced stages of knee OA.
We are indebted to Sarah Brookes for obtaining the random sample, and to Pete Shiarly for database programming. The Department of Social Medicine at the University of Bristol is the lead center for the British Medical Research Council Health Services Research Collaboration.