The natural history of radiographic knee osteoarthritis: A fourteen-year population-based cohort study

Authors


Abstract

Objective

To establish the natural history of radiographic knee osteoarthritis (OA) over 14 years in a community-based cohort.

Methods

We examined women from the Chingford Women's Study, a community-based cohort followed up for more than 14 years. We selected women for whom bilateral radiographs of the knees (with the legs in full extension) were obtained at approximately 5-year intervals. Radiographs were scored for OA in a blinded manner, using Kellgren/Lawrence (K/L) grades. Descriptive statistics and odds ratios (ORs) were used to compare the incidence, worsening, and progression of radiographic knee OA.

Results

A complete radiography series was available for 561 of the original 1,003 subjects enrolled in the study. The median age of these subjects at baseline was 53 years (interquartile range 48–58 years). At baseline, 13.7% of the subjects had radiographic knee OA (K/L grade ≥2) in at least one knee, and the prevalence increased to 47.8% by year 15. The annual cumulative incidence of radiographic knee OA was 2.3% between baseline and year 15. The annual rates of disease progression and worsening between baseline and year 15 were 2.8% and 3.0%, respectively. Subjects with a K/L grade of 1 at baseline were more likely to experience worsening by year 15 compared with subjects with a baseline grade of 0 (OR 4.5, 95% confidence interval 2.7–7.4).

Conclusion

This is the longest natural history study of radiographic knee OA to date. The results showed relatively low rates for the incidence and progression of radiographic knee OA; more than half of all subjects had no radiographic evidence of knee OA over a 15-year period of time. Subjects with a baseline K/L grade of 1 were more likely than subjects with other baseline K/L grades to experience worsening of knee OA.

Knee osteoarthritis (OA) is one of the leading health burdens; in 2004 alone, the cost for knee replacement in the US was $14.6 billion (1). This dollar amount does not address the additional expenses associated with pain management, loss of work due to disability, and various treatment options such as physiotherapy and revision surgery. The economic burden of OA is increasing; 54% more knee replacements were performed in 2004 compared with 4 years earlier, and this number is estimated to increase to 1.4 million by 2015 (1). The trend has been further substantiated in a long-term study based in the UK, where the rate of knee replacements tripled between 1991 and 2006 (2). Because of the increasing health burden due to the aging population and a projected 45% lifetime risk of symptomatic knee OA developing, there is an urgent need to understand the natural course of knee OA in order to target preventative therapies and reduce known risk factors for both the incidence and progression of knee OA (3).

Plain film radiography is the diagnostic imaging technique used most commonly to evaluate knee OA. Although other imaging modalities such as magnetic resonance imaging (MRI) are being assessed within the research community, their advantage over radiographic assessment in clinical practice remains uncertain (4).

Research into the natural history of radiographic OA (ROA) of the knee has focused primarily on the incidence and progression in symptomatic subjects (5–8) and the progression of disease in older cohorts (9–11). Community-based studies have been limited by followup times that varied from 3 years to 12 years (9–14), and few data are available for followup times extending beyond this period. Previous studies have also focused only on baseline and followup radiography data. In those studies, the reported annual incidence rates ranged from 2% to 4% (9, 11, 12), the rates of ROA were significantly higher in women than in men, and the rates of incident ROA were twice as high as the rates of incident symptomatic OA (9).

The aim of this study was to assess the long-term prevalence, incidence, and progression of mild and moderate ROA of the knee in a well-described population-based cohort of middle-aged women; to compare incident unilateral and bilateral disease and progressive unilateral ROA using data for 5-year intervals; and to assess the changes in individual Kellgren/Lawrence (K/L) grades (15) over 14 years. This research will establish a long-term natural history of ROA of the knee.

SUBJECTS AND METHODS

Subjects.

The participants were selected from the Chingford Women's Study, a prospective population-based longitudinal study of osteoporosis and OA. All women derived from the register of a large general practice in Chingford, North London, UK, who were between the ages of 45 years and 64 years were contacted in 1988–1989 and asked to participate. Of the 1,353 women contacted, 1,003 (78% response rate) attended the baseline visit; due to the 2-year recruitment period, the actual age range of these women was 44–67 years.

Clinic visits included administration of detailed questionnaires regarding musculoskeletal symptoms, physical evaluations, and knee radiography; pertinent risk factors for OA such as physical activity, smoking, and age were ascertained using a nurse-administered questionnaire. Height and weight measurements were obtained by staff members, and these data were used to calculate the body mass index (BMI) of the subjects. Groupings according to BMI were based on the World Health Organization categories, with a normal BMI defined as <25 kg/m2, overweight defined as 25 kg/m2 to <35 kg/m2, and obese defined as ≥35 kg/m2. Pain was evaluated using data from the baseline clinic visit assessing the presence/absence of current pain in each knee. Much of this information was evaluated repeatedly over the course of the prospective study. Information about comorbidities such as rheumatoid arthritis (RA) and fractures was collected, and total knee replacements (TKRs) were confirmed by general practice records in addition to self-report. Subjects identified as having RA (n = 6), including those who underwent TKRs due to RA (as determined from self-report), were excluded from the final analysis.

Subjects in the Chingford Study are a well-described predominantly white cohort who have been shown to be representative of women in the general UK population in terms of height, weight, and rates of hysterectomy but with a lower percentage of current smokers (16). By the year 15 clinic visit, 98 women had died, 76 had moved away, 22 were unable to be contacted, and 149 declined to attend. The study was approved by the Outer North East London Research Ethics Committee, and written consent was obtained from each woman.

Radiography protocols.

Weight-bearing anteroposterior radiographs of the knees with the legs in full extension were obtained at baseline (year 1), year 5, year 10, and year 15. Both knees of each subject who was present were radiographed by experienced radiographers, using the same equipment each year. Standardized protocols were established at baseline and used for all subsequent visits. According to these protocols, views were standardized, with the back of the knees kept in contact with the cassette and the patella centralized over the lower end of the femur. A tube-to-film distance of 100 cm was used, with the beam centered 2.5 cm below the apex of the patella (12).

Radiographic grading.

Radiographs were scored using a K/L global score (0 = normal; 1 = possible osteophyte, no joint space narrowing [JSN]; 2 = definite osteophyte, possible JSN; 3 = multiple osteophytes, definite JSN, sclerosis, and possible deformity of bone ends; 4 = large osteophytes, marked JSN, severe sclerosis, and definite deformity of bone ends) (15, 17). TKRs and partial knee replacements were identified using a combination of self-report and general practice records and were further confirmed by reviewing the original radiographs. Subjects with knee replacements were included in the final analysis and were coded separately. Radiographs were read individually by year (unpaired), with blinding regarding subject identity and symptoms. The baseline and year 5 radiographs were read by the same 2 observers (TS and DH), and a single reader (DH) read the year 10 and year 15 radiographs. As previously reported, reproducibility of the radiographic grading system was confirmed using films from 50 women (100 knees), with observers reading the films 3 weeks apart. Kappa values for intraobserver reproducibility were 0.88 (95% confidence interval [95% CI] 0.87–0.89) and 0.79 (95% CI 0.78–0.80). Interobserver reproducibility was high, with a kappa value of 0.80 (95% CI 0.79–0.81) (18).

Statistical analysis.

The present analysis was conducted using a subset of the Chingford cohort, which included only subjects for whom complete knee radiographs obtained at baseline, year 5, year 10, and year 15 were available as well as data for all pertinent baseline characteristics. Due to the inherent limitations of complete case analysis, a post hoc available-case analysis was performed when possible to check for dropout bias. Subjects with knee replacements were included in the analysis and placed in the groups “K/L grade 2 or above” and “ROA” unless they were explicitly listed separately. The baseline age, BMI, and K/L grade of subjects lost to followup were compared with those of subjects selected for this study. Because none of the continuous variables had normal distributions, Mann-Whitney U tests were used. For categorical data, Pearson's chi-square test was used except when the expected cell counts were ≤5, in which case Fisher's exact test was used.

Prevalence was calculated at both the subject level (using the “worse knee” of each subject) and the knee level (with each subject supplying 2 knees to the analysis) and was defined using a K/L grade of ≥2 as the indication of disease presence and K/L grades of 0 and 1 as the lack of disease. The “worse knee” of each subject was determined by the knee with the higher K/L grade and was used as the index knee in the analysis.

Incidence was calculated at both the subject level (worse knee) and the knee level and was defined by having a K/L grade of 0 or 1 at the first period of observation and a grade of ≥2 at the second period of observation. The annual cumulative incidence was calculated by dividing the incidence by the number of years under observation. Incident unilateral and bilateral disease was defined as having a K/L grade of 0 or 1 in both knees at the first observation and having a grade of ≥2 in one or both knees at the next observation point, respectively.

Progression was calculated at the knee level and was defined as having a K/L grade of ≥2 at the first period of observation and showing an increase of at least one K/L grade by the second period of observation. At the subject level, progression was defined as unilateral disease at the first period of observation and bilateral disease at the second period of observation.

Worsening was calculated at the knee level and was defined as an increase of one K/L grade from any other grade (including grades 0 and 1). The group with worsening essentially includes incident cases, subjects with disease progression, as well as subjects with mild progression who moved from a K/L grade of 0 to a K/L grade of 1.

The development of incident ROA at each time point (year 5, year 10, and year 15) from a baseline K/L grade of 1 was compared with baseline K/L grades of 0 by calculating percentages (with 95% CIs) for each. This was stratified by quartiles of age (<50 years, 50–54 years, 55–60 years, and >60 years). Differences between groups were assessed by chi-square tests.

Odds ratios (ORs) were used to compare incident ROA at year 15 among subjects with a baseline K/L grade of 0 with that among subjects with a baseline K/L grade of 1 and to assess the odds of subjects with each baseline K/L grade progressing to TKR by year 15. These ORs were calculated by generalized estimating equation logistic regression models in order to account for clustering due to each subject contributing 2 knees to the analysis. The baseline characteristics (age, BMI, pain, and smoking status) of subjects in whom unilateral or bilateral disease developed were compared using logistic regression models. Finally, cross-tabulation was used to assess individual K/L grades at baseline and year 15. Statistical analysis was carried out using Stata version 10 (19) and SPSS version 17.0 (20).

RESULTS

The baseline median age of the subjects with complete followup was 53 years (interquartile range [IQR] 48–58 years). Of the original 1,003 women who were seen at baseline, 970 women underwent radiography at baseline, 831 had radiographs obtained at year 5, 819 had radiographs obtained at year 10, and 613 had radiographs obtained at year 15. Five hundred sixty-one women underwent radiography at all 4 visits and had complete demographic data recorded at baseline. Four hundred forty-two women had incomplete followup data and were excluded from the complete case analysis. Subjects lost to followup were slightly older than those with complete followup (P < 0.0001) and were more likely to be current smokers. In addition, the percentage of subjects with knee pain was slightly higher in the group lost to followup compared with the group with complete followup (33.5% versus 28.5%) (Table 1).

Table 1. Baseline characteristics of the entire cohort, subjects with complete followup, and subjects lost to followup*
CharacteristicEntire cohort (n = 1,003)Complete followup (n = 561)Lost to followup (n = 442)
  • *

    Except where indicated otherwise, values are the percent. A body mass index (BMI) of <25 kg/m2 was considered normal. IQR = interquartile range; K/L = Kellgren/Lawrence; TKR = total knee replacement.

  • P < 0.0001 versus subjects with complete followup.

  • P < 0.007 versus subjects with complete followup.

  • §

    P < 0.052 versus subjects with complete followup.

Age, median (IQR) years54.0 (49.0−60.0)53.0 (48.0−58.0)56.0 (50.0−61.0)
BMI, median (IQR) kg/m224.8 (22.6−27.6)24.7 (22.7−27.3)25.1 (22.6−28.2)
BMI <25 kg/m251.453.748.4
Current smoker22.819.327.4
K/L grade in worse knee   
 Grade 079.280.976.8
 Grade 16.15.47.1
 Grade 29.69.89.3
 Grade 34.93.96.1
 Grade 40.10.00.2
TKR of worse knee0.20.00.5
Knee pain30.728.533.5§

The prevalence of ROA (worse knee having a K/L grade of ≥2) was 13.7% at baseline, 23.9% at year 5, 36.4% at year 10, and 47.8% at year 15. Among all knees (n = 1,122), the prevalence of ROA was 9.5% at baseline, 17.5% at year 5, 27.5% at year 10, and 38.6% at year 15. Interval rates of the annual cumulative incidence at the subject level (worse knee) were 3.0% between baseline and year 5, 3.4% between year 5 and year 10, and 3.9% between year 10 and year 15, with 39.5% of subjects developing incident ROA in at least one knee between baseline and year 15. At the knee level, the annual cumulative incidence also increased steadily between each 5-year period, with increases of 2.3% between baseline and year 5, 2.6% between year 5 and year 10, and 3.3% between year 10 and year 15, and incident ROA developed in 32.5% of knees between baseline and year 15.

The annual cumulative incidence between baseline and year 15 was 2.3% at the knee level and 2.8% per year for the worse-knee subject-level analysis. Among the 106 knees with a K/L grade of ≥2 at baseline, 38.7% had disease progression between baseline and year 15. Approximately 12% of all knees showed worsening (increase of at least one K/L grade) between baseline and year 5, 23.4% showed worsening between year 5 and year 10, and 23.8% showed worsening between year 10 and year 15. When only baseline and year 15 data were analyzed, 41.5% of knees showed worsening by at least one K/L grade. A sensitivity analysis using all available knees at each time point did not show any significant differences compared with the complete case analysis (data not shown).

When the 5-year cumulative incidence of ROA was examined at each time point and age was stratified into quartiles (<50 years, 50–54 years, 55–60 years, and >60 years), a linear trend (P < 0.002) was evident, with the oldest age group having the highest percentage of incident ROA (Figure 1). By year 15, incident ROA had developed in 26.0% of subjects who were younger than age 50 years at baseline, 34.1% of subjects who were 50–54 years of age at baseline, 31.7% of those ages 55–60 years at baseline, and 42.2% of subjects who were older than age 60 years at baseline. The difference between the youngest and oldest age groups was significant (P < 0.01), although the difference between the 2 middle-aged groups was not (P = 0.584). When age was stratified into 2 age bands (<55 years and ≥55 years), the difference between incident ROA in the 2 groups was statistically significant (P = 0.017). When the 5-year cumulative incidence was analyzed according to BMI category, the percentage of knees with incident ROA at year 5 was roughly similar between the groups. By year 10 and year 15, however, the cumulative incidence among obese subjects was almost 20% higher than that among subjects in both the normal and overweight categories (Figure 2). No difference in the incidence of ROA between subjects who were premenopausal and those who were postmenopausal at baseline was observed (P = 0.193).

Figure 1.

Point estimates and 95% confidence intervals (95% CIs) for the cumulative percentage of women with incident radiographic osteoarthritis (ROA) at each visit, stratified by baseline age group (n = 1,016 knees).

Figure 2.

Point estimates and 95% confidence intervals (95% CIs) for the cumulative percentage of women with incident radiographic osteoarthritis (ROA) at each visit, stratified by baseline body mass index (BMI) category (n = 1,016 knees).

Cross-tabulation of individual K/L grades and TKRs at baseline and year 15 (Table 2) demonstrated that 51.3% of 1,122 knees had a K/L grade of 0 throughout the study period, while 41.5% of knees worsened by at least one grade. Among the subjects with a K/L grade of ≥1 at baseline (n = 167), 37.1% remained at the same grade, and 51.5% worsened (including progression to TKR) by year 15. Knees with a baseline K/L grade of 1 (n = 61) had a higher percentage of progression (73.8%) compared with knees with any other K/L grade at baseline. Knees with a baseline K/L grade of 2 (n = 76) were the next most likely to undergo progression, with 47.7% increasing by at least one K/L grade over 15 years; 1.7% of knees were scored as having regressed to a lower K/L grade by year 15. Ten (1.1%) of 955 knees with a baseline K/L grade of 0 progressed to TKR by year 15, compared with 3 (4.9%) of 61 knees with a K/L grade of 1 at baseline, 4 (5.3%) of 76 knees with a K/L grade of 2 at baseline, and 2 (6.7%) of 30 knees with a K/L grade of 3 at baseline. Among subjects with baseline K/L grades of 0, 1, 2, and 3, the respective percentages of knees with pain at baseline were 21.5%, 19.7%, 39.5%, and 26.7%.

Table 2. Cross-tabulation of baseline and year 15 K/L grades in 1,122 knees*
Baseline K/L gradeK/L grade at year 15
01234TKR at year 15
  • *

    Values are the number (%), with percentages calculated by row. K/L = Kellgren/Lawrence; TKR = total knee replacement.

0 (n = 955)575 (60.2)95 (10.0)157 (16.4)116 (12.2)2 (0.2)10 (1.1)
1 (n = 61)12 (19.7)4 (6.6)24 (39.3)18 (29.5)0 (0.0)3 (4.9)
2 (n = 76)0 (0.0)1 (1.3)39 (51.3)32 (42.1)0 (0.0)4 (5.3)
3 (n = 30)1 (3.3)1 (3.3)4 (13.3)19 (63.3)3 (10.0)2 (6.7)

When rates of worsening for all knees (n = 1,122) were analyzed, knees in the group with the highest number of TKRs by year 15 had a baseline K/L grade of 0. The odds of having a TKR by year 15 were similar for knees with baseline K/L grades of 1–3 and those with a baseline grade of 0 (for grade 1, OR 4.7 [95% CI 1.0–22.2]; for grade 2, OR 5.9 [95% CI 1.9–18.2]; for grade 3, OR 4.6 [95% CI 0.3–65.3]). Among subjects who underwent a TKR by year 15, 52.4% reported having pain at the baseline visit.

Although a K/L grade of 1 is not considered diagnostic of ROA, when the data were stratified by an initial baseline K/L grade of 0 or 1 (Figure 3), the difference between groups in the cumulative incidence was significant at each visit (P < 0.001). The odds of a subject with a baseline K/L grade of 1 developing incident ROA by year 15 was 4.5-fold (95% CI 2.7–7.4) that of the odds for a subject with a baseline K/L grade of 0. When the data were stratified by age groups, these differences remained significant (P < 0.05) at all visits and in all age groups except at year 5 in the group of subjects >60 years of age (OR 2.8, 95% CI 0.9–9.0 [P = 0.085]).

Figure 3.

Point estimates and 95% confidence intervals (95% CIs) for the cumulative percentage of women with incident radiographic osteoarthritis (ROA) at each visit, stratified by baseline Kellgren/Lawrence (K/L) grades of 0 or 1 (n = 1,016 knees).

The prevalences of unilateral and bilateral ROA at baseline were 8.6% and 5.2%, respectively (Table 3), and at year 15 were 18.4% and 29.4%, respectively. Among the 484 subjects without ROA at baseline, 293 (60.5%) remained free of disease at year 15, compared with 14 (29.2%) of 48 subjects with unilateral disease at baseline and 28 (96.6%) of 29 subjects with bilateral disease at baseline. Among the subjects in whom either incident unilateral or bilateral disease developed within any 5-year followup period (n = 200), bilateral disease developed in 32.0%, and unilateral disease developed in 68.0%. Among the subjects in whom unilateral disease developed between baseline and year 10 (n = 143), 56.6% had progressed to bilateral disease by year 15.

Table 3. Cross-tabulation of radiographic knee OA status at baseline versus year 5, year 5 versus year 10, and year 10 versus year 15*
Radiographic knee OANoneUnilateralBilateral
  • *

    Values are the number (%), with percentages calculated by row. OA = osteoarthritis.

Baseline versus year 5   
 None (n = 484)426 (88.0)42 (8.7)16 (3.3)
 Unilateral (n = 48)1 (2.1)30 (62.5)17 (35.4)
 Bilateral (n = 29)0 (0.0)0 (0.0)29 (100.0)
Year 5 versus year 10 
 None (n = 427)355 (83.1)51 (11.9)21 (4.9)
 Unilateral (n = 72)2 (2.8)44 (61.1)26 (36.1)
 Bilateral (n = 62)0 (0.0)5 (8.1)57 (91.9)
Year 10 versus year 15 
 None (n = 357)287 (80.4)43 (12.0)27 (7.7)
 Unilateral (n = 100)6 (6.0)56 (56.0)38 (38.0)
 Bilateral (n = 104)0 (0.0)4 (3.8)100 (96.2)

When the cumulative incidence was stratified by age quartiles for baseline and year 15 data, a significant difference was observed between the youngest (age <55 years) and oldest (age >60 years) age groups (P = 0.003), although adjacent age groups were not significantly different from one another (P = 0.642). Baseline characteristics were compared between subjects who remained free of disease over the 15-year study period and subjects who experienced progression to unilateral or bilateral disease.

All subjects who experienced disease progression were more likely to have a higher BMI, and those in whom bilateral disease developed (from no disease or unilateral disease) were more likely to be older. Subjects in whom unilateral disease developed and those who experienced progression from unilateral to bilateral disease were more likely to have pain at baseline compared with subjects in whom ROA did not develop.

DISCUSSION

The novel findings of this research were as follows: the annual rates of disease incidence, progression, and worsening between baseline and year 15 were 2.3%, 2.8%, and 3.0%, respectively; there are 3 potential symmetry-based phenotypes for knee ROA (incident unilateral, incident bilateral, and progressive unilateral to bilateral disease); and although the risk of TKR was associated with an increasing baseline K/L grade, the majority of knees that underwent a total replacement by year 15 had a baseline K/L grade of 0.

More than half of the subjects (52.2%) remained free of radiographic knee OA over the course of the study. At the year 15 visit, 38.6% of knees had prevalent ROA, compared with 9.5% of knees at baseline. Annual rates of knee progression (2.8%) and worsening (3.0%) between baseline and year 15 were slightly lower than those observed in other community-based cohorts, which were 3.5–8.0% for progression (8, 9, 11) and 4.4% for worsening (11). Rates in established symptomatic cohorts were similar, varying between 3.3% and 7.7% for worsening (6, 21) and from 4.0% to 8.8% for progression (7, 8). The slightly lower rates observed in the Chingford Study are likely a consequence of both the relatively young age of the cohort at the start of the study and the length of the study.

This study demonstrated an annual cumulative incidence of radiographic knee OA of 2.3% between baseline and year 15. When the data were broken down by 5-year intervals, the annual rate between baseline and year 5 was in the lower range (2.3%), with a high of 3.3% between year 10 and year 15. This is likely a result of the increasing age of the sample, with a median age of 53 years (IQR 48–58 years) at baseline and 68 years (IQR 63.5–72.5 years) by year 15. When the cumulative incidence was stratified by age quartiles, a significant difference was observed between the youngest (<55 years) and oldest (>60 years) age groups, although adjacent age groups were not significantly different from one another. Analyses conducted in primarily symptomatic cohorts have shown much higher rates of annual cumulative incidence (up to 4.0%) (8). Cohort studies and case–control studies that included both symptomatic and asymptomatic subjects were more comparable with these results, with percentages ranging between 2.0% and 2.5% (9–11). As would be expected, however, the cumulative incidence among obese subjects was almost 20% higher by year 15 than that among subjects in both the normal and overweight categories.

Assessing a cross-tabulation of individual K/L grades over 15 years, rather than using the K/L grade as a binary variable, demonstrated that knees with specific K/L grades of 1 and 3 are more likely to progress to a higher grade or to remain stable, respectively, even over a long period of time. Fewer than half of all knees (41.5%) worsened by at least one K/L grade over the 14 years of the study. The majority of subjects (68.4%) who underwent a knee replacement by year 15 did not have evidence of conventional ROA (K/L grade ≥2) at baseline, and more than half had current knee pain. This suggests that radiographs are not necessarily the optimal tool for predicting TKR as the long-term outcome in younger subjects (median age 53.0 years at baseline).

The comparison of incident ROA between subjects with baseline K/L grades of 0 and 1 extends the time line of an earlier nested case–control study within this cohort (22) and other recent research (23) that emphasizes the importance of subjects with a K/L grade of 1 being treated distinctly from those with a K/L grade of 0. The higher risk of a subject with a baseline K/L grade of 1 progressing to incident ROA (4.5-fold the odds associated with a K/L grade of 0) suggests that grades of 1 are an important indicator of longitudinal incidence. Lachance et al examined the incidence and progression of mild K/L grades over 3 years and reported that women with a K/L grade of 1 were 6.4-fold more likely than those with a K/L grade of 0 to progress to a grade of ≥2 (13).

Among the subjects in whom incident ROA developed, approximately one-third developed bilateral ROA between each clinic visit, while the remaining two-thirds developed unilateral ROA. More than one-third of the knees with unilateral disease progressed to bilateral disease between each clinic visit, while the rest remained stable. The baseline characteristics of these groups (age, BMI, and pain) were different from those of subjects who remained free of ROA over the study period. These data are possibly describing 3 distinct subsets of ROA, in which some subjects have slow progression from no disease to unilateral and then bilateral ROA, while others have more rapid progression to bilateral disease within a 5-year period. This could reflect a difference between environmental factors (i.e., functional effect of having contralateral knee OA) and genetic factors (i.e., genetic predisposition to ROA) (24), although further work is required to validate these findings.

Limitations of this study include the effect of radiographic views, scoring methods, radiographic blinding, inclusion criteria, and loss to followup (which is the most common limitation of studies of this length). The standard view used for radiography of knees at the start of the Chingford Study was anteroposterior, fully extended, and weight-bearing. Although the prevailing opinion is to use standard semiflexed views (25, 26) and the more rigorous fluoroscopy-assisted positioning (27) due to underestimation of joint space narrowing in fully extended views (28), long-term studies often continue using the same radiographic protocol as that used at the baseline visit in order to more accurately evaluate change. The patellofemoral compartment is known to be an important component of “whole organ” knee OA, and the presence of patellofemoral ROA is highly associated with pain and disability (29, 30). The lack of additional views that allow imaging of the patellofemoral compartment, such as skyline and/or lateral, is a limitation of this study that should be addressed in future natural history studies of ROA.

The K/L scoring system has been the primary method used to evaluate radiographic knee OA in similar studies (5, 7–11, 13, 14) but is commonly criticized for several known limitations. The K/L system assumes a nonvalidated natural disease progression that is extremely osteophyte-centric, and also has several different “official” and modified versions, all of which are in use (15, 17, 31). Despite these negative attributes, K/L scores have a high level of reproducibility (8, 11, 32), there is a strong correlation between pain and increasing K/L grades (33, 34), and K/L grade–defined ROA is present in the majority of subjects who present with knee pain (35). Although the relationship between K/L grades and pain is by no means perfect, other imaging modalities, such as MRI, have not yet demonstrated a better specificity than plain film radiography (36).

A potential limitation of reading radiographs with the reader blinded to order is that grades may decrease over time, because they are not being read in a method that allows for the evaluation of change. The percentage of “regressive” grades in this study (1.7%) was much lower than that in studies using similar blinding methods (5.5–7.5%) (13, 22). Interval censoring was used due to the lack of exact dates for radiographic incidence, progression, and worsening for each knee. Scores were evaluated only on the date that each radiograph was obtained, so that knees in which disease progressed immediately after a visit would be recorded as not having worsened until the next clinic visit 4–5 years later. This may have contributed to an overestimation of time interval until the development of incident or progressive ROA.

Due to the original study design, the results of this study are also restricted to the natural history of ROA in white women. Although it is possible that the results could be loosely applied generally, there are known differences in prevalence, incidence, and progression between sexes and between subjects of different ancestry (23, 29, 34, 37). The high projected lifetime risk of symptomatic knee OA (45%) emphasizes the importance of using symptoms when defining OA (3). The lack of information regarding pain in this analysis limits its clinical application and should be addressed in future work.

Subjects lost to followup represent a major limitation of all long-term cohort studies, such as the Chingford Study. There is a potential for study bias due to deaths, subjects withdrawing because of disability and illness, and generally having a healthier cohort attending the followup visits. Although the baseline characteristics of subjects included in the analysis and those of subjects lost to followup were similar enough not to imply a severe bias (i.e., slightly older, more likely to smoke, and slightly more knee pain), the possibility remains that subjects lost to followup, for whatever reason, would have had significantly worse ROA than those included in this analysis. There is no way to know the potential effect of this type of bias on any study with this design. The use of complete case analysis can add further bias due to the need to exclude subjects who were not present for intermediate visits; however, only a small percentage of subjects who were present at both baseline and year 15 missed any other visits. Among the subjects for whom both baseline and year 15 data were available, 29 were excluded due to missing year 5 data, 8 were excluded due to missing year 10 data, and 3 were excluded due to missing data for both the year 5 and year 10 visits.

This analysis represents the longest natural history study of radiographic knee OA to date and is intended to provide novel data regarding the trend of individual K/L grades during 5-year intervals over 14 years. No other natural history study of this type has included intermediate radiography scores (i.e., scores obtained between the baseline visit and the followup visit). The inclusion criteria were deliberately nonrestrictive in order to gain a more accurate picture of ROA in a normal population, including analysis of the progression of mild ROA in a relatively young cohort. The Chingford Study has an extremely high response rate for a study of this length, with >50% of the original 1,003 women attending all clinic visits involving knee radiography over the 14 years of followup.

In conclusion, this study showed that the annual rates of disease incidence, progression, and worsening between baseline and year 15 were 2.3%, 2.8%, and 3.0%, respectively; that more than half of the subjects (52.2%) remained free of radiographic knee OA over 14 years; that 3 potential phenotypes exist for knee ROA based on symmetry; that knees with a baseline K/L grade of 1 had a 4.5-fold greater risk of developing incident ROA compared with knees with a baseline K/L grade of 0; and that the majority of knees that had undergone total replacement by the time of the followup visit did not have ROA at baseline.

AUTHOR CONTRIBUTIONS

All authors were involved in drafting the article or revising it critically for important intellectual content, and all authors approved the final version to be published. Dr. Arden had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study conception and design. Leyland, Hart, Javaid, Spector, Arden.

Acquisition of data. Leyland, Hart, Javaid, Goulston, Spector, Arden.

Analysis and interpretation of data. Leyland, Hart, Javaid, Judge, Kiran, Soni, Goulston, Cooper, Arden.

Acknowledgements

We would like to thank all of the participants in the Chingford Women's Study and Maxine Daniels and Dr. Alan Hakim for their time and dedication.

Ancillary