Natural history of radiographic hip osteoarthritis: A retrospective cohort study with 11–28 years of followup




To evaluate the association between radiographic hip osteoarthritis (OA) and future total hip replacement (THR) due to OA or hip fracture.


We studied a cohort of individuals who had colon radiography from 1980–1997. Minimal joint space (MJS) was measured and each hip was graded for radiographic OA according to the Kellgren/Lawrence scale. Subjects were followed until the end of 2008. A Cox proportional hazards model, adjusted for age and sex, was used to evaluate factors associated with THR and hip fracture.


A total of 2,953 hips were studied (57% women). The cumulative incidence of THR was 2.5% and the cumulative incidence of hip fracture was 2.6%. For hips with radiographic hip OA (MJS of 2.5 mm or less), the cumulative incidence of THR was 16.9% and the hazard ratio (HR) for THR was 13.2 (95% confidence interval [95% CI] 8.1–21). Using Kellgren/Lawrence grading, the HR for THR was 12.9 (95% CI 7.9–21) for hips with radiographic OA compared to those without. The HR for all types of hip fracture for hips with radiographic OA (MJS of 2.5 mm or less) was 0.47 (95% CI 0.15–1.5), for intracapsular fractures was 0.29 (95% CI 0.04–2.1), and for extracapsular fractures was 0.67 (95% CI 0.16–2.8).


The risk of THR due to OA is substantially increased in patients with radiographic hip OA, regardless of symptoms, and increases with decreasing MJS. However, 11–28 years after having had radiographic hip OA, more than 4 of 5 of those having radiographic signs of hip OA had not had a THR for OA.


Risk factors for progression of hip osteoarthritis (OA) and the natural history of the disease remain controversial (1). Most of the studies published on progression of hip OA have been based on cohorts of patients with already established hip OA with clinical symptoms at study entry (2). There are only few studies on the natural history of hip OA and progression in a community setting (3, 4), especially with clinical outcome as an end point. Many of the studies done so far have a short followup time (5, 6). Studies also vary in their end point, with some using radiographic joint space narrowing (7, 8) and others using total hip replacement (THR) (9–11) or a combination of both (3, 12) as an end point. In a recently published review of the literature, 18 studies on hip OA progression were deemed to be of acceptable quality, 15 of which came from the same 3 cohorts, and only 4 had more than 1,000 subjects (1).

Results from epidemiologic studies are often used to estimate the individual risk of disease progression and to provide estimates of future health care needs on a society level. For OA, this need is frequently expressed as a need for total joint replacement. Furthermore, the association of individual radiographic features with disease progression and THR may provide insights into OA disease mechanisms. This study was therefore undertaken to evaluate the natural history of radiographic hip OA registered at colon radiography with regard to the association with later THR due to OA or hip fracture.


Study cohort.

Our cohort consisted of patients who had colon radiographs taken in Iceland during the years 1980–1997. Only 3 radiographic departments performed colon radiography in Iceland during this period. From two of these departments we collected all examinations done between 1980 and 1997. From the third department we retrieved all available examinations, but examinations in this unit were not systematically saved and stored, resulting in incomplete retrieval. Patients were ages 35 years or older at the time of the examination. The patients were referred for radiography from 4 different hospitals, as well as from primary health care providers. They were from both rural and urban areas.

Radiographic techniques.

The double-contrast (barium enema) colon radiographs included at least two supine anteroposterior (AP) and several oblique exposures. The hip joints in this study were assessed from an AP control radiograph, which was taken with the same tube-to-film distance of 100 cm that is used in a standard AP view of the pelvis. The x-ray beam was centered on the umbilicus. Both hips were examined, and if both hips were not clearly visualized, the patient was excluded. The date of the colon examination, the age of the patient at that time, and any signs of secondary OA and hip operations were registered. Hips with congenital dislocation or dysplasia, Perthes disease, or slipped epiphysis were excluded from further analysis.

Radiographic classification.

Minimum joint space (MJS) was measured on the AP film with a ruler divided in millimeters (13). MJS less than or equal to 2.5 mm was used as a definition of hip OA unless otherwise stated (14). Global assessment of radiographic signs of OA was done according to the Kellgren/Lawrence scale (15, 16). Hips classified as Kellgren/Lawrence grade 2 (definite narrowing in the presence of definite osteophytes) or higher were defined as having OA. The presence of osteophytes, cysts, and sclerosis was also recorded.

Exclusion process and followup of the cohort.

After exclusions, we had 2,953 hips that were available for followup (Figure 1). This comprised our hip cohort.

Figure 1.

The exclusion process of the study. The number within the parentheses is the number of women or female hips in each group. n = the total number in each group; OA = osteoarthritis; THR = total hip replacement.

In Iceland as in other Scandinavian countries, all persons have a unique personal identification number. This combined with a highly computerized national health system makes it possible to identify with a high degree of certainty all patients operated on for a given diagnosis or by a certain procedure in Iceland, with errors in the computer registry being less than 2% (17). The National Census and death register makes it possible to locate all Icelanders, i.e., if they are alive and where they live.

All hospitals in Iceland that have done THR surgery have their discharge registries computerized from the year 1973, and we checked these registries for all hip fractures or THR. For each participant, we registered if they had undergone THR due to OA, had a hip fracture, had THR due to other reasons, or had died during the study period. For each hip, we registered the first event that occurred. All participants were followed until the end of the year 2008. The medical records of all patients who had a registered event were checked to confirm the diagnosis. THR for OA was verified as a diagnosis of hip OA based on clinical and radiographic observations, concomitant with THR. Hip fractures were further divided as being intracapsular or extracapsular. THR due to primary hip fracture was registered as hip fracture, and in the case of THR for other reasons, the indication for operation was registered.

Statistical methods.

Age differences between groups and differences in MJS were tested with the t-test for independent samples. Spearman's rho was used to test correlations between covariates. Cox regression, adjusted for age and sex, was used to calculate the hazard ratio (HR). Frailty test was used to examine the effects of bilaterality. We considered a P value of less than or equal to 0.05 to be significant, and all tests were 2-tailed. Calculations were done using SPSS, version 16.0. For frailty calculations, we used R, version 10.2 (online at To further evaluate the effects of bilaterality, we constructed a cohort where each subject only supplied one hip. Individuals that already had a THR or hip fracture in either hip at the start of the study were excluded from this cohort. We then used the hip from each remaining individual (n = 1,455, 57.5% women) that first had an event; the other hip in that individual was excluded. For subjects who died before any event or had no event in either hip until the end of the study, we randomly chose the right or left hip. Calculations presented were based on the hip cohort, and the subject cohort was used for comparison (see below).

Observer reliability for radiographic MJS measurements was assessed by measuring 294 randomly selected hips twice by the same observer, and 174 randomly selected hips by two independent observers. Data reported in this study are for MJS measurements by a single observer (TI). The intraclass correlation coefficient for interobserver variability of assessment of MJS was 0.94 and for the Kellgren/Lawrence grading, the kappa statistic was 0.76. The corresponding intrarater correlation coefficient was 0.96 and the kappa statistic was 0.65.

The study was approved by the Ethics Committee of Akureyri Central Hospital, Iceland.


Cohort characteristics.

The possible outcomes were THR due to OA, hip fracture, THR due to other reasons, death, or no event during the study period. The hip fracture group included all hip fractures, regardless of treatment. The characteristics of each outcome group are shown in Table 1. The 7 hips that got THR due to reasons other than OA or hip fracture were due to rheumatoid arthritis (n = 3) or necrosis of the femoral head.

Table 1. Characteristics of the hip cohort according to outcome*
 No. (%)Age at colon radiography, mean ± SD yearsWomen, no. (%)MJS, mean ± SD mmYears to event, mean ± SD
  • *

    MJS = minimal joint space; THR = total hip replacement; OA = osteoarthritis.

  • In the case of no event, this denotes time until the end of the study.

No event1,900 (64.3)55.0 ± 10.51,152 (60.6)3.9 ± 0.715.5 ± 3.4
Died894 (30.1)69.7 ± 10.5441 (49.3)3.8 ± 1.08.5 ± 5.4
THR due to OA75 (2.5)62.1 ± 9.346 (61.3)2.8 ± 1.69.9 ± 5.9
Hip fracture (all)77 (2.6)70.9 ± 9.758 (75.3)3.8 ± 0.78.3 ± 4.7
 Intracapsular41 (1.4)70.3 ± 9.333 (80.5)4.0 ± 0.78.0 ± 4.8
 Extracapsular36 (1.2)71.7 ± 10.125 (69.4)3.6 ± 0.78.8 ± 4.6
THR due to other7 (0.2)62.7 ± 11.64 (57.1)2.5 ± 2.47.1 ± 5.9
All groups2,953 (100)60.1 ± 12.61,701 (57.6)3.9 ± 0.913.0 ± 5.4

One hundred seventy-eight hips (6.0%) had MJS of 2.5 mm or less, and 160 hips (5.4%) had Kellgren/Lawrence grade 2 or higher (Table 2). There was no significant sex difference in OA prevalence within the cohort (P = 0.4). At the time of colon radiography, individuals with radiographic OA were on average 7.4 years older than those without (P < 0.001).

Table 2. Hip event according to radiographic OA status at baseline*
 No eventTHR due to OAHip fractureTHR due to otherDied
  • *

    Values are the number (percentage). OA = osteoarthritis; THR = total hip replacement; MJS = minimal joint space.

Not OA (MJS ≥3.0 mm)1,831 (66.0)45 (1.6)74 (2.7)4 (0.1)821 (29.6)
OA (MJS ≤2.5 mm)69 (38.8)30 (16.9)3 (1.7)3 (1.7)73 (41.0)
 MJS 2.5 mm30 (50.8)3 (5.1)1 (1.7)0 (0)25 (42.4)
 MJS 1.5–2.0 mm30 (45.5)7 (10.6)2 (3.0)0 (0)27 (40.9)
 MJS 0–1.0 mm9 (17.0)20 (37.7)0 (0)3 (5.7)21 (39.6)

THR due to OA.

Comparing hips that got THR to hips of subjects who had no event or died before the end of the study period, we found that the patients with THR had lower MJS (mean difference 1.1 mm; P < 0.0001) and were older at study entry (mean difference 2.4 years; P = 0.03), but were of similar age at the time of the event or at the end of the study in the case of no event (mean difference 0.9 years; P = 0.3). The mean ± SD age at THR in our cohort was 72.1 ± 7.5 years. The mean ± SD age at THR surgery in Iceland during the last decade is 67.7 ± 10.7 years (unpublished observations).

A total of 17% of those with radiographic hip OA had undergone THR for OA at the end of the study. The cumulative incidence of THR increased with decreasing hip MJS (Table 2).

We compared those persons with radiographic hip OA that got a THR with those that were without hip OA at baseline but nevertheless had a THR before the study end, and found that the former had a mean age at colon radiography of 65.1 years and the latter, 60.1 years (P = 0.02). The mean time to THR for the former was 7.4 years, and for the latter was 11.6 years (mean difference 4.3 years; P = 0.002). Therefore, the mean age at the time of THR was 72.8 years for the former and 71.6 years for the latter (P = 0.5). The proportion deceased in the OA group during the study period (Table 2) might indicate that OA patients have higher mortality. We therefore did a Cox regression with death as the outcome and found that the difference was not significant (HR 1.1, 95% confidence interval [95% CI] 0.88–1.4).

A receiver operator characteristic (ROC) curve analysis showed that both MJS and Kellgren/Lawrence grading were significant in predicting THR (area under the curve 0.674 and 0.664, respectively; P < 0.0001 for each). The two radiographic criteria were not significantly different from each other, as their 95% CIs overlapped (0.601–0.748 and 0.590–0.738 for MJS and Kellgren/Lawrence, respectively).

Hip fractures.

Women were overrepresented in the fracture group (relative risk 2.2, 95% CI 1.3–3.8). The mean ± SD age at hip fracture in our cohort was 79.3 ± 9.3 years, or the same as in Iceland as a whole, 79.3 ± 11.5 years (unpublished observations). A total of 2.7% (n = 74) of hips without OA and 1.7% (n = 3) of hips with radiographic OA sustained a hip fracture during followup (Table 2). Using Kellgren/Lawrence grading for the definition of OA, only one hip fracture had radiographic OA. Those with intracapsular fractures had on average 0.4 mm greater MJS than those with extracapsular hip fractures (P = 0.01). There was no difference in the mean age of subjects between these two fracture types at the time of colon radiography (P = 0.5) or at the time of fracture (P = 0.3). ROC curve analysis showed that neither MJS nor the Kellgren/Lawrence scale was significant in predicting hip fractures, whether testing all hip fractures together or intra- and extracapsular fractures separately (data not shown).

Hip survival.

Age and sex were included as covariates when calculating HRs using a multivariate Cox regression model. The individual features of hip OA (femoral osteophytes, acetabular osteophytes, sclerosis, and cysts) were all significant predictors of THR by themselves. Using a composite definition of OA with MJS and any of the other individual features did not yield significantly different results than using MJS alone (data not shown). All individual features except acetabular osteophytes showed a significant correlation with the other individual features with a relatively high correlation coefficient (ρ = ≥0.3). To avoid the problem of multicolinearity, these were not included in the final model.

For cases with radiographic OA compared to cases without radiographic OA, the HR for THR due to OA was ∼13 and highly significant, regardless of whether OA was defined as MJS 2.5 mm or less or Kellgren/Lawrence grade 2 or greater (Table 3). The more severe cases of hip OA had greater HRs (Table 4). For hip fracture, 95% CIs for HRs included 1, but were suggestive of an inverse relationship between radiographic hip OA and hip fracture (Table 3).

Table 3. HRs for cases with radiographic OA compared to cases without radiographic OA for receiving a total hip replacement or getting a hip fracture according to radiographic OA classification (Cox multivariate regression model, adjusted for age and sex)*
 Minimal joint space ≤2.5 mm, HR (95% CI)Kellgren/ Lawrence grade ≥2, HR (95% CI)
  • *

    HR = hazard ratio; OA = osteoarthritis; 95% CI = 95% confidence interval.

  • There were no cases with intracapsular hip fracture that had hip OA according to Kellgren/Lawrence classification.

Total hip replacement13.2 (8.1–21)12.9 (7.9–21)
Hip fracture (all)0.47 (0.15–1.5)0.17 (0.02–1.2)
 Intracapsular0.29 (0.04–2.1)No cases
 Extracapsular0.67 (0.16–2.8)0.35 (0.05–2.6)
Table 4. Hazard ratios for getting a total hip replacement according to radiographic osteoarthritis severity (Cox multivariate regression model, adjusted for age and sex)
 Hazard ratio (95% confidence interval)
  • *

    Minimal joint space of 3.5 mm or greater was used as reference group.

  • Kellgren/Lawrence grade 0 was used as reference group.

Minimal joint space, mm* 
 3.01.7 (0.87–3.3)
 2.53.7 (1.1–12)
 1.5–2.09.5 (4.1–22)
 0–1.051 (28–93)
Kellgren/Lawrence grade 
 11.8 (0.81–3.8)
 28.5 (4.4–16)
 333 (16–68)
 449 (17–141)

To explore the effect of radiographic OA over time, we excluded all hips that had an event during the first 10 years (n = 658) and did an age- and sex-adjusted Cox regression with the same parameters as before. We found that the presence of hip OA on a radiograph taken more than 10 years earlier was a significant risk factor for THR (HR 8.5, 95% CI 3.8–19).


The fact that a patient with disease in one hip is more predisposed to also having a disease in the other hip may affect the results when evaluating both hips of patients. To evaluate if the effect of bilaterality affected our findings, we did a frailty test and found that although there was an effect of bilaterality, it did not alter the results.

To further evaluate the effects of bilaterality, we constructed a model where each subject only supplied one hip. We chose the hip that first had an event and in the case of no event or death, we randomly chose the right or left hip. Using this “subject” model, the age- and sex-adjusted HR for cases with radiographic OA (MJS less than or equal to 2.5 mm) compared to cases without radiographic OA for getting a THR due to OA was 14.3 (95% CI 8.2–25).


The primary objective of this study was to ascertain the association between radiographic changes of hip OA present in the colon radiographs and the subsequent incidence of THR for OA and hip fracture during an 11–28-year followup. The cumulative incidence of THR and of hip fracture in the total cohort was approximately equal (2.5% versus 2.6%). When comparing hips with radiographic signs of OA at the index examination to those without, the cumulative incidence of THR for OA was greatly increased in the group with radiographic signs of hip OA at index examination (Figure 2), regardless of the radiographic classification system. The cumulative incidence of THR increased with radiographic severity at the index examination.

Figure 2.

The crude cumulative incidence of total hip replacement (THR) related to radiographic osteoarthritis (OA) status at index examination. The crude cumulative incidence of THR for hips with radiographic hip OA at index examination and hips without radiographic OA was calculated with Kaplan-Meier. Hip OA here is defined as minimal joint space less than or equal to 2.5 mm. The shaded areas show the 95% confidence intervals. Curves were cut off when 40 patients remained for analysis.

There were 178 hips from 129 patients with radiographic OA at baseline. Of these, 24 patients (30 hips, 19% of patients, and 17% of hips) had received a THR for OA at the end of the study. Another 58 patients (73 hips) had died, 3 patients (3 hips) were fractured, and 3 patients (3 hips) received THR due to other reasons. This left 69 hips (39%) with radiographic OA at the index examination that had no event until the end of the study 11–28 years later (Table 2). These hips had been followed for mean ± SD 15 ± 3.6 years. One can only speculate why these hips have not received a THR after such a long time. We know that there is poor correlation between radiographic signs of OA and pain (3, 18–20), and that several factors affect patients' willingness to undergo joint replacement. Even with clinically severe OA, many patients are not willing to consider THR as treatment (21). Other studies indicate that there is underuse of arthroplasty for severe arthritis in both sexes, more so in women (22), and that some patients overestimate the pain and disability needed to warrant a joint replacement (23). Our cohort was based on individuals ages 35 years and older. Since surgeons might be more hesitant to offer a young person THR, we tested our calculations after excluding all individuals under the age of 50 years. This did not significantly impact our results. It is therefore possible that these individuals with radiographic OA that did not receive a THR were without pain, were painful but not willing to undergo surgery, or were not offered surgery.

After excluding those that had an event during the first 10 years after the colon radiography, we found that the presence of radiographic hip OA was still a risk factor for THR. This suggests that some of the study participants developed their clinical disease significantly later than the radiographic changes. There were 1,900 hips without an event at the end of the study. The mean ± SD age of these subjects was 70.5 ± 10.5 years. It is therefore likely that additional subjects from this cohort will sustain a hip fracture or receive a THR during their remaining lifetime.

There were only 3 hips with radiographic OA that had a hip fracture during followup, the effect being that all results regarding radiographic hip OA and fracture in this study must be interpreted with caution. The trend shown might indicate that there is an inverse association between hip OA and hip fracture, especially intracapsular hip fracture.

Different studies have used different definitions of radiographic hip OA. Of those that use MJS, the cutoff point for OA varies. Some studies have used 2.5 mm or less (13), others 2.0 mm or less (19), or even 1.5 mm or less. The fact that having MJS of 2.5 mm yields a significant HR for THR (Table 2) is supportive of using 2.5 mm or less as a definition of radiographic hip OA.

There are limitations to this study. Possible confounders not taken into account are body mass index (BMI) and occupation. BMI has been associated with worse hip pain (3) and function, but not necessarily with greater structural progression (18). No relationship has been shown between BMI and radiographic features of hip OA (24), but there is evidence supporting BMI as a risk factor for THR (25–27). Hip fractures require a fall and this can be influenced by comorbidity and medications. OA has been found to increase the risk of falls (28, 29), and this would reduce a possible protective effect of OA gained through higher bone mass. The use of medications, such as hormone replacement therapy and bisphosphonates, affects osteoporosis, and this is unaccounted for in the present study. Furthermore, there might be other unknown confounders.

Patients who undergo colon radiography are not a random sample of the population. Subjects with symptoms of hip OA who are seen within health care more often may be more likely to be referred to colon radiography than the background population, introducing some selection bias. Obesity is linked to colon cancer (30) and these subjects may more commonly undergo colon radiography, but we are not aware of any studies on the BMI of the average patient undergoing colon radiography. The sum and direction of the aforementioned biases are difficult to ascertain.

The AP exposure from the colon radiography that we used to evaluate the hips was not done according to the same protocol as a standard pelvic radiograph used routinely to evaluate hip OA, but the difference is not great and there is good agreement between colon and hip joint radiographs in regard to both the prevalence and degree of hip OA (31, 32). Therefore, we believe that the results presented here also apply to standard pelvic radiographs.

In the present study, the prevalence of radiographic hip OA at the index examination was 6.0%, but cases that already had a THR were excluded because they had already had an end point. Therefore, this prevalence cannot be interpreted to represent the prevalence of radiographic hip OA in the population.

The cumulative overall incidence of THR was 2.5% in this cohort. There are only few similar studies published, but in a large Norwegian cohort study where the mean age at baseline was 47 years (range 34–59 years) and the mean followup time was 9 years (range 0.1–9.5 years), the cumulative incidence was 1.3% (25). The difference in incidence between these studies may in part be explained by a lower mean age at baseline and a shorter followup period in the Norwegian study, but it is also a fact that the prevalence of radiographic hip OA (14) and THR for OA incidence (17, 33) is higher in Iceland than in other Scandinavian countries.

The cumulative THR incidence in patients with radiographic hip OA at the index examination was 17% in the present study. In a study of elderly white women recruited from the community, the cumulative THR incidence was found to be 10% after 8 years of followup (mean ± SD 8.3 ± 0.4 years) (3). However, that study is not fully comparable to the present one, as it included only women ages 55 years or older at baseline, used a different definition of OA (modified Croft grade) than the present study, and had shorter followup.

Apart from those mentioned, there are few published studies that are comparable to the present study, due to population or outcome selection. Most published studies on OA progression use radiographic progression only (6, 7) or a combination of radiographic progression and THR as a definition of progression (3, 5, 12). The rate of progression of hip OA differs between individuals and there is limited concordance between radiographic change and the amount of pain and disability (3, 19, 34–36), and even with severe radiographic change young men are often pain free (34). Therefore, THR for OA is a more relevant outcome for the practicing clinician and health care planners. However, previous studies that have used THR only as an outcome have had short followup (11, 37, 38) or few participants (10).

In studies on hip OA progression, the study participants have been either hospital based, with verified OA and pain (5–7, 9–11, 36–39), or drawn from the population (3, 12, 20, 34). The hospital-based studies have had a much higher rate of THR than the present study, which is understandable, taking into account that those patients have already presented with hip pain.

There are few longitudinal studies on the association between hip fracture and hip OA. Some have been based on self-reported hip OA and their results have been conflicting (40–42). It was shown that self-reported OA has a significantly higher prevalence than radiographic OA (43). In a study of elderly white women, no reduced risk of hip fracture could be found in individuals with radiographic hip OA or even severe radiographic hip OA (43).

In this cohort of subjects with colon radiographs, we found the cumulative incidence of THR for OA to be 2.5% and the cumulative incidence of hip fracture to be 2.6% after 11–28 years of followup. In individuals with radiographic hip OA at the index examination, the THR for OA prevalence was 17% and the hip fracture prevalence was 1.7%. The HR of a hip with radiographic OA compared to a hip without radiographic OA for getting a THR during the study period was 13.2 (95% CI 8.1–21). However, more than 4 of 5 of those with radiographic signs of hip OA at the index examination had not had a THR for OA at the end of the study 11–28 years later. The implications of these results are 2-fold. First, it enables health care planners to use studies of radiographic hip OA to more precisely predict the future need of THR. Second, it helps the clinician and patient to understand that an incidental finding of radiographic hip OA entails a relatively small risk for future THR.


All authors were involved in drafting the article or revising it critically for important intellectual content, and all authors approved the final version to be published. Dr. Franklin had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study conception and design. Franklin, Ingvarsson, Lohmander.

Acquisition of data. Franklin, Ingvarsson, Ingimarsson.

Analysis and interpretation of data. Franklin, Ingvarsson, Englund, Ingimarsson, Robertsson, Lohmander.