When Should the Doctor Order a Spine X-Ray? Identifying Vertebral Fractures for Osteoporosis Care: Results From the European Prospective Osteoporosis Study (EPOS)


  • Dr Reeve served as a consultant for Eli Lilly and Company and Procter & Gamble. All other authors have no conflict of interest.


Vertebral fractures are common but usually remain unrecognized in primary care. Data from 2908 women and 2653 men in the EPOS study were used to derive algorithms to indicate the need for a spine X-ray to identify a fracture using easily elicited determinants. At a sensitivity of 50% for identifying cases, the specificity was increased from 50% to 78% in women and from 50% to 72% in men compared with a random allocation of X-rays. Use of X-rays can be optimized by selecting patients at high risk using a short screening procedure.

Introduction: Previous osteoporotic fracture is an independent risk factor for further fractures and an indication for treatment. Vertebral fractures are the most common osteoporotic fractures before age 75, accounting for 48% of all fractures in men and 39% in women over 50. They usually remain unrecognized, so many patients requiring treatment are denied it, doubling their risk of a further fracture. Our objective was to develop an efficient algorithm indicating the need for an X-ray.

Materials and Methods: Data from 2908 women and 2653 men ≥50 years of age in the European Prospective Osteoporosis Study (EPOS) were analyzed. Lateral thoracic and lumbar spine radiographs were taken at baseline and at an average of 3.8 years later. Prevalent fractures were qualitatively diagnosed by an experienced radiologist. Fracture risk was modeled as a function of age, statural height loss since age 25, gender, and fracture history including limb fractures in the last 3 years using negative binomial regression. Receiver operating characteristic (ROC) curves were used to summarize a model's predictive ability, and a prediction algorithm was devised to identify those most likely to have a fracture.

Results: In a multivariate model for women, the risk of prevalent vertebral fracture significantly increased with age (RR, 1.67 [95% CI, 1.46, 1.93] per decade), statural height loss (1.06, [1.03, 1.10] per centimeter decrease), self-reported history of spine fracture (7.52 [5.52, 10.23]), and history of other major fracture (1.83 [1.46, 2.28]). Higher body weight reduced risk (0.86 [0.79, 0.95] per 10-kg increase). In men, the respective RR estimates were as follows: age (1.32 [1.18, 1.49]); height loss (1.06 [1.04, 1.09]); self-reported spine fracture (5.05 [3.69, 6.90]); other major fracture (1.42 [1.12, 1.81]); and weight (0.86 [0.79, 0.94]). Using algorithms based on these easily elicited determinants, specificity was increased from 50% to 78% in women and from 50% to 72% in men at a sensitivity of 50% compared with a random allocation of X-rays. At a sensitivity of 75%, the specificity was 50% in women and 40% in men. Inclusion of hip BMD (femoral neck or trochanter), measured in 1360 women and 1046 men, significantly improved the area under the ROC curves by 4% in women (p < 0.002) but not in men (p > 0.350). Spine BMD, measured in 982 women and 847 men, produced a significant 5% AUC improvement in women (p = 0.007) but not in men (p = 0.554).

Conclusion: A woman 65 years of age with one vertebral fracture has a one in four chance of another fracture over 5 years, which can be reduced to one in eight by treatment. Positive treatment decisions are often contingent on identifying a vertebral fracture. Selective use of lateral vertebral X-rays can be optimized using a 2-minute screening procedure administered by a nurse.


Fractures of the spine are the most common type of osteoporotic fractures; however, review of data from medical care surveys have indicated that only 2–12% of people with radiologically evident spine fracture(s) are identified in British primary care health services.(1–3) In the largest British survey, with data from 5,000,000 participants registered in the UK general practice research database (GPRD), van Staa et al.(1) found that the annual incidence of clinically diagnosed vertebral fracture varied from 3 per 10,000 in both men and women at 50 years of age to 23 per 10,000 in women and 10 per 10,000 in men at 80 years of age. In contrast, population-based studies employing ascertainment of vertebral fracture in radiological surveys have found incidence rates in excess of 10-fold higher than these,(4) suggesting that the great majority of vertebral fractures do not come to clinical attention. In the European Prospective Osteoporosis Study of men and women >50 years of age, 39% and 48% of incident fractures confirmed radiologically were of the vertebral bodies in women and men, respectively.(4,5)

The low identification rate of these vertebral fractures should be of concern because they are associated with increased risk of future vertebral and hip fractures even after adjusting for the effects of known major risk factors for fracture such as age, weight, and BMD.(6) In addition, patients with vertebral fracture have poor scores on health-related quality-of-life measures,(7–9) which in turn, may predispose to other co-morbid conditions and increase death rates(10–12) and the health and economic burden associated with vertebral fracture.

The standard methodology for diagnosing vertebral fracture is a radiologist's qualitative evaluation of vertebral X-ray films. Over the years, advances have been made in imaging technology and there are new methods for evaluating X-rays, including the use of semiquantitative grading scales and morphometry.(13) A more recent advance has been the introduction of morphometric X-ray absorptiometry (MXA),(14) which has the advantage that it can be undertaken at the same time as bone densitometry and involves a lower radiation dose to the patient.(15) However, possible disadvantages have been MXA's poorer precision in the mid-thoracic region and in any vertebra diagnosed as deformed on a conventional radiograph.(14,16) Thus, conventional radiography still remains the gold standard for diagnosing vertebral fracture.

Prospective population-based studies have identified potential clinical risk factors associated with increased risk of prevalent vertebral fracture. These include advancing age, height loss, gender, previous fracture history,(6) low bone mass,(17) weight, hormonal factors, and physical inactivity.(18) Despite the knowledge of these risk factors, little progress is evident in translating these findings into an effective diagnostic tool that can be useful in primary care for making objective decisions on whom should have an X-ray to confirm (or exclude) the presence of prevalent vertebral fracture. Currently, this is frequently a mandatory first step in making treatment decisions.

Currently, there are wide variations in national strategies for identifying cases of osteoporosis for treatment, varying from BMD screening in the >65-year group (United States)(19) to case finding (United Kingdom) and densitometry offered to those who already have suffered an identified fracture (some other European countries). These strategies all partially fail to identify patients deserving of treatment, particularly if they have vertebral fractures, although in the United States, it seems that the more severe case whose first fracture occurs before age 65 is at most risk.

Our objective was to improve the prevention of osteoporotic fractures by developing algorithms that would indicate a high likelihood of finding a prevalent spine fracture with an X-ray. Such cases are likely to benefit from modern treatments such as the bisphosphonates, selective estrogen modulators, or in the more severe cases, teriparatide.



The study participants were recruited to the European Vertebral Osteoporosis Study (EVOS) and its successor, the European Prospective Osteoporosis Study (EPOS). Detailed descriptions of the two studies have been provided elsewhere.(17,20,21) In summary, the EVOS study aimed to quantify the prevalence of vertebral fracture across Europe, and at baseline, had recruited ∼17,342 men and women >50 years of age from 36 centers in 19 European countries. Each center had recruited a random sample of up to 300 men and 300 women from population registers stratified into six 5-year age bands: 50–54, 55–59, 60–64, 65–69, 70–74, and 75+ years. A response rate of 54% was achieved in the whole EVOS study, and 7273 participants from 31 EVOS centers took part in the follow-up vertebral fracture study, the EPOS, which aimed at ascertaining the occurrence of incident fractures. We have analyzed data from the participants who took part in the follow-up EPOS study.

Radiology and ascertainment of fractures

Details of radiology and ascertainment of fractures are provided in the Appendix. For this analysis, we have used the clinical vertebral fracture definitions by the radiologist on the second X-ray film, which models the clinical paradigm used in practice. We chose to use fractures on the second X-ray film because it allowed us to use knowledge of validated limb fractures that occurred in the 3 years preceding the second visit in addition to variables measured at baseline in our prediction models.


The EVOS questionnaire administered at baseline contained questions on demographics, medical history, fracture history, gynecological information, physical activity, and lifestyle variables. Participants were asked about their height at 25 years of age and minimum weight after 25 years of age; then, their current height and weight was measured and recorded on the questionnaire. Height loss was calculated as reported height at 25 years of age minus measured height at study entry. To assess fracture history, participants were asked if they had ever suffered from a broken bone, and if so, to give details on which bone, age at first fracture, and level of trauma experienced. The fracture type choices given were vertebral, hip, rib, forearm, and other. Trauma level was divided into spontaneous, minor, and major trauma. The reproducibility of the questionnaire has been tested and found to be acceptable.(22) In total, 7222 men and women had paired spine radiographs and completed the baseline questionnaire.

Causes of deformity other than fracture (e.g., osteoarthritis, congenital malformations) were identified using conventional radiological criteria. There were 494 participants diagnosed with such other medical conditions who were excluded, leaving 6728 men and women eligible for inclusion in the analysis. Because of incomplete questionnaires and lack of follow-up data on recent peripheral fractures in two centers, only 83% of the eligible participants furnished all the information required for this analysis.


Twenty-one of the EVOS/EPOS centers were able to measure BMD at the hip and/or the spine at baseline or during follow-up in subsamples of between 20% and 100% of their available participants using DXA. For the participants considered in this analysis, 2863 (43%) from 19 centers had hip BMD (femoral neck and/or trochanter) measurements and 2086 (31%) from 13 centers had spine BMD measurements.

Statistical analysis

Negative binomial regression (see Appendix) was used to model the expected number of prevalent vertebral fractures on the second X-ray as a function of the predictor variables. These were broadly classified as demographics (age, sex, height loss, weight); past fracture history by questionnaire (any, vertebral, hip, rib, forearm, other); validated recent peripheral fracture (any upper or lower limb, upper limb, lower limb); and BMD (femoral neck, trochanter, spine). The predictors were entered into the negative binomial regression model using a forward stepwise approach if the likelihood ratio test was significant at the 5% level. Men and women were analyzed separately. Because fewer than one-half of the participants had BMD data, the effects of demographic variables, fracture history, and recent peripheral fracture were first assessed using the larger data set, and then significant predictors from this model were forced into a model with BMD to assess if measuring BMD would be of any additional value.

Predictive ability of the models was summarized by means of receiver operating characteristic curves (ROC curves). ROC curves were drawn using the linear predictor score from the variables in the optimal model to assess discriminating between different fracture risk groups, namely 0 versus at least 1 (1+), 0 versus at least 2 (2+), and 0 versus at least 3 (3+) prevalent fractures. To develop a decision criterion for ordering an X-ray that could be applied in clinical practice, the expected population scores were grouped into deciles, and measures for assessing diagnostic accuracy (sensitivity, specificity, positive predictive values, and likelihood ratios) were calculated. Two kinds of likelihood ratios were considered: (1) LR(x,y), which is the likelihood ratio comparing the probability of finding a person with a fracture in a score interval (x,y) versus probability of finding a person without a fracture in the same score interval, and (2) LR+, which is the positive likelihood ratio associated with using a specific score category for defining a prevalent vertebral fracture. All analysis was done using Stata version 8 statistical software (StataCorp).


Participant characteristics

Table 1 shows the characteristics, including the number of prevalent fractures, observed in the sample of 5561 (83%) participants (2653 men and 2908 women) who had complete data used in the statistical modeling. The numbers and proportions of participants who reported history of different types of fractures on the baseline questionnaire and also those known to have suffered a limb fracture in the 3 years preceding the second visit are also shown in Table 1. Comparison of characteristics of the 17% subjects who did not have complete data for one or more of the variables used in the statistical modeling indicated that they were slightly older (1.7 years difference in both genders, p < 0.0001), weighed less (2.0 kg lighter for males only, p = 0.002), and had smaller height loss (difference = −0.39 cm in females and −0.46 cm in males, p < 0.019) than those who provided complete data. There were no significant differences in BMD at the femoral neck, trochanter, or the spine between the two groups in either gender (p > 0.184).

Table Table 1. Participant Characteristics
original image

Determinants of prevalent vertebral fracture

Table 2 shows the relative risk (RR) estimates and the 95% CIs for the predictors that were statistically significant determinants of prevalent vertebral fracture in the negative binomial regression model for women and men. In both genders, the risk of prevalent vertebral fracture significantly increased with increasing values of age, height loss since 25 years of age, reported history of vertebral fracture on questionnaire, and history of other major fracture—defined as history of forearm, rib, or recent lower limb fracture. These latter fracture types, which were collapsed into “other major fracture,” were each independently predictive of prevalent vertebral fracture risk, but it was thought to be more parsimonious to use a general fracture history in the algorithm because the area under the ROC curves did not significantly differ after this simplification. Higher body weight was associated with less risk of prevalent vertebral fracture.

Table Table 2. Determinants of Prevalent Vertebral Fracture—Model Without BMD
original image

When we tested for interactions with gender, we found significant gender interactions with age (p = 0.007) and self-reported history of vertebral (p = 0.032) fracture on the questionnaire. The effect of age on risk of prevalent vertebral fracture was greater in women than men, and reporting of vertebral fracture on the questionnaire was associated with greater chance of prevalent vertebral fracture being found on X-ray in women than in men.

Fracture prediction

For an individual with known values of age (years), height loss in centimeters since 25 years of age (htloss), weight (kg), sex, history of vertebral fracture (hvert), and history of other major fracture (hotherfx; i.e., forearm, rib, or recent lower limb fracture), the linear predictor score was calculated from the estimated model coefficients as shown by the formulae in the Appendix. In the interest of clarity and practical ease of use, we derived a simplified score with integer coefficients for females and males as:

equation image
equation image

The Pearson correlation between the simplified linear predictor score with the score estimated from the model coefficients was 0.99 in both genders, almost completely preserving the rank ordering of individuals. Participants who had one or more prevalent fractures had higher scores on the scale. To improve applicability in clinical practice, a 10-point grading scale was derived by categorizing the expected population scores in each gender into deciles.

Figure 1 shows several ROC curves derived from categories of the simplified linear predictor score in women and men. The smooth ROC curves and their 95% confidence bands were estimated by maximum likelihood,(23) assuming a binormal distribution (mixture of two normal distributions) for the latent variable. The first curve shows the discrimination of those with 1 or more (1+) prevalent fractures (n = 354 females and 411 males) from those without prevalent fracture (n = 2554 females and 2242 males). The others show discrimination of those with 2+ prevalent fractures (n = 115 females and 131 males), 3+ prevalent fractures (n = 54 females and 53 males), and 4+ prevalent fractures (n = 33 females and 21 males) from those without prevalent vertebral fracture. In women, the areas under the curves (AUCs) ranged from 0.69 (95% CI, 0.66, 0.72) for identifying those with 1+ prevalent fracture up to 0.86 (95% CI, 0.79, 0.92) for identifying those with 4+ prevalent fractures. In men, the AUCs were slightly smaller, ranging from 0.64 (95% CI, 0.61, 0.66) for identifying those with 1+ prevalent fracture to 0.82 (95% CI, 0.74, 0.91) for identifying those with 4+ prevalent fractures (Fig. 1). All the AUCs were significantly different from 0.50, indicating that the discrimination was better than could be achieved by random guessing. In predicting 1+ vertebral fracture in women, specificity was increased from 50% to 78% at a sensitivity of 50% compared with a random allocation of X-rays. At a sensitivity of 75%, the specificity was 50% in women. In men, at a sensitivity of 50%, the specificity for predicting 1+ prevalent fracture was increased from 50% to 72%. At a sensitivity of 75%, the specificity was 40% in men.

Figure FIG. 1..

ROC curves from using deciles of the linear predictor score in each gender to discriminate between different prevalent vertebral fracture risk groups, that is, one or more (1+, n = 354 females and 411 males), two or more (2+, n = 115 females and 131 males), three or more (3+, n = 54 females and 53males), and four or more (4+, n = 33 females and 21 males) prevalent vertebral fractures from those without prevalent vertebral fracture (0, n = 2554 females and 2242 males).

Table 3 shows the details for predicting 1+ prevalent vertebral fracture in women and men. Measures of diagnostic accuracy (sensitivity, specificity, likelihood ratios, and predictive values) were calculated to help determine scores at which the test was most informative for fracture prediction. In women, the likelihood ratio LR(x,y) comparing probability of finding a person with prevalent fracture in a score interval (x,y) versus probability of finding a person without prevalent fracture in the same score interval was >1 for score categories above 6, suggesting that a decision criteria based on this cut-off score category or above would be informative for predicting fracture. A cut-off of score category 6 (i.e., score ≥ 23.1) achieved a sensitivity of 65% and a specificity of 63%. The positive likelihood ratio (LR+) was 1.77, so that, irrespective of the prior odds, the posterior odds of prevalent fracture would change by a factor of 1.77 if score category 6 is adopted as a cut-off for fracture prediction in women. Based on the observed median prevalence of vertebral fracture of 14% in this population, a decision criterion based on this cut-off score of 6 would give a positive predictive value (PPV) of 22% and a negative predictive value (NPV) of 92% for predicting 1+ prevalent fractures in women. Similar calculations for the male model are shown in the bottom half of Table 3. At the score category 6, where 40% of men would be eligible for X-ray, the sensitivity was 55% and specificity was 63%, with a positive likelihood ratio of 1.46. At a higher cut-off score of category 7 (i.e., score ≥ 27.9), the positive likelihood ratio increased to 1.76 and the specificity increased to 73% at the expense of a decrease in sensitivity to 47%.

Table Table 3. Sensitivities, Specificities, Likelihood Ratios, and Predictive Values From Using Different Cut Points of the Simplified Linear Predictor Score to Predict One or More Prevalent Vertebral Fractures in Each Gender
original image

Because positive and negative predictive values are dependent on disease prevalence and given that the prevalence of vertebral deformities significantly varied in Europe,(20) we also calculated the PPVs and NPVs for different values of prevalence. At the proposed cut-off score category of 6 in women, the PPV varied from 16% when the prevalence was 10% to a PPV of 38% when the prevalence was 26%. The NPV was 94% at 10% prevalence and was 84% at 26% prevalence. At the highest score category of 9, with 92% specificity, the PPV varied from 26% when the prevalence was 10% to 53% when the prevalence was 26%. The NPVs were 92% and 78% for 10% and 26% prevalence, respectively.

Figure 2 shows the cumulative proportions by age group in each gender that would be X-rayed in the population >50 years of age if those judged to be at greatest risk were recommended for X-ray in increments of 10%. Overall, smaller proportions of middle-aged participants would be referred for X-ray compared with older participants, with the differences in proportions being more pronounced at higher cut-off scores. Because women over 65 years of age have already been recommended for densitometry screening in the United States,(19) we evaluated the diagnostic performance of the algorithm in subjects <65 versus ≥65 years of age. As expected, because age was adjusted for in the model, there was no significant difference in the areas under the ROC curves (p = 0.767 in women, p = 0.392 in men). At the proposed cut-off score of 6 in women, 164 of 1465 (11%) female participants <65 years of age were indicated for X-ray, and from these, 31 of 112 (28%) had 1+ prevalent vertebral fractures identified with a specificity of 90%. This was at a cost of performing five X-rays per fracture case identified for those <65 years of age, the same as five X-rays per fracture case identified in the women >65 years of age. In men, at score category 7, four X-rays were indicated per fracture case identified in those <65 years of age compared with five X-rays per fracture case identified in those >65 years of age. Costs could be reduced somewhat if old written records of fractures previously identified radiologically were available for reference.

Figure FIG. 2..

Cumulative proportions by age group in each gender that would be X-rayed in the population >50 years of age if those judged to be at greatest risk of fracture according to the linear predictor score were to be recommended for X-ray in increments of 10% (shown as “cut-off score”).

Models with hip and spine BMD

Table 4 shows the RR estimates when hip and spine BMD measured in a smaller sample of participants was added into the model already containing the determinants that were significant using the larger data set (Table 2). BMD was entered into the model as T scores calculated using the NHANES III young normal reference data for hip BMD(24) and using normative data from a European community concerted action for spine BMD.(25) RRs were estimated for T scores calculated using young normal sex-specific reference data and also with young normal female reference data. There was a significant increase in risk of prevalent vertebral fracture per unit T score decline in femoral neck BMD after adjusting for the variables that were significant in the larger data set. The RR was 1.7 in women and ∼1.3 in men per 1 SD decrease in femoral neck BMD, with little difference seen in risk estimates using T scores calculated from sex-specific young normal reference data or using female young normal reference data. The addition of femoral neck BMD into the female model significantly improved the area under the ROC curve from 0.67 (95% CI, 0.62, 0.72) to 0.71 (95% CI, 0.67, 0.76; p = 0.002) in this smaller sample with hip BMD. In men, despite statistical significance of the risk estimate for femoral neck BMD, it did not translate to a significant improvement in the area under the ROC curve (p = 0.450) compared with prediction using age, height loss, weight, and fracture history variables. When the femoral trochanter BMD or spine BMD T scores were substituted for femoral neck BMD T scores, the results were similar, with the AUC improvements in women being 4% with trochanter BMD (p = 0.001) and 5% with spine BMD (p = 0.007). Neither trochanter nor spine BMD significantly improved the AUC in men (p > 0.350).

Table Table 4. Determinants of Prevalent Vertebral Fracture—Models With Hip and Spine BMD
original image

Vertebral fracture prediction from BMD data

The usual current approach to using BMD data in clinical practice is to classify patients according to their BMD T score as osteoporotic (OP, T score ≤ −2.5) versus not osteoporotic (NotOP, T score > −2.5). In women, the AUC (95% CI) in a univariate model with femoral neck BMD T score classified as OP versus NotOP was 0.55 (95% CI, 0.46, 0.63). When age and sex were entered into this model as additional variables, the AUC increased substantially to 0.63 (95% CI, 0.58, 0.68). In combination with age, sex, weight, height loss, and fracture history, the AUC increased further to 0.67 (95% CI, 0.63, 0.72). This AUC did not significantly change when the OP versus NotOP classification of femoral neck BMD T-score was dropped from the model (p = 0.372). Thus, in contrast with the earlier female model where femoral neck BMD T score significantly improved the AUC when entered as a continuous variable, it seems that the collapse of the BMD T score into osteoporotic versus not osteoporotic categories led to loss of information that was useful for greater statistical efficiency in fracture prediction. When the femoral trochanter BMD or spine BMD T score classifications were substituted for femoral neck BMD T score classification, the findings were similar.


There has been longstanding controversy concerning the need for many of the spine X-rays ordered for back pain in primary care.(26–28) This study aimed to explore ways that a general medical practitioner (primary care physician) might improve the selection of patients for X-ray to identify previous vertebral fractures. The need for improved selection of patients—and a corresponding acceptance of referrals by radiologists—is urgent because of the newly found importance of vertebral X-rays in managing patients with osteoporosis. In the last 10 years, effective anti-osteoporosis treatments have been developed and made available,(29,30) but their use will be suboptimal without proper objective criteria for identifying patients at high risk of fracture likely to benefit from treatment. Radiologically evident vertebral fractures can occur relatively early compared with some other osteoporotic fractures and are common. They confer a high risk of future vertebral and other osteoporotic fractures. However, most of these vertebral fractures are clinically undetected, with only 2–12% of cases with prevalent vertebral fracture being currently identified in British general practices,(1–3) which impedes the rational use of effective treatments.

We considered the information available in primary care for managing patients with osteoporosis. Over 90% of osteoporosis management is performed in primary care, much of it without onward specialist referral. In many centers, bone densitometry is of limited availability because of inadequate facilities.(31) In others, availability under insurance is limited to patients who are already known to have fractured—which, because a fracture is an independent risk factor, has the effect of impairing fracture prevention. Therefore, this analysis was based on easily obtained information from clinical history and examination, which could be nurse-led. At the proposed cut-off score category of 6 in women, the PPV varied from 16% when the prevalence was 10% to a PPV of 38% when the prevalence was 26%. The NPVs were 94% and 84%, respectively. Thus, in a primary care practice in a population with low prevalence of vertebral fracture (e.g., 10% prevalence), the chance of a vertebral fracture being confirmed by X-ray would increase from 10% to 16%. In a population with 26% prevalence of vertebral fracture, a score in category 6 or above would increase to 38% the likelihood of a vertebral fracture. The practitioner obtaining a score below 6 would be 94% or 84% assured, respectively, that a female patient did not have a vertebral fracture.

We found that we could identify those with multiple fractures more efficiently than those with a single fracture. It is those with multiple fractures who are more likely to have already suffered impaired quality of life and reduced activities of daily living (ADLs).(7,32) Multiple (and larger(33)) fractures are associated with higher risk of an incident fracture at a later date. Subsequent fracture(s) are likely to be within three vertebrae of the index fracture(34) and to create a more severe deformity(33) in those more severe cases. Those with clustered multiple fractures in the lumbar spine especially tend to be the most disabled group of patients with spinal osteoporosis.(35) Vertebral fractures are also associated with higher mortality,(10–12) but in terms of future disease prevention, particularly important is their strong association with hip fractures,(6,36,37) which can be prevented with at least two of the currently licensed treatments for vertebral osteoporosis (alendronate and risedronate).(30)

The current “state of the art” in estimating fracture risk is to use DXA bone densitometry (usually of the hip and spine) with the data analyzed to generate the number of population SDs below the young normal mean for BMD that the patients' own BMD falls (T score). When the patient has a T score of −2.5 or less, she is diagnosed as having (densitometric) osteoporosis. For financial reasons, our study did not include densitometric measurements on every participant—indeed, in that respect, it modeled health care provision in Britain and other European countries. In the United States, on the other hand, densitometry is recommended for women >65 years of age.(19) Nevertheless, even without densitometry, our prediction model in females suggested that a randomly selected woman with at least one or more prevalent fracture(s) could be distinguished from a randomly selected woman >50 years of age without a fracture, with 69% (95% CI, 0.66, 0.72) probability using the clinical history, age, and anthropometry (see Fig. 1). Similarly, using our male model, a man with 1+ prevalent fractures could be distinguished from a randomly selected man ≥50 years of age without a fracture with 64% (95% CI, 0.61,0.68) probability. Furthermore, the number of X-rays required per fracture case identified was similar in the <65 and >65 age groups, suggesting that selective case-finding with X-rays should be worthwhile for those <65 years of age. At a cut-off score of 6 in women, 28% of female patients with 1+ prevalent vertebral fractures <65 years of age would be identified at a similar cost to those >65 years of age. They would otherwise be missed should fracture prevention strategies be restricted to those >65 years of age only.

We have shown in this study that many currently undiagnosed fractures could be identified at modest cost. It is not envisaged that, after a negative X-ray, repeated X-rays would be necessary except in response to a significant clinical change.(38) A woman 65 years of age with one vertebral fracture has a one in four chance of another fracture over the next 5 years, which can be reduced to one in eight by treatment.(4,30) Thus, it is expected that eight women with a vertebral fracture would have to be treated to avoid one of them having another vertebral fracture (NNT). Together with our estimate of needing five X-rays on average to identify one vertebral fracture case in the 65+ age group at the proposed cut-off score of 6, it would be necessary to X-ray 40 women ≥65 years of age to allow prevention of one additional vertebral or other fracture in the next 5 years. Because the cost of lateral views of the dorsal and lumbar spine in the United Kingdom amount to about one-fifth of the yearly cost of a weekly oral bisphosphonate, it may be calculated that incorporating a strategy for identifying fracture cases such as we describe here into practice will add about 20% to the overall cost of treating a patient for 5 years. This could be quite cost-effective as well as leading to substantial health benefits. Selective prevention of osteoporotic fractures in those at sufficiently high risk(39) can be of net benefit to health care insurance systems.

In current practice, the use of BMD to confirm the diagnosis of osteoporosis before assigning a patient to treatment with a drug that reduces bone resorption is now quite usual. In our subsample that had a BMD measurement, the improvement in vertebral fracture prediction through the incorporation of axial BMD measurement into the prediction algorithm was modest, being 4% in women and not statistically significant in men. Thus, in many cases, BMD could, without great loss to decision-making, be reserved for later confirmation of osteoporosis. Alternatively, BMD could be used to gauge the level of individual risk of a vertebral fracture in the next 5–10 years in those already categorized by their current vertebral fracture status.(40,41) A large number of fragility fractures occur in individuals with BMD T scores in the range of −1 to −2.5, and therefore, the finding of a prior silent compression fracture in these individuals would be of particular importance for a treatment decision.

Compared with other risk factors, the patients' age showed particularly strongly in predicting vertebral fractures and indicated that reliance on the BMD T score for assessing risk without considering the age of the patient is unwise. Thus, an “osteoporotic” patient with a T score of −2.6 at the age of 55 is 1.8 times less likely to have a vertebral fracture than a “non-osteoporotic” patient with a T score of −2.0 at the age of 75. The choice of treatment strategy in those >75 years of age, many of whom would be indicated for radiographs and comprise about 15% of our sample, or those with BMD T scores above the range tested in the bisphosphonate trials is outside the scope of this paper. Worldwide, many national and regional guidelines allow bisphosphonate treatments to fractured patients in either or both of these groups, generally in the former case when their calcium and vitamin D status is assured. Subclinical vitamin D deficiency is remediable at low cost and is often only revealed at the time a fracture is identified in the elderly.

While most studies that have attempted to produce algorithms for fracture prediction have concluded that historical risk factors are important for vertebral or hip fracture prediction,(42–44) others have suggested that these risk factors have insufficient sensitivity and specificity for fracture prediction to warrant widespread use.(45) However, for the specific case of vertebral fracture where it is important to identify those with multiple fractures, our results showed that the algorithm has higher sensitivity and specificity in identifying this high-risk group (Fig. 1). The AUCs and sensitivities/specificities for this multiple fracture groups were approaching the range of those that have been reported for diagnosis of cardiovascular disease risk using age, sex, blood pressure, cholesterol, and smoking,(46) or diagnosis of diabetes using fasting plasma glucose.(47) Furthermore, the effective radiation dose from radiography that has been of concern has been reduced, and therefore, it is likely that the future benefits associated with early identification of prevalent vertebral fractures will outweigh the small risk of radiation exposure. We could not model the potentially beneficial effect of allowing the radiologist additional radiographs to refine diagnosis, because this was outside the EPOS protocol. Finally, because newer low-radiation dose techniques for identifying vertebral fractures are improving rapidly and because they can be accomplished at the same time as bone densitometry, these may play a major role in the future provision of clinical osteoporosis services,(14,16) including their use as prescreening tools for conventional radiography.(48)

Patients are often denied treatment for osteoporosis because the clinician is unaware of their pre-existing spine fractures. Who should have a spine X-ray to identify osteoporotic spine fractures in primary care and when that should occur (in the sixth decade or later) can be usefully guided by algorithms based on the clinical history of non-spine fracture, gender, age, height, and weight. A discriminating and proactive approach to spine radiology combined with assessment of osteoporosis in those who suffer any sort of low or moderate trauma fracture will help deliver anti-osteoporosis treatment to many of those who need it and improve the effectiveness of current case-finding strategies.


This study was financially supported by a European Union Concerted Action Grant under Biomed-1 BMH1CT920182 and EU Grants C1PDCT925102, ERBC1PDCT 930105, and 940229. The central coordination was also supported by the UK Arthritis Research Campaign, the Medical Research Council (G9321536), and the European Foundation for Osteoporosis and Bone Disease. The EU's PECO program linked to BIOMED 1 funded in part the participation of the Budapest, Warsaw, Prague, Piestany, Szczecin, and Moscow centers. Data collection from Zagreb was supported by a grant from the Wellcome Trust. The central X-ray evaluation was generously sponsored by the Bundesministerium fur Forschung and Technologie, Germany. Local or national research funds supported the participation of the following: Austria, University Hospital, Graz; Belgium, University Hospital, Leuven; Croatia, Clinical Hospital, Zagreb; Czech Republic, Charles University, Prague; Germany, Behring Hospital, Berlin; Humboldt University, Berlin; Ruhr University, Bochum; Medical Academy, Erfurt; University of Heidelberg; Clinic for Internal Medicine, Jena; HH Raspe, Institute of Social Medicine, Lubeck; Greece, University of Athens; Hungary, National Institute of Rheumatology and Physiotherapy, Budapest; Italy, University of Siena; Netherlands, Erasmus University, Rotterdam; Portugal, Hospital de Angra do Herismo, Azores; Hospital de San Joao, Oporto; Poland, PKP Hospital, Warsaw; Academy of Medicine, Szczecin; Russia, Institute of Rheumatology, Moscow; Medical Institute, Yaroslavl; Slovakia, Institute of Rheumatic Diseases, Piestany; Spain, Asturia General Hospital, Oviedo; Sweden, Lund University, Malmö; United Kingdom, University of Aberdeen; Royal National Hospital for Rheumatic Diseases, Bath; University of Sheffield; University of Southampton; Royal Cornwall Hospital, Truro.



Radiology and ascertainment of fractures

At baseline, each participant answered an interviewer-administered questionnaire and agreed to a lateral thoracic and lumbar spine radiograph (T4-L4). These were taken according to standard protocol that specified details on positioning of participants and radiographic technique. The thoracic film was centered at T7 and the lumbar film at L2, with the patient lying in left lateral position. For the thoracic film, the breathing technique was used to allow blurring of overlying ribs and lung detail. Repeat lateral thoracic and lumbar spine radiographs were taken at a mean of 3.8 years after the baseline film according to the same protocol. Additionally, during follow-up, annual postal questionnaires were used to assess occurrence of incident peripheral fractures, and these were validated(49) by review of radiographs, medical records, or interview.

All spine radiographs were evaluated morphometrically in a central radiology coordinating facility in Berlin by one of three observers as described previously.(20),(50) Using a translucent digitizer and cursor, six points were marked on each vertebral body from T4 to L4 to describe vertebral shape. From the six points, the anterior (Ha), mid (Hm), posterior (Hp), and predicted posterior (Hpred) heights were assessed, and the vertebral height ratios (Ha/Hp, Ha/Hm, and Ha/Hpred) were calculated. A senior radiologist reviewed all films in which any of the measured vertebral height ratios in a single film was <0.75 and identified prevalent fractures. In addition, for three centers, all films were qualitatively reviewed by the radiologist without exception, which showed that this selection procedure did not miss any clinical fractures.

Negative binomial regression

Negative binomial regression is useful for modeling count data when there is evidence of overdispersion. Other popular methods for modeling count data such as Poisson regression require the assumption that the mean is the same as the variance (equidispersion) and therefore incorrect inferences would be made if applied to model data with significant overdispersion. The negative binomial model, on the other hand, estimates the variance as a function of the mean (a variance function) and therefore provides more conservative estimates of SEs and CIs compared with using Poisson regression if the data are overdispersed. In general, we observe outcome variables (y1, y2, …, yn) with yi = 0, 1, 2, … participants, and for the ith participant, we are interested in measuring the effect of k explanatory variables xi = (xi1,x i2, …, xik). Using the exponential mean function, the conditional mean μi is given by:

equation image

and the unknown vector of coefficients β = (β0, β1, …, βk)) is estimated from the data for all n participants by maximum likelihood. The value of exp(βk) represents the proportionate change in the expected number of prevalent fractures when the kth explanatory variable increases by 1 unit; hence, it quantifies the magnitude of risk associated with the explanatory variable. The variance in the Poisson model is μi, whereas in the negative binomial model, the variance is given by:

equation image

where the term (1 + α) is referred to as the dispersion, and the negative binomial model reduces to the Poisson model if α = 0. A likelihood ratio test for the hypothesis that α = 0 is available in the nbreg function in Stata version 8 that was used for analysis.

Tests for overdispersion

Using the variance function, ωi = μi(1 + α) the value of α was estimated in a model not having any regressors to be 1.63 with 95% CI (1.30, 2.03) p < 0.0001 in women and 0.97 (0.77, 1.23) p < 0.0001 in men, giving a dispersion of 2.63 in women and 1.97 in men. Therefore, there was evidence of considerable overdispersion in the data before including any predictor variables. In a model with predictor variables (Table 2), the value of α was estimated to be 1.22 with 95% CI (0.96, 1.55) p < 0.0001 in women and 0.82 (0.65, 1.05) in men, suggesting there was still a moderate overdispersion of 2.22 in women and 1.82 in men after accounting for the effects of the predictors. Thus, direct application of Poisson regression would not have been appropriate for making inferences.

Formulae for calculating linear predictor score from model coefficients

equation image

Example of how the estimated coefficients relate to relative risk

Consider a female subject 1 with the following: age = 70 years, height loss = 2 cm, weight = 70 kg, hvert = No, hotherfx = No. Now consider subject 2 with similar characteristics except that she reported other fracture history (i.e., hotherfx = Yes). The expected number of prevalent fractures would be: Subject 1: exp(−4.487 + 0.052 × 70 + 0.063 × 2 − 0.015 × 70 + 2.017 × 0 + 0.602 × 0) = 0.17016 Subject 2: exp(−4.487 + 0.052 × 70 + 0.063 × 2 −0.015 × 70 + 2.017 × 0 + 0.602 × 1) = 0.31068

The excess number of prevalent fractures in subject 2 attributable to other fracture history (because all other covariates are the same) is given by the ratio 0.31068/0.17016 = 1.826, which is the same as exp(0.602) and is the value reported in Table 2 as the RR for “other major fracture” in females.