Determinants of the Size of Incident Vertebral Deformities in European Men and Women in the Sixth to Ninth Decades of Age: The European Prospective Osteoporosis Study (EPOS)


  • J Reeve

    Corresponding author
    1. Department of Medicine, Addenbrooke's Hospital, Cambridge, United Kingdom
    • Address reprint requests to: J Reeve, DM, DSc Department of Medicine Box 157 Addenbrooke's Hospital Hills Road Cambridge CB2 2QQ, UK
    Search for more papers by this author

  • The authors have no conflict of interest


More severe vertebral fractures have more personal impact. In the European Prospective Osteoporosis Study, more severe vertebral collapse was predictable from prior fracture characteristics. Subjects with bi-concave or crush fractures at baseline had a 2-fold increase in incident fracture size and thus increased risk of a disabling future fracture.

Introduction: According to Euler's buckling theory, loss of horizontal trabeculae in vertebrae increases the risk of fracture and suggests that the extent of vertebral collapse will be increased in proportion. We tested the hypothesis that the characteristics of a baseline deformity would influence the size of a subsequent deformity.

Methods: In 207 subjects participating in the European Prospective Osteoporosis Study who suffered an incident spine fracture in a previously normal vertebra, we estimated loss of volume (fracture size) from plane film images of all vertebral bodies that were classified as having a new fracture. The sum of the three vertebral heights (anterior, mid-body, and posterior) obtained at follow-up was subtracted from the sum of the same measures at baseline. Each of the summed height loss for vertebrae with a McCloskey-Kanis deformity on the second film was expressed as a percentage.

Results and Conclusions: In univariate models, the numbers of baseline deformities and the clinical category of the most severe baseline deformity were each significantly associated with the size of the most severe incident fracture and with the cumulated sum of all vertebral height losses. In multivariate modeling, age and the clinical category of the baseline deformity (crush > bi-concave > uni-concave > wedge) were the strongest determinants of both more severe and cumulative height loss. Baseline biconcave and crush fractures were associated at follow-up with new fractures that were approximately twice as large as those seen with other types of deformity or who previously had undeformed spines. In conclusion, the characteristics of a baseline vertebral deformity determines statistically the magnitude of vertebral body volume lost when a subsequent fracture occurs. Because severity of fracture and number of fractures are determinants of impact, the results should improve prediction of the future personal impact of osteoporosis once a baseline prevalent deformity has been identified.


The impact of osteoporotic vertebral fractures has become the subject of increasing attention.(1–4) Because nonspecific or mechanical back pain is common, it has proved sometimes difficult to show the impact of osteoporotic fractures in the vertebral bodies. These are relatively uncommon in populations but were thought to have a large impact on affected individuals. In more recent work, better instruments have been used to measure impact in terms of activities of daily living, bodily function, and pain.(3,5,6) It has become clear that the impact of vertebral fractures is considerable, whether measured with disease-specific instruments such as the QUALEFFO or with generic instruments such as the short form (SF)-36.(5)

It has also become clear that some vertebral fractures have a larger impact than others. In both the Multiple Outcomes of Raloxifene (MORE) trial quality of life study and the European Prospective Osteoporosis Study (EPOS) cohort study, it was shown that thoracic fractures have less impact than lumbar fractures and that the overall impact of osteoporotic spine fractures on the patient depends on the cumulative number of fractures.(4,7) Previously, it was shown that larger fractures are more common in clinically referred patients than expected from their prevalence in the general population, suggesting that they have a greater impact than smaller ones.(8) In both the United States and the European arms of the MORE study(6,9)and in the Fracture Intervention Trial (FIT) study,(10) it was shown that incident fractures had an impact on quality of life, whether or not they were reported to the patient's personal physician. Ettinger et al.(1) found that the more severe (>4 SD) deformities had significant personal impact, whereas the less severe (<4 SD) did not.

At fracture, vertebral bodies collapse to a variable extent; therefore, it is common to categorize fractures according to loss of estimated vertebral volume in a semiquantitative manner.(11) According to Euler's buckling theory, loss of horizontal trabeculae not only increases the risk of vertebral fracture but also suggests that the degree of vertebral collapse will be increased when a fracture occurs. This is because of the greater unsupported length of the vertebral trabeculae that remain. We hypothesized that one or more of the determinants of the risk of occurrence of an incident vertebral fracture would also be a predictor of the size of that fracture. Therefore, in the present analysis of data from the EPOS, we have explored the statistical determinants of the size of a fracture in those subjects judged to have suffered an incident fracture.

We took advantage of the fact that all the X-rays in the EPOS study were subjected to digital morphometric analysis in a single center. We devised quantitative variables relating to vertebral volume loss that are arithmetically continuous, which has advantages for statistical modeling purposes.



The subjects followed up were those who had participated in the European Vertebral Osteoporosis Study (EVOS) prevalence survey, and this is described in detail elsewhere.(12–16) In brief, each of 36 European centers aimed to recruit 600 subjects aged 50 to 79 years (50 males and females in each 5-year age band) from population-based registers who were invited to have a spinal radiograph. Participating centers undertook postal follow-up to determine the interval occurrence of non-spine fractures. At the second follow-up, 29 of the centers invited the responders to attend for a follow-up radiograph.

Resource constraints in some centers resulted in only a proportion of the follow-up questionnaire responders being invited for a second radiograph. In addition, as expected, not all subjects invited for a second film attended. In a previous study, we found no convincing evidence that this incompleteness of follow-up caused bias.(17) Lateral spinal radiographs of dorsal and lumbar spine were scheduled to be taken according to a structured protocol, including measurement of film-focus distance, using the breathing technique, and film centering as previously used for the baseline film.(14,18)

In total, 7340 subjects had a follow-up film, of which 67 (0.9%) were unreadable because of poor image quality. In the remaining 7273 subjects, the X-rays were taken from 1.4 to 8 years apart (mean, 3.8 years), and sent in batches to Berlin where they were scanned and digitized. Then, each of the 94,549 vertebral bodies from T4 to L4 was evaluated for its capability of being adequately quantitated. When this was the case, the vertebral body heights were measured digitally anteriorly, posteriorly, and at the midpoint in the vertebral body. In 49 subjects, however, less than five vertebrae were visualized adequately for quantitative analysis.

Ascertainment of fracture/deformity

For the purposes of this paper and to conform to previous terminology, a baseline deformity is referred to as a deformity even if it was clinically certain that it was an osteoporotic fracture that occurred before the study.

All X-rays were captured as digital images and all assessable vertebral bodies were quantitated by placing six points per vertebra to define anterior, mid-, and posterior vertebral body heights. The results were compared with the results of the quantitative evaluation of the first film,(14) and all vertebral bodies showing an absolute change in the anterior:posterior or mid-body:posterior ratio greater than 15% were identified. Also, all vertebral bodies with one of these ratios less than 75% were identified. All film pairs scoring positive for one or more vertebrae by either of these criteria were set aside for review (50% of the subjects). In addition, for three centers, all films were reviewed without exception, which demonstrated that this selection procedure did not miss any clinical fractures.

All film pairs were reviewed side-by-side on the same screen with image enhancement (magnification, contrast) by a single experienced radiologist who made a clinical judgment as to whether there had been a fracture event in any vertebral body between the two films.(17) In the event that an incident fracture was diagnosed, it was assigned to one of four clinical categories based on shape: wedged, concave (one endplate fractured), bi-concave, or crushed. When necessary, for example, because of a mistake in identifying the vertebral body outline, the points defining the vertebral body's dimensions were adjusted. The morphometric readings were analyzed after adjusting the images to the same magnification.(19) Our approach was modeled in part on our successful pilot study.(20) Deformed vertebrae on the second film were identified by the McCloskey-Kanis criteria.(14,21) To qualify as an incident fracture, a deformity on the second film had to have lost 4 mm in height in at least one of its measured dimensions and to have at least one dimension reduced by 20% or more.

As described previously by Lunt et al.,(22) we also subclassified the incident fractures identified by the McCloskey-Kanis algorithm. Thus, wedge fractures according to the McCloskey classification were subdivided according to whether they had anterior height loss only or additional loss of mid-body height (some had posterior plus mid-body height loss without anterior height loss). Concave fractures were those with only loss of mid-body height. Crush fractures were associated with loss of all three heights.

In 21 centers, bone mineral density (BMD) using DXA was measured in at least some subjects, according to resources available. The total number of subjects was 3527. The machines used were Lunar (Madison, WI, USA), Hologic (Waltham, MA, USA), or Norland (Minster, OH, USA) pencil beam machines, or in one case, a Sopha fan-beam machine. The results were cross-calibrated with the European Spine Phantom.(23) We updated our calibrations with the definitive phantom compared with our previous calibrations(24) with the prototype. These measurements were made either at the time of the first X-ray (13 centers) or at the time of the second X-ray (remaining centers). Two centers measured the lumbar spine only and eight centers measured the femoral neck only, with the remainder measuring both regions of interest. Altogether, we had 3047 cross-calibrated measurements of the femoral neck and femoral trochanter and 2247 measurements of the lumbar spine.

Statistical analysis

Calculation of incident vertebral fracture size variables:

These analyses were undertaken on the subset of subjects who had at least one incident vertebral fracture in a previously unfractured vertebra. We did not include the very small number of subjects who only had an incident fracture in a vertebra that was classified at baseline as a McCloskey-Kanis deformity. After identifying all incident fractures that qualified according to the McCloskey-Kanis +20% height reduction/4-mm criteria, the relative loss of vertebral volume attributable to each was estimated as follows, using the vertebral height data after magnification adjustment(19) to place the first and second films on the same dimensional scale.

equation image(1)

where h = an adjusted vertebral height, a = anterior, m = mid-body, p = posterior, and 1 and 2 refer to the first and second X-ray, respectively. The largest fracture was defined as the largest %size reduction in each subject (Max%SR).

If more than one vertebra qualified as having an incident size reduction, the qualifying %size reductions were summed to generate a cumulated vertebral size reduction (Cum%SR). A third outcome variable was calculated: the residual sum of the %size reductions attributable to the incident fractures other than the largest (Res%SR).

equation image

In silico simulation of semiquantitative grading:

We were also interested to determine whether the semiquantitative (SQ) method of assessment of baseline fractures of Genant et al.(11) would predict Max%SR, Cum%SR, or Res%SR. This is because, as a method for assessing the size of baseline fractures, in principle, it might perform better in predicting the size of subsequent fractures than the qualitative clinical assessment method of Felsenberg et al.(25) or the grading according to which vertebral heights were reduced used in the McCloskey-Kanis approach.(21) Unfortunately, we did not have the resources to commission a second, independent, SQ analysis of the baseline X-rays. We therefore performed, in silico, a simulation study of the Genant SQ grading on the baseline film data as follows.

The magnification-adjusted vertebral heights (first the anterior, then the mid-body heights, and finally the posterior heights) for each subject were statistically weighted to exclude all those which had a McCloskey deformity criterion outside the range ±2 SD units (−3 SD units was the cut-point for accepting a baseline McCloskey deformity). Those within this range were accepted as normal heights and fitted to a subject-specific cubic regression with vertebral level (T4-L4) as the independent variable. The heights not used for fitting were then compared with their predicted heights after trimming all predicted heights which fell outside a 3 SD criterion. All observed heights that were more than 20% below the predicted value were given a pseudo-Genant grade (higher than −25%, grade 1, mild; −25% to -40%, grade 2, moderate; worse than -40%, grade 3, severe). The worst of the three pseudo-Genant grades for a vertebral height was accepted as that vertebra's grade for modeling purposes, and the worst grade in each individual was used to simulate the worst grade that would have been allocated by an X-ray reader. The pseudo-Genant grades were also summed from T4 to L4 for each subject to generate a score combining the effects of vertebral height reductions in the individual vertebra with the numbers of affected vertebrae.

Statistical modeling:

Simple regression models were used to investigate the dependence of vertebral size reduction at the level of the individual subject on the continuous variables age and cross-calibrated BMD (g/cm2). Nonparametric Kruskall-Wallis tests were used to examine the dependence of the %size reduction variables on the categorical variables gender, study center, number of baseline deformity (0, 1, 2, or 3+), radiological category of the most severe baseline clinical deformity (in ascending order: wedge; concave involving only one end-plate; bi-concave; complete crush), and the most severe McCloskey-Kanis fracture (in ascending order: concave [loss of mid-body height]; wedge; crush). When the McCloskey-Kanis algorithm was used to categorize deformities in its disaggregated form as introduced by Lunt et al.,(22) deformities were ranked in the following order: concavity, anterior-only wedge, anterior plus mid-body wedge, posterior plus mid-body wedge, crush (anterior plus posterior height loss), and crush (all three heights reduced).

Initially, multiple regression models were generated by including all significant non-DXA determinants, including center and the number of accurately readable vertebrae. Then those that were nonsignificant were eliminated by a backward step-wise approach (criterion to leave p > 0.05). To evaluate the contribution of BMD measurements to prediction of fracture size, in the subsets of subjects with these data, BMD of the hip or spine was added to the optimized models for each outcome variable to determine whether it added significantly to the prediction of fracture size.


For the study as a whole, 3.1% of subjects had an incident fracture in a previously unfractured vertebra. Nine incident fracture cases were excluded from the analysis because they were judged radiologically to have other diagnoses (e.g., osteoarthrosis, Forestier's, or a traumatic fracture). There were 207 other cases of incident fracture in the study whose fractures were attributed to osteoporosis and the present analysis was based on these cases (Table 1). Seventy-seven of these had a femur BMD measurement and 59 had a spine BMD measurement.

Table Table 1. Distributions of Baseline Deformities According to Classification of Worst Deformity at the Subject Level in 207 Subjects With at Least One Incident Fracture
original image

Of these 207 subjects, 144 had only one qualifying incident fracture with at least a 20% reduction in one vertebral height that was >4 mm. There were 37 subjects with two incident fractures, 14 with three, and 12 with four or more. Table 2 shows the mean sizes of incident fractures in these 207 cases, according to the clinical categorization of the worst baseline deformity (none, wedge, concave, bi-concave, or crush), and Table 3 shows the same data classified by the worst McCloskey-Kanis baseline morphometric deformity (none, wedge, concave, crush).

Table Table 2. Mean Values for the Percentage Reduction in Size of the Most Severely Affected Vertebra With an Incident Fracture (Max%SR) and the Cumulative Sum of All Vertebral Percentage Size Reductions in Vertebrae, Which Were Normal at Baseline and Qualified as Having a McCloskey-Kanis Deformity at Follow-Up (Cum%SR)
original image
Table Table 3. Max%SR and Cum%SR Data According to the McCloskey-Kanis Classification
original image

At the individual level, the calculated worst pseudo-Genant grade was associated with other measures of baseline deformity as expected. With the Felsenberg clinical grade of the worst baseline fracture, χ2 was 101.5, with the number of baseline fractures, χ2 was 91.9, and with the McCloskey Kanis deformity type, χ2 was 122.7 (p < 0.0001). In logistic regression, the pseudo-Genant grade also was inversely associated with femoral neck (p < 0.0001), trochanteric (p < 0.004), and lumbar spinal (p = 0.024) BMD values.

In univariate models, the following variables were not significantly associated with Max%SR, Cum%SR, or Res%SR: geographical region within Europe (N, S, E, W) or the sex of the subject. Max%SR was not associated with McCloskey grade (p = 0.34), number of baseline deformities (p = 0.52), the worst pseudo-Genant fracture grade at baseline (p = 0.11), or femoral neck BMD (p = 0.10). It was inversely associated with spine BMD (p < 0.024), and it was positively associated with the clinical grade of baseline deformity as read by the radiologist (p = 0.038). Thus, prevalent baseline biconcave and crush deformities predicted the loss of an additional 9.4 ± 2.9% of vertebral body size with the largest incident fracture compared with other baseline states. Age also had a significant effect, with Max%SR increasing by 2.5 ± 0.9% (p < 0.01) per decade of age. Finally, Max%SR was associated with the sum of the pseudo-Genant grades in the spine at baseline (p = 0.044).

Cum%SR and Res%SR were associated with all the variables associated with Max%SR and also with number of baseline deformities (p < 0.001 for both) and the McCloskey-Kanis grade of the worst baseline deformity (p < 0.0002). Both were associated with the worst baseline pseudo-Genant grade and the sum of the baseline pseudo-Genant grades (p < 0.001 in each case). Cum%SR (but not Res%SR) was associated with femoral neck (p < 0.03) and trochanteric (p < 0.02) BMD, in addition to spine BMD (p < 0.04). Res%SR was positively correlated with Max%SR (Spearman's ρ 0.24, p < 0.001).

In multivariate modeling of the fracture size variables, we entered sex and geographical region into backward stepwise models alongside the variables related to the outcome variables in univariate modeling. The bone density variables were initially excluded because we only had bone density measurements in a subset. The simulated Genant grades were also excluded because they were not real data. The resulting models with their regression coefficients and 95% CIs are shown in Table 4. It can be seen that both Max%SR and Cum%SR were associated independently with the clinical grade of the worst baseline fracture and age of the subject (Fig. 1), but gender, geographical region, and the McCloskey-Kanis morphometric grade dropped out of these models. In contrast, in addition to being associated with Max%SR, Res%SR was found to be associated with the McCloskey-Kanis grade; there was a significant positive effect of having a wedge deformity (defined as having two of three vertebral heights reduced; Fig. 2).

Table Table 4. Multivariate Regression Models* for the Three Outcome Variables Studied
original image
Figure FIG. 1..

Effects of the shape of the worst affected prevalent fracture on (left) the size of the largest incident fracture Max%SR (least squares mean ± 95% CI) adjusted for the age of the subject and (right) the cumulated sum of sizes of all incident fractures newly qualifying as McCloskey-Kanis deformities on the second X-ray (Cum%SR) after adjusting for the age of the subject. Key to clinical grade of worst shape baseline deformity: 1 = wedge, 2 = concave, 3 = biconcave, 4 = crush.

Figure FIG. 2..

Effect of McCloskey-Kanis deformity type, as disaggregated by Lunt et al.(22) (least squares means ± 95% CI), after adjusting for the size of the largest incident fracture, on the cumulated volume loss caused by the remaining incident vertebral fractures (Res%SR).

In examining the data from the subjects with BMD measurements, there was no significant effect of having versus not having BMD measurements in these models (p = 0.48). Although the effect of adding trochanteric BMD to the models for Max%SR and Cum%SR approached statistical significance (0.05 < p < 0.06), there were no other significant effects of BMD (Table 5).

Table Table 5. Regression Models* With One of the Three BMD Variables Included in the Subset With BMD Measurements (n = 77 for the Proximal Femur and n = 59 for the Lumbar Spine)
original image

When the pseudo-Genant grade data were included in the backward stepwise multivariate models (without BMD), both the worst baseline grade and the sum of the baseline grades had no independent predictive effects on Max%SR or Res%SR. There was, however, an additional and independent effect of the worst baseline pseudo-Genant grade 2 or 3 to increase Cum%SR by a mean of 13%.


The EPOS study has shown that the incidence of new fractures in men and women participants aged 50–80 years was approximately 1% per annum in women and one-half that in men.(17) These figures were very similar whether a clinical or morphometric approach to case definition was used. There was an increasing incidence with age in both genders, and using the morphometric approach, approximately 1 in 30 women aged over 75 was found to develop a new fracture each year. With the new results presented here, we have shown that the careful reading of spine X-rays by an experienced clinical radiologist can provide invaluable additional prognostic information concerning the severity, as well as the probability,(26) of future osteoporotic fractures.

Intuitively, as well as on theoretical grounds, it seems likely that a structure that is less well optimized to resist mechanical overload will collapse more completely when overloaded than a structure with more sound mechanical properties. The trabecular lattice structure of the cancellous bone in the spine is highly antisotropic, with the principal compressive trabeculae arranged vertically, connected by horizontal trabeculae so as to keep them in column under load. Characteristically, in osteoporosis, these horizontal trabeculae disappear first; the consequence according to Euler's theory of buckling is that the risk of collapse of a strut increases in proportion to the fourth power of the unsupported strut length. Nevertheless, it is unusual for a vertebral body to collapse completely; usually at some terminating point during the fracture event, the residue of the undamaged structure becomes capable of supporting the subject's weight without further collapse, even if pain is still experienced. This terminating point in a fracture event may be influenced by the accumulation of fracture debris between the vertebral end-plates as the internal structure disintegrates. Thus, a collapsing vertebra that had a relatively high volume of trabecular bone as a proportion of bone plus marrow may collapse less completely than a vertebra with an initially more degraded and porous structure.

Assuming that there is variability between individuals in the degree to which cancellous bone structure is degraded in the spine and that loss of horizontal struts is imprecisely reflected in BMD measurements, it would be expected that prevalent spine fractures would predict subsequent (incident) spine fractures independently of BMD. This has been confirmed in several previous studies,(27) as well as in the present cohort.(26) A further likely consequence is that large or more severe fractures would predict larger fractures subsequently, as we have now shown to be the case in this paper.

These data should be of considerable importance in clinical practice and potentially also in public health. The identification of an osteoporotic vertebral fracture in a patient has profound implications for how that patient should be treated. As well as a previous vertebral fracture increasing the risk of a subsequent vertebral fracture by about 5-fold,(27) the risk of a hip fracture is increased by 2.3- to 4-fold (Study of Osteoporotic Fractures [SOF](28) and EPOS studies(29)). Both these classes of fractures can be prevented with a range of different treatments already marketed or likely to become so. The personal impact of vertebral fractures increases directly with the number of vertebrae involved, the location of the vertebra(e) affected (lumbar fractures have more impact),(4,7) and with the size of the incident deformity.(1)

Our results show clearly how careful radiological assessment of the patient with a possible prevalent vertebral fracture can help the clinician and patient make choices regarding future treatment. Clearly, bone density measurements remain important in a patient's prognosis. However, in the future, in patients with possible or suspected spinal fractures already, the existence of such a fracture should be subjected to careful scrutiny against objective criteria, such as those we and others have used in our cohort studies and trials. The further classification of a prevalent deformity (e.g., wedge, concave, biconcave, or crush) adds considerable further value to the radiological assessment. A prevalent deformity of any sort increases risk at least 3.5-fold and more with some classes of deformity.(26) Also, Lunt et al. has shown that the location of a prevalent deformity has prognostic value for the location of an incident fracture.(26) This new analysis shows in addition that, with the two most serious forms of prevalent fracture (biconcave and complete crush fractures), the overall loss of height in the vertebral body affected by the next incident fracture will be about double of what would otherwise be expected. There was a suggestion from our simulation of the Genant SQ method of assessment that this form of grading for prevalent fracture size would add further value. This might be important because it is quickly performed in well-trained hands. Further work using the SQ method done by expert radiologists on an existing prospective data set is justified to further explore its value in this context. Nonetheless, in the present work, it was the clinical (as distinct from morphometric) reading of fracture shape, with the separation of fractures at baseline into wedged, concave, bi-concave, and crush fractures, by expert radiologists that provided the most predictive determinant of future fracture size.

Ongoing, there is likely to be increased clinical attention devoted to identifying the woman (or man) with one or more vertebral fractures who is suitable for treatment. The benefits of treatment depend directly on the absolute risk of fracture, and the calculation of absolute risk depends on knowing the age-specific population absolute risk. Furthermore, as treatments become less expensive with expiry of patent protection, the controversy over “screening” populations for risk of osteoporotic fracture will shift toward showing adequate discrimination by screening technologies such as bone densitometry and ultrasound. Again the cut-point for assigning treatment will depend on the absolute level of risk. Finally, these data contribute to the development of a benchmark for diagnosis that it is necessary to establish if ever population-level interventions against osteoporosis are to be tested. These data also provide a study design tool for interested triallists needing to do statistical power calculations for interventions designed to prevent fractures with a large impact on the patient.

This study had some limitations. It is difficult to obtain population samples to attend for radiographic surveys. In the first round of our study we achieved a response rate of about 50%. In the second round of X-rays, about 50% of those eligible to be X-rayed again attended, but we could not distinguish in the study as a whole between those who did not attend from choice, those whom the principal investigators were not able to make contact with (e.g., had moved away), and those who had died or were too ill to attend. In addition, we were told clearly by some centers that their continued activity in X-raying participants was dependent on further local funding, which was not achieved within the deadline set by the project coordinators. The data available from the centers did not permit a reliable separation between the different causes for non-participation, and we therefore adopted a conservative approach that there may be bias in the group as a whole.(17)

Using a Poisson modeling approach, we identified only a few variables thought to be associated with fracture risk that were significantly different between those with and without a follow-up film and that were also predictive of an incident deformity in the former.(17) Of these, age was the most important. In estimating incidence in the full cohort, based on the modeling approach, our assumption was that the relationship between such factors and the incidence of deformity would be similar between participants and non-participants. The results of the analysis suggested that the overall impact of non-participation on incidence was small. The impact was greater in the oldest age group, although this might be because of greater mortality, that is, those most at risk may have died and thus were not eligible to attend follow-up.

A second source of bias is the influence of unmeasured interventions. Subjects participating in EVOS were aware they were taking part in a study of osteoporosis and may, during the follow-up, have changed their lifestyle in an attempt to improve their bone health. It is intrinsically difficult in such circumstances to truly measure natural history. Furthermore, some of those with a prevalent deformity identified diagnostically as likely caused by osteoporosis are known to have received treatment, although EVOS was not an intervention study, possibly reducing their risk of an event in another vertebrae.

We used two approaches to define a case. A clinical definition based on a radiologist's opinion and a morphometric definition based on measurements. Both of these were subject to error for reasons stated above. To reduce the error associated with a clinical reading, we chose to have every candidate deformed vertebra identified in the study read by the same experienced radiologist on both films, taking advantage of optimal image processing technology to enhance the detection of the edges of each vertebral body in question. In pilot work, we found a disturbing amount of disagreement between members of a panel of expert radiologists and clinicians who each evaluated a selection of normal and abnormal films to explore the possibility of decentralizing the reading of X-rays in the study.(30) This pilot clearly showed that clinical reading of X-rays without adequate safeguards to ensure the identity of criteria used to score osteoporotic vertebral fractures differs between centers.(30) There are other problems that, however, can make the achievement of accuracy and reproducibility in the reading of X-rays for incident fractures using vertebral morphometry difficult. These include the variability in quality of X-rays taken in different centers, changes in personnel at the central morphometry center, and the increasing discrimination of the same personnel as they gain experience in point placement on film images between the first and second films. Changes in magnification between films, affecting algorithms that compare measured heights or even height ratios on adjacent vertebrae (because the thoracic and lumbar films are typically not of the same degree of magnification)(19) can also lead to problems. However, chief among the sources of error is clinical error in correctly placing the six points per vertebra, due almost always to the poor quality of the image, and for which the only remedy is clinical experience allied to the best technology for image enhancement. In our pilot work, we showed that for a study of fractures in which the incidence was expected to be low, a conservative approach would give the best agreement between methods. Furthermore, we were able, by choosing a 20% height reduction criterion for an incident fracture, to conform to the practice and recommendations of others,(31) and by insisting that each incident fracture also fulfilled the criterion for a McCloskey-Kanis deformity, we preserved the comparability between our current incidence and our earlier prevalence study. Therefore, we do not believe that these limitations have seriously compromised our main conclusions.


We have shown that the volumetric size of a new (incident) spine fracture is dependent, most importantly, not only on the presence, but also on the clinical category of any previous spine fracture. There was an effect of age to increase incident fracture size, but gender was apparently without influence. Low trochanteric bone density has a borderline independent effect to increase the size of an incident spine fracture. Of greater significance was the finding that if the initial fracture was a crush or bi-concave fracture, the second and subsequent new fractures were substantially larger. These results have important implications for predicting those spine fractures that will be more severe, painful, or disabling. They also have the implication that the collapse of a vertebral body is dependent for its completeness or extent, not just on the measured amount of bone mineral that it contains, but also on independent factors relating to bone “quality.” These are likely to be related independently of BMD to the presence and clinical categorization of any previous deformity. All radiologists who read lateral chest radiographs and/or spine radiographs should report the presence of a vertebral fracture (or other deformity) and identify those characteristics that mark its importance for diagnosis and prognosis of osteoporosis. This is not routinely done, at least in the United Kingdom, and many women (and men) with osteoporotic vertebral fractures go undetected.


This study was financially supported by a European Union Concerted Action Grant under Biomed-1 (BMH1CT920182), and EU Grants C1PDCT925102, ERBC1PDCT 930105, and 940229. The central coordination was also supported by the UK Arthritis Research Campaign, the Medical Research Council (G9321536), and the European Foundation for Osteoporosis and Bone Disease. The EU's PECO program linked to BIOMED 1 funded in part the participation of the Budapest, Warsaw, Prague, Piestany, Szczecin, and Moscow centers. Data collection from Zagreb was supported by a grant from the Wellcome Trust. The central X-ray evaluation was generously sponsored by the Bundesministerium fur Forschung and Technologie, Germany. The remaining funding was provided by or through the following centers: Radiological Evaluation Center: Department of Radiology and Nuclear Medicine, Free University, Berlin, Germany (DF,WG,GA); Co-ordination and Data Evaluation Centers: University Institute of Public Health, Cambridge, U.K (JR, CJT, ML) and ARC Epidemiology Unit, University of Manchester, U.K (AJS, TWO'N, ML, AAI, JDF, WCC); Participating Investigative Centers:Behring Hospital, Berlin, Germany (DB); Institute of Rheumatology, Moscow, Russia (LIB); Royal National Hospital for Rheumatic Diseases, Bath, UK (AB); Hospital de Angra do Herismo, Azores, Portugal (JBA); Asturias General Hospital, Oviedo, Spain (JBC); University of Southampton, UK (CC); University Hospital, Leuven, Belgium (JD); University of Sheffield, UK (RE, JAK); Clinic for Internal Medicine, Jena, Germany (BF); Charles University, Prague, Czech Republic (SH); PKP Hospital, Warsaw, Poland (KH); Clinical Hospital, Zagreb, Croatia (IJ); Ruhr University, Bochum, Germany (JJ); Lund University, Malmö, Sweden (OJ); Medical Academy, Erfurt, Germany (GK); Hospital de San Joao, Oporto, Portugal (ALV); University of Athens, Greece (GL); Institute of Rheumatic Diseases, Piestany, Slovakia (PM); Institute of Social Medicine, Lubeck, Germany (CM, HHR); Academy of Medicine, Szczecin, Poland (TM); University of Siena, Italy (GP); Erasmus University, Rotterdam, Netherlands (HAPP); National Institute of Rheumatology and Physiotherapy, Budapest, Hungary (GP); University of Aberdeen, UK (DMR); Humboldt University, Berlin, Germany (WR); University of Heidelberg, Germany (CS-N); University Hospital, Graz, Austria (KW); Royal Cornwall Hospital, Truro, UK (ADW); Medical Institute, Yaroslavl, Russia (OBY).