Underdiagnosis of Vertebral Fractures Is a Worldwide Problem: The IMPACT Study

Authors


Abstract

Accurate radiographic diagnosis of vertebral fractures is important. This multicenter, multinational study assessed radiographic diagnoses of vertebral fracture in 2451 postmenopausal women with osteoporosis. Comparison between local and central readings yielded a false-negative rate of 34%. Underdiagnosis of vertebral fracture is a worldwide problem.

Introduction: Vertebral fractures are the most common complication of osteoporosis. Although they are associated with significant morbidity, they frequently do not come to clinical attention. Accurate radiographic diagnosis is important.

Materials and Methods: In a multicenter, multinational prospective study (the IMPACT trial), the accuracy of radiographic diagnosis of vertebral fracture was evaluated in postmenopausal women 65–80 years of age newly diagnosed with osteoporosis (based on BMD measurement). Lateral radiographs of the thoracolumbar spine were evaluated for identification of vertebral fractures, first locally and subsequently at a central reading center, using a validated semiquantitative method. False-positive and false-negative rates were calculated based on adjudicated discrepancies between the initial interpretation at the local site and the subsequent central reading, considered the “reference standard.”

Results: Of 2451 women with an evaluable radiograph both centrally and locally, 789 (32%) had at least one vertebral fracture. Adjudicated discrepancies (n = 350 patients) between local and central readings because of undetected vertebral fracture (68%) or equivocal terminology in the local radiology report (32%) yielded a false-negative rate of 34%.

Conclusions: Underdiagnosis of vertebral fractures was observed in all geographic regions (false-negative rates: North America, 45.2%; Latin America, 46.5%; Europe/South Africa/Australia, 29.5%). The false-positive rate was 5% globally. Underdiagnosis of vertebral fracture is a worldwide problem attributable in part to a lack of radiographic detection, use of ambiguous terminology in the radiology report, or both. Efforts to improve accuracy and reduce variability in terminology and interpretation may increase the effectiveness of spinal radiography for detecting vertebral fractures in patients with osteoporosis.

INTRODUCTION

VERTEBRAL FRACTURES ARE the most common type of osteoporotic fracture, occurring in ∼20% of postmenopausal women.(1) However, three-fourths of vertebral fractures do not come to immediate clinical attention(2)—the so-called “clinically silent” vertebral fracture. Because they are frequently undetected, osteoporosis may remain untreated and progress rapidly. Indeed, postmenopausal women with at least one vertebral fracture have a 5-fold increased risk of sustaining another vertebral fracture within the coming year(2) and a 2-fold increased risk of other fragility fractures, including hip fractures.(3)·Height loss, kyphosis, chronic back pain, and back-related functional disability result when vertebral fractures remain untreated.(4–6)

Both symptomatic and asymptomatic vertebral fractures are associated with increased morbidity(7) and mortality.(4,6,8) The clinical significance of fractures of all severities has been highlighted in a recent study by analysis of vertebral fracture incidence in the placebo arm of a large postmenopausal osteoporosis trial (MORE).(9) The findings from this study showed that the occurrence of any vertebral fracture, even those classified as mild or moderate, result in an increased risk of subsequent fractures and associated health consequences. With the number of aged people at risk for osteoporosis expected to increase dramatically in the next decades, such accurate identification and treatment intervention of patients is necessary to reduce the enormous potential impact of this disease on patients and health care systems.

Because vertebral fractures are often unsuspected clinically, diagnosis depends on accurate radiographic detection and an unambiguous radiographic report of fracture. However, in a single-center retrospective study of hospitalized elderly women who had a lateral chest radiograph, 50% of radiographic reports failed to report the presence of moderate or severe vertebral fractures, and many patients remained untreated.(10)

The objective of this study was to assess—prospectively and globally—the accuracy of the spinal radiographic diagnosis of vertebral fractures by comparing results of local radiographic reports with that of subsequent central readings. Radiographs were collected as baseline studies in a trial that also assessed the impact of physicians' reinforcement of bone marker-based monitoring on patient compliance and persistence with risedronate treatment (the IMPACT study).

MATERIALS AND METHODS

Study participants

The study was conducted at 172 clinical sites in 21 countries on five continents and recruited ambulatory women between 65 and 80 years of age who had not been previously screened and diagnosed with osteoporosis. Subjects were prescreened at random by general practitioners or primary care physicians, and in countries where this was not possible, recruitment was performed by mailing or advertisement.

Diagnosis of osteoporosis was based on a BMD T score of ≤−2.5 at the hip or spine or a BMD T score between −2.5 and −1.0 plus a clinically documented low-trauma fracture sustained at age 45 years or later. All study participants gave written informed consent. The study protocol was approved by the local institutional ethics review board at each study site.

Women who had previously taken any medication specifically for the treatment of osteoporosis and black women (because of their lower incidence of postmenopausal osteoporosis) were excluded. Other exclusion criteria included use of oral or parenteral glucocorticoid therapy (≥5 mg prednisone or equivalent/day) within 3 months or for >1 month within 6 months before study entry; a diagnosis of hypocalcemia, hypercalcemia, hyperparathyroidism, hyperthyroidism, or osteomalacia within 1 year before study entry; depot injection of >10,000 IU of vitamin D; use of progestogen within 3 months or for >1 month within the 6 months before study entry; or treatment with any investigational drug in the 30 days before study entry (16 weeks in Ireland).

Radiographic assessments

A radiographic procedure manual was used by all radiologists and was the principal tool for standardization of the image acquisition and evaluation procedures.(11)

Radiographic procedures were based on specified technical parameters such as imaging screen technique (Bucky table), film size, exposure time, kilovolt peak, collimation of the X-ray beam, patient positioning, focus-film distance (40 in or 100 cm), patients' breathing technique, and inclusion of T12 in both thoracic and lumbar films and centered at T7 and L2, respectively. All radiologists and technologists performing examinations for this study were licensed as required by local regulations and read and understood the requirements set forth in the quality assurance manual. No additional training regarding the criteria for vertebral fracture was provided. At each study site, the local radiologist was instructed to evaluate each radiograph for proper technique, subject positioning, clear depiction of all vertebrae, and the presence of vertebral fracture from T4 to L4. Local radiologists were in structed to use the same semiquantitative technique for fracture assessment that was used by the central radiologists.(12)

Criteria for good image quality included superimposition of vertebral endplates, complete superimposition of the posterior elements and posterior edges of the vertebral bodies, blurred rib contours and lung parenchyma on lateral thoracic radiographs, unobscured image of target vertebrae, and appropriate exposure enabling clear visibility of vertebral contours and trabeculae across the entire spine. All study participants with evaluable radiographs were included in the analysis.

All spinal radiographs were sent to a single central reading facility (Synarc, San Francisco, CA, USA), for confirmation of radiographic quality and evaluation of vertebral fracture by one of two radiologists, both having standardized training and using prespecified criteria and validated methods.(12–15) Prevalent vertebral fractures were assessed using the semiquantitative method, in which all vertebrae were graded by visual inspection from normal (grade 0) to severely deformed (grade 3).(12) Vertebral deformities unrelated to fracture, such as those associated with Scheuermann's disease and severe osteoarthritis, were excluded from the analysis.

Our goal was to identify the true errors that might occur at the local level in a typical clinical setting. Local and central radiographic interpretations were compared, and radiographs for which there were discrepancies were sent to an expert (HKG) for blinded review. Twenty nondiscrepant radiographs were also reviewed at the same time for quality control. The expert's opinion prevailed, and his findings were used to determine the true discrepancy rates. False-negative rates (%) and false-positive rates (%) were calculated based on the adjudicated discrepancies between the local and central readings, with the latter as the “reference standard.” We then calculated the κ values for the agreement between the fracture assessment at the central and local levels. Discrepancy rates were performed for the entire study group and by prospectively defined geographic regions: (1) Europe/South Africa/Australia; (2) North America; and (3) Latin America. These regions consisted of countries that were part of the stratified randomization design of the IMPACT study. For this secondary analysis, the regions were consolidated based on similar prescribing practices within the countries representing those regions.

In addition to the comparative evaluation of the local and central radiographic assessment, we wanted to identify the source of those true errors. Local radiographic reports for this study were reviewed and translated into English by osteoporosis experts who were native speakers of the language in which the original report was written. Local reports describing vertebral fracture were considered positive, and those with no diagnosis of fracture were considered negative. Equivocal reports containing no diagnosis of fracture but with descriptive terminology such as vertebral deformity, biconcavity, compression, or mild collapse or loss of height of the vertebral body were considered negative if the investigator reported “no vertebral fracture” on the case report form. The local osteoporosis experts also reviewed the local radiologists' reports to confirm the accuracy of the investigators' interpretation of these reports.

RESULTS

The disposition of patients for this subanalysis of the IMPACT study is shown in Fig. 1. A total of 7153 postmenopausal women (≥65 to ≤80 years) that had not been previously diagnosed with osteoporosis were screened, of which 2939 patients were eligible based on inclusion criteria. Spinal radiographs were performed on 2500 patients; both central and local evaluable radiographic reports were available for 2451 patients. The baseline characteristics of the participants included in this analysis are summarized in Table 1.

Table Table 1.. Baseline Characteristics of Study Participants
original image
Figure FIG. 1..

Patient disposition.

Of the 2451 patients evaluated both locally and centrally, 496 discrepancies were noted (336 false-negative, 160 false-positive). On analysis by the blinded expert, 76 of the false-negative radiographs were deemed accurately assessed at the local level, as were 70 of the false-positive radiographs. Thus, after the expert's validation, there were 789 patients (32%) with at least one vertebral fracture; the distribution by geographical region is shown in Table 1. There was no local identification of vertebral fracture for 266 patients, yielding a false-negative rate of 34% globally (Fig. 1). Regionally, the false-negative rate was highest in Latin America (46.5%) and North America (45.2%) compared with Europe/South Africa/Australia (29.5%). Among 1844 patients categorized as not having a vertebral fracture by the local radiologist, a vertebral fracture was confirmed by central reading in 266 (i.e., in 14% of them). Seventy-three percent (n = 194) of false-negative local radiographic reports provided no description of fracture or deformity. However, 27% (n = 72) of the false-negative local radiographic reports were considered equivocal in terms of fracture diagnosis because of the use of ambiguous terminology, such as “biconcavity,” “end plate compression,” “wedge deformity,” and “slight reduction in vertebral height.” The local study site checked the box labeled “no vertebral fracture” on the case report form because, at least in part, of these ambiguities. Furthermore, when assessing the accuracy of the investigator's interpretation of the local radiologist's report, the local osteoporosis experts identified 75 cases that were misinterpreted. Of these, 61 radiographs were false-negative and 14 were false-positive based on the comparison of the local and central radiographic reports.

For the 266 patients with false-negative radiographs, 134 (50.4%) were diagnosed with one prevalent vertebral fracture and 132 (49.6%) had more than one prevalent vertebral fracture. A total of 537 vertebral fractures were detected in the false-negative radiographs; these fractures were classified according to severity as follows: 299 (55.7%) grade 1, 191 (35.6%) grade 2, and 47 (8.8%) grade 3 fractures. For the 523 patients with true-positive radiographs, 272 (52.0%) were diagnosed with one prevalent fracture and 251 (48.0%) had more than one prevalent vertebral fracture. The distribution of severity for the 1105 fractures identified among these patients was 568 (51.4%) grade 1, 399 (36.1%) grade 2, and 138 (12.5%) grade 3 fractures. For all women with diagnosed fractures based on the validation of the local radiologists' interpretation (n = 789 false-negative and true-positive radiographs), 1642 vertebral fractures were classified according to severity as follows: 867 (52.8%) grade 1, 590 (35.9%) grade 2, and 185 (11.3%) grade 3 fractures.

Figure 2 displays actual radiographs of grade 1 fractures that were not reported locally but were confirmed centrally and adjudicated. These false-negative local radiographic readings were located throughout the spine. The thoracolumbar distribution of all detected vertebral fractures is shown in Fig. 3. The proportion of underdiagnosed fractures in the thoracic spine ranged from 6% to 29% (at T4 and T9, respectively) and in the lumbar spine ranged from 10% to 16% (at L4 and L1, respectively).

Figure FIG. 2..

Actual radiographs of grade 1 fractures that were not reported locally. (A) False-negative local reading, with central reading detecting a mild wedge and endplate fracture at T8. (B) False-negative local reading, with central reading detecting a mild endplate fracture at T12. (C) False-negative local reading, with central reading detecting a mild endplate and wedge fracture at L2.

Figure FIG. 3..

Distribution of fractures by vertebral level at the thoracic (T) and lumbar (L) spine. The solid bars are the number of vertebral fractures not detected at the local level (false-negative); the striped bars are the total number of fractures identified by the central radiologists. FNR, false-negative rate.

Conversely, of 1662 women without vertebral fracture according to the adjudicated central reading, 84 were diagnosed locally with vertebral fracture, yielding a false-positive rate of 5% globally. The false-positive rates were similar across all regions, except in Latin America, where none were recorded. It should be noted that this region had the smallest number of patients evaluated (n = 189) compared with the other regions. Overall, a local, unequivocal, false-positive diagnosis of fracture was rendered in 53% of the radiographs, whereas ambiguous radiographic terminology in the report lead to a “check” in the vertebral fracture box on the case report form, which led to a false-positive diagnosis for 47%.

For the 2451 patients available for the adjudicated comparative assessment, the percentage agreement and κ value for the local and central radiologists was 86% and 0.65 (95% CI, 0.62-0.68), respectively.

DISCUSSION

In this study, vertebral fractures were frequently underdiagnosed worldwide in radiographic reports of postmenopausal women with osteoporosis. Regionally, false-negative rates ranged from 29.5% to 46.5%, despite a strict protocol that provided an unambiguous vertebral fracture definition and minimized or eliminated underdiagnosis because of inadequate film quality. Indeed, the radiologists in the study had specific procedures to follow, albeit no additional training in vertebral fracture assessment, to obtain high-quality radiographs and perform high-quality readings in patients considered to be eligible for an osteoporosis study. This could impose a bias toward aggressive identification of fractures; however, the high false-negative rate suggests the opposite. If the same study had been done with routine radiographic procedures, the false-negative rates may have been higher because of possible inclusion of inadequate-quality radiographs. This high rate of failure to diagnose vertebral fracture radiographically suggests that many patients who require treatment to reduce their fracture risk are not being properly identified.

There are other possible explanations for the high rate of failure to identify vertebral fractures in the local assessments. Even though the radiologists were required to report all fractures according to the standardized manual,(11) “old” fractures associated with degenerative changes are sometimes not reported. Because vertebral fracture was not a prerequisite for study enrollment, no incentive to identify fractures existed at the study sites. Nonetheless, local reviewers knew that the radiograph was part of a baseline study of a patient enrolled in an osteoporosis trial, so their radiographic review could have been more diligent than a routine review; however, many vertebral fractures were undetected.

There was a relatively low proportion of false-positive reports (5%). These included ambiguous or equivocal findings (47%) or decreases of vertebral height of <20% or degenerative remodeling changes (53%).

The κ score (0.65) and percentage agreement (86%) between the local and central readers, after adjudication by the blinded expert, was slightly lower in this study than that generally reported in postmenopausal women (>0.75 and >95%, respectively).(12–14) This study evaluated women without previously diagnosed osteoporosis, which may account for some of the differences between studies.

Although there are alternative approaches to the identification of vertebral fracture, there is no generally agreed “gold standard.” Quantitative assessments of vertebral fractures are moderately sensitive and specific but complicated and tedious to perform and generally are not applicable to clinical practice.(12,16–18) Semiquantitative assessment of vertebral fractures, as applied in this study, can be performed quickly on a routine basis(12,14,19,20) and correlates moderately well with quantitative morphometry(15,21) but also has limitations. The semiquantitative method detects more fractures, particularly midthoracic grade 1 fractures,(12,14) than quantitative morphometry. Thus, the false-negative rate in the midthoracic region could have been slightly overestimated in this study.

The majority of fractures that were missed were grade 1 (mild) fractures. Whereas the definition and clinical significance of mild fractures have been a subject of research and controversy, these fractures undoubtedly have clinical implications.(9,16,22–24) In one study, height ratios of mildly deformed vertebrae were an independent predictor of vertebral fracture risk in pre- and postmenopausal women.(25) Another study showed progressive deterioration of vertebral bodies in 89 postmenopausal women 55-68 years of age experiencing acute back pain who had radiologically confirmed mild or no vertebral fracture.(26) Lateral radiographs of the thoracic and lumbar spine assessed every 6 months revealed a progression of deformity from vertebral wedging to full collapse of vertebral bodies at variable time intervals over the 18-month study. A recent analysis of the placebo arm of a large osteoporosis trial (MORE) showed that the presence of at least one mild vertebral fracture at baseline was associated with a 2-fold increase in subsequent vertebral fractures over 3 years, including more serious ones.(9) Thus, the predictive value of grade 1 fractures for subsequently more severe fractures and their clinical correlation with acute back pain emphasize their clinical significance and the need for accuracy in their radiographic identification.

The use of ambiguous terminology in the radiologist's narrative contributed to false-negative readings in this study, and it is likely that the same is true in the clinical setting. For example, a retrospective analysis reported that, where fractures were mentioned in the narrative summary of the report, <50% contained a firm diagnosis of fracture in the summary impressions.(10) Thus, it is recommended that a consensus be reached regarding specific terminology for the radiographic diagnosis of vertebral fracture. The use of inconclusive, imprecise terms such as “biconcavity,” “wedge deformity,” and “slight reduction in height” should be avoided. The term “fracture” should be used consistently whenever radiographic deformities indicating fracture are identified, and the fracture grade or severity should be provided if possible. A standardized radiology report form would also help to eliminate the potential for investigators to miss a documented fracture that may have been obscured in the report. However, it should be noted that because fracture cannot be assessed according to a stringent dichotomous variable, there will always be some discrepancies between individuals with no fractures and those with mild fractures.(12,14,27)

Another potential limitation of this study is in the prospective grouping of geographic regions and the unequal sample sizes that result. This unequal distribution of patients by geographic region creates some issues with the precision of the point estimates of recognition. Interestingly, our study revealed higher rates of false-negative fractures in the Americas compared with the other regions (Europe/South Africa/Australia). Because there were no differences in clinical characteristics of patients between these regions, the reasons for this difference remain unclear. In addition, the clinical trial setting may have limitations with potential introduction of a sampling bias; nevertheless, the trial was designed to reflect a real-life setting of undiagnosed osteoporosis by only recruiting subjects that had had previously undiagnosed disease.

Because of the magnitude of missed fractures, a second approach to identifying vertebral fractures, such as quantitative morphometry, may have been useful to provide insight into the reasons for the discrepancies between the local and central readings. Whereas a second method of identifying vertebral fractures may have been useful, it is not typically performed in clinical practice and therefore was not necessary in the context of this study. Nonetheless, this limitation needs to be considered when evaluating the results from this analysis.

In conclusion, prospective data from spinal radiographs of postmenopausal women with osteoporosis, interpreted locally and reviewed centrally with adjudication by a blinded expert, show that vertebral fractures are underdiagnosed worldwide. Underdiagnosis of vertebral fractures may lead to decreased rates of diagnosis and treatment of osteoporosis in postmenopausal women.

Acknowledgements

Funding for this study was provided by the Alliance for Better Bone Health (Procter & Gamble Pharmaceuticals, Cincinnati, OH, USA and Aventis, Bridgewater, NJ, USA). The authors gratefully acknowledge Anne Le-Moigne-Amrani for critical review of the study methodology and contributions to the statistical analysis performed in this study, Drs Chun Wu, G Von Ingersleben, and Guirong Jiang for interpretation of the radiographs, and Dr Karen Mittleman for editorial assistance.

Ancillary