Vertebral fractures are associated with higher morbidity and mortality. Since 70% of vertebral fractures are clinically silent, a radiologic image of the spine has to be acquired for the diagnosis. The aim of this study was to compare the performance of Vertebral Fracture Assessment (VFA) by dual x-ray absorptiometry (DXA) with radiographs to identify vertebral fractures in community-dwelling older adults.
A total of 429 older adults (ages ≥65 years) were enrolled in this cohort. VFA by DXA measurements were evaluated by 2 expert rheumatologists by consensus, and spine radiographs were analyzed according to the semiquantitative method by an expert radiologist. The correlation between VFA and spine radiographs to identify vertebral fractures was analyzed by kappa scores.
The prevalence of vertebral fractures in VFA and radiographs was 29.1% and 29.4%, respectively (P = 0.99). The frequency of unavailable vertebrae was significantly lower in spinal radiographs than in VFA (0.9% and 5.6%, respectively; P < 0.001), particularly in T4–T6. According to VFA, 5,013 vertebrae (96%) were identified as normal and 144 (2.7%) had grade 1, 58 (1.1%) had grade 2, and 12 (0.2%) had grade 3 fractures. The sensitivity of VFA was 72.9% and the specificity was 99.1% to identify vertebral fractures. The sensitivity increased to 92% and the specificity increased to 99.9% when excluding grade 1 deformities. A good correlation between VFA and radiographs (κ = 0.74) was observed, and the exclusion of grade 1 resulted in even better agreement (κ = 0.84).
In community-dwelling older adults, VFA and radiographs had comparable performances in identifying vertebral fractures, particularly if mild deformities are excluded. Therefore, this methodology is a feasible and promising alternative to improve the management of patients with a high risk of osteoporotic fractures.
Vertebral fractures have been reported in approximately one-third of the older population (1), a prevalence comparable to that observed in female (27.5%) and male (31.8%) community-dwelling older Brazilian subjects (2).
Vertebral fractures are associated with higher morbidity and mortality in the general population. Indeed, the presence of a vertebral fracture increases the risk for new vertebral fractures from 3- to 12.6-fold and increases the risk for nonvertebral fractures from 1.6- to 4-fold (3–6). Moreover, the survival rate of patients with vertebral fractures is lower compared to those without fractures, with a risk of death increased by 2.5-fold in men and women (7–9).
Since 70% of vertebral fractures are clinically silent (6), radiologic images of the spine have to be acquired for the diagnosis, and their detection is an indication for osteoporosis treatment (10). Spine radiographs remain the gold standard method, but lateral spine imaging obtained by dual x-ray absorptiometry (DXA) devices has emerged as an alternative assessment for the diagnosis of vertebral fractures (11–16). Most available studies demonstrating the effectiveness of Vertebral Fracture Assessment (VFA) by DXA, however, have included only women with a specific indication for spine radiographs (height loss, risk factors for osteoporosis and back pain, and long-term corticosteroid therapy) and postmenopausal women participating in osteoporosis treatment trials, precluding a definitive conclusion about the overall specificity and sensitivity of this method. Moreover, in general, the mean age in these studies has been <70 years and the studies have included smaller samples. There are few studies with male subjects and community-dwelling people (16–21).
Therefore, the aim of this study was to compare VFA by DXA and radiograph performances to identify vertebral fractures in healthy community-dwelling older men and women.
Significance & Innovations
This article provides clinically important results because it shows that the Vertebral Fracture Assessment (VFA) by dual x-ray absorptiometry (DXA) is a valid method and has excellent reproducibility for the diagnosis of vertebral fractures. Being a tool that might be used concomitantly with the measurement of bone mineral density by DXA, it has an undoubted practical advantage. In addition, compared to spine radiographs, the patients are exposed to a lower load of radiation with VFA.
This study is very important for clinical practice for all rheumatologists, who usually treat patients with a high probability of having vertebral fractures, although 70% of these fractures are asymptomatic.
SUBJECTS AND METHODS
This study is part of a previous epidemiologic project (2), including a population-based survey (São Paulo Ageing & Health [SPAH] study). This study was conducted from June 2005 to July 2009 and involved individuals ages ≥65 years living in the Butanta district, located in the western area of São Paulo. All individuals were apparently healthy and showed no evidence of malabsorption, chronic diarrhea, hepatic disease, severe chronic diseases, or cancer.
Four hundred twenty-nine subjects (259 women and 170 men) participating in the SPAH study were included in this study. For each individual, age, height, weight, and body mass index (BMI; calculated as kg/m2) were obtained. A specific questionnaire, including the assessment of current thoracic/lumbar pain, was completed.
Assessment of vertebral fractures.
All study participants who agreed to participate and who provided signed informed consent had bone densitometry of the hip and spine and a VFA scan of the thoracolumbar spine performed using the same DXA device (Discovery model, Hologic), with subjects in the supine position. All DXA measurements were performed by the same experienced technologist (LT). On that same day, standard lateral thoracic and lumbar spine radiographs were taken using a 40-inch tube-to-film distance centered at T7 and L2. These radiographs were not digitized.
All VFA and radiographic images were independently evaluated by 2 experienced rheumatologists (DSD, JBL) and 2 expert skeletal radiologists (MEK), respectively. For the identification of vertebral fractures, in both methods, the readers evaluated each T4–L4 vertebrae image to decide whether it contained a fracture. A consensus was reached between the readers for any difference of interpretation. Nonvisible vertebrae were excluded. Only adequately visualized vertebrae were analyzed for deformity using the Genant semiquantitative approach (22). Each identified fractured vertebra was classified by grade based on the Genant semiquantitative scale, where mild (grade 1) = a reduction of 20–25% of anterior, middle, and/or posterior height relative to the adjacent vertebral bodies; moderate (grade 2) = a reduction of 26–40% in any height; and severe (grade 3) = a reduction of >40% in any height.
The performance of VFA in the diagnostic evaluation of vertebral fractures was compared to that of radiographs (gold standard). All of the analyses were performed for the entire group, but men and women were also analyzed separately.
The number of unreadable vertebrae was compared between the 2 methods using McNemar's test. The vertebral visibility between different spine segments was compared by the chi-square test. Factors affecting vertebral visibility (i.e., sex, height, and BMI) were tested using Fisher's exact tests for the categorical variable and Spearman's correlations for continuous variables.
To verify the reliability between the readers for each method, a random subsample of 60 VFA and radiographic images was independently evaluated by the 2 rheumatologists and the 2 radiologists, respectively. A kappa score for the interrater reliability between the readers for VFA and radiographs was expressed using the kappa statistic. Agreements between the readers were high, with interobserver kappa coefficients of 0.78 for VFA scans and 0.82 for radiographs.
At the vertebral level, the overall agreement between VFA and radiographs for evaluable vertebrae, taking into account the presence or absence of vertebral fracture and the grade of fracture, was expressed using the kappa score. The sensitivity and specificity of VFA with exact 95% confidence intervals (95% CIs), diagnostic accuracy of VFA, positive and negative likelihood ratios (LRs), and positive and negative predictive values were calculated. All of these analyses were also performed excluding mild (grade 1) fractures to verify if this would result in an improvement of performance.
At the patient level, subjects were classified as fractured (at least 1 fracture in visible vertebrae) or nonfractured (all visible vertebrae from T4–L4 with no fracture). The overall agreement between the methods (considering individuals with fractured and nonfractured vertebrae) was expressed using the kappa score. Again, the sensitivity and specificity with exact 95% CIs, diagnostic accuracy of VFA, positive and negative LRs, and positive and negative predictive values were calculated. All analyses were also performed excluding mild (grade 1) fractures to verify if this would result in an improvement of performance.
Moreover, a random subsample of 100 spine radiographic images was evaluated regarding the adjacent disc spaces for osteophyte formation and overall osteoarthritis severity, according to a modification of the Kellgren/Lawrence scale, as normal, mild, or moderate/severe (23). To describe the influence of disc space osteoarthritis on fracture assessment at the vertebral level, an agreement by kappa score between VFA and radiographs for detecting fractures before and after eliminating vertebrae with moderate or severe osteoarthritis in both the superior and inferior disc spaces was performed. The sensitivity and specificity of VFA compared to radiographs were also calculated. Significance was set at P values less than 0.05. All analyses were performed using Stata software, version 9.0.
Two hundred fifty-nine women with a mean ± SD age of 73.5 ± 5.4 years and 170 men with a mean ± SD age of 72.3 ± 4.5 years were studied. The mean ± SD height and weight for women were 150 ± 6.3 cm and 64.5 ± 14.1 kg, respectively, and for men were 163.1 ± 7.4 cm and 70.4 ± 12.5 kg, respectively. The mean ± SD BMI was 28.5 ± 5.6 kg/m2 for women and 26.4 ± 3.8 kg/m2 for men.
Visibility of the spine.
Of 5,577 potentially readable vertebrae from T4–L4, the frequency of unavailable vertebrae was significantly lower in spine radiographs than in VFA (0.9% and 5.6%, respectively; P < 0.001). On radiographic films, 92.3% of the unreadable vertebrae (48 of 52) were located in the segment T4–T6 and 99.9% of the vertebral bodies (4,286 of 4,290) from T7–L4 were effectively visualized (P < 0.001). On VFA scans, nonvisible vertebrae were also mostly located in T4–T6 (170 [54.8%] of 310) and, in the segment T7–L4, 96.7% of the vertebral bodies (4,150 of 4,290) had good visibility to assess the presence of fractures (P < 0.001). The visibility of the spine by VFA is shown in Figure 1. There was a significant difference in the frequency of visible vertebrae between segments T4–T6 and T7–L4 (P < 0.001). Sex (P = 0.33), height (P = 0.05), and BMI (P = 0.84) did not affect the visibility of the vertebrae by VFA (data not shown).
Identification of vertebral fractures.
At the vertebral level.
On spine radiographs, 4,999 vertebrae (95.6%) were normal and 134 (2.5%) had grade 1, 74 (1.3%) had grade 2, and 20 (0.4%) had grade 3 fractures. According to VFA, 5,013 vertebrae (96%) were identified as normal and 144 (2.7%) had grade 1, 58 (1.1%) had grade 2, and 12 (0.2%) had grade 3 fractures. Comparison between VFA and radiographs is shown in Table 1. Concerning the concordance between VFA and vertebral fractures, 4,952 vertebrae (94.8%) were normal and 86 (1.6%) had grade 1, 43 (0.8%) had grade 2, and 12 (0.2%) had grade 3 fractures on both methods.
Table 1. VFA versus radiographic fracture interpretation (at the vertebral level)*
VFA = Vertebral Fracture Assessment.
There was good overall agreement between VFA and radiographs at the vertebral level (κ = 0.74) and by grade of vertebral fracture (κ = 0.68). Both results were even better when grade 1 deformities were excluded (κ = 0.84 and κ = 0.80, respectively). Further analysis by vertebral levels revealed good to very good agreement for all levels (κ = 0.61–0.83). Analysis of solely grades 2 and 3 fractures also resulted in a better agreement, except for the T4–T6 levels (Table 2).
Table 2. Agreement between Vertebral Fracture Assessment and radiographs*
All grades of fractures
Including only grades 2 and 3 fractures
Values are the kappa statistic. NA = not applicable (no vertebral fracture was detected on the T4 and T10 levels).
The overall sensitivity of VFA for fracture diagnosis was 72.9% (95% CI 66.7–78.6%) and the specificity was 99.1% (95% CI 98.8–99.3%). Therefore, the diagnostic accuracy of VFA compared to radiographs was 97.9% (95% CI 97.8–98%). The positive and negative predictive values for fractures using VFA were 78.0% (95% CI 71.9–83.4%) and 98.8% (95% CI 98.4–99.1%), respectively. Furthermore, when only grades 2 and 3 fractures were considered, the sensitivity of VFA increased to 92.0% (95% CI 86.0–99.0%), whereas the other performance parameters remained equivalent to the analysis, including grade 1 deformities, with specificity of 99.9% (95% CI 99.8–100%), diagnostic accuracy of 99.5% (95% CI 99.4–99.6%), positive predictive value of 77.0% (95% CI 68.0–86.0%), and negative predictive value of 99.6% (95% CI 99.4–99.8%). The positive and negative LRs of VFA were 77.5 (95% CI 57.7–104) and 0.27 (95% CI 0.22–0.34), respectively. The concordance between the methods was higher in the case of grade 2 or 3 fractures, with a positive LR of 255.4 (95% CI 160.1–406.7) and a negative LR of 0.08 (95% CI 0.03–0.18). The positive LR was less for grade 1 fractures, where 42 nonfractured vertebrae on radiographs were considered as having grade 1 fractures on VFA and 44 grade 1 fractures on radiographs were considered as normal using VFA (Table 1).
Analyzing by vertebral levels, the sensitivity and positive predictive value decreased from the lumbar spine to the upper thoracic spine. In contrast, the specificity and negative predictive value remained above 95%. Of note, exclusion of grade 1 fractures improved the diagnostic value of VFA to detect fractures in each vertebral level (Table 3).
Table 3. Diagnostic performance of Vertebral Fracture Assessment to detect grades 2 and 3 fractures by vertebral level*
Values are the percentage. PPV = positive predictive value; NPV = negative predictive value; NA = not applicable (no vertebral fracture was detected on the T4 level).
Analyzing female and male subjects separately, the kappa scores between VFA and radiographs, with all grades of fractures and excluding grade 1 deformities, were 0.75 and 0.86 for female subjects and 0.73 and 0.82 for male subjects, respectively. The diagnostic performance of VFA compared to radiographs also showed similar results to those found for the entire population (Table 4).
Table 4. Diagnostic value of VFA to detect vertebral fractures by sex (at the vertebral level and the individual level)*
All grades of fractures, vertebral level
All grades of fractures, individual level
Grade 1 fractures excluded, vertebral level
Grade 1 fractures excluded, individual level
All grades of fractures, vertebral level
All grades of fractures, individual level
Grade 1 fractures excluded, vertebral level
Grade 1 fractures excluded, individual level
VFA = Vertebral Fracture Assessment; PPV = positive predictive value; NPV = negative predictive value; LR = likelihood ratio; NA = not applicable (no disagreement between VFA and radiographs concerning absence of grades 2 and 3 vertebral fractures in female subjects).
Concerning osteoarthritis analysis, there was a slight improvement in the kappa score between VFA and radiographs with all grades of fractures before and after eliminating the vertebrae with osteoarthritis (0.77 and 0.83, respectively). Moreover, no improvement in the sensitivity (83%; 95% CI 66–99% and 84%; 95% CI 68–99%, respectively) and specificity (82%; 95% CI 69–94% and 82%; 95% CI 70–94%, respectively) was observed before and after eliminating the vertebrae with osteoarthritis.
At the individual level.
On VFA scan and spine radiographs, the prevalence of vertebral fractures was 29.1% (125 of 429 subjects) and 29.4% (126 of 429 subjects), respectively (P = 0.99). On VFA, the frequencies of vertebral fractures in women and men were 29.7% (77 of 259) and 28.2% (48 of 170), respectively (P = 0.74). On spine radiographs, 28.6% of women (74 of 259) and 30.6% of men (52 of 170) had vertebral fractures (P = 0.65). Separating by sex, there was no difference between VFA and radiographs related to the prevalence of vertebral fractures (P = 0.69 for women and P = 0.5 for men).
Using radiographs, among 126 individuals with fractures, 61 subjects (48.4%) presented only grade 1 deformities, 49 (38.9%) had at least one grade 2 fracture, and 16 (12.7%) had at least one grade 3 fracture. There was a mean ± SD of 1.9 ± 1.1 fractured vertebrae among subjects with fractures. On VFA scan, among 125 individuals with fractures, 76 subjects (60.8%) presented only grade 1 fractures, 39 (31.2%) had at least one grade 2 fracture, and 10 (8%) had at least one grade 3 fracture. There was a mean ± SD of 1.7 ± 1.1 fractured vertebrae among subjects with fractures. The overall agreement between VFA and radiographs was good (κ = 0.75), and it was higher after excluding grade 1 fractures (κ = 0.91).
The sensitivity of VFA was 81.7% (95% CI 73.9–88.1%) and the specificity was 92.7% (95% CI 89.2–95.4%). Therefore, the diagnostic accuracy of VFA compared to radiographs was 90.0% (95% CI 89.1–90.8%). The positive and negative predictive values for fractures using VFA were 82.4% (95% CI 74.6–88.6%) and 92.4% (95% CI 88.9–95.1%), respectively. The positive and negative LRs of diagnosis of vertebral fracture by VFA were 11.3 (95% CI 7.5–17.0) and 0.20 (95% CI 0.14–0.28), respectively. Furthermore, the diagnostic performance of VFA was also better when only grades 2 and 3 fractures were considered, with sensitivity of 88.2% (95% CI 76.1–95.6%), specificity of 99.6% (95% CI 98–99.9%), diagnostic accuracy of 97.9% (95% CI 97.7–98.1%), positive predictive value of 97.8% (95% CI 88.5–99.9%), negative predictive value of 97.9% (95% CI 95.5–99.2%), positive LR of 248 (95% CI 35–1,759), and negative LR of 0.10 (95% CI 0.05–0.2).
Separating by sex, the kappa scores between VFA and radiographs with all grades of fractures and excluding grade 1 deformities were 0.77 and 0.94 for female subjects and 0.72 and 0.87 for male subjects, respectively. The diagnostic value of VFA to identify individuals with at least one vertebral fracture was similar, considering the entire population and separating by sex (Table 4).
None of the evaluated subjects had previous knowledge of having vertebral fractures. Moreover, concerning the history of back pain, 53% of older individuals reported this symptom. Only 17% of these subjects with back pain had radiographic vertebral fractures compared to 12% of individuals without back pain who had radiographic vertebral fractures (P = 0.13).
Our study showed a very good performance of VFA to detect prevalent vertebral fractures, with a high positive predictive value for fracture diagnosis and a high negative predictive value for fracture exclusion.
The biggest advantage of the present study design is the evaluation of a large population of solely individuals ages ≥65 years, since the highest risk of vertebral fractures occurs after this age (3–6). Furthermore, only community-dwelling subjects were included. Another advantage of this study is the individualized analysis by sex. VFA has previously received limited evaluation in men. Our results confirmed the excellent diagnostic value of VFA in men and women. Vallarta-Ast et al also demonstrated the utility of VFA, but in a population with specific indications for DXA (24).
The VFA evaluation strictly follows the International Society for Clinical Densitometry (ISCD) (25) recommendation that the diagnosis of vertebral fractures should be done by visual evaluation according to the semiquantitative method (22), which has excellent inter- and intraobserver agreement. Although the semiquantitative approach depends mostly on the examiner's experience and training, it has the great advantage of being easily applicable in clinical practice, contrasting with morphometry, which is not reliable for diagnosis. The semiquantitative method demonstrated high accuracy for identifying grades 2 and 3 fractures, but demonstrated lower accuracy for identifying grade 1 deformities. Previous studies have suggested that grade 1 deformities do not, in fact, represent clinically significant fractures and do not confer increased risk of new fractures or higher mortality in these individuals (2, 26, 27). These changes would be related more to the aging process, including kyphosis and degenerative changes in the spine (28, 29).
VFA by DXA has other clear benefits compared to conventional radiography. Besides the possibility of a simultaneous evaluation with lumbar and hip bone mineral density acquisition, it has the advantage of reducing the impact of parallax effects on fracture visualization. In addition, VFA delivers a much lower dose of radiation to the patient (entrance dose 30–50 μSv) than thoracic/lumbar spine radiographs (1,800–2,000 μSv) (30, 31). VFA also has a lower cost than radiographs ($44 versus $92 in the US) (32). Good cost-effectiveness was demonstrated for this method (33, 34).
We have confirmed previous studies (12, 19, 28, 35, 36) indicating that vertebral visibility in the upper thoracic spine (above T7) is the main limitation of VFA due to the restricted image resolution of the DXA device in this region. This was also true for radiographic assessment, although to a lesser extent. This limitation does not seem to be a significant disadvantage, since most vertebral fractures were located at the mid-lower thoracic and lumbar spine (37), where there is good visibility reinforcing the notion that VFA by DXA remains an important tool for clinical practice based on the high positive predictive value.
Importantly, the specificity and the negative predictive value were high (above 90%) at all vertebral levels. These findings have relevant clinical implications and emphasize that false-positive results are rare on VFA assessment, and suggest that it is not necessary to perform spine radiographs if there is no vertebral fracture on VFA. On the other hand, we showed that the agreement (kappa statistic) between VFA and radiographs was much more inferior in the segments T4–T6, with a lower sensitivity and positive predictive value of VFA in this segment. This means that more false-negative results can occur at the upper thoracic spine.
The positive and negative LRs were better for grades 2 and 3 fractures than grade 1 deformities, which is consistent with previous reports (19–21). These results indicate greater disagreement between the methods for the presence of mild fractures. We also extended a prior observation that sex, height, and BMI did not seem to influence spinal visibility (18, 21).
The presence of an osteoporotic vertebral fracture on an image, even a subclinical image, is a formal indication for osteoporosis therapy, independently of bone mineral density (10). Therefore, knowledge of prevalent vertebral fractures can alter patient management decisions and result in initiation of therapy to reduce fracture risk in some patients who would not otherwise be treated. Cost–benefit analysis demonstrates that identifying and treating patients with vertebral fractures, even those with a densitometric classification of osteopenia, is cost effective (33, 34). Therefore, the inclusion of VFA, concomitant with routine bone density testing, seems to prompt incorporation of knowledge into fracture risk estimation, which ultimately would optimize the management of patients with osteoporosis.
Nevertheless, vertebral deformities do not always correspond to vertebral fractures, and conventional radiography is the best way to identify and confirm these abnormalities in clinical practice. There are other potential differential diagnoses for vertebral deformities, such as osteoarthritis, diffuse idiopathic skeletal hyperostosis, Scheuermann's disease, congenital malformation, Paget's disease, and neoplastic and inflammatory changes. All of these causes can influence the morphology of the vertebrae and can therefore mimic fractures, leading to false-positive diagnoses of fractures (38). According to the 2007 ISCD official positions (25), VFA is designed to detect vertebral fractures and not other deformities. The presence of sclerotic or lytic changes or findings suggestive of conditions other than osteoporosis is an indication for following VFA with another imaging modality.
In this study, we did not see any clear improvement in the sensitivity and specificity of VFA relative to radiographs by excluding those vertebrae with moderate or severe osteoarthritis in the adjacent disc spaces, but this analysis was done in a smaller subsample. Also, our study has other limitations: first, we did not evaluate the influence of scoliosis in vertebral visibility, and second, considering that our study has a transversal design, we cannot establish the prevalence of clinical vertebral fractures (acute back pain). However, it is known that most vertebral fractures are diagnosed by radiologic images. Gehlbach et al, evaluating chest radiographs of older women who had been hospitalized for several causes, found that a few vertebral fractures had been identified by clinicians previously (39). Indeed, in our study, one-third of older subjects had this condition, although no one had prior knowledge of having vertebral fractures.
In conclusion, in community-dwelling older adults, VFA had a performance comparable to radiographs to identify clinically relevant grades 2 and 3 vertebral fractures. This methodology is easily applicable and seems to be a promising alternative to spine radiographs, improving the management of patients with a high risk of osteoporotic fractures.
All authors were involved in drafting the article or revising it critically for important intellectual content, and all authors approved the final version to be published. Dr. Pereira had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.