Finite element analysis of computed tomography (CT) scans provides noninvasive estimates of bone strength at the spine and hip. To further validate such estimates clinically, we performed a 5-year case-control study of 1110 women and men over age 65 years from the AGES-Reykjavik cohort (case = incident spine or hip fracture; control = no incident spine or hip fracture). From the baseline CT scans, we measured femoral and vertebral strength, as well as bone mineral density (BMD) at the hip (areal BMD only) and lumbar spine (trabecular volumetric BMD only). We found that for incident radiographically confirmed spine fractures (n = 167), the age-adjusted odds ratio for vertebral strength was significant for women (2.8, 95% confidence interval [CI] 1.8 to 4.3) and men (2.2, 95% CI 1.5 to 3.2) and for men remained significant (p = 0.01) independent of vertebral trabecular volumetric BMD. For incident hip fractures (n = 171), the age-adjusted odds ratio for femoral strength was significant for women (4.2, 95% CI 2.6 to 6.9) and men (3.5, 95% CI 2.3 to 5.3) and remained significant after adjusting for femoral neck areal BMD in women and for total hip areal BMD in both sexes; fracture classification improved for women by combining femoral strength with femoral neck areal BMD (p = 0.002). For both sexes, the probabilities of spine and hip fractures were similarly high at the BMD-based interventional thresholds for osteoporosis and at corresponding preestablished thresholds for “fragile bone strength” (spine: women ≤ 4500 N, men ≤ 6500 N; hip: women ≤ 3000 N, men ≤ 3500 N). Because it is well established that individuals over age 65 years who have osteoporosis at the hip or spine by BMD criteria should be considered at high risk of fracture, these results indicate that individuals who have fragile bone strength at the hip or spine should also be considered at high risk of fracture. © 2014 American Society for Bone and Mineral Research.
The increasing size of the elderly population, coupled with insufficient rates of screening,[1, 2] has renewed calls to increase the number of people who are tested for osteoporosis and assessed for fracture risk. Although dual-energy X-ray absorptiometry (DXA) is the clinical standard for such testing, finite element analysis (FEA) of computed tomography (CT) scans can now be used clinically to assess fracture risk. FEA-derived strength estimates have been validated in cadaver studies by numerous groups for both the spine[4-6] and hip.[7-12] These strength estimates have been clinically validated for prediction of incident clinical spine and hip[14, 15] fractures in men and incident hip fractures in women. Associations have also been shown in women between FEA and both prevalent spine fractures[5, 16-19] and any prevalent osteoporotic fracture. However, validation studies have not yet been reported for prediction of incident spine fractures in women, and no single FEA methodology has yet been validated in a single study for both incident spine and hip fractures in both sexes. Further, although clinical guidelines for interpreting bone mineral density (BMD) are well established (T ≤ –2.5 for defining osteoporosis, for example), such guidelines remain to be validated for FEA-derived measures of strength.
Addressing these limitations, we sought to further validate FEA for predicting incident spine and hip fractures in both women and men, and with clinical translation in mind, to prospectively evaluate preestablished FEA strength-based interventional thresholds for identifying those at high risk of fracture. Because the same CT scans used for FEA can also be used to measure clinically established BMD measures at the spine (volumetric BMD of the trabecular bone) and hip (DXA-equivalent total hip and femoral neck areal BMD),[15, 22] we also assessed these BMD measures to facilitate clinical interpretation of the FEA results.
Materials and Methods
For our study design, we used a case-control approach with separate spine and hip arms (Fig. 1). For the spine and hip arms, the cases had an incident spine or hip fracture, respectively, during a 5-year observation period, whereas the controls did not have an incident spine or hip fracture, respectively. The study participants were drawn from community-dwelling women and men in Reykjavik, Iceland, who were enrolled in the ongoing Age, Gene/Environment Susceptibility Reykjavik observational study (AGES-Reykjavik). All participants in that larger study were scheduled to have a CT exam at a baseline time point and again after 5 years, all scans taken on the same CT machine. We evaluated the finite element-estimated bone strength (in Newtons) and load-to-strength ratio measured from the baseline CT scans of the spine and hip for prediction of spine and hip fractures, respectively. We also investigated BMD measures that are currently used clinically, specifically volumetric vertebral trabecular BMD (in mg/cm3) in the spine arm, and femoral neck and total hip areal BMD (in g/cm2) in the hip arm. Because DXA was not available, we used the baseline CT scans to measure a DXA-equivalent hip areal BMD (Supplemental Data). All CT analyses, including finite element analyses, were performed blinded to fracture status, and the interventional strength thresholds were developed on an independent cohort of 1459 women and men before this analysis.
For the spine arm, incident vertebral fracture cases included both clinical (identified from patient medical records) and morphologic (identified from the study CT scans) fractures. Based on a search of medical records for appropriate ICD-10 codes, we identified 69 AGES-Reykjavik participants who had a potential clinical incident spine fracture, 30 of whom were subsequently excluded because the fracture could not be confirmed radiographically via CT scans. Based on reading the baseline and 5-year follow-up CT scans for a random selection of 897 AGES-Reykjavik participants who had both such CT exams, we identified an additional 129 participants who had a morphologic incident spine fracture between baseline and follow-up. One of these cases was subsequently excluded because the baseline CT scan was unsuitable for FEA, resulting in 167 spine fracture cases (39 clinical plus 128 morphologic) for statistical analysis. From that same random selection, we identified 676 controls, namely participants who displayed a radiographic absence of incident vertebral fracture between baseline and 5-year follow-up CT (hip fracture status was not considered in the spine arm and vice versa). The remaining 93 subjects were excluded from further analysis in the spine arm because of lack of a readable CT scout view (at baseline or follow-up) needed to adjudicate vertebral fracture. The final spine arm comprised 497 women and 346 men, all white, the age at baseline spanning 66 to 93 years.
For the hip arm, again based on a medical record search for appropriate ICD-10 codes, we identified 180 participants in the AGES-Reykjavik study who had a clinical diagnosis of a hip fracture between the baseline and 5-year follow-up CT exams. We excluded 9 of these participants because their baseline CT scan was unsuitable for FEA, resulting in a total of 171 hip-fracture cases for statistical analysis. A total of 877 controls without hip fracture were drawn from the aforementioned random selection of 897 AGES-Reykjavik participants who had both baseline and follow-up CT scans; the remaining 20 subjects were excluded because their baseline hip CT scan was unsuitable for FEA or because we could not ascertain their hip fracture status during the 5-year follow-up period. The final hip arm comprised 608 women and 440 men, all white, the age at baseline spanning 66 to 93 years.
Vertebral fracture adjudication
Incident vertebral fractures were adjudicated by analyzing the lateral scout of the baseline and 5-year follow-up CT exams. First, the 5-year follow-up CT scout for all participants were examined, using Genant's semiquantitative technique to grade fractures as mild (SQ1), moderate (SQ2), or severe (SQ3). Then, if a fracture was found in any vertebra from T6 to L4, the baseline CT scout was read to classify the fracture as prevalent or incident. A fracture was classified as incident if the grade of the vertebra increased from baseline to follow-up and was otherwise classified as prevalent. The fracture grade for a participant was defined as the highest grade of any incident fracture at any vertebral level at the end of the 5-year observation period. The interobserver agreement (Kappa score) between the study reader and both a radiologist at the Icelandic Heart Association (IHA) (κ = 0.64) and a physician at Synarc, Inc. (κ = 0.54) for 38 participants was deemed to be acceptable.
All CT scans of the lumbar spine and proximal femur were acquired at IHA on a single scanner (Siemens Sensation 4) using 120 kVp, a pitch of 1.0, and a modulated tube current with a reference exposure of 150 mAs. Images were reconstructed using a 1.0-mm slice thickness, a B30s soft-tissue kernel, and a 50-cm field of view. The same external bone mineral phantom (Image Analysis, Columbia, KY, USA) was imaged in every scan.
Finite element analysis
All image processing and finite element analyses of the baseline CT scans were performed at IHA, by a single analyst, blinded to fracture status, using the VirtuOst software application (O.N. Diagnostics, Berkeley, CA, USA). Construction of the finite element models (Fig. 2) is described elsewhere.[4, 14] In short, the CT scans were calibrated to convert voxel values to BMD, the bone was segmented, and then resampled into isotropic voxels (1.0 mm for the spine; 1.5 mm for the hip). Each voxel was then converted into a finite element and assigned elastic and failure material properties based on empirical relations to BMD.[4, 26]
In the spine, displacement boundary conditions were applied to simulate a uniform axial compression on the vertebra, applied through a virtual layer of bone cement, and the vertebral strength was defined as the force at 2% deformation. To calculate a load-to-strength ratio, we used body-weight and height data for each participant to estimate a lumbar compressive force for forward bending at the waist while holding a 10-kg weight. The assumed moment arm of the paraspinal muscles opposing forward bending was larger for men (5.86 cm) than for women (5.48 cm). Vertebral measurements were taken as the average from analysis of L1 and L2 (84% of analyses), or the average of two other vertebra (14%) or from just a single vertebrae (2%) from T12 through L4 if L1 and L2 were not both suitable for analysis.
In the femur, displacement boundary conditions were applied to simulate a sideways fall with the diaphysis angled at 15° to the ground and at 15° of internal rotation. Femoral strength was defined as the force at 4% deformation. To calculate a load-to-strength ratio, we used body-weight and height data for each participant to estimate the impact force for a sideways fall, and we assumed a constant trochanteric soft-tissue thickness of 25 mm for all participants. The left femur was analyzed except in three participants because of image artifact or abnormal morphology of the left femur.
In addition to these finite element computations, we also used the VirtuOst software application to measure BMD from the same bones for which strength was measured. For the spine, volumetric BMD of the trabecular bone (vBMD, in mg/cm3), compatible with the UCSF reference database,[21, 30] was measured using an anteriorly placed elliptical region of interest (Fig. 3). Because DXA was not acquired for this study, we used the CT scans to measure a Hologic-equivalent areal BMD (in g/cm2) at the femoral neck and total hip regions and associated T-scores, using a similar approach as reported elsewhere (Fig. 3).[15, 22] Prospective analysis of an independent cohort of 75 women and men confirmed that our CT-measured areal BMD values were highly correlated with DXA-measured values (R2 = 0.90 femoral neck areal BMD, R2 = 0.93 total hip areal BMD; Supplemental Data).
Interventional thresholds for bone strength
With clinical translation in mind, interventional thresholds for bone strength were established on an independent cohort before this analysis and were designed to coincide generally with the BMD-based interventional thresholds for osteoporosis and low bone mass (aka osteopenia). These strength thresholds were developed by analyzing 1459 predominantly white women and men in a variety of prior research studies, excluding the AGES-Reykjavik study. The spine analysis included 892 women aged 21 to 89 years and 286 men aged 22 to 89 years, and the hip analysis included 856 women aged 21 to 92 years and 336 men aged 22 to 89 years. To minimize any bias from the use of a particular CT machine, data were pooled from 77 different CT scanners, including GE Healthcare (Fairfield, CT, USA), Siemens (Erlangen, Germany), Philips (Best, The Netherlands), and Toshiba (Otawara, Japan) models. Strength and BMD measures were obtained from these scans as described above. For the spine, linear regression between vertebral strength and vertebral trabecular BMD was then used to find the sex-dependent strength values corresponding to 80 and 120 mg/cm3, the recommended vBMD thresholds for osteoporosis and low bone mass, respectively.[31, 32] Similarly for the hip, linear regression between femoral strength and femoral neck areal BMD T-score (using the NHANES III database for T-scores) was used to find the sex-dependent strength values corresponding to T = –2.5 and T = –1.0, the T-score thresholds for defining osteoporosis and low bone mass, respectively.[34, 35] The strength interventional thresholds—“fragile bone strength” corresponding to osteoporosis and “low bone strength” corresponding to low bone mass—were defined as the strength values resulting from these regression analyses, rounded up to the nearest 500 N (Table 1). As per WHO guidelines, we used female young-reference values in our BMD T-score calculations for both sexes. All of these reported outcomes, including the interventional thresholds, have been cleared by the FDA for clinical use.
Statistical analyses were performed using SAS (SAS Institute, Inc., Cary, NC, USA) and STATA (StataCorp LP, College Station, TX, USA). Baseline characteristics were compared between fracture cases and controls using an age-adjusted F test. To quantify the association between each predictor variable and fracture status, odds ratios and 95% Wald confidence intervals were calculated from logistic regression models with parameters estimated using the maximum likelihood approach (LOGISTIC procedure in SAS). To provide insight into how this association might compare across the sexes in the general population, for each predictor the odds ratio was expressed as a ratio with respect to the standard deviation of the controls pooled across the sexes. Prediction capacity was assessed by analysis of the receiver-operator-characteristic curve, adjusting for age. The spine statistics were further adjusted for body mass index (BMI) and prevalent fracture. Because the clinical utility in identifying mild vertebral fractures is unclear,[18, 37] we performed two separate analyses for incident vertebral fractures: one for all SQ1, SQ2, and SQ3 fractures (n = 167 cases), and one only for moderate/severe (SQ2/SQ3) fractures (n = 96 cases). In the latter, SQ1 fractures were excluded from the analysis, not reassigned as controls. Multivariable logistic regression models were constructed to determine if strength and the load-to-strength ratio remained associated with fracture independent of age, BMI, and BMD. We also calculated the net reclassification improvement (NRI) to compare a fracture-risk classification using just BMD alone versus a classification using BMD combined with FEA measures of strength and load-to strength ratio. Finally, to assess if individuals with fragile bone strength were indeed at high risk of fracture, we compared the probability of fracture for strength and BMD, using age-adjusted logistic regression models for an age of 75 years (average for the sample); confidence intervals for the resulting probability curve were calculated using the 95% confidence intervals of the regression-model coefficients.
For both the spine and hip arms, fracture cases were older than controls (Tables 2 and 3). After accounting for this age difference, the fracture cases still had lower strength and BMD compared with the controls, and higher (worse) values for load-to-strength ratio; vertebral fracture cases were also more likely to have a prior vertebral fracture (p < 0.001).
|No-fracture controls||SQ1, SQ2, SQ3 Fracture cases||SQ2, SQ3 Fracture cases|
|Mean (SD)||Mean (SD)||p Valuea||Mean (SD)||p Valuea|
|No. with prior vertebral fracture||48||52||22|
|Age (years)||74.3 (5.2)||76.4 (5.4)||<0.001||76.7 (5.0)||<0.001|
|BMI (kg/m2)||27.5 (4.4)||26.8 (4.8)||0.30||26.7 (4.3)||0.31|
|Strength (N)||4880 (1670)||3900 (1320)||<0.0001||3650 (1130)||<0.0001|
|Load-to-strength ratio||0.43 (0.15)||0.51 (0.16)||<0.0001||0.55 (0.17)||<0.0001|
|Vertebral trabecular BMD (mg/cm3)||82.0 (32.8)||60.3 (26.5)||<0.0001||55.8 (22.6)||<0.0001|
|No. with prior vertebral fracture||54||27||5|
|Age (years)||74.8 (5.1)||76.7 (5.3)||<0.05||76.5 (5.5)||0.14|
|BMI (kg/m2)||27.0 (3.7)||26.0 (3.6)||0.18||25.0 (3.2)||<0.05|
|Strength (N)||7540 (2470)||5950 (1880)||<0.0001||5840 (2030)||<0.01|
|Load-to-strength ratio||0.33 (0.11)||0.41 (0.13)||<0.0001||0.40 (0.14)||<0.01|
|Vertebral trabecular BMD (mg/cm3)||91.6 (32.7)||73.9 (30.5)||<0.01||79.0 (38.9)||0.16|
|No-fracture controls||Hip fracture cases||p Valuea|
|Age (years)||74.7 (5.3)||79.4 (5.7)||<0.0001|
|BMI (kg/m2)||27.2 (4.5)||25.7 (5.2)||0.054|
|Strength (N)||3600 (910)||2800 (670)||<0.0001|
|Load-to-strength ratio||0.95 (0.22)||1.12 (0.25)||<0.0001|
|Femoral neck areal BMD (g/cm2)||0.68 (0.11)||0.59 (0.08)||<0.0001|
|Total hip areal BMD (g/cm2)||0.82 (0.14)||0.70 (0.11)||<0.0001|
|Age (years)||75.2 (5.4)||80.2 (5.6)||<0.0001|
|BMI (kg/m2)||26.8 (3.7)||25.9 (3.9)||0.35|
|Strength (N)||5140 (1200)||3860 (940)||<0.0001|
|Load-to-strength ratio||0.82 (0.20)||1.04 (0.24)||<0.0001|
|Femoral neck areal BMD (g/cm2)||0.79 (0.13)||0.65 (0.09)||<0.0001|
|Total hip areal BMD (g/cm2)||0.97 (0.16)||0.81 (0.13)||<0.0001|
Vertebral fracture predictors
In the spine arm, strength, load-to-strength ratio, and vBMD were highly significantly associated with any incident vertebral fracture (SQ1 to SQ3) in both women and men both before and after adjusting for age, BMI, and prior vertebral fracture (Table 4). Odds ratios (per unit standard deviation) were generally numerically higher for strength than for vBMD for both women and men. Logistic regression results showed that for men only, both strength (p = 0.01) and load-to-strength ratio (p = 0.03) remained associated with fracture independent of vBMD. The age-adjusted AUC values showed trends that were similar to those for the odds ratios, values ranging from 0.67 to 0.70 for women and 0.68 to 0.71 for men.
|(No adjustment)||(Adjusted for age)||(Adjusted for age, BMI, and prior vertebral fracture)|
|SQ1, SQ2, SQ3 Vertebral Fracture|
|Strength||3.1 (2.1–4.7)||2.3 (1.6–3.4)||2.8 (1.8–4.3)||2.2 (1.5–3.2)||2.3 (1.5–3.6)||2.0 (1.3–3.1)|
|Load-to-strength ratio||1.6 (1.3–1.9)||2.0 (1.4–2.7)||1.5 (1.2–1.8)||1.9 (1.4–2.7)||1.4 (1.2–1.8)||1.9 (1.4–2.8)|
|Vertebral trabecular BMD||2.4 (1.8–3.2)||1.9 (1.3–2.7)||2.3 (1.7–3.2)||1.7 (1.2–2.5)||1.8 (1.3–2.6)||1.5 (1.0–2.2)|
|SQ2, SQ3 vertebral fracture|
|Strength||4.9 (2.8–8.6)||2.5 (1.4–4.4)||4.3 (2.4–7.6)||2.4 (1.3–4.3)||3.3 (1.8–5.9)||1.8 (1.0–3.4)a|
|Load-to-strength ratio||1.8 (1.5–2.3)||1.9 (1.2–3.1)||1.7 (1.4–2.2)||1.8 (1.1–3.0)||1.7 (1.3–2.3)||1.6 (0.9–2.8)a|
|Vertebral trabecular BMD||3.3 (2.2–4.7)||1.5 (0.9–2.5)a||3.1 (2.1–4.7)||1.4 (0.9–2.3)a||2.5 (1.6–3.7)||1.0 (0.6–1.6)a|
After excluding mild (SQ1) fractures from analysis, the age-adjusted odds ratios of all predictor variables increased for women (Table 4). For men, the odds ratios for strength and the load-to-strength ratio were generally unchanged, whereas the odds ratio for vBMD was no longer significant. After further adjusting for BMI and prior vertebral fracture, all three predictors remained significant for women, but none remained significant for men, although the small number of cases for men (n = 21) compromised statistical power for this analysis.
Hip fracture predictors
In the hip arm, strength, load-to-strength ratio, femoral neck areal BMD, and total hip areal BMD were all highly significantly associated with incident hip fractures in both women and men, both before and after adjusting for age and BMI (Table 5). Further, after adjusting for age, BMI, and femoral neck areal BMD using logistic regression models, both strength (p = 0.01) and the load-to-strength ratio (p = 0.005) remained associated with fracture for women but not for men. When total hip areal BMD was placed in the multivariate model instead of femoral neck areal BMD, each of strength (women p = 0.0006, men p = 0.0001) and load-to-strength ratio (women p = 0.015, men p = 0.010) remained associated with fracture for both sexes. The age-adjusted AUC values tended to be higher for men (0.84 to 0.86) than women (0.78 to 0.80), but otherwise showed trends that were similar to those for the odds ratios.
|(No adjustment)||(Adjusted for age)||(Adjusted for age and BMI)|
|Strength||6.3 (4.0–10)||4.1 (2.8–6.2)||4.2 (2.6–6.9)||3.5 (2.3–5.3)||4.3 (2.6–7.4)||3.7 (2.4–5.7)|
|Load-to-strength ratio||1.9 (1.6–2.4)||2.4 (1.8–3.2)||1.7 (1.4–2.2)||2.3 (1.7–3.1)||2.3 (1.8–3.0)||2.6 (1.9–3.5)|
|Femoral neck areal BMD||3.7 (2.6–5.3)||4.3 (2.9–6.3)||2.7 (1.9–3.9)||3.7 (2.5–5.6)||2.7 (1.8–4.0)||4.0 (2.6–6.1)|
|Total hip areal BMD (g/cm2)||3.5 (2.5–5.0)||3.1 (2.2–4.4)||2.6 (1.8–3.7)||2.6 (1.8–3.7)||2.6 (1.8–3.8)||2.8 (1.9–4.1)|
Consistent with the trends for odds ratios, the reclassification analysis revealed an added benefit of combining strength and BMD compared with the use of BMD alone. For the spine arm, in men, the combination of strength and vBMD (NRI = 62%, p = 0.006), or strength, load-to-strength ratio, and vBMD (NRI = 63%, p = 0.005) significantly improved fracture classification for moderate/severe (SQ2/3) vertebral fractures; this effect was not significant when all fractures (SQ1/2/3) were considered. No significant improvement was seen for women (p > 0.5). For the hip arm, combining femoral strength and femoral neck areal BMD improved classification of hip fractures (NRI = 33%, p = 0.002) in women, as did combining strength, the load-to-strength ratio, and femoral neck areal BMD (NRI = 37%, p = 0.001). No significant improvement was seen for men (p > 0.4).
The age-adjusted logistic regression analysis indicated that, at an average age of 75 years, the calculated probability of fracture for this analysis sample was similarly high at the BMD and bone strength interventional thresholds, both for women and men and at the spine (Fig. 4) and the hip (Fig. 5). For example, the probability of vertebral fracture at the thresholds for fragile bone strength and osteoporosis were 22.2% (95% CI 18.5 to 26.4%) and 14.6% (11.1 to 19.0%), respectively, for the women, and 14.8% (11.2 to 19.3%) and 14.6% (11.1 to 19.0%), respectively, for the men. At the hip, the probability of hip fracture associated with the thresholds for fragile bone strength and osteoporosis were 17.1% (13.5 to 21.3%) and 21.8% (17.0 to 27.5%), respectively, for the women, and 24.0% (17.0 to 32.7%) and 33.4% (23.3 to 45.4%), respectively, for the men.
The finite element analysis technique, which uses computational biomechanical principles coupled with patient-specific information in clinical CT scans to mechanistically simulate bone failure, has been validated in cadaver studies by numerous groups for both the spine[4-6, 39] and hip[7-12] and clinically has been shown to be significantly associated with incident and prevalent fracture in multiple cohorts.[13-18, 20, 40] Our new data provide further clinical validation. For the spine, vertebral strength was associated with fracture in both women and men; consistently had the (numerically) highest odds ratios compared with the other predictors; was associated with fracture independently of vBMD in men; and for men with more severe fracture (SQ2/3) remained significant, whereas vBMD lost significance. For the hip, femoral strength was associated with fracture independently of femoral neck areal BMD in women and total hip areal BMD in both sexes. With clinical translation in mind, we introduced and prospectively evaluated interventional thresholds for bone strength and confirmed that these thresholds for fragile bone strength were associated with fracture probability levels equivalent to those for well-established thresholds for osteoporosis, both at the hip and spine and in women and men. The probability of fracture in this study depends on the nature of the case-control design and does not represent actual clinical fracture risk. However, because it is well established that individuals over age 65 years who have osteoporosis at the hip or spine by BMD criteria should be considered at high clinical risk of fracture, and because strength was associated with fracture at least as well as was BMD at both the hip and spine in the present study, these results indicate that individuals who have fragile bone strength at the hip or spine should also be considered at high clinical risk of fracture.
One novel aspect of this study is the use of FEA-based vertebral strength assessment and vertebral trabecular BMD for prediction of incident vertebral fractures in women, the first study of its kind. Our findings are consistent with those from the Osteoporotic Fractures in Men (MrOS) study of men over age 65 years, which showed that vertebral compressive strength (and the load-to-strength ratio) were highly significant predictors of incident clinical vertebral fracture in men. Similar to our reported odds ratios, hazard ratios for strength in that study were numerically higher for strength compared with vBMD, although that study evaluated integral vBMD (trabecular and cortical) and not trabecular vBMD as in this study. We are aware of no studies for women reporting on incident spine fracture and CT-based BMD or strength. In the current study, the association between vertebral strength and vertebral fracture was uniformly stronger in women than men, both before and after adjusting for age and prevalent fracture, and in particular when restricted to more severe fractures (SQ2/3). Spine DXA was not used in this study. However, in the MrOS study of elderly men, the age-adjusted hazard ratio was twofold higher (p < 0.01) for strength than for DXA-measured lumbar spine areal BMD, and strength was associated with fracture independent of DXA areal BMD.
Although no prior study has investigated FEA predictors of incident vertebral fracture in women, several cross-sectional studies in the USA and Japan have shown statistically significant associations between FEA and prevalent vertebral fracture, supporting the generality of our spine results. In 1991, Faulkner found that an FEA-estimated vertebral yield stress better distinguished women aged 20 to 79 years with a prevalent vertebral fracture than did a QCT-based measure of bone mineral content. More recently, two studies of women over age 50 years in the Rochester MN, area found that vertebral strength, load-to-strength ratio, and vBMD were all highly associated with prevalent vertebral fractures.[17, 18] Imai and colleagues also found a trend for the association of prevalent vertebral fracture in Japanese women to be greater for strength than for CT-measured vBMD or DXA-measured spine areal BMD. In all these more recent studies, vertebral strength consistently performed statistically better than lumbar spine areal BMD as measured by DXA. DXA at the spine is limited by its two-dimensional nature and its inclusion of the posterior elements and any aortic calcification in the BMD measure. Given these limitations, and because our findings show consistent agreement with these prior FEA studies, the collective literature suggests that vertebral strength can reasonably be expected to perform better than DXA for assessing risk of incident spine fractures in both women and men.
Another novel aspect of this study was our prospective validation of previously established interventional thresholds for bone strength, which enables FEA to be used clinically to identify women and men at high risk of fracture. We found that the women and men in this study who had fragile bone strength were at an equivalently high probability of fracture as were the women and men who had BMD-defined osteoporosis, the latter criterion placing them clinically in a high-risk category for fracture. Because the thresholds for bone strength were derived from a previously measured strength-BMD relationship for whites, these thresholds should remain valid for any population with a similar strength-BMD relationship. The AGES-Reykjavik cohort shows such a correspondence (Fig. 6), the value of strength at the threshold for fragile bone strength being—as per design—just slightly higher than the value of strength directly corresponding to the osteoporosis threshold. In general, the relation between whole-bone strength and BMD by FEA analysis depends on such morphological characteristics as the size and shape of the bone and the spatial distribution of bone density, including the trabecular-cortical characteristics. As such, the strength thresholds reported here might not be directly applicable to bones in nonwhite populations having different morphological characteristics that would alter the relation between BMD and whole-bone strength.
For the hip, our findings for women that femoral strength was associated with fracture independently of areal BMD and that reclassification improved when using a combination of femoral strength and areal BMD, together suggest that more individuals at high risk of fracture can be identified by using measures of both femoral strength and hip BMD than by using measures of hip BMD alone. Part of this effect is that some individuals with low bone mass (aka osteopenia) who fractured also had fragile bone strength, as shown in a plot of femoral strength and femoral neck areal BMD from this study (Fig. 6, see shaded region) and as observed also in the MrOS study of elderly men. The underlying biophysical mechanisms for this effect are not yet clear, perhaps related to geometry, or relatively low trabecular to cortical mass, or locally weak regions within the bone, any of which might go undetected by an areal BMD measure because of its projectional nature. Regardless, these findings illustrate that women and men who have low bone mass can be at as high a risk of fracture as the risk associated with having osteoporosis if they also have fragile bone strength. Whether such osteopenic high-risk individuals would correspond with those identified using an absolute-risk approach that incorporates various clinical risk factors[42, 43] is unclear and remains a topic for future research.
The generality of our hip strength results is supported by reports of similar findings from the only other two incident hip-fracture studies that analyzed both areal BMD and FEA-estimated hip strength. Femoral neck areal BMD was not reported in either of these two studies but total hip areal BMD was. In the first prior study, an analysis of incident hip fractures in men over age 65 years in the MrOS cohort that used the same software as in the current analysis, age-adjusted hazard ratios were numerically higher for femoral strength (6.5, 95% CI 2.3 to 18.3) than for DXA-measured total hip areal BMD (4.4, 95% CI 2.1 to 9.1). That finding is consistent with ours of a higher odds ratio for femoral strength compared with total hip areal BMD. In the second prior study, an age- and sex-matched nested case-control analysis of a subset of the AGES-Reykjavik participants performed using different image-processing and FEA software by Keyak and colleagues, femoral strength in a stance loading configuration remained a significant predictor of hip fracture after accounting for total hip areal BMD in men (p = 0.01) and just missed statistical significance for women (p = 0.06). Similar trends (p = 0.06) were seen in the fall loading condition for both women and men, and it is likely these trends would have reached statistical significance had the number of fractures in that analysis (71 women and 45 men) been greater. These results are, therefore, also consistent with our findings.
Despite this consistency between these past studies and the current study, Keyak and colleagues concluded that femoral strength may be a more important fracture risk predictor for men than for women, in apparent contradiction to our findings that the odds ratio for femoral strength was higher for women than for men. The Keyak and colleagues conclusion was based on their finding that the ratio of the mean difference in femoral strength between their age-matched cases and controls, divided by the sex-specific standard deviation, was larger for men (ratio = 0.72 for men versus 0.32 for women). We used a sex-pooled standard deviation in our odds ratio calculations. However, to compare against Keyak and colleagues, we also normalized our strength differences by sex-specific standard deviations and then adjusted our strength results to age 80 years (the mean age in the Keyak and colleagues study) and found a similar trend of a higher ratio for men than for women (ratio = 0.83 for men versus 0.59 for women), indicating congruence in the two studies. Further, when we also normalized our logistic regression parameters by sex-specific standard deviations, we found that the age-adjusted odds ratio (95% CI) for strength was numerically higher for men 3.2 (2.1 to 4.7) than for women 2.8 (2.0 to 3.9), again consistent with the Keyak and colleagues findings. This apparent reversal in our odds ratios was because of the women having a lower standard deviation relative to the sex-pooled standard deviation than men, which in turn was because of the lower mean value of femoral strength for women. For our primary analysis, we normalized by the sex-pooled standard deviation to provide insight into risk differences between women and men. Our finding of a higher odds ratio for women than for men, when using a sex-pooled standard deviation, indicates that a fixed decrement of bone strength elevates risk more for women than for men. It is well established that women lose femoral strength at a greater absolute rate with aging than do men.[44, 45] Thus, our results help explain the known higher rate of hip fracture in women than in men. Further, evaluated in this way, our results suggest that the association between femoral strength and hip fracture is at least as important for women as for men.
There are a number of limitations for this study. Most important, the analysis sample lacked racial variation and, as noted above, potential differences in the general relation between areal BMD and whole-bone strength may affect the generality of the strength thresholds in certain nonwhite populations. However, because hip strength and hip areal BMD are quite well correlated (Fig. 6, for example) and because hip areal BMD is a robust predictor of fracture across races, strength should also be associated with fracture risk across races. This is consistent with results from prevalent and incident fracture-outcome studies from multiple different cohorts from the USA,[13, 14, 17, 18] Japan, and Iceland. A second limitation is that we did not use DXA. However, as shown by others[15, 22] and by our own data (Supplemental Data), the hip areal BMD T-score as measured from CT is highly correlated and numerically equivalent to that as measured by DXA, and thus our results should remain substantially unchanged had real DXA been used. Even so, future studies are required to confirm the expected advantage of vertebral strength over DXA for predicting incident spine fractures in women and more generally to confirm our various results in other cohorts.
An additional limitation is that we used a case-control approach rather than a case-cohort approach that would have allowed direct estimation of prevalence and absolute risk. This choice was a trade-off in the spine arm to increase statistical power by including morphologic vertebral fractures as (incident) cases, which required follow-up CT scans for adjudication. Excluding those without a follow-up exam meant that the random sample, although useful for selecting controls, underrepresented cases because of an association between incident fracture and dropout. Even so, the case-control design still allowed us to evaluate odds ratios for strength and to show that the probability of fracture associated with fragile bone strength and osteoporosis were similarly high. We note also that despite the general trend for larger odds ratios for strength compared with BMD, there were only small differences in the AUC values between the various predictors. This is not surprising because typically large differences in an odds ratio need to occur before the AUC changes appreciably. However, because the AUC represents the performance of a predictor across the entire range of sensitivities and specificities, a finding of only a small difference in AUC values between predictors may not represent potential benefits in clinical practice and decision-making, which depends more on where an individual patient falls with respect to any relevant interventional threshold. For example, we found improved reclassification when both strength and BMD were used instead of just BMD, presumably because a statistically significant subset of individuals had lower than expected bone strength for their BMD and were therefore at higher than expected risk of fracture. Clinically, that should translate to a subset of individuals with low bone mass, but fragile bone strength, who are indeed at higher risk of fracture.
There are also inherent limitations with the finite element technique. Although our technique has shown good agreement in strength values compared with cadaver testing,[4, 8] our current implementation did not include some potentially important features, such as microscale effects, fine resolution of the thin cortex, and multiple loading conditions. There is also potential to improve the load-to-strength ratio formulation, which in this study did not include patient-specific modeling of the intervertebral discs, spinal curvature, or muscle morphology. In the hip, a CT-based measure of patient-specific soft-tissue thickness over the greater trochanter was not included, although our preliminary analyses indicated that including such detail did not improve the age-adjusted odds ratio after adjustment for BMI. Development of methods to better utilize such patient-specific model inputs remains a topic for future research.
When viewed in the context of the available literature that has now accumulated on FEA of CT scans, these new results suggest that FEA (which clinically would also include a CT-based BMD analysis) can provide an alternative clinical tool to DXA for meeting the increasing need for additional osteoporosis and fracture risk assessment. Although the use of a dedicated CT scan would introduce additional radiation compared with the use of DXA, the use of an ancillary approach, in which a previously acquired CT is utilized,[48-50] would circumvent this limitation, and tens of millions of such CT scans are taken annually in women and men over age 50 years. As noted above, such an approach may not only identify individuals with osteoporosis and fragile bone strength but also a subset of individuals with low bone mass who have fragile bone strength and are thus at high risk of fracture.
DLK and PFH are employees of O.N. Diagnostics. DLK, PFH, and TMK each have a financial interest in O.N. Diagnostics, and they and the company may benefit from the results of this research. All other authors state that they have no conflicts of interest.
Funding for this research was provided by NIH AR052234, NIH AR057616, and the Intramural Research Program of the NIH, National Institute on Aging. The CT scans used to establish equivalence between CT and DXA-measured areal BMD were acquired as part of the Rochester Epidemiology Project at Mayo Clinic Rochester (PI: Dr Sundeep Khosla) and used here with permission. Verification of the vertebral fracture adjudication procedure at Synarc, Inc., was kindly arranged by Dr Thomas Fuerst.
Authors' roles: Study conception and design: DLK, TA, TH, VG, and TMK. Acquisition of data: PFH, SS, and KS. Statistical analysis: TA. Interpretation of data: DLK, TA, TH, VG, and TMK. Drafting manuscript: DLK and TMK. Manuscript revision and approval: DLK, TA, PFH, SS, KS, TH, VG, and TMK. DLK takes responsibility for the integrity of the data analysis.