Hip fracture is a major health problem, which is likely to become exacerbated by the rising numbers of older people.1 It has been estimated that if the age-adjusted rates of hip fracture increase by just 1% per year, then the number of hip fractures worldwide could rise from 1.7 million in 1990 to 8.2 million in 2050.2 Up to one-third of patients with hip fracture die within a year, and survivors often suffer pain, reduced mobility, loss of independence, and reduced quality of life.3 Recent advances in drug development have resulted in effective treatments for osteoporosis, which maintain or increase bone density and reduce the risk of fracture.4–6 Early identification of people who are at greatest risk of fracture would enable targeted treatments and fracture prevention strategies to be used. This would enable a more effective management of patients and potentially improve fracture prevention. Bone mineral density (BMD) measured by dual-energy X-ray absorptiometry (DXA) is the basis of current clinical practice guidelines for the diagnosis of osteoporosis and to identify patients at risk of fracture.7 However, the assessment of fracture risk by BMD lacks sensitivity, and up to half of patients who suffer nonvertebral fracture do not have low BMD (BMD T-score ≤ −2.5).8, 9 This lack of sensitivity may be owing to the modest relationship between BMD and bone strength. The measurement of bone strength in cadaveric specimens has shown that around 56% of femoral strength is explained by BMD.10 Bone mass measured by DXA provides a two-dimensional representation of the bone, which does not incorporate other factors that contribute to bone strength such as the geometry, microarchitecture, and material properties of bone tissue.11 It also does not account for the loading conditions that are placed on a bone as a consequence of falling.
Finite element (FE) analysis is an engineering method that uses mathematical models to define how a structure or material will react to stress when loaded. To estimate bone strength, the FE analysis model incorporates information about the geometry and density distribution embedded in DXA bone density scans and the loading conditions known to cause fractures. These DXA-derived FE models have been shown to enhance the prediction of hip fracture compared with BMD alone, but the effect is modest and further development of these methods is required.12–15
We have previously shown in a cross-sectional study that DXA-based FE analysis of the proximal femur is potentially useful in discriminating hip fracture cases from controls.14 Finite element analysis of quantitative computer tomography (QCT) scans is also used to provide an estimate of bone strength. This method has been shown to prospectively predict hip fractures in men.16 However, QCT scans are not routinely performed in the clinical management of patients, and BMD measured by DXA is more widely available to clinicians. Femoral strength estimated from QCT scans has a greater decrease with respect to age compared with BMD and the prevalence of low femoral strength is higher than the prevalence of osteoporosis.17 To our knowledge, there are no published data of bone strength determined from FE analysis in DXA scans of the proximal femur compared with osteoporosis defined by BMD in a large clinical study of postmenopausal women with hip fractures.
The aim of this study was to determine whether bone strength, derived from FE analysis of DXA scans of the proximal femur, is able to discriminate hip fracture independently of BMD in a longitudinal, nested case-control study of elderly community-dwelling women.18
Materials and Methods
We performed FE analysis on DXA scans of the proximal femur in a cohort of 728 women who had participated in a large, single-center study to determine the effect of clodronate on the rate of fractures (n = 5592 participants). The women were over 75 years of age, living in the community. The study details have been published previously.18, 19 Clodronate treatment was associated with a reduction in clinical fractures during the 3-year study but with no efficacy on the incidence of hip fractures. The baseline visit included assessment of general health and fracture history. Bone mineral density was measured at the hip by DXA using a Hologic QDR4500 Acclaim densitometer (Hologic Inc., Bedford, MA, USA). Lateral spine scans were acquired for vertebral fracture assessment (VFA).20 Baseline characteristics were entered into the FRAX model to estimate an individual's 10-year probability of hip fracture with FN BMD not included in the model.19 The participants were not informed of the results of the baseline assessments of fracture risk or BMD measurements. All reported incident fractures were confirmed by hospital notes, discharge letters, radiographic reports, or radiograph review. Only verified incident hip fractures were included in statistical analysis.
This was a nested case-control study design; the study cohort (n = 728) had a mean age of 82-years (range 75 to 95 years). The fracture group included 182 women with incident hip fracture during a median of 4 years of study follow-up. Of these, 127 were classified as fractured neck of femur and 55 intertrochanteric fractures. The mean time to fracture was 2.6 years (range 2 weeks to 5.5 years). Nine women fractured both hips during the study. Each fracture case was age-, height-, and weight-matched (age ±3 years, height ±3 cm, weight ±3 kg) with three women who remained free of incident hip fracture during follow-up (n = 546).
Fracture history before entry into the study was recorded at baseline by questionnaire. We defined prior fragility fractures as those that occurred after the age of 35 years at the hip (n = 23), spine (n = 8), and wrist (n = 148).
Finite element analysis
DXA scans of the proximal femur from the baseline visit were reanalyzed using a special version of Hologic software under a collaborative research agreement. This version was designed to extract a pixel-by-pixel BMD map of the hip DXA scan (Fig. 1A). Based on the BMD map, the femoral head was segmented first, with the assumption that it was a circle, then the rest of the proximal femur was segmented using an image-processing algorithm that combined edge detection and thresholding followed by manual addition and/or removal (Fig. 1A).
We generated the patient-specific FE models directly from the segmented BMD map. The details about the geometry, material properties, and boundary conditions of the FE model are described in the Supplemental Materials and Methods. Briefly, a subject-specific thickness of the proximal femur was first derived from the width of the femoral neck from the DXA scans. This thickness was then used to convert areal BMD to volumetric BMD (vBMD), which effectively assumed that the femur is a plate with constant thickness. Material properties were then derived from vBMD by using the empirical equations of Morgan and colleagues.21, 22 The FE boundary conditions simulated a fall on the greater trochanter: peak impact force (Fpeak) was applied to the greater trochanter, with prevention of medial displacement of the femoral head and prevention of displacement of the distal femoral shaft (Fig. 1B). The peak impact force is dependent on body weight and height.23 This force is defined as peak impact force because it does not account for the attenuation effect of the soft tissue over the greater trochanter. We performed linear-elastic analysis without considering the post-yield behavior because the proximal femur behaves linear elastic until failure.24
We defined the femoral strength based on von Mises failure criteria used to predict yielding of materials under multimode stresses from the results of simple uni-axial tensile/compression tests. We calculated von Mises stress (Fig. 1C) and divided it by an apparent yield stress, an average of compressive and tensile yield stress, to obtain a stress ratio at each pixel (Fig. 1D). If the stress ratio is greater than one, the element is considered to have failed caused by the peak impact force. Because voxel-based FEA models have a jagged surface, stress and strain from such models are not accurate at the surface, so the surface data were not used in postprocessing. Considering that a single-element failure does not indicate gross fracture of the whole bone, we identified a contiguous area of 25 mm2, comprising about 100 elements, that contained the highest stress ratios and thus will have the greatest likelihood of failing first. This was achieved by gradually decreasing the stress ratio threshold by a factor β from the maximum stress ratio until a contiguous area of at least 25 mm2 was segmented in the stress ratio image. We defined the FE-derived femoral strength as the onset impact force that causes the stress ratio in that area exceeding one. This was calculated by scaling the peak impact force by the factor β, ie:
because the FE analysis was linear elastic and the stress level was proportional to the applied force. This approach to define failure has been successfully applied by Keyak and colleagues.25 It should be pointed out that the femoral strength as derived here can be obtained by applying a unit force rather than a patient-specific peak impact force.
Hip fracture occurs only if the impact force is greater than the femoral strength. We calculated load-to-strength ratio (LSR), defined as the attenuated impact force, that takes into account the attenuating effect of trochanteric soft tissue, divided by strength. Because thickness of trochanteric soft tissue was not measured in the original study, a constant value of 25 mm was assumed.16, 26 According to Robinovitch and colleagues, this corresponds to 1775 N reduction of the peak impact force.27 To account for the very high rate of loading during a fall and its associated viscoelastic strengthening effect,28 the femoral strength was increased by a factor 1.3.16 Load-to-strength ratio greater than 1 indicates high risk of fracture.
A suite of programs were developed in Matlab (The Mathworks Inc., Natick, MA, USA) to process DXA images and generate the FE model in ANSYS Parametric Design Language (ANSYS, Canonsburg, PA, USA). On a modern PC, it took less than 3 minutes to segment the femur and generate the FE model. The models were read into ANSYS and solved, and the resulting element stress, strain, and displacement were saved in files for further postprocessing with Matlab programs. It took less than 1 minute to solve the model in ANSYS and another 4 minutes to postprocessing the results for strength.
The FE model was validated on 56 cadaveric femoral specimens subjected to destructive experiments that simulated a sideways fall. The cadaver experiment has been described in detail by Roberts and colleagues.29 We performed the FE analysis on the cadaver hip DXA scans blinded to the experiment results and sent FE results away for independent statistical analysis. Pearson's coefficient of correlation was found to be 0.797 and the linear regression as (Fig. 2):
All parameters were tested for normality using a normal probability plot; BMD, strength, and LSR were not normally distributed. Differences between the hip fracture group (n = 182) and the control group (n = 546) at baseline were compared by Mann-Whitney test. The odds ratios (OR) of fracture for a one standard deviation (SD) change in parameter value were derived from the univariate conditional logistic regression for matched case-control groups. The OR adjusted for FN BMD was also calculated to determine whether FE-derived femoral strength was an independent discriminator for fracture risk. To identify the best regression model for hip fracture, we performed forward multivariate conditional logistic regression including femoral strength, FN BMD, prevalent fragility fracture, treatment, and FRAX score as covariates. The best model was compared with FN BMD alone. We performed a comparison based on the area under the curve (AUC) for receiver operating characteristics (ROC) analysis. A p value <0.05 was considered to be significant.
Low femoral strength assessed by FE analysis of QCT scans in men and women has been defined as less than 3000 N.16, 17 This is the threshold that we used to determine the prevalence of low bone strength. T-scores for FN BMD were calculated using the published data from the third National Health and Nutrition Examination Survey (NHANES).30
One-way analysis of variance with Kruskal-Wallis statistic and Dunn's multiple comparison tests were used to compare variables according to hip fracture type (neck of femur or trochanter) with controls. Mann-Whitney comparison of medians was used to compare the fracture and nonfracture groups. Data analysis was performed using STATA11 (StataCorp LP, College Station, TX, USA), Statgraphics Plus 5.0 (STSC Inc., Rockville, MD, USA), Graphpad Prism v5.0 (GraphPad Software Inc., San Diego, CA, USA).
The baseline bone density and FE measurements for cases and controls are shown in Table 1. Bone density measurements at the femoral neck and FE-derived bone strength were significantly lower in the women with incident hip fracture compared with the controls (p < 0.001). Load-to-strength ratio was significantly higher in the hip fracture group compared with the controls. The differences were similar for both femoral neck (n = 127) and trochanteric hip (n = 55) fractures. Femoral neck BMD and FE-derived strength for both types of hip fracture were significantly lower than the control group but not significantly different from each other. The LSR was significantly higher in the fracture groups compared with the controls.
Table 1. Baseline Bone Density and FE Measurements for Cases and Controls
Femoral neck fracture
Data are presented as medians (interquartile ranges).
p < 0.0001 for all comparisons of fracture group medians to controls (Kruskal-Wallis test).
FN BMD (g/cm2)
0.557 (0.498, 0.624)
0.618 (0.547, 0.683)
0.563 (0.494, 0.634)
0.621 (0.547, 0.688)
0.548 (0.499, 0.623)
0.605 (0.548, 0.679)
FN BMD T-score
−2.5 (−3.0, −2.0)
−2.0 (−2.6, −1.5)
−2.5 (−3.0, −1.9)
−2.0 (−2.6, −1.4)
−2.6 (−3.0, −2.0)
−2.1 (−2.6, −1.5)
1820 (1265, 2648)
2614 (1793, 3435)
1828 (1256, 2766)
2693 (1854, 3550)
1697 (1265, 2471)
2484 (1780, 3329)
1.1 (0.7, 1.4)
0.7 (0.6, 1.1)
1.1 (0.7, 1.4)
0.7 (0.6, 1.1)
1.1 (0.8, 1.5)
0.8 (0.6. 1.1)
The relationships between FE-derived bone strength, LSR, and FN BMD are shown in Fig. 3. There was a positive correlation (p < 0.0001) between FN BMD and strength (Fig. 3A) in the hip fracture group (Spearman correlation ρ = 0.73) and the controls (ρ = 0.81). We calculated the prevalence of low bone strength (<3000 N) and the prevalence of osteoporosis (FNBMD T-score ≤ −2.5). In the whole study cohort (n = 728), a total of 481 women had low bone strength, and 243 had FN BMD T-score ≤ −2.5 (n = 239 had both). In the hip fracture group, 153 women had bone strength less than 3000 N (mean 1717 N, SD 638 N), and 93 were classified as osteoporotic with a FN BMD T-score ≤ −2.5. Of the 93 women in the hip fracture group with a FN BMD T-score ≤ −2.5, 91 also had bone strength <3000 N.
There was a significant negative correlation (p < 0.001) between FN BMD and LSR (Fig. 3B) in the hip fracture group (Spearman correlation ρ = −0.62) and the controls (Spearman correlation ρ = −0.72). Of the 182 women in the hip fracture group, 98 had LSR >1; in the control group, 153 of the 546 women had LSR >1. A total of 169 women had both FN BMD T-score ≤ −2.5 and LSR >1 (n = 71 hip fracture group; n = 98 control group).
A total of 177 women had prevalent fragility fracture (spine, hip, and wrist). Femoral strength of women who had a fragility fracture before entry into the study (n = 177, median 2035 N, interquartile range (IQ) 1288 to 3033 N) was significantly lower than for women who had not fractured before the study (n = 551, median 2504 N, IQ 1735 to 3341 N, p = 0.0002, Mann-Whitney test). Prior fragility fracture was associated with significantly lower bone strength and FN BMD in both the hip fracture group (p = 0.01) and the control group (p < 0.01). The prevalence of prior fragility fracture was similar in both the hip fracture and control groups (26% and 24%, respectively).
There were VFA data available for 716 women. The femoral strength of women with prevalent VFA vertebral fracture (n = 136, median 1689 N, IQ 1127 to 2346 N) was significantly lower than women without VFA vertebral fracture p < 0.0001 (n = 580, median 2622 N, IQ 1788 to 3444 N). Prevalent fracture by VFA was associated with lower bone strength and FN BMD in both the hip fracture group and the control group. The prevalence of VFA prevalent fracture was 27% in the hip fracture group and 16% in the control group.
The OR of increase risk of hip fracture associated with 1 SD change in parameters is shown in Table 2. Fracture risk increased per SD decrease in FN and TH BMD, femoral strength, and per SD increase in LSR. When FN BMD was also included in the model, the OR for strength and LSR remained significantly greater than 1. This was still the case with further inclusion of VFA fracture and FRAX score for 10-year risk of hip fracture (OR strength 1.73, 95% CI 1.3–2.3; LSR 1.4, 95% CI 1.1–1.8). Prevalent fragility fracture or treatment group alone had no effect. Using stepwise conditional multiple linear regression model to determine the best predictor of fracture, BMD and FE variables remained in the model with FN BMD and strength being the best predictors for fracture. Treatment, prevalent fragility fracture, VFA fracture, and FRAX score were excluded from the model. The areas under the ROC curves (AUC) were compared with FN BMD as the gold standard (Table 2). The AUC for LSR combined with FN BMD, AUC = 0.69 (95% CI 0.64–0.73) was significantly (p = 0.004) greater than FN BMD alone, AUC = 0.66 (95% CI 0.62–0.71). The ROC AUC for bone strength, AUC = 0.68 (95% CI 0.63–0.72) was not significantly different from FN BMD. The area under the curve from ROC analysis in the FN fracture group (combined with FN BMD) was AUC = 0.68 (95% CI 0.62–0.73) for strength and AUC = 0.69 (95% CI 0.64–0.74) for LSR, which was significantly greater than FN BMD alone in the FN fracture group 0.66 (95% CI 0.61–0.71).
Table 2. Odds Ratio, Area Under the ROC Curve, and Sensitivity (%)
To our knowledge, this is the first longitudinal study to examine the association of the femoral strength parameters, derived by FE analysis of DXA BMD scans, with the risk of incident hip fracture in elderly women. Previous studies have shown that FE analysis of DXA scans improves hip fracture discrimination in cross-sectional studies compared with BMD but did not estimate whole bone strength of the proximal femur.12–15 In this longitudinal study, we further developed our previous linear-elastic model to estimate the femoral strength and LSR. We found significant associations of FE-derived femoral strength and LSR with incident hip fracture. These associations are still evident after adjusting for areal BMD of the hip, indicating that FE analysis provides additional information beyond areal BMD for the prediction of fractures. Combining BMD and LSR significantly increased the AUC and discrimination sensitivity of BMD alone by 0.03 and 3% to 4%, respectively, which implies that more than 2100 more patients would be correctly identified based on the estimated 70,000 hip fractures each year in the UK (National Institute of Clinical Excellence [NICE] 2009, The management of hip fracture in adults).
Because of the lack of bone strength data on DXA-based FE analysis in the literature, we will refer to QCT-based FE analysis studies in the following discussion, but always bearing in mind the differences and limitation of our FE model compared with the QCT-based models. Several FE models of the proximal femur based on QCT have been developed, validated, and used for discriminating or predicting hip fracture and monitoring drug treatment of osteoporosis.10, 16, 17, 26, 31–35 These state-of-the art QCT-based FE models approximate closely the complex 3D geometry and heterogeneous distribution of the nonlinear material properties of the femoral cortical and trabecular bones, simulate 3D loading conditions known to cause hip fracture, and take into account the different failure levels of bone in tensile and in compression. Our FE model, like other DXA-based models, is restricted by the inherent limitations of DXA scans to a two-dimensional approach, which, compared with the state-of-the-art technology, is relatively crude. We did not make corrections for any measurement errors in bone size caused by the wide-angle fan-beam of the QDR4500 scanner. We modeled the femur as a plate of constant thickness with uniform volumetric density and material properties in the anteroposterior direction. Although the relationship between volumetric BMD and material properties of the human femoral neck and trochanter were used, we did not model cortical bone separately because it was not possible to identify cortical bone correctly in DXA scans. We simulated the loading conditions in a sideways fall, but the forces had to be limited to the frontal plane without any anteroposterior force components. The loading conditions were not exactly the same as in the validation study. We chose to perform linear-elastic analysis without considering post-yield behavior because the human proximal femur has been found to behave linearly elastic up to failure.24 We did not consider the different yield stresses in tension and in compression. All these limitations may explain the underestimation of the femoral strength in the cadaveric validation. However, we found that the FE strength was moderately correlated to the experimental strength, which led to the significant association with hip fracture that was independent of the hip BMD.
Femoral strength has been reported to be around 2500 N in women with prevalent osteoporotic fractures and 4000 N in nonfracture controls.26, 35 The threshold that we assigned as low bone strength was based on data from the MrOs prospective study, which reported femoral strength values less than 3000 N in those subjects who reported new fractures during follow-up.16 Amin and colleagues reported that below an estimated bone strength of 3000 N, there was an increase in the probability of prevalent osteoporotic fractures for both men and women.26 Using the same threshold to compare the difference in bone strength to BMD with respect to incident fracture, we calculated the number of women in each group with bone strength less than 3000 N, and FN BMD T-score ≤ −2.5 to classify women as osteoporotic. In the hip fracture group, 84% of women had low bone strength, 50% had osteoporosis classified by FN BMD T-score, and only one of the osteoporotic women had bone strength greater than 3000 N. On the other hand, 60% of the controls had femoral strength less than 3000 N and 27% were osteoporotic. Because the mean age of the participants was 82 years, it was to be expected that there is a large proportion of nonfracture controls with low bone strength. Keaveny and colleagues reported prevalence of 65% for low femoral strength and 38% for osteoporosis in women in the 8th decade of age,17 which are comparable to our data that 481 (66%) and 234 (33%) women had low femoral strength and osteoporosis.
From a biomechanics point of view, hip fracture occurs when the load applied to the bone is greater than the bone strength, that is, when the LSR is greater than 1. Keaveny and Bouxsein proposed a concept of biomechanical fracture threshold based on LSR.36 They reported that low femoral strength was associated with LSR above the theoretical biomechanical fracture threshold (Φ = 1) and consideration of the LSR may improve fracture risk assessment. Orwoll and colleagues reported that in elderly men (>65 years, mean age 73 years) the LSR, based on FE analysis of QCT scan, in incident hip fracture cases was 1.13 ± 0.41 and 0.75 ± 0.24 in the nonfractured men.16 Amin and colleagues reported LSRs for women and men with and without prevalent osteoporotic fractures being 1.43 ± 0.50 and 1.03 ± 0.36, respectively, for women and 1.50 ± 0.57 and 1.07 ± 0.36, respectively, for men.26 Similar mean LSRs and their SDs were found in our study (1.16 ± 0.52 for incident fractured group and 0.87 ± 0.41 in control group).
There was no difference in FN BMD, bone strength, or LSR according to type of fracture (neck of femur or trochanteric). Some studies have suggested that trochanteric and femoral neck fractures are associated with different geometrical and density measures.37–41 Because the FE model integrates bone mass distribution and geometry information embedded in DXA scans with the loading condition known to cause hip fracture, these subtle differences are incorporated into the FE analysis model but may be canceling each other out in estimating the whole bone strength. Regional specific measure of strength indices such as those suggested by Luo and colleagues15 may have the potential to distinguish FN and trochanteric fractures and this warrants further study.
In agreement with Amin and colleagues,26 we found that prevalent fragility fracture at baseline was associated with lower femoral strength and FN BMD in both the hip fracture group and the control group. However, with conditional logistic regression, prior fracture had no significant effect on discrimination of hip fracture.
There are strengths and additional limitations to our study. This is a nested case-control study with individually matched age, height, and weight between fractured cases and controls. Such design has the advantage of eliminating those known risk factors for hip fracture. However, it prevented us from accounting for the time to fracture. The FE model is derived from a two-dimensional DXA scan, which could be improved by generation of a 3D model from a 2D image as reported by Langton and colleagues.42 We are currently developing our method to incorporate a three-dimensional model. The variability of the method (5%), which is mainly because of the interactive segmentation of the proximal femur, is higher than that of DXA BMD (1%). Reducing the variability of the FE analysis will make this method a better diagnostic tool. Improvements in modeling the forces associated with falling and variations in soft-tissue thickness may also improve the diagnostic performance of the FE analysis method.
We studied elderly women who were above the mean age of hip fracture (77 years) for the UK population (The National Institute for Health and Clinical Excellence 2009). Their bone density results were low and scan quality not optimal for all images, which could influence performance of method. For this particular age group, FEA may not offer a significant benefit to the information gained from DXA, as significant bone loss has already occurred because of aging.
Our results demonstrate a modest but significant improvement to DXA BMD alone for the prediction of hip fractures. Bone strength was associated with hip fracture risk independently of other markers of skeletal fragility such as prevalent fragility fracture, VFA, and FRAX score. A greater proportion of women in the hip fracture group had low bone strength compared with those classified as osteoporotic by BMD T-score of −2.5 or below. Further work is required to optimize the clinical utility of FE models and improve the variability of measurements. Finite element analysis models, to determine parameters of bone strength, may be a useful clinical tool to identify those at risk of fracture with low bone strength, who do not have low bone mineral density.
All authors state that they have no conflicts of interest.
This study was supported by the Arthritis Research UK (grant number 19669) and Medical Research Council (grant number G80899).
The authors thank Ms D Charlesworth, Metabolic Bone Centre, Northern General Hospital, for retrieval of the DXA scans, and Mr M Bradburn from the University of Sheffield, School of Health and Related Research, for statistical advice. We thank Dr Mary L Bouxsein for providing us with the cadaver DXA scans and experiment results for the validation work. We acknowledge Hologic Inc., for providing the modified software to acquire the bone maps for FE analysis.
Authors' roles: Study design: LY and EM. Study conduct: LY and EM. Data collection: KN, LY, and EM. Data analysis: KN and LY. Data interpretation: KN, LY, RE, and EM. Drafting manuscript: KN and LY. Revising manuscript content: KN, LY, RE, and EM. Approving final version of manuscript: KN, LY, RE, and EM. LY takes responsibility for the integrity of the data analysis.