Center for Chronic Disease Outcomes Research, Veterans Affairs Medical Center, and Department of Medicine and Division of Epidemiology and Community Health, University of Minnesota, Minneapolis, MN, USA
Although bone mineral density (BMD) is a strong predictor of fracture risk,1, 2 only a small portion of women meet BMD criteria for osteoporosis, and thus the majority of fractures occur in women with low bone mass (previously called osteopenia).3 To improve fracture prediction, the World Health Organization (WHO) has recently developed a country-specific fracture risk index using nine clinical risk factors in addition to BMD.4–6 The US National Osteoporosis Foundation (NOF) subsequently released guidelines recommending treatment for women with existing hip or spine fracture, osteoporosis by BMD (T-score ≤ −2.5), or low bone mass by BMD (−2.5 < T < −1.0) with an increased risk of fracture based on the FRAX model.7 FRAX appears similar to age-adjusted BMD alone in overall prediction of fracture risk in postmenopausal women.8 However, it is unknown how well the FRAX model predicts fractures across differing levels of BMD, particularly among women with low bone mass—who present the greatest treatment dilemma. The aim of our study was to evaluate how the FRAX model 10-year probabilities predicted actual fractures observed over 10 years in a prospective US cohort of women aged 65 years or older.
In 1986–1988, the Study of Osteoporotic Fractures (SOF) recruited 9704 community-dwelling women who were aged 65 years or older (>99% non-Hispanic white) in four US regions: Baltimore County, MD, Minneapolis, MN, Portland, OR, and the Monongahela Valley near Pittsburgh, PA.1 Women were recruited irrespective of BMD and fracture history; those unable to walk without assistance and those with bilateral hip replacements were excluded. All women provided written consent, and SOF was approved by each site's institutional review board.
About 2 years after the initial visit, 9339 (of 9451 surviving) SOF women returned for a visit that included their first dual-energy X-ray absorptiometry (DXA) BMD measurement in the clinic; 7963 had adequate DXA BMD measurement. For this analysis, we required women to have measurements for all the risk factors at the baseline in the FRAX model, as well as femoral neck BMD. Among the 7963 women with BMD, the primary reason for missing FRAX risk factors was unknown parental history of fracture (n = 1445). Complete data to calculate FRAX data were available on 6252 women. Women who were missing risk factors for FRAX were, on average, older (72.3 versus 71.3 years), and a larger proportion reported a previous history of fracture (42% versus 34%). However, the 6252 women in the analytic cohort and those missing FRAX risk factors had a similar body mass index (BMI; 26.4 kg/m2) and a similar femoral neck BMD (0.65 g/cm2).
Measurement of clinical risk factors, including BMD
Measurement and quality-control procedures were rigorous (detailed elsewhere).1 At the baseline examination (1986–1988), height was measured by stadiometer, and weight (in light clothing without shoes) by balance-beam scale. Women also provided information on date of birth, personal fracture history after age 50, parental history of hip fracture, smoking status, alcohol use, rheumatoid arthritis (RA), and glucocorticoid use. At the second visit (1988–1990), DXA was first available and measured by Hologic QDR 1000 (Hologic, Bedford, MA, USA) at the proximal femur and lumbar spine. DXA BMD measurement standards and precision also have been detailed previously.9T-scores were calculated using the National Health and Examination Survey (NHANES) young female age 20 to 29 years as the reference and were computed by WHO criteria.10
Follow-up for ascertainment of fractures
Participants were contacted every 4 months by postcard (with phone follow-up for nonresponders) to ascertain incident hip and other nonspine fractures; more than 98% of these contacts were completed. Incident nonspine fractures were physician-adjudicated from radiology reports. Clinical spine fractures also were adjudicated, when reported. Major osteoporotic fractures included hip, clinical spine, wrist (distal radius or ulna), and humerus.
WHO and FRAX 10-year absolute fracture risk
The WHO 10-year absolute risk of both hip fracture and major osteoporotic fracture (ie, hip, clinical spine, wrist, or humerus) was calculated for each SOF participant by the WHO Collaborating Centre for Metabolic Bone Disease using the FRAX algorithm for US white women6 and provided to SOF (FRAX Version 3.0 was used for final analyses).11, 12 The FRAX 10-year probabilities are based on the following risk factors: age, sex, body mass index (kg/m2), previous history of fracture, parental history of hip fracture, current smoking, glucocorticoid use in the last 3 months, presence of rheumatoid arthritis (RA), other types of secondary osteoporosis, and 3 or more alcoholic beverages a day.6 For secondary osteoporosis, only RA was assessed in SOF. However, the US FRAX model calculator (with BMD) similarly does not consider other types of secondary causes of osteoporosis in the FRAX calculation when BMD is known because they typically mediate their risk through BMD.7 FRAX 10-year probabilities were provided both with and without femoral neck (FN) BMD. We did not evaluate use of nonhip BMD (eg, spine), nor is this recommended with the FRAX calculator because it has not been validated.7
To compare 10-year FRAX-predicted fracture risk with observed fracture risk in our cohort, we evaluated 6252 women who had measurements for all nine risk factors as well as DXA BMD. To allow an adequate comparison with the 10-year FRAX probabilities, we also limited follow-up to 10 years (mean follow-up was 9.4 years, range 2.2 to 10.0 years).
We evaluated each individual woman's FRAX predicted probability of hip and major osteoporotic fracture compared with observed rates of hip and major osteoporotic fracture, both with and without FN BMD T-score, in the FRAX model.5, 6 We did additional analyses that included traumatic fractures as part of observed fractures to confirm that results were consistent for all fracture types (data not shown).
Since our goal was to evaluate how FRAX predicts fractures across varying levels of baseline BMD, we stratified results based on initial FN BMD T-score by 0.5 increment as well as by T-score groups: normal (T ≥ −1.0), low bone mass (−2.5 < T < −1.0), and osteoporotic (T ≤ −2.5). So that within-strata comparisons (eg, comparisons within the low-bone-mass group) to FRAX would be valid, we developed our models on the full population before stratifying by BMD.
Receiver operating characteristics (ROC) curves assessed overall sensitivity and specificity to predict observed hip and major osteoporotic fractures by the area under the ROC curve (AUC) statistic) using the 10-year probabilities of hip and major osteoporotic fractures, respectively, calculated with FRAX.5, 6 Higher AUC values represent better prediction with the models.
STATA Version 9.2 was used to compare the AUC statistic across BMD groups (StataCore, College Station, TX, USA). All other statistical analyses were performed using SAS Version 9.1 (SAS Institute, Cary, NC, USA). We used chi-square tests and analysis of variance to test bivariate associations. A STATA algorithm by DeLong, DeLong, and Clarke-Pearson13 was used to test the equality of the area under the curve across the three BMD groups. Post hoc pairwise comparisons using the same procedure also were conducted. Linear trends in AUC statistics also were tested using regression analyses. Specifically, the AUC statistic was regressed on the BMD group (normal, low bone mass, osteoporotic). For these trend analyses, observations were weighted by the standard deviation of the AUC statistic, and BMD groups were assumed to be equally spaced. The p value (for trend) reported is for the slope of the regression equation.
Within each 0.5 T-score increment, we used t tests to compare the FRAX-predicted probability of fracture with the actual fracture proportions. We summarized these findings in terms of predicted and observed number of fractures within each of the BMD categories by 0.5 T-score increments (Fig. 2).
Although the FRAX model was developed using other data, we confirmed there was no evidence of multicollinearity between predictor variables used in the FRAX calculations in all comparisons (r < 0.4). All the statistical tests that we report are two-sided; statistically significant implies p < .05.
Sensitivity analyses based on prior fracture status or age at baseline
In addition to evaluating overall probabilities in the whole cohort, we did separate analyses among the 4097 (65.5%) women with no prior fracture history (those who did not report any fracture since age 50 at the baseline exam). We also stratified these analyses by age at baseline (≤75 years versus >75 years).
Sensitivity analyses of US FRAX treatment thresholds
A primary aim of our analysis was to evaluate FRAX among women with low bone mass (osteopenia)—those who present a clinical conundrum about treatment benefit. The US NOF recommends pharmacologic treatment of high-risk women with low bone mass (T-score between −1.0 and −2.5 on either the femoral neck or the lumbar spine) if they have a 10-year probability of a hip fracture of 3% or greater or a 10-year probability of a major osteoporotic fracture of 20% or greater based on the US adapted FRAX model.7 Based on NOF criteria, we dichotomized the 4464 women who had low bone mass (LBM) by femoral neck or spine BMD depending on whether the NOF would recommend treatment (FRAX high risk, n = 2218) or not (FRAX low risk, n = 2246).7 We then evaluated what proportion of the high- and low-risk groups of LBM women developed a fracture over 10 years. The sensitivity, specificity, positive predictive value, and negative predictive value of NOF treatment thresholds for FRAX also were calculated in these LBM women.14, 15
The characteristics of the 6252 women who were an average age of 71 years at the baseline exam are shown in Table 1, stratified by baseline BMD category. All risk factors, except history of RA and corticosteroid use, significantly differed based on baseline BMD T-score (Table 1). Over a total of 58,879 person-years of follow-up, 368 women suffered a hip fracture, and 1011 incurred a major osteoporotic fracture. Fracture risk increased with decreasing BMD, as would be expected (Table 1).
Table 1. Characteristics of the SOF Cohort at Baseline and Over 10 Years of Follow-up
A model with no utility in predicting fracture would have an AUC of 0.50 (ie, no better than flipping a coin or chance alone); AUC was greater than 0.50 for all models (Table 2). The FRAX model predicted hip and major osteoporotic fractures within all BMD categories, even when baseline BMD was not part of the probability calculation (Table 2). However, prediction with FRAX models was similar to that of simpler models (Table 2). In general, overall prediction was better (higher AUCs) for all hip fracture models (using either FRAX or simpler models) than it was for major osteoporotic fracture.
Table 2. Prediction of Fracture, Stratified by Baseline BMD
Note: n = 6252 subjects for whole cohort and 4097 subjects reporting no prior fracture after age 50; model n's vary by fracture type owing to missing values (for fracture type). FRAX models use calculated FRAX hip and major osteoporotic fracture probabilities compared with actual fractures. All models with BMD use femoral neck BMD.
FRAX hip and major osteoporotic probabilities are used for corresponding fracture outcomes.
Major osteoporotic fractures include hip, clinical spine, wrist, and humerus.6
p Value for overall chi-square comparison across BMD categories.
p < .05 for trend across BMD groups.
p < .05 compared with osteoporotic group (pairwise comparison).
p < .05 compared with FRAX with BMD model (pairwise comparison within BMD category).
When analyses were restricted to the 4097 women without prior fracture at the baseline exam for all BMD categories, FRAX prediction (AUC) was similar to the whole cohort for both fracture types (Table 2). Thus FRAX discriminated fracture risk, particularly for hip fracture, among women without current evidence of osteoporosis (by BMD or history of fracture)—women one would like to target for primary prevention.
When hip fracture risk was further evaluated among women without prior history of fracture, FRAX models with BMD predicted 10-year probability best among women aged 65 to 75 years at baseline (versus >75 years old; Table 3). Simpler models (eg, age + BMD) similarly predicted best in younger women.
Table 3. FRAX Prediction of Hip Fracture Among 4097 Women Without a History of Prior Fracture
Normal/LBM (n = 3441)
Osteoporotic (n = 656)
Note: n = 4097 women without a prior history of fracture. Although the age distribution for the whole cohort was approximately evenly split for age ≤75 versus age >75, for women without a history of fracture, there were 3427 women aged less than 75 years and 670 women aged greater than 75 years at baseline. All BMD models use femoral neck BMD.
Models if no prior fracture history
AUC (95% CI)
AUC (95% CI)
Hip fracture, n
Age ≤ 75
FRAX with BMD
0.75 (0.69, 0.80)
0.60 (0.51, 0.70)
FRAX without BMD
0.62 (0.56, 0.69)
0.60 (0.51, 0.70)
Age + BMD
0.73 (0.68, 0.79)
0.59 (0.50, 0.69)
0.63 (0.57, 0.69)
0.58 (0.48, 0.67)
Age > 75
FRAX with BMD
0.66 (0.58, 0.73)
0.51 (0.41, 0.60)
FRAX without BMD
0.62 (0.53, 0.71)
0.49 (0.39, 0.59)
Age + BMD
0.69 (0.61, 0.77)
0.62 (0.52, 0.71)
0.51 (0.41, 0.62)
0.52 (0.42, 0.63)
FRAX prediction across T-score BMD increments
Clinically, it is helpful to know not only overall prediction (AUC) but also whether the error is an over- or underestimation. To better illustrate how the rates of fracture predicted by the FRAX model with BMD compared with actual rates, we evaluated this by 0.5 increments of T-score FN BMD (Fig. 1). Hip fracture prediction when using 10-year hip fracture probabilities was very close to actual fracture rates across most BMD increments (Fig. 1), consistent with higher AUC values for hip fracture (Table 2). In contrast, FRAX overpredicted major osteoporotic fractures in women with normal and low bone mass when using 10-year major osteoporotic probabilities (Fig. 1).
FRAX prediction based on NOF treatment guidelines
There were 4464 women (71% of the 6252) who had low bone mass by FN or spine BMD. Based on the FRAX US model Version 3.0, nearly half these 4464 women with low bone mass would be considered high risk by the NOF and recommended for treatment (Fig. 2). Importantly, during the 10 years of follow-up after the SOF baseline exam in 1986–1988, osteoporosis treatment was less common and less available (eg, only approximately 1% used alendronate prior to the year 10 exam), and thus it is a more ideal population in which to compare fracture risk with predicted risk. Interestingly, fewer than 10% of LBM women classified as high risk by the NOF7 suffered a hip fracture, and fewer than 25% of those deemed high risk (treatment recommended) incurred a major osteoporotic fracture (Fig. 2). For those with a history of prior fracture, the proportion recommended for treatment was higher, as well as the percent that developed a fracture (Fig. 2, Table 2). The NOF treatment threshold (high risk) for women with low bone mass was reasonably sensitive at identifying a high proportion of women who would develop fracture (most true positives with few false negatives) but was not very specific in excluding false positives and thus had a low specificity among women with low bone mass (Table 4). Moreover, the positive predictive value of this NOF threshold was very poor (Table 4) for women with low bone mass because of the very high false-positive rate.
Table 4. Proportion of Women With Low Bone Mass Meeting NOF Thresholds for Treatment (High Risk)a and Performance Characteristicsb of NOF Treatment Thresholds Overall and Based on Prior History of Fracture
NOF high risk is low bone mass (T-score between −1.0 and −2.5 by femoral neck or spine BMD, n = 4464) and 10 year FRAX probability (including BMD) of fracture of 3% or more for hip or 20% or more for major osteoporotic fracture (MOF).7
FP = false positive; TP = true positive; TN = true negative; FN = false negative. For presentation, we rounded group percentages to whole numbers (thus the group total may not be exactly 100%). PPV = positive predictive value; NPV = negative predictive value.14, 15
Sensitivity, or true-positive rate = TP/(TP + FN).
Specificity, or true-negative rate = TN/(TN + FP).
PPV = TP/(TP + FP).
NPV = TN/(TN + FN).
No prior fracture
In this large prospective cohort study of 6252 community-dwelling women aged 65 years or older, we found that the FRAX model predicted incident hip and major osteoporotic fractures among women with normal and low bone mass, not just those with frank osteoporosis. Overall prediction in each BMD category (ie, normal, low, or osteoporotic) was similar using either FRAX model (clinical risk factors alone or combined with BMD) for all fracture types.
FRAX predicted hip fractures better in women with normal and low bone mass than it did in women with frank osteoporosis by BMD criteria. These results don't contradict prior data that BMD is a strong risk factor for fracture (hazards ratios are based on sensitivity, not specificity). Moreover, one would hope that a clinical risk model would perform best with overall prediction of sensitivity and specificity (AUC) of fracture risk among those identified as low risk by BMD. Similarly, one would hope that a risk model would be useful in women who have not yet manifested fragile bones (by experiencing a fracture after age 50). Indeed, our results suggest that the FRAX model, and assessing additional risk factors, offers particular utility in stratifying fracture risk among women with normal and low bone mass. Importantly, this improved prediction for women with normal and low bone mass was present even among women who had yet to experience a fracture since age 50 (Table 2).
Because BMD (the “gold standard”) is an excellent discriminator of fracture risk, it is not surprising that hip fractures occurred rarely in women with normal BMD values (n = 14). However, the majority of hip fractures occurred in women without osteoporosis by BMD (n = 190), and the addition of clinical risk factors improved fracture prediction in these women. This improved hip fracture prediction occurred even among women who had not “declared” their fragile bone status with a prior fracture.
Ensrud and colleagues have recently published that FRAX prediction is similar to simpler models for prediction of overall nonspine and spine fracture rates.8, 16–19 One could reasonably argue from our results that BMD (if known) or prior fracture after age 50 (if occurred) are very good (and simple) predictors of fracture risk, including among women with low bone mass. However, our results also suggest that in women for which true primary prevention is sought (ie, normal or low bone mass and no history of prior fracture), FRAX can offer utility in predicting risk—especially for hip fracture.
We now have excellent evidence that treatment for women who meet BMD criteria for osteoporosis can reduce future fracture risk.20–23 The data for treatment benefit in women with low bone mass are less compelling, particularly with prevention of nonspine fractures.20, 22, 24–26 Because women with low bone mass (osteopenia) represent the majority of all postmenopausal women, treating nearly all women to prevent future fractures is cost-prohibitive with limited resources and also unnecessarily increases adverse effects in women who are unlikely to receive individual benefit.
Thus the concept of the FRAX model is desirable to better stratify fracture risk and potential benefit with treatment in women without osteoporosis. Our findings provide insight into how FRAX probabilities (and simpler models) relate to observed risk, especially among women with low bone mass that is not yet osteoporotic. Moreover, our results suggest that FRAX can be helpful in stratifying risk among women who have not yet experienced a fracture—the target group for primary prevention. Additional evidence about how treatment based on FRAX risk stratification will reduce fracture risk is still needed.
With current NOF guidelines, the majority of postmenopausal women with low bone mass are recommended for treatment based on the FRAX model probabilities.7, 27, 28 Since publication of the report by Donaldson and colleagues on the high proportion of US women meeting NOF treatment thresholds,27 the US version of FRAX was updated (Version 3.0) to improve overestimation of fracture.12 Our results that current NOF treatment thresholds (based on FRAX) still identify a large proportion of women with low bone mass as high risk who will not fracture (false-positives) requires more research on how to improve screening women with low bone mass who would benefit from primary prevention (and thus avoid unnecessarily treating a large proportion of women).
Our study has several important strengths. It is a large prospective study of 6252 community-dwelling older women with rigorous quality control of BMD and other measurements. In addition, retention of survivors is excellent, including fracture ascertainment over the 10 years of follow-up.
Our study also has some potential limitations. All FRAX risk factors were measured at baseline, except baseline BMD, which was not available until about 2 years after baseline. However, this is unlikely to provide a bias given our prior evidence on the stability of one measurement of BMD longitudinally as a predictor.2 Also, our results are in postmenopausal US women aged 65 years or older and may not be generalizable to other groups, particularly younger women, who are transitioning through the menopause.
The FRAX models provide a paradigm shift in fracture prevention because it encourages providers (and patients) to think in terms of absolute fracture risk, given that there is no compelling rationale for treating people with low absolute risks of fracture. An important limitation of FRAX is that the evidence of treatment efficacy has come from populations with osteoporosis by BMD criteria.20–23 Therefore, until treatments are shown to be efficacious in women without osteoporotic BMD, the utility of identifying high fracture risk in women without osteoporosis still should be viewed with caution. Our results that FRAX predicts fractures within all BMD categories, including women with normal bone mass and without prior fracture, provides the first step. Treatment trials are also needed to demonstrate that fractures will be prevented among women identified by FRAX to be at high risk but without an osteoporotic BMD.
In summary, among older US women, we found that the FRAX model predicted hip and major osteoporotic fractures within all BMD categories, including women with normal and low bone mass. Simpler models provided similar risk stratification (eg, age + BMD) among women with low bone mass. For the large proportion of postmenopausal women for whom osteoporosis primary prevention is sought (normal or low bone mass without prior fracture), more research is still needed on how to reduce the false-positive rate (and unnecessary treatment) with FRAX (or other screening tests) while retaining our ability to identify women at high risk of fracture.
JAC is a consultant for Novartis and has received research funding from Novartis. DCB has received research funding from Novartis, Merck, and Amgen. DMB is a consultant for Amgen and has received research funding from Novartis, Roche, Merck, and Amgen. All the other authors state that they have no conflicts of interest.
We wish to thank the following contributors to this work: Heather Baird for technical assistance and Martie Sucec for editorial review. TA Hillier had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
This research was supported by the National Institute of Arthritis and Musculoskeletal and Skin Diseases and the National Institute on Aging (Public Health Service Grants 2 R01 AG027574-22A1, R01 AG005407, R01 AG027576-22, 2 R01 AG005394-22A1, AG05407, AG05394, AR35583, AR35582, and AR35584). The SOF investigators were completely independent of the funding source to design and conduct the study, including data collection, management, analysis, interpretation, and preparation, review, and final approval of the manuscript. This article was presented in part at the American Society for Bone and Mineral Research Annual Meeting, in Montreal, Quebec, Canada, September 13, 2008.