Low bone mineral density (BMD) is a risk factor for fracture. Although the current “gold standard” test is DXA of the hip and spine, this method is not universally available. No large studies have evaluated the ability of new, less expensive peripheral technologies to predict fracture. We studied the association between BMD measurements at peripheral sites and subsequent fracture risk at the hip, wrist/forearm, spine, and rib in 149,524 postmenopausal white women, without prior diagnosis of osteoporosis. At enrollment, each participant completed a risk assessment questionnaire and had BMD testing at the heel, forearm, or finger. Main outcomes were new fractures of the hip, wrist/forearm, spine, or rib within the first 12 months after testing. After 1 year, 2259 women reported 2340 new fractures. Based on manufacturers' normative data and multivariable adjusted analyses, women who had T scores ≤ −2.5 SD were 2.15 (finger) to 3.94 (heel ultrasound [US]) times more likely to fracture than women with normal BMD. All measurement sites/devices predicted fracture equally well, and risk prediction was similar whether calculated from the manufacturers' young normal values (T scores) or using SDs from the mean age of the National Osteoporosis Risk Assessment (NORA) population. The areas under receiver operating characteristic (ROC) curves for hip fracture were comparable with those published using measurements at hip sites. We conclude that low BMD found by peripheral technologies, regardless of the site measured, is associated with at least a twofold increased risk of fracture within 1 year, even at skeletal sites other than the one measured.
FRACTURE IS THE MOST important clinical consequence of osteoporosis, and prevention of fracture is the rationale for identification of persons at higher risk. Although several characteristics and behaviors have been shown to be risk factors for osteoporosis (e.g., white ethnicity, advanced age, cigarette smoking, low body weight, and inadequate calcium intake), evaluation of risk factors alone has been inadequate to diagnose accurately osteoporosis or low bone mass or to predict fracture risk in individual patients.(1–3) Bone mineral density (BMD) testing is currently the most objective method to diagnose osteoporosis in asymptomatic women.(2)
Low BMD is an important risk factor for osteoporotic fracture. Eighty percent of the variance in bone strength is related to bone mineral content (BMC).(4, 5) Measurement at the hip by DXA is the currently accepted gold standard.(2) Low values at peripheral skeletal sites in the elderly have also been shown to predict fractures, including hip fracture.(6–11) Prospective studies that have evaluated the association between peripheral BMD and fracture incidence have been relatively small and have not included recently Food and Drug Administration (FDA)-approved devices.(5) Use of peripheral measurements to identify persons at risk of osteoporotic fracture is attractive because these devices are smaller, more portable, and less expensive to buy, operate, and maintain than central DXA units.(12) Many of these devices are in use across the United States.
Previously, we have reported that peripheral BMD predicted fracture risk in a large cohort of women without known osteoporosis.(13) In this study, we report the powerful predictive relationship between low peripheral BMD measured at different sites and the subsequent 1-year fracture risk at different sites, both hip and nonhip, in this large population of healthy postmenopausal white women.
MATERIALS AND METHODS
The National Osteoporosis Risk Assessment (NORA) is a longitudinal observational study of osteoporosis among postmenopausal women throughout the United States. Details of the study design have been published.(13, 14) Briefly, postmenopausal women who were at least 50 years of age, who had not been diagnosed previously with osteoporosis, and who had not had BMD testing within the preceding 12 months were eligible to participate. Women were recruited through the practices of 4236 primary care physicians whose practices (located in the 34 states) included large numbers of postmenopausal women. Each office provided randomly chosen names of up to 300 eligible women who received letters of invitation from their physicians to participate in the NORA. Overall, ∼30% of women who were invited to participate did so. On average, 40–100 women in each practice agreed to enroll and gave informed consent. The study protocol and consent documents were approved by Essex, a national institutional review board.
Each participant had BMD measured at a single peripheral site: forearm, using peripheral DXA (pDEXA; Norland Medical Systems, Inc., Fort Atkinson, WI, USA); finger, using pDXA (AccuDEXA; Schick Technologies Inc., Long Island City, NY, USA); or heel, using either single X-ray absorptiometry (SXA; Osteoanalyzer; Norland, Medical Systems, Inc.) or ultrasound (US; Sahara; Hologic, Inc., Bedford, MA, USA). BMD testing was performed in the physician's office by licensed technicians who were certified by the International Society for Clinical Densitometry. All instruments were calibrated daily and before use at each office site, using the manufacturer's internal standard. Quality assurance of BMD tests was maintained by staff at Synarc in Portland, OR, USA.(15) T scores, defined as the number of SDs from the young adult mean, were calculated from manufacturers' white reference databases and the relative risk (RR) for fracture per SD below the young adult normal mean was calculated. The World Health Organization (WHO) definitions of normal (T score > −1.0), osteopenic (T score between −1.0 and −2.49), and osteoporotic (T score ≤ −2.5) BMD were used in this analysis.(16) In addition, BMD and risk were expressed as SDs from the mean age of the NORA population.
At enrollment, each participant completed a mailed set of questionnaires, including general demographic information, questions about health status, and questions about risk factors for osteoporosis, including prior fractures of the spine, rib, wrist, or hip after the age of 45 years. Approximately 12 months later, participants completed follow-up questionnaires, including information on the occurrence and site(s) of new fractures since baseline.
Incident fractures were defined as having occurred between the date of BMD measurement and the date of the follow-up survey. If the date of fracture was not reported, the site of the fracture was compared with information in the baseline questionnaire. If the site matched, the fracture was considered to be preexisting and was not used in this analysis. Clinical fractures occurring at the rib, spine, wrist, forearm, and hip were included in this analysis as “osteoporotic” fractures. Events immediately preceding the fracture were not captured in the follow-up questionnaire, so the amount of trauma associated with the event is unknown. Participants who reported four or more incident fractures at follow-up were excluded from the analysis, because of the probability that this reflected major trauma rather than fragility fracture.
All analyses were conducted using SAS version 6.12 software (SAS Institute, Inc., Cary, NC, USA). To minimize heterogeneity and focus on the issue on differences in technology, the analysis for this report includes only white women who comprise 89.7% of the NORA cohort.(13) Analyses were performed with and without the inclusion of women with prior fracture. Fracture rates for each measurement site/device were calculated by summing new fractures reported in follow-up and person-years of follow-up. Adjustment for important confounders, including age, body mass index (BMI), and prior fracture, was done using Cox proportional hazards, which models the hazard associated with covariates in comparison with baseline hazard. The assumption of proportionality of hazards in Cox modeling was met. To develop age-adjusted receiver operating characteristic (ROC) curves for each measurement site/device, logistic regression models were fit by maximum likelihood to incidence of fracture (yes/no) using T score as a continuous variable. In these models, the odds ratio (OR) is the exponentiation of the slope coefficient for T score and represents the change in the odds of fracture per unit decrease in T score (1 SD change in BMD), assuming the logarithm of the risk per SD change to be constant across BMD values. The Hosmer-Lemeshow statistic was used to test goodness of fit of the logistic regression. Sensitivity and specificity of BMD measurements for each site/device were calculated with the logistic regression models and the values were used to plot ROC curves (sensitivity vs. 1 − specificity) and to calculate areas under the curves (AUC).
Complete information, including baseline questionnaires; BMD; and follow-up information on fracture incidence was available for 149,524 white women, who comprise the sample for this report. Of these, 70.8% (n = 105,814) were between 50 and 70 years of age and 11% (n = 16,113) had reported at baseline a history of fracture of the spine, hip, rib, or wrist after the age of 45 years. Baseline characteristics and the number tested with each device are shown in Table 1. More than one-half of the women were measured with heel SXA and another one-third were measured with forearm pDXA.
Table Table 1.. Characteristics of White Participants by Site of BMD Measurement
Overall, 45.3% had low BMD, including 38.9% with T scores between −1.0 and −2.49 and 6.4% with T scores ≤ −2.5. As shown in Table 2, the frequency of low BMD varied by skeletal site and measurement device: heel tests yielded the smallest proportions of osteoporosis (4.1% with SXA and 2.9% with US); 9.4% of women who had BMD measured at the forearm and 12.0% of women with finger measurements had T scores ≤ −2.5.
Table Table 2.. Distributions of BMD Among White Women by Site/Device
During the follow-up interval, 2259 women reported 2340 new clinical fractures, including 1012 wrist/forearm fractures, 658 rib fractures, 393 hip fractures, and 277 spine fractures (Table 3). Based on Mantel-Haenszel χ2 testing, there were no significant differences in fracture incidence among devices.
Table Table 3.. Number (%) of Fractures Reported by White Participants During Follow-up According to Site of Fracture and Site/Device of Bone Density Measurement
Fracture rates at the hip, wrist/forearm, rib, and spine per 100 person years of follow-up, by T-score group, are shown in Fig. 1 for all fractures combined. Women with more than one fracture were counted only once. For each device, fracture rates were highest among women with T scores ≤ −2.5, intermediate with T scores between −1.0 and −2.5, and lowest among women with normal BMD. As shown in Fig. 2, results were similar for fractures of the hip among participants who had been measured at the heel with SXA or at the forearm. There were too few hip fractures in heel US and finger pDXA groups for meaningful analysis, because of the smaller numbers tested with these devices.
Table 4 shows the age and fracture history-adjusted association of BMD T score categories with risk of fracture. After adjusting for age and fracture history, low T scores were associated with increased risk of all fractures for each site/device. Although fracture risk increased exponentially with age, the effect of age differed by fracture site. The oldest women were not at increased risk of wrist fracture, relative to women aged 50–59 years old, but were at markedly increased risk of hip fracture (RR = 7.91; 95% CI, 5.25–11.92). A history of fracture was independently associated with an increased risk of new fracture, with risk ratios ranging from 2.17 (95% CI, 1.87–2.52) for wrist/forearm fracture to 3.59 (95% CI 3.02, 4.25) for rib fracture.
Table Table 4.. Relative Hazard (Risk Ratio) for Fracture Among White Women, Adjusted for Age and History of Fracture
Additional analyses were performed to calculate the age and fracture history-adjusted association of BMD per SD decrease with risk of fracture using both the mean BMD and the SD of young adult normal values and the mean BMD and SD of the NORA population. The RR was comparable with whether calculated from the manufacturer's young adult reference database or from the mean BMD of the NORA cohort (Table 5).
Table Table 5.. Relative Hazard (Risk Ratio) Per SD Decrease of BMD for Fracture Among White Women, Adjusted for Age and History of Fracture
The age-adjusted sensitivity and specificity of T score as a predictor of all incident fractures is shown in Fig. 3A and of hip fractures is shown in Fig. 3B. In each of these ROC curves, age-adjusted AUCs were comparable with in overall fracture prediction for each method/site, with AUC ranging from 0.663 (heel US) to 0.721 (finger). AUC for hip fracture prediction was 0.773 using heel SXA and 0.749 for forearm pDXA. In analyses excluding women with prior fractures, the AUC for all fractures and for hip fractures was only slightly reduced to 0.663–0.699 and 0.702–0.740, respectively.
In this large sample of women unselected for osteoporosis risk, osteoporotic peripheral BMD measurements (T score ≤ −2.5) predicted a greater than twofold increased risk of new osteoporotic fracture within 1 year of the measurement. The association was graded, with a smaller but still significantly increased risk in women with T scores between −1.0 and −2.49. This association was observed for all BMD measurement sites/devices and for all osteoporotic fractures that were investigated in this study.
Although different groups of women were tested with each device, all peripheral sites measured showed similar predictive ability for fracture after correction for age differences in the subgroups, as shown by the ROC curves. The observed AUCs for hip fracture (0.749 and 0.773) did not differ significantly from those reported by Cummings et al.(10) (0.75 to 0.78) for prediction of hip fracture in older women using measurements at hip sites. Generally, they are also equivalent to AUCs reported in other studies of fracture using hip and spine measurements with DXA(17) and to values observed for identifying women with prior fractures.(18–20) The age-adjusted sensitivity and specificity of T score as a predictor of all incident fractures was comparable with AUCs in hip fracture prediction for each method/site, ranging from 0.727 to 0.773. However, neither BMD nor other diagnostic tests can predict which individuals will fracture, in part because the absolute probability of fracture is low (1.5% in the NORA population) and in part because other factors, both skeletal and nonskeletal, affect fracture risk.
BMD data were presented as T scores, which were calculated separately for each device using the manufacturer's normative database for that device. The use of a T score of ≤ −2.5, now widely used as a diagnostic definition of osteoporosis in individual patients, derives from the WHO's prevalence analysis for population health policy strategies.(16) Earlier epidemiological studies had observed the lifetime risk of hip fracture for postmenopausal white women to be 16%. Using data derived from cross-sectional examination of hip and forearm BMD in two separate populations of postmenopausal white women, the threshold of −2.5 SD was chosen because the prevalence of BMD below this value at the femoral neck was also 16%.(16) Hence, the diagnostic threshold was selected as the point at which prevalence matched fracture risk. It is important to note that the women whose BMD values were used by the WHO to derive the T score distribution and diagnostic threshold were not followed longitudinally to validate directly the fracture risk associated with the observed T scores.
Although many prospective fracture studies(10,11,21–23) calculate risk from mean BMD and the SD of the population being studied, we chose to use each manufacturer's T score conversions. Either means of expressing risk has pragmatic and historical application. Pragmatically, the T score has been highlighted as a user-friendly value reported on bone densitometry computer printouts. The T score is widely used in the clinical community because of its linkage to the National Osteoporosis Foundation (NOF) physician guidelines and is beginning to be used to evaluate fracture risk in the context of the WHO criteria for osteoporosis and osteopenia.(24) Available data suggest that expressing risk in the population in comparison with the young normal reference population database is valid.(24) However, historically, most prospective studies of fracture risk have expressed RR from the mean BMD for the average age of the population studied, and we have provided this means of expressing risk as well. As shown in Table 5, either method of expressing risk in the NORA population appears similar and either method produces good risk prediction.
Applying WHO criteria to peripheral measures is somewhat controversial because peripheral skeletal sites have differing proportions of cortical and trabecular bone and change differently over time.(25) Moreover, the T score calculation depends on both the mean and the SD of the young normal reference population, and as a result, T scores differ for each site because of differences in the reference populations used to establish norms for various devices. Therefore, a T score of −2.5 at one site is unlikely to be equivalent to a T score of −2.5 at another site.(25, 26) T scores from peripheral sites may be higher, on average, than T scores from central sites, so that low T scores at peripheral sites could identify women at relatively higher risk of fracture than equivalent T scores at central sites.(25) Nevertheless, the present observations confirm the predictive value of T scores from several skeletal sites and measurement devices, although these T scores were derived from different normative populations.
Use of peripheral technologies for fracture risk assessment is consistent with recommendations of the International Osteoporosis Foundation to use hip measurements for WHO diagnosis (prevalence) and peripheral technologies for fracture risk assessment.(24) The compilation of a consistent young normal reference population database will mitigate T score discrepancies, so that a standardized T score value can be used for risk prediction by peripheral devices.(26, 27) This standardization project holds the best promise of minimizing T score discrepancies that are currently seen among available technologies and different manufacturers.(28–30)
This study has several limitations. Women who responded to the follow-up survey may differ from nonrespondents; importantly, women who have fractured may have been more (or less) likely to respond to follow-up than women who did not fracture. Fracture information was collected entirely by self-report. However, others have found an 80–90% correlation between self-report and medical record evidence of fracture.(31–33) Another limitation is the lack of spine X-rays; two-thirds of spine fractures are subclinical.(34, 35) The estimate of fracture rate by heel US is less precise than that for the other devices, in part because of larger SDs but also because of the relatively small numbers of women tested with this device and, consequently, the small number of fractures reported.
In summary, although no single BMD measurement, peripheral or central, identifies all women who have low BMD, we found comparable predictive values of BMD at peripheral skeletal sites by four different devices. Low BMD measured by peripheral technologies, regardless of the site at which it is identified, was associated with increased risk of fracture within 1 year, whether that risk is expressed as a function of manufacturer's normative database T scores or age-adjusted comparisons based on the mean age of the NORA population. Although no test can identify correctly each individual who will, within a finite interval of follow-up, sustain a fracture, clinicians measuring BMD in postmenopausal women can feel confident that results obtained with peripheral technologies do predict fractures.
We acknowledge all the physicians and women who participated in the NORA and the significant contributions of the many fine people at Merck & Co., Inc.; Parexel International; and ABT Associates, who were involved in the extraordinary implementation and data collection efforts undertaken on behalf of the NORA. We also acknowledge the skilled programming and administrative support provided for this study by Suna Barlas, Ph.D., Susan Brenneman, Ph.D., David Manfredonia, David Furman, Ph.D., Carissa Mazurick, and Jessica Panish. NORA was funded and managed by Merck & Co, Inc., in collaboration with the International Society for Clinical Densitometry.