The authors state that they have no conflicts of interest
Article first published online: 16 FEB 2009
Copyright © 2009 ASBMR
Journal of Bone and Mineral Research
Volume 24, Issue 7, pages 1319–1325, July 2009
How to Cite
Moayyeri, A., Kaptoge, S., Dalzell, N., Bingham, S., Luben, R. N., Wareham, N. J., Reeve, J. and Khaw, K. T. (2009), Is QUS or DXA Better for Predicting the 10-Year Absolute Risk of Fracture?. J Bone Miner Res, 24: 1319–1325. doi: 10.1359/jbmr.090212
Published online on February 16, 2009
- Issue published online: 4 DEC 2009
- Article first published online: 16 FEB 2009
- Manuscript Accepted: 11 FEB 2009
- Manuscript Revised: 23 NOV 2008
- Manuscript Received: 8 OCT 2008
- absolute risk;
- bone fractures;
- quantitative ultrasound
Although quantitative ultrasound (QUS) is known to be correlated with BMD and bone structure, its long-term predictive power for fractures in comparison with DXA is unclear. We examined this in a sample of men and women in the European Prospective Investigation into Cancer (EPIC)-Norfolk who had both heel QUS and hip DXA between 1995 and 1997. From 1455 participants (703 men) 65–76 yr of age at baseline, 79 developed a fracture over 10.3 ± 1.4 yr of follow-up. In a sex-stratified Cox proportional-hazard model including age, height, body mass index, prior fracture, smoking, alcohol intake, and total hip BMD, a 1 SD decrease in BMD was associated with a hazard ratio (HR) for fracture of 2.26 (95% CI: 1.74–2.95). In the multivariable model with heel broadband ultrasound attenuation (BUA) in place of BMD, HR for a 1 SD decrease in BUA was 2.04 (95% CI: 1.55–2.69). Global measures of model fit showed relative superiority of the BMD model, whereas the area under the receiving operator characteristic (ROC) curve was slightly higher for the BUA model. Using both Cox models with BMD and BUA measures, we calculated exact 10-yr absolute risk of fracture for all participants and categorized them in groups of <5%, 5% to <15%, and ≥15%. Comparison of groupings based on two models showed a total reclassification of 28.8% of participants, with the greatest reclassification (∼40%) among the intermediate- and high-risk groups. This study shows that the power of QUS for prediction of fractures among the elderly is at least comparable to that of DXA. Given the feasibility and lower cost of ultrasound measurement in primary care, further studies to develop and validate models for prediction of 10-yr risk of fracture using clinical risk factors and QUS are recommended.
Many trials have been conducted in the field of osteoporosis over the last decade, and several treatments have proven efficacy for reduction of fracture risk. Today, a major challenge is to better identify individuals at high risk of fracture who would benefit from intervention. To identify patients at high risk of fractures, DXA is widely accepted as the reference method for measuring BMD.(1–3) At the population level, a decrease in BMD is associated with a significant increase in fracture risk. However, at the individual level, BMD assessment is quite sensitive but not specific for prediction of fractures. This is explained partly by the fact that DXA measures BMD only, a surrogate of bone strength that is also influenced by bone architecture and hip geometry, and partly by the fact that the occurrence of fracture depends on other clinical risk factors.(4)
Quantitative ultrasound (QUS) of the calcaneus, developed in the past two decades, is expected to provide information on bone structure and density.(5,6) Previous studies suggest that QUS parameters are influenced by the mechanical properties of bone, which in turn are determined by the amount of bone, the bone's material properties (e.g., bone mineralization and elasticity), and its structural properties (e.g., bone architecture).(7–10) The pattern of absorption of a range of wavelengths of sound is called the broadband ultrasound attenuation (BUA; expressed in dB/MHz) and transmission of sound through bone can be quantitatively assessed by the speed of sound (SOS; expressed in m/s). Recent research has shown that ultrasonic assessments of the calcaneus are significantly discriminative and predictive of osteoporotic fractures independently of hip BMD.(11–17) In fact, major prospective studies have shown that QUS measurements are predictors of hip fracture with a similar performance to hip DXA measurements.(18–23)
The significant growth in use of QUS has been based on the affordability of the technology and the potential of sound waves to probe multiple bone properties such as BMD, microarchitecture, and elasticity. The cost of the devices is lower compared with DXA scanners and hence, QUS might be more appropriate compared with DXA assessment for use in primary care. This, however, depends on the performance of QUS for prediction of osteoporotic fractures in the long term. Several studies have tried to compare the predictive power of QUS and DXA for various types of fractures, but they have used different methods of comparison and the overall results are still inconclusive.(10,24)
Currently, the use of absolute fracture risk estimation in the field of osteoporosis research and clinical practice guidelines has come to the forefront because that is what matters to the patients and the health care providers. The World Health Organization (WHO) Scientific Group for Assessment of Osteoporosis at the Primary Health Care Level has developed a clinical tool (based on DXA and clinical risk factor) for estimation of 10-yr absolute risk of fracture in different populations.(25,26) Similar methods can now be applied for comparison of different radiological techniques or clinical risk factors to predict long-term absolute risk of fracture. We aimed in this study to compare models based on clinical risk factors and DXA with those using QUS measures obtained simultaneously for estimation of 10-yr absolute risk of fracture in elderly men and women.
MATERIALS AND METHODS
We recruited 1511 men and women ≥65 yr of age from a prospective population-based cohort study, the European Prospective Investigation into Cancer (EPIC)-Norfolk. Full details of participant recruitment and study procedures have been published elsewhere.(27) Briefly, the original cohort comprised about 25,000 men and women 4079 yr of age, recruited between 1993 and 1997 from general practice age-sex registers in Norfolk region, United Kingdom. The study was approved by the Norwich District Health Authority ethics committee, and all participants signed an informed consent at the beginning of the study. All participants are being followed up for different health endpoints, including fractures, to the present.
A subset of participants in the original EPIC-Norfolk study (≥65 yr of age and without DXA-confirmed diagnosis of osteoporosis) was invited to our bone densitometry study ∼18 mo after the baseline visit. An information sheet detailing the purpose of the study was sent to eligible subjects. About 90% of invited participants consented to participate in this study and attended the examinations.(28) Over the period of May 1995 to January 1998, 1511 participants underwent hip BMD measurements using a Hologic 1000 W bone densitometer (Hologic, Bedford, MA, USA). BMD (in g/cm2) of the total hip region was used for this study. All measurements were done by the same operator, and an experienced independent operator reviewed all scans to ensure consistency of positioning of the hip regions.(29) In the same day, 1458 of these participants also had a QUS assessment in the heel by a CUBA sonometer (McCue Ultrasonics, Winchester, UK). The means of at least two measures of BUA and SOS (on left or right calcaneus) were used for analysis in this study. Details of QUS procedures and their predictive power for fractures in the original cohort has been published previously.(30)
Demographic, anthropometric, and lifestyle variables were collected at the time of bone measurements. Height was measured to the nearest millimeter using a free-standing stadiometer (CMS Weighing Equipment, London, UK). Weight was measured to the nearest 100 g using Salter digital scales (Salter Industrial Measurement, West Bromwich, UK). Height and weight were measured in light clothing without shoes. Body mass index (BMI) was calculated as weight in kilograms divided by the square of the height in meters. Trained nosologists obtained vital status of the entire cohort based on death certificates of the United Kingdom Office of National Statistics. All hospital contacts for participants of the EPIC-Norfolk were identified through linkage of the unique National Health Service (NHS) number of participants with the East Norfolk Health Authority (ENCORE) database. This database identifies hospital contacts throughout England and Wales for all Norfolk residents. International Classification of Diseases (ICD) 9 and 10 diagnostic codes were used to ascertain osteoporotic fractures by site (excluding fractures of skull, face, metacarpals, and phalanges) occurring in the cohort up to the end of March 2007, an average of 10.3 ± 1.4 (SD) yr (range, 8.2–13.1 yr).
Multivariable Cox proportional-hazard regression models were used to model the association between incident fractures and potential risk factors. Two separate models, one including total hip BMD and the other including BUA of the heel, were constructed with age, past history of fracture, BMI, smoking status, and alcohol intake as the covariates in both models. Both models were stratified for sex. For comparison of performance of models, different global measures of model fitness were used. These measures included Bayesian information criterion (BIC),(31) Akaike's information criterion (AIC),(32) deviance information criterion (DIC),(33) likelihood ratio χ2 statistic, Nagelkerke's and Cox-Snell R-squared,(34) and D-statistic.(35) Lower values for the three information criteria and higher values for other measures indicate better fitness of the proportional-hazard models. Harrell's C-index (which is equivalent to area under the receiver operator characteristic [ROC] curve for survival data) was used as the measure of discrimination.(36) Calibration, which refers to the ability of a model to match predicted and observed outcome rates across the entire spread of the data, were compared between two models using the Hosmer-Lemeshow χ2 statistic.(37) This measure compares observed and predicted outcomes over deciles of risk and higher values for its p value indicate better calibration of the model (i.e., a less significant difference between expected and observed rates).
For further comparison of the two models, 10-yr absolute risk of fractures for each participant was calculated using the baseline survivor function and the estimated log hazard ratios for the variables in each model. In general, the Cox regression model for the hazard of fracture at time t after baseline given k explanatory variables X1, X2, …., Xk included in the model is of the form:
where h0(t) is the baseline hazard at time t and b1, b2, …, bk are the log hazard ratios for the k explanatory variables. h0(t) represents the instantaneous rate of failure expected at time t for a person with zero values of all covariates and the cumulative baseline hazard H0(t) at time t is obtained by integrating h0(t). The baseline survival at time t, i.e., Pr(T > t), is then given by
and the survival for a person with covariate values X1, X2, …, Xk is obtained as S0(t) ∧ exp(b1X1 + b2X2 + … + bkXk). Hence, our 10-yr risk of fracture was calculated as:
All participants were assigned to two different 10-yr fracture risk using Cox models including hip BMD or heel BUA as covariates. Participants were categorized into three groups with absolute risks of <5%, 5% to <15%, and ≥15% based on these two models. Unlike other ultrasound devices that report combined measures of BUA and SOS (namely, quantitative ultrasound index [QUI] for Sahara and Stiffness Index [SI] for Achilles devices), there is no combined measure for CUBA sonometer. Substitution of SOS for BUA resulted in poorer prediction in all models (BIC = 995.3 versus 991.9), and inclusion of SOS with BUA did not result in better prediction (BIC = 992.4 versus 991.9). Hence, these results are not reported. All database management and statistical analyses were performed using Stata software, version 10.0 (StataCorp, College Station, TX, USA).
A total of 1455 participants 65–76 yr of age (703 men; mean age, 69.5 ± 3 yr) were entered into the analysis. Three participants were excluded because of incomplete data. The characteristics of study participants are summarized in Table 1. Bone characteristics were higher on average among men. Mean total hip BMD was 0.944 ± 0.140 g/cm2 among men and 0.767 ± 0.125 g/cm2 among women. Mean BUA of the calcaneus was 88.3 ± 18.2 dB/MHz among men, which was significantly higher than the mean 63.5 ± 15.4 dB/MHz for women. During 15,567 person-years of follow-up, 79 participants (61 women) developed a fracture. The Pearson correlation coefficient was 0.47 for total hip BMD and heel BUA.
Two sex-stratified proportional-hazard models including BMD and BUA are shown in Table 2. Most of the variables entered into the models were significant predictors of fractures. Table 2 shows that the hazard ratio (HR) for a 1 SD decrease in total hip BMD was 2.3 (95% CI: 1.7–3.0) compared with 2.0 (95% CI: 1.6–2.7) for a 1 SD decrease in BUA. Table 3 compares the performance of two models. Global measures of model fit (including different information criteria, likelihood ratio test, R2 estimates, and D-statistic) showed relative superiority of the BMD model, whereas the area under the ROC curve was 0.6% higher for the BUA model. High Hosmer-Lemeshow p values confirmed that both models were adequately calibrated. In general, performances of both models were fairly similar.
For the next stage of the analysis, two new variables containing estimated 10-yr absolute fracture risk using the BMD and BUA models were generated. The estimated fracture risks using the BUA model (median, 4.2%; interquartile range [IQR], 2.0–8.0%) were higher than estimated risks using the BMD model (median, 3.1%; IQR, 1.5–7.1%; Wilcoxon signed-ranks test p value < 0.001). Table 4 compares the classification of participants based on two models into three risk categories (10-yr risk of <5%, 5% to <15%, and ≥15%). Most of the participants were classified into the same category of risk using each model. However, 419 participants (28.8%) were reclassified using different models. Forty-five of 112 participants (40%) assigned to the high-risk group (10-yr risk of ≥15%) using the BMD model were reclassified to a lower-risk group according to the BUA model. The greatest reclassifications (∼40%) were observed among the groups with intermediate and high risk of fracture. Whereas most of the participants were reclassified to adjacent categories, 10 participants categorized as low risk (<5% risk) using the BMD model were reclassified to high risk (≥15% risk) according to the BUA model and 6 of high-risk participants based on the BMD model were reclassified as low risk using the BUA model. The distribution of the participants in different categories based on the two models is shown graphically in Fig. 1.
Because we had followed up most of the participants for >10 yr, we were able to calculate observed 10-yr fracture risk (which is the incidence rate of fracture in the first 10 yr of follow-up). Table 4 and Fig. 1 also report the observed fracture risks for different categories. These numbers show that the estimated fracture risks based on the BUA model were relatively more compatible with the observed risks particularly in the intermediate- and high-risk groups. For instance, the right side of Fig. 1 shows that none of participants categorized as high risk based on the BMD model but as low risk using the BUA model experienced a fracture during follow-up. Similarly, 17% of those categorized as intermediate risk based on the BMD model but reclassified as high risk using the BUA model developed a fracture. In general, observed risks were closer to the estimated fracture risks using the BUA model for participants with higher risks of fracture based on any model.
To our knowledge, this is the first study that uses calculated 10-yr absolute risks of fracture for comparing QUS with DXA for their ability to predict fractures. Our results indicate that, whereas the conventional statistical methods show a relatively similar performance for both BMD and BUA models, there is a significant difference between two models regarding categorization of patients to different risk bands. Global measures of model fit showed relative superiority of the model based on clinical risk factors and BMD, whereas the area under the ROC curve was slightly higher for the model with clinical risk factors and BUA. Nevertheless, almost one in three of the participants (28.8%) were reclassified to a different category when 10-yr absolute risk of fractures was considered. Estimated fracture risks based on the BUA model were closer to the observed fracture risk compared with the BMD model, particularly for participants with higher risks of fracture. These findings suggest that QUS and DXA measure somewhat different aspects of bone strength and, suggest that hitherto QUS measurement has been under-rated for the prediction of long-term risk of fractures among the elderly.
Since 1984, when QUS measures began to be applied in bone research,(38) it has been hypothesized that ultrasound may give information not only about the bone density but also about architecture and elasticity.(6) A growing number of researchers have used QUS to assess bone status for prediction of osteoporotic fracture risk and various studies have found a lower,(11) an equal,(13–15,18,20,23) or a higher(16) prediction value than the one obtained with DXA. Relative risks or HRs have been the most widely used measures of association in prospective studies to compare predictive power of QUS and DXA.(18–23) However, these measures may not be perfect for comparison of these methods because they may only reflect the superiority of one method for estimation of short-term risk of fractures. Five of six major prospective studies in this field have followed their participants for <3 yr,(18–21,23) and the only long-term study showed similar hazard ratios for BUA and femoral neck BMD.(22) Moreover, generalization of relative risks derived from prospective studies that only used QUS measurements might be problematic because we need to consider the effect of clinical risk factors and their potential interactions with these measurements.
It has been recently appreciated that the clinical practice should be founded on the estimation of absolute fracture risk in the long term and using a multitude of risk factors. The measurement of a single risk factor can only capture one aspect of the likelihood of the outcome when the disease is multifactorial, and in osteoporosis for instance, assessment with BMD captures a minority of the fracture risk. The increase in risk with age is ∼7-fold greater than that can be explained on the basis of BMD alone.(39) This has been the basis for development of the FRAX tool by the WHO scientific group. This online program for estimation of 10-yr absolute fracture risk for individuals currently considers several clinical risk factors and BMD in the femoral neck.(25,26) FRAX is likely to be a basis for future routine clinical practice in the field of osteoporosis. Whereas the FRAX methodology is the current best choice because it captures all the relevant information and summarizes it to a single sensible measure for clinicians (i.e., 10-yr probability of fractures), other potential risk factors (including clinical, radiological, and biochemical factors) can be added to or replaced with the current set of risk factors. Our results suggest that BUA can be considered as a suitable alternative to BMD in such models.
Glüer and Hans(40) have suggested four potential strategies on how to use ultrasound clinically. The first strategy, the estimated BMD approach, suggests use of QUS for estimation of BMD and then use of that BMD estimate for fracture risk assessment. This approach is unsatisfactory because of low coefficients of correlation observed in different studies (including our study) between heel ultrasound and axial DXA as well as the poor predictive power of peripheral DXA.(40) Another strategy, the prescreening approach, uses a threshold for QUS (presumably derived from a cross-sectional study) so that all subjects with a QUS result below this threshold would be referred for DXA assessment. This is particularly problematic given the extent of assumptions for derivation of the threshold and the view of BMD as the gold standard for fracture risk estimation.(40) The third strategy, the composite approach, categorizes subjects as high, intermediate, and low risk, and subjects with intermediate risk would be referred for further assessment (DXA, bone biomarkers, or second independent QUS at a different site). This strategy depends greatly on identification of other diagnostic techniques that add predictive power to QUS. The fourth strategy, the stand-alone approach, therefore seems to be optimal among these approaches. It considers replacement of BMD with a QUS measure for fracture risk prediction.(40) Considering the advances put forward by the FRAX method, and given the results of our study that show a similar performance of the models based on QUS and DXA for risk prediction, similar models can be built using clinical risk factors and QUS measures for estimation of 10-yr absolute fracture risk and application in clinical practice.
It should be noted that ultrasound devices have some technological drawbacks that have precluded their widespread utilization in bone assessment hitherto. An important factor is the precision of the devices. The short-term in vivo precision of BUA varies between 2.0% and 3.5%, depending on the device and the site of measurement. Because a 2–3% precision of calcaneal BUA generates a least significant change that is about six to nine times larger than the average annual loss rate in postmenopausal women,(41) QUS devices cannot be good candidates for monitoring response to therapy. Moreover, there are no criteria for diagnosis of osteoporosis using QUS measurements. It has been shown that the −2.5 SD criterion for osteoporosis cannot be applied to many QUS devices (42) and, because of the technological differences between devices, results cannot be extrapolated from one device to another.(41) However, QUS instruments have some advantages: they are radiation free, portable, and inexpensive.(24) Therefore, given the predictive power of QUS compared with DXA observed in this study, using a stand-alone approach may be the most cost-effective approach for fracture risk assessment.(40) This issue needs further attention of researchers working in this field.
This study has some limitations. The most important one is the choice of thresholds for categorization of participants. We acknowledge that these thresholds must ideally come from population-specific cost-utility or cost-effectiveness studies that combine absolute risk measures, age structure of the population, cost and efficacy of the therapies, and the value (or utility) of fractures in the community. Currently, however, there is no such study using absolute risk estimates in the United Kingdom (as for other parts of the world), and we had to rely on arbitrary thresholds. We considered the distribution of incident fracture cases in our study population and the estimated fracture risks using both models in this study. Given the low incidence of fractures in our population, we chose to consider ∼10% of participants as high risk and ∼30% as intermediate risk. This translated to cut points of 5% and 15%. If we were to use previously suggested thresholds (such as the thresholds of 10% and 20% for risk categorization suggested by Siminoski et al.),(43) we would have only 4% of participants as high risk and ∼11% as intermediate risk based on both models.
The other potential limitation of this study is the representativeness of the study population. Although it was population based, there is a potential for healthy participant recruitment into the EPIC-Norfolk study as well as this particular analysis.(29) Healthy participants are more likely to complete food diaries and questionnaires and consent to undergo several diagnostic procedures. The incidence rate of fractures was very low according to UK norms in our participants (∼5 per 1000 person-years). However, it should be noted that previous studies from East Anglia have shown that the rate of fracture in this region is considerably lower compared with other parts of United Kingdom.(44) Given the follow-up procedure in the EPIC-Norfolk study, only fractures that needed admission to hospitals or recorded in the general practices were considered for this study. This may have led to an underestimation of fracture rate in our study compared with other populations, although only minor fractures (e.g., of digits or ribs) are thought not to attract clinical attention. Nevertheless, this is not likely to have confounded the comparison of the results for models based on BUA and BMD unless there was an independent interaction with the composition of the sample. In any case, the results of this study need validation in other settings before generalization to other populations.
In conclusion, we estimated 10-yr absolute risk of fracture for comparison of models based on BMD and BUA as fracture predictors. We found that, whereas the conventional statistical methods showed a similar performance for both models, almost one in three participants were reclassified to a different risk band using different models. Although individuals were categorized to different risk bands, both models classified a similar fraction of participants to each risk band. Interestingly, estimated fracture risks based on the BUA model were closer to the observed fracture risks. These results suggest that QUS has at least a similar performance compared with DXA in prediction of long-term fracture risk among elderly men and women. Given the lower cost and affordability of ultrasound measurement in primary care, further studies to develop and validate models for prediction of 10-yr risk of fracture using clinical risk factors and QUS are recommended.
As part of EPIC-Norfolk, this study was supported by program and project grants from the Medical Research Council (G9321536), EU FP5 (ADOQ: QLRT-2001-02363), the Ministry of Agriculture, Fisheries, and Food (AN0523), and the Food Standards Agency (N05046). Further support was received from Cancer Research UK, the Stroke Association, and Research Into Ageing. The funding sources had no role in the study design, conduct, or analysis or in the decision to submit the manuscript for publication.
- 2WHO 1994 Assessment of Fracture Risk and its Application to Screening for Postmenopausal Osteoporosis. World Health Organization, Geneva, Switzerland.
- 192006 Prediction of hip fracture risk by quantitative ultrasound in more than 7000 Swiss women > or =70 years of age: Comparison of three technologically different bone ultrasound devices in the SEMOF study. J Bone Miner Res 21: 1457–1463., , , , , , , , , , , , , , , , , ,
- 25on behalf of the World Health Organization Scientific Group 2007 Assessment of Osteoporosis at the Primary Health-Care Level. Technical Report. World Health Organization Collaborating Center for Metabolic Bone Disease, University of Sheffield, Sheffield, UK.
- 371999 Applied Survival Analysis. Wiley, New York, NY, USA.,