The authors have no conflict of interest.

Research-Article

# Does the Combination of Two BMD Measurements Improve Fracture Discrimination?^{†}

Version of Record online: 1 NOV 2003

DOI: 10.1359/jbmr.2003.18.11.1955

Copyright © 2003 ASBMR

Additional Information

#### How to Cite

Blake, G. M., Patel, R., Knapp, K. M. and Fogelman, I. (2003), Does the Combination of Two BMD Measurements Improve Fracture Discrimination?. J Bone Miner Res, 18: 1955–1963. doi: 10.1359/jbmr.2003.18.11.1955

^{†}

#### Publication History

- Issue online: 2 DEC 2009
- Version of Record online: 1 NOV 2003
- Manuscript Accepted: 2 JUL 2003
- Manuscript Revised: 8 MAY 2003
- Manuscript Received: 18 FEB 2003

- Abstract
- Article
- References
- Cited By

### Keywords:

- bone densitometry;
- fracture risk;
- osteoporosis

### Abstract

**Combining information from different types of BMD measurement should improve the evaluation of patients' risk of fracture. This study used a bivariate gaussian model to examine the effect of combining two different BMD measurements. The results show that, in practice, there is little benefit unless the measurements are completely unrelated.**

**Introduction:** Intuitively, the combination of information from two or more different types of bone densitometry investigation should improve our ability to identify patients at high risk of fracture. However, the best way to combine measurements and the resulting gain in fracture discrimination are not known.

**Materials and Methods:** In this study, we used a bivariate gaussian model to investigate the effect of combining two different types of bone densitometry measurements. The measurements had individual relative risk values RR_{1} and RR_{2} and a correlation coefficient *r* between their Z-scores. Different approaches to the combination of the two measurements were compared by calculating the area under the curve (AUC) for the receiver operating characteristic (ROC) curve, which was obtained by plotting the percentage of fracture patients against the percentage of the whole population with a Z-score below some chosen threshold. ROC curves were calculated for three cases: (1) one type of measurement only; (2) two different types of measurements combined using their mean Z-score weighted according to the theoretical optimum weighting factors predicted by the bivariate gaussian model; and (3) two different types of measurements combined using the conventional World Health Organization (WHO) approach, where one or other measurement is below a set threshold. The theoretical model was tested using measurements of speed of sound (SOS) in the radius, phalanx, and metatarsal in patients with vertebral and Colles' fractures.

**Results:** Results were calculated for RR values of 1.5, 2.0, and 2.5 and *r* = 0, 0.5, and 0.7. Although a significant improvement in fracture discrimination was obtained when *r* = 0 and RR_{1} = RR_{2}, the improvements obtained when *r* ≥ 0.5 or RR_{1} ≠ RR_{2} were relatively modest. Slightly better fracture discrimination was obtained using the weighted mean Z-score approach compared with the WHO approach, although the differences were small. The results of the in vivo study in Colles' and vertebral fracture patients showed close agreement with the predictions of the bivariate gaussian model.

**Conclusion:** In practice, from a theoretical point of view, there is unlikely to be any benefit from combining information from different types of bone densitometry measurements unless they are completely unrelated.

### INTRODUCTION

Growing awareness of the impact of osteoporosis on the elderly,^{(1)} the consequent costs of healthcare,^{(2)} and the development of new treatments to prevent fractures^{(3–7)} have all contributed to a rapid growth in the demand for bone densitometry services. Today, scans to measure bone mineral density (BMD) have an essential role in evaluating patients at risk of osteoporosis.^{(8–10)} In 1994, a World Health Organization (WHO) report^{(11)} recommended that osteoporosis should be defined by expressing BMD measurements as T-scores. T-scores are calculated by taking the difference between a patient's measured BMD and the mean BMD of healthy young adults matched for gender and ethnic group and dividing by the young adult SD. The WHO report defined osteoporosis as a T-score ≤ −2.5 measured at the spine, hip, or forearm. BMD measurements may also be interpreted using Z-scores.^{(12)} Z-scores are calculated by taking the difference between the measured BMD and the mean BMD for healthy subjects matched for age, gender, and ethnic group, and dividing by the respective SD. Although they cannot be used to diagnose osteoporosis, Z-scores are useful because they express a patient's skeletal status relative to their peers. A patient's T- and Z-scores are related by the equation:

- (1)

where the population mean T-score depends on the patient's age and gender and the type of measurement.^{(13)}

Fundamental to the role of BMD scans in diagnosing osteoporosis is the ability to assess a patient's risk of fracture. The most reliable approach to evaluating the effectiveness of bone densitometry is through prospective studies of incident fractures.^{(14,15)} Studies are analyzed using a proportional hazards model in which the findings are expressed as the relative risk (RR), defined as the increased risk of fracture for each unit decrease in Z-score.^{(16)} The results of fracture studies can also be expressed by plotting the percentage of fracture patients against the percentage of the whole study population with BMD values below some chosen threshold. As the threshold is varied, one obtains a receiver operating characteristic (ROC) curve^{(17)} in which the true positive fraction (those patients who sustained a fracture and were correctly identified to be at risk by the BMD measurement) is plotted against the false positive fraction (those patients identified as being at risk but who did not fracture). ROC curves are often parameterized by the area under the curve (AUC). The larger the AUC, the better the discrimination of the BMD measurements at identifying those patients at greatest risk of fracture.^{(18)}

The need to optimize the ROC curve raises the question of what improvement in fracture discrimination is obtained by combining information from two or more different measurements. For example, it is common to perform BMD scans of the spine and hip and make the diagnosis of osteoporosis if the T-score at either site is less than −2.5.^{(12)} However, it is unclear what gain in discrimination is obtained by this practice, or even whether individual interpretation of T-scores using the WHO threshold is the best way of combining the data. More generally, one might wish to examine the effect of other combinations such as BMD and heel ultrasound^{(19)} or speed of sound (SOS) measurements at different skeletal sites.^{(20)} In this report we consider these issues by developing a mathematical model to illustrate the gains and limitations that apply when information is combined from two different measurements. The predictions of the model are compared with SOS data obtained in patients with vertebral and Colles' fractures.

### MATERIALS AND METHODS

#### Estimation of the ROC curve for a single BMD measurement

We first give a brief description of the model for a single type of measurement. We assume that for a group of subjects from the general population with a narrow range of ages (say, a 5-year age range), the distribution of Z-score values approximates to a gaussian curve with its peak at *Z* = 0.^{(18)}

- (2)

Based on the proportional hazards model, the fracture risk can be modeled by an exponential curve that scales with Z-score as exp(−β*Z*), where β is the logarithm of the relative risk [β = ln(RR)]. The Z-score distribution for the fracture population is found by multiplying Equation 2 by the fracture risk curve. It is found to be a gaussian equation with the same SD as Equation 2 but with its peak at *Z* = −β^{(18)}:

- (3)

The gaussian model may be used to predict points on the ROC curve by calculating the area under the curves representing the fracture and the general populations up to a chosen Z-score threshold (Fig. 1A). As the threshold is varied, the ROC curve is traced out and the AUC can be calculated.

#### Estimation of the ROC curve for the combination of two BMD measurements

The effect of combining two different types of measurement with Z-scores *Z*_{1} and *Z*_{2} can be studied by using a bivariate gaussian function^{(21)} to represent the population distribution. For the general population the equation is:

- (4)

where *r* is the Pearson correlation coefficient between *Z*_{1} and *Z*_{2}.^{(22)}

The bivariate gaussian representing the fracture population is similar to the function describing the general population, but with its peak at the point *Z*_{1} = −β_{1}, *Z*_{2} = −β_{2}:

- (5)

That Equation 5 has the correct form to represent the fracture population can be verified by projecting the distribution onto the *Z*_{1} axis when it reduces to the single measurement gaussian in Equation 3 with β = β_{1}. It is clear, therefore, that the constant β_{1} is related to the relative risk of the first BMD measurement by the equation β_{1} = ln(RR_{1}). In the same way it can be shown that the constant β_{2} is related to the relative risk of the second measurement by the equation β_{2} = ln(RR_{2}). Figure 1B shows the functions in Equations 4 and 5 plotted together as contours of constant number density.

There is an important relationship between β_{1}, β_{2}, and *r*. If two different types of measurement are correlated, then part of the fracture prediction capability of one is derived through its correlation with the other. If Equation 5 is rewritten to express *Z*_{2} in terms of its correlation with *Z*_{1} and a residual, the β value associated with the residual is β_{res} = (β_{2} − *r*β_{1})/√(1 − *r*^{2}) Because an increased risk of fracture is always associated with a decrease in BMD, β_{res} > 0, and hence we derive the condition β_{2} > *r*β_{1}, or equivalently *r* < β_{2}/β_{1}. This condition must apply if the combination of two different measurements is to provide increased fracture discrimination compared with a single measurement alone. There is a similar condition, *r* < β_{1}/β_{2}, but given that the correlation coefficient *r* is less than unity, one of the two conditions is trivial.

In this study, the bivariate gaussian functions representing the general and the fracture populations were integrated to derive the form of the ROC curve in a manner similar to the single measurement model (Fig. 1A). Two different methods of combining the two BMD's to set a threshold for identifying the high-risk individuals were considered. In the first approach, a weighted mean of the two Z-scores was used (Fig. 1B). The second approach was based on the WHO interpretation of BMD results in which individuals are identified as being at risk if one or other of the two measurements is below a set threshold (Fig. 1C). These two approaches are described in turn below.

#### Combination of two BMD measurements using the weighted mean Z-score

In the first approach, a weighted mean of the two Z-scores was used to specify the threshold for the integration of the bivariate gaussian functions (Fig. 1B). We begin by considering the case where the two measurements are equally effective at predicting fracture risk. In this case β_{1} = β_{2} = β, and it is clear the measurements should be given equal weight. A simple way of achieving this is to rotate the axes in Fig. 1B through 45° to give new axes *Z*_{+} and *Z*_{−} that are oriented along the major and minor axes of the bivariate gaussian distributions, respectively. The *Z*_{+} and *Z*_{−} values associated with *Z*_{1} and *Z*_{2} are:

- (6)

- (7)

When the bivariate gaussian is projected onto the *Z*_{+} axis, a gaussian curve is obtained with an SD of √(1 + *r*). Because the peaks of the functions representing the two populations are at (0, 0) and (−β, −β), respectively, the separation of the two peaks is √2β. When normalized by the SD of the projected gaussian, the separation is √2B/√(1 + *r*). It follows from the example of the single measurement model illustrated in Fig. 1A that the β value for the *Z*_{+} combination of *Z*_{1} and *Z*_{2} is given by:

- (8)

When the bivariate gaussian is projected onto the *Z*_{−} axis, the peaks of the two populations coincide. Hence *Z*_{−} gives no information about fracture risk. We shall refer to *Z*_{+} as the weighted mean Z-score. As the value of the *Z*_{+} threshold in Fig. 1B is varied, the ROC curve is traced out and the AUC can be calculated.

The general case when β_{1} and β_{2} are unequal also has a simple solution. In this case, *Z*_{1} and *Z*_{2} are combined with weighting factors cosϕ and sinϕ, giving the following generalized equation for the weighted mean Z-score:

- (9)

By an argument similar to that given above for the derivation of Equation 7, in which the separation of the two peaks projected on the *Z*_{+} axis is normalized by the SD of the projected single gaussian, the β value for the weighted mean Z-score is:

- (10)

The value of β_{comb} is a maximum when the angle ϕ takes the value:

- (11)

#### Combination of two BMD measurements using the WHO approach

In the second method of combining two measurements, an approach equivalent to the WHO interpretation of BMD results was modeled in which the patient was defined to be at high risk of fracture if one or the other of the two Z-scores was below a set threshold (Fig. 1C). A computer program was written to integrate the functions representing the two populations according to the limits shown in Fig. 1C and calculate the corresponding ROC curve. For the sake of presenting a simple example, the calculations were performed assuming that the population mean T-score in Equation 1 was the same for both measurements. In this case, the two axes have equal Z-score thresholds, as shown in Fig. 1C. This is approximately true for lumbar spine and femoral neck BMD measurements.^{(13)} Note that although the above description of the bivariant gaussian model is given in terms of Z-scores, the point on the ROC curve corresponding to a T-score of −2.5 is readily calculated by substituting T = −2.5 in Equation 1.

#### Subjects and measurements for the in vivo study

Data from an in vivo study were used to compare the predictions of the theoretical model described above with the improvements in fracture discrimination found by combining multisite axial transmission SOS measurements in patients with vertebral and Colles' fracture. A Sunlight Omnisense device (Sunlight Medical Ltd., Tel-Aviv, Israel) was used to measure SOS at the radius, phalanx, and metatarsal in a study population that consisted of 110 healthy postmenopausal women, 64 women with atraumatic vertebral fractures, and 31 women with low trauma wrist fractures.^{(20,23)} All patients gave informed consent, and the study was approved by the Local Research Ethics Committee. Logistic regression analysis was used to calculate the age-adjusted odds ratios (ORs) for fracture discrimination. Although the mathematical model used for logistic regression analysis of cross-sectional fracture studies differs slightly from the proportional hazards model used for prospective studies,^{(24)} the resulting ORs are essentially equivalent to relative risks.^{(25,26)} OR values for the combinations of pairs of SOS sites were therefore calculated by taking the weighted mean of the two Z-scores, and the results were compared with the predictions of the theoretical model calculated from the single site OR using Equations 9 and 10.

### RESULTS

When the single measurement model was used to calculate the ROC curves for RR values of 1.5, 2.0, and 2.5 (Fig. 2A), the values of the AUC were 0.613, 0.688, and 0.741, respectively (Table 1). At the point on the ROC curve representing the lowest quartile of the general population, the percentage of the fracture population captured was 39%, 51%, and 60%, respectively (Fig. 2A).

For the combination of two BMD measurements using the weighted mean Z-score approach, the simplest case is when the measurements have equal RR values. The greatest improvement in fracture discrimination is obtained for *r* = 0 when β_{comb} = √2β (Equation 7). As the correlation coefficient increases to *r* = 1, the value of β_{comb} decreases to β, and no improvement is obtained compared with a single measurement. The value of RR corresponding to β_{comb} can be calculated using the equation RR_{comb} = exp(β_{comb}). For RR_{1} = RR_{2} = 1.5, the values of RR_{comb} (and AUC) for *r* = 0, *r* = 0.5, and *r* = 0.7 are 1.77 (0.657), 1.60 (0.630), and 1.55 (0.622), respectively (Table 1). For RR_{1} = RR_{2} = 2.0, the equivalent values are 2.67 (0.756), 2.23 (0.714), and 2.13 (0.702), and for RR_{1} = RR_{2} = 2.5 the values are 3.65 (0.820), 2.89 (0.773), and 2.71 (0.759). The ROC curves for RR = 2.0 are plotted in Fig. 2B. For the point on the curves representing the lowest quartile of the general population, the percentage of the fracture population captured was 62%, 55%, and 53%, respectively, for *r* = 0, 0.5, and 0.7 compared with 51% for a single BMD measurement with the same RR value.

When the two BMD measurements have unequal RR values, the angle ϕ for calculating the optimum weighting factors in Equation 8 was found using Equation 10. The angle ϕ was plotted as a function of the correlation coefficient for three different sets of RR values (Fig. 3). When *r* is less than the critical value *r*_{c} = β_{2}/β_{1}, both weighting factors are positive, and the combination of the two measurements improves fracture discrimination. However, when *r* = *r*_{c}, ϕ = 0 and no additional discrimination is obtained. When *r* > *r*_{c}, the weighting factor for the second measurement is negative, and the additional discrimination provided by the second measurement associates an increased fracture risk with an increased BMD. This is the opposite of what is usually observed and will not be considered further here.

When the weighted mean Z-score approach was analyzed for RR_{1} = 2.0 and RR_{2} = 1.5, the values of RR_{comb} (and AUC) for *r* = 0 and *r* = 0.5 were 2.23 (0.715) and 2.01 (0.689), respectively (Table 1). The case for *r* = 0.7 was not considered because it exceeds the critical value *r*_{c} = 0.58. For RR_{1} = 2.5 and RR_{2} = 2.0, the critical value is *r*_{c} = 0.76, and the values of RR_{comb} (and AUC) for *r* = 0, *r* = 0.5, and *r* = 0.7 are 3.16 (0.791), 2.61 (0.750), and 2.51 (0.742), respectively. For RR_{1} = 2.5 and RR_{2} = 1.5, the critical value is *r*_{c} = 0.44, and the value of RR_{comb} (and AUC) for *r* = 0 is 2.72 (0.760).

For the combination of two BMD measurements using the WHO approach, the ROC curves were calculated by performing the numerical integration over the two population distributions shown in Fig. 1C. The AUC values obtained are compared with the weighted mean Z-score approach in Table 1. In each instance, slightly better fracture discrimination was obtained using the mean Z-score approach. For equal RR values, the AUC values for the WHO approach lay between those for the mean Z-score approach and a single BMD measurement. The ROC curves for a single BMD measurement and two measurements combined using the mean Z-score and WHO approaches for RR_{1} = RR_{2} = 2.0 and *r* = 0 are plotted in Fig. 2C. For the point on the curves representing the lowest quartile of the general population the percentage of the fracture population captured was 62%, 57%, and 51% for the mean Z-score, the WHO approach, and a single BMD measurement, respectively. For *r* = 0.5, the corresponding values were 55%, 53%, and 51%.

The results for the WHO approach when the two BMD measurements have unequal RR values are also included in Table 1. As with the case for equal RR values, the weighted mean Z-score approach performed slightly better than the WHO approach. The comparison between the two approaches is summarized in Fig. 4, which shows the AUC plotted as a function of the correlation coefficient for the weighted mean Z-score, and the WHO approaches for examples of equal RR values (RR_{1} = RR_{2} = 2.0) and unequal values (RR_{1} = 2.0, RR_{2} = 1.5), respectively.

Details of the subjects included in the cross-sectional studies of vertebral and Colles' fracture patients using the Sunlight Omnisense device have been published previously.^{(20,23)} OR values obtained from logistic regression analysis for radius, phalanx, and metatarsal SOS considered singularly and then combined in pairs using the optimum weighting factors for the weighted mean Z-score are listed in Table 2. For the paired SOS data, Table 2 also lists the theoretical values of OR calculated using Equation 9. Values for the correlation coefficient *r* varied between 0.27 and 0.32. The average OR value for the single site measurements was 1.55, for the paired SOS data analyzed using logistic regression was 1.75, and for the paired SOS data predicted using the bivariant gaussian model was 1.74. The latter two figures show good agreement, and the in vivo data confirm the modest improvement in the ORs predicted by the model.

### DISCUSSION

The clinical value of bone densitometry investigations depends on two factors: (1) the ability of the measurements to discriminate patients who will have a fracture from those who will not; and (2) how effectively the scan findings are interpreted. The first factor depends on the RR value of the BMD measurements, which determines the area under the ROC curve. The larger RR, the larger the percentage of future fracture patients identified when treating any given percentage of the population.^{(18)} At present, the best technique is the use of a hip BMD measurement to predict hip fracture risk for which RR = 2.6.^{(16)} The second factor determining the value of BMD scanning is how effectively the findings are interpreted. Here, the prevailing paradigm is the WHO definition of osteoporosis of a T-score ≤ −2.5 at the spine, hip, or forearm. The ROC plot is a useful tool for evaluating the outcome of bone densitometry investigations because the operating point on the curve indicates the overall effectiveness of BMD scans taking into account both the RR of the measurement technique and the way the findings are interpreted.^{(18)}

Intuitively, the combination of two or more different types of measurement should improve the ability of bone densitometry investigations to identify patients at risk of fracture so that a more favorable point is achieved on the ROC curve. Ideally, this means moving the operating point towards the top left of the diagram so more fracture cases are identified and fewer patients treated overall. When more than one measurement is made (for example spine and hip BMD), most clinicians would agree that a diagnosis of osteoporosis should be made if one or other site is below the WHO T-score threshold.^{(12)} However, it is not known whether this widely used practice is the optimum method of using the data in terms of the ROC plot. An alternative is the conventional statistical approach of combining two noisy measurements by taking their mean. If each measurement has the same error, they are given equal weights, and there is an overall gain in signal to noise by a factor of √2. Bone density measurements should certainly be regarded as noisy data, because random accuracy errors caused by the effects of soft tissue composition account for around 50% of the population SD,^{(27)} suggesting that their effect might be reduced by averaging over sites. Therefore, an important objective of this study was to quantify and compare the improvement in fracture discrimination for the WHO and the weighted mean Z-score approaches.

Of the two approaches, it is easier to calculate the ROC curve for the weighted mean Z-score. In this case, the effect of combining two measurements is equivalent to a single BMD measurement with an increased RR value that can be calculated from Equation 7 for the case RR_{1} = RR_{2} and from Equations 9 and 10 for the case when RR_{1} ≠ RR_{2}. The greatest improvement in fracture discrimination is obtained when the two measurements have RR values, and their correlation coefficient is zero. In this case, β_{comb} = √2β, and we obtain the √2 factor familiar from basic statistics. However, as the correlation coefficient increases to *r* = 1, the value of β_{comb} decreases to β, and there is no gain compared with a single measurement. When two measurements are completely uncorrelated, the improvement in discrimination is substantial. For example, for two measurements where RR_{1} = RR_{2} = 2.0, the value of RR_{comb} is 2.67, which is equivalent to the optimal bone densitometry study using hip BMD to predict hip fracture risk.^{(16)} However, in practice the correlation coefficient between different measurements often lies in the range *r* = 0.5–0.7,^{(28)} and in this case, the value of RR_{comb} is substantially less and lies in the range of 2.13–2.23. A good example of the important effect of the correlation coefficient is dual-femur DXA scanning.^{(29)} In this case RR_{1} = RR_{2} = 2.6,^{(16)} and *r* ≈ 0.95,^{(29)} giving RR_{comb} = 2.63, which represents a negligible improvement. In comparison, for two uncorrelated measurements with RR = 2.6, the RR_{comb} value would be 3.86. Were it possible to achieve this latter example in practice, it would represent a substantial gain in fracture discrimination.

The case when RR_{1} ≠ RR_{2} is more complicated because the weighting factors are unequal. The greatest improvement in fracture discrimination is still obtained when *r* = 0. In this case tanϕ = β_{2}/β_{1} (Equation 10), and the two measurements are weighted in proportion to their β values. It is readily shown from Equation 9 that in this case the β values of the weighted combination is β_{comb} = √(β_{1}^{2} + β_{2}^{2}). As the correlation coefficient increases from zero, proportionally more weight is given to the measurement with the larger RR value, and the contribution of the second measurement becomes zero when *r*_{c} = β_{2}/β_{1} (Fig. 3). An example of unequal weights is the use of spine and hip BMD to predict hip fracture risk. The Marshall meta-analysis lists RR values (and 95% CIs) of 1.6 (1.2–2.2) for spine BMD and 2.6 (2.0–3.5) for hip BMD.^{(16)} Assuming RR = 2.6 for hip BMD and a correlation coefficient *r* = 0.7, the minimum allowed RR value for spine BMD is 1.95. We therefore combined the hip RR figure of 2.6 with the 95% confidence upper limit for spine BMD of 2.2 to obtain RR_{comb} = 2.64. As with the example of dual-femur scanning discussed above, this represents a negligible improvement.

The poor correlation between different types of bone densitometry measurement is often a cause for concern because of the potential for conflicting findings between different sites.^{(18)} If two types of measurement were to correlate perfectly, they would identify exactly the same patients for treatment. However, in practice, different types of measurement often correlate poorly, with *r* ≈ 0.6–0.7 between BMD results from different sites and *r* ≈ 0.4–0.5 between QUS and BMD measurements.^{(28)} Although the poor correlation between different types of measurement can create difficulties when comparing results, it turns out that the correlations found in practice are rather too good to achieve more than a modest improvement in fracture discrimination by combination of measurements.

In this study, when the weighted mean Z-score approach was compared with the WHO approach, the former was found to perform consistently slightly better in terms of the ROC curve (Table 1). However, the differences are greatest when the correlation coefficient is zero, and in practice, for correlations of *r* = 0.5–0.7, the differences in the ROC curves between the two approaches were very small.

An important limitation of the analysis presented here is that the results depend on the assumptions that the BMD distributions for the age-matched general population are gaussian and that the fracture risk curves are exponential. We believe that these are sufficiently good approximations to provide useful insights into the effects of combining different types of BMD measurement. Although we were able to show good agreement with the results of the theoretical model presented here and data from a cross-sectional study of fracture patients, it is important that the predictions of the model are compared with data from a prospective fracture study. We note that the conclusions of the present study are consistent with an analysis of data from the Study of Osteoporotic Fractures reported by Genant et al.,^{(30)} who also drew attention to the limited benefit of combining BMD data from different sites.

It was concluded from the present study that when two different types of BMD measurement are combined, the largest gain in fracture discrimination is obtained when the correlation coefficient is zero and they have equal RR values. In this case, by using the mean Z-score approach, the exponent β in the exponential curve relating fracture risk to BMD is increased by a factor √2. In practice, however, the effects of the correlation between different measurements and of unequal values of RR ensure that the real gains in fracture discrimination are considerably smaller than this ideal case. Second, it was concluded that when the weighted mean Z-score approach was compared with the WHO approach the former performed slightly better when judged by the ROC curve, although in practice the clinical effect of this difference was found to be small.

### REFERENCES

- 11992 Hip fractures in the elderly; a world-wide projection. Osteoporos Int 2:285–289., ,
- 21997 Medical expenditures for the treatment of osteoporotic fractures in the United States in 1995: Report from the National Osteoporosis Foundation. J Bone Miner Res 12:24–35., , ,
- 31996 Randomised trial of the effect of alendronate on risk of fracture in women with existing vertebral fractures. Lancet 348:1535–1541., , , , , , , , , , , , , ,
- 41999 Reduction of vertebral fracture risk in postmenopausal women with osteoporosis treated with raloxifene: Results from a 3-year randomized clinical trial. JAMA 282:637–645., , , , , , , , , , , , , , , , ,
- 51999 Effects of risedronate treatment on vertebral and nonvertebral fractures in women with postmenopausal osteoporosis. JAMA 282:1344–1352., , , , , , , , , , ,
- 62001 Effect of risedronate treatment on hip fracture risk in elderly women. N Engl J Med 344:333–340., , , , , , , , , , ,
- 72001 Effect of recombinant human parathyroid hormone (1–34) fragment on spine and non-spine fractures and bone mineral density in postmenopausal osteoporosis. N Engl J Med 344:1434–1441., , , , , , , , , , ,
- 81997 Diagnosis and management of osteoporosis: Guidelines for the ultilization of bone densitometry. Calcif Tissue Int 61:433–440., , , ,
- 9on behalf of the European Foundation for Osteoporosis and Bone Disease 1997 Guidelines for diagnosis and treatment of osteoporosis. Osteoporos Int 7:390–406., , , ,
- 10Royal College of Physicians 1999 Osteoporosis: Clinical Guidelines for Prevention and Treatment. Royal College of Physicians, London, UK.
- 11World Health Organization 1994 WHO Technical Report Series 843. Assessment of Fracture Risk and Its Application to Screening for Postmenopausal Osteoporosis. World Health Organization, Geneva, Switzerland.
- 12National Osteoporosis Society 2002 Position Statement on the Reporting of Dual X-Ray Absorptiometry (DXA) Bone Mineral Density Scans. National Osteoporosis Society, Bath, UK.
- 131999 Discordance in patient classification using T-scores. J Clin Densitom 2:343–350., ,
- 141991 Which fractures are associated with low appendicular bone mass in elderly women? Ann Intern Med 115:837–842., , , , ,
- 151993 Bone density at various sites for prediction of hip fractures. Lancet 341:72–75., , , , , , , , ,
- 161996 Meta-analysis of how well measures of bone mineral density predict occurrence of osteoporotic fractures. BMJ 312:1254–1259., ,
- 171991 Practical Statistics for Medical Research. Chapman Hall, London, UK, pp. 409–418.
- 182001 Peripheral or central densitometry: Does it matter which technique we use? J Clin Densitom 4:83–96.,
- 192001 Does the combination of quantitative ultrasound and dual energy x-ray absorptiometry improve fracture discrimination? Osteoporos Int 12:471–477., ,
- 202001 Multisite quantitative ultrasound: Precision, age- and menopause-related changes, fracture discrimination and T-score equivalence with dual-energy X-ray absorptiometry. Osteoporos Int 12:456–464., , ,
- 211962 Statistical Astronomy. Dover Publications, New York, NY, USA, pp. 42–68.,
- 221991 Practical Statistics for Medical Research. Chapman Hall, London, UK, USA, pp. 277–282.
- 232002 Multisite quantitative ultrasound: Colles fracture discrimination in postmenopausal women. Osteoporos Int 13:474–479., , , ,
- 241995 Choosing between predictors of fractures. J Bone Miner Res 10:1816–1822., , ,
- 251992 Fracture risk as determined by prospective and retrospective study designs. Osteoporos Int 2:290–297., , , ,
- 261998 What's the relative risk? A method of correcting the odds ratio in cohort studies of common outcomes. JAMA 280:1690–1691.,
- 271995 Impact of soft tissue on in-vivo accuracy of bone mineral measurements in the spine, hip and forearm: A human cadaver study. J Bone Miner Res 10:868–873., , ,
- 281997 Comparisons of non-invasive bone mineral measurements in assessing age-related bone loss, fracture discrimination and diagnostic classification. J Bone Miner Res 12:697–711., , , , , , , ,
- 292000 Bilateral measurements of femoral bone mineral density. J Clin Densitom 3:133–140., , ,
- 301996 Classification based on DXA measurements for assessing the risk of hip fractures. J Bone Miner Res 11:S1;S120., , , ,