SEARCH

SEARCH BY CITATION

Keywords:

  • bone densitometry;
  • fracture risk;
  • osteoporosis

Abstract

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. REFERENCES

Combining information from different types of BMD measurement should improve the evaluation of patients' risk of fracture. This study used a bivariate gaussian model to examine the effect of combining two different BMD measurements. The results show that, in practice, there is little benefit unless the measurements are completely unrelated.

Introduction: Intuitively, the combination of information from two or more different types of bone densitometry investigation should improve our ability to identify patients at high risk of fracture. However, the best way to combine measurements and the resulting gain in fracture discrimination are not known.

Materials and Methods: In this study, we used a bivariate gaussian model to investigate the effect of combining two different types of bone densitometry measurements. The measurements had individual relative risk values RR1 and RR2 and a correlation coefficient r between their Z-scores. Different approaches to the combination of the two measurements were compared by calculating the area under the curve (AUC) for the receiver operating characteristic (ROC) curve, which was obtained by plotting the percentage of fracture patients against the percentage of the whole population with a Z-score below some chosen threshold. ROC curves were calculated for three cases: (1) one type of measurement only; (2) two different types of measurements combined using their mean Z-score weighted according to the theoretical optimum weighting factors predicted by the bivariate gaussian model; and (3) two different types of measurements combined using the conventional World Health Organization (WHO) approach, where one or other measurement is below a set threshold. The theoretical model was tested using measurements of speed of sound (SOS) in the radius, phalanx, and metatarsal in patients with vertebral and Colles' fractures.

Results: Results were calculated for RR values of 1.5, 2.0, and 2.5 and r = 0, 0.5, and 0.7. Although a significant improvement in fracture discrimination was obtained when r = 0 and RR1 = RR2, the improvements obtained when r ≥ 0.5 or RR1 ≠ RR2 were relatively modest. Slightly better fracture discrimination was obtained using the weighted mean Z-score approach compared with the WHO approach, although the differences were small. The results of the in vivo study in Colles' and vertebral fracture patients showed close agreement with the predictions of the bivariate gaussian model.

Conclusion: In practice, from a theoretical point of view, there is unlikely to be any benefit from combining information from different types of bone densitometry measurements unless they are completely unrelated.


INTRODUCTION

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. REFERENCES

Growing awareness of the impact of osteoporosis on the elderly,(1) the consequent costs of healthcare,(2) and the development of new treatments to prevent fractures(3–7) have all contributed to a rapid growth in the demand for bone densitometry services. Today, scans to measure bone mineral density (BMD) have an essential role in evaluating patients at risk of osteoporosis.(8–10) In 1994, a World Health Organization (WHO) report(11) recommended that osteoporosis should be defined by expressing BMD measurements as T-scores. T-scores are calculated by taking the difference between a patient's measured BMD and the mean BMD of healthy young adults matched for gender and ethnic group and dividing by the young adult SD. The WHO report defined osteoporosis as a T-score ≤ −2.5 measured at the spine, hip, or forearm. BMD measurements may also be interpreted using Z-scores.(12) Z-scores are calculated by taking the difference between the measured BMD and the mean BMD for healthy subjects matched for age, gender, and ethnic group, and dividing by the respective SD. Although they cannot be used to diagnose osteoporosis, Z-scores are useful because they express a patient's skeletal status relative to their peers. A patient's T- and Z-scores are related by the equation:

  • equation image(1)

where the population mean T-score depends on the patient's age and gender and the type of measurement.(13)

Fundamental to the role of BMD scans in diagnosing osteoporosis is the ability to assess a patient's risk of fracture. The most reliable approach to evaluating the effectiveness of bone densitometry is through prospective studies of incident fractures.(14,15) Studies are analyzed using a proportional hazards model in which the findings are expressed as the relative risk (RR), defined as the increased risk of fracture for each unit decrease in Z-score.(16) The results of fracture studies can also be expressed by plotting the percentage of fracture patients against the percentage of the whole study population with BMD values below some chosen threshold. As the threshold is varied, one obtains a receiver operating characteristic (ROC) curve(17) in which the true positive fraction (those patients who sustained a fracture and were correctly identified to be at risk by the BMD measurement) is plotted against the false positive fraction (those patients identified as being at risk but who did not fracture). ROC curves are often parameterized by the area under the curve (AUC). The larger the AUC, the better the discrimination of the BMD measurements at identifying those patients at greatest risk of fracture.(18)

The need to optimize the ROC curve raises the question of what improvement in fracture discrimination is obtained by combining information from two or more different measurements. For example, it is common to perform BMD scans of the spine and hip and make the diagnosis of osteoporosis if the T-score at either site is less than −2.5.(12) However, it is unclear what gain in discrimination is obtained by this practice, or even whether individual interpretation of T-scores using the WHO threshold is the best way of combining the data. More generally, one might wish to examine the effect of other combinations such as BMD and heel ultrasound(19) or speed of sound (SOS) measurements at different skeletal sites.(20) In this report we consider these issues by developing a mathematical model to illustrate the gains and limitations that apply when information is combined from two different measurements. The predictions of the model are compared with SOS data obtained in patients with vertebral and Colles' fractures.

MATERIALS AND METHODS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. REFERENCES

Estimation of the ROC curve for a single BMD measurement

We first give a brief description of the model for a single type of measurement. We assume that for a group of subjects from the general population with a narrow range of ages (say, a 5-year age range), the distribution of Z-score values approximates to a gaussian curve with its peak at Z = 0.(18)

  • equation image(2)

Based on the proportional hazards model, the fracture risk can be modeled by an exponential curve that scales with Z-score as exp(−βZ), where β is the logarithm of the relative risk [β = ln(RR)]. The Z-score distribution for the fracture population is found by multiplying Equation 2 by the fracture risk curve. It is found to be a gaussian equation with the same SD as Equation 2 but with its peak at Z = −β(18):

  • equation image(3)

The gaussian model may be used to predict points on the ROC curve by calculating the area under the curves representing the fracture and the general populations up to a chosen Z-score threshold (Fig. 1A). As the threshold is varied, the ROC curve is traced out and the AUC can be calculated.

thumbnail image

Figure FIG. 1.. (A) Gaussian curves representing the distribution of BMD values in a fracture population compared with an age-matched general population. The two curves are offset by a Z-score difference β = ln(RR). By integrating the two curves, the percentage of patients in each group with BMD values below a threshold Z-score can be calculated and used to draw an ROC curve in which the true positive fraction (those patients identified to be at risk who sustain a fracture) is plotted against the false positive fraction (those patients identified at risk but who do not fracture) as the threshold Z-score is varied. (B) Plot of the bivariate gaussian functions representing the distribution of BMD values in a fracture population compared with an age-matched general population. The elliptical curves represent contours of equal number density. The diagonal line represents a line of constant weighted mean Z-score that sets the threshold for integration of the two bivariate gaussian functions. By integrating over the stippled area, the percentage of patients in each group with weighted mean Z-score value below the threshold can be calculated and used to draw an ROC curve in a manner analogous to the single BMD model in A. (C) The same bivariate gaussian functions as shown in B, but with the stippled area defining the patients at high risk of fracture based on the WHO approach where patients are assumed to be at risk if one or other measurement is below the threshold.

Download figure to PowerPoint

Estimation of the ROC curve for the combination of two BMD measurements

The effect of combining two different types of measurement with Z-scores Z1 and Z2 can be studied by using a bivariate gaussian function(21) to represent the population distribution. For the general population the equation is:

  • equation image(4)

where r is the Pearson correlation coefficient between Z1 and Z2.(22)

The bivariate gaussian representing the fracture population is similar to the function describing the general population, but with its peak at the point Z1 = −β1, Z2 = −β2:

  • equation image(5)

That Equation 5 has the correct form to represent the fracture population can be verified by projecting the distribution onto the Z1 axis when it reduces to the single measurement gaussian in Equation 3 with β = β1. It is clear, therefore, that the constant β1 is related to the relative risk of the first BMD measurement by the equation β1 = ln(RR1). In the same way it can be shown that the constant β2 is related to the relative risk of the second measurement by the equation β2 = ln(RR2). Figure 1B shows the functions in Equations 4 and 5 plotted together as contours of constant number density.

There is an important relationship between β1, β2, and r. If two different types of measurement are correlated, then part of the fracture prediction capability of one is derived through its correlation with the other. If Equation 5 is rewritten to express Z2 in terms of its correlation with Z1 and a residual, the β value associated with the residual is βres = (β2rβ1)/√(1 − r2) Because an increased risk of fracture is always associated with a decrease in BMD, βres > 0, and hence we derive the condition β2 > rβ1, or equivalently r < β21. This condition must apply if the combination of two different measurements is to provide increased fracture discrimination compared with a single measurement alone. There is a similar condition, r < β12, but given that the correlation coefficient r is less than unity, one of the two conditions is trivial.

In this study, the bivariate gaussian functions representing the general and the fracture populations were integrated to derive the form of the ROC curve in a manner similar to the single measurement model (Fig. 1A). Two different methods of combining the two BMD's to set a threshold for identifying the high-risk individuals were considered. In the first approach, a weighted mean of the two Z-scores was used (Fig. 1B). The second approach was based on the WHO interpretation of BMD results in which individuals are identified as being at risk if one or other of the two measurements is below a set threshold (Fig. 1C). These two approaches are described in turn below.

Combination of two BMD measurements using the weighted mean Z-score

In the first approach, a weighted mean of the two Z-scores was used to specify the threshold for the integration of the bivariate gaussian functions (Fig. 1B). We begin by considering the case where the two measurements are equally effective at predicting fracture risk. In this case β1 = β2 = β, and it is clear the measurements should be given equal weight. A simple way of achieving this is to rotate the axes in Fig. 1B through 45° to give new axes Z+ and Z that are oriented along the major and minor axes of the bivariate gaussian distributions, respectively. The Z+ and Z values associated with Z1 and Z2 are:

  • equation image(6)
  • equation image(7)

When the bivariate gaussian is projected onto the Z+ axis, a gaussian curve is obtained with an SD of √(1 + r). Because the peaks of the functions representing the two populations are at (0, 0) and (−β, −β), respectively, the separation of the two peaks is √2β. When normalized by the SD of the projected gaussian, the separation is √2B/√(1 + r). It follows from the example of the single measurement model illustrated in Fig. 1A that the β value for the Z+ combination of Z1 and Z2 is given by:

  • equation image(8)

When the bivariate gaussian is projected onto the Z axis, the peaks of the two populations coincide. Hence Z gives no information about fracture risk. We shall refer to Z+ as the weighted mean Z-score. As the value of the Z+ threshold in Fig. 1B is varied, the ROC curve is traced out and the AUC can be calculated.

The general case when β1 and β2 are unequal also has a simple solution. In this case, Z1 and Z2 are combined with weighting factors cosϕ and sinϕ, giving the following generalized equation for the weighted mean Z-score:

  • equation image(9)

By an argument similar to that given above for the derivation of Equation 7, in which the separation of the two peaks projected on the Z+ axis is normalized by the SD of the projected single gaussian, the β value for the weighted mean Z-score is:

  • equation image(10)

The value of βcomb is a maximum when the angle ϕ takes the value:

  • equation image(11)

Note that in the case when β1 = β2, the angle ϕ = 45°, and the definition of the weighted mean Z-score given in Equation 8 reverts to the definition given in Equation 6a.

Combination of two BMD measurements using the WHO approach

In the second method of combining two measurements, an approach equivalent to the WHO interpretation of BMD results was modeled in which the patient was defined to be at high risk of fracture if one or the other of the two Z-scores was below a set threshold (Fig. 1C). A computer program was written to integrate the functions representing the two populations according to the limits shown in Fig. 1C and calculate the corresponding ROC curve. For the sake of presenting a simple example, the calculations were performed assuming that the population mean T-score in Equation 1 was the same for both measurements. In this case, the two axes have equal Z-score thresholds, as shown in Fig. 1C. This is approximately true for lumbar spine and femoral neck BMD measurements.(13) Note that although the above description of the bivariant gaussian model is given in terms of Z-scores, the point on the ROC curve corresponding to a T-score of −2.5 is readily calculated by substituting T = −2.5 in Equation 1.

Subjects and measurements for the in vivo study

Data from an in vivo study were used to compare the predictions of the theoretical model described above with the improvements in fracture discrimination found by combining multisite axial transmission SOS measurements in patients with vertebral and Colles' fracture. A Sunlight Omnisense device (Sunlight Medical Ltd., Tel-Aviv, Israel) was used to measure SOS at the radius, phalanx, and metatarsal in a study population that consisted of 110 healthy postmenopausal women, 64 women with atraumatic vertebral fractures, and 31 women with low trauma wrist fractures.(20,23) All patients gave informed consent, and the study was approved by the Local Research Ethics Committee. Logistic regression analysis was used to calculate the age-adjusted odds ratios (ORs) for fracture discrimination. Although the mathematical model used for logistic regression analysis of cross-sectional fracture studies differs slightly from the proportional hazards model used for prospective studies,(24) the resulting ORs are essentially equivalent to relative risks.(25,26) OR values for the combinations of pairs of SOS sites were therefore calculated by taking the weighted mean of the two Z-scores, and the results were compared with the predictions of the theoretical model calculated from the single site OR using Equations 9 and 10.

RESULTS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. REFERENCES

When the single measurement model was used to calculate the ROC curves for RR values of 1.5, 2.0, and 2.5 (Fig. 2A), the values of the AUC were 0.613, 0.688, and 0.741, respectively (Table 1). At the point on the ROC curve representing the lowest quartile of the general population, the percentage of the fracture population captured was 39%, 51%, and 60%, respectively (Fig. 2A).

Table Table 1. Calculated Values of the Area Under the ROC Curve
Thumbnail image of
thumbnail image

Figure FIG. 2.. (A) ROC curves for a single BMD measurement calculated using the method shown in Fig. 1A. The three curves correspond to RR values of 1.5, 2.0, and 2.5. For patients in the lowest quartile of the general population, the ROC curves include, respectively, 39%, 51%, and 60% of patients who will have a fracture. (B) Comparison of ROC curves for two BMD measurements combined using their weighted mean Z-score and calculated using the method outlined in Fig. 1B. The two measurements both have RR = 2.0, and the curves correspond to r = 0, 0.5, and 0.7, respectively. Also shown is the ROC curve for a single measurement for the same RR value. For patients in the lowest quartile of the general population, the ROC curves include, respectively, 62%, 55%, 53%, and 51% of patients who will have a fracture. (C) Comparison of ROC curves calculated for two BMD measurements combined using the weighted mean Z-score approach (Fig. 1B) and the WHO approach (Fig. 1C). The two measurements both have RR = 2.0 and the correlation coefficient r = 0. Also shown is the ROC curve for a single measurement for the same RR value. For patients in the lowest quartile of the general population, the ROC curves include, respectively, 62%, 57%, and 51% of patients who will have a fracture.

Download figure to PowerPoint

For the combination of two BMD measurements using the weighted mean Z-score approach, the simplest case is when the measurements have equal RR values. The greatest improvement in fracture discrimination is obtained for r = 0 when βcomb = √2β (Equation 7). As the correlation coefficient increases to r = 1, the value of βcomb decreases to β, and no improvement is obtained compared with a single measurement. The value of RR corresponding to βcomb can be calculated using the equation RRcomb = exp(βcomb). For RR1 = RR2 = 1.5, the values of RRcomb (and AUC) for r = 0, r = 0.5, and r = 0.7 are 1.77 (0.657), 1.60 (0.630), and 1.55 (0.622), respectively (Table 1). For RR1 = RR2 = 2.0, the equivalent values are 2.67 (0.756), 2.23 (0.714), and 2.13 (0.702), and for RR1 = RR2 = 2.5 the values are 3.65 (0.820), 2.89 (0.773), and 2.71 (0.759). The ROC curves for RR = 2.0 are plotted in Fig. 2B. For the point on the curves representing the lowest quartile of the general population, the percentage of the fracture population captured was 62%, 55%, and 53%, respectively, for r = 0, 0.5, and 0.7 compared with 51% for a single BMD measurement with the same RR value.

When the two BMD measurements have unequal RR values, the angle ϕ for calculating the optimum weighting factors in Equation 8 was found using Equation 10. The angle ϕ was plotted as a function of the correlation coefficient for three different sets of RR values (Fig. 3). When r is less than the critical value rc = β21, both weighting factors are positive, and the combination of the two measurements improves fracture discrimination. However, when r = rc, ϕ = 0 and no additional discrimination is obtained. When r > rc, the weighting factor for the second measurement is negative, and the additional discrimination provided by the second measurement associates an increased fracture risk with an increased BMD. This is the opposite of what is usually observed and will not be considered further here.

thumbnail image

Figure FIG. 3.. Curves of the angle ϕ used to calculate the optimum weighting factors cosϕ and sinϕ for the combination of two BMD measurements using the weighted mean Z-score approach when the two RR values are unequal. The angle ϕ is shown as a function of the Pearson correlation coefficient r between the two Z-score measurements.

Download figure to PowerPoint

When the weighted mean Z-score approach was analyzed for RR1 = 2.0 and RR2 = 1.5, the values of RRcomb (and AUC) for r = 0 and r = 0.5 were 2.23 (0.715) and 2.01 (0.689), respectively (Table 1). The case for r = 0.7 was not considered because it exceeds the critical value rc = 0.58. For RR1 = 2.5 and RR2 = 2.0, the critical value is rc = 0.76, and the values of RRcomb (and AUC) for r = 0, r = 0.5, and r = 0.7 are 3.16 (0.791), 2.61 (0.750), and 2.51 (0.742), respectively. For RR1 = 2.5 and RR2 = 1.5, the critical value is rc = 0.44, and the value of RRcomb (and AUC) for r = 0 is 2.72 (0.760).

For the combination of two BMD measurements using the WHO approach, the ROC curves were calculated by performing the numerical integration over the two population distributions shown in Fig. 1C. The AUC values obtained are compared with the weighted mean Z-score approach in Table 1. In each instance, slightly better fracture discrimination was obtained using the mean Z-score approach. For equal RR values, the AUC values for the WHO approach lay between those for the mean Z-score approach and a single BMD measurement. The ROC curves for a single BMD measurement and two measurements combined using the mean Z-score and WHO approaches for RR1 = RR2 = 2.0 and r = 0 are plotted in Fig. 2C. For the point on the curves representing the lowest quartile of the general population the percentage of the fracture population captured was 62%, 57%, and 51% for the mean Z-score, the WHO approach, and a single BMD measurement, respectively. For r = 0.5, the corresponding values were 55%, 53%, and 51%.

The results for the WHO approach when the two BMD measurements have unequal RR values are also included in Table 1. As with the case for equal RR values, the weighted mean Z-score approach performed slightly better than the WHO approach. The comparison between the two approaches is summarized in Fig. 4, which shows the AUC plotted as a function of the correlation coefficient for the weighted mean Z-score, and the WHO approaches for examples of equal RR values (RR1 = RR2 = 2.0) and unequal values (RR1 = 2.0, RR2 = 1.5), respectively.

thumbnail image

Figure FIG. 4.. Comparison of the area under the ROC curve (AUC) as a function of the correlation coefficient r for the combination of two BMD measurements using the weighted mean Z-score and WHO approaches. The AUC values are shown for two measurements with equal RR values of 2.0 and two measurements with unequal RR values of 2.0 and 1.5. In the latter case, there is an upper limit to the correlation coefficient r = β21 set by the requirement that the second measurement provide independent evidence of fracture risk. Also shown are the AUC values corresponding to single BMD measurements with RR values of 1.5, 2.0, and 2.5.

Download figure to PowerPoint

Details of the subjects included in the cross-sectional studies of vertebral and Colles' fracture patients using the Sunlight Omnisense device have been published previously.(20,23) OR values obtained from logistic regression analysis for radius, phalanx, and metatarsal SOS considered singularly and then combined in pairs using the optimum weighting factors for the weighted mean Z-score are listed in Table 2. For the paired SOS data, Table 2 also lists the theoretical values of OR calculated using Equation 9. Values for the correlation coefficient r varied between 0.27 and 0.32. The average OR value for the single site measurements was 1.55, for the paired SOS data analyzed using logistic regression was 1.75, and for the paired SOS data predicted using the bivariant gaussian model was 1.74. The latter two figures show good agreement, and the in vivo data confirm the modest improvement in the ORs predicted by the model.

Table Table 2. Odds Ratios From the In Vivo Study
Thumbnail image of

DISCUSSION

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. REFERENCES

The clinical value of bone densitometry investigations depends on two factors: (1) the ability of the measurements to discriminate patients who will have a fracture from those who will not; and (2) how effectively the scan findings are interpreted. The first factor depends on the RR value of the BMD measurements, which determines the area under the ROC curve. The larger RR, the larger the percentage of future fracture patients identified when treating any given percentage of the population.(18) At present, the best technique is the use of a hip BMD measurement to predict hip fracture risk for which RR = 2.6.(16) The second factor determining the value of BMD scanning is how effectively the findings are interpreted. Here, the prevailing paradigm is the WHO definition of osteoporosis of a T-score ≤ −2.5 at the spine, hip, or forearm. The ROC plot is a useful tool for evaluating the outcome of bone densitometry investigations because the operating point on the curve indicates the overall effectiveness of BMD scans taking into account both the RR of the measurement technique and the way the findings are interpreted.(18)

Intuitively, the combination of two or more different types of measurement should improve the ability of bone densitometry investigations to identify patients at risk of fracture so that a more favorable point is achieved on the ROC curve. Ideally, this means moving the operating point towards the top left of the diagram so more fracture cases are identified and fewer patients treated overall. When more than one measurement is made (for example spine and hip BMD), most clinicians would agree that a diagnosis of osteoporosis should be made if one or other site is below the WHO T-score threshold.(12) However, it is not known whether this widely used practice is the optimum method of using the data in terms of the ROC plot. An alternative is the conventional statistical approach of combining two noisy measurements by taking their mean. If each measurement has the same error, they are given equal weights, and there is an overall gain in signal to noise by a factor of √2. Bone density measurements should certainly be regarded as noisy data, because random accuracy errors caused by the effects of soft tissue composition account for around 50% of the population SD,(27) suggesting that their effect might be reduced by averaging over sites. Therefore, an important objective of this study was to quantify and compare the improvement in fracture discrimination for the WHO and the weighted mean Z-score approaches.

Of the two approaches, it is easier to calculate the ROC curve for the weighted mean Z-score. In this case, the effect of combining two measurements is equivalent to a single BMD measurement with an increased RR value that can be calculated from Equation 7 for the case RR1 = RR2 and from Equations 9 and 10 for the case when RR1 ≠ RR2. The greatest improvement in fracture discrimination is obtained when the two measurements have RR values, and their correlation coefficient is zero. In this case, βcomb = √2β, and we obtain the √2 factor familiar from basic statistics. However, as the correlation coefficient increases to r = 1, the value of βcomb decreases to β, and there is no gain compared with a single measurement. When two measurements are completely uncorrelated, the improvement in discrimination is substantial. For example, for two measurements where RR1 = RR2 = 2.0, the value of RRcomb is 2.67, which is equivalent to the optimal bone densitometry study using hip BMD to predict hip fracture risk.(16) However, in practice the correlation coefficient between different measurements often lies in the range r = 0.5–0.7,(28) and in this case, the value of RRcomb is substantially less and lies in the range of 2.13–2.23. A good example of the important effect of the correlation coefficient is dual-femur DXA scanning.(29) In this case RR1 = RR2 = 2.6,(16) and r ≈ 0.95,(29) giving RRcomb = 2.63, which represents a negligible improvement. In comparison, for two uncorrelated measurements with RR = 2.6, the RRcomb value would be 3.86. Were it possible to achieve this latter example in practice, it would represent a substantial gain in fracture discrimination.

The case when RR1 ≠ RR2 is more complicated because the weighting factors are unequal. The greatest improvement in fracture discrimination is still obtained when r = 0. In this case tanϕ = β21 (Equation 10), and the two measurements are weighted in proportion to their β values. It is readily shown from Equation 9 that in this case the β values of the weighted combination is βcomb = √(β12 + β22). As the correlation coefficient increases from zero, proportionally more weight is given to the measurement with the larger RR value, and the contribution of the second measurement becomes zero when rc = β21 (Fig. 3). An example of unequal weights is the use of spine and hip BMD to predict hip fracture risk. The Marshall meta-analysis lists RR values (and 95% CIs) of 1.6 (1.2–2.2) for spine BMD and 2.6 (2.0–3.5) for hip BMD.(16) Assuming RR = 2.6 for hip BMD and a correlation coefficient r = 0.7, the minimum allowed RR value for spine BMD is 1.95. We therefore combined the hip RR figure of 2.6 with the 95% confidence upper limit for spine BMD of 2.2 to obtain RRcomb = 2.64. As with the example of dual-femur scanning discussed above, this represents a negligible improvement.

The poor correlation between different types of bone densitometry measurement is often a cause for concern because of the potential for conflicting findings between different sites.(18) If two types of measurement were to correlate perfectly, they would identify exactly the same patients for treatment. However, in practice, different types of measurement often correlate poorly, with r ≈ 0.6–0.7 between BMD results from different sites and r ≈ 0.4–0.5 between QUS and BMD measurements.(28) Although the poor correlation between different types of measurement can create difficulties when comparing results, it turns out that the correlations found in practice are rather too good to achieve more than a modest improvement in fracture discrimination by combination of measurements.

In this study, when the weighted mean Z-score approach was compared with the WHO approach, the former was found to perform consistently slightly better in terms of the ROC curve (Table 1). However, the differences are greatest when the correlation coefficient is zero, and in practice, for correlations of r = 0.5–0.7, the differences in the ROC curves between the two approaches were very small.

An important limitation of the analysis presented here is that the results depend on the assumptions that the BMD distributions for the age-matched general population are gaussian and that the fracture risk curves are exponential. We believe that these are sufficiently good approximations to provide useful insights into the effects of combining different types of BMD measurement. Although we were able to show good agreement with the results of the theoretical model presented here and data from a cross-sectional study of fracture patients, it is important that the predictions of the model are compared with data from a prospective fracture study. We note that the conclusions of the present study are consistent with an analysis of data from the Study of Osteoporotic Fractures reported by Genant et al.,(30) who also drew attention to the limited benefit of combining BMD data from different sites.

It was concluded from the present study that when two different types of BMD measurement are combined, the largest gain in fracture discrimination is obtained when the correlation coefficient is zero and they have equal RR values. In this case, by using the mean Z-score approach, the exponent β in the exponential curve relating fracture risk to BMD is increased by a factor √2. In practice, however, the effects of the correlation between different measurements and of unequal values of RR ensure that the real gains in fracture discrimination are considerably smaller than this ideal case. Second, it was concluded that when the weighted mean Z-score approach was compared with the WHO approach the former performed slightly better when judged by the ROC curve, although in practice the clinical effect of this difference was found to be small.

REFERENCES

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. MATERIALS AND METHODS
  5. RESULTS
  6. DISCUSSION
  7. REFERENCES
  • 1
    Cooper C, Campion G, Melton LJ 1992 Hip fractures in the elderly; a world-wide projection. Osteoporos Int 2:285289.
  • 2
    Ray NF, Chan JK, Thamer M, Melton LJ 1997 Medical expenditures for the treatment of osteoporotic fractures in the United States in 1995: Report from the National Osteoporosis Foundation. J Bone Miner Res 12:2435.
  • 3
    Black DM, Cummings SR, Karpf DB, Cauley JA, Thompson DE, Nevitt MC, Bauer DC, Genant HK, Haskell WL, Marcus R, Ott SM, Torner JC, Quandt SA, Reiss TF, Ensrud KE 1996 Randomised trial of the effect of alendronate on risk of fracture in women with existing vertebral fractures. Lancet 348:15351541.
  • 4
    Ettinger B, Black DM, Mitlak BH, Knickerbocker RK, Nickelsen T, Genant HK, Christiansen C, Delmas PD, Zanchetta JR, Stakkestad J, Gluer CC, Krueger K, Cohen FJ, Eckert S, Ensrud KE, Avioli LV, Lips P, Cummings SR 1999 Reduction of vertebral fracture risk in postmenopausal women with osteoporosis treated with raloxifene: Results from a 3-year randomized clinical trial. JAMA 282:637645.
  • 5
    Harris ST, Watts NB, Genant HK, McKeever CD, Hangartner T, Keller M, Chesnut CH, Brown J, Eriksen EF, Hoseyni MS, Axelrod DW, Miller PD 1999 Effects of risedronate treatment on vertebral and nonvertebral fractures in women with postmenopausal osteoporosis. JAMA 282:13441352.
  • 6
    McClung MR, Geusens P, Miller PD, Zippel H, Bensen WG, Roux C, Adami S, Fogelman I, Diamond T, Eastell R, Meunier PJ, Reginster J-Y 2001 Effect of risedronate treatment on hip fracture risk in elderly women. N Engl J Med 344:333340.
  • 7
    Neer RM, Arnaud CD, Zanchetta JR, Prince R, Gaich GA, Reginster J-Y, Hodsman AB, Eriksen EF, Ish-Shalom S, Genant HK, Wang O, Mitlak BH 2001 Effect of recombinant human parathyroid hormone (1–34) fragment on spine and non-spine fractures and bone mineral density in postmenopausal osteoporosis. N Engl J Med 344:14341441.
  • 8
    Baran DT, Faulkner KG, Genant HK, Miller PD, Pacifici R 1997 Diagnosis and management of osteoporosis: Guidelines for the ultilization of bone densitometry. Calcif Tissue Int 61:433440.
  • 9
    Kanis JA, Delmas P, Burckhardt P, Cooper C, Torgerson D on behalf of the European Foundation for Osteoporosis and Bone Disease 1997 Guidelines for diagnosis and treatment of osteoporosis. Osteoporos Int 7:390406.
  • 10
    Royal College of Physicians 1999 Osteoporosis: Clinical Guidelines for Prevention and Treatment. Royal College of Physicians, London, UK.
  • 11
    World Health Organization 1994 WHO Technical Report Series 843. Assessment of Fracture Risk and Its Application to Screening for Postmenopausal Osteoporosis. World Health Organization, Geneva, Switzerland.
  • 12
    National Osteoporosis Society 2002 Position Statement on the Reporting of Dual X-Ray Absorptiometry (DXA) Bone Mineral Density Scans. National Osteoporosis Society, Bath, UK.
  • 13
    Faulkner KG, Von Stetton E, Miller P 1999 Discordance in patient classification using T-scores. J Clin Densitom 2:343350.
  • 14
    Seeley DG, Browner WS, Nevitt MC, Genant HK, Scott JC, Cummings SR 1991 Which fractures are associated with low appendicular bone mass in elderly women? Ann Intern Med 115:837842.
  • 15
    Cummings SR, Black DM, Nevitt MC, Browner W, Cauley J, Ensrud K, Genant HK, Palmero L, Scott J, Vogt TM 1993 Bone density at various sites for prediction of hip fractures. Lancet 341:7275.
  • 16
    Marshall D, Johnell O, Wedel H 1996 Meta-analysis of how well measures of bone mineral density predict occurrence of osteoporotic fractures. BMJ 312:12541259.
  • 17
    Altman DG 1991 Practical Statistics for Medical Research. Chapman Hall, London, UK, pp. 409418.
  • 18
    Blake GM, Fogelman I 2001 Peripheral or central densitometry: Does it matter which technique we use? J Clin Densitom 4:8396.
  • 19
    Frost ML, Blake GM, Fogelman I 2001 Does the combination of quantitative ultrasound and dual energy x-ray absorptiometry improve fracture discrimination? Osteoporos Int 12:471477.
  • 20
    Knapp KM, Blake GM, Spector TD, Fogelman I 2001 Multisite quantitative ultrasound: Precision, age- and menopause-related changes, fracture discrimination and T-score equivalence with dual-energy X-ray absorptiometry. Osteoporos Int 12:456464.
  • 21
    Trumpler RJ, Weaver HF 1962 Statistical Astronomy. Dover Publications, New York, NY, USA, pp. 4268.
  • 22
    Altman DG 1991 Practical Statistics for Medical Research. Chapman Hall, London, UK, USA, pp. 277282.
  • 23
    Knapp KM, Blake GM, Fogelman I, Doyle DV, Spector TD 2002 Multisite quantitative ultrasound: Colles fracture discrimination in postmenopausal women. Osteoporos Int 13:474479.
  • 24
    Hui SL, Slemenda CW, Carey MA, Johnston CC 1995 Choosing between predictors of fractures. J Bone Miner Res 10:18161822.
  • 25
    Stegman MR, Recker RR, Davies KM, Ryan RA, Heaney RP 1992 Fracture risk as determined by prospective and retrospective study designs. Osteoporos Int 2:290297.
  • 26
    Zhang J, Yu KF 1998 What's the relative risk? A method of correcting the odds ratio in cohort studies of common outcomes. JAMA 280:16901691.
  • 27
    Svendsen OL, Hassager C, Skodt V, Christiansen C 1995 Impact of soft tissue on in-vivo accuracy of bone mineral measurements in the spine, hip and forearm: A human cadaver study. J Bone Miner Res 10:868873.
  • 28
    Grampp S, Genant HK, Mathur A, Lang P, Jergas M, Takada M, Gluer C-C, Lu Y, Chavez M 1997 Comparisons of non-invasive bone mineral measurements in assessing age-related bone loss, fracture discrimination and diagnostic classification. J Bone Miner Res 12:697711.
  • 29
    Mazess RB, Nord RH, Hanson JA, Barden HS 2000 Bilateral measurements of femoral bone mineral density. J Clin Densitom 3:133140.
  • 30
    Genant HK, Lu Y, Mathur AK, Fuerst TP, Cummings SR 1996 Classification based on DXA measurements for assessing the risk of hip fractures. J Bone Miner Res 11:S1;S120.