SEARCH

SEARCH BY CITATION

Abstract

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. METHODS
  5. RESULTS
  6. DISCUSSION
  7. CONCLUSIONS
  8. Acknowledgements
  9. References

Background  One of the challenges of health-related quality of life research is to translate statistically significant health-related quality of life changes into interpretable clinical or medically important ones.

Objective  To calculate the minimal important difference of the King's Health Questionnaire, a condition-specific health-related quality of life questionnaire for the assessment of men and women with lower urinary tract dysfunction.

Methods  The King's Health Questionnaire was administered to patients suffering from overactive bladder enrolled in two multinational studies. Minimal important differences were calculated using an anchor-based approach with both a global rating of patient-perceived treatment benefit and one of perceived disease impact. A distribution-based method using effect size was calculated for comparison purposes.

Results  Minimal important difference values varied slightly with each method. Using the anchor-based approach, the King's Health Questionnaire minimal important difference ranged between 5–10 points when the calculation factored out patients who reported no change and 6–12 points for patients who experienced a small improvement. The effect size method indicated a minimal important difference of 5 to 6 points for a small effect and 10 to 15 points for a medium effect.

Conclusions  In the case of the King's Health Questionnaire, the anchor-based approaches and the distribution-based approach provide similar results. A change from baseline of at least 5 points on King's Health Questionnaire domains indicates a change that is meaningful to patients and is indicative of a clinically meaningful improvement in health-related quality of life after treatment. Convergence of the estimates using different approaches should give us confidence in the values derived for the quality of life domains measured by the King's Health Questionnaire.


INTRODUCTION

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. METHODS
  5. RESULTS
  6. DISCUSSION
  7. CONCLUSIONS
  8. Acknowledgements
  9. References

Health-related quality of life is increasingly being considered an essential endpoint in therapeutic assessments and is useful in the management of patients with lower urinary tract dysfunction. Understanding how to measure and interpret health-related quality of life research findings is important in both clinical research and practice. Because health-related quality of life is a multidimensional, subjective construct, interpretation of data collected using these measures is difficult. Interpretation of the significance to patients of improvement in health-related quality of life scores is not well understood and reliance on health-related quality of life improvement as a primary outcome measure in clinical practice is often viewed with caution. Although the statistical significance of changes in health-related quality of life is reported, it has not been easy to place the degree and importance of these changes in a context that is meaningful for both patients and health care providers.1–7 Therefore, one of the challenges in using health-related quality of life instruments is the ability to translate statistically significant changes into clinically or medically important ones.8 Ideally, investigators, regulatory agencies and ultimately practitioners would like to answer the question, “How much change in health-related quality of life is enough to evaluate the treatment or to consider one treatment better than another?”2,3,9,10

Several terms have been used in connection with clinically meaningful change in health-related quality of life, including minimal important difference, minimal clinically important difference, minimal patient perceivable deterioration and minimal patient perceivable improvement. All these terms refer to the smallest (or ‘minimal’) change in a health-related quality of life score that is considered meaningful (or ‘important’) by either a clinician or a patient.1,4,8,9,11,12 The term ‘minimal’ refers to the threshold beyond which individual variation and measurement ‘noise’ are likely explanations for the difference. ‘Important’ addresses the “How much is enough?” question and refers to change that has an effect on clinical decision-making, for instance, continuing an effective treatment regimen or modifying an ineffective one. Although minimal important difference has statistical roots, it is not a statistical test. It is better described as an aid or guide for interpretation of health-related quality of life results.

The determination of medical importance depends on the perspective used and the evaluation criteria applied. The perspective of the interpreter addresses the “Who says so?” question. Perspectives include those of the society, the institution, the payer, the health care provider and the patient.13 The criteria against which importance is assessed may depend on the perspective. For example, patients are often concerned about the social and physical impacts of their disease; physicians about the physical aspects of the disease and its symptoms and also the concerns of their patients; and payers are interested in the resource use and economic impacts associated with treatment interventions, morbidity and mortality.

Several methods are used to determine minimal important differences4–6,14–16 and a ‘gold standard’ approach has not yet been agreed. Because the use of different perspectives, criteria or methods can yield different results, it seems prudent to use multiple methods to estimate the clinically meaningful change. Two methods are commonly used to calculate the minimal important difference of health-related quality of life questionnaires: anchor-based and distribution-based approaches.7 The anchor-based approach uses an external criterion or ‘anchor’ with which to compare improvements in health-related quality of life domain scores. This anchor is frequently a global or overall question measuring wellbeing or treatment effect. The distribution-based approach depends entirely on the distribution properties of the sample and focuses on interpreting the results in terms of the relationship between the magnitude of change and the variability.

In this article, both methods were used to estimate the minimal important difference for the King's Health Questionnaire.17 Two large-scale clinical trials using the King's Health Questionnaire in patients with overactive bladder characterised by frequency and urgency with and without urge incontinence were chosen as data sources.18,19 Results using the two different methods were compared, providing a real-life example of how these estimates might vary.

METHODS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. METHODS
  5. RESULTS
  6. DISCUSSION
  7. CONCLUSIONS
  8. Acknowledgements
  9. References

The King's Health Questionnaire is a condition-specific health-related quality of life instrument for the assessment of patients with lower urinary tract conditions including overactive bladder, stress incontinence and voiding difficulties. Although initially validated for use in female study populations, the measure has demonstrated reliability and validity in both males and females, has been used in a number of large multinational studies and numerous smaller studies and is available in 34 linguistic and culturally validated translations.17,20–25 Although the King's Health Questionnaire also includes a symptom bother score, the analyses were restricted to the health-related quality of life domains. While the King's Health Questionnaire has been used extensively in clinical trials and is also used in clinical practice, there is no published guidance on the difference in King's Health Questionnaire scores that are perceptible to the patient and considered clinically meaningful (i.e. the minimal important difference threshold). The reader should note that negative changes reported here correspond to an improvement in health-related quality of life, as all King's Health Questionnaire domains are scored on a 0 (best) to 100 (worst) scale.

This assessment used data from two multinational randomised clinical trials comparing the efficacy and safety of tolterodine among overactive bladder patients. Study 1 was a randomised, parallel group, placebo-controlled, double-blind, multinational, multicentre, 12-week trial designed to compare the clinical efficacy and safety of tolterodine with placebo in the treatment of overactive bladder in a total of 1529 patients.19 Study inclusion criteria required patients to have an average daily micturition frequency of eight over a week long observation period, and at least five urge incontinence episodes per week (verified by charting before randomisation). Tolterodine proved clinically efficacious, safe and well tolerated.18,19 Study 2 was a randomised, parallel group, multinational, multicentre, naturalistic, six-month, open-label study in a total of 827 patients. Inclusion criteria required patients to have urinary urgency, symptoms of urinary frequency with or without urge incontinence or symptoms of overactive bladder for six months or greater. Both tolterodine and oxybutynin proved safe and efficacious in the study population (data on file). Although the groups differed in other ways, few between-group differences were observed on the King's Health Questionnaire domains in Study 2, supporting the decision to combine the tolterodine and oxybutynin treatment groups for the minimal important difference analyses.

The anchor-based approach employed a patient-reported global change rating method using two criterion variables or ‘anchors’ (treatment benefit and disease impact) and data from two studies (Study 1 and Study 2). One of the anchors (treatment benefit) was included in both studies and the other (disease impact) was unique to Study 1. The minimal important difference was determined using a modification of a technique previously used by Juniper et al.1 In this approach, the mean King's Health Questionnaire score is calculated for two subsets of the patients: those patients who report ‘no change’ and those who report a small improvement on the anchor. The values for patients who report ‘no change’ are used as an indicator of measurement ‘noise’, whereas the values for patients who report a small improvement are informative as to the point of separation between ‘noise’ and meaningful results. These minimal important difference results for the anchor-based methods are reported in two ways:

  • As the difference in mean change from baseline King's Health Questionnaire scores for those patients reporting a one-scale point improvement (small improvement or SI) in treatment benefit or disease impact, and those indicating no change (NC) in treatment benefit or disease impact (SI − NC),
  • As those reporting a one-scale point improvement (SI) in treatment benefit or disease impact.

For the treatment benefit anchor, patients were asked, “Have you had any benefit from your treatment?” The response options available were: (1) no benefit; (2) yes, a ‘little’ benefit; and (3) yes, ‘much’ benefit. Study 1 also used a global question of perceived impact of the bladder condition (‘disease impact’) on health-related quality of life at baseline (Visit 2) and after 12 weeks (Visit 4) of treatment with a six-point rating scale. Patients were asked, “Which of the following statements describes your bladder condition best at the moment?” Response options included “My bladder condition: (1) does not cause me any problems at all; (2) causes me some very minor problems; (3) causes me some minor problems; (4) causes me (some) moderate problems; (5) causes me severe problems; and (6) causes me many severe problems.” A movement of one scale point on either anchor was defined as a small improvement. Less than 10% of the population reported a decline in impact on bladder condition from baseline to the end of the study. Because there was considerable variability in King's Health Questionnaire scores of those reporting a decline, this resulted in an insufficient number of patients for minimal important difference analyses. Therefore, our analyses were based on patients who reported improvements.

Analyses utilised baseline and end-of-treatment King's Health Questionnaire scores from data pooled across countries for each study. Validity and reliability of languages were established prior to pooling data and analyses were conducted without regard for translation.22 Analyses were conducted using least square mean (LS mean) change from baseline King's Health Questionnaire scores calculated using a split-plot analysis of covariance, with age and gender as covariates, with baseline scores included in the model. A mean King's Health Questionnaire score for each group (for both NC and SI) was calculated. The mean King's Health Questionnaire scores for those reporting no change was subtracted from the mean King's Health Questionnaire scores for those who reported a small improvement (SI − NC). Analyses were conducted on available data for all patients combined.

Although several methods are in use,4–6,14–16 the most common distribution-based method is to compare two subgroups at a point in time or measure the change over time in one group to the standard deviation at baseline (i.e. concept of effect size).4,5,14 The effect size is entirely dependent on the within-group standard deviation at baseline, and therefore may vary widely with different samples taken from the same population.26 Although the effect size method is a subject of debate as to whether it is more of a measure of responsiveness than a minimal important difference measure, it was selected for inclusion in this study because it is a common, well-known measure and familiar to clinicians.7,27

Usually effect size is calculated to quantify the observed result into a small, medium or large effect size. Conventional benchmarks established by Cohen are used to interpret group effect sizes.28 An effect size between 0.20 (0.20 of a standard deviation) and 0.50 is considered to be small, between 0.51 and 0.80 to be medium and an effect size of greater than 0.80 to be large. Thus, we used 0.20 standard deviations and 0.50 standard deviations to determine the minimal change score that would be needed to achieve a small and medium effect size, respectively. This approach is generally consistent with recent use by other instrument developers.29 The standard deviation was calculated at baseline for all patients regardless of treatment assignment.

RESULTS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. METHODS
  5. RESULTS
  6. DISCUSSION
  7. CONCLUSIONS
  8. Acknowledgements
  9. References

Anchor-based approach

Treatment benefit anchor (“Have you had any benefit from your treatment?”)

Table 1 shows the mean change from baseline King's Health Questionnaire scores for all patients (in both studies) reporting a small improvement (SI) in treatment benefit and the difference between SI and the mean change from baseline King's Health Questionnaire scores for all patients reporting no change (NC) or SI − NC for the treatment benefit anchor. Patients reporting a SI showed changes of about 4 to 13 points on the King's Health Questionnaire in Study 1, with the exception of the General Health Perceptions domain, and about 2 to 12 points in Study 2. Patients who reported NC in treatment benefit actually demonstrated slight improvements in King's Health Questionnaire scores (with the exception of General Health Perceptions), which varied across domains but were generally less than 5 points. Because the patient did not recognise these improvements, they were factored out by subtracting them from the mean change from baseline King's Health Questionnaire scores for patients reporting SI in treatment benefit in the SI − NC approach. Using the SI − NC approach, patients showed changes of about 5 to 10 points on the King's Health Questionnaire for all patients in Study 1 (with the exceptions of the Personal Relationships and General Health Perceptions domains, which showed changes of less than 5 points), and between 4 and 10 points in Study 2 (with the exceptions of Personal Relationships and Symptom Severity domains, which showed changes of less than 2 points).

Table 1.  Summary of results using different methods to calculate the minimal important difference for the King's Health Questionnaire.
King's Health Questionnaire domainTreatment benefit anchor*+Disease impact anchor*+Effect size+,#P= 0.05
SISI − NCSISI − NCSmall effectMedium effect
Study 1Study 2Study 1Study 2Study 1Study 1Study 1Study 2Study 1Study 2Study 1Study 2
  • *

    Absolute value using treatment benefit anchor, results using the disease impact anchor are not reported here because the anchor was only used in one study (Study 1).

  • Minimum mean change from baseline to follow up (all patients combined) needed from statistical significance at P= 0.05.

  • +

    Population = All patients regardless of treatment group.

  • #

    Minimum difference between baseline and follow up needed to achieve an effect size of 0.2 or 0.5.

  • SI = small improvement; NC = no change; ES = effect size; SD = standard deviation.

Incontinence impact12.8611.119.525.6214.647.915.035.4812.5813.711.672.39
Role limitations12.5911.707.777.4815.136.296.116.2815.2815.711.772.27
Physical limitations12.088.7810.365.0214.277.676.286.1315.7115.331.732.21
Social limitations7.395.636.872.087.624.445.735.7814.3414.461.351.92
Personal relationships3.576.751.971.413.982.126.076.5615.1716.411.942.51
Emotions8.339.276.169.139.725.156.015.7915.0314.471.441.87
Sleep and energy8.3310.726.069.608.994.915.455.6213.6214.061.361.77
Severity measures7.617.085.473.659.524.494.995.1512.4712.891.251.64
General health perceptions0.822.332.515.230.503.054.283.9610.709.911.011.27
Symptom severity7.762.375.461.479.665.482.613.106.537.750.821.15
Disease impact anchor (“Which describes your bladder condition best at the moment?”)

Table 1 also shows the mean change from baseline King's Health Questionnaire scores for all Study 1 patients reporting a small impact (SI) on the disease impact anchor, and the difference between SI and NC (SI − NC) for the disease impact anchor. Patients reporting a SI showed changes of about 3 to 15 points on the King's Health Questionnaire with the exception of the General Health Perceptions domain. Using the SI − NC approach, patients showed changes of about 4 to 8 points on the King's Health Questionnaire for all patients (with the exceptions of the Personal Relationships and General Health Perceptions domains, which showed changes of about 3 points or less).

Distribution-based approach

Table 1 shows the 0.20 (small effect) and 0.50 (medium effect) standard deviation results for all patients for each of the two studies. A 0.20 standard deviation criterion would place the minimal important difference in the range of 4 to 6 for all domains except Symptom Severity, where the minimal important difference was in the range of 2 to 3. Similarly, a 0.50 standard deviation criterion that would yield a medium effect size would place the minimal important difference in the range of 10 to 15 (6 to 8 for Symptom Severity).

DISCUSSION

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. METHODS
  5. RESULTS
  6. DISCUSSION
  7. CONCLUSIONS
  8. Acknowledgements
  9. References

A major criticism of objective assessment of treatment outcomes for overactive bladder has been the inability to relate statistically significant improvement in micturition frequency and urgency to significant improvement in health-related quality of life for treated patients. As a result, health-related quality of life assessment has become a standard in the assessment of patients with lower urinary tract dysfunction in clinical trials and to some extent also in clinical practice. Unfortunately, the difference between statistically significant improvement and clinically relevant improvement also relates to improvement in health-related quality of life scores. Frequently asked questions are “What do the results of health-related quality of life assessment mean?”“What is relevant to the patient rather than the statistician?” and “Can health-related quality of life be simplified?” The unifying concept of all of these questions is the intangible nature of health-related quality of life measurement and a conceptual difficulty of basing clinical rationale and decision making on the results of patient completed questionnaires. To improve the value of health-related quality of life assessment, we need to know not only that health-related quality of life improves, but also what that improvement means in real terms to the patient. The clinical significance of changes in health-related quality of life scores—the question of “How much is enough?”—is particularly important for conditions that impact on the quality rather than the quantity of patients lives, and that are treated on the basis of their effect on health-related quality of life and are unlikely to prolong it.

How much is enough?

Anchor-based approach

The two trials reported here provided an opportunity to compare the use of two different anchors, treatment benefit and disease impact, in the same study, and to compare the use of the same anchor in two different studies with a similar overactive bladder patient population. In addition, it allowed the comparison of the SI and the SI − NC definitions of minimal important difference. Although somewhat variable across domains, the minimal important differences for the two studies, calculated using anchor-based methods in all patients, are in the range of 5 to 10 King's Health Questionnaire scale points for the SI − NC definition in which the mean score of patients reporting NC are removed, and 6 to 12 King's Health Questionnaire scale points for the SI definition that relies on the change based on patients who report a SI. Patients demonstrating less than this amount of change on the King's Health Questionnaire did not reach a minimal important difference, based on their reports of the global ratings of change. However, smaller differences may also be important, as some results suggest that changes of 2, 3 or 4 points may be noticed by patients and considered important. These findings also varied across the studies, which may relate to the different study durations (12 weeks vs 6 months).

The anchor-based approach can be criticised for a variety of reasons. Change scores may be biased because patients under-estimate their health-related quality of life at a prior visit or provide retrospective change estimates that are highly correlated with their health-related quality of life at the current visit.26 Therefore, retrospective computation based on global measures of change may yield change values that are larger than the values derived by directly calculating average treatment-induced change.30 Our assessment is not exempt from these criticisms because it used one global question that required the patient to retrospectively determine treatment benefit and one measure that asked patients to rate their bladder condition at two time points. However, these two approaches yielded similar results, suggesting that bias was minimal in this assessment. Norman et al.31 also noted the possibility that ‘where you start matters’ in that changes at one end of a measure's scale may not be the same as changes at an opposite end. Such variation relates in part to scales being non-linear. In an effort to minimise this bias, an analysis of covariance model with baseline values was used to calculate the mean change scores for the anchor-based approach. The resulting LS mean scores are therefore less subject to this criticism.

It is also possible that the number of response options on the anchor-based question, its comprehensiveness and how conceptually similar it is to other such questions will affect minimal important difference estimates. For example, Juniper's sentinel study on minimal important difference in asthma used four global questions to classify patients as improved or worsened: “Since your last clinic visit, has there been any change in activity limitation/symptoms/emotions/overall quality of life, related to your asthma?”1 However, in the current study, a single global question was used to calculate the minimal important difference. It is therefore possible that the use of very ‘global’ questions may yield larger minimal important difference estimates than would anchors that were conceptually a ‘closer fit’ with the individual domains of health-related quality of life being measured. Also, the treatment benefit anchor produced slightly larger SI − NC values than the disease impact anchor, which is likely due to the larger number of response categories for the latter anchor. It may be that SI − NC as a threshold is dependent on the number of scale points and that Juniper and associates did well to use plus or minus 7 points in their study.

Whether to use a SI-derived minimal important difference value or an adjusted (SI − NC) value is a consideration infrequently addressed in research. If one uses the SI-only value, a second question, whether to use the value derived from only treated patients experiencing an improvement, needs to be considered. Although not shown in the tables, we calculated the minimal important differences for treated patients and for placebo patients. In the case of these trial data, the magnitude of the calculated SI-derived minimal important difference from treated patients can be twice that from the SI − NC placebo group. One might be best served to use the estimates that most closely resemble the hypothesis—an SI-derived minimal important difference in treated-only patients for intra-individual level gains due to active treatment, or the minimal important difference calculated as the SI − NC difference derived from all patients when evaluating between-group mean treatment effects such as would be calculated from a clinical trial.

Distribution-based approach

Effect size estimates were generally consistent between the two studies and were larger in treated patients than in placebo patients. The minimum change from baseline in King's Health Questionnaire scores needed to give a small effect size was about 5 to 6 points, with the exception of the General Health Perceptions and Symptom Severity domains, where it was 3 to 4 points. For a medium effect size, these values were approximately 12 to 15 points, and 6 to 11 points, respectively.

Effect size has been criticised because results vary by the sample used. Other distribution approaches, such as the standard error of the mean, are independent of the sample distribution.5 These distribution methods appear to be very objective and the separation of the minimal important difference estimate from the sample characteristics is appealing. However, these approaches are dependent on the number of response levels and on the construction of the measure to assure items are carefully balanced to reflect the importance that patients and providers place on the aspects being measured for each patient population. This is a difficult hurdle to overcome, requiring considerable developmental burden and techniques not routinely used in the development of current health-related quality of life measures.

Who says so?

The anchor-based approach is clearly patient-centred and grounded in aspects that are expected to influence treatment choices. In contrast, it is unclear whether the change measured by distribution methods is of importance to the patient.6 A small effect size may not have any particular relevance to the individual patient, but it provides valuable information about the responsiveness of the instrument and about populations of patients. However, the consistency between the anchor-based methods and the distribution-based method suggests not only a consistency in the minimal important difference ranges but helps to provide additional confidence in the construction of the King's Health Questionnaire.

The use of patient populations—a central part of the ‘who says so?’ issue—raises the question of which populations should serve as the source of the minimal important difference. Populations on open-label active treatment may rate their experience differently from those in a blinded study. Also, the inclusion of placebo-treated patients in the calculation of change scores may add additional variability and may not generate a minimal important difference value that is an appropriate standard in clinical trials comparing two or more active treatments.

Study limitations

The two studies contributing data to these analyses differed in several ways, including differing study lengths (3 months vs 6 months), differing designs (placebo-controlled, double blind vs parallel group, naturalistic) and differing controls (placebo or active). The difference in study length is important, as recent work has indicated that the social and psychological domains of the King's Health Questionnaire have a longer response profile than the physical domains.23 Study 2 was an open-label study with more lenient inclusion criteria and included less severe patients, as inclusion criteria allowed enrolment of patients without urge incontinence and did not specify the number of urge incontinent episodes as in Study 1. In addition, Study 1 utilised a placebo control group, while Study 2 compared active treatments. The separation of the placebo and treated patients in Study 1 allowed the authors to gain insight into the effect of treatment differences on the minimal important difference while the inclusion of an active comparator in Study 2 was a notable difference as well. Despite some significant study design differences, the calculated minimal important difference values were similar between studies.

Unlike the Juniper method, this study did not consider patients who declined, as too few were available for analysis or the anchor questions had too few response options to capture it. Barring these limitations, one could repeat the analysis and calculate the minimal important difference for a decline in health-related quality of life. Also, the impact of including the true decliners in the NC group on the magnitude of the NC difference is unclear.

Recommendations

As we saw in our assessments, minimal important differences may vary somewhat between patient samples and by methods selected. Therefore, we recommend calculating the minimal important differences using multiple methods. We also recommend that at least one of the methods be an anchor-based approach so that the patient perspective is well represented. When several methods yield similar values, such as we found in our assessments, confidence is increased in the results.

In terms of using the King's Health Questionnaire in clinical practice, our findings provide confidence that a change of 5 or more points on the King's Health Questionnaire domains is an indication of an important effect at the patient level, although smaller changes in the Symptom Severity checklist are important to patients. Interestingly, these findings are consistent with other work that shows that minimal important differences on 100-point scales are typically 5 points. For example, the developers of the Short Form-36 (SF-36) suggest that a 5-point (5%) change is associated with future resource use and is therefore clinically meaningful.32 A 0.5-point change on a 7-point scale is considered to be a meaningful difference on the Asthma Quality of Life Questionnaire1 as is a 2 to 5 point change on a 100-point scale by the developers of the Incontinence Quality of Life Instrument.33

Although the research goal was to address the question of “How much is enough?” there is no single right answer to this question. While we provide a range of values that can be used with increased confidence, the value that you select depends on how you use it. At the individual patient level, the minimal important difference must be selected by taking into account the side effects and costs of an intervention. For example, if the intervention is very costly and/or the side effects are severe, the treatment benefit assessed with a health-related quality of life instrument would have to be large (as would the minimal important difference) for the patient to have a net benefit.31 In making between-group comparisons, such as in clinical trials, it may be appropriate to require treatment groups to be statistically different and the difference between the mean values for the groups to exceed the SI − NC value. Alternatively, the proportion of patients in each group who exceed the minimal important difference threshold, defined either as the SI − NC or the SI for anchor-based methods, or minimal important differences estimated with distribution-based methods, should be evaluated. In this approach, the specific minimal important difference value is less critical and the data are presented in a way that may be more informative for decision makers.

The minimal important differences reported here are intended to be informative—the actual threshold values to be used will depend on the specific circumstances. However, because we found good convergence on the minimal important difference estimates using data from clinical trials and with multiple methods, we have confidence that when using the King's Health Questionnaire in clinical practice, a change in King's Health Questionnaire domain scores of 5 points or more is a good indication that the treatment is working and providing meaningful improvements in health-related quality of life.

CONCLUSIONS

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. METHODS
  5. RESULTS
  6. DISCUSSION
  7. CONCLUSIONS
  8. Acknowledgements
  9. References

Using data from two different clinical studies and employing multiple methods to calculate the King's Health Questionnaire's minimal important difference yielded similar results, despite minor variability depending on the population and method used for the minimal important difference calculation. A general convergence of findings using multiple methods supports the internal validity of these results and provides additional confidence in the validity of the King's Health Questionnaire. Unfortunately, there is no single right answer to the question of “How much is enough?” However, the data and experience suggest that domain gain scores of 5 or higher for most of the King's Health Questionnaire domains are likely to represent the minimum threshold values considered clinically meaningful.

Acknowledgements

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. METHODS
  5. RESULTS
  6. DISCUSSION
  7. CONCLUSIONS
  8. Acknowledgements
  9. References

The authors would like to thank the Tolterodine Study Group (Philip van Kerrebroeck, Karl Kreder, Udo Jonas, Norm Zinner and Alan Wein) and Terra Slaton, MS, for her assistance in preparing this manuscript.

References

  1. Top of page
  2. Abstract
  3. INTRODUCTION
  4. METHODS
  5. RESULTS
  6. DISCUSSION
  7. CONCLUSIONS
  8. Acknowledgements
  9. References
  • 1
    Juniper EF, Guyatt GH, Willan A, Griffith LE. Determining a minimal important change in a disease-specific quality of life questionnaire. J Clin Epidemiol 1994;47(1):8187.
  • 2
    Jaeschke R, Guyatt GH, Keller J, Singer J. Interpreting changes in quality-of-life score in N of 1 randomized trials. Control Clin Trials 1991;12(Suppl 4):226S233S.
  • 3
    Leidy NK, Revicki DA, Geneste B. Recommendations for evaluating the validity of quality of life claims for labeling and promotion. Value Health 1999;2(2):113127.
  • 4
    Hays RD, Woolley JM. The concept of clinically meaningful difference in health-related quality of life research. How meaningful is it? Pharmacoeconomics 2000;18(5):419423.
  • 5
    Wyrwich KW, Nienaber NA, Tierney WM, Wolinsky FD. Linking clinical relevance and statistical significance in evaluating intra-individual changes in health-related quality of life. Med Care 1999;37(5):469478.
  • 6
    Juniper EF. Quality of life questionnaires: does statistically significant = clinically important? J Allergy Clin Immunol 1998;102(1):1617.
  • 7
    Guyatt GH, Osoba D, Wu AW, Wyrwich KW, Norman GR. Methods to explain the clinical significance of health status measures. Mayo Clin Proc 2002;77(4):371383.
  • 8
    Jaeschke R, Singer J, Guyatt GH. Measurement of health status. Ascertaining the minimal clinically important difference. Control Clin Trials 1989;10(4):407415.
  • 9
    Barber BL, Santanello NC, Epstein RS. Impact of the global on patient perceivable change in an asthma specific QOL questionnaire. Qual Life Res 1996;5(1):117122.
  • 10
    Rector TS, Tschumperlin LK, Kubo SH, et al. Clinically significant improvements in the living with heart failure questionnaire score as judged by patients with heart failure. Qual Life Res 1994;3: 6061.
  • 11
    van Walraven C, Mahon JL, Moher D, Bohm C, Laupacis A. Surveying physicians to determine the minimal important difference: implications for sample-size calculation. J Clin Epidemiol 1999;52(8):717723.
  • 12
    Redelmeier DA, Guyatt GH, Goldstein RS. Assessing the minimal important difference in symptoms: a comparison of two techniques. J Clin Epidemiol 1996;49(11):12151219.
  • 13
    Osoba D, Rodrigues G, Myles J, Zee B, Pater J. Interpreting the significance of changes in health-related quality of life scores. J Clin Oncol 1998;16(1):139144.
  • 14
    Lydick E, Epstein RS. Interpretation of quality of life changes. Qual Life Res 1993;2(3):221226.
  • 15
    Crosby RD, Kolotkin RL, Williams GR. Defining clinically meaningful change in health-related quality of life. Value Health 2001;4(2):178.
    Direct Link:
  • 16
    Jacobson NS, Truax P. Clinical significance: a statistical approach to defining meaningful change in psychotherapy research. J Consult Clin Psychol 1991;59(1):1219.
  • 17
    Kelleher CJ, Cardozo LD, Khullar V, Salvatore S. A new questionnaire to assess the quality of life of urinary incontinent women. Br J Obstet Gynaecol 1997;104(12):13741379.
  • 18
    Chancellor M, Freedman S, Mitcheson HD, Antoci J, Primus G, Wein A. Tolterodine, an effective and well tolerated treatment for urge incontinence and other overactive bladder symptoms. Clin Drug Invest 2000;19(2):8391.
  • 19
    van Kerrebroeck P, Kreder K, Jonas U. Tolterodine once-daily: superior efficacy and tolerability in the treatment of overactive bladder. Urology 2001;57(3):414421.
  • 20
    Kelleher CJ, Cardozo LD, Toozs-Hobson PM. Quality of life and urinary incontinence. Curr Opin Obstet Gynecol 1995;7(5):404408.
  • 21
    Kobelt G, Kirchberger I, Malone-Lee J. Review. Quality-of-life aspects of the overactive bladder and the effect of treatment with tolterodine. BJU Int 1999;83(6):583590.
  • 22
    Reese PR, Pleil AM, Okano GJ, Kolleher CJ. Multinational study of reliability and validity of the King's Health Questionnaire in patients with overactive bladder. Qual Life Res 2003;12: 427442.
  • 23
    Okano GJ, Pleil AM, Reese PR, Kelleher CJ. Effects of long-term tolterodine treatment on physical and symptom aspects of health-related quality of life in overactive bladder patients. Value Health 2002;5(3):278.
  • 24
    Pleil AM, Reese PR, Okano GJ, Kelleher CJ. Validation of King's Health Questionnaire in patients with symptoms of overactive bladder. Qual Life Res 2000;9(3):347.
  • 25
    Kelleher CJ, Reese PR, Pleil AM, Okano GJ. Health-related quality of life of patients receiving extended-release tolterodine for overactive bladder Am J Manag Care 2002;8(19):S608S615.
  • 26
    Wyrwich KW, Wolinsky FD. Identifying meaningful intra-individual change standards for health-related quality of life measures. J Eval Clin Pract 2000;6(1):3949.
  • 27
    Samsa G, Edelman D, Rothman ML, Williams GR, Lipscomb J, Matchar D. Determining clinically important differences in health status measures: a general approach with illustration to the Health Utilities Index Mark II. Pharmacoeconomics 1999;15(2):141155.
  • 28
    Cohen J. Statistical Power Analysis for the Behavioral Sciences. 2nd edition. Hillsdale, New Jersey: Lawrence Erlbaum Associates, 1998.
  • 29
    Mangione C, Lee PP, Coleman AL, Shapiro MF, Berry S, Keeler E. Responsiveness of the NEI VFQ-25 to Cataract Surgery. 13 November 2000.
  • 30
    Norman GR, Stratford P, Regehr G. Methodological problems in the retrospective computation of responsiveness to change: the lesson of Cronbach. J Clin Epidemiol 1997;50(8):869879.
  • 31
    Norman GR, Sridhar FG, Guyatt GH, Walter SD. Relation of distribution- and anchor-based approaches in interpretation of changes in health-related quality of life. Med Care 2001;39(10):10391047.
  • 32
    Ware JE, Snow KK, Kosinski M, Gandek B. SF-36 Health Survey: Manual and Interpretation Guide. Boston, Massachusetts: The Health Institute, New England Medical Center, 1993.
  • 33
    Patrick DL, Martin ML, Bushnell DM, Yalcin I, Wagner TH, Buesching DP. Quality of life of women with urinary incontinence: further development of the incontinence quality of life instrument (I-QOL). Urology 1999;53(1):7176.

Accepted 28 January 2004