Smallest detectable and minimal clinically important differences of rehabilitation intervention with their implications for required sample sizes using WOMAC and SF-36 quality of life measurement instruments in patients with osteoarthritis of the lower extremities




To discuss the concepts of the minimal clinically important difference (MCID) and the smallest detectable difference (SDD) and to examine their relation to required sample sizes for future studies using concrete data of the condition-specific Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) and the generic Medical Outcomes Study 36-Item Short Form (SF-36) in patients with osteoarthritis of the lower extremities undergoing a comprehensive inpatient rehabilitation intervention.


SDD and MCID were determined in a prospective study of 122 patients before a comprehensive inpatient rehabilitation intervention and at the 3-month followup. MCID was assessed by the transition method. Required SDD and sample sizes were determined by applying normal approximation and taking into account the calculation of power.


In the WOMAC sections the SDD and MCID ranged from 0.51 to 1.33 points (scale 0 to 10), and in the SF-36 sections the SDD and MCID ranged from 2.0 to 7.8 points (scale 0 to 100). Both questionnaires showed 2 moderately responsive sections that led to required sample sizes of 40 to 325 per treatment arm for a clinical study with unpaired data or total for paired followup data.


In rehabilitation intervention, effects larger than 12% of baseline score (6% of maximal score) can be attained and detected as MCID by the transition method in both the WOMAC and the SF-36. Effects of this size lead to reasonable sample sizes for future studies lying below n = 300. The same holds true for moderately responsive questionnaire sections with effect sizes higher than 0.25. When designing studies, assumed effects below the MCID may be detectable but are clinically meaningless.


Comprehensive assessment of patients' health status is gaining in importance now that health care, with its expanding diversity of medical interventions, is becoming increasingly evidence-based. As the growing number of the elderly in industrial nations exerts additional pressure on the fiscal resources of health care systems, medical action within strict guidelines is in greater demand(1, 2). One of the key issues for evidence-based and cost-effective medicine is the detection and proof of intervention effects.

In patients with osteoarthritis (OA), information about effectiveness of medication and joint replacement(3–7) is available, but information on rehabilitation interventions is sparse. Effects of rehabilitation intervention are substantially smaller than those of arthroplasty, which reduces disability substantially(6–10). Furthermore, small effects may be more difficult to detect and require larger sample sizes for clinical studies, making them more difficult to realize.

Most importantly, the ability of an instrument to detect such a small difference (the so-called smallest detectable difference, SDD) is essential in order to quantify the minimal difference that patients and their physicians consider clinically important (the so-called minimal clinically important difference [MCID]).

Thus, in order to illuminate small effects in rehabilitation intervention, we need sensitive instruments. For the assessment of interventions in OA of the lower extremities, the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) is generally recommended as the most sensitive condition-specific instrument(4, 11–18). As a generic health status measure the Medical Outcomes Study 36-Item Short Form (SF-36) is now widely used and allows the effect of an OA intervention to be gauged in comparison with other interventions under various conditions(3, 19–21).

The objective of our study was to examine both the MCID, in contrast to the smallest statistically detectable difference (SDD), and the consequent implications for sample sizes in the assessment of comprehensive rehabilitation intervention in OA patients using the WOMAC and SF-36 instruments.


Patients and data collection

Patients were recruited from the Zurzach Rheumatology and Rehabilitation Clinic in Switzerland. All patients with hip or knee OA who were consecutively referred to a comprehensive inpatient rehabilitation intervention by their family physician or their rheumatologist were invited to participate in the study by a letter that was sent to them 4 weeks prior to their entry into the clinic. On the day of their entry into the clinic, a physician performed the baseline interview and examination, which determined inclusion in or exclusion from the study. Patients were sent or given a set of questionnaires, including the WOMAC and the SF-36, on the day of entry into the clinic (baseline examination), on the day of discharge from the clinic, and again 3 months after their baseline examination.

According to the American College of Rheumatology (ACR) guidelines, inclusion criteria were (a) knee pain for more than 25 of the last 30 days, (b) morning stiffness of less than 30 minutes and crepitation in the knee, or (c) pain for more than 25 of the last 30 days, with osteophytes on x-rays of the knees indicating knee OA(22). Patients with hip OA were included when there was pain for more than 25 of the last 30 days and at least 2 of the following 3 criteria were present: erythrocyte sedimentation rate <20 mm/hour, osteophytes on x-rays, or obliteration of joint space(23). Patients were excluded if they did not fulfill the ACR criteria, had a history of medication abuse or nonadherence, had difficulty completing questionnaires, suffered from severe illness, or had undergone arthroplasty of the joint in question.

Patients included in the analysis filled out the questionnaires in accordance with the rules of the user's guide, which specifies completion of at least 4 of the 5 pain items, 1 of the 2 stiffness items, and 14 of the 17 function items in WOMAC(13). Furthermore, completion of the SF-36 required that both the physical component summary (PCS) and mental component summary (MCS) were calculable(20).

The comprehensive inpatient rehabilitation intervention consisted of a standardized program of passive and especially active physical therapy as well as a reduction in the use of nonsteroidal anti-inflammatory drugs (NSAIDs) as much as possible. The rehabilitation program concentrated on physical therapy and was supervised by physicians. Active kinesitherapies were performed both individually and in groups to strengthen and stretch the musculature, especially the quadriceps, as well as the passive structures in order to recreate regular joint mobility. Passive therapies included electrotherapies, hydrotherapies, thermotherapies such as cold or warm compresses, and massage. Instructions for relaxing strategies and consultations for preventive measures were additional elements of the rehabilitation program. Finally, each patient was instructed in an individual home rehabilitation program to be continued after discharge. The duration of the program varied between 3 and 4 weeks, depending on adaptation of the program to each individual patient's unique situation (severity of OA, comorbidity, etc.).


The condition-specific WOMAC questionnaire is a multidimensional measure of pain, stiffness, and physical functional disability consisting of 24 items graded in a numerical rating scale ranging from 0 (“no symptoms”) to 10 (“extreme symptoms”)(4, 12–18). We selected the most responsive sections for OA pain and function(8, 13). With 36 items, the generic SF-36 calculates 8 multi-item scales—physical functioning, role physical, bodily pain, general health, vitality, social functioning, role emotional, and mental health—and 2 summary scales, the physical component summary (PCS) and the mental component summary (MCS)(20, 21). Each scale ranges from 0 (“extreme symptoms/poor health”) to 100 (“no symptoms/perfect health”). For the analysis, bodily pain, physical functioning, and PCS were selected.

The transition questionnaire was used to gather data from the patients about their current subjective health status in relation to the OA joint in terms of their general health. At the 3-month followup, patients had to compare their general health status with that of 3 months earlier, i.e., with that at baseline examination, using the assessment categories “much worse,” “slightly worse,” “equal,” “slightly better,” and “much better.”


The changes in score from baseline to the 3-month followup were defined as effects. As responsiveness measures, standardized response mean (SRM)(24) and the effect size (ES)(25) have been used. The SRM is equal to the mean change in score (effect) divided by the standard deviation or deviations of individuals' changes in scores. The ES equals the mean change in score (effect) divided by the standard deviation of the baseline scores. In both coefficients, SRM and ES, a higher value indicates higher responsiveness.

Effects measured by WOMAC and SF-36 were related to the transition reply categories in order to assess MCID. This is an application of the transition method, which has been established and successfully used in different settings(26–30). We compared the mean scores of WOMAC and SF-36 and the score changes between baseline and 3-month followup within the different transition categories (“much better,” “slightly better,” etc.; see above). The mean score difference between the “equal” group and the “slightly better” group resulted in the MCID for improvement. The corresponding MCID for worsening was determined by the mean score difference between the “equal” group and the “slightly worse” group.

When planning a future study, a small pilot study is often conducted in order to assess important parameters for the main study. Given the data of the pilot study, the SDD is the smallest effect that can be detected as significant by the chosen statistical method. The size of the SDD depends on the responsiveness of the measurement instrument and the sensitivity of the statistical method(31–34). For example, a parametric Student's t-test is able to detect much smaller score differences, changes, and effects than nonparametric tests such as the Wilcoxon rank-sum test. Conversely, more sensitive statistical models allow the use of smaller sample sizes. For example, a given effect will lead to smaller sample sizes when applying the t-test rather than the Wilcoxon test(33). Which statistical model and which corresponding test to choose is a question of the observed or expected distribution of the data.


In data generated by “natural” processes, approximation by normal (Gauss) distribution is feasible when the sample size is large enough, i.e., n ≥ 30(31, 33, 34). In smaller numbers of patients, the t-distribution replaces the normal distribution(34). In both cases, SDD and sample size can be determined by the calculation rules of normal distribution as follows(31, 34–37):

General equation for the sample size (n).

Effects were measured as differences (d) between 2 groups. For example, d is the difference between the mean of the intervention group and the mean of the control group. In large samples (n ≥ 30), d can be considered as normally distributed with the mean μd and the standard error SE(d). Specifically, this is true when the scores of both groups are normally distributed because of the fact that, by the calculating rules of normally distributed variables, the difference between 2 normally distributed variables is also normally distributed. The null hypothesis is that there is no effect: μd = 0. The alternative hypothesis(31, 34) is that

equation image

The z-values come from the standard normal distribution (mean = 0, standard deviation = 1) where α = two-sided type I error (mostly α = 0.05) and β = one-sided type II error; thus 1 – β = power (mostly power = 0.8). In the case of n < 30 the z-values must be replaced by t-values out of the t-distribution(34).

When comparing the difference (d) of 2 (effect) variables, the mean of the difference is equal to the difference of the 2 means by the commutative rule. By the rules of calculation with normally distributed variables, the difference's variance results from the sum of the variances of the 2 means: variance(d) = s2/n1 + s2/n2, when both effect variables have the same (or a comparable) “a priori” standard deviation, SD, and n1, n2 are the sample sizes of the variables. In paired followup data, or when both the control and treatment groups have the same size, we can set n1 = n2 = n. Thus, SE(d) can be replaced by SE(d) = √(s2/n + s2/n) = √(2s2/n) in formula [0], resulting in the general equation for the sample size:

equation image

where zα / zβ is the value of the standard normal distribution (mean = 0, standard deviation = 1) at the probability of α or β, respectively

α =two-sided type I error

β =one-sided type II error (thus, 1 – β = power)

Δ =mean effect, i.e., the difference of the mean score of the intervention group (or followup score) minus the mean score of the control group (or baseline score) equals the mean of the differences μd

SD =standard deviation of the scores at baseline (a priori standard deviation)

In the case of followup studies, we have the same subjects in the control group (before the intervention) and in the intervention group (after the intervention). Therefore, n is the total required number of the sample.

Conversely, given a sample size (n), and the a priori baseline standard deviation (SD), for example by a pilot study, the smallest statistically detectable difference (SDD = Δ) can be determined out of formula&lsqbr;1&rsqbr;:

equation image

Determination of n by ES.

If we know the effect size (ES) from a pilot study, and we assume that in the control group of the main study the standard deviation is equal or comparable to the a priori standard deviation of the control group in the pilot study, we have (SD / Δ) = 1 / ES by the definition of ES. Out of formula [1] follows

equation image

Determination of n by SRM.

If we have paired observations and we know the variance of the differences SDΔ2 (from a pilot study), then we can replace the standard error of the mean difference by SE(d) = √(SDΔ2/n) in formula [0](32, 35):

equation image

Because SRM is equal to μd / SDΔ (and μd = Δ) by formula [0] it follows that

equation image
equation image

For the mostly used type I and II errors from the standard normal distribution the expression (zα + zβ)2 can be replaced by

equation image
equation image
equation image



Between February 1997 and September 1998, 142 patients with the diagnosis of OA of the hip or knee were included in the study according to ACR criteria. They completed the questionnaires correctly according to the rules of the WOMAC and the SF-36 at their entry into the clinic (baseline examination)(13, 20). None of the patients were scheduled to undergo arthroplasty in the near future; instead, they underwent a 3- to 4-week comprehensive inpatient rehabilitation.

Three months after the baseline examination, 122 patients were reexamined with complete WOMAC sets, and 116 were reexamined with both WOMAC and SF-36 questionnaire sets. Between the baseline examination and the 3-month followup, 2 patients had died (for reasons unrelated to their OA), 2 patients had undergone arthroplasty (after their clinic stay), 2 patients were excluded by the predetermined exclusion criteria (listed above), and 14 (WOMAC) and 20 (SF-36) patients returned incomplete forms according to the rules or refused to participate further.

The mean age of the study subjects was 65.1 years, 70.5% of the patients were female, 61.5% had knee OA, and 43 patients (35.2%) used NSAIDs or analgesics, or both, at baseline examination; most of them reduced or omitted these substances until the end of their rehabilitation stay. The level of disability of the patients in the study varied widely but was moderate on average (see Table 1, baseline scores: WOMAC global = 4.8, range 0–10). The patients who could not be included in the study or in the 3-month followup were a median of 5 years older than the study patients, but there was no difference between the groups with respect to sex or distribution of joints involved.

Table 1. Patients with hip or knee osteoarthritis, before and after inpatient rehabilitation*
 Baseline Mean ± SD3-month followup Mean ± SDEffect (Difference = 3-month followup − baseline)
  • *

    ES = effect size; SRM = standardized response mean; WOMAC = Western Ontario and McMaster Universities Osteoarthritis Index; SF-36 = Medical Outcomes Study 36-Item Short Form; PCS = physical component summary. WOMAC scale: 0 = no symptoms; 10 = extreme symptoms. SF-36 scale: 0 = extreme symptoms; 100 = no symptoms. ES = mean (effect) ÷ SD (baseline). SRM = mean (effect) ÷ SD (effect). Improvement if WOMAC effect < 0, SF-36 effect > 0.

WOMAC (n = 122)
 Pain4.83 ± 2.254.18 ± 2.37−0.66 ± 1.960.290.34
 Stiffness4.61 ± 2.674.58 ± 2.40−0.03 ± 2.550.010.01
 Function4.81 ± 2.184.33 ± 2.32−0.47 ± 1.730.220.27
 Global4.80 ± 2.094.32 ± 2.26−0.47 ± 1.720.230.27
SF-36 (n = 116)
 Bodily pain27.1 ± 16.537.5 ± 20.310.5 ± 23.00.630.45
 Physical function37.5 ± 20.637.9 ± 22.10.4 ±
 PCS28.6 ± 7.730.9 ± 9.12.3 ± 8.00.300.29

Baseline scores, followup scores, and effects (Tables 1 and 2)

The scores of the examination at baseline (entry into the clinic) and at the 3-month followup are listed in Table 1. The effect is the difference between the two. The WOMAC baseline scores are positioned in the middle of the range (global score 4.80), indicating moderate illness and disability. There was low floor and ceiling effect (data not shown). WOMAC pain and function, SF-36 pain, and PCS were the most responsive sections, with the highest SRM and ES resulting in comparably small sample sizes required for future studies (Table 3, columns ES and SRM). WOMAC stiffness and SF-36 physical function were not highly responsive, with SRM and ES near zero.

Table 2. Mean effects (3-month followup vs. baseline) in groups after the categories resulting from the answer to the “transition” query “health in general related to the OA joint 3 months ago”*
 Effects (3-month followup − baseline) within transition groupsMCID
Slightly worseEqualSlightly betterMCID for worseningMCID for improvement
  • *

    OA = osteoarthritis; MCID = minimal clinically important difference; WOMAC = Western Ontario and McMaster Universities Osteoarthritis Index; SF-36 = Medical Outcomes Study 36-Item Short Form; PCS = physical component summary. MCID for worsening = effect (“slightly worse”) − effect (“equal”): absolute value. MCID for improvement = effect (“slightly better”) − effect (“equal”): absolute value. Improvement if WOMAC effect < 0, SF-36 effect > 0.

WOMAC (n = 122)n = 22n = 42n = 28
SF-36 (n = 116)n = 21n = 40n = 26
 Bodily pain−0.96.414.27.27.8
 Physical function−6.4−
Table 3. Smallest detectable difference (SDD) and sample sizes (n) given the data of a pilot study (Tables 1 and 2)*
Given the …n of pilot study, SD (baseline)ESSRMMCID for worsening, SD (baseline)MCID for improvement, SD (baseline)
Using formula1a2311
Results inSDDn per treatment armn totaln totaln total
  • *

    SD (baseline) = standard deviation of baseline scores (a priori standard deviation); ES = effect size; SRM = standardized response mean; MCID = minimal clinically important difference; WOMAC = Western Ontario and McMaster Universities Osteoarthritis Index; SF-36 = Medical Outcomes Study 36-Item Short Form; PCS = physical component summary. ES = Mean (effect) ÷ SD (baseline). SRM = mean (effect) ÷ SD (effect).

  • Formulas in Methods section.

WOMAC (n = 122)
SF-36 (n = 116)
 Bodily pain6.140398371
 Physical function7.6>1000197238612

The differences of the mean effects between the groups of patients who replied “slightly worse” or “slightly better” and those who replied “equal” constitute the MCID for worsening and for improvement, respectively (Table 2). On average, and especially with the WOMAC scales, lower values for improvement resulted. This seems to indicate that improvement is easier to notice subjectively than worsening.

SDD and sample sizes for ES, SRM, MCID (Table 3)

The formulas described in the Methods section were applied practically to our data, which can be interpreted as coming from a previously conducted pilot study. That study produced the necessary figures for planning a future study. Inserting the n (122 for WOMAC and 116 for SF-36) and the baseline standard deviations of the pilot data, the resulting SDDs vary between 0.75 and 0.96 points for the WOMAC sections and between 6.1 and 7.6 for the SF-36 sections. The SF-36 physical component summary has a low baseline variation that gives a small SDD of 2.8 points.

Analogously, sample sizes can be determined given responsiveness data (ES or SRM) or MCID and baseline standard deviations from the pilot study data. The moderately responsive sections (WOMAC pain and function, SF-36 bodily pain) need a relatively low sample size (between 40 and 167), whereas the less responsive scales (WOMAC stiffness, SF-36 physical function and PCS) require large sample sizes that will be difficult to provide in a future study.

Illustration of sample size (n) by effect and baseline standard deviation (Figure 1)

For the WOMAC, the dependency of n on the effect (absolute change in score from baseline to 3-month followup) and on the baseline standard deviation is illustrated three-dimensionally in Figure 1. The depicted plane illustrates the minimally required sample size for detection of the given effect, assuming the given baseline standard deviation either per treatment arm for unpaired data or total for paired (followup) data. The n did not exceed 300 when there were given effects greater than 0.6 points and baseline standard deviations smaller than 2.6 from our data. These differences of 6% to 13% of maximal possible value (12% to 26% of baseline value) and the standard deviations of 20% to 26% of maximal possible value reflect most of the values found by the pilot study. Specifically, 0.6 and 0.7 WOMAC points (scale 0 to 10) and 6 and 7 SF-36 points (scale 0 to 100) are on the level of SDD and MCID in both scales.

Figure 1.

Sample size (n) in dependency on the effect and the baseline standard deviation (WOMAC scale); n per treatment arm (unpaired data) or total for paired (followup) data.

Determination of sample size (n) by the effect size (ES) (Figure 2)

The size of n as a parabolic function of ES is illustrated in Figure 2 for pairwise data, assuming a type I error of α = 0.05 and a power = 0.8 using formula&lsqbr;2&rsqbr; in the Methods section. Below 0.25, required sample sizes exceed 250, but in cases of ES > 0.4, i.e., when the baseline standard deviation is less than 2.5-fold of the assumed effect, n will be smaller than 100. For ES > 0.7, the minimal required number of n = 30 is sufficient for detection of significant effects, and the parabolic function flattens more because the normal approximation must be replaced by the t-approximation(34).

Figure 2.

Sample size (n) in dependency on the effect size (ES); n per treatment arm (unpaired data) or total for paired (followup) data.


Strategy and concepts

We have explained and illustrated the methodology and concrete results of the concepts of SDD, MCID, and sample size calculation by using the WOMAC and SF-36 data of OA patients who have undergone an inpatient rehabilitation intervention. These data can be considered a result of a previously conducted pilot study. Our aim was to build a bridge between the clinician and the epidemiologist/statistician, 2 professional groups with a long history of classic issues between them.

The clinical importance of effects for the assessment of MCID was quantified by the concept of transition method exemplary(26–30). The determinations of SDD, MCID, and sample size are derived from the calculation rules for normal distribution that is assumed for large numbers of subjects (n > 30) whose data are dependent on “natural” processes(31, 34–37). The assumption of normally distributed effects and differences of effects results in simple equations for the user in clinical practice. The further concepts of study design, pilot and main study, and SDD and MCID will first be discussed and then applied to concrete data.

Study design

A study to describe the effect of an intervention can examine 2 independent groups of patients, one with the intervention and the other without (the control group). The conditions at baseline should be as similar as possible for both groups in order to avoid systematic bias. The required sample sizes for this design are twice as large as those resulting from paired, followup design(31). In the paired followup design, the control group can be created by the crossover design: First, one half of the patients will receive the intervention and the second half will not; then, after a “washout” period, the second half will be treated and the first half will not. This is one of the “gold standards” for drug studies, but when applying rehabilitation interventions this design is difficult to accomplish. When assessing the same patients, uncontrolled determination of the effects will be the result of followup studies. In this case, the formulas [1], [2] (using ES), and [3] (using SRM) can be applied.

Pilot study and main study–use of ES and SRM

When planning a study to prove the effect of an intervention, one wants to know how many patients have to be examined. To determine sample sizes, an estimate of the effect of the intervention and an estimate of the variance (standard deviation) of the data are needed. Simple estimation of these figures is vague and uncertain. These estimates can be based on the results found in the literature if studies with comparable conditions can be found. Literature or simple expectation can predict the size of the effect, but the estimation of the variance remains a problem. Ideally, a small pilot study should be performed to gain valid data under as similar conditions as possible to the future main study. To keep its realization easy, fast, and economical, a cross-sectional survey of baseline data of the future patient sample is adequate. In this case, only the baseline standard deviation (the so-called a priori standard deviation) will be known, and not the standard deviation of the single effects (differences of 2 health statuses), because the short pilot study has no followup data that will allow determination of the effect's standard deviation and by that the SRM. Therefore, we can use only the ES, which is the mean effect divided by the baseline standard deviation in formulas [1] and [2], and not the SRM in formula [3], which equals the mean effect divided by the standard deviation of the single effect. This will be most typical when planning future studies.

Interpretation of SDD and MCID

A priori effects assumed for future studies should be greater than or equal to MCID, representing clinically meaningful effects. An effect smaller than MCID may be measurable, but it will make no sense from the point of view of the patient, who will be unable to notice it. Thus, if MCID > SDD, the assumed effect will be the MCID and the sample size is sufficiently large. However, if SDD > MCID we need a greater sample size or a more sensitive statistical model that is able to detect smaller effects.

Application on concrete data

In our example of 122 (WOMAC) and 116 (SF-36) inpatient rehabilitation patients, ES and MCID for worsening and MCID for improvement led to sample sizes between 40 and more than 1,000 for rehabilitation interventions. In followup studies with paired data, effect sizes greater than 0.25 led to realizable sample sizes smaller than 250. Above 0.5, the ES can be denoted as high due to requiring an n less than 63 (Figure 2). However, except for WOMAC stiffness and SF-36 physical function, the required sample sizes did not exceed 325, which may exceed feasibility for future studies in rehabilitation patients. Our data confirm that the WOMAC stiffness section is the least responsive scale(6–9), and the physical function scale of the generic SF-36 is not as sensitive as the function dimension of the condition-specific WOMAC(3, 8), despite the fact that pain is measured similarly by both instruments(6–9).

Regarding our data, one can assume an effect's standard deviation around or below the baseline standard deviation (this is true except for SF-36 bodily pain) and use the baseline standard deviation as a good estimate of that of the effect. In this case, the simple future crossover pilot study enables the use of much smaller sample sizes for paired followup data in rehabilitation patients (formulas 1 and 3).

In the WOMAC sections, we would need effects between 0.8 and 1.0 points to be detectable statistically (SDD), assuming a sample size of 122 and the a priori standard deviations of our data. Assuming that the smallest MCID in the WOMAC scale ranging from 0 to 10 points will be 0.6 points (6% of maximal value, 12% of baseline value), the required sample size will be below 300 for future followup studies (Figure 1) and will, therefore, still be realizable. Concerning bodily pain and the PCS, these figures are also valid for the SF-36. Thus, after rehabilitation intervention with more than a hundred patients, SDD and MCID remain comparable.


In rehabilitation intervention, effects larger than 12% of baseline score (6% of maximal score) can be attained and detected as MCID by the transition method in both the WOMAC and the SF-36. Effects of this size lead to reasonable sample sizes for future studies, lying below n = 300. The same holds true for moderately responsive questionnaire sections with effect sizes higher than 0.25. When designing studies, assumed effects below the MCID may be detectable but are clinically meaningless.


This study has been supported by the Zurzach Rehabilitation Foundation SPA. We thank Stephan Mariacher, MD, and Susanne Lehmann for the planning, management, and implementation of the data base and Robin Kyburg and Diane Fassett for editing the English-language manuscript.