There is conflicting evidence on the efficacy of traditional Chinese acupuncture (TCA), and the role of placebo effects elicited by acupuncturists' behavior has not been elucidated. We conducted a 3-month randomized clinical trial in patients with knee osteoarthritis to compare the efficacy of TCA with sham acupuncture and to examine the effects of acupuncturists' communication styles.
Acupuncturists were trained to interact in 1 of 2 communication styles: high or neutral expectations. Patients were randomized to 1 of 3 style groups, waiting list, high, or neutral, and nested within style, TCA or sham acupuncture twice a week over 6 weeks. Sham acupuncture was performed in nonmeridian points with shallow needles and minimal stimulation. Primary outcome measures were Joint-Specific Multidimensional Assessment of Pain (J-MAP), Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC), and satisfaction scores.
Patients (n = 455) received treatment (TCA or sham) and 72 controls were included. No statistically significant differences were observed between TCA or sham acupuncture, but both groups had significant reductions in J-MAP (−1.1, −1.0, and −0.1, respectively; P < 0.001) and WOMAC pain (−13.7, −14, and −1.7, respectively; P < 0.001) compared with the waiting group. Statistically significant differences were observed in J-MAP pain reduction and satisfaction, favoring the high expectations group. In the TCA and sham groups, 52% and 43%, respectively, thought they had received TCA (κ = 0.05), suggesting successful blinding.
TCA was not superior to sham acupuncture. However, acupuncturists' styles had significant effects on pain reduction and satisfaction, suggesting that the analgesic benefits of acupuncture can be partially mediated through placebo effects related to the acupuncturist's behavior.
Acupuncture is a mainstay in traditional Chinese medicine. Health is attained through the flow of vital energy, Qi, through specific body paths called meridians; disease is caused by obstructions to this flow (1). In traditional Chinese acupuncture (TCA), needles are inserted at points along meridians to unblock these obstructions. In the past, acupuncture was mostly administered using manual needle manipulation. Increasingly, most acupuncturists, including those providing TCA, use electroacupuncture to increase stimulation.
There is conflicting evidence regarding the efficacy of acupuncture in knee osteoarthritis (OA) (2–5). Moreover, no study has systematically assessed the potential bias from the interactions of participants with acupuncturists. Only 2 randomized clinical trials in a recent systematic review described any attempts to limit the interactions between acupuncturists and participants (3, 6, 7). Neither achieved successful blinding. Placebo effects can be enhanced by expectations of improvement and it is conceivable that patient-provider interactions result in increased benefits if the provider has a confident attitude (8, 9).
To further evaluate the contribution of provider communication style to therapeutic responses, we conducted a randomized clinical trial of acupuncture for OA of the knee with a nested factorial design allowing us to evaluate the comparative effects of TCA and sham acupuncture while controlling for the effect of the acupuncturists' communication styles.
PATIENTS AND METHODS
We conducted a nested 2-stage randomized clinical trial to determine the effects of practitioner behavior on patients' responses to TCA or sham acupuncture in OA of the knee, and to evaluate the difference in therapeutic efficacy between these two modalities. Six acupuncturists trained in traditional Chinese medicine, licensed by the Texas State Board of Medical Examiners, were recruited through the American College of Acupuncture & Oriental Medicine. To ensure uniformity, all were Chinese, male, and had ≥2 years of clinical experience.
After completing a baseline assessment, patients were randomized to 1 of 3 groups, waiting list, high expectations, or neutral expectations communication style, and were given an appointment with an acupuncturist trained in the assigned communication style (see below) (Figure 1). During the initial 30-minute visit, the acupuncturist examined the subject in the traditional Chinese medicine method (e.g., documenting pulse), giving information and answering questions with the corresponding communication style. Participants were then randomized again to receive TCA or sham acupuncture. The study was approved by the Institutional Review Boards of the Baylor College of Medicine and University of Texas MD Anderson Cancer Center. Participants were told that the study compared traditional versus nontraditional acupuncture. The words “sham” or “placebo” were not used in the information materials or consent forms.
Randomization was computer generated with unequal blocks, stratified initially according to communication style. Patients allocated to either style were then randomized to one of the acupuncturists trained to interact in that style (acupuncturist nested within style). Finally, for each acupuncturist, patients were randomized to receive either TCA or sham acupuncture (treatment nested within acupuncturist). Opaque, sealed, consecutively numbered envelopes kept at a central location were used for allocation. Our hypothesis was that TCA would be more effective than sham acupuncture; therefore, to better evaluate the effects of communication style on placebo effects we oversampled the sham arm to allow us to conduct subgroup analyses in this group if results between the two arms (TCA and sham) were found to be different.
Subjects (n = 560) who were ≥50 years of age and had knee OA according to the American College of Rheumatology criteria (10, 11) participated in the trial. All patients had a radiologic diagnosis of OA. Additional inclusion criteria were 1) pain in the knee in the preceding 2 weeks ≥3/10 on a visual analog scale (VAS), 2) no prior treatment with acupuncture, 3) stable treatment with nonsteroidal antiinflammatory drugs and analgesics in the previous month, 4) if receiving glucosamine, a stable dose for the past 2 months, and 5) no intraarticular injections in the knee in the previous 2 months.
Communication style intervention.
Because the individual communication patterns of each acupuncturist could be different in the first half of the study, the acupuncturists were randomized with 3 assigned to interact with a high expectations style and the other 3 in a more uncertain, neutral way. In the second half of the study, the high expectations acupuncturists were retrained to act neutrally and vice versa. One acupuncturist had to leave the study toward the end of the first half of the trial, so in order to maintain a balanced design, only 4 acupuncturists participated in the second half.
Acupuncturists conveyed high expectations of improvement, using positive utterances such as “I think this will work for you,” “I've had a lot of success with treating knee pain,” and “Most of my patients get better.” A high expectations brochure was developed and given to patients. The research coordinator assisting with these patients was also trained to interact with a high expectations style.
Acupuncturists conveyed uncertainty with utterances such as “It may or may not work for you,” “It really depends on the patient,” and “We're uncertain, and that's why we are doing the study,” with words like “uncertain.” A neutral expectations brochure was given to patients. The research coordinator for this group was trained to interact with a neutral style.
Training materials were developed for each style. Before the trial started, acupuncturists participated in two 2-day training sessions including didactic instruction, one-on-one coaching, and group role play to practice the assigned style, with video recording to provide feedback. After completion of the first half of the trial, the acupuncturists were retrained.
We chose electroacupuncture as the modality of choice because it is currently the most commonly used method. We used transcutaneous electrical nerve stimulation (TENS) equipment. In clinical settings, TCA is often performed by choosing points for each patient. To standardize our methods, a panel consisting of the acupuncturists determined the procedure and specific points to be used equally for all patients in each of the 2 arms: TCA points on the basis of clinical practice, and sham points outside the relevant meridians (Figure 2). Before the trial started, we tested both the TCA and sham protocols in 9 volunteers; 7 received both treatments, but only 1 correctly distinguished between the two.
Labeled boxes with disposable Millennia needles (Millennia) were created for each arm. The depth of the needle placement was shallower for sham acupuncture. For TCA, TENS was set to emit a dense disperse wave impulse at 50 Hz, dispersing at 15 Hz, 20 cycles/minute. Voltage was increased slowly from 5 V to 60 V until maximal tolerance was achieved. Patients rested for 20 minutes with continuing TENS. For sham, instead of a dense disperse wave, a 40 Hz adjustable wave was used. Voltage was increased until the patient could feel it and then immediately turned off. Patients rested for 20 minutes with the needles retained, but without TENS stimulation. Patients received 2 treatments per week (TCA or sham) for 6 weeks.
All procedures, needling techniques, and communication styles were thoroughly outlined in the training manual and reviewed at training sessions. During the trial, research assistants were instructed to enter the room randomly while patients were being treated to ensure that the needles were placed correctly with respect to the meridians. Although the assistants did not enter the treatment rooms on all occasions, the acupuncturists were aware that the assistants would periodically check for accuracy in needle placement. The initial interaction between participants and acupuncturists took place before the participant was randomly allocated to TCA or sham; therefore, the acupuncturist was unaware of what type of treatment the participant would receive. All visits were audio recorded, and feedback on performance was given at followup sessions. Audio recordings at the baseline, midpoint, and final sessions were rated on the strength of expectations communication on a 0–10-cm VAS by graduate students in the Department of Communication Sciences at Texas A&M University, unaware of treatment allocation. Acupuncturists did not know which sessions were rated.
Outcome measures were collected at baseline, 4 weeks, 6 weeks (end of treatment), and 3 months. The primary outcomes determined a priori were Joint-Specific Multidimensional Assessment of Pain (J-MAP), Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) pain subscale, and Satisfaction with Knee Procedure (SKIP).
J-MAP measures the intensity, frequency, and quality of pain (12); responses range from 1 to 7, with higher values indicating more pain (Cronbach's α = 0.93). Respondents rated each knee separately. The score was the average of both knees if the pain was bilateral and they had received treatment in both knees. For unilateral symptoms, only the J-MAP for the relevant knee was used. The WOMAC pain subscale (13) ranges from 0 to 100, with higher values indicating more pain. SKIP is a 6-item satisfaction scale with the following agree/disagree statements: “I would recommend acupuncture to my family if they needed care for the same problem,” “I feel acupuncture is worthwhile for my knee arthritis,” “All things considered, my perceptions of acupuncture are generally negative,” “I feel I was helped by the acupuncture,” “I am dissatisfied with the functioning of my knees after acupuncture,” and “Undergoing acupuncture was a waste of time.” Scores range from 1 to 5, with higher scores indicating greater satisfaction. Cronbach's alpha for our sample was 0.91.
Secondary outcomes included the WOMAC function subscale (13), the Short-Form 12 (SF-12) health survey, the SF-12 physical component summary and mental component summary scores (14), a 10-cm VAS for pain, range of motion (ROM), and the Timed Up and Go Test (15). We also estimated the proportion of patients who achieved at least 20%, 50%, and 70% improvement in WOMAC scores, which has been shown to be clinically relevant (16). Questionnaires were self-administered; ROM and the Timed Up and Go Test were measured by a blinded research assistant.
We took several steps to minimize bias. We did not use the words “sham” or “placebo,” and stated our objective as a comparison between traditional and nontraditional acupuncture. Participants could not have received acupuncture previously. Clinic procedures were implemented to avoid minimal contact and interaction between participants; individuals with close relationships to participants were excluded. The research assistant conducting the followup assessments was blinded. Although the statisticians could not be blinded because of unequal sample sizes, the other investigators were blinded to the allocation groups until the analysis was complete. At 3 months, patients were asked whether they thought they had received TCA or nontraditional acupuncture.
Repeated-measures analysis of variance (ANOVA) was used to assess changes over time for treatment and style. Individual growth curve analysis was used to examine change over time for each individual (17). A linear model was used, with intercept and time as random effects, treatment and style modeled as fixed effects, and interaction terms for time with treatment, time with style, and style with treatment.
The analysis was conducted as intent-to-treat, carrying the last observation forward for dropouts. Our power calculations were based on an unbalanced design (1:2:4): 80 waiting list, 160 TCA, and 320 sham. This sample size yielded a power in excess of 0.99 (α = 0.01) for ANOVA of the pain scales. All analyses were performed using SAS, version 9.1.3 (SAS Institute).
Of the patients who were randomized (n = 560), 238 were randomized to the high expectations group, 242 to the neutral expectations group, and 80 to the waiting list group. Numbers were slightly unbalanced because one of the acupuncturists left before the end of the first arm. Twenty-five patients withdrew before treatment and 8 patients in the waiting list group did not return for reassessment. The analyses included 527 patients: 455 who received TCA or sham and 72 controls. Of those who received TCA or sham, 33 patients did not complete ≥10 acupuncture treatments. No differences were observed in dropout rates between the TCA and sham groups (9.2% versus 6.3%; P > 0.20) or high versus neutral communication groups (5.7% versus 8.7%; P > 0.20). Most patients (72%) received acupuncture in both knees.
Baseline characteristics are shown in Table 1. P values are shown for differences in proportions (chi-square test) or differences in means (F test) among all groups. No statistically significant differences were observed.
Table 1. Baseline demographic and clinical characteristics*
Continuous variables are the mean ± SD. TCA = traditional Chinese acupuncture; J-MAP = Joint-Specific Multidimensional Assessment of Pain; WOMAC = Western Ontario and McMaster Universities Osteoarthritis Index; VAS = visual analog scale; SF-12 = Short Form 12 health survey; PCS = physical component summary; MCS = mental component summary; TUG = Timed Get Up and Go; ROM = range of motion; NSAIDs = nonsteroidal antiinflammatory drugs.
Chi-square test for difference in proportion between style groups or F test for difference in means between style groups (high/neutral/waiting list), as appropriate.
Sex, no. (%) women
63.5 ± 10.4
65.3 ± 9.0
65.5 ± 7.8
63.7 ± 9.1
64.1 ± 9.6
Ethnicity, no. (%)
Educational level, no. (%)
Less than high school
High school diploma/some college
Duration of knee pain, years
10.0 ± 11.7
8.4 ± 9.6
8.4 ± 7.9
8.8 ± 10.4
11.5 ± 12.4
4.4 ± 1.2
4.3 ± 1.3
4.5 ± 1.3
4.6 ± 1.2
4.3 ± 1.2
WOMAC pain score
43.3 ± 18.2
44.8 ± 18.7
45.6 ± 18.6
45.2 ± 17.8
44.1 ± 15.2
WOMAC function score
41.5 ± 19.3
45.1 ± 18.5
44.2 ± 18.8
44.1 ± 17.6
40.1 ± 16.5
VAS pain score
57.3 ± 21.0
57.1 ± 24.0
59.3 ± 23.6
57.6 ± 23.0
54.6 ± 21.3
SF-12 PCS score
35.4 ± 9.7
33.6 ± 8.1
34.6 ± 10.0
33.4 ± 9.3
35.3 ± 8.4
SF-12 MCS score
51.7 ± 10.1
53.6 ± 9.3
52.8 ± 8.8
53.2 ± 9.3
53.7 ± 10.7
TUG score, seconds
14.3 ± 9.0
13.6 ± 4.8
13.6 ± 5.3
13.2 ± 6.1
12.3 ± 3.3
ROM, degrees of flexion
104.1 ± 14.5
105.3 ± 12.4
107.3 ± 12.9
106.0 ± 13.7
105.7 ± 13.0
Medications, no. (%)
No differences were observed between the TCA and sham groups (Table 2). Patients receiving either form of therapy showed improvement for most outcome measures compared with the waiting list group.
Table 2. Outcome measures by acupuncture treatment group*
Patient-provider interactions were audio recorded and rated blindly. Mean ratings for acupuncturists in the high expectations group were higher than those in the neutral group (P < 0.0001). Patients in the high expectations communication style group had statistically significant improvements in pain (J-MAP and VAS) and satisfaction (SKIP) compared with the neutral group, with effect sizes of 0.25 and 0.22, respectively, at 3 months (Table 3). Patients receiving either communication style showed statistically significant improvement for most outcome measures compared with the waiting list group. No significant 2-way interactions between treatment and style were found in the models. The effect of treatment and style on patients over time was examined using individual growth curve analysis. The time parameter was statistically significant for all measures, indicating that patients improved, over the course of the study, equally for both treatments (TCA and sham). For J-MAP and SKIP, communication style was statistically significant (P = 0.02 and P = 0.01, respectively), favoring the high expectations group. For treatment (TCA versus sham), none of the outcomes showed a significant effect. No interactions were observed between time and style or between time and treatment.
Table 3. Outcome measures by communication style group*
We also examined differences in treatment and style by categorizing the WOMAC pain and function scales according to 20%, 50%, and 70% improvement. No statistically significant differences or trends were observed for treatment. For style, trends were observed favoring the high expectations group for 50% WOMAC improvement. For pain at 6 weeks, 41.2% in the high expectations group achieved 50% improvement compared with 33.6% in the neutral group (P =0.10); at 3 months, 50% improvement was reported by 35.4% of patients in the high expectations group and by 27.5% in the neutral group (P = 0.07). For WOMAC function, a trend was observed at 4 weeks, with 24.3% and 17.0% in the high and neutral style groups, respectively, reporting 50% improvement (P = 0.05).
At the end of the study we asked participants which treatment they thought they had received, with 3 possible responses: TCA, nontraditional acupuncture, or not sure. No significant differences were observed; 52% in the TCA group and 43% in the sham group thought they had received TCA (weighted κ = 0.05, P = 0.23), showing successful blinding.
Few adverse events were observed: 26 patients (7.2% of the TCA group versus 4.9% of the sham group) had exacerbation of knee pain, 22 (5.8% TCA versus 4.6% sham) had bruising at the needle site, 3 (<1% each TCA and sham) reported muscle cramps, 1 patient (TCA) reported headache, and 1 patient (TCA) had infection at the needle site.
To our knowledge, this is the first study designed to compare the efficacy of TCA with sham acupuncture in OA of the knee while experimentally controlling and measuring the effects of provider-patient interactions in the response to acupuncture. There were 2 major findings. First, acupuncture as practiced traditionally following the Chinese method of meridian needle insertion was not superior to the sham procedure. Second, the communication intervention had a small but significant effect on pain and satisfaction with treatment, demonstrating a placebo effect related to the clinicians' communication styles. Effect sizes were small, 0.25 and 0.22, respectively.
There is considerable interest in the use of acupuncture as a relatively safe intervention for the management of knee OA. Unfortunately, the quality of many trials has been poor (18–20). In many instances, sham procedures have not been adequate to mask the true intervention (19). Specifically, the use of retractable nonpenetrating needles may not adequately blind subjects (6, 21). Two recent systematic reviews have examined the efficacy of acupuncture in this condition (3, 4). White et al (4) identified 13 trials, of which 5 were considered to be of enough quality to be included in a meta-analysis (6, 7, 22–24). Although a statistically significant effect was observed favoring acupuncture over sham, these findings were mostly due to the results of 2 trials (6, 22) that used sham procedures with retractable needles. In one trial blinding was not successfully achieved (6) and in the other it was not reported (22). Another 2007 meta-analysis (3) concluded that the effects of acupuncture for knee OA were clinically irrelevant when compared with sham therapy, but found statistically and clinically significant benefits when compared with usual care or waiting list controls, as we observed in our study. Similarly in this review (3), trials that used nonpenetrating needling procedures showed clinically small but statistically significant beneficial effects (6, 22, 25). The studies using superficial but penetrating needling at nonacupuncture points as sham showed either no benefit or modest efficacy for traditional acupuncture (7, 26).
Our study used a sham procedure with superficial needling in nonmeridian points and minimal electric stimulation. Although the procedure was minimally invasive, it was sufficient to allow successful blinding as compared with some recent studies where blinding was unsuccessful. Our sham procedure may have had an analgesic effect from superficial needling such as a release of endorphins, but this effect is also observed with oral pain placebo. Meridian point insertion following TCA practices did not have an additional effect. Furthermore, using continuous electrical stimulation in the TCA group (compared with a few seconds in the sham group) was also ineffective.
Whether the improvement observed in both TCA and sham groups is due to needling (deep or superficial) or to the placebo effects of participating in a study with frequent contact with research staff cannot be easily established. A recent large trial of acupuncture for low back pain reported no differences between individualized acupuncture tailoring needling sites, a standardized approach using specific points, and simulated nonpenetrating acupuncture (27). In a systematic review of 3-armed trials of acupuncture for pain, a small difference was observed between traditional and placebo acupuncture and a moderate difference between placebo acupuncture and no acupuncture (28). The authors questioned the importance of meridians and specific acupuncture points emphasized by TCA and concluded that the analgesic effect of acupuncture is small and may result from incomplete blinding. Moreover, an acupuncturist may unintentionally convey different signals to the patient depending on whether TCA or sham acupuncture is administered. In addition to their training in behavioral styles, in our trial the interactions of acupuncturists with patients were audio recorded to ensure adherence to protocol.
In our study, patients in the high expectations group showed a clinically small but statistically significant improvement in knee pain and satisfaction (J-MAP, VAS, and SKIP) compared with patients in the neutral group. No significant differences were observed in WOMAC scores when using interval change as an outcome (13). However, we found borderline trends favoring the high expectations group when the WOMAC pain and function scores were dichotomized to identify those patients with ≥50% improvement, an approach previously recommended (16). A difference between the J-MAP and the WOMAC is that the J-MAP asks about pain intensity and frequency in general (e.g., how intense was your knee pain), whereas the WOMAC elicits information about pain with specific activities (e.g., pain going up or down the stairs). We also evaluated each knee separately with the J-MAP, whereas the WOMAC assessed both knees simultaneously.
Expectancies play a role in patients' perceptions of outcome, and placebo responses are enhanced in patients with high expectations for improvement (29–32). These expectations can be modeled according to the behavior and communication style of the provider interacting with the patient. A recent study evaluated the role of provider communication style in patients treated for irritable bowel syndrome with sham acupuncture in a single-blind trial (9). Patients were assigned to 3 groups: waiting list, placebo acupuncture with limited communication (<5 minutes with the provider), and placebo acupuncture with augmented interaction (a 45-minute visit with the provider communicating in a warm, empathic manner). A statistically significant incremental response was observed, with patients in the augmented group experiencing the most improvement.
The effects of TCA and sham needling appear to be mediated by similar neurochemical and neurophysiologic pathways (33–36). The release of endorphins and other neuropeptides has been documented both with placebo responses and acupuncture treatment, with or without electrical stimulation. Functional magnetic resonance imaging (MRI) studies have shown broad activation and deactivation in the central nervous system (31, 34, 37). Kong et al used nonpenetrating needles as a sham procedure to evaluate functional MRI changes in response to expectations about acupuncture in 16 subjects receiving thermal stimulation (37). Investigators manipulated the subjects' expectancies about acupuncture by explaining how pain relief might be achieved. Participants had higher pain relief and more prominent functional MRI activation when their expectancies of relief were enhanced. These findings, in conjunction with other evidence, support the important neurophysiologic changes that can modulate pain relief centrally through placebo effects enhanced by expectancies.
Our study had some limitations. First, we cannot make inferences with respect to the use of a less invasive placebo, such as nonpenetrating needles, which could have resulted in different results. Second, although we audio recorded the verbal interactions between patients and acupuncturists, nonverbal communication may also be important in the setting of acupuncture and placebo responses, and we were unable to measure this. Finally, although we made efforts to blind the study in as much as possible with the established procedures, some patients may have been aware of which treatment they were receiving (TCA or sham); however, no differential responses were observed among the groups when patients were specifically asked which therapy they thought they had received.
This is the first study examining TCA and sham acupuncture in knee OA that has also included experimental manipulation of the acupuncturists' communication styles. In summary, TCA was not superior to sham acupuncture, and the needling of meridian points was not more effective than the use of sham points. Continuous electrical stimulus or increased needle penetration in the TCA group did not improve response. Acupuncturists' communication styles had a small but statistically significant effect on pain reduction and satisfaction, suggesting that the perceived benefits of acupuncture may be partially mediated through placebo effects related to the acupuncturists' behavior.
All authors were involved in drafting the article or revising it critically for important intellectual content, and all authors approved the final version to be submitted for publication. Dr. Suarez-Almazor had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study conception and design. Suarez-Almazor, Liu, Pietz, Marcus, Street.
Acquisition of data. Suarez-Almazor, Looney, Liu.
Analysis and interpretation of data. Suarez-Almazor, Cox, Pietz, Street.
The authors are grateful to the members of the research team who contributed to this work. They are Stacey Havelka, BA, Sonya Patel, MSOM, Andrea Price, BS, and Elizabeth Simmons, RN. We are also grateful to Michael Kallen, PhD, for his contributions to the analysis of this work.