Efficacy of open‐label counterconditioning for reducing nocebo effects on pressure pain

Nocebo effects can adversely affect the experience of physical symptoms, such as pain and itch. Nocebo effects on itch and pain have shown to be induced by conditioning with thermal heat stimuli and reduced by counterconditioning. However, open‐label counterconditioning, in which participants are informed about the placebo content of the treatment, has not been investigated, while this can be highly relevant for clinical practice. Furthermore, (open‐label) conditioning and counterconditioning has not been investigated for pain modalities relevant to musculoskeletal disorders, such as pressure pain.


| INTRODUCTION
It is well known that nocebo effects (i.e. adverse treatment outcomes not attributable to active treatment components) can be induced via learning mechanisms, including classical conditioning and suggestions (Bartels et al., 2014;Benedetti et al., 2007;Colloca et al., 2008;Thomaidou et al., 2020). Much less is known about methods to reduce nocebo effects and their translation to clinical care. First findings indicated counterconditioning to reduce nocebo effects for conditioned thermal pain and itch and may even lead to placebo effects (Bartels et al., 2017;Thomaidou et al., 2020). During counterconditioning, the original unconditioned stimulus (US) (e.g. an increase of administered pain) to which a previously neutral stimulus (e.g. activation of a sham electrode) has been paired (conditioned stimulus [CS]), is replaced by a US of opposite valence (e.g. decreased pain stimulation). First results have indicated that counterconditioning is more effective than extinction, during which the CS is no longer paired with the US, leading to people gradually learning the US and CS are no longer associated (Bartels et al., 2017;Thomaidou et al., 2020). Although not examined yet, pressure pain could be a relevant pain modality in which to examine nocebo-learning strategies, since this elicits a deep tissue pain sensation similar to the pain patients with chronic musculoskeletal pain disorders experience (Petzke et al., 2003;Wolfe et al., 1990).
Typically, deceptive (counter)conditioning paradigms have been used in experiments (Colagiuri et al., 2015;Colloca et al., 2008Colloca et al., , 2010Thomaidou et al., 2020), which might lead one to think that deceptive methods are needed to treat nocebo effects. In clinical practice, however, patients need to be informed about their treatment, as deception could harm trust in the healthcare provider and treatment (Miller et al., 2005;Peerdeman et al., 2021). Therefore, it is difficult to translate current findings to clinical practice. A possible solution lies in open-label (counter)conditioning procedures, in which people are informed about using inert treatments, which could provide a non-deceptive opportunity for reducing nocebo effects. Although open-label placebos have been demonstrated to be effective (Carvalho et al., 2016;Kaptchuk et al., 2010;Kleine-Borgmann et al., 2019;Locher et al., 2017), openlabel nocebo conditioning has only been examined in one study (using itch) and open-label counterconditioning has not been examined. Furthermore, it is unclear whether placebo effects induced after counterconditioning are as strong as placebo effects induced without prior nocebo conditioning. It would therefore be relevant to investigate whether these findings can be replicated in a study using (pressure) pain and to also investigate open-label counterconditioning, as these findings may help develop new treatment opportunities for reducing nocebo effects in clinical care.
In the current study, we aimed to investigate the reduction in nocebo effects on pressure pain through openlabel counterconditioning combined with open-label suggestions. We first tested whether a nocebo effect could be induced by open-label conditioning and suggestions by comparing nocebo conditioning with sham conditioning. Secondly, we tested whether counterconditioning works better than extinction, on which commonly used treatments (e.g. exposure treatment) are based. Counterconditioning was also compared to continued nocebo conditioning (which mimics a real-life situation in which people repeatedly have negative experiences) and placebo conditioning (to examine the influence of prior nocebo conditioning). We hypothesized that (1) nocebo conditioning induces a stronger nocebo effect than sham conditioning; (2) both counterconditioning and extinction reduce the nocebo effect in comparison to continued nocebo conditioning; and (3) counterconditioning yields a larger reduction than extinction. We further explored (4) whether placebo conditioning and counterconditioning successfully induce a placebo effect, and (5) whether this effect is larger after placebo conditioning than after counterconditioning. Investigating the effects of open-label counterconditioning on pressure pain in healthy participants builds onto prior knowledge on (closed-label) counterconditioning examined in other pain modalities and could provide a first step for new clinically applicable treatment strategies for chronic pain disorders. not ethically appropriate for use in clinical practice. The current study demonstrates that open-label counterconditioning in a pain modality relevant for many chronic pain conditions may be a promising new strategy for reducing nocebo effects in a non-deceptive and ethical manner, which provides promise in designing learning-based treatments to reduce nocebo effects in patients with chronic pain disorders.

| Ethics statement
This study was approved by the Psychology Research Ethics Committee of Leiden University (reference number CEP18-1114/442) and pre-registered in the International Clinical Trials Registry Platform (number NCT05284383). All participants gave written informed consent and were reimbursed by €15 in cash or study credits. The current paper reports on data from a study entailing different study aims; the current paper focusses on the efficacy of conditioning and counterconditioning for inducing and reducing nocebo effects on pressure pain, whereas in another paper the predictive value of several psychological characteristics, as well as nocebo susceptibility on the strength of the nocebo effect and its reduction will be discussed (M. Karacaoglu, S. Meijer, K.J. Peerdeman et al., unpublished data, October 2021).

| Participants
The sample size required for our primary analysis was calculated using G*Power 3.1 for an independent samples t-test (two-sided, alpha = 0.05, desired power 0.80). The expected effect size was d = 0.73, based on a similar study on counterconditioning of nocebo effects (Bartels et al., 2017). According to the sample size calculation, 31 participants were needed per group. Since the design consisted of four groups in the second phase, we aimed for a total of 124 participants.
Participants were recruited through flyers at Leiden University and online via Facebook, as well as via the online recruitment system Sona (Sona systems, Tallin, Estonia). All participants had to be female, between 18 and 35 years old and have a good understanding of written and spoken Dutch. The counterconditioning procedure tested in the current study, once found to be effective in healthy participants, is intended to be used in future research with patients with fibromyalgia. As fibromyalgia is more prevalent in women (Marques et al., 2017), only female participants were tested in the current study, to avoid the possible influence of gender differences.
Exclusion criteria were severe somatic or psychiatric morbidity (e.g. heart/lung diseases, DSM-5 psychiatric disorders), Raynaud's disease, chronic pain complaints at present or in the past (≥3 months), current pain complaints (≥2/10 on Numeric Rating Scale [NRS]), current use of medication, injuries on the non-dominant hand, refusal/inability to remove nail polish or artificial nails on the thumbnail of the non-dominant hand for the experiment, colour blindness and pregnancy or breastfeeding.
Participants were excluded from further participation if their sensory discrimination was poor, that is if they were unable to distinguish between three different pressure intensities or if a pain intensity of 4.5/10 on NRS was not reached at maximum pressure levels. Participants were asked not to consume alcohol, recreational drugs, painkillers and/or sleep medication in the 24 h prior to testing.

| Design
A randomized controlled trial with a between-withinsubjects design was employed, consisting of two parts ( Figure 1). In part 1 (nocebo induction), participants were randomly assigned (3:1) to the experimental group (openlabel nocebo conditioning) or the control group (openlabel sham conditioning). A randomization list was made by an independent person and group allocation was noted down on paper and inserted into an opaque envelope, which was opened after the pressure pain calibration procedure, to reduce experimenter bias during calibration. Since all experimental manipulations contained openlabel verbal instructions, neither the experimenter nor the participant could be blinded to group allocation. In part 2 (nocebo reduction), participants from the experimental group were randomly assigned (1:1:1) to one of three groups: open-label counterconditioning, open-label extinction or open-label continued nocebo conditioning. Participants in the control group underwent an open-label placebo conditioning procedure in part 2, to be able to compare the magnitude of placebo effects after counterconditioning (preceded by nocebo conditioning) to the magnitude of placebo effects after placebo conditioning (i.e. an identical procedure, but not preceded by nocebo conditioning).

| Pain induction
To induce pain, pressure pain stimuli were applied to the thumbnail of the non-dominant hand using a custommade automated pneumatic stimulator, borrowed from the Karolinska Institute in Stockholm, Sweden (Jensen et al., 2009). The handpiece of the stimulator (borrowed from Kings College London) has a plastic piston that applies pressure via a 1 cm 2 hard rubber probe. The handpiece has a cylinder opening where participants can insert their thumb, placed such that the probe contacts the middle of the thumbnail. The thumbnail was selected as a neutral location to repeatedly and safely deliver pressure stimuli as has been previously used and reported on for both healthy and clinical samples (Jensen et al., 2009). Pressure pain was chosen, because this more closely resembles the deep tissue pain that patients with chronic musculoskeletal pain disorders experience in contrast to the more commonly used method of thermal pain, which relies on applying heat to the skin that leads to a burning sensation. Additionally, patients with fibromyalgia experience a sensitivity to pressure stimuli and applying pressure to certain 'tender points' has previously been used in fibromyalgia diagnosis, although not a current criterion (Wolfe et al., 1990).
Pressure stimulus duration was set at 2.5 s, with an inter-stimulus interval of 30 s. The minimum intensity of pressure given was set at 50 kPa (≈5 N/cm 2 or 0.5 kgf), while the maximum was set at 850 kPa (≈85 N/cm 2 or 8.7 kgf).

| Pressure pain calibration
A calibration procedure was conducted in order to find the optimal pressure intensity for minimal pain (0-1/10 NRS), slight pain (2-3/10 NRS) and moderate pain (4.5-5.5/10 NRS) for the individual participant, to be used in parts 1 and 2 of the experiment. A minimally painful pressure intensity (0-1 on the NRS) was also accepted for the lowest intensity, as slight sensitization was expected to occur due to the repeated administration of pressure, which could increase the minimally painful rating above zero. Calibration consisted of three phases. In phase 1, an ascending series of pressure stimuli (50 kPa increments) was applied up to the first pressure intensity participants rated as ≥5.5. In phase 2, five different stimuli were applied three times in random order, ranging from the highest pressure intensity rated as 0 in phase 1 up to the highest pressure intensity rated between 4.5 and 5.5. If no pressure intensity during the ascending series was scored between 4.5 and 5.5, a formula was used to calculate the appropriate value (see Supplementary Appendix A). In phase 3, a calibration check was performed. The intensities for F I G U R E 1 Overview of the study design. In part 1 (open-label nocebo induction), participants were randomly assigned (3:1) to the experimental group (nocebo conditioning) or the control group (sham conditioning). During the learning phase of nocebo conditioning, participants received moderate pain (4.5-5.5 on 0-10 Numeric Rating Scale [NRS]) during 'DNS on' trials and slight pain (2-3 on 0-10 NRS) during 'DNS off' trials. The sham group received stimuli of slight and moderate intensity, not specifically associated with 'DNS on' or 'DNS off' (i.e. 10 moderate intensity and 10 slight-intensity stimuli were randomly paired to the 20 trials). Both groups received slight pain stimuli for all trials in the test phase. In part 2 (open-label nocebo reduction), participants from the experimental group were randomly assigned (1:1:1) to one of three groups: counterconditioning, extinction or continued nocebo conditioning. During counterconditioning, 'DNS on' trials were now paired with minimal pain (0-1 on 0-10 NRS) and 'DNS off' trials with slight pain in the learning phase. During extinction, all trials were paired with a slight pain intensity. Continued nocebo conditioning was identical to nocebo conditioning in part 1. Participants in the placebo-conditioning group received minimal pain during 'DNS on' trials and slight pain on 'DNS off' trials, which is identical to the procedure of counterconditioning. The test phases in all groups in part 2 were identical to the test phases in part 1. DNS, Dermal Nerve Stimulation. phase 3 were determined by taking the median of all intensities in phase 2 rated within the numeric ranges for no, slight and moderate pain. If participants did not rate any intensity within one or more of the intended ranges, formulas were used again to inter-or extrapolate the intensity corresponding to the intended range of pain scores (see Supplementary Appendix A). The chosen final intensities were administered twice for minimal pain and moderate pain, and thrice for slight pain. Participants were required to rate at least one out of two (or two out of three for slight pain) stimuli within the intended ranges. If this requirement was not met for any of the three intensities, again formulas were used to calculate the adjusted intensity (see Supplementary Appendix A). If manual adjustments were impossible (due to the requirement of less than the minimum or more than maximum amount of pressure), participants were excluded. In total, participants received up to 38 pressure stimuli during calibration.

| Sham TENS device
During the experiment, a sham Transcutaneous Electrical Nerve Stimulation (TENS) device combined with a message indicating its (de)activation on a screen was used as a conditioning stimulus. Depending on the randomized group allocation, participants were taught a contingency between sham (de)activation of this device and the delivery of either a non-painful, slightly painful or moderately painful pressure intensity by the pneumatic stimulator. To avoid potential interference by participants' possible previous experiences or knowledge on the functions of a TENS, the device was referred to as a Dermal Nerve Stimulation (DNS) device. Two electrodes were attached below each other on the radial side of the participants' non-dominant forearm. As part of the open-label nocebo and placebo induction, it was explained to participants that while the DNS device was sham and therefore inactive, their pain would still be influenced because of the nocebo or placebo effect, respectively. These suggestions were repeated right before the start of each part.
The messages indicating (de)activation of the device were presented to participants on a computer screen, in purple or yellow text (colours associated with either activation or deactivation were counterbalanced across participants). The messages were displayed for 3.5 s, starting 1 s before the pressure was administered. Participants were instructed to keep paying attention to the screen. In between stimuli, a fixation cross was shown.

| Nocebo-induction part
Nocebo conditioning consisted of a learning and testing phase ( Figure 1). In the learning phase, a button-press on the sham DNS device by the experimenter combined with a computer screen message in either purple or yellow indicating the activation of the DNS device ('DNS ON'), was repeatedly paired with a moderate-intensity pressure pain stimulus (pressure scored as 4.5-5.5 on 0-10 NRS for that participant), whereas the other-coloured computer screen message indicating the deactivation of the sham DNS device ('DNS OFF') was repeatedly paired with a slight-intensity pressure pain stimulus (2-3 on 0-10 NRS). In total, the learning phase consisted of 10 experimental trials ('DNS ON trials') and 10 control trials ('DNS OFF trials'), presented in a standard pseudorandom order (max two stimuli of the same trial type (experimental or control) could follow each other). The testing phase consisted of three experimental and three control trials in random order, all associated with a slight pressure pain intensity. Participants in the nocebo conditioning group were given open-label suggestions about the conditioning procedure and were told conditioning would be used to teach them that the activation of the sham DNS device will increase their pain sensitivity, by manually increasing the intensity of pressure stimuli after experimental trials. The precise verbal suggestions can be found in the Supplementary Materials.
Sham conditioning deviated from nocebo conditioning only in that pressure intensity was now not associated with sham DNS (de)activation, but randomly paired. For that, a random sequence was created for the 20 pain stimuli (10 slight-intensity stimuli and 10 moderate-intensity stimuli), while the order of the messages ('DNS ON' and 'DNS OFF') was identical to nocebo conditioning. Again, max two stimuli of the same trial type (experimental or control) could follow each other. Furthermore, participants were explicitly told there was no association between the DNS messages and the pain stimuli.

| Nocebo-reduction part
For all groups, the learning phase of part 2 consisted of 20 trials (10 experimental and 10 control trials) and the testing phase was identical to the testing phase for the nocebo-induction part.
The counterconditioning procedure differed from nocebo conditioning in part 1 such that a non-painful pressure stimulus (0-1 on a 0-10 NRS) instead of a moderateintensity pressure pain stimulus now followed the 'DNS ON' message. Again, participants were given open-label suggestions about the counterconditioning procedure and were told counterconditioning would be used now to teach them that the activation of the sham DNS device now decreases their pain sensitivity.
In the extinction procedure, only slightly painful stimuli were given during all trials, in both the learning and the testing phase. Participants were given open-label suggestions about the extinction procedure and were told that the pressure stimuli were no longer manually increased after the CS, to teach them that the activation of the DNS does not influence their pain anymore.
While both counterconditioning and extinction could decrease the nocebo effect, the main difference between the methods is that during counterconditioning the pain intensity paired with experimental trials is actively decreased (to below the level of pain during control trials), which thus resembles a more active strategy of reducing nocebo effects. During extinction, the pain intensity is identical to the intensity during control trials. This is comparable to either a gradual decrease of a nocebo effect without treatment, or therapies such as exposure, where repeated exposure decreases a certain negative association.
The procedure for the continued nocebo conditioning group in part 2 was identical to nocebo conditioning in part 1 of the experiment and this procedure mimics a reallife situation in which people repeatedly have negative experiences.
The procedure for placebo conditioning was identical to the counterconditioning procedure, apart from following sham conditioning instead of nocebo conditioning in part 1 of the experiment and a slight difference in the verbal suggestions given. Participants were told placebo conditioning would be used to teach them that the activation of the sham DNS device decreases their pain sensitivity. Placebo conditioning mimics a placebo treatment without people negative experiences prior to this treatment (e.g. no existing nocebo effects prior to the placebo treatment).

| Self-report ratings
A questionnaire including demographic and health questions was used to screen participants for inclusion. Furthermore, several validated questionnaires were used to measure baseline psychological characteristics, which is elaborated on further in a separate article as this concerns different study aims (Karacaoglu, Meijer, Peerdeman et al., unpublished data, October 2021). During the experiment, experienced pain intensity was reported after each pressure stimulus on an NRS, ranging from 0 (no pain) to 10 (worst pain imaginable). Participants were allowed to use decimals while scoring their pain. An exit questionnaire consisted of questions on (1) what participants thought the aim of the experiment was (open-ended); (2) level of focused attention during the experiment (0-10 NRS; higher score indicates more focused attention); (3) experienced pain during experimental trials in part 1 (on a scale of 0-10, with 0 indicating less pain compared to control trials, 5 indicating equal pain and 10 indicating more pain compared to control trials), (4) experienced pain during experimental trials in part 2 (on the same scale as question 3), (5) trustworthiness of the experimenter (on a scale of 0-10, with a higher score indicating more trustworthiness), (6) competence of the experimenter (on a scale of 0-10, with a higher score indicating more competence) and (7) whether participants adjusted pain ratings during the experiment to help the experimenter (on a scale of 0-10, with a higher score indicating a higher amount of adjusted answers and thus a response bias). Baseline and exit questionnaires were filled in using Qualtrics software (Qualtrics). NRS scores were verbally communicated to the experimenter, who noted the scores down using an Excel form (Microsoft Corporation).

| Experimental procedure
The experiment was conducted in a single session and took approximately 2 h, with 5-min breaks in between the different parts of the calibration procedure and a 10-min break between parts 1 and 2 of the experiment. During the experimental procedure, the experimenter always followed a detailed standardized script to ensure procedures for each participant resembled each other closely. After the procedure was explained, participants signed the informed consent form. If participants were eligible based on the screening questions, participants completed all baseline questionnaires. Individual pressure pain levels were then calibrated. Next, part 1 of the experiment commenced (nocebo-induction part), followed by part 2 (nocebo-reduction part). Finally, participants completed the exit questionnaire and were debriefed and compensated for their participation.

| Statistical analyses
All data were analysed using SPSS 25.0 (IBM SPSS Statistics). Assumptions of all statistical tests were checked through examination of histograms, Shapiro-Wilk tests, Levene's tests and boxplots. In case of violation, we used a bootstrapping approach or non-parametric tests. The threshold of significance was set at p < 0.05, unless stated otherwise. One-way analysis of variances (ANOVAs) were used to assess between-group differences in calibration values, ability to focus during testing, trust in experimenter, perceived competence of the experimenter and response bias. An overview of all analyses written below can also be found in the Supplementary Appendix C.

| Nocebo induction
To examine whether a significant nocebo effect was induced after nocebo conditioning and sham conditioning, two paired samples t-tests were performed separately within the nocebo-conditioning group and within the sham-conditioning group. For this, the average NRS score of experimental trials was compared with the average NRS score after control trials in testing phase 1.
Then, to test whether the induced nocebo effect was stronger after nocebo conditioning than after sham conditioning, an independent samples t-test was used to compare the induced nocebo effect (defined as a difference score between the average NRS score on all three experimental trials and the average NRS score on all three control trials in testing phase 1 between the nocebo-conditioning group and the sham-conditioning group). A Bonferroni correction was applied to correct for multiple testing and threshold for significance was set at p < 0.017.

| Manipulation checks
As a manipulation check to see whether the sham conditioning was actually perceived as sham, two paired samples t-tests were conducted to examine whether experimental trials were on average rated significantly different from control trials during the learning phase of both nocebo and sham conditioning. Additionally, in the nocebo-conditioning group, it was checked whether a difference between experimental and control trials in the test phase were actually due to increased NRS scores during experimental trials, instead of decreased scores during control trials because they think that there should be a difference with the experimental trials. This was done by comparing the average rating of the final control trial from the learning phase with the first control trial in the testing phase, using a paired samples t-test.

| Nocebo reduction within groups
In the nocebo-reduction part of the experiment, to determine whether the reduction in the nocebo effect within each group following nocebo induction was significant, three 1 sample t-tests were performed. Nocebo reduction was defined as a difference score between the nocebo effect in testing phase 1 of the nocebo-induction part of the experiment and the nocebo effect in testing phase 2 of the nocebo-reduction part (the nocebo effect in part 2 was subtracted from the nocebo effect in part 1). In each group, the amount of reduction of the nocebo effect was compared to 0, as a significant (positive) deviation from 0 indicates a significant amount of change and thus reduction in the nocebo effect. A Bonferroni correction was applied to correct for multiple testing and threshold for significance was set at p < 0.013.
2.9.4 | Nocebo reduction: Group differences Then, we examined whether any differences existed in nocebo reduction between the counterconditioning, extinction and continued nocebo conditioning groups. Since we were only interested in the pairwise comparisons between groups (and specifically the group × time interaction), we conducted three separate 2 × 2 mixed-model ANOVAs and used Bonferroni to correct for multiple comparisons (i.e. we tested our effects of interest against alpha < 0.017). These tests compared the interaction of group [(1) counterconditioning vs. extinction, (2) counterconditioning vs. continued nocebo conditioning, and (3) extinction vs. continued nocebo conditioning] and time (nocebo effect after part 1 vs. nocebo effect after part 2).
Finally, speed of reduction in the nocebo effect by counterconditioning and extinction were compared by examining the interaction between group (counterconditioning and extinction) and time (all 10 experimental trials in the learning phase of part 2) using a mixed ANOVA.

| Sensitivity analyses
Sensitivity analyses were conducted to assess the influence of excluding participants for whom no nocebo effect had been induced in phase 1 (i.e. participants for whom the difference between experimental vs. control trials in testing phase part 1 was zero or positive) from all analyses on nocebo reduction, as for these participants, there was no nocebo effect to be reduced, which may lead to incorrect inferences on the effects of counterconditioning.

| Placebo induction
To test whether a placebo effect could be successfully induced by placebo conditioning (following sham conditioning), a paired samples t-test was performed for the placebo-conditioning group, to test whether the average NRS score on the experimental trials significantly differed from the average NRS score on the control trials during the testing phase of placebo conditioning. Subsequently, a paired samples t-test was performed for the counterconditioning group to test whether a placebo effect was induced after counterconditioning, followed by an independent samples t-test to explore whether placebo effects induced after sham conditioning (placeboconditioning group) are stronger than placebo effects induced after nocebo conditioning (counterconditioning group). If no difference was found, an equivalency test was run, using the 'two one-sided tests' (TOST) approach (Lakens et al., 2018). The upper and lower equivalence bound were based on the smallest effect size of interest, which was d = 0.5 (a medium effect size). Then, the 90% confidence interval (CI) for the effect size of the difference between the placebo effect in the counterconditioning group and the placebo-conditioning group was calculated, to determine whether the 90% CI fell within the previously established range (which would indicate equivalency).

| Participants
Participants were recruited from December 2018 to March 2020. Out of 166 enrolled participants, 56 participants were excluded. Seven were excluded because of fulfilling one of the health-related exclusion criteria, 46 had a too high pain threshold (i.e. they did not reach a moderate pain level during calibration), 2 were excluded because they sensitized during conditioning (1 from noceboconditioning group, 1 from sham-conditioning group, part 1) and reported too high pain levels to continue the experiment, and 1 was excluded due to technical difficulties during the experiment. Due to the COVID-19 pandemic, it was decided to end the study prematurely and not to continue to reach the powered 124 participants, as data collected during the pandemic was not considered to be comparable to previously collected data, due to the additional safety measures (e.g. participant and researcher wearing masks, having a different lab set-up to ensure enough distance between participant and researcher).
In total, 110 participants were included in the final analyses of part 1, whereas 108 participants were included in the analyses of part 2. A flowchart of participant inclusion and exclusion, as well as group allocation is displayed in Figure 2. Descriptive data of calibration values and exit questionnaire scores are displayed in Table 1. During screening, nine people reported having current pain complaints (of lower than 2 on a 0-10 NRS), which was reported to be F I G U R E 2 Flow diagram of the randomized controlled trial.  either muscle soreness from working out or mild menstrual pain. No significant differences were found between groups for calibration values, trust in experimenter, perceived competence of the experimenter and response bias. The nocebo-conditioning group and sham-conditioning group did differ significantly on the self-reported amount of experienced pain (on average) on experimental trials compared with control trials in part 1, as the nocebo-conditioning group reported to have felt more pain after experimental trials than the sham-conditioning group. This indicates that participants perceived the experimental procedures in the expected way. Furthermore, regarding part 2, all groups differed significantly on the self-reported amount of experienced pain (on average) on experimental trials (in comparison to control trials) in part 2, except for the counterconditioning group and placebo-conditioning group, as participants in both groups reported to have felt less pain after experimental trials. In the extinction group, no difference was reported and in in the continued nocebo conditioning group, participants reported to have felt more pain during experimental trials. This indicates all procedures were perceived by participants as intended.

| Induction of the nocebo effect
The mean ratings on experimental and control trials during the testing phase of conditioning and sham conditioning are displayed in Table 2 and Figure 3. We found a significant difference between experimental and control trials in the testing phase of nocebo conditioning; t(84) = 12.10, p < 0.001, d = 1.31, as well as in the testing phase of sham conditioning; t(24) = 3.34, p = 0.003, d = 0.67, indicating both procedures led to a nocebo effect. However, as hypothesized and as displayed in Figure 3, the nocebo effect was significantly larger in the nocebo conditioning group than in the sham-conditioning group [t(84.33) = 6.82, p < 0.001, d = 1.27]. As a manipulation check, we tested whether experimental trials were on average rated significantly differently from control trials during the learning phase of both nocebo and sham conditioning. NRS scores during experimental trials were significantly higher than during control trials [t(84) = 22.10, p < 0.001] during nocebo conditioning, consistent with the difference in pressure intensity. As expected, during sham conditioning, experimental trials were not rated significantly higher than control trials [t(24) = −1.53, p = 0.138], consistent with the fact that the different pressure intensities were not specifically paired with either experimental or control trials. Additionally, no significant differences were found in the nocebo-conditioning group between the final control trial of the learning phase and the first control trial of the testing phase [t(84) = −0.13, p = 0.942], indicating that the induced nocebo effect in the testing phase of nocebo conditioning was driven by a higher pain score during the experimental trials, instead of a lower pain score after the control trials.

| Reduction of the nocebo effect
The mean reduction in each group is shown in Figure 4. The nocebo effect was effectively reduced by both counterconditioning [t(26) = 6.77, p < 0.001, d = 1.35] and extinction [t(28) = 4.60, p < 0.001, d = 0.85], whereas continued nocebo conditioning showed no significant change in the nocebo effect compared with part 1 [t(26) = −0.047, A 2 × 2 mixed-model ANOVA showed a significant interaction between group (counterconditioning vs. extinction) and time [nocebo reduction; F(1,54) = 14.06, p < 0.001, d = 1.02], indicating a significantly larger reduction in the nocebo effect after counterconditioning compared with extinction. Another 2 × 2 mixed-model ANOVA showed a significant interaction between group (counterconditioning vs. continued nocebo conditioning) and time [F(1,52) = 36.01, p < 0.001, d = 1.66], indicating a significantly larger reduction of the nocebo effect in the counterconditioning group compared with continued nocebo conditioning. Finally, the last 2 × 2 mixed-model ANOVA also showed a significant interaction between group (continued nocebo conditioning vs. extinction) and time [F(1,54) = 10.51, p = 0.002, d = 0.86], which indicated a significantly larger reduction in the extinction group compared with continued nocebo conditioning. For this analysis, our assumption of homogeneity of variances was violated, thus the data in both groups was transformed by taking the 10log(nocebo effect + 10), after which the assumption was met. Since this led to highly similar results as the original analysis, the results of analysis using non-transformed data was used, to stay closest to the original data.
Reduction in the NRS score during experimental trials is displayed in Figure 5. Speed of reduction did not differ between counterconditioning and extinction, as no significant interaction between group and time (10 experimental trials in learning phase of part 2 of the experiment) was found [F(5.04) = 0.395, p = 0.853].

| Sensitivity analyses
After exclusion of six participants whom did not show a nocebo effect after part 1 (i.e. 7.2% of the noceboconditioning group; four showed no change, two showed a change in the opposite direction), all analyses on nocebo reduction yielded the same conclusions.
T A B L E 2 Group means and SDs for reported pain during the learning and testing phase of nocebo induction and reduction, as well as for the magnitude of the nocebo effect and the reduction in the nocebo effect.

| DISCUSSION
The current study investigated the efficacy of open-label nocebo conditioning, and of open-label counterconditioning and open-label extinction on the reduction of induced nocebo effects, using the pain modality of pressure pain. We demonstrated that open-label conditioning can induce a nocebo effect on pressure pain, as participants F I G U R E 3 Average NRS ratings and standard error of the mean of all three experimental trials and control trials during the testing phase of nocebo conditioning and sham conditioning. Both for nocebo conditioning and sham conditioning, experimental trials were rated as significantly more painful than control trials. The magnitude of the nocebo effect was significantly larger in the nocebo conditioning group than in the sham conditioning group. *p < 0.01, **p < 0.001 (two-tailed). NRS, Numeric Rating Scale.

F I G U R E 4
Level of nocebo reduction from the testing phase of nocebo induction to the testing phase of nocebo reduction. Means and standard error of the mean's are depicted across the three groups. Both counterconditioning and extinction led to a significant reduction in the nocebo effect, whereas continued nocebo conditioning did not. Counterconditioning led to a significantly larger reduction than extinction and continued nocebo conditioning; extinction also led to a larger reduction than continued nocebo conditioning *p < 0.01, **p < 0.001 (two tailed). rated more pain than was actually administered during the test phase of nocebo conditioning. Furthermore, both open-label counterconditioning and extinction combined with suggestions were found to reduce nocebo effects. Both strategies led to an immediate reduction in the nocebo effect from the start of the procedure, instead of a gradual decrease. Counterconditioning yielded a larger reduction than extinction. Counterconditioning not only reduced nocebo effects but also induced a similar level of conditioned placebo analgesia as placebo conditioning (preceded by sham conditioning), as participants rated less pain than was actually administered.
In line with previous research on closed-label nocebo conditioning (Bartels et al., 2014;Colloca et al., 2008;Thomaidou et al., 2020), open-label conditioning with verbal suggestions effectively induced a nocebo effect. This shows that conditioning is effective even when there is honesty about the procedure, which supports previous findings on the efficacy of open-label nocebo conditioning on itch (Meeuwis et al., 2019) and of the use of open-label placebos in clinical trials (Carvalho et al., 2016;Kaptchuk et al., 2010;Kleine-Borgmann et al., 2019;Locher et al., 2017). The current study was the first to show openlabel induction of nocebo effects on pressure pain. The use of pressure pain, compared to commonly used methods like thermal and electrical pain, can be beneficial when designing (counter)conditioning-based treatments in patients with musculoskeletal pain conditions, as it more F I G U R E 5 Average nocebo effects and standard error of the mean's throughout the testing phase of part 1 and the learning phase of part 2 are displayed for counterconditioning and extinction. The first 3 trials represent the difference in pain between experimental and control trials during the testing phase of part 1, while the next 10 trials represent the difference in pain between experimental and control trials in the learning phase of part 2 (separated by the vertical line). While counterconditioning shows the largest reduction, the speed of reduction does not differ between the groups.

F I G U R E 6 Average Numeric
Rating Scale ratings and standard error of the means of all three experimental trials and control trials during the testing phase of counterconditioning and placebo conditioning. Both for counterconditioning and placeboconditioning, experimental trials were rated as significantly less painful than control trials. No significant difference in the magnitude of the placebo effect was detected; *p < 0.01, **p < 0.001, ns = p ≥ 0.05 (two-tailed).
closely taps into the specific sensitivity to pressure and mimics the real-life experience.
As for nocebo reduction, this study was the first to use an open-label counterconditioning procedure to reduce nocebo effects. In line with studies using closed-label procedures (Bartels et al., 2017;Thomaidou et al., 2020), counterconditioning was found to effectively reduce nocebo effects and to be more effective than extinction. Open-label counterconditioning fully extinguished the nocebo effect and produced a placebo effect. While openlabel extinction led to a significant reduction in the nocebo effect, this reduction was smaller than after counterconditioning and the nocebo effect was not fully extinguished (as participants still experienced slightly more pain during experimental trials). This finding slightly contradicts studies showing nocebo effects cannot be reduced by extinction, but it does provide further support for nocebo effects being resistant to complete extinction (i.e. no longer experiencing more pain during experimental trials than during control trials) (Bartels et al., 2017;Colagiuri et al., 2015;Colagiuri & Quinn, 2018;Colloca et al., 2008). It should be kept in mind that the current study used open-label conditioning to induce nocebo effects. This typically is not the case outside of an experimental environment, as pain is not deliberately associated with certain stimuli and people are not aware of being conditioned. Therefore, conditioning in daily life more closely resembles closed-label conditioning, during which people are not informed they are being conditioned. As more resistance to extinction was found in closed-label studies (Colagiuri et al., 2015;Colagiuri & Quinn, 2018;Colloca et al., 2008Colloca et al., , 2010) compared with the current study, we should keep in mind that nocebo effects in the real-world context may be more resistant to extinction. This highlights the importance of finding new ways to reduce nocebo effects, such as counterconditioning. A possible explanation for less resistance to nocebo reduction compared with closed-label procedures, could be because possibly no (or little) fear towards the CS was induced during conditioning due to the open-label nature of the study. While trial-by-trial fear (or fear after hearing the suggestions) was not assessed in the current study, the open-label procedure was more predictable than traditional closed-label paradigms, which in turn could lead to participants feeling less anxious about the CS and the pain associated with it. A recent study has shown fear to play an important role in the induction and amplification of nocebo effects, as a larger amount of self-reported fear predicted a larger nocebo effect (Thomaidou et al., 2021). Furthermore, several studies using pain conditioning have shown that fear regarding the painful stimuli may arise as a result of conditioning (Meulders et al., 2011(Meulders et al., , 2015Meulders & Vlaeyen, 2013). Future studies on openlabel (counter)conditioning and extinction should take the possible influence of fear into account, to be able to compare the results of open-and closed-label (counter) conditioning and/or extinction better.
Another explanation for these findings could be the open-label instruction itself, as during closed-label procedures, even when closed-label verbal suggestions are added, it is not mentioned that the amount of administered pain is adjusted by the experimenter. During openlabel counterconditioning and extinction, it is specifically told that the experimenter will no longer increase the pain or will lower the pain intensity, meaning there is little to no uncertainty regarding the administered pain. This is supported by our finding that speed of reduction did not differ in the counterconditioning and extinction groups. Typically, conditioned effects extinguish gradually during extinction, but our results showed a decrease right after the open-label instructions that suggested the extinction of pain increase. This illustrates the interaction between conditioning and the role of verbal suggestions in pain regulation (Montgomery & Kirsch, 1997) and the rapid extinction could thus be due to the explicit suggestion that pain would no longer be increased after presentation of the CS. This indicates that the influence of the provided verbal suggestions could be stronger than the counterconditioning or extinction procedure itself. Bajcar et al. (2021) found that the order of procedures (conditioning vs. suggestions) matters: when incongruent verbal suggestions were given after a conditioning procedure, the suggestions and not the conditioning determined the placebo effect. Furthermore, cue validity studies have shown that expectations regarding a certain stimulus, as well as how painful this stimulus is perceived can change on a trialby-trial basis, because of the use of different cues (i.e. a low or high tone) (Lorenz et al., 2005). While in our experiment the cue itself is altered (the meaning of 'DNS on' is changed from part 1 to part 2), this does indicate that pain experience can be subject to sudden changes, which our results would support. Our verbal instructions during nocebo reduction may have been more dominant than the preceding nocebo-conditioning procedure, which could explain why the nocebo effect was reduced right from the start of counterconditioning and extinction. Nevertheless, the reduction of the nocebo effect may have been strengthened by the counterconditioning or extinction procedure that followed. It could be relevant to compare the effects of counterconditioning and/or extinction with and without open-label verbal suggestions, to better disentangle the effects of the individual learning mechanisms.
Summarized, in an open-label lab context, counterconditioning and extinction can both reduce nocebo effects, with counterconditioning fully reducing nocebo effects and inducing a placebo effect and extinction only partially reducing the nocebo effect. It is however to be researched whether these findings can be replicated in closed-label and/or clinical settings. In future studies, it would be relevant to investigate whether open-label counterconditioning can also effectively reduce nocebo effects induced by closed-label conditioning, since this better resembles realworld settings, and to compare open-and closed-label procedures. This can give more insight into the separate effects of learning mechanisms (i.e. conditioning and verbal suggestions) and the open-label aspect of the procedure. While one previous study did not find any differences in efficacy between open-and closed-label conditioning (Meeuwis et al., 2019), differences between open-and closed-label counterconditioning have not been researched.
Importantly, our findings suggest that counterconditioning can not only reduce a nocebo effect, but even produce as strong a placebo effect as could be induced without prior negative learning experiences. However, the lack of induced fear and the open-label nature of the study could have made it easier to reduce the established nocebo effects, meaning it is important to replicate these findings in a study inducing nocebo effects in a closed-label fashion.
The effectiveness of open-label counterconditioning in an experimental setting is promising for clinical practice, as it could offer a new treatment strategy for reducing nocebo effects, while remaining fully transparent to patients. However, it remains to be researched whether the current findings in healthy participants will also be found in chronic pain patients. Several factors can influence the rise of nocebo effects in patients with chronic pain. Conditioning effects, such as the association of a doctor's white coat with a painful treatment, but also verbal cues, such as information on certain painful side effects, can lead to negative expectations regarding a treatment or (the development of) symptoms and thus a nocebo effect (Klinger et al., 2017). Additionally, patients have shown to have an attentional bias towards pain information (Van Ryckeghem et al., 2013), which could further increase the chance of negative expectations. Potentially, open-label procedures could be extra effective in altering these kinds of naturally occurring expectations, as open-label suggestions are very explicit and may shift the attentional focus of patients towards pain reduction because of specifically mentioning pain will be manipulated. These expectations of pain reductions are then further validated by the counterconditioning procedure itself (during which pain is actually lowered). As mentioned above, closed-label procedures leave some uncertainty regarding pain levels, which may lead patients to focus on the pain they previously experienced, while in open-label procedures this may be less likely. Nevertheless, those suffering from chronic pain might still respond less to counterconditioning than healthy controls, due to multiple negative treatment experiences in the past (Peerdeman et al., 2016). Alternatively, patients may have a stronger desire for relief, meaning it is also possible they respond better to counterconditioning, as studies have shown desire for pain relief is associated with placebo analgesia (Vase et al., 2003(Vase et al., , 2005. Therefore, the efficacy of this procedure should be tested in individuals with chronic pain. Additionally, applying such a procedure in a clinical setting can be more challenging than in an experimental setting, where the nocebo effect was induced experimentally. In a clinical setting, nocebo effects are acquired over time, and it may prove difficult to establish which associations induced those nocebo effects. More importantly, in the lab the symptoms experienced can be directly manipulated (i.e. the amount of pressure pain can manually be decreased during counterconditioning), whereas in a clinical setting, this is not possible (e.g. if patients experience nausea upon entering the hospital because of previous treatment experiences in the hospital, this nausea cannot easily be manipulated). Therefore, the procedure may have to be adjusted before clinical application; while the symptom cannot be targeted directly, it is possible to pair the hospital setting to something of a more positive valence than the nausea (e.g. music that makes the patient happy). Alternatively, an association with symptom decrease could be conditioned in a laboratory environment where symptoms can be manipulated, after which homework exercises can be given to promote generalization of this association with different environments. An example of such a treatment protocol is described in Meijer et al. (2022).
A possible limitation of the study is that valence of the CS has not been measured throughout the experiment. Therefore, it is not possible to judge whether valence regarding the CS changed in the expected direction during conditioning and after counterconditioning or extinction. Several studies demonstrating the superiority of counterconditioning over extinction (using fear or evaluative conditioning) have suggested this might be because counterconditioning effectively changes CS valence, whereas extinction does not (Engelhard et al., 2014;Kang et al., 2018;Kerkhof et al., 2011;Raes & de Raedt, 2012). However, it could be argued that nocebo counterconditioning does not completely resemble counterconditioning used in fear or evaluative paradigms, as the US is never fully taken away (only lowered to a less intense level). It is therefore important to assess whether this is sufficient for altering CS valence, as the alteration of valence has been found to strengthen the reduction in fear and reduce relapse, suggesting the effects of nocebo reduction in the nocebo effect might last longer when CS valence is successfully altered during counterconditioning.
Furthermore, while this study was only conducted in females to be able to better compare this in future studies with fibromyalgia patients (the majority of which is female), it would be good to compare the current study procedures in males. There may be differences in males and females regarding response to (open-label) suggestions and conditioning.
In conclusion, this study demonstrates that nocebo effects on the pressure pain modality can successfully be induced by a combination of open-label conditioning and verbal suggestions. Moreover, open-label counterconditioning and extinction can effectively reduce these nocebo effects, with counterconditioning leading to a stronger reduction than extinction and even producing a placebo effect similar to placebo conditioning without prior negative-associative learning. While more research is needed on the effectiveness of counterconditioning in chronic pain patients, the current study demonstrates that open-label counterconditioning in a pain modality that is relevant for many chronic pain conditions may be a promising new strategy for reducing nocebo effects in a nondeceptive and ethical manner.