Reliability and validity of a continuous pain registration procedure

Authors

  • A.J. van Wijk,

    Corresponding author
    • Department of Social Dentistry and Behavioural Sciences, Academic Centre for Dentistry Amsterdam (ACTA), University of Amsterdam and VU University Amsterdam, The Netherlands
    Search for more papers by this author
  • F. Lobbezoo,

    1. Department of Oral Kinesiology, Academic Centre for Dentistry Amsterdam (ACTA), University of Amsterdam and VU University Amsterdam, MOVE Research Institute Amsterdam, Amsterdam, The Netherlands
    Search for more papers by this author
  • J. Hoogstraten

    1. Department of Social Dentistry and Behavioural Sciences, Academic Centre for Dentistry Amsterdam (ACTA), University of Amsterdam and VU University Amsterdam, The Netherlands
    Search for more papers by this author

  • Funding sources

    No external funding was received for this work.

  • Conflicts of interest

    There are no financial or other relationships concerning this manuscript that might lead to a conflict of interest.

Correspondence

Arjen J. van Wijk

E-mail: a.vwijk@acta.nl

Abstract

Background

Conventional pain rating scales [i.e. visual analogue scales (VAS) or numerical rating scales (NRS)] only provide a summary for different levels of pain felt, while the duration of these levels is not accounted for. If pain can be rated continuously, the area under the curve (AUC) of varying pain intensity over time can be calculated, which integrates varying pain intensity with duration. The present study examined the reproducibility and validity of a continuous pain rating procedure.

Methods

Twenty-eight healthy volunteers participated. Pain was induced using constant current delivered to the non-dominant forearm using bipolar electrodes. Pain was rated continuously on an electronic VAS monitored by a computer. For each participant, the level of current needed to achieve a weak, mild, slightly moderate and moderate level of pain was determined (part I). Next, participants were asked to rate the painfulness of six periods of electrical stimulation (part II). Unknown to the participants, they were presented with the four levels of current obtained in part I, where the level of current for mild and moderate pain was presented twice (in order to assess consistency). The order of presentation was randomized for all subjects.

Results

In general, participants produced reliable mean AUCs. In addition, the AUC of pain intensity over time could clearly discriminate between the four levels of pain used in the present study.

Discussion

A continuous pain registration procedure, using an AUC approach, may be a promising direction to explore. Results can be improved by allowing more training on the use of the electronic VAS.

What's already known about this topic?

  • Conventional pain rating scales only provide a summary for different levels of pain felt; while the duration of these levels is ignored, no means have been reported to summarize the temporal aspect in a single figure.

What does this study add?

  • If pain is rated continuously, the area under the curve (AUC) of varying pain intensity over time can be calculated.
  • The AUC integrates pain intensity with duration.

1. Introduction

Suppose a patient is given the opportunity to choose between two different procedures to perform some type of painful treatment. Although both procedures can be considered equally painful, the time needed to perform procedure one is twice as long as the time needed to perform procedure two. Which procedure would the patient choose? It is obvious that patients will choose the shorter lasting procedure. Still, it is quite common to ask a patient, directly afterwards, to rate the painfulness of a procedure or treatment. This is typically carried out using conventional pain rating scales, such as numerical rating scales (NRS), visual analogue scales (VAS) and others (Turk and Melzack, 1992), which all require a person to express his/her pain using a single point on a scale. As such, two important remarks should be made at this point. First, clinical pain is never the experience of a constant intensity of pain but varies over time instead. This implies that, in a clinical situation, patients must somehow process the different levels of pain felt, during a given time period, to summarize their experience in a single number (a rating on an NRS or VAS). Complicating is that we do not know whether people summarize their experiences in a similar fashion [although Kahneman et al. (1993) suggest a peak-and-end rule]. Second, although time seems an obvious decisive factor when pain intensity is the same, conventional pain rating scales do not allow time to be integrated in the pain rating.

So, conventional pain rating scales only provide a summary for different levels of pain felt, while the duration of these levels is not accounted for. A more adequate approach would be to express pain as the total sum of different levels of pain experienced over time. If pain is rated continuously, this expression of pain can be obtained by calculating the area under the curve (AUC) of varying pain intensity over time. The AUC can be seen as a cumulative measure of varying pain intensity over time. Stated otherwise, it integrates varying pain intensity with the duration of such intensity. The idea of the AUC of pain intensity over time is not novel. AUCs are very common in designs of analgesic clinical trials (Max and Laska, 1991). For instance, to investigate the effect of analgesics on pain following third molar surgery, patients are required to fill out measures of pain each hour for the first six (or more) postoperative hours. However, such an approach (limited measurements during a relatively long time) deviates substantially from measuring pain continuously during a relatively shorter period of time.

However, a continuous pain rating approach would require that subjects are able to indicate their pain while being in pain. Since pain draws attention, one can imagine that subjects are, at least, hindered in their ability to transfer the subjective experience of pain to a scale of some sort, while being stimulated. Therefore, the present study aims to assess the reproducibility and validity of a continuous pain rating procedure.

2. Methods

2.1 Participants

Thirty healthy volunteers (all dental students) agreed to participate in this study in exchange for a modest financial compensation (10 Euro). Students were approached in the lunchroom, given a pamphlet with information about the study and asked to participate. Appointments were made for at least 1 week later. The research protocol was approved by the Medical Ethical Committee of the VU University Amsterdam (Protocol Number 04.148). All subjects signed the informed consent form. Subjects were excluded from the study whenever they had a pacemaker, were pregnant, reported impaired health (assessed using a general health questionnaire) or used painkillers on the day of the experiment. Although menstrual cycle, anxiety and depression can influence pain perception (in terms of between-subject variability), these were not taken into consideration since we were primarily interested in within-subject consistency.

2.2 Method and material

2.2.1 Pain induction

Electrical current was chosen as painful stimulus and was induced using a stimulator (constant current stimulator, model DS7A, Digitimer Ltd, Hertfordshire, England). The stimulus was a rectangular pulse (1 ms width) at a 5-Hz frequency. Current was delivered through custom-made, bipolar surface electrodes made of silver amalgam (4 mm diameter). The inner part of the lower non-dominant arm, halfway between the elbow and the wrist, was cleaned with alcohol. Conduction gel (TECA Electrode Electrolyte, Beckman, MN, USA) was applied to the electrodes, which were attached (inter-electrode distance was 5 mm) on the arm (next to each other, perpendicular direction) using double adhesive rings (Product Number 3444280, Enraf Nonius, Delft, The Netherlands).

2.2.2 Pain measurement

Participants had to indicate their pain continuously, during the experiment, using an electronic VAS (Fig. 1). The electronic VAS was a 6-cm long, vertically sliding device (i.e. a potentiometer), ranging from no pain (score 0) to moderate pain (score 8). Although the electronic VAS read 0–10, the corresponding score that was transmitted to the computer to be analyzed ranged from a corresponding 0 to 100. The score (position) was monitored continuously on a computer. The side of the electronic VAS contained a small button. The device was held in such a way that participants could continuously indicate pain intensity using one hand (dominant arm), and press the button using the hand that held the device (non-dominant).

Figure 1.

A picture of the electronic visual analogue scale. The scale is divided into 10 scores. The score 4 reads mild pain, the score 8 reads moderate pain.

2.2.3 Procedure

The study took place during 10 consecutive working days. All experiments were performed in the afternoon. Several precautions were taken to maximize standardization. When entering the laboratory, subjects were requested to turn off their mobile phone, and a ‘Do-Not-Disturb'-sign was attached to the door. Before commencing, subjects were informed that all instructions would be read aloud from the predetermined research protocol, and that there was no room for additional conversation. Subjects were seated in a chair in such a way that their lower arms were stretched out horizontally on a table. The table was surrounded by a movable room screen to prevent visual distraction as much as possible. Two experimenters (not the authors) performed the entire study. One experimenter (male) operated the computer; the other experimenter (female) interacted with subjects (reading instructions, etc.).

The experiment consisted of two parts. In part I, it was determined what levels of electrical current were necessary to achieve a feeling of weak, mild, slightly moderate and moderate pain for each subject. Current was slowly increased, and subjects were instructed to move the electronic VAS along with their subjective pain experience (see Fig. 2). When their feeling reached weak pain, participant pressed the button on the side of the electronic VAS, and ideally, the position of the electronic VAS reached weak pain as well. Next, current was further increased until mild pain was reached, and so on for slightly moderate and moderate pain. This entire process was repeated three times. Repeating this process provided participants with the opportunity to practice using the electronic VAS, and so to establish some form of calibration to the scale. In addition, the associated levels of current could now be based on an average score. For each level of subjective pain, the associated mean current was calculated based on measurement 2 and 3. These levels of current are used in the second part of the experiment.

Figure 2.

Schematic representation of the design (part I). Current is slowly increased to determine the level of current needed to achieve a feeling of weak, mild, slightly moderate and moderate pain.

In part II of the experiment, subjects were presented with six periods of stimulation, each separated by a 60 s resting period. Subjects were told that the computer would randomly select levels of current from the interval no pain–moderate pain, and that they had to rate the painfulness on the electronic VAS. In reality, all subjects were presented with the four levels of current obtained in part I, where the level of current for mild and moderate pain was presented twice (in order to assess consistency). The order of presentation was randomized for all subjects (Fig. 3)

Figure 3.

Schematic representation of the design (part II). Each subject is presented with the four levels of current obtained from part I. The order is randomized for each subject. Stimulation periods are separated by a 60 s wait.

2.2.4 Stimulus and response incongruence

Subjects will be presented with painful stimuli of a given intensity (level 1 through 4 associated with weak to moderate pain) and duration. During stimulation, subjects are required to express their pain continuously on the electronic VAS. As such, there is a possibility for unwanted measurement error to arise. This is made clear in Fig. 4.

Figure 4.

Two possible sources of unwanted measurement error. Response delay is the time between start of stimulation and start of the response. Intensity delay is the time needed to reach the desired level of subjective pain.

First of all, experience with a pilot study (van Wijk et al., 2007) showed that some subjects need considerable time to ‘decide’ what they feel before moving the electronic VAS. This is labelled ‘response delay’ in Fig. 4 and is corrected for in the following manner. Stimulation time was set to 10 s. For each subject, the AUC was determined by taking the first point in time where the response started, to 5 s later. With a sampling frequency of 10 Hz, this gives a response consisting of 50 measurements, which are summed to obtain the AUC (which is now 5 s for everyone).

However, moving the VAS to a certain position on the scale takes time, and this is labelled the ‘intensity delay’. The latter is inherent to this study, in which the stimulus is a constant, but has to be rated continuously.

2.2.5 Statistical analysis

Conventional statistical analyses were used: T-tests to compare mean scores, Pearson's correlation as a measure of linear association, and the intraclass correlation coefficient (ICC) as a measure of agreement (reliability). Analysis of variance (ANOVA) for repeated measures was used to compare multiple dependent mean scores. The significance level was set at alpha = 0.05.

3. Results

3.1 Participants

Complete data for one female participant had to be discarded as a result of a technical malfunction of the stimulator. Another female subject failed to produce responses (as a result of misunderstanding the instructions) for three out of six stimulation periods and was also discarded. This resulted in a total of 28 participants (11 men, 17 women) with usable data (mean age = 23.1, standard deviation = 4.5).

3.2 Part I of the experiment

In this part of the study, it was determined what level of current would be necessary to achieve different levels of pain within subjects. The process was repeated three times. For each trial, the mean level of current associated with each of the subjective pain levels is presented in Table 1. A number of relevant findings can be noted at this point. First, ANOVA for repeated measures showed that each of the four levels of current (total column in Table 1) differ significantly from each other, F(3, 25) = 12.6, p < 0.001. In other words, the mean levels of current needed to achieve different levels of pain on a subjective level differ significantly. Also remarkable is the approximately equal spacing between the subjective levels of pain in terms of current needed to achieve them. That is, there seems to be a mean difference of about 0.30 mA between each subjective pain level and the next. Second, for each subjective pain level, there was a significant difference between the first, second and third trial in terms of current needed to achieve that level. The current needed to produce the same level of pain on each subsequent trial increased significantly. In other words, there seemed to be substantial habituation to the electrical stimulus. To conclude, although male subjects scored consistently higher (more current) on each level than female subjects, no significant differences were present (p-value range from 0.21 to 0.34), and separate data is therefore not reported.

Table 1. Mean level of current (mA) associated with weak, mild, slightly moderate and moderate pain for the total sample
 First trial* Second trial* Third trial* Total*
  1. Total = average of trial 2 and 3.
  2. *p < 0.05 within a trial across pain levels; § p < 0.05 within a pain level across the trials. SD, standard deviation.
 MeanSDMeanSDMeanSDMeanSD
Weak § 0.360.250.560.540.840.840.700.69
Mild § 0.630.450.870.801.201.111.030.95
Slightly moderate § 0.850.621.130.991.481.361.301.17
Moderate § 1.291.131.501.381.831.691.661.52

3.3 Part II of the experiment

All subjects were presented with six periods of stimulation during which pain intensity had to be rated. Unknowingly, all subjects were presented with the four levels of current obtained in part I of the experiment, with mild and moderate painful stimulation being presented twice, and the order of stimulation randomized over all subjects. ANOVA for repeated measures was used to compare the mean AUC indicated for each stimulation period. Results show a significant main effect for the repeated measures, F(5, 23) = 34.5, p < 0.001. Subsequent pairwise comparisons show that all mean AUCs differ significantly from each other in the expected direction (higher scores for higher levels of painful stimulation), with the exception of the AUC associated with repeated mild and repeated moderate painful stimulation. That is, no significant differences could be shown with respect to mild pain, t(24) = 1.35, p = 0.19, and moderate pain, t(26) = 1.27, p = 0.22. This result is tabulated in Table 2.

Table 2. Mean AUCs associated with each stimulation period
 Mean AUCSD
  1. AUC, area under the curve; SD, standard deviation.
  2. a p > 0.05 within the same level of stimulation.
Weak pain549.33551.7
Mild, 1st presentation1482.5a 791.5
Mild, 2nd presentation1274.4a 807.7
Slightly moderate2117.6929.1
Moderate, 1st presentation2793.8a 891.0
Moderate, 2nd presentation2567.0a 941.9

To summarize, these results show that the AUC of a continuous pain rating shows discriminative validity. That is, it can easily discriminate between four levels of subjective pain, while it does not discriminate between the repeated mild and repeated moderate stimulations. In addition, the fact that the order of painful stimulation was randomized across subjects makes this result more convincing. Also, the results suggest that when subjects are stimulated with the same level of intensity, this will, on average, result in the same AUC. However, comparable mean scores do not imply consistency, and therefore ICCs were calculated. For mild painful stimulation, the (single measure) ICC was 0.60. For moderate painful stimulation, the ICC was 0.57. Both coefficients are not as high as one would prefer but are on an acceptable level (see the discussion).

3.4 Individual differences

Although the results so far suggest reproducible values that show discriminative validity, the responses to the painful stimulations were not uniform. That is, some subjects ‘performed’ better than others. The individual responses were explored thoroughly and subjectively categorized into ‘good’ (n = 5), ‘reasonable’ (n = 13) and ‘bad’ (n = 10). The online supplemental file (Fig. S1) shows an example for each category and makes a number of things clear. First of all, not every subject was equally good at the task of transforming a painful stimulus to a corresponding point on the electronic VAS. Some subjects were quite good, some were bad, some in the middle. This issue is addressed in the discussion. When inspecting the response curves, it is also apparent that the earlier labelled ‘intensity delay’ serves as a valid explanation for the relatively modest ICCs found. That is, it is difficult to intentionally display the same delay. Indeed, when an ICC is calculated for those subjects that showed a ‘good’ response, the ICC increases to 0.68 for mild pain, and to 0.84 for moderate pain.

4. Discussion and conclusions

The aim of the present study was to investigate the reliability and discriminative validity of a continuous pain registration procedure. Results showed that the AUC of continuously rated pain intensity over time yields reproducible mean scores and does possess discriminative validity. Reliability or reproducibility of a continuous pain registration procedure has been investigated before by the present authors (van Wijk et al., 2007), and more recently by other authors as well (Boormans et al., 2009). Both studies have shown that subjects are able to rate pain, using a continuous procedure, in a reproducible manner, which is confirmed by the present study. Although some points are worth discussing (next paragraph), results are quite positive and suggest that a continuous pain registration procedure may be a worthwhile direction to explore. Further on, we argue why this is the approach to choose in some situations.

Reliability, although acceptable, was not as high as one would desire. The following explanations should be considered. Inspecting the response curves made clear that some subjects were quite good at the task, some were very poor and some were in the middle. Rather than concluding that subjects are not able to rate their pain continuously, it is more tempting to assume that subjects could do much better with some form of training in the use of the electronic VAS. That is, besides part I of the experiment, subjects had no experience at all in using the electronic VAS, or in rating painful stimuli. In the planned follow-up study, this aspect will be given thorough attention. Another point is that the order of stimuli was randomized for each subject. Although this randomization strengthens the study in a methodological sense, the painfulness of a stimulus is partially dependent on the previous stimulus. In other words, mild pain may seem less painful when it is preceded by moderate pain than when it is preceded by weak pain. Although this is randomized, it does result in more within-subject variability, which in turn negatively affects the ICCs calculated. One final factor is the habituation (Milne et al., 1991) that took place during the experiment (Table 1). That is, on each subsequent trial, more current was needed to achieve the same level of pain (part I). In order to reduce the role of habituation, there was a 60 s period of rest in between each stimulation period. We would like to add that in part I of the experiment, the habituation is expected to be stronger than in part II. That is, it was determined what levels of current would be needed to achieve the different levels of subjective pain, in one go. That is, the stimulation time during part I of the experiment (average 2 min) is much longer than the stimulation periods in part II (10 s each). In addition, this process (from part I) was repeated three times with little time in between. In summary, although some habituation took place, we believe that its effects are acceptable and have only led to an underestimation of reliability.

One final issue concerns the following. If there is a response delay, and not all subjects can use the continuous pain rating equally well without some form of practice, should we not better use the VAS or NRS? The answer is most likely no, especially for situations like the one sketched in Fig. 5 [a situation that is directly related to the work by Kahneman et al. (1993)]. In this figure, two separate stimuli consist of the same pain intensity combined with equal duration. For both stimuli, the problem arises that was described in the introduction. A subject who is presented with these stimuli and who is asked to rate them on a conventional pain rating scale will have to somehow summarize the different levels of pain intensity (and their duration) into a single estimate on a VAS or NRS. This may yield a different estimate for stimulus 1 and stimulus 2 while, in a sense, both stimuli can be considered equally painful. In fact, the work by Kahneman et al. (1993) suggests that, based on the peak-and-end rule, subjects will judge stimulus 2 to be less painful than stimulus 1. However, we would expect that a continuous pain rating procedure should result in a level of experienced pain that is (approximately) the same for stimulus 1 and 2. In addition, in situations like the one sketched in Figure 5, it is expected that conventional pain rating scales will yield lower consistency scores, while an AUC approach using a continuous pain rating procedure would be more appropriate. This specific hypothesis will also be tested in a follow-up study.

Figure 5.

Two separate stimuli that both consist of an equal amount of pain intensity combined with equal duration. How will a subject score these separate (yet comparable) stimuli on conventional pain rating scales?

So, a continuous pain registration procedure would be more appropriate in situations in which a painful experience consists of multiple pain intensities, possibly with differing durations. As noted in the introduction, clinical pain is hardly ever the experience of constant intensity but rather shows a fluctuating character over time. However, these are the first results concerning such a procedure, and it would be desirable to replicate these findings in experimental settings, using other stimuli and allowing more practice using the electronic VAS, before making a transfer to clinic situations and clinical pain. Therefore, a continuous pain registration procedure as used in this study should be applied with, and would be very suitable for use with, other human pain models, such as pressure pain (Reeves et al., 1986), ischemic pain (Maurset et al., 1991), thermal pain (Fruhstorfer et al., 1976), intracutaneous pain (Bromm and Meier, 1984), tonic pain (Chen et al., 1989) or other pain models (Staahl and Drewes, 2004).

To conclude, the present study shows that subjects (young and intelligent) are able to use a continuous pain registration procedure in a reliable manner, which yields valid scores.

Author contributions

All authors have substantially contributed to the manuscript. All authors have read, and commented on, the manuscript. Below you will find a more detailed description of authors' contribution.

AJ van Wijk: Conceptualized, conceived and designed the experiments, supervised the experiments, analyzed the data, wrote the manuscript.

F Lobbezoo: Contributed to the execution of experiment (laboratory, materials, etc.), contributed to conceptualization, wrote the manuscript.

J Hoogstraten: Contributed to conceptualization, supervised the project, wrote the manuscript.

Acknowledgements

J.J. van der Weijden Ing is thanked for contributing his technical expertise to the realization of this study. The Department of Oral Function kindly permitted use of their research facilities. Finally, the experimenters, Yasmeen Ahmed and Gaurav Bagga, are thanked for their commitment to the experiment and carrying it out so meticulously.

Ancillary