To determine if participants in chronic disease self-management courses have a change of perspective of their health status (a response shift), and if this is measurable with a paper-based questionnaire.
To determine if participants in chronic disease self-management courses have a change of perspective of their health status (a response shift), and if this is measurable with a paper-based questionnaire.
Nine items were developed to measure potential benefits of self-management courses. These were based on the constructs of a previous questionnaire, the Health Education Impact Questionnaire (HEI-Q). Cognitive interviews elicited spontaneous statements about the reasons for paper-based answers. Sensitivity, specificity, and overall accuracy of items were calculated using the interview as a relative gold standard. Response shift can be negative (i.e., after the course, participants realize that, before the course, they were worse than they thought they were), positive (i.e., participants now realize they were better than they thought they were), or absent (no change).
Interviews (n = 39) reflected that true response shift occurred in approximately half the replies to questionnaire items. Of these, 31% were negative response shift, 20% were positive response shift. Response shift was absent in 32% of replies. Presence or absence of response shift could not be determined in 17% of replies across items. Significant concordance between questionnaires and cognitive interviews (average overall accuracy 0.79) indicated that the HEI-Q Perspective questionnaire detects response shift in participants of self-management courses. The questionnaire revealed that 87% of participants had response shift in at least 1 item.
This study suggests that preintervention/postintervention assessments of interventions such as self-management courses are confounded by a change in perspective of a large proportion of respondents. It also indicates response shift is a valuable outcome of self-management courses that can be measured with a paper-based questionnaire.
Health education and self-management (SM) programs for persons with arthritis and other chronic diseases have become prominent in North America, Britain, Australia, and other regions (1, 2). These programs are becoming widely endorsed by clinical and community groups as well as governments. However, systematic reviews reveal small to moderate effects for biologic outcomes (e.g., glycosylated hemoglobin, blood pressure, asthma attacks) and small to no effects for questionnaire-based assessments, including subjective patient-reported outcomes such as pain or disability in individuals with arthritis (3).
Traditional evaluation of health intervention programs relies on classic pre- and postintervention assessment, focusing on changes from baseline to followup. The outcomes assessed often include proximal outcomes (e.g., change in knowledge, self efficacy) and more distal outcomes such as health-related quality of life, pain, or disability. The assessment of these outcomes relies on eliciting participant self-perceived status pre- and postintervention. These assessments assume that participants do not change the way they perceive their health over the period of the program. The assumption is that participants have the same perspective of their disease and their wellbeing both before and after the program and, therefore, answer a followup questionnaire from the same point of view as they held for the baseline questionnaire.
The definition of response shift (RS) is a change in the meaning of one's self evaluation of a target construct as a result of recalibration (a change in a person's internal standards of measurement), reevaluation or reprioritization (a change in a person's values), and reconceptualization (a change in the way a person defines a target construct; for example, quality of life) (4). We define RS as having 3 forms of expression: negative (i.e., after the course, participants report they now realize they were worse before the course than they thought they were), positive (i.e., participants now realize they were better than they thought they were), or absent (no change).
RS is a phenomenon that was explored and developed in the fields of education and organizational change (5). It has recently been applied to medical and health-related psychosocial research where self-reported data are collected (4). In these areas, RS has primarily been examined with respect to individuals who experience long-term, often terminal illness and their caregivers (5–7). In this study, RS was investigated with regard to arthritis and other chronic disease SM programs that were of short duration (6 weeks).
Pre- and postintervention assessments are generally not designed to detect participants' changed perceptions of their health. However, as participants learn (and use) new and more effective strategies for coping with their chronic disease and observe others who are in better and worse conditions than themselves, their internal standards of measurement, their values, and their definition of health and quality of life may change. If individuals have a changed perception of symptoms, then they may answer questions in a systematically different way at followup. That is, participants may respond differently to questions posed after an SM program than they did to the same questions posed before an SM program. It may be possible for this to happen even if they have no physical improvement or decline. It is this change of perspective, or RS, that may confound outcome assessment of SM programs where an intervention involves social comparisons or cognitive behavioral interventions. It is also important to recognize that RS may be a genuine and intended (albeit implicit) treatment effect. The aims of the present study were to determine the extent to which RS occurs, and to determine if a paper-based questionnaire, the Health Education Impact Questionnaire (HEI-Q) Perspective, can accurately identify the presence of RS.
This study is a component of a larger evaluation of the outcomes of SM programs. The participants were community-based individuals who had voluntarily taken part in an SM course delivered by Australian arthritis foundations. Although a range of other outcomes was collected, only data collected on RS were used for this study. A questionnaire was mailed to 353 individuals 2–10 months after they had completed an SM program; no active followup was undertaken. Telephone calls were made to those participants who returned the questionnaire and ticked a box that indicated their permission to be contacted for an interview. Interviewees were purposefully sampled from the larger sample by RS questionnaire score (whether scores lay primarily to the left, middle, or right of the scale described below) to produce an approximate balance between negative, absent, and positive RS. The target sample was 30 but was increased to 40 to ensure saturation had been reached. One interviewee was excluded due to cognitive difficulties with the task.
The SM programs were the standard Stanford University School of Medicine Arthritis or Chronic Disease Self-Management Programs (8) comprising 6 structured sessions of 2 hours' duration. The project was approved by the University of Melbourne Research Ethics Committee.
A set of 9 items, as indicators of RS, was generated specifically for this study. Each item was designed to capture an independent potential benefit of an SM intervention. The items were derived from work that was undertaken to inform the development of a new questionnaire, the HEI-Q, a new chronic disease health-education outcomes questionnaire. The set of 9 items is called the HEI-Q Perspective.
The HEI-Q Perspective covers 8 dimensions (see Appendix A, available at the Arthritis Care & Research Web site at http://www.interscience.wiley.com/jpages/0004-3591:1/suppmat/index.html). Each dimension was reflected by 1 item, except the Health Service Navigation dimension, which was covered by 2 items. Therefore, there are 9 items to represent 8 dimensions. For this study, because RS was revealed through a retrospective assessment indicating that things are viewed differently now compared with before, the items were written to capture a difference in terms of diminution or augmentation of perceived importance or impact of an issue. Items were therefore presented with positive and negative statements at either end: “I now realize that before the course I was much more anxious, worried, or depressed than I thought I was” and “I now realize that before the course I was much less anxious, worried, or depressed than I thought I was.” The middle point on the scale was “About the same.” The middle point was scored 0 (absent RS), answers to the left were scored from −1 to −3 (negative RS), and answers to the right were scored from 1 to 3 (positive RS).
Telephone interviews were conducted 1–4 weeks after a person completed the HEI-Q Perspective questionnaire. Cognitive interviews (9) were sought to establish if the answers from the paper-based questionnaire (intended to measure the presence of RS) reflected the presence or absence of RS. For each answer on the paper-based questionnaire, the interviewer (MH) first reminded the participant of his or her answer. Then the primary interview question was, “Can you tell me more about that answer?” If necessary, 2 additional prompts were used: “How did you come to see it that way?” and “Is there anything else you would like to say about that?” It was intended that these questions would elicit spontaneous answers with clear indications of RS, if RS was present. No leading questions or additional probes were used. At the end of the dialogue for each question, the participant's comments were paraphrased. If the participant did not approve the interviewer's paraphrase then it was reworded until the meaning of the participants' comments was understood by the interviewer. Paraphrases were accepted when participants answered affirmatively with comments such as “Yes,” “Oh, definitely,” or “That's true.”
A negative RS was exemplified by comments such as, “I now realize that, before the program, I was more depressed than I thought I was,” or, “Since learning what other people do, I now realize I am worse than what I thought I was.” Positive RS comments included, “When I saw other people in the program, I realized I was not so bad,” or, “I didn't think I was doing much but, when I saw other people in the program, I realized I was.” The midpoint (0 score) was scored as no (or absent) RS, corresponding with participants regarding themselves as still having the same perspective as they did before the program. When no clear negative, positive, or absent RS comment was volunteered, the interview result was scored as unclear.
For the purposes of this study, the result from the qualitative RS interview was regarded as the true indicator of RS (akin to a gold standard). The questionnaire scores were dichotomized to either absence or presence of RS (i.e., negative and positive RS were combined). In this way, questionnaire responses were categorized as either true positive (agreement between questionnaire and interview that RS was present, i.e., questionnaire and interview both indicated positive or both indicated negative RS), true negative (agreement between questionnaire and interview that RS was not present), false positive (where the questionnaire suggested RS was present but the interview could not demonstrate RS), or false negative (where the questionnaire suggested that RS was not present but either the interview found RS to be present or the absence of RS could not be demonstrated). These data were then used to calculate the following indicators of test accuracy: sensitivity (the ratio of the number of correctly identified individuals with RS and the total number of individuals with RS) and specificity (the ratio of the number of individuals correctly identified as not having RS and the total number of individuals without RS). The overall accuracy was the measure of true findings (true positive + true negative results) divided by all test results. This is also termed “the efficiency” of the test.
The questionnaire was sent to 353 individuals. A total of 137 participants returned the questionnaire and, of these, 121 had usable responses for all items (34% response rate). The data from the 121-person subset are reported. The mean age was 63 years (range 33–83 years), 84% were women, and approximately one-third of the sample reported arthritis as their main health problem. The demographics of the interview subset (Table 1) were similar to the overall sample.
|Characteristic||Total sample||Interview subsample|
|Age, mean ± SD years||63 ± 11.9||59 ± 11.0|
|Female||112 (84)||34 (87.0)|
|Main health problem|
|Arthritis (osteoarthritis and arthritis)||35 (29)||12 (31)|
|Fibromyalgia||9 (7.4)||5 (13)|
|Osteoporosis||12 (9.9)||1 (3)|
|Pain syndrome||5 (4.1)||4 (10)|
|Rheumatoid arthritis||8 (6.6)||4 (10)|
|Other musculoskeletal||12 (9.9)||4 (10)|
|Affective disorder||1 (0.8)||0 (0)|
|Diabetes||3 (2.5)||1 (3)|
|Chronic obstructive airways disease/asthma||4 (3.3)||1 (3)|
|Cardiovascular disease||10 (8.3)||2 (5)|
|Other||12 (9.9)||5 (13)|
|Missing||10 (8.3)||0 (0)|
|Total||121 (100.0)||39 (100.0)|
A total of 39 interviews were conducted. The interviews revealed that all items showed some RS, with RS occurring in approximately half the replies to items (Table 2). Across items, ∼31% had a negative RS, 20% had a positive RS, 32% had no RS, and in ∼17% of the items the presence or absence of RS could not be established. The purposeful sampling for interviews was intended to create a balanced sample across strength of endorsement of response options. Across items, 37% (range 23–59%) scored 0, 38% (28–54%) endorsed either −1 or 1, and 25% (10–38%) endorsed the ends of the scale (−3, −2, 2, or 3). The whole sample that was assessed for RS by questionnaire demonstrated similar proportions. Overall, 87% of participants had RS in at least 1 item.
|Questionnaire answer for subgroup (n = 39)||Interview answer for subgroup (n = 39)||Questionnaire answer for total sample (n = 121)|
|Positive and Active Engagement in Life||38||28||33||36||23||31||10||28||47||25|
|Skill and Technique Acquisition||36||23||41||33||18||28||21||31||38||31|
|Constructive Attitudes and Approaches||44||38||18||41||31||8||21||35||50||16|
|Self-Monitoring and Insight||38||33||28||44||28||18||10||35||37||28|
|Health-Service Navigation– Knowledge||36||46||18||31||44||13||13||36||45||19|
|Health-Service Navigation– Relationships||26||59||15||15||54||8||23||16||69||16|
|Social Integration and Support||26||49||26||15||49||21||15||19||57||24|
Negative RS was most frequent in the Self-Monitoring and Insight item (44% of interviews), implying that, because of participation in the SM program, individuals now realize that they had a much poorer understanding of their health before the program than they thought they did. Typical comments include: “I didn't realize how badly it [rheumatoid arthritis] was affecting me … I was really fearful of what was going to happen in 10–15 years' time. Now I think, ‘That’s not now.' I'm doing things I'd convinced myself I could never do again. I can now live the life I want with some modification”; “I understand now that my only limitations are what I put on myself. The course helped with different techniques and not to dwell on what you can't do but what you can do.”
Another item with a frequent negative RS was Constructive Attitudes and Approaches (41%), implying that because of the SM program, these participants now realize that they had fewer constructive attitudes and approaches to their health before the program than they thought they had. Participant comments include: “I realize now I did let my health problems affect my life; I feel much stronger now because I learned how to manage the pain.”
The 2 items that related to relationships (Social Integration and Support, and Heath Service Navigation–Relationships With Health Professionals) had the lowest negative RS endorsement (both 15%) and among the lowest positive RS endorsement. Conversely, these items had the highest absent RS. A typical absent RS comment for the Social Integration and Support item was, “Basically, my husband and my family … have always been supportive. I knew that anyway.” A typical absent RS comment for the Health-Service Navigation item was, “Yes. Never have any trouble talking to them or discussing things.”
The Constructive Attitudes and Approaches item also had a low positive RS endorsement. The item with the highest positive RS endorsement was the Positive and Active Engagement in Life item (31%), suggesting that approximately one-third of participants found that the program enabled them to see that before the program, they were more positively and actively engaged in life than they thought they were. Typical positive RS comments from this item include: “Well, I never used to think I was doing very much, but when I did the course I realized I was doing a lot”; “When I saw the other people on the course who were far worse than me and they were dealing with things, I realized I wasn't so badly off.”
Closely behind Positive and Active Engagement in Life with high positive RS endorsements were the Skill and Technique Acquisition item and the Emotional Wellbeing item (both 28%), suggesting that approximately one-quarter of the participants found that the program enabled them to see that before the program, they already had a variety of skills and techniques, and that they were less anxious, worried, or depressed than they thought they were. Typical positive RS comments from the Skill and Technique Acquisition item include: “I realized I was doing exercises and not sitting and thinking about the arthritic disease”; “Some of the things we were talking about I realized I already had done that.” Positive RS comments from the Emotional Wellbeing item include: “I really did think that I was anxious or depressed and it's been acknowledged, but oh I wasn't as badly depressed as some of the others. I came away thinking, ‘my lot’s not that bad'”; “Some women in the course were very depressed; they isolate themselves and I am a go-go person. It made me realize that I wasn't depressed at all, just getting older and can't do things as quick as I used to.”
The Health-Directed Behavior item had the largest number of interview outcomes (28%) where it was unclear if RS had occurred. Across interview items, the average number of unclear responses was 17% (Table 2). Unclear RS answers included: “I used to play sports; a lot of bike riding, played in a brass band; but now it [the illness] has slowed me down a lot. In for a major operation. I had started to slow down before the course; had to give up many activities”; “I've done a bit more for myself.”
Across all items in the full sample of 121 participants, only 11 (13%) reported no RS across all items, 18 (22%) reported RS in 1–2 items, 20 (24%) reported RS in 3–4 items, 39 (47%) in 5–6, 28 (34%) in 7–8, and 5 (6%) reported RS across all items (Figure 1). Using the interview as the gold standard, the most accurate items were Positive and Active Engagement in Life, Self-Monitoring and Insight, and Social Integration and Support (overall accuracy score 0.85). These were followed closely by Health Service Navigation–Knowledge, with an overall accuracy score of 0.82. The least accurate item was Health-Directed Behavior (overall accuracy 0.69). The average accuracy across all items was 0.79 (Table 3).
|Item||True positive||True negative||False positive||False negative||Sensitivity||Specificity||Overall accuracy|
|Positive and Active Engagement in Life||24||9||4||2||0.92||0.69||0.85|
|Skill and Technique Acquisition||23||7||7||2||0.92||0.50||0.77|
|Constructive Attitudes and Approaches||16||12||8||3||0.84||0.60||0.72|
|Self-Monitoring and Insight||22||11||4||2||0.92||0.73||0.85|
|Social Integration and Support||14||19||2||4||0.78||0.90||0.85|
The measurement of outcomes of health interventions (i.e., pre- and postintervention assessment) relies on participants maintaining a stable perspective from which to rate their status on quantitative questionnaires. This study indicates that ∼87% of participants may have an altered internal standard of measurement on at least 1 relevant outcome of SM courses, which could confound classic pre- and postintervention assessments. This study has also demonstrated that RS can be measured reasonably accurately by questionnaire. The concordance between questionnaire and cognitive interview is high, indicating the HEI-Q Perspective is able to detect negative, absent, and positive RS in participants of SM programs.
The presence of RS therefore has important implications for assessing the benefits that individuals might receive from participating in SM programs. Individuals who have a negative RS (“I now realize that I was worse than I thought I was”) are likely to underrate the severity of their condition in their baseline questionnaire. This could result in a preintervention/postintervention questionnaire-reported underestimation of the treatment effect, thus obscuring a true no treatment effect, or a true positive treatment effect. Participants who have a positive RS (“I now realize I was much better than I thought I was”) are likely to overrate the severity of their condition in their baseline questionnaire. This could result in a preintervention/postintervention questionnaire-reported overestimation of the treatment effect, thus obscuring a true no treatment effect or a true positive treatment effect. When no RS is present, the preintervention/postintervention questionnaire response is likely to be a true reflection of the treatment effect.
RS-related error would reduce the accurate signal (i.e., true change), increase the amount of noise, and make the demonstration of quantitative differences between pre- and postprogram assessments difficult. In settings with substantial RS, larger sample sizes would therefore be required. It may be for this reason that previous studies of SM programs have demonstrated small to no effects for patient-reported outcomes such as quality of life and related issues (3). Warsi et al (3), using a meta-analytical approach across diseases, found that SM programs led to significant outcomes where objective measures were used, i.e., for individuals with hypertension (reduced systolic blood pressure), diabetes (reduced glycosylated hemoglobin levels), and asthma (fewer attacks), but no significant reduction in subjective patient-reported outcomes such as pain or disability for individuals with arthritis. The data from our study suggest that assessment of subjective outcomes through questionnaire may be substantially confounded by RS. The results imply that preintervention/postintervention assessments in a subgroup of persons with absent RS will report accurate patient-reported outcomes (unconfounded by RS); however, for the remainder of the individuals, preintervention/postintervention assessments will be enlarged or diminished (through RS), introducing additional measurement error and difficulty in demonstrating overall group effects.
An important strength of this study is the use of cognitive interviews to validate the HEI-Q Perspective. Adamson et al advocate the importance of combining quantitative and qualitative methods to demonstrate “the complexity of subjective views and [to try] to elucidate the meanings behind recipients' answers to items” (9). Participants in that study completed a standard questionnaire followed by 2 interviews: a structured and then a semistructured interview. The research team's aim was to “investigate perceptions and experiences of health, illness, and healthcare among random samples of white, working-class residents of 3 British cities” (9). Adamson et al call the approach “questerviews,” and their article explores the importance of this method when seeking to understand health and the health care system. Questerviews have parallels, they say, with cognitive interviewing methods to validate questionnaires because both are “concerned with how participants interpret and comprehend questions and the process behind making a response” (9).
Cognitive interviewing is used primarily to identify problems with questionnaires that cannot be identified through statistical psychometric approaches. Harris-Kojetin et al used cognitive interviewing in their Consumer Assessment of Health Plans study: “The intent of cognitive testing is to examine the question-answering process to identify and address errors being introduced into the process. This knowledge then can be used to develop better survey questions, thereby improving the accuracy and reliability of survey responses” (10).
In our study, we used the cognitive interviewing method to determine if the HEI-Q Perspective questionnaire has items that make sense, can be understood by a wide range of the population, is interpreted as intended, and, most importantly, can accurately detect RS. The interviews revealed lower overall accuracy of the Health-Directed Behavior item (“I did fewer/more healthy activities or behaviors”). This item also had the highest number of unclear interview answers, which may be an indication of ambiguity in, or misinterpretation of, the question, and it may not have been detected without the cognitive interviews.
The analytical approach taken in this study is highly conservative. The RS items were scored using a 7-point scale from −3 to 0 to 3. However, we collapsed the results into dichotomous outcomes (presence or absence of RS) to examine test accuracy. The interview approach was simple and passive and was intended to allow spontaneous quotes that might reveal RS. If clear quotes were not elicited, a participant's score was categorized as unclear, and their score contributed to false positive and false negative components of the analyses. The overall specificity (0.74) was somewhat lower than the sensitivity (0.79). Although the overall accuracy is reasonable, the conservative approach of only ascribing RS absence when a clear qualitative statement was revealed may deflate the sensitivity score (i.e., the ratio of the number of individuals correctly identified as not having RS and the total number of individuals without RS). A more proactive interview process and the use of the 7-point scale are likely to demonstrate even higher sensitivity and specificity of the scale and are the subject of ongoing research. Taken together, these results indicate that the HEI-Q Perspective questionnaire has high face validity and high construct validity. Future work will need to include stability and reliability.
Several potential limitations of this study require discussion. The cognitive interview first involved reminding respondents of their initial answer, and then passively asking them, “Can you tell me more about that answer?” Although the structure of the open question partially protects against confirmation bias, the unblinded interview cannot exclude reporting bias. Importantly, across most items, respondents appeared to understand what the question was intending to communicate. In this study, we did not intend to identify the source of RS (e.g., whether or not it related to social comparison). Furthermore, the specific type of RS was not sought after in the interviews. Despite this, the interview content suggested that recalibration was occurring (e.g., “It made me realize that I wasn't depressed at all”). It is possible that the cognitive nature of the items (e.g., fewer/more/poorer) was more likely to elicit recalibration RS rather than reevaluation or reconceptualization RS.
There was also variability in the lengths of time between the end of the course, completion of the questionnaire, and the interviews. Although this provides a potential for recall error, participants seemed confident and clear about their answers. Had the questionnaires been sent out and interviews conducted within a shorter period, the results may have been more evident, with fewer responses in the unclear category. A further potential limitation of this study, as with all quality of life studies, is that a true gold standard does not exist, and therefore a surrogate was used. Although the cognitive interview approach has appealing face validity, replication of the findings of the present study is important. A final potential limitation of the study is the sample used. Participants were self selected and chose to promptly return a somewhat complex questionnaire to researchers. Only approximately one-third of the original respondents returned the questionnaire. This was not surprising because there were no incentives or active followup (reminder letters or phone calls) and no tracking of individuals who might have changed addresses. This may have biased the sample to those who were stronger endorsers of the course, i.e., those with greater negative, greater positive, or greater true change as a result of the course. The distribution of RS in unselected samples may well be different, and replication in other populations is required.
Overall, the results of this study suggest that RS may introduce substantial error into preintervention/postintervention assessments. We have identified a large proportion of participants who had a change of perspective in critical areas affected by the program. For studies where objective disease outcomes (e.g., glycosylated hemoglobin, blood pressure, or asthma) are used to assess pre- and postintervention effects, the influence of RS is nonexistent. For programs that are assessed through subjective outcomes (e.g., pain, disability, and psychological states), the presence of RS may introduce substantial measurement error into preintervention/postintervention assessments. This change, or RS, is detectable by the HEI-Q Perspective and may represent intellectual and emotional shifts that permit individuals to better cope with their disease, seek additional care, and be better equipped to manage themselves and their environment. This would suggest that RS is an important and valuable outcome of SM courses. Ongoing research is required to empirically explore if adjustment for positive, negative, and absent RS does affect the substantive conclusions from health-education program evaluations. Until RS is accounted for in outcome assessment, many effects of SM programs may be obscured, leading to false conclusions that the intervention has small or no effects.