High rates of depressive symptoms among patients with systemic sclerosis are not explained by differential reporting of somatic symptoms

Authors

  • Brett D. Thombs,

    Corresponding author
    1. Sir Mortimer B. Davis Jewish General Hospital and McGill University, Montreal, Quebec, Canada
    • Sir Mortimer B. Davis Jewish General Hospital, Institute of Community and Family Psychiatry, 4333 Cote Saint Catherine Road, Montreal, Quebec H3T 1E4, Canada
    Search for more papers by this author
  • Samantha Fuss,

    1. Sir Mortimer B. Davis Jewish General Hospital and McGill University, Montreal, Quebec, Canada
    Search for more papers by this author
  • Marie Hudson,

    1. Sir Mortimer B. Davis Jewish General Hospital and McGill University, Montreal, Quebec, Canada
    Search for more papers by this author
  • Orit Schieir,

    1. Sir Mortimer B. Davis Jewish General Hospital and McGill University, Montreal, Quebec, Canada
    Search for more papers by this author
  • Suzanne S. Taillefer,

    1. Sir Mortimer B. Davis Jewish General Hospital and McGill University, Montreal, Quebec, Canada
    Search for more papers by this author
    • Dr. Baron is the director and Dr. Taillefer is the research coordinator of the Canadian Scleroderma Research Group, which receives grant funding from the Canadian Institutes of Health Research, the Cure Scleroderma Foundation, the Scleroderma Society of Canada, the Ontario Arthritis Society, Actelion Pharmaceuticals, and Pfizer Pharmaceuticals.

  • Joshua Fogel,

    1. Brooklyn College, City University of New York, Brooklyn, New York
    Search for more papers by this author
  • Daniel E. Ford,

    1. Johns Hopkins Bloomberg School of Public Health and Johns Hopkins University School of Medicine, Baltimore, Maryland
    Search for more papers by this author
  • Murray Baron,

    1. Sir Mortimer B. Davis Jewish General Hospital and McGill University, Montreal, Quebec, Canada
    Search for more papers by this author
    • Dr. Baron is the director and Dr. Taillefer is the research coordinator of the Canadian Scleroderma Research Group, which receives grant funding from the Canadian Institutes of Health Research, the Cure Scleroderma Foundation, the Scleroderma Society of Canada, the Ontario Arthritis Society, Actelion Pharmaceuticals, and Pfizer Pharmaceuticals.

  • Canadian Scleroderma Research Group

    Search for more papers by this author
    • Investigators of the Canadian Scleroderma Research Group are listed in Appendix A.


Abstract

Objective

Between 36% and 65% of patients with systemic sclerosis (SSc) report symptoms of depression above cutoff thresholds on self-report questionnaires. The objective of this study was to assess whether these high rates result from differential reporting of somatic symptoms related to the high physical burden of SSc.

Methods

Symptom profiles reported on the Center for Epidemiologic Studies Depression Scale (CES-D) were compared between a multicenter sample of 403 patients with SSc and a sample of respondents to an Internet depression survey, matched on total CES-D score, age, race/ethnicity, and sex. An exact nonparametric generalized Mantel-Haenszel procedure was used to identify differential item functioning between groups.

Results

Patients with SSc reported significantly higher frequencies (moderate to large effect size; P < 0.01) on 4 CES-D somatic symptom items: bothered, appetite, effort, and sleep. Internet respondents had higher item scores on 2 items that assessed interpersonal difficulties (unfriendly, large effect size; P < 0.01; disliked, large effect size; P < 0.01) and on 2 items that assessed lack of positive effect (happy, moderate effect size; P = 0.01; enjoy, large effect size; P < 0.01). Adjustment of standard CES-D cutoff criteria for potential bias due to somatic symptom reporting resulted in a reduction of only 3.6% in the number of SSc patients with significant symptoms of depression.

Conclusion

High rates of depressive symptoms in SSc are not due to bias related to the report of somatic symptoms. The pattern of differential item functioning between the SSc and Internet groups, however, suggests some qualitative differences in depressive symptom presentation.

INTRODUCTION

Between 36% and 65% of patients with systemic sclerosis (SSc) report symptoms of depression above cutoff thresholds on self-report questionnaires, which is a high rate even when compared with patients who have other chronic diseases when the same assessment instruments and cutoff scores are used (1). An important question, however, is whether this estimate is inflated due to bias related to the heavy physical burden of SSc. Symptoms common in SSc, such as fatigue, difficulty sleeping, or gastrointestinal symptoms and poor appetite, are easily misinterpreted as mood related (2–4).

Accurate assessment of depressive symptoms requires screening questionnaire scores to reflect depressive symptomatology rather than physical symptoms of SSc that are not directly related to depression. Methodologically, this means that patients with and without SSc who have similar levels of depressive symptoms should be equally as likely to endorse items that assess symptoms of depression. Differential item functioning (DIF) is said to be present when a group characteristic that is distinct from depression (e.g., gastrointestinal symptoms among patients with SSc) results in overestimation or underestimation of depression symptoms on certain items (e.g., an item assessing changes in appetite) (5).

The Center for Epidemiological Studies Depression Scale (CES-D) (6) is a 20-item self-report measure that is commonly used as a depression screening and research tool in rheumatic diseases including osteoarthritis (7, 8), systemic lupus erythematosus (9, 10), rheumatoid arthritis (RA) (11–14), and SSc (15). Two studies of patients with RA have reported that several somatic items of the CES-D reflect both depressive symptoms and RA disease factors, which the authors termed “criterion contamination” (16, 17). Blalock et al found that 4 CES-D items (item 7, effort; item 8, hopeful; item 11, sleep; and item 20, get going) were associated with RA disease severity indicators even after controlling for total CES-D score. Callahan et al reported that RA patients endorsed 6 CES-D items at higher rates than nonmedically-ill controls (item 2, appetite; item 3, blues; item 6, depressed; item 7, effort; item 11, sleep; and item 20, get going), although they did not control for total CES-D scores across groups. Studies of patients with SSc have reported similar results when they regressed both somatic and cognitive symptoms on a set of SSc-related predictors (15, 18), and have noted that scores on the depressive symptoms subscale of the General Health Questionnaire 28 were higher for patients with SSc than controls (19), but none have directly assessed DIF or measurement bias.

Therefore, no studies have assessed for DIF on the CES-D among patients with SSc. The objectives of this study were: to assess whether there is evidence of DIF on the CES-D and therefore differential symptom presentation when patients with SSc are compared with nonmedical patients, and to evaluate whether different reporting of somatic symptoms accounts for the high rates of depressive symptoms that have been reported among patients with SSc. In this study, we compared CES-D item responses from patients with SSc enrolled in a multicenter, pan-Canadian registry of SSc patients with a matched sample of respondents to an Internet depression survey who did not necessarily have a medical condition.

PATIENTS AND METHODS

Patient and comparison sample.

The SSc study sample consisted of patients enrolled in the Canadian Scleroderma Research Group Registry from September 2004 through August 2006 who completed the CES-D. Patients in the Registry were recruited from 15 centers across Canada. To be eligible for the Registry, patients must have a diagnosis of SSc made by a Registry rheumatologist, be ≥18 years of age, and be fluent in English or French. Specific diagnostic criteria were not required for entry into the Registry. American College of Rheumatology (ACR; formerly the American Rheumatism Association) criteria published in 1980 (20) have been shown to be outdated as understanding of SSc disease processes improves. Subsequent classification systems have been proposed, but none has gained universal approval (21). Thus, one of the objectives of the Registry is to improve upon existing diagnostic systems. Registry patients undergo extensive clinical history, physical evaluation, and laboratory investigations. They also complete a series of self-report questionnaires that includes, in addition to the CES-D, sociodemographic variables, lifestyle variables (e.g., smoking history), other health problems, environmental exposures, family history of autoimmune diseases, SSc symptoms, disability, quality of life, pain, and medical resource utilization. Disease severity was assessed with the scale developed by Medsger et al (22), which has been used in other studies (18, 23, 24). A severity score of 0 (normal) to 4 (end-stage) was generated for each of the 9 systems. Patients from all centers provided informed consent, and the research ethics board of each study site approved the data collection protocol.

The Internet sample included respondents from the US who completed the CES-D on the InteliHealth Web site (www.intelihealth.com) from March 1999 through December 2002. Individuals who visited the site were invited to take a depression screening test. The site could be accessed directly or by typing “depression test” into search engines. In addition to the CES-D, demographic information obtained included race/ethnicity, sex, age category, and US postal code. The CES-D was completed by 153,068 visitors to the Web site. Although surveys placed on the Internet are typically completed only once and not multiple times (25), postal code and demographic data were used to screen for possible duplicate surveys. Respondents who did not enter US postal codes were excluded, leaving 118,937 individuals. Approval from the Johns Hopkins University School of Medicine Institutional Review Board was obtained for the data collection procedure, including a waiver of informed consent and the use of anonymous data. More details about the Internet sample have been previously published (26).

Each SSc patient was matched with an Internet group participant based on sex, age category from the Internet survey (18–29 years, 30–45 years, 46–60 years, or >60 years), race/ethnicity (white or nonwhite), and total CES-D score. For each SSc patient, the automated matching protocol selected all patients from the Internet sample who matched the SSc patient exactly on sex, age category, race/ethnicity, and CES-D score. If >1 participant from the Internet sample was found to match with an SSc patient, a computer-generated random number selected the participant to be included in the matched sample. If no exact matches were available, alternative matches were sought by allowing the Internet participants to differ in 1 category (but not by more than either 1 level or 1 point). In these cases, matches were sought by first allowing race/ethnicity to vary, then sex, age category, and total CES-D score.

Measures.

The CES-D (6) assesses the presence and severity of depressive symptomatology. The frequency of occurrence of each of 20 symptoms during the past week is rated on a 0–3 Likert-type scale (“rarely or none of the time” to “most or all of the time”), and total scores range from 0–60. A score of ≥16 is most commonly used to screen for depression, although a score >16 does not necessarily mean that a patient has depression (6). Although some variation in the factor structure of the CES-D has been reported, the bulk of existing studies (27) have found a 4-factor structure consistent with that originally reported by Radloff: Somatic/Vegetative, Depressed Affect, Positive Affect, Interpersonal (6).

SSc-related variables (disease duration and severity, number of tender joints, number of gastrointestinal symptoms, skin involvement, respiratory problems) were compiled for patients with SSc. Skin involvement was assessed using the modified Rodnan skin thickness score ranging from 0–51 (28). Limited skin disease was defined as skin involvement limited to an area distal to the elbows and knees with or without face involvement.

DIF analysis.

An exact nonparametric generalized Mantel-Haenszel procedure for assessing DIF with polytomous (e.g., Likert) items (5, 29, 30) was used to assess whether the frequency of specific depressive symptoms differed between patients with SSc and nonmedical patients from the Internet sample. Generalized Mantel-Haenszel methods for DIF assessment are implemented by organizing data into a 2 × T × K contingency table, where, for each of 2 groups, T is the number of Likert response categories and K is the number of levels (strata) of the matching variable, typically the total score on the measure. To assess for DIF for each item, a significance test is carried out to test the null hypothesis that there is no interaction between group membership (e.g., SSc versus Internet) and item response category (e.g., 0–3 of CES-D items) across overall score strata. Both large-sample chi-square approximations and exact tests have been used in DIF procedures that make statistical inferences related to contingency tables. Exact methods and chi-square approximations have been found to perform similarly for binary (31) and polytomous (5) items, although exact methods tend to be more conservative (5) and large-sample chi-square approximations may be inadequate under certain conditions (30). Exact generalized Mantel-Haenszel tests for DIF were conducted with StatXact 4.0 (32).

For small to intermediate numbers of item comparisons, thick-matching, such as quintile stratification, more consistently produces accurate results in DIF analyses than thin-matching by actual scores or with many different strata (33). Participants in the SSc and Internet groups were assigned to 5 strata based on CES-D total scores that each contained ∼20% of participants. An iterative purification procedure was used because purifying the matching variable has been shown to yield more accurate results, particularly when scales contain multiple DIF items (34). DIF was initially assessed for each item after matching across 5 strata based on CES-D total scores. Items were subsequently reanalyzed using only items not identified as having DIF to calculate the total score used for matching. This procedure was repeated until the same set of items was identified as having DIF in sequential iterations (34). For all analyses, the item under consideration was included in the matching score as recommended by Zwick et al (35).

In addition to significance testing, typical DIF analyses include a measure of effect size to assess the level of practical significance of DIF. A frequently-used measure of effect size for polytomous items is the standardized mean difference technique (36–38). The standardized mean difference is the difference between the unweighted item mean of the focal group (e.g., patients with SSc) and the weighted mean of the reference group (e.g., nonmedical Internet respondents), where the weights are the proportion of participants in the focal group in each stratum. Since SSc and Internet respondents were matched on total CES-D score in this study, this reduced to the raw difference between groups in mean item score. An effect size estimate is generated by dividing the standardized mean difference by the SD of the item score of the combined group, which, due to matching for CES-D total scores, was equivalent to Cohen's d (39) in the current sample. We used classification rules developed by the Educational Testing Service that classify DIF in 3 categories based on statistical significance and effect size (5, 40). Items were classified as having negligible DIF in all cases where P ≥ 0.05 and in cases where P < 0.05 and effect size ≤ 0.17. Items were classified as having moderate DIF for items with P < 0.05 and 0.17 < effect size ≤ 0.25, and as having large DIF where P < 0.05 and effect size > 0.25.

Post hoc analyses were done to test the degree to which differential reporting of somatic symptoms might influence estimated rates of depression using standard cutoffs on the CES-D. Tabulation of demographic data and standardized mean difference calculations were done with SPSS 15.0 (Chicago, IL).

RESULTS

Sample characteristics.

Of the 403 patients with SSc, 392 were matched across all categories with patients from the Internet sample, whereas 11 were matched exactly on 3 categories and within 1 point or level on a fourth category. Of the 11 patients who were matched on 3 of 4 categories, 2 differed on race/ethnicity, 3 differed by 1 age category, 1 differed on sex, and 5 differed by 1 point on the CES-D. The SSc sample and the Internet sample were mainly comprised of women (86.4% and 86.1%, respectively), and the majority of the sample classified their race/ethnicity as white (79.7% and 80.1%, respectively), which is consistent with North American samples from previous reports of SSc (41). Of the patients with SSc, 90.2% met the ACR criteria for SSc (20). The mean age of the SSc sample was 55.4 years, and the largest age group of participants in the SSc and Internet samples was in the 46–60 years age range (44.8% and 44.4%, respectively). Sociodemographic and medical characteristics for the patients with SSc are summarized in Table 1. The mean total CES-D score for both the SSc and Internet groups was 14.0 points. Over one-third of each group scored ≥16 points on the CES-D (148 [36.7%] of 403), and over a quarter of patients scored ≥19 points (110 [27.3%] of 403). The percentages of respondents from the Internet sample above these cutoffs were the same.

Table 1. SSc patient demographic and medical characteristics*
VariableResult
  • *

    Values are the number (percentage) unless otherwise indicated. SSc = systemic sclerosis.

  • The number of patients ranges from 310 (kidney) to 398 (gastrointestinal tract).

Demographic 
 Age, mean ± SD years55.4 ± 12.6
 Female348 (86.4)
 Race/ethnicity, white321 (79.7)
 Education level (n = 402) 
  Less than high school graduate120 (29.9)
  High school graduate102 (25.3)
  Some college104 (25.9)
  College graduate76 (18.9)
 Married or living as married293 (72.7)
Medical 
 Time since non-Raynaud's symptom onset, mean ± SD years10.7 ± 8.7
 Time since diagnosis of SSc, mean ± SD years8.5 ± 7.7
 Modified Rodnan total skin thickness score, mean ± SD11.2 ± 10.4
 Diffuse SSc (n = 396)194 (49.9)
 Disease severity scores 
  General1.01 (1.29)
  Peripheral vascular1.54 (1.25)
  Skin1.33 (0.75)
  Joint/tendon0.94 (1.30)
  Muscle0.31 (0.81)
  Gastrointestinal tract2.09 (0.76)
  Heart0.51 (1.06)
  Kidney0.16 (0.70)
  Lung1.41 (1.14)

DIF.

Results from the generalized Mantel-Haenszel assessment of DIF for each CES-D item are shown in Table 2. In the first iteration, 5 symptoms were more frequently present among patients in the SSc group (bothered, appetite, effort, sleep, and cry) and 6 items were more frequent in the nonmedical Internet group (failure, happy, lonely, unfriendly, enjoy, and disliked). After re-matching using total CES-D scores calculated with these items removed, results were the same in iterations 2 and 3, with the exception of the item lonely, which was no longer significant in either subsequent iteration. Effect size criteria were applied, and 4 items were found to have DIF with higher frequency ratings in the SSc group (bothered, moderate; appetite, large; effort, large; sleep, large). All of these items are included in the somatic/vegetative factor of the CES-D. In addition, DIF was identified for 4 items in the opposite direction, with higher frequency ratings for the nonmedical Internet group (unfriendly, large; happy, moderate; enjoy, large; disliked, large). Two of these items, unfriendly and disliked, comprise the interpersonal factor of the CES-D, and the 2 other items are from the positive affect factor.

Table 2. Generalized Mantel-Haenszel (GMH) assessment of differential item functioning (DIF)*
CES-D itemsIteration 1Iteration 2Iteration 3Effect sizeDIF categoryHigher frequencyItem mean ± SD
GMHPGMHPGMHPSSc groupInternet group
  • *

    Standardized generalized Mantel-Haenszel statistics based on exact version. CES-D = Center for Epidemiologic Studies Depression Scale; SSc = systemic sclerosis, scleroderma; R = recoded.

1. Bothered−3.28< 0.01−3.32< 0.01−3.34< 0.010.188ModerateSSc0.65 ± 0.850.49 ± 0.86
2. Appetite−4.99< 0.01−4.96< 0.01−4.94< 0.010.306LargeSSc0.59 ± 0.940.32 ± 0.75
3. Blues0.300.76−0.550.58−0.550.58−0.017Negligible0.47 ± 0.810.49 ± 0.93
4. As good (R)0.910.36−0.210.830.070.950.053Negligible1.34 ± 1.281.27 ± 1.27
5. Mind0.470.640.180.860.110.91−0.029Negligible0.68 ± 0.910.70 ± 0.98
6. Depressed1.940.051.360.171.260.21−0.101Negligible0.58 ± 0.850.67 ± 1.01
7. Effort−4.54< 0.01−4.41< 0.01−4.72< 0.010.251LargeSSc1.00 ± 0.990.74 ± 1.04
8. Hopeful (R)−0.020.981.080.300.880.370.002Negligible1.22 ± 1.191.23 ± 1.16
9. Failure2.580.011.990.052.040.04−0.147Negligible0.32 ± 0.670.43 ± 0.86
10. Fearful0.560.570.080.940.050.96−0.035Negligible0.49 ± 0.800.52 ± 0.90
11. Sleep−6.06< 0.01−5.92< 0.01−6.12< 0.010.395LargeSSc1.23 ± 1.090.82 ± 0.97
12. Happy (R)2.730.012.450.012.250.01−0.171ModerateInternet1.09 ± 1.141.29 ± 1.16
13. Talked less0.020.99−0.430.67−0.490.630.003Negligible0.54 ± 0.820.54 ± 0.89
14. Lonely2.100.041.340.181.340.18−0.115Negligible0.55 ± 0.860.66 ± 1.01
15. Unfriendly4.20< 0.013.42< 0.013.40< 0.01−0.252LargeInternet0.17 ± 0.520.33 ± 0.76
16. Enjoy (R)4.86< 0.014.45< 0.014.41< 0.01−0.306LargeInternet1.07 ± 1.161.43 ± 1.20
17. Cry−2.610.01−2.990.03−2.95< 0.010.148Negligible0.41 ± 0.730.30 ± 0.71
18. Sad1.210.230.500.620.390.69−0.063Negligible0.64 ± 0.860.70 ± 1.01
19. Disliked3.94< 0.013.08< 0.013.03< 0.01−0.251LargeInternet0.16 ± 0.510.32 ± 0.76
20. Get going−0.540.59−1.060.29−1.060.290.035Negligible0.81 ± 0.930.77 ± 1.06

The existence of DIF in itself, however, does not necessarily reflect the degree to which clinical screening practice might be impacted. Therefore, we did a post hoc analysis of rates of patients with SSc who met established CES-D cutoffs after adjusting for DIF due to somatic symptoms in the SSc group. The standard cutoff score for depression on the CES-D is 16. The total raw difference in mean scores on the 4 somatic items with DIF that had higher scores for the SSc group was 1.09 points. We compared the percentage of patients meeting threshold criteria using the 16-point score cutoff with a cutoff score of 17 points to take into account the ∼1 point total DIF related to somatic symptoms. Based on a cutoff of 17 points, 33.1% of patients were above the threshold. Compared with the standard cutoff score of 16 (36.7% of patients), adjusting for DIF related to somatic symptoms resulted in a reduction of 3.6%.

DISCUSSION

This study compared item responses on the CES-D from 403 patients with SSc with responses from 403 nonmedical respondents to a large Internet survey who were matched on total CES-D score, age, sex, and race/ethnicity. The main finding of this study was that patients with SSc more frequently endorsed 4 somatic symptoms: “I was bothered by things that usually do not bother me,” “I did not feel like eating; my appetite was poor,” “I felt that everything I did was an effort,” and “My sleep was restless.” In turn, respondents to the Internet survey more frequently reported that “People were unfriendly” and “I felt that people disliked me” and less frequently reported that “I enjoyed life” and “I was happy” (both reverse scored with higher depression symptom scores). However, the overall magnitude of DIF in terms of impact on assessment decisions was small. When standard CES-D cutoff scores for identifying cases were adjusted upward to account for potential bias due to somatic symptoms, the reduction in SSc patients identified with significant symptoms of depression was only 3.6%.

The results from this study were similar to findings from 2 previous studies of the CES-D in patients with RA (16, 17). Although some of the identified items differed across studies, the present study and the 2 previous reports all found that several somatic items on the CES-D were more frequently endorsed by patients versus controls or by patients with more severe medical symptomatology. Our results were also consistent with findings from 3 studies of patients with SSc that used indirect methods, but concluded that depression scores of patients with SSc are generally accurate representations of symptoms of depression (15, 18, 19). Both of the RA studies concluded that, albeit of a small magnitude, these item differences represented criterion contamination, or measurement bias. It is not clear from these conclusions, however, whether these results can indeed be attributed to bias. The question remains whether identified differences in symptom reporting represent bias in the assessment of depressive symptoms due to the overlap of somatic symptoms of depression and medical illness, or whether they are better understood in the context of different experiences of depressive symptoms between patients with chronic medical illnesses like SSc, and nonmedical participants like the nonmedical Internet respondents. The diagnosis of depression is inherently tautologic, with no real gold standard, because the symptoms define the disorder and the disorder is diagnosed by identifying manifested symptoms. Therefore, there is no underlying reason why symptoms of depression must necessarily be distributed equally among all patients with elevated symptoms. Medical and nonmedical patients with high levels of depressive symptoms may experience different types of symptoms more or less frequently, and the presence of DIF on a given item does not reveal the source of that DIF.

The overall pattern of DIF across items in this study, however, sheds some light on the issue. If DIF on the CES-D were simply due to biased measurement related to somatic symptoms in the SSc group, one would expect to find not a clear pattern of DIF in the other direction, but rather nonsystematic disturbances in item responses. In this study, however, there were several items that exhibited DIF in the opposite direction in a theoretically coherent pattern. Compared with patients with SSc who were matched on total CES-D scores, age, sex, and race/ethnicity, participants in the Internet sample reported higher frequencies (large DIF) for both items on the CES-D interpersonal factor (unfriendly and disliked). The Internet sample also exhibited statistically significant and large DIF on both items from the positive affect factor that are representative of positive affect (rather than hope for the future or acceptance by others), happy and enjoy. This pattern suggests that patients with SSc and nonmedical Internet respondents do not differ in the experience of negative affect, but may differ in terms of their experiences of somatic symptoms, interpersonal difficulties, and positive affect.

Clinically, these results suggest that the CES-D is unlikely to substantially inflate rates of depressive symptoms in patients with SSc. Removal of somatic symptoms when assessing medically-ill patients is not recommended. Across cultures, the majority of primary care patients with depression present with primarily somatic symptoms (42, 43), and somatic symptoms have been shown to be good discriminators of depressed and nondepressed individuals (44, 45). Indeed, depression treatment impacts somatic and nonsomatic symptoms similarly in patients with and without chronic medical illness (46).

There are some limitations to the present study. We used an Internet sample as a control group, and because that sample collection ensured anonymity and did not inquire about medical characteristics, we were not able to fully characterize the medical characteristics of the control sample. It is possible, for instance, that some of the respondents in the nonmedical Internet group did have a chronic medical condition. Other selection biases may have been present. It is possible, for instance, that people who filled out depression scales on the Internet may have had certain characteristics, influential to the types of symptoms they endorsed, that were different from others in the general population with similar levels of overall symptoms. It is also possible that there may have been educational differences between the Internet sample and the patients with SSc that could have influenced results; education level was not assessed in the Internet sample and could not be included in analyses. In addition, demographic data reported by the Internet sample could not be verified.

Another limitation to our study is that sample size considerations did not allow for separate comparisons by SSc disease subgroups. SSc is a heterogeneous disease, and it is possible that the expression of individual symptoms of depression might relate differently to various aspects of the disease, such as the degree of skin or organ involvement. On the other hand, these limitations are contrasted with the benefit of being able to match patients with SSc on a one-to-one basis with counterparts from a nonmedical population, which is a stronger methodology than purely statistical matching. Although we considered 2:1 matching, we decided against it because fewer patients would have matched exactly across all 4 match categories, and the quality of the match would not have been as strong. Additional limitations were that the SSc sample was drawn from patients in Canada whereas the Internet sample consisted of respondents from the US, and the SSc sample was assessed from 2004–2006 whereas the Internet sample was surveyed from 1999–2002, both of which could have resulted in unidentified biases. Given the coherence of the results, however, the likelihood of this would appear to be low.

In summary, we found that the sample of patients with SSc endorsed 4 somatic symptom items of the CES-D more frequently than a comparable nonmedical Internet survey sample. On the other hand, patients with SSc reported fewer interpersonal difficulties and less absence of positive affect. It is possible that these findings represent some degree of bias in the measurement of depressive symptoms in patients with SSc. Nonetheless, if this is the case, the degree of bias is small and the impact on clinical screening practice is negligible. The coherent pattern of DIF that was uncovered across the SSc and Internet groups suggests that, rather than simply bias, there are real differences in how symptoms of depression are experienced by patients with chronic, debilitating illness and individuals without chronic medical illness.

AUTHOR CONTRIBUTIONS

Dr. Thombs had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Study design. Thombs, Fuss, Taillefer.

Acquisition of data. Hudson, Taillefer, Fogel, Ford, Baron.

Analysis and interpretation of data. Thombs, Fuss, Hudson, Fogel, Ford.

Manuscript preparation. Thombs, Fuss, Schieir, Fogel, Ford, Baron.

Statistical analysis. Thombs, Fogel.

APPENDIX A

INVESTIGATORS OF THE CANADIAN SCLERODERMA RESEARCH GROUP

Investigators of the Canadian Scleroderma Research group are: J. Markland: Saskatoon, Saskatchewan; J. Pope: London, Ontario; D. Robinson: Winnipeg, Manitoba; N. Jones: Edmonton, Alberta; P. Docherty: Moncton, New Brunswick; M. Abu-Hakima, S. Le Clercq: Calgary, Alberta; N. A. Khalidi, E. Kaminska: Hamilton, Ontario; E. Sutton: Halifax, Nova Scotia; C. D. Smith: Ottawa, Ontario; J. P. Mathieu, S. Ligier: Montreal, Quebec; P. Rahman: St. John's, Newfoundland.

Ancillary