Effect of socioeconomic status as measured by education level on survival in breast cancer clinical trials


Correspondence to: Department of Biostatistics & Bioinformatics, Hock Plaza, Suite 802, Duke University Medical Center, Box 2717, Durham, NC 27710, USA. E-mail: james.herndon@duke.edu



This paper aims to investigate the effect of socioeconomic status, as measured by education, on the survival of breast cancer patients treated on 10 studies conducted by the Cancer and Leukemia Group B.


Sociodemographic data, including education, were reported by the patient at trial enrollment. Cox proportional hazards model stratified by treatment arm/study was used to examine the effect of education on survival among patients with early stage and metastatic breast cancer, after adjustment for known prognostic factors.


The patient population included 1020 patients with metastatic disease and 5146 patients with early stage disease. Among metastatic patients, factors associated with poorer survival in the final multivariable model included African American race, never married, negative estrogen receptor status, prior hormonal therapy, visceral involvement, and bone involvement. Among early stage patients, significant factors associated with poorer survival included African American race, separated/widowed, post/perimenopausal, negative/unknown estrogen receptor status, negative progesterone receptor status, >4 positive nodes, tumor diameter >2 cm, and education. Having not completed high school was associated with poorer survival among early stage patients. Among metastatic patients, non-African American women who lacked a high school degree had poorer survival than other non-African American women, and African American women who lacked a high school education had better survival than educated African American women.


Having less than a high school education is a risk factor for death among patients with early stage breast cancer who participated in a clinical trial, with its impact among metastatic patients being less clear. Post-trial survivorship plans need to focus on women with low social status, as measured by education. Copyright © 2011 John Wiley & Sons, Ltd.


The relationship between socioeconomic status (SES) and health/disease has been the focus of numerous investigations over the past 20–25 years. The basis for these investigations is often through the theoretical framework of social determinants of health [1, 2]. Marmot, Wilkinson, and House reported the statistically significant effect of income and education on survival in the general population [1-5]. Marmot's explanation for this relationship in affluent countries, such as the USA, is that the relationship between income and survival is the result of relative differences in social participation and opportunity to control life circumstances [6]. Much of the literature focuses on income; however, Marmot argues that education may be a better indicator of factors linked to social position that are important to health and survival given that the effect of income on mortality is markedly reduced when education is included in predictive models [6].

In spite of the numerous published papers that have specifically examined the relationship between SES and cancer survival, questions persist because of inconsistent conclusions [7-32]. Possible explanations for this inconsistency include differences in the research question being asked (e.g., impact of SES on survival in the general population or impact in a clinical trial population), the patient population (e.g., homogeneous or heterogeneous histology or stage, as well as the national, racial, and ethnic composition), sample size or power considerations, the data source (e.g., census, regulatory, Surveillance, Epidemiology and End Results Program, clinical trial treatment trial, or patient reports), and the measure of SES (e.g., income, education, or occupation).

Such methodological differences are evident in research published in the past decade that has focused on the relationship between SES and survival among patients with breast cancer. Based upon education level reported to the French census, Menville reported that higher mortality observed among highly educated breast cancer patients had attenuated since the 1970s [29]. In contrast, Bouchardy [26], Dalton [27], and Thomson [28] reported that patients with lower SES had poorer prognosis. Bouchardy used patient-reported occupation to define SES among her population-based Swiss cohort that primarily had early stage breast cancer. Dalton linked Denmark's registry of patients who received adjuvant protocol treatment with an administrative database that contained each patient's education level. Thomson derived SES from Scotland's census data. Unlike multivariate analyses reported by Bouchardy and Dalton, the effect of SES disappeared in Thomson's study after adjustment for prognostic factors. Albano used data from the National Center for Health Statistics to show that the relative risk of death from breast cancer is highest among patients with 12 or fewer years of education [30]. Gordon used census-derived measures of education and income to show that low SES was associated with poorer survival among White, node-negative breast cancer clinical trial participants but not among African Americans [31, 32].

Nearly 20 years ago, the Cancer and Leukemia Group B (CALGB), a national cooperative group funded by the National Cancer Institute explored the relationship between SES and survival of cancer patients enrolled on eight CALGB studies [33]. After adjustment for known prognostic factors including cancer type, performance status, age, and protocol-specific factors, analyses showed that clinical trial participants with low income or only a grade school education had poorer survival than patients with higher SES. This paper is part of a larger project to examine a large database of background forms that CALGB had routinely collected between 1990 and 1998, and to ‘validate’ findings of Cella [33] concerning the relationship of education and survival within larger, homogeneous cancer patient populations that participated in CALGB trials [34].

This paper focuses specifically upon the relationship between education and survival among patients with breast cancer who participated in one of 10 CALGB-coordinated clinical trials initiated between 1987 and 1998 (Table 1). Analyses assessed the effect of education on survival separately within early stage and metastatic patient subgroups. The strength of this research is that it uses a patient-reported measure of SES (education), explores the relationship between education and survival within one cancer type, and has power(i.e., sufficient patient numbers) to detect clinically important effect sizes within the unique context of clinical trials where initial therapeutic choices are not influenced by SES.

Table 1. Cancer and Leukemia Group B breast cancer studies from which patients were drawn
StudyTreatment armsEnrollment datesNumber of patients treatedNumber of patients with education data (%)Number of deaths among treated patients with education data (%)Date of last follow-up
  1. CPB, Cyclophosphamide, Cisplatin, and Carmustine; BMT, bone marrow transplant; AC, combination of Doxorubicin and Cyclophosphamide; G-CSF, granulocyte colony-stimulating factor; Dox, Doxorubicin; Cyclo, Cyclophosphamide; Pac, Paclitaxel; CAF, Cyclophosphamide, Doxorubicin, and Fluorouracil; LV, Leucovorin.

Early stage
9082High dose CPB + BMTFebruary 1991–May 1998394111 (28)50 (45)August 2009
Intermediate dose CPB + G-CSF391105 (27)56 (53)
9141AC + 5 µg/kg G-CSFMarch 1993–April 19948772 (83)29 (40)January 2008
AC + 10 µg/kg G-CSF8574 (87)32 (43)
9343Tamoxifen + radiationJuly 1994–February 1999317273 (86)121 (44)August 2009
Tamoxifen319286 (90)124 (43)
9344Cyclo + Dox 60 mg/m2 → Paclitaxel 175 mg/m2May 1994–April 1997533461 (86)173 (38)August 2009
Cyclo + Dox 60 mg/m2   
Cyclo + Dox 75 mg/m2 → Paclitaxel 175 mg/m2515444 (86)185 (42)
Cyclo + Dox 75 mg/m2   
Cyclo + Dox 90 mg/m2 → Paclitaxel 175 mg/m2517448 (87)161 (36)
Cyclo + Dox 60 mg/m2523430 (82)177 (41)
Cyclo + Dox 75 mg/m2520456 (88)170 (37)
Cyclo + Dox 90 mg/m2513443 (86)192 (43)
9741Sequential Dox/Pac/Cyclo every 3 weeksOctober 1997–March 1999484373 (77)110 (29)August 2009
Sequential Dox/Pac/Cyclo every 2 weeks493389 (79)91 (23)
Concurrent Dox/Cyclo then Pac every 3 weeks501397 (79)113 (28)
Concurrent Dox/Cyclo then Pac every 2 weeks495384 (78)97 (25)
8741160 mg Megestrol acetateJuly 1987–March 199112346 (37)44 (96)September 2001
1600 mg Megestrol acetate12443 (35)41 (95)
800 mg Megestrol acetate11941 (34)40 (98)
9140CAF aloneNovember 1991–August 199512036 (30)34 (94)June 2002
12129 (24)29 (100)
9242TopotecanApril 1993–June 19945340 (75)40 (100)November 1997
9342Paclitaxel 175 mg/m2February 1994–July 1997158139 (88)137 (99)August 2006
Paclitaxel 210 mg/m2156137 (88)135 (99)
Paclitaxel 250 mg/m2155137 (88)133 (97)
9840Thrice weeklySeptember 1998–November 200310443 (41)38 (88)March 2009
Weekly18264 (35)57 (89)
Thrice weekly + Trastuzumab123114 (93)102 (89)
Weekly + Trastuzumab168151 (90)132 (87)

Materials and methods

CALGB collection of socioeconomic data

The CALGB Psychiatry Committee (later renamed the Psycho-Oncology Committee) piloted the collection of socioeconomic data from a background form [35] in the late 1980s and showed the feasibility of collecting such data for all variables except income. After completion of the feasibility study, CALGB initiated collection of background forms on all active studies. Patient-reported data collected at trial enrollment included education, race, and marital status. Education was presented as a multiple-choice question with the following options: grades 1–8, grades 9–11, high school graduate, some college, junior college degree, college degree, some post-college, or advanced degree. Income was not collected because of collection difficulties encountered in the feasibility study. Survival and baseline clinical data were obtained from the CALGB database and merged with the background form. When these studies were initiated, CALGB required all patients to be followed until death.

Patient population

The analyses presented in this paper are based upon education and clinical data collected in 10 breast cancer studies coordinated by the CALGB. Studies are listed in Table 1, along with data concerning accrual, availability of education data and patient status (live/dead) at last follow-up. Study results, as well as details about treatment regimens administered, have been previously reported [36-47].

Submission of the background form that contained patient education data was required for all participants in six studies; exceptions included CALGB 8741, 9082, 9140, and 9840. For CALGB 8741 and 9082, this form was collected from a patient subgroup that participated in a separate quality of life companion study initiated after accrual to the treatment study commenced [36, 38]. Enrollment to the companion of intergroup 9082 was also limited to participants from CALGB. For CALGB 9140, accrual was initiated before submission of the background form was required for all patients, as was the case for CALGB 9840, a study that included an unbalanced randomization.

Power calculations

A priori power calculations were generated to determine whether clinically meaningful effects would be detectable with available data if they existed. Cella reported that 31% of patients had only a grade school education and that their hazard ratio (HR) of death relative to patients with more than a grade school education was approximately 1.2 [33]. In the current dataset, 921 of 1020 metastatic breast cancer patients and 1881 of 5146 early stage patients are dead. Median follow-up time for early stage patients is 11.2 years. Among early stage and metastatic patients, 4% and 6% had 1–8 years of education, respectively; 11% and 19% had less than a high school education (Table 2). A two-tailed logrank test conducted within each subgroup to compare the survival of patients without and with a high school education had 80% power to detect an HR of 1.23 and 1.27, respectively, assuming α = 0.05 (48).

Table 2. Characteristics of breast cancer patients stratified by disease status and survival rates
PredictorSubgroupN%Number of deadHazard ratio95% lower confidence limit for hazard ratio95% upper confidence limit for hazard ratiop-value*
  • PS was not collected in early stage studies.

  • Patients with N0 M0 disease were elderly patients from Cancer and Leukemia Group B 9343.

  • *

    From logrank test without adjustment for other covariables.

  • ECOG PS, Eastern Cooperative Oncology Group Performance Status; ER, estrogen receptor; PR, progesterone receptor; RT, radiation therapy.

Early stage patients
Age group (years)<3076134Reference   
Marital statusMarried3367651156Reference   
Menopausal statusPremenopause265852871Reference   
Number of positive nodes02555125Reference   
Tumor size≤2 cm174134753Reference   
>2 cm33626511091.581.421.77 
Education1–8 grades203498Reference   
9–11 grades38781770.870.681.12 
High school graduate1557305840.670.540.83 
Some college1130224110.640.520.80 
Junior college34171200.620.480.82 
Some post-college2475820.560.420.76 
Advanced degree515101570.520.400.67<0.0001
Metastatic patients
Age group (years)<30818Reference   
Marital statusMarried60059566Reference   
ECOG PS047647435Reference   
Menopausal statusPremenopause17117154Reference   
Prior hormonal therapyNo42942396Reference   
Prior chemotherapyNo32031296Reference   
Prior RTNo42942396Reference   
Visceral involvementNo29429268Reference   
Bone/marrow involvementNo47446428Reference   
Education1–8 grades61658Reference   
9–11 grades133131301.190.861.63 
High school graduate361353441.020.771.36 
Some college191191750.950.701.29 
Junior college444410.890.591.34 
Some post-college545510.960.661.41 
Advanced degree717660.910.631.310.6211

Analytic methods

To guard against possible bias associated with limiting analyses to only patients with education data, characteristics of patients (age, race) with and without education data were compared using chi-squared tests and t-tests.

Survival time, defined as the time between study enrollment and death, was censored for patients remaining alive at last follow-up. The Kaplan–Meier product limit estimator was used to describe the survival experience within patient subgroups defined by various prognostic factors and education levels [48]. Median survival estimates were generated using the estimator of Brookmeyer and Crowley [49]. The Cox proportional hazards model stratified by treatment arm/study was used to assess the relationship between survival and known prognostic factors listed in Table 2, as well as interactions of education with age and race [50, 51]. A reference cell parameterization was used to model categorical variables in the Cox model with one category representing missing data. Analyses used backwards elimination to obtain a parsimonious multivariable clinical model predictive of survival, in which age was considered as an uncategorized predictor. Martingale and Schoenfeld residuals were used to assess the adequacy of the proportional hazards assumption [51]. Once a final multivariable clinical model was determined, factors describing the effect of education were added to the Cox model. Analyses were conducted separately among early stage and metastatic patient subgroups.


The analyses described in this paper are based upon the experiences of all patients accrued to one of 10 breast cancer studies that provided education data (Table 1). These 6166 patients constituted 74% of the enrolled patients. A comparison of the characteristics of patients included and not included in these analyses showed no significant difference relative to age (data not shown). Among patients who had the opportunity to provide education data, the racial composition of the group of patients who did and did not provide education data did not significantly differ.

The study cohort included 5146 early stage patients and 1020 metastatic patients. The majority were White (84%), and 52% were 50 years of age or older (Table 2). A greater proportion of metastatic patients were African American (17% vs 10%), 50 years of age or older (72% vs 48%), postmenopausal (82% vs 48%), or had less than a high school education (19% vs 11%) than that observed among early stage patients (p < 0.001 for all comparisons). African American patients were less likely to be a high school graduate than non-African Americans (72% vs 90%; p < 0.0001), as were older patients (78% vs 90%;p < 0.0001) and separated/widowed patients (72% vs 89%;p < 0.0001).

The relationship between education and survival is graphically summarized in Figures 1 and 2 for early stage and metastatic disease subgroups; associated statistics are provided in Table 2. Within the early stage subgroup, education had a statistically significant effect on survival (p < 0.0001) with poorest prognosis being among patients with less than a high school education in comparison with patients who completed high school (HR = 1.47; 95% CI: 1.29, 1.68). Among patients with metastatic disease, survival of patients who did and did not complete high school was not significantly different (p = 0.095; HR = 1.15; 95%CI: 0.98, 1.353).

Figure 1.

Survival stratified by education among early stage patients

Figure 2.

Survival stratified by education among metastatic patients

In addition to education, the following individual factors were significantly associated with better survival among early stage patients: non-African American race, married or single women, premenopausal status, estrogen receptor (ER) positive, progesterone receptor (PR) positive, one to three positive nodes, and tumor diameter 2 cm or less. Among patients with metastatic cancer, factors associated with better survival included non-African American race, performance status (PS) = 0, ER positive, PR positive, no visceral involvement, and no bone involvement. These results are consistent with the literature concerning known prognostic factors for breast cancer.

Multivariable Cox regression analysis showed that after adjustment for known prognostic factors, early stage patients with less than a high school education were at greater risk of dying than patients who completed high school (Table 3; p = 0.0007; HR = 1.26). African American women were at greater risk of dying than non-African American women (p = 0.007; HR = 1.23), with no evidence that the effect differed among patients with and without a high school education (p = 0.453; model not shown).

Table 3. Multivariable Cox models predictive of survival
ParameterHazard ratio95% hazard confidenceRatio limitsp-value
  1. A stratified Cox proportional hazards model with backwards elimination was used to generate these models.

Model for early stage patients
African American1.2261.0571.4210.0069
Marital status:
Separated versus married/single1.2440.9711.5940.0838
Divorced versus married/single1.1781.0311.3470.0161
Widowed versus married/single1.2701.0831.4890.0032
ER positive/unknown0.6690.5910.756<0.0001
PR positive/unknown0.8510.7560.9570.0071
Number of positive nodes
4–9 vs 0–31.6251.4501.821<0.0001
10+ vs 0–32.7412.3743.165<0.0001
Unknown vs 0–31.3941.0911.7800.0078
Tumor diameter >2 cm1.3921.2501.552<0.0001
Education not high school graduate1.2591.1021.4390.0007
Model for metastatic patients
African American1.5311.2401.890<0.0001
Performance status
1 vs 0/unknown1.2891.1211.4820.0004
2 vs 0/unknown1.4241.2451.629<0.0001
Estrogen receptor
Positive versus negative0.6270.5340.735<0.0001
Unknown versus negative0.6890.5440.8740.0021
Prior hormonal therapy1.1941.0371.3740.0134
Visceral involvement1.2881.1071.4980.0011
Bone involvement1.3191.1451.5190.0001
Education not high school graduate1.1880.9811.4380.0779
Not high school graduate among African American0.6690.4550.9810.0397

Among metastatic breast cancer patients, multivariable analysis showed that the effect of having less than a high school education varied across racial groups (p = 0.040; Table 3). The HR associated with having less than a high school education was 0.80 (95%CI: 0.57, 1.11) among African Americans and 1.19 (95%CI: 0.98, 1.44) among non-African Americans. Of greater magnitude was the statistically significant HR for race: 1.53 (95%CI: 1.24, 1.89). A model without the interaction between race and education shows that the African American main effect was statistically significant (p = 0.001; HR = 1.35; 95%CI: 1.13, 1.62) and the education main effect was not (p = 0.442; HR = 1.07; 95%CI: 0.90, 1.26; model not shown).


Analyses hypothesized a priori that clinical trial participation with standardized treatment plans and rigorous patient follow-up would initially negate any potential effect of social class, as measured by education, on the hazard of dying, and that after completion of ‘active’ protocol treatment, the effect of education would emerge. This hypothesis implied that education would not have an impact upon the survival of metastatic patients, given that such patients are intensively followed until treatment failure and death, and that education would have an effect among early stage patients as they are less rigorously followed long-term after protocol treatment termination.

The hypothesis was substantiated with multivariable analyses among early stage patients where the lack of a high school education and African American race were associated with a greater hazard of dying. Hazard ratios for race and education are of similar magnitude, with no evidence of an inconsistent effect of education across racial groups.

Within the metastatic patient population, the relationship between education and survival is complicated by a statistically significant interaction that suggests that the effect of education on survival varies across racial groups. Within each racial group, the effect of education is not statistically significant. Among African Americans, the lack of high school education is associated with better survival (HR = 0.8). However, among non-African Americans, the lack of a high school education is associated with poorer survival (HR = 1.18), an effect that is opposite to that seen among African Americans. It is not clear whether the statistically significant interaction is a false positive result and an artifact of the non-significant effects of education that are in opposite directions, or whether this result is an indication that African American women with less education are tied into services that might support them more than less educated non-African American women. Regardless, survival of African Americans is poorer than that of non-African Americans (HR = 1.53).

The fact that education had a statistically significant effect on survival among early stage patients that did not vary across racial groups is consistent with reports by Meara that there is a gap in life expectancy among women with low and high education regardless of race [52]. In contrast, among patients with metastatic disease, education did not have an effect on survival within racial subgroups.

This study is unique in that it focuses on the effect of education within a setting where initial therapeutic decisions are not influenced by SES. SES, however, may have resulted in observed differences in baseline characteristics of the early stage and metastatic subgroups due to influences of SES on stage at diagnosis [53-57] and access to clinical trials.

Commonly used ‘area-wide measures’ of socioeconomic data based upon census or administrative databases are unreliable as they classify all patients within a heterogeneous community as having the same SES, as measured by education or income [58]. Dale advocates use of SES measures obtained from each individual and argues that income and education data should be obtained with the recognition that other factors such as occupation might be needed to capture the full effect of SES [16]. Furthermore, Dale recommends that investigations focusing on SES and cancer survival should have adequate sample sizes to make scientifically and statistically sound inferences and that the investigations focus on specific cancer sites. With these criteria as benchmarks, the study described in this paper is reasonably well designed to investigate the relationship between SES and cancer survival in that SES, as measured by education, is available on the individual patient level, the sample size is large enough to assure statistically sound inferences, and a relatively homogeneous population, that is, one cancer site, has been studied. The inclusion of patient-reported income would have strengthened the study; however, such data were purposely not collected as previous pilot work had indicated that a large percentage of patients would not provide such data [35].

Given that race has been the focus of much of the published clinical literature concerning the effect of SES on survival, race was included in analyses as a potential confounder of the effect of education. The increased risk of death among African American women as shown in analyses is consistent with previous reports [58-60].

Both Albain [59] and Polite [60] have wrestled with the source of the race effect, whether it is biologically based or a result of SES. The provocative paper by Albain reports no racial effect on survival among patients with acute myelogenous leukemia, limited small cell lung cancer, advanced stage non-small cell lung cancer, multiple myeloma, adjuvant colon cancer, and advanced stage non-Hodgkin's lymphoma and a statistically significant negative effect of being African American on survival among sex-specific cancers (i.e., early stage breast cancer, advanced stage ovarian cancer, and advanced stage prostate cancer). The CALGB has reported similar observations for lung cancer [34, 61, 62]. The results presented in this paper complement the report by Albain as it has shown that the mortality rate associated with metastatic breast cancer is greatest among African American women. Albain concludes that tumor biology and inherited host factors contribute to differential survival outcomes by race in sex-specific malignancies.

Both race and education have been found to be independent predictors of survival among early stage breast cancer patients treated on CALGB clinical trials. Among patients with metastatic disease, race also has a significant effect on survival; however, education appears to have an effect on survival that is inconsistent across racial groups. We conjecture that education is a surrogate for social status, whereas race is a surrogate for both biological/genetic and social factors. Additional research is needed to substantiate such a statement and to gain a better understanding of the relationship between race, education, stage, clinical trial participation, and survival. An integral part of this additional research needs to be an examination of sociocultural and behavioral factors that contribute to long-term breast cancer survivors with low SES having poorer prognosis.

Regardless of the underlying mechanism for the associations between education and survival, it is clear that post-trial survivorship plans need to focus on women with low social status, as measured by education. Issues that need consideration include long-term compliance with hormone administration, management of comorbidities, cancer prevention, and detection.


The research was supported, in part, by grants from the National Cancer Institute (CA31946) to the Cancer and Leukemia Group B (Richard L. Schilsky, MD, Chairman) and to the CALGB Statistical Center (Stephen George, PhD, CA33601). The authors were also supported by grants (Alice B. Kornblith, PhD, CA32291; Jimmie C. Holland, MD, CA77651; Electra D. Paskett, PhD, CA77658). The content of this manuscript is solely the responsibility of the authors and does not necessarily represent the official views of the National Cancer Institute.