Points of view or opinions in this article are those of the authors and do not necessarily represent the official position or policies of the National Cancer Institute.
Differences in late-stage cancer risk between urban and rural residents are a key component of cancer disparities. Using data from the Illinois State Cancer Registry from 1998 through 2002, the authors investigated the rural-urban gradient in late-stage cancer risk for 4 major types of cancer: breast, colorectal, lung, and prostate.
Multilevel modeling was used to evaluate the role of population composition and area-based contextual factors in accounting for rural-urban variation. Instead of a simple binary rural-urban classification, a finer grained classification was used that differentiated the densely populated City of Chicago from its suburbs and from smaller metropolitan areas, large towns, and rural settings.
For all 4 cancers, the risk was highest in the most highly urbanized area and decreased as rurality increases, following a J-shaped progression that included a small upturn in risk in the most isolated rural areas. For some cancers, these geographic disparities were associated with differences in population age and race; for others, the disparities remained after controlling for differences in population composition, zip code socioeconomic characteristics, and spatial access to healthcare.
Cancer stage at the time of diagnosis is critically important in affecting the long-term health and well being of cancer patients. Detection at an early stage increases the likelihood that cancers can be treated successfully, reducing the risk of morbidity and mortality and enhancing long-term prognosis. In the US, there are wide disparities in late-stage cancer risk among population groups and among geographic regions. Low-income, vulnerable populations are more likely to be diagnosed with cancer at a distant or “late” stage. Late-stage risk also varies geographically. Some studies identify a greater risk of late-stage diagnosis among rural residents who face long distances in accessing cancer screening services. However, research on rural-urban disparities has produced mixed and conflicting findings that question whether rural residents are disadvantaged in late-stage risk.
This research investigates the rural-urban gradient in late-stage cancer risk in Illinois for 1998 through 2002. Using data from the Illinois State Cancer Registry, we examine fine-grained patterns of rural-urban variation for 4 major types of cancer: breast, colorectal, lung, and prostate. The objectives were to determine the extent and direction of rural-urban gradient in late-stage risk for each type of cancer and to analyze the roles of individual demographic characteristics and zip code-level contextual variables in accounting for rural-urban variations.
Studies of rural-urban disparities in late-stage cancer risk in the US present a mixed picture of geographic variation. Conventional wisdom suggests that rural residents have a higher risk of late diagnosis because of numerous barriers to obtaining preventive health services and screening for early detection. These barriers include: poor geographic access to primary care and cancer screening services,1, 2 lack of insurance,3 and lack of knowledge regarding screening guidelines.4-6 The concentration of vulnerable population groups, such as the elderly, in rural areas makes these barriers even more significant.
The results from some empirical investigations have produced support for the hypothesis of a rural disadvantage by uncovering higher rates of late-stage cancer among rural residents.7, 8 African Americans living in rural areas are particularly disadvantaged,9 as are residents of remote and impoverished rural areas.10 It also has been observed that rural areas have lower rates of cancer screening, which results in increased late-stage disease and a higher proportion of unstaged cancers.10-13 Evidence of a rural disadvantage also comes from research in other countries.14-16
An equally diverse body of research suggests that there is little or no rural-urban gradient in late-stage cancer risk. Urban/rural residence had no significant association with cancer stage for patients with breast cancer and melanoma in California.17 For colorectal cancer in California, controlling for socioeconomic status eliminated rural-urban disparities in late-stage risk.18 Lung cancer survival was unrelated to urban versus rural residence but was associated strongly with socioeconomic deprivation.19 Similar findings have been reported for colorectal cancer in North Carolina20 and for breast cancer in New Zealand.21 There also is evidence that, although rural residents may have been disadvantaged in the past, the rural-urban divide has closed over time.17, 22 Knowledge of and access to cancer screening has increased substantially in many rural areas.23
Some studies even suggest that there is a reverse pattern of rural-urban disadvantage in which late-stage risk is higher in cities than in rural areas.24 Among breast cancer patients in Florida, living in urban areas was associated with a higher risk of late-stage presentation.25 Intraurban research indicates high rates of late-stage diagnosis in impoverished inner-city communities, rates that may be indicative of clusters of urban disadvantage.26-29 A recent study of colorectal and lung cancers in the US30 revealed that, after controlling for social and demographic factors that were linked to late presentation, the risk is higher in urban areas than in rural areas.
These contrasting findings highlight the need for further research into the rural-urban gradient in late-stage cancer risk. Many previous studies were limited by their reliance on a simple binary classification of urban versus rural that failed to capture the geographic diversity of residential environments in the US. Moreover, the definitions of rural and urban differed depending on whether counties or smaller geographic units were used in constructing the rural-urban classification. Past research also suggested that rural-urban disparities may vary among cancer types, reflecting differences in the availability and accessibility of screening services and health education and sociodemographic differences in affected populations. Cancer stage also has been associated with individual and contextual variables, such as age, education, and economic deprivation, and these confounding factors need to be considered in studies of geographic disparities.
MATERIALS AND METHODS
In this report, we present a multilevel statistical analysis of rural-urban disparities in late-stage cancer presentation for the 4 major types of cancer (breast, colorectal, lung, and prostate) in Illinois. Illinois is an appropriate study area because it encompasses a diverse range of geographic settings, from the densely populated Chicago metropolitan area to low-density, remote rural regions. Data concerning all cancer cases for the years 1998 through 2002 were obtained from the Illinois State Cancer Registry (ISCR). The ISCR was certified as a Gold Standard registry by the North American Association of Central Cancer Registries for each and every year from 1998 to 2002 (personal communication with Dr. Tiefu Shen, Director of ISCR, on October 22, 2008). Cases among Illinois residents that were diagnosed in neighboring states are included, and the completeness of case ascertainment is estimated at 98%.31 The dataset comprises individual records of cancer incidence in Illinois, and each cancer case is geocoded to the zip code of residence. For each cancer patient, variables describing cancer type, age group, sex, race, diagnosis stage, and year are included.
The ISCR uses a classification scheme consistent with Surveillance, Epidemiology, and End Results summary stage to measure disease stage at diagnosis.32 The in situ and localized categories (stage of disease code 0 or 1) were considered early stage, and regional and distant categories (codes 2-7) were considered late stage. Unstaged cases or those with unknown stage were excluded. The percentage of cases lacking stage information varied among the 4 cancer sites, ranging from 5.1% for breast cancer to 13.1% for lung cancer (Table 1).
Table 1. Cancer Stage by Type in Illinois 1998-2002
Total No. of Cases
No. of Cases
No. of Cases
The percentage of patients diagnosed with late-stage disease differs substantially by cancer type. Nearly 80% of lung cancer patients present with late-stage disease, compared with only 15.6% for prostate cancer. Breast cancer (36.9%) and colorectal cancer (62.9%) have intermediate values. These inequalities by cancer type reflect biologic, social, and healthcare factors, including differences in awareness and information concerning early diagnosis and in the availability of screening procedures for early detection.
Multilevel modeling was used to analyze the rural-urban gradient in late-stage cancer risk and to evaluate the role of individual demographic characteristics (age, sex, race) and zip code-level contextual factors in accounting for rural-urban variation. Multilevel modeling is an appropriate method for investigating contextual effects—the effects of the local environment or neighborhood on health outcomes.33 The dependent variable in the multilevel models is a binary variable that represents late-stage diagnosis (code 1, late) for individual cancer patients; thus, we model an individual cancer patient's risk of late diagnosis.
Independent variables exist at 2 levels: individual and zip code. The individual-level data come from the ISCR. Because of privacy and confidentiality restrictions, we have access only to a limited set of variables—age category, race, cancer type, cancer stage at diagnosis, and zip code of residence—for each cancer patient. Three dummy-coded, individual-level, independent variables were included in the models: young age (aged <50 years), older age (aged ≥70 years), and race. At the zip code level, independent variables representing rural-urban location, spatial access to healthcare, and socioeconomic characteristics of zip codes were included (Table 2).
Table 2. Variable Definitions
Travel time is included only in the models for breast and colorectal cancer.
Travel time to nearest mammography or colonoscopy facility
To differentiate various types of urban, suburban, and rural locations, we subdivided the state into 5 zones based on the Rural-Urban Commuting Areas classification scheme (RUCA) developed by the Office of Rural Health Policy.34 The RUCA scheme classifies areas on the basis of urbanized population and commuting flow. We rely on the 4-tiered RUCA taxonomy, which classifies zip codes into 1) urban core areas, 2) suburban areas, 3) large town areas (urbanized population, 10,000-49,999), and 4) small town and isolated rural areas. This classification was devised by researchers at the University of Washington based on census tract data aggregated into zip code areas.35
Developed mainly for analyzing rural issues, RUCA codes do not differentiate well within urban metropolitan areas. For example, most zip codes in the Chicago metropolitan area, including the City of Chicago and surrounding suburbs, are classified into RUCA category 1 (urban core) despite their geographic and population differences. Smaller cities, such as Peoria and Rockford, also are lumped into category 1, although they differ greatly in population density and healthcare availability from Chicago and its suburbs.
To represent these urban and suburban environments better, we modified and expanded the RUCA scheme as follows (Fig. 1):
1)Chicago City: With a population of 2.9 million in 2000, Chicago represents a distinct geographic setting because of its high population density; large poor and immigrant populations; large concentration of hospitals, physicians, and other healthcare providers; and well developed public transit system.
2)Chicago Suburbs: Forming an expanding ring around Chicago, these areas are characterized by moderate-density residential and commercial development, a growing and increasingly diverse population, and strong linkages with the central city; Chicago suburbs were identified as parts of the Chicago metropolitan statistical area located in Illinois and not including the city itself.
3)Other Metropolitan: This group consists of smaller cities and suburbs in other parts of the state, such as the cities of Peoria and Springfield; zip codes in RUCA categories 1 and 2 (urban core and suburban) were selected, and those not located in Chicago or its suburbs were placed in the “Other Metropolitan” category.
4)LargeTowns: This group comprises RUCA category 3, towns with populations between 10,000 and 50,000 and their surrounding rural areas with high commuting into the town.
5)Rural: These areas represent RUCA category 4, small towns (population <10,000) and isolated rural areas.
Four dummy-coded variables describe the 5 types of rural-urban settings of the zip code of residence as described above (see Table 2). In all models, the reference category represents the City of Chicago, and therefore Chicago zip codes serve as the basis for comparison. In the models, coefficients for these geographic variables describe the difference in late-stage risk for patients living in a particular type of area relative to the risk for patients living in Chicago, and all other independent variables are held constant.
In addition to these rural-urban classifications, several variables at the zip code level, obtained from 2000 Census data for Zip Code Tabulation Areas, were included as area-based socioeconomic indicators.36 A large number of socioeconomic variables from the 2000 Census were examined by estimating models using different combinations of variables. Most of the socioeconomic variables were eliminated because they had high levels of multicollinearity. For the final model, 2 area-based socioeconomic indicators were used: median household income (in logarithm) and percentage of population that does not speak English. These were chosen because they are relatively uncorrelated and represent 2 distinct dimensions of socioeconomic status: economic deprivation/well being and populations disadvantaged by language ability. Although many researchers advocate the use of poverty as an economic indicator,36 we observed that poverty varies much less than median income across rural areas in Illinois. Thus, income is a more sensitive indicator of economic well being, especially in rural areas.
The final 2 zip code-level, independent variables describe spatial access to healthcare from a cancer patient's zip code of residence: spatial access to primary care and travel time to the nearest cancer screening facility (Table 2). The latter variable is used only in the models for breast cancer and colorectal cancer, because those 2 types of cancer have clearly identifiable screening services (mammography and colonoscopy). All mammography and colonoscopy facilities in the state were geocoded by address. Travel times from the centroid of each zip code to the nearest screening facility were estimated based on real-world road networks, which account for the type of road and adjust for typically lower travel speeds in densely populated urban areas.37
Spatial access to primary care was estimated using the 2-step floating catchment area method (2SFCA).37, 38 For each zip code, the 2SFCA computes a numerical value that represents the ratio of the local supply of primary care physicians to the local demand (population) for primary care. Supply and demand are measured in a floating window within a fixed range (ie, 30 minutes) of travel time. A high value for this spatial access measure represents a high ratio of supply to demand—a large number of primary care physicians in the local area compared with the population.
The multilevel models specify individual cancer patients nested within zip codes—a 2-level hierarchical model. A logistic, multilevel formulation was used because the dependent variable is binary (late-stage). The specific model formulation that we used in this research is a 2-level “intercepts-as-outcomes model.” Such models assume that the effects of individual-level variables are fixed across zip codes and that zip code intercepts vary as a function of zip code socioeconomic and spatial variables. To evaluate rural-urban variation and to assess whether individual and contextual variables account for that variation, we entered independent variables into the multilevel models in blocks. The first models included only the 4 urban-rural variables; next, individual-level variables representing age and race were entered. In the final models, all zip-code socioeconomic and spatial variables were included (Table 3). All models were estimated using STATA statistical software (StataCorp, College Station, Tex).
Table 3. Multilevel Model Coefficients by Cancer Type, 1998-2002*
Chic_sub indicates Chicago suburb; Other_metro, other metropolitan area.
For each type of cancer, the first model includes only the 4 dummy independent variables for the 5-category rural-urban classifications; the second model adds 3 dummy-coded, individual-level variables (age and race); and the third model adds 4 zip code-level variables (2 socioeconomic indicators and 2 for spatial access to healthcare). A negative coefficient means that the variable is associated with a reduced risk of late-stage cancer. A positive value means that the variable increases the risk of late-stage disease.
The first set of multilevel models that include only rural-urban location variables reveal significant geographic variation in late-stage cancer risk for all 4 cancers. Table 3 provides the coefficients for the multilevel models; odds ratios (ORs), calculated as exp(b), are mentioned in the text and illustrated in Figure 2 for categoric variables, such as age category and geographic zone. According to Table 3, in every case, the risk of late diagnosis is greatest among patients living in the City of Chicago. Those living in the other 4 geographic zones have significantly lower risk, with the exception of lung cancer patients in the Chicago suburbs (not significant).
Graphing the ORs for each geographic zone and cancer type reveals a clear and remarkably consistent rural-urban gradient in late-stage risk (Fig. 2). Risk is highest in Chicago, decreases in the less urbanized zones, and reaches its nadir in other metropolitan areas and large towns. Risk increases somewhat among patients living in the most rural areas, tracing a reverse J-shaped gradient along the urban-rural continuum. This J-shaped gradient holds for all 4 cancer types. The gradients are steepest for breast, colon, and lung cancers, all of which record the lowest ORs among patients living in large towns. These patients are roughly 25% less likely to present with late-stage cancer than their counterparts in the City of Chicago, as indicated by ORs ranging from 0.71 to 0.79. The gradient is less pronounced for prostate cancer, although patients living outside Chicago still are from 15% to 20% less likely to be diagnosed with late-stage disease than those living in the city. Thus, for all cancers, late diagnosis is most concentrated in the highly urbanized City of Chicago and decreases with decreasing urbanization, recording a modest increase in the most isolated rural areas.
To determine whether differences in population composition account for these rural-urban variations, individual-level age and race variables were added to the multilevel models for each cancer type (Table 3). In every case, older age is linked to a reduced likelihood of late diagnosis. For breast cancer (OR, 0.75) and lung cancer (OR, 0.79), individuals aged ≥70 years are 25% less likely to be diagnosed with late-stage disease than their younger counterparts. The decrease in risk is less for patients with colorectal cancer and prostate cancer but remains statistically significant. In contrast, young age (aged <50 years) is associated with a higher risk of late diagnosis, and the heightened risk is statistically significant for all but prostate cancer. Racial disparities also are evident. Blacks are more likely than others to be diagnosed with late-stage breast, colorectal, and prostate cancer when the analysis is controlled for age and rural-urban location. The racial disparity is particularly wide for breast cancer (OR, 1.45) and prostate cancer (OR, 1.34). For lung cancer, black patients are less likely to present with late-stage disease.
Adding these individual-level, demographic variables reduces the urban-rural gradient in late-stage risk for all cancers with the exception of lung cancer. For prostate cancer, the gradient is eliminated almost completely, suggesting that the observed rural-urban disparities are primarily the result of compositional differences in population age and race. For lung cancer and colorectal cancer, the J-shaped gradient in late-stage risk remains; however, disparities diminish for colorectal cancer but widen for lung cancer. This means that, for lung cancer, the high rates of late-stage diagnosis observed in Chicago city and suburbs are even higher than expected based on the age and racial composition of lung cancer patients. Adjusting for age and race has a different impact on rural-urban differences for breast cancer: Risks converge for rural patients and those in Chicago city; whereas patients living in Chicago suburbs, other metropolitan areas, and large towns are significantly less likely to be diagnosed with late-stage disease.
The final models include zip code-level indicators of socioeconomic conditions and spatial access to healthcare. These contextual variables achieve statistical significance only in the models for breast cancer and lung cancer (Table 3). In both cases, median income is associated inversely with late diagnosis: Residents of higher income areas have a reduced likelihood of late-stage presentation, confirming the strong ties between economic vulnerability and cancer stage observed in many research studies.39-41 Spatial access to primary healthcare also is statistically significant, and patients living in areas that lack primary healthcare resources are more likely to present with distant-stage disease. Adding these contextual variables to the models leads to further changes in rural-urban disparities. Disparities essentially are eliminated for prostate cancer and colorectal cancer (except for colorectal patients residing in large towns). Among breast cancer patients, the risk of late-stage diagnosis is less for those living in other metropolitan and large town settings when the analysis is controlled for patient demographics and area-based indicators of socioeconomic disadvantage and spatial access to healthcare. In the case of lung cancer, controlling for these factors has the opposite result, revealing wider disparities along the rural-urban continuum. The likelihood of late-stage presentation is highest among residents of Chicago city and suburbs and follows the same J-shaped pattern observed previously. All other factors being equal, patients who live outside the Chicago area are 25% to 35% less likely than their Chicago-area counterparts to present with late-stage lung cancer.
A rural-urban gradient in the risk of late-stage cancer is evident for the 4 major types of cancer in Illinois; however, there is little indication of a rural disadvantage. Instead, we observe that the likelihood of late-stage diagnosis is highest among patients living in the most densely populated zone: the City of Chicago. The observed pattern of urban disadvantage provides support for the finding reported by Paquette and Finlayson30 that, for certain cancers, the risk of late-stage presentation is higher among residents of urban areas than among nonurban residents. In the current study, we use a finer grained rural-urban classification that differentiates the densely populated City of Chicago from its suburbs and from smaller metropolitan areas, large towns, and rural settings. For all 4 cancers, risk decreases as rurality increases, following a J-shaped gradient that includes an upturn in risk in the most isolated rural areas.
For colorectal cancer and prostate cancer and, to a lesser extent, for breast cancer, rural-urban disparities largely disappear when individual-level and zip code-level variables are controlled. Thus, the geographic differences observed stem mainly from differences in the age and racial composition of cancer patients and the social and spatial characteristics of the locations in which they live. Concentration of vulnerable populations and economically disadvantaged places in Chicago and its suburbs underpin the high rates of late-stage diagnosis observed in these highly urban areas. Conversely, in the most rural areas, the lower rates of late-stage diagnosis primarily reflect the greater presence of elderly patients who have a lower risk of late-stage diagnosis. Finally, in the case of lung cancer, the rural-urban gradient not only remains after individual and zip code characteristics are controlled, but it also becomes more extreme. This suggests that there are unmodelled factors, such as cancer awareness or diagnostic differences, which vary along the rural-urban continuum, leading to systematic disparities in stage at presentation. Lung cancer has the highest percentage of unstaged cases; therefore, staging procedures and accuracy may be important.
These findings also reveal a strong and consistent advantage for patients living in the other metropolitan and large town contexts—an advantage that remains after individual and zip code variables are controlled (except for prostate cancer). Other metropolitan areas and large towns are dispersed across the state, ranging in population size from 10,000 to approximately 300,000. Nearly all of these places have primary care physicians, and many have hospital facilities. The lower incidence of late-stage presentation among patients residing in these areas warrants more in-depth investigation focused on how the interactions between individuals and local healthcare systems affect cancer screening and awareness. For example, residents of these smaller urban places may face fewer space-time constraints in accessing cancer screening services than their counterparts in the most urban and rural settings. Whatever its causes, the lower rate of late-stage presentation in these smaller urban places highlights the scope for improvement elsewhere.
The results of the current study also emphasize the need to look beyond the binary categories of urban and rural in investigating geographic health disparities. In Illinois, late-stage cancer risk consistently is lowest in the 2 geographic zones that straddle the urban-rural divide: other metropolitan places (normally classified as “urban”) and large towns (normally classified as “rural”). Using traditional urban and rural definitions smoothes away variation along the rural-urban continuum, obscuring the high incidence of late diagnosis that exists in the Chicago region and the (moderately) higher incidence in isolated rural settings. Variation within these broad geographic zones also is important, although it was not examined here. Studies indicate that there is marked geographic variation in late diagnosis within large cities such as Chicago,27-29 and similar spatial inequalities are likely to exist in rural areas. Using spatial clustering methods to identify geographic concentrations of high-rate areas enables the detailed spatial targeting of public health interventions.42-44
Similar to many studies, we observed a higher risk of late-stage cancer among younger patients and a lower risk among older patients. These are likely to result from differences in frequency of primary care visits and age-related cancer screening protocols. We also observed significant racial disparities in late-stage disease for all types of cancer. The higher likelihood of late presentation among black patients with breast, colorectal and prostate cancer when the analysis is controlled for age and zip code socioeconomic characteristics confirmed the persistent racial disparities in late-stage cancer reported elsewhere.45, 46 Similar to disadvantaged populations in other countries,24 the black population in the US is distinctly vulnerable to late diagnosis for these types of cancer.
The causes of these racial disparities are not well understood. Emerging evidence for breast cancer points to biologic differences in tumor size, type, and lymph node involvement.47 At the same time, contextual and cultural factors that affect the use of screening services and the quality and effectiveness of those services also are likely to be important.48 Our results indicate that such factors go beyond the socioeconomic and sociocultural characteristics of residential areas and their spatial accessibility to healthcare. Black patients may face different kinds of constraints and opportunities in accessing healthcare than patients of other racial groups, and such differences are not captured in our modeling strategy.
For lung cancer, the racial disparity is reversed: blacks are less likely than others to be diagnosed with late-stage lung cancer, although the decrease in risk is small. This finding is not consistent with national data, which reveal a heightened risk of late diagnosis among black patients with lung cancer.49 However, that study controlled for the age and area-based socioeconomic characteristics that consistently increase late-stage risk. Understanding how socioeconomic deprivation interacts with and affects racial disparities in lung cancer is an important topic for future investigation.
In conclusion, rural-urban inequalities in late-stage cancer risk are an important dimension of persistent disparities in cancer morbidity and mortality. The current study results indicate that the odds of late-stage presentation are not highest among patients living in rural areas but among those living in the most urbanized setting, the Chicago metropolitan area. Thus, we observe a reversal of the commonly held view that risks are highest for rural residents. The concentration of health disadvantage in highly urbanized places emphasizes the need for more extensive urban-based cancer screening and education programs, especially programs targeted to the most vulnerable urban populations and neighborhoods. At the state level, late-stage risk varies systematically along a detailed rural-urban continuum, with both low-rate areas and high-rate areas cutting across the traditional rural-urban divide. This raises questions concerning the use of a simple, binary rural-urban classification in investigating geographic health disparities. Determining whether the J-shaped trend observed here holds for other types of cancer in other geographic contexts is an important topic for future research investigation. Unraveling the causes of these geographic inequalities also requires attention. Although the limited set of individual and contextual variables considered here accounts for some rural-urban variation, much remains unexplained. Why, for example, are risks consistently lower among patients living in large towns in rural areas? Addressing such questions calls for an analysis of how individuals in particular geographic contexts interact with local healthcare providers and how providers respond to local population health needs—issues that define a challenging research agenda.
Conflict of Interest Disclosures
Support by the National Cancer Institute, National Institutes of Health, under Grant 1-R21-CA114501-01, is gratefully acknowledged. Points of view or opinions in this article are those of the authors and do not necessarily represent the official position or policies of the National Cancer Institute.