Division of Cancer Prevention and Control, National Center for Chronic Disease Prevention and Health Promotion, Centers for Disease Control and Prevention, Atlanta, Georgia
Division of Cancer Prevention and Control, Centers for Disease Prevention and Control and Division of Epidemiology and Disease Prevention, Indian Health Service, 5300 Homestead NE, Albuquerque, NM 87110
This supplement was sponsored by Cooperative Agreement Number U50 DP424071-04 from the Centers for Disease Control and Prevention, Division of Cancer Prevention and Control.
The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention.
This article is a U.S. Government work and, as such, is in the public domain in the United States of America.
The misclassification of race decreases the accuracy of cancer incidence data for American Indians and Alaska Natives (AI/ANs) in some central cancer registries. This article describes the data sources and methods that were used to address this misclassification and to produce the cancer statistics used by most of the articles in this supplement.
Records from United States cancer registries were linked with Indian Health Service (IHS) records to identify AI/AN cases that were misclassified as non-AI/AN. Data were available from 47 registries that linked their data with IHS, met quality criteria, and agreed to participate. Analyses focused on cases among AI/AN residents in IHS Contract Health Service Delivery Area (CHSDA) counties in 33 states. Cancer incidence and stage data were compiled for non-Hispanic whites (NHWs) and AI/ANs across 6 IHS regions of the United States for 1999 through 2004.
Misclassification of AI/AN race as nonnative in central cancer registries ranged from 85 individuals in Alaska (3.4%) to 5297 individuals in the Southern Plains (44.5%). Cancer incidence rates among AI/ANs for all cancers combined were lower than for NHWs, but incidence rates varied by geographic region for AI/ANs. Restricting the rate calculations to CHSDA counties generally resulted in higher rates than those obtained for all counties combined.
The classification of race for AI/AN cases in cancer registries can be improved by linking records to the IHS and stratifying by CHSDA counties. Cancer in the AI/AN population is clarified further by describing incidence rates by geographic region. Improved cancer surveillance data for AI/AN communities should aid in the planning, implementation, and evaluation of more effective cancer control and should reduce health disparities in this population. Cancer 2008;113(5 suppl):1120–30. Published 2008 by the American Cancer Society.
Accurate cancer surveillance data are essential to plan, implement, and evaluate cancer prevention and control activities.1, 2 The goal of producing reliable estimates of cancer occurrence in American Indians and Alaska Native (AI/AN) populations has been hampered by the misclassification of race that frequently occurs in central cancer registries.3–8 Even with such misclassification in many cancer registries, data from New Mexico and Alaska9 and other regions,5–8, 10 as well as data from death records,11, 12 indicate that wide regional variation in cancer burden is characteristic of AI/AN populations. Clearly, analyses that minimize misclassification of race have the potential to provide to tribes and their cancer control partners a more accurate description of the cancer burden in AI/AN communities and, as a consequence, the tools to plan and implement more effective cancer prevention and control programs. In this article, we describe methods used to mitigate the effects of race misclassification and to produce statistics on cancer incidence for individual regions, for all regions combined, and for the AI/AN population reported in the individual articles in this supplement.
The 2004 intercensal population estimates of 3.1 million AI/AN individuals represented 1.1% of the total United States population.13 These individuals are members of—or related to—1 or more of over 560 federally recognized tribes or over 200 nonfederally recognized tribes, and they represent communities with diverse languages, cultures, and histories. The median age of the United States AI/AN population was just 29 years in 2000,14 far younger than the nationwide median age of 38.6 years for non-Hispanic white (NHW) individuals (Fig. 1). Approximately 75% of the AI/AN population resides west of the Mississippi River, and AI/ANs make up proportionally greater percentages of the population in Alaska, Oklahoma, and other selected regions—the Southwest, the Northern Plains, and the Pacific Northwest (Fig. 2). About 33% of AI/ANs reside on tribal reservations, trust lands, or other tribally affiliated areas; approximately 70% live in urban areas.14, 15
The Indian Health Service (IHS) provides primary healthcare to approximately 1.8 million enrolled members of federally recognized tribes, or about 58% of the United States' estimated 3.1 million AI/AN population.16 The 150 IHS hospitals and clinics are located primarily on reservation lands and in a few cities with relatively large AI/AN populations. Half of these healthcare facilities are managed by tribal governments under negotiated agreements with the federal government, and half are operated directly by the federal government. An additional 34 urban health centers receive some federal funding to provide healthcare to the urban AI/AN population. Eligible AI/ANs can receive healthcare at any IHS facility, but complex rules govern and restrict the delivery of contract health services for specialty medical care, such as cancer treatment, which is generally not available in IHS facilities. Funding for IHS is by Congressional appropriation and is currently at the level of $2532 per capita, far below the $5645 expenditures per capita from all sources for personal medical services for the remaining United States population.17
Central cancer registries generally determine AI/AN ancestry on the basis of specific statements or notations in the medical record. However, such information is not always available and may be incorrect or incomplete for mixed-race populations; significant misclassification of AI/ANs as some other racial group (nonnative) has been documented in central cancer registries.3–8 These reports of misclassification in central cancer registries have been derived from linkages between registry records and patient registration records from the IHS. The use of such linkages has been proposed as one mechanism for correcting misclassification of AI/AN race in central cancer registries in a timely manner and at relatively low cost.18
MATERIALS AND METHODS
Central cancer registries receive case information from multiple sources, including hospitals, outpatient surgery centers, free-standing radiation centers, and death certificates. In the United States, state and metropolitan central cancer registries gather data on cancer incidence. Two federal programs fund central cancer registries: the National Program of Cancer Registries (NPCR) of the Centers for Disease Control and Prevention (CDC) and the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute (NCI). Together, NPCR and SEER collect data for the entire United States population (4 states—California, Kentucky, Louisiana, and New Jersey—receive funding from both NPCR and SEER).1
The SEER Program, which was established by the NCI after Congress passed the National Cancer Act of 1971,19, 20 currently collects cancer incidence and survival data from 17 population-based cancer registries covering approximately 26% of the United States population. Recognizing the need for more complete geographic coverage for cancer incidence data, Congress established the NPCR in 1992 by enacting the Cancer Registries Amendment Act, Public Law 102-515.21 Before establishment of the NPCR, 10 states had no registry, and most states with registries lacked the resources and legislative support they needed to gather complete data. NPCR registries now cover 96% of the United States population. In 2001, NPCR registries began annually reporting incidence data to the CDC, with the first diagnosis year reported the first year for which the registry collected data with the assistance of NPCR funds. SEER and NPCR work closely with the North American Association of Central Cancer Registries (NAACCR) to develop and promote consensus standards for cancer registration, provide education and training, certify population-based registries, evaluate and publish data, and promote the use of cancer surveillance data.1
In this supplement, incidence data from the registries refer to invasive cancers, with the exception of the urinary bladder (bladder), which includes in situ and invasive cancers,22, 23 and breast cancer, which includes 1 tabulation with in situ cancers.24 Data on the primary cancer site and on histology were coded according to the International Classification of Diseases for Oncology (ICD-O) edition in use at the time of diagnosis, converted to the 3rd edition coding,25 then categorized according to SEER site groups.26 Analysis by specific histologic diagnoses, in addition to the SEER site groups, was included in several articles (cancers of the lung, stomach, kidney and urinary bladder).22, 23, 27–29
Inclusion in the analytic dataset of data from individual registries and for individual years was determined by several factors. First, registries had to meet data standards developed for United States Cancer Statistics1 for each year of data to be included. Five state registries contributed data for fewer than the 6 years included in the analysis on this basis. Second, 3 states agreed to link their data with IHS yet declined to include their data in the analytic dataset. Finally, 1 state did not submit data in 2004 and was excluded. Collectively, these last 4 states represented 2.2% of the AI/AN population estimates for 2004. The ‘all counties’ incidence rates, for which no geographic restrictions apply, include data from 46 state registries and the District of Columbia. For most tabulations in this supplement, however, the analyses were restricted to ‘Contract Health Service Delivery Area’ (CHSDA) counties, which, in general, contain federally recognized tribal lands or are adjacent to tribal lands (Fig. 2). For incidence rates restricted to CHSDA counties, data from 33 registries were included.
CHSDA residence is used by the IHS to determine eligibility for services that are not available directly within the IHS. Data from the IHS Division of Epidemiology and Disease Prevention, using registry records linked with the IHS patient registration file, indicate less misclassification of race for AI/ANs in these counties.30 The CHSDA counties also have higher proportions of AI/ANs in relation to total population than do non-CHSDA counties, with 56% of the United States AI/AN population residing in the 624 counties designated as CHSDA (these counties represent 20% of the 3141 counties in the United States). Although less geographically representative (Fig. 3), analyses restricted to CHSDA counties are presented for cancer incidence in this report for the purpose of offering improved accuracy in interpreting cancer statistics for AI/ANs.
The analyses were completed for all regions combined and by individual IHS regions: Alaska, Pacific Coast, Northern Plains, Southern Plains, Southwest, and East (Fig. 3). Regional analyses have been presented in several publications focusing on AI/ANs,11, 12, 31, 32 and it was determined that this approach was preferable to the use of smaller jurisdictions, such as the Administrative Areas defined by IHS,33 which yielded less stable estimates. The geographic coverage of cancer registries and for residents in CHSDA counties by geographic regions is shown in Table 1.
Table 1. Population Coverage of State Cancer Registry Incidence Data for American Indians/Alaska Natives and Non-Hispanic Whites in Contract Health Service Delivery Area Counties by Indian Health Service Region, 2004
Source: 2004 intercensal bridged single-race population estimates, US Census Bureau/Centers for Disease Control and Prevention/National Cancer Institute (released on January 3, 2007). Available at: http://seer.cancer.gov/popdata/.
IHS indicates Indian Health Service; CHSDA, Contract Health Service Delivery Area.
Population counts are from the following states with CHSDA counties: Alabama, Alaska, Arizona, California, Colorado, Connecticut, Florida, Idaho, Indiana, Iowa, Louisiana, Maine, Massachusetts, Michigan, Minnesota, Mississippi, Montana, Nebraska, Nevada, New Mexico, New York, North Carolina, North Dakota, Oklahoma, Oregon, Pennsylvania, Rhode Island, South Dakota, Texas, Utah, Washington, Wisconsin, and Wyoming.
Classification of Race and Ethnicity
Current Office of Management and Budget standards include the following minimum categories for the collection of race information: AI/AN, Asian, black or African American, Native Hawaiian or other Pacific Islander, and white.34 These race categories represent sociopolitical constructs and are not anthropologically or biologically based. The current standards also allow census respondents to select 1 or more races when they self-identify rather than a single race as required in previous years.35 This allowance for selection of multiple races has had a large impact on the size of the total AI/AN population, evidenced by the finding that the 2000 U.S. Census count for those who reported their race as AI/AN either alone or in combination with another race was over 58% larger than the count of those who reported their race as AI/AN alone.36
Here and in other articles included in this supplement, cancer patients are classified as ‘American Indian’ or ‘Alaska Native’ if they are identified as such in the medical record (presumably by self-designation) or if they have sufficient native ancestry in a federally recognized tribe to have received IHS services. Individual tribes determine the degree of tribal ancestry necessary for tribal membership, which, in turn, determines eligibility to receive services from IHS. To improve race classification for AI/AN cases in contributing registries, state registries submitted their case records diagnosed from as early as 1988 to 2004 for linkage with the IHS patient registration database to identify AI/AN cases that were misclassified as nonnative. No clinical information was released from the registries to the IHS. The records of non-AI/ANs in the IHS database were removed from the linkage database by applying an ‘Indian status’ algorithm developed by the IHS that is based on 3 variables: beneficiary code, tribe, and blood quantum (representing the proportion of native ancestry). Linkages were conducted using LinkPlus, a probabilistic linkage software program that was developed by the CDC for general application by cancer registries.37 By using key patient identifiers (ie, Social Security number, first name, last name, middle initial, date of birth, and date of death), LinkPlus identifies records that represent the same individual in the IHS and cancer registry databases. For each pair of records, LinkPlus assigns a weight to each identifier; these weights subsequently are combined into a final weight, which is a large positive number if all or most matching variables agree or a large negative number if they disagree. Pairs with intermediate final weights (designated as ‘clerical reviews’), were examined independently by 2 reviewers, who assigned a status of match or nonmatch. Any discrepancies between these 2 reviewers were adjudicated by a third reviewer.
The race categories used by central cancer registries are specified in NAACCR standards and correspond closely to the race categories used by the U.S. Census Bureau to allow calculation of race-specific incidence rates. Race is coded independent of Spanish/Hispanic origin.38 Beginning with cancer cases diagnosed on January 1, 2000, registries have reported data in up to 5 race fields for multiracial individuals if that information is available from medical records.34 Coding rules specify that, for individuals of multiple races, a nonwhite race takes priority over white race for analytic purposes.39 For this report, all cases classified as AI/AN in the first race field were retained in that category. In addition, when the first race field was classified as white or unknown or ‘other’ and there was a positive IHS link, the case also was reclassified as AI/AN for this report. In contrast, if the first race field was coded as Asian/Pacific Islander or black race and there was a positive IHS link, then the value for the first race was retained.39
Self-identification also provides the optimal means to identify a individual's Hispanic ethnicity, but this information is not always found on cancer records. In 2005, NAACCR published a standard approach40 to strengthen the accuracy of Hispanic ethnicity for cancer cases, and this approach was used to identify NHW cases that were used as the comparison group for rate ratios.
Population estimates that are used as denominators in the rate calculations are from the NCI's publicly available, web-based statistical resources and are the same as those routinely included with the SEER*Stat statistical analysis software.13 They are based on the annual time series of July 1 estimates of county populations by age, sex, race, and Hispanic origin produced by the U.S. Census Bureau's Population Estimates Program.41
The Census Bureau currently develops annual county-level population estimates for 31 possible racial groups (5 single race groups and 26 multiple race groups) to include individuals who select 1, 2, 3, 4, or all 5 of the race categories. Corresponding multiple-race information is not widely available, however, either from state vital records (mortality data) or from medical records (incidence data). Therefore, a method for bridging the multiple-race population estimates to single-race estimates was developed by the CDC's National Center for Health Statistics using information from the pooled 1997 through 2000 National Health Interview Surveys.36, 42, 43 These bridged single-race estimates were used by the NCI to produce the final population estimates that are included in the calculations of incidence rates appearing in this report.13 Development of the bridged single-race data also makes the post-2000 race/ethnic population estimates comparable to the pre-2000 race/ethnic estimates and enables the reporting of a combined rate spanning 2000 as well as trend analyses.
All rates, expressed per 100,000 population, were directly age adjusted, using SEER*Stat software,44 to the 2000 United States standard population (19 age groups; Census P25-1130) in accordance with a 1998 U.S. Department of Health and Human Services recommendation.45, 46 Readers should avoid comparison of these data with published cancer rates that were adjusted using a different standard population.
By using the age-adjusted incidence rates, standardized rate ratios (RRs) were calculated for AI/AN populations using NHW rates for comparison. RRs are calculated in SEER*Stat before rounding of rates and may not equal RRs calculated by the reader from rounded rates presented in the tables. Confidence intervals (CIs) for age-adjusted rates and standardized RRs were calculated based on the methods described by Tiwari et al47 using SEER*Stat version 22.214.171.124
Most of the articles in this supplement examined the distribution of stage of disease at diagnosis for AI/AN and NHW populations regionally and in all regions combined. Incident cancer cases were classified as in situ, localized, regional, or distant using SEER Summary Stage 1977 and/or Summary Stage 2000.48 If staging was not compatible between these 2 systems, then only cases diagnosed from 2001 through 2003 were included for statistics on cancer stage.49 Relative percents (R percent) were calculated by dividing the category-specific incidence rate by the total rate to facilitate comparisons of the distributions of age-adjusted, stage-specific incidence rates between AI/AN and NHW populations across IHS regions.
Linkages with the IHS patient registration database that were completed for 49 state cancer registries and the District of Columbia are summarized by region in Figure 4. In total, 12,103 AI/AN individuals who had been classified as non-AI/AN were identified as AI/AN by the IHS linkage in these 49 states, ranging from 85 individuals in the Alaska region (3.4%) to 5297 individuals in the Southern Plains region (44.5%).
For all regions combined, cancer incidence rates for AI/ANs residing in CHDSA counties for all cancers combined were lower than for NHWs (AI/AN men: RR, 0.75; AI/AN females: RR, 0.80; see Table 2), but AI/AN incidence rates varied substantially by geographic region. These regional variations persisted for most cancer sites among AI/ANs (data not shown; see the Table of Contents of this supplement), and incidence rates were significantly higher among AI/ANs in Alaska and the Northern and Southern Plains than among AI/ANs in the Southwest.
Table 2. Combined Incidence for All Cancer Sites Combined by Indian Health Service Region for American Indians/Alaska Natives and Non-Hispanic Whites in Contract Health Service Delivery Area Counties: United States, 1999-2004a
Source: Cancer registries in the Centers for Disease Control and Prevention's National Program of Cancer Registries (NPCR) and/or the National Cancer Institute's Surveillance, Epidemiology, and End Results (SEER) Program.
CHSDA indicates Contract Health Services Delivery Area; IHS, Indian Health Service; AI/AN, American Indians/Alaska Natives; CI, confidence interval; NHW, non-Hispanic whites.
AI/AN race is reported by NPCR and SEER registries or through linkage with the IHS patient registration database. AI/AN persons of Hispanic origin are included.
Rates are per 100,000 persons and are age-adjusted to the 2000 U.S. standard population (19 age groups; Census P25-1130).
RRs are calculated in SEER*Stat prior to rounding of rates and may not equal RRs calculated from rates presented in the table.
The RR is statistically significant (P < .05).
eRates and RRs for Alaska in the CHSDA Counties section were the same as those in the All Counties section, because all counties in Alaska are CHSDA counties.
Years of data and registries used: 1999-2004 (41 states and the District of Columbia): Alaska,* Alabama,* Arkansas, Arizona,* California,* Colorado,* Connecticut,* the District of Columbia, Delaware, Florida,* Georgia, Hawaii, Iowa,* Idaho,* Illinois, Indiana,* Kentucky, Louisiana,* Massachusetts,* Maine,* Michigan,* Minnesota,* Missouri, Montana,* North Carolina,* Nebraska,* New Hampshire, New Jersey, New Mexico,* Nevada,* New York,* Ohio, Oklahoma,* Oregon,* Pennsylvania,* Rhode Island,* Texas,* Utah,* Washington,* Wisconsin,* West Virginia, and Wyoming*; 1999 and 2002-2004: North Dakota*; 2001-2004: South Dakota*; 2003-2004: Mississippi* and Virginia; 2004: Tennessee (asterisks indicate states with at least 1 county designated as a CHSDA).
Percentage regional coverage of AI/AN in CHSDA counties compared with AI/AN in all counties: Alaska. 100%; East. 13.1%; Northern Plains. 59%; Southern Plains. 64.1%; Pacific Coast. 55.6%; Southwest. 87.5%.
The effect of restricting calculations of incidence rates to CHSDA counties generally resulted in higher rates than were reported for all counties combined (Table 2). For the Northern and Southern Plains, the Pacific Coast, and the East, the rates with restriction to CHSDA increased approximately 100 cases per 100,000 population per year. The rate changed less for the Southwest region, where the CHSDA rate for all cancers combined was 232.9 and that for all counties combined was 221.0. Rates presented for ‘all counties combined’ in Table 2 for Alaska were the same as CHSDA county rates, because all counties in that state are classified as CHSDA.
The methods used in this supplement enhance AI/AN cancer surveillance by addressing race misclassification and by including analyses by geographic region. Linkages of IHS and cancer registry data and restricting analyses to CHSDA counties are efficient, inexpensive ways of reducing AI/AN misclassification and of improving the accuracy of cancer incidence data among AI/ANs residing in CHSDA counties. This supplement also includes data from 46 state cancer registries, including 33 of the 35 states that contain CHSDA counties, and, thus, is one of the most comprehensive analyses of cancer incidence in AI/AN populations to date.
Findings from the analyses reported here and in other articles in this supplement, as well as earlier reports from specific regions or registries,7, 9, 50–52 indicate that wide regional variation is characteristic of results from AI/AN cancer surveillance and that region-specific data are essential to characterize the AI/AN cancer burden. In general, cancer rates among AI/ANs in CHSDA counties were highest in Alaska and the Northern and Southern Plains and lowest in the Southwest. In part, the wide regional variations may reflect geographic variations in environmental, social, and personal determinants of health (see the article by Steele et al53 in this supplement). Research designed to understand regional variations in disease risk may help identify appropriate prevention and control strategies.
There are several limitations to consider when interpreting the results presented in this supplement. First, although linkage with the IHS patient registration database improves the classification of race for AI/AN cases, the issue is not resolved completely, because AI/AN individuals who are not members of the federally recognized tribes and are not eligible for IHS services are not represented in the IHS database. In addition, some individuals may be eligible for, but never use, IHS services and, thus, are not included in the IHS database. Second, the findings from CHSDA counties highlighted in this supplement do not represent all AI/AN populations in the United States or in individual IHS regions (Table 1, Fig. 3). In particular, the East region includes only 13.1% of the total AI/AN population for that region. Furthermore, the analyses based on CHSDA designation exclude many AI/AN residents in urban areas that are not part of a CHSDA county. AI/AN residents of urban areas differ from all AI/ANs in poverty level, healthcare access, and other factors that may influence cancer trends.15, 54 Third, this analysis revealed less variation for NHWs than for AI/ANs by IHS regions using data from CHSDA counties only. Perhaps alternative groupings of states or counties would reveal a different level of variation for NHWs.
Methods for Improving Cancer Surveillance Data in AI/AN Populations
Cancer registrars rely on information available in medical and administrative records for information on race/ethnicity; often, this information either is not available or is not collected and recorded in a systematic manner. Several recent reports have recommended that hospitals implement a uniform framework for the collection of data on race, ethnicity, and language, including a rationale for reporting these data, the provision of scripts for employees to use when interviewing patients, and the development of other tools to facilitate data collection.55 Another approach to improve race classification further for AI/ANs is the development and expansion of tribal rosters, such as the Northwest tribal roster,7 to complement the IHS patient registration database and thereby increase the usefulness of data linkages.
The high rate of misclassification of AI/AN race on death certificates has been documented in several studies.56, 57 Although data on cancer mortality are not presented in this supplement, data linkages between IHS data and state death records are in progress to improve AI/AN mortality data in future reports.
In conclusion, substantial progress has been made in cancer surveillance in AI/AN populations to provide a more comprehensive and accurate picture of the cancer burden in this population than was available previously. To build on this progress, the cancer registry community and the many partners who bring cancer surveillance to fruition should continue efforts to improve race classification and routine reporting of cancer in AI/AN populations. These improved data should be readily available to the cancer control community to more effectively plan, implement, and evaluate cancer control programs that target AI/AN populations.