Martin L. Brown, Health Services and Economics Branch, National Cancer Institute, 6130 Executive Blvd, EPN 4005, Bethesda, MD 20892-7344 (email: firstname.lastname@example.org).
Context: This article compares cervical cancer screening intensity and cervical cancer mortality trends in the United States and the Netherlands to illustrate the potential of cross-national comparative studies. We discuss the lessons that can be learned from the comparison as well as the challenges in each country to effective and efficient screening.
Methods: We used nationally representative data sources in the United States and the Netherlands to estimate the number of Pap smears and the cervical cancer mortality rate since 1950. The following questions are addressed: How do differences in intensity of Pap smear use between the countries translate into differences in mortality trends? Can population coverage rates (the proportion of eligible women who had a Pap smear within a specified period) explain the mortality trends better than the total intensity of Pap smear use?
Findings: Even though three to four times more Pap smears per woman were conducted in the United States than in the Netherlands over a period of three decades, the two countries’ mortality trends were quite similar. The five-year coverage rates for women aged thirty to sixty-four were quite comparable at 80 to 90 percent. Because screening in the Netherlands was limited to ages thirty to sixty, screening rates for women under thirty and over sixty were much higher in the United States. These differences had consequences for age-specific mortality trends. The relatively good coverage rate in the Netherlands can be traced back to a nationwide invitation system based on municipal population registries. While both countries followed a “policy cycle” involving evidence review, surveillance of screening practices and outcomes, clinical guidelines, and reimbursement policies, the components of this cycle were more systematically linked and implemented nationwide in the Netherlands than in the United States. To a large extent, this was facilitated by a public health model of screening in the Netherlands, rather than a medical services model.
Conclusions: Cross-country studies like ours are natural experiments that can produce insights not easily obtained from other types of study. The cervical cancer screening system in the Netherlands seems to have been as effective as the U.S. system but used much less screening. Adequate coverage of the female population at risk seems to be of central importance.
In this article we compare one aspect of health care delivery, cervical cancer screening, which differs substantially in the United States and the Netherlands. Even though cervical cancer screening accounts for only a small fraction of overall health care spending, it is representative of the broader area of preventive health services. The U.S. Patient Protection and Affordable Care Act of 2010 requires new private health insurance plans and Medicare to cover preventive services with no deductibles or copayments. Preventive services often are cost-effective; that is, the health benefits resulting from their use represent a good “value” relative to the economic cost of these services. Whether this potential cost-effectiveness, if not actual cost savings, is realized in practice, however, depends on how the preventive services are implemented. This is an area for which comparative national studies may be informative (Cohen, Neumann, and Weinstein 2008).
This article has two main aims: to compare the health care resource utilization and subsequent health outcomes of cervical cancer control in the United States and the Netherlands and to describe how the two countries translate this knowledge of public health into health care policy and practice.
We chose cervical cancer screening for this study for several reasons. Both the United States and the Netherlands have relatively good data on resource use and health outcomes at the national level. Both countries also have a relatively long history of cervical cancer screening, and therefore sufficient time has elapsed to capture the population-based health outcome measures. One of the limitations of studying variations in the delivery of preventive services across geographic regions within a country or between time periods within a country is that once a basic approach to preventive services is developed, there is only incremental variation across place and time within a country. More variation may be observed across national boundaries. This is especially true for cervical cancer screening because no randomized clinical trials have ever been conducted for cervical cancer screening to assess its impact on mortality. Moreover, the differences in screening frequencies and starting and stopping ages are wide (Anttila et al. 2009; Dowling et al. 2010). We focus on conventional cervical cytology, that is, the Pap smear. Examining this well-established and conventional screening technology may not seem interesting. But the lessons learned from this “historical” example may be relevant to newer forms of cervical cancer screening, for example, DNA testing for the human papilloma virus (HPV), which are just beginning to become standard practice in advanced industrialized countries like the United States and the Netherlands. In addition, many low- and middle-income countries have not yet established screening programs for cervical cancer, which remains a major cause of mortality in much of the world (Brown et al. 2006). Thus, this case study may also have lessons for less developed countries.
First we describe the institutional factors that determine how screening is delivered in the two countries and the historical evolution of the respective policies, programs, and practices. We compare “resource use,” measured as the number of Pap smears performed per capita in the United States and the Netherlands, and “health outcomes,” measured as the mortality rates from cervical cancer in the United States and the Netherlands. We then describe in more detail how the health outcomes are related to the patterns of screening in each country and what lessons might be drawn from this comparison, in regard to both the specific design of the screening programs and the broader issues of policy formulation and program implementation. We consider this analysis to be within the genre of comparative effectiveness research, but rather than comparing drugs or medical devices, we look at two national approaches to the delivery of recurrent screening.
Data and Methods
In order to make data comparable across time and between the United States and the Netherlands, a weighted average of mortality (or incidence) rates for five-year age groups was used, with the weights corresponding to the age composition of the U.S. population in the year 2000. This standardization applies to total population rates as well as to rates for broader age-intervals, which are composed of five-year age groups.
Cervical Cancer Incidence Data
The incidence data for the United States are from the National Cancer Institute Surveillance, Epidemiology and End Results (SEER) program (Altekruse et al. 2010). The long-term incidence data are from the nine original SEER registries, covering about 10 percent of the U.S. population. The incidence data for the Netherlands are from the Dutch National Cancer Registration (Netherlands Cancer Registry 2010).
Cervical Cancer Mortality Data
In the United States, cervical cancer mortality data are based on death certificate information reported to the states’ vital statistics offices and compiled into a national file through the Centers for Disease Control and Prevention's National Center for Health Statistics (NCHS) National Vital Statistics System and categorized according to SEER anatomic site groups to maximize comparability within the International Classification of Diseases (ICD) and its oncology version ICD-O. We accessed the cervical cancer mortality data for 1950 to 2007 for the entire United States (Altekruse et al. 2010) and comparable data for the Netherlands from the Central Bureau of Statistics (Statistics Netherlands 2010a).
Cervical Cancer Screening Intensity
As a measure of screening intensity, we used the total number of Pap smears in a year per 1,000 female population. The U.S. data are based on the responses to the questions in the National Health Interview Survey (NHIS) asking women when they had their most recent Pap smear (National Center for Health Statistics 2011a, 2011b; Swan et al. 2010). The NHIS, the principal source of information about the health of the U.S. noninstitutionalized civilian population, is a household personal interview survey of a large, statistically representative sample of 75,000 to 100,000 individuals. The U.S. Census Bureau is the data collection agent for the NHIS, which has a high response rate, above 80 percent. The survey items undergo expert review and cognitive testing in order to minimize the possibility of error due to self-reporting. The National Cancer Institute partially funds and contributes to the wording of the NHIS's cancer screening items.
The yearly number of smears was estimated from the percentage of affirmative answers to the question of whether the female respondent had had a Pap smear within the last twelve months. The five-year coverage data are based on the affirmative answer to the question about having had a Pap smear less than sixty months ago. Gardner and Lyon (1977) reconstructed the early history of screening in the United States. Screening started in the late 1950s and increased rapidly after 1960, with about a quarter of the female population over age twenty having a Pap smear annually between 1966 and 1968. Screening continued to grow steadily until 1973, when it stabilized.
In the Netherlands, the number of Pap smears is based on the nationwide registry of histo- and cytopathology (PALGA), which is virtually complete for cytology from 1991 onward (Casparie et al. 2007). As a result, we were able to count the annual number of cytological tests. The number of women at risk was based on data from Statistics Netherlands (Statistics Netherlands 2010b). We reconstructed the history of the intensity of screening in the Netherlands before 1990 from national health interview survey data and the registration of general practitioners (see Habbema et al. 1988). We then calculated the five-year coverage rates by comparing the number of women who in the last five years had had at least one smear, for any reason, with the number of women alive on January 1 of the analyzed year. (For the PALGA data linkage used in calculating coverage rates, see van den Akker-van Marle et al. 2003.)
We determined the U.S. Guidelines through the National Guideline Clearing House of the Agency of Healthcare Research and Quality (http://www.guideline.gov/index.aspx) and through a search of relevant journals.
History of Cervical Cancer Screening in the Netherlands
In the Netherlands, the Ministry of Health has the final responsibility for population-based screening programs, including the one for cervical cancer. The ministry decides on important changes in the program, after being advised by the Health Council of the Netherlands, which is similar to the U.S. Institute of Medicine. The Ministry of Health also directly finances cervical cancer screening, as well as other population-based prevention programs, such as childhood vaccinations. This financing covers the costs of establishing and maintaining the program, identifying eligible women, inviting them for screening, reimbursing the general practitioners taking the smears and the laboratories processing and interpreting them, and registration and quality control. This financing also covers the costs of conducting national surveillance and evaluation studies of the process and outcomes of screening.
The Netherlands is divided into five regions, with regional organizations managing the screening program. Each region must submit a proposal specifying the structure of the organization, the operational plan, and the budget. Regional steering groups consist of all those involved in the screening and follow-up (general practitioners, pathologists, gynecologists, public health services, and municipalities). Although the regions’ organizations may differ, they all must satisfy certain requirements, such as extending screening invitations (and reminders) to the entire eligible population. The Netherlands Institute of Public Health, which is similar to the U.S. Centers for Disease Control and Prevention (CDC), is required by the Ministry of Health to coordinate and finance the regional plans (National Institute for Public Health and the Environment 2010). The costs of further diagnostics and treatment after an abnormal smear are covered by the country's regular health insurance. The Netherlands’ health system is made up of private insurers and obligatory insurance, with the insurers dividing up the liabilities while taking account of the insured population's differences in risk (Dutch Ministry of Health, Welfare, and Sports 2010).
There was little screening in the Netherlands until the 1970s, when screening gradually began to increase. Starting in 1976, a pilot study was conducted in three regions, comprising a quarter of the Dutch population. Women aged thirty-five to fifty-three were invited every three years to come to a screening center to have a Pap smear. Other regions soon followed, and by 1980, screening was available to all women in the Netherlands. The final results of the pilot studies were presented to the minister of health in 1985 (Evaluation Committee on Early Detection of Cervical Cancer 1984). Nationwide screening was then introduced, run by regional organizations, with the same schedule of starting at age thirty-five and repeating in three-year intervals until age fifty-three. At the same time that the nationwide screening started, a Health Technology Assessment (HTA) was undertaken, which used the internationally available evidence for a detailed effectiveness and cost-effectiveness analysis (CEA) study. A number of conclusions from this study and later additional studies deeply influenced subsequent changes in the program (van Ballegooijen et al. 1993). First and foremost, the cost-effectiveness analysis showed that the thirty-five to fifty-three age group and the three-year interval schedule needed improvement. The age range was too narrow and the interval too short. A few years later, the minister decided on a new screening schedule, in accordance with the results of the CEA, with a screening interval of five years. In a letter to the Dutch Health Insurance Council asking to work out a new schedule, the Ministry of Health proposed a starting age of thirty and a stopping age of fifty-five (Ministry of Welfare, Health and Culture 1991), later extended to sixty (National Health Insurance Council 1993).
The cost-effectiveness study underlying the change in screening schedule is an example of the interaction between applied research results and policy decisions in the Netherlands. Many other policy changes in cancer screening were also based on conclusions from evaluation studies.
Another important finding from the HTA was that many preventive smears were taken outside and/or in addition to those conducted within the screening program. The Ministry of Health decided that preventive smears taken outside the regular screening program would no longer be reimbursed, which led to an enormous decrease of these so-called opportunistic smears. The number of smears before age thirty dropped by 75 percent (Bos et al. 2002; Rebolj et al. 2007).
Measures were also taken to increase attendance. It was decided that invitation letters would be sent out by the general practitioner of the woman concerned, and a reminder letter of the invitation, also sent by the general practitioner, became obligatory. These measures were successful. Despite the increase in the screening interval from three to five years, the number of women aged thirty to sixty-four who had had a Pap smear in the last five years rose between 1994 and 2003 from 69 to 77 percent (Rebolj et al. 2007).
Other policy decisions were aimed at reducing the very high proportion of ambiguous Pap smear results requiring follow-up. New guidelines reclassified those smears with morphocytological signs of inflammation from borderline to negative, bringing down the percentage of borderline smears from 10 to 2 percent (Rebolj et al. 2007). At the same time, the monitoring of women requiring follow-up improved, and the percentage of women receiving a timely follow-up increased from 47 to 86 percent (van Ballegooijen et al. 2006). An analysis of smears that were repeated because of insufficient quality showed no difference in CIN lesions that occurred after negative smear results, whether or not endocervical cells were present in the smear. A decision was made to stop repeating smears based only on the absence of endocervical cells in the initial smear. As a result, repeat smears because of inadequate quality dropped from 8 to 1 percent (Bos et al. 2001). A more recent evaluation showed that the risk of cancer after a negative smear did not increase after the change from a three-year to a five-year interval and the adoption of a more restrictive definition of a positive smear (Rebolj et al. 2008). Many of these measures were brought together in an advisory by the Dutch Health Insurance Council. To prepare the advisory, the council established the Coordination Committee for Cervical Cancer Screening, with the involvement of the Dutch professional societies of general practitioners, gynecologists, and pathologists; the municipal health services; the regionally integrated cancer centers; and the Ministry of Health (National Health Insurance Council 1993).
In summary, the development of cervical cancer policy and practice in the Netherlands followed a cycle of pilot studies; implementation of regional, population-based screening programs; surveillance and evaluation studies; and revision and implementation of screening program guidelines. The financing, planning, and coordination of the program as a public health function of the Ministry of Health made this systematic approach possible.
History of Cervical Cancer Screening in the United States
The United States also has national public health goals in regard to cervical cancer control, but the actual delivery of cervical cancer screening is largely the responsibility of individual medical practitioners, operating in the context of a mix of federal, state, and local programs; public and private health insurance plans; and medical specialty organizations.
The lack of direct and explicit public health policy instruments in the United States has several consequences. Individual medical practices, informed by clinical guidelines and through direct interaction with the patient, are the primary channel for ensuring that cervical cancer screening takes place. The cost of cervical cancer screening is divided among many sources, depending on the patient's age, employment status, and other socioeconomic characteristics. Accordingly, there is no mechanism for determining, in anything like real time, global national expenditures and resource utilization devoted to cervical cancer screening, much less constraining or targeting these resources. Finally, the balkanization of the U.S. system has produced a proliferation of clinical guidelines and practices.
Historically, preventive services have not been covered by health insurance in the United States, with the exception of some nonprofit HMOs. Initially, the Medicare program followed this policy and practice and did not cover Pap tests, but this changed in 1990 when the U.S. Congress amended the Medicare law to cover them. This change was motivated by the observation that cervical cancer rates had declined less rapidly in elderly women than in younger women in the United States. The congressional Office of Technology Assessment (OTA) thus was commissioned to perform a cost-effectiveness analysis of this new Medicare benefit. Although the OTA study found Pap screening among women aged sixty-five and older to be very cost-effective, a crucial assumption behind this conclusion was that nothing was known about their screening history before age sixty-five. In a subsequent article published in the Annals of Internal Medicine, the authors of the OTA study concluded that
The success of the new Medicare benefit depends substantially on physicians assuring that their elderly patients, particularly women without regular prior screening, obtain high quality Papanicolaou smears. The data also show that after a woman 65 years of age or older has a history of regular negative smears, screening is inefficient and can cease. (Fahs et al. 1992, 520)
The actual Medicare benefit covers biennial Pap tests for all women covered by the program, and annual tests are covered for high-risk women. “High risk” includes such factors as a lack of recent screens and a promiscuous sexual history. The Medicare benefit does not, however, indicate the cessation of Pap testing for any group of older women and does not target screening on the basis of socioeconomic risk (Medicare.gov 2011). Because private health insurance coverage in the United States tends to follow the lead of Medicare, this rather unqualified coverage of Pap smears for older women may have unintentionally resulted in increased Pap testing among women aged thirty to sixty-five during the 1990s, as reflected in figure 1.
Relatively few policy evaluations of Pap smear usage in the United States have been conducted. In 1990, a cost-effectiveness analysis by Eddy (1990) did indicate that the incremental benefits were small and the incremental costs were large for short screening intervals, resulting, for instance, in a cost-effectiveness ratio of about $800,000 more per life-year for annual screening, compared with triennial screening. The incremental cost-effective ratio for triennial screening compared with no screening is, according to Eddy, about $13,000 per life-year. Eddy concluded that “for most women, a 3-year frequency is appropriate” (Eddy 1990, 224). This conclusion was reinforced by a 2003 study using a Markov model informed by data from the CDC's National Breast and Cervical Cancer Early Detection Program (Sawaya et al. 2003). This study found that for women with a negative Pap test history, the incremental benefits of continued annual compared with triennial screening were very low relative to the resource cost. U.S. health economists often cite annual screening for cervical cancer as an example of an intervention with a very high incremental cost per life-years saved (Brown and Garber 1999; Kim, Wright, and Goldie 2002; Weinstein and Skinner 2010). These studies, however, have had only limited influence on clinical guideline formulation, Medicare coverage policy, and other health insurance providers’ policies, and they are often ignored altogether (Hagen et al. 2001; Pearson and Bach 2010).
Clinical guidelines are more proximal than policy analyses to the actual practice of cervical cancer screening. The United States offers several guidelines, promulgated by a variety of public and private organizations. Because of the lack of cervical cancer screening programs organized on the basis of population-based invitation through public health agencies (with the exception of the CDC program for low-income women), the main burden and responsibility of ensuring participation in screening falls on the individual physician, especially those specializing in obstetrics and gynecology (Saraiya, McCaig, and Ekwueme 2010; Yabroff et al. 2009). Consequently, it is not surprising that physician specialty organizations as well as public agencies feel obligated to promulgate independent clinical guidelines.
The most prominent guidelines are those of the U.S. Preventive Services Task Force (USPSTF), the American Cancer Society (ACS), and the American Congress of Obstetricians and Gynecologists (ACOG). In a 2007 U.S. survey of primary care physicians, they ranked the ACOG (57%), ACS (55%), and USPSTF (45%) guidelines as “very influential.” Among physicians specializing in obstetrics and gynecology, 88 percent ranked the ACOG guidelines as “very influential,” compared with only 24 percent for the USPSTF guidelines (Yabroff et al. 2009). Of the three guidelines, only those of the USPSTF, sponsored by the federal government, use an evidence-based and transparent process (Grilli et al. 2000; Imperiale and Ransohoff 2010). The U.S. guidelines are summarized in table 1.
Table 1. Number of Lifetime Pap Smears Recommended, by Guideline
Starting age: “Approximately 3 years after initiation of sexual intercourse, but no later than age 21 years.”
Stopping age: “It is difficult to set an upper age limit for cervical cancer screening. … An older woman who is sexually active and has multiple sex partners … [and/or a] woman with a previous history of abnormal cytology … should continue to have routine cervical cytology examination.”
Frequency: “Annual cytology examination should be recommended for women younger than 30 years. … [Normal-risk women] aged 30 years and older who have had three consecutive cervical cytology test results that are negative … may be screened every 2–3 years.”
Starting age: “Cervical cancer screening should begin at age 21 years.”
Stopping age: “It is reasonable to discontinue cervical cancer screening at either 65 years of age or 70 years of age in women who have three or more negative cytology test results in a row and no abnormal test results in the past 10 years.”
Frequency: “Every 2 years for women aged 21–29 years. … Women aged 30 years and older who have had three consecutive cervical cytology test results that are negative … may be screened every 3 years.”
Beginning in the 1950s and through 1988, the ACOG recommended annual Pap tests for all women (U.S. Congress Office of Technology Assessment 1990). The 1995 ACOG guidelines recommended annual Pap tests for women beginning at the onset of sexual activity or age eighteen (ACOG 1995). Although no upper age is given, less frequent screening is suggested for low-risk women after three or more consecutive normal findings. The less frequent interval is not specified, but the ACOG questioned the wisdom of a three-year interval, as well as the validity of “theoretical” cost-effectiveness analyses. The 2003 ACOG guidelines recommend even more intense screening than the 1995 guidelines. While allowing annual screening to commence somewhat later, on average, than the 1995 guidelines, between the initiation of sexual intercourse plus three years but no later than age twenty-one, the 2003 guidelines advise annual screening up to age thirty, without the option of less frequent screening, which is an option after age thirty (ACOG 2003).
The guidelines state that “studies have shown that in organized … screening, annual cytology examinations offer little advantage over screening performed at 2- or 3-year interval,” but “in the current U.S. practice climate, a women's care provider may change frequently … the physician may be unable to determine a woman's screening history,” suggesting that the lack of an organized screening program in the United States, rather than the clinical performance of the Pap smear, makes annual screening preferable.
The ACS and ACOG guidelines were very similar from the 1950s through the 1980s. In 1981 the ACS stated that a three-year screening interval could be adopted after two consecutive negative tests (U.S. Congress Office of Technology Assessment 1990). The ACS's 2002 guidelines are essentially the same as the ACOG's 2003 guidelines except that for women seventy years of age or older with a recent negative history of Pap results, the option to stop having Pap tests is presented as acceptable and that biennial screening is considered acceptable using liquid-based cytology (Saslow et al. 2002).
The USPSTF's 2003 cervical cancer screening guidelines differ from the ACOG's guidelines in that they specify a screening interval of “at least every three years”; thus, annual screening is still permitted but is not indicated as preferable (USPSTF 2003). The USPSTF also discourages continued screening after age sixty-five, based on data showing that the yield of screening is low in previously screened women over sixty-five, due to the declining incidence of high-grade cervical lesions after middle age. The USPSTF also found “fair evidence” that screening women older than sixty-five is associated with an increased risk for potential harms, including false-positive results and invasive procedures and thus concluded that the potential harms of screening are likely to exceed benefits among older women who have had normal results previously and are not otherwise at high risk for cervical cancer. Although it is the least influential of the three guidelines, the more cautious approach of the 2003 USPSTF guideline may partially explain the moderate decrease in annual screening after 2000, as shown in figure 1.
The Kaiser Permanente health care system, the largest nonprofit integrated health care system in the United States, formulated its own guidelines (Kaiser Permanente National Cervical Cancer Screening Guideline Development Team 2006). Kaiser's 2006 guidelines are similar to the ACOG guidelines in regard to starting age, but Kaiser differs in recommending a screening interval of three years and a stopping age of sixty-five, assuming a normal result on the last Pap test. An analysis of the guidelines of four nonprofit HMOS, however, indicates considerable heterogeneity. Before 2006, the recommended screening intervals were one, two to three, and one to three years, depending on the plan. After 2006, three of the four plans shifted toward longer screening intervals and/or more definitive stopping ages. But one plan changed from a screening interval of one to three years to one year for women under age thirty (Buist and Williams 2010). The generally more conservative approach of the Kaiser guidelines, compared with the ACOG/ACS/USPSTF guidelines, may reflect the fact that cervical cancer screening is not mainly the responsibility of OB-GYNs in the Kaiser system and that because Kaiser is an integrated health care system with a high continuity of enrollment and a mature system of electronic medical records, the ACOG guidelines’ concern about continuing adherence when a three-year screening interval is used may be mitigated by the Kaiser system.
Finally, based on a 2003 study by Sawaya and colleagues, the CDC recommended an increase in the Pap test interval to three years following three consecutive negative tests (Sawaya et al. 2003).
Physicians’ Attitudes and Practices
The interpretation of guidelines by many U.S. primary care physicians is toward intensive screening, even exceeding the ACOG guidelines. For example, about half the primary care physicians responding to a national survey in 2006/2007 recommended that an eighteen-year-old woman with no sexual history have a Pap test every two or three years or a Pap test once a year for at least three consecutive years; and 41 percent recommended continued Pap testing for a sixty-six-year-old woman with unresectable lung cancer and three negative Pap tests. Compared with those of internists and general/family practitioners, OB-GYNs' recommendations were more likely to be inconsistent with guidelines (Yabroff et al. 2009). Other data indicate that many women who have had a hysterectomy continue having Pap tests (Meissner et al. 2008). Significantly, despite the local Kaiser plans recommending a screening interval of three years, annual Pap smears continue to be common (Diana Buist and Andrew Williams, personal communication 2011).
Additional insights into these attitudes are provided by the results of focus groups of its physician providers conducted by the CDC's National Breast and Cervical Cancer Early Detection Program (NBCCEDP). None of the participating physicians were familiar with the CDC's triennial Pap test policy. About half reported that they were “annual screeners,” meaning that they typically screened all women each year. The other half was “selected extended screeners,” meaning that they increased the screening interval to two or three years for certain low-risk women. Annual screeners were more likely to report the ACOG guidelines as authoritative. Comments from this group included the fear that if given a three-year screening interval, fewer patients would return for the next test. OB-GYNs were more likely to be annual screeners, and some commented that they administered annual Pap tests because “that is what OB-GYNs do.” Some of the extended screeners were more influenced by the USPSTF's guidelines and considered them “more scientific” (Cooper et al. 2005). The results (Yabroff et al. 2009) did not differ for physicians participating in the NBCCEDP and other physicians (Benard et al. 2011).
HPV-DNA Testing in the United States and the Netherlands
In this case study, we restricted our attention to Pap testing. With the availability of HPV-DNA screening and HPV vaccination against the virus that causes cervical cancer, the potential costs of reducing cervical cancer incidence and mortality may escalate unless these technologies, used either as substitutes for or supplements to Pap testing, are disseminated in a thoughtful and efficient way. The Dutch Health Council recently released a report recommending that the Pap test be replaced by an HPV-DNA test, because of its greater sensitivity and because a self-sampling version enables women who are unable or unwilling to come into a physician's office to sample their own cervix and mail in the sample, thereby possibly improving the population coverage. Based on two modeling studies commissioned by the council, the recommended number of screening tests in a woman's lifetime was reduced from seven for the Pap test to five for the HPV-DNA test, at ages thirty, thirty-five, forty, fifty, and sixty. The report also describes the triage strategies using the Pap test to address the HPV-DNA test's lack of specificity (Health Council of the Netherlands 2011). The Ministry of Health has not yet decided on its recommendations.
The current U.S. guidelines recommend HPV-DNA testing as a “reflex” test, that is, as a second confirmatory test for women with equivocal Pap test results to determine whether to proceed to colposcopy or to continue with routine Pap testing. HPV-DNA testing is also recommended as a “co-test” to be used simultaneously with Pap testing for women over age thirty (Castle 2011). In a 2006/2007 national survey of screening practices, 60 percent of physicians in the United States responded that Pap testing should continue annually after a negative HPV test and a normal Pap smear, and another 20 percent stated that Pap testing should continue biennially, contrary to the co-test guidelines (Saraiya et al. 2010). A majority of providers reported using HPV co-testing in women younger than thirty (Lee, Berkowitz, and Saraiya 2011). A majority of physicians also did not agree that the HPV vaccine would affect screening initiation or frequency (Wong et al. 2010). Commenting on these and similar results, Roland and colleagues (2011) stated that “an organized and systematic approach to screening that promotes … evidence-based screening policies may also be necessary to ensure the screening of women at appropriate intervals.”
Cervical Cancer Screening Rates and Mortality in the United States and the Netherlands
In the period under study, a first approximation of a country's resources devoted to cervical cancer screening can measured by the number of Pap smears taken each year. Figure 1 shows the number of Pap smears taken annually between 1950 and 2007 per 1,000 women, standardized to the age structure of the 2000 U.S. population. In the Netherlands, after a gradual increase from 1970 onward, the rate of Pap smears reached a level of about 120 per year per 1,000 women in the early 1980s and fell to a level of 100 in the late 1990s. In the United States, the use of Pap smears rose dramatically during the late 1950s, reaching a rate above 300 per year per 1,000 women in the early 1970s. This increase continued between 1987 and 2000 but fell somewhat between 2000 and 2007. Over the entire period, the Pap smear rate ranged between 300 and 450 per year per 1,000 women.
The numbers in figure 1 refer to all Pap smears, including repeat smears after an abnormal smear or after treatment. The U.S. numbers are based on the answer to the NHIS survey question about the date of the last Pap smear, asked of women answering yes to whether they had ever had a Pap smear. The wording of these questions has varied somewhat over the years, with more detailed questions being used in later NHIS surveys.
Compared with the register-based data for the Netherlands during the last twenty years, answers to survey questions may be biased upward because the number of self-reported tests is typically higher than the number shown in medical record data. The longer the question recall period was—for example, having had a Pap smear within the last five years versus within the last year—the less over-reporting there was. A recent meta-analysis (Howard, Agarwal, and Lytwyn 2009) found, for Pap smears, a report-to-record ratio of 1.1 for a five-year interval, 1.2 for a three-year interval and 1.3 for a two-year interval, and 1.3 for the only study with a one-year interval and complete registration. A comparison of record-based Pap tests in a large nonprofit HMO with self-reports from the NHIS found the reported rates to be 34 percent and 63 percent higher in the NHIS than in the HMO for the age groups eighteen to twenty-nine and sixty-five and older but were quite similar for the age group thirty to sixty-four, which contributes the most Pap smears (Insinga, Glass, and Rush 2004). There was also a downward bias in the NHIS data, however, because for those women having more than one Pap smear in the last year, only one was counted. Even assuming that on balance, the over-reporting of recent Pap smears could be as high as 30 percent in the NHIS, we estimated that the Pap smear use in the United States is still more than three times greater than in the Netherlands.
Figure 2 shows that the levels and trends of cervical cancer mortality in the United States and the Netherlands were different from 1950 to 1970 and were quite similar from 1970 to 2010 (Statistics Netherlands 2010a; U.S. Department of Health, Education and Welfare 1968; WHO 1955, 1977). The higher year-to-year variability in the Dutch mortality trends is due to its smaller population, compared with the United States, and was taken into account by also presenting the five-year moving averages. From 1950 to 1970, the mortality rate remained at the same level in the Netherlands but fell spectacularly in the United States. This decrease between 1950 and 1960 cannot, to a measurable degree, be attributed to screening, given that screening only became widely practiced in the later 1950s and given the delay between screening and mortality reduction. This difference between the two countries in the 1950s complicates the interpretation of the mortality trends under screening. Compared with the 1960 mortality rate, the 2007 rate was 78 percent lower in the United States and 75 percent lower in the Netherlands. When we assume that the 16 percent decrease in the U.S. mortality rate from 1950 to 1960 also applied between 1960 and 1970, the difference in decrease between the two countries no longer holds. Overall, it looks as if the vastly different screening inputs in United States and the Netherlands are not reflected in substantially different trends in mortality. The incidence data in figure 3 (Altekruse et al. 2010; Netherlands Cancer Registry 2010) are more difficult to interpret than the mortality data, because screening may detect a number of early invasive cancers that never would have become clinically apparent without screening, especially at older ages. Nevertheless, it is notable that the incidence trends are very similar between the two countries.
Note that until 1998, incidence data for the Netherlands were available for only one region, which, due to the small numbers, explains the large variability.
Age-Specific Patterns of Pap Testing and Cervical Cancer Mortality in the United States and the Netherlands
The age-specific patterns of Pap testing that result from the two different approaches to screening are reflected in figure 4, which shows the five-year coverage—the percentage of women who received at least one Pap smear within the last five years—for recent years. Figure 4 indicates that in the Netherlands, the coverage rate for the population of women aged thirty to sixty-four, while increasing, remained somewhat below that of the United States for the period shown. While the Netherlands achieved this coverage with Pap testing once every five years, in the United States intervals tend to be concentrated in the range of one to three years. The one- to three-year coverage rates in women aged thirty to sixty-four were therefore much higher in the United States (not shown). Figure 4 also shows that Pap test coverage in the Netherlands is very low and decreasing for women under thirty, consistent with that country's guidelines, while in the United States the coverage for this age group is the same as for those aged thirty to sixty-four. Swan and colleagues found similar patterns (Swan et al. 2010). The high coverage rate among women aged twenty-one to twenty-nine may reflect the 2003 ACOG guidelines, which recommend annual screening for women under age thirty, as well as the fact that U.S. women in this age group are more likely to receive their primary care from OB-GYNs.
Figure 5 shows age-specific cervical cancer mortality trends for the United States and the Netherlands from 1970 to 2008. In order to focus on trends under the influence of screening and not on the absolute level, the 1970-to-1974 mortality is indexed at 100. For the age group thirty to sixty-four (figure 5b), both countries achieved steady declines in the mortality rate since 1970, which was somewhat more pronounced for the Netherlands. This may reflect the high compliance of physicians and patients with follow-ups after abnormal smears in the Netherlands as a result of the endorsement of national guidelines.
For the age group sixty-five and older (figure 5c) the two countries also achieved a comparable reduction in mortality, although the Netherlands lags behind the U.S. trend, especially before 1985. This suggests that the Dutch program could have benefited from the provision of “catch-up” Pap testing for women older than the upper age of the screening schedule (initially over fifty-three and later over fifty-nine) and who had never had a Pap test before the initiation of the national screening program. Conversely, this comparison indicates that the U.S. pattern of often continuing regular Pap testing for women well beyond the age of sixty is unlikely to have been efficient. For the age group twenty to twenty-nine (figure 5a), there was a reduction in mortality for the United States, but the downward trend was less than that for the older age groups. In absolute terms, however, the mortality due to cervical cancer is very low in this age group, representing only 1 to 2 percent of U.S. cervical cancer deaths. In the Netherlands, with very low screening rates for those under thirty, there was no reduction in cervical cancer mortality. The Netherlands data are unstable because there are annually only a few cervical cancer deaths in this age group.
In this article, we used a case study to describe differences in the policy process and resulting differences in the two countries’ program designs, practices, and resource expenditures.
Efficiency versus Economic Waste in Cervical Cancer Screening
Bentley and colleagues (2008) labeled spending to produce services that provide marginal or no health benefit over less costly alternatives as “clinical waste.” From the historical account, it is clear that the Netherlands has followed a public health approach to cervical cancer screening emphasizing global efficiency while achieving broad population coverage for disease prevention. This may have come at some cost in health outcomes, as in the case of women over the age of sixty-five. In the United States, a medical services model that emphasizes the professional autonomy to achieve optimal coverage through the individual doctor-patient relationship has resulted in a screening practice that is often more resource intensive than even those clinical guidelines recommending the most frequent screening. The most recent 2009 ACOG guidelines (ACOG 2009b), indicating that women should not have their first Pap test until age twenty-one and that between the ages of twenty-one and thirty, women should be screened no more often than biennially, in essence acknowledges that past guidelines and much of continuing practice were not consistent with efficiency or even clinical prudence. As Alan G. Waxman, who headed the ACOG committee, stated, “A review of the evidence to date shows that screening at less frequent intervals prevents cervical cancer just as well, has decreased costs, and avoids unnecessary interventions that could be harmful” (ACOG 2009a, 1).
While the comparison of the two countries’ screening intensity and cervical cancer mortality trends does not, by itself, provide causal evidence, the data are nevertheless consistent with the historical evidence that the decentralized and nonintegrated approach to cervical cancer screening in the United States produces substantial clinical waste compared with the centralized, integrated, and organized system of the Netherlands. Using cost-effectiveness modeling, Bentley and colleagues estimated the clinical waste associated with cervical cancer in the United States as in the range of $630 million to $4 billion per year. Based on our current comparison, we can make an independent estimate. The total annual cost of cervical cancer screening has been estimated to be $2.3 billion to $3.8 billion in 2002 dollars (Insinga, Dasbach, and Elbasha 2005). If two-thirds of this cost were forgone, the savings could be in the range of $1.5 billion to $2.5 billion in 2002 dollars or $2 billion to $3.4 billion in 2010 dollars. While these are not large relative to the overall spending on health care in the United States, they represent considerable resources that might be better spent on, for example, more specific targeting of cervical cancer control to identifiable groups of women at high risk (Vogt et al. 2003) or to enhance compliance with diagnostic follow-ups in the case of abnormal smears. For example, the 2008 budget for the CDC's National Breast and Cervical Cancer Early Detection Program for cervical cancer screening for low-income women was about $40 million, but it is estimated that the program currently reaches only 7 percent of eligible women in the United States (Tangka et al. 2010, Ekwueme et al. 2008).
What lessons can be drawn from this cross-national case study? In both the United States and the Netherlands, the “initial choice” seems to have had a substantial influence on the subsequent evolution of policy and practice. In the United States, early guidelines for frequent screening were established because of the concerns about the low sensitivity of the Pap smear and to ensure the continuity of screening through the medical model. In the Netherlands, however, the pragmatic, resource-constrained design of the early pilot study was influential. Once established, these approaches seem to have a substantial institutional “inertia,” and only small incremental changes can be easily made. Without this inertia, Pap testing in the United States might have moved more rapidly toward more efficient starting and stopping ages and longer screening intervals. In the Netherlands, the potential benefit of a “catch-up” Pap test for older women might have been recognized, and the change from the rather poor pilot study screening schedule to a better one would have been made earlier. Of the two countries, however, the public health approach of the Netherlands, which enables strong linkages between surveillance, evaluation, policy, and practice, made evolution toward an efficient screening program more likely. While the Netherlands was inflexible in regard to the global resources made available for Pap testing, within this constraint, the screening interval was changed by almost twofold in a relatively short time. The movement away from annual screening in the United States has been slower. It is important that this anchoring to the status quo is recognized so that screening effectiveness and efficiency can be improved more rapidly. In addition, while the Netherlands has regional plans that exert direct control over screening practices, in somewhat parallel organizations in the United States, such as the CDC program and the Kaiser system, these practices appear to be determined more by physicians’ individual preferences, influenced by specialty society guidelines.
Despite the good screening coverage in the Netherlands, most invasive cervical cancers occurred in women over age thirty who had their last Pap smear more than the recommended five years ago. Interval cancers in between five-yearly screenings were far less important. This brings the important message to both countries that reaching all women is crucial to further reducing cervical cancer mortality.
Cancer screening is a process involving many steps (Leyden et al. 2005). In this article, we described only those associations between the primary “input” of that process—Pap tests—and the ultimate “output” of that process—cervical cancer mortality. Given the substantial evidence regarding the sharply declining incremental benefit of cervical cancer screening with shorter screening intervals (Day 1986; Eddy 1990; Sawaya et al. 2003; van den Akker-van Marle et al. 2003), a reasonable hypothesis is that the more intensive pattern of Pap smear testing in the United States has had only a modest additional reduction in mortality. Furthermore, this benefit may have been canceled out by inefficiencies in the U.S. system that we did not take into account, including the technical quality of the Pap test and the less complete follow-up and treatment of abnormal Pap tests.
In our simple analysis, it is not possible to disentangle secular trends and screening effects when interpreting mortality trends. In 1950, mortality was higher in the United States. At the same time, there was a secular downward trend that was absent in the Netherlands. In 1960, mortality was 26 percent higher in the United States than in the Netherlands. When we assume no secular trends in both countries after 1960 (or identical trends), the decrease in mortality until 2007 was slightly more favorable in the United States, with a 78 percent reduction, compared with 75 percent in the Netherlands.
The 1.26 relative mortality risk of the United States compared with the Netherlands in 1960 also points to a higher risk of cervical cancer. This is consistent with the higher prevalence of HPV in young women in the United States than in the Netherlands (Coupe et al. 2008; Dunne et al. 2007). The risk ratio is, however, too close to one to justify a large difference in screening intensity between the two countries.
Cross-national studies of health care interventions, such as screening for cancer, give results that are not readily obtained from one-country studies. They can lead to valuable lessons for the countries involved, despite the limitations of its observational nature. The newly established U.S. Patient Centered Outcome Research Institute (PCORI) therefore should consider cross-country studies eligible for sponsoring and funding.
Acknowledgments: We wish to acknowledge Timothy S. McNeel of Information Management Services, Silver Spring, Maryland, for data programming and the processing of NHIS data.
We also wish to acknowledge Andrew Williams of Kaiser Permanente Hawaii Center for Health Research and Diana Buist of Group Health Research Institute for making available unpublished data from the SEARCH research project.