Prospective and randomised public-health trial on neural network-assisted screening for cervical cancer in Finland: Results of the first year
Article first published online: 15 NOV 2002
Copyright © 2002 Wiley-Liss, Inc.
International Journal of Cancer
Volume 103, Issue 3, pages 422–426, 20 January 2003
How to Cite
Nieminen, P., Hakama, M., Viikki, M., Tarkkanen, J. and Anttila, A. (2003), Prospective and randomised public-health trial on neural network-assisted screening for cervical cancer in Finland: Results of the first year. Int. J. Cancer, 103: 422–426. doi: 10.1002/ijc.10839
- Issue published online: 5 DEC 2002
- Article first published online: 15 NOV 2002
- Manuscript Accepted: 26 SEP 2002
- Manuscript Revised: 20 AUG 2002
- Manuscript Received: 7 JUN 2002
- Finnish Academy
- Finnish Cancer Organisation
- European Community
- cervical cancer;
- primary screening;
- public health trial;
Our objective was to evaluate the feasibility and relative validity of interactive neural network assisted screening (Papnet) in primary mass screening for cervical cancer as a public health policy (routine screening). A randomized, ongoing trial involved 152,969 invitees and 108,686 attendees in the organized mass screening in Finland in 1999. Drawing invitations from the population registry, women were randomized 2:1 at an individual level to have their smear analyzed either conventionally or with Papnet. The distribution of smears to different cytological categories, detection rates of dysplasias, in situ carcinomas and cancers were estimated with smears analyzed either conventionally (72,461) or by Papnet (36,225). A total of 108,686 smears were screened and 449 were histologically confirmed as dysplasias and carcinomas. The detection rates for histologically verified carcinoma in situ/severe dysplasia, moderate and mild dysplasias were 0.14%, 0.14% and 0.13% with conventional and 0.14%, 0.14% and 0.11% with Papnet, respectively. The detection rate of invasive cancer was 0.06‰ (n = 4) with conventional method and 0.08‰ (n = 3) with Papnet. None of the differences were statistically significant (p > 0.05). Papnet was able to identify 92.5% of healthy women (normal cytology), and the specificity of conventional smear was 92.9%. The positive predictive value (Pap Classes III–V) of Papnet was slightly but not significantly better (55% vs. 51%). Papnet screening was feasible as a part of routine screening and performed equally well compared to conventional one methods used in Finland. Organized mass screening was practiced very successfully in the last 38 years. We are going to continue the trial to study the potential trends in cervical cancer incidence in both study arms. © 2002 Wiley-Liss, Inc.
The role of cytology-based screening in the prevention of cervical cancer is established. Organized mass screening has been especially effective in Finland compared to other organized screening programs1, 2 or to spontaneous screening practiced in the same population.3 The program was introduced in 1963 and since then the age-standardized mortality from and the incidence of invasive disease has decreased by 80%.4, 5 Years before the screening started the age-standardized incidence was about 15/100,000 women and was at its lowest during early 90s (∼3.5/100,000 women-years). Thereafter the trends have not been as favorable as in the past and in 1998 the rate was 4.5/100,000 women-years.
An interactive computer-assisted screening system that uses neural networks (Papnet) can potentially improve the effectiveness and efficacy of the cervical cancer screening program and decrease screener errors.6, 7, 8, 9 There are studies indicating that it is especially suitable for quality control re-screening.10, 11, 12 No prospective, randomized trials to assess the effectiveness of the computer-assisted system in primary screening, however, are available. The usefulness of Papnet in primary screening seems promising. Doornewaard et al.9 and Duggan13 have concluded that in primary screening the Papnet yields in equal sensitivity and specificity as conventional screening on basis of non-randomized studies. These studies, however, have well-known limitations as to bias.
Finland is particularly suitable country for evaluating the performance of neural network-assisted screening in primary screening because of the highly developed health care infrastructure and a well-functioning and continuously evaluated mass screening program.
The objective of our study is to analyze whether the organized screening program for cervical cancer can even be improved by means of this new technology. We are carrying out a large prospective randomized trial that remains ongoing. The results of the first year, based on a large study population, illustrate the results in terms of process. The ultimate purpose is to assess whether Papnet-assisted screening can reduce the incidence of invasive cervical cancer. This, however, requires a longer follow-up time. We report the detection rates of cervical dysplasias, in situ carcinomas and cervical cancers using conventional and Papnet-assisted screening
MATERIAL AND METHODS
Mass screening in Finland
An organized mass screening program for cervical cancer was introduced in Finland in 1963. The target population of screening is the entire Finnish female population aged 30–60 (≈1.2 million women). There is some variation in the age groups because the cohorts to be screened are decided by municipal authorities. Every woman is identified from the National Population Registry and invited every 5th year by a personal letter with place and individual time specified to attend the screening program. The results of the cervical smear are sent to the women and to the nationwide Mass Screening Registry.
Every year about 250,000 women are invited personally and about 180,000 smears are taken within the mass screening program. The attendance rate is 72% and the coverage of the program depends on age. This varies from 70%, in the 30-years age group, to 100% (between ages 40–55).3 In addition, plenty of spontaneous smears are taken by municipal health centres, occupational health services or by private gynecologist. These smears are not registered but number about 300,000 annually.
This 2:1 randomized, prospective (still ongoing) trial is carried out as a part of the national screening program for cervical cancer. Drawing invitations from the national population registry, a large number of women (152,969), aged 30–60 (25–65 in some municipalities) were invited to attend the organized mass screening in Finland in 1999. The women were individually randomized into 2 arms to have their smear analyzed either conventionally (2/3) or with Papnet (1/3). The randomization was done by the national authorities and it was based on random allocation of the personal identification number issued to every resident in Finland.
The smears were analyzed in the laboratories of the Finnish Cancer Organisations in Helsinki, Kotka, Oulu, Pori, Tampere and in the Helsinki City Screening laboratory. Because the trial was set as a part of the national screening program, the laboratory workload remained very similar compared to the previous years and there were no changes in human resources.
The ethical committees of Helsinki University Medical Faculty and the National Research and Development Centre for Welfare and Health (STAKES) approved the study protocol, which included the decision not to inform every woman which method was used.
The interactive Papnet system (NSI Inc.), has 2 main components: 1 computerized scanning station (Pr1ma) and a number of review stations. At the scanning station stained slides prepared conventionally are automatically mapped using video technique and then scanned by a Zeiss microscope with a video camera linked to a computer (Papnet software 3.04b, latest version). An algorithmic image processor and neural network program then identifies the cells and cell clusters with least normal appearance, and provides 128 digitized, high-resolution color images. These images are stored on compact disc and are reviewed by the cytotechnician on a personal computer linked to a high-resolution color monitor and a microscope. These 128 images per 1 smear can be enlarged and also located to the original cervical smear slide using the slide map that gives the coordinates matched to the cytotechnicians individual microscope cross-table.
Two to 5 cytotechnicians from each of the 7 participating laboratories were trained to use the Papnet system on a 3-day intensive course at autumn 1998 and then few weeks to 3 months on-site “hands on” training before the trial started in the beginning of 1999. In addition, the first 1,000 slides at each laboratory were double screened (Papnet and conventional) to become accustomed to the system and to ascertain the high quality. The laboratory staff of all participating laboratories were also invited twice in 1999 to meetings where existing technical problems were discussed, solved and morphological criteria were updated.
The recommended trial procedure established that the slides randomized to the Papnet arm were first scanned with Papnet and then analyzed using the review stations. Only the edge areas (5% of the coverslip area) and possible unscanned (mainly air bubbles under coverslip) areas (judged by the scanning map of each slide) of the slides were screened manually. Scanning error slides, i.e., slides that could not be scanned with Papnet for various reasons (focus problems because of too thick slide [about 70%], slides with very few cells [10%], too many air bubbles [8%], broken slide [3%], wrong size of the glass [3%] and other reasons), were screened conventionally but remained in the Papnet arm. Slides with observed abnormal Papnet images were reviewed further with light microscope. Slides randomized to the conventional arm were traditionally analyzed using light microscopy only.
Other procedures in the mass screening program (e.g., invitations, smear-taking and preparing, quality control meetings, re-screening, reporting, replies to women) remained untouched. The smears were reported according to the original Papanicolaou classification ranging from normal (Class I) to malignant (Class V). Quality-control cytopathologists rescreened about 10% of the negative slides in addition to the abnormal ones in both study arms at every laboratory. The criteria for inadequate smear were the same used in the Bethesda 1991 System.
Odds ratios (OR) with 95% confidence interval (CI) of cytological findings and of histologically confirmed cancerous and pre-cancerous lesions per screened woman were calculated for the Papnet arm using the conventional screening arm as the reference and logistic regression with likelihood ratio statistics to assess whether any difference in the cytological test sensitivity was obtained. We also compared the test specificity between the arms, using histologically-confirmed lesions as the gold standard. The test specificity was defined as the proportion of the cytological test negatives among those with the negative histological status. We also calculated the positive predictive values for the cytological test using the histologically-confirmed lesions as the standard. Statistical significance of the difference in the positive predictive value between study arms was tested with an unconditional binomial test.14 An asymptotic 2-sided test was used.
The total number of women invited for screening in the study was 152,969. Of these women, 50,989 were randomized for Papnet-assisted screening and 101,980 for conventional screening. Of the invited women, 108,686 attended the screening. The overall attendance rate for Papnet as well as for conventional screening was 71%, i.e., 36,225 women attended Papnet-assisted and 72,461 conventional screening (Table I). The number and percentages of the attendees and the method used in screening (Papnet vs. manual) by the arm and laboratory are shown in Table I. There was substantial contamination in the Papnet arm (average 28%, range by laboratory 9–40 %) and practically none in the conventional one. This was mainly due to the technical problems, (9–22% scanning errors) of the Papnet machinery. The rest were due to problems in many of the participating laboratories during the beginning of the study period. The most important factor was that not all of the slides in the Papnet arm were sent to be scanned because of the suboptimal logistics causing accumulation of the slides and delaying reporting process. The technical and logistical problems diminished rapidly during the first year. These problems, however, are what one might to expect with any new technology.
|Laboratory||Papnet arm||Conventional arm|
|CO Helsinki||13,836||9,504 (68.7)||5,635||3,869||27,672||19,336 (69.9)||104||19,232|
|CO Tampere||5,485||4,064 (74.1)||2,799||1,265||10,972||8,108 (73.9)||7||8,101|
|CO Oulu||8,601||6,035 (70.2)||4,732||1,303||17,204||12,194 (70.9)||3||12,191|
|CO Kotka||6,528||5,183 (79.4)||4,501||682||13,053||10,264 (78.6)||—||10,264|
|CO Pori||4,090||3,089 (75.5)||2,805||284||8,179||6,134 (75.0)||3||6,131|
|Helsinki City||12,449||8,350 (67.1)||5,772||2,578||24,900||16,425 (66.0)||19||16,406|
|Total||50,989||36,225 (71.0)||26,244||9,981||101,980||72,461 (71.1)||136||72,325|
The mean age of women randomized in the Papnet arm as well as in the conventional screening arms was 44.4 years, with SD of 10.3 years in both of the arms. The mean age of attendees was 45.3 years with SD of 10.0 years. Mean age of non-attendees was 42.2 years with SD of 10.5 years in both of the study arms.
Severe atypias, i.e., Pap Classes IV and V were detected slightly more frequently in the Papnet-assisted screening arm than in the conventional screening arm (Table II). Class III findings were more common in conventional screening and Class II findings were more common in Papnet-assisted arm. There were only 3 inadequate smears classified as Pap Class 0 (1 in Papnet and 2 in conventional screening group).
|Papanicolaou class||Papnet arm||Conventional arm||OR||CI|
|I||33,447 (923)||24,243 (924)||9,204 (922)||67,245 (928)||1.00||Reference|
|II||2,525 (70)||1,833 (70)||692 (69)||4,649 (64)||1.09||1.04–1.15|
|III||210 (5.8)||136 (5.2)||74 (7.4)||514 (7.1)||0.82||0.70–0.96|
|IV||39 (1.1)||29 (1.1)||10 (1.0)||48 (0.7)||1.63||1.07–2.49|
|V||3 (0.08)||2 (0.08)||1 (0.10)||3 (0.04)||2.01||0.37–10.9|
|0||1 (0.03)||1 (0.04)||0 (0.00)||2 (0.03)||—||—|
|Total||36,225 (1,000)||26,244 (1,000)||9,981 (1,000)||72,461 (1,000)||—||—|
The detection rate of invasive cancer was 0.06‰ (n = 4) with conventional screening and 0.08‰ (n = 3) with Papnet-assisted screening arms. In situ carcinoma or severe dysplasia of the cervix was detected 0.14% of the smears in the conventional and 0.14% in the Papnet arm. The detection rates for moderate and mild dysplasias were 0.14% and 0.13% in the conventional and 0.14% and 0.11% in the Papnet arm, respectively (Table III). None of the differences were statistically significant (p > 0.05).
|Histology||Papnet arm||Conventional arm||OR||95% CI|
|Invasive cancer||3 (0.08)||2 (0.08)||1 (0.10)||4 (0.06)||1.50||0.30–6.80|
|In situ carcinoma or dysplasia gravis||51 (1.4)||33 (1.3)||18 (1.8)||100 (1.4)||1.02||0.72–1.42|
|Dysplasia moderata||51 (1.4)||34 (1.3)||17 (1.7)||104 (1.4)||0.98||0.70–1.36|
|Dysplasia levis||40 (1.1)||28 (1.1)||12 (1.2)||96 (1.3)||0.83||0.57–1.20|
|Normal and other||36,080 (996)||26,147 (996)||9,933 (995)||72,157 (996)||1.00||Reference|
|Total||36,225 (1,000)||26,244 (1,000)||9,981 (1,000)||72,461 (1,000)||—||—|
The specificity for severe dysplasia+ with cut-off level at Pap Class I was 92.5% (Papnet) vs. 92.9% (conventional). There was a very slight variability whether the reference group was only Pap Class I or combined with Class II cytology. (Table IV).
|Negative histology||Negative Pap smear||Specificity %|
|Pap Class I|
|Dysplasia gravis and more|
|Pap Class I and II|
|Dysplasia gravis and more|
In the screening program clearly positive cytology yields a direct referral for colposcopy. The positive predictive value (PPV) for histologically verified (any dysplasia or cancer), cytological findings (Class III–V) in Papnet group was 55% compared to 51% (p = 0.25, binomial test) in conventional screening group. The positive predictive value (PPV) for cytological findings Class II–V in Papnet group was 5.2% vs. 5.8% (p = 0.28, binomial test) in conventional screening group. (Table V). None of the differences were statistically significant (p > 0.05). Class II cytology caused a new smear after 6–12 months and a repeated Class II or worse result led to colposcopy.
|Histology||Pap III–V cytology PPV (n)||Pap II–V cytology PPV (n)|
|Papnet n = 252||Conventional n = 565||Difference Papnet vs. conventional (p-value)||Papnet n = 2,777||Conventional n = 5,214||Difference Papnet vs. conventional (p-value)|
|Invasive cancer||1.2 (3)||0.7 (4)||0.5 (0.49)||0.1 (3)||0.1 (4)||−0.0 (0.65)|
|Dysplasia gravis +||20.6 (52)||17.9 (101)||2.8 (0.35)||1.9 (54)||2.0 (103)||−0.0 (0.92)|
|Dysplasia moderata +||40.1 (101)||35.2 (199)||4.9 (0.18)||3.8 (105)||4.0 (207)||−0.2 (0.68)|
|Dysplasia levis +||55.2 (139)||50.8 (287)||4.4 (0.25)||5.2 (145)||5.8 (303)||−0.6 (0.28)|
Both study arms were almost equally sensitive to find abnormal micro-organisms from the Pap-smears (Table VI), being of potential relevance for gynecological practice, i.e., treating gynecological infections. Papnet screening, however, was slightly inferior for finding actinomyces (OR = 0.81, 95% CI = 0.71–0.96).
|Papnet n = 35,972||Conventional n = 71,894||OR||95% CI|
|Trichomonas||53 (0.15)||139 (0.19)||0.76||0.56–1.05|
|Yeast||1,524 (4.24)||3,016 (4.20)||1.01||0.95–1.08|
|Herpes simplex virus||5 (0.01)||10 (0.01)||1.00||0.31–2.81|
|Actinomyces||243 (0.68)||584 (0.81)||0.83||0.71–0.96|
|Clue cells||1,987 (5.52)||3,899 (5.42)||1.02||0.96–1.08|
|HPV suggestive||168 (0.47)||380 (0.53)||0.88||0.73–1.06|
Our large study of automation-assisted primary screening describes the results of the first year of a still ongoing randomized, prospective study incorporated to organized mass screening for cervical cancer in Finland. The main finding was that no statistically significant differences were found in any of indicators of test performance between the conventional and the Papnet arms. The computerized automation-assisted, Papnet screening found slightly more severe cytological lesions (Class IV–V) compared to conventional one, but confirmed histopathological findings were as common in both study groups. Assuming Pap Class I only to be test-negative, the specificity of Papnet-assisted screening was slightly smaller compared to the conventional one. Papnet screening produced more Class II findings that were not confirmed as precancerous at the colposcopy guided histology. It can not be excluded, however, that these non-dysplastic lesions may have malignant potential or otherwise relevant gynecological information. Only longer follow-up for reduction of invasive disease allows us to find that out. Papnet and conventional screening arms were equally valid and effective in finding abnormal micro-organisms mainly indicated by Class II smears. In Finland these findings imply referral of the woman for treatment for gynecological purposes (not necessarily for cancer prevention).
The Papnet system is no longer commercially available. There will, however, probably be new computer-assisted, interactive screening methods in the future. Our study is probably applicable to any technology based on neural network or images. It is important to study the interaction between the cytotechnician or cytopathologist and the modern techniques.
The relative validity of neural network-assisted system in quality control rescreening has been reported in many studies.10, 11, 12 On basis of those studies, Papnet primary screening is expected to be about as sensitive and more specific as the traditional one. Doornewaard et al.9 concluded in their longitudinal cohort study that in primary screening the Papnet yields in equally sensitivity and specificity as conventional screening. Duggan13 draws the conclusion that Papnet-assisted screening is equally specific as the conventional screening and probably of better sensitivity because it is a better detector of abnormalities at the lower end of the abnormal spectrum. The sensitivity of Papnet primary screening vs. conventional screening was 87.6% vs 72.3% when atypical squamous cells of undetermined significance (ASCUS)/atypical glandular cells of AGUS was used as threshold, and 85.6 vs. 82.4% when LSIL was threshold. Michelow et al.15 got similar results on their study that simulated primary screening in an unscreened, high-risk community. Papnet was significantly superior to conventional screening in low-grade lesions including ASCUS, AGUS and low-grade squamous intraepithelial lesions (89.6% vs. 63.8%). In more severe abnormalities including high-grade squamous intraepithelial lesions and invasive carcinoma, conventional screening was more sensitive than Papnet (87.5% vs. 94.6%), the difference was not statistically significant.
There are even more Papnet favorable results. Kok et al.7 conducted a large study and found that Papnet was substantially more sensitive than conventional screening in diagnosing invasive squamous cell cancer of the cervix. Further, the diagnostic consistency for high grade lesions and invasive carcinoma was reported to be significantly higher for Papnet than for conventional screening.6
These slightly controversial results between those published and our results may, at least partly, be explained by the differences in relative validity of the conventional screening. Most of the studies classify premalignant or preinvasive lesions positive and estimate the relative validity and not the interval cancer-based sensitivity. Such estimates involve cases with overdiagnosis (i.e., lesions that would not have progressed invasive) as the objective of screening is to prevent invasive disease. In Finland the effect of screening on incidence and mortality was about 80%, which can not be reached unless the Pap test had a very high sensitivity. There is no overall agreement on how to define and, hence, how to estimate the validity. With this reservation, there are estimates from 20–90% of sensitivity.16 Our randomized study with arms subjected to independent estimation of test performance, i.e., primary screening, allows the results to be expressed as detection rates and not only as sensitivity rates. Our results show that Papnet screening performed almost as well as conventional screening. The results are in some discrepancy with majority of the results available that show somewhat better relative validity of Papnet. The major explanation is the high quality of organized mass screening in Finland, with very good performance in the last 38 years with conventional manual screening. Automation-assisted screening may improve the results of a suboptimal screening organization, but it may be very difficult to beat the results of a well-organized, high-quality screening. Relative validity of the test is not, however, sufficient evidence for effectiveness of the automation-based program. Our prospective and randomized design is unique and allows an evaluation of the 2 screening methods based on Pap smear and Papnet in terms of reduction in incidence and in mortality of cervical cancer. We are going to continue the trial and to include other competing technologies such as HPV-DNA-based cervical cancer screening in a multi-arm randomized design.
The Papnet system was purchased from NSI Inc. in 1998 by the Finnish Cancer Organizations and thereafter maintained by an engineer, who was a former employee of NSI Inc. None of the authors have any dependencies on NSI Inc. T. Toivonen, MD, J. Ikkala, MD, J. Martikainen, MD and T. Timonen, MD were the chief cytopathologists of the local laboratories and were responsible of the data provided by these laboratories. This work was partly supported by the Finnish Academy, Finnish Cancer Organization and the European Community.
- 1Trends in the incidence of cervical cancer in the Nordic Countries. In: MagnusK, ed. Trends in cancer incidence. Causes and practical implications. New York: Hemisphere, 1982. 279–292..
- 2Trends in mortality from cervical cancer in the Nordic countries; Association with organized screening programmes. Lancet 1987; I: 1247–9., , .