Metaanalysis of the accuracy of rapid prescreening relative to full screening of pap smears

Authors


Abstract

BACKGROUND

Efficient quality assurance and improvement measures are essential ingredients in a well organized cytology-based program for cervical carcinoma screening. Various pap smear review procedures, aiming for optimization of accuracy, are described throughout the literature. Evaluation and synthesis of those methods are needed. In a previous study, we pooled data on the diagnostic quality of rapid reviewing (RR) of cervical smears initially reported as normal or unsatisfactory. We now focus on rapid prescreening (RPS) of unreported smears.

METHODS

Six published studies on the accuracy of RPS relative to subsequent full screening were pooled using metaanalytic methods. Individual and pooled sensitivity, specificity, and predictive values were assessed using forest plots. Random effect pooling methods were used for interstudy heterogeneity. Variation in sensitivity according to influencing factors was explored by metaregression.

RESULTS

The pooled average sensitivity of RPS was 64.9% (95% confidence interval [CI] 50.7–79.1%) for all abnormalities, 72.6% (95% CI 60.6–85.2%) for low-grade lesions or more severe, and 85.7% (95% CI 77.8–93.6%) for high-grade lesions or more severe. The pooled specificity was estimated at 96.8% (CI 95.8–97.8%). The sensitivity increased significantly with duration of screening and decreased with workload. Almost 3% of all abnormal slides were detected only by RPS (2.8%; CI 0.0–5.8%). This is comparable to the proportion of false-negative smears detectable by RR.

CONCLUSIONS

Rapid prescreening has a high yield for severe dysplasia and shows diagnostic properties that support its use as a quality control procedure in cytologic laboratories. We showed previously that RR is superior to full reviewing of a 10% random sample of negative slides (10% FR). Because the yield of additional abnormalities found by RR and RPS is comparable, we expect RPS to be more efficient than 10% FR as well. Cancer (Cancer Cytopathol)2003;99:9–16. © 2003 American Cancer Society.

Occurrence of false-negative Pap test results is a known reason of failure of cytologic screening for cervical carcinoma.1 Reduction of the false-negative rate to a strict minimal level is a major preoccupation of cytotechnologists and pathologists.2, 3 Rescreening slides interpreted as negative is a quality control (QC) method specifically designed to address the sensitivity problem inherent to the interpretation of Pap smears.4 Full rescreening of a 10% random fraction of smears reported as being within normal limits is a mandatory QC procedure in the United States.5–7 This QC method is criticized for its inefficiency and lack of statistical power to detect low-level achievement in primary screening within laboratories.8–10 Rapid or partial reviewing (RR) of all smears initially interpreted as being nonabnormal has been introduced in the United Kingdom as an alternative and more useful QC standard.11–14 Rapid reviewing consists of rescreening quickly during 30–120 seconds all slides that are originally reported as within normal limits or as inadequate to indicate those that might contain missed abnormalities. Suspect smears are subsequently fully checked by an experienced cytotechnologist or cytopathologist who determines the final report.13, 15, 16

We synthesized published data on RR of Pap smears from the uterine cervix initially reported as negative to anticipate false-negative results.17–19 We established evidence that RR of all nonnegative preparations results in the detection of more additional abnormalities in comparison to fully rescreening only 10% of the negative workload. In particular, we found that RR allows retrieval of 4.7 times more extra positives, 5.6 times more squamous intraepithelial lesions (SIL), and 7.9 times more high-grade SIL (HSIL).

In the current study, available information from citations in the literature on rapid prescreening (RPS) of cervical smears was pooled. Rapid prescreening can be defined as a partial microscopic inspection of a slide during a limited duration (maximum 120 seconds) before a full routine evaluation.20

The essential difference between rapid pre-screening and re-viewing is that in RPS all slides are submitted to a quick partial scanning by a cytotechnologist, while, in RR, all found abnormals are already removed and only the slides initially interpreted as negative are reviewed.

MATERIALS AND METHODS

Bibliographic Retrieval

Bibliographic references were retrieved by searching MEDLINE using the following keywords combined with Boolean operators: (cervix OR cervical) AND (cancer OR carcinoma OR dysplas* OR neoplas*) OR (rapid OR partial) AND (screening OR rescreening OR reviewing).

More sources were found by screening the reference list of retrieved papers, by exploring the function of related papers in MEDLINE, and by hand searching the table of contents of specialized cytologic journals (Acta Cytologica, Cytopathology, Diagnostic Cytopathology, Cancer (Cancer Cytopathology), Journal of Clinical Pathology, and American Journal of Clinical Pathology). The search was extended to the end of 2001.

Extraction of Data and Definitions

Retrieved material was grouped by study design: 1) RPS of unreported routine smears; 2) RR of smears initially interpreted as nonpositive; 3) rapid screening of sets of selected slides. Only studies matching type 1 were included in the current metaanalysis. Other analyses, including the pooling of RR studies (type 2), were the subjects of earlier publications.17, 19

The absolute number of true-positives (TP), false-positives (FP), true-negatives (TN) and false negatives (FN), extracted or calculated from the original data, constituted the ingredients for our calculations of study-specific and pooled accuracy parameters. A case was considered TP if it was indicated at RPS as potentially abnormal and subsequently confirmed as atypical squamous cells of undetermined significance (ASCUS) or worse at full screening or at final check when full screening was not positive. Without confirmation, the case was considered to be FP. A TN case was defined as a slide that was interpreted by the rapid prescreener as nonsuspect and that subsequently was interpreted as within normal limits after full evaluation. Cytologic lesions, negative at RPS but positive at full screening, were considered as FN.

The terminology of the British Society of Clinical Cytology used for reporting of cervical smears was converted to the 1991 version of The Bethesda System using the following equivalencies: ASCUS for borderline lesions; low-grade SIL (LSIL) for mild dyskaryosis; and HSIL for moderate or severe dyskaryosis.12, 21

We calculated sensitivity (TP/[TP + FN]) for three cytologic thresholds (ASCUS or more severe, LSIL or more severe, and HSIL or more severe). The following parameters were calculated only at the threshold ASCUS or more severe: specificity (TN/[TN + FP]); positive predictive value (PPV = TP/(TP + FP]); negative predictive value (NPV = TN/(TN + FN]); the proportion of additional positive slides detected only by RPS as a percentage of all confirmed cytologic positives; and the prevalence of cytologic abnormality and the proportion of slides that were suspect at RPS.

Statistical Methods

Forest plots were used to display the accuracy outcomes of separate studies with the pooled measure as a diamond at the bottom of the graph.22–24 Interstudy heterogeneity was assessed statistically by a Q-test.25, 26 When the Q-test showed significant heterogeneity, a random effect model was used for the estimation of the overall pooled measure and its 95% confidence interval (CI).27 When no statistical significant heterogeneity was found, a fixed effect pooling method was used with weighting of each study according to the reciprocal of its variance.24 A random effect model widens the CI substantially when heterogeneity is important.

The relation between sensitivity and influencing covariates was studied comprehensively by metaregression.28–30 Metaregression allows measuring of the extent to which covariates contribute to the heterogeneity in the study outcomes. This method, based on multilevel modeling, accounts for both intrastudy and interstudy variance and is particularly useful when wide variability exists among studies.

RESULTS

Procedures

Six of 52 retrieved references contained quantitative data on RPS.19 The oldest study results dated from 195731 and were cited by Koss.32 The original report could not be retrieved. The other five articles were more recent and were all from British authors.

The main technical procedures are provided in Table 1. The duration of RPS varied between 30 seconds, exactly measured20, 33 or not,34, 36 60 seconds,35, 36 and 120 seconds.32, 36 Usually, 10 seconds were spent for result documentation and changeover. Generally, two techniques of slide movement were used. In the “step” mode, slides are evaluated using the usual speed and movement but only a few tracks are followed (diagonal, long and short border). In the “whole slide” method, coverage of the entire smear is attempted in the given time using large random jumps.37 The number of slides prescreened per session was described in only three studies: 70,20 90,33 and 100.36 Faraker et al.34 stated that the number of smears prescreened by one cytotechnologist rarely exceeded 60 per day. Several cytotechnologists with varying experience participated in the RPS exercises. The duration of subsequent full screening varied between 6 and 10 minutes.

Table 1. Technical Procedures Applied in RPS followed by Full Routine Screening of Unreported Cervical Smears
ReferenceNo. of slidesDuration of rapid screening (sec)Slide movementSlides screened at one RPS sessionCytologistsDuration full screening (minutes)
  • RPS: rapid prescreening; CT: cytotechnologist.

  • a

    Ten seconds used for documentation and changeover.

  • b

    One cytoscreener who was unhappy with the method of RPS was withdrawn during the experience.

Koss324184120
Baker and Melcher20203030 exact; + 10aWhole slide707
Johnson et al.33216030, + 10aStep: 10 CT, Whole slide: 1 CT, Step and corners: 1 CT± 90 (× 2)8 experienced; 4 less experienced6–10
Faraker et al.34951730–45StepRarely exceeding 60/day4
Cross351729± 60Whole slideAll screening staffb6–8
Farrell et al.362938± 30, + 10aStep1003 experienced CTs 1 pathologist7
 2925± 60, + 10aStep100As above7
 2937± 120, + 10aStep100As above7

Data and Metaanalysis

The calculated accuracy parameters, the proportion of RPS additional positives not detected by full screening, the prevalence of cytologic abnormalities, and the RPS positivity rate are presented in Table 2.

Table 2. Diagnostic Performance Parameters of RPS Relative to Full Subsequent Screening
ReferenceTime (seconds)Sensitivity (%)Specificity (%)PPV (%)NPV (%)Additional positives detected (%)Prevalence (%)Proportion of positives at RPS (%)
AbnLSIL +HSIL +
  1. PPV: positive predictive value; NPV: negative predictive value; RPS: rapid prescreening; Abn: abnormal cytology (atypical squamous cells of undetermined significance or worse); LSIL +: low grade squamous intraepithelial lesion or worse; HSIL +: high-grade squamous intraepithelial Lesions or worse.

Simon and Ricci3112077.3  86.53.099.90.00.5413.9
Baker and Melcher203079.581.910095.651.498.70.05.68.6
Johnson et al.333035.850.658.394.540.693.43.49.48.3
Faraker et al.343081.486.287.498.370.199.13.84.45.1
Cross356055.579.081.697.367.695.50.08.56.9
Farrell et al.363056.959.186.898.270.496.77.27.15.8
 6057.064.786.898.374.096.49.17.96.1
 12069.978.796.797.269.697.29.88.48.4
 All61.667.990.497.971.296.88.87.86.7

Figure 1 shows the sensitivity, stratified into three cytologic thresholds (ASCUS or more severe, LSIL or more severe, and HSIL or more severe), and the specificity of RPS. The sensitivity was the highest in the Faraker et al. study,34 except for high-grade abnormalities for which the Baker and Melcher study20 ranked first. At all thresholds, sensitivity was lowest in the Johnson et al. study.33

Figure 1.

Forest plots of the sensitivity for three cytologic thresholds: atypical squamous cells of undetermined significance or more severe (a), low-grade squamous intraepithelial lesions or more severe (b), and high-grade squamous intraepithelial lesions or more severe (c). (d) Specificity of rapid prescreening relative to full routine screening of unreported smears. The forest plot displays parameters of each study as a horizontal line, depicting the confidence interval and a rectangle for the point estimate with surface proportional to the weight that the study contributes to the metaanalysis. The pooled estimate is displayed at the bottom as a diamond.

The summary of the pooled accuracy parameters is shown in Table 3. The average sensitivity for RPS relative to the combination of subsequent full screening was 64.9% (CI 50.7–79.1%) for all abnormalities (ASCUS or more severe), 72.6% (CI 60.0–85.2%) for LSIL or more severe, and 85.7% (CI 77.8–93.6%) for HSIL or more severe. The Simon and Ricci study31 was excluded from the metaanalysis of the specificity because it had an extremely high FP rate (13.5%). The pooled estimate of the specificity of these studies was 96.8 % (CI 95.8–97.8%). If the Simon and Ricci study had been included, the pooled estimate would have been 95.0% (CI 92.8–97.3%).

Table 3. Accuracy of RPS: Summary of the Metaanalysis
Accuracy parameterPooled mean (%)CIRange
  • RPS: rapid prescreening; LSIL +: low-grade squamous intraepithelial lesions or worse; HSIL +: high-grade squamous intraepithelial lesions or worse.

  • a

    Results of Simon and Ricci31 were excluded because of the outlying false-positive and prevalence rates.

Sensitivity for cytologic abnormalities64.950.7–79.135.8–81.4
Sensitivity for LSIL+72.660.6–85.250.6–86.2
Sensitivity for HSIL+85.777.8–93.658.3–100
Specificitya96.895.8–97.894.5–98.3
Positive predictive valuea60.449.6–71.240.6–71.2
Negative predictive value97.496.2–98.593.4–99.9
Additional positives found2.80.0–5.80.0–8.8

The prevalence of cytologic abnormalities in five of the studies ranged from 2.7% to 4.7%. This yielded a pooled PPV of 60.4% (CI 49.6–71.2%). Again, the Simon and Ricci study data were excluded, as the prevalence of cytologic abnormality was very low (0.4%), with a PPV of 3.0%. The NPV was 97.4% (CI 96.2–98.5%) in all studies. Forest plots of the PPV and NPV are shown in Figure 2.

Figure 2.

Forest plot of the positive (a) and negative (b) predictive value of rapid prescreening.

Three studies33, 34, 36 found abnormal smears at RPS that were not detected at subsequent full screening. In three other studies,20, 31, 35 no additional cytologic abnormalities were found by RPS. The overall proportion of additional positive slides was 2.9% (CI 0.0–5.8%). Faraker et al.34 did not report the grade of these abnormalities detected only by RPS and Johnson et al.33 found only ASCUS. However, 10% (6 of 60) and 45% (27 of 60) of extra abnormalities identified in the Farrell et al. study36 were HSIL and LSIL, respectively.

Variation of the Sensitivity of RPS According to the Duration of RPS

The sensitivity of RPS in relation to duration is presented in Table 4 for the three cytologic cutoffs. Only studies stating formally that prescreening took 30 seconds are included in the first group. In the Faraker et al. study,34 duration ranged from 30 to 45 seconds and was, therefore, excluded. The estimation of sensitivity at 30 seconds was very imprecise due to heterogeneity across the studies. No increase in sensitivity for all cytologic lesions was observed by increasing reading time from 30 to 60 seconds. Overall, sensitivity seems to increase with duration and with severity of cytologic abnormality. Sensitivity for HSIL was quite high, even at short reading times.

Table 4. Sensitivity of RPS According to Screening Duration Defined at Three Cytologic Thresholds
Duration (sec)Method of estimationEstimate (95% CI)
  1. CI: 95% confidence interval; ≥ ASCUS: atypical squamous cells of undetermined significance or worse; ≥ LSIL: low-grade squamous intraepithelial lesions or more severe; ≥ HSIL: high-grade squamous intraepithelial lesions or more severe. RPS: rapid prescreening; Studies at 30 seconds included Baker and Melcher,20 Johnson et al.,33 and Farrell et al.;36 studies at 60 seconds included Cross35 and Farrell et al.;36 studies at 120 seconds included Simon and Ricci31 and Farrell et al.36

All abnormalities (≥ ASCUS)  
 30Pooled random effect57.3% (33.1–81.6)
 60Pooled fixed effect56.4% (51.4–61.6)
 120Pooled fixed effect70.6% (65.2–76.1)
≥ LSIL  
 30 sPooled random effect64.1% (45.9–82.2)
 60Pooled fixed effect69.8% (63.8–75.9)
 120Point estimate (binomial CI)78.7% (71.8–84.6)
≥ HSIL  
 30Pooled random effect82.6% (62.5–100.0)
 60Pooled fixed effect85.0% (77.6–92.3)
 120Point estimate (binomial CI)96.7% (88.7–99.6)

Multivariate Analysis of the Sensitivity

A more complete picture was obtained by considering the coefficients of metaregression with sensitivity as the dependent variable and with duration of prescreening, the number of slides per session, the mode of slide movement, and the cytologic threshold as predictor variables. By fixing the prescreen workload in the Faraker et al. study at 60 slides per session, four of the studies could be included.

Figure 3 shows the relation between duration of screening and the fitted sensitivity (lines) as well as observed sensitivities (points) for the three cutoffs. Neither a linear model with different slopes for each threshold nor a quadratic model resulted in a significantly better fit.

Figure 3.

Relation between the duration of prescreening and its sensitivity, estimated at 30, 60, and 120 seconds of prescreening for three cytologic thresholds: atypical squamous cells of undertermined significance, low-grade squamous intraepithelial lesions, and high-grade squamous intraepithelial lesions. The points are observed values reported in four published studies. The lines represent fitted estimates derived from a random effect model, allowing for interstudy heterogeneity and calculated for a workload of 100 slides per session.

The sensitivity increased by 7.0% (CI 4.8–18.9%) for LSIL or more severe (not significant, P = 0.24) and by 23.2% (CI 11.1–35.2%) for HSIL or more severe (highly significant, P = 0.000) in comparison to the baseline threshold (ASCUS or more severe). Sensitivity increased with the duration of prescreening by 0.26% (CI 0.09–0.42%) for every second above 30 seconds. Conversely, sensitivity decreased significantly with workload by 0.68% per slide per session. Sensitivity was unaffected by the mode of slide movement.

Influence of Experience in RPS

Johnson et al.33 studied the sensitivity of RPS and provided details of the level of experience of the cytotechnologists in rapid screening (Table 5). The sensitivity was substantially higher for ASCUS or more severe (P = 0.02) when RPS was performed by experienced cytologists. However, there was only a marginal difference between experienced and inexperienced screeners for detection of LSIL or more severe (P = 0.07) and there was no significant difference for HSIL or more severe.

Table 5. Sensitivity of RPS in Relation to Screener Experience
ThresholdSensitivityP value for difference in sensitivity
Experienced screener (%)Nonexperienced screener (%)
  1. RPS: rapid prescreening; ASCUS: atypical squamous cells of undetermined significance; LSIL: low-grade squamous intraepithelial lesions; HSIL: high-grade squamous intraepithelial lesions.

ASCUS40.020.50.02
LSIL55.128.60.07
HSIL59.450.00.71

DISCUSSION

Rapid rescreening as a QC method was originally described and applied in the United Kingdom. Over the last 5 years, the popularity of rapid screening has grown. We found reports from nine countries.19 The current study is one in a series of systematic reviews by which we evaluate the role of rapid screening in prescreening and postscreening QC.

This study provides a comprehensive picture of the accuracy of RPS for cervical smears. An attempt was made to synthesize available information and to find plausible explanations for the variation in accuracy, using modern metaanalytic techniques. It must be noted that only relative accuracy was considered, i.e., performance of RPS was measured relative to subsequent full screening and further checking when a case was RPS positive but negative at routine screening. This study did not address the absolute sensitivity of rapid screening.

Quality of Studies

The quality of the retrieved studies was somewhat variable, with substantial heterogeneity in the outcomes and insufficient description of technical procedures. In these circumstances, metaanalytic pooling is not straightforward. Nevertheless, forest plots clearly show the variability of individual results and the heterogeneity in outcomes is at least partly accounted for by the use of random effect pooling methods. Readers are invited not to focus only on the pooled measure but to consider the whole forest plot reflecting the variation among all included individual studies. By metaregression, plausible explanations for this variation could be identified within limits of available covariate information.

Accuracy of RPS and Influencing Factors

The sensitivity of RPS differed according to cytologic category, duration of screening, number of slides prescreened in one session, and the experience of the cytotechnologist. The pooled overall sensitivity estimate for HSIL or more severe is high (86%) and substantially lower (23%) for ASCUS or more severe. The modeled sensitivity increases almost linearly with duration of prescreening between 30 and 120 seconds: + 2.6% (CI 0.9–4.2%) with every 10 seconds longer screening. This relation is similar for all degrees of cytologic abnormality. It is remarkable that 83% of cytologically detectable HSIL lesions were found at the 30-second prescreening. Furthermore, where cytotechnologist experience was recorded, this sensitivity for HSIL was as high for inexperienced as experienced rapid screeners. Because metaregression does not allow extrapolations beyond the interval of 30–120 seconds, we cannot comment further on the relation between the reading time of cytologic preparations and the detection rate. However, we agree with Koss32 that this relationship is very poorly documented throughout the cytologic literature and warrants further study.

Another strong influencing factor is workload. Sensitivity decreases when the number of slides exceeds 60 per session. This fact can be attributed to fatigue and loss of concentration and has been confirmed by other authors.35, 38 In the Faraker et al. laboratory,34 the maximum was 60 slides per day with a duration of 30–45 seconds, whereas Johnson et al.33 allowed two consecutive sessions of 90 slides to be read at 30 seconds each. This contrast in workload largely explains why the results reported by Johnson et al. were located at the extreme left in the forest plot and those of Faraker et al. at the right (Fig. 1). Even specificity seems to be influenced in the same way.

The NPV of RPS is high (97.4%; CI 96.2–98.5%). However, this is not meant to propose that RPS should be used as an alternative to full screening.

Due to the inconsistency in the reporting of exact duration of screening, the coefficients from the metaregression equation should be considered with caution. The effect of determining factors could be assessed more precisely by multiple rapid screening of sets of selected smears and recording of all pertinent technical details. Pooling of such studies (type 3) is planned. As previously indicated, the results of this study were calculated relative to conventional primary screening and subsequent checking. The absolute sensitivity of RPS will be lower, because the cross-sectional FN rate of primary screening can be substantial.39, 40 In particular, rapid screening would be unlikely to detect cases with few abnormal cells.41 The capacity of RPS to identify smears missed at full conventional screening can be assessed easily by rapidly screening a set containing selected original FN slides mixed with original true positive and normal slides. Renshaw et al.42 found a ratio of sensitivity for TPs versus FNs of 1.6 for LSIL or more severe and 1.5 for HSIL. However, this assessment depends strongly on the composition of the study set and does not replicate a real RPS situation. Therefore, we will pool such studies in a separate metaanalysis.

Applications of RPS

Rapid prescreening of Pap smears has been used for several purposes. It can be used during a backlog, allowing quick identification of easily detectable pathologic cases so that concerned women can receive timely follow-up or treatment.20 It has often been used to train cytologists before introducing RR as a QC system in the laboratory and to identify skilled cytotechnologists.13, 34, 38 The knowledge of the sensitivity of RPS with respect to subsequent full reading provides a correction parameter that can be used to adjust the FN rate of primary screening evaluated by RR of cases negative after conventional full screening.17, 34

Our findings further suggest a potential of RPS in internal QC comparable to RR. The yield of positive smears, found only during RPS and not at subsequent routine screening, was 0.19% (CI 0.03–0.35%). The average sensitivity of conventional full screening increased by 2.8% (CI 0.0–5.8%). These findings are comparable to those from a previous metaanalysis of rapid rescreening.17 In that study, the yield of positive smears, found only at RR of negative cases, was 0.18% (CI 0.14–0.21%) and the average sensitivity increased by 2.6% (CI 1.8–3.5%). That study also showed that RR is a more efficient QC procedure than the full rescreening of 10% of the negative workload used in the United States. The same efficiency can be expected from RPS.17 To explore this potential, a research project is currently underway at the Cytological Institute of the Bavarian Society in Munich (in collaboration with the Scientific Institute of Public Health in Brussels, within the framework of the European Network for Cervical Cancer Screening). Although RPS and RR yield comparable gain in detection of missed abnormalities, cytotechnologists may find the prescreening more interesting.38 Moreover, RPS has a practical advantage: slides are still free of ink marks that might influence the reviewing process.

Economic Aspect of RPS

The complement of the specificity, the FP rate, indicates the extra effort and time spent without benefit by checking RPS-positive slides that were negative at full screening. On average (across five studies), the RPS FP rate was 3.2% (CI 2.2–4.2%), which looks economically acceptable. The extra gain in sensitivity obtained by RPS is modest but does not require investment, other than human resources. RR or RPS can be a cost-effective quality assurance method in comparison to more sophisticated automated rescreening procedures that require substantial investment and generally are less specific.10

To conclude, RPS shows considerable promise as a QC process, with a sensitivity gain comparable to that of RR, and is superior to that of 10% full rescreening. Further studies are in progress to determine its value.

As with any screening process, quality assurance and improvement of test interpretation (e.g., by reviewing the smears) are only one aspect. Optimization of sampling and, most importantly, participation of at-risk populations and follow-up of screen positives are of crucial importance in the reduction of incidence and cause-specific mortality. All of these aspects can be managed best within the framework of a population-based screening program, such as those conducted in North-European countries43–45 and as recommended by the European Commission.46, 47

Acknowledgements

The authors are grateful to Bernadette Claus, Catherine Vân Phan, and Marisa Abarca (of the Scientific Institute of Public Health) for their assistance in retrieval of literature references.

Ancillary