Pooled analysis of the accuracy of five cervical cancer screening tests assessed in eleven studies in Africa and India

Authors


Abstract

Cervical cancer is the main cancer among women in sub-Saharan Africa, India and other parts of the developing world. Evaluation of screening performance of effective, feasible and affordable early detection and management methods is a public health priority. Five screening methods, naked eye visual inspection of the cervix uteri after application of diluted acetic acid (VIA), or Lugol's iodine (VILI) or with a magnifying device (VIAM), the Pap smear and human papillomavirus testing with the high-risk probe of the Hybrid Capture-2 assay (HC2), were evaluated in 11 studies in India and Africa. More than 58,000 women, aged 25–64 years, were tested with 2–5 screening tests and outcome verification was done on all women independent of the screen test results. The outcome was presence or absence of cervical intraepithelial neoplasia (CIN) of different degrees or invasive cervical cancer. Verification was based on colposcopy and histological interpretation of colposcopy-directed biopsies. Negative colposcopy was accepted as a truly negative outcome. VIA showed a sensitivity of 79% (95% CI 73–85%) and 83% (95% CI 77–89%), and a specificity of 85% (95% CI 81–89%) and 84% (95% CI 80–88%) for the outcomes CIN2+ or CIN3+, respectively. VILI was on average 10% more sensitive and equally specific. VIAM showed similar results as VIA. The Pap smear showed lowest sensitivity, even at the lowest cutoff of atypical squamous cells of undetermined significance (57%; 95% CI 38–76%) for CIN2+ but the specificity was rather high (93%; 95% CI 89–97%). The HC2-assay showed a sensitivity for CIN2+ of 62% (95% CI 56–68%) and a specificity of 94% (95% CI 92–95%). Substantial interstudy variation was observed in the accuracy of the visual screening methods. Accuracy of visual methods and cytology increased over time, whereas performance of HC2 was constant. Results of visual tests and colposcopy were highly correlated. This study was the largest ever done that evaluates the cross-sectional accuracy of screening tests for cervical cancer precursors in developing countries. The merit of the study was that all screened subjects were submitted to confirmatory investigations avoiding to verification bias. A major finding was the consistently higher sensitivity but equal specificity of VILI compared with VIA. Nevertheless, some caution is warranted in the interpretation of observed accuracy measures, since a certain degree of gold standard misclassification cannot be excluded. Because of the correlation between visual screening tests and colposcopy and a certain degree of over-diagnosis of apparent CIN2+ by study pathologists, it is possible that both sensitivity and specificity of VIA and VILI were overestimated. Gold standard verification error could also explain the surprisingly low sensitivity of HC2, which contrasts with findings from other studies. © 2008 Wiley-Liss, Inc.

Cervical cancer, an eminently preventable cancer, is the second most common cancer among women worldwide, affecting an estimated 490,000 women each year, and causing 270,000 deaths annually.1 About 85% of this global burden of cervical cancer is experienced in sub-Saharan Africa, South and South-East Asia, Oceania, Central and South America and the Caribbean. Although cervical cancer risk has substantially declined among women in developed countries due to effective cervical cytology screening programs, cervical cancer continues to be the most common cause of premature death among middle-aged women in their most productive years and the largest single cause of year-life lost to cancer in developing countries.2–4 This is mainly due to lack of screening or existing inefficient cytology screening programs in these countries.

Different screening tests such as conventional cytology, visual inspection using acetic acid (VIA) or using Lugol's iodine (VILI), VIA with magnification (VIAM) and human papillomavirus (HPV) testing have been investigated in the last few years for their accuracy in different settings in developing countries by different providers as part of the research efforts to find feasible and effective options for implementing screening in developing countries. The International Agency for Research on Cancer, as part of the Alliance for Cervical Cancer Prevention supported by the Bill and Melinda Gates Foundation, conducted 11 cross-sectional studies in 4 locations in India and in 5 locations in French speaking Africa, in collaboration with the national institutions, to evaluate the accuracy of the above screening tests in detecting cervical neoplasia and to establish a platform from where screening services can be provided and health personnel can be trained in different aspects of cervical screening.5–10 Results from these studies have showed varying estimates of test accuracies, due to several factors including differences in testing, training of providers, quality assurance methods, quality and consistency of reference standards used to establish true positive disease, reporting of histopathology and monitoring of studies. In this article, we present the results of a meta-analysis of these 11 independent studies. We computed the overall sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and pooled relative accuracy of the screening tests. Additionally, we explored the sources of heterogeneity by assessing the association between test accuracies and individual and study characteristics.

Material and methods

Study population, evaluators and tests

Between 1999 and 2003, the test accuracy of 5 cervical cancer screening methods were evaluated in more than 58,000 women aged 25 to 64 from 11 urban settings in India and 5 African countries.5–9 The assessed tests were naked eye visual inspection of the uterine cervix after application of acetic acid (VIA)5, 6 or after application of Lugol's iodine (VILI)5, 6 VIA with a low level magnifying lens (4×) (VIAM),7 the conventional Papanicolaou smear8 and an HPV DNA detection test targeting 13 high-risk HPV types,9 using the B-probe of the Hybrid Capture-2 assay (HC2) (Digene, Gaithersburg, MD). The number of women examined and tests assessed per centre is shown in Table I. VIA was evaluated in all 11 studies, VILI in all but Calcutta 1 study. VIAM, cytology and HC2 were only assessed in India, respectively, in 3, 5 and 4 studies. The characteristics of the participating population, involved health personal, the technical details and results of each separate screening test were described previously.5–9 The Pap smear was reported in the following 4 categories: negative for neoplastic cellular changes, atypical squamous cells of undetermined significance, low-grade (LSIL) and high-grade intraepithelial lesion or worse (HSIL+). The other tests were categorized as negative or positive using criteria described elsewhere.5–9, 11 Briefly, positivity for VIA or VIAM was defined as presence of opaque, dense, well-defined aceto-white areas touching the squamo-columnar junction or close to the external os or presence of aceto-white growth, observed 1 min after application of 4% acetic acid solution on the cervix; whereas presence of mustard or saffron yellow lesions after application of lugol was considered as positivity criterion for VILI.11 A signal with RLU (relative light unit) higher than one using controls that contained 1 pg/mL of HPV DNA was used as cutoff for the HC2 test.9, 12

Table I. Number of Included Participants and Cervical Cancer Screening Tests Evaluated in Five African and Six Indian Study Centres
NumberCentreCountryTests evaluatedNumber of women included
  1. VIA, visual inspection after application of acetic acid; VILI, visual inspection after application of Lugol's iodine; VIAM, visual inspection after application of acetic acid and using a magnifying loupe; HC2, hybrid-Capture 2 assay.

1BamakoMaliVIA, VILI5552
2BrazzavilleCongoVIA, VILI6935
3ConakryGuineaVIA, VILI8627
4NiameyNigerVIA, VILI2534
5OuagadougouBurkina FassoVIA, VILI2051
6Calcutta 1IndiaVIA, VIAM, Pap smear, HC25894
7Calcutta 2IndiaVIA, VILI, VIAM, HC28080
8JaipurIndiaVIA, VILI, Pap smear5786
9MumbaiIndiaVIA, VILI, VIAM, Pap smear, HC24004
10Trivandrum 1IndiaVIA, VILI, Pap smear4457
11Trivandrum 2IndiaVIA, VILI, Pap smear, HC24759
 Total  58,679

All the screening tests were applied independently by different examiners, who were blinded toward the results of the other tests. Visual screening tests, and collection of cervical cells for cytology and HPV testing were performed by 51female health workers with a variety of different educational qualifications: auxiliary nurse midwives (n = 33) (Burkina Faso, Congo, Guinea, Mali, Niger), registered nurses (n = 2) (Trivandrum), cyto-technicians (n = 2) (Trivandrum), university graduates in science (n = 9) (Jaipur, Calcutta) or high-school graduates (n = 5) (Mumbai). All were trained in the performing and reporting of VIA and VILI during a 5-day intensive course, using a training manual prepared by the International Agency for Research on Cancer.11 The training included lectures, discussions, review of photographs of normal and abnormal cervix, as well as practical experience observing examinations of 150–200 volunteer subjects. Refresher courses (1–2 days) were given for all health workers during the study period.

All participants were subsequently investigated with colposcopy on the same day and, if a colposcopically suspect or abnormal lesion was identified, punch biopsies were taken. Colposcopists and histologists examining biopsies were masked with respect to the screening test results. The histopathological diagnosis was used as gold standard for women from whom biopsies were taken; otherwise the colposcopic impression was used for determination of the final cervical status. This final outcome was categorized into 5 classes: normal or non-neoplastic changes, cervical intraepithelial neoplasia grade 1 (CIN1) including HPV changes, CIN2, CIN3 and invasive cancer. Patients with CIN or cancer were offered appropriate follow-up and treatment.

The technicians and doctors involved in the study were trained and reoriented at the beginning of the study and retrained and assessed periodically during the course of the study. Internal and external quality control measures were introduced in the pathology laboratories. Laboratory procedures and manuals were reviewed.

Statistical methods

We used meta-analytical methods to pool test accuracy measures from the different study sites. For each category of CIN, sensitivity, specificity, predictive values of the 5 tests, and the ratio of the sensitivity and specificity of VIA compared with the other tests were assessed using random effect models, allowing for intersetting heterogeneity.13, 14

We also estimated the pooled absolute sensitivity and specificity using a 2-level model, with independent binomial distributions for the true positives and true negatives conditional on the sensitivity and specificity in each study, and a bivariate normal model for between-study variation of the logarithms of odds of sensitivity and specificity.15 This method incorporates the intrinsic negative correlation between the sensitivity and specificity and it allows for sparse data. The overall accuracy measure, the diagnostic odds ratio (DOR) (odds of sensitivity/odds of (1-specificity), was estimated using Hierarchical Summary Receiver Operating Characteristic regression.16 Models were fitted with metandi, a Generalized Linear Latent and Mixed Models-based procedure developed by Harbord.17

The influence of age, study centre, and time period on study outcomes was assessed by SROC regression.18 Age was aggregated by 5-year groups and study period by tertiles, using date of screening or chronological rank ID. Study period was considered as a proxy for accumulated experience of the assessors.

The correlation between screen test and colposcopy results were assessed by Spearman's rank correlation coefficient.

All statistical analyses were performed with STATA, version 9 (Stata Corp., College Station, TX).13

Results

Test accuracy of VIA and VILI for CIN2+

The forest plots in Figure 1 display the variation of the sensitivity and specificity VIA and VILI, considering CIN2 or worse disease as outcome. The sensitivity of VIA varied between 61.5% (CI 53.5–69.0%), in Calcutta 1, and 91.1% (CI 85.7–94.9%) in Conakry. The pooled sensitivity was 79.2% (CI 73.3–85.0%). The lowest specificity of VIA (74.2%; CI 72.2–76.1%) was observed in Ouagadougou, and the highest specificity (94.5%; CI 93.5–95.3%) was found in Niamey. High specificity was also observed in Conakry: 93.8% (CI 93.2–94.3%). The pooled specificity of VIA was 84.7% (CI 80.7–88.8%). In general, the sensitivity of VILI was higher than VIA at the exception of Jaipur (87.5%; CI 76.8–83.2%) and Trivandrum 2 (80.2%; CI 70.9–88.3%) where sensitivities were equal to those of VIA. The specificity values of VILI varied over a similar range as VIA, between 73.0 and 91.6%. The overall pooled sensitivity for VILI (91.2%; CI 87.8–94.6%) was statistically significantly higher than for VIA. On the other hand, the pooled specificity of VILI (84.5%; CI 81.3–87.8%) was not significantly different from that of VIA.

Figure 1.

Forest plots of the sensitivity and specificity of visual inspection of the cervix after application of acetic acid (VIA) or Lugol's iodine (VILI) to detect CIN2 or more serious cervical disease. The diamond at the bottom of each forest plot represents the 95% confidence interval of the pooled measure computed from a random effect model.

Table II shows, besides sensitivity and specificity, also the predictive values, the prevalence of CIN2+ and the detection rates of CIN2+ in all settings where VIA and VILI were evaluated. The pooled prevalence of CIN2+ was 2.3% (CI 1.6–3.0%) with a lowest value of 0.8%, noted in Niamey, and a highest value of 3.3% in Trivandrum 1. This yielded pooled PPVs with wide overlapping confidence intervals, for VIA (11.6%; CI 8.1–15.1%) and VILI (12.9%; CI 8.0–17.9%). The NPV of both inspection methods were less heterogeneous and in general higher than 99%. Moreover, the pooled NPV of VILI (99.8%; CI 99.7–99.9%) was significantly higher than that of VIA (99.4%; CI 99.2–99.6%). The test positivity rates of VIA and VILI varied widely, but were highly correlated (Spearman's correlation coefficient = 0.93), ranging from 6.0 and 9.0%, observed in Niamey, to 27.4 and 28.7%, observed in Ouagadougou, for VIA and VILI, respectively.

Table II. Number of True and False Positive and Negative Results, Test Accuracy of VIA and VILI to Detect CIN2 or More Severe Cervical Neoplasia; Test Positivity Rate, Prevalence of CIN2+ and Detection Rate of CIN2+; Values from 11 Centres in Africa and India and Meta-Analytically Pooled Values
StudyTPFNFPTNSeSpPPVNPVTest+PrevDet rate
  1. VIA, visual inspection after application of acetic acid; VILI, visual inspection after application of Lugol's iodine; CIN, cervical intra-epithelial neoplasia; TP/FN/FP/TN, number of true positives, false negatives, false positives and true negatives; Se, sensitivity; Sp, specificity; PPV, positive predictive value; NPV, negative predictive value; Test+, test positivity rate; Prev, prevalence of CIN2+; Det rate, detection rate of CIN2+.

VIA
 Bamako1303449448940.7930.9080.2080.9930.1120.0300.023
 Brazzaville31376153250140.8050.7660.1700.9850.2660.0560.045
 Conakry1531552779320.9110.9380.2250.9980.0790.0190.018
 Niamey13713923750.6500.9450.0860.9970.0600.0080.005
 Ouagadougou45551614850.9000.7420.0800.9970.2740.0240.022
 Calcutta 19962102347100.6150.8220.0880.9870.1900.0270.017
 Calcutta 2521985871510.7320.8930.0570.9970.1130.0090.006
 Jaipur568142642960.8750.7510.0380.9980.2560.0110.010
 Mumbai503146734560.6170.8810.0970.9910.1290.0200.012
 Trivandrum 11321794533630.8860.7810.1230.9950.2420.0330.030
 Trivandrum 2651651441640.8020.8900.1120.9960.1220.0170.014
 TOTAL/Pooled11082908441488400.7920.8470.1160.9940.1670.0230.018
VILI
 Bamako159556848200.9700.8950.2190.9990.131 0.029
 Brazzaville3711772358230.9560.8900.3390.9970.158 0.054
 Conakry163581176480.9700.9040.1670.9990.113 0.019
 Niamey18221023040.9000.9160.0790.9990.090 0.007
 Ouagadougou49153914620.9800.7310.0830.9990.287 0.024
 Calcutta 25713108169250.8140.8650.0500.9980.141 0.007
 Jaipur568154341790.8750.7300.0350.9980.276 0.010
 Mumbai602162632960.7410.8400.0870.9940.171 0.015
 Trivandrum 11361381334950.9130.8110.1430.9960.213 0.031
 Trivandrum 2651661540630.8020.8690.0960.9960.143 0.014
 TOTAL/Pooled11341017529440150.9120.8450.1290.9980.172 0.021

Summary of test accuracy of all screening test for all categories of CIN

The sensitivity and specificity for all tests and outcomes are summarized in Table III. The sensitivity rose substantially with increasing severity of the outcome (>22% difference in sensitivity for respectively CIN1+ and cancer), whereas the specificity decreased (≤3% difference in specificity for, respectively, CIN1+ and cancer). All accuracy measures showed statistically significant interstudy heterogeneity (p for Cochrane's Q test <0.01) at the exception of the sensitivity of the HC2 test for outcomes of CIN2+, CIN3+ and cancer, which were statistically homogenous (p for Cochrane's Q test >0.2).

Table III. Sensitivity and Specificity of 5 Screening Tests for CIN1 or More Severe Disease (CIN1+), CIN2+, CIN3+ and Cancer; Minimum, Maximum and Meta-Analytically Pooled Measures
TestOutcomeTest CutoffSensitivitySpecificity
MinMaxPooled (95% CI)MinMaxPooled (95% CI)
  1. VIA, visual inspection after application of acetic acid; VILI, visual inspection after application of lugol's iodine; VIAM, visual inspection after application of acetic acid and using a magnifying loupe; HC2, Hybrid-Capture 2 assay; CIN, cervical intra-epithelial neoplasia; AW lesion, aceto-white lesion; ASCUS+, atypical squamous cells of unspecified significance; LSIL, low grade squamous intraepithelial lesion; HSIL, high-grade intraepithelial lesion; RLU, relative light units.

VIACIN 1+AW lesions or growth0.4250.9000.618 (0.523–0.713)0.7520.9510.865 (0.828–0.901)
CIN2+0.6500.9110.792 (0.733–0.850)0.7420.9450.847 (0.807–0.888)
CIN3+0.5830.9460.829 (0.771–0.887)0.7380.9430.842 (0.800–0.883)
Cancer0.6671.0000.887 (0.831–0.943)0.7310.9410.836 (0.793–0.880)
VILICIN 1+Non iodine uptake yellow areas or growth0.5030.9410.737 (0.630–0.845)0.7410.9280.866 (0.834–0.898)
CIN2+0.7410.9800.912 (0.878–0.946)0.7300.9160.845 (0.813–0.878)
CIN3+0.7291.0000.938 (0.906–0.971)0.7260.9140.838 (0.805–0.871)
Cancer0.6671.0000.957 (0.918–0.997)0.7190.9110.832 (0.798–0.865)
VIAMCIN1+AW lesions or growth0.4250.6840.585 (0.432–0.739)0.8640.9010.881 (0.858–0.904)
CIN2+0.6460.7320.670 (0.618–0.722)0.8330.8930.862 (0.824–0.900)
CIN3+0.6570.7440.682 (0.618–0.747)0.8280.8910.859 (0.820–0.898)
Cancer0.7631.0000.826 (0.677–0.976)0.8240.8890.855 (0.815–0.896)
Pap smearCIN1+ASCUS+0.2300.6550.343 (0.153–0. 532)0.8660.9870.946 (0.915–0. 977)
CIN2+0.3330.8190.570 (0.376–0. 763)0.8650.9850.928 (0.887–0. 968)
CIN3+0.3560.9640.630 (0.379–0. 882)0.8630.9820.923 (0.881–0. 966)
Cancer0.4001.0000.725 (0.549–0. 900)0.8570.9770.918 (0.875–0. 962)
CIN1+LSIL+0.1720.6330.306 (0.112–0. 499)0.9290.9930.967 (0.948–0. 985)
CIN2+0.2380.7790.512 (0.300–0. 724)0.8860.9910.949 (0.921–0. 977)
CIN3+0.2670.8930.561 (0.327–0. 796)0.8730.9880.945 (0.916–0. 975)
Cancer0.2001.0000.651 (0.432–0. 871)0.8650.9830.941 (0.910–0. 971)
CIN2+HSIL+0.1750.6170.426 (0.265–0. 586)0.9770.9970.993 (0.988–0. 997)
CIN3+0.2220.7680.516 (0.320–0. 711)0.9750.9970.990 (0.984–0. 995)
Cancer0.2001.0000.651 (0.432–0. 871)0.9730.9950.985 (0.978–0. 993)
HC2CIN1+RLU>10.2150.3370.266 (0.215–0.316)0.9220.9510.940 (0.929–0.951)
CIN2+0.4840.6770.619 (0.562–0.677)0.9160.9460.936 (0.924–0.948)
CIN3+0.6230.7350.684 (0.615–0.754)0.9140.9440.934 (0.922–0.946)
Cancer0.6150.8570.721 (0.603–0.838)0.9110.9400.930 (0.918–0.942)

HSROC curves

Figure 2 displays the joint variation in observed sensitivity and specificity in each study, the summary ROC curve fitted by HSROC regression, the pooled sensitivity–specificity point and its 95% confidence area for VIA and VILI for the outcomes of CIN2+ and CIN3+. Pooled sensitivities and specificities were very similar to those obtained by the separate meta-analyses shown in Figure 1 and Table III. SROC curves and summary points for VILI were located more upper-left than for VIA. The fitted DORs for CIN2+ and CIN3+ were, respectively, 25.0 and 64.7 for VIA and, 30.3 and 88.5 for VILI.

Figure 2.

HSROC curves displaying the fitted sensitivity as a function of specificity of VIA (at left) and VILI (at right) with respect to detection of underlying CIN2 or worse (top) or CIN3 or worse (bottom). Individual studies are represented by small circles. The square corresponds with the pooled sensitivity and specificity and the area surrounded by an interrupted line corresponds with the 95% confidence ellipse.

Pooled relative accuracy of the screening tests

The forest plots in Figure 3 display the relative sensitivity and specificity of VILI compared with VIA with respect to the prediction of CIN2+ or CIN3+. The relative accuracy of the other screening tests is documented in Table IV. Overall, the sensitivity of VILI for CIN2+ and CIN3+ was, respectively, 10.5% (CI 4.8–16.5%) and 7.4% (CI 4.3–10.6%) higher than the sensitivity of VIA. The relative sensitivity of VILI was considerably higher than VIA in the African studies. Specificities of both tests were not statistically significantly different. The accuracy of VIAM was similar to that of VIA. The Pap smear had a significantly lower sensitivity for CIN2+ than VIA, even at the lowest cytological cutoff of atypical squamous cells of undetermined significance + (relative sensitivity= 0.742; CI 0.576–0.958), but also showed a significantly higher specificity, and this difference rose with the test threshold. The sensitivity of HC2 was lower than that of VIA, but this difference did not reach the level of statistical significance. On the other hand, the specificity of HC2 was 7 to 8% higher and this difference was significant. For all histological outcomes and cytological cutoffs, the Pap smear was significantly less sensitive but more specific than VILI. The HC2 assay also had lower sensitivity than VILI, but this finding was only significant for CIN2+. On the other hand, the specificity of the HPV test was significantly higher. The Pap smear showed a lower sensitivity and a higher specificity than the HC2 test. The difference in sensitivity was never significant but was significant for specificity when LSIL+ and HSIL+ were considered as cutoff.

Figure 3.

Forest plots of the relative sensitivity and specificity of VILI compared with VIA for the prediction of cervical intraepithelial neoplasia, grade 2 or more severe neoplasia (CIN2+) and for CIN3+.

Table IV. Relative Sensitivity and Specificity of VIA, VILI, VIAM, Cytology (At Cutoffs ASCUS, LSIL and HSIL) and the HC2 Assay Versus Another Screening Test to Detect CIN2 or CIN3 or More Serious Cervical Disease
Test combinationsOutcomeRelative sensitivity pooled (95% CI)Relative specificity pooled (95% CI)
  • VIA, visual inspection after application of acetic acid; VILI, visual inspection after application of lugol's iodine, VIAM, visual inspection after application of acetic acid and using a magnifying loupe; HC2, Hybrid-Capture 2 assay; CIN, cervical intra-epithelial neoplasia; AW lesion, aceto-white lesion; ASCUS+, atypical squamous cells of unspecified significance; LSIL, low grade squamous intraepithelial lesion; HSIL, high-grade intraepithelial lesion.

  • 1

    95% confidence interval (CI) does not include unity, indicating statically different accuracy. Relative sensitivity or specificity >1 indicates higher sensitivity or specificity of the first test; <1 indicates lower sensitivity or specificity of the 1st test.

VILI/VIACIN2+1.105 (1.048–1.165)10.996 (0.962–1.031)
CIN3+1.074 (1.043–1.106)10.994 (0.962–1.027)
VIAM/VIACIN2+1.036 (0.926–1.159)0.997 (0.978–1.016)
CIN3+1.028 (0.897–1.180)0.997 (0.978–1.016)
Pap smear (ASCUS+)/VIACIN2+0.742 (0.576–0.958)11.125 (1.072–1.181)1
CIN3+0.795 (0.613–1.032)11.127 (1.073–1.183)1
Pap smear (LSIL+)/VIACIN2+0.659 (0.491–0.884)11.152 (1.099–1.208)1
CIN3+0.708 (0.535–0.937)11.154 (1.099–1.211)1
Pap smear (HSIL+)/VIACIN2+0.552 (0.400–0.763)11.205 (1.132–1.284)1
CIN3+0.660 (0.499–0.872)11.209 (1.133–1.291)1
HC2/VIACIN2+0.883 (0.775–1.007)1.074 (1.051–1.097)1
CIN3+0.956 (0.781–1.169)1.075 (1.051–1.099)1
Pap smear (ASCUS+)/VILICIN2+0.735 (0.570–0.948)11.162 (1.092–1.236)1
CIN3+0.796 (0.618–1.024)1.164 (1.094–1.239)1
Pap smear (LSIL+)/VILICIN2+0.668 (0.500–0.892)11.176 (1.096–1.261)1
CIN3+0.721 (0.550–0.945)11.178 (1.097–1.265)1
Pap smear (HSIL+)/VILICIN2+0.552 (0.392–0.777)11.228 (1.144–1.318)1
CIN3+0.676 (0.518–0.882)11.233 (1.145–1.327)1
HC2/VILICIN2+0.834 (0.740–0.939)11.097 (1.085–1.110)1
CIN3+0.857 (0.714–1.028)1.098 (1.084–1.112)1
Pap smear (ASCUS+)/HC2CIN2+0.957 (0.825–1.109)1.008 (0.956–1.063)
CIN3+0.915 (0.742–1.130)1.008 (0.956–1.064)
Pap smear (LSIL+)/HC2CIN2+0.870 (0.723–1.047)1.037 (1.013–1.061)1
CIN3+0.816 (0.609–1.093)1.037 (1.014–1.060)1
Pap smear (HSIL+)/HC2CIN2+0.786 (0.562–1.099)1.061 (1.052–1.070)1
CIN3+0.974 (0.756–1.256)1.061 (1.052–1.071)1

Sources of variation of the diagnostic performance

There was no statistically significant variation of the DOR for the outcome of CIN2+ by age group.

The DOR of VIA increased by study period and also varied significantly by setting. The DORs were also higher in Calcutta 2 or Trivandrum 2 compared with Calcutta 1 or Trivadrum 1 studies.

The DORs of VILI, VIAM and cytology at cutoff LSIL+ were significantly higher in the third period compared with the first, but there was no significant difference between the first and second period. There was a country effect for VILI with significantly higher DORs in Congo, Mali, Guinea and Niger compared with India. The DOR of VIAM and cytology varied significantly among the Indian settings where the tests were evaluated.

HPV testing with HC2 was the only screening method whose accuracy did not vary by period. Nevertheless, a significant variation by setting was observed.

For the outcome CIN3+, SROC regressions showed similar results, except for HC2 where there was no significant setting effect any more.

Readers can find tables containing the coefficients of the SROC regressions in http://www.iph.fgov.be/epidemio/epien/cervixen/ACCP_Meta9a.pdf.

Correlation between tests

The Spearman rank correlation between the respective tests and colposcopy is shown in Table V. High correlations (ρ > 0.60) were observed for all visual screening tests and colposcopy.

Table V. Correlation Between Results of Screening Tests and Colposcopy Assessed with the Rank Correlation Coefficient of Spearman (ρ)
 VILIVIAMHC2Cytology (LSIL+)Colposcopy
VIA0.690.860.100.170.61
VILI0.720.080.190.74
VIAM0.080.070.63
HC20.210.13
Cytology0.26

Discussion

The appropriateness of a screening test depends not only on its accuracy, as measured mainly by sensitivity and specificity but also on its simplicity and safety. Additionally, the PPV and NPV can be used to measure the suitability of the screening tests, but these 2 measures greatly depend on the prevalence of the disease. All these measures of test characteristics can be evaluated, provided that a suitable reference test is carried out, to distinguish the subjects who are truly diseased from those who are not.

Our study allows comparative evaluation of the accuracy of 5 different screening tests, VIA, VILI, VIAM, conventional cytology and HPV testing, in the detection of cervical neoplasia. Assessment of the test characteristics of these screening tests in different settings is of considerable importance, as they are useful in determining screening policy decisions. However, there are very few studies from the developing world that permit the test characteristics of these screening tests to be established with minimal verification bias. In our study, the reference standard consisted of histology or colposcopy if no histology result was available. All women, irrespective of the screening test result, underwent a colposcopic examination and biopsies were directed in those who had colposcopic abnormalities.

The analyses from this study show that screening with VIA or VILI allows detecting presence of cervical cancer and its precursors with an accuracy as good or even better than the standard Pap smear test or testing for the presence of high-risk HPV with HC2 assay. However, the interstudy variation of VIA and VILI accuracy parameters was wide. Similar sensitivity (83%) and specificity (89%) for VIA was reported in a recent study in Mongolia19 and in Kenya20 (sensitivity 73% and specificity 80%). Studies from other researchers have shown similar test sensitivities but far lower test specificities21–24 or test specificities similar to ours but with far lower sensitivities.25–27 Lower sensitivity and specificity have been reported by other workers for VILI.27, 28 This inconsistency across studies reflects the considerable subjectivity in interpreting visual tests by different providers as a result of different levels of competencies, training methods, monitoring and quality assurance and also reflects the fact that visual inspection methods have low reproducibility. The accuracy of VIA, increased significantly by study phase. It also increased in Trivandrum 2 and Calcutta 2, where the same teams as in Trivandrum 1 and Calcutta 1 did the examinations. These findings underline the need for experience, continuous training and supervision.

VILI was 10% more sensitive for detecting CIN2+ than VIA, but had the same specificity, thus looks the most preferred method to detect high-grade CIN in developing countries. The reported sensitivity for VILI in other studies has been generally lower.27, 28 VILI requires further evaluation by more providers in other settings.

The HC2, showed an unexpectedly low sensitivity (62%) for a high-grade CIN, for a rather high specificity (94%), in the 4 Indian settings. In all reported studies from other developing countries, where the accuracy of HPV testing by HC2 was documented without verification bias, a higher sensitivity exceeding 80% has been observed20, 29–32 and in studies in the developed world a consistently higher sensitivity exceeding 90% has been reported.33–36

Possible explanations for this low sensitivity in our study could be contamination of the sample by acetic acid or Lugol's iodine or deterioration of the sample because of exposure at high temperature. Contamination of the sample by acetic acid or Lugol's iodine could normally not have occurred, since, according to the protocol, the sample for HC2 was collected before application of vinegar or iodine solution. A laboratory testing problem, is also hardly probable, because of the high concordance between the Indian results and the those performed in a specialized virological French laboratory on a random subsample.9 Presence of other HPV types not included in the hrHPV DNA probe cocktail is another possibility. However, recent HPV type distribution studies and case–control studies conducted in India do not provide evidence for this latter hypothesis.37, 38 Finally, misclassification of the outcome could also explain the low observed sensitivity of HC2.

The interstudy variation in HC2 test accuracy was narrow and most often nonsignificant, which most probably reflects high reproducibility, independent of training or experience.

Among all evaluated tests, cytology showed the lowest sensitivity, even at the lowest cytological cutoff (57% for CIN2+). This adds further to the inconsistent results of cytology observed in low resource settings in which repeated cytology testing is difficult due to logistic problems.

It is known that colposcopy followed by biopsy taken from colposcopically suspect lesions is not a perfect gold standard.39, 40 In this study, all visual inspection methods yielded highly correlated results, which might explain the high apparent accuracy (sensitivity and specificity) of the visual methods. The test sensitivity of colposcopy itself was not evaluated in the Alliance for Cervical Cancer Prevention trials. The sensitivity of colposcopy-directed biopsy for CIN2+ in women with satisfactory colposcopy was only 57% in a Chinese study where multiple random biopsies were taken from all tested women.41 Pretorius showed that it is possible that the sensitivity of VIA is overestimated if colposcopically directed biopsy and VIA miss similar small lesions.42 Moreover, suboptimal blinding of gold standard verification in certain settings may have occurred, for instance, in Conakry where outlying high sensitivity and specificity of VIA were observed. Histological interpretation of small punch biopsies is subjective. Over-interpretation of CIN lesions, which in fact were not CIN2+, but VIA or VILI positive and HC2 negative, could explain the apparent high sensitivity of the former and low sensitivity of the latter. In a recent reevaluation of a diagnostic study on cervical cancer screening tests, conducted in Zimbabwe, including correction for gold standard misclassification yielded substantially higher estimates of the sensitivity of HC2 and lower for VIA, compared with original estimates based on colposcopy-based biopsies.43 It looks plausible that gold standard misclassification was less evident in Indian sites where providers had more experience in carrying out both the screening and confirmatory tests than their African counterparts. It might be clear, for the future, that higher standards for disease confirmation are needed such as p16 immunostaining of histological preparations, strict blinding of assessors, quality review by highly experienced colposcopists and histologists on random sub-samples, taking multiple random biopsies and, last but not least, robust statistical methods adjusting for misclassification and verification biases.

It was recently demonstrated in a randomized population trial that once in a lifetime VIA screening reduces incidence of cervical cancer with 25% and cause-specific mortality with 35%.44 The major finding of our multicentre study was that VLII was more sensitive and as specific compared with VIA. This might generate further hope for improvement of screening with simple and affordable technologies, which challenge other innovative promising preventive policies such as screening with a simple and cheap HPV testing method and HPV vaccination.45, 46

Acknowledgements

The authors acknowledge the support received from the Bill & Melinda Gates Foundation (Seattle, WA) through the Alliance for Cervical Cancer Prevention (ACCP) the DWTC/SSTC (Federal Services for Science, Culture and Technology, Brussels, Belgium). The Gynaecological Cancer Cochrane Review Collaboration (Bath, UK).

Ancillary