Fax: (011) 32-2-6425410
p16INK4a immunocytochemistry versus human papillomavirus testing for triage of women with minor cytologic abnormalities †‡
A systematic review and meta-analysis
Version of Record online: 14 JUN 2012
Copyright © 2012 American Cancer Society
Volume 120, Issue 5, pages 294–307, 25 October 2012
How to Cite
Roelens, J., Reuschenbach, M., von Knebel Doeberitz, M., Wentzensen, N., Bergeron, C. and Arbyn, M. (2012), p16INK4a immunocytochemistry versus human papillomavirus testing for triage of women with minor cytologic abnormalities . Cancer Cytopathology, 120: 294–307. doi: 10.1002/cncy.21205
The authors acknowledge M. Nasioutziki, M. Guo, A. Szarewski, J. Cuzick and J. Monsonego for the provision of additional data as well as L. Houthuys for bibliographic support.
See related Commentary on pages 291–293, this issue.
- Issue online: 12 OCT 2012
- Version of Record online: 14 JUN 2012
- Manuscript Accepted: 1 FEB 2012
- Manuscript Revised: 25 JAN 2012
- Manuscript Received: 6 DEC 2011
- cervical cancer;
- cervical intraepithelial neoplasia;
- atypical squamous cells of undetermined significance;
- low-grade squamous intraepithelial lesions;
- human papillomavirus testing;
- diagnostic accuracy;
- systematic review
The best method for identifying women who have minor cervical lesions that require diagnostic workup remains unclear. The authors of this report performed a meta-analysis to assess the accuracy of cyclin-dependent kinase inhibitor 2A (p16INK4a) immunocytochemistry compared with high-risk human papillomavirus DNA testing with Hybrid Capture 2 (HC2) to detect grade 2 or greater cervical intraepithelial neoplasia (CIN2+) and CIN3+ among women who had cervical cytology indicating atypical squamous cells of undetermined significance (ASC-US) or low-grade cervical lesions (LSIL). A literature search was performed in 3 electronic databases to identify studies that were eligible for this meta-analysis. Seventeen studies were included in the meta-analysis. The pooled sensitivity of p16INK4a to detect CIN2+ was 83.2% (95% confidence interval [CI], 76.8%-88.2%) and 83.8% (95% CI, 73.5%-90.6%) in ASC-US and LSIL cervical cytology, respectively, and the pooled specificities were 71% (95% CI, 65%-76.4%) and 65.7% (95% CI, 54.2%-75.6%), respectively. Eight studies provided both HC2 and p16INK4a triage data. p16INK4a and HC2 had similar sensitivity, and p16INK4a has significantly higher specificity in the triage of women with ASC-US (relative sensitivity, 0.95 [95% CI, 0.89-1.01]; relative specificity, 1.82 [95% CI, 1.57-2.12]). In the triage of LSIL, p16INK4a had significantly lower sensitivity but higher specificity compared with HC2 (relative sensitivity, 0.87 [95% CI, 0.81-0.94]; relative specificity, 2.74 [95% CI, 1.99-3.76]). The published literature indicated the improved accuracy of p16INK4a compared with HC2 testing in the triage of women with ASC-US. In LSIL triage, p16INK4a was more specific but less sensitive. Cancer (Cancer Cytopathol) 2012. © 2012 American Cancer Society.
Cervical cancer is the third most common cancer in women worldwide. It is estimated that, in 2008, approximately 530,000 women developed cervical cancer and that 275,000 died of the disease.1 A well organized screening for and management of precancerous lesions could reduce the incidence of cervical cancer.2 Women with high-grade cervical abnormalities should be referred immediately to colposcopy or treatment. However, the optimal management of women with atypical squamous cells of undetermined significance (ASC-US) or low-grade squamous intraepithelial lesions (LSIL) remains elusive and continues to be the object of intensive research.
Testing for carcinogenic human papillomavirus (HPV) DNA has been proposed as a triage method for identifying women who are at increased risk of cervical cancer precursors and cervical cancer. Numerous clinical studies, most prominently the ASC-US/LSIL Triage Study (ALTS)3 and a meta-analysis4 indicated that the Hybrid Capture 2 (HC2) assay has improved accuracy (higher sensitivity, similar specificity) than repeat Papanicolaou (Pap) testing to detect grade 2 or greater cervical intraepithelial neoplasia (CIN2+) in women with ASC-US cytology. However, for LSIL, the possible advantages of HPV triage remain unclear.5 LSIL is the morphologic correlate of a productive HPV infection.6 Therefore, HPV-DNA testing nearly always yields positive results and cannot provide additional risk stratification to distinguish between women with and without underlying or developing high-grade lesions.7
There is a lot of research on the development of objective biomarkers that can distinguish transforming from productive HPV infections and predict disease severity. The cellular tumor suppressor protein cyclin-dependent kinase inhibitor 2A (p16INK4a) has been identified as a biomarker for transforming HPV infections. It is a cyclin-dependent kinase inhibitor that decelerates the cell cycle by inactivating the cyclin-dependent kinases (CDK4/CDK6) involved in the phosphorylation of the retinoblastoma protein (pRb).8 In the presence of the high-risk HPV (hrHPV) oncogene E7, p16INK4a transcription is induced by the histone demethylase KDM6B.9 Consequently, p16INK4a protein accumulates in the cell, and this could be considered as a surrogate of a transforming infection.10, 11
Recently, an immunocytochemical, dual-staining protocol that simultaneously detects p16INK4a and Ki-67 expression was established. The simultaneous detection of p16INK4a overexpression with the proliferation marker Ki-67 within the same cervical epithelial cell indicates deregulation of the cell cycle and does not require morphology-based interpretations.12
A previous meta-analysis demonstrated the correlation between the frequency of p16INK4a overexpression and the severity of preneoplastic cervical lesions in cellular and tissue specimens.13 No hypotheses regarding the clinical applications of p16INK4a immunostaining were addressed in that systematic review.13 Establishing a correlation between p16INK4a expression and the severity of cancer precursors is a first step in the generation of evidence for potential clinical applications in screening for cervical cancer or in the management of screen-positive women.14 Therefore, we conducted the current meta-analysis to explore the performance of p16INK4a immunocytochemistry in the triage of women with minor cytologic cervical lesions.
MATERIALS AND METHODS
Population-Index Test-Comparator Test-Outcomes-Studies Question
Before we conducted a literature search, a clinical question and corresponding “Population-Index Test-Comparator Test-Outcomes-Studies” (PICOS) question were defined as follows: Can p16INK4a be used to identify women with minor cytologic abnormalities who need referral to colposcopy? Is it better than repeat cytology, HPV testing (HC2, other HPV assays), or other biomarkers? In other words, is p16INK4a immunocytochemistry a good triage test to manage women with ASC-US or LSIL?
Three electronic databases were searched: PubMed-MedLine, EMBASE, and CENTRAL. The following search string was used in PubMed-MedLine: (cervix OR cervical OR vaginal) AND (cancer OR carcinoma OR dysplas* OR neoplasm* OR CIN OR SIL OR “Pap smear” or cytology) AND (p16* OR p16INK4a OR protein p16 OR p16 protein). No language or publication date restrictions were applied.
The references from the retrieved articles were hand-searched to identify other eligible studies. Eligibility of inclusion or exclusion criteria was verified independently by 2 investigators (J.R. and M.R.). When no consensus could be reached, a third investigator was involved (M.A.). Extraction of the data was done by J.R. and checked by M.A.
Inclusion and Exclusion of Studies
We included all studies that assessed p16INK4a immunostaining or p16INK4a/Ki-67 dual staining with or without HC2 testing as a comparator test on liquid-based cytology or conventional cytology specimens which had ASC-US or LSIL cytology in which the diagnosis was verified with a reference standard. Studies were excluded if the population included <20 women who had ASC-US or LSIL cytology. If the data were not separated according to ASC-US or LSIL cytology, then separate data were requested from the authors. When the authors did not respond, the studies were excluded. When duplicate publications of the same studies were identified, the most comprehensive study was included.
Two groups of participants were considered: women with equivocal cervical lesions or ASC-US (triage group I) and women with low-grade cytologic lesions or LSIL (triage group II). For the first group, we considered women who had atypical squamous cells of undetermined significance (ASCUS), which was defined according to the 1988 version of the Bethesda System.15 For studies that used 2001 Bethesda System criteria, only the data on women with ASC-US were extracted. Studies that reported data exclusively on atypical squamous cells (ASC)-favor reactive, or ASC-cannot exclude high-grade squamous intraepithelial lesion, or atypical glandular cells were excluded. For this meta-analysis, only 1 term, “ASC-US,” was used for both versions of the Bethesda System.
For the second group, we considered only women with LSIL. Studies that used the terminology of the British Society of Clinical Cytology16 were translated into the 1988 Bethesda System. The British Society of Clinical Cytology terms borderline and mild dyskaryosis were considered similar to ASCUS and LSIL, respectively.17
Types of Outcome Measures
Outcome measures were defined before the literature search. The primary outcome was the absolute sensitivity and specificity of p16INK4a immunocytochemistry to detect underlying disease (CIN2+ or CIN3+/adenocarcinoma in situ) in the triage of women with equivocal or low-grade cytologic abnormalities. The secondary outcome was the relative sensitivity and specificity of p16INK4a immunostaining versus hrHPV testing in studies with comparator testing.
We considered the following categories of reference standards: 1) colposcopy and either large loop excision of the transformation zone or conization on all women; 2) colposcopy, punch biopsies of colposcopically suspicious areas, and random biopsies of colposcopic normal zones on all women; 3) colposcopy and more than 1 biopsy on all women (type of biopsy unknown); 4) colposcopy and 1 or more biopsies of colposcopic suspected zone (with women considered free of CIN2+ if colposcopy is negative); 5) colposcopy and/or biopsy on all women (no further information); and 6) retrospective collection of biopsy/histology data.
Data Extraction and Statistical Analyses
Study characteristics and covariates that could influence study outcomes were tabled: primary p16INK4a antibody used, reference standard, and positivity criterion for p16INK4a. The Quality Assessment of Diagnostic Accuracy-Studies (QUADAS) checklist for evaluating the quality of diagnostic test studies was used as a tool to evaluate the quality of the studies.18 The most important quality items that were reviewed in the QUADAS checklist were the acceptability of the reference standard, the delay between tests, blinding of results, incorporation bias, and verification bias.18
A pooling of the absolute accuracy of p16INK4a immunocytochemistry and hrHPV testing was done making use of the Stata-10 procedure metandi (Stata Corp., College Station, Tex). This is a 2-level, mixed logistic regression model with independent, binomial distributions for the true-positives and true-negatives conditional on the sensitivity and specificity in each study and a bivariate normal model for the logit transforms of sensitivity and specificity between studies.19, 20
The relative sensitivity and specificity of p16INK4a compared with hrHPV testing was computed using the metadas macro in SAS (SAS Institute Inc., Cary, NC) for meta-analysis of diagnostic accuracy studies, which allows the inclusion of “test” as a covariate, making comparison of ≥2 tests possible.21, 22
Multivariate analyses for p16INK4a immunocytochemistry were done using metadas. Different covariates were included for the test-positivity criterion used for p16INK4a, primary antibody, preparation method of index cytology, and the reference standard used.
The electronic search yielded 810 articles (the last search was performed on August 24, 2011). The majority of articles were identified in the PubMed-Medline database (n = 619). An additional 191 articles were retrieved from EMBASE. The CENTRAL database yielded no further results. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) flowchart (Fig. 1) illustrates how selected references were harvested and the reasons for the exclusion of certain articles.23 Finally, 17 reports were retained with data that fulfilled the inclusion criteria and allowed us to address the PICOS question.
Two studies provided data on the accuracy of p16INK4a immunochemistry for women with LSIL cytology,24, 25 5 studies provided data on women with ASC-US cytology,26-30 and another 10 studies provided data on the triage of both ASC-US and LSIL cytology.12, 31-39 Study characteristics and technical information of the included articles are listed in Tables 1 and 2, respectively.
|Study||Country||Study Size||Triage Group||Triage Test||Outcomes||Gold Standarda|
|Nieh, 200529||Taiwan||66||ASCUS||p16INK4a cytology, HC2||CIN2+||3|
|Holladay, 200632||United States||100||ASC-US||p16INK4a cytology, HC2||CIN2+||6|
|Meyer, 200724||United States||28||LSIL||p16INK4a cytology, HC2||CIN2+||5|
|Monsonego, 200733||France||98||ASC-US||p16INK4a cytology, HC2||CIN2+, CIN3+||3|
|Wentzensen, 200739||France||137||ASCUS||p16INK4a cytology||CIN2+||3|
|Schledermann, 200837||Denmark Sweden||43||ASC||p16INK4a cytology||CIN2+||6|
|Szarewski, 200838||United Kingdom||104||ASCUS||p16INK4a cytology, HC2||CIN2+, CIN3+||3|
|Denton, 201031||Switzerland Italy||385||ASC-US||p16INK4a cytology, HC2||CIN2+, CIN3+||6|
|Passamonti, 201035||Italy||91||ASC-US||p16INK4a cytology||CIN2+, CIN3+||4|
|Samarawardana, 201036||United States||164||ASC-US||p16INK4a cytology||CIN2+||4|
|Sung, 201030||Korea||66||ASC-US||p16INK4a cytology||CIN2+||3|
|Tsoumpou, 201025||Greece||216||LSIL||p16INK4a cytology||CIN2+||4|
|Alameda, 201126||Spain||109||ASCUS||p16INK4a cytology, HC2||CIN2+||4|
|Edgerton, 201127||United States||63||ASC-US||Dual stain (p16INK4a/Ki-67)||CIN2+||6|
|Guo, 201128||United States||65||ASC-US||p16INK4a cytology||CIN2+, CIN3+||5|
|Nasioutziki, 201134||Greece||53||ASCUS||p16INK4a cytology, HC2||CIN2+||5|
|Schmidt, 201112||Switzerland, Italy||361||ASCUS||Dual stain (p16INK4a/Ki-67), HC2||CIN2+||3|
|Study||p16INK4a Antibody||Criterion for p16INK4 Positivitya||Cytology Preparation Method||Cytology Collection Device|
|Nieh, 200529||Clone E6H4||Nuclear/cytoplasmic staining of ≥1 cytologically abnormal cervical cell||Conventional cytology||Wooden spatula/ cytobrush|
|Holladay, 200632||Clone E6H4||Cytoplasmic/nuclear staining of ≥1 cytologically abnormal cervical cell||LBC (PreservCyt [Hologic, Inc., Bedford, Mass], ThinPrep [Cytyc Corporation, Boxboro, Mass])||ND|
|Meyer, 200724||Clone E6H4||Nuclear/cytoplasmic staining of ≥1 cytologically abnormal cervical cell||LBC (PreservCyt, ThinPrep)||ND|
|Monsonego, 200733||Clone E6H4||Nuclear/cytoplasmic staining of ≥1 cytologically abnormal cervical cell||LBC (PreservCyt, ThinPrep)||ND|
|Wentzensen, 200739||Clone E6H4||Nuclear score >2a||LBC (CYTO-screen system fixative fluid [SEROA, Monaco])||Flexible brush|
|Schledermann, 200837||Clone E6H4||Nuclear staining of ≥1 cytologically abnormal cervical cell||LBC (ThinPrep, PreservCyt)||Plastic spatula, endocervical cytobrush|
|Szarewski, 200838||Clone E6H4||Nuclear score >2a||LBC (ThinPrep, PreservCyt)||Cervex broom|
|Denton, 201031||Clone E6H4||Cytotechnologist 1 and pathologist: Presence of ≥1 p16INK4a-stained cervical cell; Cytotechnologist 2: nuclear score ≥2a||LBC||ND|
|Passamonti, 201035||Clone JC8||Nuclear/cytoplasmic staining of ≥1 cytologically abnormal cervical cell||Conventional cytology, 151; LBC, 95 (ThinPrep, PreservCyt)||ND|
|Samarawardana, 201036||16P04||Nuclear/cytoplasmic strong staining in ≥30 metaplastic, koilocytotic, or cytologically equivocal cells||LBC (ThinPrep, PreservCyt)||Broom-like device|
|Sung, 201030||Clone E6H4||Nuclear/cytoplasmic staining of ≥1 cytologically abnormal cervical cell||LBC||Cytobrush|
|Tsoumpou, 201025||Clone E6H4||Nuclear/cytoplasmic staining of ≥1 cytologically abnormal cervical cell||LBC (ThinPrep, PreservCyt)||ND|
|Alameda, 201126||Clone E6H4||Nuclear score >2a||LBC (ND)||ND|
|Edgerton, 201127||Clone E6H4 Clone 274-11 AC3||Simultaneous dual staining of ≥1 cervical cell||LBC (ND, SurePath [Becton, Dickinson and Company, Franklin Lakes, NJ])||ND|
|Guo, 201128||Clone 6H12||Nuclear staining of ≥1 cytologically abnormal cervical cell with/without cytoplasmic staining||LBC (SurePath)||ND|
|Nasioutziki, 201134||Clone E6H4||Nuclear score >2a||LBC (PreservCyt; ThinPrep)||Ayre spatula and cytobrush|
|Schmidt, 201112||Clone E6H4, clone 274-11 AC3||Simultaneous dual staining of ≥1 cervical cell||LBC (ThinPrep, PreservCyt)||ND|
In 2 studies,12, 27 p16INK4a/Ki-67 dual staining using the CINtec Plus kit (Roche mtm laboratories AG, Heidelberg, Germany) was performed. The other 15 studies24-26, 28-39 applied single p16INK4a immunocytochemistry. Twelve studies24-26, 29-34, 37-39 used clone E6H4 as a primary antibody for p16INK4a, and other primary antibodies that used included clone 6H12,28 clone JC8,35 and 16P04.36 Positivity criteria for p16INK4a immunostaining differed between the studies. Five studies26, 31, 34, 38, 39 made use of the nuclear scoring proposed by Wentzensen et al.40 This scoring system takes into account nuclear staining and nuclear abnormalities (increased size, granular/hyperchromatic chromatin, irregular shape, or variable morphology from cell to cell). When a cervical cell has nuclear p16INK4a staining and 1 of the nuclear abnormalities mentioned above, a score of 2 is given. If the stained nucleus has an increased size and 1 or more nuclear abnormality, then a score of 3 and 4 is given, respectively. A nuclear score >2 or ≥2 is used as a cutoff for p16INK4a positivity. For the studies that applied p16INK4a/Ki-67 dual staining, simultaneous red nuclear staining and brown cytoplasmic staining in at least 1 cervical cell was set as the positivity criterion.12, 27 The presence of staining in 1 or more or 30 or more cytologically abnormal cervical cells was interpreted as a positive p16INK4a reaction in the remaining 10 studies. However, there was a difference in the localization of the immunostaining. Two studies28, 37 only considered nuclear staining as a positive p16INK4a reaction, whereas 8 studies24, 25, 29, 30, 32, 33, 35, 36 considered both nuclear and/or cytoplasmic staining as a positive reaction.
Triage of Atypical Squamous Cells of Undetermined Significance
Fifteen studies contained accuracy data for p16INK4a immunostaining in the triage of women with ASC-US cytology.12, 26-39 In total, 1740 women were enrolled. Eight studies performed a direct comparison with HC2 triage data.12, 26, 29, 31-34, 38 The study by Denton et al31 provided p16INK4a immunocytochemistry data interpreted independently by 2 pathologists and 1 cytotechnologist. To avoid the possibility that this study could contribute too much influence, each interpretation was weighted with a factor 0.33.
Absolute Accuracy of p16INK4a Triage
The pooled estimated absolute sensitivity and specificity values and their 95% confidence intervals (CIs) are listed in Table 3. The pooled sensitivity was 83.2% (95% CI, 76.8%-88.2%) and 85.4% (95% CI, 71.7%-93.1%) for an outcome of CIN2+ and CIN3+, respectively. To predict the absence of CIN2+ or CIN3+, the pooled absolute specificity was 71% (95% CI, 65%-76.4%) and 61.1% (95% CI, 57.2%-64.9%), respectively. The hierarchical summary receiver-operator curve (HSROC) curve for p16INK4a triage for an outcome of CIN2+ is illustrated in Figure 2.
|Outcome||No. of Studies||Parameter||Accuracy (95% CI), %|
Relative Accuracy of p16INK4a Versus Hybrid Capture 2 Triage
The relative accuracy measures and their CIs are listed in Table 4. The relative sensitivity of p16INK4a versus HC2 for CIN2+ and CIN3+ lesions was 0.95 (95% CI, 0.89-1.01) and 0.98(95% CI, 0.86-1.12). The relative specificity was 1.82 (95% CI, 1.57-2.12) and 1.64 (95% CI, 1.44-1.87) for predicting the absence of CIN2+ or CIN3+ lesions, respectively. The corresponding HSROC curve is illustrated in Figure 3. In Figure 3, the top graph illustrates that the 2 summary points are almost the same height (equal sensitivity), but the summary point of p16INK4a is located more to the left (greater specificity) than that of HC2. This means that HC2 and p16INK4a have equal sensitivity in the triage of ASC-US to detect CIN2+; however, the specificity of p16INK4a is greater than the specificity of HC2.
|Outcome||Ratio: p16INK4a vs HC2 (95% CI)||P|
|Specificity||1.82 (1.57-2.12)||< .0001|
|Specificity||2.74 (1.99-3.76)||< .0001|
|Specificity||2.81 (2.38-3.33)||< .0001|
Triage of Low-Grade Squamous Intraepithelial Lesions
Absolute Accuracy p16INK4a Triage
The pooled absolute sensitivity of p16INK4a was similar to its sensitivity in the triage of ASC-US, with 83.8% (95% CI, 73.5%-90.6%) and 87.7% (95% CI, 78.6%-93.2%) absolute sensitivity to predict CIN2+ and CIN3+, respectively. The absolute specificity of p16INK4a to predict the absence of CIN2+ and CIN3+ was slightly lower than its specificity in ASC-US triage; the pooled estimates were 65.7% (95% CI, 54.2%-75.6%) and 48.9% (95% CI, 36.2%-61.7%), respectively (Table 3, Fig. 2).
Relative Accuracy of p16INK4a Versus Hybrid Capture 2 Triage
In contrast to ASC-US triage, p16INK4a triage had lower sensitivity than HC2 to predict CIN2+ or CIN3+ lesions. The relative sensitivity for CIN2+ and CIN3+ lesions was 0.87 (95% CI, 0.81-0.94) and 0.88 (95% CI, 0.81-0.95), respectively. In concordance with ASC-US triage, p16INK4a triage had statistically significantly greater specificity than HC2 with pooled values of 2.74 (95% CI, 1.99-3.76) and 2.81 (95% CI, 2.38-3.33) for CIN2+ and CIN3+ outcome, respectively. The corresponding HSROC curve is illustrated on the bottom graph in Figure 3. The summary point of p16INK4a is located lower (lower sensitivity) and more to the left (higher specificity) than that of HC2 testing, which means that there is a difference in sensitivity and specificity between p16INK4a and HC2 to triage LSIL, indicating that p16INK4a triage has higher specificity but a lower sensitivity than HC2 to detect CIN2+ lesions in women with LSIL cytology (Table 4, Figs. 3, 4).
Influence of Study Characteristics
Multivariate analysis revealed higher sensitivity and specificity for studies that used the nuclear scoring system to interpret p16INK4a results and studies that applied dual staining for p16INK4a and Ki-67 compared with studies that only used simple p16INK4a expression in cytologically abnormal cells (Table 5). However, these differences were not statistically significant (P > .05).
|Covariate||No. of Studies||Sensitivity (95% CI), %||P||Specificity, (95% CI), %||P|
|Test cutoff criterion|
|p16INK4a expression in >1 cell||10||81.6 (70.2-89.3)||Ref||66.8 (62.3-70.9)||Ref|
|NS>2||5||85.9 (75.5-92.3)||.504a||83.1 (60.8-94.0)||.036a|
|Dual staining >1 cell||2||84.6 (69.1-93.1)||.696a||70.2 (56.7-80.9)||.595a|
|No. of triage tests evaluated|
|Both triage testsb||10||87 (81.1-91.3)||.119a||73.3 (62.9-81.7)||.452a|
|Only p16INK4a testingc||7||77.3 (65.1-86.1)||Ref||68.7 (60.9-75.6)||Ref|
|Test cutoff criterion|
|p16INK4a expression in >1 cell||9||79.7 (65.8-88.9)||Ref||58.9 (46.1-70.7)||Ref|
|NS>2||4||82.1 (65.9-91.6)||.778a||77.1 (56.4-89.7)||.086a|
|Dual staining >1 cell||1||94.4 (55-99.6)||.106a||68.0 (45.5-84.4)||.445a|
|No. of triage tests evaluated|
|Both triage tests||9||85.2 (74.4-91.1)||.644a||64.1 (49.3-76.6)||.681a|
|Only p16INK4a testing||5||80.2 (54.9-93.1)||Ref||68.7 (49.9-82.8)||Ref|
The studies that applied both p16INK4a and HC2 triage tests did not differ significantly in terms of sensitivity and had equal specificity compared with studies that only assessed p16INK4a immunocytochemistry. The type of p16INK4a antibody used also did not significantly influence the accuracy measures.
Our meta-analysis revealed better accuracy for p16INK4a triage of ASC-US than HC2 (similar sensitivity but better specificity) considering both CIN2+ and CIN3+ outcomes. In LSIL triage, p16INK4a staining was more specific but less sensitive than HC2.
Triage of Atypical Squamous Cells of Undetermined Significance
It has been demonstrated in large randomized trials and meta-analyses that HC2 performs better than repeat cytology in the triage of women with ASC-US.4, 5, 41, 42 Nevertheless, the specificity of HC2 triage still is not optimal (often in the 40%-60% range), resulting in colposcopy referral for many women without disease. With a pooled specificity of 71% (1.82 times higher than HC2), p16INK4a immunostaining appears to be a test that meets the demand for a more specific triage test without loosing sensitivity. The specificity of HC2 in ASC-US triage, including 8 studies (40.5% 95% CI, 33.5%-47.9%) was lower in our meta-analysis compared with previous meta-analyses, including 20 studies (62.5%; 95% CI, 57.8%-67.3%),5 but was not significantly different from the specificity reported in the ALTS study (48%),3 which may be explained by differences in the age composition of study populations. Age could not be controlled for throughout previous meta-analyses, because age-stratified data were not reported sufficiently in the included studies. However, within each of the 8 evaluable studies in our meta-analysis, age could not cause bias, because the 2 compared tests were done on the same women.
Triage of Low-Grade Squamous Intraepithelial Lesions
HC2 does not perform well in many studies because of its very low specificity.7, 41, 43 However, these findings are not universal and depend on the quality of cytologic interpretation and the HPV test used. In our meta-analysis, the pooled specificity values for HC2 were very similar to those reported in previous meta-analyses (in the range from 22% to 28%) for CIN3+ and CIN2+ outcomes.5 There is clearly a need for more specific assays that can be used universally in the triage of LSIL that are as sensitive and more specific than HC2. Our meta-analysis indicates that p16INK4a indeed is more specific but, in contrast to the triage of ASC-US, it is less sensitive.
Influence of Study Characteristics
The use of p16INK4a immunocytochemistry in clinical applications remains controversial because of the variation in procedures used. The most important difference between the different studies is the interpretation of p16INK4a expression.32 Because a purely color-based approach to identify abnormal cells in cervical smears using p16INK4a is hampered by the reality that few normal endocervical, squamous metaplastic, or atrophic cells also may display some p16INK4a expression, Wentzensen et al39 defined morphologic criteria that would enable scoring of p16INK4a-positive squamous cells. A major concern of using morphology-based biomarkers is achieving adequate reproducibility. Although the nuclear scoring system revealed high reproducibility in initial reports,39, 40 it was not applied consistently in subsequent studies, and its reproducibility was not evaluated on a larger scale. The recent p16INK4a/Ki-67 dual-staining method could eliminate the need for a standardized methodology, because it allows the identification of cells with a deregulated cell cycle in cervical cytology specimens independent of morphology-based parameters. We assumed that the studies that applied the nuclear scoring system or p16INK4a/Ki-67 dual staining would have greater accuracy (higher sensitivity and specificity) to identify women with CIN2+ compared with studies that only investigated simple p16INK4a expression in cytologically abnormal cells without scoring. Multivariate analyses revealed higher sensitivity and specificity of the ASC-US studies that applied nuclear scoring or dual staining compared with those that applied simple p16INK4a immunostaining; however, in general, these differences were not statistically significant. Only the specificity of p16INK4a immunostaining with nuclear scoring in women with ASC-US was significantly higher compared with the other studies (P = .04). p16INK4a/Ki-67 dual staining was used in only 2 ASC-US triage studies.12, 27 One study12 reported excellent sensitivity (92%) and specificity (81%) for CIN2+ using dual staining, with sensitivity similar to that of HC2 (ratio, 1.01; 95% CI, 0.92-1.16) but with increased specificity (ratio, 2.22; 95% CI, 1.89-2.62). Another study27 that used dual staining reported substantially lower sensitivity (64%) and specificity (53%) for the same outcome without comparison with HC2. This may be because the study did not follow the manufacturer's instructions for CINtec PLUS dual staining. In LSIL triage, only 1 study used dual staining and reported findings similar to those for ASC-US triage: high sensitivity (94%) and rather good specificity (68%) for CIN2+, similar to the sensitivity of HC2 (ratio, 0.98; 95% CI, 0.93-1.03) but with higher specificity than HC2 (ratio, 3.57; 95% CI, 2.76-4.60).12
The gold standard used can influence the accuracy estimates of triage tests. In the current meta-analysis, we considered colposcopy and histology as the gold standard and distinguished 6 types of verification. However, none of these methods of verification significantly influenced the triage accuracy estimates. In addition, staining of biopsies may have an impact on outcome assessments. Two studies12, 31 used p16 immunohistochemistry in addition to normal hematoxylin and eosin (H&E) staining for the histologic interpretation of biopsies. Previous studies have demonstrated that this improved gold standard increases the sensitivity of the histologic interpretation.44, 45 Our multivariate analysis revealed no significant difference in the absolute sensitivity of triage using p16INK4a immunocytochemistry between studies that used H&E staining compared with studies that used p16INK4a staining of biopsies (P = .17 and P = .22 for ASC-US and LSIL, respectively). Furthermore, outcome adjudication using p16 will bias the results in favor of p16 cytology because of autocorrelation.
Future Research on Triage of Atypical Squamous Cells of Undetermined Significance and Low-Grade Squamous Intraepithelial Lesions
The meta-analysis presented in this report is part of an international effort that includes a series of ongoing meta-analyses addressing the accuracy of triage of minor cytologic abnormalities using methods other than HC2, such as other hrHPV-DNA tests, assays that detect viral RNA, picking up a restricted number HPV types (in particular, HPV types 16 and 18), as well as other protein markers, such as ProExC (BD Diagnostics—TriPath, Burlington, NC). All these meta-analyses will address questions of follow-up for screen-positive women participating in cytology-based screening. Investigators and authors should be recommended to follow Standards for Reporting of Diagnostic Accuracy (STARD) guidelines for good diagnostic research involving application of 1 or more markers followed by verification with colposcopy and colposcopy-targeted biopsies with or without additional random punch biopsies for all patients with ASC-US and LSIL.14, 46 This gold-standard verification preferentially should be blinded to the results of the markers and should take place in a short delay (<10 weeks) to avoid development of disease after the triage tests. Future research also should target longitudinal outcomes, in particular the risk of developing CIN3 in women with triage-positive and triage-negative results over 3 to 5 years (longitudinal positive predictive value and 1-Negative Predictive value (NPV)).
In conclusion, based on currently published data, we conclude that p16INK4a immunocytochemistry may be recommended for use in the triage of women with ASC-US because of its greater specificity without loss of sensitivity compared with HC2 testing. In LSIL triage, p16INK4a is less sensitive but more specific than HC2. Therefore, it can be used as a first-step triage, justifying further diagnostic workup of p16INK4a-positive women. However, women with LSIL who are negative for p16INK4a cannot be referred back to normal screening. Those women should be reinvited for repeat testing. Dual staining in LSIL triage may be as sensitive as HC2, but this was reported in only 1 observational study, which is insufficient to justify clinical recommendations. More studies using the dual stain are currently ongoing and may have an influence on the current conclusions.
Financial support was received from: 1) the European Commission through the Prevention Strategies for HPV-Related Diseases in European Countries (PREHDICT) Network, coordinated by the Free University of Amsterdam (the Netherlands), funded by the seventh Framework program of DG Research (Brussels, Belgium), and through the European Cooperation on Development and Implementation of Cancer screening and prevention guidelines (coordinated by the International Agency for Research on Cancer [Lyon, France]), funded by the Directorate of SANCO (Luxembourg, Grand-Duchy of Luxembourg); 2) The Belgian Foundation Against Cancer (Brussels, Belgium); and 3) the Gynaecological Cancer Cochrane Review Collaboration (Bath, United Kingdom).
CONFLICT OF INTEREST DISCLOSURES
M. Von Knebel Doeberitz was a member of the supervisory board and shareholders of mtm laboratories.
- 12p16/ki-67 dual-Stain cytology in the triage of ASCUS and LSIL Papanicolaou cytology: results from the European Equivocal or Mildly Abnormal Papanicolaou Cytology study. Cancer Cytopathol. 2011; 119: 158-166., , , .
- 20Metandi: meta-analysis of diagnostic accuracy using hierarchical logistic regression. Stata J. 2009; 9: 211-229., .
- 21Diagnostic Test AccuracyWorking Group. METADAS: A SAS Macro for Meta-Analysis of Diagnostic Accuracy Studies. Cary, NC: SAS Institute, Inc.; 2009. Available at: http://srdta.cochrane.org/sites/srdta.cochrane.org/files/uploads/METADAS %20Quick%20Reference%20WorkedExample%20v1.3.pdf. Accessed November 30, 2011.;
- 22Diagnostic Test Accuracy Working Group. Handbook for Diagnostic Test Accuracy Reviews. Available at: http://srdta.cochrane.org/handbook-dta-reviews. Accessed November 30, 2011.
- 28Evaluation of p16 immunostaining to predict high-grade cervical intraepithelial neoplasia in women with Pap results of atypical squamous cells of undetermined significance. Diagn Cytopathol. 2011; 39: 482-488., , , et al.
- 29Is p16(INK4A) expression more useful than human papillomavirus test to determine the outcome of atypical squamous cells of undetermined significance-categorized Pap smear? A comparative analysis using abnormal cervical smears with follow-up biopsies. Gynecol Oncol. 2005; 97: 35-40., , , et al.
- 32A comparison of the clinical utility of p16(INK4a) immunolocalization with the presence of human papillomavirus by Hybrid Capture 2 for the detection of cervical dysplasia/neoplasia. Cancer. 2006; 108: 451-461., , , , .
- 41[No authors listed] Human papillomavirus testing for triage of women with cytologic evidence of low-grade squamous intraepithelial lesions: baseline data from a randomized trial. The Atypical Squamous Cells of Undetermined Significance/Low-Grade Squamous Intraeithelial Lesions Triage Study (ALTS) Group. J Natl Cancer Inst. 2000; 92: 397-402.