Cervical cancer is a worldwide problem, representing 1 of the most common cancers in women across the globe.1 Cervical cancer screening programs, such as Papanicolaou (Pap) testing, decrease the incidence and mortality of squamous cell carcinoma of the uterine cervix; in the United States, there has been a decrease in the incidence of invasive cervical cancer by 70% since the implementation of Pap testing in the 1950s.2 Why is cervical cancer screening successful? Pap tests detect cell changes that may indicate a current, biologically significant lesion, allowing intervention before the occurrence of invasive carcinoma.

Not all cell changes are significant. In fact, as the prevalence of cervical cancer decreases because of screening and human papillomavirus (HPV) vaccination,3, 4 morphologic changes are more likely to be associated with insignificant, non-HPV–related lesions. HPV is a necessary factor for the development of invasive cancer. Therefore, HPV testing is a successful triage test that identifies the presence of HPV in borderline abnormalities. If there is no HPV, then there is not likely to be a precancerous lesion requiring intervention. However, current triage HPV tests simply detect the presence of the HPV virus. HPV infection is common, but not all women with HPV infection develop cancer. Invasive cancer requires the persistence of HPV and the integration of HPV into the host genome. Our current screening programs measure 2 different variables to decide which women are at risk. Pap tests detect morphologic abnormalities that may be associated with transformed cells, and the HPV test simply confirms that HPV is present in the cell. Neither test actually identifies a cell that has the ability to transform into cancer. Consequently, women who demonstrate morphologic epithelial cell abnormalities and the presence of HPV that would not progress to invasive cancer may be subjected to unnecessary diagnostic surveillance or treatment that may be harmful.

What would be an ideal test? An ideal cervical cancer screening test would be 1 that could identify women who have transforming cells that will result in invasive cervical cancer and clearly exclude those women who have cells in which HPV is not integrated and that will not develop into cancer. A marker of integration and transformation would be an ideal test. Cyclin-dependent kinase inhibitor 2A (P16INK4a) is a surrogate marker of cell transformation. The accumulation of P16INK4a in a cell is a result of the integration of the E7 HPV oncogene into the cell and disruption of the cyclin-dependent kinase phosphorylation/retinoblastoma protein pathways. The test to detect P16INK4a is a visual detection system that requires nuclear and cytoplasmic signals. In a review and meta-analysis by Roelens et al in this issue of Cancer Cytopathology,5 the authors thoroughly review the use of P16INK4a as a triage test for minor cytologic abnormalities defined as atypical squamous cells of undetermined significance (ASC-US) and low-grade squamous intraepithelial lesion (LSIL). The meta-analysis includes 17 studies; 2 studies of women with LSIL, 5 studies of women with ASC-US, and 10 studies that evaluated the performance of p16INK4a among women with both ASC-US and LSIL. The included studies used a variety of preparations, including conventional preparations (ThinPrep [Hologic, Bedford, Mass] and SurePath [Becton, Dickinson and Company, Franklin Lakes, NJ]), several different antibody clones, and different staining scoring systems; and, in 3 studies, combined staining of P16INK4a and Ki-67 was used. The studies were from 11 different countries, and the outcome was a biopsy that demonstrated cervical intraepithelial lesion (CIN) with a score of either 2 or worse (CIN2+) or 3 or worse (CIN3+).

The authors also compared the performance of P16INK4a with the performance of Hybrid Capture 2 (HC2) (Qiagen, Venlo, Netherlands) when available. When using P16INK4a to triage ASC-US, the meta-analysis demonstrated a sensitivity of 83.8% for CIN2+ and 87.7% for CIN3+ with specificities of 71% and 61.1% for CIN2+ and CIN3+ respectively. P16INK4a had the same sensitivity as HC2 but had greater specificity. For LSIL, the sensitivities were similar to ASC-US but demonstrated lower specificities for CIN2+ (65.7%) and CIN3+ (48.9%). Compared with HC2, P16INK4a had a higher specificity but lower sensitivity in women with LSIL. Based on their meta-analysis, the authors conclude that P16INK4a is a better test than HC2 in ASC-US (just as sensitive with increased specificity) but does not perform as well in LSIL triage (more specific, but less sensitive).

The meta-analysis is a wonderful example of the rigor with which a meta-analysis should be performed. The transparent methods, clear definitions of inclusion and exclusion criteria, outcome measure, reference standard, and statistical analyses are models for the evaluation of emerging technologies for triage methods when no large-scale, prospective studies are available.

Although the meta-analysis by Roelens et al is well done, it has limitations. The results reported for ASC-US included 1740 women, and the 2019 women with LSIL represented a relatively small pooled sample. It also was not possible to stratify the results by age because there were inadequate age-stratification data. These limitations are because of the nature of the initial data available for reanalysis. All meta-analyses also are limited because the hypotheses investigated in the original study may not be of primary importance in the meta-analysis; the meta-analysis depends on information obtained in a tangential manner from the original study's purpose.

Although all meta-analyses are limited by their retrospective nature, a more significant concern in considering the clinical application of P16INK4a is that any subjective, visual method is fraught with problems of reproducibility. P16INK4a is a visual method and is similar to a Pap test in its subjective evaluation of “positive” staining. Not only were various methods and antibody clones used by investigators reported in this meta-analysis, but there were no standard methods of determining “positivity” for PINK4a staining. Simple positive staining, dual staining with Ki-67, and nuclear scoring systems all were reported, and specificity differed significantly only when nuclear scoring was used in ASC-US cases. Furthermore, there were instances in which investigators did not follow the manufacturer's instructions,6 which underscores the importance of method standardization and validation before the use of any method in any clinical setting. Before P16INK4a can be used clinically, it is essential to standardize interpretation and reporting. A cavalier approach to methods or to manufacturer's instructions will obfuscate any potential value of the test.

Other considerations that were not part of the meta-analysis by Roelens et al, but that need to be considered in any cancer screening program, are access to the test and cost of the test. Obviously, the use of P16INK4a testing was studied in countries with well developed cervical cancer screening programs that had sophisticated laboratory medicine services. A test like P16INK4a is inappropriate in a low-resource setting without appropriate laboratory sophistication or patient follow-up. In countries with well developed screening programs, an increase in specificity for a triage test is important. In countries without cervical cancer screening programs, implementation of screening technology and effective clinical follow-up appropriate for the medical infrastructure of the country should be the primary consideration for screening. Finally, it is important to determine whether the cost of P16INK4a testing, balanced against the potential increased specificity of the test, will be acceptable to risk-sensitive patients and health care professionals. If patients and clinical caregivers insist on historic, conventional follow-up and treatment for ASC-US despite the results of a P16INK4a test, then the cost will only be a burden, and the test will not diminish potentially harmful follow-up.

Should we adjust our pink-colored lenses and begin widespread testing with P16INK4a? Although the current meta-analysis certainly opens the door for future research, it is premature to applaud P16INK4a testing at this time. The report by Roelens et al is a first step in considering alternative triage tests for abnormal Pap results. The authors caution that future research is essential using Standards for Reporting Diagnostic accuracy (STARD) guidelines,7, 8 requiring colposcopy and biopsy for all patients with ASC-US results and a blinded review of the gold standard. It is noteworthy that longitudinal studies using the risk of developing CIN3+ should be developed over more than 3 years. The clinical utility of P16INK4a is yet to be determined, but the current meta-analysis provides a framework on which to build future, larger scale research studies. Meanwhile, we should continue to study and search for tests that accurately detect cells that have the potential to undergo malignant transformation, allowing appropriate early intervention, and minimizing harms caused by unnecessary follow-up and treatment.


  1. Top of page

No specific funding was disclosed.


The author made no disclosures.


  1. Top of page