• cervical intraepithelial neoplasia;
  • p16 immunohistochemistry


  1. Top of page
  2. Abstract
  3. Material and methods
  4. Results
  5. Discussion
  6. References

The histopathological diagnosis of cervical intraepithelial neoplasia grade 2,3 (CIN 2,3) is subjective and prone to variability. In our study, we analyzed the impact of utilizing a biomarker (p16INK4A) together with histopathology to refine the “gold standard” utilized for evaluating the performance of 3 different cervical cancer screening tests: cervical cytology, human papillomavirus (HPV) DNA testing and visual inspection with acetic acid (VIA). Cervical biopsies from 2 South African cervical cancer screening studies originally diagnosed by a single pathologist were reevaluated by a second pathologist and a consensus pathology diagnosis obtained. Immunohistochemical staining for p16INK4A was then performed. The estimated sensitivity of some cervical cancer screening tests was markedly impacted by the criteria utilized to define CIN 2,3. Use of routine histopathology markedly underestimated the sensitivity of both conventional cytology and HPV DNA testing compared to an improved gold standard of consensus pathology and p16INK4A positivity. In contrast, routine histopathology overestimated the sensitivity of VIA. Our results demonstrate that refining the diagnosis of CIN 2,3 through the use of consensus pathology and immunohistochemical staining for p16INK4A has an important impact on measurement of the performance of cervical cancer screening tests. The sensitivity of screening tests such as HPV DNA testing and conventional cytology may be underestimated when an imperfect gold standard (routine histopathology) is used. In contrast, the sensitivity of other tests, such as VIA, may be overestimated with an imperfect gold standard. © 2006 Wiley-Liss, Inc.

The histolopathologic interpretation of cervical intraepithelial neoplasia (CIN) is subjective and prone to considerable variation. This is documented by numerous studies investigating intraobserver and interobserver variability among groups of pathologists evaluating cervical biopsies.1, 2, 3, 4 In general, agreement between pathologists is excellent for invasive lesions, moderately good for CIN 3 and poor for CIN 1 and CIN 2.3 This results in marked difficulties in separating normal biopsies from CIN 1 and in distinguishing between CIN 1 and CIN 2,3 based on histopathology alone.4

Recognition of this variability has led to concerted efforts to identify novel biomarkers capable of more reproducibly distinguishing between normal and CIN lesions and in reducing the variability in grading of CIN lesions. A number of potentially diagnostically useful biomarkers have been identified using conventional immunohistochemical approaches and tissue microarrays.5, 6, 7, 8, 9 These include altered expression of selected MHC antigens; increased expression of Ki-67, telomerase and desmogleins and reduced expression of nm23-H1, a candidate tumor suppressor gene.10, 11, 12, 13, 14 One of the more promising biomarkers is p16INK4A, a cyclin-dependent kinase inhibitor involved in control of the cell cycle.8 The intracellular expression of p16INK4A is increased upon the binding of high-risk HPV derived E7 oncoproteins to the retinoblastoma gene product. Since essentially all cervical cancers and high-grade precursors are associated with high-risk types of HPV, it is not surprising that there is an increased expression of p16INK4A in HPV-induced neoplasia.15 Numerous studies have demonstrated that staining with p16INK4A is very uncommon in normal cervical squamous epithelium, that a proportion of CIN 1 lesions stain positively for p16INK4A, a higher proportion of CIN 2 lesions stain positively and that the vast majority of CIN 3 lesions and cancers stain positively.7, 9, 16, 17, 18, 19 Moreover, prospective follow-up studies suggest that p16INK4A-positive lesions behave as true precursor lesions. One prospective study of women with CIN 1 demonstrated a higher rate of regression of p16INK4A-negative CIN 1 lesions (71%) compared to p16-positive CIN 1 lesions (38%).20 Although some p16-negative CIN 1 lesions progressed to CIN 3 in that study, the p16-positive lesions were much more likely to progress. Another prospective study demonstrated that 44% of women with p16INK4A-positive biopsies that were classified as “not CIN 2,3” by consensus pathology were subsequently diagnosed with CIN 2,3.16 Therefore, there is considerable data suggesting that p16INK4A immunostaining can help identify CIN 2,3 lesions that will behave as true precursor lesions.

Improving the accuracy of the histopathological diagnosis of CIN is important not only for patient care but also for evaluating new cervical cancer screening methods. Screening and diagnostic tests are usually evaluated against a gold standard that classifies patients as to the absence or presence of the disease being tested for. Unfortunately, the gold standards that are used for evaluating screening tests for cervical cancer are imperfect and tend to misclassify some patients.21 These misclassifications produce distortions in measures of test performance that can be of considerable magnitude and can also obscure important differences between tests.21, 22 Since combining immunohistochemical detection of p16INK4A together with routine histological examination improves interobserver agreement in the diagnosis of CIN, we have assessed the impact of combining p16INK4A immunohistochemistry together with histopathology as the gold standard for determining the performance of cervical cancer screening methods.23

Material and methods

  1. Top of page
  2. Abstract
  3. Material and methods
  4. Results
  5. Discussion
  6. References


Material for our study was obtained from 2 cervical cancer screening studies conducted in Cape Town, South Africa, between January 1996 and October 1999. In the first study, 2,944 nonpregnant, previously unscreened, women ages 35–65 years were screened using a conventional Papanicolaou test, visual inspection with acetic acid (VIA) in which the cervix is washed with a 5% solution of acetic acid and then inspected for areas of acetowhitening that may represent a precancerous lesion and human papillomavirus (HPV) DNA testing using the older Hybrid Capture 1 HPV DNA assay (Digene Corporation, Gaithersburg, MD).24 A 35 mm photograph of the cervix was also obtained after the application of acetic acid (Cervigram™, National Testing Laboratories, Fenton, MO). This photograph was blindly reviewed by a trained observer and classified using the company's standard terminology. Women with cytology of low grade squamous intraepithelial lesion (LSIL) or greater, women who were VIA-positive, had Cervigram diagnoses of “warrants colposcopy (P0),” low-grade SIL, high-grade SIL or cancer, and women who had high HPV DNA viral loads (10 times the standard cutoff representing ∼100 pg/ml HPV DNA) were referred for colposcopy with biopsy of all colposcopically identified minor grade lesions or loop electrosurgical excision procedure (LEEP) if a high-grade lesion was identified. Colposcopy was performed on 26% of the 2,944 women. At the time of colposcopy, the colposcopist utilized the Reid Colposcopic Index to grade any cervical lesions. This index classifies cervical lesions as either squamous metaplasia or CIN 1 (overall score of 1–2), likely to be CIN1 or CIN 2 (score of 3–4) or likely to be CIN 2,3 (score of 5–8).25 When the Hybrid Capture 2 HPV DNA assay (hc2) (Digene Corporation, Gaithersburg, MD) became available, we went back to samples from our study and retested samples from all women with biopsy-confirmed CIN and a random sample of nondiseased samples from the first study using the new test.26 The data that we present in the results of this article refers to HPV testing using the new Hybrid Capture 2 assay.

In the second study, 2,754 women meeting the same inclusion/exclusion criteria were enrolled from the same community.27 These women were screened using conventional cytology, VIA and high-risk HPV DNA testing using the Hybrid Capture 2 HPV DNA assay. These women also had a 35 mm photograph of the cervix obtained. Women who were VIA-positive, had LSIL or greater, had Cervigram diagnoses of “warrants colposcopy (P0),” low-grade SIL, high-grade SIL or cancer or who were high-risk HPV DNA positive at the standard clinical cutoff (∼1 pg/ml HPV DNA), were referred for colposcopy and biopsy. This represented 44% of all women. All participants provided written informed consent and both studies were approved by the Institutional Review Boards of Columbia University and the University of Cape Town.

Sample selection and histological review

Cervical biopsies were fixed in neutral buffered formalin and processed and initially diagnosed in Cape Town. Slides were then sent to Columbia University and reviewed by a single pathologist in a blinded fashion to provide a “study diagnosis” that was used to determine the performance of the different screening tests. All biopsies diagnosed as CIN 1 (n = 194), CIN 2,3 (n = 165), cancer (n = 32), as well as 28 biopsies diagnosed as squamous metaplasia without CIN that were selected at random, were reviewed by a second pathologist, and if the second review did not agree with the first, a third pathologist reviewed the case. On the basis of all readings combined, a “consensus diagnosis” (2 out of 3 in agreement) was reached.


Paraffin blocks from 329 (80%) of the 419 biopsies could be retrieved and had sufficient diagnostic material remaining for immunohistochemistry. This included 157 CIN 1 lesions, 135 CIN 2,3 lesions, 16 cancers and 21 squamous metaplasia. NCL-p16-432, an anti-human p16INK4A monoclonal antibody (clone 6H12, Novocastra Laboratories, UK) was used at a 1:25 dilution. Prior to incubation with the primary antibody, rehydrated sections were microwaved for 15 min in 0.01 citric acid (pH 6.0) and then washed twice with distilled water.28 Endogenous peroxidase activity was abolished by incubation in methanol containing 0.3% hydrogen peroxide for 20 min. Sections were preincubated with 3% normal horse serum in phosphate-buffered saline for 1 hr at room temperature (RT), incubated with primary antibody at 4°C overnight, followed by a 1 hr incubation at RT. The avidin-biotinylated-peroxidase complex detection system was used for immunocytochemical localization (Vectastain ABC kit, Vector Laboratory, Burlingame, CA). Immunostaining was visualized using Liquid DAB Pack (BioGenex, CA). For negative controls, slides were incubated with normal rabbit IgG or preimmune serum instead of primary antibody.

p16INK4A staining was classified as either diffuse, involving all layers of the epithelium; basal, involving only the basal and parabasal cell layers and negative. Both diffuse and basal staining could be strong, moderate or weak.

Statistical methods

Associations between p16INKA4 staining and disease categories were tested using χ2 tests. Agreement within pathology diagnoses was described using the κ coefficient. Sensitivity was calculated as the proportion with a positive screening test among those with CIN 2, CIN 3 or cancer (CIN2,3+). Differences in sensitivity between the screening tests were examined using McNemar tests. Confidence intervals for proportions were calculated based on the binomial distribution. All p-values were two-sided and p < 0.05 was considered statistically significant. All analyses were done using SAS statistical software (Cary, NC).


  1. Top of page
  2. Abstract
  3. Material and methods
  4. Results
  5. Discussion
  6. References

p16INK4A immunostaining

A strong association was observed between p16INK4A staining and lesion grade based on the original histological “study” diagnosis of the 329 cases, Table I (p < 0.001). Only 33% of the 21 biopsies classified as squamous metaplasia were positive for any degree of staining with p16INK4A, whereas 70% of the CIN 2 lesions, 80% of the CIN 3 lesions and 100% of the cancers stained positively. For the purposes of analysis, p16INK4A staining was classified into 2 general patterns, staining of only the basal and parabasal layers of the epithelium and diffuse staining of the epithelium, Figure 1. Each of these staining patterns was further subclassified as to being strong, moderate or weak in intensity. Strong diffuse staining was the predominant staining pattern, Table I. This pattern was observed in 81% of the cases of invasive cervical cancer and 59% of the CIN 3 lesions. Other patterns of staining were found in only a minority of cases. When only basal staining was present, it was almost always strong; no cases were classified as moderate and only 3 as having weak basal staining. Based on the results presented in Table I, it was decided to utilize a dichotomous classification of either positive (any intensity of staining in either pattern) or negative for p16INK4A staining in the subsequent analysis.

thumbnail image

Figure 1. p16INKA Immunohistochemical staining: (a) strong basal only staining; (b) strong diffuse staining; (c) weak basal only staining and (d) weak diffuse staining.

Download figure to PowerPoint

Table I. Distribution of p16 Staining in Different Types of Cervical Lesions by Original Diagnosis
Original diagnosisNo.Strong p16INK4a stainingModerate/weak p16INK4a stainingNegative for p16INK4a
Metaplasia215 (24%)2 (10%)0014 (66%)
CIN 115737 (24%)36 (23%)6 (4%)2 (1%)76 (48%)
CIN 27025 (36%)8 (11%)15 (21%)1 (1%)21 (30%)
CIN 36539 (60%)3 (5%)10 (15%)013 (20%)
Invasive cancer1613 (81%)1 (6%)2 (12%)00

When a two-tiered CIN classification system is utilized (i.e., CIN 1 and CIN 2,3), the “consensus” diagnosis obtained after blinded pathological review differed from the original “study” diagnosis for 84 (25%) of the 329 cases. The most common change in diagnosis was from CIN 1 to metaplasia. A total of 51 (32%) of the 157 cases originally classified as CIN 1 were reclassified as metaplasia on “consensus” pathology, Table II. Cases that stained negatively for p16INK4A showed a poor correlation between the original “study” diagnosis and the “consensus” diagnosis (κ = 0.378), whereas cases that stained positively for p16INK4A showed a good level of correlation between the two diagnoses (κ = 0.736), Table II. The impact of p16INK4A staining status on discordance between the two diagnoses was most marked at the two ends of the histological spectrum. Among cases originally classified as metaplasia, 14 (100%) of the 14 that were p16INK4A-negative were classified as negative on the “consensus” diagnosis. However, 3 (43%) of the 7 p16INK4A-positive cases originally classified as metaplasia were upgraded to CIN during the blinded “consensus” read. Similarly, 114 (97%) of the 117 p16INK4A-positive cases originally classified as CIN 2,3+ (includes cases of CIN 2, CIN 3 and invasive cancer) were classified as CIN 2,3+ on the “consensus” diagnosis compared to only 68% (23 of 34) of the p16INK4A-negative cases.

Table II. Correlation Between Original and “Consensus” Diagnosis
Consensus diagnosisOriginal diagnosis
CIN 2,3+CIN 1MetaplasiaTotal
  1. CIN 2,3+ includes cases classified as CIN 2, CIN 3 and invasive cervical cancer. p16 staining is negative or positive for any pattern of staining at any intensity.

 CIN 2,3+23 (68%)2 (3%)025
 CIN 11 (3%)31 (41%)032
 Metaplasia10 (29%)43 (57%)14 (100%)67
 CIN 2,3+114 (97%)14 (17%)1 (14%)129
 CIN 12 (1.7%)59 (73%)2 (29%)63
 Metaplasia1 (1%)8 (10%)4 (57%)13

p16INK4A-negative CIN 2,3 lesions

All 25 cases that were classified as CIN 2,3 by consensus pathology, but that were p16INK4A-negative, were reviewed in an unblinded fashion. These cases tended to have a similar histological appearance, Figure 2. Many have abundant eosinophilic cytoplasm with enlarged hyperchromatic nuclei and coarsely clumped chromatin and prominent chromocenters. Mitoses were readily identified in 23 (91%) of the 25 lesions. In most, mitoses could be identified at approximately the midway point of the epithelium. In approximately one-third (n = 8) of the cases, mitoses were identified in the upper half of the epithelium. This histological appearance is reminiscent of reactive/reparative changes, but with greater nuclear atypia than usually associated with such changes.

thumbnail image

Figure 2. p16INKA-negative lesion classified as CIN 2,3 by “consensus” pathology. (a) Hematoxlylin and eosin stained section showing cells with considerable amounts of eosinophilic cytoplasm, nuclear enlargement and coarse granular chromatin and (b) p16INKA immunohistochemical staining is negative.

Download figure to PowerPoint

The p16INK4A-negative CIN 2,3 lesions could also be separated from the p16INK4A-positive CIN 2,3 lesions using a number of nonhistological criteria, Table III. For example, the Cervigram results obtained at the screening visits are significantly different (p < 0.001) for the two groups. Ninety-one percent of the Cervigram results were classified as negative for the p16INK4A-negative cases compared to only 37% for the p16INK4A-positive cases. Similarly, the cervical cytology and HPV DNA testing results obtained at enrollment as well as the Reid Colposcopic Index score obtained at the time of colposcopy were significantly different for p16INK4A-positive and -negative cases (p < 0.001 for all). For example, a Reid Colposcopic Index of 5–8 that indicates “probable CIN 2,3” was obtained at the time of colposcopy in only 8% of the p16INK4A-negative cases compared to 43% of the p16INK4A-positive cases. Moreover, not only were a higher proportion of the p16INK4A-negative cases high-risk HPV DNA negative using hc2 but also the median RLU (relative light units that is an indicator of the amount of HPV DNA present in a sample) among those who were HPV DNA positive was lower in the p16INK4A-negative cases compared to the p16INK4A-positive cases.

Table III. Nonhistological Features of Cases with a “Consensus” Diagnosis of CIN 2,3
 Number (%) of cases1p-value2
p16INK4a-negative (n = 25)p16INK4a-positive (n = 114)
  • 1

    Numbers do not add up to totals due to missing data and are as shown. Cases of invasive cervical cancer are not included.–

  • 2

    χ2 test for trend except for median RLU value that is by Kruskall-Wallace test.

  • 3

    P0 is “warrants colposcopy” diagnosis.

  • 4

    If high-risk HPV DNA positive by hc2, RLU is relative light units that is an indicator of HPV DNA content or viral load.

Cervigram result
 Negative19 (90.5)38 (36.9) 
 Positive1 (5.8)1 (1.0) 
 P0305 (4.9) 
 CIN 11 (5.8)33 (32.0) 
 CIN 2,3021 (20.4) 
 Suspicious for cancer05 (4.9)<0.001
Cytology result
 WNL16 (66.6)11 (9.8) 
 ASCUS4 (16.7)3 (2.7) 
 LSIL3 (12.5)33 (29.5) 
 HSIL1 (4.2)62 (55.4) 
 Cancer03 (2.7)<0.001
Reid colposcopic index
 1–215 (60.0)19 (17.6) 
 3–48 (32.0)43 (39.8) 
 5–82 (8.0)46 (42.6)<0.001
HPV DNA status using hc2
 High-risk negative11 (44.0)3 (2.6) 
 High-risk positive14 (56.0)111 (97.4)<0.001
Median RLU value49.03 (2.56–329.3)201.1 (44–611)0.006

Women whose biopsies were originally diagnosed as CIN 2,3 were referred for treatment using LEEP. Follow-up examinations that incorporated colposcopy and biopsy of all visible lesions were obtained at 4 and 10 months in 92 women with a “consensus” histological diagnosis of CIN 2,3. Of these 92 cases, 80 (87%) were also p16INK4A-positive. During follow-up, recurrent CIN 2,3 was diagnosed in 8 (8%) of the 92 women. All 8 were originally p16INK4A-positive. None of the 12 women with a p16INK4A-negative CIN 2,3 lesion developed recurrent persistent CIN 2,3.

Impact of “gold standard” definition on test performance

The criteria that were used to define the “gold standard” affected the estimates of sensitivity of the 3 screening tests that were evaluated in the screening trials, Table IV. When the original study diagnosis is used as the “gold standard” the sensitivity of HPV DNA testing was only 86.8 (95% CI: 81.3–92.2) for CIN 2,3+. This increased to 90.3 (95% CI: 85.6–94.9) when the “consensus” diagnosis was used as the “gold standard” and to 96.9 (95% CI: 93.1–99.9) when the definition of the “gold standard” required not only a consensus diagnosis of CIN 2,3+ but also that the lesion be p16INK4A-positive. Similar increases were observed in the sensitivity of cervical cytology. When the “gold standard” is defined as the original “study” diagnosis, the sensitivity of cytology was only 69.6 (95% CI: 62.4–77.3). This increased to 88.0 (95% CI: 82.3–93.7) when the gold standard was defined as the “consensus” diagnosis of CIN 2,3+ and p16INK4A positivity. In contrast to the results observed for HPV DNA testing and cytology, changing the definition of the “gold standard” did not improved the sensitivity of VIA.

Table IV. Effect of Changing Definition of Gold Standard on Screening Test Performance
Estimate of sensitivity (95% CI) of screening testOriginal diagnosis of CIN 2,3+1Consensus diagnosis of CIN 2,3+Consensus diagnosis of CIN 2,3+ and p16INK4a-positive2
  • 1

    CIN 2,3+ includes cases classified as CIN 2, CIN 3 and invasive cervical cancer.

  • 2

    p16INK4a staining is positive for any pattern of staining at any intensity.

  • 3

    at a cutoff of ASCUS or greater.

HPV DNA testing86.8 (81.3–92.2)90.3 (85.6–94.9)96.9 (93.1–99.9)
Cervical cytology369.9 (62.4–77.3)76.5 (69.7–83.3)88.0 (82.3–93.7)
VIA71.5 (64.3–78.7)68.2 (60.8–75.5)67.4 (59.4–75.5)

HPV DNA testing was significantly more sensitive than the other 2 tests using any definition for the “gold standard.” However, VIA and conventional cytology did not have significantly different sensitivities when the “gold standard” did not include p16INK4A staining. Once the “gold standard” was improved with consensus pathology and p16INK4A staining, conventional cytology had significantly better sensitivity than VIA (p < 0.0001).


  1. Top of page
  2. Abstract
  3. Material and methods
  4. Results
  5. Discussion
  6. References

Interpretation of cervical biopsies is widely recognized to be subjective and to have a significant degree of inter- and intraobserver variation. This has implications for patient care, since it means that clinical management guidelines must be inherently conservative to take into account the possibility that a given patient may not actually have the lesion for which they are being treated. It also has implications for evaluating the performance of new cervical cancer screening technologies that are usually evaluated against, the best available, but imperfect, gold standard of biopsy-confirmed CIN 2,3 and cancer. Recently, there has been interest in developing biomarkers that could assist in the diagnosis of CIN 2,3 lesions.6 One of the most promising of the new biomarkers is p16INK4a. In general, a strong association between finding high levels of p16INK4a within a lesion by immunohistochemical staining and a diagnosis of CIN 2,3 has been found.7, 9, 29 Immunohistochemical staining of cervical biopsies for p16INK4a can reduce interobserver variation in their histolopathologic interpretation.23 Moreover, women with cervical biopsies that are p16INK4a-positive but not diagnosed as CIN 2,3 are at greatly elevated risk for subsequently being diagnosed with CIN 2,3 and women with p16INK4a-negative CIN 1 lesions are less likely to progress to CIN 3 than those with p16INK4a-positive CIN 1.16, 20

In our study, we observed good correlation between p16INK4A immunohistochemical staining and a consensus pathological diagnosis of CIN 2,3+, consistent with what has been shown previously in the literature.7, 9, 29 Using archival tissue we observed some heterogeneity of p16INK4a staining in CIN 2,3+, which may reflect either variations in the degree of overexpression of p16INK4a within the lesions or variability in antigen retrieval. Lesions that were classified as CIN 2,3+ on consensus pathology but that were p16INK4A-negative had histological features suggestive of immature squamous metaplasia and did not recur after treatment. Women with these lesions were also significantly more likely (p < 0.001) to have a negative cervical cytology result, to be high-risk HPV DNA negative, have a negative Cervigram result and to be classified as not CIN 2,3 using the Reid Colposcopic Index at the time of colposcopy than were women with lesions that were p16INK4A-positive. Moreover, if high-risk HPV DNA positive, the women with p16INK4A-negative lesions had a significantly lower HPV viral load (p < 0.006). Although proof that these p16INK4A-negative lesions are not truly neoplastic would require prospective follow-up, taken together, our findings suggest that these p16INK4A-negative lesions represent histological mimics of CIN 2,3 rather than true cervical cancer precursors. Thus, incorporating p16INK4a staining when diagnosing CIN 2,3 might be expected to reduce measurement error in the inherently error-prone “gold standard” conventionally used to assess the performance of cervical cancer screening tests.

Significant improvements in the estimated sensitivity of both cervical cytology and HPV DNA testing occurred as we attempted to improve the accuracy of the histopathological diagnosis of CIN 2,3. Simply changing from having a single pathologist evaluate the slides to having a “consensus” pathological diagnosis by 2 pathologists improved the sensitivity of HPV DNA testing from 87 to 90% and that of conventional cervical cytology from 70 to 77%. Incorporating p16INK4a immunohistochemistry together with a “consensus” pathological diagnosis resulted in further improvements in the sensitivity of both HPV DNA testing and cervical cytology. When a diagnosis of CIN 2,3 required both a consensus pathological diagnosis of CIN 2,3 as well as p16INK4a, positivity the sensitivity of HPV DNA testing increased to 97% and the sensitivity of cervical cytology increased to 88%. In contrast to what was seen for both HPV DNA testing and cervical cytology, changing the diagnostic criteria did not improve the sensitivity of VIA and relative to the other tests the performance of VIA deteriorated as the measurement error in the “gold standard” was reduced. When interpreting these results it is important to bear in mind that the original pathological diagnoses were rendered by a gynecological pathologist who specializes in cervical pathology. Even greater bias in estimates of sensitivity might be introduced if someone with less experience in diagnosing cervical biopsies were responsible for determining the “gold standard.”

It is well established that the use of an imperfect “gold standard” to determine disease status can introduce a significant bias into measures of a diagnostic or screening test's performance.21 The direction of the bias is determined by whether or not the imperfect standard and the test being evaluated tend to err on the same patients. If no such tendency exists then the test being evaluated and the imperfect standard show “conditional independence.”30 When conditional independence exists, both sensitivity and specificity of the test being evaluated will be underestimated when it is evaluated against the imperfect standard. This is what was observed with HPV DNA testing and conventional cytology. In contrast, when a test and the imperfect standard have a tendency to err in the same patients, the test's performance characteristics can be either over- or underestimated, depending on the degree to which they misclassify the same patients.30 If the classification errors are highly correlated for both the test and the imperfect standard then the capabilities of the test are overestimated. This is because the test receives credit for identifying the same patients as the imperfect standard misclassifies as having the condition being tested for. Since the estimated sensitivity of VIA decreased as the criteria used to define the “gold standard” were made more rigorous, it appears that classification errors occurring with VIA actually correlate with histopathologic classification errors. This interpretation is supported by our histopathologic review of the p16INK4a-negative cases that were classified as CIN 2+ by consensus pathology. These cases are histologically similar and have features of immature squamous metaplasia, but with an unusually pronounced degree of nuclear atypia and mitotic activity. In retrospect, these lesions most likely represent areas of immature squamous metaplasia with marked reactive atypia. A characteristic trait of regions of immature squamous metaplasia is that they develop acetowhitening after the application of a 5% acetic acid solution.31 This would cause such women to have been classified as VIA-positive and referred for colposcopy. Since areas of immature squamous metaplasia are difficult to distinguish from CIN by colposcopic appearance alone, these areas are frequently biopsied and would be at risk for being misdiagnosed by pathology as CIN 2,3. This sequence of events could explain why the classification errors for both VIA and the histopathologic interpretation of CIN 2,3 would be expected to correlate and the capabilities of VIA overestimated in cross-sectional studies.

In summary, we have demonstrated that the use of a novel biomarker p16INK4a to refine the gold standard diagnosis of CIN 2,3 in cervical cancer screening trials has a marked impact on estimates of screening test performance. In some instances routine histopathological examination resulted in marked underestimate of test sensitivity. Because new screening technologies can rarely be evaluated for their ability to actually reduce the incidence of invasive cervical cancer, it is important that studies evaluating their performance utilize biomarkers such as p16INK4a and expert consensus pathology.


  1. Top of page
  2. Abstract
  3. Material and methods
  4. Results
  5. Discussion
  6. References
  • 1
    Robertson AJ, Anderson JM, Beck JS, Burnett RA, Howatson SR, Lee FD, Lessells AM, McLaren KM, Moss SM, Simpson JG. Observer variability in histopathological reporting of cervical biopsy specimens. J Clin Pathol 1989; 42: 2318.
  • 2
    Stoler MH, Schiffman M. Interobserver reproducibility of cervical cytologic and histologic interpretations: realistic estimates from the ASCUS-LSIL triage study. JAMA 2001; 285: 15005.
  • 3
    Ismail SM, Colelough AB, Dinnen JS, Eakins D, Evans DM, Gradwell E, O'Sullivan JP, Summerell JM, Newcombe RG. Observer variation in histopathological diagnosis and grading of cervical intraepithelial neoplasia. BMJ 1989; 298: 70710.
  • 4
    McCluggage WG, Walsh MY, Thornton CM, Hamilton PW, Date A, Caughley LM, Bharucha H. Inter- and intra-observer variation in the histopathological reporting of cervical squamous intraepithelial lesions using a modified Bethesda grading system. Br J Obstet Gynaecol 1998; 105: 20610.
  • 5
    Chen Y, Miller C, Mosher R, Zhao X, Deeds J, Morrissey M, Bryant B, Yang D, Meyer R, Cronin F, Gostout BS, Smith-McCune K, et al. Identification of cervical cancer markers by cDNA and tissue microarrays. Cancer Res 2003; 63: 192735.
  • 6
    von Knebel Doeberitz M. New markers for cervical dysplasia to visualise the genomic chaos created by aberrant oncogenic papillomavirus infections. Eur J Cancer 2002; 38: 222942.
  • 7
    Keating JT, Cviko A, Riethdorf S, Riethdorf L, Quade BJ, Sun D, Duensing S, Sheets EE, Munger K, Crum CP. Ki-67, cyclin E, and p16INK4 are complimentary surrogate biomarkers for human papilloma virus-related cervical neoplasia. Am J Surg Pathol 2001; 25: 88491.
  • 8
    Cho NH, Kim YT, Kim JW. Alteration of cell cycle in cervical tumor associated with human papillomavirus: cyclin-dependent kinase inhibitors. Yonsei Med J 2002; 43: 7228.
  • 9
    Sano T, Oyama T, Kashiwabara K, Fukuda T, Nakajima T. Expression status of p16 protein is associated with human papillomavirus oncogenic potential in cervical and genital lesions. Am J Pathol 1998; 153: 17418.
  • 10
    Park TW, Riethdorf S, Schulz G, Riethdorf L, Wright T, Loning T. Clonal expansion and HPV-induced immortalization are early molecular alterations in cervical carcinogenesis. Anticancer Res 2003; 23: 15560.
  • 11
    Kruse AJ, Baak JP, Janssen EA, Bol MG, Kjellevold KH, Fianne B, Lovslett K, Bergh J. Low- and high-risk CIN 1 and 2 lesions: prospective predictive value of grade, HPV, and Ki-67 immuno-quantitative variables. J Pathol 2003; 199: 46270.
  • 12
    Alazawi WO, Morris LS, Stanley MA, Garrod DR, Coleman N. Altered expression of desmosomal components in high-grade squamous intraepithelial lesions of the cervix. Virchows Arch 2003; 443: 516.
  • 13
    Wang PH, Chang H, Ko JL, Lin LY. Nm23-H1 immunohistochemical expression in multisteps of cervical carcinogenesis. Int J Gynecol Cancer 2003; 13: 32530.
  • 14
    Chil A, Sikorski M, Bobek M, Jakiel G, Marcinkiewicz J. Alterations in the expression of selected MHC antigens in premalignant lesions and squamous carcinomas of the uterine cervix. Acta Obstet Gynecol Scand 2003; 82: 114652.
  • 15
    Bosch FX, de Sanjose S. Chapter 1: Human papillomavirus and cervical cancer—burden and assessment of causality. J Natl Cancer Inst Monogr 2003; 31: 313.
  • 16
    Wang SS, Trunk M, Schiffman M, Herrero R, Sherman ME, Burk RD, Hildesheim A, Bratti MC, Wright T, Rodriguez AC, Chen S, Reichert A, et al. Validation of p16INK4a as a marker of oncogenic human papillomavirus infection in cervical biopsies from a population-based cohort in Costa Rica. Cancer Epidemiol Biomarkers Prev 2004; 13: 135560.
  • 17
    Kalof AN, Evans MF, Simmons-Arnold L, Beatty BG, Cooper K. p16INK4A immunoexpression and HPV in situ hybridization signal patterns: potential markers of high-grade cervical intraepithelial neoplasia. Am J Surg Pathol 2005; 29: 6749.
  • 18
    Agoff SN, Lin P, Morihara J, Mao C, Kiviat NB, Koutsky LA. p16(INK4a) expression correlates with degree of cervical neoplasia: a comparison with Ki-67 expression and detection of high-risk HPV types. Mod Pathol 2003; 16: 66573.
  • 19
    Klaes R, Friedrich T, Spitkovsky D, Ridder R, Rudy W, Petry U, Dallenbach-Hellweg G, Schmidt D, von Knebel Doeberitz M. Overexpression of p16(INK4A) as a specific marker for dysplastic and neoplastic epithelial cells of the cervix uteri. Int J Cancer 2001; 92: 27684.
  • 20
    Negri G, Vittadello F, Romano F, Kasal A, Rivasi F, Girlando S, Mian C, Egarter-Vigl E. p16INK4a expression and progression risk of low-grade intraepithelial neoplasia of the cervix uteri. Virchows Arch 2004; 445: 61620.
  • 21
    Valenstein PN. Evaluating diagnostic tests with imperfect standards. Am J Clin Pathol 1990; 93: 2528.
  • 22
    Franco EL. Statistical issues in human papillomavirus testing and screening. Clin Lab Med 2000; 20: 34567.
  • 23
    Klaes R, Benner A, Friedrich T, Ridder R, Herrington S, Jenkins D, Kurman RJ, Schmidt D, Stoler M, von Knebel Doeberitz M. p16INK4a immunohistochemistry improves interobserver agreement in the diagnosis of cervical intraepithelial neoplasia. Am J Surg Pathol 2002; 26: 138999.
  • 24
    Denny L, Kuhn L, Pollack A, Wainwright H, Wright TC,Jr. Evaluation of alternative methods of cervical cancer screening for resource-poor settings. Cancer 2000; 89: 82633.
  • 25
    Sankaranarayanan R, Sellors J. Colposcopy and treatment of cervical intraepithelial neoplasia: a beginner's manuel. Lyon, France: IARC, 2003.
  • 26
    Kuhn L, Denny L, Pollack A, Lorincz A, Richart RM, Wright TC. Human papillomavirus DNA testing for cervical cancer screening in low- resource settings. J Natl Cancer Inst 2000; 92: 81825.
  • 27
    Denny L, Kuhn L, Pollack A, Wright TC,Jr. Direct visual inspection for cervical cancer screening: an analysis of factors influencing test performance. Cancer 2002; 94: 1699707.
  • 28
    Shi SR, Key ME, Kalra KL. Antigen retrieval in formalin-fixed, paraffin-embedded tissues: an enhancement method for immunohistochemical staining based on microwave oven heating of tissue sections. J Histochem Cytochem 1991; 39: 7418.
  • 29
    Dray M, Russell P, Dalrymple C, Wallman N, Angus G, Leong A, Carter J, Cheerala B. p16(INK4a) as a complementary marker of high-grade intraepithelial lesions of the uterine cervix. I. Experience with squamous lesions in 189 consecutive cervical biopsies. Pathology 2005; 37: 11224.
  • 30
    Valenstein PN, Emancipator K. Sensitivity, specificity, and reproducibility of four measures of laboratory turnaround time. Am J Clin Pathol 1989; 91: 4527.
  • 31
    Singer A, Monaghan JM. Lower genital tract precancer: colposcopy, pathology and treatment. London: Blackwell, 1994.