Interrater agreement of anal cytology


  • This article is a US government work and, as such, is in the public domain in the United States of America.



The majority of anal cancers are caused by persistent infections with carcinogenic human papillomaviruses (HPV). Similar to cervical carcinogenesis, the progression from HPV infection to anal cancer occurs through precancerous lesions that can be treated to prevent invasion. In analogy to cervical cytology, anal cytology has been proposed as a screening tool for anal cancer precursors in high-risk populations.


The authors analyzed the interobserver reproducibility of anal cytology in a population of 363 human immunodeficiency virus (HIV)-infected men who have sex with men (MSM). Liquid-based cytology (LBC) specimens were collected in the anal dysplasia clinic before the performance of high-resolution anoscopy on all patients. Papanicolaou-stained LBC slides were evaluated by 2 cytopathologists, each of whom was blinded to the clinical outcome and the other pathologist's results, using the revised Bethesda terminology.


Overall agreement between the 2 observers was 66% (kappa, 0.54; linear-weighted kappa, 0.69). Using dichotomizing cytology results (atypical squamous cells of undetermined significance [ASC-US] or worse vs less than ASC-US), the agreement increased to 86% (kappa, 0.69). An increasing likelihood of testing positive for markers associated with HPV-related transformation, p16/Ki-67, and HPV oncogene messenger RNA was observed, with increasing severity of cytology results noted both for individual cytologists and for consensus cytology interpretation (P value for trend [ptrend] < .0001 for all).


Moderate to good agreement was observed between 2 cytopathologists evaluating anal cytology samples collected from HIV-positive MSM. A higher severity of anal cytology was associated with biomarkers of anal precancerous lesions. Anal cytology may be used for anal cancer screening in high-risk populations, and biomarkers of HPV-related transformation can serve as quality control for anal cytology. Cancer (Cancer Cytopathol) 2013. Published 2012 by the American Cancer Society.


Anal cancer is uncommon in the general population, with incidence rates of approximately 2 per 100,000 population in the United States.1 In certain high-risk populations, such as men who have sex with men (MSM) and human immunodeficiency virus (HIV)-positive men and women, the risks for anal cancer can be much higher and may approach the risk of cervical cancer reported in unscreened populations of women. In MSM, anal cancer rates are estimated to be 40 per 100,000 population,2-4 and in HIV-positive MSM the risk of anal cancer may be 2- fold to 4-fold higher2, 3, 5, 6 or more than in HIV-negative MSM. A recent analysis of 13 cohorts found that HIV-positive MSM were at the highest risk of developing anal cancer, followed by HIV-positive men or women, and all were at a much higher risk than HIV-uninfected populations.7

Analogous to cervical cytology, anal cytology has been recommended as a method of screening for the prevention of anal cancer through the detection of precancerous lesions and anal intraepithelial neoplasia of grade 3 (AIN3) and grade 2 (AIN2), and treatment. Surprisingly, unlike in cervical cytology, there are limited data regarding the interobserver or interrater agreement of anal cytology. A previous study of 120 cytology slides from HIV-infected men reported a weighted kappa of 0.54 for agreement between 4 pathologists evaluating the slides independently.8

Because anal cancer is caused by the same causal factors as cervical cancer, persistent infection by high-risk human papillomavirus (HR-HPV), HPV measurements, and related biomarkers might potentially be used as objective measures for quality control of anal cytology, just as HR-HPV is used for cervical cytology, specifically for atypical squamous cells of undetermined significance (ASC-US).9-11 Examples of other potentially useful biomarkers include the detection of the HPV E6/E7 oncogene messenger RNA (mRNA), p16INK4a, and HPV type 16 (HPV-16),12 which is the most carcinogenic HPV genotype. Comparisons of anal cytology and histology results from laboratory data might also provide benchmarks for anal cytology.8, 13, 14

To examine the issue of interrater agreement of anal cytology and the relation between biomarkers and anal cytologic interpretations, we conducted an analysis in a population of HIV-positive MSM enrolled at an anal cancer screening clinic in the Kaiser Permanente Northern California (KPNC) health maintenance organization.


Study Population

The study was based at the San Francisco KPNC Anal Cancer Screening Clinic. We enrolled men who were identified as positive for HIV through the Kaiser HIV registry, who were aged ≥ 18 years, who were not diagnosed with anal cancer before enrollment, and who provided informed consent. In total, 363 men were enrolled between August 2009 and June 2010. The study was reviewed and approved by the institutional review boards at KPNC and at the National Cancer Institute. All participants were asked to complete a self-administered questionnaire to collect risk factor information. Additional information regarding HIV status and medication, sexually transmitted diseases, and histopathology results were abstracted from the KPNC clinical database.

For 87 of the 271 subjects without biopsy-proven AIN2 or AIN3 at the time of enrollment, follow-up information concerning outcomes from additional clinic visits up to December 2011 was available and included in the analysis to correct for the possible imperfect sensitivity of high-resolution anoscopy (HRA).13, 15

Clinical Examination, Evaluation, and Results

During the clinical examination, 2 specimens were collected by inserting a wet flocked nylon swab16 into the anal canal up to the distal rectal vault and withdrawing with rotation and lateral pressure. Both specimens were transferred to PreservCyt medium (Hologic, Bedford, Mass). A third specimen was collected for routine testing for Chlamydia trachomatis and Neisseria gonorrhea. After specimen collection, participants underwent a digital anorectal examination followed by HRA. All lesions that appeared suspicious on HRA were biopsied and sent for routine histopathological review by KPNC pathologists, and were subsequently graded as condyloma or AIN1 through AIN3. No cancers were observed in this study population.

From the first specimen, a ThinPrep slide (Hologic) was prepared for routine Papanicolaou staining and evaluation. Two pathologists (T.D. and D.T.) reviewed the slides independently. Cytology results were reported analogous to the Bethesda classification17 for cervical cytology except when otherwise noted. The following categories were used: negative for intraepithelial lesion or malignancy (NILM); ASC-US; atypical squamous cells cannot rule out high-grade squamous intraepithelial lesion (HSIL) (ASC-H); low-grade squamous intraepithelial lesion (LSIL); HSIL, favor AIN2 (HSIL-AIN2); and HSIL-AIN3. ASC-H, HSIL-AIN2, and HSIL-AIN3 were combined into a single high-grade cytology category for the current analysis.

Biomarker Testing

Using the residual specimen from the first collection, mtm Laboratories AG (Heidelberg, Germany) performed the p16INK4a/Ki-67 dual immunostaining (“p16/Ki-67 staining”) using their CINtec Plus cytology kit according to their specifications. A ThinPrep 2000 processor (Hologic) was used to prepare a slide, which then was stained according to the manufacturer's instructions. The CINtec Plus cytology kit was then applied to the unstained cytology slide for p16/Ki-67 staining.

On the second collected specimen, Roche Molecular Systems (Pleasanton, Calif) tested for HR-HPV, including separate detection of HPV-16, and HPV-18 DNA, using their cobas 4800 HPV test. To prepare DNA for the cobas test, automated sample extraction was performed as follows: 500 μL of the PreservCyt specimen was pipetted into a secondary tube (Falcon 5-mL polypropylene round-bottom tube, which measured 12-mm-by-75-mm and was nonpyrogenic and sterile). The tube was capped, mixed by vortexing, uncapped, placed on the x-480 specimen rack, and loaded onto the x-480 sample extraction module of the cobas 4800 system. The x-480 extraction module then inputs 400 μL of this material into the specimen preparation process. The extracted DNA was then tested as previously described.16

NorChip AS (Klokkarstua, Norway) also tested the second specimen for HPV-16, -18, -31, -33, and -45 HPV E6/E7 mRNA using their PreTect HPV-Proofer assay according to their specifications. All testing was performed masked to the results of the other assays, clinical outcomes, and patient characteristics.

Statistical Analysis

For the agreement between the 2 cytology raters, we calculated the total agreement with a binomial 95% confidence interval (95% CI). We calculated the Cohen kappa with 95% CI as a chance-corrected measure of agreement as described by Shoukri.18 Because kappa does not account for the degree of disagreement between categories and treats any disagreement equally, we calculated linear-weighted kappa with 95% CI for the ordered cytology categories. Thus, disagreement between adjacent categories results in a lower reduction of kappa values than disagreement between nonadjacent categories. Kappa values < 0.20 were interpreted as poor, values between 0.21 and 0.40 were interpreted as fair, values between 0.41 and 0.60 were interpreted as moderate, values between 0.61 and 0.80 were interpreted as good, and values > 0.80 were interpreted as very good. Exact versions of symmetry (4-category) and McNemar (2-category) chi-square tests were used to test for statistically significant differences in the distribution of the cytologic interpretations between raters. A nonparametric test of trend was used to assess the trend in the percentage of positive results for each biomarker for the risk of AIN2 or higher (AIN2+) with increasing severity of the cytologic interpretation.19 Finally, a Fisher exact test was used to test for differences in the percentage of positive results for each biomarker between subgroups defined by the paired cytologic interpretations.


The 363 men enrolled in the current study had a median age of 53 years and a mean age of 53 years (range, 26 years-79 years). The majority of men were users of highly active antiretroviral therapy (93%), 89% of the men had an HIV viral load < 75 copies, and 97% had a cluster of differentiation 4 (CD4) count > 200 cells/μL (82% had CD4 counts > 350 cells/μL) at the time of enrollment. Of the 363 men who enrolled in the study, 339 (93%) had cytologic interpretations available from both study cytopathologists and these formed the basis of the current analysis. The 24 men who were not included in the analysis because of missing cytology interpretations had a nonsignificantly lower percentage of HR-HPV DNA (65% vs 80%; P = .09).

Table 1 shows the comparison of the cytologic interpretations by the 2 cytopathologists (raters). The first rater called 33% of the samples as negative, 22% as ASC-US, 20% as LSIL, and 26% as high-grade cytology. The second rater called 43% of the samples as negative, 10% as ASC-US, 24% as LSIL, and 23% as high-grade cytology. The crude agreement was 66% (95% CI, 61%-71%), the kappa was 0.54 (95% CI, 0.47-0.60), and the linear-weighted kappa was 0.69 (95% CI, 0.63-0.74). The first rater was more likely to interpret the cytology as more severe (P < .0001). When the cytology was recategorized as negative or ASC-US or more severe, the crude agreement was 86% (95% CI, 82%-90%) and the kappa was 0.69 (95% CI, 0.61-0.76). Rater 1 was more likely to interpret the cytology as ASC-US or more severe (P < .0001).

Table 1. Interrater Agreement for Cytologic Interpretation by 2 Raters
  Rater 2 
  • Abbreviations: ASC-US, atypical squamous cells of undetermined significance; LSIL, low-grade squamous intraepithelial lesion.

  • a

    aHigh-grade cytology includes high-grade squamous intraepithelial lesion (HSIL) and atypical squamous cells cannot rule out HSIL.

  • b

    bBold type highlights exact agreement.

  • c

    cItalic type indicates those cells that contribute the greatest to disagreement.

Rater 1Negative104b4c31112

Table 2 shows the relations between various biomarkers and the risk of having a histologic diagnosis of AIN2+ with the individual and paired cytologic interpretations. There was a significant trend (P value for trend [ptrend] < .0001) toward an increasing likelihood of testing positive for any of the biomarkers and/or having a diagnosis of AIN2+ with increasing severity of the cytologic interpretation for each rater individually. Similarly, there was a significant trend (ptrend < .0001) toward an increasing likelihood of testing positive for any of the biomarkers and/or having a diagnosis of AIN2+ with increasing severity of the consensus cytologic interpretation. Although the numbers for specific pairs of discordant cytologic interpretations were small, making generalization difficult, there was a tendency for these paired results to reflect a mixture of both overcalled and undercalled cytologic interpretations, as indicated by the intermediate positivity of the biomarker results compared with the consensus paired results (ie, ASC-US/ASC-US < ASC-US/LSIL or LSIL/ASC-US < LSIL/LSIL).

Table 2. Relation Between Biomarker Results and Paired Cytology Results From 2 Ratersa
   Rater 2  
  • Abbreviations: +, positive; AIN2, anal intraepithelial neoplasia grade 2; ASC-US, atypical squamous cells of undetermined significance; HPV-16, human papillomavirus type 16; HR-HPV, high-risk human papillomavirus; LSIL, low-grade squamous intraepithelial lesion; mRNA, messenger RNA; p16, p16INK4a/Ki-67 immunocytochemistry.

  • a

    aFor each paired cytology result, the number and percentage positive for HPV-16 DNA; HR-HPV DNA; p16INK4a/Ki-67 immunocytochemistry (p16); HPV-16, -18, -31, 33, and -45 mRNA, or AIN2 or a more severe diagnosis is presented.

  • b

    bHigh-grade cytology includes high-grade squamous intraepithelial lesion (HSIL) and atypical squamous cells cannot rule out HSIL.

  • c

    cBold type with gray background indicates exact agreement for cytologic interpretation.

 Rater 1 Negative%HPV-16+15c14%00%00%00%1513%

However, we observed a large number of discordant pair results of ASC-US/negative (rater 1/rater 2). Comparing the profiles of biomarker positivity and the risk of AIN2+ (Fig. 1), we noted that the profile of the ASC-US/negative subgroup was more akin to that of the negative/negative subgroup than to the ASC-US/ASC-US subgroup. Specifically, the percentage positive for HR-HPV DNA and p16/Ki-67 staining for the ASC-US/negative group was significantly lower than that for the ASC-US/ASC-US group (P = .02 and P = .03, respectively), but was not significantly higher than for the negative/negative group (P = .6 and P = 1, respectively).

Figure 1.

The relationship between biomarker results for paired cytology results of negative/negative, atypical squamous cells of undetermined significance (ASC-US)/negative, and ASC-US/ASC-US (rater 1/rater 2) is shown. For each pair of cytology results, the percentage positive for human papillomavirus type 16 (HPV-16) DNA; high-risk HPV (HR-HPV) DNA; p16INK4a immunocytochemistry; HPV-16, -18, -31, -45, and -58 E6/E7 messenger RNA (mRNA); or a diagnosis of anal intraepithelial neoplasia of grade 2 or higher (AIN2+) is shown. * P = .02 comparing the percentage positive for HR-HPV between ASC-US/negative versus ASC-US/ASC-US. **P = .03 comparing the percentage positive for p16 between ASC-US/negative versus ASC-US/ASC-US.


In the current analysis, we found moderate to good agreement between 2 cytopathologists who were evaluating anal cytology using samples from HIV-infected MSM. When compared with the study by Lytwyn et al,8 we found a better linear-weighted kappa (0.69 vs 0.54 [overall for 4 pathologists]), but a worse unweighted kappa (0.54 vs 0.69 [median]). Thus, in the study by Lytwyn et al,8 there was better exact agreement but when there was disagreement with regard to the severity of the cytology, the discrepancies were more pronounced compared with the current study. Any differences in interrater agreement between studies may be because of differences in the screening and treatment between populations, resulting in differences in the size of the lesions and the number of diagnostically informative cells on a slide. The current study used a different collection device, a flocked nylon swab,16 rather than the typical Dacron swab, which may have altered the number of diagnostic cells on a slide. Finally, rater 2, an experienced cytopathologist who had only read cervical cytology before the study, received training for anal cytology from rater 1 before the study was initiated, which might also have influenced the agreement between the cytopathologists. It is interesting to note that in the current study, the histologic confirmation even of consensus HSIL cytology results was limited because of the limited performance of HRA that is widely recognized.13

With annual rates of anal cancer increasing in the United States (Fig. 2), it will be important to establish screening programs targeting high-risk populations such as HIV-positive MSM and HIV-infected men and women.7 Although to the best of our knowledge there is no established method for anal cancer screening, cytology has been recommended2 and its use may be cost-effective in high-risk populations.20The results of the current study also demonstrated that the detection of several biomarkers and the diagnosis of AIN2+ increased with the increasing severity of anal cytology, as has been shown for cervical cytology. Therefore, these biomarkers might be useful as objective standards to help monitor and maintain the performance of anal cytology. For example, retrospectively reviewing anal cytology interpreted as HSIL in conjunction with biomarker results may improve the diagnostic accuracy of an individual pathologist and identify false-negative and false-positive diagnoses.

Figure 2.

Annual age-adjusted anal cancer incidence rates in the United States are shown for (A) both sexes, (B) males, and (C) females. Data were obtained from Incidence source: Surveillance, Epidemiology, and End Results (SEER) 9 areas (San Francisco, Connecticut, Detroit, Hawaii, Iowa, New Mexico, Seattle, Utah, and Atlanta). Rates are per 100,000 population and are age-adjusted to the 2000 US standard population (19 age groups obtained from the US Census P25-1130). The modeled rates are the point estimates for the regression lines calculated by the Joinpoint Regression Program (Version 3.5; National Cancer Institute, Bethesda, Md).


Supported by the Intramural Research Program of the National Institutes of Health/National Cancer Institute.


Dr. Darragh has received research supplies for anal ThinPrep samples from Hologic Inc, although not for this project. She also serves on the advisory boards of OncoHealth Corporation and Arbor Vita Corporation. Dr. Castle serves as a member of a Data and Safety Monitoring Board for next-generation human papillomavirus (HPV) vaccines for Merck and Company. Dr. Castle has received HPV tests and testing for research at a reduced cost or no cost from Qiagen, Roche, and Merck and Company.