Interobserver agreement in the interpretation of anal intraepithelial neoplasia

Authors


  • Presented at the 21st International Papillomavirus Conference, Mexico City, Mexico, February 21–26, 2004.

Abstract

BACKGROUND

Anal carcinoma incidence is increasing, and is highest among men with human immunodeficiency virus (HIV) infection who have sex with men. Anal carcinoma and anal intraepithelial neoplasia (AIN) are ascertained on tissue histology, but requires invasive procedures. Screening for AIN using anal cytology was suggested. The authors evaluated agreement on cytologic and biopsy specimens from HIV-positive men undergoing anal carcinoma screening.

METHODS

One hundred twenty-nine HIV-positive men with a history of anal-receptive intercourse underwent anal cytology, anoscopy, and biopsy. Four pathologists independently assessed cytology and biopsy specimens and reached consensus for discordant cases.

RESULTS

Each pathologist evaluated 120 cytology and 155 biopsy specimens. The weighted kappa value for overall agreement was 0.54 (95% confidence interval [CI], 0.49–0.59) for cytology specimens and 0.59 (95%CI, 0.55–0.63) for biopsy specimens. The median kappa values for pairwise agreement among pathologists and for agreement with consensus were, respectively, 0.69 and 0.77 for cytology and 0.66 and 0.75 for biopsy. At least 3 pathologists were in agreement for 92 (76.7%) cytology and 134 (86.5%) biopsy specimens. Reliability for the Bethesda classification system was at least moderate, except for the cytologic category of atypical squamous cells of undetermined significance (kappa = 0.12). Fourteen of 29 (48.3%) cytology specimens and 36 of 47 (76.6%) biopsy specimens with consensus interpretation of high-grade squamous intraepithelial lesions (HSIL) were interpreted originally as HSIL by ≥ 3 pathologists. The kappa value for agreement with consensus distinguishing HSIL from non-HSIL ranged from 0.55 to 0.88 for cytology specimens and from 0.76 to 0.94 for biopsy specimens.

CONCLUSIONS

Agreement for cytologic and biopsy interpretations was generally at least moderate. Nevertheless, these results supported the need for disease indicators with greater reliabililty. Cancer 2005. © 2005 American Cancer Society.

Over the last 40 years, the incidence of invasive squamous carcinoma of the anal canal has been increasing among both men and women.1, 2 Human immunodeficiency virus (HIV)-positive men who have sex with men (MSM) have the highest age-adjusted incidence rate, which is estimated to be 70–100 per 100,000.3 This rate is at least twice that of invasive cervical carcinoma before the introduction of cervical screening.4–6

Anal carcinoma shares many similarities with cervical carcinoma, including epidemiologic evidence for sexual transmission and human papillomavirus (HPV) infection,7, 8 increased occurrence with immunosuppression,9 and the presence of oncogenic HPV-DNA in anal carcinoma tissue specimens.10–12 In addition, anal squamous epithelium undergoes morphologic changes comparable to cervical intraepithelial neoplasia (CIN), and these changes are identified as preinvasive lesions.13

High-risk individuals should be screened for preinvasive anal lesions to prevent invasive malignancy, using an approach analogous to cervical carcinoma screening.14, 15 Anal cytology can be used to screen patients for preinvasive anal lesions.16 Although management is not yet well established, patients whose anal cytology specimen is interpreted as a high-grade squamous intraepithelial lesion (HSIL) undergo anoscopy, and it has been reported that any cytologic squamous intraepithelial lesion (SIL) indicates the need for anoscopic examination.14 Patients whose tissue biopsy specimen shows SIL commonly receive at least close observation with repeat anoscopy, and patients with histologic HSIL undergo ablative therapy if possible. Patients whose tissue specimen show a low-grade squamous intraepithelial lesion (LSIL) may not be treated routinely because the likelihood of progression to invasive cancer is low, but the treatment can be painful and the recurrence of LSIL after treatment is high.14

Because different cytologic and histologic interpretations of anal squamous lesions may critically affect management decisions, whether in the context of clinical practice or within clinical trials, knowledge of the interobserver agreement for specimen interpretation is an important issue. However, previous studies on reliability for cytologic and histologic interpretations of anal intraepithelial neoplasia (AIN) specimens are limited.17–19 We, therefore, assessed the interobserver agreement for interpretation of thin-layer cytology and biopsy specimens in the context of an ongoing prospective cross-sectional survey.

MATERIALS AND METHODS

Population and Specimen Collection

A study screening for anal carcinoma precursors is being performed, and is to recruit 400 HIV-positive men from 3 hospital-associated ambulatory clinics in Toronto, Canada. The men are ≥ 18 years old and have a history of anal-receptive intercourse. Specimens for anal cytology are obtained, followed by immediate anoscopy and directed biopsy. The primary purpose is to determine the accuracy of anal cytology for the detection of histologically confirmed AIN. After the study, men with histologic low-grade lesions will be followed by repeat anoscopy and those with high-grade lesions will be treated. The study was approved by the Human Subjects Review Board of the University Health Network and St. Michael's Hospital. We evaluated the interobserver agreement for cytology and biopsy specimens obtained from an adequate number of consecutive patients, starting from the first patient enrolled in the study.

After written informed consent was obtained from the patient, the physician obtained an anal swab specimen for cytology. The physician inserted the swab into the patient's anal canal until the tip reached the anal valves and then removed it in a twirling motion. The swab was placed into ThinPrep (Cytyc, Boxborough, MA) solution, vigorously swirled, and discarded. After the cytology specimen was obtained, a swab soaked in 3% acetic acid was inserted into the anal canal for 1 minute. Subsequently, an anoscope was inserted and the anoscopist examined the anal canal at a magnification of × 6 to × 15. Biopsy specimens were taken from any areas that appeared dysplastic, and these specimens were placed into 4% neutral-buffered formalin. If no abnormality was seen, no biopsy specimens were taken.

To ensure that the pathologists could not link the cytology and biopsy specimens from the same patient, thereby avoiding test and diagnostic review bias, cytology samples were labeled with uniform random numbers between 2000 and 2999 and biopsy samples were labeled with random numbers between 3000 and 3999. Random numbers were generated with the S-PLUS statistical analysis program (Insightful Corporation, Seattle, WA).

Specimen Processing

The cytology sample was processed according to the manufacturer's instructions. The samples were placed into a ThinPrep processor, which agitated the samples, collected a layer of cells onto a filter by means of a negative pressure pulse, and then transferred the cells by positive pressure to a 20-mm circular area on a glass slide. The cells were fixed and stained with the Papanicolaou stain and screened by cytotechnologists.

The biopsy specimens were processed in a routine manner. A ribbon of three sections at three deeper levels were cut from the paraffin-embedded tissue specimen and stained with hematoxylin and eosin.

Pathology Review

Four pathologists experienced in reading anal cytology and biopsy specimens participated in the study. The pathologists had either American Board Certification or Canadian Fellowship specialist certification in anatomic pathology with additional expertise in gynecologic pathology or cytopathology. All had > 15 years of experience in the interpretation of cervical or anal cytology and biopsy specimens, and had participated in previous related research studies. The cytology was classified according to the modified Bethesda system for cervical cytology.20 Biopsy specimens were grouped into Bethesda-like categories, although the atypical squamous cell (ASC) designation was not used for histologic specimens. There was no previous discussion among pathologists regarding classification criteria. To simulate routine practice, the cytotechnologists' marks were not removed from the slides and the reporting form contained the cytotechnologist impression. Each pathologist reviewed the original histology slides. No recut sections were circulated. Each pathologist was masked to the other pathologists' interpretations, to the corresponding cytology specimen of the biopsy specimen being examined, to the corresponding biopsy specimen of the cytology specimen being examined, and to clinical and any other additional laboratory information. After all specimens were reported, the pathologists met at a multiheaded microscope and reached consensus on any specimen for which there had not been unanimous agreement in the independent review.

Statistical Analysis

For the purpose of statistical analysis, cytology specimens interpreted as ASC cannot exclude HSIL (ASC-H) or squamous cell carcinoma were grouped into the HSIL category, because the practical management of all three interpretations is anoscopic examination. Similarly, biopsy specimens interpreted as invasive cancer were grouped with HSIL, as both would require further clinical management.

We calculated the kappa statistic with 95% confidence interval (95% CI) to summarize agreement for cytology and biopsy interpretation.21 Unlike proportion of agreement, the kappa statistic takes the element of chance agreement into account. We interpreted the kappa statistic to represent the following levels of agreement: < 0.0 = poor, 0.0–0.2 = slight, 0.2–0.4 = fair, 0.4–0.6 = moderate, 0.6–0.8 = substantial, and 0.8–1.0 = almost perfect.21 The unweighted kappa statistic was used to analyze binary grouped data. Otherwise, the weighted kappa statistic was calculated. Weighted kappa statistics take the severity of disagreement into account. For example, a weight of 1.00 was assigned for exact agreement, 0.67 for interpretations 1 category away (e.g., when one pathologist's interpretation was HSIL and another pathologist's interpretation was LSIL for the same specimen), and weights of 0.33 and 0.00 for interpretations 2 and 3 categories away, respectively. To address the possibility of correlation between biopsy specimens when more than one biopsy specimen was obtained per patient, we calculated two kappa statistics, one using all the biopsy results, and the second using only the first biopsy result from those patients in whom multiple biopsies were done. If the two kappa statistics were similar, we assumed that any correlation within patients in whom multiple biopsies were done was negligible.

We obtained kappa statistics for the following measures of agreement: overall four-rater agreement, pairwise agreement among the four pathologists, and individual pathologist agreement with the consensus interpretation.

Agreement for Bethesda system categories for cytology, and agreement for Bethesda-like categories for biopsies were calculated. To determine the level of agreement for each specific diagnosis, the unweighted kappa statistic was calculated after combining all pairwise comparisons into 2 × 2 tables in which samples were classified as having that specific diagnosis or not. The first such table classified the samples according to whether the first and second of the pair of pathologists labeled the specimen as negative. The remaining tables looked at the agreement in classifying the samples as atypical squamous cells of undetermined significance (ASCUS), LSIL, and HSIL.

We also calculated kappa statistics for agreement after dividing the data into disease versus no disease as described by Stoler and Schiffman,22 using different cutoff points that may trigger different management strategies. Cytology specimens were divided into binary categories of negative versus all interpretations of ASCUS or greater (≥ ASCUS), ≤ ASCUS versus ≥ LSIL, and ≤ LSIL versus ≥ HSIL. Because we did not use the ASCUS category for biopsy specimens, biopsy specimens were divided into negative versus ≥ LSIL, and ≤ LSIL versus ≥ HSIL.

We calculated the precision with which we could estimate kappa statistics from our study. With 100 cytology samples and 147 biopsy samples rated by 4 pathologists, we could be certain to identify a kappa statistic of ≥ 0.6 for cytology specimens and 0.5 for biopsy specimens, based on an expected kappa statistic of 0.7 for cytology specimens and 0.6 for biopsy specimens.

RESULTS

One cytology sample was obtained from each of the first 129 enrolled patients. Nine samples were judged to be inadequate for assessment because of insufficient cellularity by at least one pathologist. The study set therefore consisted of 120 cytology samples. One hundred fifty-five biopsy specimens from 93 patients were included in the study. No anoscopic abnormality was seen in 15 of the 129 patients and, therefore, no biopsy specimens was obtained. Twenty-one biopsy specimens consisted of glandular rectal mucosa only and were excluded from the study. In the remaining 93 patients, 1 biopsy specimen was taken from each of 46 individuals, and 2–4 biopsy specimens were obtained from 47 patients.

Table 1 lists the pathologists' interpretations, as compared with each other and with the consensus interpretation, of the cytology and biopsy specimens. Few cytology specimens were reported as ASC-H: three were interpreted as ASC-H by two pathologists, and one other was interpreted as ASC-H by one pathologist. At the consensus conference, an additional specimen was interpreted as ASC-H. All five specimenswere categorized as HSIL for the purpose of analysis. The shaded diagonals mark the number of specimens for which there was complete agreement between the pathologist on the left side of the table with the pathologist and consensus listed at the top of the table. Pathologists A and B tended to downgrade their cytology interpretations as compared with the consensus and Pathologist A had a similar tendency with respect to biopsy specimens.

Table 1. Cytology and Biopsy Interpretations: Comparison between Pairs of Pathologists and the Consensus Interpretationa
  • ASCUS: atypical squamous cells of undetermined significance; LSIL: low-grade squamous intraepithelial lesion; HSIL: high-grade squamous intraepithelial lesion.

  • a

    Shaded areas indicate the number of specimens in each category in which there was full agreement.

inline image

The distributions of complete and partial agreement among pathologists for cytology and biopsy specimens are shown in Figure 1. In 92 of 120 (76.7%) cytology specimens, pathologists were either unanimous (49.2%) in their interpretation or 3 of the 4 pathologists (27.5%) reported the same interpretation. Two pairs of pathologists agreed on 14 (11.7%) specimens, i.e., in these 14 specimens, there were 2 interpretations for any 1 specimen: a specimen was given 1 interpretation (e.g., LSIL) by 2 pathologists, and another interpretation by the other 2 pathologists (e.g., HSIL). In 13 (10.8%) samples, only 1 pair of pathologists agreed, i.e., there were 3 interpretations: 2 pathologists had the same opinion (e.g., for 1 specimen, 2 pathologists called the specimen LSIL), and each of the other pathologists had different interpretations (e.g., the same specimen was called HSIL and ASCUS by the other 2 pathologists, respectively). For 1 (0.8%) specimen, there were 4 different interpretations, 1 from each of the categories (negative, ASCUS, LSIL, and HSIL; Fig. 2). For 134 of 155 (86.5%) biopsy specimens, there was either unanimous agreement (48.4%) or agreement among 3 of the 4 pathologists (38.1%). Two pairs of pathologists agreed on 19 (12.3%) specimens, and there were 3 different interpretations for 2 (1.3%) samples (Fig. 3).

Figure 1.

Distribution of complete or partial agreement among pathologists for anal cytology and biopsy specimens.

Figure 2.

Anal cytology specimen with four different observer interpretations (negative, atypical squamous cells of undetermined significance [ASCUS], low-grade squamous intraepithelial lesions, and high-grade squamous intraepithelial lesions) and consensus interpretation of ASCUS.

Figure 3.

Anal biopsy specimen with three different observer interpretations (negative, low-grade squamous intraepithelial lesions [LSIL], and high-grade squamous intraepithelial lesions) and consensus interpretation of LSIL.

For cytology specimens, the weighted kappa statistic for overall agreement among the 4 raters was 0.54 (95% CI, 0.49–0.59), the median kappa statistic for pairwise agreement among the 4 pathologists was 0.69 (range, 0.62–0.77), and the median kappa statistic for agreement with the consensus interpretation was 0.77 (range, 0.73–0.89; Table 2). For biopsy specimens, the 4-rater kappa statistic was 0.59 (95%CI, 0.55–0.63), and the median kappa statistics for pairwise agreement among pathologists and agreement with consensus were 0.66 (range, 0.56–0.77) and 0.75 (range, 0.72–0.92), respectively. The results were similar when data were reanalyzed using only the first biopsy result from subjects who had multiple biopsies. We were confident that use of more than one biopsy specimen per subject did not unduly increase agreement. For both cytology and biopsy specimens, agreement among pathologists was at least moderate.

Table 2. Summary Data for Agreement between Pairs of Pathologists and with Consensus Interpretation
PathologistCytologyBiopsy
Weighted kappa (95% CI)Weighted kappa (95% CI)
Pathologist BPathologist CPathologist DConsensusPathologist BPathologist CPathologist DConsensus
  1. CI: confidence interval.

A0.77 (0.69–0.85)0.71 (0.61–0.80)0.62 (0.51–0.72)0.76 (0.67–0.85)0.63 (0.53–0.72)0.71 (0.63–0.80)0.59 (0.49–0.69)0.72 (0.64–0.80)
B 0.67 (0.57–0.76)0.65 (0.56–0.75)0.73 (0.64–0.82) 0.70 (0.61–0.79)0.56 (0.45–0.66)0.77 (0.69–0.84)
C  0.71 (0.63–0.80)0.89 (0.83–0.94)  0.74 (0.65–0.82)0.92 (0.87–0.97)
D   0.78 (0.70–0.86)   0.73 (0.64–0.81)

Kappa statistics for individual categories are shown in Table 3. For the 120 cytology samples, 1440 pairwise comparisons were collapsed sequentially into four 2 × 2 tables. Nine hundred eighty-eight of 1440 (68.6%) paired interpretations agreed, and 382 (26.5%) were off by 1 category. There was strong agreement in the classification of cytology samples as negative (kappa = 0.84), moderate agreement in classification as LSIL or HSIL (kappa = 0.52 and 0.45, respectively), and only slight agreement in classification as ASCUS (kappa = 0.12). For the 155 biopsy specimens, 1342 of the 1860 (72.2%) pairwise comparisons agreed, and 492 (26.5%) were off by 1 category. There was substantial agreement in the classification of samples as negative or HSIL (kappa = 0.63 and 0.68, respectively), but only moderate agreement in classification of samples as LSIL (kappa = 0.44).

Table 3. Category-Specific Agreement among Pathologistsa
  • CI: confidence interval; ASCUS: atypical squamous cells of undetermined significance; LSIL: low-grade squamous intraepithelial lesion; HSIL: high-grade squamous intraepithelial lesion, includes one case in which individual diagnoses were HSIL or cancer.

  • a

    Shaded areas indicate the number of cases in each category in which there was full agreement.

inline image

In Table 4, specimens are categorized as showing disease or no disease using different cutoff points, and the individual pathologist's interpretations are compared with the consensus interpretation. The median kappa statistics were 0.90 (range, 0.85–0.93) for the cytology disease category of negative versus ≥ ASCUS, 0.79 (range, 0.70–0.92) for ≤ ASCUS versus ≥ LSIL, and 0.62 (range, 0.55–0.80) for ≤ LSIL versus ≥ HSIL. For biopsy specimens, the median kappa statistics were 0.74 (range, 0.71–0.90) for negative versus ≥ LSIL and 0.77 (range, 0.70–0.94) for ≤ LSIL versus ≥ HSIL. Agreement for biopsy specimens was at least moderate in all disease categories, as was agreement for differentiating negative cytology from all other diagnoses, and for separating at least low-grade cytology from the combined group of negative and ASCUS lesions. The lowest reliability occurred in differentiating HSIL from all other cytology categories, with the lowest 95% CI of 0.38 indicating that the possibility of only fair agreement among some pathologists cannot be excluded. Results were similar when data were reanalyzed using only the first biopsy specimen from patients from whom multiple biopsy specimens were obtained.

Table 4. Summary Data for Original Pathologist Interpretation Vs. Consensus Interpretation, when Divided into “Disease” Vs. “No Disease” at Different Binary Cutoff Points
Specimen typeDisease cutoff pointKappa (95% CI), comparison to consensus interpretation
Pathologist APathologist BPathologist CPathologist D
  1. CI: confidence interval; ASCUS: atypical squamous cells of undetermined significance; LSIL: low-grade squamous intraepithelial lesion; HSIL: high-grade squamous intraepithelial lesion, includes one case in which individual diagnoses were HSIL or cancer.

CytologyNegative vs. ≥ ASCUS0.85 (0.76–0.96)0.91 (0.83–0.99)0.93 (0.86–0.99)0.89 (0.79–0.98)
 ≤ ASCUS vs. ≥ LSIL0.81 (0.70–0.92)0.70 (0.58–0.83)0.92 (0.84–0.99)0.78 (0.66–0.90)
 ≤ LSIL vs. ≥ HSIL0.56 (0.38–0.75)0.55 (0.38–0.93)0.80 (0.68–0.93)0.68 (0.53–0.83)
BiopsyNegative vs. ≥ LSIL0.72 (0.61–0.83)0.71 (0.59–0.82)0.90 (0.83–0.97)0.75 (0.65–0.86)
 ≤ LSIL vs. ≥ HSIL0.72 (0.60–0.84)0.83 (0.74–0.93)0.94 (0.88–0.99)0.70 (0.57–0.83)

Twenty-nine cytology specimens were given a final consensus interpretation of HSIL. All 4 pathologists had originally interpreted 8 (27.6%) of these samples as HSIL and 3 of 4 pathologists had called 6 (20.7%) samples HSIL. Fifteen (51.7%) samples were originally interpreted as HSIL by only 1 or 2 pathologists (Fig. 4). In these 29 cytology specimens, a disagreement of > 1 degree occurred in 7 (24%) specimens. In all seven specimens, at least one pathologist believed that the specimen represented ASCUS, and for one of these specimens, one pathologist believed that the sample was negative for ASCUS or SIL.

Figure 4.

Proportion of pathologists interpreting specimens as high-grade squamous intraepithelial lesions (HSIL) compared with the consensus interpretation of HSIL.

Forty-seven biopsy specimens were interpreted as HSIL by consensus review. All 4 pathologists originally interpreted 24 (51.1%) samples as HSIL and 3 of 4 pathologists had called 12 (25.5%) samples HSIL. Eleven (23.4%) samples were originally interpreted as HSIL by only 1 or 2 pathologists (Fig. 4). Among these 47 biopsy specimens, there was > 1 degree of disagreement in 2 (4%) specimens, with both biopsy specimens interpreted as negative by 1 pathologist.

DISCUSSION

Screening for anal carcinoma precursors is relatively new, and so it is worthwhile to assess the agreement among pathologists in interpreting specimens produced by this screening activity. We carried out an assessment of interobserver reliability for cytology and biopsy specimens as part of a larger study that examines the test characteristics of anal Papanicolaou smears in detecting anal dysplasia and cancer.

In the anal canal, the morphologic changes in preneoplastic anal squamous lesions are similar to those of the uterine cervix. The interobserver reliability for cervical cytology and histology generally has been assessed as moderate.22–25 Two studies of cervical cytology showed that agreement was similar whether liquid-based or conventional cytology preparations were used.26, 27 Few studies have evaluated the reliability of anal specimen reporting.17–19 Scholefield et al.17 circulated 30 cytology slides with material obtained from the perianal skin to 6 pathologists and reported point estimates of pairwise agreement ranging from 0.65 (moderate) to 1.00 (perfect). Carter et al.18 circulated 100 paraffin and fresh frozen histology slides to 5 pathologists. Weighted kappa statistics for pairwise agreement ranged from 0.17 (slight) to 0.60 (moderate agreement). HPV and inflammatory changes were grouped together as one category separate from AIN. Although the data were not shown, reliability for all AIN grades was reported to be poor. In the third study, Colquhoun et al.19 circulated nonconsecutive histology slides to three pathologists experienced in anorectal pathology. Using the Fleiss interpretation of the kappa statistic,21 agreement for HPV changes in approximately 180 specimens was fair (kappa = 0.3; 95% CI, 0.19–0.42). Reliability for AIN was moderate (kappa = 0.64; 95% CI, 0.58–0.69) among the 3 readers, and ranged from 0.38 to 0.60 (fair to moderate) when compared with the original diagnoses from any of 9 service pathologists.

We evaluated the reliability among four pathologists in three different centers in Canada and the United States with experience reading anal cytology and histology results. The overall agreement for cytology and biopsy specimens was at least moderate, and often substantial, whether expressed by the kappa statistic for four-rater agreement, pairwise comparisons among pathologists, or by comparison of individual pathologists to the consensus interpretation. The proportion of specimens with unanimous agreement or agreement among 3 of 4 pathologists was high, i.e., > 70%. However, this result must be interpreted cautiously as reliability expressed in proportions is likely to be inflated by chance agreement. For this reason, the kappa statistic is a better measure of agreement.

The agreement among pathologists for classifying cytology and biopsy specimens into Bethesda and Bethesda-like categories was also moderate to substantial, except for the ASCUS category. Pathologists showed only slight agreement for classifying cytology specimens as ASCUS. The ASCUS-LSIL triage study also showed the greatest variability among pathologists for this category of diagnosis in cervical cytology specimens.22

The most appropriate management of the different categories of HPV-related lesions of the anal canal in HIV-positive MSM is not yet established. However, it is feasible that, similar to cervical screening, patients with negative cytology or negative anoscopy and biopsies may receive no further intervention other than continued routine screening. Patients with HSIL on Papanicolaou smears would undergo anoscopic examination and those with HSIL on biopsy specimens would potentially undergo ablative treatment. In the current study, agreement among pathologists for differentiating between specimens that were negative versus all others was moderate to substantial. Reliability was similar for biopsy specimens in separating HSIL from all other categories. There was greater disagreement among pathologists in distinguishing HSIL from no HSIL for cytology specimens, and for some specimens, agreement may only be fair. Previous studies also have shown that high-grade lesions in cervical cytology can be a source of disagreement.22, 28

The results of the current study may not be representative of pathology practice in general, as there may be more disagreement among pathologists who do not routinely evaluate anal specimens. Moreover, even among pathologists with experience in evaluating anal specimens, agreement was generally moderate, and in keeping with previous experience with cervical specimens.

It must be noted that the lower agreement statistics of some pathologists when compared with the consensus interpretation do not indicate that these pathologists are less correct or that higher kappa statisticss indicate correctness. The consensus interpretation is not a gold standard of truth. It is simply another opinion. There are different ways of generating a consensus interpretation, and we do not know if another method would have led to different results.

Investigators have attempted to improve diagnostic agreement in the practice of pathology. In one article, investigators stated that study of written criteria and review of a teaching slide set improved reliability in the diagnosis of proliferative breast lesions.29 However, Smith et al.30 reported that group study sessions using the Bethesda System atlas did not appear to improve reproducibility for cervical ASCUS smears. The Bethesda System gives precise criteria for cytology categories, including ASCUS, and it may not be possible to improve classification of cervical lesions any further based on morphology alone. In the future, molecular markers such as p16 and aberrantly methylated tumor suppressor genes may help to improve reliability.31, 32

The current study shows that a range of opinions exist in the interpretation of anal cytology and biopsy specimens, but that, overall, there is moderate agreement. Similar reliability has been reported for interpretation of cervical cytology and histology specimens. Despite imperfect reliability, it is well accepted that cervical screening has successfully and dramatically decreased the incidence of cervical carcinoma.4, 33, 34 Nevertheless, the results of our study indicate that in addition to standard Papanicolaou smears and directed biopsies, more reliable, as well as accurate, test and gold standards would be desirable for anal carcinoma screening.

Ancillary