Epidemiologic and viral factors associated with cervical neoplasia in HPV-16-positive women



While infection with high-risk HPV is the most important risk factor for cervical cancer, HPV alone is insufficient. Our purpose was to identify viral and epidemiologic factors associated with cervical disease in HPV-16 DNA-positive women referred to colposcopy. We used a standardized interview to collect epidemiologic data from consenting women. Total nucleic acids from exfoliated cervical cells were used for all viral assays (HPV detection and typing using L1 consensus PCR with line probe hybridization, variant classification by sequencing, viral load and transcript copy determination by quantitative PCR and transcript pattern by nested RT-PCR). Cervical disease was based on colposcopic biopsy. Logistic regression was used to calculate ORs with 95% CIs. There were 115 HPV-16 positive women among 839 enrollees. By univariate analyses, age >25 years (OR = 3.05, 95% CI 1.20–7.76), smoking (OR = 3.0, 95% CI 1.19–7.56), high viral load (OR = 5.27, 95% CI 2.05–13.60), detection of both E6 and E6*I transcripts (OR = 10.0, 95% CI 2.1–47.58) and high transcript copies (OR = 5.56, 95% CI 2.05–13.60) were significant risk factors for CIN III with reference to No CIN/CIN I. Less than a third of the women (31.5%) had prototype HPV-16 detected, and variants showed no association with disease, viral load or transcription. Viral DNA and transcript copies were highly correlated, and the ratio of transcript copies to DNA copies was not changed with disease status. While viral load, transcript copies and transcript pattern were statistically associated with CIN III, none of these measures effectively discriminated between HPV-16 women with disease requiring treatment and those who could be followed. Cellular proliferation and differentiation pathways affected by HPV should be investigated as biomarkers for cervical cancer screening. © 2005 Wiley-Liss, Inc.

High-risk HPV types are recognized as carcinogenic agents, 1 and infection with high-risk HPV is the most important risk factor for cervical cancer. However, it is clear that HPV alone is insufficient to cause cancer. Most HPV infections are transient and associated with no lesions or spontaneously regressing low-grade precancerous lesions. Factors that facilitate HPV oncogenesis are important targets for early detection, therapeutic intervention and monitoring the response to therapy. Identifying these factors may be difficult because of the high baseline risk associated with HPV infection. One approach is to identify risk factors for cervical disease in groups with high-risk HPV infection.

Our purpose was to examine viral and epidemiologic factors associated with biopsy-confirmed cervical disease in HPV-16 DNA-positive women referred to colposcopy. Viral characteristics other than simple HPV type may be associated with disease and reflect increased virulence or increased host susceptibility. We examined the genotype variants of HPV-16, HPV DNA copy number (viral load), pattern and extent of HPV oncogenic (E6/E7) transcription (transcript load) and ratio of E6/E7 transcript load to viral load. We also examined known demographic and behavioral risk factors for cervical neoplasia.

In univariate analysis, we found age >25 years, smoking, high viral load, presence of transcripts coding for both E6 and E7 and high copy number of transcripts to be significant risk factors for CIN III (reference no disease and CIN I). In multivariate analysis, only age, transcript pattern and viral load were identified as independent risk factors. While these factors were statistically different between CIN III and reference, the separation was insufficient to discriminate between the groups, making it unlikely that they could be useful in early detection of disease. We found HPV DNA and E6/E7 transcript copy numbers to be highly correlated. Their ratio, an indirect measure of transcriptional regulation, did not vary significantly with cervical disease, indicating no evidence of transcriptional deregulation of the E6/E7 viral oncogenes with progression of preinvasive cervical neoplasia.


BMI, body mass index; CI, confidence interval; CIN,cervical intraepithelial neoplasia; DTT, dithiothreitol; HPV, human papillomavirus;OR, odds ratio; TNA, total nucleic acid extract containing RNA and DNA.

Material and methods

Enrollment and sample preparation

Participants were recruited from nonpregnant, HIV-negative women, aged 18–69 years, attending colposcopy clinics at urban public hospitals in Atlanta, Georgia, and Detroit, Michigan, between December 2000 and December 2002. Consenting women were interviewed by clinic coordinators using a standardized questionnaire to determine epidemiologic risk factors associated with cervical neoplasia.

After visualization of the cervix, ecto- and endocervical cells were collected using a CytoBroom (Cytyc, Malborough, MA) and dislodged into PreservCyt collection medium (Cytyc). If a cytology diagnosis was required, the collection device was used to prepare a conventional Pap smear and then placed into the PreservCyt collection medium. Samples were transported to the laboratory at ambient temperature and stored at 4°C until processed.

Colposcopic examination was performed following collection of cervical cells. Findings were recorded by clinic personnel using a standardized data collection form. Cervical samples obtained for diagnosis were submitted to the hospital pathology laboratory and subsequently reviewed by study pathologists. Cervical disease status was based on review of colposcopy, biopsy and cytology findings. CIN was grouped into 3 grades (CIN I, II and III). Subjects with no evidence of dysplasia were designated “No CIN”. We combined those women whose disease would not require treatment other than follow-up, i.e., No CIN and CIN I, into a single group for all analyses.

Within 2 weeks of sample collection, a single TNA sample was prepared from 14 ml of each 20 ml PreservCyt sample using modifications of the MasterPure Complete DNA and RNA Purification kit instructions (Epicentre, Madison, WI) as previously described. 2 TNA was resuspended in 50 μl TE buffer with 50 units of RNasin (Promega, Madison, WI), aliquoted and stored at –70°C until use. TNA quality was evaluated using ethidium bromide–stained denaturing agarose gel electrophoresis. We used appropriate dilutions of this TNA for all assays and included appropriate negative controls to monitor potential contamination. TNA extracts of cell lines CaSki and SiHa (ATCC, Rockville, MD) were prepared and used as HPV-16 positive controls.

HPV DNA detection and genotyping

HPV detection and typing were performed using the Roche line blot assay (reagents provided as a gift from Roche Molecular Systems, Inc., Pleasanton, CA). This assay uses HPV L1 consensus PCR with biotinylated PGMY09/11 primer sets and β-globin as an internal control for sample amplification. 3, 4 TNA, 5 μl of a 1:8 dilution (water diluent), was used in the 100 μl PCR. TNA from CaSki cells harboring HPV-16 was used as the positive control. Amplicons (10 μl) were evaluated for β-globin and HPV bands with 1.5% agarose gel electrophoresis stained with ethidium bromide, and those with an HPV band were hybridized to the typing strips. Samples with an HPV band that did not hybridize to the strip were sequenced as previously described to determine HPV. 5 Samples negative for β-globin and HPV were considered inadequate for interpretation. All HPV-16 positive samples were selected for further testing.

HPV-16 DNA quantitation and sequencing

We tested 10 μl of the 1:40 TNA dilution in a 50 μl reaction using the previously described multiplex quantitative PCR to determine HPV-16 viral load and β-globin copy numbers. 6 Viral load was expressed as HPV DNA copies/cell on the basis of 2 copies of β-globin/cell. For dichotomous analyses, the median was used as the cut-off between low and high viral loads.

For variant determination, an 863 bp fragment corresponding to the E6/E7 region (nucleotides 22–885) was amplified and purified using Centricon YM-100 columns (Millipore, Bedford, MA). The purified product was sequenced with primers corresponding to nucleotides 22–43 (forward) and nucleotides 702–723 (reverse) using the BigDye Terminator Cycle Sequencing Kit and 3100 ABI sequencer (Applied Biosystems, Foster City, CA). Sequence was analyzed using the Sequencher software (Gene Codes, Ann Arbor, MI). HPV-16 variant sequences were identified with reference to the European prototype (Ep) sequence (HPV16R) and classified using previously described groups. 7, 8 Nonprototypes include Asian American (AA), African-1 (Af1), African-2 (Af2) and European variant subclasses E-C109G, E-G131G and E-350G. E6L83V variant refers to nonprototypes (AA, E-C109G, E-G131G and E-350G) with substitution at the 83rd residue from leucine to valine due to a change from T to G at nucleotide 350.

Preparation of cDNA

TNA (3 μl) was digested in a 10 μl reaction with 5 units of DNase I (GenHunter, Nashville, TN) for 30 min at 37°C, then placed on ice. After addition of 1 μl (300 ng) random hexamers (Invitrogen, Carlsbad, CA), the mixture was incubated at 75°C for 5 min, then placed on ice. The final RT reaction (20 μl) included 15 mM Tris-HCl; 50 mM KCl; 2.5 mM MgCl2; 1.0 mM each of dATP, dCTP, dGTP and dTTP; 10 mM DTT; and 12.5 units/μl of Superscript II (Invitrogen). The reaction was incubated at 42°C for 1 hr and terminated by heat inactivation at 70°C for 15 min. cDNA was used for subsequent PCR assays of E6/E7 and β-actin. Four microliters of each RT reaction removed prior to the addition of RT enzyme served as a no-RT control template.

β-Actin PCR

To monitor the integrity of RNA, we amplified a 637 bp fragment of β-actin (GenBank accession X00351) using a previously described primer set. 9 The reverse primer spans the exon 5–6 joining site to prevent amplification of genomic DNA. The 25 μl reaction included 2 μl of a 20-fold dilution of cDNA, 0.4 μM of each primer, 1.5 mM MgCl2 and the reaction buffer and enzyme mix of the High-Fidelity Expand PCR kit (Roche, Diagnostics Corp., Indianapolis, IN). Cycling conditions were one cycle at 94°C for 2 min; 30 cycles at 94°C for 15 sec, 55°C for 1 min, 72°C for 2 min; and one cycle at 72°C for 7 min. Products were analyzed on a 1.2% agarose gel stained with ethidium bromide.

Transcription pattern of E6/E7

We modified a previously described nested RT-PCR for evaluation of the E6/E7 transcription pattern. 10 The first PCR contained 2 μl of cDNA and outer primers corresponding to nucleotides 142–161 (forward) and nucleotides 647–666 (reverse). The second PCR contained 1 μl of the first PCR product and nested primers corresponding to nucleotides 192–211 (forward) and nucleotides 567–586 (reverse). Both PCRs (25 μl) included 20 mM Tris-HCl, 50 mM KCl, 1.5 mM MgCl2, 0.3 μM each of primers, 0.2 mM dNTP and 1.25 units of platinum Taq DNA polymerase (Invitrogen). Cycling conditions for both PCRs consisted of one cycle at 94°C for 2 min; 30 cycles at 94°C for 30 sec, 55°C for 45 sec and 72°C for 1 min; and one cycle at 72°C for 7 min.

Amplicons (5 μl) from the final PCR were evaluated with 2.0% agarose gel stained with ethidium bromide to detect bands from the 3 potential E6/E7 transcripts: 395 bp from unspliced E6/E7, 213 bp from the major spliced product E6*I and 95 bp from the minor spliced product E6*ll. For each sample, amplification from contaminating DNA was ruled out by reactions containing no-RT control template.

E6/E7 transcript quantification

We developed a real-time RT-PCR with SYBR green I dye detection to quantify all 3 potential E6/E7 transcripts. The HPV-16-specific primer set corresponding to nucleotides 657–680 (forward) and nucleotides 743–764 (reverse) avoided variant positions in the HPV-16 E6/E7 region and yielded a 108 bp product (Tm = 83 ± 0.5°C). Human DNA and TNA cervical extracts with HPV types 6, 18, 11, 31, 33, 35, 42, 45, 51, 52, 53, 54, 55, 56, 59, 66, 68, 83 and 84 gave no products.

We used a purified 238 bp PCR fragment derived from HPV-16 DNA amplified with primers corresponding to nucleotides 527–550 (forward) and 744–764 (reverse) as the standard. The concentration was determined by UV spectrophotometry and adjusted to 1 × 109 copies/μl in Tris EDTA buffer (10 mM Tris HCl, 0.1 mM EDTA). Aliquots were stored at –20°C until use. At the time of each assay, a standard curve ranging from 1 × 100 to 1 × 107 copies/reaction was prepared by 10-fold dilutions of a fresh aliquot.

Reactions were set up in duplicate using the DNA Master SYBR green I mixture (Roche Diagnostics Corp.) and TaqStart antibody (Clontech, Palo Alto, CA)–mediated “hot start”. Each reaction (20 μl) contained 2 μl template, 0.4 μM each primer and 4 mM MgCl2. Cycling conditions for amplification and melting curve analysis were set up as described earlier 11 with primers annealing at 65°C with a 5 sec hold and product-specific signal acquisition at 80°C with a 2 sec hold. Calculations used the Fit point method of LightCycler Software version 3.5 (Roche Diagnostics Corp.). HPV E6/E7 copies were expressed as copies/cell based on the β-globin (cell number) determined for that extract in the previously described DNA quantitation assays. The 25th percentile was used to categorize women into those with low (negatives and <25th percentile, n = 43) vs. high (n = 72) E6/E7 transcript copy number.

Statistical analysis

To determine differences between women positive and negative for HPV-16, we used the χ2 test for proportions and the t-test or Wilcoxon test for continuous outcomes. Within the HPV-16 positive group, univariate associations of epidemiologic and viral factors with preinvasive CIN II and CIN III diagnoses were measured by ORs and 95% CIs estimated from polytomous logistic regression analysis using no CIN/CIN I as the reference group. Variables significant at the 0.25 level in univariate analyses were used in stepwise logistic regression analyses to determine individual multivariate models for CIN II and CIN III diagnoses. Statistical analysis was done using SAS version 8.1 (SAS Institute, Cary, NC). For comparison of HPV-16 DNA and E6/E7 transcriptional activity, HPV-16 DNA copies/cell and HPV-16 E6/E7 transcript copies/cell were log2-transformed to create a scatterplot. Samples negative for E6/E7 transcript were assigned a value of 1 × 10−8 copies/cell for log2 transformation. The Pearson correlation coefficient (r) was determined using Microsoft (Redmond, WA) Excel for all samples or samples stratified by disease.


Demographic and behavioral characteristics of the study sample

The demographic and behavioral characteristics of the 839 women enrolled are shown in Table I. Reflecting enrollment from urban public hospitals, 70% of the sample reported income under USD$20,000 and two-thirds had high school or less education. The median age was 27 years (range 18–62), and the median BMI was 27.7 (range 16.8–87.6), indicating that 50% of the women were overweight, obese or morbidly obese. There were no significant differences between women from the 2 cities (data not shown). As expected for a population referred to colposcopy, the prevalence of any HPV type was very high, detected in 585 (69.7%) women. HPV-16 had the highest type-specific prevalence, detected in 115 women (13.7%). HPV-16 positive women were more likely to be younger (p = 0.0134), to be white (p = 0.0103) and to have a higher number of lifetime partners (p = 0.0197) than those HPV-16. No other significant differences were detected.

Table I. Demographic and behavioral characteristics of the study sample stratified by HPV-16 and cervical disease status
Characteristics1HPV-16 statusCervical disease in HPV-16+ (n = 115)
All (n = 839)Negative2 (n = 724)Positive (n = 115)No CIN/CIN I (n = 60)CIN II (n = 24)CIN III (n = 31)
  • 1

    Values for each characteristic are percentages and may not add up to 100% because of missing data.

  • 2

    Negative includes HPV and positive for any type other than HPV-16.

  • 3

    Significant difference between HPV-16+ and HPV groups.

  • 4

    Significant difference between diagnosis groups among HPV-16+ women.

Age (years)34      
 Median (range)27 (18–69)28 (18–69)25 (19–62)24 (19–62)24 (19–49)27.5 (20–60)
 African American78.
Marital status      
 Married/living with a partner22.923.220.916.720.829.0
Yearly household income      
 High school or less67.066.371.366.775.077.4
 Beyond high school30.330.727.833.325.019.4
 <18.5 (underweight/anorexic)
 18.5–24.9 (normal)22.121.426.130.020.822.6
 25–29.9 (overweight)18.418.020.916.720.829.0
 30–39.9 (obese)19.420.910.46.78.319.4
 ≥40 (morbidly obese)7.46.910.46.716.712.9
Smoking status      
 Never smoked69.970.665.275.062.548.4
 Former smoker9.910.
 Current smoker18.517.326.116.729.241.9
Alcohol consumption      
 At least once a month24.824.924.423.320.829.0
 At least once a week24.223.827.023.329.232.3
Number of pregnancies      
 Median (range)2 (0–13)2 (0–13)2 (0–12)2 (0–12)2 (0–9)3.5 (0–10)
Number of lifetime male partners3      
 Median (range)5 (1–800)4 (1–800)5 (1–30)5 (1–14)6 (1–30)4 (1–5)
Age (years) at first sex      
 Mean (SD)16.2 (2.7)16.3 (2.7)16.0 (2.3)16.4 (2.2)15.6 (2.0)15.6 (2.7)
Oral contraceptive use      

Preinvasive cervical neoplasia in HPV-16 positive women

Of the 115 HPV-16 positive women, 37 had No CIN and 23, 24 and 31 had CIN I, CIN II and CIN III, respectively. Disease groups were similar with respect to demographics, lifestyle and sexual behavior, except for age (Table I). Women with CIN III were older than those with CIN II or No CIN/CIN I (p = 0.0213). CIN II was not associated with any of the nonviral characteristics, but CIN III was significantly associated with age (p = 0.0191) and smoking (p = 0.0198) (Table II). HPV-16 positive women who were >25 years old or smoked were 3 times more likely to develop CIN III.

Table II. Univariate analysis of demographic and behavioral risk factors for CIN II and CIN III in HPV-16+ women
CharacteristicsOR (95% CI)
  • 1

    Reference group is No CIN/CIN I.

  • 2

    Bold numbers indicate statistical significance at the 0.05 level.

Age (years)  
 >250.78 (0.30–2.07)3.05 (1.20–7.76)2
 Black1.82 (0.54–6.14)0.81 (0.31–2.14)
 >High school1.01.0
 ≤High school1.50 (0.52–4.37)2.0 (0.70–5.68)
Household income  
 >$20,0000.64 (0.19–2.23)0.68 (0.21–2.18)
Marital status  
 Married/living with partner1.01.0
 Single/divorced/widow0.76 (0.23–2.51)0.47 (0.17–1.13)
Ever a smoker  
 Yes1.8 (0.65–4.95)3.0 (1.19–7.56)
Alcohol use  
 Once a month0.95 (0.28–3.22)1.87 (0.63–5.52)
 Once a week1.33 (0.43–4.10)2.08 (0.72–6.01)
 Overweight1.80 (0.42–7.76)2.31 (0.66–8.11)
 Obese/morbidly obese2.70 (0.63–11.51)3.21 (0.90–11.51)
Number of pregnancies  
 Never pregnant1.01.0
 1–21.53 (0.35–6.67)1.53 (0.35–6.67)
 ≥31.47 (0.34–6.39)2.49 (0.60–10.29)
Number of lifetime male partners  
 >51.52 (0.57–4.04)0.82 (0.32–2.11)
Age (years) at first sex  
 >160.60 (0.22–1.67)0.58 (0.22–1.54)
Oral contraceptive use  
 Sometimes0.53 (0.15–1.93)1.33 (0.42–4.28)
 Regularly0.51 (0.17–1.53)0.86 (0.29–2.54)

Viral cofactors of preinvasive cervical neoplasia in HPV-16 positive women

More than half of the 115 women (61.7%, 71) had 1–12 types in addition to HPV-16 detected. Of the 111 women analyzed for HPV-16 variant status, 76 (68.5%) were nonprototypes. Variant types were AA (12.6%), Af1 (16.2%), Af2 (9.9%), E-C109G (2.7%), E-G131G (9%) and E-350G (18%). E6L83V variant comprised 42.3% (47/111). The HPV-16 viral load ranged from 3.05 × 10–5 to 3.5 × 105 copies/cell (mean 3.12 × 103, median 1.0 copies/cell) and cell line controls gave expected results (CaSki 3.94 ± 0.21 × 102 and SiHa 1.2 ± 0.2). The number of cells per volume of extract assayed (0.35% of original sample) ranged from 9.39 to 1.69 × 105 cells (mean 2.00 × 104, median 9.23 × 103).

HPV-16 E6/E7 transcripts were detected in the vast majority of women by nested RT-PCR (92, 80.0%) or real-time RT-PCR (97, 84.3%). The E6*I splice product was the strongest band in most samples. The most common nested RT-PCR pattern included transcripts coding for both E6 and E7 proteins (unspliced E6 and E6*I with or without E6*II) and was detected in 63 samples (54.7%) and in the CaSki and SiHa cell lines. Transcripts coding only for E7 (E6*I with or without E6*II) were found in 21 and those coding only for E6 (unspliced E6 only), in 8 samples. Quantitative assessment of E6/E7 transcript copies in samples positive by real time RT-PCR ranged from 1.0 × 10–5 to 6.8 × 102 per cell (mean 12.9). CaSki and SiHa gave 41.6 ± 1.5 and 14.8 ± 2.6 transcript copies per cell, respectively. The higher level of E6/E7 transcript in CaSki than in SiHa is compatible with a prior report. 12

We examined the relationship between transcript detection, variant status and HPV-16 viral load, using both qualitative (nested RT-PCR) and quantitative (real time RT-PCR) measures. A similar proportion of prototype and nonprototype HPV-16 samples had the predominant E6/E7 pattern (unspliced E6 and E6*I) detected [20/35 (57.1%) and 42/76 (55.2%), respectively] as did E6L83V vs. non-E6L83V samples [28/47 (59.6%) and 34/64 (53.1%), respectively]. However, the unspliced E6 and E6*1 pattern was significantly more common in samples with a high compared to low viral load [43/58 (74%) and 20/57 (35%), respectively; p < 0.0001]. The ratio of E6/E7 transcript copies to HPV-16 DNA copies per cell ranged from 2.59 × 10–5 to 3.22 (mean 9.9 × 10–2, median 8.1 × 10–3). The E6/E7 transcript to DNA ratio was 0.11 for CaSki and 12.23 for SiHa cells, consistent with the relation between SiHa and CaSki reported previously. 12 There was a significant positive linear correlation between the DNA and E6/E7 transcript copies, shown in Figure 1 as a scatterplot of log2-transformed values for all samples (r = 0.76, p < 0.0001). Also shown in Figure 1, the linear regression lines for samples stratified by disease were statistically identical.

Figure 1.

Scatterplot of HPV DNA copy number/cell vs. E6/E7 transcript copy number/cell for all HPV-16+ samples, stratified by cervical disease status. Both HPV DNA and E6/E7 transcript copy numbers were normalized per cell for direct comparison, as shown in the scatterplot. Cell numbers were determined based on the cellular DNA copy number for β-globin (2 copies/cell) in the aliquots of TNA used for these assays. Gray circles and dotted linear regression line, No CIN/CIN I; open squares and solid linear regression line, CIN II; black diamond and dashed linear regression line, CIN III.

Univariate analyses of viral risk factors for CIN II and III are shown in Table III. Women with a high viral load were 5 times more likely to be diagnosed with CIN II or CIN III compared to those with low HPV-16 viral load. If both unspliced E6 and E6*I were present, there was a 10- or 12-fold risk for HPV-16-infected women to develop CIN II or CIN III. However, high E6/E7 transcript copies was a risk factor (5-fold) only for CIN III, and E6 unspliced or E6*I transcript alone was significant (10-fold) for only CIN II. Viral characteristics such as multiple HPV types, ratio of E6/E7 transcripts to HPV-16 DNA copy and variant status (both nonprototype and E6L83V) were not associated with a diagnosis of CIN II or CIN III.

Table III. Univariate analysis of viral risk factors for CIN II and CIN III in HPV-16+ women
CharacteristicNo CIN/CIN ICIN II1CIN III1
n (%)n (%)OR (95% CI)n (%)OR (95% CI)
  • 1

    Reference group is No CIN/CIN I.

  • 2

    Bold numbers indicate statistical significance at the 0.05 level.

HPV type     
 HPV-16 only20 (33.3)8 (33.3)1.016 (51.6)1.0
 HPV-16 with other types40 (66.7)16 (66.7)1.0 (0.37–2.73)15 (48.4)0.47 (0.19–1.14)
Viral load     
 Low41 (68.3)7 (29.2)1.09 (29.0)1.0
 High19 (31.7)17 (70.8)5.24 (1.86–14.75)222 (71.0)5.27 (2.05–13.60)
E6/E7 transcript copies     
 Low31 (51.7)7 (29.2)1.05 (16.1)1.0
 High29 (48.3)17 (70.8)2.60 (0.94–7.17)26 (83.9)5.56 (1.88–16.41)
E6/E7 transcript pattern     
 No transcripts20 (33.3)1 (4.2)1.02 (6.5)1.0
 Any pattern40 (66.7)23 (95.8)11.50 (1.45–91.40)29 (93.5)7.25 (1.57–33.49)
 E6 single or E6*I single16 (26.7)8 (33.3)10.0 (1.13–88.49)5 (16.1)3.13 (0.53–18.29)
 Both E6 and E6*I24 (40.0)15 (62.5)12.5 (1.52–103.05)24 (77.4)10.0 (2.1–47.58)
RNA/DNA ratio     
 Low27 (45.0)8 (33.3)1.08 (25.8)1.0
 High33 (55.0)15 (66.7)1.64 (0.61–4.40)23 (74.2)2.35 (0.91–6.09)
Nonprototype variant     
 Absent20 (34.5)7 (29.2)1.08 (27.6)1.0
 Present38 (65.5)17 (70.8)1.28 (0.45–3.59)21 (72.4)1.38 (0.52–3.67)
Variant E6L83V     
 Absent34 (58.6)14 (58.3)1.016 (55.2)1.0
 Present24 (41.4)10 (41.7)1.01 (0.39–2.66)13 (44.8)1.15 (0.47–2.83)

While the viral characteristics of high viral load, transcript pattern (both unspliced E6 and E6*I present) and transcript copy number were significantly associated with disease, they did not allow for a clear discrimination between HPV-16 subjects with cervical disease requiring treatment and those who could be followed clinically. The sensitivity values in Table IV show that not all of the CIN II or CIN III cases would be identified if HPV-16 DNA-positive women were triaged on the basis of these viral characteristics. The effect is a reduction in the sensitivity of disease detection achieved with HPV-16 DNA detection. Changing the threshold or cut-off value for the viral DNA and transcript copy number did not improve discrimination; areas under the receiver operating characteristic curve were 0.72 and 0.71, respectively (data not shown).

Table IV. Sensitivity and specificity of viral characteristics for disease detection in HPV-16+ women
Viral characteristicDiagnosis1SensitivitySpecificity
  • 1

    Reference group is No CIN/CIN I.

High viral loadCIN II70.868.3
 CIN III70.968.3
High copies of E6/E7 transcriptCIN II70.851.6
CIN III83.851.6
Both E6/E6*I presentCIN II62.560.0
 CIN III77.460.0

Multivariate analysis of factors associated with cervical neoplasia in HPV-16+ women

Variables univariately associated at the 0.25 level were considered for the stepwise logistic regression models of CIN II [i.e., indicators of high viral load, high E6/E7 transcript copies, presence of E6 unspliced (E6 only) or E6*I alone (E7 only), presence of E6/E6*I together (E6 and E7) and frequency of oral contraceptive use] and CIN III (i.e., indicators of age >25, less than high school education, single/divorced/widow marital status, ever smoking, frequency of alcohol use, number of pregnancies, HPV-16 with other types, high viral load, high E6/E7 copies, E6 only or E7 only, E6 and E7 and high RNA/DNA ratio). Age, detection of transcripts for both E6 and E7 and high viral load were independently associated with CIN III (Table V). High viral load was the only factor associated with CIN II.

Table V. Multivariate stepwise logistic regression for risk factors of CIN II and CIN III in HPV-16+ women
CharacteristicCIN II1CIN III1
OR (95% CI)2p valueOR (95% CI)2p value
  • 1

    Reference group is No CIN/CIN I.

  • 2

    OR with 95% CI given for variables that met the 0.05 significance level in the multivariate model.

  • 3

    Reference group is all other transcript patterns.

Both E6 and E6*I present3  4.30 (1.40–13.17)0.0108
High vs. low viral load5.24 (1.86–14.75)0.00173.13 (1.10–8.90)0.0321
Age >25 vs. ≤25  2.94 (1.04–8.29)0.0413

Multivariate analysis of factor associated with cervical neoplasia in HPV-16 positive women

Variables univariately associated at the 0.25 level were considered for the stepwise logistic regression models of CIN II [i.e., indicators of high viral load, high E6/E7 transcript copies, presence of E6 unspliced (E6 only) or E6*I alone (E7 only), presence of E6/E6*I together (E6 and E7) and frequency of oral contraceptive use] and CIN III (i.e., indicators of age >25, less than high school education, single/divorced/widow marital status, ever smoking, frequency of alcohol use, number of pregnancies, HPV-16 with other types, high viral load, high E6/E7 copies, E6 only or E7 only, E6 and E7 and high RNA/DNA ratio). Age, detection of transcripts for both E6 and E7 and high viral load were independently associated with CIN III (Table V). High viral load was the only factor associated with CIN II.


To determine the factors contributing to cervical disease in HPV-infected women, we focused on the HPV-16 positive women examined at colposcopy. This group was similar to the HPV-16 negative group (women with other HPV types or negative for HPV) with respect to most demographic and behavioral characteristics. HPV-16 positive vs. HPV-16 negative women showed statistically significant differences in age (median 25 vs. 28 years), race (13% vs. 5% white) and lifetime numbers of sex partners (5 vs. 4); however, the small differences were of questionable biologic significance. While there was no association with cervical disease status, half of the women in the study were overweight, obese or morbidly obese.

Among demographic and lifestyle factors known to be risk factors for cervical neoplasia, univariate analysis identified only age and smoking as being associated with a 3-fold increased risk for CIN III in HPV-16 positive women. Smoking may act as an additional carcinogen, furthering the genetic instability contributed by HPV. In addition, it may modify the host immune response. While we did not find an association between smoking and viral load or viral transcription (data not shown), smoking may be a modifier of HPV function in some way. Although the magnitude of the increased risk from smoking was in the range of earlier reports, 13, 14, 15 smoking was not an independent risk factor for CIN III in the multivariate model. As reviewed previously, age, oral contraceptive use, smoking, number of pregnancies (parity) and number of lifetime sexual partners modulate the risk of progression from HPV infection to high-grade cervical lesions and cancer. 16 Our failure to detect associations with all these factors may be attributable to the cross-sectional study design, the size of our study population or population differences across studies.

Based on β-globin determination for each TNA extract, cell content varied between samples. Therefore, for correlations of viral load and viral transcripts with disease, we normalized HPV-16 DNA and E6/E7 transcript copies by cell number. This normalization approach has been used by other investigators 17 for viral DNA load. Artifacts from sample to variation in the efficiency of the RT reactions and PCRs will not be accounted for by this correction; however, extraction and reaction conditions were applied consistently to all samples, to minimize this variation. Because we performed the HPV-16 RNA and DNA assays on aliquots of the same extract, viral RNA and DNA copies could be compared directly.

High viral load (greater than the median 1 copy/cell) was a risk (3- to 5-fold) factor for both CIN II and CIN III. Previous studies using either quantitative PCR 17, 18, 19, 20, 21 or hybrid capture II assays 22, 23, 24 also found viral load to be associated with CIN III, though the reported magnitude of risk varied widely, from 3- to 365-fold. High viral load may be a surrogate for viral persistence, but longitudinal studies would be required to address this.

We found E6/E7 expression in the majority of HPV-16 positive samples by both nested RT-PCR and real-time RT-PCR (80% and 84%, respectively). We used nested RT-PCR to identify the 3 different E6/E7 transcripts resulting from alternative splicing. Because each transcript has a different coding potential, the E6/E7 transcript pattern may be associated with neoplastic progression. A transcript pattern including both unspliced E6 and E6*I transcripts, encoding E6 and E7 proteins, respectively, was an independent 4-fold risk factor for the development of CIN III. This is consistent with results of in vitro models and transgenic mouse experiments, suggesting different and complementary roles for the E6 and E7 proteins in the multistep process of cervical carcinogenesis. 25

As overexpression of E6/E7 due to viral integration and loss of E2 repression or to promoter hypomethylation is considered a key mechanism for neoplastic progression, 26 we hypothesized that E6/E7 overexpression, reflected by alterations in the ratio of E6/E7 transcripts to HPV-16 DNA, would be associated with disease. Contrary to our expectations, we found the number of HPV-16 DNA copies/cell and the number of E6/E7 transcript copies/cell to be linearly correlated regardless of cervical disease. The disease-specific regression lines for the relationship of log2 E6/E7 transcript copies and log2 viral DNA copies per cell completely overlapped (Fig. 1). These findings agree with one prior study that used quantification of HPV-16 DNA and RNA by PCR and microwell hybridization 27 and found no change in the ratio of transcripts to DNA with disease progression but disagree with another study that used real-time PCR and concluded that, while both HPV-16 DNA and E6/E7 RNA copy number increased with disease, RNA was a more sensitive indicator of disease severity. 28 The latter study used an RT-PCR assay targeting different regions of E6/E7 and a one-step RT-PCR protocol that did not permit use of a no-RT control. Moreover, these investigators had relatively few HPV-16 positive samples, used separate DNA and RNA isolation methods and correlated results with cytology rather than biopsy findings.

The failure to detect a change in the ratio of oncogenic transcripts to viral DNA in exfoliated cells does not exclude E6/E7 deregulation as a biologically significant mechanism of disease progression, but it does make it unlikely that assays of HPV transcriptional activity will improve the sensitivity or specificity of cervical cancer screening. Transcriptional deregulation may be most apparent in the basal compartment of the epithelium not well represented in exfoliated samples, or it may be important later in neoplastic progression in the transition from in situ to invasive disease. In addition, deregulation in E6/E7 expression with disease progression may be masked by the spectrum of disease present in women with CIN III 29 as well as by the large number of transcriptionally silent HPV-16 DNA copies. The mean ratio of E6/E7 transcript copies/viral DNA was <0.1, indicating that ferwer than one in 10 DNA copies is active and small but significant changes in transcript copies could be missed.

In summary, our study demonstrates transcriptional activity of HPV-16 in the majority of HPV-16 positive samples, a correlation of HPV-16 viral load and E6/E7 transcriptional activity and evidence that fewer than one in 10 HPV-16 DNA copies is transcriptionally active. Use of a TNA extraction protocol and accurate real-time PCR quantitative measures of RNA and DNA extracted from clinical material strengthens our findings. While high viral load, high transcript copies and transcript pattern were statistically associated with cervical disease, none of these factors adequately discriminated between disease categories. This emphasizes that, while statistical significance can be used to search for biomarkers, sensitivity and specificity for discrimination between groups must also be evaluated. These findings cannot be extrapolated to all high-risk HPV types as differences have been reported for the association between copy number and disease for other viral types. 18 However, they do suggest that cellular proliferation and differentiation pathways specifically affected by high-risk HPV may perform better than viral factors as biomarkers for cervical cancer screening. We are currently using cellular gene expression profiling to identify candidate biomarkers.


M.T.R. was supported by a grant from the National Cancer Institute (K24 CA80846). The authors gratefully acknowledge the assistance of Ms. J. Bailey, Ms. M. Bienaiz, Ms. S. Murray, Ms. D. Marshall and Dr. D. Parker with patient recruitment and Ms. D. Taysavang, Ms. M. Brown, Ms. S. Bergman, Ms. R.A. Tucker and Dr. D. Rollin with laboratory assays.