Discovery and development of differentially methylated regions in human papillomavirus‐related oropharyngeal squamous cell carcinoma

Human papillomavirus (HPV)‐related oropharyngeal squamous cell carcinoma (OPSCC) exhibits a different composition of epigenetic alterations. In this study, we identified differentially methylated regions (DMRs) with potential utility in screening for HPV‐positive OPSCC. Genome wide DNA methylation was measured using methyl‐CpG binding domain protein‐enriched genome sequencing (MBD‐seq) in 50 HPV‐positive OPSCC tissues and 25 normal tissues. Fifty‐one DMRs were defined with maximal methylation specificity to cancer samples. The Cancer Genome Atlas (TCGA) methylation array data was used to evaluate the performance of the proposed candidates. Supervised hierarchical clustering of 51 DMRs found that HPV‐positive OPSCC had significantly higher DNA methylation levels compared to normal samples, and non‐HPV‐related head and neck squamous cell carcinoma (HNSCC). The methylation levels of all top 20 DNA methylation biomarkers in HPV‐positive OPSCC were significantly higher than those in normal samples. Further confirmation using quantitative methylation specific PCR (QMSP) in an independent set of 24 HPV‐related OPSCCs and 22 controls showed that 16 of the 20 candidates had significant higher methylation levels in HPV‐positive OPSCC samples compared with controls. One candidate, OR6S1, had a sensitivity of 100%, while 17 candidates (KCNA3, EMBP1, CCDC181, DPP4, ITGA4, BEND4, ELMO1, SFMBT2, C1QL3, MIR129–2, NID2, HOXB4, ZNF439, ZNF93, VSTM2B, ZNF137P and ZNF773) had specificities of 100%. The prediction accuracy of the 20 candidates rang from 56.2% to 99.8% by receiver operating characteristic analysis. We have defined 20 highly specific DMRs in HPV‐related OPSCC, which can potentially be applied to molecular‐based detection tests and improve disease management.


Introduction
HNSCC is a highly lethal cancer that affects over 60,000 people in the United States annually. 1 While the overall incidence of HNSCC, including HPV-unrelated oral cancers, has decreased in the past decades, the incidence rates of oropharyngeal cancer continues to increase. And it is now clear that HPV infection is driving the increase in these rates. 2 HPVrelated HNSCC is known to occur in a separate patient population in comparison to tobacco-related HNSCC, and exhibits a different etiology, site predilection, prognosis, and genetic and epigenetic alterations. 3,4 High risk HPV plays an important causal role in the pathogenesis of OPSCC, analogous to the etiologic role of high risk HPV infection in cervical cancer. The overwhelming majority of HPV-positive OPSCC is currently diagnosed as advanced stage III/IV disease requiring multimodality therapy, and early stage disease can usually be treated with single modality therapy with decreased cost and morbidity.
Cervical cancer, as another HPV-related cancer, has benefitted from aggressive screening techniques including the use of routine pap smears, regular gynecologic examination, and HPV DNA detection. This benefit from screening exists despite an objectively poor test performance related to pap testing, and a decrease of disease specific morbidity has been successfully achieved. 5 Given the precedent of successful screening for cervical cancer, it is attractive to consider similar population-based screening for HPV-positive OPSCC. However, there are barriers to initiating population-based screening for HPV-positive OPSCC using standard cytologic examination. HPV DNA presence in salivary rinses is relatively prevalent and is not specific enough as a single screening test, as data demonstrate that high risk HPV DNA prevalence in saliva is as high as 6% in groups including men in their 50s. 6 Given these barriers of HPV DNA-based population screening for HPV-positive OPSCC, it is attractive to define other measurable DNA alterations characteristic of HPVpositive OPSCC. Extensive number of studies have demonstrated that abnormal epigenetic regulation is closely related to tumorigenesis. [7][8][9][10] The dominant paradigm includes that methylation of CpG island promoter results in transcriptional repression of tumor suppressor gene. Analyses of HNSCC conducted by TCGA have shown significant epigenetic difference between HPV-related HNSCC and non-HPV HNSCC. 3 HPV-related HNSCC have higher average methylation levels over regular methylated loci compared to non-HPV HNSCC.
Of note, TCGA analysis is not based upon comparison with noncancer control tissues, but depends upon mucosal tissues adjacent to primary HNSCC. Unfortunately, "normal" mucosal tissue adjacent to HNSCC are well known to carry genetic and epigenetic alterations characteristic of HNSCC, and given the high rate of tobacco exposure in this cohort, may confound these analyses. 11 The cis control of gene expression through alterations in DNA methylation can be complex, requiring truly unbiased assessment of DNA methylation to draw robust inferences. Genome-wide epigenetic profiling technologies have proven to be valuable in the study of distinct patterns of DNA methylation. While the Illumina DNA methylation array in TCGA study is based toward designed probes within designated regions, such as gene promoter regions, we employed a moderate cost, quantitative genomewide DNA methylation profiling strategy, MBD-seq, in our study. MBD-seq provides a more unbiased genome-wide evaluation of the methylome landscape and has high sensitivity, enough to detect single methyl-CpG sites. 12,13 In our study, we performed a whole genomic, epigenetic analysis in HPV-positive OPSCC and facilitated biomarker development. We defined biomarkers of epigenetic and qPCR-based assays from a cohort of HPV-positive OPSCC patients, and then performed a case control study to assess the ability of these markers to discriminate patients with HPV-positive OPSCC. Our study serves as a discovery and diagnostic validation of DMRs for HPV-positive OPSCC detection with potential application for development detection tests as well as defining epigenetic drivers of HPV-positive OPSCC.

Tissue samples
We used two independent cohorts including HPV-positive OPSCC patient samples and normal control samples. Patients were evaluated and enrolled in accordance with predefined protocol in the Department of Otolaryngology -Head and Neck Surgery at Johns Hopkins Medical Institutions (Baltimore, MD). Appropriate informed consent was obtained after Internal Review Board approval. The discovery cohort included 50 primary HPV-positive OPSCC tissues and 25 normal mucosal samples from noncancerous patients from the previously published study. 14,15 All primary tissues were confirmed that they were consistent with OPSCC by two investigators from the Pathology Department of Johns Hopkins Hospital independently. The HPV status was determined What's new? High-risk human papillomavirus (HPV) infection plays an important role in the pathogenesis of oropharyngeal squamous cell carcinoma, but population-based HPV screening similar to cervical cancer is not yet in place. To define other measurable DNA alterations apart from HPV, the authors performed genome-wide quantitative DNA methylation profiling and identified 20 differentially methylated regions with high prediction accuracy for HPV-positive oropharyngeal tumors, opening the door for the development of detection assays as well as defining epigenetic alterations associated with HPV-positive tumors.
using in situ hybridization for high-risk HPV subtypes or p16 immunohistochemistry. HPV16 E6 and E7 quantitative realtime polymerase chain reaction (qPCR) was used to confirm in equivocal cases. Clinical characteristics of the discovery cohort are shown in Table S1, Supporting Information. The independent validation cohort included 24 primary HPVpositive OPSCC tissues and 22 normal samples from noncancer patients. To obtain robust cancer unaffected controls, control specimens were randomly selected from available uvulopalatopharyngoplasty (UPPP) surgeries of cancer unaffected patients.

DNA preparation
Ten microns frozen sections of tumor tissue samples or normal tissue samples were prepared. Thirty-five frozen sections on slides were microdissected and digested in 1% SDS (Sigma-Aldrich, St. Louis, MO) and 50 μg/mL proteinase K (Invitrogen, Carlsbad, CA) at 48 C for 48 hours. DNA was purified by phenol-chloroform extraction and ethanol precipitation as described previously. 16 DNA was resuspended in LoTE buffer, and DNA concentration was quantified using the NanoDrop spectrophotometer (Thermo Scientific, Carlsbad, CA).

RNA preparation
RNA was isolated from 0.35 mm thick frozen tissue with the mirVana miRNA Isolation Kit (Ambion, Forster City, CA) per manufacturer's recommendations. RNA concentration was quantified using the NanoDrop spectrophotometer.

DNA methylation sequencing analysis
While the Illumina DNA methylation array is biased toward designed probes within designated regions, such as gene promoter regions, recently developed methyl-CpG binding domain genome sequencing (MBD-Seq) provides a more unbiased genome-wide evaluation of the methylome landscape and has high sensitivity to detect single methyl-CpG sites. 12,13 Approximately 2 μg of genomic DNA from each sample were sonicated to a modal size of 150-200 bp. The resulting fragments were subjected to end-repair using the NEBNext SOLiD DNA library preparation kit end-repair module (New England Biolabs, Ipswich, MA). The library was ligated to doublestranded 5 0 -dephosphorylated P1 and P2 SOLiD sequencing adaptors (Thermo Scientific, Carlsbad, CA), and subjected to nick translation to remove nicks generated at the 5 0 end of the double strand adaptors during ligation. This library was then divided into two fractions, a total input fraction, and an enriched methylated fraction, which was then subjected to enrichment for methylated DNA using MBD2-MBD bound magnetic beads as described previously. 9,12,17 Each of these fractions were subjected to emulsion PCR and SOLiD sequencing workflows in the Next Generation Sequencing Center (NGSC) facility. The obtained sequences were mapped against the reference human genomic sequence (GRCh37/ hg19) using advanced Bioscope/Lifescope software specializing in analysis of SOLiD next generation sequencing data.
Methods for sequencing and data processing of RNA using the RNA sequencing protocol have been previously described. 14,15 Briefly, 47 tumors and 25 normal samples passed minimum quality thresholds after RNA extraction. Sequencing was performed using the HiSeq 2500 platform sequencer (Illlumina) and the TruSeq Cluster Kit for 2×100 bp sequencing. The RNA sequencing data were then normalized using the version 2 protocols as developed by TCGA. 3 The RNA sequences were aligned to the GRCh37/ hg19 genome assembly using MapSplice2 version 2.0.1.9. Supervised clustering of the RNA sequencing data was performed using R/Bioconductor version 3.3.2.

TCGA data selection
The TCGA data for HNSCC was completed and partly published. 3  . We selected OPSCC samples with HPV-positive, which included 48 samples. Histologic imaging available for TCGA normal specimens was examined and only specimens with normal mucosal epithelia were included as normal. For normal controls, we only kept 6 samples confirmed as true squamous epithelium tissues to reach a credible validation. The Illumina probe for one candidate, LOC388692, was not found with 300 bp range and was not further analyzed. Supervised clustering of 50 candidate DMRs in the methylation array data was performed using R/Bioconductor version 3.3.2. Two Illumina probes were found for SFMBT2 and OR1F1, and only one probe for each gene was kept in the heatmap.

Bisulfite treatment and quantitative methylation-specific PCR (QMSP)
QMSP was performed in another independent validation cohort to confirm DNA methylation candidates. DNA from tissue samples were subjected to bisulfite conversion. Unmethylated cytosines in 2 μg of genomic DNA was converted to uracil using the EpiTect Bisulfite Kit (Qiagen, Valencia, CA) as described previously. 16,18 Primers and probes for QMSP of each top 20 candidates were designed based on bisulfite converted sequence using PrimerQuest (http://www. idtdna.com/Primerquest/Home/Index) to specifically include CpG dinucleotides in hypermethylation region in tumor samples by MBD-seq. Primers and a probe for β-actin were designed in areas without CpG nucleotides, thus amplifying the β-actin gene was independent of the methylation status of CpG nucleotides. Primers and probes sequences are available on Table S3, Supporting Information. QMSP was performed on the real-time PCR machine with normalization to internal reference control (β-actin) as described previously. 10,[18][19][20] Statistical analysis Model-based analysis of ChIP-seq (MACS) analysis and candidate prioritizing. Methylated regions were identified as positional peaks of population of aligned sequencing reads in the MBD-enriched data compared with the total input fraction using the MACS v1.4 software. 9,21-23 MACS commonly identifies peaks after accounting for both global and local biases using the enriched-to-input fraction. A MACS p-value cut-off (p < 10 −6 ) was used to define regions that were methylated. Utilizing our differential.coverage R package (https://github.com/ favorov/differential.coverage), we uniformly separated the human genome into nonoverlapping 100 bp regions. For each region, we distributed all the samples into a 4-field contingency table representing whether the sample is tumor or normal and whether the region intersects with any MACS-processed MBD-Seq signal in the sample. By the calculation of the Fisher exact test p-values for the contingency table, each region obtained a measure of association between the MBD signal and the disease status. We filtered out all the regions with FDR-adjusted Fisher exact test p-value >0.05. Due to the high tumor heterogeneity in HNSCC, it is commonly characterized by low incidence of individual methylation events, and it is difficult to detect the difference using conventional statistical methods. Therefore, we focused on regions with minimal DNA methylation detection in normal samples (Fig. 1). We removed all regions with any normal sample with MACS signals, and we focused only on regions with over 50% tumor samples with methylation signals. We also analyzed each 100 bp region together with its 300 bp flanks (700 bp region in total), to account for minimal length of MBD-Seq reads. For each overlapping 700 bp region we built the distribution of the number of reads in the MBD-Seqenriched results for normal samples. We then filtered out all regions with nonzero median and with any outlier that had more than 20 reads within each 700 bp region. Lastly, all overlapping region (within 500 bp from each other) were combined, since a common QMSP assay will be designed for them. P-value calculation. DNA methylation beta values (β, portion of methylation) in TCGA cohort and DNA methylation levels confirmed by QMSP were compared for OPSCC and control samples using student t-test. All statistical tests were two sided with p < 0.05 considered statistically significant.
Sensitivity and specificity quantitation. We estimated β-values in TCGA cohort from unmethylated (U) and methylated (M) measurements on a probe level basis as β = M/ (M + U). 16 The sample was classified as "test positive" if β > 0.15, and "test negative" if β < 0.15. Subjects with a diagnosis of HPV-positive OPSCC were defined as "presence of disease", and normal controls were defined as "absence of disease". True positive (sensitivity) and true negative (specificity) rates were determined for each candidate. The 95% confidence interval (CI) for sensitivity and specificity were calculated assuming binomial distribution using R/Bioconductor version 3.3.2. A cutpoint of the DNA methylation detected by QMSP was established for each candidate using a modified optimality criteria proposed by Perkins and Schisterman 24 ( Table 2). Methylation levels of each candidate were treated as binary variables by dichotomizing the DNA methylation at corresponding cutpoint. Sensitivity, specificity and corresponding CIs were calculated in the same way as previously mentioned.
Receiver operating characteristic analysis. To explore the performances of the individual candidate, a receiver operating characteristic analysis was constructed, and area under curve (AUC), an index of predictive power was provided. All analyses were done using R/Bioconductor version 3.3.2.

Discovery of differentially methylated candidates in HPVpositive OPSCC
MBD-seq data from a discovery cohort of 50 HPV-related OPSCC primary tumors and 25 normal tissue samples was used to identify relevant differentially methylated candidates in HPV-positive OPSCC. HPV-positive OPSCC patients were predominantly male (90%) and white (96%) with an average age of 55.2 years (Table S1, Supporting Information). The control patients were more equally distributed among the sex and ethnic group with a small male and white predominance (40% and 56%), respectively. Current or former smoking was found in 62% of HPV-positive OPSCC patients, compared to control population in which 24% were smokers. Current or past alcohol consumption was found in

Cancer Epidemiology
Ren et al.
70% of HPV-positive OPSCC patients and in 4% of healthy control patients, respectively. Utilizing MACS-processed MBD-Seq data we prioritized HPV-positive OPSCC-related candidates based on the scheme summarized in Fig. 1. We focused on regions with minimal DNA methylation detection in normal samples, and 51 candidates with maximal methylation specificity to cancer samples were defined for further analysis (Table S2, Supporting Information). Fifty-one candidate DMRs demonstrated maximal specificity (all 100%) and were highly prevalent in OPSCC (50-80%). Out of 51 candidates, 40 were found within UCSC-annotated CpG islands, 9 within CpG shores (2,000 bp upstream or downstream of CpG island), and 2 were outside of either CpG island, CpG shore, or CpG shelf (2,000 bp upstream or downstream of CpG shore). The majority (n = 33) of candidates were found within 6,000 bp of a nearby gene transcription start site.

Confirmation of methylation markers for differentiating HPVpositive OPSCC and normal tissues in TCGA cohort
To confirm DNA methylation levels of 51 DMR candidates in a validation OPSCC cohort, DNA methylation results were downloaded from TCGA, which includes 528 HNSCC tumors, containing 48 HPV-positive OPSCC, and 6 normal samples. Unfortunately, TCGA contains only 6 control samples confirmed as true squamous epithelium tissues (while the other 44 TCGA samples designated as controls belong to muscle, salivary gland and other tissues). We matched Illumina probes within 300 bp range of 51 candidates (Table S2, Supporting Information). The Illumina probe for one candidate, LOC388692, was not found with 300 bp range and was not further analyzed. Out of the 50 analyzed DMRs, 49 were found significantly hypermethylated in HPV-positive OPSCC, by student t-test (Table S2, Supporting Information). Overall, out of total 50 DMRs, 32 (64%) had DNA methylation β < 0.15 for all 6 normal samples, and 37 (74%) with β > 0.15 were found in 80-100% tumor samples. Supervised hierarchical clustering of these candidate DMRs was able to distinguish between HPV-positive OPSCC and normal samples. Interestingly, clustering of these DMRs was also able to distinguish HPV-positive OPSCC from HPV-negative HNSCC and HNSCC in other tumor sites, indicating that these candidate DMRs are highly specific in HPV-positive OPSCC (Fig. 2). Overall, methylation data from TCGA was able to provide an independent confirmation and a high concordance between the discovery cohort for 50 candidate DMRs.

Cancer Epidemiology
Ren et al.
ZNF137P and ZNF773. Among these 20 candidates, 17 were found within CpG islands, and 3 within CpG shores ( Table 1). The methylation levels of 20 candidates in 48 HPV-positive OPSCC and 6 controls in TCGA dataset were shown separately (Fig. 3). The methylation levels of all 20 candidates were significantly higher among the patients with HPV-positive OPSCC than controls. Setting a cut-off value β = 0.15, we found that 18 out of 20 biomarkers showed 100% specificity, while 16 of 20 biomarkers showed 80-100% sensitivity to distinguish tumor from normal in the TCGA cohort (Table 1). Moreover, we investigated the performance of these 20 candidates on HPV-negative OPSCC separately, and found that the methylation levels of 16 candidates are significantly higher in HPV-positive OPSCC compared with HPV-negative OPSCC (Fig. S1, Supporting Information). DNA hypermethylation in promoter regions often leads to decreased gene expression because of epigenetic silencing. We further introduced gene expression analysis from our discovery cohort (including 47 HPV-positive OPSCC and 25 normal samples) to investigate potential gene expression changes driven by DNA methylation. TCGA gene expression data was not used because only 6 real control samples were included. Supervised hierarchical clustering of 20 candidates in gene expression was shown in Figure S2, Supporting Information. Two candidates (EMBP1 and MIR129-2) were excluded because their expression levels were not available in our RNAseq data of the discovery cohort. Most of these candidates within CpG islands demonstrated decreased expression in HPV-positive OPSCC compared to normal samples, which was consistent with increased methylation levels.

Discussion
HPV-related OPSCC distinguish themselves as a separate entity from other HPV-negative HNSCC tumors. However, because of the asymptomatic nature of early stage lesions, the patients usually present at a more advanced stage (III/IV) at initial diagnosis and detection markers would provide great promise for timely diagnosis and treatment of HPV-positive OPSCC. 25 In our study, we identified and confirmed specific DNA methylation markers for accurate classification and diagnosis of HPV-positive OPSCC.
We employed a genome-wide DNA methylation profiling strategy, MBD-seq, in 50 HPV-positive OPSCC and 25 normal controls and defined 51 DMRs with maximal cancer specificity. Candidates were additionally confirmed in TCGA and a second independent cohort as high throughput data is often poorly validated due to biases inherent to a single methodology. 16,26 We also utilized diverse detection platforms with different methodologies: Illumina Infinium HumanMethylation450 arrays and QMSP for DNA methylation, and RNA-seq for gene expression. Using our pipeline, 49 biomarkers out of total 51 candidates were confirmed to be specifically hypermethylated in HPV-positive OPSCC tissues, and hypomethylated in noncancer tumor-adjacent tissues in TCGA. We identified top 20 promising markers, and further confirmed using QMSP in an independent cohort in which 17 markers demonstrated a specificity of 100%, with robust ability to differentiate HPVpositive OPSCC from normal controls according to receiver operating characteristic analysis. Analysis of gene expression of these 20 candidates in the discovery cohort suggested that most of these candidates demonstrated decreased expression in HPV-positive OPSCC compared to normal samples. Biologically relevant methylation changes occur earlier in cancer development, 27,28 and only methylation changes that occur earlier in tumor development will allow for development of subsequent gene expression changes. 18 This may explain why the decreased gene expression of 20 candidates were not perfectly consistent with increased methylation levels.
Of the 20 candidates confirmed by multiple confirmation steps, ITGA4 (integrin subunit alpha 4) has previously been found hypermethylated in OPSCC, 29 and NID2 (nidogen 2) and HOXB4 (homeobox B4) have previously been found hypermethylated and downregulated in oral squamous cell carcinoma (OSCC). 27,30 The functional roles of ITGA4, NID2 and HOXB4 have not been investigated in HNSCC. However, in chronic lymphocytic leukemia ITGA4 on the surface of cells facilitates migration and adhesion to the microenvironment. 31 NID2 is known as a component of the basement membrane that stabilizes the extracellular matrix (ECM) network. Expression of NID2 suppresses migration and inhibits metastasis by suppressing the EGFR/AKT and integrin/FAK/ PLCγ pathways. 32 HOXB4, as a hematopoietic transcription factor, downregulates the expression of Prdm16, which is a proto-oncogene necessary for self-renewal and maintenance of murine hematopoietic stem cells. 33 The other eight frequently methylated candidates (CCDC181, DPP4, BEND4, CTNND2, ELMO1, SFMBT2, C1QL3, MIR129-2) confirmed in our study have also shown hypermethylation in other malignancies (Table 1). [34][35][36][37][38][39][40][41][42] Among them, SFMBT2 and MIR129-2 have been shown to act as tumor suppressors. SFMBT2 negatively regulates migration and invasion by targeting MMP-9 and MMP-26. 40 MIR129-2 suppresses migration and invasion by directly inhibiting HMGB1. 42 The ELMO1/Dock180-Rac pathway serves an important role in promoting invasion and metastasis in multiple cancers. 39 DPP4 and CTNND2 act both as tumor suppressors and as markers of tumor aggressiveness, depending on tumor type. 36,38 Knockdown of KCNA3 significantly suppressed cell proliferation and increased apoptosis, 43,44 EMBP1 was found associated with ER-positive breast cancer and lower grade breast tumors, 45 and ZNF93 may be involved in DNA repair pathway after DNA damage by chemotherapy, 46 but no methylation levels of these three candidates were noted in prior reports. Fairly little is known about six candidates (ATP5EP2, OR6S1, ZNF439, VSTM2B, ZNF137P, ZNF773) selected in our study and this is the first report that the epigenetic changes of these six candidates DMRs are found in solid tumors. Three prospective biomarkers (ZNF439, ZNF137P, ZNF773) belong to the Zinc Finger protein group, whose members have previously been shown possess tumor suppressor activity. 18,47 Our study is the largest genome wide DNA methylation study to date defining DMRs specific to HPV-positive Figure 3. TCGA confirmation of 20 biomarker candidates. Twenty biomarker candidates were confirmed using TCGA methylation array dataset. Normal samples (n = 6, left, black) and HPV-positive OPSCC samples (n = 48, right, gray) were included. The methylation levels of these 20 candidates in HPV-positive OPSCC were significantly higher than those in normal samples.
OPSCC. Nakagawa et al. 29 have performed an Infinium 450 k BeadArray for a relative small cohort which included 13 OPSCC samples and 4 noncancerous samples, and identified frequently hypermethylated and silenced genes in OPSCC, which are preferentially hypermethylated in HPVpositive tumors. Four most frequently methylated genes (RXRG, GHSR, CTNNA2, and ITGA4) were discovered and validated in Nakagawa's study, which included one candidate (ITGA4) confirmed in our study. But further sensitivity, specificity and AUC of ITGA4 were not analyzed in previous study. In our study, a first published cohort consisting of 50 HPV-positive OPSCC and 25 normal controls was used for the discovery of potential DMRs using MBD-seq. Furthermore, DMRs were additionally confirmed in TCGA (528 tumor samples) and an independent cohort (24 tumor samples).
There are limitations of our study. First, clinical characteristics between tumor and normal patients were not able to matched, due to the inherent differences in HNSCC and UPPP populations. 18,48,49 These clinical differences may potentially contribute to methylation differences, but the employed UPPP population helped revealing strong cancer specific signatures of HNSCC in previous studies. 16,48,49 Moreover, the TCGA and another independent validation cohort confirmed our candidate DMRs. Second, some proposed DMRs had limited sensitivity or specificity in the validation cohort. However, our results are comparable to the most prominent HNSCC methylated biomarkers, like EDNRB (38% sensitivity and 78% specificity) and DCC (27% sensitivity and 88% specificity). 50 Despite the lower sensitivity of ITGA4, BEND4, ZNF439, ZNF93 and VSTM2B, the 100% specificity of these individual candidates suggests that they are promising candidates for OPSCC detection. Another . Predictive accuracy of 20 biomarker candidates in the validation cohort. Twenty markers were confirmed in a separate validation cohort including HPV-related OPSCC tissues and normal controls. Normal samples (n = 22, left, black) and HPV-positive OPSCC samples (n = 24, right, gray) were included. The methylation levels of 16 candidates in HPV-positive OPSCC were significantly higher than those in normal samples. candidate, OR6S1, had lower specificity, but showed higher sensitivity and AUC for detecting OPSCC. It is potentially attractive to combine these biomarkers together in a panel, in which the particular specificity of single biomarkers will increase the specificity of the entire panel without sacrificing sensitivity.
In conclusion, the present study is a study of HPV-positive OPSCC with a large sample size using a technology that provides broad, relatively unbiased coverage of methylation regions across the genome. Integration of deliberate statistical methods and cross cohort confirmation focused this discovery to a list of 20 DMRs with potential roles in HPV-positive OPSCC carcinogenesis. These DNA methylation biomarkers in HPV-related OPSCC might be potentially applied to develop a population-based screening test and improve disease management. Additional studies with large independent patient sets that incorporate treatment should be performed in the future to confirm whether these DMRs have prognostic implications.