Oral microbial community composition is associated with pancreatic cancer: A case‐control study in Iran

Abstract Background Oral microbiota may be related to pancreatic cancer risk because periodontal disease, a condition linked to multiple specific microbes, has been associated with increased risk of pancreatic cancer. We evaluated the association between oral microbiota and pancreatic cancer in Iran. Methods A total of 273 pancreatic adenocarcinoma cases and 285 controls recruited from tertiary hospitals and a specialty clinic in Tehran, Iran provided saliva samples and filled out a questionnaire regarding demographics and lifestyle characteristics. DNA was extracted from saliva and the V4 region of the 16S rRNA gene was PCR amplified and sequenced on the MiSeq. The sequencing data were processed using the DADA2 plugin in QIIME 2 and taxonomy was assigned against the Human Oral Microbiome Database. Logistic regression and MiRKAT models were calculated with adjustment for potential confounders. Results No association was observed for alpha diversity with an average of 91.11 (standard deviation [SD] 2.59) sequence variants for cases and 89.42 (SD 2.58) for controls. However, there was evidence for an association between beta diversity and case status. The association between the Bray‐Curtis dissimilarity and pancreatic cancer was particularly strong with a MiRKAT P‐value of .000142 and specific principal coordinate vectors had strong associations with cancer risk. Several specific taxa were also associated with case status after adjustment for multiple comparisons. Conclusion The overall microbial community appeared to differ between pancreatic cancer cases and controls. Whether these reflect differences evident before development of pancreatic cancer will need to be evaluated in prospective studies.


| INTRODUCTION
Microbes living in and on the human body, including bacteria, viruses and archaea, have the potential to impact human health and disease. There is evidence that the microbiota is related to a number of conditions, such as inflammatory bowel disease, 1 diabetes, 2 and cancer. 3 It has been hypothesized that oral microbiota may play a role in the etiology of pancreatic cancer, particularly due to the associations detected between periodontal disease and pancreatic cancer risk. [4][5][6][7][8][9] Oral health and periodontal disease are associated with oral microbial diversity. Distinct oral microbiome communities by gingivitis status 10 and dental caries 11 have been observed. Clustering by periodontal disease status has also been detected 12 and there are multiple specific microbes strongly implicated in periodontal disease etiology, including Porphyromonas gingivalis and Aggregatibacter actinomycetemcomitans. 13 These data suggest that the relationships between oral health and periodontal disease with pancreatic cancer may be related to changes in the oral microbiota. A limited number of studies have assessed the relationship between the oral microbiota or antibodies to oral bacteria and pancreatic cancer. [14][15][16][17][18] However, these studies were conducted within populations in the United States and Europe and many had very small sample sizes.
Pancreatic cancer ranks as the 12th most common incident cancer globally, but due to its poor prognosis, it is the seventh most common cause of cancer death. 19 Mortality from pancreatic cancer has also been increasing over the past few years, including in Iran, 20 unlike trends for many other cancer sites. 21 Given the poor prognosis after a diagnosis of pancreatic cancer, identifying new risk factors is essential and changes in the oral microbiota offer a promising new avenue in the search for risk factors. Therefore, we evaluated the association between oral microbiota and pancreatic cancer in a case-control study in Iran.

| Study population
The recruitment of pancreatic cancer cases and controls for this study has been previously described in detail. 22 In brief, participants were recruited from patients referred for endoscopic ultrasonography related to suspicion of a mass or cyst in the pancreas or bile ducts, assessment of submucosal lesions found during esophago-gastro-duodenal endoscopy, or to rule out bile duct stones at one of three tertiary hospitals or a specialty clinic in Tehran, Iran from January 2011 to January 2015. After providing informed consent, the participant responded to a questionnaire and provided saliva samples which were immediately stored at −70°C. The questionnaire included information related to demographics, tobacco and opium use, and body mass index. Endoscopic ultrasonography was conducted and for those with mass or cystic lesions, fine needle aspirates were obtained. The pancreatic tissues were interpreted by an expert pathologist and those with pancreatic adenocarcinoma defined by histopathology were considered pancreatic cancer cases. Participants who had a normal pancreas at the endoscopic ultrasonography exam, aged 40 years or older, no history of liver or renal failure or cancer, no consumption of a special diet, and did not develop pancreatic disease or any cancer within one year of the initial visit were considered controls. Final diagnoses for the controls were asymptomatic small (<10 mm) submucosal lesions in the esophagus or stomach, or gallbladder or common bile duct stones without cholangitis. A total of 357 pancreatic cancer cases and 328 controls were identified and saliva specimens were available from 287 cases and 300 controls. and case status. The association between the Bray-Curtis dissimilarity and pancreatic cancer was particularly strong with a MiRKAT P-value of .000142 and specific principal coordinate vectors had strong associations with cancer risk. Several specific taxa were also associated with case status after adjustment for multiple comparisons. Conclusion: The overall microbial community appeared to differ between pancreatic cancer cases and controls. Whether these reflect differences evident before development of pancreatic cancer will need to be evaluated in prospective studies.

K E Y W O R D S
case-control study, microbiota, pancreatic cancer | 799 VOGTMANN eT Al.
Of the 280 cases with staging data, there were 29 stage I, 160 stage II, 37 stage III, and 54 stage IV pancreatic tumors.

| DNA extraction, amplification, and sequencing
Saliva samples were shipped on dry ice to the National Cancer Institute for processing. DNA extraction, PCR amplification, and sequencing were completed as described in detail previously. 23 In brief, DNA extraction batches were created by randomly selecting study participants within sex-stratified sets of cases and controls to have an adequate distribution of both cases and controls and both sexes within each batch. The laboratory was blinded to the case or control status of each sample. Within each DNA extraction batch, three quality control (QC) samples were also included: oral artificial community or chemostat community 24 ; blank; and extraction duplicate of a randomly selected sample. The saliva samples were thawed at 4°C and extracted using the DSP DNA Virus Pathogen kit on a QIAsymphony instrument (Qiagen). The V4 region of the 16S rRNA gene was PCR amplified for 25 cycles and 2 × 250 bp paired end sequencing was performed on the Illumina MiSeq.

| Bioinformatic data processing
Sequence data processing was performed with QIIME 2 2017.2. 25 Sequences were demultiplexed, and quality control and paired-end read joining were performed with DADA2. 26 The first ten bases were trimmed from forward and reverse reads; forward reads were truncated at 225 bases and reverse reads were truncated at 200 bases. Taxonomy was assigned to the resulting amplicon sequence variants (ASVs) using q2feature-classifier 27 and the Human Oral Microbiome Database version 14.51. 28 ASVs not assigned at least to the phylum level were excluded. Taxonomic relative abundances from the phylum to genus level were generated. A phylogenetic tree was created by aligning ASVs with MAFFT, 29 filtering highly variable positions using q2-alignment, and applying FastTree 30 followed by midpoint rooting using q2-phylogeny. Diversity metrics, including Bray-Curtis, 31 weighted and unweighted UniFrac, 32 observed sequence variants, Shannon index, 33 and Faith's Phylogenetic Diversity (PD) 34 were computed using q2-diversity at 40 000 sequences per sample. Principal coordinates analysis (PCoA) was applied to the beta diversity distance matrices using q2-diversity.
For quality control analysis, the taxonomic composition of the oral artificial community samples was compared to the known composition of the mock using q2-quality-control. The mean taxon accuracy rate (fraction of observed taxa that were expected 27 ) was 0.74 (±0.6 standard deviation [SD]) at the genus level, indicating a low false-positive error rate. The mean taxon detection rate (fraction of expected taxa that are observed 27 ) was 0.77 (±0.3 SD) at the genus level, indicating a low false-negative detection rate. In addition, both the oral artificial and chemostat communities displayed high levels of consistency across runs with mean Shannon estimates of 3.86 (±0.10 SD) and 4.09 (±0.10 SD), respectively. Similarly, oral artificial community and chemostat community within-group Bray-Curtis distances were 0.14 (±0.05 SD) and 0.11 (±0.04 SD), respectively, and significantly different from other sample types (PERMANOVA P < .05), indicating a low level of variation in beta diversity across sequencing runs.

| Statistical analysis
A total of 273 cases and 285 controls remained after excluding QC samples. Descriptive characteristics of the pancreatic cancer cases and controls were presented. The average alpha diversity estimates were calculated by case status and by demographic and lifestyle factors.
To evaluate associations with pancreatic cancer, we created logistic regression models to calculate odds ratios (OR), 95% confidence intervals (CI), and P-values. Alpha diversity was modeled using quartiles from the distribution within the controls with a test for trend using alpha diversity as a continuous variable. For beta diversity, we modeled the first six PCoA vectors, which accounted for 42%-65% of the overall variance in the matrices, in independent logistic regression models. The ORs were calculated based on normalized values for the PCoA vectors such that the ORs represent a one standard deviation (SD) increase in the vector. Restricting to taxa with a relative abundance in cases and controls of at least 1%, we created logistic regression models for the relative abundance of taxa. Restricting to taxa with an overall prevalence ranging from 5% to 95%, we created models for the presence of individual taxa. We set Bonferroni-adjusted P-value significance thresholds based on the number of statistical tests for beta diversity and the taxonomic analyses. For all statistical models, we generated three models: (a) unadjusted; (b) adjusted for age and sex; and (c) adjusted for age, sex, body mass index (BMI; continuous), and any tobacco and/or opium use.
For the association between the overall beta diversity matrices and pancreatic cancer, the MiRKAT test was used. 35 Additionally, supervised learning was performed with q2sample-classifier 36 using random forest classification models 37 grown with 500 trees. Microbial ASV abundances, taxonomic abundances, age, sex, BMI, and tobacco and/or opium use, were used as features for prediction of cancer cases. The model was trained on 80% of the samples and validated on the 20% hold-out set.

| RESULTS
As seen in Table 1 Note: Any cigarette smoking incorporates reporting ever smoking factory made cigarettes with or without a filter, or smoking hand-made cigarettes. Any alcohol consumption incorporates reporting ever consuming beer, imported alcoholic beverages, homemade alcoholic beverages, or spirits. Any opium use incorporates reporting ever smoking opium, using heroin, smoking burned opium, using opium juice, or using crystal. of consistent patterns with pancreatic cancer case status for alpha diversity (Table 2). However, alpha diversity was significantly different by age, cigarette smoking, and opium use. When considering the overall microbial community composition, as captured by the full beta diversity matrices, a significant difference was detected between pancreatic cancer cases and controls. For example, in unadjusted MiRKAT models, the P-value for the association with case status was 5.15 × 10 −7 for Bray-Curtis. After adjustment for age, sex, BMI (continuous), and use of tobacco or opium, the associations with beta diversity were attenuated, but remained statistically significant (Table 3). Random Forest classification could weakly predict pancreatic cancer based on the oral microbiota and other covariates, yielding a 32.1% error rate for prediction of cases compared to controls, a reduction from the baseline error rate of 49.1%. The top 25 most predictive features included several ASVs and taxa including Streptococcus parasanguinis, Granulicatella adiacens, and Rothia aeria.
To further understand the overall beta diversity association with pancreatic cancer, we investigated the first six PCoA vectors from the three beta diversity matrices. No visual clustering was observed by case status, but strong associations were observed for specific PCoA vectors in logistic regression models. For the Bray-Curtis matrix, two PCoA vectors were significantly associated with pancreatic cancer after adjustment for potential confounders. For unweighted UniFrac and weighted UniFrac, one PCoA vector each was associated with pancreatic cancer in the fully adjusted models, although the P-value for PCoA2 from weighted UniFrac was marginally higher than the Bonferroni-adjusted significance threshold. In general, associations were similar when restricted to individuals who reported never smoking cigarettes or using opium (Table 2).
From a single beta diversity matrix, the PCoA vectors are orthogonal and therefore uncorrelated with other PCoA vectors, but this is not the case for PCoA vectors from different beta diversity matrices. To evaluate whether the PCoA associations from each beta diversity matrix were measuring similar microbial community characteristics, we calculated Pearson correlations between the vectors. Bray-Curtis PCoA1 was positively correlated with both unweighted UniFrac PCoA3 (R = 0.39) and weighted UniFrac PCoA2 (R = 0.53), while Bray-Curtis PCoA4 was negatively correlated with both unweighted UniFrac PCoA3 (R = −0.23) and weighted UniFrac PCoA2 (R = −0.43). Unweighted UniFrac PCoA3 was also positively correlated with weighted UniFrac PCoA2 (R = 0.37). The significant PCoA vectors were also associated with some demographic and lifestyle factors. For example, Bray-Curtis PCoA1 was associated with sex, cigarette smoking, and opium use, but as noted above, these vectors remained significantly associated with pancreatic cancer case status in fully adjusted models.
When considering individual taxa, for relative abundance, only one genus had a marginally significant association with the odds of pancreatic cancer, Haemophilus. The OR for an increase of 1% in the relative abundance of Haemophilus was 0.95 (95% CI: 0.92, 0.98), although the P-value was slightly greater than the Bonferroni-adjusted significance threshold of 0.0038 (P = .0043). However, we saw consistently strong inverse associations at higher taxonomic levels for Haemophilus (ie, Proteobacteria, Gammaproteobacteria, Pasteurellales, and Pasteurellaceae, at the phylum, class, order, and family levels, respectively) demonstrating the overall lower odds of pancreatic cancer at all taxonomic levels. For the presence/absence of taxa, for individuals with presence of the order Enterobacteriales, there was 2.80 times greater odds of pancreatic cancer (95% CI: 1.69, 4.78). This strong association was seen at the family level (Enterobacteriaceae) and at the genus level (unknown genus). Similarly, the presence of the genus Lachnospiraceae G7 was associated with 2.40 times greater odds of pancreatic cancer (95% CI: 1.52, 3.84). Although not statistically significant using a Bonferroni-adjusted P-value significance threshold of .0011, the presence of the family Bacteroidaceae (OR 1.90; 95% CI: 1.29, 2.84) or Staphylococcaceae (OR 1.81; 95% CI: 1.25, 2.62) were both associated with increased odds of pancreatic cancer. When we restricted to individuals who reported never smoking cigarettes or using opium, the association estimates for the statistically significant taxa from both relative abundance and the presence/absence were generally similar (results not shown).

| DISCUSSION
In this study of 273 pancreatic cancer cases and 285 controls from Iran, we found evidence of differing microbial communities by case status using multiple methods to test beta diversity. No associations with pancreatic cancer were detected for alpha diversity. There were also indications for associations between specific taxa and pancreatic cancer. Increasing relative levels of Haemophilus were associated with decreased odds of pancreatic cancer, while the presence of Enterobacteriaceae, Lachnospiraceae G7, Bacteroidaceae, or Staphylococcaceae were associated with increased odds of pancreatic cancer.
In previous studies of the oral microbiota and pancreatic cancer, no associations between alpha diversity and pancreatic cancer have been detected 15,18 which was consistent with our findings. This is notable given the possibility of reverse causation in a cross-sectional study of pancreatic cancer where subjects may be feeling unwell at the time of enrolment. Lower alpha diversity in fecal samples has been found to be related to inflammatory bowel disease, 1  and Westernization, 39 which suggests that increased fecal microbial diversity is a marker of good overall health. Oral microbial diversity may be less indicative of general overall health and for pancreatic cancer, not a marker of disease. We found associations between pancreatic cancer and the microbial community composition (ie, beta diversity), but previous studies have not observed this association. 15,16,18 Analyses with beta diversity are complicated due to the pairwise nature of beta diversity matrices and PCoA vectors are dependent on the distance matrix from the study population and not directly comparable to other studies. Methods are being developed to facilitate between study comparisons by using a standard reference set from which to calculate dissimilarities, 40 but no ideal standard reference set currently exists. Our beta diversity associations may differ from previous studies because they focused on Western populations. A recent study observed that within one province in China, the factor that explained the largest percent of variability in beta diversity was the district in which the individual lived. In addition, microbial models to predict specific diseases were location specific. 41 The beta diversity associations we observed may be specific to the Iranian population or associations may be detected using a standard reference set.
Previous studies have also investigated associations between specific bacteria or antibodies to bacteria with pancreatic cancer. We found that an increased relative abundance of Haemophilus and its higher taxonomic levels (ie, Proteobacteria, Gammaproteobacteria, Pasteurellales, and Pasteurellaceae) were associated with decreased odds of pancreatic cancer. Two previous studies also observed lower relative abundances of Proteobacteria in pancreatic cancer cases compared to controls 15,18 with one finding an association with Haemophilus. 18 The presence of Enterobacteriaceae, Lachnospiraceae G7, Bacteroidaceae, or Staphylococcaceae were associated with increased odds of pancreatic cancer, although Bacteroidaceae and Staphylococcaceae had slightly greater P-values than the Bonferroni-adjusted threshold. One study found that individuals who had pancreatic tissue sampled after foregut surgery had a higher prevalence of Kluyvera (genus within the Enterobacteriaceae family) compared to pancreatic tissue from deceased controls; however, when comparing pancreatic cancer tissue to deceased control pancreatic tissue, Salmonella, Enterobacter, and Raoultella (genera within the Enterobacteriaceae family) had a lower prevalence. Similar to our results, this study also found a higher prevalence of Bacteroides (genus within the Bacteroidaceae family) in pancreatic cancer tissue compared to deceased control pancreatic tissue. 42 A number of additional taxa have been detected to be associated with pancreatic cancer in previous studies, often with a special focus on periodontal pathogens. 4-6 P gingivalis, antibody detected and from oral wash specimens, was associated with increased risk of pancreatic cancer in two prospective studies. 16,17 However, a small case-control study found lower levels of Porphyromonas in oral samples from pancreatic cancer cases compared to controls. 15 In our study, Porphyromonas was not associated with pancreatic cancer and was detected in 76.92% of cases and 76.49% of controls. The presence of another periodontal pathogen, Aggregatibacter actinomycetemcomitans, was associated with an increased risk of pancreatic cancer, but not other periodontal pathogens (Tannerella forsythia and Prevotella intermedia). 16 The periodontal pathogen association differences could again be related to the distinct population under study since there are differences in oral health in Iran compared to Western populations. One estimate of tooth loss from the Golestan Cohort study, a cohort of adults in northeastern Iran, found extensive tooth loss with men and women having lost an average of 15.9 and 18.7 teeth, respectively. 43 In comparison, estimates from NHANES in the United States indicated that adults aged 50 to 64 had on average lost only about 5.7 teeth. 44 It is also possible that some associations may differ due to unique oral sampling methods since previous studies have observed clustering by oral site or collection method. [45][46][47] This study has some limitations. First, saliva samples were collected from pancreatic cancer cases at the time of diagnosis, so we cannot distinguish whether any microbial associations were related to pancreatic cancer etiology or presence of disease. Second, controls were identified within patients who were also referred for endoscopic ultrasonography, thereby not representing healthy individuals who may have experienced microbial changes from underlying conditions. Third, although we had a fairly large sample size compared to the previous studies, this study was still underpowered for the taxa-specific analyses. Finally, information regarding oral health or tooth loss in this population was not obtained so we are unable to address potential confounding by these factors. However, to the best of our knowledge, this is the first study of the oral microbiota and pancreatic cancer conducted outside of the United States or Europe.
In conclusion, the oral microbial communities detected in pancreatic cancer cases differed from controls. The presence or relative abundance of some specific microbial taxa were also associated with pancreatic cancer, including Haemophilus, Enterobacteriaceae, Lachnospiraceae G7, Bacteroidaceae, and Staphylococcaceae. The microbial community and taxalevel differences could be related to the presence of pancreatic cancer or the risk of developing pancreatic cancer. Therefore, we need large, prospective studies of diverse populations to evaluate these associations, both for determining the microbiota related to pancreatic cancer etiology, but also for identifying microbiota that may help with early detection.