A review of lifestyle, metabolic risk factors, and blood‐based biomarkers for early diagnosis of pancreatic ductal adenocarcinoma

Abstract We aimed to review the epidemiologic literature examining lifestyle and metabolic risk factors, and blood‐based biomarkers including multi‐omics (genomics, proteomics, and metabolomics) and to discuss how these predictive markers can inform early diagnosis of pancreatic ductal adenocarcinoma (PDAC). A search of the PubMed database was conducted in June 2018 to review epidemiologic studies of (i) lifestyle and metabolic risk factors for PDAC, genome‐wide association studies, and risk prediction models incorporating these factors and (ii) blood‐based biomarkers for PDAC (conventional diagnostic markers, metabolomics, and proteomics). Prospective cohort studies have reported at least 20 possible risk factors for PDAC, including smoking, heavy alcohol drinking, adiposity, diabetes, and pancreatitis, but the relative risks and population attributable fractions of individual risk factors are small (mostly < 10%). High‐throughput technologies have continued to yield promising genetic, metabolic, and protein biomarkers in addition to conventional biomarkers such as carbohydrate antigen 19‐9. Nonetheless, most studies have utilized a hospital‐based case–control design, and the diagnostic accuracy is low in studies that collected pre‐diagnostic samples. Risk prediction models incorporating lifestyle and metabolic factors as well as other clinical parameters have shown good discrimination and calibration. Combination of traditional risk factors, genomics, and blood‐based biomarkers can help identify high‐risk populations and inform clinical decisions. Multi‐omics investigations can provide valuable insights into disease etiology, but prospective cohort studies that collect pre‐diagnostic samples and validation in independent studies are warranted.


Introduction
Pancreatic ductal adenocarcinoma (PDAC) has the highest case fatality rate of all cancers. [1][2][3] It has a median survival of 4 to 6 months and a 5-year survival of less than 5%. 1,2 More than 80% of PDAC patients are diagnosed at a late stage (stages III and IV), and 20-25% of the patients have localized, surgically resectable tumors. 4 This is because of the unspecific and latepresenting signs and symptoms of PDAC (e.g. nausea and vomiting, bloating, abdominal pain, weight loss, jaundice, and newly onset diabetes) and the inaccessible location of the pancreas. 1,2 Despite its dismal prognosis, survival of PDAC is higher when it is diagnosed at an early stage. Compared with the overall 5-year survival of less than 5%, Cancer Research UK data showed a 5-year survival of 7-25% for resectable PDAC. 5 Similarly, Surveillance, Epidemiology, and End Results data in the USA (2006-2012) showed a 5-year survival of~30% for localized PDAC, 11% for regional lymph node spread tumor, and~3% for distant metastasis. 6 Furthermore, recent genome sequencing data suggested that it takes at least 10 years between the initiating mutation and the birth of the parental founder cell and an additional 5 years between the tumor initiation and the acquisition of metastatic ability. 7 These data demonstrate a potential window of opportunity for early detection of PDAC if diagnostic biomarkers were available.
Prospective studies so far have reported over 20 potential risk factors for PDAC, primarily lifestyle and metabolic risk factors, while case-control studies have suggested the clinical utility of genomics, proteomics, and metabolomics assays in early diagnosis of PDAC. An integrated approach of traditional risk factors and biomarkers may improve our understanding of risk prediction and early diagnosis of PDAC. This review gives a timely overview in these areas and can be particularly helpful to inform large-scale population-based studies given the readily available resources. In this article, we will review (i) epidemiologic studies of lifestyle and metabolic risk factors for PDAC (mainly smoking, alcohol, diet, adiposity, diabetes, and physical activity), genome-wide association studies (GWAS), and risk prediction models incorporating these factors and (ii) epidemiologic studies of blood-based biomarkers for PDAC (conventional diagnostic markers, metabolomics, and proteomics). As summarized in Figure 1, these factors have the potential to be predictive of PDAC risk and can provide valuable insights into risk prediction, early diagnosis, and treatment of PDAC. Familial PDAC with identified genetic susceptibility will not be discussed in detail. Other potential biomarkers including cell-free noncoding RNA, circulating tumor DNA (ctDNA), circulating tumor cells, exosomes, and the microbiome have been reviewed elsewhere and will not be discussed in detail (Table S1). [8][9][10] Lifestyle and metabolic risk factors Pancreatic ductal adenocarcinoma has a multifactorial etiology, and several risk factors have been reported. 11 Table 1 shows the study information and pooled estimates from published metaanalyses of prospective studies. Despite the large number suggested, the majority of risk factors have a relative risk (RR) of less than 2, and if causal associations are assumed, the population attributable fraction (PAF) of individual risk factors is small. 11 For example, the PAF for smoking ranges from 11% to 32%, while PAFs for other risk factors are less than 10%. 11 Future studies assessing the combined effects of risk factors may inform risk prediction of PDAC and help identify at-risk populations. Furthermore, an appreciable proportion of PDAC cases cannot be explained by established risk factors, and therefore, other risk factors need to be investigated, such as infections, medications, and immunity.
Lifestyle risk factors. Lifestyle risk factors including smoking, alcohol drinking, and diet have been investigated in relation to risk of PDAC. Among these lifestyle factors, smoking is the most well-established one. A meta-analysis of 35 prospective cohort studies with 14 236 PDAC cases reported a 70% and 20% excess risk among current and former smokers, respectively. 12 Among current smokers, there were also moderate dose-response relationships with amount and duration smoked, with each 20 cigarettes per day and each 10-year smoking duration associated with 60% and 16% higher risk, respectively. 12 Heavy alcohol drinking is also associated with higher risk of PDAC, while the effects of light-to-moderate drinking remain unclear. Previous prospective studies have shown that heavy alcohol drinking (i.e. ≥ 3 drinks or 36 g alcohol per day) is associated with a 30% higher risk of PDAC, whereas light-to-moderate drinking is not associated. 13 Although the role of diet in relation to PDAC risk has been inconclusive, prospective studies have suggested that low consumption of red meat and processed meat and high consumption of fresh fruits are associated with lower risk. A meta-analysis of eight prospective cohort studies involving 2761 PDAC cases reported an RR of 1.19 (0.98-1.45) comparing 100 versus 20 g/day of red meat intake, 14 while another meta-analysis of seven prospective cohort studies involving 2748 PDAC cases reported an RR of 1.17 (1.01-1.34) comparing 50 versus 20 g/day of processed meat intake. 14 A metaanalysis of five prospective cohort studies involving 1532 PDAC cases reported a null association between fruit intake and PDAC risk (RR 1.00, 0.95-1.05, per 100 g/day). 14 Metabolic risk factors. In addition to lifestyle factors, metabolic risk factors that are related to the insulin resistance syndrome may play a role in the etiology of PDAC. Physical activity is associated with improved insulin sensitivity, lower blood glucose, and lower risk of developing type 2 diabetes. 14 However, previous prospective studies have been inconclusive whether physical activity is associated with risk of PDAC. In the meta-analysis conducted by the World Cancer Research Fund (WCRF), each 20 metabolic equivalent of task-hours per day (MET-h/day) higher total physical activity was associated with~20% nonsignificantly lower risk of PDAC (RR per 20 MET-h/day 0.81 [0.64-1.02]), while leisure-time physical activity was not related to PDAC (RR per 10 MET-h/day 0.99 [0.96-1.03]). 14 However, this meta-analysis included a limited number of PDAC cases, with 687 cases for total and 1315 cases for leisure-time physical activity. Similar to the WCRF systematic literature review, a recent meta-analysis of prospective studies showed that neither total physical activity nor leisuretime physical activity was associated with risk of PDAC, despite Figure 1 Steps towards early diagnosis of pancreatic ductal adenocarcinoma. Risk prediction models that are currently available include sociodemographics, lifestyle risk factors, medical history, and, for some, genetic variants. Ideally, biomarkers can be incorporated into these models. The current recommendation is selective screening of individuals at increased risk for PDAC based on their family history or identifiable genetic predisposition. 73 The current screening modalities include endoscopic ultrasonography and/or magnetic resonance imaging/magnetic resonance cholangiopancreatography but not biomarkers. 73 Lifestyle risk factors including smoking, alcohol, and diet are behavioral factors that are potentially modifiable. Metabolic risk factors, especially those related to the insulin resistance syndrome, are important risk factors for PDAC. These include adiposity, diabetes, hyperglycemia, physical activity, and metabolic syndrome. Other possible risk factors for PDAC are reviewed elsewhere. 11   is the only conventional biomarker that has been demonstrated to be clinically useful, despite its relatively low sensitivity and specificity. Genomic investigations of PDAC have identified genetic syndromes or mutations in familial PDAC and genetic polymorphisms in sporadic PDAC. Proteomics is the comprehensive characterization of the identity, characteristics, and interactions of the proteins found in individual cellular systems. 40 Metabolomics is the comprehensive characterization of small low-molecular-weight metabolites in biological samples. 41 Both metabolomics and proteomics can provide coverage of metabolites and proteins in much greater quantities than traditional laboratory approaches.  15 Adiposity is an established risk factor for PDAC, and the WCRF has judged that evidence that body fatness is a cause of PDAC is convincing. 14 General adiposity, as assessed by body mass index (BMI) measured or self-reported in middle-to-old ages, is positively associated with risk of PDAC. A meta-analysis involving 23 prospective cohort studies and 9504 PDAC cases reported a 10% higher risk associated with 5 kg/m 2 higher adulthood BMI (RR 1.10, 1.07-1.14). 16 Despite the small number of prospective studies, central adiposity (waist circumference or waist-to-hip ratio) is also positively associated with risk of PDAC. The same meta-analysis reported an 11% higher risk associated with 10-cm higher waist circumference (RR 1.11, 1.05-1.18, 5 studies, 949 PDAC cases) and a 19% higher risk associated with 0.1-unit higher waist-to-hip ratio (RR 1.19, 1.09-1.31, 4 studies, 1047 PDAC cases). On the other hand, young adulthood adiposity, as assessed by self-reported BMI at age 18-25 years, also shows a positive association with risk of PDAC. 17 A recent meta-analysis involving five prospective cohort studies and 4602 PDAC cases reported an 18% higher risk associated with 5 kg/m 2 higher young adulthood BMI (RR 1.18, 1.12-1.24). 17 PDAC has a long subclinical period in which unintentional weight loss might occur, and therefore, the association between adiposity and PDAC risk may be affected by reverse causation. Nonetheless, recent evidence from a Mendelian randomization study has shown that genetically higher BMI is associated with increased risk of PDAC (odds ratio 1.66 [1.05-2.63] per 4.6 kg/m 2 higher BMI), 20 suggesting a causal role of BMI in PDAC etiology.
Diabetes is associated with a 1.5-fold to 2.5-fold higher risk of PDAC. 11 A recent meta-analysis involving 34 prospective studies and 35 761 PDAC cases showed that participants with diabetes have a twofold higher risk (RR = 1.98 [1.92-2.03]), and the pooled RR was 1.52 (1.43-1.63) when restricting to 22 prospective cohort studies. 18 The association of diabetes with PDAC is independent of obesity. A meta-analysis of nine prospective studies reported a RR of 1.46 (1.36-1.56) when further adjusting for BMI. 18 Among participants without diabetes, there is a positive association between blood glucose and risk of PDAC, with RRs of 1.11 (1.02-1.20), 1.15 (1.09-1.21), and 1.13 (1.08-1.19) per 1 mmol/L higher fasting blood glucose, random blood glucose, and post-load blood glucose, respectively. 18 The association of diabetes may also be confounded by reverse causation (i.e. diabetes may be a consequence rather than a cause of PDAC). Preclinical PDAC can induce diabetes due to beta-cell dysfunction and insulin resistance. 21 It has been estimated that approximately 40-50% of newly diagnosed PDAC patients have diabetes at diagnosis. [21][22][23] Although hyperglycemia and hyperinsulinemia have been proposed as the underlying mechanisms linking diabetes and PDAC risk, a recent Mendelian randomization study reported no evidence of a causal relationship between type 2 diabetes and PDAC risk, 20 suggesting that the positive associations in observational studies may be partly explained by reverse causality. However, it should be noted that Mendelian randomization studies rely on important assumptions. 24 On the other hand, this study suggested that genetically increased plasma insulin (i.e. higher levels of plasma insulin predicted by genetic variants) was causally associated with PDAC risk, in line with previous observational evidence showing a positive association between plasma insulin and risk of PDAC. Indeed, prospective studies have suggested positive associations of plasma insulin, insulin-like growth factors (IGFs), and IGF-binding proteins (IGFBPs) with risk of PDAC, although the evidence has been inconclusive. [25][26][27][28][29][30][31] Pancreatitis, an inflammatory disease of the pancreas, is also associated with risk of PDAC. 2 Although alcohol, gallstones, and autoimmune diseases are the main causes of pancreatitis, metabolic risk factors including adiposity and diabetes are important risk factors for pancreatitis. 2 Previous prospective studies have identified metabolic risk factors for pancreatitis including adiposity, hyperglycemia, and diabetes that are also risk factors for PDAC. 2 Previous prospective and case-control studies have shown a higher risk of PDAC associated with a diagnosis of pancreatitis, with reported RR or odds ratio ranging from 2.7 to 13. 11 Although the strong positive association may be partly due to reverse causation (i.e. pancreatic tumor-related ductal obstruction) and misdiagnosis of PDAC as pancreatitis, 19 previous studies showed higher risk of PDAC when excluding pancreatitis cases diagnosed within 2 years of PDAC diagnosis. 19 Furthermore, a few small case-control studies have shown overexpression of biomarkers in both pancreatitis and PDAC (e.g. ephrin receptor A3 and fibrillin 1), 3 while recent case-control studies that compared metabolomics and proteomics profiles of pancreatitis and PDAC can inform the shared etiology and differential diagnosis between the two diseases (see the sections on Metabolomics and Proteomics). More importantly, pancreatitis is predictive of subsequent PDAC diagnosis (see the section on Risk Prediction).
Apart from these lifestyle and metabolic risk factors mentioned earlier, other risk factors for PDAC have been reported, including hepatitis B virus (RR 1.2-1.4) and non-O blood group (RR 1.3-1.4). Although the evidence has been inconclusive, history of allergy is associated with lower risk of PDAC, while regular use of aspirin and nonsteroidal anti-inflammatory drugs is not associated with risk of PDAC. 11 In addition, Helicobacter pylori infection and periodontitis are associated with higher risks of PDAC, 32 possibly because of the increased inflammatory response and the interaction between the human microbiome and the immune system. 10 However, the reported associations of H. pylori with PDAC risk have not been consistent. 11 Although inflammation has been implicated in the etiology of PDAC, some prospective studies have shown null associations of inflammation markers with risk of PDAC, including interleukin (IL)-6, C-reactive protein, and tumor necrosis factor-α. [33][34][35] Genomics While Mendelian randomization studies utilize genetic instruments for putative risk factors to assess the causality of metabolic risk factors in relation to the disease, large-scale, trans-ethnic GWAS have continued to identify common genetic variants that influence disease risk. For sporadic PDAC, previous GWAS have identified at least 22 common variants in primarily European populations (Table S2). Recent GWAS have identified five new susceptibility loci among participants of Chinese descent and have suggested three loci among participants of Japanese descent. 36,37 Despite the larger number of variants/loci identified compared with familial PDAC, the RRs are relatively small, mostly ranging from 0.7 to 1.3. Although GWAS findings advance our understanding of the development of PDAC, future studies are warranted to investigate the biological mechanisms of these common susceptibility alleles and to incorporate the genetic information to develop risk prediction models. Furthermore, future genetic risk models need to incorporate genetic variants of a wide range of allele frequencies, including rare, low-frequency, and common variants. 38 On the other hand, the genetic basis for familial PDAC remains poorly understood, although several genetic factors for PDAC have been identified. A positive family history of PDAC has been reported to be associated with an 80-200% higher risk. 11 Previous reviews have summarized inherited disorders that carry an increased risk of PDAC, the genes involved, and the corresponding RRs. 1,2,8 These involved germline mutations in BRCA1, BRCA2, CDK2A, STK11, PRSS1, SPINK1, PALB2, ATM, and CFTR, and the associated RRs ranged from 2.2 to over 100. However, these genetic alterations are rare in the general population and only account for approximately 10% of all PDAC cases. 2

Blood-based biomarkers
Blood sampling through venesection is a noninvasive and costeffective approach that can provide high-throughput diagnostic information. 2 Investigating blood-based biomarkers can provide insights into the biological mechanisms. Conventional assays including traditional tumor biomarkers (e.g. carbohydrate antigen 19-9 [CA , carcinoembryonic antigen [CEA], and carbohydrate antigen 125 [CA-125]) are readily available in clinical settings. In addition to traditional tumor biomarkers, proteomics and metabolomics (including lipidomics) have recently become more feasible allowing the identification of promising clinical biomarkers. [39][40][41][42] An exhaustive examination of potential biomarkers for PDAC is beyond the scope of this review. A compendium of 441 secreted proteins overexpressed in pancreatic cancer has been reported elsewhere. 3 Conventional biomarkers. Carbohydrate antigen 19-9 (CA 19-9) is a Lewis antigen of the mucin 1 protein class, a well-established blood test for the early detection of PDAC. 2 As the most extensively evaluated marker for PDAC, CA 19-9 has poor specificity for PDAC, and the use of CA 19-9 alone for PDAC screening has been discouraged. 43 Although CA 19-9 has been reported to discriminate between symptomatic individuals and healthy controls (sensitivity 80%, 95% confidence interval [CI] 78-83%; specificity 80%, 95% CI 78-82%) 2 and benign pancreatic disease (sensitivity 78%, 95% CI 72-80%; specificity 83%), 44 it has been shown to be ineffective in the mass screening of asymptomatic subjects. 45 Another limitation is that CA 19-9 is elevated in patients with nonmalignant diseases, including liver cirrhosis, chronic pancreatitis (CP), cholangitis, and other cancers of the gastrointestinal system. 46 Moreover, CA 19-9 is not expressed in Lewis blood-type-negative patients (approximately 5-10% of the population). 46 Previous studies on CA 19-9 collected blood samples after a diagnosis was made, whereas a recent study showed good diagnostic performance using pre-diagnostic blood samples. This nested case-control study within the UK Collaborative Trial of Ovarian Cancer Screening with 154 cases collected samples taken up to 6 years before clinical presentation of PDAC. 47 This study showed that at 95% specificity, CA 19-9 (> 37 U/mL) had a sensitivity of 68% up to 1 year and 53% up to 2 years before diagnosis. This suggests that more than half of PDAC cases can be detected 1-2 years before clinical presentation. In addition, they showed that the combination of CA 19-9 and CA-125 improved sensitivity because CA-125 was elevated in 20% of CA 19-9-negative cases.
Apart from CA 19-9, alternative biomarkers have been investigated in early detection of PDAC, including tumor markers (e.g. CA-125, CA-242, α-fetoprotein, and CEA), cytokines/chemokines (IL-2, IL-10, IL-13, and tumor necrosis factor-α), cell adhesion molecules (e.g. intercellular adhesion molecule 1), proteases/inhibitors in extracellular matrix degradation (e.g. matrix metalloproteinase and tissue inhibitor of metalloproteinase), acute-phase reactants (e.g. C-reactive protein and serum amyloid A), and other biomarkers (e.g. osteoprotegerin and IGF-binding proteins [IGFBP2 and IGFBP3]). 46,48,49 However, when combined with CA 19-9, the majority of these biomarkers have not been shown to improve the diagnostic accuracy compared with CA 19-9 alone. 46 Although additional studies are needed to validate the use of these biomarkers, a few studies have reported that macrophage colony-stimulating factor 1, haptoglobin, tumor-specific growth factor, heat shock protein 27, clivatuzumab, mucin 1, CEA-related cell adhesion molecule 1, mucin 5AC, and miR-1290 had higher sensitivity and specificity for diagnosis of PDAC than CA 19-9 alone. 3,9,50 Proteomics. In recent years, advances have been made in proteomics assays (mainly antibody microarrays) to capture the systemic immune response to cancer. 42 As a consequence, multiplexed proteomics panels have been investigated with the aim of increasing sensitivity and specificity. Such a multiplexed serum biomarker signature has the potential to improve diagnosis accuracy of PDAC and to distinguish PDAC from benign conditions in case-control studies (number of PDAC cases 13-401, median 80; Tables 2,S3). 42 However, there are several limitations for investigations of both proteomics and metabolomics: (i) the majority of case-control studies recruited PDAC cases of early and advanced stages and therefore could not distinguish between markers only present in advanced disease and biomarkers useful for early diagnosis, (ii) some of the results have not been validated in independent samples, and (iii) the diagnostic accuracy of proteomic assays was low in studies that collected pre-diagnostic samples (number of PDAC cases, 87-174; Table 3), demonstrating the challenge for early diagnosis.
At least 15 case-control studies have assessed protein panels mostly consisting of two to five biomarkers (Tables 2,S3). These case-control studies reported an area under the receiver operating characteristic curve (AUC) of 0.88-0.99 in distinguishing PDAC cases from healthy controls and of 0.82-0.90 in distinguishing PDAC cases from benign pancreatic conditions (e.g. acute pancreatitis or CP and benign pancreatic cyst).  Previous studies demonstrated the benefit of a larger panel of protein biomarkers in both Caucasians and Asians, using a casecontrol study design (Table S3). 49,[57][58][59] Wingren et al. presented the first multiplex serum biomarker signature in a case-control study of 34 PDAC patients and 30 healthy controls, as well as 16 CP and 23 autoimmune pancreatitis patients. 49 Based on a 25-serum biomarker signature, an AUC of 0.95 was achieved in distinguishing PDAC from healthy controls. Of note, PDAC could be discriminated from inflammatory diseases of the pancreas (AUC: CP 0.86 and autoimmune pancreatitis 0.99). In the validation study, PDAC could also be distinguished from healthy controls and inflammatory diseases of the pancreas, achieving an AUC of 0.88. In a subsequent study, the same study group extended the platform with novel antibodies predominantly targeting cancer-associated antigens and demonstrated robust serum signatures that could be identified in a multicenter trial. 58 This multicenter trial involved 338 cases and control serum samples (156 PDAC, 152 other pancreatic diseases, and 30 controls with nonpancreatic conditions) from five hospitals in Spain. Based on 293-plex recombinant antibody microarrays, PDAC cases could be distinguished from healthy participants with an AUC of 0.98, using a multiplexed biomarker signature of up to 10 serum markers.
In a recent study, the same study group identified stageassociated biomarkers by comparing stage I-IV patients and demonstrated the possibility for diagnosis of PDAC in earlier disease stages (Table S3). 59 The investigators used a recombinant antibody microarray platform (350 antibodies) to analyze 213 Chinese plasma samples from PDAC patients and healthy controls. Based on a 25-biomarker signature, they reported that all PDAC stages could be distinguished from controls with the accuracy increasing with disease progression (from stage I to stage IV). In particular, patients with stage I/II PDAC could be discriminated from healthy controls with an AUC of 0.80. Furthermore, the investigators showed a clear overlap between this study and a previous study involving Caucasians when comparing the 25 highest-ranked antibodies, indicating that this proteomics assay was generalizable between race and ethnicity (Caucasian and Asian).
Although case-control studies have demonstrated good diagnostic accuracy of multiplex protein signatures, several prospective studies using pre-diagnostic samples have yielded low discrimination (Table 3). [51][52][53] In a case-control study with 160 PDAC cases, PDAC patients could be distinguished from healthy controls with an AUC of 0.93, using three serum protein biomarkers (CA 19-9, intercellular adhesion molecule 1, and osteoprotegerin). 60 However, in a population-based prospective study of 135 incident PDAC cases, the same three-biomarker panel merely achieved an AUC of 0.69 and 0.66 in samples collected < 1 and ≥ 1 year prior to diagnosis, not superior to the AUC for CA 19-9 alone (0.68 vs 0.63). 51 The findings suggested that this protein signature could not be used for prediagnostic risk assessment. Another study analyzed plasma samples from both mouse model and diagnostic and prediagnostic plasma from 87 PDAC cases in the Women's Health Initiative, an antibody microarray platform containing 130 antibodies. 52 This cross-species approach identified a panel of three protein biomarkers (v-erb-b2 erythroblastic leukemia viral oncogene homolog 2, tenascin C, and estrogen receptor 1) achieving an AUC of 0.86 in diagnostic samples. However, the AUC decreased to 0.68 in pre-diagnostic samples (87 women who were later diagnosed with PDAC within the next 4 years of blood collection), albeit slightly superior to CA 19-9 alone (AUC = 0.60). By contrast, a nested case-control study within the UK Collaborative Trial of Ovarian Cancer Screening with 154 PDAC cases assessed the combinations of CA 19-9 with CA-125 and with TSP-1 in the pre-diagnosis plasma samples of PDAC patients and showed good diagnostic accuracy distinguishing PDAC patients and healthy controls up to 2 years prior to diagnosis (Table 3). 47,53 The combination of TSP-1 and CA 19-9 achieved an AUC of 0.85, superior to both markers alone (0.69 and 0.77, respectively; P < 0.01). 53 In addition to studies investigating proteomics alone, recent evidence has shown that the combination of protein biomarkers and ctDNA can reach a sensitivity of~70% and specificity of > 99% in distinguishing PDAC cases and healthy controls (Table  S3). 61,62 Combining protein biomarkers and ctDNA increases sensitivity because the majority of cancer patients are detected only by one biomarker. 61,62 Although ctDNA is elevated in 85% of patients with advanced cancers, plasma ctDNA is detectable only in a small proportion of patients with early-stage cancers, 63,64 and the sensitivity of ctDNA tests is limited for localized cancers. 9 Metabolomics. Metabolomics is the comprehensive characterization of small low-molecular-weight metabolites in biological samples, 41 allowing investigation of associations of metabolic alterations with conventional metabolic risk factors and with specific diseases. In recent years, metabolomics technologies (mainly mass spectrometry [MS] and nuclear magnetic resonance) have allowed the identification of metabolite biomarkers, 39,41 with the promise to inform early detection of PDAC. However, data on metabolic signatures are still limited compared with those on proteomic and genomic profiling studies of cancer. So far, there have been several case-control studies using MS metabolite profiling to examine diagnostic performance of various platforms, which measured metabolites of a diverse range (number of cases 5-360, median 49; Table S4). In general, the diagnostic performance is superior to CA 19-9 alone and has been validated in independent test sets. However, the majority of previous studies used a casecontrol design and measured blood biomarkers after diagnosis of PDAC. Metabolic profiling using blood samples collected before cancer occurrence may inform early detection and improve understanding of etiology of PDAC (number of PDAC cases 170-453; Table 3).
At least 17 case-control studies suggested that MS-based metabolomics in blood samples could be useful in PDAC detection and distinguish between PDAC from healthy controls, with an AUC greater than 0.8 (Table S4). However, only four studies reported the AUC of biomarkers to distinguish PDAC from other benign diseases (e.g. CP and biliary diseases), which were important differential diagnoses of PDAC in clinical settings. These studies showed good discrimination of PDAC cases from benign hepatobiliary disease (i.e. benign tumor and CP), CP, and type 2 diabetes. A recent study by Mayerle et al. used both untargeted and targeted MS-based approaches including lipidomics and showed that PDAC patients could be distinguished from CP patients, as well as from healthy controls. 65 Using a case-control  These studies showed that individuals with PDAC had higher sphingomyelin than individuals with CP and higher lysophosphatidylcholine than healthy controls (Table S5).
On the other hand, three nested case-control studies within prospective cohort studies investigated the performance of metabolomics in pre-diagnostic blood samples (Table 3). Using liquid chromatography-tandem mass spectrometry, Mayers and colleagues assessed 83 metabolites in central metabolism and amino acid metabolism in 453 PDAC cases and 898 controls nested in four prospective studies. 54 After a median follow-up of 8 years, they found that elevated plasma levels of branched-chain amino acids were associated with a twofold increased risk of PDAC (HR 2.00-2.13, comparing top vs bottom quintile). This elevated risk was independent of known predisposing factors, with the strongest association observed among subjects with samples collected 2 to 5 years before diagnosis. Another nested case-control study within the Japan Public Health Center-based Prospective Study with 170 cases quantified 12 targeted metabolites and showed that, among patients diagnosed in the first 6 years of follow-up, higher levels of 1,5-anhydroglucitol (1,5-AG), asparagine, tyrosine, and uric acid were associated with decreased risk of PDAC after adjustment for potential confounders (P for trend 0.02-0.04). 55 However, when analyzing the cases during the entire follow-up, higher 1,5-AG and lower methionine levels showed nonsignificant associations with decreased risk of PDAC (P for trend 0.06 and 0.07, respectively). Recently, a nested case-control study in the Shanghai Men's Health Study and the Shanghai Women's Health Study with 226 cases identified 10 metabolites that were associated with risk of PDAC, including seven glycerophospholipids. 56 The study also showed that the association was similar for cases diagnosed < 5 and ≥ 5 years after plasma collection. Despite the null associations of conventional lipids (triglycerides, total cholesterol, and low-density and highdensity lipoprotein cholesterol) with PDAC in prospective studies (Table S6), these other lipids may be promising candidates as biomarkers.

Risk prediction
Lifestyle and metabolic risk factors, as well as genomics and blood-based biomarkers, are predictive of risk of PDAC. However, blood-based biomarkers have been rarely investigated in relation to risk prediction in the general population as the majority of previous studies are hospital-based case-control studies. The primary goal of risk prediction model is to develop a tool for identifying individuals at a high risk of PDAC. Well-developed models can provide accurate risk assessment for individuals and can inform screening decisions. Although imaging (endoscopic ultrasonography and/or magnetic resonance cholangiopancreatography) The O/E ratio represents the age-standardized incidence ratio for the group of individuals within the Risk Index RR category, standardized using observed 10-year age-specific incidence rates in the cohort. In all studies, diagnosis of PDAC was ascertained by medical records and the International Classification of Diseases code.
has been recommended for initial screening among high-risk populations, there is no consensus on screening modalities and intervals for follow-up imaging. [1][2][3] There are two types of risk prediction models, developed either in high-risk populations (e.g. positive family history of PDAC and newly onset diabetes) or in the general population (Table 4).
PancPRO is a Mendelian model for PDAC risk prediction for identifying high-risk individuals in those with familial PDAC, built using a Bayesian modeling framework, and was validated using an independent cohort in the National Family Pancreas Tumor Registry (961 families and 26 incident PDAC cases). 66 PancPRO was shown to have good discrimination and calibration in the independent validation cohort, with an AUC of 0.75 (0.68-0.81) and an observed to predicted PDAC ratio of 0.83 (0. 52-1.20). Another risk prediction model included detectable symptomatology preceding the diagnosis of PDAC as well as other risk factors. 67 The estimates were obtained from a casecontrol study where information on current medications and recent signs and symptoms was collected. The 5-year absolute risk was calculated from the US Surveillance Epidemiology and End Results incidence data from 2008 to 2010. A total of 0.87% of controls had 5-year absolute risks > 5% who had a combination of recent diagnosis of diabetes and pancreatitis, current use of proton-pump inhibitors, Jewish ancestry, non-O blood group, and current smoking. A recent study developed and internally validated a risk prediction model for PDAC among patients with newly onset diabetes, showing the promise of risk stratification among high-risk populations. 68 The study involved 109, 385 individuals with newly diagnosed diabetes, and the outcome was PDAC diagnosed within 3 years of diabetes onset. The prediction model included demographic, behavioral, and clinical variables that were routinely collected at the time of diabetes diagnosis. They showed that if the predicted risk threshold was set at 1% over 3 years, the model would have a sensitivity of 44.7%, specificity of 94.0%, and a positive predictive value of 2.6%.
For risk prediction models developed in the general population, the majority of previous studies developed risk prediction models using routinely collected data and traditional regression models (Table 4), [69][70][71][72] and some validated the prediction models in separate populations 71,72 and showed good discrimination and calibration. Two retrospective cohort studies have developed and validated prediction models for PDAC using information on socio-demographic, lifestyle, and clinical variables, 71,72 and both have shown good diagnostic accuracy in independent cohorts. The first study included routinely collected data of~5 million patients aged 25-84 years from 753 QResearch general practices in England. 71 In an external validation cohort of 1.6 million patients, their sex-specific risk prediction model showed good diagnostic accuracy (AUC of 0.86 in men and 0.87 in women) and showed a 10-year absolute risk of~0.8% in both sexes for participants with the top 10% of predicted risk. Likewise, another retrospective cohort involving~2 million Korean individuals who underwent biennial examinations reported an 8-year absolute risk for participants with all risk factors included in the prediction model (1.5% in men and 1.2% in women). 72 So far, only the PanScan Consortium developed an RR model involving genetic risk factors as well as traditional, nongenetic risk factors. 70 However, their model had limited diagnostic accuracy, with an AUC of 0.58 for nongenetic factors, 0.57 for genetic factors, and 0.61 for both nongenetic and genetic factors. In particular, they found that the genetic factors did not add substantively to a risk model based on lifestyle factors only.

Conclusions and future directions
Despite the large number of risk factors suggested by observational studies, the magnitude of RRs is overall small and the PAFs are low. Risk prediction models incorporating these lifestyle and metabolic risk factors as well as other factors routinely collected by health insurance systems have achieved satisfying sensitivity and specificity. Genetic studies of very large samples sizes are required for the discovery of genetic variants ranging from rare, low-frequency, and common variants in order to inform risk prediction models. Novel blood-based biomarkers, particularly proteomics and metabolomics, can inform early diagnosis of PDAC. However, prospective cohort studies that collect prediagnostic samples and validation in independent studies are warranted. In this context, prospective biobank studies with samples collected prior to disease onset are valuable resources. For example, the China Kadoorie Biobank has proposed a multiomic approach (metabolomics, proteomics, and genomics) to investigate novel biomarkers relevant for risk prediction and early diagnosis of PDAC. Similarly, the European Prospective Investigation into Cancer and Nutrition has included metabolomics in their ongoing research topics for PDAC.
Because of the lack of noninvasive and low-cost screening tools, the current recommendation is that screening the general population for PDAC is not feasible, and screening will need to be restricted to people at high risk of PDAC. 73 Based on our review of literature in this field, we propose that a bridge between risk factor epidemiology and multi-omics investigations is needed because (i) combinations of biomarkers, as well as combination of biomarkers with traditional risk factors and genomics data, can provide much more information than a single biomarker alone (e.g. CA 19-9); (ii) developing algorithms to identify high-risk populations and biomarkers with sufficient discriminatory power that are cost-effective are needed to inform clinical decisions; and (iii) multi-omics investigations can help identify etiological factors for PDAC and new pathways for potential therapeutic targets for treatment.

Supporting information
Additional supporting information may be found online in the Supporting Information section at the end of the article. Table S1. Reviews of ctDNA, cell-free non-coding RNA, exosomes, microbiome, circulating tumor cells, and PDAC. Table S2. Study information of GWAS studies of PDAC. Table S3. Study information of case-control studies of proteomics and PDAC. Table S4. Study information of case-control studies of metabolomics and PDAC. Table S5. Case-control studies of metabolomics and PDAC reporting on phospholipids. Table S6. Prospective studies of blood lipids and risk of PDAC.