Risk stratification of symptomatic patients suspected of colorectal cancer using faecal and urinary markers

Faecal markers, such as the faecal immunochemical test for haemoglobin (FIT) and faecal calprotectin (FCP), have been increasingly used to exclude colorectal cancer (CRC) and colonic inflammation. However, in those with lower gastrointestinal symptoms there are considerable numbers who have cancer but have a negative FIT test (i.e. false negative), which has impeded its use in clinical practice. We undertook a study of diagnostic accuracy CRC using FIT, FCP and urinary volatile organic compounds (VOCs) in patients with lower gastrointestinal symptoms.


Introduction
Colorectal cancer accounts for 12% of all cancers in the UK with as strong age relationship. The clinical presentation of CRC is varied; hence the dilemma for the clinician is to distinguish those with significant versus nonsignificant pathology without recourse to invasive and costly investigations [1].
Symptoms alone are not sufficiently sensitive to diagnose CRC, and up to a third of patients who undergo invasive investigations have normal outcomes (i.e. colonoscopy reported as macroscopically and microscopically normal) [2]. For those with lower gastrointestinal symptoms there is increasing evidence for the use of the faecal immunochemical test for haemoglobin (FIT) as first-line testing; this has a high negative predictive value (NPV) of 0.99, but sensitivity is relatively low (0.80-0.90) [2][3][4].
In symptomatic patients it remains uncertain as to how to interpret a FIT-negative resultin other words, can such patients be sufficiently reassured that they do not have cancer? Alternative or additional tests are therefore required. For example, a stool marker for inflammationfaecal calprotectin (FCP)or urinary volatile metabolic markers could be utilized to risk stratify those with suspected CRC. Metabolic markers such as urinary, faecal or breath volatile organic compounds (VOCs) have previously been shown to aid in CRC detection [5][6][7][8]. To our knowledge, this is the first study to use a combination of faecal tests (FIT and FCP) as well as urinary VOC testing in a symptomatic population suspected of CRC.

Method Design and setting
This was a single-centre, prospective, blinded study of patients with lower gastrointestinal symptoms referred by family physicians to tertiary care with suspected CRC. Ethical approval was granted by Coventry and Warwickshire Research Ethics Committee, UK as part of the FAMISHED (Food and Fermentation using Metagenomics in Health and Disease) multicentre study (09/ H1211/38). The study protocol conforms to ethical guidelines of the 1975 Declaration of Helsinki as reflected by the institution's human research committee.

Participants
A total of 1850 patients were approached with criteria for inclusion based on national referral criteria (Appendix S1 in the online Supporting Information). Of these, 834 were excluded for a combination of reasons including physical frailty, illness, language barriers, etc. One thousand and sixteen patients provided consent and underwent colonic investigations (endoscopic or radiological colonic cross-sectional imaging). Sixteen patients withdrew from the study and 310 failed to provide stool samples (69% return rate). A further 78 patients were excluded as only one stool sample was provided (both the FIT sample collection device and stool for FCP were required). Spot urine samples were received from 762 patients (76% return rate), but 39 were excluded due to insufficient sample volume or failed VOC urine analysis. This left a group of 562 patients with matching urine and stool samples (FIT and FCP) who were included for final statistical analysis ( Fig. 1). Those who were under the age of 18, pregnant, did not meet the referral criteria for urgent review for lower gastrointestinal symptoms or had incomplete colonic examinations were excluded from the study.
All study participants were given a pack containing a FIT sample collection device (Extel Hemo-auto MC A device; Kyowa Medex, Tokyo, Japan via Alpha Laboratories Ltd., Eastleigh, UK), which holds 2 mg of faeces in 2 ml of buffer, and a Universal Sterilin 30 ml stool pot for the FCP sample. Written and pictorial instructions for collections were provided, with the Fe-Col â sample collection aid (Alpha Laboratories Ltd.). Time of collection and time of receipt of samples at the laboratory were recorded and all samples were stored refrigerated at 2-8°C until analysis. Patients were asked to return the sample prior to colonic investigations. Samples returned more than 4 days after collection were excluded from analysis.
All study participants were also provided with a Universal Sterilin 30 ml pot to collect a spot urine sample at the time of the clinic visit. Timing of collection was recorded and urine samples were stored with sealed caps in a freezer at À80°C. For patients who were unable to provide the sample in the clinic, a urine container was provided as well as a return envelope, and the sample was sent via courier from their primary physician and stored at À80°C.

Intervention
Quantitative FIT was performed on automated HM-JACKarc analyser (Kyowa Medex) by the Midlands and North West Bowel Cancer Screening Hub, Rugby, UK on a weekly basis. Stool samples for FCP analysis were extracted manually by trained laboratory staff alongside the routine calprotectin service. Extracted calprotectin was measured using the EliA Calprotectin fluoroimmunoassay on the automated Thermo Fisher Immuno-Cap 250 analyser (Thermo Fisher Scientific, Waltham, Massachusetts, USA).
In the analyses for CRC, thresholds were determined from the data (a priori) to maximize sensitivity under the constraint that the NPV was ≥ 0.99. For FCP, no threshold achieved an NPV ≥ 0.99. The lowest detection limit of this assay for FIT is 3 lg/g faeces.
Samples were initially stored at À80°C. Prior to urinary VOC analysis, they underwent a graded defrost process (based on our unpublished established protocol). Ten-millilitre glass vial aliquots (Thermo-Fisher Scientific, Loughborough, UK) suitable for use with an autosampler (MPS, Gerstel GmbH, Mülheim an der Ruhr, Germany) were used. Crimp caps (Chromacol Ltd., Merck, UK) with silicone polytetrafluoroethylene septa were used to seal each sample. Each septum and crimp cap was baked for 6 h at 200°C prior to use to remove any potential interfering molecules. Control blanks of air were prepared using the same method. A commercial gas analysis instrument [Lonestar (FAIMS), Owlstone, Cambridge, UK], based on ion mobility spectroscopy (IMS), was used to analyse VOCs emanating from the urine samples. Details of the Lonestar and its application in medical diagnostics have been reviewed by Covington et al. [9]. The setup was bespoke for this application, to detect unique VOC chemical 'fingerprints'. Our previous study has determined the optimal sample capture and storage methods to minimize diurnal and day to day variation [10] (see Appendix S1 for further methodological details on the use of FAIMS).

Outcomes
Participating clinicians, endoscopists and radiologists were blinded to the results of the urinary VOCs and faecal tests (FIT/FCP). Diagnosis of CRC and adenomas was confirmed histologically. High-risk adenomas were defined as lesions with high-grade dysplasia and/ or serrated, villous histology, ≥ 10 mm in size or the presence of three or more adenomas. Hyperplastic polyps were excluded. Faecal and urinary VOC results were compared with the outcome of the colonic investigations and divided into clinical groups: CRC, high-risk adenoma, all adenomas, others and normal.
All analysis was carried out using R 3.4.2 (R Foundation for Statistical Computing, Vienna, Austria). To combine FIT and FCP measurements, 10-fold cross validation and a Bayesian robust logistic regression model were used to generate predictive scores for each patient using the above criteria.
The VOC data variables were assessed for predictive value with a Mann-Whitney U-test at a significance level of 0.8, with multi-test correction performed using sequential goodness of fit (SGoF) [11]. This yielded a set of about 10 4 candidate variables. The dimensionality of the selected variables was then reduced using principal component analysis (PCA) [12], with the number of principal components chosen using crossvalidation (typically around 35) [13]. For testing using VOCs alone, a logistic regression model was trained on the results. For the two-phase test incorporating FIT as a first phase a support vector machine, radial kernel, was used on the top 128 variables, without using SGoF for variable selection. Ten-fold cross-validation was used to generate predictive scores for each patient, and a threshold was selected to maximize sensitivity.

Results
The main patient characteristics are shown in Table 1. A total of 562 patients, who completed colonic investigations, in addition to providing stool for FIT and FCP  Table 2 for the diagnostic performance of FIT).
FCP alone performed less well than FIT in the CRC and adenoma groups. Test performances combining FIT and FCP are shown in Table 2. The respective receiver operator curve characteristics (ROCs) for CRC, high-risk adenomas and all adenomas using FIT are shown in Fig. 2.

Urinary VOCs
For urinary VOC analysis, logistic regression using SGoF selection and PCA was applied to form a unique 'chemical fingerprint'. The sensitivity of urinary VOCs   High-risk adenoma: adenoma with high-grade dysplasia, villous histology, ≥ 10 mm or ≥ 3 adenomas. Thresholds applied to achieve the highest sensitivity under the constraint of keeping the negative predictive value > 0.99. Note that the lowest limit of detection for the FIT assay is 3 lg/g feces.  (Table 3).
Further analysis of urinary VOCs in a setting of FITnegative CRC (missed cancer) improved the sensitivity to 0.97 (95% CI 0.90-1.0) and specificity to 0.72 (95% CI 0.68-0.76) with a NPV of 1.0 (95% CI 0.99-1.0). Figure 3 shows a box plot applying the two-stage filter process to depict improvement in CRC detection when urinary VOCs are used in FIT-negative CRC patients compared with FIT alone. The decision threshold line in each case is the value (of either FIT or predicted probability) that divides prediction of cancer. Thus, for any patient above the line the test indicates the likelihood of having cancer, and vice versa. Overall, patients are classed as negative for the two-stage test if they are negative for both FIT and VOC screening, and positive if they are positive for either test -FIT (above the threshold) or VOC screening. We did not observe any differences even after stratifying for age or gender. For all adenomas, urinary VOC did not improve detection in those with false-negative FIT.

Discussion
Whilst studies evaluating various faecal markers (methylated genes, microRNA and protein markers) have shown promise for the detection of CRC and adenomas their application within a clinical setting has been limited due to high cost and poor sensitivity, especially when applied in areas with a low disease prevalence [14]. Thus, the emphasis has been on low-cost, noninvasive testing such as FIT for detection of CRC and adenomas. Experience from using FIT in the screening population reveals that it has a relatively high specificity at the expense of sensitivity. Various FIT devices have been trialled and it has become evident that there is considerable heterogeneity in FIT devices for the   detection of CRC. This is further compounded by the fact that various threshold levels are applied and there is uncertainty surrounding those who test negative with FIT [15]. In the UK, CRC detection based on symptoms alone is low; ranging from 4% to 8% [16,17]. This suggests that over 90% of patients undergo negative tests for exclusion of CRC. Thus, it is imperative that alternative noninvasive prescreening markers such as FIT and other faecal or urinary markers are used in those with gastrointestinal symptoms so as to minimize unnecessary investigations.
In this study we have shown for the first time the value of dual-modality testing using noninvasive markers (FIT and urinary VOCs) as a two-stage process to exclude CRC. In those with lower gastrointestinal symptoms suspicious of CRC, the use of FIT alone revealed a sensitivity of 80% [meta-analysis suggests 90% (CI 87-92%); unpublished]. Thus, there is the potential to miss one or two cancers out of every ten, which is not sufficiently robust for everyday clinical use. Urinary VOC on its own was less sensitive (0.63; CI 0.46-0.79) and in combination with FIT did not show any improvement in sensitivity (0.80; CI 0.60-0.93). However, the use of a two-stage test, namely the addition of urinary VOC testing in those who test negative for FIT (i.e. false-negative CRC), increases the combined sensitivity to 97%; this is more acceptable for clinical use and comparable to the performance of colonoscopy at a fraction of the cost.
Urinary VOC analysis on its own (VOC 'positive') only provides 63% sensitivity and specificity for CRC detection, and performance does not improve when it is used in combination with FIT or FCP. This may be due to the heterogeneity of the chemical fingerprint that is produced by CRC. However, following preselection by FIT it performs well, as there is reduced background 'volatile noise' making it more specific to detect either haemoglobin moiety/breakdown products or glycation end-products. Whilst urinary VOCs demonstrate a high sensitivity for adenoma detection, the lack of specificity and high false-positive rate suggests that this marker may not perform as well for adenoma detection.
The VOCs that are detected using our preanalytical method and FAIMS are unique, as detection is based on volatiles existing in the gaseous phase rather than in the liquid phase. The disease separation is characterized by the mobility of individual ions (i.e. physical rather than chemical properties of the ion), which have low molecular weights . Unlike conventional gas chromatography and mass spectroscopy (GC-MS), the specific chemicals are not identified but a 'chemical fingerprint' is formed. GC-MS is limited by its high running and labour cost as well as run time. This

O340
Faecal and urinary markers in colorectal cancer impedes its use within routine clinical practice where rapid, low cost and simple operation (nonskilled operators) would be preferable. The composition of key volatile compounds gives rise to the unique chemical fingerprint identified in this study, allowing classification into 'VOC positive or negative' outcomes. VOCs reflect metabolic cellular changes within the host; for example, detection of advanced glycation endproducts which have been implicated in colon carcinogenesis [18]. Our previous work [19] [undertaken using a Bruker Scion GCMS, fitted with dynamic head space sampling and solid-phase micro-extraction (SPME) preconcentration system] has identified three chemicals which are modulated in CRCthe 'VOC positive' signature. In particular we noted a high incidence of 1,3,5,7cyclooctatetraene and a low incidence of 1,3-propanediamine and 4-methylbenzoic acid (dietary metabolite). Methylbenzene has also recently been reported to provide a unique chemical signature in those with CRC when a different technology (non-GCMS) is used [20]. A higher incidence of acetone was noted in those with colorectal adenomas; this has been shown by members of our group (unpublished) to be produced by C. difficile and in other Clostridiales. Allyl isothiocyanate was also detected in those with CRC but not at elevated levels; the latter is produced by certain E. coli strains as dietary substrates with can affect the integrity of the gut mucosa [21].
The use of dual-modality testing, initially with FIT followed by urinary VOCs, enables 97% sensitivity with 100% NPV if both tests are negative for the detection of CRC and high-grade adenomas. Furthermore, findings from this study suggest that the combination of FIT and VOC offers the option for personalized strategies for CRC detection in those with symptoms and avoids the need for repeat FIT testing (if FIT is negative the test probability is unlikely to improve unless there are preanalytical errors).
It is envisaged that both these noninvasive tests (FIT and urinary VOCs) can be undertaken within primary care and analysed within a central laboratory (as FIT currently is) at low cost to guide secondary care referral patterns. The FAIMS unit is commercially available and urine VOCs are deemed stable up to 12 months when stored frozen [10]. It has been purported within a simulation model that for an equivalent biomarker to compare with FIT (£18/test) it should not exceed seven-fold the unit cost of FIT [22] this is fulfilled in urinary VOCs (£28/test), which cost less than twice the unit cost of FIT. A proposed clinical algorithm is outlined, highlighting the use of a dual noninvasive diagnostic approach in those with lower gastrointestinal symptoms ( Figure S1).