Diagnostic yield from symptomatic lower gastrointestinal endoscopy in the UK: A British Society of Gastroenterology analysis using data from the National Endoscopy Database

The value of lower gastrointestinal endoscopy (LGIE; colonoscopy or sigmoidoscopy) relates to its ability to detect clinically relevant findings, predominantly cancers, preneoplastic polyps or inflammatory bowel disease. There are concerns that many LGIEs are performed on low‐risk patients with limited benefit.


| INTRODUC TI ON
The volume of lower gastrointestinal (GI) endoscopy performed to investigate symptoms in the UK continues to rise annually, combined with decreasing cancer conversion rates. 1 This practice of highvolume, low-yield endoscopy has resulted in services being unable to cope with demand, 2 patients subjected to invasive procedures with inherent risk of complications but with little chance of altering their management, 3,4 substantial environmental impact, 5 and high cost to the National Health Service (NHS).Despite low-yield endoscopy still having the potential of identifying pathology and offering reassurance through normal findings, the high opportunity cost in publicly funded systems must be considered.Unlike the independent sector, which can adjust capacity to meet demand, endoscopy resources in public healthcare are often limited.For example, a major obstacle to the planned expansion to younger age-groups of the English FITbased bowel cancer screening programme, where the PPV for cancer is higher (7.1%) 6 and with earlier cancer stage detection, 7 is the lack of available endoscopy capacity.
Alongside delaying Bowel Cancer Screening Programme expansion there are now significant diagnostic delays for patients awaiting lower GI endoscopy in the UK and it is predicted these will result in a 16% increase in colorectal cancer deaths over the coming 5 years. 8doscopy services, unable to cope with rising demand prior to the COVID pandemic, 9 now face additional backlogs of over half a million procedures. 10suming pre-pandemic work patterns will not address the backlog, while increasing the volume of endoscopy performed is not feasible due to the funding and workforce crises currently facing the NHS. 11e best solution lies in optimising the use of available endoscopy resources: ensuring higher-risk patients are investigated in a timely manner while reducing the volume of low-yield endoscopy.Implementing this approach has proven challenging, in part due to the absence of nationwide data linking procedure indications with diagnostic yields.
Without this, it is difficult to perform a comprehensive, nationwide comparison of endoscopy indications over extended periods.
Data from the National Endoscopy Database (NED) can resolve this issue.NED data is compiled directly from electronic endoscopy reports, which are automatically uploaded without adding to the workload of endoscopists. 12The data generated offers a unique opportunity to gain valuable insights into the state of endoscopy services in the UK.
The primary aim of this study was to use data from the NED to evaluate diagnostic yield for lower GI endoscopy -colonoscopy and flexible sigmoidoscopy -performed to investigate patient symptoms for different patient age-groups and sex, aiming to identify opportunities for future capacity optimisation.Specifically, the study estimates the positive predictive value (PPV) of common GI symptoms (at different patient ages and sex) for identifying large polyps and cancer, to help guide referral pathways and inform both patients and healthcare professionals regarding the necessity of undergoing endoscopic assessment.

| Data source
The data for this study was extracted from the NED for all endoscopy procedures performed in NED-uploading sites in the UK from March 1, 2019 to February 29, 2020.At that time, national referral guidelines were primarily based on patient symptoms and age, 14 and faecal immunochemical testing (FIT) was not commonly used outside the UK's bowel cancer screening programme (BCSP).Local endoscopy reporting systems automatically upload data to NED via a standardised data schema developed following data mapping and validation exercises. 12At the time of data collection 407 of the 515 UK endoscopy sites (79%) were uploading to NED, 15 alongside three-monthly monitoring to ensure upload success rates were 98% or greater. 12As the NED data is anonymised and not linked to histology, endoscopic diagnoses were used for this study.
The analysis was restricted to lower GI endoscopy procedurescolonoscopy or flexible sigmoidoscopy -in adult patients (Figure 1).
Duplicate procedures were excluded (with index procedure retained), as were endoscopies conducted on patients under 18 years old, those over 99 years old (as such cases were likely to be data entry errors), and those with unclassified patient sex.Abandoned procedures, including colonoscopies which did not reach the caecum or equivalent and flexible sigmoidoscopy which did not enter the rectum, were also excluded unless a diagnosis of cancer was the reason for the incomplete procedure.NED offers the flexibility of free-text entry in both the indication for referral and diagnosis fields, in addition to pre-set input options from the standardised data schema.
To prevent omission of relevant data free-text entries were carefully scrutinised using the txttool utility in Stata 16 and, where appropriate, re-coded to corresponding NED terms for indication and diagnosis (this process is described in Appendix S1).Following this, if endoscopies did not include a recorded indication or diagnosis, they were excluded, as were OGD uploads.

| Data coding: Indications, symptoms and diagnoses
As NED allows for the recording of multiple indications for a procedure, the data was reduced for analysis to one indication per procedure, using a hierarchy of severity.The hierarchy was as follows: therapeutic/emergency, screening, abnormal prior investigation, post-CRC surveillance, polyp surveillance, other surveillance, IBD assessment/surveillance and symptomatic.When multiple indications were recorded for a procedure, the procedure was categorised as the indication 'highest' in the hierarchy.For example, if indications included both post-CRC surveillance and diarrhoea the endoscopy was categorised as being for post-CRC surveillance.As this analysis focused on lower GI endoscopy performed to investigate patient symptoms in an outpatient setting, those conducted for other indications were excluded.
In NED, the indications fields may include multiple symptoms.For the primary analysis, each endoscopy was classified by symptom -as described in Table 1 -using a hierarchy of 'severity' (e.g. if anaemia and constipation were both recorded as indications the symptom was categorised as anaemia).The effects of broadened symptom combinations, examining each symptom individually as well as in conjunction with another symptom was also explored in secondary analyses (Table S1).
The outcome variables for the analyses related to endoscopic diagnoses: Normal, diverticulosis, inflammatory bowel disease (IBD), polyp(s), large polyp, cancer.For each endoscopy, multiple diagnoses could be recorded except when the outcome was 'normal'.For example, an endoscopy revealing a 10 mm polyp would be categorised under both the polyp(s) and large polyp groups.Certain diagnoses (e.g.haemorrhoids, melanosis) were recorded by NED, but unlikely to impact clinical course.Consequently, if only these diagnoses were recorded the procedure was re-classified as normal.

| Statistical analysis
Descriptive statistics (expressed as percentages unless otherwise noted) were applied to summarise the patient demographics (age, sex) and symptom groups.Pearson's chi-square test was used to identify associations between categorical variables.Patient demographics and symptom data groups were compared by health sector (NHS or independent sector).
As many endoscopies conducted within independent sector sites are on behalf of the NHS, 9 with providers being reimbursed by NHS services to enhance capacity and alleviate waiting times (i.e.outsourcing), the primary analysis encompassed data from all participating sites.Secondary analysis was conducted focusing specifically on data uploaded from NHS sites (Table S2).
The unadjusted PPVs, together with 95% confidence intervals (CI) calculated using the Wilson method, 17  Analysis was completed for colonoscopy initially, then repeated for flexible sigmoidoscopy.The impact of symptoms, patient sex, and patient age upon the PPV of large polyp (and subsequently cancer) was examined using two-level mixed effects logistic regression models. 18These accounted for the non-independence of procedures (which are clustered within endoscopists) by fitting endoscopist as a random effect, with symptoms, patient sex, and patient age-group as fixed effects upon the binomial dependent variable.The primary models included patient age by group (18-39, 40-49, etc.); a subsequent analysis explored the impact of an age cut-off of 50, and for that analysis regression was performed with age a dichotomous variable (aged <50, aged 50 or more).Postestimation analysis was then performed to calculate the marginal means of the dependent variable based on covariates, with results displayed as adjusted PPVs with 95% CIs.Analysis was then rerun using the additional symptom combinations within the random effects upon the dependent variable of cancer.Finally, regression was repeated restricting consideration to colonoscopies and flexible sigmoidoscopy uploaded from NHS sites.

| Approvals
As no patient identifiable information was used this project was assessed as not requiring ethical approval by the North Tees and
The most frequent indication for colonoscopy was rectal bleeding (28.1%), followed by altered bowel habit (19.9%) and anaemia (17.6%).Colonoscopy performed at independent sector sites was on younger patients (median age 61 vs. 50, p < 0.01) and TA B L E 1 Explanatory and outcome variables.

Symptom groups Hierarchy Description
Rectal Note: The symptom groups were mutually exclusive and each procedure could only be recorded in one category.The diagnosis groups were not mutually exclusive, meaning that each procedure could be classified under multiple diagnosis groups, except for the 'normal' category.
a higher proportion was performed to investigate abdominal pain or constipation (30.9% vs. 14.1% at NHS sites, p < 0.01).Most symptomatic flexible sigmoidoscopy was performed to investigate rectal bleeding (71.1%).
Two-thirds of symptomatic colonoscopies (66.5%) were reported as normal or identifying only diverticulosis; this applied to 71.8% of those performed among female patients compared to 60.4% in males.
While aPPV cancer exceeded 3% among male patients over 50 and female patients over 60 with indications of rectal bleeding or anaemia, most symptomatic colonoscopies (52.1%) were carried out in patient groups where aPPV cancer was under 1%.This included 62.5% of colonoscopy performed upon female patients and 39.8% of those performed upon male patients.
The findings of the secondary analysis -when consideration was restricted to procedures conducted in NHS sites only -were largely the same as the primary analysis (Table S2).
The aPPVs for additional symptom combinations are shown in Table S1.

| Multivariable analyses: Flexible sigmoidoscopy
Increasing patient age and male sex were associated with increased aOR of both large polyp and cancer (Table 6).
Rectal bleeding was associated with increased risk of large polyp (aOR: 2.0, 95% CI:

TA B L E 3 (Continued)
As over half of lower GI endoscopies in the UK are performed to investigate lower GI symptoms alterations to this pathway holds great potential for rationalising endoscopy capacity.Effective triage of symptomatic endoscopies requires the identification of patients at higher risk of both cancer and IBD (delays in either diagnosis lead to poorer patient outcomes 19,20 ) while minimising investigation for low-yield symptoms.A recent meta-analysis of outcomes from symptomatic colonoscopy -including 31 studies with 45,000 patients in total -concluded that rectal bleeding and anaemia were the most practical alarm symptoms for cancer, however analysis was limited by heterogeneity between studies populations and wide variability in cancer prevalence. 21The findings of our analysis -which included eight times the number of procedures -were consistent with this.
Most studies on cancer yield from colonoscopy have analysed screening colonoscopy, with large European studies on nonenriched (i.e.not triaged using pre-colonoscopy stool testing) patients identifying cancer PPV of 0.5-0.8% in those aged 55-75. 22,23Our study's overall cancer PPV of 1.5% suggests, unsurprisingly, that those with lower GI symptoms are at higher overall risk of cancer than the general population.However, this higher PPV is largely based on the strength of rectal bleeding and anaemia as predictors of cancer; the PPVs for other lower GI symptoms (which ranged from 0.4% for constipation to 0.8% for weight loss) were similar to those expected within non-enriched screening populations.The PPVs of altered bowel habit and weight loss were lower than previously reported, 24,25 with the PPV for rectal bleeding at colonoscopy over three times that for altered bowel habit, differing from previous studies showing similar cancer yields. 24,26potential explanation for our different findings is that promoting altered bowel habit as an alarm symptom in urgent cancer pathways (as was done in the English guidelines 14 ) might have altered referral thresholds for a largely subjective symptom common in the general population, 27 leading to a higher volume of lower risk referrals than in the past. 28e well-established strong associations between patient age and sex and incidence of cancer 29 was reflected in the PPVs reported.Male patients were at higher risk of cancer than females, with cancer PPVs for females roughly equating to those for males 10 years younger.Yet paradoxically, almost 55% of symptomatic endoscopy was performed on females, with 62.5% of colonoscopies on females performed on those with cancer PPV under 1% (compared to 39.8% among males).
Colonoscopies at independent sites recorded lower cancer and large polyp detection rates compared to NHS sites.Differences in case-mix likely explain much of this variation, but given the higher post-colonoscopy colorectal cancer rates previously observed in the independent sector, 30 alongside updated analysis emphasising  1.
colonoscopy quality's role in reducing these rates, 31 further investigation is necessary.
Only 3.2% of colonoscopies reported large polyps, and these accounted for only 9% of all polyps.Although the distribution of polyps based on patient age-groups and sex aligned with earlier screening studies, the rate of large polyp reporting was approximately half of what was previously documented. 32,33The reason for this is likely due to differing definitions: in our study, we were only able to use size as the criterion for an advanced polyp, whereas other studies focus on adenomas and include small polyps with tubulovillous histology or high-grade dysplasia.Analysis again identified rectal bleeding and anaemia as the symptoms most strongly predicting large polyps, with a strong age-gradient -although less strong than that seen in cancer, consistent with previous studies. 34tional referral guidelines in England have recently downgraded several low-risk symptoms towards non-urgent or no investigation 14 : our results support this and should assist ongoing review of referral criteria, leading to more tailored guidance for clinicians.Currently, the most effective solution for determining which patients with lower GI symptoms require endoscopy, as well as the most appropriate initial examination, is the use of biomarkers.In the lower GI tract, established biomarkers are now available to help predict the risk of both IBD and cancer.Regarding suspected IBD, faecal calprotectin can reliably differentiate between IBD and IBS: using a calprotectin cut-off of 50 μg/g in adult patients suspected of having IBD reduces the colonoscopy requirement by two-thirds, with high negative predictive value. 35For cancer, faecal immunochemical testing (FIT) has been used as a screening test within the English BCSP since 2020 36 and BSG guidelines have recently been published advising its routine use within the symptomatic population. 37Incorporating FIT into the symptomatic population improves prediction of cancer, 38 while using a FIT threshold of 10 μg Hb/g in a symptomatic population would reduce endoscopy volume with a sensitivity of 87% and specificity of 84%. 39e most common reason for flexible sigmoidoscopy in patients was isolated rectal bleeding; where flexible sigmoidoscopy is often preferred to colonoscopy due to better tolerance, lower risk of complications, and the high proportion of left-sided pathology. 40,41wever, this paradigm needs to be revisited with the introduction of FIT-testing, 42 as perhaps a better algorithm might be to perform a full colonoscopy for those who are FIT-positive, whereas FIT-negative patients could be triaged to lower risk examinations such as rectoscopy, or no investigation, reducing pressure on endoscopy services. 43

| Strengths and limitations
The primary strength of the study is that it utilised real-world data collected over a course of 1 year, providing a comprehensive picture of national endoscopy activity.The study period was chosen to avoid the influence of the COVID-19 pandemic on endoscopy services. 44wever, some endoscopy sites did not upload to the NED, which could potentially introduce bias as these non-uploading sites may have

| CON CLUS ION
In conclusion, our national analysis highlights the variation in cancer risks from common GI symptoms and the substantial proportion of endoscopy that is devoted to low-yield patient cohorts, resulting in inefficient utilisation of endoscopy capacity.
of each diagnosis group (alongside the PPV of those reporting only diverticulosis or normal F I G U R E 1 Flowchart illustrating creation of analysis dataset.All exclusion shaded are shown in grey, those included in study are shown in green.findings) within each age group (categorised as 18-39, 40-49, 50-59, 60-69, 70-79 and 80-99 years old) was determined, and further broken down by patient sex.
Hartlepool NHS Foundation Trust Caldicott guardian.The Research and Development Department at North Tees and Hartlepool NHS Foundation Trust approved the project.The analysis was supported by the Joint Advisory Group for GI endoscopy and the British Society of Gastroenterology (BSG).

Grouping of diagnoses based on severity Diagnosis groups Description/NED terms included in diagnosis group
bleeding 1 Indications include rectal bleeding, either alone or in combination with other symptoms Characteristics of patients and symptoms present during procedures performed to investigate symptoms in both NHS and independent sector sites.Proportion of symptomatic procedures recording each diagnosis group, by patient age-group and sex (presented as unadjusted positive predictive values (PPV) with 95% confidence intervals).
gated rectal bleeding, which exhibited the highest positive predictive values for large polyp and cancer, almost half were performed on patients under 50 years.Patient age and sex were strongly related to the risks of large polyps and cancer.However, a third of procedures were conducted on patients under 50 years old, where yield of significant pathology was low.TA B L E 2 aSymptom groups are hierarchical, as per Table1, and each procedure can only be included in a single group.TA B L E 3Note: (a) Results are not hierarchical, so each colonoscopy could be included in multiple diagnostic categories (with the exception of normal).(b) Results are not hierarchical, so each procedure could include multiple diagnoses.
Mixed-effects logistic regression of the association of reporting large polyps and cancer from colonoscopy by patient age group, patient sex, and symptoms (with endoscopist variation as random effect): numbers of procedures and cancers, adjusted odds ratios (aOR), with 95% CIs and p values.
TA B L E 4Note: ORs mutually adjusted for variables in the table.Symptom groups defined as per Table Adjusted positive predictive values (PPVs) for identifying large polyps and cancer from symptomatic colonoscopy, by symptom, age group and patient sex, based on modelled output from regression analysis: overall and by sex.Color denotes pathology risk, with green as lowest and red as highest.Mixed-effects logistic regression of the association of reporting large polyps and cancer from flexible sigmoidoscopy by patient age group, patient sex, and symptoms (with endoscopist variation as random effect): numbers of procedures and cancers, adjusted odds ratios (OR), with 95% CIs and p values.
Note: ORs mutually adjusted for variables in the table.Symptoms defined as per Table1.however, a post-hoc sensitivity analysis revealed the indications and cancer PPV for these systems following exclusions closely resembled other systems reliably recording diagnosis, hence it is unlikely to have introduced substantial bias.The study demonstrates the utility of accurate and accessible national data, being able to assess both the PPV cancer and volume of endoscopies performed to investigate GI symptoms, within patient age groups and by sex.These will change over time, influenced by referral guidelines, growth in use of FIT, endoscopy capacity and population demographics.By monitoring trends, policymakers can proactively appraise and improve pathways and help ensure optimal use of resources.The second iteration of NED is currently being rolled out, 13 and updated data fields, such as FIT level and cancer location, will enable the development of, or assessment of, real-world impact of more complex risk prediction models incorporating results of pre-endoscopic screening tests.
Adjusted positive predictive values (PPVs) for identifying large polyps and cancer from symptomatic flexible sigmoidoscopy, by symptom, age group and patient sex, based on modelled output from regression analysis: overall and by sex.Color denotes pathology risk, with green as lowest and red as highest.
Triaging younger patients with low-risk symptoms towards conservative management, or better risk assessment including pre-endoscopic screening tools such as FIT testing, would permit more effective targeting of our endoscopic resource towards those most likely to benefit, increasing population health and help mitigate the predicted increase in colorectal cancer mortality due to COVIDrelated diagnostic delays.TA B L E 7