Anastomotic leak in colorectal cancer surgery: Contribution of gut microbiota and prediction approaches

To monitor prospectively the occurrence of colorectal anastomotic leakage (CAL) in patients with colon cancer undergoing resectional surgery, characterizing the microbiota in both faeces and mucosal biopsies of anastomosis. In a second stage, we investigated the ability to predict CAL using machine learning models based on clinical data and microbiota composition.


INTRODUC TI ON
Despite the recent methodological advances that have been incorporated into colon cancer surgery, colorectal anastomotic leakage (CAL) remains the leading cause of morbidity and mortality following surgery for colorectal cancer, with an overall incidence of 7%-8% [1][2][3].Impaired healing of the intestinal epithelium results in leakage of the intestinal contents into the peritoneal cavity, requiring antibiotic treatment (leakage severity categories a and b) and, in the worst cases, emergency reoperation (leakage severity category c) [4].The occurrence of CAL is strongly associated with shorter survival [5].
Although the specific CAL triggers remain unknown, numerous human and surgical factors have been identified [2,6].The critical surgical aspects are good blood flow, a tension-free connection and carefully executed mucosal apposition, using either a hand-sewn or a stapled technique [7].However, the rates of CAL have not decreased despite numerous efforts to minimize both tissue trauma and postoperative ileus [8], and opinions remain divided regarding the use of prophylactic antibiotics [9,10].
The possible involvement of the gut microbiota in the development of CAL has also been postulated [11], with reported evidence of the predominance of Enterococcus and Pseudomonas in CAL drainage fluid [12].Although both genera are common commensals in healthy individuals, it has been hypothesized that under stressful conditions, such as surgery, some virulence factors that are usually silent could be overexpressed [13].The detection of genes encoding the collagenase GelE or matrix metalloproteinase 9 (MM-9) for collagen degradation are frequent in Enterococcus [14], whereas in Pseudomonas aeruginosa, lasB and prpL genes have been linked to collagen digestion [15].For Streptococcus gallolyticus subsp.gallolyticus, a classical microorganism associated with colorectal cancer, elevated adhesion to collagen has been reported [16].Recent studies have highlighted the contribution of bacterial proteins to collagen degradation.In 2019, Guyton et al., proposed a simple, non-invasive and inexpensive presurgical test to help predict CAL before surgery; however, it has not yet been incorporated into clinical practice [17].
Most data on the microbiota in CAL have been obtained using bacteria culturable from faeces or drainage fluids.However, massive sequencing of 16S rDNA amplicons allows the detection of non-culturable microorganisms, which are the most abundant microorganisms in the colon.Moreover, faeces is not the most appropriate sample for the study of CAL because it contains DNA from microorganisms throughout the digestive tract, diluting possible local pathogens that are only present around the wound.The aims of this study were to monitor prospectively the occurrence of CAL in patients with colon cancer undergoing resectional surgery that included placement of an anastomosis and to characterize the culturable and non-culturable microbiota in both faeces and mucosal biopsy samples.In a second stage, we tested the ability of machine learning models to predict CAL, using clinical data and data on microbiota composition in faeces and biopsy tissue.

Patients and sampling
A total of 111 patients with colorectal cancer attending our institution between 2018 and 2019 were recruited after signing the informed consent approved by the Ethics Committee, reference 030/17 (Table 1).We aimed to recruit at least 10 patients with CAL, in accordance with the prospective design of the study.Anthropometric and clinical variables for each patient were recovered from their clinical case card.Each patient provided a faecal sample 2-3 days prior to surgery, and biopsy samples from proximal and distal sites in the healthy margins of the tumour were taken from the tumoral piece after surgery.All samples were immediately frozen at −80°C after collection.Based on their clinical evolution, we differentiated patients who developed CAL (CAL group) from an equal number of patients with satisfactory evolution matched for sex, age and body mass index (BMI), who constituted the control group.

Bacterial isolation, identification and characterization
Faeces and mucosal biopsy samples were cultivated in Columbia agar + 5% sheep blood (BioMerieux), M-Enterococcus agar (Difco) and MacConkey agar (Difco) at 37°C, with and without 5% CO 2 , for 48 h.We identified the resulting colonies using matrix-assisted laser desorption/ionization coupled to a time-of-flight mass spectrometer (MALDI-TOF) (Bruker) complemented with amplification of 16S rDNA or sodA by PCR and sequencing with BLAST confirmation.The genetic relationship of the isolates was determined using pulsed-field gel electrophoresis (PFGE), with specific conditions for each species.Phenotypic production of both collagenase and protease activity in bacterial isolated from faeces or mucosal biopsy samples was tested in Minimal Medium (Sigma) containing 100 μg/L of collagen or 10% sterile skimmed milk, supplemented with 5% human blood for streptococcal isolates.Carriage of genes encoding the virulence factors aggregation substance (agg), enterococcal surface protein (esp), hemolysin (cylA), collagen adhesin

What does this paper add to the literature?
This is the first work including biopsies of surgical margins to study of the anastomotic fistulas pathogenesis.We have corroborated the role of collagenase and protease expression by Enterococcus, also demonstrating that the pathogen is not present in the colonic location before surgery.Our hypothesis is that is a regular inhabitant of the small intestine, which only colonizes the anastomosis when collagen is produced.Overall, our study provides new data on the pathogenesis of anostomotic fistula and future therapeutic approaches with antibiotics.

Bacterial composition according to 16S rDNA massive sequencing
After a slow defrost at −20°C for 24 h and then another at 4°C for 24 h, the biopsy samples were completely dissolved in proteinase K.In addition, 0.5 g of faeces was solubilized in 5 mL of sterile water.DNA from biopsy and faecal samples was extracted using the Speedtools tissue DNA extraction kit (Biotools), and the V3 and V4 regions of the 16S rDNA gene were sequenced (2 × 300 bp) on a MiSeq platform (Illumina).For gene library preparation, we employed the 16S Metagenomic Sequencing Library Preparation protocol (Illumina; Cod.15044223 Rev. A).Sequence quality control was performed using DADA2 (https://qiime2.org/),with a rarefaction of 12,000 sequences per sample.Amplicon sequencing variants were obtained by taxonomic assignation using the Silva_132 classifier.Alpha-and beta diversity studies were performed using the q2-diversity add-on of QIIME2 [19], after normalizing the samples by rarefaction (subsampling without replacement).Linear discriminant analysis Effect size (LEfSe) was performed to assess which taxa explained the differences between groups [20].Sequence data were deposited in GenBank (BioProject PRJNA871042).

Machine learning prediction
The accuracy of various predictive models was explored using algorithms such as logistic regression (LR), random forest, k-nearest neighbours, neural networks (NN) and support vector machines, and the following two data sets: bacterial abundance in biopsy samples and faeces; and available clinical data.Both data sets were randomly divided into training and test sets.The scoring of the models was evaluated using precision accuracy and we paid special attention to the f1-score, which allowed us to fine-tune the precision in patients with CAL detection.Given that our data sets are considered "small" from a machine learning point of view, we averaged our results over 500 experiments (dividing the data sets into training and test sets).All models were performed in Python 3.8.10,using the Scikit-learn package.

Statistical analysis
Continuous variables were expressed as median (SD).Statistical differences in the continuous variables were determined using the Student's t-test (for data following a normal distribution) or the Mann-Whitney U test (for data following a non-normal distribution).
For categorical variables the chi-squared test was used.Values of p <0.05 were considered statistically significant.

Patients and surgical data
The relevant characteristics of patients are given in

Streptococcus parasanguis 4
Total 26 Note: The control group was 10 patients with an anastomosis healing correctly after colon cancer surgery; patients in the control group were matched to patients in the CAL group according to sex and age.

TA B L E 2
Identification of bacterial species in patients with colorectal anastomotic leakage (CAL) and controls, and the possible contribution of these bacteria to the development of CAL.  2).

16S rDNA massive sequencing
Alpha diversity values obtained for biopsy and faecal samples were comparable in both groups, and some statistical differences were detected (Figure 3).The most relevant result was the difference in alpha diversity values of the proximal biopsy samples between CAL and non-CAL groups.In addition, as expected, the alpha diversity of the faecal samples was higher than that of the biopsy samples.Regarding beta diversity, Bray-Curtis analysis demonstrated a significant difference between biopsy samples from the proximal (p = 0.048) and distal (p = 0.014) healthy surgical margins of the tumour piece when comparing patients with and without CAL (Figure 3).However, when phylogenetic distances were considered according to Weighted and Unweighted UniFrac, no significant differences were found.
Data from all faecal samples was investigated using LEfSe; for comparison of proximal and distal biopsy samples, however, only data from patients with CAL (n = 5) and controls (n = 40) undergoing right hemicolectomy were considered.This criterion was established in view of the differences in the composition of the microbiota between the ascending and descending colon (Figure 4).The LEfSe detected significant differences between faecal and biopsy samples,

Prediction of CAL using machine learning
All 16S rDNA data obtained from biopsy samples, as well as stool and clinical data, were included in machine learning analyses; however, the data sets were separately studied including a binary classification: CAL and non-CAL.The algorithms with the best results were LR and NN; we finally selected the LR method because by using this method we can obtain the most significant coefficients to fit the model, giving us greater interpretability.All results were averaged over 500 experiments (Figure 5).In the clinical data set, based on our best set of parameters (C = 100, penalty = 'l1', i.e., a Lasso regression), we obtained f1-score values of 96% for no CAL and 67% for CAL (see the patient confusion matrix).In the

F I G U R E 4
Linear discriminant analysis Effect size (LEfSE) based on Linear Discriminant Analysis (LDA) was used to investigate the composition of the bacterial microbiota in all patients (CAL and control) and samples (for all patients, faeces and biopsy samples taken from proximal and distal healthy margins of the tumour were obtained).Colour was used to differentiate patients with colorectal anastomotic leakage (CAL, red) from those with adequate healing of the anastomosis (Control, green).

DISCUSS ION
The rates of CAL have remained stable worldwide over the last century, and deciphering its aetiology remains a challenge [1][2][3].The first evidence of bacterial involvement in CAL pathogenesis was published in 1955 [9]; as a consequence, prophylactic oral antibiotic therapy was promoted as part of the preparation for gastrointestinal surgery until the mid-1980s.Subsequently, the intravenous route was used to administer antibiotics, to avoid the generation of multidrug-resistant bacteria in the normal microbiota of the digestive tract.Recent studies have once again focused on the benefit of administration of antibiotics via the oral route [21][22][23], as well as on impregnation of the surgical margins of the anastomosis with antibiotic during surgery or using slow-release antibiotic devices [24].Reducing antibiotic consumption is crucial for preventing the selection and spread of antibiotic-resistant microorganisms; however, the surgical outcome should never be compromised, and there remain many controversial opinions on this point [25].Preventing the occurrence of CAL is also an antibiotic-saving strategy, because in the case of emergence of antibiotic-resistant microorganisms following surgery the patient will need to be treated more aggressively to eradicate F I G U R E 5 Prediction of colorectal anastomotic leakage (CAL) by a machine learning approach using both clinical data and microbiota composition.Most accuracy results were obtained for predicting non-CAL using the microbiota data (56 vs. 4).
the pathogens after the redo-surgery.More studies are needed to demonstrate the advantages of oral and/or local administration of antibiotics versus administration via the intravenous route.According to our results, the microbiota involved in CAL is not strictly present at the site before surgery; its development appears to be related to the production of collagen during healing after surgery.
Consequently, antibiotic prophylaxis might only be effective if antimicrobials remain at an effective concentration at the anastomotic site for 5-10 days after surgery.
The rate of CAL reported in the present study (9%) is comparable with the rates reported by other authors [2,26].The two regions of gut on which we focused regarding CAL were the caecum and sigmoid, in which the oxygen gradient is critical for bacterial growth and for which variations in oxygen could be associated with a stressful environment.This is consistent with the hypothesis that it is only necessary to be colonized by a species of bacteria with a particular virulence factor (mainly collagenase) for CAL to develop; however, the influence of the local tissue environment on expression of virulence factors by microorganisms, has not been sufficiently evaluated.
Across studies, the consistent predictive clinical features for CAL are male sex, BMI >35, poor nutritional status, lower anastomosis and the number of staples (>3) used to close the surgical site [27].
Elevated serological CRP levels are directly related to the development of CAL [28], and elevated serological CRP had the best predictive power in our series on day 5; interestingly, however, at day 3 it was just the opposite.
The classical bacterial pathogens implicated in CAL belong to the genera Enterococcus, Pseudomonas and Bifidobacterium [12,14,[29][30][31], and in the present study, all were identified by traditional cultivation methods in the anastomotic drainage fluid or in the faeces.The introduction of 16S rDNA sequencing techniques revealed a much more complex ecosystem than was initially expected [12,32,33].Previous studies using this technology indicated an abundance of Lachnospiraceae in patients with CAL, whereas Prevotella copri and the Streptococcus genus negatively correlated [30,34,35].However, those studies were performed on faeces, which contain the microbiota of the entire digestive tract.E. faecalis is not directly related to collagen degradation but contributes to prolonged inflammation by proinflammatory cytokine production [14].In our series, the enterococcal isolates from patients with CAL were significantly more virulent than those obtained from patients in the control group with no CAL.Significant differences between patients with CAL and those in the control group were only found for gelE and agg genes; nevertheless, environmental factors such as oxidative stress [37] or hyperoxia [38] should be considered, that site-specific environmental factors can promote the expression or silence bacterial virulence factors.
The prediction of CAL based on clinical features and/or composition of the microbiota could be used to optimize antibiotic prophylaxis and, more importantly, to prevent development of CAL by active monitoring.In that sense, machine learning tools can contribute to treatment optimization, as previously demonstrated [6].
In our study, clinical data and composition of the microbiota were combined to increase the predictive power; however, after testing all available algorithms, the best results were obtained with both data sets separately.Each analysis was performed 500 times, thus increasing the predictive power.It has been proposed that the increase of Escherichia coli or Enterococcus spp. in drainage fluid should be monitored by PCR [39]; however, this is only possible 2-4 days after surgery, when CAL has already manifested [40].The usefulness of quantitative PCR for gelE and agg in faeces prior to surgery, and of course phenotypic collagenase testing using selective media, as already proposed by Guyton et al., should be validated in other cohorts [17].
Our work has limitations.The most relevant limitation is the lack of culturable Pseudomonas spp., and although we cannot rule out that this is a particular characteristic of our patients as the 16S rDNA counts were extremely low, our suspicion is that the freezing process limits their viability.This genus also was not relevant in machine learning analysis, reinforcing the hypothesis that it did not contribute to CAL in our patients.Another limitation was the small number of patients with CAL who could be included in the machine learning analysis: a much larger number is required to obtain conclusive results.Finally, as almost all patients take the same prophylactic antibiotic, we did not measure its effect on microbiological cultures, as 16S rDNA detect their DNA.
On the other hand, the strengths of this study were employing both culture-dependent and culture-independent techniques, phenotypic and genotypic bacterial typing, analysis of faeces and mucosal biopsies, and predictive analysis.We must also bear in mind that patients with colon cancer already have a different microbiota, and accordingly we must always compare data following colon cancer surgery between those who develop CAL and those who do not, because a control group comprising patients who do not have colon cancer is not appropriate.
Biopsy samples from surgical margins are more appropriate than samples of faeces for exploring involvement of the intestinal microbiota in CAL.The levels of enterococci are only increased in the anastomosis after surgery, and enterococcal collagenases and proteases are involved in the degradation of the anastomotic scar.
Based on machine learning models, the occurrence of CAL can be confirming the importance of sample selection in microbiota studies.The CAL and control groups presented relevant particularities in the composition of their microbiota detected in the biopsy samples taken from proximal and distal sites of the tumour piece, highlighting the higher abundance of bacteria from the phylum Actinobacteria in the control group.Contrary to expectations, the abundance of F I G U R E 2 (A) Phenotypic production of protease (following culture in Minimal Medium plus 10% sterile skimmed milk) or collagenase (following culture in Minimal Medium plus 10% collagen) circles point to bacteria growing in minimal medium with milk for those bacteria expressing gelatinase and in supplemented minimal medium to see collagenases which are clearly visible as blue colonies.(B) Summary of data obtained for Enterococcus faecalis and Enterococcus faecium, including phenotypic results and detection, by PCR, of the virulence factor genes expressed.CAL, patients with clinical anastomotic leakage; Control, patients with no clinical anastomotic leakage.The y-axis is calculated with the total number of isoaltes of each specie, for example E. faecalis has 5 isolates in CAL so, the 100% of the axis corresponded to 5 strains.(C) Dendrograms of the genetic relationship among E. faecalis and E. faecium; data were obtained using pulse-field gel electrophoresis (PFGE).Pseudomonas spp.In biopsy samples from the distal healthy margin of the tumour was associated with adequate healing.Lastly, no significant association with classical CAL pathogens, such as those from the order Enterobacterales, or enterococci, was detected.The exudate recovered from the drainage tube of two patients with CAL was analysed using 16S rDNA massive sequencing.More than 60 different microorganisms were identified: Ruminococcys gnavus (49%) and Enterococcus spp.(15%) were the most prevalent microorganisms in one patient, whereas Enterococcus spp.were most prevalent (accounting for 85% of the total microorganisms) in the other patient.In both patients, the proportions of these microorganisms were low in the faecal samples and in biopsy samples taken from the healthy distal and proximal margins of the tumour before surgery: the proportions of Enterococcus spp.were less than 0.1% in all samples from both patients; in both patients, the proportions of Ruminococcus were 11% in faecal samples and 12% in biopsy samples taken from the healthy proximal margin of the tumour, whereas the level of Ruminococcus was only 1% in the biopsy sample from the healthy distal margin of the tumour taken from the patient who had Ruminococcus in the exudate, whereas in the other patient Ruminococcus was almost undetectable.

F I G U R E 3
Alpha diversity: Shannon index values of faecal and biopsy samples among patients with colorectal anastomotic leakage (CAL) and controls (no CAL).*p < 0.05, **p < 0.01, and ***p < 0.001.Beta diversity: principal coordinate analysis (PCoA) based on Bray-Curtis distances representing all patients and samples and differentiated by colour between CAL (red) and adequate healing of the anastomosis (no CAL) (green).
bacteria data set, with the following set of parameters C = 1 and penalty = 'l2' (i.e., a ridge regression), we obtained f1 scores of 95% and 50% for no CAL and CAL, respectively (see the bacteria confusion matrix).The most significant values for the medical set were obtained for 3-day C-reactive protein (CRP) and diabetes for the negative values, and 5-day CRP and cardiopathy for the positive values.The most significant values for the bacteria data set were in Acinetobacter and Gordonibacter for the negative values and in Dielma and Elusimicrobium for the positive values.All these genera are in the minority, so their predictive value is conditioned by their low abundance.

Table 1
Characteristics of patients with and without colorectal anastomotic leakage (CAL).
: 62 (55.8%) were men with a median (range) age of 72.7 (50-94) years; and 49 (44.2%) were women with a median (range) age of 74.2 (52-87) years.All patients received mechanical bowel preparation.The prophylacticTA B L E 1Note: Values are given as median ± SD, n (%), n or %.Abbreviations: BMI, body mass index; ASA, American Society of Anesthesiologists; ATB, antibiotics; CRP, C-reactive protein.antibioticused in the surgery was amoxicillin-clavulanate in 104 (93.6%) patients and ciprofloxacin with metronidazole and piperacillintazobactam in the remaining patients.All surgeries were scheduled in advance.Tumour location was mostly in the sigmoid colon (42.3%), followed by the caecum (25.2%) and the ascending colon 10.8%), the remaining locations being uncommon (Figure1).The anastomoses were performed using staples with hand-sewn reinforcement in 58 (52.2%) patients, staples only in 31 (27.9%)patients and hand-sewn only in 22 (19.8%)patients;63(56.7%)anastomoses were side-toside, 43 (38.7%) were end-to-end and 5 (4.5%) were side-to-end).Laparoscopic intervention was performed in 57 (51.3%) patients, and a drainage tube was left in 94 (84.6%).Ultimately, 10 (9.0%) patients had CAL, for whom no relevant clinical or surgical differences from patients in the control groups were found (Table1).F I G U R E 1 Schematic representation of the operations carried out in patients included in the present study, showing tumour location, occurrence of colorectal anastomotic leakage (CAL) and type of surgery performed.

10) Relevant results related to CAL Enterococcus
faecalis 6 Lower production of haemolysin Higher production of collagenase, gelE (four in CAL vs. one in control) and agg (four in CAL vs. three in control) genes Enterococcus faecium 6 Lower gelE (two in CAL vs. three in control) and agg (two in CAL vs. none in control) genes Higher production of protease Enterococcus avium 1 Pseudomonas spp.were not obtained in the microbiological cultures of faeces and biopsy samples, whereas 36 colonies (22 from patients with CAL and 14 from patients in the control group) were identified as Enterococcus spp.and 14 (two from patients with CAL and 12 from patients in the control group) as Streptococcus spp.The carriage of virulence factors, and the production of both collagenase and protease, were determined in all isolates.Significant differences between groups were only found for gelE (80% vs. 16% in Enterococcus faecalis, and 20% vs. 50% in Enterococcus faecium) and for agg (80% vs. 16.6%) in E. faecalis isolates, whereas other relevant data were not found for minor species (Table2), except for the protease activity of S. gallolyticus subsp.gallolyticus.Two additional E. faecalis isolates were recovered from the CAL drainage fluid of two patients; both of these isolates carried gelE and agg and expressed collagenase activity.The PFGE analysis demonstrated no clonal origin among the isolates (Figure