Survival prediction by Bayesian network modeling for pseudomyxoma peritonei after cytoreductive surgery plus hyperthermic intraperitoneal chemotherapy

Abstract Objectives To establish a survival prognostic model for pseudomyxoma peritonei (PMP) treated with cytoreductive surgery (CRS) plus hyperthermic intraperitoneal chemotherapy (HIPEC) based on Bayesian network (BN). Methods 453 PMP patients were included from the database at our center. The dataset was divided into a training set to establish BN model and a testing set to perform internal validation at a ratio of 8:2. From the training set, univariate and multivariate analyses were performed to identify independent prognostic factors for BN model construction. The confusion matrix, receiver operating characteristic (ROC) curve and the area under curve (AUC) were used to evaluate the performance of the BN model. Results The univariate and multivariate analyses identified 7 independent prognostic factors: gender, previous operation history, histological grading, lymphatic metastasis, peritoneal cancer index, completeness of cytoreduction and splenectomy (all p < 0.05). Based on independent factors, the BN model of training set was established. After internal validation, the accuracy and AUC of the BN model were 70.3% and 73.5%, respectively. Conclusion The BN model provides a reasonable level of predictive performance for PMP patients undergoing CRS + HIPEC.

PMP mainly originates from appendiceal mucinous tumors. Tumor cells and mucus enter the abdominal and pelvic cavity through the perforated appendix wall, accumulate and redistribute in the abdomen and pelvis, leading to mucinous ascites, peritoneal implantation, omentum cake, and organ involvement particularly to the ovary and the spleen. [1][2][3][4] There are many factors affecting the prognosis of PMP, such as age, peritoneal cancer index (PCI), completeness of cytoreduction (CC), histological grading, lymphatic metastasis, vascular invasion, stripped peritoneum area, number of anastomosis. [6][7][8] The identification of prognostic factors and development of survival prognostic model for PMP are important to predict the clinical outcome for PMP patients treated with CRS + HIPEC and to make clinical treatment decision.
In recent years, machine learning method has been widely used in medical field. 9 Bayesian network (BN) is a directed acyclic graph that explores the unknown probability of variables from the known probability knowledge. Previous studies have developed BN model to survival prediction of malignant tumors such as lung cancer, breast cancer, gallbladder cancer and colon cancer, [10][11][12][13] which showed a high forecast accuracy. At present, there is no research on the establishment of PMP survival prognostic model based on BN. Therefore, this study aims to construct and evaluate a BN prediction model for PMP.

| Clinical information
Our institute is a medical center specialized in treating peritoneal metastases from gastrointestinal and gynecological malignancies, mainly using the CRS + HIPEC and postoperative integrated treatment approaches. Each patient treated at our center has been entered into a prospectively established database, which contained detailed clinicopathological information on 1980+ patients. From this database, we screened 453 PMP patients underwent CRS + HIPEC for the first time from December 2004 to July 2021. All patients met the following inclusion and exclusion criteria. 14 The study was approved by the Ethics Committee of Beijing Shijitan Hospital. All patients signed the informed consent.

| CRS + HIPEC
After general anesthesia, a midline xiphoid-pubic incision was performed to enter the abdomen. Once the abdominal wall was opened, characteristics and volume of ascites were recorded and evaluation of PCI was conducted, according to Sugarbaker's principle. 15 Then, the maximal CRS was performed, including the resection of the visceral and parietal peritoneum, tumor-involved organs, and lymphadenectomy.
CC score was evaluated after CRS according to Sugarbaker's criteria. 15  After CRS, open HIPEC was performed. The chemotherapy drugs were docetaxel 120 mg + cisplatin 120 mg or cisplatin 120 mg + mitomycin C 30 mg, each dissolved in 3000 ml of heated saline at 43°C for 60 min.
Then, digestive tract and urinary tract reconstructions were performed after HIPEC. Intestinal stoma was conducted if necessary. Drainage tubes were placed and the incision was sutured with reduced tension. After operation, patient was delivered to the intensive care unit for recovery and transferred to the surgical oncology ward when the condition stabilized.

| Follow-up
The follow-up consisted of physical examination, tumor response evaluation and survival information. The frequency of follow-up was once every 3 months within 2 years after CRS + HIPEC, once every 6 months for the third year after CRS + HIPEC and once every year thereafter. 16 The last follow-up was on December 31, 2021, with the rate of 100%.

| Definition
Overall survival (OS): OS was defined as the time interval from the date of clinical diagnosis to the date of death or the last follow-up. SPSS 26.0 (IBM Corporation, SPSS, Armonk, NY) were used for data collection and analysis. Continuous variables were reported as median (range) and compared with t-test or rank sum test. Categorical variables were presented as number (percentage), analyzed by x 2 test and Fisher's exact method. Kaplan-Meier method was used to estimated OS and log-rank test was used for comparison between groups. p value <0.05 was considered significant. Univariate and multivariate COX regression analyses were conducted to identify the independent risk factors on OS. R software (version 4.1.2 developed by The R Foundation for Statistical Computing) was used for BN model development and evaluation.

| Development of the BN model
The "Bnlearn" package (version 4.7) was used for BN structure learning, parameter learning and inference. To evaluate the BN model performance, all PMP patients were randomly split in training set and testing set with a ratio of 8:2. The training set was used to establish the BN model and the testing set performed internal validation. From the training set, univariate and multivariate analyses were performed to screen for independent prognostic factors for BN model construction. We selected OS as the target variable and 36 months as the target cut-off point time. As the "Bnlearn" package can only deal with discrete variables, discretization of the data was completed prior to the construction of the model. After establishment of the dataset and discretization of variables into discrete variables, a BN model was established.

| Evaluation of BN model
The confusion matrix is a cross table containing the observed and predicted classes with relevant statistics, which can be obtained by internal validation. The accuracy of the BN model is defined by the following equation: Accuracy = [true positive (TP) + true negative (TN)]/ [TP + false positive (FP) + TN + false negative (FN)]. Using the "ROCR" package (version 1.0-11), the receiver operating characteristic curve (ROC) and the area under curve (AUC) were calculated to evaluate the overall performance of the BN model. histological grading, vascular invasion, lymphatic metastasis, PCI, CC, number of organ resections, number of anastomoses, RBC transfusion volume, ascites volume, and splenectomy (all p < 0.05) ( Table 2). Factors with p < 0.05 were incorporated into multivariate COX regression analysis, which identified 7 independent prognostic factors: gender, previous operation history, histological grading, lymphatic metastasis, PCI, CC and splenectomy (all p < 0.05) ( Table 3). Kaplan-Meier curves of training set and subgroup comparation based on those 7 independent prognostic factors are showed in Figure 1A-H. Based on the 7 independent prognostic factors above, the BN model for training set was constructed (Figure 2A).

| Internal validation for BN model
The confusion matrix of internal validation is listed in  Figure 2B).

| DISCUSSIONS
The development and utilization of cancer survival prediction models are of great significance for physicians to make clinical decisions. In this study, we constructed a BN model to predict survival of PMP patients based on the 7 independent prognostic factors. After internal validation, the BN model showed a reasonable level of predictive performance with the accuracy being 70.3% and the AUC being 73.5%. The univariate and multivariate analyses of training set showed that gender, previous operation history, histological grading, lymphatic metastasis, PCI, CC and splenectomy were the independent prognostic factors. Chua et al. 17 conducted a large multi-center study of 2298 patients, which showed age, severe adverse events, CC and PMP with high grade were independent risk factors for OS. Another study conducted by Ansari et al. 18 have confirmed that male, high grade PMP, high level of carbohydrate antigen (CA) 125 and carcinoma embryonic antigen (CEA) were independent risk factors for poor prognosis. As mentioned above, there are many factors affecting the prognosis of PMP, and there are certain differences among PMP cases in different treatment centers.
Among 7 independent prognostic factors selected by multivariate analysis for our study, there were two factors  associated with CRS + HIPEC, which were splenectomy and CC score. Our study showed that splenectomy provided a significantly better survival comparing with nonsplenectomy for PMP patients. The reason may be that splenectomy enhances the likelihood of complete cytoreduction. However, a study 19 showed that splenectomy could increase major complication rate in patients with CRS + HIPEC. So, the efficacy and perioperative safety of splenectomy need further study to verify. CC score is a critical independent prognostic factor for PMP patients. As shown in the BN model we constructed, PCI and splenectomy have big impacts on CC score. PMP patients with low PCI and splenectomy, underwent standardized CRS + HIPEC, had a lower CC score and a longer OS. Histological grading and lymphatic metastasis are also independent factors affecting the survival and prognosis of PMP patients. The BN model showed that histological grading was correlated with lymphatic metastasis, and the lymphatic metastasis rate was higher in patients with high pathological grade. In 2001, Sugarbaker systematically studied CRS + HIPEC+ early postoperative intraperitoneal chemotherapy (EPIC) for PMP, demonstrating that this therapy was the optimal treatment strategy for PMP patients. This treatment embodies the advantages of comprehensive treatment based on surgery, integrating the synergistic effects of surgical resection, regional chemotherapy, hyperthermia and large volume liquid lavage. CRS can remove all visible tumor tissues and HIPEC can eliminate micrometastases and free tumor cells. Current studies 17,18,[20][21][22][23][24][25] have reported that the mOS of PMP treated with standard CRS + HIPEC was 103. .0 months, the median progression-free survival time was 40.0-98.0 months, and the 5-and 10-year survival rates were 49.0%-92.1% and 32.8%-80.8%, respectively. The mOS of the training set in this study was 102.4 months, and the 3-, 5-and 10-year survival rates were 82.3%, 68.1% and 43.9%, respectively. One early study of our center 26 showed that the mOS of 254 PMP patients was 55.4 months, and 3-and 5-year survival rates were 61.0% and 44.3%, respectively. CRS + HIPEC can prolong the survival time of PMP obviously.
Currently, HIPEC regimens vary in different treatment centers. Oxaliplatin and mitomycin C are the most commonly basic chemotherapy drugs for HIPEC. There is no international consensus on the best drug and dose for HIPEC. Therefore, international peritoneal cancer centers need to strengthen cooperation and conduct multi-center, large sample randomized controlled clinical trials to explore HIPEC protocol with high efficacy and less toxicity. The nomogram is a graphical representation that has been used to predict cancer survival in recent years. Two studies 27,28 had developed nomograms for predicting survival in PMP patients. Chen et al. 27 performed a nomogram to predict OS incorporated with age, grade, location, T stage, N stage, M stage, lymph node removed and chemotherapy. The C-index of the nomogram model was 0.757 after the analysis of the internal validation. Another nomogram survival model proposed by Bai et al. 28 was based on 5 independent prognostic factors, which were D-dimer level, carbohydrate antigen (CA) 125 level, CA19-9 level, degree of radical surgery and histological grade. The C-index of the model was 0.825 and they did not mention the AUC of the model. Nomogram and BN model both based on the independent risk factors. BN model can further illuminate the relationships and interactions among the independent factors. Moreover, BN model is a direct and structured illustration of how the factors working together to contribute to the outcome. Researchers can improve accuracy of the model by adjusting the conditional probability of each variable node according to clinical experience and research.
In recent years, BN has been widely used in artificial intelligence, systematic biology, disease diagnosis and prognosis, scientific decision-making and other fields. The application value of BN in medical field is also prominent. The BN survival prediction model has the following advantages: (1) The model is presented in the form of tree graph, which is simple and intuitive; (2) The correlation between variables can be found and the conditional probability of each variable can be calculated and predicted; (3) The inference function of BN can guide treatment decision-making.
There were three major deficiencies in this study: (1) The survival prognosis model established in this study was based on single-center data, and only conducted internal validation without external validation; (2) The time span of the cases included in this study was long, which resulted in heterogeneity of the cases; (3) Preoperative tumor markers, Ki-67, P53 and other pathological indicators were not included in this study.
For the results of this study, the prediction accuracy of the BN model remains to be further improved. In future study, we will expand the sample size, include more variables and conduct external validation to improve the prediction accuracy of the survival prognostic model of PMP.

| CONCLUSIONS
To conclude, this study established a BN-based survival prediction model for PMP from 7 independent prognostic factors, which could help clinical treatment decision making and outcome prediction.