Mortality prediction models after radical cystectomy for bladder tumour: A systematic review and critical appraisal

To identify risk‐predictive models for bladder‐specific cancer mortality in patients undergoing radical cystectomy and assess their clinical utility and risk of bias.


| INTRODUCTION
Urothelial bladder carcinoma (UBC) is the most common type of bladder neoplasia. It is associated with smoking (causing about 50% of cases) and environmental risk factors such as occupational exposure to toxic products. [1][2][3] According to GLOBOCAN data, in 2020 bladder carcinoma was the fifth most common tumour in Europe. 4,5 Urothelial bladder carcinoma is classified into superficial (non-muscle-invasive) or infiltrative (muscleinvasive) tumour based on whether or not they involve the muscular layer of the bladder. Most of the UBC (75%) are non-muscle-invasive, with a high recurrence rate but a low rate of progression and mortality. 1,3 The remainder are muscle-invasive (T2-T4) and result in higher morbidity and mortality, with 5-year cancer-specific survival ranging from 23.5% to 65% depending on the study. 3,6,7 The standard treatment for nonmetastatic muscle-invasive UBC is radical cystectomy preceded by neoadjuvant chemotherapy. 3 However, radical cystectomy is associated with high morbidity and mortality, with complication rates ranging from 25% to 35% and perioperative mortality rates ranging from 0.7% to 11%. 8 In addition to tumour stage, there are other prognostic factors for mortality in patients with UBC, such as age, sex, positive margins, lymphovascular invasion (LVI), and neoadjuvant and adjuvant treatments. [9][10][11][12][13][14][15] UBC is therefore a heterogeneous disease with a variable clinical course. Prediction models can be useful tools to assess each patient's individualized risk and the treatment to be applied to achieve maximum oncologic efficacy with the least possible comorbidity. 16 Predictive mortality models, which incorporate relevant prognostic factors, may determine a patient's individual risk of death. They are often presented in the form of intuitive graphs, mathematical formulas, or risk groups that facilitate their use. 17 However, predictive mortality models can have a high risk of bias and often lack independent external validation, limiting their applicability to clinical practice. 18 This is why in 2014 the 'CHecklist for critical Appraisal and data extraction for systematic Reviews of prediction Modeling Studies' (CHARMS) was developed as a guideline to develop systematic reviews of predictive models. 19 Five years later, Prediction Model Risk Of Bias Assessment Tool (PROBAST) was published to assess the clinical applicability and risk of bias of prediction models, based on the results obtained from a CHARMS review. 20,21 Both tools have been widely used in several diseases, showing limitations and difficulty of models to be applied in clinical practice when they have biases. [22][23][24] As far as we know, no systematic review of prediction models of UBC mortality has been carried out with the application of CHARMS and PROBAST. [19][20][21] Hence, a summary of the existing models is lacking, including the description of the risk of bias in each model, to allow clinicians to better stratify the mortality risk of these patients.
Consequently, the objective of this study is to systematically review the available evidence focused on predictive models of cancer-specific mortality in patients with UBC undergoing radical cystectomy, to evaluate their main characteristics and to assess the risk of bias and clinical applicability.

| Study design and literature search
This systematic review was performed following a prespecified protocol (registered in the PROSPERO database, CRD42021224626), and in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. 25 We included original studies in English, Spanish or Portuguese describing the development and internal validation of a multivariable predictive model for cancerspecific mortality in patients with UBC who underwent or were candidates for radical cystectomy. We also included studies that carried out external validation in the same study. We considered studies that predicted mortality in short, medium and long term. Review articles, studies evaluating recurrence or all-cause mortality, and those performing external validation only or using clinical markers not available in clinical practice (genetic or biomarker analysis) were excluded.
A literature search was performed in the MEDLINE (through PubMed) and EMBASE databases, including all studies published since their creation until 10/31/2021, using the following descriptors: 1) Related to cancer: bladder cancer and bladder neoplasms; 2) Related to predictive models: nomograms, predictive models, scoring system, points system, risk score and prediction model; 3) Related to the outcome: mortality, recurrence, death, prognosis and survival. The complete search equations are included in Appendix S1.
Titles and abstracts were reviewed independently by two researchers (PS-S and LM-C). To validate the inclusion of the articles, the concordance between authors was assessed (kappa index (KI)), which had to be greater than 0.60. 26 If this condition was met, possible discrepancies were resolved by consensus among all the authors of the review. Once the abstracts were selected, the same procedure was replicated for the full text of the articles selected in the previous step. In addition, to reduce the possibility of publication bias, a manual search was performed using the bibliographic references of the models selected for the review. According to previous evidence, 27 we also search for and include grey literature through three strategies: search in grey literature databases, searches in clinical trial registers, and searches in conference proceedings.

| Data extraction
Variables were extracted according to the 11 items in the CHARMS checklist, 19 to identify potential sources of bias, organize information, and identify relevant information used to evaluate prediction modelling studies. Two of the authors (PS-S and LM-C) reviewed the studies independently according to CHARMS, and disagreements were resolved by discussion and consensus with a third author. 19 To validate the concordance between authors in the CHARMS items, the KI was used, which had to be >0.60. 26 Aspects related to the risk of bias were analysed using the PROBAST tool, [19][20][21] which assesses the presence of systematic errors and the applicability of predictive models. Risk of bias is analysed in four domains (participants, predictors, outcome and analysis) and applicability in three (participants, predictors and outcome). These refer to the patients selected, how the predictors are handled and their timing in the measurement, how the outcome is measured and whether the statistical analysis has been performed correctly. Each domain has a number of items, in which the PROBAST statement itself gives guidelines for assessment, 20,21 categorizing each of them into 'yes', 'partly yes', 'no', 'partly no' and 'no information'. Based on the response to all items, the domain is categorized as 'low', 'high' or 'unclear' risk of bias and 'low', 'high' or 'unclear' concern regarding applicability. After the assessment of all domains, an overall evaluation is arrived at, which follows the principle of 'the worst score counts', whereby the worst score of all domains is obtained. The KI was also used among the authors to assess concordance. 26 The Scottish Intercollegiate Guidelines Network Grading Review Group (SIGN) recommendations were used to assess the level of recommendation and level of evidence. 28 The PROBAST and SIGN assessments followed the same procedure previously defined for the CHARMS application, in which two independent investigators assessed each of the studies.

| Characteristics of the studies according to CHARMS
Tables S1-S7 show the analysis of the 11 CHARMS domains in full detail.
In the Interpretation and Discussion domain, the conclusions indicate that the results obtained would be further validated in an independent study, 6,[9][10][11][12]15,[30][31][32][34][35][36][37][38][39][40][41] with the exception of Simone and Solsona et al. who were conclusive in the use of the models in clinical practice. 13,33 All authors interpreted the relationship between predictors and outcome, compared their results with other studies, and discussed the strengths and limitations of their research. Discussion of generalization to other geographic areas was not addressed in two papers. 13,33 F I G U R E 1 PRISMA flow chart for the systematic review of predictive models in cancer-specific mortality for bladder cancer in patients treated with radical cystectomy

| PROBAST analysis
The PROBAST analysis is detailed in Appendix S2, and the results are summarized in Table 3 and Figure 2. In the participants' domain, the risk of bias or nonapplicability was high in 52.6% of the articles. 9,[11][12][13]31,34,35,37,39,40 In the analysis section, the risk was high in all papers, except for Xylinas et al. 11 Generally, the reasons for the risk of bias in this domain were the low number of participants, the categorization of continuous variables, the misuse of missing data and the calibration of the models. Finally, in the Overall Section, the risk of bias was high in all the studies and the applicability was low in 10 of the articles, 9,[11][12][13]31,34,35,37,39,40 all the studies showed the high risk of bias and 52.6% showed low applicability.

| Scottish intercollegiate guidelines network grading review group (SIGN) recommendations
Based on the SIGN criteria, the review has a grade of evidence 2++: high-quality systematic reviews of casecontrol or cohort studies with a grade B recommendation.

| Synopsis and results
More than 3000 abstracts have been reviewed for this systematic review, eventually including a total of 19 prognostic models of CSS in patients with UBC after radical cystectomy of which 100% have a high risk of bias and near 53% have low applicability. 6,[9][10][11][12][13]15,[30][31][32][33][34][35][36][37][38][39][40][41] Several predictors were consistently selected for inclusion in the different models, such as the sociodemographic variables of age and sex, tumour characteristics such as TNM stage, histological subtype/grade and LVI, and neoadjuvant chemotherapy. Most of the models showed Cindex values higher than 0.7, indicating a good model, and only one study presented C-index values higher than 0.8 (strong model).

| Strengths and limitations
The main strength of our work is that it provides a synthesis of all the predictive models published to date, indicating their main characteristics, and an assessment of their methodology and clinical applicability. Hence, this paper provides a global overview of the different predictive models of specific mortality in UBC, which could help in the  This table summarized the most used variables, the full table is attached as Table S2. N nodes removed = number of nodes removed.

T A B L E 1 (Continued)
elaboration of consensus documents and clinical practice guidelines.
One possible limitation could be selection bias, relating to the possible exclusion of some articles that did meet the inclusion criteria and none of the exclusion criteria, or that may have been published in a language other than those used by the authors. However, independent peer review minimizes the risk of such bias. Scopus and Web of Science databases were not included in the search as these have been found to produce studies similar to the databases used and hence their use can produce high 'noise', as has been observed in previous reviews. 42,43 Cochrane Database was not used either, as it usually indexes systematic reviews and clinical trials, which are not the subject of this systematic review.
The process of extracting information from each article included by peers was carried out systematically and using an objective tool such as PROBAST to assess the risk of bias and applicability in order to minimize the possibility of information bias.

| Comparison with existing literature
Our systematic review shows that existing models in the literature have a high risk of bias and low clinical applicability. These findings are consistent with others, a previous systematic review by Beneyto et al., 23 which identified and summarized, through the use of the PROBAST criteria the predictive models for predicting mortality in sepsis. The included studies showed a high risk of bias in the Participants, Predictors, Outcome and Analysis domains, with the risk of bias in the latter domain being high in 80%-100% of the studies. Furthermore, the models were not applicable in 12%-53% of the models included. Our results were in line with these studies, finding a high risk of Analysis bias in 94% of the papers included, while 53% of the models presented low applicability in clinical practice.
In our systematic review, only 5 of the 19 studies included an external validation. 6,11,33,38,41 Previous studies 44,45 also showed a similar percentage of studies including external validation of the developed model. External validation of the models in a new context is essential to assess the impact on health outcomes in clinical practice. However, this is not the subject of this review, and thus, we did not check whether the other studies had been externally validated in a different posterior study. In addition, to reduce the risk of bias the study design should be prospective, and we only included 2 studies with this study design in our systematic review. 31,33

| Implications for research and clinical practice
There are other models centered in UBC that predict other types of outcomes, such as overall survival [46][47][48][49] or the risk of recurrence, 50,51 and models in the population with non-muscle-invasive bladder tumors 52 or with metastatic UBC. [53][54][55] We decided to focus on studies that evaluated patients after radical cystectomy, nonmetastatic and predicting CSS, since this is one of the groups of patients with the highest morbidity and mortality. Consequently, it could be worthwhile to carry out systematic reviews for other types of outcomes and patients. T A B L E 3 Risk of bias and concern regarding applicability of the included studies that evaluated cancer-specific mortality prediction models in patients with bladder cancer (PROBAST) Study Abbreviations: −, high ROB/high concern regarding applicability; ?, unclear ROB/unclear concern regarding applicability; +, low ROB/low concern regarding applicability; PROBAST, Prediction model Risk Of Bias ASsessment Tool; ROB, risk of bias.
Based on our results, further research should be also focused on the development of new prognostic models that consider the recommendations of CHARMS and PROBAST, to increase their applicability in clinical practice.

| CONCLUSIONS
This systematic review analyses the predictive models for specific mortality in patients with UBC after radical cystectomy, through the application of CHARMS and PROBAST. Although the C-index values were considered good, the models included have a high risk of bias and low applicability, so they should be applied with caution. There is a need for studies that enable the development of new prognostic models that meet the standards called for within international consensus frameworks, including a prospective design and an external validation of the model.