A meta‐analysis comparing efficacy and safety between proton beam therapy versus carbon ion radiotherapy

Abstract Background This study aimed to compare the outcomes of proton beam therapy (PBT) and carbon ion radiotherapy (CIRT) by a systematic review and meta‐analysis of the existing clinical evidence. Methods A systematic literature search was performed to identify studies comparing the clinical outcomes of PBT and CIRT. The included studies were required to report oncological outcomes (local control [LC], progression‐free survival [PFS], or overall survival [OS]) or adverse events. Results Eighteen articles comprising 1857 patients (947 treated with PBT and 910 treated with CIRT) were included in the analysis. The pooled analysis conducted for the overall population yielded average hazard ratios of 0.690 (95% confidence interval (CI), 0.493–0.967, p = 0.031) for LC, 0.952 (95% CI, 0.604–1.500, p = 0.590) for PFS, and 1.183 (0.872–1.607, p = 0.281) for OS with reference to CIRT. The subgroup analyses included patients treated in the head and neck, areas other than the head and neck, and patients with chordomas and chondrosarcomas. These analyses revealed no significant differences in most outcomes, except for LC in the subgroup of patients treated in areas other than the head and neck. Adverse event rates were comparable in both groups, with an odds ratio (OR) of 1.097 (95% CI, 0.744–1.616, p = 0.641). Meta‐regression analysis for possible heterogeneity did not demonstrate a significant association between treatment outcomes and the ratio of biologically effective doses between modalities. Conclusion This study highlighted the comparability of PBT and CIRT in terms of oncological outcomes and adverse events.

Particle beam radiotherapy (PBRT), a form of radiation therapy (RT), can deliver high radiation doses to tumors and exert antitumor effects.Notably, PBRT distinguishes itself through a distinctive depth-distribution characteristic known as the Bragg peak. 1 This characteristic allows high doses to be delivered to the tumor while minimizing exposure to nearby normal tissues.Moreover, carbon ion RT (CIRT) and proton beam therapy (PBT) have been increasingly utilized, and over 250,000 patients have undergone PBRT until 2019.The availability of PBRT has been expanding with over 100 facilities offering this special treatment. 2lthough some physical differences exist between proton and carbon-ion beams with respect to the widths of the penumbra and fragmentation tail, these particle beams are generally considered to exhibit similar physical profiles. 3However, because of its higher relative biological effectiveness (RBE) and linear energy transfer compared to proton beams, CIRT is expected to have superior biological effectiveness. 4 Nonetheless, studies on PBRT are mostly single-arm studies, which may be undervalued when comparing the oncological outcomes of the two treatment modalities.While a few prospective randomized controlled trials (RCTs) and meta-analyses have compared the treatment outcomes and toxicities between the two modalities of PBRT, the available evidence still needs to be provided.Moreover, no meta-analyses have focused exclusively on literatures comparing the two treatment arms, PBT versus CIRT.
Therefore, we aimed to systematically review and generalize the published clinical evidences, specifically comparing the treatment outcomes and toxicities between PBT and CIRT.

| Search strategy and selection criteria
Systematic literature searches were conducted to identify all available articles on the clinical outcomes of PBRT, with the last date of the search until the 1st June 2023.The first search query identified studies using PBT or CIRT, and the second query included all types of tumors that were known candidates for PBRT.The Cochrane Library, PubMed, and EMBASE electronic databases were used, and the keywords to conduct literature searches were ("particle" OR "heavy ion" OR "carbon ion" OR "carbon radiation" OR "radiation therapy technique" OR "Cion" OR "CIRT" OR "c ion rt") AND ("cancer" OR "tumor" OR "neoplasm" OR "carcinoma" OR "chordoma" OR "sarcoma").Additional manual searches of references were also performed.Studies were included if they were written in English and met the Population, Intervention, Comparison, Outcome, and Study (PICOS) criteria defined as follows: Population (P) was defined as human subjects, Intervention (I) with all types of PBRT, Comparison (C) with comparison between PBT and CIRT, Outcomes (O) with any oncologic outcomes including local control (LC), progression-free survival (PFS), overall survival (OS), and any adverse events (AE), and Study (S) was defined as only RCTs or case-control studies.This study was registered in PROSPERO (Protocol No: CRD42023450927).

| Data extraction
Four investigators extracted the literature's general characteristics (Jang, JY, Kim, K, Lee, TH, and Yoo, GS).The recorded data included the name of the first author, year of publication, study design, treatment type, sample size, dose per fraction, number of fractions, type of disease, site of the treated area, total dose, pre-RT treatments, and the study population (age and sex).The sample size and number of events related to treatment outcomes and the occurrence of AE were recorded according to the treatment arm.To compensate for the heterogeneity of dose per fraction and the number of fractions, we used a biologically effective dose (BED) with an alpha-beta ratio of 3 for toxicity evaluation and 10 for oncologic outcome evaluation.The 3-year and 5-year LC, PFS, and OS rates were extracted from each study.Concerning AE, we extracted data on the most frequently reported toxicities common to both treatment groups, ensuring consistency in the analysis.

| Quality assessment
We performed a quality assessment of all the studies included in the analysis.Four individual radiation oncologists used the star-based Newcastle-Ottawa Scale.Each item in the assessment could receive a maximum of one star, except for comparability, which could receive one or two stars.The quality of the literature was converted to the Agency for Healthcare Research and Quality standards and was categorized as good, fair, or poor quality based on the following criteria: 3 or 4 stars in the selection domain AND 1 or 2 stars in the comparability domain AND 2 or 3 stars in the outcome/exposure domain for good quality; 2 stars in the selection domain AND 1 or 2 stars in the comparability domain AND 2 or 3 stars in the outcome/exposure domain for fair quality; 0 or 1 star in the selection domain OR 0 stars in the comparability domain OR 0 or 1 stars in the outcome/exposure domain for poor quality. 5

| Statistical analysis
The Biomedical Statistics Center of our institution conducted the statistical analyses.Statistical analysis was executed using R 4.2.3 (Vienna, Austria; http:// www.R-proje ct.org/ ), packages "metafor" and "meta".To determine the estimated effect of particle beams on treatment outcomes and toxicities, we extracted or calculated the log hazard ratio (HR) and standard error (SE) for LC, PFS, and OS using Parmar's method, and the log odds ratio (OR) and SE for AE. 6,7All HRs and ORs were calculated using CIRT as a reference and the ratio of PBT to CIRT.A random-effects model was consistently used for the overall population, whereas a fixed-effects model was employed for subgroup analysis.Heterogeneity was measured using the Higgins and Green I 2 test. 8I 2 ranged between 0% (no heterogeneity) and 100% (maximal heterogeneity), and the heterogeneity of the study was considered substantial (p < 0.1) by Cochran's Q-test and I 2 >50%.We also evaluated the potential publication bias using Egger's regression test and funnel plots. 9For the meta-regression analysis, we used inverse-weighted mixed-effects regression models to evaluate the effect of radiation dose on the occurrence of oncological outcomes and AE. 10 Statistical significance was set at p < 0.05 as statistically significant.

| Selected articles and characteristics
Figure 1 presents the literature search results, and 3,983 articles were initially identified from three electronic databases.2][13][14][15][16][17][18][19][20][21][22][23][24][25][26][27][28] The characteristics of the included articles are shown in Table 1.Except for two, all were retrospective studies.Among the treated sites, there were 10 articles on the head and neck (including the paranasal sinus, nasal cavity, and skull base), four on the lung, two on the liver, and two on the pelvis.Concerning the type of tumor, the analysis included five articles on skull base tumors, comprising three articles on chordomas and two on chondrosarcomas.In addition, there were three articles on non-small cell lung cancer; two on adenoid cystic carcinoma of the head and neck; two on hepatocellular carcinoma; and one each on mucosal melanoma, squamous cell carcinoma, any malignancy of the head and neck, oligometastatic disease of the lung, sacral chordoma, and prostate cancer.The number of studies reporting each outcome was 12 for LC, nine for PFS, 13 for OS, and 11 for AE.A pooled analysis was conducted, and the results are presented in Figure 2. The heterogeneity test results are presented in Table 3. Moderate heterogeneity, with an I 2 of 47.5% (p = 0.057), was observed only among the studies on PFS, while the remaining studies showed no heterogeneity.Furthermore, Egger's regression test indicated no publication bias, with funnel plots showing p-values > 0.5 for all outcomes (Table s1).The pooled HR for LC is estimated to be 0.690 (95% confidence interval (CI), 0.493-0.967;p = 0.031), indicating a significant difference and favoring PBT.For PFS and OS, the estimated HRs were 0.952 (95% CI, 0.604-1.500;p = 0.831) and 1.183 (95% CI, 0.872-1.607;p = 0.281), respectively, indicating no significant difference.

| Oncologic outcomes in the subgroup population: LC, PFS, and OS
The number of studies used in the subgroup analysis and the heterogeneity test results are shown in Table S2.In the subgroup analysis of patients treated in the head and neck region, the HRs for LC, PFS, and OS were 0.861 (95% CI, 0.536-1.383;p = 0.536), 1.542 (95% CI, 0.893-2.661;p = 0.120), and 0.965 (95% CI, 0.608-1.531;p = 0.880), respectively, using fixed effects model, indicating no significant difference (Figure S1).In the pooled analysis of patients treated in areas other than the head and neck, only PFS showed moderate heterogeneity, with an I 2 of 44.7%.However, considering the limited number of studies, a fixed-effects model was employed, yielding HR estimates of 0.551 (95% CI, 0.341-0.890;p = 0.015), 0.738 (95% CI, 0.427-1.277;p = 0.120), and 1.389 (95% CI, 0.923-2.090;p = 0.880) for each outcome (Figure S2).PBT favored LC but showed no significant difference in PFS and OS.Another subgroup analysis was conducted on articles on patients with chordomas and chondrosarcomas.PFS analysis was not conducted because of the limited number of articles available.The HR for LC and OS were 0.809 (95% CI, 0.451-1.449;p = 0.476) and 0.956 (95% CI, 0.541-1.689;p = 0.877), respectively, demonstrating no significant difference (Figure S3).

| Adverse events
A total of 11 studies provided data on AEs, with three reporting the OR for AE ≥ Grade 3. The treatment characteristics and results are presented in Table 4. Heterogeneity tests showed an I 2 value of 0% for all outcomes.In the pooled analysis of the overall population, the OR for any AE was 1.097 (95% CI, 0.744-1.616;p = 0.641) (Figure 2D).Subgroup analyses according to the treatment site and pathology also revealed no significant differences in the occurrence of any AEs between PBT and CIRT (Figure S4).Furthermore, no significant differences were observed in the occurrence of Grade ≥ 3 AEs in the overall population (Figure S5).

| Meta-regression with BED ratio
A meta-regression analysis was conducted using the BED ratio to explore the factors that may explain the possible heterogeneity in the HR of oncologic outcomes.However, no significant association was found between the HR of each outcome and the BED ratio (Figure S6).Furthermore, permutation tests were conducted to address the limitation of the small sample size, yielding consistent findings that reinforced the validity of the observed trends (Table S3).

| DISCUSSION
To our knowledge, this is the first meta-analysis that compares PBT and CIRT exclusively using comparative articles.Despite the difficulty in making direct comparisons owing to the diverse endpoints reported in each study, we observed a degree of comparability in oncologic outcomes and risk of toxicities between the two modalities.
It is widely recognized that although PBT and CIRT share the common advantages inherent to particle beams, they also exhibit distinct properties.Heavy ions exhibit reduced longitudinal and lateral scattering compared to protons, resulting in a smaller dose halo and a narrow penumbra. 29Furthermore, a carbon-ion beam with RBE ranging from 1.5 to 3.4, which is greater than that of a proton beam, is expected to be more effective in eradicating cancer cells with hypoxia and T A B L E 2 Clinical characteristics and treatment outcomes of the included studies.The RBE values were calculated using local effect model I.
radioresistance. 30Given these characteristics, it was expected that CIRT would yield superior oncologic outcomes and reduced toxicity compared with PBT.However, evidences confirming the superiority of CIRT are rare, and this may have been resulted from several reasons.Publishing comparative studies regarding PBT versus CIRT is challenging in the real world because of several factors such as patient preference, insurance coverage, and the limited availability of heavy-ion centers offering both modalities, resulting in potential bias and limitation in the chance for study conduction.2][33] Furthermore, most studies combined data on photon, proton, and carbon therapies, predominantly emphasizing comparisons between PBRT and photon treatment.The present study is of noteworthy importance as it is the first meta-analysis on this topic, focusing solely on comparative studies and confirming comparable outcomes between the two modalities.Moreover, the significance of our research was enhanced by incorporating a meta-regression analysis that aimed to evaluate the effect of radiation dose on outcomes.
Our results indicated a modestly better LC with PBT in the overall population.However, this result requires cautious interpretation because of the potential contribution of the study by Iwata et al., in which the follow-up duration was at most 35.5 months, and the number of events was only 15. 14 Therefore, due to the limited quality of

Outcome
No   this study, it is imperative to interpret these results with caution.Another noteworthy point is that there were no differences in outcomes based on tumor pathology or irradiation site.As sarcomas, including chondrosarcoma and chordoma, are known to be radioresistant compared to other histologies, the potential superiority of CIRT over PBT has often been expected. 34,35Furthermore, CIRT can potentially be more favorable in treating tumors located at the head and neck area because of its distinct physical properties, providing a narrow irradiating volume compared with PBT. 36owever, the present study showed no significant differences in the oncological outcomes and risk of toxicities in either the sarcoma or head and neck subgroups.Nevertheless, drawing the conclusion that CIRT is not more beneficial than PBT might be premature because several limitations still need to be addressed in its realworld application.Because of the difficulties in comparing PBT and CIRT in the real world, the number and quality of included studies are small and low, respectively. 37specially most of the included studies were retrospectively conducted, and 44.4% of them showed poor quality based on Newcastle-Ottawa scale.In fact, most centers that perform CIRT use only fixed-beam gantries, which restrict the optimization of irradiation angles, thereby limiting the quality of dosimetry. 38Furthermore, because the optimal dose prescription and biological model for CIRT have not yet been standardized among institutions, the CIRT protocols among the studies may be diverse. 39n particular, the inherent variability of RBE with carbon ions is a major challenge in unifying clinical protocols for CIRT among institutions. 40,41Therefore, considering these limitations is crucial when interpreting the findings and drawing conclusions regarding their effectiveness.In the future, the successful integration of modern technologies, such as gantry rotation, along with the establishment and optimization of biological models may offer promising potential for the utilization of carbon ions, particularly in radioresistant histology.Furthermore, as RCTs comparing PBT and CIRT are ongoing, these studies may provide valuable insights into the comparative effectiveness and potential advantages of each treatment modality (NCT01182753, NCT01182779, NCT01165671, NCT01641185, NCT01811394).
Our study had several limitations.To begin with, the restricted number of articles available for analysis stemmed from our stringent inclusion criteria, which focused exclusively on comparative studies.We did not include single-arm studies to mitigate the potential for an increased risk of bias, and as a result, our analysis was based on a relatively small number of studies. 42,43While we made efforts to conduct distinct analyses for various cancer types and organs, we ultimately had to opt for a pooled analysis due to the limited availability of eligible studies.We expect that as high-quality comparative research continues to emerge, performing more robust meta-analyses will become increasingly feasible in the future.Moreover, conducting comparative research requires access to both CIRT and PBT within the same institution, which restricted our study to a limited number of centers and possibly introduced potential selection bias.Second, while the majority of the included studies focused on head and neck cancer, followed by lung cancer, prostate cancer is the most frequently treated malignancy using both PBT and CIRT in real world. 44,45This discrepancy between publication and utilization in real world is worth noting, and readers should be cautious in their interpretations, considering potential bias.Third, the lack of detailed information on clinical factors such as stage or prior treatment history posed challenges during our analysis.Lastly, the absence of a consensus on the standardized RBE for CIRT has a limitation, as different studies have employed varying RBE values or models.Despite these limitations, our greatest strength lies in our exclusive focus on comparative studies, excluding case reports and series.

F I G U R E 1
PRISMA flow chart of literature search and selection.PICOS, Population, Intervention, Comparison, Outcome, and Study design.T A B L E 1 Characteristics of the included studies.First author (year) 10 ) was calculated by applying α/β ratios of 10, with calculations rounded to the first decimal place.b Reporting on the entire population without specifying individual numbers for protons and carbons.c The number of lesions, rather than the number of patients, is provided.d RBE values of carbon ion radiotherapy were determined as 2.0-3.7,depending on the depth of the spread-out Bragg peaks.e Authors did not provide RBE values but only described the model used.f

F I G U R E 2
Forest plots with random effect model of pooled analyses regarding (A) local control, (B) progression-free survival, (C) overall survival, and (D) adverse events.AE, adverse event; CI, confidence interval; HR, hazard ratio; LC, local control; OR, odds ratio; OS, overall survival; PFS, progression-free survival; RE, random effect.T A B L E 3Studies included in the pooled analysis for each outcome and analysis for heterogeneity.

T A B L E 4
Treatment-related complications in the included studies.First author (year) Abbreviations: AE, adverse event; BED, biologically effective dose; CIRT, carbon ion radiotherapy; H&N, head and neck; N/A, not available; NC, nasal cavity; PBT, proton beam therapy; PNS, paranasal sinus.