Combined tumor plus nontumor interim FDG‐PET parameters are prognostic for response to chemoradiation in squamous cell esophageal cancer

We have investigated the prognostic value of two novel interim 18F‐fluorodeoxyglucose positron emission tomography (FDG‐PET) parameters in patients undergoing chemoradiation (CRT) for esophageal squamous cell carcinoma (ESCC): one tumor parameter (maximal standardized uptake ratio rSUR) and one normal tissue parameter (change of FDG uptake within irradiated nontumor‐affected esophagus ∆SUVNTO). PET data of 134 European and Chinese patients were analyzed. Parameter establishment was based on 36 patients undergoing preoperative CRT plus surgery, validation was performed in 98 patients receiving definitive CRT. Patients received PET imaging prior and during fourth week of CRT. Clinical parameters, baseline PET parameters, and interim PET parameters (rSUR and ∆SUVNTO) were analyzed and compared to event‐free survival (EFS), overall survival (OS), loco‐regional control (LRC) and freedom from distant metastases (FFDM). Combining rSUR and ∆SUVNTO revealed a strong prognostic impact on EFS, OS, LRC and FFDM in patients undergoing preoperative CRT. In the definitive CRT cohort, univariate analysis with respect to EFS revealed several staging plus both previously established interim PET parameters as significant prognostic factors. Multivariate analyses revealed only rSUR and ∆SUVNTO as independent prognostic factors (p = 0.003, p = 0.008). Combination of these parameters with the cutoff established in preoperative CRT revealed excellent discrimination of patients with a long or short EFS (73% vs. 17% at 2 years, respectively) and significantly discriminated all other endpoints (OS, p < 0.001; LRC, p < 0.001; FFDM, p = 0.02), even in subgroups. Combined use of interim FDG‐PET derived parameters ∆SUVNTO and rSUR seems to have predictive potential, allowing to select responders for definitive CRT and omission of surgery.


Introduction
The best treatment approach for locally advanced esophageal squamous cell carcinoma (ESCC) is controversial. Preoperative chemoradiation (pCRT) bears an overall survival (OS) benefit compared to surgery alone in several trials, including adenocarcinomas and ESCC. [1][2][3] Therefore, there is clear evidence that pCRT should be performed in case of surgical resection. However, whether this trimodality approach comprises an additional OS benefit compared to definitive chemoradiation (dCRT) is uncertain. Two randomized phase-III studies did not show an OS difference between dCRT and pCRT plus surgery, but higher loco-regional failure rates after dCRT. 4,5 As some recent population-based analyses suggest better OS after trimodality treatment, pCRT plus surgery is usually the treatment of choice in medically fit patients. 6,7 Treatment-related morbidity is considerable after trimodality treatment. The mentioned phase-III trials both reported postsurgical mortality rates of around 10%. 4,5 Since up to one-half of patients show complete histopathological remission after pCRT, identification of tumors/patients exhibiting high sensitivity to chemoradiation (CRT) should be a pivotal issue for improved treatment individualization, for example, by organ preservation. 2,8 The applicability of an interim positron emission tomography (PET) scan with 18 F-fluorodeoxyglucose (FDG) to measure response to CRT is of limited use with current standard PET parameters, especially in ESCC, 9 although PET-tailored treatment has shown some promising results in adenocarcinomas of the esophagus. 10 Currently, the most often investigated FDG-PET/CT parameter is tumor glucose uptake quantified as standardized uptake value (SUV). However, SUV quantification has several wellknown shortcomings, for example, uptake time dependence of the SUV, interstudy variability of the arterial input function and susceptibility to errors in scanner calibration. [11][12][13] In recent publications, it was shown that uptake time normalized tumor to blood SUV ratio (standardized uptake ratio, SUR) essentially eliminates most of these shortcomings and leads to an improved correlation with the metabolic uptake rate, [14][15][16] improved testretest stability 17 and significantly better prognostic value compared to tumor SUV. [18][19][20] It was shown recently that the maximum SUR during interim FDG-PET-CT (rSUR) has a high prognostic value in patients treated with pCRT for esophageal cancer. 18 Furthermore, recent studies reported a high prognostic value of the FDG uptake within irradiated nontumor affected (o)esophagus (NTO) in a cohort of patients treated with pCRT and dCRT, most likely reflecting radiation-induced inflammation. 21,22 The aim of our study was to confirm that both established interim FDG-PET parameters are prognostic and can identify patients that would be candidates for organ preservation.

Establishment of parameters in patients treated with pCRT
Data on the patients treated with pCRT have been published, 18,22 reanalysis of the combined PET parameters was performed for 36 patients with squamous cell histology included in both prior analyses, patient details can be found in Supporting Information  Table S1.

Inclusion criteria
Inclusion criteria for the retrospective multicenter validation cohort were: Treatment with dCRT for histologically confirmed ESCC with normo-fractionated radiotherapy with curative intent and prescription of curative radiation doses, concomitant chemotherapy according to international standards with platinum compound plus taxane/5-fluorouracil, initial FDG-PET for staging and interim FDG-PET during week four of CRT, no evidence of distant metastases in both PET scans.

Patients, treatment and follow-up
In total, 98 patients were included in the validation cohort. A summary of patient and tumor characteristics is given in Table 1. Chinese patients received dCRT as standard of care while in the two European centers dCRT was restricted to patients with relevant comorbidities or declining surgery.
Details of the 72 patients from the Department of Radiation Oncology of the University Hospital Xiamen have been previously described. 21,23 Briefly, patients received dCRT between 2009 and 2013, mostly using intensity-modulated radiotherapy (IMRT). After a radiation dose of 50 Gy to the tumor, affected lymph nodes, safety margins, and elective lymph nodes, a consecutive boost of 4-16 Gy was prescribed to tumor/affected lymph nodes (average total dose: 58.9 Gy) with reduced margins. Concomitant chemotherapy consisted of two cycles of cisplatin (25 mg/m 2 /day, Days 1-3 and Days 29-31) and either paclitaxel (135 mg/m 2 /day, Day 1 and Day 29) or 5-fluorouracil (500 mg/m 2 /day, . The 23 patients treated at the Department of Radiotherapy and Radiation Oncology of the University Hospital Carl Gustav Carus Dresden, Germany, constitute a subgroup of a previously published study who received CRT and pretherapeutic as well as interim FDG-PET. 20 Patients were treated between 2007 and 2014 using 3D conformal radiotherapy. A total dose of 50 Gy was delivered to the tumor, affected lymph nodes, and elective mediastinal lymph nodes, followed by a sequential boost of 10-20 Gy (average total dose: 66.1 Gy). Concomitant chemotherapy consisted of cisplatin (70 mg/m 2 ) and 5-fluorouracil (3,000 mg/m 2 as an infusion over 96 hr), applied during the first and fourth weeks of treatment.
Three patients treated at the Department of Radiation Oncology of the Charité University Hospital, Campus Virchow-Klinikum, Germany, were included. Patients were treated in 2016 and received volumetric modulated arc or tomotherapy with an elective dose to mediastinal lymph nodes of 50.4 Gy and a consecutive boost to macroscopic tumor volumes with reduced margins of 57.6 Gy. During both treatment series a simultaneous integrated boost was applied to the metabolic tumor volume as delineated by FDG-PET up to 64 or 66 Gy. Concomitant chemotherapy consisted of weekly Carboplatin (AUC = 2.0) and Paclitaxel (50 mg/m 2 ).
Chinese patients received barium swallow tests and chest CT every 3 months during the first 2 years and every 6 months during the next 3 years of follow-up. European patients received CT scans in the same interval and additional endoscopic examinations every 3-6 months. Loco-regional recurrence needed to be confirmed by clear radiological signs of malignancy or continuous imaging or biopsy.

PET imaging
Patients from Xiamen were scanned with a Discovery STE (General Electric Medical Systems, Milwaukee, WI). Applied radiation dose at the time of the second scan was on average  Patients from Berlin were scanned with a Gemini TF 16 Astonish (Philips Medical Systems, Cleveland, OH). Applied radiation dose (within the macroscopic tumor) at time of the second scan was on average 41.5 Gy (range: 38.0-46.0 Gy). Data acquisition started 71 AE 9 min (range: 60-86 min) after injection of 236-248 MBq FDG (3D PET acquisition, 90 sec acquisition time per bed position). PET data were reconstructed using BLOB-OS-TF reconstruction (Philips Astonish TF technology: 3 iterations, 33 subsets).

Image analysis
Tracer uptake in the NTO was determined using a roughly cylindrical region of interest (ROI) which was manually delineated as described previously. 21,22 The minimum longitudinal distance to the tumor or affected lymph nodes was 20 mm. The ROI had to be in the high-dose treatment volume and the minimum volume was 5 ml (minimum longitudinal length 20 mm). The delineating observer was blinded to patient outcome. For the resulting ROIs, maximum SUV (SUV max ) was computed.
The metabolically active part of the primary tumor was delineated in the PET data by an automatic algorithm based on adaptive thresholding considering the local background. 24,25 The resulting delineation was inspected visually by an experienced observer and manually corrected if deemed necessary. Manual delineation was performed in 5 out of 98 cases exhibiting only low diffuse tracer accumulation in the respective lesion. In 14 further cases, the delineation algorithm was not able to separate primary tumor and FDG avid lymph nodes in the immediate vicinity of the primary tumor. These lymph nodes were manually removed from the ROI. For the delineated ROIs, SUV max , and the metabolic tumor volume (MTV) were computed.
Since tracer uptake time T was not fully standardized in this retrospective study, lesion SUVs and NTO SUVs were adjusted to an uptake time T 0 = 75 min after injection using the following formula: where SUV tc is the time-corrected SUV, T is the time at which the SUV was actually measured and b = 0.31 describes the shape and decrease of the arterial input function over time. 15 As only time-corrected SUVs were investigated the index "tc" is omitted hereafter.
The arterial blood SUV (BSUV) needed for computation of SUR values was determined by defining a roughly cylindrical aorta ROI in the attenuation CT data and transferring it to the PET data. The aorta ROI was centered in the lumen of the descending aorta observing a minimum volume of 5 ml. To reduce partial volume effects, a concentric safety margin of about 8 mm from the aortic wall was ensured. Planes showing high tracer uptake close to the aorta or showing obvious attenuation correction artifacts were excluded. BSUV was computed as the mean SUV in the aorta ROI.
Lesion SUR max was computed as the ratio of lesion SUV max and BSUV. Uptake time correction to T 0 = 75 min p.i. was performed as described in References 15 and 16: where T is the actual time of measurement in the respective scan. In the following, we omit the index "max" in the notation of SUV max and SUR max as only maximum values of these quantities were considered. SUR was analyzed only for tumor lesions and not for NTO. The SUR concept relies on irreversible FDG trapping which is a valid assumption for malign tumor lesions. This is not necessarily true for the potentially inflammation-induced NTO uptake here the precise tracer kinetics remains unclear. 26,27 The fractional differences in SUV between the first and second scan were computed as follows: where the prefixes "b" and "r" refer to baseline and interim PET scans, respectively. The fractional differences of SUR were computed accordingly. ROI definition and ROI analyses was performed using the ROVER software, version 3.0.36 (ABX GmbH, Radeberg, Germany).

Statistical analysis
The primary clinical endpoint was event-free survival (EFS) with any disease recurrence (loco-regional or distant) or death being classified as an event. Secondary endpoints were OS, locoregional control (LRC), and freedom from distant metastases (FFDM) measured from the start of dCRT to death and/or event. Patients who did not keep follow-up appointments and for whom information on OS or tumor status was, therefore, unavailable were censored at the date of last follow-up. Survival analysis with respect to EFS was performed for 8 PET parameters: ΔSUV NTO , MTV, SUV and SUR determined in the baseline PET, SUV and SUR determined in the interim PET (rSUV and rSUR), and the fractional difference of SUV and SUR (ΔSUV and ΔSUR). In addition, the correlation of the clinical parameters gender, age, ethnic group (Asian or European), grade, T-stage, N-stage, UICC-stage and the PET parameters were analyzed. The prognostic value of PET parameters was investigated using univariate and multivariate Cox proportional hazard regression in which the PET parameters were included as metric parameters. Parameters showing a significant effect in this analysis were further analyzed in univariate Cox regression using binarized PET parameters. For ΔSUV NTO , the cutoff value used for binarization was 0% (i.e., high risk was defined as no increase of SUV NTO from first to second PET scan 21 ). The cutoff values for the other PET parameters were calculated by minimizing the p-value in univariate Cox regression as described in Reference 20. The optimal cutoff was determined for EFS and then applied to OS, LRC, and FFDM. Cutoff values leading to p < 0.05 were tested for stability (i.e., sensibility of the prognostic value against variation of the cutoff value). In this test, the range of cutoff values still leading to a significant effect in univariate analysis was computed by successively decreasing/ increasing the cutoff value (starting at the optimal value) and repeated univariate Cox regression. Significant parameters were combined, high risk was defined by the presence of one or more risk factors. Probability of survival was computed and rendered as Kaplan-Meier curves. Correlation was tested using the Spearman's rank correlation method. Hazard ratios (HR) were compared using the bootstrap method (random resampling with replacement, 10 5 samples) to determine the statistical distribution of (HR1 − HR2) from which the relevant p-value then was derived. Statistical significance was defined as a p-value of less than 0.05. Statistical analysis was performed with the R language and environment for statistical computing version 3.5.1 (2019, R Core Team, Vienna, Austria). 28 Clinical trial information. All patients had to give written informed consent before treatment and imaging. The studies were approved by the Institutional Review Boards of the participating centers and the joint-analysis was additionally approved by the Institutional Review Board of the first author's institution (EA2/122/17) and was conducted in accordance with the guidelines of the International Conference on Harmonization/Good Clinical Practice and the principles of the Declaration of Helsinki.   PET parameters were included as metric parameters. Note that due to high correlation of bSUV and bSUR, the analysis was performed separately for these two parameters (bSUV above, bSUR below). All significant values are in bold.

Data availability
The data that support the findings of our study are available from the corresponding author upon reasonable request.

Results
The prognostic value of each single PET parameter has been previously published in single-institution studies. 18,22,29 Since combination of the novel tumor and nontumor parameter has not been assessed and the previously published cohort was not restricted to ESCC only, a reanalysis of 36 ESCC patients treated with pCRT plus surgery evaluating rSUR was performed. Combining the resulting optimal cutoff value rSUR = 4.5 and the previously published value for ΔSUV NTO (0%) and defining low-risk patients as patients presenting ΔSUV NTO > 0% and rSUR ≤ 4.5 revealed a very strong prognostic impact on all endpoints (EFS, OS, LRC, FFDM, see Supporting Information Table S2 and Fig. S3). While several PET parameters were associated with histopathological tumor regression (Supporting Information Fig. S1) the combination of rSUR and ΔSUV NTO did not only show a correlation with tumor regression but seems to deliver further biological information beyond local tumor response as shown in Figure 1.
In the validation cohort of 98 patients treated with dCRT, 2-, 3-and 5-year OS rates were 48, 41 and 27%, respectively. The median follow-up time was 16 and 56 months in surviving patients (range, 8-102 months). The rates for EFS, LRC and FFDM at 5 years were 31, 34 and 71%, which is in line with survival rates reported in current publications. 30,31 Univariate analysis with respect to EFS revealed the PET parameters MTV, ΔSUV NTO , bSUV, bSUR and rSUR as significant prognostic factors. The only significant clinical parameters were N-stage and UICC stage ( Table 2, Supporting  Information Fig. S2). Correlation analysis showed a strong correlation of bSUV and bSUR (Spearman's rho = 0.8, p < 0.001). All other clinical and PET parameters showed only weak correlation (Spearman's rho <0.45), see Supporting  Information Tables S3 and S4 for a summary of all PET parameters. Therefore, bSUV and bSUR were analyzed separately in multivariate Cox regression in combination with MTV, ΔSUV NTO , rSUR and UICC stage in order to identify the best combination of PET parameters. Both analyses revealed ΔSUV NTO and rSUR as independent prognostic factors (p = 0.006 and p = 0.004, respectively). MTV, bSUV and bSUR did not provide additional prognostic information,  UICC stage showed a trend for significance (p = 0.073), details are reported in Table 3.
ΔSUV NTO and rSUR were binarized using the same cutoff values as in the pCRT plus surgery cohort (ΔSUV NTO = 0% and rSUR = 4.5). In univariate analysis, the binarized parameters, too, were significant prognostic factors (ΔSUV NTO : HR = 2.02, p = 0.0073; rSUR: HR = 3.02, p < 0.001). In a cutoff stability, test both parameters turned out to be very stable regarding their ability to discriminate high and low-risk patients (Supporting Information Table S5).
Subsequently, ΔSUV NTO and rSUR were combined in the multicenter dCRT validation cohort. High risk was defined by ΔSUV NTO ≤ 0% and/or rSUR > 4.5. Univariate analysis of the combination parameter with respect to EFS revealed a hazard ratio of HR = 4.02 (p < 0.001). Corresponding Kaplan-Meier curves are shown in Figure 2 for EFS and Supporting Information Figure S4 for all other endpoints. HR of combination was significantly larger than HRs of ΔSUV NTO and rSUR (p = 0.012 and p = 0.041, respectively). Two years EFS rate was 17% in the high-risk group and 73% in the low-risk group (defined by the combination of ΔSUV NTO and rSUR).
Both PET parameters were also able to predict OS, LRC and FFDM in univariate analysis for each parameter separately. The only exception was ΔSUV NTO for predicting FFDM, which only showed a trend for significance (p = 0.063). Furthermore, the combination of both parameters notably increased effect size compared to ΔSUV NTO and rSUR alone for all endpoints, Figure 3 shows the corresponding Kaplan-Meier curves. The results of univariate Cox regression are shown in Supporting Information Table S6. To exclude bias due to Ethnical differences in the dCRT cohort of our study, analyses were also restricted to European and Asian patients. In both subgroups, the combination of ΔSUV NTO and rSUR still significantly discriminated between high and low risk for all endpoints except FFDM (Supporting Information Figs. S5 and S6). Subgroup analyses showed that combination of parameters had a high predictive value in terms of EFS and OS regardless of radiation dose, type of concomitant chemotherapy or age as shown in Figure 4.

Discussion
In this retrospective multicenter evaluation of FDG-PET at baseline and during treatment we could demonstrate, for the first time, that combination of ΔSUV NTO and rSUR appears not only to improve discrimination of patients with good and poor prognosis but also to be potentially predictive for treatment with CRT. The observed loco-regional 2-year control rates of 95% after dCRT suggest that combination of these interim-PET parameters enables prediction of CRT response which is most likely linked to a more favorable tumor biology as shown by the excellent outcome regarding all endpoints in this group (EFS, FFDM and OS). This is an important prerequisite for future studies on individualized treatment, specifically organ preservation in patients presenting positive response parameters.
Compared to various staging PET parameters determined before treatment, the combination of these two interim parameters provided independent and significantly enhanced prognostic value. Our study adds further evidence to the growing knowledge on the promising prognostic value of NTO and SUR in esophageal cancer 18,[20][21][22] and identifies the combination of both parameters obtained during interim PET at the end of the fourth week of dCRT as a promising tool for treatment individualization. To the best of our knowledge, our study is the largest analysis of patients with interim FDG-PET in ESCC with clearly specified timing of interim PET (end of week four of CRT). Our data indicate the potential usefulness of interim PET for response assessment and further treatment planning in ESCC. The prognostic value of both parameters (determined separately) has originally been identified in patients undergoing pCRT. 18,22 The reanalysis of ESCC patients treated with trimodality and the high predictive impact in patients treated with dCRT shown in our study suggests that the combination of both parameters among patients undergoing pCRT might be a useful tool to select well-responding patients for dCRT but continue with surgery in the remaining patients. Additionally, due to the high impact on all endpoints, these parameters might also be very useful for further treatment stratification, for example, additional checkpoint-inhibition in the high-risk group or combined TGFβ and checkpoint inhibition. 32 Two trials that randomized patients to pCRT plus surgery or dCRT alone did either not include any biomarker for response assessment or only used very basic clinical/radiological parameters to include only patients with at least a partial response to pCRT. 4,5 In both trials, OS of patients did not differ significantly between treatment arms, but 3-year locoregional recurrence rates were about 40% after dCRT. The role of interim PET for response assessment and potential treatment guidance of ESCC is uncertain. A recent review identified 13 mostly retrospective studies investigating the prognostic value of interim PET in esophageal cancer. The authors concluded that a slight majority of the studies (8 of 13) were able to show a prognostic value of interim PET. 9 However, patients with adenocarcinomas and ESCC were included and PET examinations were performed at various time points between week 2 and 6 of CRT. These heterogeneities combined with the low number of (mostly retrospective) data make it difficult to draw any firm conclusions on the utility of interim-PET for treatment guidance. Another limitation of interim PET during CRT is the increased uptake of normal tissue, especially mucosa and gut. Preclinical and clinical studies have shown that there is a correlation between radiation-induced inflammation and FDG uptake [33][34][35] and FDG-PET is also used in several inflammatory diseases. 36 The increased normal tissue uptake hampers evaluation of tumor response and may have led to negative studies on response assessment in interim PET. Our data suggest that inclusion of NTO is able to counteract this phenomenon.
In our study, only patients with complete PET/CT scans before treatment and at the end of the fourth week of pCRT/ dCRT were included. This is based on our findings in pCRT patients that demonstrated a positive prognostic value of both investigated PET parameters at this timepoint: ΔSUV NTO and rSUR. 18,22 Additionally, at this timepoint modification of the treatment schedule from pCRT to dCRT would still be possible.
Our study has several limitations. Retrospective analysis generally is prone to bias which may affect the reliability of prognostic biomarkers, requiring independent validation. PET-scanners, imaging protocols, radiotherapy techniques, total radiation doses and chemotherapy in our study were not standardized and added to the heterogeneity of the patient population. Furthermore, patient and tumor characteristics of Asian patients could differ from European patients. dCRT in the participating European centers is usually restricted to patients with relevant comorbidities or patients who refuse surgery, while in China dCRT is more commonly applied in patients who also would be candidates for surgery. A recent publication was able to show that BRCA2 loss-of-function germline mutations play an important role in the genetic predisposition of Chinese patients to develop ESCC. 37 On the one hand, these underlying heterogeneities may add to the risk of statistical overfitting leading to spurious associations. On the other hand, the heterogeneity of our validation cohort, in light of the persisting effect in all subgroups, suggests that the PET response biomarkers are very robust. Despite the small sample size several parameters (ethnicity, age, radiation dose and chemotherapy regime) also did not impact the high prognostic/predictive value. Regarding the implementation of NTO, glucose uptake within irradiated nontumor tissue also bears a significant prognostic value in head and neck and lung cancer that is independent from PET parameters of the tumor. 38,39 These data suggest that the FDG uptake within the nontumor irradiated tissue can be used to perform biologydependent radiation dosimetry in vivo. The further biological mechanisms are part of ongoing research. Most likely immunological effects and/or a similar radiosensitivity of tumor and normal tissue are responsible for the observed phenomenon. 39 To the best of our knowledge, the present study is the largest analysis of patients undergoing dCRT with an interim FDG-PET within a standardized interval (fourth week of radiotherapy). The combination of ΔSUV NTO = 0% and rSUR = 4.5 excellently discriminated ESCC patients treated with trimodality (exploration cohort) and dCRT (validation cohort). Together, these data indicate that ΔSUV NTO and rSUR enable prediction of CRT response in ESCC patients.

Conclusion
The combination of tumor and nontumor FDG-PET parameters during Week 4 of CRT appears promising to discriminate good from poor prognosis in esophageal cancer patients irrespective of the chosen therapeutic approach (pCRT plus surgery or dCRT). This could be further exploited as a predictive tool during pCRT to select well responders for dCRT and omit surgery.