Prognostic and predictive role of [18F]fluorodeoxyglucose positron emission tomography (FDG‐PET) in patients with unresectable malignant pleural mesothelioma (MPM) treated with up‐front pemetrexed‐based chemotherapy

Abstract The aim of this study was to evaluate the role of metabolic parameters analyzed at baseline and at interim FDG‐PET in predicting disease outcome in unresectable MPM patients receiving pemetrexed‐based chemotherapy. A consecutive series of MPM patients treated between February 2004 and July 2013 with first‐line pemetrexed‐based chemotherapy, and evaluated by FDG‐PET and CT scan at baseline and after two cycles of chemotherapy, was reviewed. Best CT scan response was assessed according to modified RECIST criteria. Progression‐free survival (PFS) and overall survival (OS) were correlated with FDG‐PET parameters, such as maximum standardized uptake value (SUVmax), total lesion glycolysis (TLG), and percentage changes in SUVmax (∆SUV) and TLG (∆TLG). Overall, 142 patients were enrolled; 77 (54%) received talc pleurodesis before chemotherapy. Baseline SUVmax and TLG showed a statistically significant correlation with PFS and OS (P < 0.05) in both group of patients (treated and untreated with pleurodesis). In 65 patients not receiving pleurodesis, SUVmax reduction ≥25% (∆SUV ≥ 25%) and TLG reduction ≥30% (∆TLG ≥ 30%) were significantly associated with longer PFS (P < 0.05). Patients showing both ∆SUV ≥ 25% and ∆TLG ≥ 30% responses had a significant reduction in the risk of disease progression (HR:0.31, P < 0.001) and death (HR:0.52, P = 0.044). Neither ∆SUV nor ∆TLG showed similar association with survival outcomes in patients treated with pleurodesis. Our study confirmed the prognostic role of baseline FDG‐PET in a large series of MPM patients treated with first‐line pemetrexed‐based chemotherapy. Moreover, use of ∆SUV ≥ 25% and ∆TLG ≥ 30% as cut‐off values to define early metabolic response supported the role of FDG‐PET in predicting disease outcome and treatment response in patients not receiving pleurodesis.


Introduction
Malignant pleural mesothelioma (MPM) is a rare and mostly fatal tumor, whose incidence is unfortunately increasing worldwide [1]. At diagnosis, the majority of MPM patients are not amenable to up-front radical surgery; thus chemotherapy represents the standard treatment option. Proper definition of baseline prognostic characteristics and reliable assessment of response to therapy are important components of patient care in everyday practice as well as in clinical trials. However, tumor assessment and response evaluation with conventional criteria based on contrast-enhanced computed tomography (CT) measurements are challenging in MPM, because of its diffuse pattern of growth. Modified RECIST criteria have been implemented and are considered the reference standard in clinical practice and ongoing trials. However, they have a high interobserver variability and were not supported by theoretical studies on modeling of mesothelioma growth [2][3][4][5]. Moreover, like all CT criteria, they do not take into account the viability of tumor tissue, which can be better assessed with a functional imaging technique such as [18F]fluorodeoxyglucose positron emission tomography (FDG-PET) [3,6].
Prognostic scores based on clinical factors, such as histological subtype, gender, Eastern Cooperative Oncology Group (ECOG) performance status (PS), and leukocyte and platelet counts have been proposed and validated by the Cancer and Leukemia Group B (CALGB) and the European Organization for Research and Treatment of Cancer (EORTC) [7,8]. The tumor avidity for FDG has been investigated as a surrogate marker of tumor biology. Nowak et al. incorporated semiquantitative PET parameters and pleurodesis into pretreatment predictors, proposing a prognostic nomogram [9]. More recently, other authors have confirmed that pretreatment FDG-PET data are robust predictors of survival in MPM, with volume-based PET parameters and histology being the main independent prognostic factors [10][11][12].
Other studies have explored the value of FDG uptake in response evaluation during chemotherapy. In fact, the early identification of responders to chemotherapy should make possible to avoid ineffective treatment with significant toxicities in these patients, usually elderly, with several comorbidities and reduced performance status, allowing also the optimization of the economic resources of the public health system. Different PET parameters were taken into account when analyzing the metabolic response (MR), defined as a decrease in the maximum standardized uptake value (SUV max ), or with dedicated algorithms analyzing volume-based parameters, such as total glycolytic volume (TGV) or total lesion glycolysis (TLG) [11][12][13][14][15][16][17][18]. All these studies, although conducted in small patient cohorts, suggested that in MPM patients treated with chemotherapy, an early reduction in FDG uptake could be significantly correlated with outcome, especially when talc pleurodesis is not performed at diagnosis.
The aim of this study was to evaluate the role of FDG-PET parameters in predicting disease outcome in a larger cohort of patients with MPM patients treated with upfront pemetrexed-based chemotherapy.

Study population
A consecutive series of MPM patients treated in our Institutions (Humanitas Clinical and Research Center, Rozzano, Milan, Italy and Humanitas Gavazzeni Clinic, Bergamo, Italy) between February 2004 and July 2013 with up-front pemetrexed-based chemotherapy, and evaluated by FDG-PET and CT scan at baseline and after two cycles of therapy, were retrospectively assessed.
Patients who received pleurodesis were included in our study, whereas patients who received less than two cycles of chemotherapy were excluded. Eligibility criteria comprised age ≥18 years, a histological diagnosis of MPM, ECOG PS ≤2, and an estimated life expectancy >12 weeks. The EORTC prognostic score for MPM (good vs. poor) was calculated for each patient [8].
Treatment was repeated for a maximum of six cycles, or until progression or unacceptable toxicity. After completion of chemotherapy, patients were evaluated with chest-abdomen CT scans every 3 months until disease progression. Patients were also followed up for survival until death, or last contact if still alive. This study was conducted with the approval of the local ethics committee, and according to the Helsinki Declaration. The trial was registered at www.clinicaltrials.gov (NCT00969098).

Imaging modalities
Imaging modalities have been described previously [19]. Chest-abdomen CT scans were acquired with a Philips Aura single-slice system in the first 22 patients, and with either Philips Brilliance or Philips Mx 8000 16 scanners in the following cases. PET scans were obtained from the base of the skull to the thighs using a Siemens ECAT ACCEL full-ring scanner until February 2007 (n = 27), whereas later images were acquired on an integrated PET/ CT tomograph: (A) Siemens Biograph LSO 6 scanner, with an integrated 6-slice CT; (B) GE Discovery PET/ CT 690, with an integrated 64-slice CT; (C) Phillips Gemini LXL PET/CT with an integrated 16-slice CT. In order to ensure consistent semiquantitative and quantitative values, each patient was studied during the course of the therapeutic protocol with the same PET or PET/CT scanner. Moreover, since 2011 all our tomographs were accredited with the EANM Research Ltd (EARL) program and image analysis was performed using standardized algorithms [20].
Tumor burden was calculated with three-dimensional volumes of interest (VOIs) drawn on the volume of metabolic tumor-related activity. The standard method of quantification was performed as described by Boucek et al. in the first 29 patients (in whom volume-based analysis was done by a semiautomated iterative threshold-based region-growing algorithm developed at Sir Charles Gairdner Hospital in Nedlands, Australia), whereas in the remaining patients the analysis for TLG computation was done using liver-based threshold semiautomated contouring on the GE ADW4.6 workstation (GE Healthcare, Waukesha, WI) [14,21]. Two board-certified nuclear medicine physicians used independently, and blinded to each other, the threedimensional volume-based region-growing algorithm or the new liver-based quantitative analysis method in the same patients [19]. We previously evaluated the consistency between the two techniques: the three-dimensional volume-based region-growing algorithm and the new liverbased quantitative analysis method [22]. Both methods defined VOIs at baseline and interim scans, corresponding to the metabolic tumor volume (MTV), while the semiquantitative measures of SUV max and SUV mean were obtained from the tissue within the VOI: SUV max was defined as the highest pixel value and SUV mean was defined as mean SUV related to the tumor burden. Calculation of TLG was done according to the following formula: MTV (ml) × SUV mean = TLG.

Response assessment
Response assessment methods have been previously described [19]. Modified RECIST criteria were used to classify tumor response to treatment as complete response (CR), partial response (PR), stable disease (SD), or progressive disease (PD) [2].
Tumor metabolic response with FDG-PET was based on measurements obtained at the same time-point as for interim CT scan (at baseline and after two cycles of chemotherapy) according to two different parameters: (A) percentage change in SUV max between baseline and interim PET (∆SUV); (B) percentage change in TLG between baseline and interim PET (∆TLG). In both cases, data were analyzed in continuous form, applying cut-off percentages of metabolic response obtained by merging previously published data from our hospital. Dedicated statistical analyses of this study cohort were also performed [23,24].

Statistical analyses
This was an observational retrospective analysis on a consecutive series of MPM patients, stratified according to previous talc pleurodesis. Patient characteristics were described in terms of number and percentage, or median and range. For continuous data, differences between groups were compared by Student's t test or the Wilcoxon test, when appropriate.
Progression-free survival (PFS) was defined as the time from the first day of chemotherapy treatment until progression, death from any cause or the last visit when a patient was alive without progression. Overall survival (OS) was defined as the time between the start of treatment and patient death or last contact for patients who were alive.
Survival curves were generated with the Kaplan-Meier method. Statistically significant variables in the univariate analysis were included in the multivariate model if they confirmed an independent effect. Hazard ratios (HR) with 95% confidence intervals (CI) were calculated with the Cox proportional-hazards regression model in univariate and multivariate analyses. For continuous variables, in the case of a statistically significant association, a recursive regression tree was estimated in order to identify a cutoff value to discriminate patients into different prognostic groups. Statistical significance was set at P < 0.05 for each evaluation.
All analyses were performed using R software, version 3.0.3 (R Foundation for Statistical Computing, Vienna, Austria); graphics were made using Stata Statistical Software, version 13 (StataCorp. 2013., College Station, TX).

Role of FDG-PET in MPM
On univariate analysis, tumor histology, SUV max at baseline, TLG at baseline and ∆SUV showed a statistically significant association with both PFS and OS, while ∆TLG showed a statistically significant association with PFS (Table 3). EORTC score was a prognostic factor for OS only. The recursive analysis identified indicative cut-offs of 6.2 for SUV max and 927.3 for TLG, while corresponding cut-offs for ∆SUV and ∆TLG were −27.8% and −34.97%, respectively. These last two values were similar to previously published data; therefore, we applied a SUV reduction of ≥25% (∆SUV ≥ 25%) and a TLG reduction of ≥30% (∆TLG ≥ 30%) as reference cut-off values [22,23]. PET parameters categorized according to cut-off values were significantly associated with outcome, except ∆TLG that showed a statistically significant association with PFS only (Figs. 1 and 2).
On multivariate analysis, all PET parameters considered at baseline, that is, SUV max (P = 0.030) and TLG (P = 0.047), and after two cycles of chemotherapy, that is, ∆SUV (P = 0.028) and ∆TLG (P = 0.049), were significantly associated with PFS. On the other hand, only SUV max (P = 0.005) was significantly associated with OS. Upon combining the two PET parameters as variation after two cycles of chemotherapy, patients showing both ∆SUV (∆SUV ≥ 25%) and ∆TLG (∆TLG ≥ 30%) responses had a significant reduction in the risk of disease  SUV max , maximum standardized uptake value; ∆SUV, percentage change in SUV max between baseline PET and interim PET after two cycles of therapy; TLG, total lesion glycolysis; ∆TLG; percentage change in TLG between baseline PET and interim PET after two cycles of therapy. Baseline values of PET parameters in patients treated with pleurodesis were not significantly different from those in patients not treated with pleurodesis (P = 0.863 and P = 0.389 for SUV max and TLG, respectively). Moreover, baseline PET parameters did not differ significantly from those evaluated after two cycles of chemotherapy (P = 0.805 and P = 0.343 for SUV max and TLG, respectively).

Role of FDG-PET in MPM
On univariate analysis, ECOG PS, SUV max at baseline and TLG at baseline showed a statistically significant association with both PFS and OS, whereas histology was associated with OS only ( Table 3). The recursive analysis identified indicative cut-offs for SUV max (9.25) and TLG (534.3) at baseline that distinguished patients with a different outcome (Figs. 1 and 2). On multivariate analysis, baseline TLG (P < 0.001) had a significant association with both PFS and OS, whereas ECOG PS and tumor histology were associated only with PFS and OS, respectively.
None of the variations after two cycles of chemotherapy (i.e., ∆SUV and ∆TLG) considered in continuous form or using the cut-off percentages cited above showed a significant association with PFS or OS.

Discussion
FDG-PET has been increasingly used in MPM for staging and monitoring tumor response to chemotherapy. In fact, preliminary observations suggested that MPM avidity for FDG might be regarded as a surrogate marker of tumor biology with a prognostic significance, while therapyinduced changes in FDG uptake might predict response and patient outcome early in the course of therapy [25].
Flores et al. incorporated SUV max into a prognostic model with stage and histology, observing that a SUV max value >10 was associated with poor prognosis [26]. Similarly, SUV max was an independent predictor of survival in two other patient series, with cut-off values of 10.7 and 5, respectively [10,27]. In contrast, Nowak et al. reported that FDG-PET volumetric parameters significantly predicted survival, whereas SUV max did not [9]. In particular, baseline TGV was included in a nomogram of pretreatment prognostic factors for MPM. Recently, Klablasta et al. confirmed TLG and histology as independent prognostic factors, whereas Hooper et al. observed baseline TGV as an independent predictor of worse OS in this disease [11,12]. Moreover, Kodota et al. [28] observed that the baseline level of SUV max could identify also the subgroup having the worse prognosis among patients with epithelial histology.
In our cohort of patients not receiving pleurodesis, a SUVmax ≥ 6.2 at baseline was significantly associated with a poor prognosis, in agreement with literature data [10,26,27]. Although we applied the same quantification method as used by Nowak et al., at multivariate analysis, only baseline SUVmax showed a statistically significant correlation with OS, whereas TLG did not [9].
We hypothesize that SUVmax could identify the most aggressive tumor clones that drive the prognosis of the disease. Probably, this sign of malignity is underrated in TLG analysis due to the algorithm that calculates this value [29]. This sort of calculation could therefore obscure the significance of focal uptake identified with SUVmax. Conversely, because TLG constitutes an overall estimate of tumor (metabolic) burden, it might be more suitable for response assessment rather than survival prognostication. In clinical practice, these data suggest that SUVmax could be sufficient to determine the prognosis of patients not submitted to pleurodesis.
On the other hand, in our cohort of patients treated with pleurodesis, baseline TLG was a strong independent prognostic factor for PFS and OS, regardless of the inflammatory effects induced by pleurodesis itself. In particular, patients receiving pleurodesis and having a baseline TLG ≤ 534.3 showed a mOS significantly longer than patients with a TLG > 534. 3. These results are in agreement with the data of Hooper et al., who reported that baseline TGF predicted the prognosis independently of talc pleurodesis, and with the data of Nowak et al., who observed that baseline TGV remained predictive of survival in patients with previous pleurodesis, independently of histology [9,12]. Taken together, these data support the prognostic role of quantitative PET parameters even in patients treated with pleurodesis, at least at baseline.
Several preliminary studies have explored the role of metabolic response evaluated by FDG-PET in MPM patients treated with pemetrexed-based chemotherapy who have not received talc pleurodesis. In these studies, semiquantitative (SUV max ) and quantitative analyses (MTV, TGV or TLG) were applied by computing variations in areas of FDG accumulation at different time points during treatment [11][12][13][14][15][16][17][18]. In a previous study by our group, a 25% decrease in SUV max correlated with improved time to progression (14 months vs. 7 months in nonresponders) [13]. However, considering that MPM is often diffuse and heterogeneous, several authors have postulated that SUV max , as a single-pixel parameter, may not be representative of changes within the entire tumor following chemotherapy [14,30]. Veit-Haibach et al. reported that a TGV reduction obtained after three cycles of chemotherapy was predictive of response as determined by RECIST criteria [15]. Both TGV reduction and CT scan response were associated with improved survival, whereas SUV max and SUV mean were not, suggesting that volumetric PET measurements of tumor uptake may be more accurate than SUV max . Evidence of response was reported by Francis et al. as early as after one cycle of chemotherapy using a quantitative semiautomated volume-based FDG-PET analysis able to obtain the TGV [14]. All the reported data, although obtained in small cohorts, suggest that in MPM patients treated with chemotherapy, an early reduction in FDG uptake can be correlated with patient outcome, in particular when talc pleurodesis is not performed. By contrast, Hooper et al. observed that change in interval TGV (baseline/after two cycles of chemotherapy) did not predict OS or chemotherapy response on CT scan [12]. In particular, analyzing 33 out of 41 (80%) MPM patients classified as metabolic responders on interval PET-CT (30% or greater fall in TGV), they did not observe a significant difference between the metabolic responders and nonmetabolic responders group in terms of time to progression on interval CT scan at 2 months (after three cycles of chemotherapy).
In our cohort of patients not treated with talc pleurodesis, ∆SUV and ∆TLG after two cycles of chemotherapy were significantly correlated with PFS, suggesting their predictive role in response assessment. Recursive analysis on our cohort of patients identified −27.8% and −34.97% as the cut-off percentages of metabolic response in terms of reduction in SUV and TLG, respectively. From these data, in agreement with previously published data for other tumors, we postulate that reductions of ≥25% in SUV and ≥30% in TLG (i.e., ∆SUV ≥ 25% and ∆TLG ≥ 30%) might have a role in defining metabolic response [23,24]. The added value of the assessment of metabolic response on PET, as previously reported by our group, could reside in its ability to predict outcome in MPM patients who show SD on CT scan [13,19]. When ∆SUV and ∆TLG were combined, the correlation with PFS improved, suggesting that while ∆SUV alone could be sufficient in clinical practice, the use of both parameters could be more appropriate in clinical trials, when the aim is to test a new treatment.
In patients treated with talc pleurodesis, neither ∆SUV nor ∆TLG showed a significant correlation with PFS or OS, suggesting that FDG signal in these patients is not reliable in the presence of an important inflammatory process. Potentially, the FDG uptake due to inflammation could mask the tumor uptake, particularly in the presence of tumors with low baseline FDG-avidity. In fact, regardless of talc pleurodesis, either ∆SUV or ∆TLG evaluations remain challenging in patients with low SUVmax at baseline. New radiopharmaceuticals under investigation may overcome the limitations demonstrated by FDG in this setting [31].
In conclusion, this trial confirms the prognostic role of baseline FDG-PET in a large series of MPM patients treated with first-line pemetrexed-based chemotherapy. Moreover, the use of a SUV max reduction ≥25% and a TLG reduction ≥30% as cut-off values for the definition of metabolic response after two cycles of chemotherapy, confirms the role of FDG-PET in predicting disease outcome and treatment response in patients not submitted to talc pleurodesis.