A comparison of two methodologies for radiotherapy treatment plan optimization and QA for clinical trials

Abstract Background and purpose The efficacy of clinical trials and the outcome of patient treatment are dependent on the quality assurance (QA) of radiation therapy (RT) plans. There are two widely utilized approaches that include plan optimization guidance created based on patient‐specific anatomy. This study examined these two techniques for dose‐volume histogram predictions, RT plan optimizations, and prospective QA processes, namely the knowledge‐based planning (KBP) technique and another first principle (FP) technique. Methods This analysis included 60, 44, and 10 RT plans from three Radiation Therapy Oncology Group (RTOG) multi‐institutional trials: RTOG 0631 (Spine SRS), RTOG 1308 (NSCLC), and RTOG 0522 (H&N), respectively. Both approaches were compared in terms of dose prediction and plan optimization. The dose predictions were also compared to the original plan submitted to the trials for the QA procedure. Results For the RTOG 0631 (Spine SRS) and RTOG 0522 (H&N) plans, the dose predictions from both techniques have correlation coefficients of >0.9. The RT plans that were re‐optimized based on the predictions from both techniques showed similar quality, with no statistically significant differences in target coverage or organ‐at‐risk sparing. The predictions of mean lung and heart doses from both methods for RTOG1308 patients, on the other hand, have a discrepancy of up to 14 Gy. Conclusions Both methods are valuable tools for optimization guidance of RT plans for Spine SRS and Head and Neck cases, as well as for QA purposes. On the other hand, the findings suggest that KBP may be more feasible in the case of inoperable lung cancer patients who are treated with IMRT plans that have spatially unevenly distributed beam angles.


INTRODUCTION
quality score was found to be correlated with treatment outcome. 4 However, the strong association between RT deviations and clinical outcomes may not truly represent causation 1 ; deviations from protocol guidelines may be related to unfavorable patient anatomy (e.g., tumor size, shape, and location) or the quality of the treatment plan. The purpose of QA is to identify cases that are not compliant with the protocol and illuminate the underlying reasons for non-compliance.
A knowledge-based planning (KBP) method that calculates achievable RT plans based on patient anatomy and past planning experience has been reported. 5 This method was adopted and introduced as a separate module (RapidPlan) in the Eclipse treatment planning system (TPS) (Varian Medical Systems, Palo Alto, CA, USA). 6 This module has been widely tested in clinical settings for plan optimization and QA of clinical trials. [7][8][9][10][11][12] Another method that generates direct predictions of organ-at-risk (OAR) dose-volume histograms (DVHs) for treatment plans based on individual patient anatomy and dosimetry was also introduced. 13 This method calculates the predictions based on the first principle (FP) benchmark dose with maximum dose gradients estimated around the target volume(s). PlanIQ (Sun Nuclear Corp, Melbourne, FL, USA) 13,14 is a standalone commercial platform that implements this prediction method. Unlike KBP, the FP method does not require prior knowledge or beam angles specifications. This method was tested for successful dose reduction on the contralateral parotid and larynx in clinical head and neck 4-arc plans. 15 PlanIQ DVH predictions have been integrated into the AutoPlan module of Pinnacle TPS (Philips Medical System, Fitchburg, WI, USA) in order to achieve more personalized prediction-guided RT plan optimizations. 16,17 The findings showed that the integration improves OAR sparing for all disease sites by a statistically significant amount.
There has not been a comprehensive comparison of the KBP and FP methods published. Using data from multi-center clinical trials, this study will compare these two methods for plan optimization guidance and QA for different disease sites. Instead of using Pinnacle, the FP predictions were imported into Varian Eclipse and compared to the KBP module in Eclipse for plan optimization.

Materials
Randomly selected patient DICOM data submitted to the following three National Clinical Trials Network clinical trials were used in this study.  20 Ten cases were used for testing, and a model previously published was used for the KBP method. The dose constraints provided in the protocol were used for plan guidance and evaluations.

Methods
The flowchart depicts the workflow for data preparation, DVH prediction, plan optimization, and evaluation ( Figure 1). Initially, the benchmark doses were calculated using PlanIQ. The prescription dose was applied to the target with a 3-mm grid resolution and a 6-MV dose kernel. The dose kernel was deformed based on the CT density, and a low-dose periphery with a high gradient at the target surface was used to calculate dose spillage. High-voltage photon-beam dose gradients depend on numerous factors, including energy spectrum, depth in tissue, transmission of the modeling material, shape of the modulating edge, and local density of the tissue. The PlanIQ dose algorithm applies the simplest and unachievable dose gradient perpendicular to a photon beamlet (sheer gradient) with the user-selected energy and based on the standard transmission and leaf end shape of the common Varian 120 leaf multi-leaf collimator. The gradient also varies slightly depending on the local anatomy density based on local CT Hounsfield units. 13 A sliding bar was provided in PlanIQ for the feasibility estimation of DVHs based on the benchmark dose, as shown in Figure 2. The estimate of achievable DVHs can be classified into the following four categories: 1. Impossible at 100% coverage (red). 2. Difficult (orange). 3. Challenging (yellow), and 4. Probable (green). 13 The technical details of the benchmark dose and feasible DVH calculations have been previously reported. 22 To generate DVH predictions for all prediction comparisons, the sliding bar on PlanIQ was placed between difficult and challenging (the dotted line shown in Figure 2) in this study. This sliding bar position was also used for head and neck plan optimization guidance, which aided in the generation of an RT plan with the best quality for both target coverage and OAR sparing. To meet protocol constraints, the sliding bar was shifted to the region between difficult and impossible (specifically between the red and orange range in Figure 2) to generate optimization objectives for the spinal cord for RTOG 0631 (Spine SRS). For plan optimization guidance, the DVH predictions were exported and imported into Eclipse TPS.

Plan optimization, evaluation, and QA
For the RTOG 0631(Spine SRS) plans, a high-dose normal tissue ring structure was added to guarantee fast dose falloff for stereotactic radiosurgery. The ring structure is generated around the planning target volume (PTV) extending 5 mm beyond the PTV. Volumetric modulated arc therapy plans with two 360 • arcs and fixed collimator angles of 0 • and 90 • were created for each case from RTOG 0631(Spine SRS) and RTOG 0522 (H&N). The arc geometry tool in Eclipse was used to select isocenter and jaw settings for each arc. For RTOG 1308 (NSCLC) plans, the originally submitted plan beam arrangement was used. Eclipse version 13.6 in the NRG cloud and the Photon Optimizer with a medium (2.5 mm) resolution were used. Final dose calculations were performed using the analytical anisotropic algorithm on a 2.5-mm grid.
On each testing patient, two identical plans were generated: one using model-generated objectives (KBP plan) and the other using FP-predicted objectives (FP plan) for the same dosimetric parameters and priority weightings. All of the cases went through two optimization iterations.
Following that, the two plans were compared using the protocol compliance criteria listed in Tables 1 and 2.Both plans were compared to the one that was originally submitted. The targets dose conformity indices were calculated.
For the conformity index, we used the Paddick Index as follows: where TV PI is the target volume encompassed by the prescription isodose surface, PIV is the prescription isodose surface volume, and TV is the target volume.
The results for the originally submitted plan were compared with predictions obtained from both methods to explore the possibility of utilizing these predictions for plan QA.

Comparison of KBP and FP methods for DVH predictions
The average predictions of test cases in each scenario are plotted in Figure 3 using DVH predictions obtained from both the KBP and FP methods (sliding bar between challenge and difficult). To compare the two methods, specific critical OARs were used. When target coverage is not sacrificed, the high-dose region of the KBP prediction, shown in Figure 3 RTOG 0522 (H&N) parotids, reflects the actual dose distribution on the overlapping region of parotids with target. The FP predictions for the RTOG0631 spinal cord maximum dose were around 4 Gy higher on average than the KBP predictions. The FP method calculates possible OAR doses based on uniform coverage of target with prescription dose, which is not true for SRS plans. As a result, shifting the sliding bar to a more difficult region will result in more accurate OAR dose predictions for SRS plans. The sliding bar was moved between difficult and impossible region, which was used to generate goals for RTOG0631(Spine SRS) plan optimizations to meet protocol dose constraints.
When compared to KBP prediction, the FP prediction for RTOG 1308 (NSCLC) Heart was 40 Gy lower (Figure 3). Target coverage was significantly reduced, and the dose distribution was strained as a result of plan optimization based on the FP predictions.The sliding bar position was then investigated to generate feasible dose predictions for RTOG1308 (NSCLC) heart and lungs for plan optimization guidance. The sliding bar's universal position for generating feasible predictions for all RTOG 1308 (NSCLC) test cohort patients, however, has yet to be discovered.

Plan QA results
Predictions for dosimetric points of critical OARs were plotted against the original submitted plan, and the values yielded by the re-optimized KBP plan in Figure 4. Figure 4a,b shows the maximum dose values for the cauda equina and spinal cord, respectively, for patients in RTOG 0631(Spine SRS). The spinal cord maximum dose for original Plan 1 was 4 Gy higher than that of the KBP prediction, whereas that for Plan 2 was 2 Gy lower than that of the KBP prediction. The average prediction from the FP method (sliding bar placed in between Challenging and Difficult) was 3.8 ± 1.8 Gy (paired two-sided t-test p = 0.0015) higher than that of the KBP method. None of the two predictions for the cauda equina exhibited statistically significant differences (0.2 ± 1.4 Gy).
In Figure 4c, the submitted plan for case 2 yielded a parotids mean dose 48 and 43 Gy higher than the FP prediction and KBP prediction, respectively. The KBP plan reduced the parotid mean dose by 39 Gy, increased the PTV_7000 D95% [Gy] by 10.2 Gy, and reduced the spinal cord Dmax by 3.1 Gy. This indicates that both methods are useful tools for QA of parotid sparing for submitted cases. For all 10 cases, the correlation coefficient between the FP predictions and KBP predictions was 0.962, while that between the KBP predictions and re-plan-realized values was 0.957. Figure 4d shows the RTOG 1308 (NSCLC) heart. The average FP prediction for the heart mean dose was 8.6 Gy lower (p = 0.0819) than that of the KBP prediction, 8.6 Gy lower (p = 0.0102) than the KBP re-plan value, and 9.6 Gy lower (p = 0.0493) than the original plan-realized value. The correlation coefficient between the KBP prediction and re-plan-realized value for the mean heart dose was 0.984. For individual patients treated with a few-beam-intensity-modulated RT (e.g., cases 2, 5, and 10), the FP predictions deviated significantly from the achievable values.

DISCUSSIONS
The radiation oncology community in general accepts the KBP method. When compared to the FP method, it uses a more accurate dose calculation algorithm, namely the analytical anisotropic algorithm 6 used by the Varian Eclipse TPS. Furthermore, KBP DVH predictions take into consideration the actual plan field configurations. As a result, KBP models provide more accurate predictions across the diseaes sites. However, because a plan library is required, this method is less flexible than the FP method. Because knowledge of OAR tolerable dose constraints is constantly expanding, library cases must be rebuilt to accommodate these changes. Comparison of the FP and KBP predictions for the same cases may provide insight into library cases used in the models. In this analysis, the average spinal cord Dmax for the RTOG 0631 (Spine SRS) cases obtained from the KBP predictions was lower than that of the FP predictions (the bar placed in between Difficult and Challenging). This finding suggests that the library plans used in the RTOG 0631(Spine SRS) model sacrificed PTV coverage to meet the spinal cord dose constraint. The FP method provides flexibility to the user to generate optimization objectives for plan guidance. When the sliding bar on Figure 2 was adjusted to the red region for spinal cord DVH predictions, the realized plans exhibited statistically comparable spinal cord sparing results compared with the KBP plans.
Plan optimization for RTOG 0522 (H&N)) and RTOG0631 (Spine SRS) plans based on predictions from both methods yielded plans of similar quality, with no statistically significant differences in target coverage or OAR sparing. This indicated that the FP method provided useful predictions for plan optimization in head and neck and spine surgery cases without the need for prior knowledge.
A report 17 on the successful use of FP predictions for planning guidance for lung cancer patients is noteworthy. Patients in the RTOG 1308 (NSCLC) study have large, irregularly shaped stage II to IIIB inoperable lung lesions. For the cohort of patients, unique angle intensity-modulated RT techniques were used to deliver the escalated prescription dose of 70 Gy while also meeting the more stringent OAR constraints. 19 Although the FP method was successful in generating optimization guidance in some cases, it was unsuccessful in generating attainable predictions in others.
For other disease sites not covered in this study, more comparisons and actual planning are needed. Finally, individual clinical judgment determines the applicability of FP predictions as well as the proper position of the sliding bars.

CONCLUSIONS
The KBP approach can be a reliable tool for all disease sites and clinical situations if a high-quality plan library is available. The FP technique, on the other hand, provides quick insight into the patient's anatomy (without the need for prior knowledge) as well as flexible plan optimization guidance. However, the FP technique ignores beam geometry and relies on a less precise dose calculation algorithm. As a result, it might not be appropriate in some situations, such as individuals with inoperable lung cancers treated with a few beam IMRT in RTOG 1308.

C O N F L I C T O F I N T E R E S T
The authors declare that there is no conflict of interest that could be perceived as prejudicing the impartiality of the research reported.

AU T H O R C O N T R I B U T I O N S
Huaizhi Geng and Ying Xiao conceived and designed the experiments and wrote the manuscript. Tawfik Giaddui, Chingyun Cheng, and Haoyu Zhong helped with the data analysis and paper writing. Samuel Ryu, Zhongxing Liao, Fang-Fang Yin, Michael T Gillin, and Radhe Mohan are principal investigators and physics co-chairs for the clinical trials related to this study.

F U N D I N G I N F O R M AT I O N
"This project was supported by grants U10CA180868 (NRG Oncology Operations), U10CA180822 (NRG Oncology SDMC), and U24CA180803 (IROC), from the National Cancer Institute, and in part by a grant from the Pennsylvania Department of Health. The Department specifically disclaims responsibility for any analyses, interpretations, or conclusions by Eli Lilly."

DATA AVA I L A B I L I T Y S TAT E M E N T
The data that support the findings of this study are available from IROC Philadelphia RT QA. Restrictions apply to the availability of these data, which were used under license for this study. Data are available from ACR cloud service with the permission of IROC Philadelphia RT QA.