Validation of virtual water phantom software for pre‐treatment verification of single‐isocenter multiple‐target stereotactic radiosurgery

Abstract The aim of this study was to benchmark the accuracy of the VIrtual Phantom Epid dose Reconstruction (VIPER) software for pre‐treatment dosimetric verification of multiple‐target stereotactic radiosurgery (SRS). VIPER is an EPID‐based method to reconstruct a 3D dose distribution in a virtual phantom from in‐air portal images. Validation of the VIPER dose calculation was assessed using several MLC‐defined fields for a 6 MV photon beam. Central axis percent depth doses (PDDs) and output factors were measured with an ionization chamber in a water tank, while dose planes at a depth of 10 cm in a solid flat phantom were acquired with radiochromic films. The accuracy of VIPER for multiple‐target SRS plan verification was benchmarked against Monte Carlo simulations. Eighteen multiple‐target SRS plans designed with the Eclipse treatment planning system were mapped to a cylindrical water phantom. For each plan, the 3D dose distribution reconstructed by VIPER within the phantom was compared with the Monte Carlo simulation, using a 3D gamma analysis. Dose differences (VIPER vs. measurements) generally within 2% were found for the MLC‐defined fields, while film dosimetry revealed gamma passing rates (GPRs) ≥95% for a 3%/1 mm criteria. For the 18 multiple‐target SRS plans, average 3D GPRs greater than 93% and 98% for the 3%/2 mm and 5%/2 mm criteria, respectively. Our results validate the use of VIPER as a dosimetric verification tool for pre‐treatment QA of single‐isocenter multiple‐target SRS plans. The method requires no setup time on the linac and results in an accurate 3D characterization of the delivered dose.

radiotherapy (IMRT) and volumetric modulated arc therapy (VMAT), it is feasible to simultaneously treat multiple metastases using a single isocenter. [5][6][7][8][9][10] Patient-specific quality assurance (PSQA) is a challenge in cases involving a large number of small, off-isocenter, and widely spreadout lesions. A detector with high spatial resolution and sufficient scanning spatial range to encompass all targets is need for this kind of PSQA. Large detection area (up to 40 × 40 cm 2 ) and high spatial resolution (~0.3-0.4 mm) are available on current electronic portal imaging devices (EPID). 11 Three-dimensional (3D) dose reconstruction over the head volume is therefore feasible from EPID images, potentially enabling dose verification of single-isocenter multipletarget (SIMT) SRS plans. 12 Ansbacher described a method for rapid evaluation of IMRT plans, using EPID images for reconstruction of the dose delivered to a virtual 3D cylindrical phantom. 13 A similar method was used by the VIPER (VIrtual Phantom Epid dose Reconstruction) software developed at Calvary Mater Newcastle Hospital (CMNH) that has been used for remote EPID-based external dosimetric auditing of IMRT and VMAT for clinical trials. 14,15 VIPER is not currently commercially available, but it can be made available on request for research purposes.
A dosimetric validation of the 3D dose reconstruction performed by VIPER is needed prior to be clinically used. However, accurate 3D measurements are extremely difficult to perform and only available at a few research centers. The Monte Carlo (MC) simulation is one of the most accurate methods in evaluating calculated dose distributions from other algorithms. [16][17][18] MC simulation is often used to estimate dose distributions when experimental measurements are difficult or not possible to be performed. The PRIMO software (https://www.primoproject.net/) that permits MC simulations of Varian radiotherapy linacs in a user-friendly manner, has been used in this study. 19,20 This study investigates the accuracy of the VIPER software to be used for pre-treatment patient-specific QA of single-isocenter multiple-target SRS plans. For small static MLC-based fields, dose distributions computed by VIPER on flat water phantom were compared to 1-D and 2-D measurements in water tank and with film.
Also, 3D dose distributions computed on a cylindrical water phantom for several SIMT SRS plans were compared with the corresponding 3D dose distributions reported by the PRIMO MC software.

2.A | Linac and VIPER software configuration
All measurements reported in this work were performed at Hospital Quirónsalud Barcelona (Spain) (HQB), which adopted fixed-gantry intensity-modulated radiosurgery in 2009 its standard procedure for cranial SRS. This was done after assessing the dosimetric advantages compared to a dynamic arc-based SRS technique. 21 Stereotactic radiosurgery is currently planned at HQB using a fixed-gantry sliding window IMRT technique. SRS plans are calculated in the Eclipse treatment planning system (TPS) version 13.6 (Varian Medical Systems, Palo Alto, CA), using the analytical anisotropic algorithm (AAA), with a 1-mm calculation grid size, and 6 MV photon beams from a Varian 2100 CD linac. A non-coplanar beam arrangement (11)(12)(13)(14) fields) with a single-isocenter is always used. The linac is equipped with a Millennium 120 MLC and a Por-talVision aS500 EPID with a sensitive area of 40 × 30 cm 2 and a resolution of 512 × 384 pixels. This results in a pixel size of 0.784 mm when it is placed at a source-detector-distance (SDD) of 100 cm. The accuracy of the dose calculation performed by Eclipse for small lesions, and the targeting accuracy of single-isocenter IMRT SRS for multiple lesions were previously investigated by our group. 22,23 VIPER is a software developed in MATLAB (The Mathworks, Natick, USA) at CMNH that allows EPID-based 3D dose distribution reconstruction for combined IMRT fields and VMAT arcs of a plan in a "virtual" water phantom. Details of the algorithm to convert EPID images to dose have been previously detailed. 24,25 The method corrects for EPID sag and EPID support arm backscatter.
VIPER requires configuration for each linac used at each center.
Two calibration plans (EPID and TPS calibration plans), and DICOM images of several virtual water phantoms are supplied by CMNH.
The EPID calibration plan consists of fifteen static fields to be delivered to the EPID to determine EPID positioning and sag with gantry angle. The TPS calibration plan consists of a single 10 × 10 cm 2 field with 100 monitor units (MUs), and gantry zero (Varian IEC 601-2-1 scale) with the isocenter at the center of each "virtual" phantom.
The dose distributions derived from the local TPS (Eclipse, in this study) are used as calibration doses for the EPID-to-dose conversion model. The acquired EPID calibration images and TPS calculated doses are sent to CMNH to create a customized configuration file for each particular linac and TPS beam model.
In this study, the VIPER software (v. 3.10 beta, May 2019) was configured for 6 MV beams, for a dose rate of 600 MU per minute and for the EPID placed at a SDD of 100 cm. All deliveries were scheduled and managed using the Varian Aria version 13.6 recordand-verify system. All EPID images were acquired in-air with no phantom or treatment couch present and using the integrated imaging mode. VIPER is provided with CT datasets of a "virtual" flat phantom of 20 cm height, 50 cm width, and 50 cm length (VFP20), and a "virtual" cylindrical phantom of 20 cm diameter and 40 cm length (VCP20). The VCP20 and VFP20 are named as "virtual" phantoms as no physical phantoms are irradiated and only in-air EPID measurements are required by VIPER. The VFP20 phantom is intended for 2D single-field analysis of each individual IMRT field of a plan, with normal incidence (zero gantry angle) and the isocenter placed at a depth of 10 cm. However, VIPER also performs the 3D dose reconstruction onto the VFP20 of a single field at gantry zero (Varian IEC 601-2-1 scale). Additionally, VIPER uses the VCP20 phantom for a 3D-combined field analysis. This is the same principle as described by Ansbacher 13 but using different dose calculation algorithms and depth-dependent (5 cm intervals) EPID to dose conversion models combined with missing tissue and buildup corrections. Doses at other depths are interpolated and the 3D dose is rotated by the delivered gantry angle about the isocenter which is placed at the center of VCP20.
To perform a 3D VIPER-based verification of a SRS plan, the plan has to be mapped in the Eclipse onto the VCP20 phantom and recalculated using the original MUs and fluences. The VCP20-based plan is then delivered to the EPID, along with an open 10 × 10 cm 2 field (100 MU) used by VIPER to calibrate the EPID signal to dose. In addition, an optional whole detector 40 × 30 cm 2 field is used to determine the uniformity of the EPID. For all the plans included in this study, the 10 × 10 cm 2 and 40 × 30 cm 2 calibration images are acquired on each SRS plan verification. Once delivery is complete, the recorded EPID images (DICOM format), the RP DICOM plan file, and the RD DICOM dose file of the SRS plan to be verified are imported into the VIPER software. VIPER then computes the 3D dose distribution inside the VCP20 phantom for comparison with the TPS calculation using its 2D and 3D gamma analysis tools.

2.B | Validation of the VIPER configuration
On-site benchmarking of VIPER against measurement was performed by investigating the accuracy of the dose distributions reconstructed onto the VFP20 phantom for static 1 × 1, 2 x 2, and 3 x 3 cm 2 fields. Field apertures were defined by the MLC with the jaws set at 10 x 10 cm 2 . One in-air EPID image was acquired for each small field on three different days. Dose reconstruction was done for normal field incidence and a source-to-surface distance (SSD) of 90 cm. To derive absolute doses for these small MLC-defined fields, output factors (OFs) relative to 10x10 cm 2 were measured at a depth of 10 cm in the MP3 water phantom using the PinPoint ionization chamber in conjunction with a PTW Unidos electrometer. For these measurements, the detector was placed with its stem perpendicular to the beam axis (radial orientation), as recommended by the International Atomic Energy Agency Technical Report Series No 483 (IAEA-TRS 483). 26 In addition, the derived output factors were corrected with the correction factors given in this report for small fields.
PDD and OF ionization chamber-based measurements were repeated on three different days.
For the static 1 x 1, 2 x 2, and 3 x 3 cm 2 MLC-defined fields, the two-dimensional dose distributions reconstructed by VIPER at 10 cm depth of the VFP20 were compared to the respective measurements. GAFChromicTM EBT3 films (Ashland Inc., Wayne, NJ, USA) in a slab water equivalent phantom (RW3, PTW) were used. Three films per field size were exposed. Films were scanned 20 hr after exposure with an Epson Pro V750 flatbed scanner (Seiko Epson Corporation, Nagano, Japan) in transmission mode with a resolution of 150 dpi (0.2 mm/pixel) and 48-bit RGB format. Film dosimetry was carried out using the web-based application https://www.radiochro mic.com (v. 3.2.2), which uses a multichannel algorithm to improve dose accuracy. 27 The VIPER and film-based measurements were compared in the Radiochromic.com software by performing a global gamma analysis with 3%/1 mm and 5%/1 mm criteria. The comparisons were performed within the 10% and 80% of maximum dose threshold to include and exclude the beam penumbra, respectively.
Relative dose profiles through the beam central axis were extracted from the planar dose distributions to be compared.
Given the expressed advantage of a large measurement area for the VIPER method, some off-axis apertures benchmarking has been included. The IMRT field known as the "Aida" pattern has been used. 28 It represents a sequence of rectangle of decreasing width (from 12 to 1 cm) as shown in Figure 1. Dose computed by VIPER at the center of each aperture at 10-cm depth of the VFP20 phantom were compared to the respective measured ones in the MP3 water phantom using the PinPoint ionization chamber in axial orientation.
Finally, verification of VIPER configuration against measurement was performed using the "chair" test ( In-air EPID images were acquired for a "chair" test planned in Eclipse using the VFP20. | 243 all depths. Doses reconstructed by VIPER were compared with these measurements. Repeatability of the ionization chamber measurements was within 1%.

2.C | Verification of the VIPER 3D dose reconstruction
The PRIMO MC software has been used in this study for verification of the full 3D dose distributions computed by VIPER for SIMT SRS plans. The accuracy of PRIMO (with the default simulation parameters) for the dose calculation of static 3DCRT beams was previously benchmarked by our group against the reference dosimetry dataset from IROC-H (Imaging and Radiation Oncology Core-Houston). A dosimetric accuracy within 2.8% was found for 6 MV beams from the Varian Clinac 2100 CD with a Millennium 120 MLC used in the present study. 29 Our team also validated PRIMO to be used for independent verification of IMRT SRS plans. 30 SRS plans for single and multiple targets defined in a polystyrene phantom were simulated with PRIMO and dose planes were compared against radiochromic film measurements. Single targets consisting of spheres of 0.5, 1, 2, and 3 cm-diameter directly outlined in the phantom CT images, while a set of three spheres of 1 cm-diameter was outlined by mimicking a multiple target case. In addition, one brain metastasis (1 cm 3 ) and one vestibular schwannoma (1 cm 3 ) were mapped from two clinical cases to the phantom. All targets were located at the phantom with their centers at the film plane. GPRs ≥97% for a 5%/ 1mm global criteria, and local dose differences at the target centers within ±3.6% were found.
Each SIMT SRS plan was mapped in Eclipse to the VCP20 water phantom, and couch positions were set to 0 degrees ("verification plan"). The verification plan was re-calculated in Eclipse with a grid size of 1 mm by keeping the same MUs, and then was delivered onto the EPID without any phantom. DynaLog files generated by the MLC controller during this delivery were retrieved, and the actual MLC segments were reconstructed using an in-house code. 31 For each verification plan, the DICOM RP file and the corresponding in-air acquired EPID images were imported into the VIPER software to reconstruct the 3D dose distribution within the VCP20 ("VIPER plan").
Finally, each verification plan was simulated within the VCP20 with the PRIMO MC software (v. 3.1.0.1772). A calculation voxel size of 1.2 mm × 1.2 mm × 1.0 mm was used. The simulation conditions used were described in a previous work. 29 Simulation of each verification plan was done using the actual DynaLog file-based MLC segments instead of using the planned MLC patterns (‛PRIMO plan').
The dosimetric agreement between VIPER and PRIMO plans was assessed using the 3D gamma tool available in the PRIMO software.
Global gamma analysis with the criteria of 3%/2 mm and 5%/2 mm were used in the present study. GPRs for both criteria were calculated within the ROIs of the VCP20 receiving at least 10%, 20%, 30%, 50%, 70%, 80%, and 90% of the maximum VIPER dose. A minimum GPR of 90% is considered as an acceptable level for the comparison, according to the AAPM Task Group No. 218. 32 In addition, PRIMO allows computing the dose-volume histogram (DVH) percentage of agreement (PA) for each ROI. The quantity PA was introduced by Rodriguez et al. as an indicator of the similarity of two DVHs, and it was shown to be more sensitive than the GPR to detect differences between dose distributions that are being compared. 33 A PA value of 100% indicates a perfect DVH agreement. A minimum PA threshold of 95% is considered in this study as a good DVH agreement.
The median dose (D50) to a high-dose volume (HDV) is another metric included for analysis to gain insight on the differences in absolute dose values. The 3D dose distributions reconstructed by VIPER were directly compared with the PRIMO calculations by the percentage difference in median dose to the high-dose volume (%ΔHDVD50), as described by Olaciregui-Ruiz et al. 34 The selected HDV is the ROI80%, as the 80% isodose is a common prescription of the SRS treatments at HQB.  Table 1, with a maximum difference of 3.2% at 5 cm depth for the 1 x 1 cm 2 field. Repeatability of the PDD values derived from VIPER and those measured with the ionization chamber collected on three different days was better than 0.05% and 0.2%, respectively. The percentage local differences (ΔD) between the absolute doses given by VIPER and the respective absolute doses derived combining the measured OFs and PDDs are shown in Table 1, for the depths of 1.5, 5, 10, and 15 cm. The maximum difference was 3.2% at a depth of 15 cm for the 3 x 3 cm 2 field. Repeatability of ΔD was better that 0.5% from data collected on three different days. Table 2 displays the OFs at a depth of 10 cm in water derived by VIPER from in-air EPID images and those measured with the PinPoint ionization chamber. Repeatability of OF measurements was better that 0.5% from data collected on three different days. Differences of 1.8%, 1.0%, and 1.6% were found in the OFs for the 1 x 1, 2 x 2, and 3 x 3 cm 2 fields, respectively. For off-axis apertures, the Table 3 shows the local dose differences (VIPER vs. PinPoint-based measurement) at the center of each rectangular aperture of the Aida test. Maximum difference of 0.8% was found for the smallest 1 x 3 cm 2 aperture at 10 cm off-axis distance from the central axis. Figure 4 shows the relative dose profiles at a depth of 10 cm in the VFP20 phantom provided by VIPER and those measured with film in RW3 for the 1 x 1, 2 x 2, and 3 x 3 cm 2 fields. Table 4 shows the differences (VIPER vs. measured) found in field size (FWHM) and penumbras along the cross-and in-plane directions, as quantified from three film measurements per field for F I G . 3. PDD comparison (VIPER vs. PinPoint-based measured).

3.A | Validation of the VIPER configuration
T A B L E 1 VIPER vs. PinPoint-based measurements: differences in PDD (ΔPDD) and local dose differences (ΔD) at several depths in water for three small fields. | 245 the 1 x 1, 2 x 2, and 3 x 3 cm 2 fields. Cross-plane direction coincides with the leaf motion direction. VIPER always overestimated the beam penumbra (up to~1 mm). The differences between the field size values given by VIPER and film were within 0.4 mm. Figure 4 shows systematic underestimation (<2.5%) of the out-of-field dose by the VIPER software. The GPRs of the 2D global gamma evaluation between the VIPER-reconstructed dose planes at 10 cm-depth in the VFP20 phantom and the reference 2D dose distributions measured with film in RW3 were above 95% for all the static fields, for both global 3%/1 mm and 5%/1 mm criteria (Table 5).    However, small discrepancies (<1.1 mm) in the penumbra of the small static 1 x 1, 2 x 2, and 3 x 3 cm 2 fields were found between the VIPER and the film-based methods (Table 4).
Due to its high spatial resolution, radiochromic film is a very suitable detector for lateral profile measurements in small fields (IAEA T A B L E 6 VIPER vs. Pinpoint-based measurements: local dose differences at several depths in water for the selected points of the "chair" test.  which has not been specifically optimized for small fields. 15 Hence, although there is margin for improvement in the VIPER modeling of the penumbra, it is still adequate for the purposes of this work. From Table 6, VIPER underestimated doses in the high MLC transmission regions of the chair test which is a known issue with EPID response but does not affect gamma pass rates significantly. 38,39 A potential source of inaccuracy of this study is that the point doses measured using the PinPoint ionization chamber were made in the very large MP3 water phantom (73 cm × 52 cm × 64 cm) in contrast to the size of the VFP20 phantom (20 cm × 20 cm × 40 cm) used by VIPER for dose reconstruction, such that difference in scatter could be an issue for dose comparison (VIPER vs measured).
However, the excess of scatter due to the larger water phantom MP3 was estimated in Eclipse and it resulted in a negligible error of less than 0.2%.
It is well known that hardening of the photon energy spectrum of small fields occurs at any point on the beam central axis with decreasing field size. 26 This fact is revealed by the PDD values measured with the PinPoint chamber, but not by the PDDs values given by VIPER (Table 1). We think that this issue could be solved by adjusting the current parameters of the depth-dependent scatter EPID kernel of VIPER to improve the agreement for depth doses on the central axis for small fields. The accuracy of VIPER was investigated for auditing IMRT/VMAT plans in the Virtual Epid Standard Phantom Audit (VESPA) project. 14 However, the EPID-to-dose conversion performed by VIPER was not specifically developed for small field dosimetry, as has been performed in some back-projection portal dosimetry models. 40 The accuracy of VIPER for 3D dose reconstruction was assessed using a gamma index analysis for 18 IMRT SRS plans by using PRIMO MC simulations as reference. As Miften et al stated, there is a need to consider both the spatial and dosimetric uncertainties when comparing dose distributions to determinate if the reference and evaluated dose distributions (PRIMO and VIPER, resp. in our study) agree to within the limits that are clinically relevant. 32 A typical 3% limit is taken as the acceptance criterion for the dose differences during IMRT PSQA. 32 To use this criterion of 3% presumes that the uncertainty in PRIMO itself is significantly less than this value. However, we believe it is appropriate to expand the 3% gamma dose criterion to 5% to account for the statistical uncertainty (~2%) of the PRIMO dose distributions. 41  At this point, the question arises about the advantage of using the EPID measurements and reconstructing the dose in the VCP20 (VIPER software) compared to taking the Dynalog files and recomputing the delivered fields in the VCP20 using the PRIMO software.
It is well described that the DynaLogs files can be used in combination with a dose calculation algorithm (e.g., MC simulation) to estimate the dose delivered to a patient. 31,33 Although these logs contain information of the MLC leaf positions for each delivered MU by the linac, they do not provide a direct and independent measurement of the actual fluence delivered or beam intensity, in contrast to the use of an EPID. Moreover it is reported that some MLC malfunctions are not recorded on the treatment logs. 48,49 However, they can be detected with an EPID acquisition. So, we think that a more independent check of the SRS plan can be done using EPIDbased 3D dose reconstruction instead of using a MC and Dynalogbased procedure.
The MLC used in this study has 40 central leaf pairs of 5 mm width (covering the central 20 cm field size), and 20 outer leaf pairs of 1 cm width (covering up to 40 cm field size). Therefore, the 1 cm leaves could compromise the treatment of very small targets.
According to our procedure for SIMT SRS, the treatment isocenter is located at the center of the brain such that it always lies in the 20cm field (superior-inferior length) where only 5-mm leaves are available. In this way, all lesions included in this study were covered by the 5-mm leaves. According to our clinical routine, we have not found so far a brain with more than 20 cm as superior-inferior length.
The VIPER verification of a SRS plan is based on a virtual phantom, that is, a real phantom is not used and therefore any fiducial or landmark is available to check the targeting accuracy. For instance, the isocenter wobble of the couch is not taken into account by VIPER, as the EPID is insensitive to the couch rotation.
In summary, this study has demonstrated that the VIPER software can be a streamlined alternative to commercially available 2D detector arrays, solving their drawbacks related to the detector field coverage and resolution when SIMT plans has to be verified. By contrast, the VIPER software has some limitations as: (a) 3D dose distribution calculation with respect to the patient's CT anatomy cannot be done, as performed by other softwares; 34,50 (b) reconstruction of non-coplanar fields onto the VCP20 phantom is not supported by the version of VIPER used in this study; and (c) VIPER cannot replace the geometric checks used in SRS QA to verify the isocenter and targeting accuracy, like a Winston-Lutz test or an end-to-end test. 23 The present study just assessed the feasibility of using the VIPER software for to be used as a tool for PSQA. To that purpose, the VIPER software needs to be validated independently of the TPS (with a MC simulation in this study).

| CONCLUSION S
VIPER software calculations were in agreement with PRIMO simulations within 5%/2 mm for clinical single-isocenter multiple-target SRS cases. Our results suggest using VIPER as a dosimetric check tool for pre-treatment QA single-isocenter multiple-target plans.
VIPER is an option to commercially available 2D detector arrays for this task.