Evaluation of an electron Monte Carlo dose calculation algorithm for electron beams

The electron Monte Carlo (eMC) dose calculation algorithm of the Eclipse treatment planning system is based heavily upon Monte Carlo simulation of the linac head and modeling of the linac beam characteristics with minimal measurement of beam data. Commissioning of the eMC algorithm on multiple identical linacs provided a unique opportunity to systematically evaluate the algorithm with actual measurements of clinically relevant beam and dose parameters. In this study, measured and eMC calculated dose distributions were compared both along and perpendicular to electron beam direction for electron energy/applicator/depth combination using measurement data from four Varian CLINAC 21EX linear accelerators (Varian Medical Systems, Palo Alto, CA). Cutout factors for sizes down to 3×3 cm were also compared. Comparisons between the measurement and the eMC calculated values show that the R90, R80, R50, and R10 values mostly agree within 3 mm. Measure and calculated bremsstrahlung dose Dx correlates well statistically although eMC calculated Dx values are consistently smaller than the measured, with maximum discrepancy of 1% for the 20 MeV electron beams. Surface dose agrees mostly within 2%. Field width and penumbra agree mostly within 3 mm. Calculation grid size is found to have a significant effect on the dose calculation. A grid size of 5 mm can produce erroneous dose distributions. Using a grid size of 2.5 mm and a 3% accuracy specified for the eMC to stop calculation iteration, the absolute output agrees with measurements within 3% for field sizes of 5×5 cm or larger. For cutout of 3×3 cm, however, the output disagreement can reach 8%. Our results indicate that the eMC algorithm in Eclipse provides acceptable agreement with measurement data for most clinical situations. Calculation grid size of 2.5 mm or smaller is recommended.


I. INTRODUCTION
The electron Monte Carlo (eMC) algorithm of the Eclipse treatment planning system (Varian Medical Systems, Palo Alto, CA) uses electron energy dependent dose kernel libraries, of macroscopic spheres of various radii and materials, that are pre-calculated with the EGS4 Monte Carlo code. (1) The advantage of this macro-Monte Carlo approach is short treatment planning time with dose calculation accuracy similar to that of bona fide Monte Carlo program. Explicit knowledge of beam forming parts inside a Varian CLINAC 21EX linear accelerator is used in pre-calculating these dose kernels. Therefore, minimal amount of measured beam data is required for the commissioning of the eMC treatment planning system. The required input beam data include the profiles in air at 95 cm for each energy, the relative depth-dose curve in water and absolute dose at a specified point for each energy/applicator combination. Profiles at different depths in water for each electron energy/ applicator combination are not used as input. The algorithm uses initial phase space model (2) to derive the initial beam phase space based on the PDDs, assuming a linac head design of a standard Varian Clinac 21EX linear accelerator.
Agreement between calculation and measurement can be affected by many factors. Previously published studies provided substantial information on factors related to eMC calculation settings, such as accuracy, calculation grids and smoothing methods using measurement data from one or two machines. (3) However, because of the stochastic nature of Monte Carlo calculation and inevitable variations in measurements, comparisons based upon data from a group of machines provide more reliable results. Further more, systematic information on how other clinically relevant factors (such as electron energies, cone sizes, and depth) affect the agreement, is lacking. We have recently commissioned the eMC algorithm for four Varian Clinac 21EX linear accelerators at a single clinic location, which provide a unique opportunity to systematically compare eMC calculation and actual measurements. In addition to profiles at various depths, electron cutout factors for cutouts down to the size of 3 × 3 cm were measured. The output verification is especially important due to the stochastic nature of the Monte Carlo method which gives the output with a specified statistical uncertainty.
We present a systemic comparison between eMC calculations and extensive measurements on four Varian Clinac 21EX accelerators. Recommendations on how to use the treatment planning systems are given.

II. MATERIALS AND METHODS
The four commissioned linear accelerators share the same Varian Clinac 21EX model number. Their photon beams are matched in PDD and beam profiles. The electron beams are not matched.
Required beam data for the eMC algorithm were collected for each of the machines. Profiles in air were obtained for each electron energy at 95 cm. A large water tank, 48 × 48 × 41 cm, with a 3D scanning mechanism (Blue Phantom: Scanditronix-Wellhofer, Bartlett, TN) was used to acquire relative depth-dose curve in water at SSD=100 cm and absolute dose depth of 100% dose. Both the reference and field detectors were diodes. The distance of 0.05 cm from the active layer to the casing surface of the scanning diode was taken into account. From PDDs, the following parameters were derived: R 100 (depth of 100% dose), R 90 (depth at which dose is 90% of the maximum dose), R 50 , D x (bremsstrahlung tail) and surface doses. Profiles were scanned at the depths of R 100 , R 90 , and R 50 in water for each beam energy and applicator combination. The scanning system was set to acquire data points every 0.5 mm. OmniPro Accept (Version 6.1, Scanditronix-Wellhofer, Bartlett, TN) was used to acquire and export the data. Square Cerrobend electron cutouts were made for each applicator down to the size of 3 × 3 cm. For consistency, all cutout factors were measured with the same mini ionization chamber CC01 (cavity volume 0.01 cm 3 , radius 1.0 mm, Wellhofer-Scanditronix, Bartlett, TN) placed at R 100 of each energy/cone. Data were collected by different groups of physicists using the same measurement devices.
Measured beam data acquired from phantom scan were exported from OmniPro in ASCII format. Each measurement setup was then calculated in Eclipse eMC and the PDD and profiles were exported from Eclipse in the DICOM format and converted to ASCII format for comparison. Linear interpolation of the eMC dose distribution was used when necessary to match the measurement locations.
Dose distributions for each energy/cone combination were calculated with algorithms commissioned for each machine. The accuracy was set at 2%. The software allows the planner to define either statistical accuracy or numbers of particles to transport. According to the manufacturer, the accuracy refers to the mean statistical error in dose for all voxels receiving more than 50% of the maximum dose value located within the region of interest. (7) When the stated mean statistical accuracy is achieved, the algorithm will stop simulation. Medium level 3-D Gaussian method was used for smoothing after calculation. Calculation grid size of 2.5 mm was used throughout except for a field of 3 × 3 cm, where 1 mm was used and will be discussed in the grid size investigation. Selection of these parameters was based upon computational time and published studies. (3)(4)(5) Each electron cutout configuration was calculated in Eclipse and the cutout factors were derived from the calculations. A total dose of 10,000 cGy was prescribed and normalized to R 100 . Since different conventions are being used for some of the beam parameters, the definitions, as used in this study, are listed in Appendix A for clarity.
The combination of energy/cone size/depth/cutout size in measurement and eMC calculation generated a substantial amount of data. In order to analyze the data systematically and consistently, a statistical software package SAS (SAS, Cary, NC) was utilized. Analysis of variance (ANOVA) and linear regression were used to study the influence of factors (e.g., machines, energies, cone sizes) on the discrepancies between measurements and eMC calculation results. Significant level was set at 0.05. Table 1 compares the depth R 90 , R 80 , R 50 , and R 10 in water of measured and eMC calculated PDD at 90%, 80%, 50% and 10% of maximum dose, averaged over four tested machines. R 100 was not compared because the PDD curve around R 100 is fairly flat therefore a small amount of noise in PDD measurement can cause a big error in identifying the exact location of the PDD maximum. Small differences in each machine's energy, scatter foils or scanning system setup can affect these parameters. (6) Nonetheless, ANOVA analysis indicated that machine was not a statistically significant factor (p < 0.05) affecting the difference between the measured and eMC calculated results. Further analyses shown in Figs. 1a -1c revealed a few interesting findings. Fig. 1a illustrates the range of disagreement in measured depths among the machines. For cones smaller than 20 × 20 cm, the difference was mostly smaller than 2 mm. These differences were comparable with those in the literature. (8)(9) A maximum of 4 mm difference was observed for the 25 × 25 cm cone at 10% dose for 16 MeV and 20 MeV. Disagreement among eMC calculated depths is shown in Fig. 1b. Variations of depth in water at 10% dose among machines were rarely reported in literature because this range is contaminated with bremsstrahlung photons and beyond the range of clinical interest. Nevertheless we report our result here for interested readers. Our result indicates that even though R 50 , R 80 and R 90 agreed reasonably well among machines, larger variations at R 10 are possible, likely due to slight variations in energy spectrum. Contrary to measurement, the largest differences in eMC calculated depths occurred at smallest cone (6 × 6 cm) with a magnitude of 5-7 mm at lower energy of 6 MeV and 9 MeV (Fig.1b). Because our machines were individually measured and commissioned using each machine's own measurement data, the differences we observed could be a combination of variations in inherent beam spectrum, measurement uncertainty and statistical uncertainty of the algorithm.

A. Central-axis depth dose curve 1. Depth in water
Differences between measured and eMC calculated R 90 , R 80 , R 50 , and R 10 , averaged over machines, were calculated and shown in Fig. 1c. Larger discrepancies between the measurement and eMC calculated result occurred at R 10 . Overall, the discrepancies were mostly within 3 mm. Electron beam energy appeared to affect the discrepancies, too. At lower energy levels (6 and 9 MeV), the eMC calculated R 90 , R 80 , R 50 , and R 10 were slightly larger than measurement, indicating that eMC calculated PDDs shifted to the right of the measured PDD. Measurement and eMC calculation agreed reasonably well at 12 MeV and 16 MeV. As beam energy went up to 20 MeV, eMC calculation became slightly smaller than measurement indicating the eMC calculated PDDs shifted to the left of the measured PDD. As the end user of the software, it is difficult to know the reason for the above observation without knowing the details of parameters and procedures of the algorithm. Nonetheless we thought the observation was interesting enough to be reported here.

Dose due to Bremsstrahlung
Linear regression analysis indicated that eMC calculated D x s correlated well with measured D x s (R-square = 0.95, p < 0.01). However, compared to measurement, eMC calculated D x s were consistently smaller than measured D x s except for a few outliers. Fig. 2 summarizes the measured and eMC calculated D x averaged over four machines. In both measurement and calculation, D x increased with energy and cone size, reaching 5-6% at 20 MeV. The discrepancy between measurement and calculation became more apparent for larger cone sizes at higher energies, reaching about 1%.

Surface dose
Surface dose for both measurement and eMC calculation was determined at 0.5 mm depth on central-axis PDD curves. As shown in Fig. 3a and Fig. 3b, surface dose increases with electron beam energy, averaging around 75% at 6 MeV and around 90% at 20 MeV. Variations among the machines were small for both measurements (Fig. 3a) and eMC calculation (Fig. 3b). Discrepancies between measurement and eMC were mostly within 2% with slightly larger value (yet still within 3%) for 6 × 6 cm cone (Fig. 3c). Table 2 lists measured and eMC calculated field width of crossplane profiles acquired at R 100 , R 90 , and R 50 , averaged over the machines. Similar results were obtained between crossplane and inplane from eMC calculation, we therefore only present results from crossplane for presentation clarity. Because difference between measured and eMC calculated field width can be affected by measurement differences or statistical variations of eMC calculation, we further analyzed these factors, as shown in Figs. 4a -4c. Fig. 4a shows the measured differences among the four machines. When energy increased, the disagreement range (maximum of measurement -minimum of measurement) among the machines increased as well, reaching a maximum of 2.6 mm at 20 MeV. Similar pattern was seen for eMC calculated field width, with a comparable maximum at 20 MeV (Fig. 4b). However, when comparing each machine's measured field width with its corresponding eMC calculated field width, one can observe that the measured field width was systematically smaller than eMC calculated field width in most cases. Higher energies (16 MeV and 20 MeV) appeared to have larger discrepancies than lower energies (6 MeV, 9 MeV and 12 MeV) as shown in Fig. 4c. Overall, however, the discrepancy between eMC calculated and measured field width were all within 3 mm. For lower and medium energies (6-16 MeV) and small cone sizes (< 20 × 20 cm) which are most frequently encountered in clinic, the agreements were mostly within 1 mm. Table 3 lists measured and eMC calculated penumbra of crossplane profiles acquired at R 100 , R 90 , and R 50 , averaged over the machines. Compared to field width, the uncertainties in both measurement and eMC calculation were much larger, as shown in Fig. 5a and Fig. 5b. Here the machines demonstrated their "individuality", with penumbra from one machine consistently smaller than the others. Nonetheless, the difference between measured and its corresponding eMC calculated  penumbra was much smaller (Fig. 5c), a result of individual commissioning of each machine's algorithm using its own electron beam data. At clinically more relevant depth (R 90 and R 100 ), the difference between measured and eMC calculated penumbra was within 3 mm.

Agreement of relative dose
We compared agreement of relative dose within 80% of field width. Differences between measured and eMC calculated relative dose in crossplane were used. Both measured and calculated dose are relative dose normalized to maximum dose at central axis. Measured profiles were smoothed, centralized, and made symmetric before exporting from OmniPro Accept, therefore the mean difference estimates the separation between the measured and calculated profiles. As shown in the Fig. 6, the differences were mostly smaller than 2% at 6, 9, 12 and 16 MeV. A maximum of 3% occurred at 20 MeV at R 100 . These agreements demonstrate that the eMC calculated profiles follow the measured profiles very well given the dose accuracy of 2% set for the eMC calculations.

Symmetry
Symmetry for eMC calculated profiles at different depths for different energy/cone combinations is plotted in Fig. 7. Larger values were found for profiles at R 50 . Symmetry for profiles at R 90 and R 100 were within 2.5% in general. The range of disagreement we observed here is reasonable, because the accuracy defined by the manufacturer refers to the mean statistical error in dose for all voxels receiving more than 50% of the maximum dose value located within the region of interest. With preset accuracy of 2%, outliers greater than 2% are expected, because the symmetry here is defined as the maximum difference in relative dose between points on equal distance from the central axis within the central 80% of the field width. Table 4 lists the measured and eMC calculated OFs averaged over four machines. Measured OFs and eMC calculated OFs agreed well for cutouts greater than 3 × 3 cm (Fig. 8c). The agreement is within 3% for cutouts greater than 5 × 5 cm, 5% for cutouts smaller than 5 × 5 cm but equal or larger than 4 × 4 cm. For cutout of 3 × 3 cm, however, the agreement was significantly poorer, reaching as much as 8%. We performed ANOVA analysis, which indicated that there were no statistically  significant differences among the four machines (P < 0.05). Energy level and cone size did not affect agreement either. Fig. 8a plots measured OFs among different machines. Up to 7.5% difference among machines was observed for 3× 3 cm cutouts. Similar variations were observed for eMC calculated OFs among the machines (Fig. 8b). This indicates that large disagreement between OFs for 3 × 3 cm cutout can be contributed from both measurement and eMC uncertainties.

D. Grid size
Investigation was conducted to study the impact of eMC calculation grid size on dose distribution calculation for a 15 × 15 cm cone at 12 MeV (Fig. 9a). Four dose calculation grid sizes, 1.0 mm, 1.5 mm, 2.0 mm. 2.5 mm, and 5.0 mm, were employed. Grid size of 5 mm gave a dose about 10% lower than actual measurement. Grid size of 2 mm provided significant improvement, with an eMC result being 3% lower than actual measurement. Changing grid size from 2 mm to 1.5 mm did not provide significant gain for this parameter until the grid size of 1 mm, yet it substantially increased computational time (Fig. 9b)  We observed a hot spot and a split in all machines when using calculation grid size of 5 mm for cone size greater than 15 cm and electron beam energy greater than 16 MeV (Fig. 10). The hot spot and split disappeared when grid size was reduced to 2.5 mm with other parameters unchanged. This observation warrants further investigation and a caution using a calculation grid size of 5 mm for final clinical calculation.

APPENDIX -DEFINITIONS OF SOME OF THE PARAMETERS USED IN THIS ARTICLE
Field Width: The width on the profile curve at 50% of the central axis dose. Penumbra: Distances between 20% and 80% of central axis dose on one side of the profile, averaged over the right and left sides. D x : Dose due to Bremsstrahlung X-ray. The value is taken to be the dose at R 10 +50 mm on the PDD curve.