Evaluating IMRT and VMAT dose accuracy: Practical examples of failure to detect systematic errors when applying a commonly used metric and action levels




This study (1) examines a variety of real-world cases where systematic errors were not detected by widely accepted methods for IMRT/VMAT dosimetric accuracy evaluation, and (2) drills-down to identify failure modes and their corresponding means for detection, diagnosis, and mitigation. The primary goal of detailing these case studies is to explore different, more sensitive methods and metrics that could be used more effectively for evaluating accuracy of dose algorithms, delivery systems, and QA devices.


The authors present seven real-world case studies representing a variety of combinations of the treatment planning system (TPS), linac, delivery modality, and systematic error type. These case studies are typical to what might be used as part of an IMRT or VMAT commissioning test suite, varying in complexity. Each case study is analyzed according to TG-119 instructions for gamma passing rates and action levels for per-beam and/or composite plan dosimetric QA. Then, each case study is analyzed in-depth with advanced diagnostic methods (dose profile examination, EPID-based measurements, dose difference pattern analysis, 3D measurement-guided dose reconstruction, and dose grid inspection) and more sensitive metrics (2% local normalization/2 mm DTA and estimated DVH comparisons).


For these case studies, the conventional 3%/3 mm gamma passing rates exceeded 99% for IMRT per-beam analyses and ranged from 93.9% to 100% for composite plan dose analysis, well above the TG-119 action levels of 90% and 88%, respectively. However, all cases had systematic errors that were detected only by using advanced diagnostic techniques and more sensitive metrics. The systematic errors caused variable but noteworthy impact, including estimated target dose coverage loss of up to 5.5% and local dose deviations up to 31.5%. Types of errors included TPS model settings, algorithm limitations, and modeling and alignment of QA phantoms in the TPS. Most of the errors were correctable after detection and diagnosis, and the uncorrectable errors provided useful information about system limitations, which is another key element of system commissioning.


: Many forms of relevant systematic errors can go undetected when the currently prevalent metrics for IMRT/VMAT commissioning are used. If alternative methods and metrics are used instead of (or in addition to) the conventional metrics, these errors are more likely to be detected, and only once they are detected can they be properly diagnosed and rooted out of the system. Removing systematic errors should be a goal not only of commissioning by the end users but also product validation by the manufacturers. For any systematic errors that cannot be removed, detecting and quantifying them is important as it will help the physicist understand the limits of the system and work with the manufacturer on improvements. In summary, IMRT and VMAT commissioning, along with product validation, would benefit from the retirement of the 3%/3 mm passing rates as a primary metric of performance, and the adoption instead of tighter tolerances, more diligent diagnostics, and more thorough analysis.