• contouring quality;
  • CriSIS;
  • penalty metrics;
  • quantitative;
  • radiation oncology trainee education



The objective of this study was to develop and assess the feasibility of utilizing consensus-based penalty metrics for the purpose of critical structure and organ at risk (OAR) contouring quality assurance and improvement.


A Delphi study was conducted to obtain consensus on contouring penalty metrics to assess trainee-generated OAR contours. Voxel-based penalty metric equations were used to score regions of discordance between trainee and expert contour sets. The utility of these penalty metric scores for objective feedback on contouring quality was assessed by using cases prepared for weekly radiation oncology radiation oncology trainee treatment planning rounds.


In two Delphi rounds, six radiation oncology specialists reached agreement on clinical importance/impact and organ radiosensitivity as the two primary criteria for the creation of the Critical Structure Inter-comparison of Segmentation (CriSIS) penalty functions. Linear/quadratic penalty scoring functions (for over- and under-contouring) with one of four levels of severity (none, low, moderate and high) were assigned for each of 20 OARs in order to generate a CriSIS score when new OAR contours are compared with reference/expert standards. Six cases (central nervous system, head and neck, gastrointestinal, genitourinary, gynaecological and thoracic) then were used to validate 18 OAR metrics through comparison of trainee and expert contour sets using the consensus derived CriSIS functions. For 14 OARs, there was an improvement in CriSIS score post-educational intervention.


The use of consensus-based contouring penalty metrics to provide quantitative information for contouring improvement is feasible.