MO-G-201-02: Comparing Sample Size Requirements for Knowledge-Based Treatment Planning




To compare how training set size affects the accuracy of a knowledge-based planning (KBP) model applied to prostate and head and neck (HN) cancer.


We selected a KBP model from the literature that uses distance-to-target histograms and organ volumes to predict an achievable dose-volume-histogram (DVH) curve for each organ-at-risk (OAR). We trained both the prostate and HN model using training set sizes of n=10, 20, 30, 50,75, and 100. We set aside 100 randomly selected treatment plans from each of the two respective cohorts of 218 to serve as a validation set for all experiments. For each value of n, we randomly selected 100 different training sets with replacement from the remaining 118 plans. Each of the 100 training sets was used to train a model for each value of n and for both prostate and HN. To evaluate the models we predicted DVH curves for each of the 100 plans in the validation set. To estimate the minimum required sample size, we used the rank-sum test to determine if the median error for each sample size from 10 to 75 was equal to the median error for the maximum sample size of 100.


In general, larger sample sizes were required for HN compared to prostate. For prostate, a minimum training set size of 30 plans was needed to accurately predict the bladder DVH, while at least 75 plans were needed for the rectum. For HN, the minimum training set size was 100 for the larynx esophagus and spinal cord, 75 for the left parotid and mandible, and only 50 for the right parotid.


The minimum sample size required for accurate treatment plan generation using KBP is OAR and site dependent. Adequate sample sizes are essential for successful clinical implementation of KBP models.

This research was funded in part by the Natural Sciences and Engineering Research Council of Canada.