The performance of normal-tissue complication probability models in the presence of confounding factors




This work explores different methods for accounting for patient-specific factors in normal-tissue complication probability (NTCP) modeling, and compares the performance of models using pseudoclinical datasets for “lung” and “rectum” complications.


Datasets consisting of dose distributions and resulting normal-tissue complications were simulated, letting varying levels of confounding factors (i.e., nondosimetric factors) influence the outcome. The simulated confounding factors were patient radiosensitivity and health status. Seven empirical NTCP models were fitted to each dataset; this is analogous to fitting alternative models to datasets from different populations, treated with the same technique. The performance of these models was compared using the area under the ROC curve (AUC) and the impact of confounding factors on the model performance was studied. The patient-specific factors were then accounted for by (1) stratification and (2) two ways of modifying the traditional NTCP models to include these factors.


Confounding factors had a greater impact on model performance than the choice of model. All models performed similarly well on the rectum datasets (except the maximum dose model), while critical-volume type models were slightly better than the mean dose-, the Lyman–Kutcher–Burman-, and the relative seriality models for lung. This difference was more apparent without confounding factors in the dataset. The two alternative functions including patient-specific factors used in this work (one logistic and one cumulative normal function) were found to be equivalent, and more efficient than stratifying datasets according to patient-specific factors and fitting models to subgroups individually. For datasets including confounding factors, the performance improved greatly when using models accounting for these; AUC increased from around 0.7 to close to unity.


This work shows that identifying confounding factors, and developing methods to quantify them, is more important than the choice of NTCP model. Most dose–volume histogram (DVH)-based NTCP models can be generalized to include confounding factors.