Which method predicts recidivism best?: a comparison of statistical, machine learning and data mining predictive models


Address for correspondence: N. Tollenaar, Research and Documentation Centre, Ministry of Security and Justice, Schedeldoekshaven 131, 2311 EM, Den Haag, Zuid-holland, The Netherlands.
E-mail: n.tollenaar@minvenj.nl


Summary.  Using criminal population conviction histories of recent offenders, prediction mod els are developed that predict three types of criminal recidivism: general recidivism, violent recidivism and sexual recidivism. The research question is whether prediction techniques from modern statistics, data mining and machine learning provide an improvement in predictive performance over classical statistical methods, namely logistic regression and linear discrim inant analysis. These models are compared on a large selection of performance measures. Results indicate that classical methods do equally well as or better than their modern counterparts. The predictive performance of the various techniques differs only slightly for general and violent recidivism, whereas differences are larger for sexual recidivism. For the general and violent recidivism data we present the results of logistic regression and for sexual recidivism of linear discriminant analysis.