Random survival forests for high-dimensional data



Minimal depth is a dimensionless order statistic that measures the predictiveness of a variable in a survival tree. It can be used to select variables in high-dimensional problems using Random Survival Forests (RSF), a new extension of Breiman's Random Forests (RF) to survival settings. We review this methodology and demonstrate its use in high-dimensional survival problems using a public domain R-language package randomSurvivalForest. We discuss effective ways to regularize forests and discuss how to properly tune the RF parameters ‘nodesize’ and ‘mtry’. We also introduce new graphical ways of using minimal depth for exploring variable relationships. © 2011 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 4: 115–132 2011