Forecasting business failure using two-stage ensemble of multivariate discriminant analysis and logistic regression



A major drawback associated with the use of classical statistical methods for business failure prediction on top of financial distress is their lack of high accuracy rate. This work analyses the use of the two-stage ensemble of multivariate discriminant analysis (MDA) and logit to improve predictive performance of classical statistical methods. All possible ratios are firstly built from the quantities involved and then the three common filters, that is stepwise MDA, stepwise logit, and t-test, are used to choose another three convenient subsets of ratios. Four principal components spaces (PCSs) are, respectively, produced on the four different feature spaces by using principal components analysis. MDA and logit are used to produce predictions on the four PCSs. After that, two levels of ensemble are implemented: one based on predictions inside each of the same type of model (i.e. MDA or logit) and another based on the former two ensembles and one best model. Each of the eight models is weighted on the base of ranking order information of its predictive accuracy in ensemble by majority voting. MDA and logit and the new challenge model of support vector machine respectively in their best standalone modes are used for comparisons. Empirical results indicate that the two-stage ensemble of MDA and logit compares favourably with the three comparative models and all its component models.