• data mining;
  • modeling;
  • ensemble;
  • health care


We built two models in the Institute for Operations Research and the Management Science 2009 data mining contest to predict the hospital transfer (task 1) and the hospital mortality (task 2). Based on the patient medical information provided in the training and testing dataset, we created a series of variables to characterize a patient's current medical condition as well as their historical medical condition. We converted all categorical fields like Diagnosis Code, Clinical Management Category, Procedure Code, and Product Line into numerical values by using the positive rate in each category. We used a gradient boosting decision tree as our classifier. Building ensembles from different boosted decision tree models were used to improve the prediction accuracy. Postprocessing on the model score was performed by utilizing two scores on the same hospital admission. Our final results were ranked second in both tasks. Copyright © 2010 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 3: 253-258, 2010