Using data mining to predict success in a weight loss trial

Authors

  • M. Batterham,

    Corresponding author
    1. Statistical Consulting Centre, National Institute for Applied Statistical Research Australia, University of Wollongong, Wollongong, NSW, Australia
    • Correspondence

      M. Batterham, Statistical Consulting Centre, National Institute for Applied Statistical Research Australia, University of Wollongong, Northfields Ave, Wollongong, NSW 2522, Australia.

      Tel.: +61 2 4221 8190

      E-mail: marijka@uow.edu.au

    Search for more papers by this author
  • L. Tapsell,

    1. Nutrition and Dietetics, School of Medicine, Faculty of Science Medicine and Health, University of Wollongong, Wollongong, NSW, Australia
    Search for more papers by this author
  • K. Charlton,

    1. School of Medicine, Faculty of Science, Medicine and Health, University of Wollongong, Wollongong, NSW, Australia
    Search for more papers by this author
  • J. O'Shea,

    1. School of Medicine, Faculty of Science, Medicine and Health, University of Wollongong, Wollongong, NSW, Australia
    Search for more papers by this author
  • R. Thorne

    1. School of Medicine, Faculty of Science, Medicine and Health, University of Wollongong, Wollongong, NSW, Australia
    Search for more papers by this author

Abstract

Background

Traditional methods for predicting weight loss success use regression approaches, which make the assumption that the relationships between the independent and dependent (or logit of the dependent) variable are linear. The aim of the present study was to investigate the relationship between common demographic and early weight loss variables to predict weight loss success at 12 months without making this assumption.

Methods

Data mining methods (decision trees, generalised additive models and multivariate adaptive regression splines), in addition to logistic regression, were employed to predict: (i) weight loss success (defined as ≥5%) at the end of a 12-month dietary intervention using demographic variables [body mass index (BMI), sex and age]; percentage weight loss at 1 month; and (iii) the difference between actual and predicted weight loss using an energy balance model. The methods were compared by assessing model parsimony and the area under the curve (AUC).

Results

The decision tree provided the most clinically useful model and had a good accuracy (AUC 0.720 95% confidence interval = 0.600–0.840). Percentage weight loss at 1 month (≥0.75%) was the strongest predictor for successful weight loss. Within those individuals losing ≥0.75%, individuals with a BMI (≥27 kg m–2) were more likely to be successful than those with a BMI between 25 and 27 kg m–2.

Conclusions

Data mining methods can provide a more accurate way of assessing relationships when conventional assumptions are not met. In the present study, a decision tree provided the most parsimonious model. Given that early weight loss cannot be predicted before randomisation, incorporating this information into a post randomisation trial design may give better weight loss results.

Ancillary