Chapter 4. Logistic Regression
Published Online: 30 JAN 2006
Copyright © 2006 John Wiley & Sons, Inc.
Data Mining Methods and Models
How to Cite
Larose, D. T. (2006) Logistic Regression, in Data Mining Methods and Models, John Wiley & Sons, Inc., Hoboken, NJ, USA. doi: 10.1002/0471756482.ch4
- Published Online: 30 JAN 2006
- Published Print: 11 NOV 2005
Print ISBN: 9780471666561
Online ISBN: 9780471756484
- maximum likelihood estimation;
- categorical response;
- the zero-cell problem;
- multiple logistic regression;
Logistic regression is introduced by way of a simple example for predicting the presence of disease based on age. The maximum likelihood estimation methods for logistic regression are outlined. Emphasis is placed on interpreting logistic regression output. Inference within the framework of the logistic regression model is discussed, including determining whether the predictors are significant. Methods for interpreting the logistic regression model are examined, including for dichotomous, polychotomous, and continuous predictors. The assumption of linearity is discussed, as well as methods for tackling the zero-cell problem. We then turn to multiple logistic regression, where more than one predictor is used to classify a response. Methods are discussed for introducing higher order terms to handle nonlinearity. As usual, the logistic regression model must be validated. Finally, the application of logistic regression using the freely available software WEKA is demonstrated, using a small example.