• Generalized estimating equations;
  • random effects model;
  • logistic regression;
  • longitudinal data;
  • tree mortality

Abstract Developing models to predict tree mortality using data from long-term repeated measurement data sets can be difficult and challenging due to the nature of mortality as well as the effects of dependence on observations. Marginal (population-averaged) generalized estimating equations (GEE) and random effects (subject-specific) models offer two possible ways to overcome these effects. For this study, standard logistic, marginal logistic based on the GEE approach, and random logistic regression models were fitted and compared. In addition, four model evaluation statistics were calculated by means of K-fold cross-valuation. They include the mean prediction error, the mean absolute prediction error, the variance of prediction error, and the mean square error. Results from this study suggest that the random effects model produced the smallest evaluation statistics among the three models. Although marginal logistic regression accommodated for correlations between observations, it did not provide noticeable improvements of model performance compared to the standard logistic regression model that assumed impendence. This study indicates that the random effects model was able to increase the overall accuracy of mortality modeling. Moreover, it was able to ascertain correlation derived from the hierarchal data structure as well as serial correlation generated through repeated measurements.