Get access

Prequential analysis of complex data with adaptive model reselection

Authors

  • Jennifer Clarke,

    Corresponding author
    1. Department of Epidemiology and Public Health, University of Miami, Miami, FL 33136, USA
    • Department of Epidemiology and Public Health, University of Miami, Miami, FL 33136, USA
    Search for more papers by this author
  • Bertrand Clarke

    1. Department of Epidemiology and Public Health, University of Miami, Miami, FL 33136, USA
    2. Department of Medicine, University of Miami, Miami, FL 33136, USA
    3. Center for Computational Science, University of Miami, Miami, FL 33136, USA
    Search for more papers by this author

  • This work was partially supported by National Institutes of Health Grant 5K25CA111636.

Abstract

In Prequential analysis, an inference method is viewed as a forecasting system, and the quality of the inference method is based on the quality of its predictions. This is an alternative approach to more traditional statistical methods that focus on the inference of parameters of the data generating distribution. In this paper, we introduce adaptive combined average predictors (ACAPs) for the Prequential analysis of complex data. That is, we use convex combinations of two different model averages to form a predictor at each time step in a sequence. A novel feature of our strategy is that the models in each average are re-chosen adaptively at each time step. To assess the complexity of a given data set, we introduce measures of data complexity for continuous response data. We validate our measures in several simulated contexts prior to using them in real data examples. The performance of ACAPs is compared with the performances of predictors based on stacking or likelihood weighted averaging in several model classes and in both simulated and real data sets. Our results suggest that ACAPs achieve a better trade off between model list bias and model list variability in cases where the data is very complex. This implies that the choices of model class and averaging method should be guided by a concept of complexity matching, i.e. the analysis of a complex data set may require a more complex model class and averaging strategy than the analysis of a simpler data set. We propose that complexity matching is akin to a bias-variance tradeoff in statistical modeling. Copyright © 2009 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 2: 000-000, 2009

Ancillary