Volume 73, Issue 4
BIOMETRIC METHODOLOGY

Stagewise generalized estimating equations with grouped variables

Gregory Vaughan

Department of Statistics, University of Connecticut, Storrs, Connecticut, U.S.A.

Search for more papers by this author
Robert Aseltine

Division of Behavioral Science and Community Health, University of Connecticut Health Center, Farmington, Connecticut, U.S.A.

Center for Public Health and Health Policy, University of Connecticut Health Center, Farmington, Connecticut, U.S.A.

Search for more papers by this author
Kun Chen

Department of Statistics, University of Connecticut, Storrs, Connecticut, U.S.A.

Center for Public Health and Health Policy, University of Connecticut Health Center, Farmington, Connecticut, U.S.A.

Search for more papers by this author
Jun Yan

Corresponding Author

E-mail address: jun.yan@uconn.edu

Department of Statistics, University of Connecticut, Storrs, Connecticut, U.S.A.

Center for Public Health and Health Policy, University of Connecticut Health Center, Farmington, Connecticut, U.S.A.

email: jun.yan@uconn.eduSearch for more papers by this author
First published: 13 February 2017
Citations: 3

Summary

Forward stagewise estimation is a revived slow‐brewing approach for model building that is particularly attractive in dealing with complex data structures for both its computational efficiency and its intrinsic connections with penalized estimation. Under the framework of generalized estimating equations, we study general stagewise estimation approaches that can handle clustered data and non‐Gaussian/non‐linear models in the presence of prior variable grouping structure. As the grouping structure is often not ideal in that even the important groups may contain irrelevant variables, the key is to simultaneously conduct group selection and within‐group variable selection, that is, bi‐level selection. We propose two approaches to address the challenge. The first is a bi‐level stagewise estimating equations (BiSEE) approach, which is shown to correspond to the sparse group lasso penalized regression. The second is a hierarchical stagewise estimating equations (HiSEE) approach to handle more general hierarchical grouping structure, in which each stagewise estimation step itself is executed as a hierarchical selection process based on the grouping structure. Simulation studies show that BiSEE and HiSEE yield competitive model selection and predictive performance compared to existing approaches. We apply the proposed approaches to study the association between the suicide‐related hospitalization rates of the 15–19 age group and the characteristics of the school districts in the State of Connecticut.

Number of times cited according to CrossRef: 3

  • Efficient interaction selection for clustered data via stagewise generalized estimating equations, Statistics in Medicine, 10.1002/sim.8574, 39, 22, (2855-2868), (2020).
  • Sequential adaptive variables and subject selection for GEE methods, Biometrics, 10.1111/biom.13160, 76, 2, (496-507), (2019).
  • Efficient big data model selection with applications to fraud detection, International Journal of Forecasting, 10.1016/j.ijforecast.2018.03.002, (2018).

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.