Standard Article

Categorical Data

Ecological Statistics

  1. Peter B. Imrey,
  2. Douglas G. Simpson

Published Online: 15 JAN 2013

DOI: 10.1002/9780470057339.vac011.pub2

Encyclopedia of Environmetrics

Encyclopedia of Environmetrics

How to Cite

Imrey, P. B. and Simpson, D. G. 2013. Categorical Data . Encyclopedia of Environmetrics. 1.

Author Information

  1. University of Illinois, IL, USA

Publication History

  1. Published Online: 15 JAN 2013


Categorical data refers to counts of events or individuals observed through some defined process and allocated to subgroups, or categories, corresponding to levels of one or more attributes. This article reviews methods for interpreting collections of such counts when they arise from apparently random environmental processes and may be treated as dependent variables relative to potentially explanatory factors or covariates. Introduction of basic terminology including measures of relative frequency and association is followed by review of the Poisson, binomial, multinomial and hypergeometric probability distributions and products thereoef, that result from conditioning upon sums of independent Poisson counts. These form the basis for modeling the random variation in observed categorical data. For modeling structural relationships, generalized linear models are first defined, and Poisson regression, logistic regression, and log-linear models are each considered within that framework. Methods are summarized for analyzing correlated counts from observing a categorical dependent variable on the same observational units under several measurement conditions or at multiple observation times, or on multiple observational units within matched sets. These methods include weighted least-squares functional regression, conditional logistic regression, Cochran–Mantel–Haenszel tests, generalized linear mixed models, and analyses using generalized estimating equations. Overdispersed, zero-inflated, and ordinal data, Bayes and empirical Bayes methods, spatial modeling, exact methods and software in late 2011 are briefly discussed.


  • categorical data;
  • random counts;
  • Poisson regression;
  • logistic regression;
  • log-linear models;
  • generalized linear models;
  • Cochran–Mantel–Haenszel tests;
  • weighted least-squares models;
  • generalized linear mixed models;
  • generalized estimating equations;
  • spatial analysis of rates