A Network-based Analysis of the 1861 Hagelloch Measles Data

Authors

  • Chris Groendyke,

    Corresponding author
    1. Department of Statistics, Pennsylvania State University, University Park, Pennsylvania 16802, U.S.A.
    2. Current address: Department of Mathematics, Robert Morris University, Moon Township, Pennsylvania 15108, U.S.A.
    Search for more papers by this author
  • David Welch,

    1. Department of Statistics, Pennsylvania State University, University Park, Pennsylvania 16802, U.S.A.
    2. Center for Infectious Disease Dynamics, Pennsylvania State University, University Park, Pennsylvania 16802, U.S.A.
    3. Current address: Department of Computer Science, University of Auckland, Auckland 1142, New Zealand
    Search for more papers by this author
  • David R. Hunter

    1. Department of Statistics, Pennsylvania State University, University Park, Pennsylvania 16802, U.S.A.
    2. Center for Infectious Disease Dynamics, Pennsylvania State University, University Park, Pennsylvania 16802, U.S.A.
    Search for more papers by this author

email: groendyke@rmu.edu

Abstract

Summary In this article, we demonstrate a statistical method for fitting the parameters of a sophisticated network and epidemic model to disease data. The pattern of contacts between hosts is described by a class of dyadic independence exponential-family random graph models (ERGMs), whereas the transmission process that runs over the network is modeled as a stochastic susceptible-exposed-infectious-removed (SEIR) epidemic. We fit these models to very detailed data from the 1861 measles outbreak in Hagelloch, Germany. The network models include parameters for all recorded host covariates including age, sex, household, and classroom membership and household location whereas the SEIR epidemic model has exponentially distributed transmission times with gamma-distributed latent and infective periods. This approach allows us to make meaningful statements about the structure of the population—separate from the transmission process—as well as to provide estimates of various biological quantities of interest, such as the effective reproductive number, R. Using reversible jump Markov chain Monte Carlo, we produce samples from the joint posterior distribution of all the parameters of this model—the network, transmission tree, network parameters, and SEIR parameters—and perform Bayesian model selection to find the best-fitting network model. We compare our results with those of previous analyses and show that the ERGM network model better fits the data than a Bernoulli network model previously used. We also provide a software package, written in R, that performs this type of analysis.

Ancillary