SEARCH

SEARCH BY CITATION

Keywords:

  • Biosurveillance;
  • Clusters;
  • Control chart;
  • Epidemics;
  • Infectious diseases;
  • Outbreak;
  • Prospective detection;
  • Surveillance

Abstract

  1. Top of page
  2. Abstract
  3. 1. Setting the scene
  4. 2. Regression techniques
  5. 3. Time series methodology
  6. 4. Methods inspired by statistical process control
  7. 5. Methods incorporating spatial information
  8. 6. Multivariate outbreak detection
  9. 7. Comparison and evaluation of prospective outbreak detection systems
  10. 8. Final remarks
  11. Acknowledgements
  12. References

Summary.  Unusual clusters of disease must be detected rapidly for effective public health interventions to be introduced. Over the past decade there has been a surge in interest in statistical methods for the early detection of infectious disease outbreaks. This growth in interest has given rise to much new methodological work, ranging across the spectrum of statistical methods. The paper presents a comprehensive review of the statistical approaches that have been proposed. Applications to both laboratory and syndromic surveillance data are provided to illustrate the various methods.


1. Setting the scene

  1. Top of page
  2. Abstract
  3. 1. Setting the scene
  4. 2. Regression techniques
  5. 3. Time series methodology
  6. 4. Methods inspired by statistical process control
  7. 5. Methods incorporating spatial information
  8. 6. Multivariate outbreak detection
  9. 7. Comparison and evaluation of prospective outbreak detection systems
  10. 8. Final remarks
  11. Acknowledgements
  12. References

The past decade has witnessed a large increase in research activity on the statistical issues that are related to prospective detection of outbreaks of infectious diseases. The major challenges in this expanding field derive from its focus on prospective detection, namely detection of outbreaks as they arise, in a sufficiently timely fashion to enable effective control measures to be taken. The growth in this area, which is sometimes now referred to as biosurveillance (e.g. Shmueli and Burkom (2010)), has been so rapid as to spawn conferences, a Learned Society, the International Society for Disease Surveillance (http://www.syndromic.org), which was founded in 2005, and an entire issue of Statistics and Medicine introduced by Rolka (2011).

Investigations of outbreaks go back at least to John Snow's iconic removal of the handle of London's Broad Street pump during the 1854 cholera epidemic. In the modern era, following a trend that is apparent in all areas of epidemiology, statistical methods have come to the fore in outbreak detection and control. For several decades (see Tillett and Spencer (1982) for an early example), statistical techniques have been used to provide early warnings of outbreaks, supplementing more traditional surveillance based on a network of alert public health physicians. Since the early 1990s, the increasingly widespread availability of computerized databases which can be interrogated for evidence of emerging outbreaks has greatly facilitated the use of statistical outbreak detection and has witnessed the creation of automated detection systems to process data on very large numbers of infections at frequent time intervals. Since the turn of the 21st century, two factors have combined to give further impetus to developments in this area: new concerns about the possible threat of large-scale bioterrorism and heightened public and media awareness about emerging or re-emerging infections, including hospital-acquired infections such as methicillin-resistant Staphylococcus aureus and Clostridium difficile, and global epidemics such as severe acute respiratory syndrome in 2002–2003 and the 2009 H1N1 swine influenza pandemic. Similar statistical surveillance methods have also been used for the early detection of new antimicrobial resistant strains of infectious pathogens.

Under these influences, a new focus has emerged, namely the surveillance of syndromes, which complements the previous emphasis on surveillance of infections. Thus, much of the new literature in the field relates to syndromic surveillance, which exploits more diverse sources of data, such as calls to telephone or Internet helplines, medical consultations and pharmacy sales, that are believed to reflect in more timely fashion changes in behaviour that may stem from a large-scale bioterrorist incident. So far, no such incident has occurred. However, the need to detect outbreaks from more mundane sources—such as contaminated foodstuffs, breakdowns in water treatment plants, low vaccine efficacy or imported infections—remains and has led to further developments, as witnessed by the inclusion of routine methods for statistical outbreak detection in laboratory-based surveillance systems in several European countries (Hulth et al., 2010).

Both syndromic and laboratory-based prospective surveillance methods for outbreak detection pose diverse statistical challenges, relating to sources of data, evaluation, multiplicity control and follow-up, as well as the statistical techniques used to detect anomalies in data series. Periodic reviews of these methods, with varying emphases, have appeared in the statistical literature, notably Sonesson and Bock (2003), Farrington and Andrews (2004), Buckeridge et al. (2005) and Shmueli and Burkom (2010). There have also been numerous developments in related fields such as pharmacovigilance and institutional performance monitoring.

The aim of the present review is to provide an account of the statistical methodologies that have been proposed for detecting anomalies in data series, specifically in the context of prospective outbreak detection. These methods are used to identify unusual patterns in data, which may result from infectious disease outbreaks. Prospective detection involves identifying anomalies as they arise, to enable control measures to be implemented, if deemed appropriate.

Our aim is not to cover the entire range of statistical issues or sources of data that are relevant to outbreak detection (Fienberg and Shmueli, 2005)—comprehensive coverage is probably no longer possible in a single review paper. Nor do we seek to document all the variants of each methodological approach, as developed in response to particular circumstances. Rather, we focus entirely on broad classes of statistical methods for detecting aberrations. Our motivation for doing so is to inform a detailed study of some of the outbreak detection systems that are used in the UK. Surveillance for healthcare-associated infections, such as surgical site infections or ventilator-associated pneumonia, is outside the remit of this review, although it is mentioned in places, as some aspects of healthcare surveillance overlap disease outbreak detection. Much healthcare surveillance, however, addresses different aims, such as the monitoring of hospital performance and the evaluation of quality indicators. The paper is structured in the following sections: regression techniques (Section 2), time series methodology (Section 3), methods inspired by statistical process control (Section 4), methods incorporating spatial information (Section 5) and multivariate outbreak detection (Section 6). We stress that this classification is chosen only to help our presentation of material and is not based on any rigorous taxonomy. Several methods could be classified under more than one heading. We include a brief review of evaluation methodologies in Section 7. Concluding comments are given inSection 8.

2. Regression techniques

  1. Top of page
  2. Abstract
  3. 1. Setting the scene
  4. 2. Regression techniques
  5. 3. Time series methodology
  6. 4. Methods inspired by statistical process control
  7. 5. Methods incorporating spatial information
  8. 6. Multivariate outbreak detection
  9. 7. Comparison and evaluation of prospective outbreak detection systems
  10. 8. Final remarks
  11. Acknowledgements
  12. References

Regression methods of outbreak detection have been widely used, both for detecting outbreaks in surveillance systems on the basis of laboratory reports and notified infections, and for syndromic surveillance. Their application differs from other areas of biostatistics, in that they are used primarily to obtain standardized residuals. The distribution of these residuals in the absence of an outbreak is then used to determine a threshold value.

Regression methods can be regarded as extending the Shewhart chart (Shewart, 1931), in which a process variable yt which is normally distributed inline image when in control is monitored by tracking the values of ytμ, an alert being declared when |ytμ|>kσ for some prespecified value of k. (Throughout the paper, random variables and their realizations are not distinguished by using upper-case and lower-case letters. This is because in the multivariate case the reader will be more concerned about whether a vector or matrix of data is involved than with the distinction between random variables and their observed values.)

When applied to outbreak detection, only the upper control limit μ+kσ is usually of interest. Regression methods generalize the Shewhart chart in three respects: the in-control mean μ and possibly the in-control standard deviation σ vary with time; both these quantities must be estimated from historical data; and the distribution of yt may not be normal.

The performance of regression outbreak detection methods may be expected to reflect the performance of Shewhart charts: applied to an observation at time t, they are effective at detecting large outbreaks starting at time t, but rather less effective at detecting more gradual outbreaks starting at some time earlier than t.

In Section 2.1 and Section 2.2 parametric and semiparametric regression methods are described. Most regression methods are based on a threshold value, above which reports are declared aberrant. How these thresholds may be obtained is explained in Section 2.3, whereas in Section 2.4 non-thresholding methods are considered.

2.1. Parametric models

Perhaps the simplest regression model for outbreak detection is that described by Stroup et al. (1989), in which the expected disease count at month t, E(yt), is calculated as the mean of observed counts at months t−1,t and t+1 over some prespecified number of years. This ensures that seasonal effects are automatically adjusted for by design rather than by explicit modelling, thus providing some element of robustness. However, this model does not incorporate time trends. Stroup et al. (1989) applied this model, using normal errors, to data on notifiable infections. The results are summarized in a simple graphic that is published routinely in the Morbidity and Mortality Weekly Report of the Centers for Disease Control and Prevention (Stroup et al., 1993).

A commonly used fully parametric outbreak detection regression model is based on that of Serfling (1963), who modelled historical baselines by using a trigonometric function with linear trend of the form

  • image(1)

and normal errors with constant variance. Serfling (1963) used the regression equation (1) to estimate excess mortality due to influenza on the basis of weekly data on pneumonia–influenza deaths. This model has subsequently been used to detect the onset of epidemics of influenza (Costagliola et al., 1991, 1994). An automated version of the Serfling model with cubic trend and three trigonometric terms (i.e. r=3), with model selection based on the Akaike information criterion, has been developed for prospective and retrospective surveillance and is available as a Web-based application (Pelat et al., 2007) at http://www.u707.jussieu.fr/periodic__regression. Fig. 1 shows a sample output screen from the system of Pelat et al. (2007), displaying the result of a prospective analysis of weekly counts of Salmonella enteritidis phage type 4 in England, Wales and Northern Ireland from 2000 to 2009 and model-based extrapolation for the year 2010 with an epidemic threshold. The user must first select a subset of the whole data series (which is called the ‘training period’) which is used to estimate the baseline level. In Fig. 1 the training period consists of the years 2006–2009. Then the 20% highest values in the training data are excluded to account for past outbreaks (default value, 15%). A Serfling-type regression equation is used to model the baseline. A threshold is obtained by taking an upper percentile for the prediction distribution; here the upper 90th percentile is chosen. The system declares an aberration as soon as an observation exceeds this threshold.

image

Figure 1.  Sample output screen from the system of Pelat et al. (2007) (see the text for explanation): inline image, observed; inline image, model for the baseline; inline image, upper forecast limit

Download figure to PowerPoint

When data are sparse, the normal errors regression model is inappropriate. Parker (1989) instead used Poisson regression with a logarithmic link to monitor the mortality that is associated with abortions. Such a model can be elaborated at will; Jackson et al. (2007) described a Poisson log-linear model for syndromic surveillance with terms for the day of the week, month, linear time trend and holidays. If denominator population data are available, binomial logistic models can be fitted in much the same way, the flexibility of the generalized linear model approach allowing further extensions to include random effects, e.g. to represent spatial variation (see Section 5).

The log-linear regression model of Farrington et al. (1996), like that of Stroup et al. (1989), adjusts for seasonal effects by design and explicitly allows for linear trends. Since much surveillance data exhibit considerable overdispersion, the model is quasi-Poisson with

  • image(2)

where φ is the dispersion parameter. To try to reduce the influence of past outbreaks, baseline values with high residuals are given lower weights in the regression. Estimates of model parameters in expression (2) are obtained by a quasi-likelihood method.

The method of Farrington et al. (1996) is used routinely by the Health Protection Agency to detect outbreaks in laboratory-based surveillance data in England and Wales, and is referred to as the ‘Lab-Base’ exceedance system hereafter. Elaborations of the Lab-Base exceedance system have been described for use with laboratory data in Scotland (McCabe et al., 2003) and the Netherlands (Widdowson et al., 2003).

2.2. Semiparametric models

All the regression models so far described use a parametric model to represent the historical data. A contrasting strategy is to use a non-parametric model for the historical baseline, as widely used in monitoring mortality and other effects in environmental time series (Dominici et al., 2003). The ‘Salmonella potential outbreak targeting’ system that was described by Stern and Lightfoot (1999) uses a smoothing method to obtain baselines and standard deviations. 5 years’ historical data are first smoothed, and the baseline value for each time point in the yearly cycle is taken to be the median of the five smoothed values. The standard deviation is obtained by smoothing the residuals (raw values minus smooth values). This system implicitly assumes Gaussian errors by using an alarm threshold of 2 standard deviations above the baseline (with a filter for low counts).

Wieland et al. (2007) proposed to model both the mean μt and the variance inline image at each time point t, using separate generalized additive models (GAMs) for both quantities. First, a GAM is fitted to historical data to obtain inline image. Then a second GAM is fitted to the residuals from the first GAM to obtain inline image. The threshold is taken as inline image for some choice of k. The smooth terms in the GAMs were based on Gaussian kernel smoothers, with bandwidth chosen to minimize the mean predictive squared error on the historical data.

Mean regression methods lack robustness to the presence of outbreaks in the baseline values, which bias inline image upwards and hence reduce the sensitivity of the system. Sensitivity is the probability that a true outbreak is detected: a high sensitivity is desirable for outbreaks of public health importance. Serfling (1963) identified past outbreaks by visual inspection and omitted them from the model; this will tend to bias inline image downwards and thus to reduce the specificity. Specificity is the probability that a time period without an outbreak is correctly identified as such. The complement of this, 1 − specificity, must be kept low to avoid user fatigue and loss of credibility. The Lab-Base exceedance system (Farrington et al., 1996) downweights outliers in the baselines, but this reduces rather than eliminates the bias, as do non-parametric smoothing techniques.

An alternative is to use a wavelet transform of the baseline values to account for low frequency variation due to trends and seasonality, while remaining robust to high frequency variation resulting from past outbreaks or other artefacts, such as holiday dips. This approach was proposed by Zhang et al. (2003), whose simple and hence easily automated wavelet-based anomaly detector subtracts a baseline value obtained by using the wavelet transform and bases thresholds on the distribution of the residuals (see also Wieland et al. (2007)). More complex wavelet-based methods, which may also be applied in a time series framework, were used by Goldenberg et al. (2002) and discussed in Shmueli (2005).

2.3. Obtaining the thresholds

Most regression-based methods specify a model for the mean at time t and declare an alarm at time t if the observed value lies above some threshold determined by the sample statistics and the quantiles of a suitable distribution, e.g. the normal, Poisson or negative binomial distributions. A commonly used procedure for large counts is to estimate the baseline value inline image and the process variance at time t, inline image, and, assuming normal errors, to define the upper threshold by

  • image

where in many applications it is further assumed that the process variance is constant: inline image. A more accurate approach is to base the threshold on an upper 100(1−α)% prediction limit which takes account of both the process variance and the uncertainty in the estimation of the baseline value. Thus, for example, the Poisson model of Parker (1989) combines the estimated process variance inline image with the regression error variance inline image, obtained by using the delta method, to obtain the total variance

  • image

The quasi-Poisson method of Farrington et al. (1996) uses a similar approach. Clearly, regression methods do not account for serial correlation in the baselines. Kafadar and Stroup (1992) investigated bootstrap and jackknife procedures to estimate inline image but suggested that such adjustments are inadvisable unless the auto-correlation structure is known or can be well estimated.

A further problem with thresholds of the form μt+kσ is that the normal theory on which they are based is usually inappropriate, especially when the background means μt are small. The accuracy of the threshold—namely the extent to which P(yt>ut) matches the nominal 100α% level on which it is based—will usually vary with μt; hence the sensitivity and specificity will vary with μt. This is undesirable for systems that are designed to monitor several different data series with a wide range of expected values.

Consequently, some detection algorithms apply a transformation to approximate normality, or to approximate symmetry, to derive the threshold on the transformed scale and to transform the threshold back to the original scale. The quasi-Poisson method of Farrington et al. (1996) uses the inline image-power transformation, which yields approximate symmetry for Poisson data. The upper threshold is defined as

  • image

where z is the 100(1−α)-percentile of the standard normal distribution. The rationale is that, for rare organisms, φ≃1 and counts are distributed approximately Poisson, whereas for frequent organisms normal approximations are valid. There are many possibilities for obtaining a normal approximation to the error distribution: for example Cooper et al. (2004) used a hyperbolic-sine-based transformation applied to a binomial proportion to achieve a similar result.

2.4. Non-thresholding methods

Most regression methods for outbreak detection use a threshold ut at time t to determine whether the current observation yt is aberrant. An alternative is to test the null hypothesis that yt belongs to the same distribution as the baseline values; Parker (1989) discussed various tests, including the likelihood ratio test, for Poisson data.

Other criteria for detecting outbreaks may be specified, based on the qualitative features of an outbreak, such as the start of an outbreak, at which point a previously stationary time series begins to increase, and the peak of an outbreak, at which the counts stop increasing and start decreasing (Andersson et al., 2008). Bock et al. (2008) described methods for identifying the peak of an epidemic, and Frisén and Andersson (2009), Frisén et al. (2009, 2010) describe parametric and semiparametric regression methods for detecting the onset of an epidemic.

The semiparametric model to detect onset of epidemics assumes that the disease counts belong to a specified distribution within the regular exponential family. Under the semiparametric version of the model, it is assumed that the disease counts ys up to time t (sleqslant R: less-than-or-eq, slantt) either have constant mean (the null hypothesis) or increase monotonically from time t=d where d is known (the alternative hypothesis). For Poisson counts that increase from the start of the series (d=1), the likelihood ratio statistic is

  • image(3)

where inline image is the maximum likelihood estimator of the process mean at time s under the null hypothesis, and inline image is the maximum likelihood estimator under the alternative hypothesis. An alert is declared if LR>k for some prespecified value k. Frisén et al. (2009, 2010) describe applications to outbreaks of influenza and tularaemia in Sweden. Computer software for this surveillance method is available (http://www.statistics.gu.se/surveillance), called ‘Outbreak detection P’, as both an SAS program and as a Visual Basic for Applications macro in Microsoft Excel for Windows. The semiparametric method by Frisén and Andersson (2009) is also implemented in the R software package surveillance (Höhle, 2007; Höhle and Mazick, 2010). The surveillance package provides interfaces of several well-known procedures for prospective outbreak detection, among others those of Stroup et al. (1989) and Farrington et al. (1996) and the system that is used by the Robert Koch Institute, Germany (Hulth et al., 2010). For a complete overview of its content, see http://surveillance.r-forge.r-project.org.

3. Time series methodology

  1. Top of page
  2. Abstract
  3. 1. Setting the scene
  4. 2. Regression techniques
  5. 3. Time series methodology
  6. 4. Methods inspired by statistical process control
  7. 5. Methods incorporating spatial information
  8. 6. Multivariate outbreak detection
  9. 7. Comparison and evaluation of prospective outbreak detection systems
  10. 8. Final remarks
  11. Acknowledgements
  12. References

Unlike most regression techniques, time series methods acknowledge the correlation structure of the data. For syndromic and laboratory data which are generally collected daily or weekly the principal correlations are auto-correlations at a lag of one time period (serial correlation) and correlations associated with the seasonal pattern in the data, which can be a combination of weekly or yearly seasonality. When time series data are available over a relatively long period of time, it is important to estimate the trend and seasonal components as the auto-correlation structure can only be identified by using a stationary time series. Methods for estimating the trend and seasonal effects are briefly described in Section 3.1. Failure to account properly for the auto-correlation will result in a misspecified model which may have bias in the estimated effects and prediction intervals which are too narrow, leading to a larger number of potential exceptions. The Box–Jenkins-type methodology was designed to take the auto-correlation structure into account. These models are considered in Section 3.2. Much statistical innovation in the field of outbreak detection has tended towards time series models of increasing complexity, including Bayesian and hidden Markov models (HMMs). The latter are discussed in Section 3.3.

3.1. Trend and seasonal estimation

With outbreak surveillance, the estimation of trend is best accomplished through a relatively simple procedure that is flexible and does not make any great demands on intervention by the operator. For time series, where there is a considerable amount of historical data and where the seasonal pattern is regular, a Serfling model (Serfling, 1963) based on sines and cosines may be used to estimate the trend and seasonal component.

Two common time series methods that are used in surveillance are simple exponential smoothing (e.g. Healy (1983) and Ngo et al. (1996)) and the Holt–Winters procedure (Holt, 1957; Winters, 1960). Simple exponential smoothing assures that the data have no trend or seasonality. It forms predictions by taking a weighted average of past observations, where the weights decrease exponentially the further they are into the past (see Section 4).

The Holt–Winters technique is a generalization of simple exponential smoothing that allows for local trend and seasonal factors. It is a method that has been used in many surveillance systems and has performed well in many forecasting competitions in comparison with other more complex methods (Chatfield and Yar, 1988). In syndromic surveillance systems it has been used to model out-patient attendance, in comparison with adaptive (using a moving baseline window) and non-adaptive regression models (Burkom et al., 2007), and found to provide a better fit to the data. It has also been used to account for the effects of temporal auto-correlation and spatial correlation when modelling calls to NHS24, the National Health Service telephone health advice and information service for Scotland, and laboratory reports (Wagner, 2010).

3.2. Auto-regressive integrated moving average and integer-valued auto-regressive models

Auto-regressive integrated moving average (ARIMA) models (Box and Jenkins, 1970) have been used for detecting outbreaks of infectious disease (e.g. Choi (1981), Helfenstein (1986), Reis and Mandl (2003) and Watier et al. (1991)). Fitting an ARIMA model requires the time series to be stationary. As much syndromic and laboratory data are likely to have seasonal and trend components it is necessary to remove these components from the original data before estimating the auto-correlation. Furthermore, the statistical testing procedure is based on the normal distribution and this will really only be valid for infections or syndromes which occur frequently. When using outbreak surveillance for relatively rare events, or for more common events in smaller areas, then integer-valued methods may be more appropriate.

The integer-valued auto-regressive (INAR) model (e.g. Weiß (2009)) is based on the convolution operator ‘∘’ (Steutel and van Harn, 1979), where

  • image(4)

and the yk are independent and identically distributed Bernoulli random variables with probability P(yk=1)=α and x is a non-negative discrete random variable. Using equation (4), an INAR model of order p, which is denoted by INAR(p), can be defined as

  • image(5)

where ɛt is a (non-negative) random shock. It follows from equation (5) that an INAR(1) model is

  • image(6)

Model (6) states that the number of new cases in the interval (t−1,t] is made up of two components—the xt−1-cases transmit the infection independently with probability α1 and, as a consequence, Σ yk new cases arise, and a random number ɛt of new cases are generated via independent sources. Parameter estimates may be obtained by using, for example, the Yule–Walker estimation technique.

When fitting ARIMA and INAR models to meningococcal incidence in the Montreal region of Canada (Cardinal et al., 1999), INAR(5) and AR(5) models were required for data aggregated in 13 4-week periods each year, from 1986 to 1993. There was no evidence of a trend or seasonality. In Scotland, using daily data on calls to NHS24 about vomiting there was evidence of a weekly pattern and a trend (Wagner, 2010). The data were best fitted by an AR(6) model on a seasonally differenced series. ARIMA models were used to describe the daily visits to emergency departments at a hospital in Boston over a period of 10 years from 1992 (Reis and Mandl, 2003) and an ARMA(2,1) model was required for total visits, whereas an ARMA(1,1) model was required for respiratory visits.

An extensive investigation of the use of ARIMA modelling, in comparison with statistical process control methods (see Section 4), has been carried out (Williamson and Weatherby Hudson, 1999). They found that ARIMA modelling was unable to model eight out of 17 syndrome time series because of non-stationarity, partly arising from sparse data. Furthermore each series had to be investigated separately. For the series which were successfully modelled, one-step-ahead forecasts were satisfactory for forecasts up to 3 years in the future though better forecasts were obtained by using continuously updated models.

In a syndromic-based detection approach, Reis and Mandl (2003) analysed healthcare utilization patterns by a two-step time series modelling approach. A trimmed mean seasonal model was used to capture both the yearly and the weekly trends in daily utilization rates. The residuals from the trimmed mean seasonal model were then fitted by an ARIMA model. An AR(6) model combined with a Serfling model with indicator variables for weekends and holidays was used to model attendance at ambulatory care centres for influenza-like symptoms (Miller et al., 2003).

In these illustrations, the traditional ARIMA models require a relatively large number of parameters for the auto-correlation. Furthermore, a model for one syndrome is not easily adapted for use with another syndrome, so the whole process of model identification must be carried out each time, making the process difficult to automate.

With each new observation the parameters of the ARIMA model should be re-estimated. It is not clear how this can be achieved automatically, since model identification often relies on looking at residual plots. A practical approach may be to keep using the same model for a period of time, say 1 month for daily data, and only to refit the model each month. This might be practical for any one series but not for many. Thus it is likely that ARIMA methods might be better suited to the retrospective analysis of time series data, rather than prospective use within an outbreak surveillance system.

Heisterkamp et al. (2006) proposed the use of a so-called hierarchical time series model, which does not require a long time series of data for parameter estimation. The observed counts are assumed to be Poisson distributed, whereas an unobserved process is assumed for the time trend in the expected number of cases. The models for the unobserved process range from a completely stationary process over time to an auto-regressive model of order 3. This provides a flexible model for the trend in a times series. Likelihood ratio tests are used to discriminate between the various models in the hierarchy. It is claimed that the hierarchical time series model can detect outbreaks faster than the Lab-Base exceedance system (Heisterkamp et al., 2006).

3.3. Bayesian and hidden Markov models

Le Strat and Carrat (1999) proposed the use of an HMM (e.g. Cappéet al. (2005)) for monitoring epidemiological data. The basic idea is to segment the time series of disease counts into an epidemic and non-epidemic phase. In the HMM of Le Strat and Carrat (1999) each observed yt (t=1,…,n) is associated with a latent variable zt ∈ {0,1} that determines the conditional distribution of yt, i.e. yt|ztfk(yt;θk), where k ∈ {0,1}, fk is a prespecified density (e.g. Gaussian or Poisson) and θk are parameters to be estimated. The unobserved state space zt (t=1,…,n) is modelled by a two-state homogeneous Markov chain of order 1 with stationary transition probabilities

  • image

where k,l ∈ {0,1} denote the two states of zt (1, epidemic; 0, non-epidemic). For example, p01 is the probability of switching from the non-epidemic to the epidemic state. Note that, in this Markov-dependent mixture model, yt is conditionally independent of all the remaining variables, given zt. Le Strat and Carrat (1999) also considered HMMs with more than two hidden states and performed model selection by using the Bayesian information criterion. Parameter estimates were obtained by means of a modified version of the EM algorithm (Dempster et al., 1977). Model extensions to account for time trends and seasonality were proposed by using the cyclic regression function of Serfling (1963). An on-line version of the retrospective approach of Le Strat and Carrat (1999) is implemented in the R package surveillance (Höhle, 2007; Höhle and Mazick, 2010).

Rath et al. (2003) and Madigan (2005) presented further exploration of HMMs for surveillance, the latter incorporating the Bayesian perspective, which requires prior distributions to be specified for model parameters. Markov chain Monte Carlo (MCMC) methods were used for parameter estimation.

In another Bayesian approach, Martínez-Beneito et al. (2008) proposed a Markov switching model (e.g. Cappéet al. (2005)) for prospective surveillance of weekly influenza incidence rates. In a Markov switching model the observed variables depend not only on the hidden state variables but also on the lagged observable variables. This setting makes the Markov switching model more suitable for time series analysis than HMMs. In Martínez-Beneito et al. (2008) the conditional distribution of the first-order difference series, formed by the differences between rates in consecutive weeks, is modelled either as a first-order auto-regressive process or as a Gaussian white noise process, depending on whether the system is in an epidemic or non-epidemic phase. Let inline image denote the first-order difference that corresponds to the difference between the rates in weeks t+1 and t in season j. Conditionally on the value of zt,

  • image

The advantage of using Markov switching models is that the differenced series is detrended, enabling auto-regressive modelling to be used to analyse the data. The methodology that was described in Martínez-Beneito et al. (2008) is implemented in a Web-based application called FluDetWeb (Conesa et al., 2009), which is used for the early detection of the onset of influenza epidemics. To illustrate this approach, we performed a prospective analysis using influenza-like illness data from the Valencian sentinel network. The data set consists of 11 time series formed by the weekly influenza-like illness incidence rates (per 100000 inhabitants in the Communitat Valenciana, Spain) during the seasons from 1996–1997 and 2006–2007. The data set can be downloaded via http://www.geeitema.org/doc/meviepi/influenza.html. An influenza season lasts 30 weeks (from the 42nd week of one year to the 19th week of the following year). In Fig. 2, the weekly influenza-like illness incidence rates for the seasons 2005–2006 and 2006–2007 are compared. The black dots in Fig. 2(b) indicate that the posterior probability of being in an epidemic phase exceeded 0.5 in weeks 15, 16 and 17 of the 2006–2007 influenza season. The posterior probabilities were found by means of the application FluDetWeb.

image

Figure 2.  Comparison of influenza-like illness rates in the Communitat Valenciana, Spain, between (a) the 2005–2006 and (b) 2006–2007 influenza season: inline image, epidemic phase; inline image, non-epidemic phase

Download figure to PowerPoint

Conesa et al. (2010) introduced a framework of models with the idea of using them on any kind of surveillance data. In particular, the process of the observed cases is modelled via a Bayesian hierarchical Poisson model in which the intensity parameter is a function of the incidence rate. Various options for modelling the mean of the rates were described, including the option of modelling the mean at each phase as auto-regressive processes of order 0, 1 and 2 (David Conesa, 2010, personal communication).

Lu et al. (2010) developed a Markov switching model with jumps to handle the effect that is caused by past outbreaks. This model utilizes two additional hidden state variables in each period. The first hidden state variable models the disease outbreak state, and the second hidden state variable models the presence of extreme values. If an extreme value exists, the third hidden state variable represents the size of the extreme value. This is done to absorb any effect caused by sporadic extreme values in the training data.

In Cowling et al. (2006), dynamic linear models (West and Harrison, 1997) were used as an approach to the early detection of the onset of the influenza epidemic period that requires only a few weeks of baseline data.

For retrospective detection, Held et al. (2006b) proposed a two-component model where the counts are viewed as the sum of a ‘parameter-driven’ (endemic) and an ‘observation-driven’ (epidemic) component. The model for the parameter-driven component is a Poisson or negative binomial regression model. The observation-driven component is modelled with an auto-regressive parameter. Model estimates are obtained through Bayesian inference via MCMC techniques. The retrospective approach of Held et al. (2006b) can be used in a prospective setting.

The above models have clear similarities in that they are designed to monitor a system that can exist in two states, where the time of the change from one state to another is unknown. The models can be generalized to three or more states but, in terms of their utility for outbreak surveillance, the need for more than two states is questionable, as the main aim is to detect a change. The main differences between the models are in their methods of modelling trend, seasonality and auto-correlation. For most of these models the simplest method of estimating the parameters is through Bayesian MCMC sampling. Although this is feasible for one infection or syndrome, the computation time may present difficulties for surveillance with many end points where the models require daily updating.

4. Methods inspired by statistical process control

  1. Top of page
  2. Abstract
  3. 1. Setting the scene
  4. 2. Regression techniques
  5. 3. Time series methodology
  6. 4. Methods inspired by statistical process control
  7. 5. Methods incorporating spatial information
  8. 6. Multivariate outbreak detection
  9. 7. Comparison and evaluation of prospective outbreak detection systems
  10. 8. Final remarks
  11. Acknowledgements
  12. References

The methods of statistical process control (e.g. Montgomery (2009) and Oakland (2008)) have a long history of application to problems in public health surveillance (Woodall, 2006). Several proposed approaches for the on-line detection of outbreaks of infectious diseases are directly inspired by, or related to, methods of statistical process control. This is not surprising because the problem of detecting unusual clusters of diseases in epidemiological data prospectively is similar to that of detecting aberrances in industrial production processes as they arise. The main tools for tracking the characteristics of a production process over time are control charts. These are discussed in Section 4.1. In Section 4.2 and Section 4.3, further methods are considered which share a flavour of the statistical process control methodology, namely temporal scan statistics and methods based on the time to failure.

4.1. Control charts

The first control chart was proposed by Shewart (1931) (see Section 2). The Shewhart chart utilizes information about only the last time point. Later, Page (1954) and Roberts (1959) derived control charts with memory: the cumulative sum (CUSUM) and exponentially weighted moving average (EWMA) control chart respectively. To start with the former, let {yt,t=1,2,…} denote the time series of the counts being monitored. Assuming that inline image, the one-sided (standardized) Gaussian CUSUM at time t is defined iteratively by

  • image(7)

where C0=0 and k>0 is a constant that depends on the size of aberration of interest. It is often chosen to be inline image (Rogerson and Yamada, 2004a). The baselines μt can be calculated from counts in comparable periods in previous years. These counts are also used to estimate the standard deviation σt. In the absence of any systematic departure from the expected values μt, equation (7) tends to remain at or close to 0. If Ct>h, where h is a specified threshold value, the process is declared to be out of control. Usually, the CUSUM is then reset to 0 and the process starts again. There are many variants of this basic procedure, though. For example, one system restarts the process with the CUSUM set to half the alerting threshold, to increase sensitivity to early signals (Lucas and Crosier, 2000). Methods based on the CUSUM formula are implemented in the early aberration reporting system of the Centers for Disease Control and Prevention, which is used throughout the USA as a syndromic surveillance system (Hutwagner et al., 2003).

Fig. 3(b) shows the CUSUM for Salmonella enteritidis in England, Wales and Northern Ireland in the year 2009. An outbreak from the 28th week onwards is detected by the CUSUM (7). The values in Fig. 3(a) are the weekly counts from the previous years 2000–2008 which were used to calculate μt and σt. The threshold h was chosen on the basis of a predetermined acceptable value for the in-control average run length ARL0, i.e. the average time between alerts when there is no outbreak. The reciprocal of ARL0 is the false positive (or false discovery) rate, i.e. the proportion of apparently aberrant reports that are not associated with outbreaks. Tables that can be used to find the value of h that is associated with chosen values of ARL0 and k are available (see for instance Rogerson (2001)).

image

Figure 3.  (a) Weekly counts of Salmonella enteritidis phage types 1, 4, 6, 6A, 8, 14B and 21 in England, Wales and Northern Ireland from 2000 to 2008 and (b) CUSUM of weekly counts in the year 2009 (k=0.5 and h=30; inline image, threshold)

Download figure to PowerPoint

In the case of rare events, the CUSUM approach (7) is not adequate, since the counts do not have a normal distribution. One remedy is to use the Poisson CUSUM (Lucas, 1985). Other methods that are used in disease surveillance to detect an increase in the mean of a Poisson distribution include, for example, the short memory scheme of Shore and Quade (1989), which is based on the distribution of cases in the current and previous periods. Kenett and Pollak (1996) used the Shiryaev–Roberts statistic (Shiryaev, 1963; Roberts, 1966) and applied it to a non-homogeneous Poisson process. Whereas Gaussian or Poisson CUSUMs are designed to analyse counted data, binomial CUSUMs (e.g. Reynolds and Stoumbos (2000)) can be used to monitor proportions.

Because CUSUMs are sensitive to small sustained changes in the mean numbers of reports, they are well suited to detecting relatively long lasting epidemics, such as influenza. However, for the same reason, they are sensitive to small changes in reporting efficiency and other artefacts of the reporting process. Thus, they may lack robustness when used with surveillance data unless the baselines are frequently reset.

The EWMA control chart gives less weight to more historical data. The EWMA is defined by the recursive equation

  • image(8)

where z0=0 and the weight parameter γ ∈ (0,1]. The weighting for each older data point decreases exponentially, giving much more importance to recent observations, while not discarding older observations entirely. For γ=1, equation (8) is the same as the method by Shewhart (1931). The asymptotic (one-sided) variant of the EWMA chart will give an alarm at

  • image(9)

where L>0 is a constant and σz is the asymptotic standard deviation of zt (Sonesson, 2003). Alternatively, one can use the exact standard deviation (which is increasing in time) instead of the asymptotic of the alarm limit (9). For the EWMA chart, Elbert and Burkom (2009) and Burkom et al. (2007) proposed the Holt–Winters technique for generalized exponential smoothing (see Section 3.1) to account for trends and seasonal features in syndromic data. Dong et al. (2008) constructed three types of EWMA methods that do not require an assumption of identical distributions of the counts to detect a positive shift in the rate of incidence. Adaptions of the EWMA method for Poisson and binomial data are available (Borror et al., 1998; Gan, 1991). Using the exponential smoothing technique and properties of numerical derivatives, Nobre and Stroup (1994) developed a method which bases monitoring on changes in the numerical gradient of the variable under surveillance with respect to time. Höhle and Paul (2008) presented count data regression charts, which accommodate seasonal variation in the mean of the infectious disease counts. Assume that the observed counts originate from a negative binomial distribution parameterized by its mean μ and dispersion parameter θ. For θ[RIGHTWARDS ARROW]0 the Poisson distribution with mean μ is obtained. For the in-control situation yt∼NegBin(μ0,t,θ), where

  • image(10)

and c(t) is a cyclic function that may be modelled, for example, by trigonometric terms (Serfling, 1963), i.e. the in-control mean is assumed to be time varying and linear on the log-scale. The out-of-control situation is characterized by a multiplicative shift μ1,t=μ0,t  exp (κ) with κ≥0, which corresponds to an additive increase of the mean of the log-scale. It is assumed that the in-control parameters are known, whereas κ is unknown and is estimated via maximum likelihood. A generalized likelihood ratio statistic is computed to detect, on line, whether a shift in the intercept occurred. Extensions of the basic seasonal count data regression chart are available that take account of auto-correlation between observations (Höhle and Paul, 2008) or the population size of the age strata (Höhle and Mazick, 2010). Other modified CUSUM methods that allow for time varying Poisson means were proposed by Rossi et al. (1999) and Rogerson and Yamada (2004b).

The use of control charts has also been widely advocated for the surveillance of healthcare-associated infections (Benneyan, 1998a,b; Woodall, 2006; Carey, 2003; Limaye et al., 2008). In this context, CUSUM charts are more frequently useful than EWMA charts (Woodall, 2006), but Shewhart charts appear to be the charts that have found greatest application, often being used to show the proportion of incidents in fixed periods of time. For example, they have been used in this way to monitor anaesthesia-related adverse events (Fasting and Gisvold, 2003) and risk-adjusted mortality rates of patients in hospital following admission for acute myocardial infarction (Coory et al., 2008). Morton et al. (2001) considered the application of Shewhart, CUSUM and EWMA charts for continuous realtime monitoring of various hospital acquired infections, such as vascular surgical site infection and Klebsiella pneumoniae. It was concluded that Shewhart and EWMA charts are together ideal for monitoring bacteraemia and multiresistant organism rates, whereas Shewhart and CUSUM charts together are suitable for surgical infection surveillance.

4.2. Temporal scan statistics

Scan statistics (e.g. Glaz et al. (2001)) can be used to detect and evaluate clusters of disease cases in either a purely temporal, purely spatial or space–time setting (Woodall et al., 2008). In a temporal setting, this is usually done by gradually scanning a window across time, noting the number of observed and expected observations inside the interval. The scan statistic has long been used for retrospective detection of temporal clusters in epidemiology (Wallenstein, 1980). Kulldorff (2001), Ismail et al. (2003) and Naus and Wallenstein (2006) adapted the scan statistic for use in prospective temporal surveillance.

There are two general types of prospective temporal scan-based methods. One type involves counting the number of incidences in a single region in the most recent time period (or window) of a fixed length (Ismail et al., 2003; Naus and Wallenstein, 2006). Let yn denote the observation at the current time point n and let L be the fixed window size. The scan statistic can be viewed as an unweighted moving sum (Han et al., 2010; Joner et al., 2008):

  • image(11)

An alert is flagged as soon as equation (11) exceeds a threshold h, i.e. the first time that Sn>h, where h is typically chosen in conjunction with an acceptable value of ARL0, although choosing h so that the type I error is a predetermined value α has also been suggested (Naus and Wallenstein, 2006).

In the prospective temporal scan method of Kulldorff (2001), the length of the window is not a constant but varies over a range of values (see also Wallenstein and Naus (2004)). Since the temporal scan statistic by Kulldorff (2001) can be viewed as a special case of his spatiotemporal procedure, a discussion of this method is deferred till Section 5. Public health surveillance data are often non-stationary with seasonal and other effects that are seldom found in industrial process control data. Wallenstein and Naus (2004) proposed a temporal scan method that can account for seasonal effects.

4.3. Methods based on interevent times

Methods which base detection on total reports will fail when events are very rare, because even a single report will then be unusual in a statistical sense. In such cases, one might either specify a minimum size of outbreak that must be exceeded for the count to qualify as an aberration (Farrington et al., 1996), impose a lower bound on the standard error used to normalize residuals or alternatively use the ‘sets monitoring technique’ (Chen, 1978), which bases detection on the time intervals between reports (see Farrington and Andrews (2004) for a brief review of this methodology). Sego et al. (2008) proposed the Bernoulli CUSUM chart for the surveillance of rare health events instead.

Other techniques based on the time to failure have been proposed, such as time between event (exponential) CUSUM or EWMA schemes (e.g. Gan (1994, 1998)). Exponential control charts arise naturally in the context of monitoring the rate of occurrence of rare events, since interevent times for a homogeneous Poisson process are exponentially distributed random variables.

5. Methods incorporating spatial information

  1. Top of page
  2. Abstract
  3. 1. Setting the scene
  4. 2. Regression techniques
  5. 3. Time series methodology
  6. 4. Methods inspired by statistical process control
  7. 5. Methods incorporating spatial information
  8. 6. Multivariate outbreak detection
  9. 7. Comparison and evaluation of prospective outbreak detection systems
  10. 8. Final remarks
  11. Acknowledgements
  12. References

5.1. General

As well as giving a date, in almost any surveillance system the reported incidence of a disease will specify a location. Using the spatial information that is given by the location can potentially enable localized outbreaks of a disease to be detected, or variations in regional patterns to be identified. To use spatial data, surveillance methods must have some notion of the distance between observations or some spatial structure. Several methods require only a cut-off value that categorizes pairs of observations as either being ‘close’ or ‘not close’ (e.g. Rogerson (2001) and Kulldorff (2001)). In others, an appropriate adjacency matrix or distance metric is defined. Suppose that there are m geographical units and let S be an m×m symmetric matrix of values skl representing the closeness of geographical units k and l, with 0leqslant R: less-than-or-eq, slantsklleqslant R: less-than-or-eq, slant1. For example, Rogerson (1997) used the metric

  • image(12)

where dkl is the geographic distance between reporting units k and l and τ is a specified constant. Tango (1995) suggested setting τ in equation (12) equal to 5 but results are fairly insensitive to its precise value (Bithell, 1992). When the locations of observations are known individually, spatial structure has been imposed by using kernel estimation to fit a smooth surface that represents the intensity of reported cases (Diggle et al., 2005). Both fixed and variable kernel bandwidths have been used, but it seems sensible to have bandwidths that are narrower in large towns than in sparsely populated country areas. When a region is divided into subareas, spatial correlation can also be induced by defining a relationship between neighbouring areas. For example, if uk denotes the ‘location effect’ of area k, then a common choice for modelling spatial correlation is to assume that uk follows the conditional auto-regressive model that was proposed by Besag et al. (1991). This model states that each uk is normally distributed around the mean value of u among its immediate neighbours, i.e. inline image, where

  • image

and

  • image

In principle, it is suboptimal to ignore spatial information in disease surveillance. Indeed, it may seem suboptimal to adopt any surveillance system that does not model the space–time patterns of diseases as completely as possible when such information is available and reliable. However, the use of spatial information is computationally demanding, and it may not be practical for surveillance systems that monitor several hundred diseases and disease organisms—computer limitations will restrict the complexity of calculations that can be performed for each disease. Early approaches to the detection of localized outbreaks of disease focused on only detecting outbreaks, enabling the methods to be comparatively simple. They include methods based on CUSUMs and the scan statistic. These methods are described in Sections 5.2 and 5.3 respectively. We then describe methods based on regression models in Section 5.4. These methods have the more ambitious aim of fitting space–time models to occurrences of disease, typically by using the Bayesian paradigm so that MCMC sampling can be used as a tool to estimate model parameters. The resulting methods are computationally very intensive. However, there are examples where methods that require MCMC sampling have been applied in the routine surveillance of some specific diseases (e.g. Diggle et al. (2005)), though strategies to reduce computation were necessary.

5.2. Spatial cumulative sum charts

CUSUM charts are a standard approach for detecting outbreaks and changes in spatial pattern. CUSUMs cumulate information from each case. In the spatial context, the information that is cumulated typically involves spatiotemporal proximity to other points. To isolate the specific contribution of the latest observation, conditioning on previous information is required. Thus the formula for the CUSUM given in equation (7) must be modified. The modified formula is

  • image(13)

where Ct is the CUSUM at time t, k is a specified constant and E(yt|νt−1) and var(yt|νt−1) are the expectation and variance of yt conditional on relevant information νt−1 from the first t−1 reporting periods respectively. An alarm is flagged if the CUSUM exceeds a threshold. Statistics that were developed for the retrospective analysis of spatial disease data have been adapted to give the yt. The Knox test (Knox, 1964) aims to identify space–time interactions by categorizing any two cases as close in space, close in time, close in both or close in neither. The number of observations that are close in both space and time is referred to as the Knox statistic and a space–time interaction is indicated if it is unusually large. Rogerson (2001) adapted the Knox statistic for use in surveillance by equating its value after t cases to yt in the CUSUM in equation (13). He showed that νt−1 may be replaced by yt−1 (i.e. yt−1 contains all the relevant information in the first t−1 observations). He also gave formulae for the conditional means and variances that are needed to form the CUSUM. However, empirical testing by Marshall et al. (2007) suggests that these estimates of means and variances can be poor and that computer simulation should be used to estimate both them and the threshold at which an alarm is triggered.

Rogerson (1997) adapted Tango's statistic (Tango, 1995) in a way that was similar to the method used with the Knox statistic. Tango's statistic for spatial clustering has the form

  • image(14)

where r and p are m×1 vectors containing the observed and expected proportions of cases at each location respectively, and S has the meaning that was assigned in Section 5.1. The statistic (14) was designed as a retrospective test of spatial clustering and is completely insensitive to a global change in the rate of occurrence of disease. In part this is a disadvantage, but it also means that the statistic remains valid when there is seasonality and/or annual trend.

Raubertas (1989) gave a method that forms neighbourhoods in a way that allows a reporting unit to belong to more than one neighbourhood. The data within a neighbourhood are pooled in weighted averages, using a measure of closeness as weights. For each neighbourhood a CUSUM is formed that monitors the rate of incidence and an alarm is triggered if any CUSUM exceeds a threshold. This early method continues to influence surveillance methods in varied contexts (e.g. Sparks (2010)).

ClusterSeer (Jacquez et al., 2002) and GeoSurveillance (Yamada et al., 2009) are software packages that implement surveillance methods based on spatial CUSUMs. The GeoSurveillance package was used to perform spatiotemporal surveillance of the weekly counts of all Salmonellaenteritidis cases in England, Wales and Northern Ireland in 2009. The geographical units are 12 regions defined by the Health Protection Agency, namely ‘North East’, ‘Yorkshire & Humberside’, ‘East Midlands’, ‘East’, ‘London’, ‘South East’, ‘South West’, ‘West Midlands’, ‘North West’, ‘Channel Islands & Isle of Man’, ‘Wales’ and ‘Northern Ireland’. Fig. 4 displays the results for four of these regions and gives evidence that spatial heterogeneity is inherent in the data (see also Fig. 3 in which data aggregated over the 12 geographical areas are presented).

image

Figure 4.  CUSUMs of weekly counts of Salmonella enteritidis phage types 1, 4, 6, 6A, 8, 14B and 21 for four regions in England, Wales and Northern Ireland in the year 2009: ∘, London; ▵, East; +, Northern Ireland; ×, Wales

Download figure to PowerPoint

5.3. Space–time scan statistics

For a temporal scan, a search window is gradually moved across time, looking for a window in which the number of observed cases is unexpectedly high (see Section 4.2). Similarly, for a spatial scan, a circular window is moved over a map of the study area, looking for a position where the circle contains an unexpectedly high number of cases. For a space–time scan the two are combined to form a cylindrical ‘window’ whose height is the time dimension. This form of search window has been used for the retrospective detection of local disease outbreaks since at least Wallenstein et al. (1989). Kulldorff (2001) adapted the method for prospective surveillance.

The temporal component of the scan windows that was used by Kulldorff (2001) has a varying start time but reaches to the end of the current monitoring period, as the aim is to detect local outbreaks of disease that are currently active. The spatial components of the windows have varying centres and radii. A search finds the scan window that has the most unexpectedly high number of cases, judged by the likelihood ratio

  • image(15)

where L(z) is the maximum likelihood for cylinder z if the scan window has its own rate of occurrence and L0 is the maximum likelihood if the scan window has the underlying base rate as its rate of occurrence. The likelihoods in equation (15) are generally based on the assumption that the number of cases follows a Poisson distribution. The maximum likelihood ratio is defined to be the space–time scan statistic and a p-value for its significance is obtained through Monte Carlo hypothesis testing.

Various methods aimed at improving the scan statistic have been suggested. Kleinman et al. (2005) focused on the baseline rates of incidence of disease in the population at risk, which are used in calculating the scan statistic. They proposed one estimate of these rates that incorporates an adjustment for the day of the week, month and holidays, and a second estimate that includes an additional adjustment for local history of illness. Kulldorff et al. (2005) suggested a scan statistic that does not require the size of the population at risk to be estimated and only needs data on cases. The scan statistic is a log-likelihood ratio, which is very commonly based on the Poisson assumption, and the p-value is an empirical probability of a statistic as large as that observed on the basis of Monte Carlo repetitions. Its assumptions are similar to those made with the Knox statistic. Assunção and Correa (2009) adapted the Shiryaev–Roberts statistic (Shiryaev, 1963; Roberts, 1966) as a scan statistic, modelling a disease outbreak as a change point in a cylindrical scan window. They improved the speed of computation of their method through a formula for updating estimates when a case occurs, rather than having to calculate estimates from scratch.

Sonesson (2007) fitted the space–time scan statistics of Kulldorff (2001) and Kulldorff et al. (2005) into a CUSUM framework and examined properties of the resultant methods. Spatial scan statistics (without a temporal component) have also been used for disease surveillance by considering case incidence in a short fixed time period, such as the previous 7 days (Mostashari et al., 2003). In a different direction, using ellipses (rather than circles) for the spatial component of a space–time scan window has been suggested (Kulldorff et al., 2006). Also, non-parametric spatial scan windows to detect irregularly shaped clusters have been explored (Assunção et al., 2006; Duczmal and Assunção, 2004; Takahashi et al., 2008; Tango and Takahashi, 2005). However, using such windows significantly increases computer time.

The space–time scan statistic is currently the most widely used method for detecting the emergence of localized clusters of disease (Shmueli and Burkom, 2010). SaTScan (Kulldorff, 2010) and ClusterSeer (Jacquez et al., 2002) are software packages that implement scan statistic methods. SaTScan is used by various American federal, state and city agencies for retrospective and prospective cluster detection (Zhang and Lin, 2009), including the New York City Department of Health, who use it for syndromic surveillance. SaTScan is versatile in its definition of a spatial location. To monitor infectious disease outbreaks in hospitals, for example, a ‘spatial’ location has consisted of individual wards and services (such as medicine or oncology) or groups of wards or services sharing in patient care (such as cardiology and cardiac surgery services), regardless of physical proximity (Huang et al., 2010).

5.4. Spatiotemporal regression methods

A variety of spatiotemporal regression methods have been proposed. An important distinction between them is whether they analyse aggregated data or data at an individual level. Often the region of interest is broken into small areas and the aggregated number of cases in each area in each time period forms the response. This form of area level data has usually been modelled as a discrete spatial model in which neighbourhood relationships between the areas are defined. The second form of data can arise with sparse data, when the location of each case is recorded at an individual level. These individual level data have been modelled as a Cox point process (Cox, 1955).

For an area level model, let ykt denote the number of cases in area k in time period t. Sometimes the model specifies that ykt follows a binomial distribution (e.g. Diggle et al. (2004) and Kleinman et al. (2004)) but, more commonly, it is assumed that ykt follows a Poisson distribution. Interest then centres on estimating the Poisson mean (Lawson et al., 2003; Vidal Rodeiro and Lawson, 2006; Watkins et al., 2009; Zhou and Lawson, 2008), which varies with k and t. Lawson and his co-workers generally assume that E(ykt)=ektθkt, where θkt is the unknown true relative risk in area k and time period t and ekt is the expected number of cases in the kth area in that period. The values of ekt must be specified before the model is fitted or else there is an identifiability problem. The ekt could be based on the ‘at-risk’ population demographics in each area, perhaps with an adjustment for seasonality or trend. Information about the ekt might also be gleaned from monitoring a different disease that has a similar at-risk population structure to that of the disease of interest (Lawson, 2005). The logarithm of the relative risk is decomposed into spatiotemporal components:

  • image

where uk represents spatially correlated extra variation, vk represents uncorrelated extra variation, τt describes temporal variation and γkt is the space–time interaction. It is generally assumed that τt and γkt follow random walks to allow a smooth variation in time (Knorr-Held, 2000): inline image and inline image. In general, the conditional auto-regressive model of Besag et al. (1991) is used to model the spatial correlation between the uk.

The complexity of this model makes it virtually essential to use Bayesian methods for model fitting. Vague prior distributions are given to the model parameters and MCMC methods are used to sample from the posterior distribution and to estimate parameters. To test for changes in spatial pattern, the estimates of parameters could be monitored (Lawson et al., 2003) but, much more commonly, observations are compared with one-step-ahead predictions (Kleinman et al., 2004). To detect gradual signals of interest, buffers between modelling intervals and test intervals can be used.

Other modelling stategies have also been proposed. Zhou and Lawson (2008) gave a computationally cheap approach in which separate spatial models are fitted to the data from each collection period. These models are combined by forming EWMAs and a sharp change in the weighted average in any neighbourhood suggests an outbreak. Another computationally quick method was given by Kleinman et al. (2004), who fitted generalized linear mixed models that include time-of-day and seasonality components, but which allow only uncorrelated heterogeneity between areas. Several researchers have used a model in which the disease process can switch between two (unobserved) states: endemic (non-outbreak) and epidemic (outbreak). To model the process HMMs are used and an alarm is flagged on the basis of a Bayes factor that reflects the relative likelihoods of each state (Lawson et al., 2003; Madigan, 2005; Watkins et al., 2009).

Modelling individual level data has attracted far less attention. Clark and Lawson (2006) proposed a novel approach using non-parametric regression via kernel smoothers. However, the major work with individual level data was in the ‘Ascertainment and enhancement of gastrointestinal infection surveillance and statistics’ project, which was reported in Diggle et al. (2004, 2005). The aim of this project was to develop a monitoring tool that could identify anomalies in the space–time distribution of non-specific gastrointestinal infections. The data that it collected were the location x and date t of each individual case. As the point process model for these data, Diggle et al. (2005) used a non-stationary log-Gaussian Cox process in which the spatiotemporal intensity λ(x,t) was decomposed as

  • image(16)

where λ0(x) is a smoothly varying surface describing the normal disease pattern that was estimated by using kernel smoothing, μ0(t) is temporal variation, modelled parametrically to reflect day-of-week and season effects, and R(x,t) is the residual space–time variation.

Both λ0(x) and μ0(t) in equation (16) could be estimated from historical data but up-to-date predictions of R(x,t) were required. These predictions were based on the most recent 5 days’ data. Naturally they had far more uncertainty attached to them than the estimates of λ0(x) and μ0(t), so these estimates were treated as deterministic quantities. To detect outbreaks, the region was divided into neighbourhoods and MCMC sampling was used to determine the probability in each neighbourhood that R(x,t) exceeded a prespecified threshold. The results were reported daily on maps with colours denoting probability levels.

The complexity of models places a burden on computer resources and the problem is exacerbated if MCMC sampling is used to implement Bayesian methods. As noted by Lawson et al. (2003), page 952, ‘… for any Bayesian model, computational speed-ups must be sought to make implementation realistic in a surveillance context’. In line with this, it seems sensible to follow Diggle et al. (2005) and to treat any quantity that is estimated from historical data as deterministic. Also, historical data can generally be restricted to a moving window of the last 3, 5 or 8 years (Lawson et al., 2003) unless the data are very sparse. Similarly, a window of just the most recent past (perhaps only a few days) has generally been used when much more weight should be given to recent data (e.g. Diggle et al. (2005)). Numerical approximations have also been used to good effect. For example, Kleinman et al. (2004) estimated parameters through quasi-likelihood, which is computationally less demanding but which introduces some bias. In many situations these pragmatic measures would simply be an option, but in the context of realtime surveillance they are a necessity.

6. Multivariate outbreak detection

  1. Top of page
  2. Abstract
  3. 1. Setting the scene
  4. 2. Regression techniques
  5. 3. Time series methodology
  6. 4. Methods inspired by statistical process control
  7. 5. Methods incorporating spatial information
  8. 6. Multivariate outbreak detection
  9. 7. Comparison and evaluation of prospective outbreak detection systems
  10. 8. Final remarks
  11. Acknowledgements
  12. References

6.1. Scope of multivariate detection

Most outbreak detection systems track more than one data series. For example, in the UK, systems of laboratory surveillance (Farrington et al., 1996; McCabe et al., 2003) and syndromic surveillance (Baker et al., 2003; Robertson, 2006) and systems for institutional surveillance (Marshall et al., 2004) may typically monitor dozens or even hundreds of different series of data. When the different data series (and, most importantly, outbreaks within them) are likely to be unrelated, it usually makes most sense to consider them as separate univariate series. However, in some cases, several series will relate to the same underlying process, and hence process changes are likely to be strongly correlated. This applies, for example, to indicators of influenza (Stroup et al., 1988; Griffin et al., 2009; Mann, 2009), reports of gastrointestinal illness from different sources (Kulldorff et al., 2007) or time series of counts of the same infection in different age groups (Held et al., 2005). In such circumstances multivariate methods of outbreak detection are likely to be fruitful in exploiting dependences, both between the underlying processes and between the timing of outbreaks.

An overview of methods for multivariate surveillance is given in Sonesson and Frisén (2005) and Frisén (2010), who classified genuinely multivariate approaches into categories that include reduction of dimensionality, joint modelling and vector accumulation methods. These are discussed in turn in Section 6.1, Section 6.2 and Section 6.3 respectively. Sonesson and Frisén (2005) and Frisén (2010) also mentioned so-called ‘parallel surveillance’ methods. A parallel approach monitors each variable separately by means of a univariate surveillance method. An alarm for the multivariate process is declared if some condition is fulfilled, e.g. the first time that any of the univariate processes gives an alarm. These methods will not be considered further here.

6.2. Dimension reduction

Dimension reduction methods for multivariate surveillance data could in principle include standard tools such as principal component analysis (Jolliffe, 2002). These have been used for detecting aberrations in other fields (Ku et al., 1995), though they may lead to problems of interpretation. A more popular approach is to reduce the multivariate data at each time point to a scalar, which is then monitored by univariate surveillance methods. Let y={yt,t=1,2,…} be the multivariate process under surveillance, where yt=(y1t,y2t,…,ypt)T is observed with in-control mean p×1 vector μ and p×p covariance matrix Σ. An early multivariate surveillance scheme is that based on Hotelling's T2 (Hotelling, 1947; Jackson, 1959). The process parameter at time t for multivariate data yt is

  • image(17)

The statistic (17) is a multivariate extension of the Shewhart chart, an alarm being declared if tA=min{t:T2(t)>h}, where h is a specified threshold. For a bivariate process (xt,yt) with change-point (tx,ty), Andersson (2009a) showed that the conditional expected delay, defined as tAt(1) given that tA>t(1), where t(1)=min(tx,ty), depends only on |txty|.

A further possibility is to undertake univariate analyses on each data set, and to combine the p-values for the marginal tests into a single ‘consensus’ value. One such method uses Fisher's rule to obtain the summary statistic F from n individual p-values pi:

  • image

If the n tests are independent then the null distribution of F is χ2 with 2n degrees of freedom. This and other methods for combining p-values, along with Hotelling's T2, were discussed in an outbreak detection setting by Burkom et al. (2005).

6.3. Joint modelling methods

Kulldorff et al. (2007) developed a multivariate space–time scan statistic based on the sum of the log-likelihood-ratio statistics for the univariate processes. This generalizes an earlier univariate version (Kulldorff, 1997). Thus, suppose that the total number of reports for series j is Nj (j=1,…,p). For a space–time cylinder z with nj,z cases from series j and expected cases μj,z obtained under a Poisson model, the likelihood ratio for a ‘high’ cluster is

  • image(18)

if nj,z>μj,z and 1 otherwise. The multivariate scan statistic for detecting high clusters is then

  • image(19)

which adjusts automatically for multiple testing that is inherent in considering multiple series as well as multiple cylinders. Equation (18) differs very slightly from Kulldorff's formulation: his LRj(z) is multiplied by an indicator function, but this will cause problems when taking logarithms.

To illustrate this approach, the multivariate space–time scan statistic (19) was applied to syndromic surveillance data in Scotland. The data give the number of calls to NHS24 by postcode district and day from February 19th to April 1st, 2007, originated within the Glasgow postcode area. This area consists of 50 postcode districts. Data are recorded for the two symptoms diarrhoea and vomiting which are syndromic indicators for norovirus infection. A space–time scan statistic analysis was performed for each of the 7 days from March 26th to April 1st, using the data from February 19th, 2007. We used 3 days as the maximum temporal window size. Calculations were performed by using the SaTScan software (Kulldorff, 2010). The results of the analyses are presented in Table 1. The strongest signal was on April 1st, with the cluster consisting of 2 days and 25 postcode districts. With a recurrence interval of 2 years and 269 days, this cluster is unlikely to be a chance occurrence. A second cluster with an identical recurrence interval was detected on March 26th, containing 27 postcode districts during 3 days.

Table 1.   Two most likely clusters for norovirus in the Glasgow postcode area, during March 26th–April 1st, 2007, as generated by the multivariate space–time scan statistic
Cluster characteristicsResults for diarrhoeaResults for vomitingRecurrence interval
DateNumber of postcodesNumber of daysObservedExpectedRelative riskObservedExpectedRelative risk
March 26th2734949.290.99192123.841.592 years 269 days
April 1st2524728.281.6916476.022.222 years 269 days

Other joint modelling approaches have been suggested, based on a joint model for the entire multivariate data series. In one such method, the alarm function is based on the likelihood ratio derived from the joint distribution of the multivariate process (Andersson, 2009b). This is a multivariate version of the method of Frisén and de Maré (1991). Schiöler and Frisén (2011) present a multivariate extension of the semiparametric univariate method of Frisén and Andersson (2009), based on the sufficient reduction approach of Frisén et al. (2011) for step changes.

Several other joint modelling methods have been used for infectious disease data but have not been applied to outbreak detection per se, though extensions in this direction should be possible. Held et al. (2005) extended to the multivariate case a model incorporating a branching process that was presented in Held et al. (2006b). This model was further extended in Paul et al. (2008) to analyse data from several pathogens. A multivariate spatial model for different gastrointestinal infections was presented in Held et al. (2006a).

Sebastiani et al. (2006) used dynamic Bayesian networks to study the interplay of four different sources of data that are monitored for influenza surveillance. Mann (2009) developed a multivariate HMM with a shared hidden process to model several markers of influenza.

6.4. Vector accumulation methods

Vector accumulation methods include multivariate extensions of the EWMA and CUSUM charts. Generalizing the EWMA is relatively straightforward (Lowry et al., 1992), via the recursive scheme

  • image(20)

where Λ is a diagonal p×p matrix of values in (0,1] and Ip is an identity matrix of order p. The chart from equation (20) goes out of control when

  • image

for some control limit h, where Σzt is the covariance matrix of zt. In contrast, generalizing the standard univariate two-sided CUSUM is complicated by the fact that there are two CUSUMs for each variable. Crosier (1988) and Pignatiello and Runger (1990) developed multivariate CUSUMs as generalizations of new univariate CUSUMs requiring a single CUSUM per variable. Golosnoy et al. (2009) investigated the properties of the multivariate CUSUM chart of Pignatiello and Runger (1990) and suggested further enhancements. Ngai and Zhang (2001) generalized the standard univariate CUSUM in the sense that the p-dimensional version reduces to it when p=1. These various multivariate CUSUMs all apply only to independent multivariate observations. Bodnar and Schmid (2007) further extended these methods to multivariate time series, taking into account dependences in the underlying process. Moving beyond Gaussian processes, an example of a rank-based multivariate CUSUM can be found in Qiu and Hawkins (2001), with further development of this non-parametric scheme provided by Qiu and Hawkins (2003).

7. Comparison and evaluation of prospective outbreak detection systems

  1. Top of page
  2. Abstract
  3. 1. Setting the scene
  4. 2. Regression techniques
  5. 3. Time series methodology
  6. 4. Methods inspired by statistical process control
  7. 5. Methods incorporating spatial information
  8. 6. Multivariate outbreak detection
  9. 7. Comparison and evaluation of prospective outbreak detection systems
  10. 8. Final remarks
  11. Acknowledgements
  12. References

As documented in earlier sections, a rich array of methodologies for outbreak detection is available. This raises the question: which to use? It is not feasible to make detailed recommendations about which method is ‘best’, because this will depend critically on the specific details of the application and implementation, as well as its purpose and context. In particular, the key issues that are likely to affect any assessment of the relative merits of different methods are

  • (a)
     the scope of the system, in particular how many parallel data series are to be monitored, which can range from one to several thousand,
  • (b)
     the quality of the available data, including the method of data collection, and the delay between event occurrence and reporting,
  • (c)
     the spatiotemporal features of the data, such as count frequency, trend structure, seasonality, epidemicity, time step and spatial resolution,
  • (d)
     the features of the outbreaks that may occur, e.g. explosive or gradual onset, brief or long duration, and level of severity, or a mix of these,
  • (e)
     the use to which the system is to be put, including the post-signal processing protocols,
  • (f)
     the availability of processing power and human resources to support the system and
  • (g)
     the choice of metric to evaluate results.

Accordingly, we venture no recommendations on the strengths and weaknesses of the different methods. Instead, and in the spirit of this review, we focus on some of the statistical issues that are involved in evaluating and comparing methods. This remains to some extent an undeveloped area, in spite of much discussion and some progress; see Buckeridge et al. (2005), Fraker et al. (2008) and Fricker (2011) for overviews. It is influenced by two distinct perspectives: technical approaches derived from statistical process control and the more empirical perspective of epidemiology.

7.1. Optimality properties

In the statistical process control literature, performance of a detection system is usually assessed by criteria based on time to event, such as the ARL. The ARL to detect an on-going outbreak that started at the same time as the surveillance is ARL1. The criterion of minimal ARL1, given a fixed ARL0, is sometimes used in public health settings (Grigg et al., 2003; Musonda et al., 2008), but the usefulness of this criterion for detecting outbreaks has been questioned by Frisén (1992, 2003), owing in part to the restriction that the measure relates only to outbreaks starting concurrently with the surveillance.

Frisén (1992) considered other criteria, such as the probability of a false alarm before some specified time t after the start of surveillance, the probability of detection of an outbreak starting at time t before time t+d for some d and the positive predictive value of an alarm, namely the probability that an outbreak is occurring, given that the system has signalled one. These and related measures are discussed in Sonesson and Bock (2003). Recently, measures of performance based on other temporal criteria have been proposed (Fraker et al., 2008). One such is the recurrence interval, namely the time interval over which the expected number of false alarms is 1, when the process is in control.

It is sometimes possible to determine which statistical detection method is optimal under a given criterion. The main theoretical results on optimality properties are those of Frisén and de Maré (1991) and Frisén (1992, 2003). Consider a series of independent random variables xs at times s with mean μ0 at s<T and mean μ1>μ0 at sT, where T is the (unobserved) start of the outbreak. An alarm function p is defined on the basis of suitable null and alternative hypotheses, defining a test with rejection region of the form p(xs)>k. This test maximizes the probability that an alarm is detected, among all tests with a specified false alarm probability. Optimality in this sense derives from a version of the Neyman–Pearson lemma, and indeed the test may be expressed as a likelihood ratio statistic. Thus

  • image

where

  • image

Here fx is the density of x and ks is the critical value of the test at time s.

Within this framework, Frisén (2003, 2007) showed that the Shewhart chart method, for which p(xs)=L(s,s), maximizes the probability of detecting the outbreak at time T=s, and, asymptotically (as μ1 increases), minimizes the time to detection, for a fixed false detection probability. In the Shiryaev–Roberts method (Shiryaev, 1963; Roberts, 1966), dependence on T through w(s,t) is eliminated by setting w(s,t)=1, which may be justified asymptotically as corresponding to a vanishingly low probability of an outbreak. More generally, however, the performance of a method will depend on the time T at which the outbreak occurs after surveillance has begun. Minimax optimality criteria minimize the expected delay in detection, for the worst possible choice of T. Some CUSUM methods, for which p(xs)=maxt{L(s,t)}, are minimax optimal in this sense (Frisén, 2003).

However, the usefulness of such theoretical optimality properties is limited by the restrictive assumptions under which they are derived. In practice, surveillance data are noisy, often non-stationary or auto-correlated, outbreaks are of varying durations and intensities but seldom indefinite, and the distributions of the observed counts are not generally Poisson. Arguably, the notion of the process being ‘in control’ or in a ‘steady state’ bears little relation to reality in the context of surveillance. Although Frisén (2003) discussed optimality properties in the presence of such complexities, a sufficiently robust general theory remains elusive. In addition, it is often advisable to consider several performance measures concurrently. For these reasons, more empirical criteria are usually employed to compare methods.

7.2. Epidemiological perspective

The epidemiological perspective derives from the evaluation of diagnostic tests in clinical medicine and is based on concepts such as sensitivity, specificity and predictive value, applied to surveillance time units rather than individual patients. An extensive discussion of these types of measure, relating largely to the era before computerized surveillance, but of lasting relevance, may be found in Thacker et al. (1988), who identified the following criteria: usefulness, cost, sensitivity, specificity, representativeness, timeliness, simplicity, flexibility and acceptability. These measures relate to the performance of a surveillance system as a whole, including the process of data collection and the wider public health impact of the system, rather than the evaluation of different statistical algorithms.

To evaluate statistical algorithms, it is customary to use a combination of numerical indices, most commonly sensitivity and specificity (or positive predictive value), and timeliness, namely the delay between the start of the outbreak and its detection. However, none of these quantities is straightforward to standardize or operationalize in the context of outbreak detection, owing to the contextual factors that were set out at the beginning of this section; their big advantage is that they relate directly to the preoccupations of system users.

Several approaches have been taken to combine these indices to obtain a single performance measure which can be compared meaningfully across detection systems. The standard method is to calculate a receiver operating characteristic (ROC) curve, and to use the area under the curve (AUC) as a summary measure. ROC curves are obtained by plotting sensitivity against 1 − specificity and thus ignore the key timeliness variable. Buckeridge et al. (2005) discussed a variant, the activity monitoring operating characteristic curve, which is obtained by plotting a measure of timeliness (such as the time to detection of an outbreak on an inverted scale) against 1 − specificity. Zhang et al. (2003) used both ROC and activity monitoring operating characteristic curves to compare performance of several outbreak detection systems.

Kleinman and Abrams (2006) have proposed a weighted ROC curve, in which each point of the curve is weighted by a timeliness measure, and a weighted AUC is derived which thus takes timeliness into account. They suggested two ways of doing this. In the first, it is assumed that there is a reference time after the outbreak starts by which it must be detected and measure the proportion Psav of time that is saved relative to this reference time. Thus, suppose that an outbreak starts at time t, and that the predetermined reference time is t+D. If the outbreak is detected at time st, or not detected (in which case s=∞), then

  • image

Martínez-Beneito et al. (2008) used this approach to evaluate different systems for detecting influenza epidemics. In the second method of Kleinman and Abrams (2006), timeliness is a 0–1 variable according to whether the outbreak was detected within a specified time or not, and it is therefore similar to the successful detection probability of Frisén (1992). Under both schemes, the weighted AUC equals 1 if the system is fully accurate and timely. Kleinman and Abrams (2006) also proposed several three-dimensional generalizations of the ROC curve, the third dimension being a timeliness measure. The AUC is replaced by the volume under the curve. Cowling et al. (2006) used the volume under the curve approach in comparing different outbreak detection systems for influenza.

Further complications arise when the system is intended to work with more than one data series. To control the false discovery rate, namely the proportion of signals that do not correspond to outbreaks, either a Bonferroni correction may be applied or the methods of Benjamini and Hochberg (1995) may be used. Marshall et al. (2004) discussed these issues in the context of hospital performance monitoring. In contrast, Farrington et al. (1996) used an empirical approach in which, each week, the number of flagged organisms (out of many hundred reported) is limited to some manageable ranked number with the smallest p-values (Farrington, 2004), since the true timeliness of outbreak detection depends also on post-signal confirmatory procedures. Grigg and Spiegelhalter (2008) showed how the methods of Benjamini and Hochberg (1995) can be applied to p-values derived from CUSUMs. For spatial surveillance, which may also involve monitoring several series of data, a further elaboration of the ROC curve has been proposed, in which the fraction of outbreak locations detected is plotted against 1 − specificity (Buckeridge et al., 2005).

Evaluation methods for large surveillance systems are an area requiring further work; it is unlikely that a single plot or index will suffice. For example, evaluations of systems applied to multiple organisms should ideally allow for differences in public health importance of different outbreaks (as determined by their size, severity and cost). Frisén (2003) discussed utilities, but in the context of delays in detection rather than severity of outbreak. This is a topic requiring further development, in particular to incorporate prior knowledge about the likely severity of an outbreak for a particular organism and age group combination, and to derive suitable importance weights.

7.3. Study designs

Most of the empirical investigations of outbreak detection algorithms have been done using real data, wholly simulated data or real data with superimposed simulated outbreaks (Buckeridge et al., 2005; Choi et al., 2010). Evaluations based on real data suffer from the obvious problem that the true outbreaks are generally not known. Kleinman and Abrams (2006) proposed to evaluate a new surveillance system by comparing the outbreak signals that it generates with outbreaks in real data identified by an existing system of recognized reliability. They suggested three tests, including a permutation test in which outbreak locations and dates are randomly reassigned, to assess whether the new system works in the sense that it detects outbreaks better than chance, or that the signals are generated earlier than by the existing system.

An appealing approach is to inject simulated outbreaks into real data series. This combines the realism of idiosyncratically noisy surveillance data, with knowledge of the outbreaks, the epidemic curves and durations of which are under the investigator's control. This is the approach that was taken by Neill (2009). Others have used fully simulated data, with superimposed outbreaks (Hutwagner, Browne, Seeman and Fleischhauer, 2005; Hutwagner, Thompson, Seeman and Treadwell, 2005).

Relatively few evaluations of automated statistical surveillance methods against traditional approaches have been undertaken: Leal and Laupland (2008) have attempted a systematic review, whereas Huang et al. (2010) carried out a retrospective cohort study. Kleinman and Abrams (2006) have commented that, in view of the multiplicity of evaluation measures that are now available, an evaluation metric is now required for evaluating evaluation metrics.

8. Final remarks

  1. Top of page
  2. Abstract
  3. 1. Setting the scene
  4. 2. Regression techniques
  5. 3. Time series methodology
  6. 4. Methods inspired by statistical process control
  7. 5. Methods incorporating spatial information
  8. 6. Multivariate outbreak detection
  9. 7. Comparison and evaluation of prospective outbreak detection systems
  10. 8. Final remarks
  11. Acknowledgements
  12. References

In this paper we have sought to review the methods that have been proposed for identifying outbreaks of infectious diseases as they arise, in sufficiently timely fashion to allow interventions to take place. In limiting ourselves to these statistical methods, we chose not to deal with several issues of critical importance, which require further research and involvement of statisticians. One key issue is how to design effective and flexible user interfaces—ideally including a choice of statistical algorithms. Another is how to create effective protocols to handle the signals that emerge from a statistical surveillance system. A third is how to deal with the imperfections in data, such as reporting delays, which inevitably affect prospective detection systems.

This review shows that a very broad range of statistical techniques have been proposed for prospective outbreak detection. The choice of which statistical technique to use will depend critically on the nature of the intended application. In particular, systems that are designed for the surveillance of a single infection or syndrome should arguably be tuned to the specific features of that infection and may need frequent user intervention. In contrast, systems that are designed for routine application to hundreds or thousands of possible infections, with diverse frequencies and temporal patterns, will require robustness and automation. Systems will also vary according to what features they are designed to detect; for example trends and seasonal variation may or may not be of interest according to context. No single statistical technique is likely to be ideal in every setting.

Statistical outbreak detection is a multidisciplinary science, involving epidemiologists and computer programmers, as well as statisticians. Its rapid development over recent years presents an opportunity for statisticians, through the numerous techniques that are now available, and the increased acceptance of statistical methodology in this field. But it also presents a challenge to the statisticians: to demonstrate that these new statistical methodologies provide added value in public health terms, by effectively supplementing other surveillance methods for detecting outbreaks.

Acknowledgements

  1. Top of page
  2. Abstract
  3. 1. Setting the scene
  4. 2. Regression techniques
  5. 3. Time series methodology
  6. 4. Methods inspired by statistical process control
  7. 5. Methods incorporating spatial information
  8. 6. Multivariate outbreak detection
  9. 7. Comparison and evaluation of prospective outbreak detection systems
  10. 8. Final remarks
  11. Acknowledgements
  12. References

The Joint Editor, two reviewers and Bill Woodall made valuable comments and suggestions on the first draft of this paper. We are grateful to David Conesa, Marianne Frisén, Michael Höhle and Ron Kenett for making us aware of some references that we would otherwise have missed.

This study was supported by grants from the National Institute for Health Research and Medical Research Council.

References

  1. Top of page
  2. Abstract
  3. 1. Setting the scene
  4. 2. Regression techniques
  5. 3. Time series methodology
  6. 4. Methods inspired by statistical process control
  7. 5. Methods incorporating spatial information
  8. 6. Multivariate outbreak detection
  9. 7. Comparison and evaluation of prospective outbreak detection systems
  10. 8. Final remarks
  11. Acknowledgements
  12. References
  • Andersson, E. (2009a) Hotelling's T2 method in multivariate on-line surveillance: on the delay of an alarm. Communs Statist. Theor. Meth., 38, 26252633.
  • Andersson, E. (2009b) Effect of dependency in systems for multivariate surveillance. Communs Statist. Simuln Computn, 38, 454472.
  • Andersson, E., Bock, D. and Frisén, M. (2008) Modeling influenza incidence for the purpose of on-line monitoring. Statist. Meth. Med. Res., 17, 421438.
  • Assunção, R. and Correa, T. (2009) Surveillance to detect emerging space-time clusters. Computnl Statist. Data Anal., 53, 28172830.
  • Assunção, R., Costa, M., Tavares, A. and Ferreira, S. (2006) Fast detection of arbitrarily shaped disease clusters. Statist. Med., 25, 723742.
  • Baker, M., Smith, G. E., Cooper, D., Verlander, N. Q., Chinemana, F., Cotterill, S., Hollyoak, V. and Griffiths, R. (2003) Early warning and NHS Direct: a role in community surveillance? J. Publ. Hlth Med., 25, 362368.
  • Benjamini, Y. and Hochberg, Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Statist. Soc. B, 57, 289300.
  • Benneyan, J. C. (1998a) Statistical quality control methods in infection control and hospital epidemiology, Part I: Introduction and basic theory. Infectn Contr. Hosp. Epidem., 19, 194214.
  • Benneyan, J. C. (1998b) Statistical quality control methods in infection control and hospital epidemiology, Part II: Chart use, statistical properties, and research issues. Infectn Contr. Hosp. Epidem., 19, 265283.
  • Besag, J., York, J. and Mollié, A. (1991) A Bayesian image restoration, with two applications in spatial statistics (with discussion). Ann. Inst. Statist. Math., 43, 159.
  • Bithell, J. F. (1992) Statistical methods for analysing point-source exposures. In Geographic and Environmental Epidemiology: Methods for Small Area Studies (eds P. Elliot, D. Cuzick, D. English and R. Stern), pp. 135. Oxford: Oxford University Press.
  • Bock, D., Andersson, E. and Frisén, M. (2008) Statistical surveillance of epidemics: peak detection of influenza in Sweden. Biometr. J., 50, 7185.
  • Bodnar, O. and Schmid, W. (2007) Surveillance of the mean behavior of multivariate time series. Statist. Neerland., 61, 383406.
  • Borror, C. M., Champ, C. W. and Rigdon, S. E. (1998) Poisson EWMA control charts. J. Qual. Technol., 30, 352361.
  • Box, G. E. P. and Jenkins, G. M. (1970) Time Series Analysis: Forecasting and Control. San Francisco: Holden-Day.
  • Buckeridge, D. L., Burkom, H. S., Campbell, M., Hogan, W. R. and Moore, A. (2005) Algorithms for rapid outbreak detection: a research synthesis. J. Biomed. Informat., 38, 99113.
  • Burkom, H. S., Murphy, S., Coberly, J. and Hurt-Mullen, K. (2005) Public health monitoring tools for multiple data streams. Morb. Mort. Wkly Rep., suppl., 54, 5562.
  • Burkom, H. S., Murphy, S. P. and Shmueli, G. (2007) Automated time series forecasting for biosurveillance. Statist. Med., 26, 42024218.
  • Cappé, O., Moulines, E. and Rydén, T. (2005) Inference in Hidden Markov Models. New York: Springer.
  • Cardinal, M., Roy, R. and Lambert, J. (1999) On the application of integer-valued time series models for the analysis of disease incidence. Statist. Med., 18, 20252039.
  • Carey, R. G. (2003) Improving Healthcare with Control Charts: Basic and Advances SWPC Methods and Case Studies. Milwaukee: American Society for Quality Quality Press.
  • Chatfield, C. and Yar, M. (1988) Holt-Winters forecasting: some practical issues. Statistician, 37, 129140.
  • Chen, R. (1978) A surveillance system for congenital malformations. J. Am. Statist. Ass., 73, 323327.
  • Choi, K. (1981) An evaluation of influenza mortality surveillance, 1962-1979: 1, Time series forecasts of expected pneumonia and influenza deaths. Am. J. Epidem., 113, 215226.
  • Choi, B. Y., Kim, H., Go, U. Y., Jeong, J.-H. and Lee, J. W. (2010) Comparison of various statistical methods for detecting disease outbreaks. Computnl Statist., 25, 603617.
  • Clark, A. B. and Lawson, A. B. (2006) Surveillance of individual level disease maps. Statist. Meth. Med. Res., 15, 353362.
  • Conesa, D., López-Quílez, A., Martínez-Beneito, M. A., Miralles, M. T. and Verdejo, F. (2009) FluDetWeb: an interactive web-based system for the early detection of the onset of influenza epidemics. BMC Med. Informat. Decsn Makng, 9, article 36.
  • Conesa, D., Martínez-Beneito, M. A., Amorós, R. and López-Quílez, A. (2010) Bayesian hierarchical Poisson models with a hidden Markov structure for the detection of influenza epidemic outbreaks. Working Paper. Departament d'Estadística i Investigació Operativa, Universitat de València, València.
  • Cooper, D. L., Smith, G., Baker, M., Chinemana, F., Verlander, N. Q., Gerard, E., Hollyoak, V. and Griffiths, R. (2004) National symptom surveillance using calls to a telephone health advice service—United Kingdom, December 2001-February 2003. Morb. Mort. Wkly Rep ., suppl., 53, 179183.
  • Coory, M., Duckett, S. and Sketcher-Baker, K. (2008) Using control charts to monitor quality of hospital care with administrative data. Int. J. Qual. Hlth Care, 20, 3139.
  • Costagliola, D., Flahault, A., Galinec, D., Garnerin, P., Menares, J. and Valleron, A.-J. (1991) A routine tool for detection and assessment of epidemics of influenza-like syndromes in France. Am. J. Publ. Hlth, 81, 9799.
  • Costagliola, D., Flahault, A., Galinec, D., Garnerin, P., Menares, J. and Valleron, A.-J. (1994) When is the epidemic warning cut-off point exceeded? Eur. J. Epidem., 10, 475476
  • Cowling, B. J., Wong, I. O. L., Riley, S. and Leung, B. M. (2006) Methods for monitoring influenza data. Int. J. Epidem., 35, 13141321.
  • Cox, D. R. (1955) Some statistical methods related with series of events (with discussion). J. R. Statist. Soc. B, 17, 129164.
  • Crosier, R. B. (1988) A new two-sided cumulative sum quality control scheme. Technometrics, 28, 187194.
  • Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977) Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. R. Statist. Soc. B, 39, 138.
  • Diggle, P. J., Knorr-Held, L., Rowlingson, B., Su, T., Hawtin, P. and Bryant, T. (2004) Spatio-temporal point processes: methods and applications. In Monitoring the Health of Populations: Statistical Principles & Methods for Public Health Surveillance (eds R. Brookmeyer and D. F. Stroup), pp. 233266. Oxford: Oxford University Press.
  • Diggle, P. J., Rowlingson, B. and Su, T.-L. (2005) Point process methodology for on-line spatio-temporal disease surveillance. Environmetrics, 16, 423434.
  • Dominici, F., Sheppard, L. and Clyde, M. (2003) Health effects of air pollution: a statistical review. Int. Statist. Rev., 71, 243276.
  • Dong, Y., Hedayat, A. S. and Sinha, B. K. (2008) Surveillance strategies for detecting changepoint in incidence rate based on exponentially weighted moving average methods. J. Am. Statist. Ass., 103, 843853.
  • Duczmal, L. and Assunção, R. M. (2004) A simulated annealing strategy for the detection of arbitrarily shaped spatial clusters. Computnl Statist. Data Anal., 45, 269286.
  • Elbert, Y. and Burkom, H. S. (2009) Development and evaluation of a data-adaptive alerting algorithm for univariate temporal biosurveillance data. Statist. Med., 28, 32263248.
  • Farrington, C. P. (2004) Monitoring clinical performance: invited comments on the papers by Grigg and Farewell and Marshall et al. J. R. Statist. Soc. A, 167, 562563.
  • Farrington, C. P. and Andrews, N. (2004) Outbreak detection: application to infectious disease surveillance. In Monitoring the Health of Populations: Statistical Principles & Methods for Public Health Surveillance (eds R. Brookmeyer and D. F. Stroup), pp. 203231. Oxford: Oxford University Press.
  • Farrington, C. P., Andrews, N. J., Beale, A. D. and Catchpole, M. A. (1996) A statistical algorithm for the early detection of outbreaks of infectious disease. J. R. Statist. Soc. A, 159, 547563.
  • Fasting, S. and Gisvold, S. E. (2003) Statistical process control methods allow the analysis and improvement of anesthesia care. Can. J. Anesth., 50, 767774.
  • Fienberg, S. E. and Shmueli, G. (2005) Statistical issues and challenges associated with rapid detection of bio-terrorist attacks. Statist. Med., 24, 513529.
  • Fraker, S. E., Woodall, W. H. and Mousavi, S. (2008) Performance metrics for surveillance schemes. Qual. Engng, 20, 451464.
  • Fricker, Jr, R. D. (2001) Rejoinder: some methodological issues in biosurveillance. Statist. Med., 30, 434441.
  • Frisén, M. (1992) Evaluations of methods for statistical surveillance. Statist. Med., 11, 14891502.
  • Frisén, M. (2003) Statistical surveillance: optimality and methods. Int. Statist. Rev., 71, 403434.
  • Frisén, M. (2007) Properties and use of the Shewhart method and its followers. Sequentl Anal., 26, 171193.
  • Frisén, M. (2010) Principles for multivariate surveillance. In Frontiers in Statistical Quality Control 9 (eds H.-J. Lenz, P.-T. Wilrich and W. Schmid), pp. 133144. Berlin: Springer.
  • Frisén, M. and Andersson, E. (2009) Semiparametric surveillance of monotonic changes. Sequentl Anal., 28, 434454.
  • Frisén, M., Andersson, E. and Pettersson, K. (2010) Semiparametric estimation of outbreak regression. Statistics, 44, 107117.
  • Frisén, M., Andersson, E. and Schiöler, L. (2009) Robust outbreak surveillance of epidemics in Sweden. Statist. Med., 28, 476493.
  • Frisén, M., Andersson, E. and Schiöler, L. (2011) Sufficient reduction in multivariate surveillance. Communs Statist. Theor. Meth., 40, 18211838.
  • Frisén, M. and de Maré, J. (1991) Optimal surveillance. Biometrika, 78, 271280.
  • Gan, F. F. (1991) Monitoring observations generated from a binomial distribution using modified exponentially weighted moving average control charts. J. Statist. Computn Simuln, 37, 4560.
  • Gan, F. F. (1994) Design of optimal exponential CUSUM control charts. J. Qual. Technol., 26, 109124.
  • Gan, F. F. (1998) Designs of one- and two-sided exponential EWMA Charts. J. Qual. Technol., 30, 4469.
  • Glaz, J., Naus, J. and Wallenstein, W. (2001) Scan Statistics. New York: Springer.
  • Goldenberg, A., Shmueli, G., Caruana, R. and Fienberg, S. (2002) Early statistical detection of anthrax outbreaks by tracking over-the-counter medication sales. Proc. Natn. Acad. Sci. USA, 99, 52375240.
  • Golosnoy, V., Ragulin, S. and Schmid, W. (2009) Multivariate CUSUM chart: properties and enhancements. Adv. Statist. Anal., 93, 263279.
  • Griffin, B. A., Jain, A. K., Davies-Cole, J., Glymph, C., Lum, G., Washington, S. C. and Stoto, M. A. (2009) Early detection of influenza outbreaks using the DC Department of Health's syndromic surveillance system. BMC Publ. Hlth, 9, article 483.
  • Grigg, O. A., Farewell, V. T. and Spiegelhalter, D. J. (2003) Use of risk-adjusted CUSUM and RSPRT charts for monitoring in medical contexts. Statist. Meth. Med. Res., 12, 147170.
  • Grigg, O. A. and Spiegelhalter, D. J. (2008) An empirical approximation to the null unbounded steady state distribution of the cumulative sum statistic. Technometrics, 50, 501511.
  • Han, S. W., Tsui, K.-L., Ariyajunya, B. and Kim, S. B. (2010) A comparison of CUSUM, EWMA, and temporal scan statistics for detection of increases in Poisson rates. Qual. Reliab. Engng Int., 26, 279289.
  • Healy, M. J. R. (1983) A simple method for monitoring routine statistics. Statistician, 32, 347349.
  • Heisterkamp, S. H., Dekkers, A. L. M. and Heijne, J. C. M. (2006) Automated detection of infectious disease outbreaks: hierarchical time series models. Statist. Med., 25, 41794196.
  • Held, L., Graziano, G., Frank, C. and Rue, H. (2006a) Joint spatial analysis of gastrointestinal infectious diseases. Statist. Meth. Med. Res., 15, 465480.
  • Held, L., Hofmann, M., Höhle, M. and Schmid, V. (2006b) A two-component model for counts of infectious diseases. Biostatistics, 7, 422437.
  • Held, L., Höhle, M. and Hofmann, M. (2005) A statistical framework for the analysis of multivariate infectious disease surveillance counts. Statist. Modllng, 5, 187199.
  • Helfenstein, U. (1986) Box-Jenkins modelling of some viral infectious diseases. Statist. Med., 5, 3747.
  • Höhle, M. (2007) Surveillance: an R package for the monitoring of infectious diseases. Computnl Statist., 22, 571582.
  • Höhle, M. and Mazick, A. (2010) Aberration detection in R illustrated by Danish mortality monitoring. In Biosurveillance: Methods and Case Studies (eds T. Kass-Hout and X. Zhang), pp. 215237. London: CRC Press.
  • Höhle, M. and Paul, M. (2008) Count data regression charts for the monitoring of surveillance time series. Computnl Statist. Data Anal., 52, 43574368.
  • Holt, C. C. (1957) Forecasting trends and seasonals by exponentially weighted moving averages. Memorandum ONR 52/1957. Carnegie Institute of Technology, Pittsburgh.
  • Hotelling, H. (1947) Multivariate quality control. In Techniques of Statistical Analysis (eds C. Eisenhart, M. W. Hastay and W. A. Wallis), pp. 111184. New York: McGraw-Hill.
  • Huang, S. S., Yokoe, D. S., Stelling, J., Placzek, H., Kulldorff, M., Kleinman, K., O'Brien, T. F., Calderwood, M. S., Vostok, J., Dunn, J. and Platt, R. (2010) Automated detection of infectious disease outbreaks in hospitals: a retrospective cohort study. PLOS Med., 7, article e1000238.
  • Hulth, A., Andrews, N., Ethelberg, S., Dreesman, J., Faensen, D., van Pelt, W. and Schnitzler, J. (2010) Practical usage of computer-supported outbreak detection in five European countries. Eurosurveillance, 15, article 36.
  • Hutwagner, L., Browne, T., Seeman, G. M. and Fleischhauer, A. T. (2005) Comparing aberration detection methods with simulated data. Emergng Infect. Dis., 11, 314316.
  • Hutwagner, L., Thompson, W. W., Seeman, G. M. and Treadwell, T. (2003) The Bioterrorism Preparedness and Response Early Aberration Reporting System (EARS). J. Urb. Hlth, 80, suppl., 8996.
  • Hutwagner, L., Thompson, W. W., Seeman, G. M. and Treadwell, T. (2005) A simulation model for assessing aberration detection methods in public health surveillance for systems with limited baselines. Statist. Med., 24, 543550.
  • Ismail, N. A., Pettitt, A. N. and Webster, R. A. (2003) ‘Online’ monitoring and retrospective analysis of hospital outcomes based on a scan statistic. Statist. Med., 22, 28612876.
  • Jackson, J. E. (1959) Quality control methods for several related variables. Technometrics, 1, 359377.
  • Jackson, M. J., Baer, A., Painter, I. and Duchin, J. (2007) A simulation study comparing aberration detection algorithms for syndromic surveillance. BMC Med. Informat. Decsn Makng, 7, article 6.
  • Jacquez, G. M., Greilling, D. A., Durbeck, H., Estberg, L., Do, E., Long, A.and Rommel, B. (2002) ClusterSeer User Guide 2: Software for Identifying Disease Clusters. Ann Arbor: TerraSeer.
  • Jolliffe, I. T. (2002) Principal Component Analysis, 2nd edn. New York: Springer.
  • Joner, M. D., Woodall, W. H. and Reynolds, M. R. (2008) Detecting a rate increase using a Bernoulli scan statistic. Statist. Med., 27, 25552575.
  • Kafadar, K. and Stroup, D. (1992) Analysis of aberrations in public health surveillance data: estimating variances on correlated samples. Statist. Med., 11, 15511568.
  • Kenett, R. S. and Pollak, M. (1996) Data-analytic aspects of the Shiryaev-Roberts control chart: surveillance of a non-homogeneous Poisson process. J. Appl. Statist., 23, 125137.
  • Kleinman, K. P. and Abrams, A. (2006) Assessing surveillance using sensitivity, specificity and timeliness. Statist. Meth. Med. Res., 15, 445464.
  • Kleinman, K. P., Abrams, A., Kulldorff, M. and Platt, R. (2005) A model-adjusted space-time scan statistic with an application to syndromic surveillance. Epidem. Infectn, 133, 409419.
  • Kleinman, K. P., Lazarus, R. and Platt, R. (2004) A generalized linear mixed models approach for detecting incident clusters of disease in small areas with an application to biological terrorism. Am. J. Epidem., 159, 217224.
  • Knorr-Held, L. (2000) Bayesian modelling of inseparable space-time variation in disease risk. Statist. Med., 19, 25552567.
  • Knox, E. G. (1964) The detection of space-time interactions. Appl. Statist., 13, 2529.
  • Ku, W., Storer, R. and Georgakis, C. (1995) Disturbance detection and isolation by dynamic principal component analysis. Chemometr. Intell. Lab. Syst., 30, 179196.
  • Kulldorff, M. (1997) A spatial scan statistic. Communs Statist. Theor. Meth., 26, 14811496.
  • Kulldorff, M. (2001) Prospective time periodic geographical disease surveillance using a scan statistic. J. R. Statist. Soc. A, 164, 6172.
  • Kulldorff, M. (2010) SaTScan version 8.2.1: software for the spatial, temporal, and space-time scan statistics. (Available from http://www.satscan.org.)
  • Kulldorff, M., Heffernan, R., Hartman, J., Assunção, R. and Mostashari, F. (2005) A space-time permutation scan statistic for disease outbreak detection. PLOS Med., 2, 216224.
  • Kulldorff, M., Huang, L., Pickle, L. and Duczmal, L. (2006) An elliptic spatial scan statistic. Statist. Med., 25, 39293943.
  • Kulldorff, M., Mostashari, F., Duczmal, L., Yih, W. K., Kleinman, K. and Platt, R. (2007) Multivariate scan statistics for disease surveillance. Statist. Med., 26, 18241833.
  • Lawson, A. B. (2005) Spatial and spatio-temporal disease analysis. In Spatial and Syndromic Surveillance for Public Health (eds A. B. Lawson and K. Kleinman), pp. 5576. Chichester: Wiley.
  • Lawson, A. B., Clark, A. and Rodeiro, C. L. V. (2003) Developments in general and syndromic surveillance for small area health data. J. Appl. Statist., 31, 951966.
  • Leal, J. and Laupland, K. B. (2008) Validity of electronic surveillance systems: a systematic review. J. Hosp. Infectn, 69, 220229.
  • Le Strat, Y. and Carrat, F. (1999) Monitoring epidemiologic surveillance data using hidden Markov models. Statist. Med., 18, 34633478.
  • Limaye, S. S., Mastrangelo, C. M., Zerr, D. M. and Jeffries, H. (2008) A statistical approach to reduce hospital-associated infections. Qual. Engng, 20, 414425.
  • Lowry, C. A., Woodall, W. H., Champ, C. W. and Rigdon, S. E. (1992) A multivariate exponentially weighted moving average control chart. Technometrics, 34, 4653.
  • Lu, H.-M., Zeng, D. and Chen, H. (2010) Prospective infectious disease outbreak detection using Markov switching models. IEEE Trans. Knowl. Data Engng, 22, 565577.
  • Lucas, J. M. (1985) Counted data CUSUM's. Technometrics, 27, 129144.
  • Lucas, J. M. and Crosier, R. B. (2000) Fast initial response for CUSUM quality-control schemes: give your CUSUM a head start. Technometrics, 42, 102107.
  • Madigan, D. (2005) Bayesian data mining for health surveillance. In Spatial and Syndromic Surveillance for Public Health (eds A. B. Lawson and K. P. Kleinman), pp. 203221. Chichester: Wiley.
  • Mann, A. E. (2009) Estimating the impact of influenza vaccinations and antigenic drift on influenza-related morbidity and mortality in England and Wales using hidden Markov models. PhD Thesis. London School of Hygiene and Tropical Medicine, London.
  • Marshall, C., Best, N., Bottle, A. and Aylin, P. (2004) Statistical issues in the prospective monitoring of health outcomes across multiple units. J. R. Statist. Soc. A, 167, 541559.
  • Marshall, J. B., Spitzner, D. J. and Woodall, W. H. (2007) Use of the local Knox statistic for the prospective monitoring of disease occurrences in space and time. Statist. Med., 26, 15791593.
  • Martínez-Beneito, M. A., Conesa, D., López-Quílez, A. and López-Maside, A. (2008) Bayesian Markov switching models for the early detection of influenza epidemics. Statist. Med., 27, 44554468.
  • McCabe, G. J., Greenhalgh, D., Gettingby, G., Holmes, E. and Cowden, J. (2003) Prediction of infectious diseases: an exception reporting system. J. Med. Informat. Technol., 5, 6774.
  • Miller, B., Kassenborg, H., Dunsmuir, W., Griffith, J., Hadidi, M., Nordin, J. and Danila, R. (2003) Syndromic surveillance for influenzalike illness in ambulatory care network. Emergng Infect. Dis., 10, 18061811.
  • Montgomery, D. C. (2009) Introduction to Statistical Process Control, 6th edn. Hoboken: Wiley.
  • Morton, A. P., Whitby, M., McLaws, M. L., Dobson, A., McElwain, S., Looke, D., Stackelroth, J. and Sartor, A. (2001) The application of statistical process control charts to the detection and monitoring of hospital-acquired infections. J. Qual. Clin. Pract., 21, 112117.
  • Mostashari, F., Kulldorff, M., Hartman, J. J., Miller, J. R. and Kulasekera, V. (2003) Dead bird clusters as an early warning system for West Nile virus activity. Emergng Infect. Dis., 9, 641646.
  • Musonda, P., Hocine, M. N., Andrews, N. J., Tubert-Bitter, P. and Farrington, C. P. (2008) Monitoring vaccine safety using case series cumulative sum charts. Vaccine, 26, 53585367.
  • Naus, J. and Wallenstein, S. (2006) Temporal surveillance using scan statistics. Statist. Med., 25, 311324.
  • Neill, D. B. (2009) An empirical comparison of spatial scan statistics for outbreak detection. Int. J. Hlth Geographics, 8, article 20.
  • Ngai, H.-M. and Zhang, J. (2001) Multivariate cumulative sum control charts based on projection pursuit. Statist. Sin., 11, 747766.
  • Ngo, L., Tager, I. B. and Hadley, D. (1996) Application of exponential smoothing for nosocomial infection surveillance. Am. J. Epidem., 143, 637647.
  • Nobre, F. F. and Stroup, D. F. (1994) A monitoring system to detect changes in public health surveillance data. J. Epidem., 23, 408418.
  • Oakland, J. S. (2008) Statistical Process Control, 6th edn. Oxford: Butterworth–Heinemann.
  • Page, E. S. (1954) Continuous inspection schemes. Biometrika, 41, 100115.
  • Parker, R. A. (1989) Analysis of surveillance data with Poisson regression: a case study. Statist. Med., 8, 285294.
  • Paul, M., Held, L. and Toschke, A. M. (2008) Multivariate modelling of infectious disease surveillance data. Statist. Med., 27, 62506267.
  • Pelat, C., Boëlle, P.-Y., Cowling, B. J., Carrat, F., Flahault, A., Ansart, S. and Valleron, A.-J. (2007) Online detection and quantification of epidemics. BMC Med. Informat. Decsn Makng, 7, article 29.
  • Pignatiello, J. J. and Runger, G. C. (1990) Comparison of multivariate CUSUM charts. J. Qual. Technol., 22, 173186.
  • Qiu, P. H. and Hawkins, D. (2001) A rank-based multivariate CUSUM procedure. Technometrics, 43, 120132.
  • Qiu, P. and Hawkins, D. (2003) A nonparametric multivariate cumulative sum procedure for detecting shifts in all directions. Statistician, 52, 151164.
  • Rath, T. M., Carreras, M. and Sebastiani, P. (2003) Automated detection of influenza epidemics with hidden Markov models. In Advances in Intelligent Data Analysis V (eds M. R. Berthold, H.-J. Lenz, E. Bradley, R. Kruse and C. Borgelt), pp. 521532. Berlin: Springer.
  • Raubertas, R. F. (1989) An analysis of disease surveillance data that uses the geographical locations of the reporting units. Statist. Med., 8, 267271.
  • Reis, B. Y. and Mandl, K. D. (2003) Time series modeling for syndromic surveillance. BMC Med. Informat. Decsn Makng, 3, article 2.
  • Reynolds, M. R. and Stoumbos, Z. G. (2000) A general approach to modeling CUSUM charts for a proportion. IIE Trans., 32, 515535.
  • Roberts, S. W. (1959) Control chart tests based on geometric moving averages. Technometrics, 1, 239250.
  • Roberts, S. W. (1966) A comparison of some control chart procedures. Technometrics, 8, 411430.
  • Robertson, C. (2006) Protecting the leaders: syndromic surveillance for the G8 summit in Scotland. Significance, 3, 6972.
  • Rogerson, P. A. (1997) Surveillance systems for monitoring the development of spatial patterns. Statist. Med., 26, 20812093.
  • Rogerson, P. A. (2001) Monitoring point patterns for the development of space–time clusters. J. R. Statist. Soc. A, 164, 8796.
  • Rogerson, P. A. and Yamada, I. (2004a) Monitoring change in spatial patterns of disease: comparing univariate and multivariate cumulative sum approaches. Statist. Med., 23, 21952214.
  • Rogerson, P. A. and Yamada, I. (2004b) Approaches to syndromic surveillance when data consist of small regional counts. Morb. Mort. Wkly Rep., suppl., 53, 7985.
  • Rolka, H. R. (2011) Preface. Statist. Med., 30, 401402.
  • Rossi, G., Lampugnani, L. and Marchi, M. (1999) An approximate CUSUM procedure for surveillance of health events. Statist. Med., 18, 21112122.
  • Schiöler, L. and Frisén, M. (2011) Multivariate outbreak detection. J. Appl. Statist., to be published.
  • Sebastiani, P., Mandl, K. D., Szolovits, P., Kohane, I. S. and Ramoni, M. F. (2006) A Bayesian dynamic model for influenza surveillance. Statist. Med., 25, 18031816.
  • Sego, L. H., Woodall, W. H. and Reynolds, M. R. (2008) A comparison of surveillance methods for small incidence rates. Statist. Med., 27, 12251247.
  • Serfling, R. (1963) Methods for current statistical analysis of excess pneumonia-influenza deaths. Publ. Hlth Rep., 78, 494506.
  • Shewhart, W. A. (1931) Economic Control of Quality of Manufactured Product. Princeton: Van Nostrand Reinhold.
  • Shiryaev, A. N. (1963) On the detection of disorder in a manufacturing process. Theor. Probab. Applic., 8,247265.
  • Shmueli, G. (2005) Wavelet-based monitoring for modern biosurveillance. Working Paper RHS-06-002. Robert H. Smith School of Business, University of Maryland, College Park. (Available from http://ssrn.com/abstract=902878.)
  • Shmueli, G. and Burkom, H. (2010) Statistical challenges facing early outbreak detection in biosurveillance. Technometrics, 52, 3951.
  • Shore, D. L. and Quade, D. (1989) A surveillance system based on a short memory scheme. Statist. Med., 8, 311322.
  • Sonesson, C. (2003) Evaluations of some exponentially weighted moving average methods. J. Appl. Statist., 30, 11151133.
  • Sonesson, C. (2007) A CUSUM framework for detection of space-time disease clusters using scan statistics. Statist. Med., 26, 47704789.
  • Sonesson, C. and Bock, D. (2003) A review and discussion of prospective statistical surveillance in public health. J. R. Statist. Soc. A, 166, 521.
  • Sonesson, C. and Frisén, M. (2005) Multivariate surveillance. In Spatial Surveillance for Public Health (eds A. B. Lawson and K. Kleinman), pp. 169186. Chichester: Wiley.
  • Sparks, R. (2010) Enhancing road safety through early detection of outbreaks in the frequency of motor vehicle crashes. Safty Sci., 48, 135144.
  • Stern, L. and Lightfoot, D. (1999) Automated outbreak detection: a quantitative retrospective analysis. Epidem. Infectn, 122, 103110.
  • Steutel, F. W. and van Harn, K. (1979) Discrete analogues of self-decomposability and stability. Ann. Probab., 7, 893899.
  • Stroup, D. F., Thacker, S. B. and Herndon, J. L. (1988) Application of multiple time series analysis to the estimation of pneumonia and influenza mortality by age 1962-1983. Statist. Med., 7, 10451059.
  • Stroup, D., Wharton, M., Kafadar, K. and Dean, A. (1993) Evaluation of a method for detecting aberrations in public health surveillance data. Am. J. Epidem., 137, 373380.
  • Stroup, D. F., Williamson, G. D., Herndon, J. L. and Karon, J. M. (1989) Detection of aberrations in the occurrence of notifiable diseases surveillance data. Statist. Med., 8, 323329.
  • Takahashi, K., Kulldorff, M., Tango, T. and Yih, K. (2008) A flexibly shaped space-time scan statistic for disease outbreak detection and monitoring. Int. J. Hlth Geog., 7, article 14.
  • Tango, T. (1995) A class of tests for detecting general and focussed clustering of rare diseases. Statist. Med., 14, 23232334.
  • Tango, T. and Takahashi, K. (2005) A flexibly shaped spatial scan statistic for detecting clusters. Int. J. Hlth Geog., 4, article 11.
  • Thacker, S. B., Parrish, R. G. and Trowbridge, F. L. (1988) A method for evaluating systems of epidemiological surveillance. Wrld Hlth Statist. Q., 41, 1118.
  • Tillett, H. E. and Spencer, I.-L. (1982) Influenza surveillance in England and Wales using routine statistics. J. Hyg. Camb., 88, 8394.
  • Vidal Rodeiro, C. L. and Lawson, A. B. (2006) Monitoring changes in spatio-temporal maps of disease. Biometr. J., 3, 463480.
  • Wagner, A. (2010) Extending Scottish exception reporting systems spatially and temporally. PhD Thesis. University of Strathclyde, Glasgow.
  • Wallenstein, S. (1980) A test for detection of clustering in time. Am. J. Epidem., 111, 367372.
  • Wallenstein, S., Gould, M. and Kleinman, M. (1989) Use of the scan statistic to detect time-space clustering. Am. J. Epidem., 130, 10571064.
  • Wallenstein, S. and Naus, J. (2004) Scan statistics for temporal surveillance for biologic terrorism. Morb. Mort. Wkly Rep., suppl., 53, 7478.
  • Watier, L., Richardson, S. and Hubert, B. (1991) A time series construction of an alert threshold with application to S. bovismorbificans in France. Statist. Med., 10, 14931509.
  • Watkins, R. E., Eagleson, S., Veenendaal, B., Wright, G. and Plant, A. J. (2009) Disease surveillance using a hidden Markov model. BMC Med. Informat. Decsn Makng, 9, article 39.
  • Weiß, C. H. (2009) Categorical Time Series Analysis and Applications in Statistical Quality Control. Berlin: dissertation.de.
  • West, M. and Harrison, J. (1997) Bayesian Forecasting and Dynamic Models, 2nd edn. New York: Springer.
  • Widdowson, M.-A., Bosman, A., van Straten, E., Tinga, M., Chaves, S., van Eerden, L. and van Pelt, W. (2003) Automated laboratory-based system using the internet for disease outbreak detection. Emergng Infect. Dis., 9, 10461052.
  • Wieland, S. C., Brownstein, J. S., Berger, B. and Mandl, K. D. (2007) Automated real-time constant specificity surveillance for disease outbreaks. BMC Med. Informat. Desn Makng, 7, article 15.
  • Williamson, G. and Weatherby Hudson, G. (1999) A monitoring system for detecting aberrations in public health surveillance reports. Statist. Med., 18, 32833298.
  • Winters, P. R. (1960) Forecasting sales by exponentially weighted moving averages. Mangmnt Sci., 6, 324342.
  • Woodall, W. H. (2006) Use of control charts in health care and public health surveillance (with discussion). J. Qual. Technol., 38, 88103.
  • Woodall, W. H., Marshall, J. B., Joner, Jr, M. D., Fraker, S. E. and Abdel-Salam, A.-S. G. (2008) On the use and evaluation of prospective scan methods for health-related surveillance. J. R. Statist. Soc. A, 171, 223237.
  • Yamada, I., Rogerson, P. and Lee, G. (2009) GeoSurveillance: a GIS-based system for the detection and monitoring of spatial clusters. J. Geog. Syst., 11, 155173.
  • Zhang, T. and Lin, G. (2009) Spatial scan statistics in loglinear models. Computnl Statist. Data Anal., 53, 28512858.
  • Zhang, J., Tsui, F.-C., Wagner, M. M. and Hogan, W. (2003) Detection of outbreaks from time series data using wavelet transform. In Proc. American Medical Informatics Association A. Symp., pp. 748752. Madison: Omnipress.
  • Zhou, H. and Lawson, A. B. (2008) EWMA smoothing and Bayesian spatials modeling for health surveillance. Statist. Med., 27, 59075928.