Forecasting influenza outbreak dynamics in Melbourne from Internet search query surveillance data

Background Accurate forecasting of seasonal influenza epidemics is of great concern to healthcare providers in temperate climates, as these epidemics vary substantially in their size, timing and duration from year to year, making it a challenge to deliver timely and proportionate responses. Previous studies have shown that Bayesian estimation techniques can accurately predict when an influenza epidemic will peak many weeks in advance, using existing surveillance data, but these methods must be tailored both to the target population and to the surveillance system. Objectives Our aim was to evaluate whether forecasts of similar accuracy could be obtained for metropolitan Melbourne (Australia). Methods We used the bootstrap particle filter and a mechanistic infection model to generate epidemic forecasts for metropolitan Melbourne (Australia) from weekly Internet search query surveillance data reported by Google Flu Trends for 2006–14. Results and Conclusions Optimal observation models were selected from hundreds of candidates using a novel approach that treats forecasts akin to receiver operating characteristic (ROC) curves. We show that the timing of the epidemic peak can be accurately predicted 4–6 weeks in advance, but that the magnitude of the epidemic peak and the overall burden are much harder to predict. We then discuss how the infection and observation models and the filtering process may be refined to improve forecast robustness, thereby improving the utility of these methods for healthcare decision support.


S2 Particle filter
Particles are initially assigned uniform weights w i (Eq. S12), which are subsequently adjusted in response to each observation y t (Eq. S13) and normalised so as to sum to one (Eq. S14). When the effective number of particles (Eq. S15) drops below the threshold N min (Table S2) the particles are resampled in proportion to their weights; this is done using the systematic (deterministic) method, as described by Kitagawa (1996, see appendix).

S3 Observation model
Google Flu Trends reports ILI prevalence as integer counts of ILI cases per 100,000 GP visits, and so we can express the daily probability of an individual presenting with ILI in the absence of influenza (p bg ) as a function of the (imposed) background ILI rate B R (Table S3) and the daily number of GP visits per individual (v daily ) (Eq. S16). The value for v daily was obtained from the reported annual rate of 5,615 GP attendances per 1,000 population (Department of Health & Human Services, 2013).
The probability that an infected individual will visit a GP and be identified as having ILI (p id ) is a parameter of the observation model (Table S3). The probability that an individual will be identified as an ILI case over some time interval [t − ∆, t] is the sum of two independent events (Eq. S17): becoming infectious (p inf ) and being identified (p id ), or not becoming infectious but presenting with an ILI. The probability of becoming infectious is defined as the fraction of the population that became infectious (i.e., transitioned from state E to state I) during the time interval (Eq. S18), and subsumes both symptomatic and asymptomatic infections. In the absence of reliable data to the contrary, both types of infection are assumed to be identically infectious. The observation probability (p id ) therefore represents the probability of an infection being symptomatic and observed.
A negative binomial distribution is used to define the likelihood of obtaining an ILI presentation rate y t from a given particle x t (Eq. S19). We use the parameter κ to convert ILI incidence from cases per 100,000 GP visits to population incidence (Eq. S20). The dispersion parameter k (Eq. S21) controls the variance: as k increases the variance decreases and the distribution approaches the Poisson.

S4 Forecast variance and accuracy
Here we show how the forecast variance and accuracy change over the weeks leading up to the observed epidemic peak for each calendar year. As shown in Figure S1, the variance of both the peak size and timing forecasts decrease substantially as the forecasting date approaches the date of the true epidemic peak. Note that the y-axis is logarithmic and differs in scale between the two plots, since the peak time forecasts are more precise than the peak size forecasts.
Forecast accuracy increases as the forecast variance decreases ( Figure S2) and so the accuracy increases as the true peak is approached ( Figure S3). For example, when using a threshold of ± 10 days as the definition of an "accurate" prediction of the peak timing (equivalent to ± 1 week, when aggregating dates into weekly bins) it can be seen that in 4 of the calendar years under consideration, more than 50% of the weighted forecasts accurately predicted the timing of the peak 5 weeks in advance.

Peak Time
Peak Size Figure S1: The ensemble forecast variances steadily decrease as the true epidemic peak is approached; the variance (log 10 ) of the predicted size is shown on the left, the variance (log 10 ) of the predicted time on the right. Results are shown for B R = 300, p id = 0.05 and k = 10; three years are excluded on the grounds that the epidemic peak was either very small (2010), absent (2011), or occurred after an earlier, smaller peak (2014).  Figure S2: The accuracy of the peak timing forecasts (left, ±7, 10, 14 days) and the peak size forecasts (right, ±10%, 20%, 33%) increase as forecast variance decreases (i.e., as forecast precision increases). Forecast accuracy is shown for three tolerance levels, relative to the true peak time (left) and peak size (right). Results are shown for the same forecasts as in Figure S1.  Figure S3: The accuracy of peak timing forecasts (left) and peak size forecasts (right) increase as the true epidemic peak is approached; dashed lines show 50% accuracy. Results are shown for the same forecasts as in Figure S1.

S5 Particle filter robustness
In an effort to avoid particle degeneracy (as observed, e.g., in the 2014 forecasts for k = 10 and k = 100), we explored a number of approaches to improve filter robustness. We tested the normal distribution observation model used by Shaman and Karspeck (2012), where the variance was a function of the mean ("N S&K", Figures S4-S5). However the variance was sufficiently large to reduce forecast performance in every year except 2013 and 2014. We also explored using similar observation models where the magnitude of the variance was reduced ("N #1" . . . "N #4", Figures S4-S5) and observed that this did not improve forecasting performance in 2013 and 2014.
Increasing the number of particles five-fold ("N px = 18K") and/or decreasing the resampling threshold ("N eff > 25%") did not improve forecasting performance in 2013 and 2014 (as shown for the "N #4" observation model in Figures S6-S7).
We also tested two normal distributions where the variance was estimated using data from the early phase of the season: first, using all values < 450 (σ = 100); and second, using all values < 800 (σ = 170), but neither improved forecasting performance in 2013 and 2014, even when the resampling threshold was decreased ("N eff > 25%"), shown in Figures S8-S9).
We also tested the performance of our original (negative binomial) observation model when the number of particles was increased five-fold ("N px = 18K") and when the resampling threshold was reduced ("N eff > 25%" and "N eff > 50%") but, as shown in Figures S10-S11, this did not improve forecasting performance in 2013 and 2014.
For each of these modifications to the particle filter, we also plotted the resulting accuracy of the peak timing forecasts over the weeks prior to the true peak (with a tolerance of ±10 days) against the accuracy of the original method ("Original (k=10)" and "Original (k=100)") in Figures S12-S19.  Figure S6: A comparison of our original forecasts (k = 10) with forecasts generated using a Gaussian observation model ("N #4") where the number of particles was increased five-fold ("N px = 18K") and with a lower resampling threshold ("N px = 18K, N eff > 25%").  Figure S7: A comparison of our original forecasts (k = 100) with forecasts generated using a Gaussian observation model ("N #4") where the number of particles was increased five-fold ("N px = 18K") and with a lower resampling threshold ("N px = 18K, N eff > 25%"). Figure S8: A comparison of our original forecasts (k = 10) with forecasts generated using Gaussian observation models with constant variances ("N (σ = 100)" and "N (σ = 170)"), and with a lower resampling threshold ("N (σ = 170), N eff > 25%").
Original(k = 10) 90% CI 50% CI Median Figure S10: A comparison of our original forecasts (k = 10) with forecasts generated using the original model where the number of particles was increased fivefold ("N px = 18K") and where the resampling threshold was decreased ("N eff > 25%" and "N eff > 50%").