Influenza‐associated mortality in South Africa, 2009‐2013: The importance of choices related to influenza infection proxies

Background Regression modeling methods are commonly used to estimate influenza‐associated mortality using covariates such as laboratory‐confirmed influenza activity in the population as a proxy of influenza incidence. Objective We examined the choices of influenza proxies that can be used from influenza laboratory surveillance data and their impact on influenza‐associated mortality estimates. Method Semiparametric generalized additive models with a smoothing spline were applied on national mortality data from South Africa and influenza surveillance data as covariates to obtain influenza‐associated mortality estimates from respiratory causes from 2009 to 2013. Proxies examined included alternative ways of expressing influenza laboratory surveillance data such as weekly or yearly proportion or rate of positive samples, using influenza subtypes, or total influenza data and expressing the data as influenza season‐specific or across all seasons. Result Based on model fit, weekly proportion and influenza subtype‐specific proxy formulation provided the best fit. The choice of proxies used gave large differences to mortality estimates, but the 95% confidence interval of these estimates overlaps. Conclusion Regardless of proxy chosen, mortality estimates produced may be broadly consistent and not statistically significant for public health practice.


| INTRODUCTION
Estimation of the mortality burden of seasonal and pandemic influenza is important in public health as it can be used to inform the impact of influenza control policies and programs; however, such estimates are not easy to ascertain. Using influenza-coded deaths usually grossly underestimate the burden of influenza-associated deaths, 1 as these deaths are more often complicated by secondary bacterial coinfections or exacerbation of underlying chronic conditions 2 or even cardiac complications. 3 These deaths are usually recorded with an underlying cause of death other than influenza. In South Africa, the mortality risk is further compounded by high HIV prevalence, which puts HIV patients at a much higher risk of influenza-related mortality and other opportunistic infections. 4 This adds another level of uncertainty in the recorded underlying cause of death.
To overcome underestimation of influenza-related deaths, ecological time-series modeling is used to estimate influenza's association with a broader range of coded causes of death. Many of these methods originate from the original Serfling regression approach, 5 with numerous adaptations. 6,7 These approaches are based on the [Correction added on 25 January 2018, after first online publication: Funding information has been added.] The copyright line for this article was changed on 25 January 2018 after original online publication long-recognized elevation in weekly or monthly time series of deaths that occurs when influenza is circulating in populations. The population rate of influenza-associated deaths is obtained by estimating excess mortality by subtracting a seasonally varying background rate of noninfluenza deaths from observed death rates. 5 A more convincing modeling approach includes independent covariates representing trends in incidence of influenza infection in the model. Influenza-associated excess mortality can then be estimated using the parameter estimates of the influenza variables of the model, which represent the relationship between observed trends in influenza and the modeled mortality outcomes. Indicators or proxies for the incidence of influenza infection include time series of laboratory-confirmed influenza infections in influenza-like illness (ILI) surveillance from sentinel health facilities. [8][9][10] Laboratory surveillance time series can be expressed in various ways. Firstly, proxies could be expressed as weekly or monthly counts of total influenza-positive samples or proportion positive of samples tested. Using proportions standardizes the data, automatically adjusting for changes in testing frequency and laboratory procedures over time. 11,12 On the other hand, the proportion may be influenced by factors other than influenza-the denominator may vary due to the incidence of non-influenza causes of acute respiratory infection.

Proxies can be further disaggregated by influenza virus types (A and B) and subtypes of influenza A, such as A(H3N2) and A(H1N1)
pdm09, if laboratory surveillance data are sufficiently refined. While several influenza types and subtypes (henceforth denoted as (sub) types) cocirculate, relative timing and incidence can vary. 8 In addition, virus virulence can vary from season to season, even within the same type or subtype. 13 To account for varying virulence, a separate influenza-incidence proxy can be used for each season. 14 It is unknown which proxy provides the best estimates in ecological time-series studies. Several studies estimating influenzaassociated mortality in South Africa used monthly proportions of total influenza-positive viral samples, with the denominator being the total annual number of samples tested. [15][16][17] The aim of this study is to look at how estimated influenza-associated mortality rates are affected by the choice of influenza proxies used.

| Study period and setting
The study period included 260 weeks (Monday-Sunday) from 5 January 2009 to 29 December 2013 inclusive. The setting was the nation of South Africa with a population of about 53 million in 2013.

| Mortality data and population denominators
All-season: influenza proxy for whole study period is analyzed as a whole. b Season-specific: influenza proxy for each season analyzed as individual covariates. c All-influenza: influenza proxy of weekly positive samples is not subtyped but considered as a whole. d Influenza (sub)types: proxy consisting of influenza types and subtypes analyzed as individuals covariates-A(H1N1)pdm09, A(H3N2), and B. e Weekly proportion: proxy calculated as total weekly samples testing positive for influenza viruses divided by total weekly samples tested. f Yearly proportion: proxy calculated as total weekly samples testing positive for influenza viruses divided by total yearly samples tested. g Rate: proxy calculated as total weekly samples testing positive for influenza viruses divided by weekly population in South Africa.
Population denominators for rates were obtained from mid-year population estimates from Statistics South Africa 19 and were linearly interpolated to provide denominators for weekly death rates.

| Influenza surveillance data and proxies
Information on circulating influenza viruses was obtained from the South African Severe Acute Respiratory Illness (SARI) surveillance program, which is a prospective, hospital-based sentinel surveillance program that covers 4 of 9 provinces in South Africa. 20 Respiratory samples were tested by real-time reverse-transcription polymerase chain reaction, and influenza viruses identified were typed as influenza A or B. Influenza A viruses were subtyped into influenza A(H1N1) pdm09 or influenza A(H3N2).
The 3 influenza-incidence proxy variables we evaluated are summarized in Table 1. First, weekly population rates of influenza-positive specimens were used to account for changing population size. The other 2 methods were calculated as proportions of positive specimens.
The "weekly proportion" was simply the weekly number of specimens testing positive for each influenza (sub)type by the total number of specimens tested for that week. The "annual proportion" was calculated by dividing the weekly number of specimens testing positive for each (sub)type by the total number of specimens tested in the year of analysis.
We also looked at allowing the relationship between viral proxies and influenza-associated mortality to vary by year, to accommodate annual changes in virus virulence or changes in the surveillance program. Each influenza (sub)type in a particular season therefore is represented by an individual covariate, which is set to 0 in all other seasons. 14 Proxy variables for these were termed "season-specific" as opposed to "all-season," which considered a single viral proxy variable for each virus category across all seasons.

| Statistical analysis
Statistical analysis was conducted using the generalized additive model (GAM) procedure in SAS 9.3 (SAS Institute Inc., Cary, NC, USA). Weekly rates of influenza-associated deaths from respiratory causes were estimated using a semiparametric GAM. Independent variables of influenzaincidence proxies were parametrically and linearly related to respiratory mortality incidence. We used a natural cubic smoothing spline of weeks to account for the time trend in non-influenza-associated (background) mortality. In an Australian study, the GAM model formulation provided an improved model fit compared with the more conventional trigonometric (sinusoidal) background mortality estimation approach. 21 The final model equations for each of the proxy definitions above are as follows: 1. Influenza (sub)type, all-season formulation: E(mortality rate) = β 0 + β 1 t +β 2 (Influenza A(H1N1)) +β 3 (Influenza A(H3N2)) +β 4 (Influenza B) + spline(t)

4.
All-influenza, all-season formulation: where E(mortality rate) was the expected respiratory mortality rate, t was the sequential week number of the weekly time series, Influenza A(H1N1) was the proxy for influenza A(H1N1)pdm09, Influenza A(H3N2) was the proxy for influenza A(H3N2), Influenza B was the proxy for influenza B, all-influenza was the aggregated proxy for all 3 influenza viruses, and spline was the spline curve for t and was specified with 31 degrees of freedom which achieved a degree of control for autocorrelation (r < .2). One degree of freedom is allocated to the parametric linear time variable (β 1 t), and the remaining degrees of freedom are distributed at 6 per year for the spline. 21 Estimated background mortality was calculated from the model by setting all the influenza variables to zero and introducing them into the fitted model formula. The estimated influenza-associated excess death rate in each week was determined by multiplying the influenza parameter estimate (β) by its respective influenza surveillance proxy value. Negative influenza parameter estimates led to negative mortality estimates, which are biologically meaningless, 14 and were regarded as zero when aggregating estimates for (sub)type-specific influenza.
The 95% confidence interval (CI) of the estimated mortality rate was obtained by multiplying the influenza surveillance proxy value using each of the upper and lower 95% CI as shown below: where SE is the standard error of the parameter estimate, β.
The standard error for the 95% CI for aggregate (sub)type-specific estimates was calculated as: where SE 1 …SE n are standard errors of aggregated proxies used.
An equivalent procedure was used to estimate the 95% CI for the annual mean influenza-attributable mortality estimates.

| Influenza surveillance
A mean of 4330 respiratory specimens were tested annually for influenza viruses. The mean annual number of specimens that tested positive for one or more influenza viruses was 332 (7.7%). Over the study period, different influenza (sub)types circulated at varying times throughout the year and their relative annual proportions are shown in Figure 1. Time series of influenza surveillance as weekly proportion, yearly proportion, and population rate is shown in Figure S1. In 2009, influenza virus activity occurred in 2 peaks; influenza A(H3N2) dominated from May to July, and a second peak occurred from August to September with the introduction of influenza A(H1N1)pdm09.

| Observed mortality rate
During the study period, 302 112 respiratory deaths were recorded.
The mean respiratory mortality weekly rate over the study period was 2.29 per 100 000 population for all ages, 1.69 per 100 000 population for persons aged <65 years, and 13.49 per 100 000 population for persons aged ≥65 years. A marked downward trend in mortality rate can be seen in the all age and <65 years age groups ( Figure S2).

| Goodness-of-fit model
The best model fit as assessed by the lowest RMSE value in both age groups and all ages combined was obtained by using weekly proportion of influenza (sub)types as proxies ( Table 2). Best fit was not confined to all-season or season-specific proxies. In the all-age group, the use of season-specific proxy variables gave better fit than all-season proxy variables. In the 2 age-stratified groups, all-season proxy variables improved fit.

| Influenza-associated deaths
Models that use either yearly proportion or rate of season-specific influenza (sub)types and all-influenza proxies gave consistently similar estimates in all 3 age groups. The remaining model formulations, however, gave different mortality estimates. While point estimates were different across the models, the 95% confidence intervals overlapped (Tables 3-5).
In the all-age analysis, the annual mean mortality estimates ranged from 2.58 to 4.66 per 100 000 population. Confidence intervals of the annual mean mortality estimates overlapped with the lowest and highest limit at 1.21-7.30 per 100 000 population, respectively (Table 3).
In the 2 age-stratified groups, the range of annual mean estimates was wider, with the ≥65 years age group experiencing much higher mortality rates. Confidence intervals of annual mean mortality for these 2 age groups also overlapped and included 0. For the <65 years age group, annual mean mortality estimates ranged from 0.75 to 3.39 per 100 000 population and the lowest and highest confidence limits were −0.50 and 3.92, respectively ( Excess deaths by (sub)type are shown in Figure S2. A direct relationship between dominant influenza (sub)type in circulation with excess mortality in the year was observed with the exception in 2012 where excess deaths were observed to be due to influenza A(H3N2) even though dominant strain in circulation that year was influenza B.
When examining mortality estimates contributed by each influenza (sub)type proxy across all models and age groups, point estimates differed across models, but their 95% confidence intervals generally  (Tables S1-S6).

| DISCUSSION
Using South African SARI surveillance data, our study explored alternative influenza-incidence proxies and their effect on model fit and respiratory mortality estimates. We tested model formulations considering  Annual mean mortality rate is calculated by averaging aggregated annual total mortality rates over 5 y.
b Influenza proxy for study period is analyzed as a whole.
c Annual mortality rate obtained by aggregating mortality rate contributed by individual influenza types and subtypes, except when mortality estimate is negative in which case, it is considered as zero for that year in the calculation as negative estimates are biologically non-meaningful.  Annual mean mortality rate is calculated by averaging aggregated annual total mortality rates over 5 y.
b Influenza proxy for study period is analyzed as a whole.
c Annual mortality rate obtained by aggregating mortality rate contributed by individual influenza types and subtypes, except when mortality estimate is negative in which case, it is considered as zero for that year in the calculation as negative estimates are biologically non-meaningful.  Annual mean mortality rate is calculated by averaging aggregated annual total mortality rates over 5 y.
b Influenza proxy for study period is analyzed as a whole.
c Annual mortality rate obtained by aggregating mortality rate contributed by individual influenza types and subtypes, except when mortality estimate is negative in which case, it is considered as zero for that year in the calculation as negative estimates are biologically non-meaningful. provided the best model fit for the age-specific models. Among all the best-fitting model formulations, the weekly proportion proxy combined with (sub)type-specific and either season-specific and all-season proxies produced reasonably higher model fit. This would suggest that the more important choice may be whether to use type or season-specific proxies rather than whether to use a proportion or a rate. Regardless of proxy formulation, none of the point estimates were statistically significantly different from each other for a given age category.
We did not evaluate a proxy of absolute counts of influenzapositive samples. Assuming testing practices were constant over time, counts should be a reasonable proxy. The population rates we used should be almost equivalent to counts except they adjusted for changes in total underlying population at risk. However, this did not account for changes in surveillance practice or geographic coverage, which were known to occur during the study period. Surprisingly, though, estimates from the population rate proxy did not differ substantially from the yearly proportion proxy. If the number of specimens tested was available, a preferred method may then be to use the proportion positive as a weekly or yearly proportion, which inherently would adjust for the impact of testing and health-seeking behavior 12 and would be independent of population size. On the other hand, proportions are sensitive to the size of the denominator which for influenza tests may vary according to the incidence of other pathogens circulating that cause influenza-like illness. Small numbers of tests can also create problems. For example, sentinel surveillance systems with small numbers of participating sites could have smaller denominators.
One positive specimen of 2 specimens tested at the start, or outside, of the influenza season produces a proportion of 50%, even though there is low influenza activity. In previous South African studies, laboratory surveillance data were provided monthly and proxies were calculated as yearly proportions to adjust for possible bias such as different laboratory methods and different specimen sampling over time due to sampling behavior as well as varying changes in total annual number of samples tested. 11,15,16 In our study, the surveillance data had adequate numbers and it was done on a weekly basis.
We tried season-and (sub)type-specific viral proxies to adjust for any potential year-to-year differences between the level of influenza activity and the resulting disease burden. 14,23 Such differences could be due to variation in virulence arising from antigenic drift of influenza strains, 24 but also differences in population susceptibility or vaccination coverage and effectiveness over the years. Much wider con- Can be unstable if denominator is a small count. Does not vary if infection incidence increases but proportion positive is the same. Only possible if number of tests is available.

Yearly proportion
Adjusts for year-to-year changes in surveillance practice or coverage. Less subject to small denominator problems than weekly proportion.
Does not adjust for within-year changes in testing behavior. Does not vary if infection incidence increases for all weeks but proportion positive is the same. Requires whole years of surveillance data. Only possible if number of tests is available.

Weekly rate
Adjusts for population changes. Reflects absolute incidence of infection, if testing behavior or surveillance coverage unchanged.
Does not adjust for testing behavior or surveillance coverage over time

Influenza (sub) types
Higher resolution according to differing epidemiology and virulence of various (sub)types. May provide information for assessing vaccine effectiveness by component strains.

Reduced model parsimony.
Reduced statistical power per (sub)type compared with combined influenza series. Proportion of specimens typed or subtyped can vary over time.

All-influenza
Summary impact of overall influenza. Does not require (sub)type information. Larger counts may lead to greater statistical power to detect an association with mortality.
Does not provide vaccine strain component information.

Season-specific
Accounts for year-to-year changes in virus virulence, surveillance system coverage, and testing behavior.
Reduced statistical power compared with an all-season proxy.
All-season Higher statistical power to detect an association between influenza and mortality compared with season-specific variables.
Does not account for year-to-year changes in virus virulence, surveillance system coverage, and testing. So what proxies should be used in practice? We only evaluated proxies with 1 data source in 1 setting. Nevertheless, our findings can be used to provide guidance in other scenarios or to analyze other influenza disease burden outcomes such as hospitalization and intensive care admissions. In this case, our study could inform how influenza proxies should be formulated. We also stress that there is a need to understand how disease burden outcomes are identified and recorded by surveillance systems in order to appropriately adjust for any biases.
We have summarized the advantages and disadvantages of using the various influenza proxy formulations in Table 6.
This study had several limitations. Our study period since the introduction of the pandemic strain only consisted of 5 years, which was relatively short. A longer period would be more ideal to fully encapsulate and understand how influenza-associated mortality would change over the years. The data quality of the SARI surveillance data was fairly robust but 1 criticism of the program was its representativeness. 20 The SARI surveillance program only covered 4 out of 9 provinces in the country in 6 sentinel sites, and therefore, it was questionable if the data could nationally represent the viral circulation activity in the population. Also, the number of surveillance sites and output from some of the sites changed over the study period. In addition, there may be a sampling bias toward critically ill patients who required hospitalization and it was not known what proportion of the population this group of patients constituted. Its representation of viral circulation in the population could be overestimated. Other proxies such as ILI for nonsevere cases in the population could be incorporated into the model to balance out the data bias for influenza viral activity. In addition, there have been other studies that explored the interaction of ILI and laboratory surveillance data as a more representative proxy for influenza incidence in the population. 10 Besides influenza, there was evidence that other viruses such as respiratory syncytial virus (RSV), parainfluenza, and noroviruses may cocirculate with influenza and contributed to excess winter mortality in the elderly, 27 although previous studies had shown that RSV and influenza did not cocirculate in South Africa at the same time. 15,16 Nevertheless, our influenza-associated mortality estimates could have been influenced by other confounding pathogens.
In conclusion, our study provides guidance on choosing appropriate influenza-incidence proxies for use in estimating the mortality burden of influenza. Improved model fit through use of (sub)type-specific proxies suggests that the different mortality risk associated with each of influenza A(H1N1)pdm09, A(H3N2), and B appears to be an important factor in estimating mortality. However, varying mortality risk from season to season among influenza strains does not appear to be a clear factor from our study. Regardless of proxy chosen, estimates produced may be broadly consistent and not statistically significantly different. Thus, the large mortality health burden attributable to influenza will still be evident.