Spatiotemporal reconstruction and transmission dynamics during the 2016–17 H5N8 highly pathogenic avian influenza epidemic in Italy

Summary Effective control of avian diseases in domestic populations requires understanding of the transmission dynamics facilitating viral emergence and spread. In 2016–17, Italy experienced a significant avian influenza epidemic caused by a highly pathogenic A(H5N8) virus, which affected domestic premises housing around 2.7 million birds, primarily in the north‐eastern regions with the highest density of poultry farms (Lombardy, Emilia‐Romagna and Veneto). We perform integrated analyses of genetic, spatiotemporal and host data within a Bayesian phylogenetic framework. Using continuous and discrete phylogeography, we estimate the locations of movements responsible for the spread and persistence of the epidemic. The information derived from these analyses on rates of transmission between regions through time can be used to assess the success of control measures. Using an approach based on phylogenetic–temporal distances between domestic cases, we infer the presence of cryptic wild bird‐mediated transmission, information that can be used to complement existing epidemiological methods for distinguishing transmission within the domestic population from incursions across the wildlife–domestic interface, a common challenge in veterinary epidemiology. Spatiotemporal reconstruction of the epidemic reveals a highly skewed distribution of virus movements with a high proportion of shorter distance local movements interspersed with occasional long‐distance dispersal events associated with wild birds. We also show how such inference be used to identify possible instances of human‐mediated movements where distances between phylogenetically linked domestic cases are unusually high.


| INTRODUC TI ON
In 2016-17, a highly pathogenic avian influenza (HPAI) epidemic caused by A(H5N8) viruses of clade 2.3.4.4 occurred in 29 European Countries. This was the largest HPAI epidemic ever recorded in Europe in terms of number of poultry outbreaks, wild bird deaths recorded and geographical extent (Adlhoch et al., 2018;Brown, Kuiken, et al., 2017;.
Between October 2016 and December 2017, approximately 1,500 cases were identified in wild birds and more than 1,200 outbreaks were reported in poultry across Europe, with frequent primary incursions affecting all productive sectors, ranging from backyard/ rural to industrial. Secondary spread between domestic premises was also detected, most commonly associated with domestic waterfowl species (Brown, Kuiken, et al., 2017;. The first HPAI cases in Italy were identified in late December 2016 with two A(H5N8) and two A(H5N5) HPAI viruses confirmed in dead wild birds found in north-eastern Italy between 30 December 2016 and 23 January 2017 (Fusaro et al., 2017). While H5N5 subtype viruses were not further detected, A(H5N8) HPAI spread widely in northern Italy leading to several cases in both domestic poultry and wild birds (Fusaro et al., 2017). In contrast to the rest of Europe, Italy experienced two distinct epidemic waves during 2016-2017 which presented different characteristics . The first wave which comprised 16 cases confirmed in poultry and five in wild birds (excluding two A(H5N5) cases) started in late December 2016 and continued until late May 2017 following an atypical epidemic curve, with sporadic weekly reported cases and a limited involvement of wild birds in discordance with observations in other European Countries (Globig et al., 2017;Verhagen et al., 2017). Most cases were detected at the periphery of the most densely populated poultry area (DPPA) in the northern Italian regions of Veneto, Lombardy and Emilia-Romagna.
The second epidemic wave started in the second half of July 2017 and ended by mid-December of the same year (Brown, Kuiken, et al., 2017;Mulatti et al., 2018). Sixty-seven outbreaks were observed in poultry farms, mainly fattening turkey premises, while only seven cases were confirmed in wild birds. During the second wave, the virus spread to the DPPA in the Po Valley, where most of the cases were located. Sporadic, isolated outbreaks were also notified in Piedmont, and Lazio, in western and central Italy, respectively. The second epidemic wave presented a classic occurrence pattern for an infectious disease spreading in a densely populated area, with evidence of lateral spread leading to clusters of interconnected outbreaks .
For rapidly evolving viruses, the time scale of epidemic spread often enables integrated reconstruction of both the evolutionary history and spatial processes underlying an epidemic (Dellicour, Rose, & Pybus, 2016;Lemey, Rambaut, Drummond, & Suchard, 2009;Pybus, Tatem, & Lemey, 2015). Phylogeographic inference has become a commonly used technique for the study of virus dispersal over the course of an epidemic in either discrete (Bedford et al., 2015;Brunker et al., 2018;Dudas et al., 2017;Hall, Knowles, Wadsworth, Rambaut, & Woolhouse, 2013;Lemey et al., 2009;Trewby, Nadin-Davis, Real, & Biek, 2017) or continuous space (Brunker et al., 2018;Dellicour, Baele, et al., 2018;Dellicour, Vrancken, Vrancken, Trovão, Fargette, & Lemey, 2018;Jacquot, Nomikou, Palmarini, Mertens, & Biek, 2017). Although the grouping of locations implicit in discrete phylogeography is often arbitrary, the boundaries used may correspond to borders around which control measures are structured and these analyses can be useful for quantifying transmission rates across national or international borders (Dudas et al., 2017;Trewby et al., 2017). Continuous phylogeography which models change in latitude and longitude as a diffusion process along branches of the phylogeny can provide a more detailed vision of the dispersal process, though requires more comprehensive location data and assumes the dispersal process maintain a relationship with geographical distance (Dellicour, Vrancken, et al., 2018), an assumption that may generally not be the case for human influenza viruses which have been shown to follow air transportation migration routes at a global scale (Lemey et al., 2014;Bedford et al., 2015). Owing to this in part, discrete phylogeography has often been applied in the study of influenza viruses in general (Bedford et al., 2015;Lemey et al., 2009;Lu, Leigh Brown, & Lycett, 2017;Lu, Lycett, & Brown, 2014;Lycett et al., 2012;Magee, Beard, Suchard, Lemey, & Scotch, 2015;The Global Consortium for H5N8 and Related Influenza Viruses, 2016). The discrete trait modelling framework used for discrete phylogeography has also been used to reconstruct ancestral hosts across phylogenies for a range of viruses, including HPAI H5N8 (Black, Breyta, Bedford, & Kurath, 2016;Dudas, Carvalho, Rambaut, & Bedford, 2018;Hall et al., 2013;Lu et al., 2014;The Global Consortium for H5N8 and Related Influenza Viruses, 2016;Trovão, Suchard, Baele, Gilbert, & Lemey, 2015). Such reconstructions enable various insights, for example, on source-sink dynamics and the quantification of the proportion of evolution occurring within different host species  or the support for epidemiological links between host species associated with outbreaks of avian influenza (Lu et al., 2014).
Avian influenza outbreaks are typically epidemiologically complex in nature, as they may involve multiple wild bird species that vary in spatial ecology and clinical disease severity, and domestic incursions may affect various species that differ in housing, susceptibility and connectedness. Possible routes of transmission include migratory and local movements of wild birds, farm-to-farm transmission between holdings in close proximity and human-mediated transmission involving transport of infected birds or sharing of vehicles or personnel between holdings (Brown, Kuiken, et al., 2017;Mulatti et al., 2018). The integration of various data types using phylodynamic analyses can shed insight into transmission dynamics and the processes shaping an epidemic.
For example, the contribution of wild birds, different poultry species and production types to the onwards transmission and the routes of transmission responsible for geographical expansion of an epidemic may be elucidated. Ultimately, a better understanding of pathogen dispersal and transmission during the course of an epidemic may help to assess existing prevention and intervention measures and shape future strategies.
In this paper, we present integrated phylogenetic and phylodynamic analyses of genetic, geographical and host data to reveal features of the 2016-17 A(H5N8) HPAI epidemic in Italy. We reconstruct the geographical pattern of pathogen dispersal using both continuous and discrete phylogeographic approaches. We also reconstruct host types across the phylogeny describing the epidemic to allow significant host type-to-host type routes of transmission to be identified.
To account for likely under-sampling of the epidemic in wild birds, we infer the presence of cryptic wild bird-mediated transmission across the phylogeny through consideration of evolutionary and geographical distances between observed cases. Transition rates between geographical regions and host types are quantified over the duration of the epidemic. Finally, relationships between host types and reconstructed movement distances are explored allowing inferences on the production types associated with virus dispersal to be determined.

| Data
Virus sequences and epidemiological data for 83 domestic cases and 12 wild birds were analysed. Epidemiological investigations, the collection of dead birds and sequencing were carried out as described in our previous work (Fusaro et al., 2017;Mulatti et al., 2018). Virus sequences have been deposited in the GISAID EpiFlu Database (https ://platf orm.gisaid.org/) in association with the aforementioned studies. The accession numbers of sequences used for analysis in the paper are as follows: EPI1261344-EPI1261994, EPI961493-EPI961507, EPI961509-EPI961522, EPI961524-EPI961527, EPI1040223-EPI1040238, EPI1081918-EPI1081924 and EPI1081966-EPI1081973. For each Case ID, date, host type, geographical region and province are detailed in Table S1.

| Phylogenetic analysis
Time-resolved phylogenetic trees were reconstructed using BEAST v1.10.4 . Sequence evolution was modelled using a general time reversible model with a proportion of invariant sites, a gamma distribution describing amongsite rate variation with four categories estimated from the data (GTR + I + Γ 4 ) and allowing for rates of change at the third codon position to differ relative to the first two. This model was implemented with a relaxed molecular clock with branch rates drawn from a lognormal distribution (Drummond, Ho, Phillips, & Rambaut, 2006) and a minimally constrained Bayesian skyline demographic model (Drummond, Rambaut, Shapiro, & Pybus, 2005).
Four runs of 50,000,000 steps were sampled every 5,000 steps.

| Continuous phylogeography
Spatial diffusion was mapped to viral phylogenies using the continuous phylogeographic framework described by Lemey, Rambaut, Welch, and Suchard, (2010) implemented in BEAST. A homogenous Brownian diffusion model, which assumes that the rate of spatial diffusion does not vary significantly across the phylogeny, was compared with relaxed random walk (RRW) models (Lemey et al., 2010) which include branch-specific scaling factors drawn from an underlying distribution (either Cauchy, gamma or lognormal). Brownian and RRW models describing the distribution of spatial diffusion across the phylogeny were compared using AICM. Reconstructed dates and geographical coordinates at internal nodes of the phylogeny were extracted, and per-branch dispersal distances were calculated from geographical coordinates by converting latitude and longitude to radians and applying the spherical law of cosines with the radius of the Earth specified as 6,371 km. For visualization, branches were dated at the midpoint between the dates estimated for the nodes they connect.

| Discrete trait analysis
Rates of transition between geographical regions and between host types were estimated in BEAST by inferring models of discrete trait evolution using an asymmetric model that allows the rate of transition between any given pair of traits to differ depending on the direction (Lemey et al., 2009). The number of sequences in each discrete state for the both the geographical and host analyses is stated in Table   S2. Bayesian stochastic search variable selection (BSSVS) was used to identify phylogenetically robust transitions by assigning a binary indicator variable to each transition which was sampled by MCMC.
Transitions included in the model (indicator variable = 1) in more than

| Inference of cryptic wild bird-mediated transmission
Phylogenetic, or patristic, distances between viruses positioned in time-scaled phylogenetic trees were calculated and averaged across a posterior sample of 100 trees. For each domestic case, a putative domestic ancestor was identified as the domestic case recorded on a previous date with the shortest patristic distance.
Each domestic case was thus associated with an evolutionary distance to its putative domestic ancestor. The minimum and maximum distances to putative domestic ancestors were used to define a range of threshold phylogenetic distances. Using thresholds at intervals of 0.005 within this range, a post-order traversal that proceeds from the tips towards the root of the tree was used to infer either wild bird or domestic host type at each internal node of the phylogeny. Internal nodes with both descendants being domestic cases with evolutionary distances to their putative domestic ancestor below this threshold were inferred to be domestic, and otherwise, a wild bird host was inferred. At internal nodes deeper in the tree, domestic host type was inferred if both descendant nodes were inferred to be domestic, otherwise wild bird was inferred.
Domestic cases were classified as being the result of primary (wild bird to domestic) or secondary (domestic to domestic) transmission depending on whether the internal node from which they were descended was inferred to be wild bird or domestic using the above methodology. Using a range of phylogenetic distance thresholds, the classification of cases made using this approach was compared with the reported classification of primary and secondary cases made using established epidemiological methods . For each value used as a threshold phylogenetic distance, the sensitivity and precision with which secondary cases reported by Mulatti et al. were calculated. From these measures, F-scores which balance sensitivity and precision were calculated. Using thresholds that resulted in a reasonable level of accordance (F 1.5 > 0.8 and specificity >0.7), the proportion of threshold values at which each internal node was estimated to be wild bird was calculated. Rates of transmission between and within wild and domestic birds through time were calculated as described in the above discrete trait analysis section. To combine the results of the discrete trait reconstruction of host type and this analysis, internal nodes retained their host type inferred using the discrete trait model unless a wild bird host was inferred using the distance-based approach. Rates of transition between each of the host categories considered in the discrete trait analysis (backyard, broiler, layer, turkey, domestic waterfowl and wild bird) were calculated and averaged across a posterior sample of 100 trees. Rates greater than one when averaged across the posterior set of phylogenies were considered significant and plotted in a network map. May 2017 (95% HPD, 4 April-2 July). As a result, we can be confident that the ancestors of the Italy-A and Italy-B lineages had diverged prior to the first case of the second wave, detected on 19 July, indicating that there were at least two epidemiological origins to the second wave of cases.

| Phylogeographic reconstruction of dispersal history in continuous space
To investigate diffusion dynamics during the course of the epidemic, geographical coordinates associated with cases were integrated into phylogenetic analysis using the continuous phylogeography approach described by Lemey et al. (2010) A spatiotemporal reconstruction of the epidemic is shown in  Figure S2. Reconstructed movements for wave 1 were generally of greater distance (mean = 33 km, median = 11 km) than wave 2 (mean = 11 km, median = 2.5 km). Within wave 2, the Italy-A lineage had a lower mean distance (7.6 km) though similar median distance (2.3 km) compared with Italy-B (mean = 14 km, median = 2.6 km). This reflects the distribution of movements associated with Italy-B being particularly right-skewed with many very short movements associated with local spread and occasional very long-distance dispersal events.

| Quantification of rates of transition between geographical regions
Transition rates between the six geographical regions (Veneto n = 28, Emilia-Romagna n = 7, Friuli-Venezia Giulia n = 3, Lazio n = 1, Lombardy n = 51 and Piedmont n = 5) where cases were detected during the course of the epidemic were also estimated using a model of discrete trait evolution. Of the 30 possible regionto-region transition rates, BSSVS indicated that six were significant using a 0.5 inclusion probability as a cut-off ( Figure 3; inclusion probabilities, Bayes factor support and conditional transition rates for all transitions are shown in Table S3). The greatest rate of region-to-region transmission across both waves of the epidemic was identified as being from Veneto to neighbouring Lombardy.
Veneto was also identified as the most probable location of the most recent common ancestor of the sampled viruses (posterior probability = 0.99), consistent with the continuous phylogeographic reconstruction.
To examine the rate of transmission between and within re-

F I G U R E 3
Dynamics of region-to-region transmission from discrete trait analysis. Time-scaled maximum clade credibility HA phylogeny with branches coloured according to the region inferred for the descendant node following the colour scheme of the network map (topleft) which shows phylogenetically robust region-to-region transitions. The width of arrows indicates the rate of transition occurrence (the number of transitions per unit of time), and arrow colour indicates posterior support: dark red indicates inclusion probability 0.9-1.0; red, 0.8-0.9; dark orange, 0.7-0.8; light orange, 0.6-0.7; yellow, 0.5-0.6

| Quantification of the role of wild birds in transmission between host types
To explore transmission dynamics at the wildlife-domestic interface and the relative roles of primary incursions into the domestic sector from wild birds and secondary spread between domestic premises, rates of transmission between host types were investigated. Six categories of host type that accounted for differences in species and production type were considered (backyard n = 21, broiler n = 5, layer n = 12, turkey n = 39, domestic waterfowl n = 6 and wild bird n = 12 In spite of the potential for bias described above, the most probable host type at the root of the tree was inferred to be wild bird, albeit with considerable uncertainty (posterior probability = 0.63), and rates of wild bird to domestic transmission were estimated to be higher than rates of transmission from domestic to wild birds. Nonetheless, the results of the discrete trait model contradict other sources of information  and resulted in only 21% of internal nodes being estimated to be wild bird (full results of discrete trait analysis shown in Figure S3 and inferred at internal nodes using a range of threshold phylogenetic distances. The resulting classification was compared with the classification of cases made using established epidemiological methods . A threshold value of 0.19 was found to provide a good level of accordance with existing methods, balancing  (Table S5), and was applied across the phylogeny.
Using this threshold, each branch of the phylogeny was classified as representing either transmission within wild birds, from wild birds to domestic, or domestic to domestic. Transmission type is shown mapped to the phylogeny in Figure 5a and through time in Figure 5b.
Using The phylogenetic tree with host types reconstructed at internal nodes and adjusted for the presence of cryptic wild bird-meditated transmission is shown in Figure 6, alongside a network map showing significant rates averaged across a posterior sample of trees. The highest rate of transmission between host types was estimated as being from wild birds to domestic turkeys, followed by the rate of wild birds to backyard premises. There is also robust evidence of transitions from wild birds to domestic waterfowl and laying chickens, but not to broiler chickens (though a single wild bird to broiler transition is inferred in the MCC, Figure 6). Of the five domestic host categories, phylogenetically robust onwards transmission to at least one other production type was detected from turkeys and from laying chickens. Across a posterior sample of trees, there was no robust evidence of onwards transmission to other production types from either backyard, domestic waterfowl or broilers. The phylogeny in Figure 6 allows the production types involved in the clusters of domestic to domestic transmission in Figure 5 to be elu-

F I G U R E 6
Host-type transmission dynamics Time-scaled maximum clade credibility HA phylogeny with branches coloured according to the host type inferred for the descendant node following the colour scheme of the network map (top-left) which shows phylogenetically robust host-to-host transitions. Host type at internal nodes reconstructed using discrete traits model with BSSVS and adjusted to infer the presence of unobserved wild bird hosts using phylogenetic-temporal distances between domestic cases. In the network map, phylogenetically robust transitions are shown with arrow width relative to the rate of transition occurrence (the number of transitions per unit of time) The spatial pattern of host type-to-host type transmission was visualized by projecting the phylogeny in Figure 5a in geographical space ( Figure S4). Wild bird driven dispersal can be seen to be responsible for geographical expansion of the epidemic. Wild bird to domestic transmission was particularly prevalent in the eastern portion of the area affected during the epidemic, occurring more frequently during first wave of cases and within the Italy-A genetic sub-epidemic during the second wave. The large cluster of domestic to domestic transmission in the Italy-B genetic group (Figure 5a) is restricted to a small geographical area to the south of Brescia ( Figure   S4). In contrast, the smaller cluster of domestic to domestic trans- November). This is consistent with the timing of arrival of migratory birds in north-eastern Italy (Spagnesi & Serra, 2005)  has been observed in previous epidemics in Italy (Busani et al., 2009;Mannelli, Ferrè, & Marangon, 2006), with fattening turkeys being at higher risk of virus introduction from wild birds, and acting as source of infection for other production types in the cluster of secondary cases. In contrast, chicken broilers accounted for only a small fraction of domestic cases (five out of 83), and in common with backyards and domestic waterfowl, no significant onwards transmission to other production types was detected.
Estimating rates of transmission between under-sampled reservoir populations and a well-sampled target species is a common challenge in infectious disease epidemiology. Discrete trait analysis is recognized as tending to underestimate transmission rates from, and overestimate rates towards, under-sampled populations . We have used an approach based on evolutionary distances between domestic cases to infer the presence of cryptic wild bird-mediated virus transmission in areas of the phylogeny where viruses were not sampled from wild birds. This approach requires some judgement of a threshold distance below which cases can be judged as likely originating from secondary transmission. Given judgements of this kind are already being made by epidemiologists given available spatiotemporal data and knowledge of potential epidemiological links , we see this analysis as providing additional information from genetic data with which the processes driving epidemics can be inferred. This approach also relies on an assumption that the level of domestic to wild bird transmission is low, though this assumption is supported by the discrete trait analysis performed and existing knowledge about the epidemic .
In contrast to the 2016-17 HPAI A(H5N8) epidemic elsewhere in Europe that was generally identified primarily in the wild bird population (Brown, Kuiken, et al., 2017;, wild bird cases during the epidemic in Italy were relatively rare in comparison with domestic cases (Fusaro et al., 2017;Mulatti et al., 2018). Our analyses indicate the role of wild birds during the Italian epidemic was significantly greater than apparent when considering either the number of wild bird cases or the position of wild bird cases in the phylogeny alone, with wild bird hosts reconstructed at deep nodes associated with long-distance dispersal events using a standard discrete trait approach. The extent of transmission within wild birds was inferred to be quite stable over the period of the epidemic, which extended beyond the nesting (July/August) and wintering migratory periods (January/February), to periods when migratory movements were limited (Spina & Volponi, 2008). Both the time period spanned and geographical extent of the epidemic, extending far from the large wetlands in proximity to coastal areas, in which domestic incursions associated with migratory birds were initially located (Fusaro et al., 2017;Mulatti et al., 2018), supporting spill-over from migratory to residential wild birds. Several primary incursions were detected in each of the recognized epidemic waves, whereas secondary spread within the domestic sector was probably restricted to the second wave and peaked in October and November

2017.
Co-inference of host type and phylogeography enabled the relationship between virus dispersal and host to be investigated. Longdistance dispersal was far more likely to be associated with branches of the tree inferred to correspond to virus dispersal either within wild birds or transmission from wild birds to domestic, and these movements were responsible for all geographical expansion of the area affected during the epidemic. However, some longer dispersal events were inferred to occur in areas of the phylogeny inferred to representing clusters of secondary spread. These instances represent genetically similar viruses detected on nearby dates that are separated by large geographical distances inconsistent with airborne transmission between domestic premises (Jonges et al., 2015;Scoizec et al., 2018;Torremorell et al., 2016). Therefore, such instances highlight opportunities for further investigation, as they may indicate human-mediated dispersal, missing domestic cases if the timescale allows, or perhaps undetected domestic-wild bird-domestic transmission. For example, long-distance virus movements were detected within a cluster of phylogenetically linked rural poultry outbreaks detected from the end of September to the end of October in several provinces of Lombardy region to the east of Milan were identified. In this case, there is knowledge that these geographically diffuse cases seem to be associated with the human-mediated movement of birds associated with a live bird market. This approach therefore has the potential to detect other instances of human-mediated virus dispersal. The methods used could be applied in the course of an epidemic, in an almost real-time fashion, by updating the list of viral sequences to be included in the analyses and the ancillary epidemiological data.
This could guide the direction of further investigations in infected farms identified as possible secondary cases, or where genomic distances, and spatiotemporal information could suggest the presence of a common source of infection.
In this paper, we describe the geographical movements and host types underlying the persistence and geographical dispersal of the 2016-17 A(H5N8) HPAI epidemic in Italy. We detect significant heterogeneity in the distances covered and tempo of virus dispersal across the epidemic associated with the host types involved. In addition to reconstructing the pattern of movement across the area affected by the epidemic, we co-estimate the timing and host types producing inferences that can be particularly useful for evaluating the effectiveness of control measures introduced or to inform the design of future measures. Using a phylogeny-based approach, we identify an important role of wild birds in the persistence and geographical expansion of the avian influenza epidemic. The role of wild birds is inferred to be much greater relative to the rate at which they were sampled, indicating that increased surveillance in wild birds could enable transmission to be understood in greater detail. To follow this work and further explain epidemic spread, correlations between variation in reconstructed dispersal distances and velocities and environmental variables including elevation, distance to fresh water, land coverage and poultry density are currently being investigated using recently developed methods (Dellicour et al., 2016;Jacquot et al., 2017).

E TH I C A L A PPROVA L
The study did not include any experimentation on animals or humans. Samples were taken from dead birds on infected farms, as part of measures provided by the European Council Directive on Avian Influenza (Council Directive 2005/94/EC).

CO N FLI C T O F I NTE R E S T
The authors declare no competing interests.