Social structure contains epidemics and regulates individual roles in disease transmission in a group‐living mammal

Abstract Population structure is critical to infectious disease transmission. As a result, theoretical and empirical contact network models of infectious disease spread are increasingly providing valuable insights into wildlife epidemiology. Analyzing an exceptionally detailed dataset on contact structure within a high‐density population of European badgers Meles meles, we show that a modular contact network produced by spatially structured stable social groups, lead to smaller epidemics, particularly for infections with intermediate transmissibility. The key advance is that we identify considerable variation among individuals in their role in disease spread, with these new insights made possible by the detail in the badger dataset. Furthermore, the important impacts on epidemiology are found even though the modularity of the Badger network is much lower than the threshold that previous work suggested was necessary. These findings reveal the importance of stable social group structure for disease dynamics with important management implications for socially structured populations.


ROZINS et al.
because high fragmentation and close-knit subgroupings delay the spread of disease and serve to "trap" infections within networks (Sah, Leu, Cross, Hudson, & Bansal, 2017). It is also likely that the role of individuals in the spread of disease may change in more modular networks, whereby individuals that act as "bridges" between different regions of network will be integral to regulating disease transmission (Salathé & Jones, 2010). However, the importance of these bridging individuals in wildlife populations made up of multiple stable social groups has received little attention.
Previously networks derived from empirical observations (Rushmore et al., 2014;VanderWaal, Atwill, Isbell, & McCowan, 2014) have been used to provide insights into the role of network structure in disease transmission in nonhuman animals. However, recent advances in bio-logging technology to collect high-resolution social contact data, and methods of network analysis, have enabled the quantification of social interactions among wild animals (Blyton, Banks, Peakall, Lindenmayer, & Gordon, 2014;Hamede, Bashford, McCallum, & Jones, 2009;Hirsch, Reynolds, Gehrt, & Craft, 2016;Pinter-Wollman et al., 2013;White et al., 2017). This has facilitated modeling work that has provided many important insights into how social systems and network structure influences the transmission of directly transmitted infections in nonhuman animals (e.g., Sah et al., 2018). However, an important gap remains in understanding the role of stable social group structure at a population level (cf. work within groups; Nunn, Craft, illespie, Schaller, & Kappeler, 2015;Sah et al., 2018;VanderWaal et al., 2014;White et al., 2017), and in particular how it interacts with the role of individuals in the spread of infection.
We exploit a contact network dataset collected in a high-density population of European badgers Meles meles ( Figure 1) in Gloucestershire (UK) using UHF proximity loggers (Sirtrack Ltd, Havelock, New Zealand). In much of the United Kingdom and Ireland, badgers live at higher densities than in the rest of the species' range (McDonald, Robertson, & Silk, 2018;Roper, 2010); they live in territorial social groups that share communal dens known as setts (Roper, 2010). Individuals in these populations interact very frequently with others from the same group but much more sporadically with those from neighboring groups (Roper, 2010;. This results in modular contact networks, in which individuals from the same social group are much more closely connected than individuals from different social groups. While there are some differences in social contacts over the course of a year (Silk et al., 2017), the overall structure of the networks persists.
We used disease simulations to examine the implications of social group structures for both the risk and size of epidemics in networks generated directly from the empirically derived European badger contact network . We also investigate the impact, on spread of infection and epidemic size, of (a) the duration of contact, (b) identity, and (c) social position of the first-infected individual. We simulated the spread of a generic infection with SIR (susceptible-infected-removed) type dynamics for a range of transmission probabilities (and subsequently a range of basic reproductive ratios-R 0 ).

| Empirical data collection
Data were collected from a high-density population of badgers in Woodchester Park, Gloucestershire, UK. This population has been the subject of a long-term mark-recapture study since the 1970s McDonald et al., 2018). A detailed capture history is available for all individuals in the population. Data for the social networks used in this study were collected by using proximity-logging radio tags (Sirtrack) to capture the interactions between 51 individuals living in eight communal setts located at the core of this long-term study population. Data were collected over a 1-year period from June 2009 to May 2010 . Individuals used to construct the networks included subadults and adults, consisted of 24 males and 27 females, and represents 80% of the total population.

| Generating simulated networks
Simulation of networks based on the observed dataset allowed us to incorporate uncertainty in epidemiological estimates. We simulated single (static) annual networks as the network structure is qualitatively similar throughout the year, and this approach enabled us to incorporate all of the information we had on the contacts of individuals. Networks were simulated that (a) matched the spatial structure of the observed network data (referred to as spatially structured networks SSN), (b) matched the degree distribution (individual variation in number of connections in the network) but not spatial structure of the observed network (referred to as spatially unstructured networks SUN), and (c) were random networks with identical density (number of edges) to the empirically derived network (referred to as F I G U R E 1 European badger, Meles meles random unstructured networks RUN). All simulated networks comprised 51 nodes, the same as the number of badgers in the observed network.
Spatially structured networks were simulated from the observed network data collected by . The observed association data were fitted with a zero-inflated negative binomial generalized linear model in the R package pscl (Zeileis, Kleiber, & Jackman, 2008). The duration (in seconds) of interactions between every dyad in the population was the response variable in the model. The explanatory variables were the distance in meters between the main setts in which two individuals were caught, the distance in terms of social group territories between two individuals and whether or not two individuals were the members of the same territorial group (as a binary indicator variable) according to bait marking studies (Delahay et al., 2000) completed in 2009. Using this method, networks were simulated that closely matched the observed network in mean degree (Supporting Information Table S1), degree distributions and "spatial" structure/ modularity ( Figure 2).
We also generated spatially unstructured networks (SUN) with the use of the rewire() function in the R package igraph (Csardi & Nepusz, 2006) which generated equivalent networks with equal degree distribution but without "spatial" structure for each simulated.
Finally the function random.graph.game() in igraph was used to generate random networks (RUN) with the equivalent number of edges to the binary full networks.
Following Sah et al. (2017), we analyzed unweighted (binary) versions of these networks. This is a conservative approach as by ignoring the weights of edges the modularity of the observed social structure is reduced (i.e., it assigns all connections as being equal; therefore, weak between group connections are treated as equally important as strong within group connections). An additional advantage of this approach, rather than including the contact duration as a parameter in disease transmission (see Rushmore et al., 2014), is that it allows for stochastic variability in individual susceptibility to disease transmission. In order to confirm that our results would be similar for infections that required longer contact durations, we repeated our main analyses in (a) unfiltered networks including all F I G U R E 2 Network structure of the observed and simulated networks used in this study: (a) the distribution of modularity scores for networks simulated with equivalent social group and spatial structure (SSN), (b) the distribution of modularity scores in spatially unstructured networks SUN, (c) the distribution of modularity scores in random networks (RUN), (d) the structure of the empirical badger contact network, (e) an example of a simulated unstructured network (SUN), and (f) an example of a simulated random network (RUN). Nodes in each network are colored according to their degree and each boxplot is based on 1,000 generated networks, all modularity scores were calculated using the fast greedy algorithm in R package igraph and weighted edges have been transformed to binary according to set temporal cut-offs (NF: all edges are included, F100: only edges with contact durations >100 s are included and F1000: only edges with contact durations >1,000 s are included) edges (NF), (b) networks filtered to include only edges with contact durations >100 s (F100), and (c) networks filtered to include only edges with contact durations >1,000 s (F1000). Finally, using the R package igraph (Csardi & Nepusz, 2006), network community structure and modularity scores were determined. We used the fast greedy algorithm to define social communities, and report information for the number of communities with the highest modularity score.

| Network measures
Three measures of individual connectedness (centrality) were calculated for all of the nodes on each of the unfiltered simulated networks: degree (number of edges), eigenvector centrality (rewards edges/connections to important nodes), and betweenness centrality (rewards nodes that act as bridges between other nodes).

| Epidemic simulation
All simulations started with a single infected individual in a completely susceptible population. Following (Sah et al., 2017), all simulations follow an SIR-type (Susceptible-Infected-Removed) model where individuals fall within one compartment, they are either susceptible to infection, infectious, or removed. All 51 individuals were chosen to be patient zero 100 times. Therefore, for a network of a particular class (SSN, SUN or RUN), we ran 5,100 simulations for nine unique transmission probabilities T, on 1,000 unique graphs.
Transmission probabilities were chosen to be between 0.075 and 0.275.
Simulations proceeded as follows: 1. A single individual is infected (patient zero).
2. Within the first iteration, patient zero's neighbors (those connected by an edge) become infected (and infectious) with probability T and patient zero is removed from the network.
3. In all subsequent iterations, the neighbors of any infectious individual become infected (and infectious) with probability T, and all previously infectious individuals are removed from the network.
Note that the iterative steps do not explicitly represent uniform time units, but follow a bond percolation-like approach. This approach, one that ignores temporal dynamics, has been shown to produce similar results to time-sensitive simulations (Rushmore et al., 2014), but is less computationally expensive. The disease will work its way through the network as described above, until no new individuals can become infected. The termination of an outbreak can be the result of it burning out due to a lack of successful transmission events, or because there are no more susceptible individuals to infect.

| Basic reproduction number and the epidemic threshold
For each network, we determine the basic reproduction number: R 0 = T(<k 2 >/<k>−1), where, <k>, is the mean degree, <k 2 >, is the mean squared degree and, T, the transmission probability (Newman, 2002). This calculation of R 0 will produce identical R 0 values for networks of identical degree (i.e., SSN and SUN networks). The epidemic threshold refers to the transmission probability for which R 0 = 1. Therefore, for each transmission parameter and network, we compute the corresponding value of R 0 , which is more easily interpreted epidemiologically than simply a transmission probability (Supporting Information Table S2).

| Variation in epidemic size given the identity of the initially infected individual
We used data from disease simulations to quantify the variation in (

| Epidemic size as an outcome of the network position of the initially infected individual
We used data on the mean epidemic size for each individual in the unfiltered networks to quantify the effect of degree, eigenvector centrality, and betweenness centrality on the size of the epidemic separately for each transmission probability in each type of network.
Mean epidemic size was divided by the total size of the network (to produce a mean proportion of the population infected), and then, the logit function was used to transform this variable. First we examined each centrality measure in separate linear mixed effects model to assess differences in their power in predicting epidemic size. We included a network measure as a fixed effect variable alongside a random effect to control for network identity (including both a random intercept and random slope). Due to the expectation that the effect of centrality on epidemic size might decline for larger centrality values (especially for infections with higher transmission probabilities), we fitted two models for each centrality measure, one in which the raw values of the measure were used, and one which had been log(Measure+1) transformed. Second we constructed a combined linear mixed effects model which included fixed effect variables of all three centrality measures after they had been log(Measure+1) transformed. This model included a random intercept and uncorrelated random slopes for each network measure related to the identity of the network. This second model enabled us to identify the importance of indirect connections (measured using eigenvector centrality and betweenness centrality) while controlling for the effect of direct contacts (degree).

| RE SULTS
The observed badger social network displayed clear community structure (Figure 2d). The weighted contact network was formed of six communities with a modularity score of 0.462 for this division. The binary contact network was split into three communities with a modularity score of 0.484 for this division. Networks simulated using a negative binomial function fitted to the observed dataset (SSN; see methods) retained this modular structure (Figure 2a), as well as having a similar mean (unweighted) degree (Supporting Information Table S1) and (unweighted) degree distribution (Supporting Information Figure S1). Networks rewired to maintain the degree distribution without retaining the spatial and social group structure of the original network (SUN; Figure 2b) and random networks (RUN; Figure 2c) had considerably reduced modularity (Figure 2). Networks filtered to only contain edges of longer durations had higher modularity scores than unfiltered networks (Figure 2). The epidemic threshold, calculated by setting R 0 = 1, was greater in the random network then the structured networks (SSN, SUN), regardless of filtering level (NF, F100, F1000; Supporting Information Table S3). Additionally, as filtering increased, so too did the epidemic threshold. Thus, highly filtered networks (F1000) require a more intense infection (higher transmission probability, T) for an epidemic to occur.  Table S7). This similarity between SSN and SUN is expected because they have identical degree distributions. However, for filtered networks (F100, F1000), the transmissibility (and hence R 0 ) of the pathogen had no impact on the degree of variation (Supporting Information Figure S3 and S4). Therefore, in networks with realistic social and special structuring, there is greater individual heterogeneity in spreader status compared with networks that lack this structuring. This reveals that heterogeneity in importance for disease transmission can be greatest in modular networks without "superspreader" dynamics, and that the

| D ISCUSS I ON
Our disease simulations showed that the spatial structure of empirically derived badger contact networks reduced the probability of large epidemics. This provides evidence for the importance of a "social bottleneck" (Nunn, Craft, et al., 2015;Sah et al., 2018;VanderWaal et al., 2014;White et al., 2017) in disease transmission functioning at a population level. We also found considerable variation between individuals in the size of epidemics they generated in networks with realistic spatial and social group structure that varied according to both their local (direct connections) and global position (role in broader network connectivity) Our results agree with evidence from other simulation studies, that networks with higher modularity will have smaller epidemics and lower peak prevalence of infection (Newman, 2003;Salathé & Jones, 2010). The most modular networks in this study were those that were both spatial and socially structured. Interestingly, networks with identical social structure (degree distribution), but lacked spatial structure, have lower modularity and higher mean outbreak size, even when compared to random networks. This suggests that social connections, in the absence of spatial structure, promote disease spread. Our results highlight that the importance of community structure extends to much lower levels of modularity (Q ~ 0.45) than the high modularity scores used by the most directly comparable previous study (Salathé & Jones, 2010).
Previous simulation models have found that groups with structure (and therefore higher modularity) tend to have lower parasite prevalence due to the presence of social bottlenecks (Nunn, Jordán, McCabe, Verdolin, & Fewell, 2015;VanderWaal et al., 2014). At a population level, social bottlenecks would be expected to result in the aggregation of infection within particular social groups for pathogens with low to intermediate transmission probabilities (Manlove, Cassirer, Cross, Plowright, & Hudson, 2014). By exploring the effect of modularity for a greater range of transmission probabilities, we were able to reveal that the effect of community structure varies, depending on the transmissibility of the pathogen being investigated, as well as the duration of contacts being considered. While we found a strong relationship between modularity and epidemic size, it should be noted that other network properties might change along with modularity that we do not account for. However, it would not be possible to completely change the modularity of a network while maintaining identical values for all other higher order properties of the network (such as degree distribution). Therefore, disentangling these properties, while remaining biologically meaningful, is difficult.
The reduction in epidemic size was greatest for pathogens with intermediate rather than low or high transmission probabilities (R 0 = 2.2) when all contacts, regardless of duration, were considered capable of transmitting infections. This suggests that the impact of structural delay and trapping of infection spread that is apparent in modular networks (Sah et al., 2017) peaks for infections F I G U R E 6 Mean epidemic size given the (a-b) degree, (c-d) eigenvector centrality and (e-f) betweenness centrality, of the initially infected individual for disease simulations on the unfiltered (NF), spatially structured network (SSN), for two transmission probabilities. The coloring reflects the frequency of observations, with lighter colors reflecting a higher frequency and darker colors lower frequency that are able to spread effectively, but do not have sufficiently high transmission probabilities to facilitate escape from subregions of the network (Cross, Lloyd-Smith, Johnson, & Getz, 2005). The importance of social structure in trapping infection further supports the idea that badgers that form more out-of-group contacts can act as "capacitors" in controlling the spread of infection under certain conditions , especially for less readily transmissible pathogens. Additionally, populations with highly modular networks often consist of a large proportion of both highly central, as well as highly isolated individuals (characteristic of high eigenvector centralization; Griffin & Nunn, 2012). For highly transmittable infections, the isolated nodes can act to limit the outbreak size, whereas for lower transmissible infections, the highly central individuals may drive an outbreaks.
The contact duration that permitted transmission of infection (i.e., level of edge-weight filtering in disease simulations) had a strong impact on the size of the epidemic for a given transmission probability. However, it did not alter the importance of modularity in limiting epidemic size compared to networks without spatial structure, tending to simply mean that the peak difference occurred for simulations with higher pathogen transmissibility. In fact, the increased modu- Rather, our findings suggest the opposite, with epidemics being just as likely but smaller in size in networks with realistic spatial and social structure. Further, the most severe epidemics in networks with realistic social structure did not necessarily stem from initially infected individuals with the highest degree, and depended in part on other aspects of network position that accounted for indirect connections. This emphasizes the potentially conflicting effects of high modularity and heterogeneity in network position on disease dynamics. The relationship between superspreader-type dynamics and the limiting effects of spatial and social structure may therefore be fundamental in driving the dynamics of host-pathogen interactions in natural populations. Finally, the individual network measures, and their impact on the epidemic size, are likely to hold true in the complete badger social network (recall the 51 badgers make up 80% of the total population). While concerns have been expressed on the reliability of using social networks constructed using a subset of the population, it has been found that an individuals social and spatial importance should not change with the addition of individuals to the network (Silk, Jackson, Croft, Colhoun, & Bearhop, 2015).
Culling of high-density badger populations has been used as an attempt to control disease, but has proved controversial (Mcdonald, 2014). It has been postulated that culling-induced perturbation of the badger social system comprising increased ranging behavior, less clearly defined territorial boundaries and increased dispersal F I G U R E 7 The effect of increasing different centrality measures on mean epidemic size in (a) spatially structured networks (SSN), (b) spatially unstructured networks (SUN) and (c) random networks (RUN). Points represent model estimates from model predictions and shaded areas represent the estimate ±95% confidence intervals (blue: degree, yellow: eigenvector centrality, pink: betweenness centrality). Dashed lines represent the model prediction for the effect of degree (blue) and eigenvector centrality (yellow) at the lowest transmission probability to facilitate comparisons. Effect sizes are from a model including all three measures so represent the effect of each measure while accounting for variation in the others. All centrality measures have been log(Measure+1) transformed and then scaled to be mean centered and have unit variance has reduced the effectiveness of this approach (Carter et al., 2007;McDonald, Delahay, Carter, Smith, & Cheeseman, 2008). By demonstrating the importance of host social structure in limiting epidemic size at a population level, our results provide a novel insight into how social perturbation of badger populations might be detrimental from a disease control perspective; decreasing the modularity of the social contact network may be integral to increases in disease incidence and epidemic size that can result from social perturbation.
Social perturbation has been suggested to be problematic for disease control in wildlife hosts (Laddomada, 2000;McDonald et al., 2008), and the lessons learned could be applied more generally in cases where animal social networks are naturally modular  to identify situations where social perturbation might have particularly important consequences for pathogen transmission. For example, social group structure seems likely to be important in limiting epidemic size in many species, and therefore, perturbation is likely to be especially important in species with stable social structures, like badgers, in which social bottlenecks are more likely to occur in the absence of perturbation.
From a management perspective, the considerable among-individual variation in importance to transmission, together with the importance of indirect connections in the spatially structured networks, suggest that targeting individual badgers with high degree might not be the most effective strategy. It has been shown that when vaccination coverage is low, vaccination efforts that target betweenness centrality rather than degree, result in smaller epidemics (Rushmore et al., 2014;Salathé & Jones, 2010). However, our results suggest that in these networks (with intermediate levels of modularity), it is important to account for both degree and measures of indirect connections, such as betweenness and eigenvector centrality, when assessing the most important individuals for the spread of infection. A key challenge now is to identify those individual traits that relate to occupation of these network positions, in order to be able to target management interventions more efficiently (Delahay, Smith, & Hutchings, 2009;VanderWaal & Ezenwa, 2016). For example, in badgers, there is a tendency for individuals that use outlier setts (located away from the main setts), to occupy potentially important, bridging network positions with high numbers of direct and indirect connections . Therefore, being able to target management interventions at outlier setts, or being able to better define seasonal variation in which individuals are likely to use these setts  may contribute disproportionately to successful interventions. However, it has also been shown that when time is limited, it may be more effective to vaccinate many lower priority animals quickly, rather than waiting for opportunities to vaccinate more important individuals (Robinson et al., 2018).
In conclusion, using epidemiological simulations, we have shown that the stable social group structure of the European badger population manifests in modular contact networks that are likely to experience smaller epidemics than equivalent networks without this structure, especially for pathogens with intermediate transmission probabilities. The nature of these contact networks also means that it is important to take into account both direct and indirect connections of individuals in the network when determining their role in disease transmission, and that important individuals may differ for pathogens with different R 0 s. The design and implementation of effective disease management interventions should therefore acknowledge that individual variation in network positions, social groupings, and pathogen traits closely interact to influence transmission, and that the social systems of many wildlife populations might already be optimized for the containment or mitigation of the spread of disease.

ACK N OWLED G M ENTS
CR and MJS are funded by NERC standard grant NE/M004546/1 awarded to RM, DJH, DPC, and MB. RD and the team at Woodchester Park are project partners. Network data were collected by NW as part of a PhD studentship funded by Defra. We thank Keith Silk for providing the photograph for Figure 1.

CO N FLI C T O F I NTE R E S T
None declared.

DATA ACCE SS I B I LIT Y
The original weighted adjacency matrix for the high-density population of European badgers, as well as code used for simulating networks and disease simulations can be found online https://doi. org/10.5061/dryad.49n3878.