Linking social and pathogen transmission networks using microbial genetics in giraffe (Giraffa camelopardalis)


  • Kimberly L. VanderWaal,

    Corresponding author
    1. Animal Behavior Graduate Group, University of California, Davis, CA, USA
    2. International Institute for Human-Animal Networks, University of California, Davis, CA, USA
    3. Wangari Maathai Institute for Peace and Environmental Studies, University of Nairobi, Nairobi, Kenya
    Search for more papers by this author
  • Edward R. Atwill,

    1. Department of Population Health and Reproduction, School of Veterinary Medicine, University of California, Davis, CA, USA
    2. Western Institute for Food Safety and Security, University of California, Davis, CA, USA
    Search for more papers by this author
  • Lynne. A. Isbell,

    1. Animal Behavior Graduate Group, University of California, Davis, CA, USA
    2. Department of Anthropology, University of California, Davis, CA, USA
    Search for more papers by this author
  • Brenda McCowan

    1. Animal Behavior Graduate Group, University of California, Davis, CA, USA
    2. International Institute for Human-Animal Networks, University of California, Davis, CA, USA
    3. Department of Population Health and Reproduction, School of Veterinary Medicine, University of California, Davis, CA, USA
    Search for more papers by this author


  1. Although network analysis has drawn considerable attention as a promising tool for disease ecology, empirical research has been hindered by limitations in detecting the occurrence of pathogen transmission (who transmitted to whom) within social networks.
  2. Using a novel approach, we utilize the genetics of a diverse microbe, Escherichia coli, to infer where direct or indirect transmission has occurred and use these data to construct transmission networks for a wild giraffe population (Giraffe camelopardalis). Individuals were considered to be a part of the same transmission chain and were interlinked in the transmission network if they shared genetic subtypes of E. coli.
  3. By using microbial genetics to quantify who transmits to whom independently from the behavioural data on who is in contact with whom, we were able to directly investigate how the structure of contact networks influences the structure of the transmission network. To distinguish between the effects of social and environmental contact on transmission dynamics, the transmission network was compared with two separate contact networks defined from the behavioural data: a social network based on association patterns, and a spatial network based on patterns of home-range overlap among individuals.
  4. We found that links in the transmission network were more likely to occur between individuals that were strongly linked in the social network. Furthermore, individuals that had more numerous connections or that occupied ‘bottleneck’ positions in the social network tended to occupy similar positions in the transmission network. No similar correlations were observed between the spatial and transmission networks. This indicates that an individual's social network position is predictive of transmission network position, which has implications for identifying individuals that function as super-spreaders or transmission bottlenecks in the population.
  5. These results emphasize the importance of association patterns in understanding transmission dynamics, even for environmentally transmitted microbes like E. coli. This study is the first to use microbial genetics to construct and analyse transmission networks in a wildlife population and highlights the potential utility of an approach integrating microbial genetics with network analysis.


Epidemiological models have traditionally assumed that the probability of contact is equal for every pair of individuals in the population. Thus, every individual is equally likely to acquire a pathogen from an infected animal. In reality, however, an animal's risk of infection is dependent on local patterns of interaction. Therefore, population spatial and social structure create heterogeneity in transmission patterns (Keeling & Eames 2005; Bansal, Grenfell & Meyers 2007; Craft & Caillaud 2011). Network theory provides a set of tools for analysing such heterogeneity by not only taking into account direct connections, but also indirect links between individuals, providing a sophisticated method to analyse relations among individuals (Wasserman & Faust 1994; Croft, James & Krause 2008). Recently, there has been considerable effort to incorporate network theory into epidemiological models (Keeling 2005; May 2006; Bansal, Grenfell & Meyers 2007). Compared with traditional mass-action models, network models tend to result in reductions in the early growth rate, number of secondary infections produced by each infected individual, and final size of an epidemic (Keeling & Eames 2005; Turner et al. 2008). Thus, accounting for heterogeneity in transmission dynamics improves our ability to understand and predict the dynamics of infectious disease (Keeling 1999; Keeling & Eames 2005; Ames et al. 2011).

Although network analysis has been lauded as the next key tool for examining how heterogeneous transmission patterns affect disease spread (Delahay, Smith & Hutchings 2009; Craft & Caillaud 2011; Tompkins et al. 2011), the structure of transmission networks in wildlife is relatively unknown and difficult to quantify. In empirical studies, contact networks are generally constructed using behavioural data on space use or social interactions, both of which have the potential to create transmission opportunities (Godfrey et al. 2009; Perkins et al. 2009; VanderWaal et al. 2013a). While modelling has proven to be a useful approach to study the importance of these wildlife contact networks for disease spread (Cross et al. 2004; Perkins, Ferrari & Hudson 2008; Craft et al. 2010; Griffin & Nunn 2012), it is difficult to empirically study transmission routes in wild populations because data on who transmitted a pathogen to whom is almost impossible to obtain using current methods, such as commonly used serological techniques (Caley, Marion & Hutchings 2009). In many cases, ‘transmission networks’ have been constructed based on social interactions between individuals without assessing any pathogens (Böhm, Hutchings & White 2009; Grear, Perkins & Hudson 2009; Hamede et al. 2009). Studies that do integrate networks with empirical data on pathogens typically treat an individual's infection status as an individual attribute, and then assess how one's connectivity in the contact network influences the likelihood of being infected. Generally, it has been shown that infected animals tend to have more connections in the contact network (Corner, Pfeiffer & Morris 2003; Godfrey et al. 2009, 2010), or that individuals are at greater risk of becoming infected if they are connected to infected animals or engage in certain types of interactions (Otterstatter & Thomson 2007; Drewe 2009; Porphyre, McKenzie & Stevenson 2011; MacIntosh et al. 2012; VanderWaal et al. 2013a). Because the ‘transmission networks’ in these studies are not independently defined from behaviour-based contact networks, it has not been possible to truly assess how social patterns influence the structure of transmission networks.

In contrast, we use the genetics of a diverse microbe, Escherichia coli, to reveal where transmission has already occurred and construct a transmission network based on quantifiable transmission events in a wild giraffe population (Giraffa camelopardalis). If two individuals share the same genetic subtype of E. coli, then we infer that either direct transmission has occurred through social interactions or indirect transmission has occurred due to exposure to a common environmental source (Archie, Luikart & Ezenwa 2008). Recent work by Bull, Godfrey & Gordon (2012) analysed how lizard social networks influenced the likelihood of two individuals sharing the same genetic subtype of Salmonella, but the strain-sharing data were not conceptualized as a transmission network that was distinct from the social network. In our transmission networks, individual giraffe are interlinked based upon patterns of E. coli subtype sharing. Therefore, the transmission network is defined independently from behaviour-based contact networks. This allows us to directly compare the structure of transmission and contact networks.

Using this novel approach, we examine how well the transmission network overlays onto social and spatial networks. This allows us to ascertain the relative importance of social vs. environmental transmission mechanisms for enteric microbes such as E. coli. If social contact is crucial for transmission to occur, then the occurrence of links in the transmission network should be highly correlated with the social network. However, if environmental transmission is more common, then the transmission network should map more closely onto a spatial network in which individuals are linked according to the extent to which their home-ranges overlap.

In addition, individuals that are highly connected in the transmission network may be potential super-spreaders (Lloyd-Smith et al. 2005). Do individuals become super-spreaders because they are highly social or because they have a large number of spatial contacts? By comparing individual-level connectivity across networks, we investigate which mechanisms give rise to individual-level variation in transmission network connectivity. If highly social animals are potential super-spreaders, then individual-level connectivity in the social and transmission networks should be correlated. However, if widely ranging animals tend to be super-spreaders, then transmission network connectivity should be correlated with large home-range sizes and high connectivity in the spatial network.

Materials and methods

Study Site and Organisms

Escherichia coli is an excellent model organism for examining microbe transmission pathways in relation to contact networks. It can readily be cultured from faecal samples and exhibits immense genetic diversity. Furthermore, the tools for genotyping E. coli are well-established (Dombek et al. 2000; Goldberg, Gillespie & Singer 2006). Escherichia coli is frequently used as an indicator for environmental faecal contamination (Tallon et al. 2005), and subtyping is a common method for tracing E. coli to its source (Simpson, Santo Domingo & Reasoner 2002). Escherichia coli subtype sharing has been used to demonstrate that transmission regularly occurs between humans and their pets, and between humans, livestock and wild primates (Goldberg et al. 2008; Johnson, Clabots & Kuskowski 2008). Recently, serotypes of shiga-toxin producing E. coli have become a major public health concern and are considered an emerging infectious disease (Beutin 2006). Transmission of E. coli generally occurs via a faecal–oral route, although flies can function as mechanical vectors (Förster et al. 2007; Fetene & Worku 2009).

This study was conducted at Ol Pejeta Conservancy (OPC), a 364 km2 semi-arid savanna woodland ecosystem located in Laikipia, Kenya (0°N, 36°56′E). All giraffe within OPC were recognized using individually unique spot patterns on their necks. At the conclusion of this study, OPC had a population of 212 reticulated giraffe. Immigration and emigration were relatively negligible for this population because, except for a few narrow gaps, OPC is enclosed by a perimeter fence. Indeed, in the last 5 months of the study, only two new adults were discovered. Giraffes were aged according to height estimates and age-associated behaviours (VanderWaal et al. 2013b). Animals were considered juveniles if they were <1·5 years and adult at approximately 4 years. In this study, subadults (1·5–4 years) were grouped with adults because they no longer constantly accompanied their mothers and they exhibited adult-like social and space-use patterns. OPC's giraffe population consisted of 160 adults, 20 subadults and 32 juveniles. The population exhibited a 50 : 50 sex ratio.

Field Observations

From 21 January 2011 to 2 August 2011, giraffe group composition and membership were recorded for all giraffe groups sighted while driving daily survey routes. Routes were pre-determined so that different regions of the study area were surveyed in rotation, allowing for most of the study area to be surveyed once every 3 days. Routes were approximately 100 km in length, covered approximately 115 km2 each, and traversed all habitat types. Giraffe groups observed from survey routes were followed off-road until a complete census of the individuals present was accomplished.

All individuals observed within a group were recorded as ‘in association’ with every other member of the group. A group was defined as a solitary individual or set of individuals engaged in the same behaviour, or moving in the same direction or towards a common destination, as long as each giraffe was no more than 500 m from at least one other group member. This definition was adapted from the literature (Leuthold 1979; Fennessy 2004). During the study period, we collected a total of 1089 sightings of giraffe groups. Each individual giraffe was observed on average 31·1 ± 7·6 SD times (approximately once per week). Group sizes at OPC ranged from 1 to 44 giraffe (mean: 5·42, mode: 1 giraffe).

Each individual's home range was mapped using the GPS locations recorded for each sighting. Home-range boundaries were determined using a fixed-kernel utilization distribution of sightings. A 75% contour (kernel density isopleth) was used to produce a core home range for each animal (Harris et al. 1999). We found that there was no correlation between number of observations and home-range size when the total number of sightings for an individual was greater than five. Average home-range size was 95·7 ± 3·3 km2 for adult males, 64·2 ± 3·4 km2 for adult females and 51·0 ± 7·7 km2 for juveniles. Giraffe were excluded from subsequent analyses if they were seen fewer than five times (n = 2) or if we failed to collect a faecal sample from them (n = 14).

Faecal Sample Collection and Genetic Analysis

We collected faecal samples from 194 giraffe. Because there can be significant turnover of E. coli subtypes in the gut (Anderson, Whitlock & Harwood 2006), faecal samples were collected during a brief time period to ensure comparability (10 August 2011 – 11 September 2011) Four faecal samples were collected after this period, but within 3 weeks of the end of the primary sampling period. Faecal samples were collected immediately after defecation was observed and transported on ice to the field laboratory. Samples were streaked for bacterial isolation onto CHROMagar EC agar (CHROMagar, Paris, France), a selective chromogenic agar that exhibits high specificity for E. coli. After overnight incubation at 37 °C, four randomly selected E. coli colony isolates were cultured and then frozen. Using data from captive giraffe, which hosted approximately two subtypes of E. coli per individual (unpublished data), we calculated that there was a <10% probability of failing to capture subtype diversity in the gut if four isolates were taken (Singer et al. 2000). Genetic subtypes were determined using BOX-PCR and gel electrophoresis, which is a well-established method for discriminating between genetically similar E. coli subtypes (Cesaris et al. 2007; Mohapatra & Mazumder 2008). See Appendix S1 in the online supporting information for detailed laboratory procedures.

Densitometric profiles for each isolate were generated from the banding patterns revealed by BOX-PCR and gel electrophoresis (Johnson & O'Bryan 2000; Goldberg, Gillespie & Singer 2006). Similarity of each isolate to all others was determined through pairwise comparisons of densitometric curves (Fig. S1). Subtypes were considered to be matching if their densitometric curves were >90% similar (Pearson's correlation coefficient). Based on a reproducibility analysis conducted in our laboratory, this threshold value minimizes Type I errors in matching to 1% while limiting the Type II error rate to <5%. Among the 776 E. coli isolates that were analysed, 134 genetically distinct subtypes were found in the OPC giraffe population. On average, an individual giraffe hosted approximately 1·7 subtypes.

Network Construction

We constructed one transmission network and three contact networks representing three types of contact: (i) a social network, (ii) spatial network and (iii) water-sharing network. All networks contained the exact same set of individuals (N = 194). For the transmission network, two individuals were considered to be part of the same transmission chain if they shared at least one E. coli subtype. In this network, individuals were linked according to subtype sharing. Connections between individuals, called ‘ties’, were unweighted (i.e. two individuals either shared subtypes [1], or they did not [0]).

A social network was constructed from observed association patterns. We calculated the association strength (AS) between every pair of individuals as the total number of observations they were seen together divided by the total number they were seen together or apart. Pairs with non-zero AS were linked in the social network, with ties weighted according to the value of their AS.

We also constructed a spatial network using the extent of home-range overlap to connect individuals. Home-range overlap between two giraffe was defined as the number of 1-km2 grid squares that fell within both individuals' home ranges, divided by the total size of both individuals' home ranges. These dyadic spatial overlap values were used to connect giraffe in the spatial network, with ties weighted according to the extent to which their home ranges overlapped.

Because water points may serve as an environmental source of bacteria, we also constructed a network based on shared water points. Water points could be rivers, streams, dammed streams or concrete water troughs constructed for cattle. No observations of individual usage of water points were made. Rather, it was assumed that animals have a high probability of using water points within their home range. We counted the number of permanent water points that fell within each animal's home range, and then calculated the proportion of water points that were shared between each pair. These dyadic values were used to connect giraffe in a water-sharing network, with ties being weighted according to the number of water sources shared.

Network-Level Analysis – To What Extent Do the Contact Networks Predict the Transmission Network?

If subtype sharing is dependent on social contact, then AS in the social network should be correlated with the occurrence of links in the transmission network. However, if environmental transmission dominates transmission dynamics, then correlations should be highest between the transmission and spatial networks. If drinking from the same water source is the most critical component leading to environmental transmission, then individuals that have a higher probability of drinking from the same water sources would be more likely to share E. coli subtypes, and the water-sharing network should yield the highest correlation with the transmission network compared with the other contact networks.

To determine the extent to which the contact networks (social, spatial, water-sharing) influenced the transmission network, we used the multiple regression quadratic assignment procedure (MR-QAP), a method of matrix regression developed for network data (Krackhardt 1988; Dekker, Krackhardt & Snijders 2007). Each network was represented as an adjacency matrix, where each cell denoted the relationship between the ith and jth giraffe. Essentially, MR-QAP coerces matrices into vectors and then performs a standard logistic regression on the log odds of a link occurring in the transmission network given the dyad's social and spatial relationships. Relational data used to construct networks have the potential to be autocorrelated, given that every element in the jth row of the matrix is associated with a single individual. Thus, traditional standard error and P-value estimates are potentially biased because the assumption of independence is violated; a single individual appears in multiple dyadic relationships. Therefore, MR-QAP uses a Monte Carlo method, in which rows and columns are randomly permuted within matrices, to determine the significance of regression coefficients (Dekker, Krackhardt & Snijders 2007). Using MR-QAP with double Dekker semipartialling and 1000 permutations (Dekker, Krackhardt & Snijders 2007), we investigated the effect of tie strength (AS, home-range overlap, shared water sources) in the contact networks on the log odds of a tie occurring in the transmission network. The transmission network was regressed against each contact network separately in univariate models and against the contact networks in combination with multivariate analyses (Table 1). Analysis was performed using the ‘sna’ package of R (Butts 2010).

Table 1. Effects of association strength and spatial overlap on the probability (log odds) that two animals were linked in the transmission network. Regression coefficients were estimated through standard logistic regression. P-values were based on MR-QAP permutation tests
AAssociation strength1·220·02
BSpatial overlap0·030·91
CShared water sources0·020·92
DAssociation strength2·260·07
Spatial overlap−0·500·25
EAssociation strength2·130·03
Shared water sources−0·410·19

Individual-Level Analysis – Does an Individual's Position in the Contact Networks Predict Its Position in the Transmission Network?

We calculated the centrality and connectivity of each individual in the transmission network using five established measures of individual-level connectivity: degree, betweenness, closeness, eigenvector centrality and information centrality (see Appendix S2 for metric definitions). Because many of these metrics were highly correlated with each other, we chose to focus our analysis on degree and betweenness, which are two of the most common and easily interpretable metrics used in the literature. Degree is defined as the number of individuals to which the focal animal is linked. Betweenness is a measure of centrality that is based on how many paths pass through the focal individual if the shortest paths between every other pair of individuals are traced (Wasserman & Faust 1994). Thus, in the context of pathogen transmission, it quantifies the extent to which an individual serves as a conduit or bottleneck for the flow of pathogens through a network. Individuals with high betweenness can be considered to ‘mediate’ or ‘regulate’ flow of pathogens between sections of the network that would otherwise by poorly connected. These individuals have the potential to play a large role in regulating pathogen spread in the network (Borgatti 1995).

We also calculated these measures for each individual in each contact network. Because links in the social and spatial networks were weighted according to tie strength, we used weighted versions of degree and betweenness (Newman 2001; Opsahl 2009). Weighted degree (hereafter: social/spatial degree) simultaneously accounts for the number of individuals the focal animal is connected to and the strength of those connections (tie strength). Overall tie strength, which is closely related to degree, is the sum of the weights of all links connected to the focal animal, regardless of the total number of neighbours. To illustrate the distinction between degree and overall tie strength, envision two individuals: individual A has only one neighbour, but they are linked with a tie strength of 10; individual B has 10 neighbours, but the tie strength with each neighbour is only 1. Overall tie strength does not distinguish between A and B's connectivity (summed weight of links = 10), but B is scored more highly for weighted degree because it has a greater diversity of neighbours. Thus, overall tie strength and weighted degree capture different elements of an individual's connectivity. Weighted measures were calculated in the ‘tnet’ package in R (Opsahl 2009).

To address whether an individual's position in a contact network influenced its position in the transmission network, we used general linear models (GLMs) to examine correlations between transmission degree and betweenness, home-range size, and connectivity in the contact networks. Measures of social and spatial network connectivity included weighted betweenness, weighted degree and overall tie strength. We also used GLMs to examine the effect of home-range size on an individual's connectivity in the contact networks and the effect of the spatial network on social network connectivity. All GLMs were univariate because high levels of multicollinearity among the network metrics precluded multivariate analysis. Because of possible non-independence concerns in network data, regression coefficients were determined using GLMs while P-values were calculated via permutation methods in which the order of y was randomized relative to x (Hanneman & Riddle 2005). The regression coefficient was recalculated after each permutation of the data (3000 total permutations), generating a distribution of coefficients in which the relationship of y to x was calculated on randomized data. P-values were defined as the proportion of permutations that produced coefficients more extreme than the observed value. Because degree exhibited a bimodal distribution, degree was ranked for GLM analyses. Higher ranks indicate higher degree.

Individual-level analyses of the water-sharing network were not included because this network was nearly identical to the spatial network. Centrality metrics of individuals in the spatial overlap and water-sharing networks were highly correlated (r > 0·9), and the relationships of these values to transmission network centrality were very similar.


We tested the extent to which the transmission network was predicted by three types of contact networks. Only AS in the social network was significantly correlated with the presence of a transmission link (i.e. sharing E. coli subtypes); individuals that associated more frequently were more likely to be part of the same E. coli transmission chain (Table 1, Fig. 1). Contact among individuals via shared space use, whether measured by home-range overlap or shared water sources, was not significantly correlated with sharing E. coli subtypes. Thus, the presence and strength of links in the social network best predicted the transmission network. The higher correspondence of the transmission network to the social network can be visually assessed in Fig. 2, which depicts a subset of the transmission network (N = 30 randomly selected giraffe) alongside the spatial and social networks.

Figure 1.

Relationship between association strength (AS) in the social network and the probability of sharing genetic subtypes of E. coli in the transmission network. Shading indicates a 95% confidence interval around the regression line. Data points indicate the proportion of dyads (±SE) that shared subtypes. Dyads were binned by association strength (AS ≤ 0·05, n = 15 151 dyads; 0·05 < AS ≤ 0·1, n = 1991 dyads; 0·1 < AS ≤ 0·15, n = 873 dyads; 0·15 < AS ≤ 0·2, n = 355 dyads; 0·2 < AS ≤ 0·5, n = 324 dyads; AS > 0·5, n = 27 dyads).

Figure 2.

Comparison of the (a) spatial, (b) social and (c) transmission networks. For visualization purposes, only links that exceed the mean tie strength + 1 SD are pictured in the social and spatial network (Home-range overlap > 36·4%, AS > 8·9%), and only a randomly selected subset of 30 individuals is shown.

At the individual level, we investigated the extent to which an animal's position in one network was correlated with its position in another network (Fig. 3). Only models with significant coefficients are presented in Fig. 3 (See Table S1 for full model output). A giraffe's degree in the transmission network was positively correlated with its degree in the social network, indicating that individual giraffe with higher social degree and overall social tie strength tended to have a greater number of connections in the transmission network. Home-range size and measures of spatial degree and tie strength were not correlated with transmission degree, which mirrors the results of the MR-QAP analysis.

Figure 3.

Inter-relationships among individual-level measures of connectivity. All relationships depicted are positive and significant in univariate general linear models (P < 0·05), except for the relationships between transmission and social betweenness (P = 0·075) and between social betweenness and spatial degree (P = 079). Regression coefficients are noted next to each arrow. Individual connectivity measures in the transmission network (betweenness/degree) were correlated with the same measure in the social network, but not correlated with spatial connectivity measures or home-range size. Social connectivity was influenced by home-range size and spatial connectivity.

An individual's transmission betweenness was positively correlated with social betweenness, although this trend was not significant (P = 0·075). Transmission betweenness was unaffected by connectivity in the spatial network or measures of social degree. It is also interesting to note that social betweenness is positively correlated with spatial degree, likely because animals that are well connected in the spatial network are positioned ideally to connect social communities. Individuals whose home ranges overlapped with a large number of others tended to have a greater diversity of associates (social degree), but not higher social tie strength.


Our results clearly indicate that patterns of social interaction play a stronger role in determining patterns of E. coli transmission than space use. Both network- and individual-level analyses found greater correspondence of the transmission network to the social network as compared to the spatial network. Association strength between dyads in the social network was strongly correlated with the occurrence of a transmission link between that pair of individuals. In contrast, shared space use had no effect on the occurrence of links in the transmission network, regardless of whether shared space use was measured as home-range overlap or shared water sources (Table 1). Measures of individual connectivity paralleled these findings in that an individual's position (degree/betweenness) in the transmission network was highly correlated with its position in the social network, but not the spatial network. Individuals with greater numbers of social connections (social degree) tended to have more connections in the transmission network. In other words, individuals that functioned as social hubs were also transmission hubs. Furthermore, giraffe with high social betweenness may bridge socially distinct clusters of individuals. These same individuals also tended to serve as bridges in the transmission network by occupying positions of high flow (high transmission betweenness) and were more likely to be positioned along potential bottlenecks for pathogen spread in the network. In an epidemic, bottlenecks may function as firebreaks for pathogen spread, but when these individuals do become infected, the pathogen may spread to new regions of the network (Salathé & Jones 2010).

While there seems to be greater relative importance of social patterns over space use in determining the transmission network, our results do not negate a role for shared space use. Rather, the importance of spatial patterns on transmission seemed to be mediated through the role that the spatial network played in determining the social network. Specifically, individuals with high degree in the spatial network tended to have higher social degree and betweenness, which were in turn strongly correlated with higher transmission degree and betweenness. Giraffe whose home ranges overlapped with a greater diversity of individuals also tended to have higher social degree and social betweenness (Fig. 3). In addition, pairs of individuals that had greater home-range overlap had significantly higher AS (VanderWaal et al. 2013b), and AS was positively correlated with the likelihood of sharing E. coli subtypes. These results indicate that space-use patterns play a key role in determining the social network, which in turn affects pathogen transmission patterns.

Indeed, when the contact networks are visually compared (Fig. 2a,b), the social network appears to be a pared-down, or filtered, version of the spatial network. While overlapping in space was a prerequisite for social interaction, some giraffe that shared space did not associate. The social network can be considered to simply apply a spatiotemporal filter to the spatial overlap data: dyads must not only share space, but also be at the same place at the same time to be linked in the social network. The spatial ties that were eliminated by this spatiotemporal (social) filter were ones that provided little information about transmission opportunities. It is also apparent that not all social contacts resulted in sharing E. coli subtypes (Fig. 2c). These findings about the inter-relationship between social, spatial and transmission patterns within a population have broad applicability to other host–pathogen systems.

Other studies indicate that E. coli is primarily environmentally transmitted through the ingestion of faecal contaminated water and forage, although there is evidence that direct transmission is important in certain situations (Besser et al. 2001; Henderson 2008; Turner et al. 2008). While giraffe do engage in some tactile contact during greetings and fights, direct contact is not a substantial feature in social interactions (Bashaw et al. 2007). Thus, the importance of social interactions in transmitting E. coli seems counter-intuitive at first glance. It is possible that the role of environmental transmission may be reduced in giraffe because they rarely forage on the ground (Young & Isbell 1991), and forage contamination by gastrointestinal parasites is minimized at higher feeding heights (Apio, Plath & Wronski 2006). Perhaps more likely, however, is that social associations led to synchronous space use. Even though E. coli can persist in the environment for months, densities rapidly drop off in the first 2 weeks (Medema, Bahar & Schets 1997; Avery et al. 2008) and inactivation by sunlight further reduces populations (Sinton, Hall & Braithwaite 2007). If two giraffe frequently associate, then they may drink from the same water source at the same point in time. Each would then be exposed to the same waterborne E. coli. Thus, frequent association may enhance spatiotemporal synchrony in exposure to environmental sources of E. coli.

Because of the difficulty of quantifying who transmitted to whom in wildlife populations, previous studies have typically used an individual's social network connectivity as a risk factor for infection. For example, individuals with higher degree in the social network are more likely to be infected (Godfrey et al. 2009; Porphyre, McKenzie & Stevenson 2011), or animals that engage in certain types of behaviour are at higher risk (Drewe 2009). In a recent study that incorporated microbial genetics, the approach was similar to previous risk factor analyses in that social network connectivity was analysed as a risk factor for being infected with specific genetic strains of Salmonella (Bull, Godfrey & Gordon 2012). Bull, Godfrey & Gordon (2012) did perform a dyad-level analysis on the likelihood of two individuals sharing Salmonella strains, and similar to our study, they found that strain sharing was influenced by the strength of the pair's social relationship rather than spatial proximity. Our approach builds upon their work by conceptualizing the strain-sharing data as a distinct network and then quantifying individual-level connectivity in this transmission network. Thus, our approach allows us to directly compare contact networks to transmission networks. This opens the door to new lines of research in network epidemiology, such as the extent to which animals occupy similar positions in contact and transmission networks, characterizing super-spreaders in transmission networks, or the effect of environmental change on the structure and connectivity of transmission networks.

In the novel approach used here, we integrated host behavioural data with microbial genetic data to provide a detailed picture of how contact and transmission patterns are related. While not usually pathogenic, E. coli is a useful proxy for pathogen transmission because it allows us to study transmission pathways without waiting for a clinical epidemic or making post hoc conclusions about transmission patterns after an epidemic has occurred. Transmission routes demonstrated by E. coli are most applicable to other faecal–oral pathogens that are epidemiologically similar. Although this study focused on a commensal organism, this is the first study to our knowledge to utilize this integrative approach to construct a network for any transmissible agent in an animal population. These methods can be employed to demonstrate possible routes of transmission through an ecosystem and are broadly applicable across studies of both intra- and interspecific routes of transmission.


We gratefully thank N. Sharpe, P. Buckham-Bonnet, J. Lillienstein and S. Preckler-Quisquater for invaluable assistance in the field; the Atwill laboratory and E. Wang for assistance in laboratory work; K. Gitahi of the University of Nairobi, OPC staff, especially N. Gichohi and G.O. Paul, and the Office of the President of the Republic of Kenya for enabling various facets of the research; and V. Ezenwa, A. Sih and T. Young for valuable comments during the development of the project. This research was approved by Kenya's National Council for Science and Technology (Permit NCST/RRI/12/1/MAS/147) and the UC Davis Institutional Animal Care and Use Committee (protocol no. 15887). This work was supported by the National Science Foundation (Doctoral Dissertation Improvement Grant IOS-1209338), Phoenix Zoo, Oregon Zoo, Sigma Xi, the Animal Behavior Society, the American Society of Mammalogists, Explorer's Club, Northeastern Wisconsin Zoo, Cleveland Metroparks Zoo, Cleveland Zoological Society, the UC Davis Wildlife Health Center, and the UC Davis Faculty Research Grant programme. K.L.V. was supported by a National Science Foundation Graduate Research Fellowship. We also thank several anonymous reviewers for their careful and considered comments.