The population and landscape genetics of the European badger (Meles meles) in Ireland

Abstract The population genetic structure of free‐ranging species is expected to reflect landscape‐level effects. Quantifying the role of these factors and their relative contribution often has important implications for wildlife management. The population genetics of the European badger (Meles meles) have received considerable attention, not least because the species acts as a potential wildlife reservoir for bovine tuberculosis (bTB) in Britain and Ireland. Herein, we detail the most comprehensive population and landscape genetic study of the badger in Ireland to date—comprised of 454 Irish badger samples, genotyped at 14 microsatellite loci. Bayesian and multivariate clustering methods demonstrated continuous clinal variation across the island, with potentially distinct differentiation observed in Northern Ireland. Landscape genetic analyses identified geographic distance and elevation as the primary drivers of genetic differentiation, in keeping with badgers exhibiting high levels of philopatry. Other factors hypothesized to affect gene flow, including earth worm habitat suitability, land cover type, and the River Shannon, had little to no detectable effect. By providing a more accurate picture of badger population structure and the factors effecting it, these data can guide current efforts to manage the species in Ireland and to better understand its role in bTB.

. This can provide key information for wildlife management, including cases where species act as reservoirs or vectors of pathogen infection (Frantz, Pope, Etherington, Wilson, & Burke, 2010;Kierepka & Latch, 2016a,b;Pope, Domingo-Roura, Erven, & Burke, 2006). Figure 1 is the largest terrestrial carnivore in Britain and Ireland, and is of significant ecological (e.g., ecosystem engineer) and economic importance (as a suspected reservoir of bovine tuberculosis, bTB) across this territory (Roper, 2010). A significant body of work has been undertaken to elucidate the species' role in cattle bTB epidemiology (Roper, 2010). The species' genetic structure at continental scale has been studied extensively (Del Cerro, Fernando, Chaschin, Taberlet, & Bosch, 2010;Frantz et al., 2014;Marmi et al., 2006;O'Meara et al., 2012). In contrast, there is limited information available on badger population genetics at the national scale, which is likely to be the scale most relevant to management. To date, there has been no large island-wide survey of the genetic structure of the badger in Ireland.

The European badger (Meles meles)
Previously, studies from across Europe have noted that badgers exhibit limited dispersal/philopatry (Pope et al., 2006). Dispersal distance seems to be inversely proportional to population density, with a large proportion of individuals exhibiting philopatry at high densities (Frantz, Cellina, Krier, Schley, & Burke, 2009). Although badger densities in Ireland are typically not as high as those in southern Britain (e.g., Woodchester Park; 0.4 setts/km 2 vs. 2.88 setts/ km 2 , respectively), they can still be considered relatively high compared to other European populations (Byrne, Sleeman, O'Keefe, & Davenport, 2012;Pope et al., 2006). It is also noteworthy that in Ireland, whilst general philopatry appears to hold, mark-recapture studies of Irish badgers have documented rare long-distance dispersal of up to 22.1 km (Byrne, Quinn, et al., 2014).
Aside from geographic distance, other landscape features likely affect gene flow of badgers. Water bodies and motorways have been observed to hinder European badger gene flow (Frantz et al., 2010). Furthermore, badgers have generally been recorded at low altitudes (<200 m; Byrne et al., 2012), and their abundance, habitat selection, and foraging behavior are positively associated with land use categories such as pasture, forested areas, and grasslands-urban and arable land are generally avoided (Byrne et al., 2012;Hammond, McGrath, & Martin, 2001).
On the other hand, the effect of biotic interactions on the gene flow of organisms in general has been little studied (Hand, Lowe, Kovach, Muhlfeld, & Luikart, 2015) regardless of the crucial insights that such research could provide. In this sense, badgers are a particularly interesting system because they are generally assumed to be earthworm (Lumbricus terrestris) specialists (Kruuk & Parish, 1981;Muldowney, Curry, O'Keefe, & Schmidt, 2003), which could result in gene flow being strongly affected by earthworm availability.
Conversely, there is some indication that in Ireland, the diet of the badger varies seasonally and is less reliant on earthworms than observed elsewhere (Cleary, Corner, O'Keefe, & Marples, 2009) which could result in little effect of prey availability on gene flow.
In light of the above, applying landscape genetics to study the effect of landscape features and biotic interactions on badger gene flow may help to inform more fully on the ecology of the species in Ireland. The latter could be of benefit in developing a better understanding of how/whether badger population structure influences bovine tuberculosis epidemiology. In this study, therefore, we aim to provide the first comprehensive, large scale assessment of genetic population structure of the badger across Ireland and to identify the landscape features which have likely shaped it. Specifically, we studied the influence of geographic distance, landscape variables (elevation, land cover, Ireland's only continental scale river: the Shannon), and biotic interactions (earthworm availability), on badger gene flow. For a full breakdown of numbers of animals submitted per county across Ireland, see Supporting Information Table S1. GPS locations for all animals are found in Supporting Information Data S1. The map in Figure

| DNA extraction
We extracted DNA from all tissue samples using a Qiagen DNeasy tissue mini kit (Qiagen, Crawley, West Sussex, United Kingdom).
Extracted DNA was stored at −20°C until PCR amplification for either microsatellite or mitochondrial DNA sequence analysis.

| Genotyping quality control
We regenotyped a random selection of 5% of all extracted DNA samples tested. All microsatellite data were subjected to analysis by

| Population genetic clustering
We determined standard population genetic indices of diversity from the microsatellite data using GENEPOP v4.2 (Rousset, 2008), any potential subpopulation's history in Ireland, ancient or recent divergence of all subpopulations from a common ancestral population were both plausible scenarios, as was the possibility of different founding populations being translocated to Ireland by human agency, as has previously been inferred (Frantz et al., 2014;O'Meara et al., 2012).
All latter scenarios would have had consequences for heterogeneity in observed patterns of genetic relatedness and divergence among extant subpopulations; therefore, we ran STRUCTURE models accounting for both correlated and independent allele frequencies and assessed which produced the highest log likelihood for the inferred best fitting values of K. To infer the best fitting number of subpopulations (K), we used the ΔK method of Evanno, Regnaut, and Goudet (2005) over consecutive values from K = 1 to K = 10 with a burn-in of 50,000 and a Markov chain length of 100,000, for 20 iterations per K value. Convergence of key statistics along the burn-in chain was assessed as per the STRUCTURE manual. We then extracted and analyzed the data using STRUCTURE Harvester (Earl & vonHoldt, 2012). Data for each K value (n = 20) were processed by the program CLUMPP (Jakobsson & Rosenberg, 2007), with final illustrations produced using DISTRUCT (Rosenberg, 2004). We assigned individual badgers which exhibited 85% of their genetic heritage, or greater, to specific STRUCTURE defined subpopulations.
Assignment thresholds for other animal species have been set at a variety of other values-50% for the American badger (Kierepka & Latch., 2016b), 70% for red deer and jaguars (Dellicour et al., 2011;Wultsch et al., 2016), and up to 90% for other mustelids (Cegelski, Waits, & Anderson, 2003) and reptiles (Gaillard et al., 2017). Given the reduced genetic diversity of the European badger in Ireland and the general philopatry of the species (Pope et al., 2006), and the continuous sampling structure we employed, we decided it would be best to use a threshold of 85%.
We quantified genetic differentiation between inferred STRUCTURE populations by calculating pairwise F ST values using FSTAT 2.9.3.2 (Goudet, 1995). Statistical significance of pairwise values was tested by permutation with corrections for multiple comparisons. Genetic differentiation between pairs of populations was also quantified using Jost's D statistic (Jost, 2008) calculated by the mmod package (Winter, 2012) in the R environment v3.2.2 (R Development Core Team, 2008). All population data were mapped using ArcGIS ArcMAP 10 using latitude and longitude coordinates based on the Irish Grid (ESRI, 2011).
The STRUCTURE clustering algorithm works by maximizing linkage disequilibrium between markers and Hardy-Weinberg equilibrium among individuals in assigned populations (Pritchard et al., 2000;Wilkinson, Haley, Alderson, & Wiener, 2011). In continuously distributed species wherein there is clinal genetic differentiation, with isolation by distance, the algorithm can assign populations arbitrarily, so care must be taken in interpretation of data (Frantz et al., 2009;Pritchard et al., 2000). In line with this concern, we opted to make use of an additional multivariate clustering algorithm (Frantz et al., 2009) that did not make assumptions about linkage disequilibrium and Hardy-Weinberg equilibrium. Discriminant analysis of principal components (DAPC) is such a method, more suited to the investigation of population substructure in continuously distributed species that exhibit clinal genetic variation (Jombart, Devillard, & Balloux, 2010). We implemented the DAPC method in the adegenet package (Jombart, 2008) in the R environment v 3.2.2. We first used the find.clusters function to assign individual samples to proposed subpopulations, retaining all 70 principal components to infer a range of possible clusters. We then applied the DAPC analysis function to the upper and lower values of this range-in both cases retaining 30 principle components and all linear discriminants to produce scatterplots of both upper and lower values of K. As with STRUCTURE outputs, all population data were mapped using ArcGIS ArcMAP 10 using latitude and longitude coordinates based on the Irish Grid (ESRI, 2011).

| Landscape genetics analyses
In order to examine the influence of geographic distance, landscape variables, and biotic interactions on badger gene flow, we combined Mantel tests, multiple regression on distance matrices (MRM), and redundancy analysis (RDA). Analyses were conducted with the subset of individuals for which precise coordinates of sampling were available (n = 433).

| Mantel tests and Multiple regression on distance matrices (MRM)
As a first step, we estimated interindividual genetic distances (Smouse & Peakall, 1999) using the R package PopGenReport (Adamack & Gruber, 2014). Next, we used ArcGIS to generate resistance surfaces (rasters) representing the hypothesized resistance a particular environmental feature poses to badger gene flow (McRae, 2006). These surfaces were generated for land cover, elevation, earthworm availability, and geographic distance.
To generate land cover surfaces, we used the CORINE land cover data set (EEA, 1995) and generated two types of resistance surfaces: (a) based on broad land cover categories (CORINE Level 1), we assigned low resistance to forest and seminatural areas, intermediate resistance to agricultural areas, and high resistance to artificial surfaces; (b) within said categories (CORINE Level 2), we made further distinctions: within artificial surfaces, we lowered resistance for artificial, nonagricultural vegetated areas; within agricultural areas, we assigned low resistance to pastures and high resistance to arable lands and within forest and seminatural areas, we assigned highest resistance to open spaces with no vegetation.
Overall, taking into account general observations on badger ecology (Byrne et al., 2012;Hammond et al., 2001), we assumed that open areas (whether artificial or natural) posed higher resistance to gene flow than areas with vegetation cover. Resistance ratios for land cover surfaces were varied (1:10:100 vs. 1:100:1,000 vs. 1: 100: 10,000), thus generating six surfaces in total. Details on these surfaces and ratios are available in Supporting Information Data S2. We obtained a digital elevation model from CGIAR (http://srtm.csi.cgiar. org/), and two surfaces were generated, one maintaining raw elevation (masl) and one using a threshold of 200 m above sea level to assign low vs. high resistance (1:100) because badgers tend to avoid elevations beyond this threshold (Byrne et al., 2012).
To obtain a resistance surface related to earthworm availability, we used MaxEnt (Phillips, Anderson, & Schapire, 2006) to generate a raster of earthworm relative habitat suitability (EHS; Merow, Smith, & Silander, 2013). We obtained available records of the species (n = 30) from the Global Biodiversity Information Facility, GBIF (http:// www.gbif.org/), and reviewed relevant literature on the ecology of Lumbricus terrestris and related taxa (Marchán et al., 2015;Rutgers et al., 2016) in order to select environmental variables to use as input in MaxEnt analyses. Coordinates for all L. terrestris records used in the MaxEnt analysis are shown in Supporting Information Table S2.
Environmental variable rasters were tested for correlation using ArcGIS v 10.2 to avoid redundancy. Because no rasters were highly correlated (r < 0.80), they were used together as input in MaxEnt software, where analyses were run with default settings. Results of analyses showed the area under the curve (AUC) = 0.791 for the MaxEnt output model, which is considered an acceptable predictive accuracy (Araújo, Pearson, Thuiller, & Erhard, 2005). The variables with the highest percent contribution to the model were BIO 5 (31.3%) and BIO 12 (30.2%), followed by pH (20.2%), BIO 4 (9.8%), and silt (8.6%). Overall, there is an increase in habitat suitability with increasing values of BIO5 (max. temperature of warmest month), whilst the opposite was true for BIO12 (annual precipitation), where at values beyond ~770 mm, habitat suitability decreases, see Supporting Information Figure S2. The output habitat suitability raster is shown in Figure 3.
To test for isolation by distance (IBD), we generated a "flat" resistance surface in which all cells had the same value (=1). This is an alternative to using Euclidean distance that accounts for the finite size of the landscape (Dudaniec, Spear, Richardson, & Storfer, 2012).
From the generated resistance surfaces, we obtained resistance-distance matrices using the software Circuitscape 4.0 (McRae, Dickson, Keitt, & Shah, 2008). For land cover, elevation, and "flat" surfaces, we used default settings, which assume that these factors inhibit badger gene flow (i.e., raster values = resistance). For EHS, settings were modified so that raster values represented "conductance" (i.e., habitat suitability ranging from 0.001 to 0.950) because we expected high EHS to facilitate badger gene flow. Finally, to test whether the River Shannon acts as a barrier to gene flow, we used individual badger locations to generate a "barrier matrix" in which values indicated whether individuals had been sampled on the same side of the river (=1) or on opposite sides (=100). Once we obtained the resistance matrices, we used the package Ecodist (Goslee & Urban, 2007) in R, to conduct Mantel tests and MRM. Although the use of Mantel (and related) tests in landscape genetics has been criticized (Legendre & Fortín, 2010), recent simulations have shown that they are highly effective at detecting Isolation by distance (Kierepka & Latch, 2014). Hence, we performed a simple Mantel test between badger genetic distance and the "flat" matrix to test for IBD. Next, we performed partial Mantel tests on genetic and landscape resistance matrices, whilst controlling for "flat" distance. For both simple and partial Mantel tests, the significance of correlations (Spearman) was determined from 10,000 permutations. Partial Mantel tests identified the resistance matrices that showed significant correlation (p < 0.05) with genetic distance matrices. These were checked for correlation using the "cor" function in R, and only those without strong correlation (r < 0.8) were retained for MRM analyses. In MRM models, matrices were used as predictors of genetic distance.
Model selection was done using a backward elimination approach, with a "threshold" p < 0.05. Significance of regression coefficients and R-square values was assessed from 10,000 permutations.

| Redundancy analysis
Ordination techniques, such as redundancy analysis (RDA), are increasingly used in landscape genetics studies given their power to We conducted RDA in the R package vegan (Oksanen, Kindt, Legendre, & O'Hara, 2008). We first built a "full" model using all predictors together and checked for multicollinearity among them with the function "vif.cca." Because all variance inflation factors were low (<5), we retained all predictors. Next, we used the function "ordistep" to conduct (backward) stepwise model selection and thus identify a "minimal" model. Having identified the minimal model, we conducted partial RDAs to estimate the amount of genetic variance explained solely by landscape/biotic variables controlling for geographic location (latitude/longitude) and that explained solely by geographic location controlling for landscape/ biotic variables. All models were tested for significance using the function "anova.cca."

| Data quality assurance
Microsatellite retyping produced results identical to those initially obtained. Microchecker detected no evidence for genotyping errors or null alleles. Microsatellite allele calls for all samples are found in Supporting Information Data S1.

Indices of diversity, inbreeding fixation (F is ) and tests for Hardy-
Weinberg equilibrium (HWE) across all of Ireland, NI, RoI, and the five Irish populations identified by STRUCTURE (see below) are shown in Table 1. Across all of Ireland, deviations from HWE were observed across 12 of the 14 loci genotyped (Table 1A).
Within the NI and RoI populations, seven and nine loci were out of HWE, respectively (Table 1B and C). Across the five subpopulations identified by STRUCTURE, between 1 and 2 loci deviated from HWE (Table 1D, E, F, G, and H). General population genetic indices of diversity across all of Ireland (see Table 1A) were similar to those described before by (O'Meara et al., 2012

| Clustering and assignment methods
The independent allele frequencies STRUCTURE model outputs indicated a plateauing of log likelihood of K (L(K)) around K = 4 or 5 (Supporting Information Figure S3A). The Evanno ΔK plot showed two peaks, the highest at K = 2 and a lower one at K = 4 (Supporting Information Figure S3B). The correlated allele frequencies STRUCTURE model outputs indicated a plateauing of the log likelihood of K (L(K)) around K = 5 (Supporting Information Figure   S4A). The Evanno ΔK plot showed two peaks, the largest at K = 2 and a smaller one at K = 5 (Supporting Information Figure S4B). The correlated allele frequency STRUCTURE model exhibited the highest mean log likelihood at all inferred values of K for 20 replicates compared to the independent allele frequencies model (Supporting Information Table S3). Consequently, we focused our efforts on the data from the correlated allele frequencies model. It has been noted before that the Evanno method can underestimate the true value of K when genetic differentiation is minimal between subpopulations, thereby preferentially finding the highest clustering hierarchy in a dataset (Waples & Gaggiotti, 2006).
In another study involving the American badger, Taxidea taxus, two peaks have been observed using the Evanno method (Kierepka & Latch, 2016a), with the second being suggestive of further substructure (Evanno et al., 2005;Kierepka & Latch, 2016a). We chose therefore to investigate both K = 2 and K = 5 for the British and Irish data. DISTRUCT admixture plots of the K = 2 and K = 5 badger populations are shown in Figures 3a,b,   Subpopulation IR1 was primarily located in the northeastern County of Down, (Figures 2b and 4b). Subpopulation IR2 was largely localized to the northern Counties of Antrim and Derry (Figures 2b   and 4b). Subpopulation IR3 was made up of few badgers and sporadically distributed Counties Monaghan, Fermanagh, Leitrim, Sligo, and Roscommon (Figures 2b and 4b). Subpopulation IR4 was distributed in the southeast, principally in Counties Wicklow, Kildare, and Wexford (Figures 2b and 4b). Subpopulation IR5 was primarily located in the southwestern Counties of Clare, Cork, Kerry, Limerick, and Waterford. Pairwise F st and Jost's D statistics on subpopulation genetic differentiation for the K = 5 STRUCTURE analysis are shown in Table 2. All pairwise between subpopulation F st calculations were observed to be significantly greater than zero. Admixed animals, which could not be definitively assigned to any of the five identified subpopulations, were distributed across the midlands and northwestern counties (Figures 2b and 4b).
Supporting Information Figure S5 shows the curve of values of Bayesian information criterion (BIC) for each of the simulated values of K derived by find.clusters. The decreasing values of BIC begin to plateau at K = 7, reaching their lowest value at K = 10, before beginning to rise again (Supporting Information Figure S5). This type of pattern, with multiple possible values of K, is typical of "real world" scenarios involving continuously distributed species (Jombart, 2008). We therefore chose to apply the DAPC method to both the K = 7 and K = 10 assigned clusters. Scatterplots and associated geolocations for all subpopulations under both the K = 7 and K = 10 scenarios are illustrated in Figures 5 and 6, respectively. At K = 7, linear discriminant axis 1 (LD1) accounted for 30.4% of the observed genetic variance, whilst linear discriminant axis 2 (LD2) accounted for 17.4% of observed genetic variance (Figure 5a). At K = 10, LD1 accounted for 26.6% of the observed genetic variance, whilst (LD2) accounted for 16.0% of observed genetic variance (Figure 6a).
At K = 7, the DAPC scatterplot and geo-location plot (Figure 5a,b) suggested that badgers from counties in the Republic of Ireland (Clusters 1-5) were more genetically homogeneous, exhibiting considerable overlap in both genetic and physical space. Conversely, F I G U R E 6 (a) DAPC K = 10 scatterplot of individual badgers assigned to inferred 10 subpopulation clusters; (b) DAPC K = 10 geolocations of all individual badgers and assigned subpopulation clusters Clusters 6 and 7 were primarily located in Northern Ireland and formed more distinct clusters in genetic space (Figure 5a,b). At K = 10, a similar pattern was observed with badgers from the Republic of Ireland exhibiting less genetic differentiation from each other than when compared to those from Northern Ireland (Figure 6a,b). However, in comparison with the K = 7 scenario, the K = 10 clusters exhibited a pattern more in keeping with a gradual cline in genetic variance both in genetic and physical space across the island.

| Landscape genetic analyses
The simple Mantel test showed a significant correlation between genetic distance and the flat resistance surface (r = 0.24; p < 0.05), confirming IBD. When controlling for flat resistance-distance in partial Mantel tests, genetic distance showed significant positive correlation with raw elevation (r = 0.09; p < 0.05) suggesting that elevation inhibits gene flow.

| D ISCUSS I ON
In this study, we sought to better understand the population structure of the Irish badger and to determine how abiotic and biotic features of the Irish landscape affected gene flow and contributed to the extant population structure. Standard population genetic indices revealed island-wide evidence of population substructure. The number of loci observed to be out of Hardy-Weinberg Equilibrium, and the higher values of the fixation index F is across the whole island as a single unit, and when split into its two political units, indicated a lack of panmixia over large distances (Table 1A, B, and C). Indeed these data are consistent with a Wahlund effect and there being some subpopulation differentiation with limited connectivity. Similar findings have been noted before in badgers across Europe (Pope et al., 2006).
Interestingly, RoI and NI exhibited very similar levels of heterozygosity and allelic diversity (Table 1A and B). RoI occupies approximately five times the landmass of NI, and whilst there are regional differences in land type and suitability for badgers (Byrne, Acevedo, Green, & O'Keeffe, 2014;Reid, Etherington, Wilson, Montgomery, & McDonald, 2012), one may have expected to see a more diverse RoI badger population. That this is not the case may be a result of the ongoing culling efforts in RoI (Sheridan, 2011). However, without baseline precull data, it is difficult to be certain this is the case.
Wide scale culling has not been a feature of TB control schemes in Northern Ireland where badger populations have remained stable over many years (Reid et al., 2012).
Regarding the inferred population differentiation, STRUCTURE analysis indicated two levels of hierarchical clustering in the microsatellite data. At K = 2, there was an apparent northeastern to southwestern cline in badger genetic differentiation. From the data we present in this study, the reason for such structuring is not apparent.
It may however have something to do with the way in which Ireland was populated by badgers in the past. The DAPC data for both K = 7 and K = 10 scenarios, particularly as pertaining to Northern Irish badgers being more genetically distinct than their southern contemporaries, support the STRUCTURE K = 2 inference that there is po-  (Byrne, Quinn, et al., 2014). Both MRM and RDA models showed that elevation is related to genetic variation in badgers, which could indicate that gene flow is hindered by upland habitat. This is in accordance with data on badger habitat selection, with setts less commonly found in upland vegetation types (Byrne, Acevedo, et al., 2014;Hammond et al., 2001;Reid et al., 2012).
In terms of other environmental variables, only EHS appeared to affect genetic variance in badgers, but its influence was only detected through RDA and the variance explained was rather low (up to 2% combined with elevation). This pattern indicates that earthworm availability has a low influence on badger dispersal/gene flow, consistent with previous suggestions that Irish badgers have lessspecialized diets than populations elsewhere (Cleary et al., 2009).
No other landscape features seemed to affect badger gene flow in Ireland, including the River Shannon. Although the River Shannon is a sizeable waterway, depth varies along its course and there are numerous man-made structures such as bridges and canals that could allow badgers to cross. Indeed, it has been previously shown that rivers do not always represent impermeable barriers to badger gene flow (Frantz et al., 2010) or dispersal (Sleeman et al., 2009).
For a species with strict habitat requirements and limited dispersal ability, strong associations among habitats and genetic differentiation are expected (Kierepka & Latch, 2016b). However, despite evidence for limited dispersal in Irish badgers (i.e., IBD), and indications that aspects of badger ecology are affected by land cover (Byrne et al., 2012;Hammond et al., 2001), the latter was not related to badger genetic variation in landscape genetic analyses. Previous research has suggested that badgers in Ireland are less ecologically specialized than badgers in other parts of their range. For instance, several studies within Ireland have recorded setts in the vicinity of seemingly "unsuitable" habitats such as roadways, graveyards, and railways (Byrne et al., 2012), indicating the species is tolerant of human disturbance and can make use of a number of land cover types.
Our overall findings thus highlight the importance of spatial replication in landscape genetics studies (Castillo et al., 2016), as populations across species range may differ in their ecological interactions and requirements. Furthermore, the effect of particular landscape features on gene flow can strongly depend on the degree of landscape heterogeneity (Bull et al., 2011), which for Ireland could be classified as low.

| CON CLUDING REMARK S
Our data demonstrate that geographic distance and elevation are preeminent drivers of continuous, clinal, genetic variation in contemporary Irish badgers. These data can guide management of the species in Ireland, particularly in the context of controlling bovine tuberculosis in cattle. Knowledge of genetic population structure allows us to formulate testable hypotheses about how this level of partitioning might affect spatial disease patterns in the pathogen (Biek & Real, 2010).
Additionally, whilst philopatry appears to be the norm in the Irish badger and may contribute to maintenance of local M. bovis clusters, a small proportion of dispersers have been observed to move larger distances (Byrne, Quinn, et al., 2014). Given that there appear to be no major physical barriers to gene flow for badgers in Ireland, infected animals dispersing over a wider scale may therefore be a risk for disease spread. Our data and findings represent a strong foundation and timely opportunity to address these pertinent issues in the future.

ACK N OWLED G M ENTS
The authors extend their thanks to Dr Olaf Schmidt, University College Dublin for his advice on data-handling for EHS rasters. This research was funded by the Department of Agriculture Environment and Rural Affairs for Northern Ireland (DAERA-NI) and DAFM-Department of Food Agriculture and the Marine, Republic of Ireland.

CO N FLI C T O F I NTE R E S T
The authors declare no competing interest.

AUTH O R S' CO NTR I B UTI O N S
JG participated in the design of the study, carried out statistical analysis, and drafted the manuscript. AB participated in the design of the study and drafted the manuscript. JL, EP, and GK carried out the molecular laboratory work. EC, JO'K, UF, and DO'M collected field samples. DE helped with statistical analysis. CM carried out field sampling and mapping. RB and RS participated in the design of the study and drafted the manuscript. AA conceived the study, participated in its design, drafted the manuscript, and performed statistical analysis and molecular laboratory work. All authors gave final approval for publication.

DATA ACCE SS I B I LIT Y
All badger genotypic data are available from the Dryad Digital