The genetic network of greater sage‐grouse: Range‐wide identification of keystone hubs of connectivity

Abstract Genetic networks can characterize complex genetic relationships among groups of individuals, which can be used to rank nodes most important to the overall connectivity of the system. Ranking allows scarce resources to be guided toward nodes integral to connectivity. The greater sage‐grouse (Centrocercus urophasianus) is a species of conservation concern that breeds on spatially discrete leks that must remain connected by genetic exchange for population persistence. We genotyped 5,950 individuals from 1,200 greater sage‐grouse leks distributed across the entire species’ geographic range. We found a small‐world network composed of 458 nodes connected by 14,481 edges. This network was composed of hubs—that is, nodes facilitating gene flow across the network—and spokes—that is, nodes where connectivity is served by hubs. It is within these hubs that the greatest genetic diversity was housed. Using indices of network centrality, we identified hub nodes of greatest conservation importance. We also identified keystone nodes with elevated centrality despite low local population size. Hub and keystone nodes were found across the entire species’ contiguous range, although nodes with elevated importance to network‐wide connectivity were found more central: especially in northeastern, central, and southwestern Wyoming and eastern Idaho. Nodes among which genes are most readily exchanged were mostly located in Montana and northern Wyoming, as well as Utah and eastern Nevada. The loss of hub or keystone nodes could lead to the disintegration of the network into smaller, isolated subnetworks. Protecting both hub nodes and keystone nodes will conserve genetic diversity and should maintain network connections to ensure a resilient and viable population over time. Our analysis shows that network models can be used to model gene flow, offering insights into its pattern and process, with application to prioritizing landscapes for conservation.


| INTRODUC TI ON
Understanding population structure and quantifying genetic connectivity are important for guiding ongoing conservation and restoration efforts (Crooks & Sanjayan, 2006). Traditionally, population structure is first analyzed and subpopulations delineated, then genetic connectivity among subpopulations is quantified. However, this process need not be completed in two stages. Genetic network models can be used to simultaneously gain an understanding of population structure and to quantify genetic connectivity among populations in natural systems (Bunn, Urban, & Keitt, 2000;Dyer, 2007;Dyer & Nason, 2004).
Genetic networks are constructed of components called nodes and edges, where nodes may represent populations and edges represent genetic connectivity among nodes (Sallaberry, Zaidi, & Melançon, 2013). Each node can be weighted by the genetic diversity within the nodes and each edge by the genetic covariance among local populations (Bunn et al., 2000). The overall structure of the network provides a means by which to rank the importance of how each component contributes to maintaining network connectivity (Jacoby & Freeman, 2016). One can think of network structure in terms of the commercial airline model. In such models, nodes and edges are known, as are nodes of high and low connectivity (henceforth, hub nodes and spoke nodes, respectively). In the airline industry, hub nodes are strategically selected to maximize efficiency of air traffic, while spoke nodes are selected based on limited need for services. For wildlife populations, where populations serve as hub nodes and where populations serve as spoke nodes are unknown.
Qualifying genetic network structure and identifying nodes that act as hubs can be very informative to conservation and management of wildlife species (Garroway, Bowman, Carr, & Wilson, 2008;Lookingbill, Gardner, Ferrari, & Keller, 2010). Knowledge of which nodes are connected to one another and which nodes rank highly in network centrality can facilitate prioritization for management (Jacoby & Freeman, 2016). Prior network modeling of wildlife populations have shown that which nodes function as hubs and which function as spokes is not intuitive (Bunn et al., 2000;Garroway, Bowman, & Wilson, 2011;Garroway et al., 2008;Koen, Bowman, & Wilson, 2015). One might expect a node's proximity to the center of the species' range would influence that node's importance to connectivity, where centrally located nodes have greater genetic exchange than peripheral nodes. However, it has been shown that populations at the periphery of a species' range can act as critical hub nodes, connecting populations across the network, and that populations located toward the center of the range do not necessarily function as hub nodes (Bunn et al., 2000).
Emergent properties of genetic networks can be used to identify hub nodes and spoke nodes and the sensitivity of the entire network to the loss of connectivity (Dunne, Williams, & Martinez, 2002). There are three common network structures: (1) single-scale ("regular"), (2) broad-scale ("random"), and (3) small-world-a subset of which are known as scale-free (Amaral, Scala, Barthélémy, & Stanley, 2000;Bray, 2003). Regular networks are highly structured such that proximal nodes tend to be linked to each other, while distant nodes tend not to be linked: a structure comparable to the isolation by distance pattern commonly discovered in the population genetics literature (e.g., a stepping-stone model; Wright, 1943). In a regular genetic network, genetic connectivity is between neighboring nodes and nodes separated by a greater number of edges will be more isolated from one another. In regular networks, hub nodes are nonexistent as all nodes are equally connected. Random networks are unstructured such that proximity of nodes is irrelevant to whether nodes are connected or not and to the strength of connections: a structure most similar to the theoretical island model first proposed by Wright (1931) and analogous to the population genetic concept of panmixia. In a random genetic network, genetic connectivity is unencumbered across the entire network because the number of steps between any two nodes is relatively small such that close and distant nodes have equal chances of being linked. In random networks, there are no hub nodes, but there exist thoroughfares through the network that foster quick transit among any set of nodes. In contrast, small-world networks are composed of few highly connected nodes (hub nodes) and a greater number of more isolated nodes (spoke nodes), much like the hub-and-node model characteristic of the familiar commercial airline model. Most nodes can be reached from every other node by a small number of steps, often routed through central hub nodes, which foster connectivity among the spoke nodes. Redundancy is an important characteristic of small-world networks. In small-world genetic networks, genetic connectivity is greatest among nearest neighbor nodes, but genetic connectivity can exist between any two nodes by a small number of steps through hub nodes which are nodes at which genetic connectivity is concentrated such that these nodes serve to connect other distal nodes (also known as, spoke nodes). An extreme form of smallworld networks is scale-free networks. In scale-free networks, there is less redundancy in internode connections and greater centrality for the hub nodes.
Within any network's structure, individual node importance to network connectivity can be quantified by centrality indices. There are several centrality indices, each of which quantifies the importance of a node to network connectivity in a different way (Table 1). Some centrality indices rank nodes based on local connectivity and some based on network-wide connectivity. Therefore, these function-valued centrality indices can be easily transformed into node-specific rankings, and these rankings can be used to prioritize conservation (Jacoby & Freeman, 2016).
Sagebrush provides essential cover, is a staple of the species' diet, is where the bird nests and rears its broods, and congregates in the spring to display and breed on leks (Beever & Aldridge, 2011;Hagen, Connelly, & Schroeder, 2007;Patterson, 1952;Remington & Braun, 1985;Wallestad & Eng, 1975). On leks, males battle with one another to claim the center and energetically display to potential mates. Lek attendance by males is significantly correlated with female lek attendance (Bradbury, Vehrencamp, & Gibson, 1989).
Sage-grouse are known to disperse during the breeding season and are capable of long-distance breeding dispersal movements (Cross et al., 2017). While the fashion of long-distance dispersal movements is unknown, most migratory movements are made in stepping-stone fashion (Tack, 2009), and short-distance abrupt singular movements are common when suitable habitat is lacking (Dunn & Braun, 1986). The lek mating system of sage-grouse is well suited to network analyses because leks are fairly fixed spatial locations. Given the species' patterns of dispersal, we would expect that network structure should be composed of clustered, hub node-like nodes characteristic of a small-world network (Garroway et al., 2008).
TA B L E 1 Network parameters used to quantify connectivity, the unit for which each is calculated, and the definition of the parameter, and relation of the parameter as pertains to the greater sage-grouse population network. All but characteristic path length and weight are centrality indices landscapes. Perhaps more importantly, network analysis will greatly benefit planning in identifying conservation targets with the greatest benefit to maintaining genetic connectivity.
In this study, we had two primary objectives. First, we sought to determine the network structure of connectivity among leks, weighting edges by genetic divergence based on genetic covariance among leks. Second, we sought to identify which leks were important to maintaining overall population connectivity and persistence using network centrality indices. Within this second objective, we also sought to identify keystone nodes, that is, nodes that act as more important to maintaining gene flow than their size or location within the species range alone would indicate.

| Study area and sampling
We used 16,420 spatially referenced sage-grouse feather and blood (2016), following a smaller pilot sample. Feather samples were collected from leks using noninvasive methods (Bush, Vinsky, Aldridge, & Paszkowski, 2005;Segelbacher, 2002) after having been dropped by sage-grouse during breeding activity, while blood samples were collected from sage-grouse on leks as part of radiotelemetry field research. The only location throughout the entire distribution of the species that we did not use was Washington State because samples from this location were collected during a different period (from 1992 to 1999) than the rest of the samples.

| DNA extraction
Genetic analysis was conducted at two molecular biological labo-

| Microsatellite DNA amplification and genotyping
We based our analysis upon a panel of neutral, polymorphic microsatellite loci both to identify individuals from noninvasively collected samples (of unknown individual origin) and to quantify relatedness (i.e., functional movement resulting in gene flow

| Network construction
A minimum of four or more individuals per node is required to calculate within-node genetic variation (Dyer, 2014b). Therefore, before constructing the network, we performed a hierarchical clustering analysis of lek locations. First, we calculated a distance-based tree using the geographic coordinates for the leks from which the 6,242 individual samples were collected (using the HCLUST function in base R). Second, we clustered all lek locations within 15 km of one another (cut distance implemented using the CUTREE function in base R). We selected 15 km as the cut distance, as this is the best estimate of median breeding dispersal distance among leks for sagegrouse (Cross et al., 2017). Third, we removed any resultant clusters of leks composed of fewer than four individuals.
Following clustering, we constructed a weighted population network among the resulting clusters, which we henceforth refer to as nodes. For all clustered samples and for all nodes, we calculated the mean and standard deviation for number of alleles per locus, effective number of alleles, expected and observed heterozygosity, and F IS . We estimated genetic covariance among nodes, where microsatellite genetic covariance represents the weight of each network edge connecting nodes. We used the packages GSTUDIO (Dyer, 2014a) and POPGRAPH (Dyer, 2014b) in program R to estimate the conditional genetic covariance network following the methods of Dyer and Nason (2004) using default parameters (α = 0.05 and tolerance = 1 × 10 −4 ; Garroway et al., 2008). Following pruning using the recommended settings, the resultant minimal incidence matrix contained the smallest set of edges that sufficiently capture the among-node genetic covariance structure (Dyer & Nason, 2004).
We also calculated the minimum spanning tree, which is the subset of network edges that connect all nodes together with the maximum genetic covariance among nodes (edge weight), without any cycles.
To test for structure within the minimum spanning tree, we tested for correlation between weighted (factoring genetic covariance) distances among nodes in the minimum spanning tree and geographic distance (great circle distance) among all nodes calculated using the RDIST.EARTH function in the FIELDS package (Nychka, Furrer, Paige, & Sain, 2015) in R.

| Network structure determination
To determine the network structure, we compared the degree distribution, clustering coefficient, and characteristic path length of the sage-grouse genetic network to that of 1000 Erdos-Renyi model random networks with the same number of nodes, edges, and edge weight distribution as the range-wide sage-grouse genetic network.
The characteristic path length is defined as the average shortest path length between all pairs of nodes in the network, and it provides an understanding of how long it takes alleles to traverse the network.
We generated the random networks using package IGRAPH (Csardi & Nepusz, 2006) in program R and tested for significant differences between the degree distribution, clustering coefficient, and characteristic path length of the true sage-grouse network and the random networks using permutation tests, following the methods of Garroway et al. (2008). We used the results of these comparisons to determine whether network structure was purely a function of the number of nodes and edges or whether network structure was a result of nonrandom processes. For example, if we found a characteristic path length that did not deviate significantly from that of the random networks coupled with a significantly higher clustering coefficient than that of the random networks, then we could conclude that the network had small-world or scale-free characteristics (Watts & Strogatz, 1998). Furthermore, if we found a degree distribution that did not follow the power law (which would indicate scalefree network), was not binomial (which would indicate a random network) or fixed (which would indicate a regular network), but instead that was fat-tailed, we would conclude that the likely network structure was that of the hub node-and-spoke small-world network.
We quantified pairwise conditional genetic distance among all nodes. Conditional genetic distance is the length of the shortest path connecting each pair of nodes conditioned on network structure (Dyer, Nason, & Garrick, 2010) or the relative strength of the genetic covariance between nodes along the connecting edges (Koen, Bowman, Garroway, & Wilson, 2013). When compared to geographic distance among nodes, conditional genetic distance can provide insight into network structure. For example, if conditional genetic distance is correlated with geographic distance, one can conclude that the process of isolation by distance shaped a genetic network (Dyer et al., 2010).
We calculated six centrality indices (Table 1) and used these metrics to quantify connectivity and relative isolation of each node across the network. To calculate standard error (SE) of the mean and median as well as their respective 95% confidence intervals (CI), we calculated 1000 resampled networks of 75% of the nodes.
Betweenness centrality quantifies the importance of a node in terms of the bottleneck to gene flow it creates, eigenvector centrality quantifies how connected a node is, and strength quantifies how strong the connection is between a node and all its neighboring nodes (Garroway et al., 2008). Eigenvector centrality is an index of both how well a node is connected and how well a node's immediate connections are connected-in essence, measuring both direct and indirect connectivity. These properties make eigenvector a better index than betweenness if one is interested in quantifying the strength of connections. This is because eigenvector centrality increases not only just with increased immediate connectivity of the node of interest, but also the immediate connectivity of the nodes to which the node of interest is connected. When quantifying connectivity, we used the centrality index of node strength rather than node degree, as Koen et al. (2015) found that strength more adequately depicts migration and gene flow than degree centrality.
To examine relationships between network centrality indices, we tested for pairwise correlation between all indices using Spearman's rank correlation. We also tested for correlation between each network centrality index and mean peak male count per node (calculation described below). All network centrality indices were calculated using the IGRAPH package in R, and all correlations were calculated in base R.
To identify hub nodes of genetic exchange, we screened all nodes within the top 1% of each network centrality index (Table 1), and within the top 50% of all node centrality indices combined in order to identify those nodes that were most important to local network (regional) and network-wide (range-wide) connectivity. These nodes represent the top hub nodes of genetic exchange that maintain connectivity at all scales. To identify spoke nodes, we identified the nodes with the lowest ranking for each centrality index.

| Keystone nodes
We hypothesized that nodes composed of the most highly attended leks and the most geographically central nodes in the species' range would rank highest for centrality (i.e., node abundance and range centrality would be positively correlated with node centrality). To calculate abundance, we used the per-lek high male counts recorded between 2005 and 2015 (WAFWA, 2015) and calculated the mean peak male count per node over these years using all leks constituting each node (male lek attendance is significantly correlated with female lek attendance; Bradbury et al., 1989). Using mean peak male count, we tested for correlation with network centrality indices.
We defined range centrality as the great circle distance from the center of the geographic center of the sampling distribution.
We calculated range centrality as the distance of each node from the centroid of a minimum convex polygon enveloping all nodes, such that increased magnitude of distance was equivalent to inverse range centrality. We calculated the minimum convex polygon using the GCONVEXHULL function and calculated the centroid of the minimum convex polygon using the GCENTROID function in the RGEOS package (Bivand & Rundel, 2017) in R. Finally, we calculated the distance of every node from the centroid using the RDIST.
EARTH function in the FIELDS package (Nychka et al., 2015) in R.
Using range centrality, we tested for correlation with node centrality indices.
Finally, we sought to identify nodes with greater importance to genetic connectivity than the magnitude of lek attendance within the node or node location within the species range alone might indicate. We call these nodes keystone nodes. We identified keystone nodes as those that were low in attendance or peripheral to the range, but that still ranked high in centrality. To identify keystone nodes, we plotted both the mean peak male count within a node and the range centrality of each node against each network centrality index. We then called the outliers of these plots, keystone nodes.  evidence for a positive correlation between conditional genetic distance and geographic distance (Table 4).

| Network structure determination
The sage-grouse genetic network deviated from random network structure in both mean clustering coefficient and mean charac-  Figure 4d).

| Node properties
In order to describe node location, we used USGS hydrologic cataloging units, also known as watersheds (https://water.usgs.gov/GIS/ metadata/usgswrd/XML/huc250k.xml). We discovered hub nodes F I G U R E 2 The greater sage-grouse range-wide genetic network minimum spanning tree. The minimum spanning tree is pruned such that only the most highly weighted edges (i.e., the connections representative of the greatest genetic covariance) are shown between all nodes (n = 458). Distance among nodes in the minimum spanning tree was highly correlated with geographic distance between nodes (r s = 0.  (Figure 3).
Only five nodes ranked in the top 50th percentile of all network centrality indices, indicating their importance to genetic connectivity both locally and network-wide as well as the rarity of this combination of local and network-wide importance. The range of each centrality index for these hub nodes was large (Table 3).
These nodes were located within the Idaho Falls, Lake Walcott, Nodes with high betweenness centrality act as bridges between different parts of the network, so their loss can have network-wide impacts on genetic connectivity (Garroway et al., 2008). We identified several hub nodes whose betweenness ranking was high, Of the remaining two nodes, one was located in the Lake Walcott watershed in Idaho, and one was located in the Upper Bear watershed in Utah (Figure 3). Of these, the node with the greatest betweenness (1491)  To identify nodes that covaried the greatest with all other nodes in the network, we ranked nodes by closeness centrality. Closeness is an index of the average shortest path between a node and all other nodes in the network. Hence, a smaller closeness index indicates shorter paths on average, and therefore, greater connectivity.
There were a small number of very closely covarying nodes in the network (left-skewed distribution: Table 3, Figure 4). The top-ranked closeness nodes were central to the species' range, away from the periphery. Two of the nodes in the top 1% of closeness centrality were located in the Lake Walcott watershed in Idaho. The remaining nodes were located in the Little Snake watershed in Colorado TA B L E 3 Network centrality indices (betweenness, closeness, clustering coefficient, degree, and eigenvector) and network connectivity (strength and weight) for the range-wide greater sage-grouse genetic network (a) and networks (b) calculated from 1000 networks constructed from a resample of 75% (n = 343 nodes) of the originally sampled 458 nodes (sampled without replacement). Listed are the network centrality index, the component for which each index was calculated, minimum, mean, median, standard error (SE) of the mean and median, and 95% confidence intervals (CI) of the mean and median  To identify nodes that anchor tightly knit groups connected by a high number of edges, we examined node rankings by clustering coefficient. Increased clustering coefficient is indicative of smallworld characteristics. Network-wide, there was a low chance that any two nodes connected to a given node were also connected to one another (right-skewed distribution: Table 3, Figure 4). The nodes in the top 1% of clustering coefficient were found across the species' range and were mostly toward its periphery (Figure 3).  (Table 4)). Node strength was normally distributed (Table 3, Figure 4).
We found evidence for a strongly positive significant correlation between betweenness and closeness, and eigenvector and strength (Table 4). All other significant correlations among centrality indices were weak and negative. We found evidence for a strongly positive significant correlation between number of alleles per node and betweenness and closeness, although all other significant correlations were moderately negative or weak and positive.
The evidence for a correlation between mean peak male count and centrality indices was weak when significant. Finally, we found evidence for a strongly positive significant correlation between the number of samples in a node and betweenness, but only moderate or weak relationships when testing for correlation with other centrality indices.

| Edge properties
Edge weight is an index of the magnitude of genetic covariance between nodes and can be used to identify nodes most closely linked.
Overall, genetic connectivity among nodes has led to increased network connectivity, with the occurrence of some highly connected nodes evidenced by a skewed right distribution of edge weight (Table 3; Figure 4). The top 0.1% of edges with the greatest genetic covariance emanated from a node in the Spring-Steptoe Valleys watershed in Nevada. This node also has the lowest closeness centrality and low eigenvector centrality (0.33) and very low betweenness (0). The edge of least weight connected two nodes within the Fremont watershed in Utah (in the southcentral UT group of nodes in Figure 3). This node was of moderate importance to network-wide connectivity (betweenness: 115), but had low connectivity to other nearby nodes (eigenvector centrality: 0.11).

| Keystone nodes
It was common that the hub nodes-those with the highest centrality rankings-were also those with lower mean peak male count ( Figure 5). There was strong evidence for a weak positive correlation between mean peak male count and eigenvector centrality and mean peak male count and strength and strong evidence for weak TA B L E 4 Correlation between network centrality indices, range centrality, and lek attendance per node for the range-wide greater sage-grouse genetic network. Spearman's rank correlation (r s ) is shown below the diagonal with significance (p) above negative correlation between mean peak male count and betweenness and between mean peak male count and closeness (Table 4 and Figure 5).
Across all centrality indices, we discovered 26 nodes that ranked high for network centrality despite having lower mean peak male count than nodes of similar ranking ( Figure 6). These 26 keystone nodes were located across the entire species' range. Four of these keystone nodes ranked highly for more than one centrality index, with high rankings coupled for eigenvector centrality and strength and for closeness and clustering coefficient. In all cases, these nodes were keystone for betweenness and closeness or for eigenvector and strength.

| Emergent network properties
The greatest utility of our network analysis is its ability to be used to prioritize and target conservation efforts to the nodes most important to maintaining network connectivity at any desired scale.
Our network approach allows nodes to be ranked across multiple centrality indices, indicative of different scales and patterns of connectivity, each with unique importance to conservation.
We discovered that the sage-grouse range-wide genetic network is best characterized as hub-and-spoke topology most resembling the structure of a small-world network and not that of a random or regular network. Both the mean clustering coefficient (0.19 ± 3.35 × 10 −3 [SE]) and the mean characteristic path length (1.88 ± 7.04 × 10 −3 [SE], [1.88, 1.91]) were shorter than has been reported for other species (e.g., 0.254 and 2.26 in Garroway et al., 2008). The fat-tailed distribution of node degree ( Figure 4d) confirmed small-world network structure by ruling out scale-free structure, for which the degree distribution follows a power law.
Many hub nodes of connectivity within the network are located across the species' range ( Figure 3), with most spoke nodes located along the periphery of the range. This hub-and-spoke topology is evident in the minimum spanning tree, with important hub nodes of genetic connectivity occurring in nearly every state across the contiguous range (Figure 2). Loss of one of these highly connected hub nodes within several major basins could severely affect overall network connectivity.
F I G U R E 6 Keystone nodes (n = 26): nodes with greater importance to genetic connectivity than the magnitude of lek attendance within the node or node location within the species range alone might indicate. These nodes were low in mean peak high male count relative to their network centrality rankings. Points representing keystone nodes for more than one centrality index are offset to the left or right, such that these offset touching points represent the same node. Node color indicates centrality measure. Shaded polygons depict the watershed within which these top-ranking nodes are located. The species' range is shown as light gray polygons We documented strong connectivity across the entire network, evidenced by high-ranking nodes and edges across the species' range. This means that some of the nodes may be able to recover should they be extirpated but the habitat remain intact or be restored (e.g., following a local extinction caused by West Nile virus-e.g., Naugle et al., 2004; or return after restoration following natural resource extraction-e.g., Naugle et al., 2011). The ability to recover is exhibited in the network's traversability (i.e., the apparent low resistance to gene flow). However, to the best of our knowledge, node recovery has not been previously investigated in wildlife networks. The minimum spanning tree can serve as a powerful guide in making management decisions related to the relative importance of individual nodes to overall landscape connectivity (Urban & Keitt, 2001), as it is possible to model which nodes or which parts of the range will most likely be affected by the loss of any given node.
Our results suggest that distance plays an important role in structuring genetic connectivity (also known as isolation by distance, Wright, 1943). The vast majority of edges in the minimum spanning tree-those connections that represent the greatest covariance-connect geographically proximal nodes (Figure 2a). Similarly, there was a correlation between conditional genetic distance and geographic distance. These results support prior findings of isolation by distance across the species' range (Bush et al., 2011;Cross et al., 2016;Davis, Reese, Gardner, & Bird, 2015;Fedy, Row, & Oyler-McCance, 2017;Oyler-McCance, Taylor, & Quinn, 2005;Schulwitz, Bedrosian, & Johnson, 2014). However, nodes with greater centrality, important to both local and network-wide genetic exchange, are located across the species' range.
Both Cross et al. (2016) and Oyler-McCance et al. (2005) found that sage-grouse subpopulations in southwestern Montana were diverged from populations in the rest of the state. We confirm this prior finding, showing that sage-grouse from these same subpopulations are more closely related to conspecifics in Idaho than to subpopulations in Montana, as is evident in edge connectivity within the minimum spanning tree (Figure 2b). Cross et al. (2016) also found that the population in Northern Montana was diverged from the subpopulation in Southeastern Montana and the Dakotas and from the southcentral Montana subpopulation (the SE-W subpopulation in Cross et al., 2016). We confirm these findings here, showing nodes with very high clustering coefficient (indicative of highly interconnected network subunits) within the same regions ( Figure 3). We expect that the other top-ranked nodes for clustering coefficient in the Middle Snake-Succor watershed in Idaho and the Fremont watershed in Utah might also be embedded at the core of their respective subpopulations. Schulwitz et al. (2014) found that the subpopulations in southern and southeastern Montana and the Dakotas were both highly connected to leks in northern Wyoming. We also found the same pattern of connectivity, evident in the hub-and-spoke topology of the minimum spanning tree. In our case, a hub node in Wyoming/southcentral Montana is located in the Big Horn Lake watershed of northcentral Wyoming, and a hub node for Wyoming/southeastern Montana subpopulations is located within the Crazy Woman watershed of northeastern Wyoming (Figures 2b and 3). Davis et al. (2015) found that the small northern California population known to have experienced population declines had retained genetic diversity. We confirm this understanding by finding that the nodes in this area show elevated local connectivity (covariance) within the area. We also found that genetic connectivity into the northern California nodes comes from nodes to the north in Oregon (Figure 2b). Oyler-McCance, Casazza, Fike, and Coates (2014) discovered a northern and a southern subpopulation within the Bi-State population in southern California and southwestern Nevada. We found the same break evidenced by a lack of edges connecting these two units in the minimum spanning tree (Figure 2b). This lack of interconnectivity among nodes in the northern and southern groups is especially surprising, given that both groups exhibit greater covariance with far more geographically distant nodes. Fedy et al. (2017) documented genetic differentiation between birds in the Bighorn and Powder River Basins of Wyoming as well as differentiation between the northern and southern parts of the state, differences reflected in our analysis as evidenced by edge connectivity within the minimum spanning tree.

| Hubs of genetic exchange
We identified nodes with high importance to large-scale, networkwide genetic connectivity (i.e., nodes with high betweenness), and nodes within the top 50% of all centrality indices important to both network-wide and local connectivity. These top-ranked hub nodes are located across the entire range of the species. The locations of these hub nodes important to network-wide connectivity are in areas that should foster range-wide genetic connectivity due to their location in the topography of the western landscape. Basin, which ranks lower for other connectivity indices, connects to the southeastern-west subpopulation in the Yellowstone River Basin of Montana -nodes in the Big Horn Lake watershed anchor both basins. We suspect that the topology of the genetic network is largely shaped by the topography of the landscape, a hypothesis previously posited for sage-grouse Row et al., 2015;Schulwitz et al., 2014), and which has been found to influence genetic structure in other species (e.g., Roffler et al., 2014).
We identified 26 keystone nodes across the range of sage-grouse that stand out with increased importance to genetic connectivity despite having lower mean peak male count ( Figure 6). These keystone nodes do not follow the presupposition of increased centrality with increased mean peak male count (i.e., a proxy for population size for any given node) and include the highest ranking nodes for each centrality index, regardless of the population size ( Figure 5). We believe that these keystone nodes and other hub nodes (Figure 3) are top candidates for targeted conservation efforts, as their protection will help secure range-wide genetic connectivity. The keystone nodes are also distributed across the entire species' range, from the core to the periphery ( Figure 6). Therefore, neither range centrality nor local population size alone should be trusted proxies for prioritizing targeted conservation actions for sage-grouse.

| Limitations of the study and future directions
Prior research has modeled range-wide sage-grouse connectivity using a network approach. Knick and Hanser (2011) weighted nodes using lek attendance and limited edge connections using hypothesized dispersal thresholds. However, these imposed dispersal thresholds may have affected the resultant network structure. For example, Knick and Hanser (2011) used an exponential decay function to determine the probability of connectivity of leks. Imposing dispersal thresholds likely oversimplified the contribution that each priority area for conservation made to network connectivity by assuming dispersal limitations is equal among all nodes regardless of the internal population dynamics within nodes and environmental conditions within and among nodes. Crist, Knick, and Hanser (2017) used network approaches to generate several models of hypothesized connectivity among sage-grouse priority areas for conservation, which are areas that protect larger leks (i.e., those with more males visible during breeding) and surrounding area. They characterized the centrality of each priority area for conservation and concluded that several subnetworks exist across the species' range.
However, in their analysis, patch size, shape, and boundary length all had an effect on the pattern of connectivity and centrality. Our analysis provides insight into genetic connectivity using centrality indices based solely on the species' biology: the genetic covariance resulting from cumulative dispersal and breeding, a quantitative metric.
We have confidence in the cut distance we used to cluster leks into nodes, as it is empirically based on dispersal distances documented over a vast area, across multiple years, involving both sexes (Cross et al., 2017). Our clustering approach increased genetic variance within nodes, but also increased covariance among nodes (Dyer, 2015). Choice of cut distance depends on the desired scale of analysis for conservation and management application. We could have performed this analysis using individual leks.
Doing so would have resulted in finer resolution for our results.
However, it also would have resulted in fewer individuals per node, which would have limited our characterization of within-node genetic variation. Furthermore, we would have had to cut many leks from our analysis due to the minimum node composition requirement of four individuals. By clustering leks into nodes, we were restricted to making statements about the connectivity of larger landscapes that extend beyond the size of an individual lek and which were potentially representative of leks unsampled within the same landscapes. Furthermore, our clustering approach reflects the biology of the species, as prior research has shown that both female and male sage-grouse attend multiple leks within a breeding season (Cross et al., 2017;Dunn & Braun, 1985;Semple, Wayne, & Gibson, 2001).
We found evidence for correlation between some network centrality indices and samples per node and mean peak male count. However, when significant, these relationships were only moderate or weak in all but one case: that of betweenness and samples within a node (Table 4). Therefore, we do not believe that sample size drove the centrality of a node. Larger populations acting as hub nodes might be expected, as these highly populated hub nodes would be expected to house greater genetic diversity to be the sources of dispersers. However, as discussed above, the highest ranking nodes for each centrality index were never those with the greatest mean peak male count ( Figure 5).
Future work should examine the effect of the spatial distribution of individuals composing nodes on the resultant network model structure. For example, constructing a genetic network where priority areas for conservation serve as nodes may help prioritize conservation based on existing management boundaries at a larger landscape scale. It is worth noting that if priority areas for conservation are treated as nodes, larger priority areas for conservation may score higher for centrality indices due to the within-node proportion of the genetic covariance, which will increase centrality.

| Applications and future directions
We believe that the greatest utility of our network analysis will be its use in prioritizing and targeting conservation efforts to the nodes most important to maintaining network connectivity. This network approach allows for the ranking of nodes by multiple centrality indices, indicative of different scales and different patterns of connectivity. These indices can be used to locate the top-ranking nodes-and more importantly, the leks which compose those nodes-which can then be prioritized in accordance with management goals (Bottrill et al., 2008). If goals are to conserve hubs of genetic exchange that connect the greatest number of nodes range-wide, then ranking nodes based on betweenness is most relevant. If goals are to conserve hubs of genetic exchange that connect immediate connections, then ranking nodes based on eigenvector centrality is most relevant. If goals are to conserve nodes that have the greatest genetic exchange with their immediate connections, then ranking nodes based on strength is most relevant. If goals are to conserve hubs of local connectivity, then ranking nodes based on closeness or clustering coefficient is most relevant.
Conservation actions may be targeted first toward the topranking nodes, or managers may first choose to combine network centrality with economic cost before deciding where to act. We can imagine many additional ways in which network centrality may be combined with additional metrics to target conservation resources.
Our hope is that the empirically based sage-grouse genetic network we constructed will prove a useful tool to conservation planners.

ACK N OWLED G M ENTS
We thank Rodney

DATA ACCE SS I B I LIT Y
Sample microsatellite genotypes and node membership are available on USGS ScienceBase: https://doi.org/10.5066/f73n22pn.