Clustering and ensembling approaches to support surrogate‐based species management

Surrogate species can provide an efficient mechanism for biodiversity conservation if they encompass the needs or indicate the status of a broader set of species. When species that are the focus of ongoing management efforts act as effective surrogates for other species, these incidental surrogacy benefits lead to additional efficiency. Assessing surrogate relationships often relies on grouping species by distributional patterns or by species traits, but there are few approaches for integrating outputs from multiple methods into summaries of surrogate relationships that can inform decision‐making.


| INTRODUC TI ON
Biodiversity conservation cannot proceed by providing habitat and threat mitigation for each and every species individually (Gross & Noon, 2015), and knowledge of species' distributions, abundances, life histories and responses to environmental perturbations is unavailable for the vast majority of organisms (Hortal et al., 2015).
The sheer size of the species pool in any given landscape has been a major motivating factor behind the emergence of surrogates as an organizing principle for conservation strategies (Noon, McKelvey, & Dickson, 2009). Surrogate approaches implement monitoring and management for a small set of species or landscape features, with the assumption that these species or features adequately represent the broader ecological community (Caro, 2010;Sato et al., 2019).
Surrogates are generally used for monitoring the status or trend of unmeasured species or attributes, and prioritizing locations for conservation action (Caro, Eadie, & Sih, 2005;Hunter et al., 2016). The use of surrogates to infer the status of other species, or the state of the ecological system, has been formalized by national land management agencies (e.g., U.S. Department of Agriculture Forest Service, 2012; U.S. Fish & Wildlife Service, 2015), and in international assessment frameworks to support biodiversity conservation (Gregory & van Strienn, 2010;Maes et al., 2016).
Two lines of research characterize species-oriented surrogacy investigations. One line focuses on empirically deriving that subset of species (i.e., the surrogates) that best characterize the patterns seen in a broader assemblage (Bal, Tulloch, Addison, McDonald-Madden, & Rhodes, 2018;Butler, Freckleton, Renwick, & Norris, 2012;Fleishman, Thomson, Mac Nally, Murphy, & Fay, 2005). The other acknowledges that economic, conservation and legal considerations will focus conservation and management on certain species.
Under this latter situation, the question shifts from identifying the best surrogate subset to asking whether there are incidental surrogacy benefits that can be assigned to the species that are already the focus of decision-making (Carlisle, Keinath, Albeke, & Chalfoun, 2018;Naugle, Johnson, Estey, & Higgins, 2001). Incidental surrogacy -benefits to non-target species arising from management of target species -would increase the efficiency of management and conservation, but as with all potential surrogates, pragmatic motivations do not guarantee efficacy for biodiversity conservation. Indeed, attempts to test the veracity of surrogate relationships have raised concerns about their strength and stability (reviewed in de Morais, Santos Ribas, Ortega, Heino, & Bini, 2018). Congruence among species' abundances (e.g., Landeiro et al., 2012), covariation among species' population trajectories (e.g., Cushman, McKelvey, Noon, & McGarigal, 2010) and the coincidence of high species richness among taxa (e.g., Westgate, Barton, Lane, & Lindenmayer, 2014) is often weak. Surrogates based on abiotic features or ecosystem processes face similar challenges for maintaining species-level biodiversity across locations (Rodrigues & Brooks, 2007).
Questions about efficacy are further complicated by high methodological sensitivity in the approaches used to select surrogates and quantify surrogate relationships. Species-based surrogates are often derived from clustering methods (Bal et al., 2018), in which species are first clustered into groups based on some measure of similarity and then one or more surrogates are selected to represent the other members of its group (Lambeck, 1997;Wiens, Hayward, Holthausen, & Wisdom, 2008). The goal is to derive a parsimonious set of surrogate species that is comprehensive, such that all unmeasured species are adequately represented, and complementary, such that selected surrogates have minimal redundancy in the targets they represent (Noon et al., 2009). Under incidental surrogacy, those species grouped with species that are already the focus of decision-making may be inferred to be represented for purposes of monitoring or management. Unfortunately, high sensitivity to methodological decisions can make such inferences unreliable. Clustering is generally based on habitat associations, patterns of species co-occurrence or abundance, or species traits that capture ecological and life history strategies (McGarigal, Cushman, & Stafford, 2000). This requires selecting the coarseness of habitat classifications, the spatial and temporal resolution of occurrence or abundance data, or the selected traits. The clustering method and the number of groups into which species are divided can also lead to considerable uncertainty, but comparison and synthesis of the resulting sets of species clusters is rare. One possibility that has not been explored in the context of multispecies management is the use of ensemble clusters (Vega-Pons & Ruiz-Shulcloper, 2011). When no single clustering method is best a priori, an ensemble cluster based on multiple methods may provide a synthetic way to account for uncertainty in developing groupings that reflect species' ecological niches and environmental sensitivities.
Here, we compare and ensemble clustering approaches to quantify incidental surrogacy benefits among wetland-dependent birds breeding in the Prairie Pothole Region (PPR) of the United States. In the PPR, management priorities target the populations of five upland-nesting waterfowl because of their importance as game species: Mallard (Anas platyrhynchos), Northern Pintail (Anas acuta), Blue-winged Teal (Anas discors), Gadwall (Anas strepera) and Northern Shoveler (Anas clypeata). Considerable resources are invested in securing wetland and associated grassland habitat for these species -efforts that have benefited other wildlife (Thiere et al., 2009) and secured ecosystem services (Johnson, 2019).
Upland-nesting waterfowl may serve as effective surrogates for other birds using a variety of habitats and resources because these waterfowl make use of different habitats over the course of the breeding season and shift across the landscape among K E Y W O R D S coarse/fine filter, multispecies management, Prairie Pothole Region, surrogate species years. Waterfowl nest at highest densities around small wetlands (Reynolds, Shaffer, Loesch, & Robert, 2006), have higher abundance and nest success in grassland-dominated compared with agriculturally dominated uplands (Austin, Buhl, Guntenspergen, Norling, & Sklebar, 2001;Greenwood, Sargeant, Johnson, Cowardin, & Shaffer, 1995) and utilize larger, deeper wetlands for brood rearing later in summer. Conservation efforts have focused on landscapes with a high density of wetlands situated within grassland-dominated habitats, an arrangement that has been shown to benefit other wetlanddependent birds (Naugle et al., 2001). Together, the target set of waterfowl may act as umbrella species by encompassing a variety of wetland and adjacent upland habitats. Nevertheless, species likely vary in their responses to management efforts directed towards these waterfowl, implying that incidental benefits may manifest as a gradient in surrogacy strength.
Our work evaluates sensitivity in clustering approaches and illustrates the application of an ensemble approach to summarize the strength of surrogate relationships. Specifically, we asked: (a) How do estimated surrogate relationships differ when based on abundance versus trait data, and when varying the spatiotemporal scale or the selection of biological traits?; (b) Which methodological approaches are best supported by empirical validation?; and (c) Can an ensemble approach integrate divergent outputs to derive and visualize a continuous measure of surrogate relationship strength between the five target waterfowl species and other wetland-dependent birds? Our work highlights a suite of approaches to account for methodological sensitivity when quantifying incidental surrogacy benefits attributable to species that are the targets of ongoing management.

| Data sources
We used North American Breeding Bird Survey (BBS) data (Sauer et al., 2014) to quantify patterns of avian abundance. The BBS data consist of ~40-km roadside routes, made up of 50 survey stops arrayed systematically along the route. A 3-min count of all birds seen or heard within 400 m is conducted at each stop. Routes are intended to be surveyed once per year during the breeding season. We included counts of birds from standard routes that were conducted under acceptable weather conditions and whose route paths were in the Prairie Pothole Region of North and South Dakota ( Figure S1).
This study area includes the highest current densities of breeding waterfowl in the United States (Loesch, Reynolds, & Hansen, 2012).
We studied the 35 wetland-dependent species that occurred on at least 1% of stops in data used for estimation of species clusters (Table S1).
We analysed patterns of species' abundance in BBS data at two spatial and two temporal resolutions, as there are often trade-offs between the resolution and scope of ecological datasets, and scale decisions can affect surrogate relationships (Hess et al., 2006). Spatially, we conducted analyses with data from each stop as the unit of observation (hereafter referred to as the stop level) or with the sum of abundances over each sequence of 10 stops (hereafter referred to as a segment) as the unit of observation. Temporally, we averaged abundances over the available years or treated each year separately. There was a trade-off between the spatial grain and the temporal duration because a longer time series was available at the coarser (i.e., segment level) spatial scale (starting in 1968; n = 51 routes) compared with the finer spatial scale (starting in 1997; n = 49 routes); the longer time period encompassed greater temporal variation in climate. Both datasets were analysed through 2010, the last year for which evapotranspiration-related covariates were available.
Temporally averaged data should be less subject to sampling artefacts and are relevant for long-term conservation decisions (e.g., land purchases and easements), whereas analyses in which each year was treated separately should be better able to capture responses to annual weather in a region typified by cycles of drought and deluge (Millett, Johnson, & Guntenspergen, 2009). Land cover and climate conditions are associated with wetland densities (Sofaer et al., 2016) and with occupancy of wetland-dependent birds in the PPR (Steen, Skagen, & Noon, 2014) and were summarized at spatial and temporal resolutions to match that of the avian survey data (see Appendix S1).
We withheld a portion of the BBS data to serve as a validation dataset, with the goal of comparing variance explained among clusters. We used data from odd-numbered stops/segments and oddnumbered years for estimation and reserved even-numbered stops/ segments and years for validation. Given spatial autocorrelation between neighbouring stops and temporal autocorrelation between sequential years, the absolute variance explained was expected to be optimistic (Bahn & McGill, 2013); however, we were primarily interested in relative performance among clustering approaches.

| Clustering based on species' abundances
Species' habitat requirements are reflected in their patterns of occupancy and abundance over sites and times with variable environmental conditions. Without knowledge of the underlying environmental gradients, it is possible to cluster species according to similarities in their patterns of relative abundance. These "r-mode analyses" (Legendre & Legendre, 2012) can be used to develop hierarchical clusters (e.g., Wiens, Crist, Day, Murphy, & Hayward, 1996). To group species according to similarity in patterns of relative abundance, we clustered based on BBS counts at each of the four combinations of spatial and temporal scales (stop vs. segment level; years averaged vs. years separate). We applied the Hellinger transformation (Legendre & Gallagher, 2001) to the matrix of counts at each spatial and temporal scale. This transformation yields patterns of each species' relative abundance across sites and is recommended because it avoids clustering on double zeros -shared absences between two species are not treated as an indication of resemblance (Legendre & Gallagher, 2001;Legendre & Legendre, 2012). We computed a Euclidean distance matrix based on the transformed data and used hierarchical agglomerative clustering based on Ward's criterion (Murtagh & Legendre, 2014) to quantify species relationships.
We also attempted to cluster species according to how relative abundance changed in response to climate and land cover; unfortunately, the models used in this approach did not reliably converge (see Appendix S1).

| Clustering based on species' traits
We classified traits into three broad sets that the literature suggests may be important in apportioning species into groups that may respond similarly to natural or anthropogenic disturbance or to management (Okes, Hockey, & Cumming, 2008;Wiens et al., 2008). The first trait set included attributes related to a species' life history, such as body size, clutch size and the duration of the reproductive period. Slow life history strategies have often been associated with greater vulnerability to global change (Pacifici et al., 2015). Our second trait set focused on natural history, which we defined as ecological attributes characterizing organism-environment interactions including food source, foraging strategy, nest type and nest location.
We expected species relying on shared resources to show similar responses to change (Butler et al., 2012). Our third trait set captured aspects of species' distributions. These biogeographical traits included attributes such as migratory strategy, whether the PPR was considered a core versus peripheral part of the breeding range, and the southern-most latitude of the breeding range, to reflect maximum thermal tolerance and thus sensitivity to climate change (Jiguet, Gadot, Julliard, Newson, & Couvet, 2007). We expected that species with similar migratory strategies and southern breeding extents, and for which PPR was considered core to the breeding range, to show similar abundance patterns.
These traits (Table S2) were compiled for each wetland-dependent bird species meeting the prevalence criteria and were primarily collected from three sources: Ehrlich, Dobkin, and Wheye (1988), Dunning (2008) and Rodewald (2015). We used hierarchical agglomerative clustering to generate four sets of clusters based on: (a) life history traits, (b) natural history traits, (c) biogeographical traits and (d) all traits combined. We used Gower's distance metric to derive the dissimilarity matrix upon which clustering was based, because of mixed data types (i.e., continuous, categorical and ordinal) among trait variables (Podani, 1999). We compared results of different clustering algorithms before selecting Ward's agglomerative clustering ( Figure S2). Trait-based and relative abundance clustering were implemented using base R (R Core Team, 2016) and the cluster (Maechler, Rousseeuw, Struyf, Hubert, & Hornik, 2016) and vegan (Oksanen et al., 2016) packages, and visualizations were based on the dendextend (Galili, 2015) and ggplot2 (Wickham, 2009) pack-ages. An overview of our estimation and ensembling approach is shown in Figure 1.

| Validation of clusters
We asked whether empirical validation could be used to select among the sets of species groups that were derived from clustering the different datasets (four abundance-based and four trait-based).
Validation measured how well each group could explain (dis)similarity in relative abundance in withheld validation data from evennumbered BBS stops/segments and years. Validation data were summarized at each of the four spatiotemporal scales, Hellingertransformed and used to calculate pairwise distances among species. This mirrored the process used in estimation of abundancebased groupings, potentially giving abundance-based clustering an advantage in our assessment of validation performance. To address how well each set of species groups explained dissimilarity in relative abundance, we partitioned the variance in the validation distance matrix using permutational MANOVA, which computes a pseudo-F ratio of the sums of squared distances among groups relative to the sum of squared distances within groups (Anderson, 2001). As the number of groups increases (i.e., the cut point is shifted towards the tips of dendrogram), more variance is explained among rather than within groups. Permutational MANOVAs were implemented with the adonis function in the vegan package of R (Oksanen et al., 2016).
For each input data type (abundance-based vs. trait-based), we averaged variance explained across distance matrices derived from the four spatial and temporal scales of validation data.
F I G U R E 1 We estimated relationships among species based on abundance or trait data, and integrated each set of outputs into an ensemble. Each input dataset (four each of trait and abundance) was used to derive a distance matrix among species, and hierarchical clustering was used to create a dendrogram from each distance matrix ( Figures S3 and S4). We considered alternative cut points for each dendrogram (red dashed lines), to yield between 3 and 8 groups of species. Ensembles were created by summing the number of times each pair of species was grouped together; this number was used as the edge weight in a network (each circle represents a species, and both distance and the width of lines reflect edge weight, i.e., strength of connection)

| Ensembles of clusters
Statistical ensembles have been shown to increase predictive accuracy in many contexts (e.g., Breiman, 2001) and are particularly useful when multiple reasonable methodological approaches yield different results. We used ensembles to integrate results based on different spatiotemporal scales or different traits, and for different numbers of species groups. Trait-and abundance-based analyses were kept separate because of higher overall performance of abundance-based clustering (Figure 3). We considered between 3 and 8 groups, a range chosen based on management relevance (to create a number of groups that separates among species while still reducing the potential number of management targets) and visual inspection of hierarchical clustering outputs (e.g., scree plots; Legendre & Legendre, 2012). Each set of clusters was converted to a co-association matrix, which was a symmetrical species-by-species matrix with ones in cells corresponding to species in the same group, and zeros otherwise. To create each ensemble, we summed these matrices across different numbers of groups and data scale within each data type (e.g., for all trait-based clustering output). We also considered ensembles that weighted each co-association matrix by the variance explained by that grouping; however, because variance explained was similar across spatiotemporal scales of abundance and for different sets of traits, this approach yielded qualitatively similar results.
We used graph theory to create a network of species relationships from our ensemble matrices. Graph theoretic approaches integrate results from co-association matrices by treating each object (species in our example) as a node and using the number of cluster sets in which two objects were grouped to define edge weights (Vega-Pons & Ruiz-Shulcloper, 2011). We visualized relationships among species using the Fruchterman-Reingold layout algorithm (Fruchterman & Reingold, 1991) implemented in the igraph R package (Csardi & Nepusz, 2006); this layout algorithm brings nodes (species) with stronger connections closer together, while also prioritizing similar edge lengths and overall network symmetry. Edge F I G U R E 2 Clustering based on abundance data at different spatial (stop vs. segment level) and temporal (years separate vs. averaged) scales consistently grouped the five target waterfowl species together, but diverged in the estimated relationship between these waterfowl and other species weights (i.e., line thickness) reflected the degree of support for a surrogate relationship between each pair of species. We coloured nodes by each species' average strength of surrogate relationship to the target waterfowl; values were rescaled to between 0 and 1 within each network.

| Clustering based on species' abundances
For all scales of input data, the five target waterfowl species showed close relationships to each other and were placed together under all numbers of groups we considered (up to eight groups; Figure   S4). However, there were qualitative differences in the estimated hierarchical relationships with other species (Figure 2). For example, clustering based on segment-level data placed the American Avocet (Recurvirostra americana) with the five target waterfowl, but clustering based on stop-level data separated the Avocet from these waterfowl in the most basal division of the dendrogram. This result points to these species showing similar patterns of abundance across the landscape at coarse resolutions, but using different habitats at finer spatial resolutions.
Although the spatial and temporal scales at which the abundance data were summarized substantially changed the estimated relationships among species (Figure 2), the variance explained in withheld validation data was similar across scales ( Figure 3). Therefore, a single scale (stop vs. segment level; years separate vs. years averaged) could not be empirically selected based on validation performance.
These results highlighted the relevance of an ensemble approach for encompassing the uncertainty in estimated species relationships arising from the use of different methodological workflows.

| Clustering based on species traits
For a given group size, clustering based on biogeographical traits generally explained less variation in the validation data than clustering based on life history, natural history or all traits combined ( Figure 3). Clustering based on the biogeographical trait set was also the only one in which the five target waterfowl were never all grouped together, because the Blue-winged Teal was separated from the other target species at the most basal division in the dendrogram, and Gadwall was separated from the remaining species in a more proximal split ( Figure S3). Although the variance explained by the other three trait sets was similar, species membership in the underlying groups differed among all sets ( Figure S3). In other words, very different species groupings explained a similar amount of variation in relative abundance patterns within validation data, again pointing to the benefits of ensembling across alternatives.
Clustering based on species traits explained less variation in withheld BBS data than clustering based on similarity in patterns of relative abundance, particularly as the number of groups increased ( Figure 3). In general, the underlying hierarchical relationships differed considerably between abundance-based and trait-based dendrograms ( Figure S5). All clusters showed a linear increase in variance explained in withheld validation data as the number of groups in the cluster set increased ( Figure 3); this was an expected feature of the F I G U R E 3 Clustering based on patterns of relative abundance generally explained more variation in the abundance-based validation set compared with clustering based on species traits. However, clustering based on each spatial and temporal scale of abundance data explained a similar amount of validation, indicating that empirical validation could not effectively select a single spatiotemporal scale of analysis. Abundance-based input datasets were calculated at either the segment-level or stop-level spatial scale, and data were either averaged across years or separate values for each year were retained. Variation explained was calculated based on withheld validation data on species' abundances, averaged over each of the four spatial and temporal scales

| D ISCUSS I ON
Surrogates are a largely unavoidable component of pragmatic biodiversity conservation because it is impossible to monitor the status and trend of each species, much less manage for all species individually Noon et al., 2009 Table S1 for species' common and scientific names. The distance between nodes (species), the thickness of edges and the colour of nodes show mutually reinforcing information, with closer position, thicker edges and darker colours each corresponding to a stronger estimated surrogate relationship among species. Absolute position within each network is stochastic within the layout algorithm, and only relative positions are meaningful arising from methodological decisions. We found that clustering based on different spatial and temporal scales of abundance data ( Figure 2) or on different trait sets ( Figure S3) yielded qualitatively different hierarchical relationships among species, and our ensemble approach provided a tool for integrating and visualizing these divergent relationships. We discuss our findings to emphasize the benefits of ensembling, highlight the management relevance of incidental surrogacy and provide future research directions to advance the use of surrogate relationships for biodiversity conservation and management.

| Benefits of ensembling
Ecological analyses of observational datasets can typically be approached from different analytical perspectives, such that multiple sets of methodological choices are reasonable a priori. Surrogate relationships among species have often been estimated via hierarchical clustering (e.g., Wiens et al., 2008) and can also be estimated and visualized via extensions of generalized linear models, ordination and other methods. Each method entails a set of additional decisions, and sensitivity analyses are commonly used to characterize the quantitative and qualitative impacts on scientific inference. For example, hierarchical clustering is recognized to be sensitive to the input data and agglomeration rules (Legendre & Legendre, 2012). In our work, the scales at which abundance data were summarized affected surrogate relationships, and we pioneered an ensemble approach to summarize results across scales and visualize them using a network of species relationships (Figure 4). Our approach yielded a continuous estimate of surrogate relationship strength between each species and the five waterfowl species that are the focus of ongoing management ( Figure 5), thereby estimating the benefits of incidental surrogacy that may be attributed to these five waterfowl species.
Ensembling is not without potential costs, particularly that inference based on better performing models will be diluted as outputs are combined across models. However, there is often no single "best" model, as is well recognized in the fields of climate modelling (Tebaldi & Knutti, 2007) and species distribution modelling (Araujo & New, 2007).
Assessments of model performance can be useful in culling models to include in an ensemble, but in other cases, performance assessments may not yield definitive rankings or may not capture model transferability to new spatial and temporal contexts. Our evaluations of model performance on withheld validation data showed that clusters based on relative abundance generally performed better than trait-based clusters ( Figure 3). Based on this finding, we did not combine outputs from the abundance-and trait-based analyses and emphasize the ensemble based on patterns of abundance. At the same time, our performance assessment had limited utility for selecting among spatiotemporal scales of abundance or among most of the alternative trait sets (Figure 3). This made it difficult to select the spatial and temporal scales that "best" quantified surrogacy strength in our system, so we developed an ensemble across scales. We suggest that ensembling may help avoid the potential problems associated with selecting a single approach that is calibrated to the characteristics of a particular study and spatiotemporal scale, to yield a more general and transferable assessment of surrogacy relationships.

| Management relevance
Both the network visualizations ( Figure 4) and the estimated surrogate strength ( Figure 5) can help guide management decisions.
At the most basic level, evaluating whether a species would be expected to accrue incidental benefits from management focused on the five target duck species is not a simple "yes/no" answer -it is a matter of degree. This is particularly noteworthy in our system since the five waterfowl species were always located on the margin of the  Table S1 4a and 5), the set of non-waterfowl species with the strongest as-  (Steen, Sofaer, Skagen, Ray, & Noon, 2017), suggesting that benefits of surrogacy may accrue to a species whose future population dynamics may be of concern. The network visualizations can also be used by managers to identify additional surrogate species to better achieve assemblage-wide management objectives. This could be done informally, by selecting species in different portions of the network, or via a more formal optimization to select the minimum number of additional surrogates that maximize assemblage coverage (e.g., Wade et al., 2014).
Investigations that have attempted to quantify the degree of incidental surrogacy benefits of waterfowl to other avian species have reached a variety of conclusions. Some have found limited benefits, particularly when considering alignment of reproductive success among species (Grant & Shaffer, 2012;Koper & Schmiegelow, 2007).
Others have emphasized the spatial overlap in habitat use between waterfowl and other wetland-dependent birds (Naugle et al., 2001) and have even found that ducks were "moderately successful" as surrogates for grassland bird abundance and richness (Skinner & Clark, 2008). However, many studies of surrogacy have focused on binary relationships among species (i.e., a surrogate or not) and have not considered methodological uncertainty. Continuous estimates of surrogacy strength based on ensembles are likely to be more realistic than binary alternatives and are also more amenable to further validation.
In some cases, the spatial and temporal scales of analysis can be selected to reflect the relevant management decisions in a given system. For example, we found that the American Avocet showed more similar abundance patterns to the target waterfowl at the coarser spatial scale we considered ( Figure 2). This points to the use of similar landscapes within the PPR and supports the multispecies benefits of a management strategy that preserves wetland complexes and adjacent grassland. Conversely, analyses conducted at the stop level and among separate years should do best for capturing the degree to which species share habitats and benefit from similar conditions, but may underestimate the degree to which species use the same locations at different times, or use different habitats within a single landscape.
The question of whether a surrogate relationship is strong and consistent enough to help achieve a given set of management goals will necessarily incorporate human values, available resources, and tolerance for risk (Wiens et al., 2008). No set of surrogates will capture all biodiversity, and the appropriateness of surrogate approaches will depend on how adequate representation of other species is defined, as well as the management scale of interest.
Expectations for surrogate relationships must be realistic. For example, Mallards and Northern Pintail cluster closely in our analyses ( Figure 4), and yet, these species have had divergent population trajectories (Miller & Duncan, 1999 Research should also address the development of sets of surrogates that include species of ongoing management interest. Sets of surrogates are often more useful and effective than individual indicators (e.g., Niemeijer & de Groot, 2008), and it is both practical and efficient for surrogate sets to include managed species. However, species-specific interventions aside from habitat conservation and management (e.g., targeted predator control, relocations, modifications of harvest regulations) can potentially decouple the spatial and temporal population dynamics of managed species and unmanaged species with similar habitat requirements (Simberloff, 1998  . These models share a central assumption with surrogate methods for multispecies management: that efficiency can be gained by characterizing common ways in which organisms respond to their environments. The conceptual framework is also similar to that of Lambeck (1997), who proposed grouping species according to their shared threats.
Mixture models consider multiple ecological factors, thereby addressing the concern that most species are affected by multiple sources of ecological variation rather than having a single limiting factor (Lindenmayer et al., 2002). Our attempt to cluster species according to their sensitivity to variation in weather and land use was not computationally successful (see Appendix S1), but these methods may be useful in situations where more refined covariate data are available or where species diverge more in their habitat requirements. Approaches that integrate species co-occurrence patterns with environmental variables (Harris, 2015) may also provide insight into the strength and underlying drivers of surrogate relationships. Similarly, the ensembles we developed complement network-based approaches for understanding patterns of co-occurrence (Morueta-Holme et al., 2016). Given ongoing land use (Rashford, Walker, & Bastian, 2011) and climate change (Johnson et al., 2010;Sofaer et al., 2016) in the PPR, evaluating whether explicit estimation of environmental sensitivities increases the stability of surrogate relationships will be important to more fully explore.
Conceptual and empirical models for how species-focused management, surrogates and landscape-scale abiotic and habitat metrics can be effectively combined remain sorely needed. Monitoring indices of habitat availability is generally not a substitute for monitoring populations (Noon, Murphy, Beissinger, Shaffer, & Dellasala, 2003;Noss, 1990), but it is also infeasible to test whether conserving abiotic diversity or a given distribution of habitat is likely to maintain the viability of each and every species in a landscape.
Surrogates have the potential to fill an intermediate role between coarse filter approaches based on abiotic or habitat features and fine-filter approaches that are species-specific (Tingley, Darling, & Wilcove, 2014). Accounting for the uncertainty in surrogate relationships is a key step in selecting credible surrogates. Integrating the use of surrogates into monitoring and decision-making is challenging because it requires clear objectives, scientifically defensible surrogate relationships, and clear feasibility and relevance of the surrogates in management contexts (Dale & Beyeler, 2001).
Evaluating how well species that are the current focus of management serve as surrogates for other species is one step in developing complementary and comprehensive surrogate sets. The networkbased visualization and ensembling approaches we introduce here provide tools for understanding surrogate relationships that can be broadly applied to the selection of taxon-based or attribute-based surrogates.

ACK N OWLED G EM ENTS
This work was funded by the U.S. Department of the Interior North Central Climate Adaptation Science Center grant G13AC00390.
Any use of trade, firm or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government. We thank the many coordinators and volunteers who have participated in the Breeding Bird Survey since its establishment.

DATA ACCE SS I B I LIT Y
Data and associated metadata will be made publicly available upon manuscript acceptance (Sofaer, 2019 ).