Using species distribution models to define nesting habitat of the eastern metapopulation of double‐crested cormorants

Abstract When organisms with similar phenotypes have conflicting management and conservation initiatives, approaches are needed to differentiate among subpopulations or discrete groups. For example, the eastern metapopulation of the double‐crested cormorant (Phalacrocorax auritus) has a migratory phenotype that is culled because they are viewed as a threat to commercial and natural resources, whereas resident birds are targeted for conservation. Understanding the distinct breeding habitats of resident versus migratory cormorants would aid in identification and management decisions. Here, we use species distribution models (SDM: Maxent) of cormorant nesting habitat to examine the eastern P. auritus metapopulation and the predicted breeding sites of its phenotypes. We then estimate the phenotypic identity of breeding colonies of cormorants where management plans are being developed. We transferred SDMs trained on data from resident bird colonies in Florida and migratory bird colonies in Minnesota to South Carolina in an effort to identify the phenotype of breeding cormorants there based on the local landscape characteristics. Nesting habitat characteristics of cormorant colonies in South Carolina more closely resembled those of the Florida phenotype than those of birds of the Minnesota phenotype. The presence of the resident phenotype in summer suggests that migratory and resident cormorants will co‐occur in South Carolina in winter. Thus, there is an opportunity for separate management strategies for the two phenotypes in that state. We found differences in nesting habitat characteristics that could be used to refine management strategies and reduce human conflicts with abundant winter migrants and, at the same time, conserve less common colonies of resident cormorants. The models we use here show potential for advancing the study of geographically overlapping phenotypes with differing conservation and management priorities.


| INTRODUCTION
Wildlife management initiatives are often developed to protect individual species, subspecies, or populations (Blair, Gutierrez-Espeleta, & Melnick, 2013;Groom, Meffe, & Carroll, 2006). At times, however, subpopulations (local populations of a metapopulation; Hanski & Gilpin, 1991) or discrete groups within a species (phenotypes) may be difficult to differentiate and have conflicting management or conservation goals. Differentiation of subgroups with similar phenotypes, however, might be possible using their seasonal distributions and behavioral traits such as foraging patterns and preferences. For example, thousands of migratory doublecrested cormorants Phalacrocorax auritus that winter on lakes in South Carolina are viewed as a threat to commercial and natural resources, whereas small colonies of cormorants that breed in the state during the summer are viewed as favorable contributors to ecosystem processes and need conservation. Resource managers in South Carolina were tasked to develop management strategies to limit the negative impacts of the migratory cormorants and, at the same time, to conserve the year-round colonies (personal communication, D. Shipes, SCDNR). Understanding the distinct breeding patterns and habitats of resident versus migratory cormorants could help identify the subgroups and inform the conservation and management plans (Carranza & Winn, 1954;Fonteneau, Paillisson, & Marion, 2009).
Human-wildlife conflicts over shared fisheries resources have precipitated dozens of culling programs for P. auritus, killing thousands of birds annually to reduce competition for sport and forage fishes (Dorr, Hatch, & Weseloh, 2014). Once distributed ubiquitously throughout North America, cormorant nesting colonies were documented along nearly all freshwater and coastal habitats (Audubon, 1840(Audubon, -1844. Cormorant population bottlenecks occurred in the twentieth century when their abundance in North America declined from millions to thousands (Dorr et al., 2014;Wires & Cuthbert, 2006). During this population decline, two phenotypes (suggested by some to be distinct subspecies) of the eastern population became apparent: a migratory group that breeds in the northern United States and Canada and a resident group that breeds in the south. Both groups winter in the southeastern United States and Mexico. During their population bottlenecks, there were no breeding cormorants in the state of South Carolina. Although P. auritus populations are now considered to be recovered (Dorr et al., 2014), it is unclear whether contemporary breeding colonies in South Carolina can justifiably be managed separately from migratory birds.
A group of cormorants sometimes referred to as the Florida subspecies (Forrester et al., 2003;Hatch, 1995;Post, 1988) is thought to be re-expanding northward, and managers suggest that birds in South Carolina may belong to this group (F.J. Cuthbert et al. 2011-unpub-lished data MNDNR). Others, however, have suggested the migratory phenotype is cueing in on the large lake systems (like habitats found in northern breeding sites) and reducing their migration distances (Post & Post, 1988;Post & Seals, 1991). The debate over the status of contemporary southern breeding birds remains contentious as molecular studies have yet to successfully differentiate among breeding phenotypes of P. auritus in eastern North America (Green, Waits, Avery, & Leberg, 2006;Mercer, Haig, & Roby, 2013;Waits, Avery, Tobin, & Leberg, 2003). Sheehan, Tonkyn, Yarrow, and Johnson (2016b) used intestinal parasite assemblages as evidence that resident and migratory birds forage together in Mississippi and Alabama during the winter. If this is the case in South Carolina, lethal management of cormorants in winter risks the concurrent removal of the local birds that breed in that state. Parasite community data are not yet available for birds breeding or wintering in South Carolina and require lethal means to obtain. Therefore, we sought nonlethal species distribution modeling (SDM) methods to define the migratory and resident phenotypes of cormorants to better identify and categorize the cormorant colonies nesting in South Carolina.
Without an official designation of subspecies, we consider the two groups of cormorants (northern breeding and southern breeding) to be phenotypes occurring within a metapopulation. The use of the metapopulation concept is fitting for P. auritus because they live in fragmented landscapes, their suitable habitat (water) is limited and occurs in discrete fragments (Hanski, Mononen, & Ovaskainen, 2011;Ojanen et al. 2013), and their population dynamics are considered to be independent (Hanski, 2004). We have found no evidence that birds commonly switch between northern and southern breeding, although banding evidence suggests imperfect fidelity among these groups (personal observation, B. Dorr, USDA/APHIS/NWRC). Thus, we consider migration behavior a phenotypic characteristic that differentiates the two groups (Hanski, 2004).
Migration behavior is used to define groups of birds for conservation and management. In Mississippi, resident Mississippi sandhill cranes (Grus canadensis pulla) are listed as endangered (Henkel et al., 2012) and consequently are conserved, while migratory birds elsewhere are hunted for sport (Raftovich, Chandler, & Wilkins, 2015). Similarly, some migratory Canada Geese are protected from hunting and harassment, while culling programs of resident birds are commonplace (Beston, Williams, Nichols, & Castelli, 2015;Holevinski, Malecki, & Curtis, 2006;Nichols, 2014). Because genetic distinction within the aforementioned metapopulations is questionable or nonexistent (Glenn, Thompson, Ballard, Roberson, & French, 2002;Smith, Craven, & Curtis, 1999), distinguishing phenotypes based on the behavior is an accepted form of differentiation for conservation and management planning. Migration and breeding behaviors are influenced by climatic, geologic, biologic, and anthropogenic factors (Guillaumet et al., 2011;Hutto, 1985;Walther et al., 2002), and if truly different, we expect migratory and resident groups of P. auritus to respond to these variables in distinct ways. Using the environmental characteristics of known nesting sites of cormorants, we developed two ecological niche models to describe the habitat of resident and migratory P. auritus during the breeding season.
If the two phenotypes are to be managed differently throughout their range, the methods used here could be considered in other situations where the identity of breeding cormorants is in question.
Species distribution models can be used to predict future, current, and past distributions of species, such as current distributions of rare or cryptic species (Engler, Guisan, & Rechsteiner, 2004), potential distributions of invasive species (Young, Abbott, Caldwell, & Schrader, 2013), and future distributions of organisms in relation to climate change (Thomas et al., 2004). The types of data input into SDM can limit the statistical assessments used to develop predictive models (Aarts, Fieberg, & Matthiopoulos, 2012;Hastie & Fithian, 2013). Using presence-absence and abundance data with true records of absences are ideal for SDM development (Howard, Stephens, Pearce-Higgins, Gregory, & Willis, 2014;Van Couwenberghe, Collet, Plerrat, Verheyen, & Gegout, 2013), but the availability of such data is limited. Nonetheless, presence-only datasets can still be used in a presence-absence assessment by assuming all locations not listed as presence points are absence points (Phillips & Dudik, 2008). This presence-inferred-absence method requires reliable presence data to ensure that true presence points are not included in the absence dataset (Guisan & Thuiller, 2005) and to identify what conditions exist at occurrence sites that do not occur elsewhere in the landscape. Maximum entropy (Maxent) is an increasingly popular method for developing predictive SDMs based on the presence-only data (Evans et al., 2010;Phillips & Dudik, 2008), and its applicability and methodology are well documented (Oppel et al., 2012;Peterson, Papes, & Eaton, 2007;Renner & Warton, 2013).
In this study, we use SDM predictions, to better identify which phenotype(s) are likely to be breeding in a state where conservation and management priorities require reliable predictions. We developed SDMs from environmental variables expected to be important for successful breeding of resident and migratory P. auritus subpopulations.
Breeding waterbird colonies are relatively conspicuous, and there is a low likelihood of missed detection (Ridgway, 2010); thus, presenceinferred-absence models are suitable for modeling P. auritus nesting habitat distributions. We hypothesized that the habitat of known breeding sites for migratory and resident phenotypes would be significantly different from the general landscapes within Minnesota and Florida, respectively, and that landscape variables important for the prediction of nesting sites would be associated with waterways, fisheries, and avian mortality. We confirmed model predictions using contemporary breeding colony data of P. auritus within the states of Minnesota and Florida using observed presence, absence, and colony size data. We hypothesized that the prediction values provided by the models would correlate with the size of a cormorant colony. Lastly, we use the models trained on Minnesota and Florida nesting data to predict contemporary nesting sites for resident and migratory cormorants in South Carolina.

| Nesting colony data
We developed nesting habitat models for migratory P. auritus using available nesting survey data from Minnesota (1977-2010: Guillaumet, Dorr, Wang, & Doyle, 2013Wires & Cuthbert, 2006;Dorr et al., 2014) and resident P. auritus in Florida (1970( -1999( : Nisbet et al., 2002. These states are historical breeding areas for migratory and resident cormorant phenotypes, respectively. Data for nesting sites in South Carolina were based on the colonies documented by SCDNR in 2011 and 2012 (unpublished data, breeding bird database accessed October 2013, SCDNR), publications reporting contemporary nesting locations (Post & Seals, 1991), and personal observations from the field (unpublished data, K. Sheehan, Clemson University, 2011. All count data were converted to presence points for Maxent model creation and presence/absence for model validation. Each colony location was initially reported as a single point despite multiple habitat characteristics occurring within a nesting site. For example, at a 30-m resolution, a single colony could occur in forested, undeveloped, and wetland habitat. To capture the full range of environmental characteristics within each colony, we converted point data to polygons, using automated and manual methods. This also allowed us to overcome geographic positioning errors (Naimi, Skidmore, Groen, & Hamm, 2011) that associated some colonies with unlikely nesting habitats (e.g., open water adjacent to island nesting sites). Geospatial analysis of water layers from the National Hydrography Dataset (NHD) was used to identify areas of land that were <10,000 km 2 and surrounded by water. The resulting polygons were spatially joined with nesting colony data. Manual validations were performed by overlaying colony polygons on satellite imagery; we manually adjusted the colony extent to accommodate sites not captured during automation, as was the case when rookeries occupied islands smaller than the spatial resolution of the source dataset (30 m × 30 m; Figure 1) or where rookeries occurred in swamps or mainland peninsulas. The polygon layer containing colony data was converted into a raster with each 30 m × 30 m cell representing the presence or absence of a nesting F I G U R E 1 Conversion of nesting points to polygons. Aerial imagery of island nesting sites reported in Florida where sandy and shell hash substrates (light areas in the photograph) connect P. auritus nesting areas (inset image) in trees. Example polygons drawn around P. auritus colonies. Albers Equal Area Conic projection colony. And finally, each presence cell was converted to point data (one point created at the center of each 30 m × 30 m cell) for input into the Maxent analysis. Because most colonies were represented by multiple points, potential spatial autocorrelation could inflate the fit of our models (Hijmans, 2012). We tested model residuals for spatial autocorrelation using a 100-permutation Mantel test on distance matrices of geographic covariance and residual covariance. To overcome issues of spatial autocorrelation, we validated our models in three ways: (1) We verified low prediction probability of observed absence points (created in the same way as present points); (2) we created random landscape models to verify that the habitat used by P. auritus was distinct from habitat throughout the rest of the state; and (3) we transferred the models (based on the observed presence, observed absence, and random data points) between all states to verify that the predicted habitat models were unique to each specified circumstance.

| Layer development for individual parameters
The selection of nesting locations by P. auritus could be associated with foraging habitat, nesting habitat, and/or anthropocentric parameters. These variables were derived from data layers obtained through publicly available Web downloads from the National Atlas, National Land Cover Database (NLCD), National Wetlands Inventory, and the NHD (see Table S1 in Appendix S1 in Supporting Information). Fish consumption advisories and avian disease outbreaks were obtained from the Environmental Protection Agency, and fish stocking activity data were obtained from each state's fisheries agency. We obtained climate data from the PRISM Climate Group, Oregon State University.
Because downloaded data were in different formats (raster, polygon, point, polyline) and sometimes were split into multiple files, each parameter was individually processed to obtain similarity in transformation, registry, and raster cell value. In some cases, this required simply snapping raster data to a common registry point (e.g., climate data), and in other cases, spatial joining of data and object classes (e.g., wetland data), raster conversion (e.g., fish advisories), and raster reclassification (e.g., land use data) was needed to achieve consistent and comparable formatting (parameter processing details appear in Appendix S1). We provide a schematic example of layer development for wetland data layers ( Figure 2). Guisan and Thuiller (2005) recommend focal statistics for highly mobile organisms because observations are likely to vary between potential and realized distributions. The likelihood of any given focal cell to be impacted positively or negatively by the values of other nearby cells was either summed, averaged, or maximized at a radius of either 3.5 or 10 km. We based focal radii on foraging distances reported during the breeding season (Coleman, Richmond, Rudstam, & Mattison, 2005;Sheehan, Hanson-Dorr, Dorr, Yarrow & Johnson, 2016a). The final state-based raster layers were converted to tagged image file format (tiff) and imported into the R statistical computing environment (r-project.org) with the Maxent Java application (cs.princeton.edu/~schapire/maxent/) using the "dismo" package (Hijmans & Elith, 2013). To capture biological processes that occur at differing spatial extents, multiple focal statistics were calculated for many environmental variables. As such, we expected these layers to covary and identified groups of correlated parameters during model development (derived from the same initial dataset or source agency; see Table S2 in Appendix S1). Within each group, the variable that explained the most variance in nesting site distribution was retained and the remaining group members were removed from the model (York et al., 2011;Young et al., 2013).

| Species distribution models
We assessed the influence of environmental parameters (derived parameters) of the entire extent of each state for nest site selection of P. auritus in Minnesota and Florida. We stacked all derived variables (Phillips, Anderson, & Schapire, 2006) and, using default Maxent algorithms, reduced the number of model parameters using three series of five iterations, removing environmental variables in the following order: variables contributing no explanatory power to the model (providing 0% contribution); variables providing 0.5% or less explanatory contribution (Holt, Salkeld, Fritz, Tucker, & Gong, 2009); and variables that covaried significantly with parameters with greater explanatory power (Koncki and Aronson 2015; see Table S2 in Appendix S1). By removing nonexplanatory and redundant variables, we reduced model parameters and overfitting (Merckx, Steyaert, Vanreusel, Vincx, & Vanaverbeke, 2011). The models trained on Minnesota and Florida nesting data were used to generate predicted geographic distributions within the political boundaries of each state (Minnesota, Figure 3; Florida, Figure 4; and South Carolina, Figure 5).

| Testing model predictions
In addition to common predictors of model success (receiver operating characteristic area under the curve [AUC], which is the likelihood F I G U R E 2 Derivation steps for wetland-related layers used to develop the Maxent model trained on resident cormorants in Florida. Although their base layers used different in geographic extents, the same methods were used to develop wetland-related layers for the states of Minnesota and South Carolina of a model to assign a higher prediction value for any randomly chosen presence point when compared to any randomly chosen background point; Merow, Smith, & Silander, 2013), we used observed absence data points in Minnesota and Florida (Nisbet et al., 2002;Wires, Cuthbert, & Hamilton, 2011;Wires, Haws, & Cuthbert, 2005) to test the predictive ability of each model. where each point was categorized as nesting habitat (1) or nonnesting habitat (0) of P. auritus (Cao et al., 2013). Chi-square analyses of actual versus predicted data were used to assess the ability of each model to correctly predict the status of a site observed for nesting P.
auritus. We also assessed whether model output was a good predictor of colony size and used linear regressions comparing nest density (nests/km 2 ) to prediction values derived by the Minnesota and Florida models.
To confirm that predictor variables in nesting models were not merely a representation of the landscape, we created a random set of 5,000 points within the state boundaries of Minnesota and Florida and developed null models in the same manner as the empirical models created with the presence data. These predictive maps of the random landscape developed from the null models were compared to empirical nesting census data for Minnesota, Florida, and South Carolina.
The null models aided in confirming that our models trained on the observed presence data were accurate despite potential spatial autocorrelation (Hijmans, 2012).

F I G U R E 4 Prediction of suitable cormorant nesting habitat in Florida. Albers Equal Area Conic projection
Seventeen parameters were included in the final models predicting P. auritus nesting habitat in Minnesota and Florida (Table 2). Of these, eight variables occurred in both models including variables important for cormorant foraging, nesting, and anthropogenic factors.

| Model predictions for South Carolina
Contemporary colonies of P. auritus nesting in South Carolina persist in and around reservoir lakes created in the 1950s (Post & Post, 1988;Post & Seals, 1991;personal (Table 1)

| Model validation results
The residuals of our models were spatially autocorrelated (migra-    We detected similar variables in the models trained on data from Minnesota and Florida, but the importance of each parameter varied. Effective management of wildlife metapopulations that have phenotypes with differing conservation imperatives requires a thorough understanding of landscape features that could promote or discourage the establishment of local populations. The inclusion of specific variables in both models highlights the importance of the landscape parameters selected in our models and could be investigated in greater detail to improve conservation and management plans for suites of wildlife species (Guisan & Thuiller, 2005;Nicholson et al., 2006) including waterbirds.

| Considerations for other waterbird distribution models
Understanding the mechanistic biology and ecology of other nesting waterbirds will almost certainly require different local variables of importance. Specific parameters that could be informative include fine-scale submerged and emergent vegetation data, climate conditions that would contribute to exposure severity such as lake fetch and forest cover density, and recreation variables that might be useful for estimating the anthropogenic use of each water body (water depth, boat launches, beaches, industry, etc.). P. auritus often nest near conspecifics and other waterbirds, and information regarding nesting sites of other bird species could be used to better inform models. We did not include conspecific data for the models presented here, because previous occurrence data collected over long periods of time were not readily accessible within all regions examined. Nonetheless, the landscape-derived variables in our models are generally available so that the methods used here can be considered in future applications.
To expand on the methods used here, we suggest that future studies train SDMs with occupancy data from telemetered resident and migratory birds. This would allow managers to identify the habitat used during the nonbreeding season, when subpopulations overlap geographically and when management activities directly impact multiple groups or organisms. If these tracking studies were combined with SDMs, wildlife management programs could better identify critical habitats and geographic areas to target for subspecies/subpopulation conservation and management activities (Venier, Pearce, McKee, McKenney, & Nieme, 2004). Additionally, changes in migration and breeding behaviors resulting from climatic, geologic, biologic (e.g., disease), and anthropogenic changes in the landscape (Huntley et al., 2006;Kavanagh & Bamkin, 1995) could be predicted using SDMs (Peterson, 2001). Using the environmental characteristics of the present distributions of wildlife can help develop management plans to accommodate transitions through a changing landscape, effectively promoting proactive rather than reactive management (Lotter & le Maitre, 2014;May, Page, & Fleming, 2016). For these predictive models to accurately describe potential distributions, consistency among parameter variables used to train models and those to which models are transferred is critical.  (Peterson et al., 2007;Randin et al., 2006;Wolmarans, Robertson, & van Rensburg, 2010). In our models, data that might have differed significantly in range were ranked prior to focal statistic transformation (Table S3 in Appendix S1). This allowed for the values derived from focal statistics to be similar in range, preventing transferability problems such as interpretation errors associated with clamping (Phillips et al., 2006).

| Conclusions and management implications
Phalacrocorax auritus has a salacious history in North America where harassment and exploitation of nesting colonies by humans were historically common (Wires & Cuthbert, 2006). Cormorants are subject to various laws that allow for both their protection and management, including lethal control and reduction or elimination from suitable nesting and roosting habitats as a means of limiting their impacts on natural resources (Dorr & Somers, 2012;Dorr et al., 2014;Wires et al., 2005). Thus, the realized ecological niche where P. auritus nests do not necessarily represent its potential ecological niche and human disturbance characterizes a portion of the disparity between the fundamental and realized habitat uses of P. auritus. Our models include anthropocentric variables that could help increase prediction accuracy for nest site suitability and, therefore, identify regions where management of resident phenotypes might be treated differently. Many of these anthropocentric variables can be altered to some degree through urban planning and natural resource management, points to consider for management goals designed to influence the distribution of P. auritus.
Geographic parameters such as prevalence of water bodies and forested land are critical predictors for the distribution of waterbird metapopulations as are anthropogenic parameters such as the quantity and distribution of impervious surfaces (Becker & Weisberg, 2015).
These and other parameters can be manipulated through changes in land management practices (Liu & Taylor, 2002). By including model parameters that influence the feeding and nesting success of cormorants, we developed assessments that predict species distribution better than climatic variables alone (Cabral & Kreft, 2012)

ACKNOWLEDGMENTS
The development of this project would not have happened without the inspiration of Dr. Robert Baldwin, who encouraged the initial ideas in his landscape ecology and conservation course at Clemson University. We appreciate the use of the GIS computing facilities on the Clemson University campus and the support provided by ESRI staff for overcoming obstacles in the project implementation. We thank the Clemson University faculty including Dr. David Tonkyn and Dr. Alan Johnson who reviewed this publication prior to submission along with two anonymous reviewers, whose recommendations for improvement were highly valuable.