Why georeferencing matters: Introducing a practical protocol to prepare species occurrence records for spatial analysis

Abstract Species Distribution Models (SDMs) are widely used to understand environmental controls on species’ ranges and to forecast species range shifts in response to climatic changes. The quality of input data is crucial determinant of the model's accuracy. While museum records can be useful sources of presence data for many species, they do not always include accurate geographic coordinates. Therefore, actual locations must be verified through the process of georeferencing. We present a practical, standardized manual georeferencing method (the Spatial Analysis Georeferencing Accuracy (SAGA) protocol) to classify the spatial resolution of museum records specifically for building improved SDMs. We used the high‐elevation plant Saxifraga austromontana Wiegand (Saxifragaceae) as a case study to test the effect of using this protocol when developing an SDM. In MAXENT, we generated and compared SDMs using a comprehensive occurrence dataset that had undergone three different levels of georeferencing: (1) trained using all publicly available herbarium records of the species, minus outliers (2) trained using herbarium records claimed to be previously georeferenced, and (3) trained using herbarium records that we have manually georeferenced to a ≤ 1‐km resolution using the SAGA protocol. Model predictions of suitable habitat for S. austromontana differed greatly depending on georeferencing level. The SDMs fitted with presence locations georeferenced using SAGA outperformed all others. Differences among models were exacerbated for future distribution predictions. Under rapid climate change, accurately forecasting the response of species becomes increasingly important. Failure to georeference location data and cull inaccurate samples leads to erroneous model output, limiting the utility of spatial analyses. We present a simple, standardized georeferencing method to be adopted by curators, ecologists, and modelers to improve the geographic accuracy of museum records and SDM predictions.


| INTRODUCTION
Climate change is predicted to result in massive species range shifts and population-level extinctions (Clark, Bell, Kwit, & Zhu, 2014;Hijmans & Graham, 2006;Thomas et al., 2004;Thuiller, Lavorel, Araújo, Sykes, & Prentice, 2005). Observing, describing, and forecasting patterns of biodiversity under changing climate conditions are critical goals in the fields of biogeography, conservation, and ecology (Bucklin et al., 2015). Species Distribution Models (SDMs), also referred to as Bioclimatic Envelope Models, are the most widely used approach for predicting past, present, and future suitable habitats for common and rare species (Elith, Kearney, & Phillips, 2010;Hijmans & Graham, 2006;Phillips & Dudík, 2008;Wiens, Stralberg, Jongsomjit, Howell, & Snyder, 2009). These models are used to predict climate change impacts (Keith et al., 2008;Serra-Diaz et al., 2014;Wiens et al., 2009), construct phylogeographic patterns (Forester, DeChaine, & Bunn, 2013), and guide efforts to locate new populations of rare species (Williams et al., 2009). Reliable SDMs can inform land managers where to concentrate conservation resources to best preserve areas of ecological importance. Because SDMs rely on species occurrence coordinates, climate data, and other environmental variables to define a species' bioclimatic niche and project future ranges (Bucklin et al., 2015;Flower, Murdock, Taylor, & Zwiers, 2013), the accuracy of those variables strongly affects the reliability of the model's predictions. In this paper, we analyze the effects of using species presence records of varying accuracy, demonstrating the importance of rigorous georeferencing to obtain optimal SDM results.
Although there are a variety of modeling methods and algorithms for generating SDMs, correlative models constructed using only species occurrence records and climate data are commonly used tools (Bucklin et al., 2015;Flower et al., 2013;Guillera-Arroita et al., 2015;Oke & Thompson, 2015). These models do not include true absence data, nor do they explicitly account for additional variables such as interspecies interactions or species' dispersal abilities (Flower et al., 2013;Pearson & Dawson, 2003). Correlative models predict the realized niche of the species, not the fundamental niche, due to their reliance on observed presence records (Wiens et al., 2009). There are several notable sources of uncertainty in the process of SDM development (Wiens et al., 2009). One source of uncertainty arises because of the fact that any ecological or climatic model is constrained by the selection of environmental variables. While there is no consensus as to which environmental or climate variables are to be included in standard SDMs, many agree that the selection of variables can potentially introduce bias (Bucklin et al., 2015). A model's accuracy is also constrained by the resolution and quality of the climate data (Real, Luz Márquez, Olivero, & Estrada, 2010). Climate data are usually represented as continuous grids interpolated from quality-controlled climate station datasets (Daly et al., 2008). The quality of these climate data and the methods of interpolating from point records to a continuous surface and correcting for factors such as elevation and aspect can be sources of error in SDMs (Real et al., 2010). There can also be issues regarding the taxonomic identification of the specimen (Lozier, Aniello, & Hickerson, 2009). Species can be misidentified, or the systematics and taxonomy may have evolved over the years to include different species classifications. Sampling bias and imperfect detection are also noted limitations of the current available data for species distributions (Boakes et al., 2010;Fourcade, Engler, Rödder, & Secondi, 2014;Guillera-Arroita et al., 2015;Newbold, 2010). Among all these potential sources of model uncertainty, one particularly important variable for creating reliable SDMs is the accuracy of the species occurrence localities (Newbold, 2010).
Museum and herbarium records can provide valuable information on the distribution of extinct and extant species (Anderson, 2012;Davis, Willis, Connolly, Kelly, & Ellison, 2015;Newbold, 2010). Millions of occurrence records can be accessed directly from the museum or in reputable online databases, many publicly available (Newbold, 2010). Most include a written site description and often geographic coordinates (see Fig. S1 in Supporting Information). The quality of location data generally declines with specimen age. Herbarium records' site descriptions and associated geographic coordinates are frequently used to build highresolution SDMs (Alvarado-Serrano & Knowles, 2014;Forester et al., 2013;Lozier et al., 2009). Site coordinates should have as good or better resolution than the climate data, often ≤1 km 2 , in order to produce useful SDMs (Wiens et al., 2009). Failure to assess spatial error in these occurrence record coordinates can have significant impacts on apparent species distributions (Rowe, 2005), although the severity of this effect varies among species and is partially dependent on the modeling method used (Graham et al., 2008). Several studies address the effect of sampling bias on SDM output (Boakes et al., 2010;Fourcade et al., 2014;Phillips et al., 2009), but less attention has been paid to the standardization of georeferencing to improve model performance. Previous research on the role of locational accuracy has focused on the effects of adding simulated random locational error (Graham et al., 2008), rather than assessing the error in actual museum records.
Most herbarium and museum records were not documented by collectors with the intention of use in geographic modeling, resulting in many potential sources of spatial error (Bowe & Haq, 2010).
The two main branches of georeferencing methods are manual georeferencing and "Georeference Calculators." Manual georeferencing requires the meticulous human interpretation of site descriptions and assigning coordinates using detailed topographic maps. This can take several minutes per sample and is increasingly taxing with large datasets. Georeference Calculators are computer algorithms designed to automate the tedious process of interpreting written site descriptions to estimate geographic coordinates and a degree of confidence (Wieczorek & Wieczorek, 2015). Many publications present SDM results, at varying spatial resolution, without explicitly stating how or if the data were georeferenced (Table 1).
In this paper, we set out to answer the following question: What are the consequences of using occurrence data of varying levels of spatial accuracy to inform present and future SDMs for a highelevation plant? To address this question, first we outline a standardized method of georeferencing occurrence records specifically for building more useful SDMs, the Spatial Analysis Georeferencing Accuracy (SAGA) protocol. Next, to demonstrate the importance of a standardized process, we built current and future SDMs in MAXENT for the high-elevation wildflower Saxifraga austromontana Wiegand (Saxifragaceae), using three sets of herbarium records, each georeferenced to a different level of spatial accuracy. Although we focus on a single plant species, the methods could be extended to any taxon with historical museum or herbarium occurrence records.

| Study system: Saxifraga austromontana
Saxifraga austromontana, the Prickly Saxifrage, is an ideal case-study species for investigating how various georeferencing methods affect SDM results because of its geographically large, but topographically limited, range and extensive herbarium records ( Figure 1). First, this plant is endemic to, but widely distributed across, mountainous regions of western North America from 30 to 55 degrees' latitude ( Figure 2), where it inhabits a topographically complex region near tree line. Second, it has an extensive history of collections spanning over 200 years resulting in over 3,000 herbarium records available in online databases. The extensive collections of this species, and others in the genus with overlapping and extended ranges, limit the effect of sampling bias. T A B L E 1 Examples of methods used to georeferenced species occurrence records as described in species distribution modeling (SDM) papers. Georeferencing practices are not standardized, and often the resolution of the resulting SDM is finer than the historical records used to train the model. Without accurately georeferenced presence points, it is impossible to create a credible SDM No mention of georeferencing Forester et al. (2013) Online herbarium records 50 km "georeferencing was evaluated for accuracy"

| Historical herbaria record data
The O dataset was edited to omit duplicate records and extreme outliers. Duplicate records across herbaria were found using accession numbers, GUID numbers, collector numbers, and site descriptions.
Outliers were defined as occurrence records located very far outside of the known species range, such as records in the oceans, in the Great Plains, outside of North America, north of 55 degrees' latitude (no confirmed records exist north of this latitude), and records in the state of Oregon outside of the Wallowa mountain range (the range of S. vespertina). Omission of outliers is common practice for building SDMs, yet not everyone goes beyond this step (Table 1). The O dataset includes 1,363 unique herbarium records ( Figure 2).
The "Previously Georeferenced" (PG) dataset includes all records from the O dataset that explicitly state they have been georeferenced by other herbaria using a variety of methods. We omitted outliers and duplicates, as above, and removed records with coordinate uncertainty listed as >1 km. The final PG dataset includes 525 unique herbarium records ( Figure 2).
The "Newly Georeferenced" (NG) dataset includes all historical herbarium records from the O dataset that we were able to manually georeference to a 1-km or finer resolution. To conduct this manual georeferencing, we developed a novel method, the Spatial Analysis Georeferencing Accuracy (SAGA) protocol to standardize the process of georeferencing. We believe that the SAGA protocol is an improvement over other georeferencing practices in terms of both accuracy and straightforward implementation. This method is based on meticulously and manually georeferencing each herbarium record of interest and verifying written site descriptions using reliable external resources such as Google Earth, USGS Topographic Maps, and the Atlas of Canada to ensure accurate geographic coordinates. Each record must be reviewed, either through the online database it was downloaded from or by physically examining the herbarium specimen. All locations should be transformed into decimal degrees, with coordinates recorded relative to the WGS 1984 geodetic datum. Minimum spatial accuracy of each location following manual georeferencing should be recorded on an ordinal scale of 1-5 (Table 2) to allow for easy sorting and spatial analysis based on the spatial resolution of the occurrence data. We applied the SAGA protocol to the O dataset to create our NG dataset. The NG dataset only includes herbarium records with a confidence of 1-3 (Table 2) for a total of 1,104 unique historical herbarium records ( Figure 2).

| Species distribution models
We intentionally did not use all SDM approaches or an ensemble approach, but rather a widely used robust method to demonstrate the need for and utility of the standardized georeferencing protocol we present. We built SDMs using the MAXENT Software (Phillips, Anderson, & Schapire, 2006), one of the most, if not the most, widely used SDM platforms (Fourcade et al., 2014;Guillera-Arroita et al., 2015;Merow, Smith, & Silander, 2013). MAXENT is built on machine learning and Bayesian statistics of maximum likelihood (Elith et al., 2011;Halvorsen, Mazzoni, Bryn, & Bakkestuen, 2015), and is especially popular because it outperforms other methods based on predictive accuracy and is user-friendly (Merow et al., 2013).

| Climate variables
We used monthly PRISM data (Daly et al., 2008) for the reference period  to define the bioclimatic envelope of S. austromontana. We felt that the   such as location, elevation, and aspect (Daly et al., 2008). The climate data for this study were downscaled from 4 km 2 grid cells to a resolution of 1 km 2 and made available from ClimateWNA http://tinyurl.com/ClimateWNA (Hamann, Wang, Spittlehouse, & Murdock, 2013;Wang et al., 2012). We selected seven final variables for use in SDMs (Tables 3 and S3) using a multistep process. First variables were preselected from the complete list available for ecological relevance to our taxa and similar high-elevation species (Körner, 1995(Körner, , 2003. Next, we further reduced variables to eliminate highly correlated parameters (Pearson's r > |0.75|), Table 3. To decide between correlated variables, we relied on ecological relevance and informed judgment to select for a diverse suite of climate variables representing temperature, precipitation, heat moisture indexes, and more (Table 3). We also downscaled projected values of these variables for a 30-year period centered on 2080. Future climate projections were obtained from ClimateWNA using an ensemble of 23 Atmosphere-Ocean General Circulation Models (AOGCMs) of the Coupled Model Intercomparison Project phase 3 (CMIP3) under the A2 emission scenario, selected based on validation rank (Hamann et al., 2013).
T A B L E 2 Standardized confidence rankings for determining the spatial accuracy of species occurrence records using the Spatial Analysis Georeferencing Accuracy (SAGA) protocol. SAGA requires manual georeferencing of each occurrence record by interpreting the site location and verifying or assigning a location in the form of WGS 1984 geographic coordinates. The SAGA protocol uses an ordinal accuracy ranking of 1-5 to classify the spatial resolution of the occurrence data. Confidence ranks of 1-3 may be useful for constructing Species Distribution Models using 1-km or coarser climate data. Ranks of 4 and 5 are not appropriate for spatial analysis and should be omitted

| Climate space analysis
To assess whether the occurrence records in each of our three geo-

| Model evaluation
We evaluated the models using the area under the receiver operating curve (AUC) because it is a generally accepted and widely used metric for model evaluations (Merow et al., 2013). The AUC score is the probability that a randomly chosen presence point is ranked higher than a random background point, and is penalized for predictions outside of presence locations (Merow et al., 2013). A high AUC value (>0.8) indicates that models can properly distinguish between presences and random background samples. Although the AUC has been highly criticized as a metric of model performance (Lobo, Jiménez-Valverde, & Real, 2008), there are few alternatives for presence-only models (Merow et al., 2013).
To quantify the geographic differences between models created using occurrence records of varying accuracy, we used the 10% cumulative logistic threshold, which defines a binary response of suitable or nonsuitable habitat from a continuous output (Merow et al., 2013).
Choosing biologically meaningful thresholds is challenging (Merow et al., 2013), yet this method can be used to easily compare the outputs of two or more models (Franklin et al., 2013). We compared area of suitable habitat for the reference and future predictions across the three georeferencing categories. Cartography and spatial comparisons were performed in ArcGIS 10.3.

| Climate space analysis
The NG dataset captures a significantly different range of environmental conditions than the other two datasets. The ANOVAs revealed that values extracted at each presence point in the O and NG datasets capture significantly different values for six of the seven climate variables (Figure 3 and Table S2). The PG and NG datasets capture   and 2080 future projections based on an ensemble of 23 CMIP3 coupled atmosphere-ocean general circulation models (Hamann et al., 2013) significantly different values for five of seven climate variables.  (Figure 4).

| Species distribution models
All MAXENT models were statistically valid (AUC > 0.88); however, the models predicted very different areas of suitable habitat, especially for future scenarios ( Figure 5 and Figure 6, Table 4). The SDMs for the reference period

| DISCUSSION
A standardized process is needed to ensure consistent spatial accu- .26% of the total variance. Ecologically, increasing PC1 can be interpreted as representing greater growing season moisture availability (more precipitation as snow (PAS), higher summer moisture index (cmiJJA), lower annual heat moisture index (AHM), and lower mean temperature of the warmest month (MWMT)). Higher values on PC2 represent increasing cold season length and severity (later start to the frost-free period (bFFP), greater difference between summer and winter temperatures (TD), and colder winter temperatures (MCMT)). Cluster ellipses delineate 95% confidence intervals. For PCA loadings see Table S1 (  (Figure 3). This indicates that the PG dataset is not much better than the O dataset at defining the specific niche of S. austromontana.
The differences in climate space among our models led to drastically different SDM outputs and strikingly different predictions of current and future ranges. Using the 10% cumulative logistic threshold to define a binary response of suitable or nonsuitable habitat, the O and PG models resulted in suitable habitat covering geographic areas 1.4 and 1.2 times larger than the NG dataset for the reference period.
Erroneously placed presence locations, such as WTU-VP-90424 circled in Figure 2,  the A2 emission scenario. The NG models are more consistent with other studies on alpine taxa that forecast a 40%-80% reduction in suitable habitat by the end of the century (Dirnböck, Essl, & Rabitsch, 2011;Dullinger et al., 2012;Forester et al., 2013). Further, the NG model predicts a relatively small gain in habitat by 2080, equivalent to 21%-29% of the area of gain predicted by the other two models, explained by limited upslope habitat for alpine taxa. Such underprediction of future range loss is worrying for any species, but especially for high-elevation species, which are disproportionately affected by climate change (Gottfried et al., 2012) and often have little room for upward range expansion (Jackson, Gergel, & Martin, 2015).
Relying on potentially inaccurate presence records when modeling species' ranges could lead to serious overestimation of the area in which these species can persist, misleading conservation and management efforts. SDMs can be developed to their full potential only when they are trained using many high-precision occurrence records for a species (Randin et al., 2009 Lake. These calculators are popular because they are easy to use and allow for batch processing of CSV files with many listed localities, but the spatial accuracy of these outputs is questionable.
As these tools become more and more popular and public access to species occurrence data increases, it is paramount to remember that convincing SDMs can be produced from dubious data (Lozier et al., 2009). Museum and herbaria databases are invaluable archives of occurrence information, yet must be used with caution, especially when applied to spatial analyses. Our results indicate that SDMs built using low-accuracy location data capture a significantly broader climate envelope, predict a more widespread spatial distribution, and predict less loss under climate change scenarios than SDMs trained on accurate collection records. Conservation and management decisions could vary considerably depending on which model's output they were based on.
This study highlights the importance of meticulously georeferencing all records manually before use in SDMs and reveals the need for a standardized protocol such as SAGA, as varying levels of georeferencing result in significantly different models of habitat suitability for the same species. The tradeoff of manual georeferencing is the time it takes to analyze each record. As datasets increase in size, the feasibility of georeferencing each record becomes increasingly daunting.
Batch georeferencing calculators may be desirable for large datasets, but reliable technology is not yet available. As the resolution of historical and projected climate data increases, more advanced and accurate SDMs become possible, but only if species occurrence records are also available at an increasingly fine scale. Field collectors must record accurate coordinates, GPS uncertainty, and detailed site descriptions, assuming use in future spatial analyses. Curators of databases must only make available accurately georeferenced occurrence records, or explicitly state otherwise. Lastly, end users must suspect occurrence records to be inaccurate and georeference before performing spatial analyses using a protocol such as SAGA. All parties should share the improved data, ultimately improving publicly available datasets and resulting science.

CONFLICT OF INTEREST
None declared.

AUTHOR CONTRIBUTION
Trevor Bloom was responsible for project development as part of his MS thesis at Western Washington University. He conducted the data collection and led the analyses, interpretation, and writing. Aquila Flower advised the spatial analyses and interpretation of the results; she also greatly assisted in the writing and editing of the publication.
Eric DeChaine served as Trevor Bloom's thesis adviser and was the overall project supervisor. He assisted in the formulation and development of the study, specimen acquisition, data interpretation, and writing.