Ecological niche models (ENMs) and species distribution models have become increasingly popular tools for predicting the geographic ranges of species and have been important for conservation (Kremen et al., 2007), for predicting changes in distribution from past or future climatic events (Hijmans & Graham, 2006), and for investigating patterns of speciation and niche divergence (Wiens & Graham, 2005; Carstens & Richards, 2007; Warren et al., 2008). The basic premise of the ENM approach is to predict the occurrence of species on a landscape from georeferenced site locality data and sets of spatially explicit environmental data layers that are assumed to correlate with the species’ range. In many cases, models are based on researchers’ own collection data and on detailed knowledge of the taxa being studied, making predictions reasonable depictions of species occurrences given the current modelling technology. However, the increasing availability of locality data in online literature, museum databases and online data portals [e.g. GBIF (http://data.gbif.org/)] is providing unprecedented access to biodiversity data and allowing researchers to greatly expand the deployment of species distribution models and/or ENMs. While the value of publicly available sample locality data is not questioned, the consequent introduction of errors in the accuracy of specimen identity and georeferencing could be problematic for developing ENMs from public data sources (Graham et al., 2004; Soberón & Peterson, 2004). Although georeferencing inaccuracies can be identified in databases from qualitative or quantitative accuracy thresholds (e.g. http://manisnet.org/GeorefGuide.html), poor taxonomy and/or misidentification may be less detectable. This issue may be particularly problematic, for example, with cryptic species or subspecies that are morphologically similar but may have very distinct ecological requirements and geographic distributions, or for those data sources that contain indirect observations rather than references only to physical specimens.
In our own attempts at implementing ENMs we have encountered probable misidentifications in biodiversity records for a number of species, and expect that many researchers have had similar issues. To demonstrate the potential for generating dubious yet visually convincing distributions from publicly available data we use ENMs to predict the range not of misidentified cryptic species, but of a crypto-zoological species – the North American Sasquatch, or Bigfoot. Supposedly, Sasquatch belongs to a large primate lineage descended from the extinct Asian species Gigantopithicus blacki, but see Milinkovich et al. (2004) and Coltman & Davis (2005) for phylogenetic analyses indicating possible membership in the ungulate clade. Sasquatch is regularly reported in forested lands of western North America, as well as being considered a significant indigenous American and western North American folk legend (Meldrum, 2007), however the existence of this creature has never been verified with a typed specimen.
We present ENMs for Bigfoot in western North America based on a repository of (1) putative sightings and auditory detections (n = 551), and (2) footprint measurements (n = 95) collected from 1944 to 2005 (Fig. 1). This data set was taken from a collection of reported Bigfoot encounters archived by the Bigfoot Field Researchers Organization (BFRO; http://www.bfro.net/news/google_earth.asp). The reports generally consisted of a description of the event and where it occurred. The reports used were filtered to eliminate spurious points by carefully examining event descriptions prior to incorporation in the present study. The events were assigned geographic coordinates by matching descriptions of event locations to actual locations on USGS quad maps and commercially available atlases.
ENMs were constructed using the maximum entropy niche modelling approach implemented in the software maxent v3.1 (Phillips et al., 2006), with 80% of the data used in training and 20% retained as test points. Environmental data layers were constructed for the 19 BIOCLIM variables (at 5-arcminute resolution) in the WORLDCLIM data set (Hijmans et al., 2005). To reduce the number of bioclimatic variables in order to minimize model overfitting, we extracted climate data for 5000 random points sampled from the geographic extent of our study and calculated correlations between each variable for these points. For pairs with a correlation coefficient > |0.80|, one variable was selected. This resulted in a set of nine variables: (1) annual mean temperature, (2) mean diurnal range, (3) isothermality, (4) temperature annual range, (5) mean temperature of wettest quarter, (6) mean temperature of driest quarter, (7) precipitation seasonality, (8) precipitation of warmest quarter, and (9) precipitation of coldest quarter.
maxent appeared to perform well in our analysis, according to the area under the receiver operating characteristic curve (AUC) and threshold-based evaluation methods (Phillips et al., 2006). The ENM had an AUC of 0.983 and strongly rejected the hypothesis that test points are predicted no better than by a random prediction for all thresholds implemented in maxent. No locality points fell outside the predicted distribution, with the exception of a single observation from Imperial, California. In general, the ENM shows that Bigfoot should be broadly distributed in western North America, with a range comprising western North American mountain ranges such as the Sierra Nevada Mountains, the Cascades, the Blue Mountains, the southern Selkirk Mountains, and the Coastal Range of the Pacific Northwest. Based on jackknife analyses for models including each variable alone, ‘precipitation of the coldest quarter’ was the bioclimatic variable that contributed most to the ENM, followed by ‘temperature annual range’, ‘mean temperature of the wettest quarter’, and ‘mean temperature of the driest quarter’. An ENM produced using footprint data alone was highly similar to that for all sighting data (not shown).
It is expected that species distributions are likely to be altered under global warming, and a number of studies have used environmental layers derived from climate-change models to project contemporary distributions into the future using ENMs (Hijmans & Graham, 2006; Pearson et al., 2006; Loarie et al., 2008). Despite potential weaknesses in such approaches (Pearson et al., 2006), we were interested in examining the potential ramifications of climate change on hypothetical remnant Sasquatch populations and to predict how the frequency of sightings might change in the future. We projected ENMs generated from the WORLDCLIM data onto bioclimatic layers simulated for a doubling of atmospheric CO2 (http://www.diva-gis.org/climate.htm). As expected for montane organisms, the model predicts Bigfoot to abandon lower altitudes and also to lose habitat in coastal regions (Fig. 2b, c). However, this loss of habitat should be compensated by a large potential gain in the northern part of the Sasquatch range and in several other montane areas (e.g. Arizona, Nevada, Utah), should such areas remain undisturbed by human activity in the near future (Fig. 2c). Thus, given our model and available data, we might expect Bigfoot sightings to increase in frequency in northern latitudes and at higher elevations over the coming years.
Notably, the predicted distribution of Sasquatch (Fig. 2a) appears similar to that which might be expected for other large mammals of western North America, including the American black bear (Ursus americanus Pallas, 1780), sightings of which are thought to be sometimes confused as Bigfoot encounters (Meldrum, 2007). We sampled records of U. americanus from the same states as considered for Bigfoot (California, Oregon and Washington) from GBIF. We selected records for which physical specimens were available, allowing us to reasonably assume that black bears were not misidentified, and for which site localities could be georeferenced to a named place at minimum. Although this level of geographical accuracy may not be ideal for many ENM applications, it should be sufficient for our purposes here. We performed maxent runs as described above (present climate only). The model (Fig. 2d) performed well according to AUC (both test and training AUC > 0.98) and all threshold statistics (all P < 10−47). However, visualizing all black bear records from GBIF reveals that limiting model training to locations within California, Oregon and Washington leads to an under-prediction of the known distribution of U. americanus, particularly in New Mexico, Colorado, and parts of Canada and Mexico (Fig. 2d). Although this suggests some methodological limitations for the distribution models as implemented here, these other locality points may belong to different subspecies that experience unique environmental conditions in inland regions relative to the more coastally distributed specimens used in our analysis (Larivière, 2001). Our ENM prediction does in fact appear quite similar to the distributions of the subspecies U. a. altifrontalis and U. a. californiensis (Larivière, 2001), supporting this possibility. In any event, for our comparisons here, any under-prediction should have little consequence since we expect such limitations to be shared by the Sasquatch model, given the similar distribution of locality points.
The general similarities between distributions of the two ‘species’ are clear (Fig. 2a, d), despite the much smaller number of available black bear coordinates. Furthermore, the exact same bioclimatic variables (see above) contributed most to the ENM when evaluated using maxent’s variable jackknifing procedure. We used the I-statistic (Warren et al., 2008) to quantify the degree of similarity between the two ENMs using the program ENMTools. The observed value of I = 0.849 indeed indicates a high degree of overlap, and falls well within the null distribution generated from maxent runs for 100 randomizations of Bigfoot and black bear coordinates (Fig. 3; P < observed = 0.32). Thus, the two ‘species’ do not demonstrate significant niche differentiation with respect to the selected bioclimatic variables. Although it is possible that Sasquatch and U. americanus share such remarkably similar bioclimatic requirements, we nonetheless suspect that many Bigfoot sightings are, in fact, of black bears.