Correspondence Stephen Hartley, Centre for Biodiversity & Restoration Ecology, School of Biological Sciences, Victoria University of Wellington, PO Box 600, Wellington 6140, New Zealand. Tel: +64 4 463 5447 Email: email@example.com
Invasive species threaten biodiversity; hence, predicting where they may establish is vital for conservation. Our aim is to provide a robust predictive model for an invasive species suitable for managers acting at both global and regional scales. Specifically, we investigate one of the world's worst invasive species [the red-eared slider turtle (RES) Trachemys scripta elegans] and one of the world's biodiversity hotspots (New Zealand) as our representative systems. We used climate data and location records to define a bioclimatic envelope for the species. Multimodel inference was used to predict areas suitable for RES establishment, weighting in favour of models with low false-negative and high true-positive rates in predictive cross-validation tests. Our performance criterion was the partial area under the curve of a receiver operating characteristic plot where sensitivity exceeded 0.95. We generated both conservative (best-case scenario) and liberal (worst-case scenario) predictions, based on different levels of information about breeding potential. All predictions were expressed on a standard scale of suitability relative to existing known distribution. Globally, the best climate matches for RES outside of their native range in North America include south-east Asia, and parts of Europe, areas where RES have already established. The best available site in New Zealand is considered climatically more suitable than 16% of global sites where RES have bred successfully. While RES can survive in several areas throughout New Zealand, the potential to establish a self-sustaining (i.e. breeding) population appears restricted to the upper areas of the north island where the mean daily temperatures in the hottest month exceed 18 °C. The methods developed here were designed to reduce false-negative predictions as that represents a precautionary approach for species that pose a biosecurity risk. They could readily be adapted, however, to reduce false-positives when predicting areas suitable for translocation of rare and endangered species.
Climate matching is a useful tool that is often used to elucidate areas that may be susceptible to the potential establishment and invasion by a species of interest (e.g. Forsyth et al., 2004; Rodda, Jarnevich & Reed, 2009). Although climate-based predictions of potential invasions are associated with a level of uncertainty (Sikder, Mal-Sarkar & Mal, 2006), it is generally more cost effective to attempt to identify potential invaders and to keep them out of the environment than to control a species once it has become established (Mack et al., 2000; Manchester & Bullock, 2000). Bioclimatic niche models identify areas suitable for a species' survival and potential establishment. They are based on statistical associations between climate variables and the presence or absence of a species, both within and outside of its native range, and provide valuable assessments of the likelihood of establishment before a species is introduced into an area (Thuiller et al., 2005).
Invasive species often have the greatest detrimental effects in regions where there are no ecological equivalents in the naïve environments. An excellent example is New Zealand, where for 80 million years evolution proceeded without the influence of terrestrial mammals (except bats) (King, 2005). Since the introduction of terrestrial mammals, the native fauna has undergone dramatic declines. Resident reptiles in New Zealand consist of the endemic tuatara (Order: Rhyncocephalia) and lizards (Order: Squamata). No crocodilians or freshwater turtles are naturally present. The RES is one of the most affordable (from 50 NZD) and easily obtained exotic reptiles in the New Zealand pet trade (K. Heidy Kikillus, pers. obs.). Although the importation of RES has been banned since 1965, local reptile breeders legally contribute c. 2000 turtles annually to the New Zealand market (Feldman, 2005), and RES have been found living in the wild throughout New Zealand (Thomas & Hartnell, 2000, Hoskins, 2006; Dykes, 2007). Some regional councils now classify RES as pests in the wild and advocate further research to determine whether this species poses a threat to New Zealand's environment (Auckland Regional Council, 2007; Greater Wellington Regional Council, 2007).
The aim of this paper is to provide a predictive model for an invasive species worldwide, and to use these data to estimate probabilities of establishment of that invasive species across a case-study region. Specifically, we use one of the worst invasive species worldwide (the RES) and one of the most highly vulnerable habitats (New Zealand) as our representative systems. We aim to: (1) predict areas of suitable RES habitat globally with a model chosen for its low false-negative and high true-positive prediction rate; (2) use the global data to predict areas in New Zealand that are susceptible to RES establishment; (3) provide both conservative (best-case scenario) and liberal (worst-case scenario) predictions to be made based on knowledge of breeding status; (4) express all predictions on an intuitive scale that is independent of the sample size and relative to the suitability across the full range of known occurrences.
Current distribution records of RES presence were collected from scientific publications, online database, and wildlife field guides (supporting information Tables S1 and S2). If latitude–longitude coordinates were given, these were recorded. If coordinates were not supplied, locations were obtained using Google Earth (http://earth.google.com) and latitude–longitude coordinates (WGS84) were acquired. Location records were assigned to two possible categories: (1) conservative dataset (records where RES have bred successfully in native and introduced ranges), (2) liberal dataset (records from the conservative dataset plus RES sightings, but where breeding was not confirmed). If a location record did not list breeding status, even if successful breeding was highly likely (e.g. juveniles found in the wild), they were placed in the liberal dataset.
Global climate data for temperature and precipitation were obtained from the Intergovernmental Panel on Climate Change (IPCC) as an array of half-degree latitude × longitude grid cells (New, Hulme & Jones, 1999). Monthly averages for the period 1961–1990 were used to calculate the mean annual temperature (MAT), mean temperature in the hottest month (MAXAVG), mean temperature in the coldest month (MINAVG), mean daily maximum temperature in the hottest month (MAXMAX), mean daily minimum temperature in the coldest month (MINMIN) and total annual precipitation (PPT).
The presence or absence of RES was modelled using binary logistic regression performed with R v.2.8.1 (R Development Core Team, 2007). All half-degree gridcells with no record of RES were treated as absences. Twelve alternative models, incorporating different combinations of temperature and precipitation variables, were constructed (Table 1). The multimodel approach allowed us to estimate the consistency of predictions on a site-by-site basis in the face of uncertainty about which is the ‘correct’ model (Hartley, Harris & Lester, 2006). It also enabled us to identify which climatic variables were most effective in defining a climatic envelope for the species that could be transferred across space. Furthermore, when there is some uncertainty regarding the correct model, predictions based on multimodel inference are generally regarded as more robust than those derived from a single model defined a priori (Roura-Pascual et al., 2009).
Table 1. Twelve different specifications representing the set of plausible candidate models
MINAVG, mean winter temperature; MAXAVG, mean summer temperature; PPT, total annual precipitation; MINMIN, mean coldest daily temperature; MAXMAX, mean warmest daily temperature; MAT, mean annual temperature.
Winter minimum, summer maximum and total annual precipitation
(a)+all two-way interactions
Daily minimum, daily maximum and total annual precipitation
(c)+all two-way interactions
Summer maximum, total annual precipitation, +two-way interaction
Winter minimum, total annual precipitation, +two-way interaction
Daily maximum, total annual precipitation, +two-way interaction
Daily minimum, total annual precipitation, +two-way interaction
Mean annual temperature and total annual precipitation
(i)+all two-way interactions
Daily minimum, daily maximum, winter minimum, summer maximum and total annual precipitation
(k)+all two-way interactions
The predictive ability of each of the 12 models was evaluated using cross-validation. For cross-validation purposes the global dataset was partitioned into five geographic areas with a minimum of 10 positive occurrences in each: (1) western North America: comprised of both introduced and native range records; (2) central North America: comprised of native range records only; (3) central–eastern North America, comprised of both introduced and native range records; (4) eastern North America, comprised of both introduced and native range records; (5) rest of the world, comprised of introduced range records only (Fig. 1). Each model was run using three of these geographic areas as ‘Training data’ (calibration) and the two remaining areas as ‘Test data’ (evaluation). The coarse-scale geographic separation of the test and training data minimizes the chances of identifying spurious relationships due to cross-correlations between spatially autocorrelated predictor variables (Lennon, 2000; Hartley et al., 2006). Every possible combination of three training data groups and two test data groups was fitted and evaluated. Our specific metric for quantifying predictive ability of the models was based on receiver operating characteristic (ROC) plots of the test data calculated using the ROCR package (Sing et al., 2004).
To produce an ROC plot, the predicted probability of occurrence (a continuous value between 0 and 1) is converted into a binary prediction of presence or absence, using a threshold value. The binary predictions are then cross-classified with the observed data and the proportion of true positives (also referred to as ‘sensitivity’) is plotted against the proportion of false positives (‘1−specificity’). A curve is generated by repeating this process for all possible thresholds between 0 and 1 (Fielding & Bell, 1997; Peterson, Papeş & Soberón, 2008). The region in the upper-left (0,1) corner of the ROC plot represents a model and threshold that maximizes true positives while minimizing false positives (Fielding & Bell, 1997).
Often the area under the curve (AUC) is used as a performance measure for bioclimatic models, as it integrates performance across all possible thresholds (Fielding & Bell, 1997). In a biosecurity context, however, a precautionary approach would suggest that it is important to achieve a high rate of true positives and few false negatives, even though this may come at the expense of a high number of false positives. We are not interested, therefore, in how models perform with low thresholds that minimize false positives at the expense of true positives (the left-hand side of a traditional ROC plot). Instead, we set a lower limit on our required true positive rate (specificity) of 0.95, and measured the AUC to the right of this specificity limit (Fig. 2). This performance measure is called the partial area under the curve (pAUC) (see also Peterson et al., 2008). Each bioclimatic model was scored based on the number of times it was selected as the best predictive model (i.e. the one with the highest pAUC of the test data) out of the 10 possible combinations of the data partitions (Table 2). The scores were divided by 10 to convert them into model weights, which represent our relative belief in the predictive ability of each model (Burnham & Anderson, 2002).
Table 2. Comparison of models from Table 1 using the conservative dataset (confirmed breeding red-eared slider (RES) Trachemys scripta elegans records
Cross-validation indicates performance in predicting withheld, geographically separated, test data (mean of 10 partitions), pAUC=partial area under the curve measured where specificity ≥0.95.
b Performance under self-validation, where all available data were used to make final predictions of suitability.
Model weights determined from the number of times models were selected as the best predictive model (i.e. the one with the highest pAUC of the test data). Models in bold account for >50% of the total weight.
Suitability score=rank of New Zealand's most suitable site relative to all the positive location records, based on the modelled probability of occurrence. A score of 0.16 indicates that New Zealand's most suitable site is modelled as being at least as favourable as 16% of the locations from which RES have been recorded to breed successfully.
Weighted multi-model average
Final predictions were made from a multimodel weighted average fitted to the complete dataset of all known records and presumed absences. (Table 2 and supporting information Table S3). The predictions of the weighted model average from both the conservative and the liberal datasets were visualized as (1) climate envelopes in climate space; (2) as maps of predicated suitability; (3) as box plots of suitability for specified geographic regions of management interest. In each of the above visualizations, the predicted probability of occurrence from the logistic regression (a value from 0 to 1 that is sensitive to prevalence in the fitting dataset) was rescaled into a score of relative suitability that has a more intuitive interpretation. For example, a suitability score of 0.2 indicates that the location in question has been modelled to have a predicted probability of occurrence equal to or greater than 20% of the sites from which the species has been recorded. A consistent colour scheme was adopted for visualizing ‘suitability’ with break points at the 0th, 1st, 5th, 10th, 50th and 90th percentiles of the predicted probability of occurrence of the positive sites. The values of the 1st and 5th percentiles were reflected back across the 0th percentile, on the logit scale, to create a value for a ‘-1st’ and a ‘-5th’ percentile, the latter two representing values below the suitability of even the least suitable occupied site. Maps were redrawn at a finer resolution (5-min resolution) for New Zealand using data from WorldClim (Hijmans et al., 2005) in order to determine more precisely which areas of New Zealand may be climatically suitable for RES.
The conservative dataset had 191 locations of confirmed breeding sites of RES (supporting information Table S1). Some sites were in the same terrestrial grid cell and were not counted twice, meaning that 163 location records were used in analyses. The liberal dataset contained 459 location records (including the records from the conservative dataset) (supporting information Table S2). Again, only one record per terrestrial grid cell was used, resulting in 352 location records.
Based on the pAUC values of the test data partition, ‘model l’ had the best predictive ability for the confirmed breeding dataset, with a relative weighting of 0.4 (Table 2). For the liberal dataset, ‘model j’ was the best performing model, with a relative weighting of 0.3 (supporting information Table S3). For consistency, ‘model l’ was used to generate a description of the RES climate envelope for both datasets (Fig. 3a and supporting information Fig. S1a).
Global predictions of areas with a suitable climate for RES derived from the multimodel average are shown in Fig. 3b and supporting information Fig. S1b. As expected, the global predictions for each dataset highlighted the species' native range in the south-eastern US as an area highly suitable for RES. When utilizing the conservative dataset (confirmed breeding records), areas identified as the best climatic match (suitability of 0.5 or greater) occurred in south-eastern China, northern India, southern Turkey, central Europe (especially along the eastern Adriatic coast) and South America (Uruguay and Argentina, near Buenos Aires). Other suitable areas (suitability ≥0.1) included parts of eastern and western Australia, northern Africa, southern Africa, Madagascar, Europe, Asia, South America and areas surrounding the RES native range in North America (Fig. 3b). Results obtained with the liberal dataset expanded on areas highlighted by the conservative dataset, with the most suitable sites identified for the presence of RES in North America (native range), South America, Europe, Asia and eastern Australia (supporting information Fig. S1b).
The ‘overlap’ in suitability between the worldwide set of grid cells occupied by RES and grid cells available in New Zealand is evident in Fig. 3c and supporting information Fig. S1c. The suitability of the ‘best’ available site in New Zealand was 0.16 based on the conservative dataset and the final fit using the multimodel average. In other words, the best site available in New Zealand is believed to be climatically better than 16% of sites incorporated into the model where RES have already been recorded to breed successfully. With regard to the liberal dataset, the relative New Zealand suitability score was 0.822, meaning that in terms of adult survival, the New Zealand environment is as suitable as at least 82% of the other locations from which free-living RES have been recorded from across the world.
Higher resolution, regional-scale predictions predicted that areas in the northern portion of New Zealand's North Island would be more suitable as RES establishment sites than areas further south (Fig. 3d). When RES location records where breeding status was unknown were incorporated (liberal dataset), the model predicted a larger area of suitable habitat where RES could survive in New Zealand (supporting information Fig. S1d). These results are consistent with reports of RES living in the wild in New Zealand (Hoskins, 2006). Currently, there are no records of RES breeding in New Zealand outside of captivity.
Global predictions and the climate envelope for RES
Our results indicate that a number of areas not currently known to be occupied by RES possess climates that are highly suitable for their breeding and establishment. Most notably, large areas of south-east Asia are predicted to be just as suitable as the majority of the native range. This is a concern as RES are farmed in China and can be found for sale in Chinese markets (Shi, 2008). Turtles, including RES, are often released into the wild in Asia as part of religious ceremonies, and work is currently being conducted in and around Singapore to determine whether the vast numbers of released sliders are successfully reproducing in the wild (Ramsay et al., 2007). Feral RES are also found in Thailand, Hong Kong, Malaysia, Vietnam, Korea, Japan, Indonesia and Taiwan (Ramsay et al., 2007).
Among the models that received some support (weights ≥0.1) for predicting the breeding range, all the different environmental variables are represented. This indicates that there is not one single climatic factor that dominates the definition of this species' fundamental niche. With the liberal dataset (including potential non-breeding records), the mean annual temperature and climate variables relating to winter minima (MINAVG and MINMIN) were most highly represented.
In their native range, RES occupy a variety of habitats, suggesting that they possess a broad tolerance of environmental conditions (Newbery, 1984). The adult RES is able to tolerate cold temperatures and can survive winter conditions where temperatures decline below −10 °C for extended periods (Cadi & Joly, 2004). Juvenile RES, however, are not as cold tolerant as the adults and the hatchlings may die at temperatures of −0.6 °C. In their native range, juvenile RES are known to overwinter in their nests in attempts to avoid or limit their exposure to freezing temperatures, and emerge the following spring (Tucker & Packard, 1998). The limited cold tolerance of hatchling RES may restrict their potential invasive range. At the other extreme, RES within their native range may survive body temperatures of over 40 °C, although their preferred body temperature is 28 °C (Crawford, Spotila & Standora, 1983).
Assessing potential for establishment at a regional level (New Zealand)
New Zealand's most suitable site received a suitability score of 0.16 based on the multimodel average. This was similar to the prediction of the single best model, ‘model l’. However, models that included only the mean annual temperature or winter minima (models f, h, i and j) among their predictors generated much higher suitability scores than those that included a measure of summer maxima. This suggests that summer temperatures are the pivotal factor in determining whether or not the New Zealand climate is considered suitable for the establishment of RES.
While adult RES are able to successfully overwinter in the warmer parts of New Zealand (Hoskins, 2006), it is unknown whether they are able to successfully reproduce in the wild and, thus, establish self-sustaining populations.
Adult females are able to store sperm for years and are capable of producing up to three clutches of between three and 24 eggs per year (Thornton, 1994; ISSG, 2006; O'Keeffe, 2006). Successful incubation of RES eggs requires temperatures between 22 and 33 °C for 55–80 days (Congdon & Gibbons, 1990; ISSG, 2006). The only reported record of successful RES breeding outdoors in New Zealand was in an outdoor exhibit at a zoo (Feldman, 1992). Feldman (1992) constructed an artificial turtle nest during a cooler than normal year in Northland, New Zealand. The temperature of the constructed nest hovered in the low 20 °C range, but reached 27 °C on occasion. Based on the results of this single nest, Feldman (1992) concluded that RES were unable to reproduce in New Zealand. RES exhibit temperature-dependent sex determination, where males are produced at cooler incubation temperatures (below c. 28 °C) and females are produced at higher temperatures (above c. 30 °C), with a pivotal temperature (where both males and females are produced) at around 29 °C (Congdon & Gibbons, 1990; Etchberger et al., 1991; Dodd, Murdock & Wibbels, 2006). In New Zealand, RES would likely experience cool incubation temperatures, which may produce only male hatchlings from successful nests. As it can be difficult to infer soil temperatures from air temperature data (Hartley & Lester, 2003), and given that the metabolic activity of developing eggs may raise the surrounding soil temperature (Burger, 1976), further research on potential nest temperatures in New Zealand is required.
Previous attempts to model areas of climatic suitability for RES at a regional scale have included additional variables such as amount of solar radiation and ‘human footprint’, a measure of human influence on the global surface (Ficetola et al., 2009). These factors are relevant, as RES are dependent on environmental warmth to regulate body temperature and for egg incubation, and released pets are often concentrated in areas of high human density. However, such predictions developed on a fine-scale regional area are not always transferable to a global scale (Randin et al., 2006). Our global-scale results are broadly consistent with the predictions for northern Italy made by Ficetola et al. (2009). Furthermore, both studies confirm that reproductive populations are associated with a warmer bioclimatic envelope than populations where breeding status is not certain.
Climate envelope approach
Climate matching between a species' native and non-native ranges is considered only a ‘first step’ in predicting establishment risk (Ficetola, Thuiller & Miaud, 2007). Climate suitability can be thought of as a coarse-scale filter of whether a species can establish in a country or a region; then, given that the climate is suitable, a more detailed analysis should examine the role of biotic interactions and the fine-scale distribution of a suitable habitat (Pearson, Dawson & Liu, 2004). Further factors to consider include propagule pressure and the history of invasions elsewhere, although this information is not always available (Kolar & Lodge, 2001; Bomford et al., 2009). Nonetheless, climate matching provides an opportunity to assess a species' establishment potential (Thuiller et al., 2005).
The use of the two datasets [‘Conservative’ (confirmed RES breeding) and ‘Liberal’ (unconfirmed RES breeding)] in this study allowed us to better visualize the potential range of RES. The conservative dataset may lead to predictions of potential range that are too conservative, and although the liberal dataset may overestimate the potential breeding range, the two datasets help to define the two extremes of what might be possible.
Choice of model metrics
From a management perspective, the suitability of the most vulnerable site within a region is a meaningful summary of a region's vulnerability to invasion, particularly against non-native species that are regularly released throughout the region. In other instances, the specific suitability at points of entry (e.g. airports and seaports) may provide the best estimate of risk. The development of a suitability score that is independent of the number of positive records in the dataset will also allow biosecurity managers to more easily compare the invasion threat of multiple species, making it possible to rank species in order of threat level and, therefore, guide management decisions and allocation of resources.
Many measures of classification performance are available (Fielding & Bell, 1997), and different rationales for selecting specific classification thresholds (Liu et al., 2005). In a biosecurity context, false-negative predictions are more costly than false-positives, as it is more costly to eradicate a pest than to identify a species that may become a problem (Mack et al., 2000). If one knew the relative cost of false negatives versus false-positives, a cost-minimization approach could be used to select the most efficient threshold (Hartley et al., 2006); however, this is rarely the case. Thus, in this paper we adopted a precautionary threshold by only comparing models across the range of thresholds with a high sensitivity (≥0.95) and low rates of false-negatives, using the partial-AUC as our integrated performance criterion. If we were producing a bioclimatic niche model for a rare species of conservation concern (e.g. to identify sites most suitable for reintroduction), then we might define the pAUC differently, by setting a lower limit on the false-negative rate (e.g. 1−sensitivity=0.05) and measuring the area under the curve that lies above this horizontal line in order to minimize the risk of including unsuitable sites in the selected set, even at the cost of excluding some good ones.
We did not use Akaike information criteria (AIC) (which is based on maximum likelihood methods and a correct estimation of the degrees of freedom in the model) as a measure of model performance because the spatial autocorrelation of predictors and response variable invalidates the supposed degrees of freedom. Hartley et al. (2006) considered a quasi-AIC, but concluded that predictive performance in independent test data was the more reliable criterion for model weighting. Multimodel inferences are more robust to errors in model specification (Chatfield, 1995; Marmion et al., 2009), and also allow a more meaningful assessment of uncertainty – which is crucial in risk assessment.
Finally, global-scale data were used to parameterize the models, but if finer scale climate data are available for particular areas (e.g. a country) then fine-scale predictions can be made. Here we made those predictions for New Zealand; however, the method could equally be applied to other regional or national study areas and to species of high conservation importance.
Conservation and management implications
At a global scale, the RES has already demonstrated itself to be a threat to local biodiversity (e.g. Cadi & Joly, 2004; ISSG, 2006), and the potential for it to continue to expand its range into south-east Asia suggests that tighter regulation of the pet trade in these areas is warranted. In bioclimatically suitable areas such as Florida, owners require permits to keep RES (FWC, 2008).
Our regional-scale predictions indicate that, while RES are able to survive in several areas throughout New Zealand, the potential to establish self-sustaining (i.e. breeding) populations appears to be restricted to the upper areas of the North Island, where the mean daily temperatures in the hottest month exceed 18 °C, such as Northland, Coromandel and parts of the Bay of Plenty. Because of the temperature-dependent sex determination of RES, it is possible that conditions in these areas may produce primarily male hatchlings; however, with the constant supplementation of individuals (including adult females with the ability to store sperm) to the population via deliberate releases and escapes from captivity, combined with their longevity, the potential for negative impacts on the New Zealand environment exists. Further research, including the incorporation of predicted climate change, research into nest temperatures that could be achieved in New Zealand, biotic interactions, diet and disease transmission is needed to complete a more holistic risk assessment of whether RES pose a credible risk to New Zealand's environment and economy. In the meantime, precautionary measures should be taken to ensure that more RES are not released into the wild and that existing feral individuals are removed from the New Zealand environment.
With increasing global travel and the sustained demand for unusual pets, there is a continuing need to develop methods for quantifying the risk of establishment for a range of invasive species. The pAuc methods developed in this paper are particularly suited for modelling the potential distribution of both invasive and ‘at risk’ species where false-negatives and false-positives do not have equal costs or consequences. Multimodel inference was used to increase the robustness of our predictions, and for communicating the range of plausible predictions. Finally, the development of a suitability score was used as an aid to an intuitive interpretation of the relative risk of establishment, irrespective of the prevalence of the species records or whether or not the species has reached an equilibrium. This suite of methods should prove useful in increasing the accuracy of the climate–envelope approach, and more importantly in communicating these results to those charged with prioritizing biosecurity and conservation resources.
We thank Andrew Stein, Mark Mitchell and Greg Hoskins of the Auckland Regional Council for providing New Zealand RES location records, Haitao Shi (Hainan Normal University) and Paul Pendelbury (Reptrans UK) for discussions about RES in China and the UK, respectively, Ken Miller for assistance with figures, and the VUW Bug Club, Herpetological Hatchet group, and two anonymous reviewers for providing comments to improve this manuscript. Financial support was provided by Victoria University of Wellington, the New Zealand Biosecurity Institute, the Society for Research on Amphibians and Reptiles in New Zealand, and Education NZ, to KHK and the Foundation for Research Science and Technology, New Zealand Science and Technology Postdoctoral Fellowship to KMH.