Quantifying species ’ geographic range changes: conceptual and statistical issues

. Geographic range is an important metric used to evaluate species ’ environmental relation-ships. Additionally, a very small or rapidly shrinking range may indicate elevated extinction risk. How-ever, a species does not fully occupy its range in the way a lake ﬁ lls a basin and is instead best thought of as a cloud of points rather than an area per se. Samples of species presence or abundance are subject to issues of both inherent detectability and stochastic detectability due to sample locations in space and time. In addition, populations ﬂ uctuate in space and time. These factors mean that the population centroid, range area, and range margin are not deterministic. Examples from the literature demonstrate that multidirectional range changes are ubiquitous due to stochastic effects. Confounding factors, particularly due to human activities such as land use change, also complicate inference about range changes. Herein, statistical tests for centroid and margin changes in the context of stochastic ﬂ uctuations of clouds of points and sample error are demonstrated. Scale-dependent area analysis and tests for range area change are presented along with tests for responses at the community level (suites of species). This test allows for evaluation of change in spatial pattern for patchy populations with or without outliers. Geographic range change can reliably be tested with statistics based on distributions of clouds of observed points rather than bounded areas per se, if potential confounding is taken into account.


INTRODUCTION
Geographic range is a key attribute of any species. Changes in range may provide information on human impacts, invasive species spread, climate change impacts, and other changes affecting species. Rapid range shrinkage can be an early warning of species decline (Channell andLomolino 2000, Rodr ıguez 2002) that may be easier to monitor than population size. Range shifts could affect decisions about sustainable use (e.g., hunting, fishing, or commercial use) and/or conservation strategies. Trailing (warm) edge losses (Hampe andPetit 2005, Wiens 2016) and leading edge (e.g., up mountain slopes) shifts could be considered a fingerprint of climate change response (Parmesan and Yohe 2003).
Modern remote sensing allows many geographic features to be mapped in a fairly continuous fashion across the landscape. Farmlands, for example, can be reliably identified, as can forest cover. Although small waterbodies (e.g., streams) may be hidden by tree cover or be ephemeral, major waterbodies are reliably detected. In distinct contrast, most species must be detected from surveys or samples. On mountains, transects across the elevational gradient may be used to detect species, with repeated surveys over time. Forest inventory plots provide regular long-term data for certain plant species. Hunting, trapping, or fishing reports can provide valuable data for some species. In most cases, a fine grid is not provided by such sampling, and thus, a number of sampling and mapping issues arise, which are addressed in my study.
The fundamental problem with detecting range changes is that survey points or transects need to be converted into areas and boundaries. Whereas a farm field or lake is a spatially extensive object that can be surveyed or mapped from aerial photography with small error relative to size, organisms do not truly occupy an area of the landscape. The area occupied by sparse, scattered objects is in fact an ill-posed metric (Loehle 2011). A plant may be found scattered across 100,000 km 2 , but it does not occupy that area. The problem is greater for ambulatory animals.
The operational solution to this problem has been to view the geographic range as that area on the map where a species has some probability of being found. It can be formalized by drawing a line on a map around all known locations for a species. While this casual map is useful for general conservation purposes, the ill-posed nature of range metrics inhibits change detection. To detect and quantify a range boundary or area change, it is necessary that the measure be repeatable and that measurement error be quantifiable. A unique problem here, compared to measures such as the mass of a rock, is that the methods used to convert point or transect data into area occupied and range boundaries are strongly affected by scale and method. For example, some data on endangered species (e.g., http://www.natureserve.org) only specify the presence/absence at the county or watershed level. Fine-scale maps cannot be drawn from such data though they might be useful for a large-scale (e.g., national) map. Explicit consideration of these mapping issues is so far sparse. Because changes in species' ranges are not controlled experiments, correlation with hypothesized governing factors (e.g., climate) must be evaluated. This raises the issue of confounding due to unmeasured factors such as habitat loss. Finally, because plant and animal populations are dynamic, we should expect them to vary over time naturally at any given spot on the map. This temporal variation will interact with sampling variability to affect reliability of range change determinations.
Herein, these issues are explored with reference to a suite of recent studies of geographic range changes in North America and Europe. These studies often take advantage of historical floristic surveys, some 100 yr old, to look for change. In other cases, long-term animal data exist. In many cases, the purpose of these studies has been to look for signs of climate change effects. These studies generally assume that warming effects should lead to species shifts uphill or to the north.

MAPPING ISSUES
A geographic range boundary is not a wellposed metric (Loehle 2011, Loehle andSleep 2015). Adding or removing a few outlier observations can cause a large change in a range boundary (Fig. 1a), as can changes in survey effort (Kujala et al. 2013) and random sampling effects (Clement et al. 2016), including map misregistration errors (e.g., maps from different times do not line up properly) at times of interest (Dai and Khorram 1998) or location error in general (Kennedy et al. 2009). Some of these artifacts can be ameliorated by using a grid to define the range (Fig. 1b), but irregular and discontinuous occupied regions remain a problem and "area" remains undefined. Brown et al. (2016) showed that methodological issues (frequency of sampling, number of species, non-climatic drivers, and abundance vs. presence data) could explain 22% of the variation in range shifts in their compilation of 651 marine species responses to climate, more than the 7.8% explained by ecological traits. Thus, range changes need to be evaluated in the context of a given scale, sample method, and sample intensity.
Some further complications arise from these considerations. Area occupied will depend on grid resolution (Fig. 1b), meaning that surveys over time need to use the same grid. In some cases, the survey itself may be on a grid (Groom 2013). In all cases, although grid detections with ❖ www.esajournals.org a given survey effort are fairly reliable (few false positives), lack of detection in a grid square is never a guarantee of absence. In a later section, I introduce scale-dependent tests for range size change that account for these issues.
A feature that has been frequently studied is the northern or upper elevational forest ecotone (forest boundary) or tree limit (including scattered trees). For example, Meshinev et al. (2000) and Kullman (2001Kullman ( , 2002 found tree limit rise in Europe. However, the old forest boundary is an irregular and diffuse object. Therefore, defining expansion distance is an illdefined problem, as is rate of spread, a problem exacerbated by the slow rate of tree growth at cold margins. When historical maps are used, unique problems arise due to map accuracy and resolution. MacDonald et al. (1998) used historical maps of northern forest limit in central Canada, created to assist settlers and others. They could not detect significant differences between these maps and their own delimitation of forest. Such comparisons should specifically note the resolution and accuracy of the maps from both time periods as any change or lack of change is within this context of limited resolution (greater than or not different within x km). Studies based on historical photography that fail to detect ecotone shifts (Butler et al. 1994, Klasner andFagre 2002) are likewise limited to a certain minimum resolution. Small or scattered trees beyond the existing tree line may not be evident, for example.
This points to the general issue that with maps, photographs, or surveys using boundaries or grids, there are minimum detectable levels below which change can occur but not be observed. Likewise, for a given estimate of change, a measure of uncertainty is needed. How do we properly measure distance of scattered trees from an irregular ecotone? How does that translate into a boundary movement or rate of spread? Is the spread significant? No standard exists at present for operationalizing these tests. Some suggestions are offered in what follows.

POPULATION FLUCTUATIONS AND DETECTABILITY
In contrast to many geographic features, species range boundaries are diffuse, and species may be difficult to detect. Surveys are therefore required to document species' occurrences, but the presence can be missed for several reasons, including sampling effort, and especially for rare or cryptic species. It is known that most species' populations fluctuate at various temporal and spatial scales (Lenoir and Svenning 2013), sometimes due to weather fluctuations. Volatile (e.g., arthropods) or mobile (e.g., birds) species can evince large local fluctuations ). Such population changes could produce a temporary range margin shift or change in a species density distribution across its range (Walter et al. 2015), leading to a change in the population centroid or distribution peak, especially in elevational gradient studies where a detected range shift might be on the order of only a few hundred meters. In fact, McCain et al. (2016) reported that, with typical values for population variability, single repeat local surveys have a 50% chance of estimating a local population trend opposite to the true trend and a bias toward erroneous detection of range shifts or contractions.
From these general considerations, it can be predicted that a pool of species would be expected to have members with both increasing and decreasing population trends (both real and due to sampling artifacts) at range boundaries, as well as expanding and contracting ranges. Range margin or distribution changes may also occur due to exogenous factors (e.g., succession, fire, changes in agriculture, or exploitation). This means that spurious range changes are likely, with unknown frequency that depends on local history, sampling method, and detectability ( Fig. 2). Care must thus be taken to evaluate shifts in all directions and to attempt to determine causation. Magnitudes of shifts and their causes are critical to conservation efforts.

FIELD DATA EXAMPLES
While the above arguments about range boundary sampling error over time are robust, it is useful to examine data. Multiple studies illustrate various aspects of the difficulties encountered when trying to detect range changes, including stochastic effects, sample size effects, human and natural confounding effects, and others. Examples of these issues are presented next. This is not a comprehensive review or meta-analysis, but rather is illustrative of the various pitfalls that exist.
There are several ways in which species' records can be obtained from surveys. A random sample of locations across a landscape might be surveyed and presence of species noted. Some forest inventory designs take this approach. Samples might be randomly located within grid cells (per Groom 2013). Data from multiple studies conducted for other reasons might be compiled. Remote sensing can be used for certain special classes (e.g., forest vs. non-forest). Road surveys are used for breeding bird counts. Finally, elevational distributions are often sampled using plots laid out along transects. Groom (2013) evaluated the entire flora of Great Britain (England, Scotland, and Wales) during 1978-1989 vs. 1995-2011 based on detailed repeat surveys. He plotted shifts of abundance centers in all directions. In the following discussion, qualitative assessments are made of his circular (directional species shift) plots as the author did not provide these data. For different sectors (England north, England south, Scotland, and Wales), the patterns differed. For species with increasing occupancy (n = 1243), centers of mass shifted in all directions with a weak northward tendency according to the author (no statistics were performed). For species with decreasing occupancy (n = 1281), which could have been due to chance, climate, human impacts, other factors, or a combination of factors, there was a weak westerly trend. Again, regional trends differed. Species from warmer Fig. 2. Effect of detections on range size metric. In (a), an initial estimate of range size is compared to (b) at a later time. Due to inadequate sampling, the actual increase in range is estimated to be a decrease. Effect would also apply to grid-based map. areas were not more likely to be moving north according to the author, but ranges per se were not reported. Centers of abundance (rather than boundaries per se) shifted in all directions over the roughly 20-yr interval, likely due to both sampling error (noise) and population fluctuations. Some of the shifts may have resulted from the shape of the respective land areas, as noted by the author.
Vegetation studies at elevational ecotones present a similar picture. Felde et al. (2012) evaluated vascular vegetation change on an elevational gradient in south-central Norway using an 80-yr resurvey (1922-1932 vs. 2008). Over the period, temperature and precipitation rose, with a concomitant reduction in snow cover. In addition, grazing by several livestock species decreased considerably after 1956. Using permutation tests within elevational bands, they detected significant changes in range margins and distributions but directions of change were not consistent with documented environmental changes. The authors argued for individualistic explanations for such shifts with some confounding due to reduced grazing. It was not entirely clear how they performed permutation tests. Lenoir et al. (2008) used long-term comprehensive floristic surveys in France during 1905-2000 to test for elevational shifts over time in probability of presence. They created two subsets of data roughly 20 yr apart, although the earlier period had a much wider temporal spread, and balanced number of samples by grid count and elevation. This balanced grid design is commendable. Of the 171 species, maximum probability of presence (based on logit model fits of presence data) moved upslope for 41 and downslope for five, for a mean elevational shift of approximately 29 m/decade.
In a study of 74 common tree species across the United States over a 28-yr period, Hanberry and Hansen (2015) summarized density of trees >12.7 cm dbh by latitudinal bands. They found 12 species that expanded northward, 13 species that expanded southward, and one species that expanded in both directions. Small decreases in trailing edge abundances (north and south) were possibly detected. However, no general northward shift was detectable, nor were significant range shrinkages, according to the authors. They did not specify their criteria for significant shifts.
Data resolution may not have allowed precise testing.
These multidirectional shift results for plants (also Frei et al. 2010, Foster andD'Amato 2015) also hold true for arthropods (Parmesan et al. 1999, Konvicka et al. 2003, Hickling et al. 2005, birds (Hitch and Leberg 2007, Sorte and Thompson 2007, Popy et al. 2010, Gillings et al. 2015, Bateman et al. 2016, and mammals (Moritz et al. 2008, Myers et al. 2009, Rowe et al. 2010. In each study, some percentage of the species moved both up and down or north and south or in all directions when this was reported, whether the metric was range boundary, position along transects, or abundance centroid. For the few studies that estimated range size, both increasing and decreasing range areas were found. These multidirectional shifts are what is expected due to sampling any rare, patchy object (a species) that has endogenous fluctuations and sampling error. The unresolved question is how one can separate the signal from this noise. That is, when species are patchy and shifting in multiple directions, how does this affect our inference of a treatment magnitude? Approaches are proposed in a following section. Parmesan et al. (2011) noted that it is not advisable to attribute any particular range shift to climate change. One reason for caution is that, as shown above, opposite direction movements are common and pose a problem for interpretation. More critically, there are confounding factors, such as habitat loss, that may produce trends mimicking the factor being studied.

CONFOUNDING FACTORS
Changes in human land use and/or management can locally produce changes that may look like a warming effect. Pauli et al. (2007) found a net expansion upward in ranges of vascular plants in the high Alps (Austria), though they speculated that reduced high meadow grazing in the Alps over recent decades may be partially causative. Similarly, Chauchard et al. (2010) attributed the spread of white fir (Abies alba) to higher elevations in the Alps to succession following land abandonment in the 18th and 19th centuries. Note the long causal delay in this latter case. In both cases, reduced intensity of land use led to woody invasion upslope. Bodin et al. (2013) used data on 30,985 plots from the French National Forest Inventory for abundance of 252 plant species over two inventory periods in southeast France. Successive inventories for the same plots were compared for surveys in the 1980s and 1990s. There was a mean upward trend for species peak abundance of 18 m in elevation, but accounting for forest maturation (succession) mainly at lower elevations (which disfavored heliophytes) removed any significant differences between number of species whose peak distribution shifted up vs. down in elevation. Forest maturation again resulted from less intensive management in this region. Myers et al. (2009) reported that very rapid movements north by some small mammals in the northern Great Lakes region of the United States were possibly facilitated by human activities.
Lower elevation dry-end ecotones in the American West show a 100-yr trend of densification and some expansion downslope (Veblen and Lorenz 1991, Brown and Wu 2005, Stine et al. 2014, Brown et al. 2015, Addington et al. 2017. The cause of this forest expansion appears to be reduced incidence of fire since the mid-1800s and cattle grazing, which helps pine (Pinus spp.) seedlings become established by reducing grass competition (Addington et al. 2017). It would be erroneous to attribute this change to a climatic factor.
An obvious confounding factor is habitat loss. Lavergne et al. (2005Lavergne et al. ( , 2006 documented stable presence of most locally rare plants in a region of southern coastal France over a 115-yr interval , but losses of some Euro-Siberian species at their southern range margin. Much of the loss of species was attributed by the authors to habitat loss from development. Franco et al. (2006) found that the southern range margins for four butterfly (Lepidoptera) species in Britain had shrunk in the past 20-40 yr, primarily due to habitat loss.
Directional shifts may occur for other than the hypothesized reason. For the eastern United States, a dominant northwestern shift was found for plants (Ash et al. 2017, Fei et al. 2017) that tracked combined temperature and precipitation change patterns. In England, there was a northwestern bias to shifts but mainly due to landmass configuration (Groom 2013).  found a northeastward shift in bird distributions in northern California that tracked precipitation increases.
As noted above, increased sampling effort at a more recent time will tend to create a bias toward range expansion. Kujala et al. (2013), in a reanalysis of bird range studies, found that controlling for survey effort either reduced (Zuckerberg et al. 2009) or largely eliminated Lennon 1999, Brommer 2004) range shifts found by these studies.
Finally, land use change could attract/repel or favor/disfavor species. Changes in agricultural practices are an example. Arthropods and birds that prefer agricultural landscapes could be affected by such changes. In the earlier example, reduced grazing of high meadows in the Alps may have led to tree invasion.
In addition to tests of centroid or boundary shifts per se, it is also important to test for correlations with the hypothesized driving factors, such as warming. For example, Lehikoinen and Virkkala (2016) compared detailed spatial data for 128 bird species in Finland during 1970-1989 vs. 2000-2012 (approximately a 25-yr period). The center of gravity of bird densities for all species combined shifted an average 37 km to the north-northeast (approximately 1.48 km/yr). While this range change distance is consistent with predictions, they found that species-specific directions of density shifts did not correlate with temperature shifts. They attributed this lack of correlation to influences by factors such as land use change. In such a case, it is difficult to assess the hypothesis (warming, in this case) because of lack of species-specific correlations. In other cases, precipitation shifts may be as important as temperature in explaining range shifts , Ash et al. 2017, Fei et al. 2017). Seasonality rather than temperature per se may also need to be considered.
Inferring causation when a range shift is observed is not easy. In most of the examples given, confounding could act to mimic the hypothesized change. Confounding could also operate in the opposite direction, causing species to move downhill, appear to have a shrinking range if recent survey effort is less than a prior survey, or shift distribution peaks in random directions due to disturbances. For eliminating confounding due to human activities, there are some possible guidelines. For example, habitat loss and land abandonment from agriculture can be documented. Likewise, changes in land management (e.g., cessation of grazing, fire suppression) can have long-term, known effects. Some such changes are at least roughly predictable. Effects due to other factors (e.g., water pollution) may be considered based on known data.

STATISTICAL TESTS
It is well-known that spatial features are difficult to quantify. Boundary features such as coastlines are subject to map accuracy issues (Crowell et al. 1991), with change detection being a particular difficulty (Alesheikh et al. 2007, Wernette et al. 2017. A species' range boundary is just such a feature. Lengths of boundaries can be scale-specific (Loehle 1983). Small map classification errors have been shown to potentially create large errors in landscape pattern indices such as habitat fragmentation (Langford et al. 2006), an issue relevant to irregular range boundaries or patchy ranges. A great deal of work has examined animal home ranges (Laver and Kelly 2008), and it might seem that this work is analogous to the geographic range problem. However, it is not directly applicable. Convex polygon methods produce an estimate of home range size but are subject to sample size, sampling regime, and boundary cutoff effects (Loehle 1990, Seaman et al. 1999, B€ orger et al. 2006. Utilization distribution methods create a probability of use map rather than an area estimate. While a utilization distribution can be converted to an areal metric by specifying a probability cutoff, the many different algorithms and computational choices involved in modeling this metric (Laver and Kelly 2008) mean that there is considerable ambiguity in the result. My suggestion is that working directly with the cloud of survey points may be simpler.
It is important to note that the problem of range change detection is not the same statistical problem as that addressed in the remote sensing literature. Remote sensing data lead to spatially continuous maps of classified pixels, whereas species data are based on scattered sample points. While remote sensing classification at the pixel scale has accuracy and misclassification issues for certain land cover classes (De Bruin and Gorte 2000, Stehman 2001, Foody 2002, Herold et al. 2008, it is sample density and detectability limitations rather than misclassification that limit species range map accuracy. Finally, land cover change detection suffers from map alignment and mapping errors at different survey times and generally produces statistics on changes in numbers of pixels by category (Coppin et al. 2004, Jin et al. 2013, whereas range changes address shifts in the spatial distribution of clouds of points. Thus, while some of the lessons and cautions from remote sensing studies can be carried over, the statistical questions and data differ considerably. In what follows, I introduce some approaches for dealing with changes in the spatial location of clouds of observed species locations.
Some previous work has been based on a distribution model or occupancy modeling (Lenoir et al. 2008, Moritz et al. 2008. From this analysis, one can develop a probability of presence model at each of two times and test for a change of boundary, peak abundance, or entire range. While useful in certain cases, species data may be too sparse in space, multimodal, or too patchy to support such analyses. I here provide alternate methods for such patchy distributions. The simplest case to consider is the population centroid. This is analogous to the center of mass of a physical object. Due to both variation in where each sample is taken over time and detectability issues, a species may be present but not observed at any sample point at a particular time. The effect of sampling error on the centroid over repeat samples is illustrated in Fig. 3a. In the example, the true locations of organisms stay the same. We can picture this as presence within a large cell (say 1 km 2 ). The centroid clearly varies simply due to detectability at the cell level. In the example, center of mass over repeat surveys could differ from each other in space by more than two cell widths (2 km).
In a second case, we can consider a species that fluctuates mainly at a small scale. Annual plants and many animals would fall into this category. In Fig. 3b, this case is illustrated with a reassignment of individuals to cells for each repeat survey but with no sample error. In this case, the resampled centroids also vary.
These two cases illustrate that the null expectation due purely to sampling error and ❖ www.esajournals.org population fluctuations over space and time is for the centroid to vary. To say that the centroid varies more than chance, we need to characterize the null distribution of centroid variation due to chance. It is not likely possible to develop standard expectations for the centroid null for several reasons. Samples are rarely taken on a regular grid. Bird surveys, for example, are often conducted along roads (Robbins et al. 1989) and may have a higher density in more populous areas. Forest inventory plots are sparse and placed randomly or stratified by forest type. Surveys may be opportunistic or local, such that a species range is not covered uniformly. Historical species ranges may be based on archival (e.g., museum) records that were opportunistic. Finally, sample density and/or gird scale will affect results, as will patchiness of the range and species rarity. It is thus likely that study-specific estimates of the null distribution (variation of the centroid by chance) are needed. Given a distribution of sample locations over space and an estimate of detectability, a resampling procedure, as done here, could be used to estimate sample error and detectability effects on centroid variability. The effect of population fluctuations is more complex but could be estimated with simple models (Cabral and Schurr 2010). The resultant null can be characterized by a mean and variance of centroid shift, plus a directional null, which is likely to be equal in all directions unless constrained by topography, human development, or other factors. The null can be suitably modified-no major shift could be expected if a coastline blocks the way in that direction. This test can be applied to individual species or suites of species. If large numbers of species exhibit changes in all directions (as in Groom 2013), this is what we expect by chance and the treatment effect is the directional change minus this null. That is, if the net direction effect is only slightly bigger than the null, as in Groom (2013), then we can say that the effect of interest is perhaps present but not strong compared to random. In the case of plants in Britain (Groom 2013), the remeasurement interval is perhaps too short for a signal to overwhelm noise.
Detecting range boundary changes poses a different problem. In the introduction, a range boundary created by simply drawing a line around observed locations (Fig. 1) was shown to be rather arbitrary. Statistics cannot be applied to such boundaries. Consider a set of location data at a range boundary (Fig. 4a), where a species becomes less common toward the boundary (top of figure). A new dataset was constructed with a less steep gradient and thus a range margin that has moved up (Fig. 4b). While visually we might believe that we can see the difference between the two maps, this visual impression is not objective. How can we test it statistically? One approach is to use horizontal bands (perpendicular to the gradient) and count observations in each band. This approach was used by Hanberry and Hansen (2015) for trees in the United States. For wide bands, as in their study, small changes Fig. 3. Effect of stochastic processes on population center of mass. For a 20 9 20 grid, population units are assigned randomly to cells with probability 0.3. Population centroid of a randomly generated sample is red. (a) For 50% stochastic sampling error (failure to sample the cell or detect) from the same data as in (a), the computed centroid varies (green). (b) For stochastic fluctuations in the population (random reassignment to cells), the same effect is observed (blue). Note that random samples cluster around the true centroid in (a) but in (b), the original centroid (red) is just another random draw from the spatial distribution.
are not detectable. More seriously, for any banding-type data aggregation, we still need a null distribution for comparison. How much will banded data fluctuate due to sample error and random population fluctuations? Simulation as in Fig. 3 could be conducted to answer this question.
A second approach is to sample transects along the gradient. This is a common method in studies of elevational change. With a set of several transects at two times, the upper (or lower) elevational limit of the species may be determined. Considering hypothetical transects (vertical) placed on Fig. 4a or b, it is clear that individual transects will differ considerably in the highest observation captured. There are two approaches to consider: the highest observation of the sampled transects and the mean highest of those sampled. For a random map such as Fig. 4b, the variance of N consecutive transect maximum and mean maximum values were determined for different numbers of transects.
For a simulated boundary 400 cells wide with most occupied cells below 40 high, the variance of the mean of N transects for different bin widths N (technically equivalent to the same number of random transects) is statistically wellbehaved (Fig. 5, dashed), with a downward slope. If we add two extreme outliers at vertical location 200 at horizontal locations 1 and 50, the variance of the sample means is higher (Fig. 5, solid) but still well-behaved. In contrast, with the outliers added, the variance of the maximum per sample is not well-behaved. It rises with number of transects per sample until all samples include the outliers, at which point the variance goes to zero (not shown). This means that with this metric, the outliers totally govern the outcome, making it unsuitable.
With the above in mind, we can perform a test for difference using the mean maximum across all 40 transects in Fig. 4. In the constructed example, the base case (Fig. 4a) has a mean maximum across all 40 vertical transects of 23.9 and the expanding case (Fig. 4b) a mean maximum of 27.4. We might naively call these different but a standard test of H O , using a t test across the 40 transects gives P = 0.011. In this case, we can reject H O at the 5% level (i.e., the means differ) but cannot reject it at the 1% level.   Fig. 4b that is 400 cells wide. Samples generated by initiating the five transects starting at 1, at 2, etc. With two outliers added (see text), the variance of the mean maximum is higher (solid) but is still well-behaved.
Given that appropriate tests have been performed for each of a suite of species, per the above, how can we assess the overall community trend? For example, in the study by Hanberry and Hansen (2015), of 74 common trees in the United States, 12 expanded northward, 13 expanded southward, and one expanded in both directions. The approximately equal (12 vs. 13) numbers expanding in both directions would seem to rule out a directional signal, but what numbers would indicate such a signal? If 14 expanded northward and 10 expanded southward out of 74, would we (should we) call this significant? One possibility for cases where both northern and southern (upper and lower) boundaries were studied is a chi-square contingency table. Consider an artificial example for elevational range margins, tested as suggested above. For elevational ranges for a group of species, if the same number shift up vs. down at both range margins, we can say that there is no effect. In Table 1, therefore, we test against this null expectation using a chi-square statistic with one degree of freedom. In the example, we reject the null and conclude that the shifts are not equal in all directions. Note, however, that there are more species at the lower boundary shifting downhill than expected, which is counter to a general expectation of upward shifts. We must conclude that while shifts are not random, they also fail to match the expected pattern. We see just this case in expansion of forest both up and down in the Valles Caldera, New Mexico, where montane meadows below the forest are maintained by cold air drainage and fire (Coop and Givnish 2007). It is important to also remember that contingency tables are not valid unless all cells have expected values of at least five, so studies with small species pools (Franco et al. 2006, Myers et al. 2009) cannot be tested this way.
The areal metric most directly relevant to extinction risk is geographic range. As noted, this is an ill-posed metric if we think of it as a single number (an area). We can analyze the occupancy of the landscape by a species in terms of a gridded occurrence map. Given a set of species locations, we can overlay a 1 9 1 km grid, a 2 9 2 km grid, a 4 9 4 km grid, and so on, and in each case count the occupied cells (in binary terms, not number per cell). As cells get larger, fewer are occupied, giving a declining curve. To account for the fact that the origin of a map grid is arbitrary, it is necessary to shift the grid origin and recalculate cell occupancy. When this is done (Fig. 6), there is a spread of the number of occupied cells at each scale. This provides an opportunity for scale-dependent statistical testing. Stochastic detectability can also be included in this analysis. Consider now a later time at which  the species has increased in both density and extent (Fig. 7). Because there are 25 observations at each scale due to map offset resampling, we can conduct a simple test for difference of means at each scale using the mean and variance of the re-gridded maps. Doing so, scales up to 24 units wide cells differ (new population is larger) at the 0.01 level. At 26 units wide, they differ at the 0.05 level (P = 0.034). At the 30 unit wide scale, they do not differ. Note that if the smaller range was the more recent survey, this same analysis would detect both range contraction and lower density (at the small cell size).
Outliers affect curve shape. In Fig. 8, the same population is compared to the case with three outliers added. Occupancy for the smallest cells is not much affected, but it is for larger cell sizes. Based on this, the loss of outlier locations would be detectable using the same test as above. If there are large disjunct population patches, the effect is similar but more easily detectable.
When ranges are viewed as a collection of points, a scale-dependent analysis can be used to detect change in density (small cell sizes), range size, centroid location, and patchiness. It is still not possible to measure the true area, as this remains undefined, but the goal of detecting area change can be achieved.

CONCLUSIONS
Detecting changes in spatial patterns is exceedingly difficult. This is especially so for plant and animal geographic ranges because in no case is the range a compact, well-defined geometric figure, nor do species fill the space of their range. Because range data are based on surveys, they are contingent on the survey spatial pattern, sample size, and species detectability at any given location and time. It is thus more useful to think of a species range as a distribution of sample points with unknown probabilities outside (between) sampled points than as areas with discrete boundaries. Tests for margin shifts, centroid shifts, shape and extent changes, and community-level changes were developed and illustrated. These methods consider sample error, map alignment error, and population stochasticity to incorporate a proper null model into analyses.
The nature of encountered data, such as species location samples, also poses difficulties for statistical inference. If data come from historical maps (e.g., for northern treeline), the resolution of the map will limit the smallest change that is  detectable. Likewise with data collected on a grid (such as 10 9 10 km). This means that any species or ecotone for which no change is detected is really no change greater than this minimum resolution. This minimum resolution is in addition to the stochastic error types documented above.
I have shown that any study of a change in range metrics should consider the data as a cloud of points and compare any change to a null expectation of variation due to sampling error and population fluctuations. This is simply standard statistical inference. Given the growing interest in conservation, many more range change studies are likely in coming years. These studies would benefit from more rigorous and consistent statistical testing.