Challenging the geographic bias in recognising large‐scale patterns of diversity change

Geographic structure is a fundamental organising principle in ecological and Earth sciences, and our planet is conceptually divided into distinct geographic clusters (e.g. ecoregions and biomes) demarcating unique diversity patterns. Given recent advances in technology and data availability, however, we ask whether geographically clustering diversity time‐series should be the default framework to identify meaningful patterns of diversity change.


| INTRODUC TI ON
Biodiversity describes the variety and heterogeneity of organisms at all levels of the hierarchy of life, from genes to species to ecosystems (Gaston, 2000).It is changing at accelerating rates worldwide due to human activity (Hull et al., 2015;Pecl et al., 2017).
This implies an urgent need for appropriate assessments to explain the uneven distributions and patterns of biodiversity in order to decide conservation policies and gain maximal conservation benefits (Cardinale et al., 2012;Jetz et al., 2019).Most research attempting to describe and analyse the multiple dimensions of biodiversity has been from one of two perspectives.First, metrics and indicators are proliferating to capture different facets of, and values derived from, biodiversity (Pascual et al., 2021;Pereira et al., 2013;Skidmore et al., 2021).For example, taxonomic diversity has been assumed to function as a surrogate for different biodiversity facets (Rapacciuolo et al., 2019), mainly owing to the easy quantification and interpretation of species distribution data (Magurran, 2021).However, with the developing recognition that species are not equivalent (e.g.some perform unique functions and some clades carry more evolutionary history than others) (Vane-Wright et al., 1991;Vellend et al., 2011;Winter et al., 2013), measures of functional and phylogenetic diversity have recently been developed (Cadotte et al., 2015;Winter et al., 2013) to incorporate these facets of diversity.To be more specific, functional diversity is a measurement that incorporates different functions that species perform in an ecosystem and is usually characterised in terms of functional traits, that is morphological, physiological, or behavioural characteristics (Petchey & Gaston, 2006).
Phylogenetic diversity is a family of metrics derived from the phylogenetic distance among taxa and is sometimes controversially treated as a proxy for functional diversity, because functional traits are often phylogenetically conserved and measured imperfectly (Cadotte et al., 2017;Mazel et al., 2018).
Second, studies have been conducted to investigate the spatial heterogeneity of biodiversity in different geographic clusters (e.g.ecoregions and biomes) within which organisms are assumed to have similar responses to environments (Dobrowski et al., 2021;Yu et al., 2019).For example, regional factors have often been used to infer environmental drivers for different populations in niche modelling and biodiversity mapping (Jetz et al., 2019), and diversity change at a single site has been employed to represent the change at a coarse scale (Antão et al., 2020;Daskalova et al., 2020).Whilst it is becoming widely recognised that species respond differently to the same environmental pressures and the relationship between them may vary at different scales (Jarzyna & Jetz, 2018;Owen et al., 2019;Tucker et al., 2018), macroecological studies like these continue to use predefined geographic clusters to carry out their analyses and recognise diversity patterns.This is likely owing to two reasons.
First, it is a classic approach to observe and record biological processes at a local scale and summarise the large-scale biological patterns in a bottom-up way, where an assumption of similarity due to geographic proximity is easily adopted (Díaz & Malhi, 2022;Hughes, Orr, Yang, & Qiao, 2021;Tscharntke et al., 2012;Willig et al., 2003).
Second, although there has been extensive data collection for biodiversity surveys, it remains challenging to sample a whole given area at large scales to capture all its diversity changes (Hughes, Orr, Ma, et al., 2021;Valdez et al., 2023), especially for long-term biological survey programmes that require substantial coordination and support (Bowler et al., 2022;Zhang et al., 2021).Thus, regional diversity patterns are calculated from samples within them, and geographic structure is used per se.
However, using geographic structure to define and analyse patterns of diversity change may introduce biases for two reasons.First, the spatial distribution of diversity is highly heterogeneous (Gaston, 2000), which means diversity in the same predefined geographic cluster can still vary greatly.Second, given the significance of human effects on nature, divergent diversity trends can be observed in sites that whilst geographically close experience these effects to varying degrees.This suggests events that can change diversity trends may not match simple geographic structures.Therefore, patterns of diversity change that are recognised by different geographical clusters may not effectively capture temporal diversity patterns.However, although such biases may exist, analyses that have estimated diversity change have commonly used geographic structure to delineate large-scale patterns of diversity change, such as ecoregions (Harrison et al., 2018;Sano et al., 2019), bird conservation regions (Jarzyna & Jetz, 2018), country boundaries (Normander et al., 2012) and continents (Blowes et al., 2019;Kerr et al., 2015;Soroye et al., 2020).Little attention has been paid to whether patterns of diversity change can most effectively be recognised by geographically clustering.
Here, we ask: Should geographic structure be used as a default approach to recognise diversity patterns?To address this question, we propose a framework that tests whether times-series identified as having similar behaviour are geographically structured.We first recognise patterns of diversity change based on behaviours of diversity time-series neglecting their geographic locations.As such, diversity time-series are recognised as having the same pattern on the basis of similar trends in their variability rather than their geographic proximity.
We then test the spatial dependence of time-series within the same pattern.Therefore, a geographic structure is found when diversity time-series that have a similar pattern of temporal change behaviour tend to be geographically clustered (Figure 1b-e).Otherwise, the timeseries should be distributed independently (Figure 1f,g), signalling that geographical recognition of diversity patterns is likely biased.We illustrate this framework with the North American Breeding Bird Survey data both across North America and east of the Mississippi River.The data are collected annually over 5000 survey routes located across the continent of North America since 1966 (Pardieck et al., 2020).To incorporate multiple facets of diversity, we calculate taxonomic, functional and phylogenetic diversity for all time-series.As a comparison, we complement this framework using analysis of a remote-sensing data set (MOD13C2 MODIS/Terra Vegetation Indices) (Didan, 2015) to produce a vegetation index (EVI) for cells across North America occupied by bird survey routes, which we expect a priori to demonstrate very strong spatial structuring.

| Pattern recognition
Diversity change time-series are inherently likely to show autocorrelation (Daskalova et al., 2021), and methods used to group diversity time-series should incorporate this.Here, we used self-organising maps (SOMs; T. Kohonen, 1997) to determine the patterns of the different diversity time-series.Self-organising map is based on an artificial neural network and uses an unsupervised learning method, well adapted to complex data analyses.As an informative and intuitive method in feature extraction, SOM has found wide application in a variety of disciplines for clustering, classification, dimensionality reduction and data visualisation (Liu et al., 2016).Some examples of the application of SOM in ecology are as follows: determining representative species (Park et al., 2006); investigating fish assemblages (Penczak et al., 2012); and modelling temporal evolution of Pacific surface chlorophyll (Huang et al., 2017).
In an SOM algorithm, high-dimensional data sets are projected onto a low-dimensional space (typically two-dimensional), meanwhile preserving the similarity and the difference between the input data vectors.The process is organised by automatically detecting relevant subgroups of similar input vectors and generating neurons (virtual vectors) that describe the coordinates of centres of the subgroups.Neurons in a network respond to a given subgroup of similar input vectors and move closer to each other with the addition of new input vectors to it.Whilst a classic result from a SOM is a topology map (where the data sets to be classified One of the most challenging steps in using SOM is to choose the number of groups.Topographic and quantisation errors (Teuvo Kohonen, 2012) have been proposed to help determine this for an SOM approach.The topographic error (TE) denotes the continuity of the topology mapping.Here, we calculated TE by the average distance between all pairs of most similar neurons of input vectors.The topographic error therefore can measure the topographic preservation and the accuracy of the mapping in topology.
where N is the number of neurons, and ‖ ‖ u i − u ‖ ‖ is the topographic distance between a given neuron and all other neurons.
The quantisation error (QE) represents the average distance between each input vector and its neuron.
where N is the number of the input diversity time-series, and ‖ is the topographic distance between every input vector and its neuron.Usually, small values of TE and QE suggest a good performance of the SOM model.However, QE and TE gradually decrease as the number of neurons increases and thus can only help determine the optimal number of neurons at a local scale (Park et al., 2018).As the data structure is unknown before running a SOM model, the SOM surface is usually set to be similar to a square (i.e.equal length and width) to reduce bias (Park et al., 2018).We selected the number of neurons based on both TE and QE of relatively small values for different numbers of neurons ranging from 4 to 25 with arbitrary intervals to reach squareshaped surface.
Self-organising map models can be applied directly to timeseries data (Vesanto & Alhoniemi, 2000).We used the SOM model as a clustering and pattern recognition method to investigate temporal diversity patterns.Specifically, the input diversity time-series were iterated by SOM to find the best matching neuron.Locations with similar diversity variability (i.e.diversity range and variation) were taken as one group.Only once the data were classified on their temporal diversity variability, the geographic distribution of the time-series identified and plotted.We generated nine groups of diversity time-series to explore their temporal change for the relatively smaller TE and QE of these numbers (Figures S1-S17), which can also efficiently extract patterns of diversity.To contrast diversity variability with that from other numbers of groups, we also calculated the SOM models of5 × 5, 5 × 4, 4 × 4 and 2 × 2 map size.We employed the 'supersom' function of the 'kohonen' package in R 4.0.1 (https://www.r-project.org).All SOM models were performed with a hexagonal structure and Gaussian neighbourhood function.We used the hexagonal structure because nodes in a hexagonal structure have six neighbours and can display greater variance in neighbourhood size, compared to a rectangular structure where nodes have four neighbours.Taxonomic, functional and phylogenetic diversity were used simultaneously with the same weight in each SOM model.

| Spatial dependence
We used a Monte Carlo test with the average nearest neighbour (ANN) distance to test spatial dependence (Clark & Evans, 1954) in membership of the groups of diversity time-series.ANN = , where d i corresponds to the distance between feature i and its nearest neighbouring feature, and n is the number of features.The great circle distance on a WGS84 ellipsoid was used for all the distance calculations.Distance in raster data is calculated by the centres of each cell.For a given group of time-series that have the same pattern, we randomly selected an equal number of time-series to the number of time-series of that group from across all time-series analysed.Then, the ANN distance for locations of the randomly selected time-series was calculated.We did not generate random points across the space for randomisation, because the biological surveys are often not randomly located (e.g.avoiding overlap).That process might be explicit (i.e. the people who organise the surveys specify this), or it might not be (e.g. it just results from the decisions individuals make, but they make these in relation to other people's choices).The randomisation process was replicated 1000 times to generate a density histogram showing the distribution of ANN distances of the randomly distributed time-series.A pseudo p-value was calculated to show the probability of incorrectness if rejecting the geographic independence hypothesis that diversity time-series that have the same pattern are randomly distributed: where N greater is the number of simulated ANN distances greater than the real ANN distance, and N is the number of replications.We did not remove locations that might lead to spatial pseudoreplications as here we aimed to test the geographic proximity of diversity timeseries, which would be reflected by the results of the above analysis (e.g.p-value).

| Data collection
We used the North American Breeding Bird Survey (BBS) data to investigate the biodiversity changes in different diversity facets produces less biased estimates than other imputation methods in risk models using regression (Ambler et al., 2007).We used a random forest algorithm (Breiman, 2001) for MICE, as it had high performance, required little computational time and was good at dealing with high-dimensional data (Pantanowitz & Marwala, 2009;Penone et al., 2014).We removed from all routes records for unidentified species (66 out of 713 species) and analysed a total of 911,702 records (20,721 per year) from 316 out of 5690 survey routes with less than 3 years' data gaps during 1973-2016.We imputed a sum of 331 data gaps (2.4% of the total number of years across 316 routes) from 214 routes based on all taxonomic, functional and phylogenetic records for each survey route.To reduce the effects of nonrandom spatial sampling, we also implemented our approach for the region east of Mississippi River which resulted in 200 survey routes as a comparison.

| Diversity measures
To compare taxonomic, functional and phylogenetic diversity in a consistent framework, we calculated these using an abundanceweighted measure, Rao's quadratic entropy (Rao, 1982): where d ij is the taxonomic, functional or phylogenetic diversity dissimilarity between each pair of species i and j, whilst p i and p j are their relative abundance.We then calculated the three diversity indices for a given time-series for every year.
For taxonomic diversity, the distance between all species is the same ( d ij = 1 ), and Rao's Q, in this case, is equivalent to the Gini-Simpson index, which usually measures the probability of interspecific encounter (Hurlbert, 1971).For the calculation of functional diversity, we used four categories of 16 traits from Elton Traits 1.0 (Wilman et al., 2014).These traits, which are assumed to represent key Eltonian niche dimensions, comprised: body mass, diet (i.e. the proportional use of invertebrates, vertebrates, carrion, fresh fruits, nectar and pollen, seeds, and other plant materials in species' diet), foraging niche (i.e.prevalence of foraging below water surfaces, on water surface, on terrestrial ground level, in understorey, in midcanopy, in upper canopy, and aerial) and broad habitat types (i.e. pelagic or not).The traits that we used to calculate functional diversity are correlated with each other to different extents (Figure S2).
However, considering there are no traits that are fully correlated and each trait carries unique features shaping avian functions, we retain all traits in the Elton trait database for subsequent analyses.We gave equal weights to each trait category, which resulted in weights for body mass and broad habitat type of 1, and 1/7 for each diet and foraging niche variable.Functional distance was calculated using a multivariate trait dissimilarity of Gower's distance (Gower, 1971) for each pairwise species (Pavoine et al., 2009), with functional diversity calculated as the subtree length on the UPGMA (Unweighted Pair Group Method with Arithmetic Mean) dendrograms of these distances.Phylogenetic distance was calculated using 100 phylogenies sampled from a full pseudo-posterior distribution of phylogenetic trees (http://birdt ree.org).The mean phylogenetic diversity across these 100 dendrograms was calculated.All three diversity facets were standardised by SD i = D i − min(D) ∕ (max(D) − min(D)) before further analysis, where SD i is the standardised diversity, D is the

| Relationship between different diversity facets
To investigate the correlation of functional-taxonomic, phylogenetictaxonomic and functional-phylogenetic diversity in each group of diversity time-series, we adopted generalised additive models (GAM) and included the year of a given time-series as a random effect.We calculated the deviation explained and 95% CI for all the groups to measure the correlation of different pairs of diversity facets.

| MODIS/Terra Vegetation Indices
We used the enhanced vegetation index (EVI) from a remote-sensing product (MOD13C2) (Didan, 2015) as a test of the framework in different sample coverages for a system where we expect strong spatial clustering.The EVI provided biomass information and can be used to quantify vegetation greenness.The product provided a monthly EVI in a 5600-m resolution from 2000 to 2021 across the globe.As we aim to test the spatial dependence of EVI time-series, we here quantified their annual change in September when the activity of vegetation is usually at a maximum and relatively constant over the year (Villamuelas et al., 2016) across North America to simplify the calculation.We applied our framework with EVI time-series that locate within the same cells as bird survey routes (316 time-series).

| RE SULTS
To test the relative fit of geographic structuring, we assessed the spatial dependence of time-series within each pattern recognised by changes in their variabilities using two data sets with different spatial sample coverages.We found no evidence of stronger geographic structure on diversity change of North American breeding birds than random.We found that the average distance between the survey routes lay in the middle part of the distance frequency curve calculated from randomly distributed points, although the number of survey routes within eight of the nine groups varied (Figure 2).
Similar results were also found for using survey routes within east of the Mississippi River region (Figure 3   We found EVI time-series that were recognised to have the same pattern of temporal change had a clear geographic structure (Figure 5).We found strong evidence that the distribution of the time-series is distinct from random across all groups (Figure 5 and Figures S15-S17).Hence, this method is able to detect and return strongly spatially clustered signals.

| DISCUSS ION
Despite the frequent use of geographic structure in studies of the distribution of biodiversity (Blowes et al., 2019;Devictor et al., 2010;Gaston, 2000;Willig et al., 2003), we reveal that using geographic By identifying geographically disparate patterns of similar diversity change, we emphasise the complex resultant structure of biodiversity (Brooks et al., 2006;Gaston, 2000) from two perspectives.
First, from a spatial perspective, biodiversity is usually understood as having multiple facets, including taxonomic and species composition, ecosystem function and service potential and evolutionary history, which leads to the complexity in its distribution and structure (Gaston, 2000;Grenyer et al., 2006;Petchey & Gaston, 2006;Tilman, 1982).Communities in neighbouring regions may face different environmental and human pressures, and as a result, have divergent trends in diversity change (Catford et al., 2022).Therefore, the estimation of large-scale diversity patterns may be biased by using geographic structure a priori and not distinguishing the behaviours between diversity time-series.Second, in temporal contexts, biodiversity change is also influenced in complex ways such as the exploitation of biological resources, the introduction of exotic species, climate and land-use change (Mantyka-Pringle et al., 2012;Newbold et al., 2015;Sax & Gaines, 2003).This means despite studies based on static diversity data having found strong geographic structures in their distributions (Bahn & McGill, 2007, 2013), approaches that incorporate the dynamic changes of diversity might still reach juxtaposing conclusions.In addition, the complex change in environments in turn may influence the behaviours of surveyors and estimation of diversity trends (Bowler et al., 2022;Zhang et al., 2021).This is particularly important for recognising patterns of diversity changes considering that the spatial sample coverage of even the most extensive biodiversity data sets is believed to be less than 7% of the Earth's surface at a 5-km resolution (Hughes, Orr, Ma, et al., 2021), and analyses estimating long-term diversity trends may abandon short time-series or time-series with data gaps (Zhang et al., 2021).Using disproportionately to taxonomic diversity in multiple groups.Specifically, significant changes in functional diversity were accompanied by more minor changes in taxonomic diversity.Similarly, previous studies also found unsynchronised changes between taxonomic and functional diversity (Le Bagousse-Pinguet et al., 2019;Monnet et al., 2014).
The correlation between taxonomic and functional diversity was strong and positive in the long term, which was a likely consequence of the replacement of functionally redundant species with unique ones (Bełcik et al., 2020).Furthermore, changes in phylogenetic diversity in the groups were smaller than or even opposite to what would have been expected given the concomitant changes in taxonomic and functional diversity.Correspondingly, phylogenetic diversity showed a tendency to level off and even decline with increasing taxonomic and functional diversity.This phylogenetic homogenisation implies that the bird assemblages have perhaps undergone recruitment of, or even replacement by, closer relatives of species already present.The fast adaptive radiation of colonising species could also contribute to the high levels of taxonomic and functional diversity and low levels of phylogenetic diversity across North America (Jarzyna & Jetz, 2017).
We also tested our framework with an EVI time-series data set.
The results showed strong evidence that EVI time-series of the same pattern are geographically structured using the same locations as bird survey routes.This suggests that geographic structure may work well in understanding ecological changes in plant biomass (Holt et al., 2013;Smith et al., 2018) but may not play a meaningful role in delimiting more complex patterns of animal diversity change.Therefore, the results emphasise the geographic biases in recognising patterns of diversity change and the importance of our framework to partition biodiversity change by the similarity of time-series.
Since we focus on the geographic bias in long-term diversity estimations with existing diversity time-series data, there are caveats in interpreting our results.First, temporal or spatial scale-dependences may influence this geographic bias.Future work could explore interplays between different spatial sample coverages and lengths of diversity time-series and how they may influence the geographic bias.
Second, our work concentrated on temperate areas whilst diversity dynamics in other biomes can be different given possibly less disturbance in those regions.It is beneficial to explore the extent to which the nongeographic structure is influenced by biomes.Third, in the main analysis, we used North America Breeding Survey data that followed a standard protocol in collecting data.However, studies that might use our framework with other data sets (e.g.citizen data) should be aware of sampling variation, and rarefaction and resampling techniques can be useful for these data sets (Chao et al., 2020;Gotelli & Colwell, 2001).
Future work could also concentrate on identifying drivers of the nongeographic biodiversity patterns extracted from time-series, which can be achieved by determining (non)geographic patterns in the temporal variation of biotic and/or abiotic factors and exploring their potential mismatch with recognised diversity patterns.Then, by analysing the degree of mismatch between diversity and factor patterns, it could be possible to establish networks of ecological interactions and their contributions to the dynamic of diversity change in each pattern.In doing so, recognising nongeographic patterns also provides a bridge between empirical systems and theoretical interaction frameworks, such as has been used in identifying the relationship between patterns of phytoplankton and eddy lifecycles in oceanography (Huang et al., 2017).

F
Conceptual framework to test spatial dependence.(a) Biodiversity facets from different sites are clustered onto a 2 × 1 SOM surface.(b and c) Recognised patterns of diversity change show strong geographic structures.(d and e) Recognised patterns of diversity change have medium geographic structures.(f and g) Represent no evidence for strong or medium geographic structures underlying the recognised patterns of diversity change.An average nearest neighbour (ANN) distance is used to measure the geographic proximity of different locations.ANN distance from the diversity time-series falling to the left of the histogram of ANN distances for randomly generated locations supports that geographic structure is important for pattern recognition (c and e), whilst ANN distance from real diversity timeseries adhering to recognition of the weak geographic pattern will fall in the middle of the distribution (g).Points in (b), (d), and (f) with the same colour surrounded by a dotted circle represent that they have similar diversity change and are recognised into the same pattern.Solid lines in (c), (e) and (g) represent the ANN distance of diversity time-series, and dashed lines represent the average ANN distance for a set of random samples.are mapped onto map nodes), the neurons that are neighbours on the topology map are expected to represent similar patterns; consequently, dissimilar patterns are expected to be distant from each other on the map.The training process iterates and tunes the network based on a preselected number of groups (hence, self-organising).

(
Pardieck et al., 2020).Breeding Bird Survey is a long-term avian monitoring programme that tracks the population dynamics of breeding birds and follows a strict survey protocol allowing for yearly comparisons.Breeding Bird Survey data are collected once per year in June over 5000 survey routes that are located across the continent of North America.Each survey route is approximately 40 km long, with 50 stops, and split into five segments.At each stop, the birds will be detected when they are seen or heard within a 400 m radius for 3 min.Generally, the trained observers record the stop ID, and the presence and abundance of each species encountered.The sampling for each whole route is conducted over approximately 5 h in the early morning.Breeding Bird Survey data from the early years of the survey are limited due to insufficient sample size and spatial coverage, so we excluded those from 1966 to 1972.An SOM model treats data gaps as form of similarity between time-series(Park et al., 2018), which introduced bias if actual data in these gaps would have been different.To keep the integrity of temporal coverage and obtain enough samples, for the period after 1972, we retained the routes with gaps of less than 3 years' duration and then conducted a multivariate imputation by chained equations (MICE; van Buuren & Groothuis-Oudshoorn, 2011), as implemented by the 'mice' package in R, to fill the gaps after diversity calculation.Multivariate imputation by chained equation has a better prediction quality and and Figures S12-S14) and other numbers of groups of diversity time-series (Figures S3-S6).The temporal trends, range and correlation of taxonomic, functional and phylogenetic diversity varied across the different groups of diversity time-series (Figure 4 and Figures S7-S10).Taxonomic diversity showed a gradual increase in four patterns (Figure 4G1,G4,G7,G9) and decline or no change in the other groups (Figure 4G2,G3,G5,G6,G8).Functional and taxonomic diversity were positively correlated in most groups (DE = 0.44, 95 % CI = 0.30 ∼ 0.59 ; F I G U R E 2 Distribution of survey routes in the North American Breeding Birds Survey and spatial independence test.Each panel presents the distribution of diversity time-series that have the same pattern as identified by the SOM.The inset density plot shows the distribution of the ANN distance of 1000 Monte Carlo tests.Solid lines in the density plot represent the average nearest neighbour (ANN) distance of the survey routes, and dashed lines represent the mean value of ANN for 1000 random samples.Blue points represent the start locations of survey routes in the focal SOM group and orange points are the start locations of other survey routes that do not follow into the SOM group.Percentages on the left top of each panel denote the percentages of time-series in that group to all time-series.Pseudo p-values represent the probability of incorrectness if rejecting the geographic independence hypothesis that diversity time-series that have the same pattern are randomly distributed.

Figure
Figure S11).Functional diversity showed similar trends to taxonomic diversity, but the change rates were different.The correlation between taxonomic and phylogenetic diversity varied less in groups of diversity time-series (Figure4).Phylogenetic diversity had a trend of slowly increasing or staying flat as taxonomic diversity increased (DE = 0.16, 95 % CI = 0.09 ∼ 0.23).The correlation between functional and phylogenetic diversity differed more in groups of diversity time-series compared with the correlation between taxonomic and functional diversity, or taxonomic and phylogenetic diversity (DE = 0.41, 95 % CI = 0.32 ∼ 0.52).In most of the groups (seven out of nine groups), functional diversity increased faster than the increase of phylogenetic diversity and could also increase more slowly or even decline with the rise of phylogenetic diversity (Figure4).
structure as a default rule to recognise diversity patterns may be F I G U R E 3 Distribution and spatial independence test of survey routes for the North American Breeding Birds Survey to the east of Mississippi River.Each panel presents the distribution of diversity time-series that have the same pattern as identified by the SOM.The inset density plot shows the distribution of the ANN distance of 1000 Monte Carlo tests.Solid lines in the density plot represent the average nearest neighbour (ANN) distance of the survey routes, and dashed lines represent the mean value of ANN for 1000 random samples.Blue points represent the start locations of each survey route in the focal SOM group and orange points are the start locations of other survey routes that do not follow into the SOM group.Percentages in the left top of each panel denote the percentages of time-series in that group to all time-series.Pseudo p-values represent the probability of incorrectness if rejecting the geographic independence hypothesis that diversity time-series that have the same pattern are randomly distributed.biased; there are other kinds of structure in such data that are important to understand diversity patterns.We proposed a nongeographic framework to group biodiversity time-series based on a pattern recognition method (i.e.SOM) and highlight potential geographic biases to understand patterns of diversity change.Specifically, we classified the time-series that varied similarly in the temporal trend as having the same pattern and tested whether the geographic distribution of areas of the same pattern is random.Using the North American Breeding Birds data, we demonstrated that diversity time-series that have the same pattern of temporal change could be independent of their geographic locations.This implies that in large-scale studies, geographic proximity may not correspond to shared temporal diversity trends.Ignoring the possible variation between time-series of a given geographic cluster, for example simply taking all time-series as a whole, might bias the understanding of biodiversity change.

F
Temporal changes of taxonomic, functional and phylogenetic diversity in the North American Breeding Birds Survey.Each panel denotes the change of diversity time-series that have the same pattern.The error bars represent a 95% confidence interval.The nonlinear regressions (a loess sliding window with a 33% range width; smoothed line) of the diversity facets were added to describe the major temporal trajectory of each group.
geographic structure, a priori to seek patterns of dynamic diversity change can result in failures to incorporate diversity changes that were not recorded and bias the diversity estimation.Given the difficulties to improve the spatial coverage of long-term biodiversity surveys, the nongeographic framework that organises diversity timeseries independently of their relative or specific locations also adds a novel way to conduct macroecological studies in recognising largescale diversity patterns for existing diversity time-series data sets.As for the patterns of North American Breeding bird diversity, we found functional and phylogenetic diversity changed F I G U R E 5 Distribution and spatial independence test of enhanced vegetation index (EVI) time-series from the same locations as bird survey routes.Each panel presents the distribution of EVI time-series (blue pixels) that have the same pattern.The inset density plot shows the distribution of the ANN distance of 1000 Monte Carlo tests.Solid lines in the density plot represent the average nearest neighbour (ANN) distance of the locations of EVI time-series, and dashed lines represent the mean value of ANN for 1000 random samples.Blue points represent the start locations of each survey route in the focal SOM group and orange points are the start locations of other survey routes that do not follow into the SOM group.Percentages in the left top of each panel denote the percentages of time-series in that group to all time-series.Pseudo p-values represent the probability of incorrectness if rejecting the geographic independence hypothesis that EVI timeseries that have the same pattern are randomly distributed.