Benchmarking plant diversity of Palaearctic grasslands and other open habitats

Aims: Understanding fine- grain diversity patterns across large spatial extents is fundamental for macroecological research and biodiversity conservation. Using the GrassPlot database, we provide benchmarks of fine- grain richness values of Palaearctic open habitats for vascular plants, bryophytes, lichens and complete vegetation (i.e., the sum of the former three groups). Location: Palaearctic biogeographic realm. Methods: We used 126,524 plots of eight standard grain sizes from the GrassPlot database: 0.0001, 0.001, 0.01, 0.1, 1, 10, 100 and 1,000 m 2 and calculated the mean richness and standard deviations, as well as maximum, minimum, median, and first and third quartiles for each combination of grain size, taxonomic group, biome, region, vegetation type and phytosociological class. Results: Patterns of plant diversity in vegetation types and biomes differ across grain sizes and taxonomic groups. Overall, secondary (mostly semi- natural) grasslands and natural grasslands are the richest vegetation type. The open- access file ”GrassPlot Diversity Benchmarks” and the web tool “GrassPlot Diversity Explorer” are now available online (https://edgg.org/datab ases/Grass landD ivers ityEx plorer) and provide more insights into species richness patterns in the Palaearctic open habitats. Conclusions: The GrassPlot Diversity Benchmarks provide high- quality data on species richness in open habitat types across the Palaearctic. These benchmark data can be used in vegetation ecology, macroecology, biodiversity conservation and data quality checking. While the amount of data in the underlying GrassPlot database and their spatial coverage are smaller than in other extensive vegetation- plot databases, species recordings in GrassPlot


| INTRODUC TI ON
Documenting and understanding patterns of biodiversity is a central issue in biogeography and macroecology (Gaston, 2000;Barthlott et al., 2007;Pärtel et al., 2016) and is also fundamental for sustainable land use and biodiversity conservation (Whittaker et al., 2015), as ecosystem function and stability are dependent on biodiversity (Tilman & Downing, 1994;Hooper et al., 2005). The increasing awareness of the current environmental crisis makes biodiversity studies even more valuable and necessary, especially for ecosystems such as grasslands, which are massively threatened by land-use change (Fischer et al., 2018). Plant species richness has been mapped globally using coarse-grain data (Barthlott et al., 2005;Kier et al., 2005;Brummit et al., 2020). However, fine-grain data on the local co-occurrence of species in plant communities across continental or global spatial extents are required for macroecological studies that link diversity patterns and assembly processes (Bruelheide et al., 2019). Nevertheless, information on broad-scale, fine-grain plant distribution is still scattered, inconsistent, and often of uncertain quality, especially for bryophytes and lichens (Beck et al., 2012).
However, it should be considered that vegetation plots derived from phytosociological sampling may vary in plot size by several orders of magnitude, even within the same vegetation type (Chytrý, 2001). Sometimes information on plot size may be lacking or only | 7 of 21 Journal of Vegetation Science BIURRUN et al. approximate. Therefore, diversity inference from phytosociological data has to consider plot sizes and should be interpreted with caution (Chytrý, 2001;Chytrý & Otýpková, 2003).
Ecologists and conservationists need reliable species richness benchmarks (i.e., maximum, minimum, mean and other basic statistics) to assess plant communities as being above or below average in richness for a specific region or vegetation type (Yen et al., 2019). To produce reliable benchmarks, plot size should be integrated into any analysis, and large amounts of high-quality vegetation-plot data are needed. Previous studies providing global richness data at several plot sizes focused on maximum values and left out information on the distribution of richness values (Wilson et al., 2012;Chytrý et al., 2015).
This information is needed for both fundamental research and biodiversity conservation (Dengler et al., 2016a;Yen et al., 2019), e.g., when establishing thresholds between average and species-rich grasslands or identifying species-poor degraded grasslands for restoration.
Palaearctic grasslands host a considerable part of the realm's diversity (Dengler et al., 2020a). At fine spatial grains (<100 m 2 ), they can even hold higher plant diversity than tropical forests (Wilson et al., 2012). After an early and rudimentary attempt of benchmark- We aim to display hotspots and coldspots of fine-grain α-diversity (species richness) across biomes and vegetation types. Besides total plant richness (complete vegetation), we separately assess vascular plant, bryophyte and lichen richness, as it has been already demonstrated that the richness of these taxonomic groups should be assessed separately (Dengler et al., 2016a). In summary, we: (a) present major diversity patterns in Palaearctic open habitats that can be derived from GrassPlot; (b) introduce the GrassPlot Diversity Benchmarks (a data set made public together with this article) and the GrassPlot Diversity Explorer (an online tool released together with this article); and (c) outline some potential applications and impacts of both.

| Data compilation
We used plot-based data from the collaborative vegetation database GrassPlot   For this benchmarking study, we retrieved all plots with grain sizes 0.0001, 0.001, 0.01, 0.1, 1, 10, 100 and 1,000 m 2 contained in GrassPlot v.2.10 (version of 1 Oct 2020), belonging to 225 data sets (Appendix S1). According to the typical species-area relationships (SARs) in Palaearctic grasslands (Dengler et al., 2020b), 10% difference in the area means only about 2% difference in richness or less, which is negligible compared to any other source of richness variation. Thus, 2,372 plots deviating less than 10% from standard grain sizes (0.0009, 0.09, 9, 10.89, 900 and 1,024 m 2 ) were also selected and used for the benchmarks of the respective grain size. The final data set contained 126,524 plots (Table 1)  Biomes were assigned using the biome classification provided in Bruelheide et al. (2019), which is based on the nine ecozones of Schultz (2005) plus an additional alpine biome based on Körner et al. (2017). Plots were also assigned to ten geographic regions following Dengler et al. (2020a). We created a two-level vegetation typology with 22 vegetation types grouped into six coarse categories: natural grasslands, secondary grasslands, azonal communities, dwarf shrublands, tall-forb and ruderal communities and deserts and semi-deserts (more details in Appendix S2). Plots were assigned to vegetation types based on expert knowledge either individually by data owners or using general assignment rules of phytosociological syntaxa to vegetation types (see Appendix S2). Among the plots in the data set, 75% have a phytosociological assignment at least at the class level.
GrassPlot includes plot data sampled following two alternative methods for recording the presence of vascular plant species: "rooted presence", which only records individuals as present in the plot if they root inside, and "shoot presence", which records individuals as present if any part of stems or leaves are inside the plot (Dengler, 2008). The majority of plots in the data set were recorded using the "shoot presence" method, and 13.4% of plots used "rooted presence", while only a small fraction (0.1%) used a combined method, where shrubs were recorded using "rooted presence" and grasses and forbs using the "shoot presence", or the recording method was not known (0.2%).
For linguistic convenience, we include lichens under the generic term "plants". Thus, we considered four taxonomic groups: vascular plants, bryophytes, lichens and complete vegetation (i.e., the sum of the former three groups).

| Establishing and providing benchmark values
We calculated mean species richness values and standard deviations, as well as maximum, minimum, median, and first and third quartiles for each combination of grain size, taxonomic group, biome, region, country, vegetation type (at coarse and fine classification level), phytosociological class and method (shoot vs rooted, nested series with seven standard grain sizes vs any plots). The data are organized as a spreadsheet, in which each of the 728,396 lines represents one combination of these factors, and the columns provide the statistics, i.e., number of plots, number of independent observations, minimum, maximum, mean, standard deviation, median, and first and third quartiles. We call these data the GrassPlot Diversity Benchmarks and provide them in Appendix S3 in the Supporting Information as a spreadsheet file (70 MB). This file is open access and is also provided on the website of the GrassPlot Diversity Explorer (https://edgg. org/datab ases/Grass landD ivers ityEx plorer) for free download. We intend to update it at regular intervals while keeping former versions available to make any studies based on these data reproducible.
Many nested series contain several subplots of the same size.
Sometimes these are multiple contiguous subplots covering the entire surface of the largest plot. Because of a high degree of spatial pseudoreplication, using these richness values separately for calculating mean richness might bias the results. Thus, for all benchmarks, except for the maximum and minimum richness, we used the averaged values of each grain size in each nested series, i.e., only the independent observations. The number of independent observations decreased from 126,524 to 48,449 plots (Table 1), 6,509 of them belonging to nested series with at least seven of our standard grain sizes, 16,499 belonging to nested series with less than seven standard sizes, and 25,441 individual plots. In the data set containing only independent observations, the percentage of plots using "rooted presence" rose from 13.4 to 23.4%.
We also added two filtering options as they can have significant effects on resulting richness patterns. (a) We allow filtering for data that were sampled with "rooted presence" or "shoot presence". As has been shown theoretically (Williamson, 2003) and empirically (Güler et al., 2016;Cancellieri et al., 2017;Zhang et al., 2021), species richness recorded with the rooted method deviates increasingly negatively from values recorded with the shoot-presence method as grain size decreases. (b) Subsetting to only those plots belonging to nested series with at least the seven "EDGG standard grain sizes" (0.0001 m 2 to 100 m 2 ; see Dengler et al., 2016b) is also possible. This function can be important when analyzing SARs, which otherwise might be distorted by uneven representation of different grain sizes in specific regions.

| Richness hotspots
In this study we aim at identifying fine-grain α-diversity hotspots (hereafter, richness hotspots). These richness hotspots are different from the biodiversity hotspots of Myers et al. (2000), who emphasized a concentration of endemic species in larger regions combined with severe habitat loss. Other criteria such as the number of rare or threatened species and total species richness are also currently used to identify these hotspots; moreover, this term is now most commonly used with reference to regions of high species richness (Reid, 1998). Another difference with the most widely used concept of the biodiversity hotspot is that we are using fine-grain resolution (plot level, e.g., lower than 1 km 2 ), while most studies identify hotspots using coarse-grain resolution maps, generally at 10,000 km 2  or even coarser (Myers et al., 2000).

| Development of the GrassPlot Diversity Explorer
The GrassPlot Diversity Explorer (https://edgg.org/datab ases/Grass landD ivers ityEx plorer) was developed to provide a dynamic version TA B L E 1 The number of available plots per taxonomic group and grain sizes. Standard sizes are indicated; 0.001 m 2 also includes 0.0009 m 2 ; 0.1 m 2 includes 0.09 m 2 ; 10 m 2 includes 9 and 10.89 m 2 ; and 1,000 m 2 includes 900 and 1,024 m 2 . N all = total number of plots. N ind. = number of independent observations, i.e., after averaging several subplots of the same grain size in the same nested series of the GrassPlot Diversity Benchmarks. We did this in R version 4.0.2 (R Core Team, 2020), using the shiny package (Chang et al., 2020). We also used other R packages, including tidyr and dplyr for data preparation (Wickham & Henry, 2020;Wickham et al., 2020), ggplot2, ggpubr and sunburtsR for visualization of the outcomes (Wickham, 2016;Bostock et al., 2020;Kassambara, 2020), summarytools for generating summary statistics (Comtois, 2020), leaflet for producing an interactive map (Cheng et al., 2019), and shinyWidgets and shinycssloaders to increase the functionality of the shiny package (Perrier et al., 2020;Sali & Attali, 2020). The GrassPlot Diversity

All groups Vascular plants Bryophytes Lichens
Explorer was then deployed on a dedicated server using the rsconnect package (Allaire, 2019).

Richness hotspots of vascular plants in grasslands and other open
habitats are scattered across the Palaearctic. However, they may vary across grain sizes, both regarding mean richness ( Figure 2) and maximum richness (Appendix S4). Richness hotspots also change according to vegetation type and taxonomic group (Appendix S4).
Maximum richness hotspots of bryophytes, lichens and complete vegetation also vary with grain size (Appendix S4). The box plots show the elevation distribution of plots across biomes, with the number of plots (n) above each bar. To fill in the Arabian Peninsula, the biome Tropics with summer rain is indicated in orange colour although GrassPlot does not contain any data from this biome grain size. In addition to arctic-alpine heathlands, sandy dry grasslands, rocky grasslands and mesic grasslands show the highest values, as well as several azonal communities such as saline, rocks and screes, and wetlands (Appendix S5). Maximum richness corresponds to secondary grasslands across most grain sizes, but once again, the pattern changes for bryophytes and lichens, with maxima often in natural grasslands (Table 2). As regards biomes, the maximum richness slightly changes across grain sizes and taxonomic groups, although the temperate mid-latitudes hold most of the maxima for all taxonomic groups (Appendix S5).

Patterns of plant diversity in vegetation types
Species-area relationships of the six best-represented grassland

| GrassPlot Diversity Explorer
The GrassPlot Diversity Explorer is an easy-to-use online interactive tool that provides users flexibility in exploring and visualizing richness data collected in the GrassPlot database. The GrassPlot Diversity Explorer can be accessed via the EDGG website (https:// edgg.org/datab ases/Grass landD ivers ityEx plorer). The tool is organized into eight panels ( Figure 6). The first panel shows species  While the SARs were not the focus of this paper, our data illustrate some general patterns. The SARs plotted in "semi-log" space (i.e., with area logarithmized, but not species richness; Figure 5 and Appendix S5) invariably show an upward curvature, at least those that are based on the nested-plot data. This shape corresponds to a power function (see Dengler, 2008)

| Data quality and methodological settings
GrassPlot only includes phytodiversity data that were carefully sampled with the aim of recording complete species lists within precisely Database, it is well known that there are also other biases in the data. This study found, in several phytosociological classes, that the mean richness decreased above a certain threshold area, a pattern explained by the tendency of phytosociologists to select larger-thanaverage plots in vegetation types that are inherently poorer in species. When comparing the mean richness data from Chytrý (2001) for the three classes that are also contained in GrassPlot (Festuco-

Brometea, Molinio-Arrhenatheretea, Phragmito-Magnocaricetea) we
found substantially lower mean richness in the phytosociological database than in GrassPlot (not shown). Similarly, comparing the mean richness data of Festuco-Brometea grasslands from the Nordic-Baltic Grassland Vegetation Database (Dengler et al., 2006) with GrassPlot data from the same geographic region, we found a good match at 1 m 2 , but increasing relative difference toward larger grain sizes (not shown). The consistently higher richness values in GrassPlot were unexpected as it is often assumed that phytosociologists preferentially TA B L E 2 Maximum richness values for each taxonomic group and grain size across coarse-level vegetation types. The highest values for each taxonomic group are shown in bold. A: natural grasslands; B: secondary grasslands; C: azonal communities; D: dwarf shrublands; E: tallforb and ruderal communities; F: deserts and semi-deserts. + or − before the maximum values indicates that they are derived from slightly smaller (+) or bigger (−) grain sizes than the standard ones, i.e., 0.0009, 0. 09, 9, 10.89, 900  Bryophytes Lichens sample plots with a species richness above average (Holeksa & Woźniak, 2005;Diekmann et al., 2007). By contrast, most GrassPlot data are based on systematic or random sampling or the approach of the EDGG Field Workshops (Dengler et al., 2016b), which aims to maximize between-plot heterogeneity, i.e., both presumably speciesrich and species-poor stands are selected for making plots (which should not bias means, but possibly increase variance). A plausible explanation for the pattern found is that the average completeness of plots in phytosociological databases is lower than most researchers, including ourselves, would have guessed. This indicates that it might be risky to take the richness data from large phytosociological databases at face value. A more comprehensive study comparing the GrassPlot benchmarks with the mean richness values derived from EVA or sPlot should explore how prevalent such a pattern is and whether its strength varies systematically between regions, vegetation types and grain sizes.
While these findings underline the good suitability of typical data contained in GrassPlot for biodiversity analyses, we do not claim that the richness records are 100% complete. It has been shown repeatedly that this is nearly impossible, even when plots are sampled by more than one experienced author (see Lepš & Hadincová, 1992;Klimeš et al., 2001;Archaux et al., 2006). However, the results support the view that the fraction of overlooked species must be minor compared to average phytosociological data and possibly even compensated by an equally minor fraction of erroneously recorded species. When the complete GrassPlot data are used, in very few cases, we also found that richness above a certain threshold appeared to stagnate or even slightly decline (Appendix S5). However, this can be easily explained by biases caused by large numbers of plots that were sampled in local clusters and only for one grain size but not for the others. The effect disappeared when considering only nestedplot series that contain all seven standard grain sizes (Appendix S5). When comparing the continuous and dashed lines in these figures, it turns out that the dashed line (the values for any plots) are largely below the continuous lines (nested plots with all the seven grain sizes). This indicates that apart from biases due to adding local clusters (which equally often should be above and below the average), even within GrassPlot data, there is a "quality gradient": on average, the richness records in nested plots are more complete, but the differences are much smaller than between GrassPlot and conventional phytosociological databases. Finally, also the way of recording plants as present in a plot, shoot presence vs rooted presence (Dengler, 2008), can influence richness records as highlighted by Williamson (2003). In the habitats studied here, a visible effect occurs at grain sizes below 1 m 2 (Appendix S5) which is consistent While we trust that our richness data for individual plots are more reliable than most other sources, the aggregated richness patterns reported in this paper in some cases might still be biased or misleading. First, data coverage in GrassPlot is sparser than in other big vegetation-plot databases. Consequently there might F I G U R E 5 Species-area relationships for vascular plants (a) and complete vegetation (b) for six selected grassland types. Only plots belonging to nested series with at least seven standard grain sizes were included. No filtering by sampling method (rooted vs shoot) was applied be stronger biases concerning geography and vegetation types.
Second, there are a few data sets in GrassPlot that have specifically been collected with the aim of studying sites of exceptional richness (e.g. Merunková et al., 2012;Roleček et al., 2014;Hájek et al., 2020).
However, GrassPlot also contains data that have been sampled in regions where a certain vegetation class is known to be poorer in species than in other parts of the respective country. In addition, a prevalence of vegetation plots from one subtype of a certain category might make this entire category appear relatively richer or  (Erdős et al., 2018). Some bias may also be caused by disputed borders between vegetation types.
Since the assignments to the fine-level vegetation types were largely based on syntaxonomy, and the fine-level types were fully nested in coarse categories, there are some "gray zones", e.g., some rocky, alpine and xeric grasslands might be secondary, and, vice versa, some meso-xeric grasslands might be natural, particularly those in the transition to the steppic natural grasslands (e.g., forest-steppes, Erdős et al., 2020), often maintained through grazing by wild herbivores and fire (Pärtel et al., 2005).

| Vegetation ecology
In studies on certain vegetation types, it is useful for authors to compare not only the richness values within their sample, but also

| Macroecology
An increasing number of studies use the enormous amount of vegetation-plot data from national and regional (see Dengler et al., 2011), continental (EVA;Chytrý et al., 2016) and global (sPlot; Bruelheide et al., 2019) vegetation-plot databases. This approach has great potential for macroecology as it combines fine-grain sizes with large spatial extents, a combination that could contribute to a more mechanistic understanding of patterns, but for a long time was underrepresented in macroecology (Beck et al., 2012). Moreover, vegetation-plot data allow for a much wider range of macroecological analyses than species occurrence databases do (Dengler et al., 2011;Bruelheide et al., 2019). Most of such plot-based macroecological papers take the information in the underlying databases as unquestioned facts. While such studies often address the unequal distribution of plots in space and time (Lengyel et al., 2011) and the preferential sampling of more species-rich communities (Divíšek & Chytrý, 2018), and sometimes also their different plot sizes (Večeřa et al., 2019), to our knowledge, the issue that the recorded species lists might be incomplete was hitherto not addressed in macroecological studies. Moreover, given the different traditions of phytosociology in different countries (Guarino et al., 2018), one can assume that the average degree of incompleteness might vary regionally, leading not only to biased absolute numbers but also unreliable patterns. Incomplete species lists are particularly problematic for macroecological studies on α-diversity and some studies on β-diversity, while studies on community-weighted means of traits or assembly rules are probably less affected, at least not when assuming that the overlooked species mostly were the rare ones with low cover.
Depending on the sensitivity of the study topic toward biased spe-

| Biodiversity conservation
In conservation, a typical challenge is to prioritize areas that deserve protection. Here our benchmarks could become a useful and applicable tool. As species richness is generally seen as one of the leading criteria for such prioritizations (Brooks et al., 2006;Brum et al., 2017), one could set an objective criterion for prioritization such as plots above the third quartile or 50% above the mean value.
Since the GrassPlot Diversity Benchmarks provide such values for any grain size up to 100 m 2 and specifically for each vegetation type, one can even compare across these categories, e.g., the threshold for alpine grasslands will be different from that for wetlands. In any case, we would like to emphasize that species richness cannot be used as a single criterion, as several naturally species-poor habitats are more species-rich after degradation, such as lower levels of salinity in saline communities. Another typical question in this context is whether a particular management or restoration measure was successful or what is the restoration potential of a specific habitat type. Did the measure achieve the typical diversity of that habitat type? Referring to richness data from the literature is troublesome in such cases as they were often recorded on different grain sizes and usually only at a single grain size, making the "translation" to another grain size challenging. All this is much easier with the GrassPlot Diversity Benchmarks, acknowledging that they largely reflect the situation during the past two decades as there is only a small fraction of 20th-century plots included. We also acknowledge that species number should not always be used as a unique criterion for such assessments, as restoration projects often monitor richness of habitat-specific target species to avoid bias caused by sites with high richness of ruderal or alien species.
Finally, we would like to advise again to carefully check plot number and spatial representativeness using the Explorer tool when using these benchmarks.

| Quality check of data
In all the above-mentioned applications, the GrassPlot Diversity Explorer can be helpful for researchers and students alike to get feedback on how complete their field records likely are. The GrassPlot Diversity Benchmarks provide vegetation-plot databases with the option of checking the reliability of data sets before including them. For example, data sets with mean richness below the first quartile of the respective vegetation type × region × grain size should be considered carefully. They do not necessarily need to be excluded but could be labeled as doubtful unless the originators provide convincing reasons that the studied stands are actually so species-poor. This quality check may also be used when data from large vegetation-plot databases are selected for specific projects.

| CON CLUS I ON S AND OUTLOOK
The GrassPlot Diversity Benchmarks provide high-quality richness data from a wide range of open habitat types across the Palaearctic realm. The restriction to eight standard grain sizes, each separated by a factor of 10, is similar to some standardized sampling schemes on other continents, such as the Carolina Vegetation Survey in North America (Peet et al., 1998) and the BIOTA Observatories in Africa (Jürgens et al., 2012). Seven of the eight grain sizes are already well populated with data, only high-quality observations for 1,000 m 2 are still sparse (which is understandable, given the enormous time effort for a complete sampling of such an area; see Dolnik, 2003).
The amount of data in the underlying GrassPlot database and their spatial coverage are much lower than in the EVA (Chytrý et al., 2016) and sPlot (Bruelheide et al., 2019) databases, which is an important constraint that may affect the aggregated patterns reflected in the diversity benchmarks. However, we have shown that species recordings are, on average, apparently much more complete in GrassPlot.
Thus, depending on the research question, either EVA/sPlot, GrassPlot or a combination of both might be the best data source.
Our study further emphasizes the advantages of standardized methodologies and a set of uniform standard grain sizes. We

ACK N OWLED G EM ENTS
We thank Manuel J. Steinbauer for the concept of the richness map in Figure 2. We thank the hundreds of vegetation ecologists who sampled the high-quality data used in this article and contributed them to GrassPlot.

DATA AVA I L A B I L I T Y S TAT E M E N T
The aggregated data (as used in this paper) for any combination of the taxonomic group, grain size, vegetation type, region, biome, and methodological settings (rooted vs shoot; subsetting to only those plots belonging to nested series with at least seven standard grain sizes) are provided in Appendix S3. Future updates will be made available as GrassPlot Diversity Benchmarks (fixed versions) and dynamically in the GrassPlot Diversity Explorer (both at https://edgg. org/datab ases/Grass landD ivers ityEx plorer). The underlying plotlevel data are available upon request from the GrassPlot database, following its Bylaws (https://edgg.org/datab ases/Grass Plot).