Integrating the Rabinowitz rarity framework with a National Plant Inventory in South Korea

Abstract Increasingly large presence‐only survey datasets are becoming available for use in conservation assessments. Potentially, these records could be used to determine spatial patterns of plant species rarity and endemism. We test the integration of a large South Korean species record database with Rabinowitz rarity classes. Rabinowitz proposed seven classes of species rarity using three variables: geographic range, habitat specificity, and local population size. We estimated the range size and local abundance of 2,215 plant species from species occurrence records and habitat specificity as the number of landcover types each species’ records were found in. We classified each species into a rarity class or as common, compared species composition by class to national lists, and mapped the spatial pattern of species richness for each rarity class. Species were classed to narrow or wide geographic ranges using 315 km, the average from a range size index of all species (D max), based on maximum distance between observations. There were four classes each within the narrow and wide range groups, sorted using cutoffs of local abundance and habitat specificity. Nationally listed endangered species only appeared in the narrow‐range classes, while nationally listed endemic species appeared in almost all classes. Species richness in most rarity classes was high in northeastern South Korea especially for species with narrow ranges. Policy implications. Large presence‐only surveys may be able to estimate some classes of rarity better than others, but modification to include estimates of local abundance and habitat types, could greatly increase their utility. Application of the Rabinowitz rarity framework to such surveys can extend their utility beyond species distribution models and can identify areas that need further surveys and for conservation priority. Future studies should be aware of the subjectivity of the rarity classification and that regional scale implementations of the framework may differ.


| INTRODUC TI ON
Biological rarity in plants has been described in several ways, including edaphic endemism (Kruckeberg & Rabinowitz, 1985), having low population sizes (Flather & Sieg, 2013), or being phylogenetically unique (McKinney, 1997), and is essentially a function of a specie's attributes-geographic range, habitat specificity, and local population size (Rabinowitz, 1981). Large presence-only survey datasets are increasingly available for use in conservation assessments. For example, the Global Biodiversity Information Facility (GBIF; http:// www.gbif.org/) and many digitized herbaria collections such as the Australasian Virtual Herbarium (https://avh.chah.org.au/) and California's Jepson Herbaria (http://ucjeps.berkeley.edu/) contain species presence data. Such records have been used extensively for species distribution modeling (SDM) (Ancillotto, Strubbe, Menchetti, & Mori, 2016;Wright, Hijmans, Schwartz, & Shaffer, 2015), which can portray potential range. However, SDMs rely on assumptions that species are in equilibrium with current environmental conditions, and the distribution or abundance represents environmental tolerances and resource requirements (Franklin & Miller, 2009).
Uncertainties in SDM results arise from the quantity and quality of observation data, from predictor variables used, and from errors in the modeling (Elith & Leathwick, 2009). Model results are influenced significantly by the sample size (Hernandez, Graham, Master, & Albert, 2006), and in many cases, the number of records available to fit a reliable model is limited, particularly for many rare species and regions (Choe, Thorne, & Seo, 2016;Wisz et al., 2008). Furthermore, the testing of the model performance is not sufficient in many SDM studies to convince ecologists (Vaughan & Ormerod, 2005).
On the other hand, species presence records could potentially be used in other ways, such as to determine plant rarity or endemism, which could offer a new tool for regional conservation efforts (Fleishman, Noss, & Noon, 2006). This approach is little-tested, and modification of traditional definitions of rarity may be needed to transition from local to regional definitions of rarity (Saetersdal, 1994) and to integrate with large observation datasets. We asked how well a large survey of plant species observation records could identify different types of rarity and how well could it portray the resulting spatial patterns. To test the suitability of survey data for this type of analysis, we used a classification of species' rarity types developed by Rabinowitz (1981) and applied it to a recently completed national-scale species presence-only survey for South Korean plants to see how well the survey could be used to define species' rarity classes. We compared the resulting classification with nationally listed plant species lists and the IUCN Red List of Threatened Species, and examined the ability of the resulting spatial outputs to identify priority conservation areas and areas that need further species locality surveys in South Korea. Rabinowitz (1981) proposed different types of species rarity using the following three variables: geographic range, habitat specificity, and local population size. These three variables can be combined and allow species to be classified into eight categories. Among them, seven types represent the different forms of rarity. These classes of rarity can be useful for conservation assessments, by using the basic characteristics of plant species from various literature sources and expert knowledge (Espeland & Emam, 2011;Silcock, Fensham, & Martin, 2011). For example, Broennimann, Vittoz, Moser, and Guisan (2005) classified Swiss conservation priority plant species into the Rabinowitz's rarity classes and found that species with the most restricted distribution are under higher risk of extinction by comparing rarity species and IUCN extinction risks. Caiafa and Martins (2010) used the forms of rarity from Rabinowitz's classification to divide southern Brazilian Atlantic rainforest species into rarity classes and found that 11% were represented in the most restricted rarity type. Anacker, Gogol-Prokurat, Leidholm, and Schoenig (2013)  We classified the plant species to respective rarity types and mapped plant species richness overall and plant species richness for each rarity class at various spatial resolutions to test the effect of spatial scale on the results. We used the subsequent spatial patterns to examine the congruence of the spatial patterns for the different rarity classes and use the patterns to identify potential conservation priority regions in South Korea and areas that may need further surveys. We show the strengths and limitations of using large systematic survey data to identify rarity classes for use in plant conservation. In addition, our approach provides a set of conservation plant lists for South Korea, and suggestions to complement ongoing national survey projects collecting natural ecosystem information by testing the accuracy and effectiveness of surveys.

| ME THODS
We classified each species into one of Rabinowitz's rarity classes by using occurrence points to estimate the three rarity descriptors.
We estimated the geographic range size of each plant species by calculating the distance among its surveyed occurrence points. We calculated the average of minimum distances between occurrence points of the same species as a proxy for local abundance, and we calculated habitat specificity by examining how many types of land cover the observations for each species was found in, using a detailed landcover map.

| Survey data
This study is confined to the mainland of South Korea which has an area of 95,219 km 2 . About 70% of the land area is mountainous | 1355 CHOE Et al.

| Rarity measures
Rabinowitz proposed categories to classify species' rarity types using geographic range (wide vs. narrow), habitat specificity (broad vs. restricted), and local population size (large vs. small). The three attributes combine for eight (2 × 2 × 2) classes. One of these (wide range, broadly adapted, and large local population) is not rare, leaving seven types of rarity.
We adapted Rabinowitz's classification of rarity and abundance to classify the South Korean plant species (Table 1). We used a range size statistic, maximum distance (D max ) to estimate each species' geographic range from the occurrence records (Hijmans & Spooner, 2001). D max is the largest distance (in kilometers) between any pair of occurrence points of each species and was considered to represent range size. Range size can be calculated as the area using formal methods such as a convex polygon (Anacker et al., 2013; F I G U R E 1 Locations of surveyed areas (left) and shaded topography (right) of South Korea and Fuller, 2009). In a sensitivity analysis, we compared our range size metric with the convex polygon metric for species with three or more observations and the agreement rate was 85% (Table S1).
Therefore, we used the D max metric, in order to evaluate as many species as possible which includes 234 species (10.6% of analyzed species) with only two observation records.
In addition, we calculated the minimum distance to estimate each species' local abundance. Minimum distance is the shortest distance (in kilometers) between occurrence points for each point of the same species. We assumed that the shorter distances between occurrence points meaning the higher local prevalence, and could be a proxy for abundance. We averaged the minimum distances of all occurrence points of each species and named this value as D min . We divided D max by D min and used this value as a proxy for local abundance. This proxy value increases as D max is larger and D min is smaller.
Despite range size, if the distance between the observation records is short, the local abundance was considered to be large. All of the geographic calculations were conducted in R (version 3.5.1) by writing loop codes to iterate the calculations for each species.
For habitat specificity, we used a landcover map of South Korea to examine the habitat specific of each species. The 2007 landcover map we used was derived from SPOT 5 and KOMPSAT 2, and has 23 categories including urbanized area, agricultural uses, forest, grass, wetlands, barren land, and water (website: http://egis.me.go. kr/map/map.do?type=land). We counted the number of landcover types overlapping with each species' surveyed points as a proxy for habitat specificity.
Then, we classified each species into the rarity framework using the average value of all species for each index: D max , D max /D min , and habitat specificity. We classified all species into the eight categories and compared them to the national registration of endangered and endemic species and to the IUCN Red List species in Korea (Table 1) to assess how well the large dataset performed in identifying different classes of plants.

| Species richness
We calculated overall species richness using grids with three spatial resolutions (10, 25, 50 km) because grid size can induce different spatial patterns (Seo, Thorne, Hannah, & Thuiller, 2009). Using smaller grid cells produces helpful richness patterns for the conservation in a local scale but also produces many empty areas due to the absence of survey data. On the other hand, using larger grids produces general richness patterns for a national scale, but may generalize the diversity patterns excessively (Orozco-Ramírez, Perales, & Hijmans, 2016). We calculated species richness for the seven rarity classes, to find the spatial patterns of each species group using the 10 km resolution, because the rarity types were measurable at this finer resolution. To find conservation priority areas and areas needing additional surveys, we took the top 20% of richness areas from each of the eight species groups and combined them to see which areas contain many types of rarity.

| National level species distributions
The number of occurrence records for each species varies from 2 to 762 (Table S1), with 24 species observed more than 500 times (9% of the survey data). The most frequently observed species were Lindera obtusiloba (762 observations), followed by Zanthoxylum schinifolium (701), Stephanandra incise (649), Aster scaber (595), and The largest distance between any pair of occurrence points of each species Average of minimum distances between occurrence points for each point TA B L E 1 Species classification into eight categories using Rabinowitz's (1981) Note. In the table, the code is made up of the first letters of the binary classification of each index. We estimated the geographic range size of each plant species by calculating the distance among its surveyed occurrence points. We calculated the average of minimum distances between occurrence points of the same species as a proxy for local abundance, and we calculated habitat specificity by examining how many types of land cover the observations for each species was found in. We classified each species into the rarity framework using the average value of all species for each index.

| Species classification
Using the average D max of all species as the cutoff, the number of species with a narrow geographic ranges was 915 and the number of species with wide geographic ranges was 1,300 (Table 1 and Within the narrow geographic range species, the average habitat specificity was 3.0 landcover types/species and the average of local abundance was 6.1 (Figure 3a). The number of species in each rarity class is in Table 1. Species with narrow geographic range and restricted habitat specificity (N/R/S and N/R/L) are often endangered or threatened species (Rabinowitz, 1981;Turchetto et al., 2016). Our results identified 7 out of 11 listed endangered species, 36 out of 114 known South Korean endemic species, and 18 out of 28 IUCN Red List species as belonging to this restricted group. The other species we identify in these groups could be considered when updating lists of the national endangered species. Four of the listed endangered species in our data occupied the groups with narrow geographic range but broad habitat specificity.
Within the four groups with wide geographic range, species with restricted habitat specificity and small local abundance are generally predictable and are at risk of habitat destruction (Franklin & Miller, 2009). There are 193 species with wide ranges and broad habitat specificity but small abundance, that we termed widespread sparse species (W/B/S; the lower right of Figure 3b).
In this study, nationally listed endangered species only appear in the narrow geographic range classes, while the nationally listed South Korean endemic species appear in all groups except N/R/L (Table 1). The number of endemic species belonging to narrow-range groups was 1.3 times higher compared to that of endemic species in wide range groups (Table 1).

| Species richness
Areas ( We combined the top 20% of richness areas from each of the eight species groups (Figure 6). Among the selected 34,600 km 2 areas, 13,000 km 2 (38%) are high richness areas from only one rarity type, while 1,100 km 2 (3%) are high richness areas for the seven rarity groups, and 3,600 km 2 (10%) are high richness areas for six rarity groups. High richness areas for five or more species groups cover 7,100 km 2 (21%), and they are distributed mainly in the eastern parts of South Korea (Figure 6a). Especially, we note the area that has the highest species richness from the rarity classes, 10 × 10 km grid cells containing Myeonokchi, Soksil, Yongsan, and Oeryong villages in Gangwon Province, Namtong, Daegak, and Judong villages in Gyeongsang Province, and Hwagok in South Chungcheong Province (Figure 6b).

| D ISCUSS I ON
We combined a large presence-only dataset of 2,215 plant species with the well-known Rabinowitz rarity classification (1981) using only spatial data to classify the rarity status of multiple species and identify high species richness areas for each rarity class. The IUCN F I G U R E 3 Species classification into eight categories using Rabinowitz's (1981)  F I G U R E 6 Spatial congruence of high species richness areas for the eight species classes (a) and for seven rarity classes, excluding common species group (b). We took the top 20% of species rich areas from each of the eight species classes and combined them to see how many types of rarity were found in each grid cell. We marked areas with the highest species richness for multiple rarirty classes with blue circles in Figure  6b discuss our approach and provide suggestions for further national survey protocols and for plant conservation in South Korea.

| Implications of the rarity framework for plant conservation in South Korea
When using the common species class or all species, high species richness areas are evenly distributed across the country using smaller grid cells (Figures 4 and 5). However, for species richness in the rarity categories, we found that Gangwon province (northeastern South Korea) contains the highest species richness, especially for species with narrow geographic ranges. The primary mountain ranges of South Korea, including the highly biodiverse Baekdudaegan Mountains (Choe et al., , 2016Ministry of Environment, 2012), are mostly located in Gangwon province. Our study also found high species richness of rarity classes and of nationally listed endangered and endemic species in each rarity class in these mountains ( Figure 5 and Figure S2), which indicates our simple approach can identify high biodiversity areas for rare species. We suggest that further field surveys in these areas could help identify high priority plant conservation areas, that biological monitoring focused on tracking populations of narrow geographic range plant species be implemented in the northeastern parts of South Korea, and that regional urban development and land use plans should consider the preservation of the high biodiversity in this area .

| Scaling the Rabinowitz framework to a national scale
Rabinowitz considers species' geographic range size an important attribute for classifying rarity. In our study, nationally listed South Korean endangered species appeared only in the narrow geographic range classes, and the large survey data were fairly successful in identifying narrow-range species.
We assumed that the species observation data we used accurately represent the range distributions of plants in South Korea, but the application of Rabinowitz allowed us to think about the limits of the national survey data. Our dataset contained 435 species with two or three observations, including seven nationally listed endangered species and 35 endemic species. We included these species in order to retain the largest species list possible. However, these species comprise 86% of the N/R/S class, and we suspect this rarity class is inflated. We also acknowledge that unlisted species in this class may not be rare. However, the species we list should be further investigated, as they may be candidates for the nationally rare species list. Species with narrow geographic range but broad habitat specificity (N/B/L and N/B/S), are considered theoretically unlikely by Rabinowitz (Franklin & Miller, 2009;Rabinowitz, 1981), but they made up a large portion of species listed in the survey.
These results could be due to incomplete surveys or local extinction meaning the species were actually more broadly distributed historically, or possibly due to error introduced by our measure of habitat breadth by intersecting the landcover map with the survey presence points. Potential error from the landcover map could be due to faulty landcover classification or conversion of landcover at a given observation point since the time (2007) of the mapping. It is also possible that this rarity definition does not scale as well to regional data-driven analyses.
Rabinowitz defined endemic species as confined to specific habitat types within a small area but with large abundance, so species in our N/R/L classification are endemic species according to Rabinowitz. However, the nationally listed South Korean endemic species occur in almost all groups except N/R/L. The nationally listed endemic species had relatively large range sizes according to our measures, with a mean D max of 252 km, compared to 315 km for all species. We checked attributes of the 11 species that we classed into the N/R/L group (Table 1). Among them, three are distributed nationwide but they were classified as narrow range because they lack records in the survey. These errors could be solved by modifying the survey protocols. The rest in our N/R/L group appear only in limited geographic areas with specific habitat types and were well classified using our approach. For nationally listed endemic species, species in N/R/S were a better fit, in which 36 of the 114 nationally listed endemics were classed. Therefore, the definition of endemism can have substantial consequences in the classification of large numbers of species using spatial data, which are likely context-specific.

| Spatially defined rarity indices representing Rabinowitz' rarity classes
We devised the three rarity metrics using only spatial data that can be calculated in a rapid manner by writing loop codes. This approach provides useful alternatives when there is insufficient other information for species, but there are also limitations. For example, we classified each group using the average value of all species for each index, but it is also necessary to consider more meaningful cutoffs.
This section discusses the strengths and limitations of the three spatially defined rarity metrics we devised to classify species. For range size, we used maximum distance (D max ) to estimate each species' geographic range. However, 10.6% of the species we studied had only two observation records, and it is uncertain how well a distance metric can represent species' range size. Therefore, we used a convex hull polygon approach to estimate range size (Gaston and Fuller, 2009) for species in our dataset with three or more observations, to test for agreement in the wide or narrow-range size classification. Agreement between the approaches was 85% (Table S1), so we decided to use the D max to evaluate as many species as possible for our study. This approach permitted us to estimate the range size of all species, but estimates of range size for species with low observation numbers and/or restricted habitats is uncertain.
For local abundance, we assumed shorter distances between occurrence points meant higher local abundance and divided D max by D min to calculate local abundance of each species. Many studies have found that species with wide range tend to have large abundance (Gaston & Lawton, 1990), so we devised the distance metric to calculate the local abundance using the maximum distance we used for the range size. This metric could be greatly improved with abundance records or potentially by plant functional type definitions (Soudzilovskaia et al., 2013), but we were limited to presence-only records. However, by using this metric, we were able to estimate an abundance value for all species which may be correct, at least with regard to the values among species.
Habitat specificity is linked to the niche breadth of species, which can be related to measures such as soil type, moisture level, or successional state (Li, Nicotra, Xu, & Du, 2015). Rabinowitz describes habitat specificity as preferred habitat types of species, such as thickets or forest edges. This was the only metric we developed that used data external to the species occurrence records. We used a landcover map to count the number of different landcover types plant species occupied. Although this method does not require extensive literature review to identify each species' characteristics, it has a drawback that it is influenced by both the quality of survey data and landcover map accuracy and could be improved with more detailed landcover maps and habitat description records.

| Survey methods
Integrating the observation data with the rarity framework identified potential improvements for the South Korean national ecosystem survey protocols, which may extend to species record datasets generally. First, many species had few observation records but large D max index values (Figure 2). Some of these species are widely dis-  , 2007;Fidelibus & MacAller, 1993) and can provide long-term reference data.

| CON CLUS IONS
Many national-scale surveys and biodiversity portals such as GBIF (Global Biodiversity Information Facility; http://www.gbif.org/), iNaturalist (http://www.inaturalist.org/), and MOL (Map Of Life; https:// www.mol.org/) contain species presence records. While many of these datasets also contain measures of species abundance, our simple approach using presence-only data can be used to extract more information from records that are already available and is therefore applicable to other regional or larger studies. In addition, although this study only considered species' occurrences, application of species' phylogenetic and functional types could provide useful information.
Plant species are one component of biodiversity, and areas we identified may not represent South Korea's overall biodiversity and comparison with species richness patterns from other taxonomic groups could improve the conservation target portfolio for South Korea (Dobson, Rodriguez, Roberts, & Wilcove, 1997). We found that presence-only data have potential to be combined with rarity classifications with caveats about the definitions inherent in the rarity classification and potential disconnects when scaling up from local to regional implementations (Franklin, Wejnert, Hathaway, Rochester, & Fisher, 2009).  (2014001310009). The authors thank three anonymous referees for constructive comments on the initial manuscript.

CO N FLI C T O F I NTE R E S T
None declared.