Core and occasional species: A new way forward

Abstract Various methods have been used to divide communities into core species and occasional or satellite species. Some methods are somewhat arbitrary, and there is evidence that many communities are more multimodal than bimodal. They also tend to rely on having multiple years of data. A completely novel method is presented that not only has no requirement for long‐term datasets but can divide communities into multiple groups. It is based on probability a species is present, calculated using Simpson's index and the sequential removal of species from the data. The sequential Simpson's index method was applied to species data from a grassland insect community. It was also applied to eleven other datasets that had been divided into core and occasional species in previously published studies. The new method was found not only to be consistent with previous core–occasional assessments but also able to identify multimodality in species abundance distributions. Although ideally used with a measure of persistence (frequency of occurrence) to rank species, community structure is consistently described even with only species abundance data. The method can be applied to short or long‐term datasets and can help identify multimodality and provide valuable insight into how communities change in time or space.


| INTRODUC TI ON
The idea of dividing up communities into species groups, based on their relative abundance or frequency of occurrence (persistence), is long standing (e.g., see Winterbottom, 1949). However, it was arguably not approached in a more formalized way until the coresatellite hypothesis of Hanski (Hanski, 1982;Magurran, 2007;Supp et al., 2015), in which distribution the core species are found at more sites and are relatively abundant, compared with satellite species.
The division of species into groups based on the frequency that species are encountered was incorporated into the UK's National Vegetation Classification (NVC), in which plant species were classified into five frequency classes, based on 20% bands (Rodwell, 2006).
In the NVC, the two highest frequency bands were given the term, constant, with the other three groupings being, frequent, occasional, and scarce. Magurran and Henderson (2003) extended the concept by replacing spatial distribution with temporal frequency and not applying any a priori expectation of frequency classification. They used data from a 21-year study of fish in the Bristol Channel, UK. By plotting maximum yearly abundance against number of years, a species was recorded (persistence); they were able divide the community into core and occasional species. Core species were more persistent and generally more abundant and ecologically associated with estuarine habitats. Occasional species were infrequent, typically less abundant and not considered estuarine species.
The deconstruction of communities into core and occasional groups is valuable as it may enable changes in the abundance of individual species or species groups to be detected (Magurran & Henderson, 2010;Whittaker, 2015). It can also be used to assist in the study of species abundance distributions (SADs) and the testing species abundance models (Magurran, 2007). Although a number of models have been proposed to describe SADs (Antão et al., 2017;Magurran, 2004), it has become clear that SADs are frequently not adequately described by them, at least in part due to their multimodal nature (Antão et al., 2017;. Multimodality can be defined as the presence of two or more species groupings within communities (Antão et al., 2017;Ugland & Gray, 1982). One way of understanding multimodality is to develop alternative models to describe SADs, such as the Gambin model (Matthews, Borregaard, et al., 2014;Ugland et al., 2007;Whittaker, 2015). Another approach is to identify the groupings that generate multiple modes and study their characteristics.
Among these studies, Genner et al. (2004) used a modification of Magurran and Henderson's (2003) method to identify core and occasional species of fish in long-term data from the Bristol and English Channels. Plotting abundance against persistence, they modeled the relationship with a third-order polynomial and divided the community at the inflection point into core and occasional species, with the same Bristol Channel community sub-division as Magurran and Henderson (2003). Ulrich and Ollik (2004) classified core species of forest Hymenoptera as being those found in six or more years from eight, while occasional species were found in three or less years. Several other studies have taken a similar, rather arbitrary approach by defining core species at being one end of a measure of frequency and occasional the other end (Coyle et al., 2013;Hansen & Carey, 2015;Ulrich & Zalewski, 2006). Some have used a single frequency point to divide communities into two groups (Barnes et al., 2016;Dolan et al., 2009;, while others have considered the presence of intermediate species (Boss & Silva, 2014;Supp et al., 2015). Alternative methods have included using species' variance to mean ratio, calculating Simpson's index for subgroups of species and fitting mathematical models to community data (Boisnier et al., 2010;van der Gast et al., 2011;Gray et al., 2005).
There has been criticism that many approaches have been arbitrary and a potential source of artifacts, although some studies have tried to use biological characteristics of the communities to help improve the reliability of species groupings (Barlow et al., 2010;Boisnier et al., 2010;Whittaker, 2015). As well as arbitrariness, any attempt to divide a community into core and occasional species presupposes the hypothesis that a binary division best describes a community.
However, such a simple a priori approach may not reflect the reality and complexity of many, or indeed most communities, and their arbitrary nature makes replication of methods problematic. Indeed, with multimodality believed to be widespread (Antão et al., 2017;Dornelas & Connolly, 2008), applying a bimodal description to a community risks missing potentially interesting ecological patterns.
Therefore, there is a need for an approach to divide communities in a more descriptive, objective, and reproducible way.
Another limitation of some previous methods has been that core-occasional and core-satellite distinctions have relied on having very large datasets, only available after sampling for many years or at many sites, thus precluding their use from shorter term or smaller scale studies. For example, Magurran and Henderson (2003) was based on 21 years of sampling fish communities. Deconstructing communities without such runs of data is more difficult, and unfortunately, much community data from field research are short-term (Barlow et al., 2010). Finding a reliable method for objectively identifying species groupings without long-term/multiple site sampling would therefore be very valuable. It would allow many more community datasets to be studied in terms of species groupings and would have the additional benefit of making it easier to look at how SADs and changes to species groupings vary over time or space (Magurran, 2007;Magurran & Henderson, 2010). I studied the grassland insect community from an intensively cattle-grazed site at Teagasc Grange, Ireland, over 4 years, to which I attempted to apply a binary core-occasional partitioning of species.
After an initial use of a modified form of the Genner et al. (2004) approach, in which I identified the method of calculating persistence as being potentially problematic, I developed a novel approach based on calculation of the Simpson's index and the sequential removal of species from the data. The new method has the advantage over previous partitioning methods of being descriptive and making no a priori assumption regarding the pattern of species groupings within a community, thus moving away from a simple binary division on communities. It can also be applied to a much wider range of datasets, as it is much less reliant on having data from multiple years or sites.
The method was tested against data from Magurran and Henderson (2003) and Genner et al. (2004), as well as nine other datasets from previous studies. In this paper, I describe this new method and discuss its application to the study of community structure. Specifically, I test the following questions: 1. Is it sensitive to the method of calculating persistence (frequency of occurrence)? 2. Does it show the same pattern a species grouping when species are ranked by persistence or by abundance?
3. Can it be applied to a range of previous community datasets?

| Insect sampling
Coleoptera and Hemiptera were suction sampled (n = 692) from the vegetation and ground surface of a cattle-grazed agricultural grassland (Drennan & McGee, 2009;Helden et al., 2015) at Teagasc Grange, Co Meath, Ireland, between 2002 and2005. Further details of sampling and species identification can be found in Helden et al. (2015).

| Statistical modeling and index calculations
The statistical modeling and calculation of values for Simpson's index, described below, were carried out using R version 3.4.0 (R Core Team, 2017). Code used for data manipulation and subsequent calculation of Simpson's index is given in Appendix A.1.

| Application of Genner et al. method to Grange
Hemiptera and Coleoptera data The Genner et al. (2004) method was applied to Grange Hemiptera and Coleoptera species and abundance data, to divide the species into core and occasional species (i.e., a binary classification). With only 4 years of data, year could not be used as a useful measure of persistence. Instead, it was measured by the number of times a species was found in a group of samples. To take account of phenological change, samples from all years and dates were randomly allocated in approximately equal-sized groups. Group size was defined by the number of individuals rather than number of samples. The minimum group size was taken as the number of individuals required to give a stabilized value (i.e., groups of the same or larger size would show no difference) of Simpson's index (1/D) (Lande et al., 2000;Magurran, 2004). For Hemiptera, this was a mean group size of 43.6 individuals (115 groups) and for Coleoptera 99.7 individuals (63 groups). For each species, persistence was determined as the number of groups in which it was present.
Log 10 mean abundance for each species was plotted as the response and persistence as the explanatory variable to give a sigmoidal pattern. A third-order polynomial line (y = x + x 2 + x 3 ) was fitted, and the value for persistence at the inflection point was determined. Species were classified as core if more frequent than the persistence value observed at the inflection point and as occasional if less so.
Although minimum group size could be determined using Simpson's index (see above), it was not known what the effect of larger group sizes would be on the number of core species. Therefore, the procedure was repeated for a series of nine other group sizes for

| Proposed new sequential Simpson's index method applied to Grange insect data and to Bristol and English Channel fish data
The proposed new sequential Simpson's index method was applied to Hemiptera and Coleoptera data from Grange, as well as to the Bristol Channel and English Channel fish data from Genner et al. (2004).
For Grange, the minimum group size (as defined in the previous section, above) was used for Hemiptera (115 groups, mean abundance 43.6) and Coleoptera (63 groups, mean 99.7). The frequency that a species was found in the groups was used only to rank species in terms of persistence. Species of equal persistence were ordered by abundance. Simpson's index (1/D) was calculated for the full community after adding one to the abundance of each species. Then, the highest ranked species (most persistent) was removed and Simpson's index recalculated. This was repeated with the sequential removal of the most persistent until only one species remained, when Simpson's 1/D = 1.
Simpson's index was plotted as the response variable with species rank (persistence) as the explanatory variable. The resultant data were modeled with polynomial generalized linear models using Gaussian error structure. A series of models, beginning with 1/D + (1/D) 2 , were created. For each new model, the next sequential power was added, such that the next was a third order polynomial, followed by a fourth order polynomial, and so on. The proportion of deviance explained by each model was calculated using the Dsquared function in the modEvA package (Barbosa et al., 2015). Increasing the polynomial power led to an increase in deviance explained, up to a point at which the deviance showed little further increase. The first model as deviance plateaued was taken as the optimal model. Model simplification was then carried out by the sequential removal of any nonsignificant terms, with adjacent models compared with AIC values and deletion tests used to justify changes (Crawley, 2007).
Polynomial models showed linear alternation in concavity at inflection points. These were identified by solving the model for each species rank and determining between which ranks the model concavity changed (Appendix A.1). The inflection points were taken as indicating dividing points between species groups. Original abundance and persistence data were compared with the inflection points. In some cases, at the less abundant and infrequent end of the data, species groups were combined when abundance and/or persistence were identical or very similar. In some cases when adjacent species near group boundaries had identical persistence, the boundary of the grouping was moved slightly away from the inflection point itself. The multiple species groupings form a continuum from the most persistent at one extreme to the least persistent at the other. The data were also inspected to try to determine a suitable dividing point between core (C) and occasional (O) species groups. This was judged in part by the position of the inflection points and partly by the abundance and persistence of the species. In most datasets, this was either at the central inflection point, in the case of an even number of groupings, or one of the two central inflection points for an odd number.
To test whether the number of core species defined by the new method was sensitive to group size, it was tested using the same 10 groups of Grange Coleoptera and 10 groups of Hemiptera data, as described in the previous section. This enabled the method to be com- The sequential Simpson's method was then applied to the same four datasets but instead of ranking species by persistence, they were ranked by abundance alone. This was because in many datasets, based on single sampling events, it is not possible to rank according to persistence. Species near boundaries were compared and if of equal abundance the boundary was adjusted accordingly.

| Application of new sequential Simpson's index method applied to nine published datasets
The new method was applied to nine datasets from seven papers.
These were as follows.

The average abundance and number of sites occupied by
Onthophagus (Coleoptera: Scarabaeidae) in Sarawak, Malaysia, grouped as core or satellite by Hanski (1982). Data extracted from figures 4 and 9 in Hanski (1982).
2. Tree species abundance from Barro Colorado Island (BCI), Panama, classified as common or rare by Gray et al. (2005). Species groupings were derived from the BCI-50 plot in figure 3b of Gray 3. Carabidae (Coleoptera) abundance and number of sites found in Poland. Data from table 1 of Ulrich and Zalewski (2006). 5. The persistence and abundance of fish species from an artificial marine reef in California, USA, which were divided into core and transient groups by Boisnier et al. (2010). Original data from table 5 of Matthews (1985).
6. Boisnier et al. (2010) also used a second dataset, from a study of fish from an artificial marine reef in Australia. The data from table 2 in Branden et al. (1994). 7. Two datasets, phytoplankton and fish from a lake in Wisconsin, USA, were divided into core, intermediate, and occasional species by Hansen and Carey (2015). Data from supporting information S1 table in Hansen and Carey (2015).
8. Abundance data of small mammals from Arizona, USA, divided into core, intermediate, and transient by Supp et al. (2015). Data were available from table 1 in Supp et al. (2015).
All were ordered by the frequency of species being found. For data from Hanski (1982), BCI (Gray et al. 2005), and Ulrich & Zalewski (2006) this was in terms of the number of sites a species was present, with the number of times a species was found for the remainder. In contrast, there was no significant effect of group size using the propose new sequential Simpson's index method (Spearman rank correlation: Hemiptera r s = 177.22, n = 10, p < 0.839; Coleoptera r s = 157.53, n = 10, p = 0.901) ( Figure 1).

| Proposed new sequential Simpson's index method applied to different community datasets
The following description of the results focuses on the overall patterns, particularly in terms of the core-occasional species split. More detailed description of the model output for each dataset can be found in Appendix A.2.

| Bristol Channel fish
When ranked by persistence, the community was divided between 33 core species within four groups, and 48 occasional species in five groups (Table 1, Figure 2a), which was identical to the 33-48 split of Genner et al. (2004). When ranked by abundance, 30 species were classified as core (in three groups) and 51 as occasional (two groups). The ratio of core and occasional species, when ordered by persistence and by abundance using the new method, showed no significant difference The abundance-based division was also not significantly different than that of Genner et al. ( 2 1 = 0.234, p = 0.629). Although there was little difference between the core-occasional split between persistence ranked and abundance ranked data, the subgroups did show differences in size and number (Figures 2a, 3a).

| English Channel fish
Modeling of persistence ranked data resulted in 39 core species in four groups and 33 occasional species also in four groups (Table 1,

| Grange Hemiptera
The persistence ranked community was split into six core groups containing 16 species and three occasional groups with a total of 25 species (Table 1).
The core group with the greatest persistence contained a single species of grass-feeding aphid, Rhopalosiphum that was clearly more abundant and more frequent than all other species (Appendix A.3). The next most persistent core group also contained a single grass-feeding aphid, Metopolophium, much less abundant but still found in 102 of the 115 groups. Further details of the boundaries of each core and occasional group, in terms of persistence and abundance figures, are shown in Appendix A.3.
One species of Javesella (Delphacidae) was in the fourth core group, another in the fifth, and a third was in the occasional group with the highest persistence. Similarly, the genus Macrosteles (Cicadellidae) had one species in the fourth core, one species in fifth core, and one in sixth core group. Details of the species within each group can be found in Appendix A.4.
Modeling of abundance ranked data resulted in classification of 15 core and 26 occasional species. There was no significant difference between the number of species categorized as either core or occasional between the persistence ranked and abundance ranked data ( 2 1 = 0.203, p = 0.652). There was also no difference in the categorization of the subgroup size (Fisher's exact test p = 0.889).

| Grange Coleoptera
The best fitting model for persistence ranked data divided the community into five species groups: two core groups, with a total of 38 TA B L E 1 Species groupings of the two fish and two insect communities deconstructed using the sequential Simpson's index method, after being ranked by persistence   Two species, Mocyta fungi and Amischa analis, occurred in all possible sample groups and were both far more abundant than any other species. There were 11 species from the genus Stenus (Staphylinidae).
Four in the most persistent core group, three in the next group, three in the first occasional group, and one in the next (Appendix A.4). There were three Curculionoidea that specialize on feeding on Trifolium: Sitona lepidus and Protapion fulvipes in the most persistent core group and Ischnopterapion virens in the most frequent occasional group.
Details of species groupings can be found in Appendix A.4.
The distribution of species into core or occasional categories was not significantly different from when data were ranked by persistence ( 2 1 = 0.019, p = 0.891). The distribution of subgroups was almost identical for occasional groups and the least abundant/persistent core group but for the abundance ranked data there were a further three core groups as opposed to one for persistence ranked.

| Tree species, Barro Colorado Island (BCI), Panama (Gray et al., 2005)
The 225 species were divided into four core groups, totaling 138 species, and three occasional groups with 92 species (Table 2). If the most occasional group was considered equivalent to the rare category of Gray et al. (2005), which is logical given the rapid rise in Simpson's index value for species rank 1-39 shown in the polynomial model (Figure 4b), and the rest were classified as common, there was no significant difference between the two approaches ( 2 1 = 0.227, p = 0.633).

| Carabidae (Coleoptera), Poland (Ulrich & Zalewski, 2006)
There were 44⋅ carabid species grouped into six core groups and 31 species in the two occasional groups (Table 2, Figure 4c). Ulrich & Zalewski (2006) suggested 20 core and 31 satellite species, with 24 considered intermediate. If the core groups were considered equivalent to the categories adopted by these authors as core and intermediate species (total 44), then the groupings are identical in size between the two approaches.

| Tintinnid ciliates, Mediterranean Sea (Dolan et al., 2009)
The community was divided into three core groups, totaling 30 species and two occasional groups, which together had 18 species (Table 2, Figure 4d). Dolan et al. (2009) suggested 11 core and 49 occasional species (the disparity in overall totals being due to undetected data points from figure 3 of Dolan et al. (2009), the source of the data). There was a significant difference in the groupings using the two techniques ( 2 1 = 22.088, p < 0.001). However, the division between Dolan's core and occasional aligns almost exactly with the boundary between groups C1 and C2. Dolan et al. (2009) considered core species to be only those that were found on all 18 days sampled, whereas with my approach species found on 7 or more days were included.

| Artificial reef fish, California (Boisnier et al., 2010)
The Californian fish community was divided into 16 core species, in four groups, and five occasional species in a single group (Table 2, Figure 3e). This was exactly the same core-occasional classification as used by Boisnier et al. (2010).

| Artificial reef fish, Australia (Boisnier et al., 2010)
There was a total of 22 species of Australian fish in five core groups, and 25 species divided among the three occasional groups (Table 2, Figure 3f).

| Lake fish, Wisconsin, USA (Hansen & Carey, 2015)
Of the 36 species of fish, 22 were core, arranged in four groups, with 15 occasional species in three groups (  Figure 4g). However, C1-C3 total 13 species (Table 2), which is very close to Hansen & Carey's core classification.

| Lake phytoplankton, Wisconsin, USA (Hansen & Carey, 2015)
The phytoplankton community was divided into 90 core species, in three groups, and 156 occasional species in two groups (Table 2, Figure 4h). This was very different from the 10 core species suggested by Hansen and Carey (2015). However, the least frequent occasional grouping was of 96 species, which was almost identical to the 97 considered occasional by the source study.

| Small mammals, Arizona, USA (Supp et al., 2015)
There were three core subgroups containing a total of eight species and two occasional groups, with 13 species (Table 2, Figure 4i). If the occasional groups were considered to be equivalent to the intermediate and transient species of Supp et al. (2015), then the groupings from the two techniques were almost identical ( 2 1 = 0.097, p = 0.755).

| D ISCUSS I ON
Previous plotting of abundance against persistence (i.e., frequency of occurrence) used to deconstruct communities into core and occasional species (Genner et al., 2004;Magurran & Henderson, 2003) relied on the availability of community data over many sampling events, and without this, it has been difficult to reliably distinguish these species groups (Barlow et al., 2010). However, the strong effect of group size on the number of species classified as being core makes it difficult to know which group size is most appropriate and to have confidence in the allocation of species. Moreover, this problem does not just apply to short-term datasets but to any sample series.
For example, if with the dataset from Genner et al. (2004) persistence were measured in months, half years or pairs of years instead of on an annual basis, the result using these previous methods would be a large difference in the number of core species identified. So, does that mean that the classification of core species by Magurran and Henderson (2003) and by Genner et al. (2004) was unreliable?
Although that is theoretically possible, in actuality this appeared not to be the case, as the division of the Bristol Channel fish community was found to be closely aligned to the ecology of the species and their relative abundance (Magurran & Henderson, 2003). However, the problem of group size still puts into question the applicability of grouping techniques that rely on persistence in this way.
The group size problem was solved with the new method by using persistence simply as a ranking, not as a continuous variable. The community was then deconstructed based on the differential abundance of species using the property of Simpson's index (1/D) as a measure of the probability that any two individuals drawn at random are the same species (Magurran, 2004). Using Simpson's index on progressively smaller sections of the community is a novel approach. Boisnier et al. (2010) had done something similar before but added all species at a given level of persistence, whereas I had removed TA B L E 2 Species groupings of nine additional datasets deconstructed using the sequential Simpson's index method  However, a really important outcome is that, on the assumption, a community is adequately sampled, there is no need for many years of data to divide it into core and occasional species. A particularly useful consequence of which is that by assessing the identity of species groupings over a series of shorter time periods, changes to F I G U R E 4 Simpson's index (1/D) values calculated from species abundance data, from nine previously studied datasets, ranked by persistence (except (i), which was ranked abundance) with the sequential removal of the most persistent (abundant) species. Solid lines show the best fitting polynomial models. Vertical lines show the boundaries between species groups, with the dashed line indicating the division between suggested core and occasional species groups. Species groupings identified by previous studies are shown as red for core, species, can be more easily studied (Genner et al., 2004;Henderson & Magurran, 2014;Henderson et al., 2011;Magurran, 2007;Magurran & Henderson, 2010). It could help to show whether the number or proportion of species in a group changed or whether particular species switched between subgroups or from a core group to an occasional grouping or vice versa. Such changes might be related to species invasions, climate change, or other anthropogenic factors such as changing habitat management and the differential removal of species through hunting.
The sequential Simpson's index method resulted in a very similar or identical pattern of deconstruction previously identified for the Bristol Channel (Magurran & Henderson, 2003) and English Channel fish communities (Genner et al., 2004), indicating it is a reliable way of distinguishing core species from occasional. However, more than simply dividing communities into two groups the higher order polynomials of the models, with their multiple inflection points, resulted in core and occasional species, being divided into a number of subgroups. Therefore, the new method enables both a binary division and a more complex structure that may be a better reflection of reality.
One of the main advantages of the new method is that it reduces subjectivity relative to other methods of grouping core and occasional species, through identifying an optimum model based on statistical grounds. Previous methods, such as those of Hansen and Carey (2015) and Dolan et al. (2009), have had a more subjective approach, in their cases considering core species only as those that were found on all sampling occasions. Application of the new method is much more objective and is likely to be much closer to a representation of reality.
One of the criticisms of the concept of a core (i.e., common)occasional (i.e., satellite or rare) dichotomy has been that it is an over simplification of reality (Ulrich & Ollik, 2004;Whittaker, 2015). The new method retains the core-occasional framework while identifying of subgroups helps to describe better the patterns of species that occur between the extremes of persistence (Ugland & Gray, 1982;Ulrich & Ollik, 2004). This approach may be particularly useful in studying multimodal species abundance distributions (Antão et al., 2017;. Do subgroups have any biological or ecological meaning or are they simply a description of probability due to relative abundance of their component species? Magurran and Henderson (2003) found that in an estuarine ecosystem the core species were those associated with muddy substrates, whereas the occasional species preferred habitats, or are normally found in deep water. This showed that there was a connection between habitat association, abundance, and persistence. So, by extension, habitat preference or other ecological characteristics can be expected to align with subgroups. Indeed, Ugland and Gray (1982) suggested that communities are composed of groups of species whose constituents have some similarity in their adaptation to a habitat. Therefore, subgroups of species may have ecological meaning in relation to habitat preference and other niche characteristics.
Unfortunately for many species, particularly invertebrates, knowledge of their ecology is limited, so it may be difficult to relate this to species groupings. However, some comment can be made about the subgroups of Hemiptera and Coleoptera identified from the grassland at Grange. The Hemiptera genera, Javesella and Macrosteles, were represented by relatively unspecialized grass-feeding species, often associated with disturbed grasslands (Nickel, 2003). Within the subgroups, both genera showed a clear abundance ranking of species. A similar pattern was found in Coleoptera of the genus Stenus, which are predators of soft-bodied arthropods such as Collembola, and in Trifolium feeding weevils (Curculionoidea) (Lott & Anderson, 2011;Morris, 1990Morris, , 2002. The order and scale of the differences between these species may reflect differences their respective niches (Harpole & Tilman, 2006;Southwood, 1996). Many other core Hemiptera species were generalist grass feeders, and core Coleoptera were often generalist predators or associated with cattle dung (Nickel, 2003;Skidmore, 1991).
Several core species could be related to the presence of their specialized food plants, such as the aphid Thecabius affinis, which feeds of Ranunculus, and the aphid Acyrthosiphon pisum a specialist on Trifolium repens and other leguminous plants (Blackman, 2010;Heie, 1980). Consequently, these Hemiptera and Coleoptera examples suggest that the subgrouping of species may well be related to their ecological niche.
The identity of the species included in the subgroups did differ depending on whether persistence or abundance was used to rank them. However, due to the close relationship between persistence and abundance (Magurran, 2004;Magurran & Henderson, 2003), the overall size and pattern of the subgroups showed little difference between models based on abundance ranking and those with persistence ranking. Consequently, abundance data alone could be used for the identification of core and occasional species and of the subgroup structure. To do so, a community would have to be well sampled and data give a good representation of true species richness (Gotelli & Colwell, 2001;Magurran, 2004). However, on that assumption, the new technique does not require long runs of sample data over many years. Therefore, data collected over shortterm periods or which are related to very long-lived species such as trees (Barlow et al., 2010;Condit et al., 2012;Umaña et al., 2017) can be deconstructed. Furthermore, communities could be deconstructed using data from individual years allowing investigation of how species groupings change over time in relation to issues such as climate change and human impact (Genner et al., 2004;Henderson & Magurran, 2014;Henderson et al., 2011;Magurran, 2007;Magurran & Henderson, 2010). Allocation of a species to a grouping would be mathematically independent of their status in other years. This would be particularly useful in studying changes in increasing or declining species, such as the fish Liparis liparis in the Bristol Channel, which may have declined due to increasing water temperature (Henderson et al., 2011). The new approach could also be extended to spatial studies to identify core and satellite species and species subgroupings, assessed at the individual site scale Supp et al., 2015).

HELDEN
The application of the new sequential Simpson's method to nine further datasets investigated its wider applicability. The division into core and occasional was identical or very nearly identical to the classification used by Hanski (1982), Boisnier et al. (2010) (Matthews 1985 data), Ulrich & Zalewski (2006) and Supp et al. (2015). For that of Gray et al. (2005), Dolan et al. (2009), andBoisnier et al. (2010) (Brandon 1994 data), binary core-occasional division did differ but there were strong similarities in other aspects of the groups identified. The least similarity was shown with the two Hansen and Carey (2015) datasets, who had not made a binary division but instead had grouped species at the extremes of abundance and/or persistence as core and occasional, with those species between being considered intermediate. Therefore, the new method can be successfully applied to a range of taxa, locations, and study approaches.
The division of communities into multiple groupings is aligned with the view that a binary division into core and occasional is simplistic and in reality communities are much more likely to be multimodal in structure (Antão et al., 2017;.
Although the new method can be used for a simple core-occasional dichotomy, it also identifies a more complex structure of multimodality. By applying the multiple groups to the previously studied datasets, rather than focusing on a simple dichotomy, reveals that there is strong agreement between the findings of the new approach and the various techniques used by other studies, even when the initial core-occasional grouping did not fit as well.

| CON CLUS ION
In applying the core-occasional species concept to a short-term data set of grassland insects, I found a probability based way of dividing communities into species groups. I have demonstrated that by sequentially removing species and calculating Simpson's index reveals patterns that distinguish the same core and occasional species classification as proposed by earlier studies and so show that it can be applied to a wide range of datasets, whether based on short-or longterm data. However, it has the advantage over previous methods in having no a priori assumption of a binary division of species but rather can identify multimodality in species abundance distributions.
It agrees well with previous studies that have used a range of taxa and sampling methodologies and offers a more objective approach to the study of species groupings. Moreover, it does not rely on long time series of sample data and can be used even if only species abundances are known, assuming well-sampled communities. It can therefore be applied to a far greater range of community data allowing a more fine-scaled approach and so has the potential to provide a valuable insight into how communities change in time or space.

ACK N OWLED G M ENTS
I am very grateful to Peter Henderson and Anne Magurran for giving me permission to use their fish community data and to Condit et al. (2012) for the use of BCI data. Annette Anderson helped with collection and initial sorting into orders of some of the insect samples. Purvis. The subsequent work did not receive any grants from funding agencies in the public, commercial, or not-for-profit sectors.

CO N FLI C T O F I NTE R E S T
The author has no competing interests to declare.

DATA AVA I L A B I L I T Y S TAT E M E N T
Data from Teagasc Grange can be accessed at https://doi.
• The values of the model (y) for each species rank (x) are listed (see table, below -data from Supp et al. (2015)).
• The difference is taken between each value of y and the value of the preceding value i.e. y x − y x−1 .
• The difference values change sequentially in either a positive or negative direction.
• Inflection points are represented when the difference values change direction. In the table below, the superscript letters (a-g) indicate this, with inflection points occurring between values with different letters. For example, the difference figure decreases from species 2 to species 3, then after species 3 it starts to increase (until species 5), so an inflection point (change from a decreasing difference to increasing difference) lies somewhere between species 3 and 4.
When ranked by abundance, data were best described by a ninth order polynomial. The glm model was as follows: Deconstructing the community resulted in three core groups (C1 = 5, C2 = 9, C3 = 16) totaling 30 species and 99.4% of individuals, and two occasional groups (O1 = 16, O2 = 35) totaling 51 species.