EUNIS Habitat Classification: Expert system, characteristic species combinations and distribution maps of European habitats

Aim: The EUNIS Habitat Classification is a widely used reference framework for European habitat types (habitats), but it lacks formal definitions of individual habitats that would enable their unequivocal identification. Our goal was to develop a tool for assigning vegetation-plot records to the habitats of the EUNIS system, use it to classify a European vegetation-plot database, and compile statistically-derived characteristic species combinations and distribution maps for these habitats.


| INTRODUC TI ON
Comprehensive systems of classification of natural, semi-natural and man-made habitat types (hereafter also "habitats") are essential tools for nature conservation.They are important for designing networks of protected areas, conducting inventories of natural areas, monitoring, management planning, environmental impact assessment and setting targets for ecological restoration.The EUNIS (European Nature Information System) Habitat Classification, developed by the European Topic Centre for Biodiversity for the European Environment Agency (EEA) in the 1990s and early 2000s (Davies and Moss, 1998;Davies et al., 2004;Moss, 2008), is the main comprehensive pan-European hierarchical classification of habitats covering both the marine and terrestrial realms (Evans, 2012;Rodwell et al., 2018).It is extensively used in research and for various applications, including the implementation of European Community directives related to environmental protection (Vilà et al., 2007;Chytrý et al., 2008;De Graaf et al., 2009;Strasser and Lang, 2015;Adamo et al., 2016;Gigante et al., 2018;Hämmerle et al., 2018).It has also become one of the key elements for the European Directive 2007/2/EC on Infrastructure for Spatial Information in the European Union (INSPIRE, 2013) and the updated version of Resolution 4 of the Bern Convention on the Conservation of European Wildlife and Natural Habitats, which is the legislative basis for the Emerald network -a complement of the Natura 2000 network in the European countries that are not members of the European Union (Council of Europe, 2018).
Terrestrial habitats in EUNIS are often based on phytosociological vegetation types, such as those defined in EuroVegChecklist (Mucina et al., 2016;Rodwell et al., 2018).However, while phytosociological classification is mainly based on species composition and vegetation structure (De Cáceres et al., 2015), the EUNIS Habitat Classification also emphasizes the abiotic environment and geographic location as classification criteria.It also includes habitats in which plants are nearly or entirely absent.Still, most of the terrestrial habitats of EUNIS can be successfully defined using methods of vegetation science.
In recent years, the EEA recognized the EUNIS Habitat Classification as a key tool for assessing progress towards the European Union biodiversity targets and global Aichi targets.
EUNIS became a European reference to which national and regional classifications and various data sets could be linked in the framework of the INSPIRE Directive.As such, EUNIS enables structured dialogue between different networks of experts, including those describing habitats through in-situ vegetation sampling, those working with satellite imagery, and those developing and evaluating various policies.
To improve these uses of the EUNIS Habitat Classification, the EEA initiated a process of its revision at Level 3 (for the terrestrial realm) and 4 (for the marine realm) of the classification hierarchy.
This revision established more consistency, removed ambiguity and overlaps in definitions of types, and extended the typology to the entire European continent and adjacent seas, although still with some gaps especially in eastern Europe (Russia and some adjacent countries).The proposals for revision of grassland, shrubland and forest habitat classification were summarized in a series of reports (Schaminée et al., 2012(Schaminée et al., , 2013(Schaminée et al., , 2014(Schaminée et al., , 2016a)), and a preliminary version of the revised EUNIS Habitat Classification was used in the project European Red List of Habitats (Janssen et al., 2016).The revisions included additions of new units, splitting or merging existing units and changes in habitat names and definitions.The review of the revised EUNIS classification has undergone public consultations with international experts and country representatives of Eionet, a partnership network of the European Environment Agency (https://www.eionet.europa.eu/).The public consultations resulted in further changes in the delimitation of individual habitats and their names.Based on the consultation proposals, a refinement of the classification for grassland, shrubland and forest habitats was made by Schaminée et al. (2018), for coastal and wetland habitats by Schaminée et al. (2019) and for vegetated man-made habitats by Schaminée et al. (2020).The work on the remaining sections is under way.
Classification expert systems assign individual vegetation plots to already established classification systems.This type of classification can also be called identification.It is particularly relevant for the EUNIS Habitat Classification because once a large number of vegetation plots from different parts of Europe are consistently assigned to habitats, exact characterization of species composition, distribution and environmental relationships of these habitats can be provided.This is of great importance for practitioners because so far the EUNIS habitats were only characterized by brief and often rather unclear textual descriptions and lists of units taken without revision from previous classifications such as CORINE Biotopes or Palaearctic Habitat Classification (Rodwell et al., 2018).Such superficial, and in coastal habitat, diagnostic species, distribution map, dune vegetation, European Nature Information System (EUNIS), European Vegetation Archive (EVA), expert system, forest, grassland, habitat classification, man-made habitat, shrubland, vegetation database, vegetation plot, wetland CHYTRÝ eT al. some cases inconsistent, characterization confused the meaning of the EUNIS habitats.Therefore, the current interpretation of the same habitat type can vary among European countries.
Our aims here are to: (a) develop a classification expert system for automatic assignment of vegetation-plot records to coastal, wetland, grassland, shrubland, forest and man-made habitats of the revised EUNIS Habitat Classification at Level 3 of the classification hierarchy; (b) base this system on algebraic and set-theoretic concepts combined using formal logic; (c) assign all available European vegetation plots to EUNIS habitats; (d) define the characteristic species combination for each habitat based on a statistical analysis of the plots assigned to this habitat by the expert system; and (e) provide distribution maps of individual habitats based on the location of vegetation plots assigned to these habitats.

| Revised EUNIS habitat classification
EUNIS provides a hierarchical classification of European habitats.
Before the recent revision, EUNIS contained the following habitat groups at Level 1, i.e. the highest level of the classification hierarchy (Davies et al., 2004) Recently, classification and delimitation of the habitat groups B, D, E, F, G and I and individual habitats at Levels 2 and 3 were revised, and all the habitat units at Levels 1 to 3 were re-coded and some of them renamed.The former group D was extended by adding helophyte beds, previously classified to group C, and renamed to Wetlands.However, some types of wetlands still remain in group C, which is currently under revision.The current codes and names of the six revised habitat groups, which are the focus of this paper, are as follows: • R -Grasslands and lands dominated by forbs, mosses or lichens (called "Grasslands" in this paper) • S -Heathlands, scrub and tundra (called "Shrublands" in this paper) • T -Forests and other wooded land (called "Forests" in this paper) • V -Vegetated man-made habitats (called "Man-made habitats" in this paper) A list of the individual habitats belonging to these six groups is provided in Table 1, and habitat factsheets with their descriptions and corresponding phytosociological alliances of EuroVegChecklist (Mucina et al., 2016) are in Appendix S1.We prepared the lists of corresponding alliances based on expert judgement by comparing the basic characteristics of the EUNIS habitats with the EuroVegChecklist alliances.These lists can help understand the content of individual EUNIS habitats to those scientists and practitioners who are familiar with European phytosociological classification.However, although the EUNIS habitat classification was, to a large extent, inspired by the phytosociological classification system, it developed independently of it, which implies that the phytosociological alliances are often not nested within habitat types.The "oneto-many" or "many-to-one" relationships of the EUNIS habitats to the EuroVegChecklist alliances are much more common than simple one-to-one matches.
Within the six revised habitat groups, we selected only those habitats that could be defined based on floristic criteria (Table 1, Appendix S1).We did not develop formal definitions for those habitat types that represent mosaics of several different habitats (e.g., wooded pastures) because their complete structure cannot be represented by single vegetation plots.We also did not consider those forest habitats that are defined based on the management practice or successional stage but do not differ floristically from related types with different management or in different successional stages (e.g., T41 Early-stage natural and semi-natural forest and regrowth or T43 Coppice and early-stage plantations).We were able to define plantations of non-site-native trees (those planted at sites where they would not occur naturally) but unable to define plantations of site-native trees because their floristic composition is usually indiscernible from that of natural forests.

| Data sources
The primary source for producing characteristic species combinations and maps for EUNIS habitats was a data set of European vegetation-plot records (henceforth "vegetation plots" or "plots").
Such plots typically contain a full list of vascular plant species, often also a list of bryophytes and lichens, estimates of cover abundance of each species and various additional sources of information on vegetation structure, location and environmental features in the plot (Dengler et al., 2011).These plots were extracted from the EVA database (Chytrý et al., 2016;accessed on 19 May 2020), and several other databases not included in EVA (see the full list of databases used in this study in Appendix S2).The geographical scope was the whole of Europe (including the European part of Russia), Azores, Madeira, Canary Islands, Anatolia, Cyprus, Georgia, Armenia and Azerbaijan.The data set contained plots representing both the target habitat groups (coastal, wetlands, grasslands, shrublands, forests and man-made) and non-target habitat groups (marine, inland surface water and inland sparsely vegetated habitats).The latter groups are not in the focus of this study because the revision of their classification is not yet complete or published; still, plots sampled in these types were needed to assure correct classification of the habitats that are TA B L E 1 Overview of the revised EUNIS habitat types and the number of plots assigned to each of them by the EUNIS-ESy expert system, with the current codes and corresponding codes used in the 2007 version of the EUNIS classification and in the European Red List of Habitats; habitats marked by an asterisk could not be defined in the expert system; numbers of plots for the habitats at Levels 1 and 2 of the classification hierarchy (grey rows) are the sums of the numbers of plots for the subordinated habitats at Level 3; for habitat groups (i.e.Level 1 habitats), the numbers of additional plots assigned directly to this level are given in brackets after the plus sign; the total number of the classified plots is 1, 125,121 (783,177 at Level 3 and 341,944 transitional between the target and non-target habitat groups.We excluded the plots that reported only species composition without cover-abundance information for individual species.Further, we excluded plots smaller than 1 m 2 , larger than 1,000 m 2 , without geographical coordinates and those with reported uncertainty of the coordinates larger than 10 km.The plots with a missing indication of location uncertainty were retained, assuming most of them were within 10 km from the indicated coordinates.The resulting data set contained a total of 1,261,373 georeferenced plots.The data set was prepared using the Turboveg 3 program (Hennekens, 2015) and analysed using the Juice 7.1 program (Tichý, 2002).It is the most extensive data set of vegetation plots ever analysed (compare Bruelheide et al., 2019).
The taxon names in this data set originated from various national and thematic international databases, most of them managed in Turboveg 2 (Hennekens and Schaminée, 2001), which use different taxon lists with partly inconsistent taxon concepts and names.Taxonomy and nomenclature were unified using Turboveg 3 in two steps.Firstly, the names from the original databases were interpreted by regional botanists, considering the taxonomic concepts and nomenclature used in the focal region of each database.
This step was important because it solved regional differences in the use and meaning of some taxon names.In this step, taxon lists of most of the European vegetation-plot databases were matched to accepted names in the SynBioSys Taxon Database, an unpublished working database of taxon names and concepts used in the EVA project (Chytrý et al., 2016).Secondly, the names of vascular plants The names of bryophytes and lichens followed the SynBioSys Taxon Database.
The cover of individual species was, in most vegetation plots, recorded using a cover-abundance scale (70% of plots were recorded using a variant of the Braun-Blanquet scale; Westhoff and van der Maarel, 1973).We transformed all of these scales to the arithmetic mid-point percent cover values corresponding to the individual cover-abundance classes following the default conversion of the Turboveg 2 program (Hennekens and Schaminée, 2001).

| EUNIS-ESy: An expert system for identifying EUNIS habitats in vegetation-plot databases
The new classification expert system EUNIS-ESy (= EUNIS Expert System) was developed for identifying coastal, wetland, grassland, shrubland, forest and vegetated man-made habitats of the EUNIS Habitat Classification based on species composition and cover abundances of particular species or species groups.In the habitats that are difficult to distinguish based on purely floristic criteria, plot-location criteria were added.
EUNIS-ESy is based on formal definitions of habitats written as logical formulas in an editable script stored as a TXT file (Appendix S3).The computer program that runs the expert system evaluates all the plots of a vegetation database and checks for each of them whether it meets the conditions of one or more of the formal definitions of habitats included in this script.If a plot matches a definition of one habitat, it is assigned to this habitat.In an ideal case, habitats should be mutually exclusive, and each plot should be assigned to one and only one habitat.However, in reality, some plant communities have a transitional composition corresponding to two or even more habitats.The plots representing such communities are simultaneously assigned to all of these habitats.Other plant communities with idiosyncratic or impoverished species composition may be unable to be assigned to any habitat and remain unclassified by the expert system.Nevertheless, the expert system was prepared with the aim of allowing a large majority of plots to be assigned unequivocally to a single habitat.
The expert system script of EUNIS-ESy (Appendix S3) is divided into three sections, which represent successive steps in the analysis.
Section 1 merges selected taxon names.In most cases, it merges subspecies, varieties or forms to the species level.We did so in order to improve the consistency of taxonomic concepts across the data set because some authors recorded only species while others also recorded infraspecific taxa.Further, we merged taxonomically difficult groups of species containing many misidentifications into species aggregates.
Section 2 enumerates species belonging to individual species groups that characterize particular habitats or groups of habitats and are used in the formal definitions of habitats.A single species can be assigned to more than one group.The initial lists of species included in the groups were compiled from relevant phytosociological literature, national habitat handbooks, personal field experience, data from our previous EUNIS reports (Schaminée et al., 2013(Schaminée et al., , 2014(Schaminée et al., , 2016a(Schaminée et al., , 2016b))  The cover criterion is the percentage cover of specific species or the total cover of a species group occurring in a plot.The criteria are combined using the relational operators GR (greater than) or GE (greater than or equal to).

| Species-based assignment rules
When applied to a single species, an assignment rule with the occurrence criterion is "Species name GR 00," meaning that the species is present (its percentage cover is greater than zero).A non-zero percentage value defines a cover criterion; for example, "Fagus sylvatica GR 50" denotes that the cover of Fagus sylvatica in the plot should be greater than a preselected threshold cover of 50%.Alternatively, the cover of a species can be compared with the total cover of all the other species occurring in the plot, for example, "Erica tetralix GR #$$" means that the cover of Erica tetralix should be greater than the cover of any other species in the plot, or "Erica tetralix GR $50" means that the cover of Erica tetralix should be greater than 50% of the total percentage cover of all species in the plot (see Tichý et al., 2019, their Appendix S1, for syntax details).
When applied to species groups, the occurrence criterion assesses whether the number of species of the target group in the plot exceeds a pre-selected threshold, or whether the number of species of one group is greater than the number of species of another group (or other groups).The cover criterion for species groups assesses whether the total cover of the species belonging to the group is greater than a preselected threshold, or whether it is greater than the total cover of species belonging to another group (or other groups).The total cover of a species group is computed by combining percentage covers of individual species of the group following the Jennings-Fischer formula, which returns values that do not exceed 100%.Alternatively, in discriminating species groups (see below), the total cover can be computed as a simple sum of percentage covers of individual species, or a sum of square-rooted percentage covers of individual species.
EUNIS-ESy contains two basic types of species groups called "functional species groups" and "discriminating species groups."All the groups of both types are defined in Section 2 of the expert system script by listing species belonging to them.In that section, functional groups are indicated by the symbols ### and discriminating species groups by ##D.The groups of both types are used to define assignment rules in Section 3 of the expert-system script (Appendix S3).Most relational operators can be applied to both functional and discriminating groups, but one set of operators can only be applied to the discriminating groups.

Functional species groups
The concept of the functional species groups follows Landucci et al. (2015).These groups comprise species with similar traits (e.g., life form, morphology or phenology), but also species with similar distribution ranges, affinity to the same habitat, or species characterized by a combination of these properties.
The assignment rules for the functional species groups with occurrence criteria are as follows: • The plot should contain at least n species of the group ("#nn Group-name" in the script, where nn is a two-digit number of required species; for example, "#03 Dwarf-shrubs" means that at least three species of the functional group Dwarf shrubs should be present in the plot).
• The plot should contain more species from one group than from another group ("### Group-name1 GR ### Group-name2" in the script; for example, "### Wet-grassland-herbs GR ### Mesicgrassland-herbs" means that the plot should contain more herb species of wet grassland than of mesic grassland).
The assignment rules for the functional species groups with cover criteria are as follows: • The total cover of a functional species group in the plot should be greater than a threshold ("#TC Group-name GR nn" in the script, where #TC means the total cover of the species of the group calculated using the Jennings-Fischer formula and GR nn means greater than a percentage threshold; for example, "#TC Dwarfshrubs GR 50" means that the total cover of dwarf shrubs in the plot should be greater than 50%).
• The total cover of a functional species group should be greater than that of another functional group ("#TC Group-name1 GR #TC Group-name2" in the script; for example, "#TC Wet-grasslandherbs GR #TC Mesic-grassland-herbs" means that the plot should contain a greater total cover of wet grassland herbs than that of mesic grassland herbs).
• The total cover of a functional species group should be greater than that of another functional group, excluding the species of the former group from the latter group ("#TC Group-name1 GR #TC Group-name2 EXCEPT #TC Group-name1" in the script; for example, "#TC Dark-taiga-trees GR #TC Trees EXCEPT #TC Darktaiga-trees" means that the total cover of the dark taiga trees in the plot should be greater than that of other trees).This formula can also be used for comparing the cover of a single species with the total cover of a group, e.g., "Picea abies GR #TC Trees EXCEPT Picea abies" means that Picea abies should have a greater cover than the total cover of the other trees.
• The total cover of a functional species group in a plot should be greater than nn% of the total cover of all the species in a plot ("#TC Group-name GR $50" in the script, where $50 means 50% of the total cover of all species; for example, "#TC Dwarf-shrubs GR $50" means that the total cover of dwarf shrubs should be greater than 50% of the total cover of all species in the plot).
• A general group containing all the species occurring in the plot can be created, and its total cover computed using the #T$ notation in the script.Such a group can be used to identify whether the total cover in the plot is greater than a given threshold (e.g., "#T$ GR 30" means that the total vegetation cover in the plot should be greater than 30%).Alternatively, this general group can be used to define that the total cover of a functional species group should be greater than the total cover of all the other species in the plot ("#TC Group-name GR #T$" in the script; in this case, #T$ means the total cover of all the other species in the plot excluding the species of the group involved in the comparison; for example, "#TC Dwarf-shrubs GR #T$" means that the total cover of dwarf shrubs should be greater than the total cover of all the other species, i.e. non-dwarf-shrubs, in the plot).
• Finally, the assignment rules can consider the cover of only one species of the group, specifically the one that has the highest cover in the plot, using the symbol #SC (single-species cover).For example, "#SC Phrygana-shrubs GE #$$" means that the cover of at least one species belonging to the group of phrygana shrubs is greater than the cover of any other species in the plot or at least it is equal to the cover of the species with the highest cover value of those not belonging to this group; "Corylus avellana GR #SC Shrubs" means that the species Corylus avellana has a greater cover than the cover of any single species in the functional species group of shrubs except Corylus avellana.
In all cases, two or more functional species groups can be merged in the logical formulas.For example, "#TC Trees|#TC Shrubs|#TC Dwarf-shrubs" represents the total cover of all woody plants in the plot.
An example of a habitat definition based on functional species groups (S35 Temperate and submediterranean thorn scrub): • (<#TC Temperate-submediterranean-deciduous-shrubs GR 25> AND (<#TC Temperate-submediterranean-deciduous-shrubs GR $50> OR <#SC Temperate-submediterranean-deciduous-shrubs GR #$$>)) NOT (<#TC Mesomediterranean-maquis-shrubs GR This means that the total cover of the functional species group "Temperate-submediterranean deciduous shrubs", calculated using the Jennings-Fischer formula, should be greater than 25% and, at the same time, either the total cover of this group should be greater than 50% of the total cover of all the species in the plot or the cover of any species of this group should be greater than the highest cover in the plot of a single species that does not belong to this functional species group.In addition, the total cover of the functional species group "Mesomediterranean maquis shrubs" should not be greater than 5% and the total cover of the functional species group "Trees" should not be greater than 10%.

Discriminating species groups
The concept of discriminating species groups follows the proposals of Dengler et al. (2006) and Willner (2011) and the principles of the expert system developed by L. among broad habitat groups.For example, the expression "##Q +04 R1A-Semi-dry-perennial-calcareous-grassland" in the expert system script means that the sum of square-rooted percentage covers of the species belonging to the discriminating species group of semidry perennial calcareous grassland in the plot should be greater than the sum of square-rooted percentage covers of the species of the discriminating species group of any other non-forest habitat which has a discriminating group with a name starting with +04.
The relative importance of functional and discriminating species groups for habitat classification varies among habitat groups.
In the habitats defined by the presence of certain dominant species, especially forest and shrubland habitats, the use of functional groups in combination with threshold cover values is often sufficient and most effective to define the particular habitat (e.g., heathland is a habitat determined by the dominance of ericoid or genistoid dwarf-shrub species).In contrast, for the habitats characterized by a weak or irregular dominance of specific species, such as most grassland and coastal habitats, this method of habitat definition rarely provides satisfactory classification.For such habitats, we based the classification mainly on the discriminating species groups.
An example of a definition based on discriminating species groups (Q22 Poor fen): This means that the sum of square-rooted percentage covers of the discriminating species group of all mire habitats should be greater than that of any other broad habitat group, and at the same time the sum of square-rooted percentage covers of the discriminating species groups of fens, acidophilous fens and poor fens should be greater than those of their contrasting groups with the same number at the beginning of their name, and the cover of acidophilous fen species should be greater than 25% and the cover of either trees or shrubs should not be greater than 15%.Note that in this example, the group "+09 Acidophilous-fen-species" is used both as a discriminating species group (##Q) and a functional species group (#TC).

| Location-based assignment rules
Some habitats in the EUNIS classification are defined partly by their occurrence in specific latitudinal vegetation zones, altitudinal vegetation belts or habitat complexes.For example, some groups of coniferous forests are divided into boreal and temperate types, or some habitat types are defined by their occurrence on coastal dunes, although similar habitats also occur on inland dunes.In several cases, it is impossible to distinguish such habitats by plant species composition and cover alone, especially if they are species-poor, because at least in some places, their species composition can be the same in the boreal and temperate zones, or in coastal and inland dunes.Therefore, we included several location-based assignment rules into the formal definitions of habitats in EUNIS-ESy, complementing the species-based assignment rules.Nevertheless, we could define most of the habitats purely based on species composition, and we added the location-based assignment rules only when the species-based classification was unable to separate some types or would have required very complex definitions.
The location-based criteria are either qualitative or quantitative: in the expert-system script, they are indicated as $$C (C stands for "character") or $$N (N stands for "numeric").The qualitative criteria are defined using the relational operator EQ (equal to), e.g., "$$C Country EQ Belgium" (the plot was located in Belgium).The quantitative criteria can also use the operator EQ, but more often they use the operators GR (greater than) or GE (greater than or equal to), e.g., "$$N Altitude (m) GR 1,000" (altitude of the plot was higher than 1,000 m a.s.l.).A range of the quantitative criteria can be defined by a combination of two statements, e.g., "<$$N Altitude (m) GE 500 > NOT <$$N Altitude (m) GR 1,000>" defines an altitude from 500 m to 1,000 m a.s.l.
The location information must be provided for each vegetation plot in a specific database field ("header data" in the terminology implemented in Turboveg and Juice).The following variables were used in EUNIS-ESy to define the location-based assignment rules: • Country -a qualitative variable containing country names  This means that a plot is assigned to the habitat type F12 if the total cover of the functional group of the Arctic and alpine bryophytes and lichens is greater than the cover of all the other species in the plot, at the same time at least two species of this group are present, the total cover of Sphagnum species or trees is not greater than 5%, and either the latitude is greater than 65°N or the plot is from Iceland.

| The hierarchical structure of the expert system
The expert system was developed hierarchically.Each habitat definition was assigned a priority degree.When the expert system is running, the definitions with the highest priority are applied to the data set first, and the plots that meet the requirements of these definitions are assigned to the habitats, while other plots remain unclassified.Then, the definitions with lower priority are applied to the remaining unclassified plots.
The current study deals only with the EUNIS habitat groups that were revised so far, i.e.N (Coastal), Q (Wetlands), R (Grasslands), S (Shrublands), T (Forests) and V (Man-made).However, the hierarchy of the expert system has been designed to include the vegetated habitats from the other groups (A -Marine, C -Inland surface waters, and H -Inland sparsely vegetated) once their revision is finished and published.Preliminary definitions of these habitats were developed and included in the expert system, which is important for separation of the target habitat groups from the non-target groups.The codes and concepts of these habitats in the expert system correspond to those used in the European Red List of Habitats (Janssen et al., 2016).However, these preliminary definitions were not tested and therefore not included in the results of the current study.
In some cases, we created two definitions with different priority levels to define a single habitat.The narrower definition is applied at a higher priority level.It is usually based on the occurrence or a high total cover of species from a functional group that comprises species narrowly specialized to the habitat.This definition classifies the plots that are typical examples of the particular habitat, but it leaves many less typical plots of this habitat unclassified.Subsequently, a broader definition is applied at a lower priority level to the unclassified plots.This definition is based on a discriminating species group and classifies the plots that are less typical examples of the habitat but still possess more features of this habitat than of any other habitat.Such a two-step approach is needed for habitats in which the occurrence of narrowly specialized species is a sufficient criterion for habitat assignment even if such species have a low cover.If only definitions based on discriminating species groups were used, some of the plots of these habitats could be misclassified.
For example, the habitat R11 Pannonian and Pontic sandy steppe is defined by two formulas.First, a narrower definition is applied at a higher hierarchical level: • <#TC R11-Pannonian-and-Pontic-sandy-steppe-specialists GR This means that the total cover of the functional species group R11-Pannonian-and-Pontic-sandy-steppe-specialists, including a selection of narrow ecological specialists of this habitat, should have a cover greater than 15%, and the total cover of trees and shrubs should not be greater than 15%.Then, a broader definition of the same habitat is applied to the plots that were not yet classified by any formula with a higher priority level: should not exceed 15%.
The expert system contains habitat definitions assigned to eight priority levels (Figure 1).The definitions at the highest priority level (8) are applied first in the classification, while the definitions at the lowest priority level (1) are applied last: • Level 8: Coastal habitats dominated by woody plants (i.e.coastal heaths, dune scrub and dune forests; habitats N18 to N1G) and Macaronesian heath (S43) are defined based on a high cover of dominant species or a high total cover of functional groups of dominant species (dwarf shrubs, shrubs and trees) in combination with the occurrence on the coast, on coastal dunes, or in Macaronesia, respectively.
• Level 7: Other (i.e.herbaceous) coastal habitats (N11-N17 and N1H-N1J) and marine habitats of the tidal zone dominated by vascular plants (A25a-A25d) are defined based on the discriminating species groups of habitats within these two groups, in combination with occurrence on the coast or in coastal dunes.
• Level 5-6: The habitat group H (inland sparsely vegetated habitats) is defined based on a cover not greater than 30% in combination with the occurrence of at least one species specialized to the habitats of this group.Individual habitats within this group are provisionally defined based on their preliminary discriminat- ing species groups at level 5.In the future, level 6 with narrower definitions of these habitats based on specialist species will be added.
• Level 4: Some coastal herbaceous (group N), some grassland (group R), all shrubland except the Macaronesian heath (group S), all forest (group T) and some man-made (group V) habitats are classified using definitions based on a high cover of characteristic dominant species or functional groups of characteristic dominant species, or the presence of a specified minimum number of species narrowly specialized to individual habitats.
• Level 3: Habitat groups of shrublands (group S) and forests (group T) are defined based on the dominance of shrubs, dwarf shrubs or trees.Vegetation plots that have not been previously classified to specific habitats within these groups are classified directly to these broad groups.
• Level 2: All the non-shrubland and non-forest habitats that were not defined before are defined based on the discriminating species groups.
• Level 1: Habitat groups of wetland (Q, separated into the groups of Qa -mires and Qb -helophyte beds), grassland (R), manmade (V), inland surface water (C) and inland sparsely vegetated (H) habitats, without assignment to any specific habitat, are defined based on discriminating species groups.Vegetation plots that have not been previously classified to specific habitats within these groups are classified directly to these habitat groups.

| Iterative evaluation and optimization of the expert system
The expert system and formal definitions of individual habitats therein were created based on expert opinion combined with iterative improvement, which used information from the evaluation of the results of successive classification trials.A preliminary version of the expert system contained the initial species groups (Section 2.3) and the first version of formal definitions for a subset of habitats belonging to the same habitat group (e.g., Forests).
The formal definitions were proposed by the authors of this paper, who considered the content of each target habitat and the options provided by the formal language of the expert system.The aim was to propose such definitions that would include most plots belonging to the habitat and none or very few plots not belonging to the habitat.This preliminary version of the expert system was applied to a data set of European vegetation plots in the Juice program.The resulting classification was evaluated by the experts, focusing on false positive (a plot does not belong to the habitat but is assigned to it) and false negative (a plot belongs to the habitat but is not assigned to it) classification results.The classification could only be validated based on the judgement of human experts because there is no standard of correct plot-level classification.Therefore the plots that the expert system assigned to individual habitats were checked by several experts from different countries, who were specialists in different habitats.These plots were also mapped, and the experts paid special attention to geographically outlying plots and to the absence of plots in areas where the habitat was expected.If the experts identified misclassified plots, the expert system was modified to avoid such misclassifications.The modifications were made either to the content of the species groups or to the assignment rules in the formal definitions.For species groups, the species that contributed to misidentifications were identified and removed from the groups, while other species that might contribute to the correct habitat identification were added to the relevant groups.For formal definitions, the structure of the formulas or thresholds used in the formulas were changed.This process was repeated many times until misclassification identifiable by the experts were eliminated.Once the final classification was achieved for one habitat group, the first versions of formal definitions of another habitat group were added, and the iterative optimization process was repeated.

| Characteristic species combination
In phytosociology, "characteristic species combination" is defined as a combination of diagnostic species and species with higher constancy that together define a vegetation unit (Braun-Blanquet, 1964).Here we use this term as an umbrella for the three types of species that Chytrý and Tichý (2003) introduced to characterize vegetation types: diagnostic, constant and dominant species.Diagnostic species (Whittaker, 1962;Westhoff and van der Maarel, 1973) are species with occurrences concentrated in a particular habitat, being absent or rare in other habitats.As such, they are useful as positive indicators of the habitat.However, diagnostic species may be absent from the habitat at many sites.Constant species are species that occur frequently but not necessarily exclusively in a particular habitat: some of them may be generalist species that are also frequent in other habitats.Dominant species are those that often reach high cover in a particular habitat, thus determining the habitat physiognomy.
For the purposes of computing the characteristic species combination for each EUNIS habitat at Level 3 of the classification hierarchy, we used a data set of all the plots classified at this level, including the vegetated marine (A), inland surface water (C) and inland sparsely vegetated (H) habitats, which were defined provisionally in the expert system.We performed a stratified resampling of this data set (Knollová et al., 2005) to balance the spatially uneven sampling effort across Europe, i.e. a high concentration of vegetation plots in relatively small areas contrasting with low density or absence of vegetation plots in other, often large areas.
This procedure should reduce the bias in identification of diagnostic, constant and dominant species, especially for those species that are frequent in heavily sampled areas but rare or absent elsewhere.The stratification was applied to a data set of those vegetation plots that were classified by the expert system to Level 3 habitats, excluding the plots classified to Level 1 (habitat groups) but not to Level 3 habitats.All the plots in this data set were assigned to geographical grid cells of 5 min × 3 min of longitude × latitude (corresponding to approximately 6.0 km × 5.5 km at 50° N).If a cell contained more than one plot belonging to the same habitat, one randomly selected plot was retained, while the others were removed.If such resampling resulted in <20 plots per habitat across the whole data set, some of the previously removed plots were selected randomly and returned to the data set, ensuring that the total number of plots of the habitat was 20.Habitats with fewer than 20 plots in the whole data set were not resampled.
The resampled data set contained 233,352 plots, i.e. 28% of all the plots classified to Level 3 habitats, but it was more balanced and more representative than the original data set.
Diagnostic species were determined based on species fidelity, i.e.
the degree of concentration of species occurrences in each group of plots representing a Level 3 EUNIS habitat.Fidelity was calculated using the phi coefficient of association (Sokal and Rohlf, 1995;Chytrý et al., 2002) standardized as if each habitat was represented by the same number of plots (Tichý and Chytrý, 2006).The species with a value of phi greater than 0.15 for a particular habitat were considered as diagnostic for this habitat.This threshold was selected arbitrarily as F I G U R E 1 A scheme of the EUNIS-ESy classification expert system with eight hierarchical levels indicated by the numbers on the left side.Groups of habitats in the right-hand column are further separated using the formal definitions of individual habitats (Appendix S3).Some habitats have two or three (broader and narrower) alternative definitions, while others have a single main definition a compromise between a stringent selection of few species with high diagnostic value (if phi was higher) and a lax selection of many species with weak diagnostic value (if phi was lower).However, the concentration of species occurrences in the habitat, even if expressed by a high value of the phi coefficient, may not be statistically significant for some habitats represented by a low number of plots in the data set.
Therefore, the statistical significance of the species-habitat association was tested using Fisher's exact test (Sokal and Rohlf, 1995), and if not significant at p < 0.05, the species was excluded from the list of diagnostic species (Tichý and Chytrý, 2006).
Constant species were defined as those with a constancy (= percentage occurrence frequency) of at least 10% in the target habitat.
This threshold is much lower than usually used for constant species of vegetation types in phytosociology.However, a lower value is needed for EUNIS habitats than for vegetation types, because many habitat types comprise several vegetation types occurring across broad geographic ranges with varying species composition; as a result, few species have a higher constancy across the whole habitat.
Dominant species were defined as those that occurred with a cover greater than 25% in at least 5% of vegetation plots classified to the target habitat.This means that a species is considered as dominant even if it does not belong to the tallest vegetation layer, and a single plot can have more than one dominant species.Conversely, a habitat can have no dominant species, especially if it has sparse vegetation cover.
Records of taxa identified only to the genus level and records of epiphytic lichen species were removed from the characteristic species combinations.Records of other non-vascular plants (bryophytes and non-epiphytic lichens) were retained because many of these species are important ecological indicators.However, as they were not recorded in all plots, their calculated constancy values were likely underestimated.Their fidelity can be either underestimated (if they were sampled only in some proportion of plots of the habitat) or overestimated (if bryophytes and lichens were more often sampled in some habitats than in others).A solution would be to compute constancy values only for plots where bryophytes and lichens were recorded (or would have been recorded if present).However, this was not possible because vegetation plots without records of bryophytes and lichens in most cases do not contain information whether these species were really absent or just not recorded.Therefore, we reported the values calculated for bryophytes and lichens based on all the plots, but we emphasize that these values can be inaccurate and have to be interpreted with caution.
As a quality test of our results, we made a formal comparison of characteristic species combinations computed for the EUNIS forest habitats with an earlier established list of indicator species for French forest habitats prepared on the basis of a different data set (Gégout et al., 2009).This exercise, performed by national experts (J.-C.Gégout and L. Maciejewski), revealed a high degree of correspondence between both lists, thereby indicating the reliability of the characteristic species combinations computed in our study.

| Habitat distribution mapping
Distribution maps for individual habitats were prepared by plotting the location of all vegetation plots classified to individual habitats (before stratified resampling) on a map.All the maps were checked for outlying locations, which in most cases pointed out either an error in coordinates or misidentification of an important species that led to an erroneous classification to a different habitat.Errors were corrected, new classification prepared, and both the characteristic species combinations and distribution maps were updated.
Because of a strong geographic bias in the available European vegetation plots, especially their low density in northern and eastern Europe, we indicated the locations of plots belonging to individual habitats on grid maps showing the regional density of plots belonging to the particular habitat group.Such maps indicate whether the absence of occurrences of the habitat in a region is likely real or caused by the absence of data from the region (Figure 2).

| Expert system software tools
A software tool to apply the expert system script to a data set of vegetation plots was developed within the Juice 7 program and, in a simpler form that does not contain all the functions, also in

| RE SULTS
We developed formal definitions for 199 EUNIS habitats including 25 coastal (group N), 18 wetland (Q), 55 grassland (R), 43 shrubland (S), 46 forest (T) and 12 man-made (V) habitats (Table 1) and included them in the expert system (Appendix S3).We were un- habitats, divided into diagnostic, constant and dominant species, are listed in habitat factsheets (Appendix S1) and also provided in a spreadsheet format (Appendix S4).
The distribution maps of these habitats are also included in habitat factsheets (Appendix S1).These maps include only localities of the vegetation plots identified by the expert system as belonging to the habitat.Therefore, they can be biased by the distribution of the available plots for some habitats (Figure 2).To estimate the magnitude of the potential bias, the densities of plots Classification for nature conservation survey, planning, monitoring and reporting on the international, national and regional levels.
EUNIS comprises concepts of individual habitat types that resulted from discussions of international teams of experts and public consultations with national experts and practitioners, organized by the European Environment Agency (Rodwell et al., 2018).Therefore, the aim of the current study was not (and could not be) to revise this classification or concepts of individual habitats within it.Our aim was to develop formal definitions that would closely match the concepts of individual habitats and enable correct assignment of vegetation plots to these habitats.However, the work on this expert system was done in parallel with the EUNIS revision process in 2013-2019, and various experiences from developing formal definitions were fed back to the revision process and influenced its outcome (Schaminée et al., 2012(Schaminée et al., , 2013(Schaminée et al., , 2014(Schaminée et al., , 2016a(Schaminée et al., , 2016b(Schaminée et al., , 2018(Schaminée et al., , 2019(Schaminée et al., , 2020)).Further refinements of the formal definitions and expert system were made during the preparation of the current paper, based on the feedback from an international team of co-authors.
Therefore, the results presented here are an update of the work that was previously summarized in the reports cited above.
The present paper deals with the six habitat groups for which the EUNIS classification has already been revised (Schaminée et al., 2018(Schaminée et al., , 2019(Schaminée et al., , 2020)) Therefore, the expert system approach based on floristically defined vegetation types cannot be used to identify them.
Still, there are some habitats in these remaining habitat groups for which it will be possible to develop formal definitions and add them to the expert system, once these habitats are revised in the process guided by the European Environment Agency.In the marine habitats, the expert system would presumably also work with non-plant benthic species if data were available in a suitable form.These tasks remain for the future.Nevertheless, the current expert system includes preliminary definitions of the non-revised habitats from the remaining groups.In this way, its structure is prepared for the inclusion of new habitat definitions, which will replace the current preliminary definitions.

| Comparison with other expert systems
The expert system EUNIS-ESy presented here is not the first one developed for the classification of European vegetation plots.Other currently available pan-European expert systems were designed to identify phytosociological alliances or associations within a specific vegetation type, e.g., floodplain forests and alder carrs (Douda et al., 2016), beech forests (Willner et al., 2017), fens (Peterka et al., 2017), coastal dune grasslands (Marcenò et al., 2018), Mediterranean Lygeum spartum grasslands (Marcenò et al., 2019) and marshes (Landucci et al., 2020).Other expert systems have a more restricted geographic scope (see an overview with source codes at http://www.sci.muni.cz/botany/juice/ ?idm=25).These expert systems provided useful resources, and some species groups and decision rules proposed in some of them were used, with modifications, in EUNIS-ESy.
Some of the mentioned expert systems are designed to be applied only to the vegetation plots belonging to the broad habitat/ vegetation type for which the expert system was developed.The plots not belonging to this scope have to be removed before classification; otherwise, they might be erroneously assigned to some of the types defined in the expert system.In contrast, EUNIS-ESy includes, in addition to the formal definitions of the coastal, wetland, grassland, shrubland, forest and man-made habitats, also preliminary definitions of all the other European vegetated habitat types.
As a result, it can be applied to any vegetation plot from Europe.
Some expert systems for vegetation classification were also developed outside Europe.The expert system for national forest vegetation classification of Taiwan (Li et al., 2013), provided in a code executable in the R program, used a similar approach as the European expert systems, following the principles outlined by Bruelheide (1997Bruelheide ( , 2000) ) and Kočí et al. (2003).A different approach, based on supervised or semi-supervised fuzzy classification performed using the noise clustering algorithm, was applied for matching new plots to the units of existing national vegetation classification in New Zealand (Wiser and De Cáceres, 2013;Wiser et al., 2016).

| Remarks on the practical application of EUNIS-ESy
EUNIS-ESy can be used to assign vegetation plots to habitat types using the script provided in Appendix S3. • Plot location information.Because EUNIS-ESy classifies habitats (and not purely vegetation types), it requires that input data contain information on the location and some environmental features, in addition to species composition and covers.This information includes plots' geographical coordinates, altitude, and their location in specific countries, ecoregions (Dinerstein et al., 2017), on the coast, or in coastal dunes.Location-based criteria are used in 90 (48%) habitat definitions.For some habitats, they are essential, whereas for others they are only used for removing geographical outliers.If the information on location was missing in the input data, the expert system would classify the plots, but the classification might be wrong, especially for the habitats for which the location criteria are essential to the definition.If the location information (except coordinates) is not available in the input data, the Turboveg 3 export function can derive it from an overlay of plot coordinates with relevant GIS layers and store the values (e.g., location on coastal dunes or in a certain ecoregion) to the header data of vegetation plots.However, the expert system itself does not extract this information from plot coordinates.
• Tested vs not-tested habitats.The current version of the expert system was tested for the coastal, wetland, grassland, shrubland, forest and man-made habitats (EUNIS habitat groups N, Q, R, S, T and V).It also contains preliminary definitions of vegetated marine habitats, inland surface water habitats and inland sparsely vegetated habitats.However, these definitions have not been tested, and the proportion of plots misclassified by these definitions may be high.Indeed, the concept and delimitation of these habitats may change considerably in the process of EUNIS revisions.
• Classification accuracy at the regional level.EUNIS-ESy was designed for use in Europe and adjacent areas including Macaronesia, Anatolia, Cyprus and the Caucasus region.It may work well also in adjacent parts of Siberia, the non-desert part of Kazakhstan or in the biome of Mediterranean sclerophyllous vegetation in the Near East and northern Africa.However, the misclassification risk is higher there because the expert system was not tested for these regions.Misclassifications can also be more common in some regions within the geographical scope of this expert system such as Turkey, Cyprus and the Caucasus.These regions have high habitat and vegetation diversity but sparse data, which did not allow the same level of testing of the classification accuracy as in other parts of Europe.
• Classification accuracy at the local level.Even in Europe, some misclassifications can occur because the formal definitions of the habitats are optimized for the whole of Europe, which does not allow all local between-habitat differences in species composition to be considered.There are many pairs of species that clearly belong to different habitats in some European regions while sharing the same habitat in other regions.Therefore, in local applications of the expert system, the classification of specific vegetation plots should be considered as their suggested classification rather than their correct classification.

| Characteristic species combinations and distribution maps of habitats
Characteristic species combinations provided in this study are based on a statistical analysis of a large database of European vegetation plots, following the approach originally proposed by Chytrý and Tichý (2003) for an analysis of the Czech National Phytosociological Database and subsequently used for analyses of vegetation-plot databases in other countries (Jarolímek and Šibík, 2008;Kącki et al., 2013).
The formal division of the characteristic species combination into diagnostic (concentrated in the habitat), constant (frequent, but not necessarily concentrated) and dominant (often attaining a high cover) provides a comprehensive characterization of each habitat through its plant species.
However, both of these compilations are based on data from various sources and concepts developed independently by various experts, which introduces some inconsistencies.Moreover, these lists do not distinguish between diagnostic, constant and dominant species.In contrast, our lists are consistent across habitats, clearly discriminate the three categories of species included in characteristic species combination, and provide a numerical ranking of importance of each species within each category.Therefore, they can be used for various analyses and practical applications as the so far most reliable source of information on species composition of different European habitats.However, it is important to note that although we used a geographically stratified resampling of the data set, these species lists are, to some extent, biased due to considerable differences in vegetation plot density among European regions.In particular, species from northern and eastern Europe can be underrepresented in these lists.Moreover, information on bryophyte and lichen species is affected by the lack of their recording in many vegetation plots.
Nevertheless, once the data sets from undersampled regions and with a more consistent recording of non-vascular plants become available, an extended data set can be classified by the current expert system and the species lists of characteristic species combinations can be updated.
The maps of habitat distribution based on the available European vegetation plots appear realistic for the habitats restricted to western, central and southern Europe, but have many gaps for the habitats occurring in or extending to the north and east.Distribution ranges of habitats can be successfully modelled if the original records of habitat occurrence cover a large part of the real range (Jiménez-Alfaro et al., 2018).However, our modelling exercise (Schaminée et al., 2014) yielded unstable and often incorrect results, especially in extrapolations to data-poor areas in eastern Europe.
Therefore we refrained from complementing the maps with models here.Further work should be focused on identifying the areas with the most important data gaps for individual habitats and collecting data from such areas.The number of plots classified to each habitat reported in Table 1 partly reflects the occurrence frequency of each habitat in Europe, but it also reflects the research intensity.Special attention should be paid to the habitats that are so far represented by very few plots.

| CON CLUS IONS
The expert system EUNIS-ESy introduced in this paper has been shown to effectively assign vegetation-plot records to EUNIS habitats with a high level of accuracy as evaluated by expert judgement.
The novel possibility of combining floristic data with plot-specific geographic or environmental data as classification criteria allows enormous flexibility, which makes it possible to apply the expert system not only to floristically defined vegetation types but also to other habitat types that are defined through a combination of vegetation type and other criteria.
The expert system approach to habitat identification has several advantages: (a) it enables identification of vegetation plots : A -Marine habitats, B -Coastal habitats, C -Inland surface waters, D -Mires, bogs and fens, E -Grasslands and lands dominated by forbs, mosses or lichens, F -Heathland, scrub and tundra, G -Woodland, forest and other wooded land, H -Inland unvegetated or sparsely vegetated habitats, I -Regularly or recently cultivated agricultural, horticultural and domestic habitats, and J -Constructed, industrial and other artificial habitats.
from the SynBioSys Taxon Database and the names from the original databases that did not match any name in the SynBioSys Taxon Database were translated to the nomenclature of the Euro+Med PlantBase (Euro+Med, 2006-2020; ww2.bgbm.org/EuroPlusMed), using a complete list of accepted names and synonyms of European and Mediterranean vascular plant taxa provided by the Berlin-Dahlem Botanical Garden and Botanical Museum in February 2020.
Species-based (or in general, taxon-based) assignment rules typically consist of three components: (a) a taxon specifier; (b) a criterion; and (c) a relational operator.The taxon specifier can be a single species or a pre-defined species group.The criteria are based on either occurrence or cover.The occurrence criterion is the presence or absence of a species or a species group in a plot.

•
Ecoreg -a quantitative variable containing the three-digit codes of the terrestrial ecoregions(Dinerstein et al., 2017; see https:// ecore gions 2017.appspot.com/).Ecoregions were only used in the definitions of some shrubland and forest types • Coast_EEA -a qualitative variable indicating whether the location is on the coastline, including a buffer distance of up to 5,000 m from the coast.A digital coast map provided by the European Environment Agency (https://www.eea.europa.eu/data-and-maps/data/eea-coast line-for-analy sis-1/gis-data/europ e-coast line-shape file) was used to identify coastal plots based on their geographic coordinates.The categories are as follows: Arctic (ARC_COAST), Atlantic (ATL_COAST), Baltic (BAL_COAST), Black Sea (BLA_COAST), Mediterranean (MED_COAST) and Not on the coast (N_COAST)• Dunes_Bohn -a qualitative variable indicating whether the location is on coastal dunes.A digital version of the Map of Natural Vegetation of Europe(Bohn et al., 2003) was used, and the plots with geographic coordinates corresponding to the mapping units from P1 to P16 (coastal dune vegetation) were given the value Y_DUNES, whereas the others were given the value N_DUNES • DEG_LAT, DEG_LON -a quantitative variable containing the degrees of latitude and longitude of the plot in the coordinate system WGS 84, format DD.DDDD.Western longitudes are indicated with a minus sign • Altitude (m) -a quantitative variable containing the altitude of the plot in metres above sea level.
CHYTRÝ eT al.An example of a definition containing location-based assignment rules (F12 Moss and lichen tundra):• ((<#TC Arctic-alpine-bryophytes-lichens GR #T$> AND <#02Arctic-alpine-bryophytes-lichens>) NOT (<#TC Sphagnum GR 05> OR <#TC Trees GR 05>)) AND (<$$N DEG_LAT GR 65> OR <$$C Country EQ Iceland>) Pannonian-and-Pontic-sandy-steppe> AND <#03 +04 R11-Pannonian-and-Pontic-sandy-steppe>) NOT <#TC Trees|#TC Shrubs GR 15> This means that the sum of square-rooted percentage covers of a discriminating species group of the Pannonian and Pontic sandy steppe (including both the narrow specialists and frequently occurring less-specialized species) should be greater than the sum of square-rooted percentage covers of the discriminating species groups of any other habitat, and the plot should contain at least three species of this group, and the total cover of trees and shrubs

the
Turboveg 3 program.The syntax of the expert system was described by Tichý et al. (2019: their Appendix S1).The code for applying the expert system in the R program (www.r-proje ct.org) was developed by Bruelheide et al. (https://git.loe.auf.uni-rostock.de/misc/ESy).
able to develop formal definitions for 2 coastal, 3 grassland, 1 shrubland, 8 forest and 19 man-made habitats because EUNIS defines these habitats by features not associated with species composition and cover (e.g., abiotic habitat features, vegetation structure, successional age or the origin as plantation of site-native trees).Others of these non-defined habitats were mosaics of trees or shrubs and herbaceous vegetation (e.g., wooded pastures, orchards or vineyards), in which only the herbaceous component is usually recorded in vegetation plots.Although the habitats were defined at hierarchical Level 3 of the EUNIS classification, one habitat (Q3 Palsa and polygon mires) was defined on Level 2, because the two subordinated habitats at Level 3 could not be distinguished based on the species composition.In addition, the expert system contains 46 preliminary definitions of the habitats of other groups: A -marine (coastal salt marshes), C -inland surface water CHYTRÝ eT al. and H -inland sparsely vegetated habitats.However, they have not been tested and require a considerable revision in the future.Of all 1,261,373 vegetation plots in the data set, 1,125,121 were classified to one of the six habitat groups N, Q, R, S, T or V. Of those, 784,901 were classified directly to one of the habitats at hierarchical Level 3 (or Level 2 for Q3) and 341,944 were classified directly to habitat groups (i.e.Level 1 habitats).Further 73,188 plots were preliminarily classified to the habitat groups A, C and H or their Level 3 habitats, 59,745 plots remained unclassified and 3,319 plots were classified to more than one habitat.The resulting characteristic species combinations for the EUNIS coastal, wetland, grassland, shrubland, forest and man-made F I G U R E 2 The density of the vegetation-plot data expressed as numbers of plots in 50 km × 50 km grid cells of the 1,261,373 plots used for the classification (a) and the plots classified by the expert system to habitat groups N, Q, R, S, T, V (b-g) and individual habitats within these groups [Colour figure can be viewed at wileyonlinelibrary.com] is the first tool that automatically classifies vegetation plots across Europe to habitat types of the EUNIS Habitat Classification.The development of this expert system represents a major step forward in the applicability of the EUNIS Habitat

| 657 Applied Vegetation Science CHYTRÝ eT al. EUNIS2020 code EUNIS2007 code Red List code EUNIS2020 habitat name No. of plots
*Small coniferous planted other wooded land - Tichý et al. (2019: theiritions of habitats written as logical formulas that combine taxonomic specifiers, relational operators and threshold abundance criteria, which can be joined as required by the logical operators AND, OR or NOT.Technical description of the syntax of the expert system is provided byTichý et al. (2019: their Cáceres et al., 2015)uropean Red List of Habitats (https://forum.eionet.europa.eu/european-red-list-habitats/library/terrestrial-habitats;seeJanssenet al., 2016)and other sources.These initial lists were critically revised and extensively modified based on multiple classification trials with successive versions of EUNIS-ESy (Section 2.4).•Location-based assignment rules (external classification criteria according to DeCáceres et al., 2015): information about plot location, such as geographical coordinates, which can be used for assigning plots to biogeographical regions (e.g., boreal vs temperate), landscape types (e.g., coastal vs inland) or altitudinal belts.
(Grosan and Abraham, 2011)es based on the dominant tree species, their basal area and information derived from plot location, e.g., altitude, biogeographical region or occurrence in wetland areas.Like EUNIS-ESy, it is a rule-based expert system(Grosan and Abraham, 2011)using the information provided by human experts, both on the dominant species and location, as classification criteria.
(EEA, 2006;Barbati et al., 2014)ert-based lists of diagnostic species for European vegetation classes defined in EuroVegChecklist and included them in an expert system.In this EuroVegChecklist expert system, diagnostic species were used directly as discriminating species in our terminology (see section 2 Methods), which provides an acceptable classification for many plots.However, unless diagnostic species lists are optimized for the purpose of identification of vegetation types, the classification error rate in expert systems is relatively high(Tichý et al., 2019), which is the case for the EuroVegChecklist the EuroVegChecklist expert system, which applies the same classification approach across all vegetation formations.Another expert system on the European scale was developed byGiannetti et al. (2018)for the classification of European Forest Types produced by forestry experts as a tool for sustainable forest management(EEA, 2006;Barbati et al., 2014).This expert system classifies Taxon nomenclature and taxonomic concepts in the input data set of vegetation plots should correspond to those of the Euro+Med PlantBase.If an export from the EVA database is used for the analysis, nomenclature can be automatically converted to the Euro+Med standard in Turboveg 3 using the in-built SynBioSys Taxon Database.Standard data ex- This script can be run in Juice (https://www.sci.muni.cz/botany/juice/),Turboveg 3 or in R using the code developed byBruelheide et al.(https://git.loe.auf.unirostock.de/misc/ESy).Detailed instructions for running the expert system in Juice are available in Appendix S5.For proper functioning, the users have to consider especially these points:• Taxonomic harmonization.•Species cover vs presence data.Because EUNIS-ESy uses the information on species cover, it cannot reliably classify data in which only species presences (not covers) are recorded.If applied to such data, this expert system can correctly classify some plots of grasslands and other open-land habitats, but it consistently fails to provide the correct classification of shrubland, forest and some other habitats.Therefore, we do not recommend to apply EUNIS-ESy to presence-only data.