Composite landscape predictors improve distribution models of ecosystem types

Distribution modelling is a useful approach to obtain knowledge about the spatial distribution of biodiversity, required for, for example, red‐list assessments. While distribution modelling methods have been applied mostly to single species, modelling of communities and ecosystems (EDM; ecosystem‐level distribution modelling) produces results that are more directly relevant for management and decision‐making. Although the choice of predictors is a pivotal part of the modelling process, few studies have compared the suitability of different sets of predictors for EDM. In this study, we compare the performance of 50 single environmental variables with that of 11 composite landscape gradients (CLGs) for prediction of ecosystem types. The CLGs represent gradients in landscape element composition derived from multivariate analyses, for example “inner‐outer coast” and “land use intensity.”


| INTRODUC TI ON
Human impact transforms nature all over the world (Ellis, Goldewijk, Siebert, Lightman, & Ramankutty, 2010), and the need for sustainable management of ecosystems is increasing (Díaz et al., 2019).
To understand, monitor and manage Nature's diversity, that is, the variation in Earth's biotic and abiotic processes and features (see e.g. Zarnetske et al., 2019), we must know where this diversity is (Whittaker et al., 2005). Although management strategies such as red lists for ecosystems require systematic mapping of nature, that is, high-quality land cover maps of ecosystems (Keith et al., 2015), only a minor fraction of the Earth's surface has so far been mapped by field survey methods (Alexander & Millington, 2000). Remote sensing methods, although useful for a wide range of purposes, have not yet proven able to interpret community structure and species composition in ecosystems with the geographic and thematic accuracy required for many research and management purposes (Myers-Smith et al., 2020;Strand, 2013). Alternative, efficient pathways to information about the spatial distribution of ecosystems over large areas are therefore needed to enhance the precision and credibility of red lists and global change assessments. Recent guidelines and studies point to distribution modelling as a promising tool for this purpose (Bland, Keith, Miller, Murray, & Rodríguez, 2017;Horvath et al., 2019).
Distribution models are models that treat the geographic distribution of observable objects of a specific type (e.g. species) as a response to a set of supplied predictors (Halvorsen, 2012). Although single species are the most common target for distribution modelling (Henderson, Ohmann, Gregory, Roberts, & Zald, 2014), distribution modelling methods are, in principle, applicable to target objects of many kinds, for example, species assemblages or species groups , patterns of species richness (Santos et al., 2020), plant communities (Franklin, 2013;Jiménez-Alfaro et al., 2018;Ovaskainen & Soininen, 2011), potential vegetation (Hemsing & Bryn, 2012), present and past "vegetation types" (Horvath et al., 2019;Janská et al., 2017;Longcore, Noujdina, & Dixon, 2019) and "ecosystem types" (Halvorsen, 2012). In this article, we use the term "ecosystem-level distribution modelling" (EDM) as an umbrella term for distribution modelling with units above the species level as modelling targets. The term applies to biotic communities, defined by species composition, as well as to ecosystems and their abiotic components. Compared with distribution modelling of single species, EDM has technical as well as practical advantages such as increased power to detect shared environmental patterns for multiple species, and, perhaps more importantly, enhanced potential to generate results relevant for management and decision-making (Ferrier & Guisan, 2006). Nevertheless, neither the encouragement of Ferrier and Guisan (2006) in their review of EDM studies nor the increasing popularity of species distribution modelling (Araújo et al., 2019;Lobo, Jiménez-Valverde, & Hortal, 2010) has so far triggered extensive use of EDM. Several methodological questions in EDM therefore await closer examination.
Access to relevant predictor variables is pivotal for any application of distribution modelling (Araújo et al., 2019), but the theoretically optimal predictor set is difficult to identify as well as to obtain (Austin, 2002). Therefore, some important variables tend to be missing from most distribution models, reflecting: (a) lack of knowledge about which environmental factors cause the current distribution of a modelling target; and/or (b) lack of spatial data that represent processes and attributes known to be important (Austin, 2002;Barry & Elith, 2006). Few studies have explicitly addressed the relative suitability of different predictors for EDM (Halvorsen, 2012, but see Jiménez-Alfaro et al., 2018). Hence, a better understanding of which proximate variables cause the spatial distribution of different ecosystems is needed in EDM, along with better spatial proxies for these variables for applied modelling purposes. Jiménez-Alfaro et al. (2018) concluded that any community type at any hierarchical level may be modelled at continental extent, provided it is consistently defined by species composition and constrained by environmental factors. Hierarchy theory has shown that the aggregation of similar components into fewer composite units (i.e. numerous species into fewer functional types or species into communities and ecosystems) may reduce the number of variables required to obtain models of a given quality, that is, with a certain predictive power (e.g. Allen & Starr, 2017). However, a shift of modelling target from single species to community or ecosystem may require a reformulation of the "ecological model," that is, the theoretical basis for the modelling process (Austin, 2002). With reference to ecosystems as modelling targets, we define "ecosystem types" as recurrent abstract "units of assessment that represent complexes of organisms and their associated physical environment within an area" (Keith et al., 2015, based on Tansley, 1935. The concept of the environmental complex-gradient (Whittaker, 1956), that is a set of correlated environmental variables that act on the species in concert rather than one by one, is fundamental for describing and understanding variation in species' responses to the environment (Halvorsen, 2012). We hypothesize that the complex-gradient concept, commonly used to understand and describe species' relationships to the environment, can be extended to the landscape level as well as be implemented in studies of the distribution of ecosystem types. This extension implies that each level of ecological diversity contains subsystems at the level below (Turner & Gardner, 2015); landscapes contain ecosystems and other landscape elements, while ecosystems contain species and their environment (Halvorsen, Bryn, & Erikstad, 2016;Noss, 1990). Accordingly, we define a "complex landscape gradient" (CLG) as an "abstract continuous K E Y W O R D S conservation planning, distribution modelling, ecosystem classification, ecosystem types, IUCN Red List of Ecosystems, landscape gradients, spatial prediction, species response curves variable that expresses more or less gradual, coordinated change in a set of more or less strongly correlated landscape variables." Thus, CLGs are composite variables expressing parallel, gradual or discontinuous variation in the presence and/or abundance of landscape elements. We define a "landscape element" as a "natural or human-induced object or characteristic, including spatial units assigned to types at an ecological diversity level lower than the landscape level, which can be identified and observed on a spatial scale relevant for the landscape level of ecological diversity" (Halvorsen et al., 2016). Composite landscape gradients can be obtained from multivariate analyses of landscape element compositional data undertaken to reduce the dimensionality of an n-dimensional landscape-level hyperspace (Erikstad, Uttakleiv, & Halvorsen, 2015).
Furthermore, segments along two or more CLGs can be combined into "landscape types," defined as "more or less uniform areas characterized by their content of observable, natural and human-induced landscape elements." Our definitions establish variation in landscape element composition along CLGs as a parallel to the spatiotemporal domain defined by Delcourt, Delcourt, and Webb (1982) as "meso-scale," capturing abiotic and biotic patterns that occur at spatial scales of approximately 10 6 -10 10 m 2 in response to processes operating at temporal scales of 10-10 4 years (e.g. geomorphological processes, climatic fluctuations, human land use, fire regimes, etc.). Analyses of data from Norway indicate that response curves of landscape elements (including ecosystems) along CLGs bear resemblance to species response curves along local environmental complex gradients (see Figure 1; Erikstad, Halvorsen, & Simensen, 2019); most ecosystems appear to have distinct optima along CLGs, that is, intervals in which they reach maximum occurrence probability. If this is the case, such landscape gradients may potentially be useful as predictors of ecosystem types in EDMs.
The aim of this study was threefold: (a) to explore how well distributions of ecosystem types can be predicted; (b) to compare the predictive power of different sets of predictors in EDM; and (c) to test if EDM can be improved by using of composite "landscape predictors" (CLGs and landscape types) as predictors.

| Study area
The study area comprised the entire mainland of Norway including coastal islands, but excluding the Svalbard archipelago, Jan Mayen and Bear Island, spanning latitudes from 57°57′N to 71°11′N and longitudes from 4°29′E to 31°10′E. Mainland Norway covers only 323,802 km 2 , but comprises an exceptional range of natural variation, given its moderate size (Halvorsen et al., 2016), including both terrestrial, marine, limnic and snow and ice ecosystems. The study area is characterized by a wide range of climatic variation; all seven temperature-related vegetation zones commonly recognized in northern Europe (from boreo-nemoral to high alpine) occur in Norway (Bakkestuen, Erikstad, & Halvorsen, 2008). Norway has a high mineral and bedrock diversity (Ramberg, Bryhni, Nøttvedt, & Rangnes, 2008), and high diversity of landforms (Gjessing, 1978). In addition to natural variation, the diversity of ecosystems in Norway is enhanced by variation in human land use. Throughout history, most Norwegian ecosystems have been affected by land use activities such as domestic grazing, outfield fodder collection, heath burning, reindeer husbandry, forestry, and industrial, urban and recreational development (Almås, Gjerdåker, Lunden, Myhre, & Øye, 2004). The diversity of Norwegian ecosystems and landscapes is thoroughly described by the theoretical framework "Nature in Norway" (Halvorsen F I G U R E 1 "Landscape element response plot," showing the distributions of eleven landscape elements along a "complex landscape gradient" within inland hills and mountains, as identified by ordination axis 1 obtained by use of global non-metric multidimensional scaling (GNMDS). The gradient reflects variation in abundance of landscape elements from steep rugged barren mountains (high alpine areas, left side of axis 1) towards areas with gentle slope in the lowland (right). The response curve is derived from ordination of 85 landscape elements recorded in 3,966 sampling units throughout Norway. Axes are scaled in half-change (H.C.) units: one unit corresponds to 50% turnover of landscape element composition et al., 2016) from which terms, definitions and typologies applied in this study have been obtained.

| Response variables
The response variables in our study are (the occurrence of) nine terrestrial ecosystem types whose management (i.e. conservation planning, general land use planning and red-list assessments) would benefit specifically from better knowledge of their spatial distributions (NBIC, 2018). The nine ecosystem types in our study are difficult to map reliably by remote sensing methods (Erikstad et al., 2009;Strand, 2013), and none of them are currently included in fullcoverage data sets for Norway.
Data from field-based vegetation and ecosystem-type mapping during the period 2004-2018 were used as training data for parameterization of EDM models for each type (Figure 2). We chose ecosystem types with equivalent definitions in two different systems of types, so that different sources of field data could be combined (Bryn, Strand, Angeloff, & Rekdal, 2018;Bryn & Ullerud, 2018; see Appendices S2 and S6). The raw data for the response variables were collected from three sources: (a) regional 9 × 9 km-grid surveys (AR 9 × 9, e.g. Bryn et al., 2015;Rekdal & Angeloff, 2013); (b) a subset of vegetation maps produced in the period 2004-2014(NIBIO, 2018; and (c) data from ecosystem-type mapping conducted by the Norwegian Environment Agency (2019). From the raw response data, a post hoc processing was conducted to reduce bias related to spatial clustering of the training data (e.g. spatial autocorrelation, see Appendix S4). The training data used for each ecosystem type consisted of all presences remaining after post hoc processing and a random sample of ~10 000 true absences (Nad'o & Kaňuch, 2018;

| Predictors
We use the term "predictor" as a collective term for variables potentially accounting for variation in response variables, including continuous variables as well as categorical variables with several classes (e.g. land cover types). All our predictors were generated with a grain (pixel) size of 100 × 100 m, or adapted to this grain size by rasterization from vector formats or interpolation by kriging (see Table 2 and Appendix S7 for further details). Strongly correlated variables (|τ| > 0.7) were omitted (Appendix S18). For model building, we used predictors from three qualitatively different predictor sets (basic climatic, geological, biological predictors, landscape predictors and neutral pseudo-predictors), and combinations of these (Table 2, Figure 2). F I G U R E 2 Methodological overview. For each of nine ecosystem types, we used one presence/absence response variable and predictors from qualitatively different predictor sets to build nine different models: two models from the 50 "basic" predictors (i.e. all single environmental variables including climatic variables and climatic variables only); two models from landscape variables (landscape types and complex landscape gradients, that is CLGs); four models from combinations of predictors from the other predictor sets; and one model from the control group of pseudo-predictors. The predictor sets represent different pools of predictors, available for model selection BIOCLIM variables); and (e) a land cover variable (9 classes of the "Land Resource Map" AR50).
The "landscape predictors" set consisted of 13 variables developed as a part of the new system for description and mapping of ecological diversity; Nature in Norway (Halvorsen et al., 2016; Appendix S11). Of these 13, the 11 CLGs represent variation in landscape element composition within three functional categories: We identified patterns of variation in landscape element composition by parallel use of detrended correspondence analysis (DCA; Hill & Gauch, 1980) and global non-metric multidimensional scaling (GNMDS; Minchin, 1987). To avoid circularity, the nine targets for EDM were deliberately selected not to be among the 85 variables used for identifying CLGs.
The Nature in Norway analyses  supported a division of the study area into six "major landscape types" identified by geomorphological criteria: inland hills and mountains; inland valleys; inland plains; coastal plains; coastal fjords and coastal hills and mountains. To further subdivide these major types into "minor landscape types," we first extracted groups of highly correlated variables as candidates for CLGs within functional variable categories (abiotic, biotic and land use-related variation). For each of these variable groups, we used constrained ordination (RDA, ter Braak, 1985) with forward selection of variables to obtain a parsimonious and orthogonal set of variables to represent each CLG. The CLGs obtained were subsequently divided into a number of discrete intervals, depending on the total length of each CLG as measured in units of compositional turnover. Each interval comprised a fixed amount (8%) of compositional turnover along a CLG. GIS-based proxies for the 11 CLGs were finally projected to the entire study area (Table 2, Appendices S12 and S20). Landscape types were obtained by combining segments along all CLGs identified as important for a given major landscape type; every unique combination of intervals along the set of relevant landscape gradients defined one landscape type TA B L E 1 Properties of modelled ecosystem types and model performance statistics (AUC and TSS). Ecosystem types (rows) are ordered by decreasing mean AUC for the n = 9 models for each ecosystem type  . Hence, the twelfth and thirteenth "landscape predictor" consisted of six major and 284 minor landscape types, respectively.
The "neutral predictors" set consisted of 11 variables derived from "neutral landscape models" (Gardner, Milne, Turner, & O'Neill, 1987; Appendix S13) as a control group for estimating the magnitude of differences in model performance that could arise by chance alone (see e.g. Fourcade, Besnard, & Secondi, 2018). These predictors, referred to as "pseudo-predictors," are completely artificial but show similar levels of spatial autocorrelation as the basic and landscape predictors.

| Model building
We fit generalized linear models (GLM; McCullagh & Nelder, 1989) with logit link function and binomial errors ("logistic regression") to the occurrence probability of each of the nine ecosystem types, as recommended for presence/absence data by Elith and Leathwick (2009). "Derived" predictors were obtained from original predictors by seven different transformation types: linear, monotonous, deviation, forward hinge, reverse hinge, threshold and binary (Vollering, Halvorsen, & Mazzoni, 2019). The effect of variable transformation is that the functional relationship between the occurrence of the modelled target and a predictor can be described more flexibly than if only the original predictors were allowed to enter the model (Vollering et al., 2019).
We selected variables by using an automated forward stepwise selection procedure based on F tests of nested models. First, a representative group of "derived predictors" was selected for each individual predictor based on their explanatory power. Single derived predictors were added to the model until no more derived predictors could be added that satisfied the pre-set threshold significance criterion (α = 0.001, see Vollering et al., 2019). Second, selection of predictors (each represented by a set of selected derived predictors) was performed by the same forward selection procedure (Vollering et al., 2019; see Appendices S8, S16 and S20 for details).
The full variable selection procedure was repeated for all predictor sets (1-9) described in Table 3. The resulting 81 models thus represented unique combinations of predictor variables and targeted ecosystem types. For models with two sets of predictors joined by a "+" sign, landscape or neutral variables (explaining significant amount of variation) were added to the best model derived from the basic predictors, to test the effect of adding the second set on model performance (see Bailey, Boyd, & Field, 2018). An example of models derived from different predictor sets is provided in Table 4, for the ecosystem-type T32 Semi-natural grasslands.

| Model evaluation
For evaluation of EDM models, we used a data set that was collected independently of the training data. This data set was obtained from the sample-based area frame field survey programme AR18×18 (Strand, 2013). The survey, which was conducted 2004-2014, included a systematic survey of 1,081 plots, each 0.9 km 2 , distributed over the Norwegian mainland according to the 18 × 18 km LUCAS grid Strand, 2013). The survey encompassed variation along all major environmental gradients recognized in Norway, covering the full spatial extent of the study area (see Appendices S3, S4, S19 and S20).
We applied two discrimination metrics for evaluation of EDM models; the area under the receiver operator characteristic curve (AUC; Fielding & Bell, 1997) and the true skill statistic (TSS) maximized for model specificity and sensitivity (Liu, Berry, Dawson, & Pearson, 2005

| Model performance
AUC values for the 81 models obtained for the nine ecosystem types ranged from 0.524 to 0.919 (Table 3). The best EDMs for seven out of nine ecosystem types were classified as "good" (AUC > 0.8), two of these as "excellent" (AUC > 0.9). The best models as judged by AUC were developed for T22 "Dry grass alpine heath," T34 "Coastal heathland" and T32 "Grassland" ( Table 1). The ecosystem types discriminated with lowest success were V3 "Bog" and V1 "Open fen."

| Performance of the predictors
When all variables were available for model building (i.e. predictor set = "all"), the variables that were most often included in the models were as follows: "quaternary geology" (included in models for n = 8 ecosystem types); AR50 land cover (n = 6); "standard curvature" (a local morphometric terrain parameter, n = 6); CLG "distance to coast" (n = 5); and CLG "vegetation cover" (n = 5). Note: Predictor sets represent non-overlapping "pools" of candidate predictors, from which any variable that explained a sufficient amount of variation could, in principle, be included in the models. Predictor sets containing a + sign represent cases for which selected variables from the last-mentioned predictor set were added to the final derived model for the ecosystem type, obtained solely by the first mentioned predictor set. Predictor sets (rows) are ordered by decreasing rank sum, based upon single-model AUC. Entries sharing a letter in the column "similar sets" (e.g. "ab") do not have significantly different median values at the p < .05 level (Wilcoxon paired rank-sum test). Each ecosystem type is modelled separately, and the number of ecosystem-type models is the same (n = 9) for all predictor sets

TA B L E 3
Excluding the neutral predictor set, which produced low AUC values, model performance differed less among predictor sets than among ecosystem types ( Figure 3). However, the Kruskal-Wallis rank-sum test revealed highly significant differences in AUC values also among predictor sets (χ 2 = 33.085, p < .001, df = 8). The best EDMs in terms of both AUC and TSS (

TA B L E 4
Example showing nine models with different combinations of predictors for one out of the nine ecosystem types: T32 seminatural grasslands.

Type example T32 Semi-natural grassland
Semi-natural grassland includes meadows formed by forest or shrub clearance followed by livestock grazing and/or haymaking, subject to the additional condition of neither being subjected to ploughing nor reseeding nor heavy fertilization. The vegetation is dominated by graminoids and herbs, nitrophilous species are not prominent. Semi-natural grassland may be open (treeless) or, also when actively managed, have an open tree layer (wooded or coppice meadows). Land management intensity, lime richness and risk of drought are the most important LECs. Since the middle of the 20th century, traditional use of semi-natural grasslands has decreased and conversion into arable fields, agriculturally improved grassland or abandonment has taken place.  Table 3 and Figure 3).

| Model parsimony
The simplest models, obtained solely from landscape variables (Table 3, Figure 4 and Appendix S17: Figure S17 Figure 4 shows that above a certain minimum number of predictors (≈10), the predictive power was negatively affected by including more predictors.

| How well can ecosystem types be predicted?
Our results that EDMs for seven out of nine ecosystem types are "good" (AUC > 0.8) or "excellent" (AUC > 0.9) according to the criteria of Araújo et al. (2005), based upon independent model evaluation, show that discrimination of ecosystem types over large areas at a high spatial resolution (100 × 100 m) is possible. We thus consider our models to be of value for several practical management purposes, such as preparations for field-based mapping, as a supporting tool in red-list assessments, and as a knowledge base for land use and conservation planning at a regional scale in cases where detailed mapping is either not feasible or not necessary. Our results thus support the conclusion of Ferrier and Guisan (2006) that EDM deserves to be used more often, and more widely, as an alternative or a supplement to modelling of individual species.
The use of "pseudo-predictors" confirms that the model performance metrics reliably discriminate between models based upon randomly created gradients/patterns and models derived from real data, that is, environmental variables in the widest sense.
Nevertheless, the substantial variation in predictive performance among pseudo-predictors calls for caution against over-interpretation of small differences in model performance also among sets of real predictors. However, the statistical tests demonstrate significant differences in the predictive power of different predictor sets in our study, even when the numerical differences in performance statistics are small. Moreover, due to inherent properties of the model performance metrics, relatively smaller differences represent more strongly significant model improvements in the upper range of AUC values, than at the lower values (see Fielding & Bell, 1997). We thus interpret the results of our analyses as sufficiently clear to represent general patterns in the overall ecological response of ecosystem types to different sets of explanatory variables.

| Finding the best predictors
Our results support the idea that identifying an ideal combination of predictors for each modelled target is critically important in EDM, like in species distribution modelling. The models with highest predictive power for each ecosystem type were characterized by a unique combination of predictors, not duplicated by models for any other ecosystem type in the study. This result is in line with ecological theory, since the ecosystem types in our study are defined by differences in species composition as explained by variation along different sets of local environmental complex gradients and subject to the action of different structuring processes (Halvorsen et al., 2016; see also Appendix S2). Accordingly, variables to be used for EDM should be specifically selected for each ecosystem type.
The poor performance of climatic predictors alone indicates that current regional, that is climatic, variables alone are insufficient for explaining the distribution of ecosystem types at the resolution of our study. This is not surprising since the ecosystem types modelled in our study are defined by local complex variables and not by regional climatic conditions (see Appendix S2). Climatic predictors showed higher predictive power for ecosystem types well known to have a high probability of presence within a specific climatic region (e.g. T14 Exposed ridge and T22 Arctic-alpine dry grass heath). Furthermore, the current climate does not necessarily reflect the prevailing conditions in the period when the modelled target was established; there might for instance be time lag in species' responses to changes in environmental conditions (Bertrand et al., 2011;Guisan, Thuiller, & Zimmermann, 2017;Maiorano et al., 2013).
The fact that "basic predictors" (also including geological and land cover variables) performed significantly better than the cli-

| Do "landscape predictors" improve distribution models of ecosystem types?
CLGs performed reasonably well as predictors of ecosystem types in our study: for a majority of ecosystem types, EDMs based upon CLG predictors efficiently accounted for considerable amounts of variation (i.e. high predictive power with few variables). The individual models (Table 4, Appendix S6) indicate that the CLGs enable flexible fitting of relationships between response and predictors. An example is the red-listed ecosystem-type "T32 Semi-natural grassland." The abundance of this ecosystem type increases gradually with the intensity of human land use to a certain point, before it decreases towards heavily utilized or highly urbanized areas (see frequency of observed presence-plot in Appendix S16). This unimodal relationship is well captured by the CLG "land use intensity," which includes the total abundance of buildings, infrastructure and man-induced land cover types. This information is lost in, for example, a traditional land cover map, in which the internal variability within, and the relationship between, the discrete, non-ordered classes are hidden.
Notably, the CLGs are indirect (distal) gradients that reproduce local landscape gradients at meso-scale, but that do not allow direct mechanistic modelling of the processes which give rise to the observed patterns. However, since direct (proximal) predictors that represent multiple drivers, operating over long time spans, are difficult or impossible to represent by adequate proxies, the aggregated patterns that result from such processes (CLGs) may in some cases ser ve as better surrogates for these processes than, for example, the current climate so often used for species distribution modelling. Our study demonstrates that CLGs extracted by ordination of landscape data may be of predictive significance for single ecosystem types that were not subject to prior ordination. The good performance of the CLGs demonstrated in this study provides an interesting parallel to distribution modelling studies at the species level for which it has been demonstrated that gradients obtained by multivariate methods (e.g. ordination axes) are better predictors of species diversity than single environmental variables (e.g. Ejrnaes, 2000;Margules, Nicholls, & Austin, 1987;Santos et al., 2020). Gradient analysis at the landscape level is to a large extent still unexplored (but see Luck & Wu, 2002)

| Missing predictors
In line with, for example, Ullerud, Bryn, and Klanderud (2016) and Horvath et al. (2019), we find that some ecosystem types are more easily discriminated by means of available predictors than others. This is expected, since different ecosystem types will typically have different response curves to underlying environmental gradients.
Our results show that there is inherently more difficult to discriminate presence from absence for ecosystem types with a broad and flat-topped (platykurtic) response curves along identified gradients, rather than types with sharper-peaked (leptokurtic) response curves. However, differences in model performance may also arise due to "missing predictors," that is, lack of spatial data representing attributes known to be important (see Barry & Elith, 2006

| Model complexity
After initial model selection based on internal evaluation (see Vollering et al., 2019), simple models performed consistently better in external evaluation data than more complex models (Figure 4).
Our results indicate that the use of CLGs as predictors in EDM may

| Conservation and management implications
In the procedure and criteria for red-list assessments of ecosystem types (IUCN, 2016), spatial distribution plays a key role. For many ecosystem types, the detailed distribution is unknown, and in practice, the assessment is therefore accomplished by expert judgements (NBIC, 2018). A good EDM can provide information about the distribution that may support and improve these expert judgments considerably. Together with risk modelling, cause-effect modelling and modelling of changes in distribution over time, EDM has been suggested as a preferred method for such assessments (Bland et al., 2017).
Development of high-quality prediction maps is a cost-effective tool F I G U R E 5 Examples of predictors and spatial prediction maps at a resolution of 100 × 100 m for the entire study area, the mainland of Norway, spanning 13 latitudinal and 27 longitudinal degrees. (a) "Bioclim 1, annual mean temperature," one of the 50 "Basic predictors," (b) "CLG land use intensity," one of the 11 complex landscape gradients (CLGs). (c) "Minor landscape types." (d-f) Spatial prediction of the ecosystem types: "T27 Boulder field"; "T32 Semi-natural grasslands"; and "V2 Mire and swamp forest" for mapping the distribution of species and ecosystems, allowing for cost-efficient field efforts. The prediction maps developed in our study ( Figure 5, Appendix S21) are suitable for use as a supporting tool in the upcoming red-list assessments for Norway (NBIC, 2018) as well as for planning of field-based mapping of red-listed ecosystem types (NBIC, 2018; Norwegian Environment Agency, 2019).

| CON CLUS IONS
The current spatial distribution of ecosystem is the result of climatic, geological, biological and human land use-related processes that have been acting in concert over thousands of years. Identifying variables that can predict the outcome of such aggregated processes is inherently difficult, but critically important in EDM. Our results indicate that ecosystems, just like species, may have distinct optima (maximum probability of occurrence) within specific segments of broad-scale gradients in the landscape. We suggest that improvements in EDM may be achieved by combining the development of better proxies for missing predictors with more knowledge about the distributions of ecosystems along complex gradients in the landscape, at several scales and levels. EDMs may complement field-based mapping and remote sensing in improving our knowledge about the spatial distribution of ecosystems. Such knowledge is essential for planning and management purposes, since the biodiversity of our planet cannot be managed species by species.

DATA AVA I L A B I L I T Y S TAT E M E N T
R-scripts, spatial predictions and raster data for the composite landscape predictors and the neutral predictors supporting the results in the paper are available for download from the DRYAD database at https://doi.org/10.5061/dryad.cjsxk sn33. Other spatial data produced in this work are available on request from the authors. Due to the restrictions with ownership of the original area frame survey data (AR18X18), these data are not openly available from the authors.