Data‐driven bioregionalization: A seascape‐scale study of macrobenthic communities in the Eurasian Arctic

We conduct the first model‐based assessment of the biogeographical subdivision of Eurasian Arctic seas to (1) delineate spatial distribution and boundaries of macrobenthic communities on a seascape level; (2) assess the significance of environmental drivers of macrobenthic community structures; (3) compare our modelling results to historical biogeographical classifications; and (4) couple the model to climate scenarios of environmental changes to project potential shifts in the distribution and composition of macrobenthic communities by 2100.


| INTRODUC TI ON
Human demands for ocean space and resources are rapidly growing, adding to the increasing cumulative impacts of multiple stressors related to climate change (e.g. Ingeman et al., 2019), such as ocean warming and increasing thermal stratification, sea-ice extent and thickness decline, as well as changes in water circulation. These profound environmental shifts lead to far-reaching alterations of the composition and structure of marine populations and communities and cause pronounced quantitative and qualitative changes in ecosystem functioning (McGill et al., 2015;Pecl et al., 2017). Without adequate conservation measures, this will result in a loss of biodiversity, resilience, ecosystem services and functioning (Halpern et al., 2015;Worm et al., 2006).
In the Arctic, global warming is about twice as rapid as worldwide (IPCC, 2014), and sea-ice retreat is the most relevant climate factor for marine ecosystems (Macias-Fauria & Post, 2018;Post et al., 2013). Both climate impacts are causing profound qualitative and quantitative changes in the structure and dynamics of benthic communities (Kedra et al., 2015;. However, information on these processes is scarce, due to the challenges of collecting important baseline field data on Arctic biodiversity (e.g. Archambault et al., 2010;Bluhm et al., 2011;Piepenburg et al., 2011), the general lack of open-access large-scale data that can be used to inform biogeographical modelling (e.g. Costello et al., 2018;Renaud et al., 2015), as well as the emphasis of most efforts on easyaccessible and well-researched areas, such as the Barents Sea (e.g. Frainer et al., 2017;Kortsch et al., 2012) or Bering and Chukchi seas (e.g. Grebmeier et al., 2006). Moreover, assessments of biodiversityecosystem functions relationship in the Arctic region are scarce  and the use of species and community distribution models (SDM/CDM) to up-scale local or regional analyses has rarely been used in marine ecosystems, particularly in Arctic seas Renaud et al., 2015Renaud et al., , 2019Robinson et al., 2011).
Our knowledge of Arctic benthic ecosystems is geographically unevenly distributed (Piepenburg et al., 2011). However, a comprehensive understanding of the large-scale patterns of Arctic benthic communities can serve as an integrative indicator of the effects of climate change in the region (Birchenough et al., 2015). Studies by Matishov et al., 2012 showed that the macrobenthic fauna of the Barents Sea responds to strong and long-term climatic anomalies and the ratio of boreal-Arctic species composition changes significantly in relation to warm and cold bottom water conditions. In order to shed light on the benthic community dynamics in response to climate change, the use of joint species distribution models that could take into account the multidimensional nature of communities is essential. Such models represent the studied communities as multidimensional spaces where the species respond jointly to the environment and to each other Tikhonov et al., 2020). This representation should inform our knowledge of remote areas with insufficient sampling activity (Jetz et al., 2019), a prominent feature of benthic ecosystems in the Arctic (Piepenburg et al., 2011;Renaud et al., 2015) in order to reveal potential shifts of the benthic communities.
Additionally, large-scale knowledge about benthic ecosystems in these largely inaccessible regions is necessary for the establishment of marine protected areas (MPAs). These regions, in addition to climate-change stressors, will experience an increase in economic activities (e.g. Northern Sea Route, increased fishery activities, underwater mining) . Establishing MPAs in the Eurasian Arctic is also important, as currently only about ~2.5% of this area is designated as such (Spiridonov et al., 2012), although about 10% of areas should be designated as protected areas to ensure the sustainability and proper functioning of biomes (Vreugdenhil et al., 2003). In order to establish MPAs, comprehensive knowledge of the water column and the delimitation of biogeographical zones must be combined . The current research should provide a baseline for identifying priority areas for conservation management.
To increase large-scale knowledge of benthic species assemblages in these little explored areas, we employ recent multivariate community models  underpinned by a novel synthesis of Arctic benthic fauna (Hansen et al., 2019;Casper Kraan, Thomas Brey, Paul Kloss, Jan Hansen & Dieter Piepenburg, unpublished data). We offer the first model-based biogeographical assessment of macrobenthic assemblages in the Eurasian Arctic, with the specific objectives to (1) delineate spatial distribution and boundaries of seafloor communities; (2) assess the significance of a broad selection of environmental drivers on structuring macrobenthic fauna; (3) compare such model-based results to historical biogeographical classifications; (4) use climate-change scenarios to project future distributions of benthic communities.
We use the Region of Common Profile (RCP; Foster et al., 2013) approach to assess multispecies bioregionalization. This method is based on a mixture-of-experts model framework, jointly considering multispecies data and environmental variables.
It allows mapping of regions within which species share the same probability of occurrence Hill et al., 2017). This is a great advantage compared to traditional methods using some measures of distance or dissimilarity between sampling locations since those do not allow to assess species' individual group membership or propagate uncertainty throughout the modelling process (Warton et al., 2015). Moreover, this approach allows prediction of assemblages in areas where only environmental information is available and their possible shifts in response to climate change .

K E Y W O R D S
distribution modelling, ecological projections, Eurasian Arctic seas, macrobenthic communities, region of common profile, seascape-scale bioregionalization.
We focus on macrobenthic communities, that is, organisms >500 µm, for four reasons: (1) Macrobenthic fauna consists of rather stationary and long-lived organisms and rely nutritionally almost entirely on the organic flux from euphotic layers. Hence, they reflect changes in surface-layer production in their own dynamics (e.g. Grebmeier et al., 2015). (2) No attempt has yet been made to regionalize Arctic seafloor communities using modern multi-species models.
(3) Increased knowledge of such seafloor communities will increase their inclusion and importance in the development of conservation priority areas or other conservation measures . (4) Our knowledge of the spatial distribution of benthic organisms in the Arctic region is rather limited (Piepenburg et al., 2011).

| Study region
Our study region includes three Eurasian shelf seas, that is, Barents, Kara and Laptev Seas (Spalding et al., 2007), covering a total of 3.5 million km 2 (Figure 1). The key environmental feature in this region is sea-ice cover (e.g. Itkin & Krumpen, 2017), lasting between 3-4 months in the Barents Sea to 9-10 months in the Laptev Sea.
Sea ice influences many factors that in turn directly influence the distribution and abundance of benthic communities, such as energy and carbon flow from the water column, the effect of desalination in summer, and vice versa in winter. Run-off from the Yenisei, Ob and Lena rivers also has an impact on environmental conditions, bringing in a significant amount of heat and freshwater, thereby conditioning coastal water circulation and ice regime, as well as determining biochemical regime (Zenkevitch, 1963).  Piepenburg et al., 2011), as well as taxonomically aligned with the world register of marine species (WoRMS Editorial Board, 2019). To reduce bias introduced by various grab-sampling devices, we limited our analysis to presence/absence data (Piepenburg et al., 2011). We ignored temporal variation in benthic species occurrences, since limiting the dataset time frame would severely impact the amount of information available and possibly make this study less valuable or even impossible. However, we matched environmental variables to the time benthic samples were taken (see below).

| Environmental data
Environmental parameters such as salinity, temperature, sediment composition and food availability were proven to be the most important factors determining the distribution of Arctic benthic communities (Dayton, 1990). It was shown that the similarities of faunal assemblages of benthic communities correlate with different depth zones (Steffens et al., 2006). The steepness of the slope may reflect the structure of benthic communities, being the driving force behind the accumulation of organic sediments (Ichino, 2015). Moreover, environmental factors as apparent oxygen utilization, concentrations of phosphates, silicates and nitrates are considered to represent the benthic remineralization function and, therefore, are assumed to be indicators of benthic activity .
The above-mentioned environmental parameters were extracted from public oceanographical databases and satellite-based remotesensing data repositories (see Figure A1). Variables were projected to a unified raster graphical format with coordinate reference sys- were extracted from GlobColour (Maritorena et al., 2010); ocean primary production (mg C m −2 day −−1 ) were extracted from www. scien ce.orego nstate.edu/ocean.produ ctivity (accessed September 2019). Additionally, we also estimated POC flux to the sea bottom (g C m −2 yr −1 ) using the empirical function based on mean net primary production in the ocean surface layer and exponential flux decrease with water depth proposed by Suess (1980). The procedure of data extraction for biological samples and the creation of predictive spaces was performed similar to sea ice.
While the effects of global warming and ocean uptake of CO 2 have been well-studied for the surface ocean environment, there are not that many studies that focus on the shelf bottom waters (Holt et al., 2017), especially in the Arctic Ocean (Popova et al., 2014).  (Jubb et al., 2013). We consider the CMIP5 model for projected sea-ice concentrations instead of a more recent CMIP6 model, as there are no comprehensive intercomparison studies of the CMIP6 and SRES F I G U R E 1 Study area. Eurasian Arctic shelf regions in the Barents, Kara and Laptev Seas (red) covering a total of ~3.5 million km 2 . White dots mark the locations of the 354 sampling stations, visited during scientific expeditions between 1991 and 2013, from where the macrobenthos data used in our study have been collected scenarios so far performed. Moreover, it was shown recently that the performance of sea-ice models based on either CMIP5 or CMIP6 projections are similar in many aspects (Notz & Community, 2020).

| Statistical modelling
The RCP approach performs a one-step classification of multi-species data into groups (assemblages), based on species composition and species' responses to environmental covariates . Each species of a particular RCP has the same probability of occurrence within this RCP, and each RCP is a latent variable (i.e. it is initially unknown and only revealed by the model), whose distribution varies as a function of the environment. This dependence allows predicting the species' probability of occurrence at locations where no samples, but only environmental data are available (e.g. Foster et al., 2013). RCP models are implemented in the "RCPmod" package in R R Core Team, 2019).
Using RCPs requires a number of steps. First, to remove covariates that have limited association with our biological data, generalized additive models (GAM) were used for each environmental variable and each species as a qualitative screening method Hui et al., 2013). Environmental covariates with significant values (p < 0.05) for at least 66% of the species were included for further analysis . These GAMs were also used to identify the optimal number of species to be included in the RCP analysis ( Figure A2), which was 169 species, occurring at least at 30 sampling locations (Table A1). Finally, correlation analysis of environmental variables that passed this preliminary screening was done ( Figure A3), retaining environmental variables with a correlation <0.7 to avoid collinearity (Dormann et al., 2012).
Then, a set of meaningful environmental variables, that is, water depth, sea-ice coverage, near-bottom water temperature and salinity, proportion of fine sediments and POC, were a priori included in each model. The remaining variables were successively added to the RCP analysis, using a forward selection procedure based on a Bayesian information criterion (BIC) to determine the optimal combination of parameters . Continuous environmental covariates were added as linear and quadratic polynomials.
Finally, to determine the number of RCPs, we ran preliminary models that indicated that the optimal number of RCPs varied between two and seven regions. Then, full models were run with 2-7 RCPs, all meaningful environmental variables, 50 random starts, and using BIC to determine model fit. Random starts are necessary to avoid getting stuck in a local likelihood maximum . This then indicated that four RCPs were optimal ( Figure A4).
Next, we determined the optimal set of environmental variables, by combining all environmental variables (a priori and variable ones) with the preset of four RCPs and 300 random starts, using BIC to determine model fit. Model uncertainty was estimated via 1000 bootstrap replications, providing 95% confidence intervals (CI).
To perform spatio-temporal projections of future shifts in macrobenthic community distribution, the best-trained RCP model was coupled to decadal changes (each decade from 2020 to 2099) of key environmental drivers projected by NorESM1-M (sea-ice cover) and SINMOD (bottom-water temperature and salinity) models. For visualizing, the produced probabilistic maps were transferred into the hard-classed maps (i.e. values of the most probable regions at each spatial point were mapped) .

| General bioregionalization
The optimal model consisted of four RCPs: RCP 1 and RCP 2 encompass coastal and offshore shelf areas of the Kara and Laptev  (Table A1).

| Description of the regions of common profile (RCP)
RCP 1 has a spatial extent of ~570,000 km 2 with an average probability of occurrence of 73.5%. The region has a predominantly shallow zonation, the euphotic depth reaches the seabed in most places.  (Table A1).  Table A1). RCP 3 has a spatial extent of ~1,100,000 km 2 and the highest av-  (Table A1).  (Table A1).

| Climate change-driven projection
The climate-change projections suggested a general eastward shift of the RCPs over the 21st century, mainly correlated with retreating sea-ice and increasing sea-bottom temperature ( Figure 5). RCP 1 was projected to experience the most profound shift, disappearing almost completely from the Barents and Kara seas. RCP 3 will be displaced in the trenches of the Barents and Kara Seas by RCP 2 and RCP 4 and will be confined to the deep regions of the continental slope and the deep sea of the Eurasian basins ( Figure 5). By the end of the 21st century, RCP 2 will become concentrated in the south of the Barents and Kara Seas, as well as over the offshore shelf of the Laptev Sea ( Figure 5). RCP 4 was also projected to expand eastward and northward into the Kara Sea, as well as towards the fjords of Spitsbergen and Novaya Zemlya ( Figure 5).

| DISCUSS ION
Based on our novel synthesis of Arctic benthic fauna (Hansen et al., 2019) and the recently developed model-based bioregionalization method 'RCP' , we identified four bioregions across the Eurasian Arctic shelves (Figures 2, 3). Each region is represented by characteristic patterns of biodiversity (Table A1)  While there are long-term data from quite a number of local studies on benthic biota for littoral regions (bays and fjords) in Eurasian-Arctic seas , there are only a few seascape-scale studies, also including offshore regions, that our results can be compared to (Mironov, 2013;Zenkevitch, 1963). Zenkevitch's (1963) bioregionalization, based on data sampled before 1947, overlaps with our Eurasian-Arctic study area, allowing us to put our findings spanning the period 1991 to 2013 into a historical context ( Figure 5). Overall, there is a pronounced similarity between Zenkevitch's (1963) spatial structure, which was based on traditional analysis methods, and our RCP model-based results: RCP 1 and RCP 2 largely correspond to the "Shallow brackish-water province" and the "Shallow marine Siberian province", respectively, which are both parts of the "High-Arctic shallow sub-region". RCP 3 aligns with the "Abyssal Arctic sub-region", covering deep-sea parts of the Arctic Ocean and slopes of the Kara and the Laptev Seas. In contrast with Zenkevitch's scheme (1963), RCP 3 also overlaps with other regions and is even found in shelf areas. Zenkevitch's (1963) "Shallow low-Arctic sub-region" largely coincides with RCP 4, mainly encompassing the Barents Sea bounded by Spitsbergen and Franz Josef Land in the north and Novaya Zemlya in the east. Notable dissimilarities between the RCP structure and the classification of Zenkevitch (1963) exist at the southern and northern borders of the Barents Sea. In the RCP model, the southern Barents Sea belongs to RCP 1 and RCP 2 instead of RCP 4, and in the north RCP 4 wedges into the shallow high-Arctic area, separating Franz Joseph Land from Svalbard. The areas around these two archipelagos are characterized by a mosaic of RCP 2 and RCP 4.
The spatio-temporal projections based on the relatively optimistic climate-change scenarios reveal a clear trend of borealization of the major part of the study area. These shifts are strongly related to the retreat of the Arctic sea ice, which is assumed to eventually lead to the weakening of the pelago-benthic coupling and a decrease of the input of organic matter to the sea bottom (Piepenburg, 2005;Wassmann & Reigstad, 2011). Moreover, the warming of the nearbottom oceanic layer, as well as the change of the salinity regime in some Eurasian shelf regions, are projected to partially affect suitable benthic habitat areas (Renaud et al., 2019) and cause poleward migration of species (IPCC, 2014).
RCP 1 is projected to be most vulnerable to projected climatechange impacts, as it may virtually disappear from the Kara Sea and partly shrink in the Laptev Sea during the 21st century ( Figure 5).
These Arctic seas are expected to be most affected by sea-ice retreat and warming (Bentsen et al., 2013;Wallhead et al., 2017). However, in the eastern part of the studied region, RCP 1 will still dominate in the Laptev Sea, whilst RCP 2 will become almost absent according to projections for 2020-2050 ( Figure A6). It has to be noted, however, that the results need to be interpreted by considering the uncertainty introduced by the sea-ice NorESM1-M model, its resolution did not resolve the occurrence of the rather narrow flaw-lead polynyas in the Laptev Sea ( Figures 5, A6), which mostly determines the distribution of RCP 2. After 2050, RCP 2 will expand quite significantly while the extent of RCP 1 will shrink by 2099 due to further sea-ice decline. Therefore, it is quite likely that the projected contraction of RCP 1 and expansion of RCP 2 in this region will be even more pronounced. However, to answer this question, it is necessary to use models with a higher spatial resolution, which are appropriate for reconciling the fine-scale spatial distribution of the sea ice and can reveal the dynamics of formation and distribution of polynyas in the area.
RCP 4 is most expanding its spatial extent throughout the entire climate simulation by 2100 (Figures 5, A6), reflecting that it is most affiliated with higher temperatures and less sea ice (Figure 4) under global warming (IPCC, 2014). However, while it is projected to generally expand northwards, it will not do so towards near-shore zones ( Figures 5, A6) because there RCP 2 that is similarly affiliated with increased temperatures and less ice (Figure 4) will persist and impede the expansion of RCP 4. Moreover, RCP 4 is most representative at high salinities, while RCP 2 is more tolerant to the typical strong salinity fluctuations in near-shore zones (Figure 4). The deep RCP 3 will gradually retreat northwards to even deeper regions of the central Arctic Ocean, in response to the projected warming of near-bottom waters in shelf depressions, sea-ice decline ( Figure 5) and be gradually replaced by the "warm-water" RCPs 2 and 4 ( Figures 5, A6).
The projections suggest that RCP 1 will start occurring in the south-western Barents Sea after 2030 ( Figure A6). This prediction is rather surprising, as RCP 1 is actually mostly affiliated with the presence of sea ice, cold temperatures and low salinities, while in this area climate change will rather lead to the opposite trends ( Figure 4). This artefact highlights the fact that when using  Zenkevitch (1963)). Current distributions are based on our RCP study and present sea-ice and oceanographical datasets. Future distributions of the four RCPs were estimated by coupling the best-trained RCP model to environmental conditions predicted by projections of NorESM1-M (sea-ice cover) and SINMOD (bottom-water temperature) models that in turn were driven by the climate scenarios Representative Concentration Pathway RCP6.0 of the Coupled Model Intercomparison Project (CMIP5) (for NorEMS1-M) and A1B of the Special Report on Emissions Scenarios (SRES) A1B (for SINMOD) shifts, and an appropriately advanced feature should be implemented within the RCP model.
There is evidence of both long-term resilience and vulnerability of benthic communities to climate-driven environmental change in the Eurasian Arctic. Kokarev et al., (2017) Fossheim et al., (2015) found the same rapid borealization for fish communities. These trends are not only caused by ocean warming but very likely also in response to indirect impacts resulting from the changes in the pelagic-benthic coupling due to the borealization of the pelagic biota and the shifts in the food-web pathways (Kortsch et al., 2015).
With regard to the projections until 2099, only a subset (sea ice, bottom-water temperature and salinity) of the environmental drivers used in the RCP model were considered. Others, such as POC and the depth of the euphotic zone, had to be kept stationary, due to lack of information about their spatio-temporal dynamics in the future, although their climate-driven change will potentially lead to a shift in benthic fauna.

| Implications for conservation management
Marine conservation efforts in the Arctic region are considered to be a high priority in the face of recent climatic changes and the concomitant intensification of human activities Spiridonov et al., 2017). In general, the RCP approach can be considered a powerful model-based mapping tool for informing ongoing and planned spatial conservation management . For instance, in case of the Eurasian-Arctic study area, it could help improving the identification of conservation priority areas that Solovyev et al., (2017) suggested based on the delineation of biogeographical provinces of benthic invertebrates, by application of a more accurate quantitative approach. The RCP approach is also suitable as a conceptual framework for establishing community-level Essential Biodiversity Variables (EBVs), which have been proposed as a key tool for better understanding the patterns and trends in Earth's biodiversity (Pereira et al., 2013). To this end, multiple current and projected RCPs, modelled for various marine biota (e.g. pelagic, benthic), diversity measures and environmental parameter sets, have to be stacked in a temporal cubic format (Jetz et al., 2019).

| Data paradigm in biogeographical research
Accelerating climate change is leading to the rapid poleward migration of species in the oceans. Therefore, decisions have to be made for enhancing the efficiency of marine conservation efforts.
Moreover, fast progress in the aggregation of biological data (e.g. PANGAEA) and eDNA-based assessments of the 'dark diversity' (Boussarie et al., 2018) will result in rapidly growing data volumes, posing challenges for the efficient analysis of these data that is necessary for further enhancing our knowledge on biodiversity and advancing environmental management. To address these challenges, modelling approaches should be established within the frameworks of the bio-information systems (e.g. PANGAEA). In this context, the integration of RCP approaches is the key to address these tasks.
Here, we demonstrated how RCP modelling can be used to extract valuable information from a broad spectrum of datasets, bridging between past, current and future states of the ecoregions.
Overall, the representation of bioregions in the cube-shaped format of layers of RCPs models on the number of different marine communities with underlying environmental layers could give an in-depth level of the ecosystem. Such abstractions will ultimately enhance our knowledge about the functioning of the ecosystems across multiple spatio-temporal scales (Jetz et al., 2019).

| Conclusions
Building on recent RCP models , an unrivalled synthesis of seascape-scale data of Arctic seafloor communities and ecologically relevant environmental parameters (e.g. Pantiukhin et al., 2019), we examined the bioregionalization of seafloor communities in the Eurasian Arctic. Our work provides the first modelbased quantitative mapping of macrobenthic assemblages based on the spatial distribution of taxa and key environmental drivers, such as water depth, sea-ice cover, bottom-water temperature and salinity, proportion of fine sediments, POC and depth of the euphotic zone. Climate scenario-based spatio-temporal projections of future RCP distribution allowed for the first time to assess community-level shifts of ecoregions over the course of the 21st century. Applying the RCP approach on a long-term basis would offer an opportunity to identify 'hotspot' regions, which are characterized by particularly pronounced change and particularly little available information, and thus guide future field activities in the Eurasian-Arctic seas. to thank the editor and anonymous reviewers for providing valuable advice to improve the manuscript. All datasets used in this study are taken from open-access repositories. Our study did not need any research permits.

DATA AVA I L A B I L I T Y S TAT E M E N T
All biotic data are available online via https://doi.panga ea.de/10.1594/PANGA EA.910004. Environmental data were taken from open-source online repositories (see main text), with the exception of sediment-data that we compiled for this study . The results of the RCP model and raw data are available from the Dryad Digital Repository: https://doi.org/10.5061/ dryad.8pk0p 2nn9.