Historical contingency via priority effects counteracts environmental change on metacommunity dynamics across decades

Community ecology has had a strong focus on single snapshots of species compositional variation in time. However, environmental change often occurs slowly at relatively broad spatio‐temporal scales, which requires historically explicit assessments of long‐term metacommunity dynamics, such as the order of species arrival during community assembly (i.e., priority effects), a theme that merits further empirical quantification. In this study, we applied the Bayesian inference scheme of Hierarchical Modeling of Species Communities together with information on functional traits and evolutionary dependencies to efficiently explore the question of how ecological communities are organized in space and time. To do this, we used a comprehensive time‐series dataset from boreal lake plants and adopted the perspective that more sound conclusions on metacommunity dynamics can be gained from studies that consider a historically integrative approach over long timeframes. Our findings revealed that historical contingency via priority effects can profoundly shape community assembly under the influence of environmental change across decades (here, from the 1940s to the 2010s). Similarly, our results supported the existence of both positive and negative species‐to‐species associations in lake plants, suggesting that functional divergence can switch the inhibition–facilitation balance at the metacommunity level. Perhaps more importantly, this proof‐of‐concept study supports the notion that community ecology should include a historical perspective and suggests that ignoring priority effects may risk our ability to identify the true magnitude of change in present‐day biotic communities.

The mechanisms that govern species coexistence and determine the assembly of present-day ecological communities have been strongly debated throughout the history of ecological science (Fukami 2015). From observed patterns of species occurrences and distributions, Diamond (1975) derived his nowclassic assembly rule of forbidden vs. permissible species associations, which he used to explain the empirical observation that functionally similar bird species rarely coexist on the same island in the Bismarck archipelago. Fueled by a concurrent surge in studies of ecological communities (Grant and Abbott 1980), Diamond's hypotheses spurred a series of mechanistic analyses exploring alternative models of community assembly that continue today (Logue et al. 2011;Viana and Chase 2019). Recent decades have seen a shift from a primarily local perspective on species coexistence via interspecific interactions to the recognition that local communities are dynamic in space and time (Leibold et al. 2004;Fukami 2015;De Meester et al. 2016). Although these ideas can be traced to the mid-1960s (MacArthur andWilson 1967), the metacommunity concept (i.e., a set of local communities connected by dispersal of constituent organisms; Logue et al. 2011) has empirically advanced our knowledge on the spatio-temporal organisation of communities in the terrestrial (Meynard et al. 2013), marine (Moritz et al. 2013) and freshwater (Heino et al. 2015) realms. However, as recently discussed by Brown et al. (2017), empirical research in metacommunity ecology has primarily focused on analyzing single snapshot observations of compositional variation, thereby jeopardizing attempts to resolve the historical contingencies that are not evident from such static views of community organization (Datry et al. 2015). [Correction added on 09 March 2021, after first online publication: the previous two sentences were reworded to avoid duplicating information contained in an original source, which has now been properly cited.] These issues beg for a shifting paradigm in the types of questions we ask in community ecology. For example, rather than simply asking about the relative importance of niche selection, dispersal events or ecological drift (Vellend 2016), we argue that this question must be put in a historically explicit context.
Discussing around the state of ecological research before the end of the century, Lawton (1999) stated that "community ecology (…) is a mess", not least because this discipline is riddled with contingency so challenging that "useful generalizations are hard to find" (Fukami 2015). [Correction added on 09 March 2021, after first online publication: this sentence was reworded to avoid duplicating information contained in an original source, which has now been properly cited.] Basically, according to Fukami (2015), this is to say that long-lasting ecological imprints on contemporary communities can often be understood only by considering the legacies of the past, whether due to biotic or abiotic factors (Drake 1991). Indeed, it is becoming increasingly apparent that community assembly depends on the order of species arrival, i.e., priority effects (Chase 2003;Fukami 2015). In this regard, Fukami (2015) noticed that the importance of historical contingencies for community assembly was already recognized when " Gleason (1926) discussed what we now call priority effects as a conceptual alternative to Clements (1916) then-dominant climax model of ecological succession". [Correction added on 09 March 2021, after first online publication: the previous two sentences were reworded to avoid duplicating information contained in an original source, which has now been properly cited.] Although the study of community assembly history offers a means to a deeper understanding of ecological patterns by decoding how inhibition and facilitation interacts with environmental change to divert communities onto alternative trajectories (De Meester et al. 2016), historical contingency via priority effects is usually absent from the traditional archetypes of community assembly (reviewed in Logue et al. 2011), and has remained a fairly minor topic for the literature to date. If ecological theory holds true, however, incumbent species could preempt niche space at the expense of late-successional species (Chase 2003), although these and other priority effects (e.g., niche modification, see Table 1) may still be difficult to predict unambiguously (Little and Altermatt 2018).
The limited research on priority effects to date mostly focuses on microbes (e.g., Zee and Fukami 2018), fungi (e.g., Kennedy and Bruns 2005), vertebrates (e.g., Stroud et al. 2019) and terrestrial plants (e.g., Grman and Suding 2010). In addition, these studies have been conducted especially in the context of invasion biology and suggest that short-term historical contingency is likely to dictate community trajectories across ecosystems. In freshwater ecosystems, however, most research on community assembly history has centered on testing the model (Scheffer et al. 1993) of alternative equilibria. Despite these studies showed how ecological communities varying in their assembly histories can differ in species composition and ecosystem functioning, few studies (e.g., Fried-Petersen et al. 2020) have adopted a metacommunity perspective on time series exceeding a few years. Therefore, we are largely neglecting potentially important features of ecosystems, such as environmental changes and species turnover over longer timeframes (e.g., across decades). Consequently, very little is known of the long-term impact of historical contingency via priority effects and its interaction with environmental variation on community assembly. The little available research on this topic posits a major hindrance for both basic and applied ecological issues (Tucker and Fukami 2014), including management and restoration of natural ecosystems (Heino 2013a;Fukami 2015). [Correction added on 09 March 2021, after first online publication: this sentence was reworded to avoid duplicating information contained in an original source, which has now been properly cited.] In the following, we use the heuristic framework provided by Hierarchical Modeling of Species Communities (HMSC; Table 1; Ovaskainen et al. 2017a), an extension of joint species distribution models (JSDM; Warton et al. 2015;Dormann et al. 2018), to explore the question of how lake vascular plant communities are organized in space and time. In other words, we investigate whether environmental change and historical contingency via priority effects shape species co-occurrences and ultimately determine metacommunity dynamics during a 70-year study period. Here, we adopted the holistic view that single snapshot studies are inadequate for characterizing ecological processes that influence community assembly, or at least, more sound conclusions can be gained from studies that consider community assembly history over relatively long timeframes (Magurran et al. 2018;Hendershot et al. 2020). Lake vascular plants provide excellent model systems for our purpose, since these communities are comprised of organisms that experience vegetative regeneration and emerge from propagule banks and by immigrants arriving from the regional pool of species throughout the growing season (Viana et al. 2016). Moreover, lake plants can be classified into several functional groups, with helophytes and hydrophytes (Table 1) showing differences in sensitivity to ice cover during winters and accessibility to carbon and nutrients from the atmosphere, water and sediments (Alahuhta et al. 2018a). Vascular plants also represent the dominant flora in lake ecosystems worldwide (Cook 1999) and work as ecosystem engineers and foundation species, providing a suite of ecologically valuable functions such as carbon sequestration, sediment stabilization and nutrient cycling (O'Hare et al. 2018). They also provide habitat and food for a wide range of other dependent freshwater and terrestrial biotas and, finally, influence water quality and primary production (Lacoul and Freedman 2006).
Using temporally replicated data from small boreal lakes (1947-1950 to 2017) and based on recent theoretical models and empirical studies on lake plants (Jamoneau et al. 2019;Lindholm et al. 2020a,b), we hypothesized (H1) that environmental changes would primarily control compositional variation among communities, thereby leading to relatively strong signals of long-term environmental niche filtering in both species and traits (García-Girón et al. 2019;García-Girón et al. 2020a). However, because most lake plants are perennial species showing clonal reproduction (Santamaría 2002) and vegetative regeneration (Barrat-Segretain et al. 1998), and are associated in discrete clusters (e.g., García-Girón et al. 2020b), we predicted (H2) that metacommunity dynamics would also depend on the local community of plants already established via priority effects. Consequently, we also expected (H3) that HMSC models with both abiotic and biotic covariates would more accurately predict the long-term trajectory of local communities. In other words, the milieu of interactions throughout the history of community assembly, together with spatiotemporal changes in the surrounding environment, would ultimately determine metacommunity dynamics. However, we also predicted (H4) that these mechanisms would appear highly variable across different species, and across species classified as helophytes or hydrophytes, not least because plant functional groups usually show different assembly Table 1. A glossary of some of the most relevant concepts dealt with in this paper.

Concept Description
Community-level random effects HMSC allows estimating species-to-species residual associations, which are implemented through a latent factor approach.
To do this, HMSC includes community-level random effects to model not only variation within species, but also covariation among species, i.e., species-to-species residual associations (Ovaskainen et al. 2017a). Here, we included community-level random effects at the levels of lake identity, sampling time points and its interaction.
Environmental change Here, we understand environmental change as any consistent temporal trend in the environment that affects ecological community dynamics in the long run. Helophytes Emergent plant species that photosynthesise above the water surface and have aerial reproductive structures (Cirujano et al. 2014).
HMSC * A new mathematically flexible routine to analyse community-level data by means of Bayesian joint species distribution models (JSDM). It enables the joint assessment of environmental filtering and priority effects by incorporating species responses to their environment, prior occupancy of study sites, species traits and phylogenetic legacies (Ovaskainen et al. 2017a).
[Correction added on 13 September 2021, after first online publication: the previous sentence was reworded to avoid duplicating information contained in the originally cited source.]

Historical contingency
The effect of the order and timing of past events-either abiotic or biotic-on community assembly (Fukami 2015).

Hydrophytes
Submerged (ceratophyllids, isoetids, and elodeids) and floating-leaved (nymphaeids and lemnids) anchored and freefloating plants that grow primarily under water or with their leaves floating on top of the water surface, respectively (Cirujano et al. 2014).

Metacommunity
A set of local communities connected by the dispersal of constituent species (Leibold et al. 2004). From the very beginning, four not mutually exclusive archetypes-species sorting, patch dynamics, mass effects, and neutral dynamics-have been associated with metacommunity dynamics. Each of the archetypes is distinguished by the amount of emphasis they place on a combination of regional and local processes, disturbance, and the degree to which species are equivalent in their traits (reviewed in Brown et al. 2017).

Priority effects
The prior species occurrences influence the trajectory of communities (De Meester et al. 2016). According to Fukami (2015), in niche preemption, "species that arrive early reduce the amount of resources available to other species and, in doing so, limit (…) colonization opportunities for late-successional species (…) that need these resources to survive and reproduce". By contrast, in niche modification, "early-arriving species change the types of niches available within the local site and, consequently, the identities of late-arriving species that can colonize the community" (Fukami 2015). See Supporting Information S1 for a schematic synthesis. [Correction added on 09 March 2021, after first online publication: these definitions were placed within quotation marks in order to appropriately credit the original source.] Putative species associations HMSC models the residual variance in species distributions as a variance-covariance matrix and then identifies species pairs showing positive or negative associations, which can be considered as hypotheses of biotic interactions. * HMSC, Hierarchical Modeling of Species Communities (Ovaskainen et al. 2017a,b).
[Correction added on 09 March 2021, after first online publication: a footnote was removed from this table to avoid redundancy and eliminate a citation.] mechanisms (Ot'ahel'ová et al. 2011). We finally expected (H5) that we would find multiple species-to-species associations after controlling the fixed and community-level random effects, either as a result of facilitation or eventual competitive exclusion in lake plants (Lacoul and Freedman 2006;Song et al. 2017).

Materials and methods
Test dataset and regional characteristics The study region, dataset characteristics and field methods have been described previously (Lindholm et al. 2020a,b,c) and are here paraphrased briefly to aid understanding of the ecogeographical context of the regional dataset. We studied a total of 27 small lakes located in the Kokemäenjoki drainage basin (Finland) and, more specifically, in the area between the two large lakes Roine and Pyhäjärvi (Supporting Information S2). The elevational range among the study lakes is relatively low, ranging between 77 and 131 m a.s.l. Nonetheless, changes in environmental conditions occur even along the relatively modest elevational range of the study area, mainly as a result of the glacial and postglacial processes that washed fine-grained sediments with nutrients across the landscape (Seppälä 2005). Many of these lakes have brown, humic waters and are situated in small chains of lakes and streams, both features typical of boreal lakescapes. Naturally more eutrophic lakes are located at lower elevations and are mainly surrounded by settlements and agricultural land, whereas smaller and more oligotrophic lakes at higher elevations are less affected by anthropogenic pressures (mostly forest roads, scattered summer cottages and diffuse nutrient load from activities in surrounding forests) and are surrounded by coniferous forests and peatlands. These heterogeneous pressures allowed us to assess the effects of environmental mechanisms that are relevant at different spatio-temporal scales, while accounting for relevant historical contingencies which are required for an objective assessment of metacommunity dynamics (Fried-Petersen et al. 2020).
Lake plants were sampled during five different decades (i.e., 1947-1950, 1975-1978, 1991-1993, 2005-2008, and 2017) using similar methods (here, an aquascope and two different types of rakes were used to survey the whole lake area) to maintain data comparability. For clarity, we refer to these surveys as 1940s, 1970s, 1990s, 2000s, and 2010s. The plant identification was done at the lowest possible taxonomic rank; therefore, we excluded taxa identified to genus level (e.g., Isoetes spp.) and hybrids in further analyses. In this study, we concentrated on presence-absence data of species classified traditionally as aquatic vascular plants in Finland (Linkola 1933), but we also considered seven tall species growing in the water from the sedge genus Carex. In total, 65 aquatic vascular taxa (including several species complexes) were examined (Supporting Information S3). With time-series datasets, presence-absence data are the most reliable source of ecological information (Lindholm et al. 2020a), particularly when compared to coverages or other information representing species abundances. Indeed, lake plants are usually perennial and are thus expected to disappear in a short period only marginally, contrary to their abundances, which are more prone to fluctuations (Jamoneau et al. 2019). Interestingly, as changes in these lake plant communities was modest through decades (Lindholm et al. 2020c), with few temporal gains of species and losses of species at the lake level (Supporting Information S4), it suggests no major biases in our data caused by species turnover in between surveys.
We focused on physico-chemical variables that are widely identified to be key drivers of lake plants: lake area (ha), elevation (m), maximum depth (m), water transparency (measured by Secchi depth, m) and pH (Supporting Information S2; Lacoul and Freedman 2006). All these environmental features represent a larger complex of ecologically important drivers that correlate with other hydromorphological and water chemistry variables which were not available for all study surveys (e.g., Kosten et al. 2009). In addition, we derived three land-use variables (agricultural area (%), including field and pasture area, built area (%) and amounts of ditches) from the base maps for each decade (National Land Survey in Finland 2017, 2018). Importantly, we decided to concentrate on these land-use types as several studies (e.g., Ecke 2009) have found that they influence water chemistry and other physical features of lakes, and because these are the land-use types that have changed most over the past decades in the study area. This reflects the trend of increasing anthropogenic pressure observed all over the world (Song et al. 2018). We also considered geographical coordinates of lake centers and attempted to use climatic variables; however, reliable climate data for our study area were not available for the whole 70-year study period (Lindholm et al. 2020a).

Statistical analyses
The presence-absence of plant species (Supporting Information S4) was analyzed using Hierarchical Modeling of Species Communities (HMSC, Ovaskainen et al. 2017a). This framework is an extension of the Generalized Linear Modeling (GLM) paradigm and models the recorded data (lakes and species denoted as i and j, respectively, where i = 1, …, m s and j = 1, …, n s ) as y ij $ D j L ij ,σ 2 j , where D j is a statistical distribution for a particular data type, "L ij is the linear predictor, and σ 2 j is a variance parameter" (Ovaskainen et al., 2017a;Tikhonov et al. 2017;Tikhonov et al. 2020a). [Correction added on 13 September 2021, after first online publication: this sentence was reworded to avoid duplicating information contained in an original source, which has now been properly cited.] The linear predictor L ij is delineated with a combination of fixed (F) and random (R) expressions (see Ovaskainen et al. 2017a for further details) as L ij = L F ij + L R ij , where L F ij includes the environmental term E (L E ij ) and the association term P (L P ij ; see below) that models the effect of environmental covariates and priority effects, respectively. [Correction added on 13 September 2021, after first online publication: this sentence was reworded to avoid duplicating information contained in an original source, which has now been properly cited.] In summary, the fixed effects model is expressed as the regression L E ij = P nc k = 1 x ik β jk , where x ik indicates a measured environmental covariate k = 1, …, n c in lake i (i.e., matrix X; Fig. 1), and β jk is the regression coefficient for the linear response of species j to this environmental covariate (see Ovaskainen et al., 2017a). [Correction added on 13 September 2021, after first online publication: this sentence was reworded to avoid duplicating information contained in an original source, which has now been properly cited.] To generate a parametrization of the model for our empirical community-level data, we assumed the regression coefficients as a multivariate normal distribution, β j. $N(μ j. , V), where the vector of the expected niche μ j. and the variance-covariance matrix V are estimated parameters (see Ovaskainen   of plant species (EPR model). Each white box denotes different data types, whereas yellow boxes and black arrows represent "estimated parameters" (here, the linear predictors L, the random variation in species co-occurrences Ω, the species niches β, the influence of traits on species γ and the phylogenetic signal parameter ρ), and "relationships described by statistical distributions" (sensu Ovaskainen et al., 2017a). Time-series datasets contain location and time information about the samples. Gray and white colors in boxes denote "different environmental conditions" (Ovaskainen et al., 2017a) of the landscape and colored flowers are different lake plant species. Presence-absence matrix (Y matrix) represents site x species community data for a set of lakes with different spatio-temporal contexts. The environmental template (X matrix) includes a set of physico-chemical (in this example, water chemistry and elevation) and land-use (in this example, agricultural area and built area) variables for each study site. Multi-trait information (T matrix) includes a combination of multiple functional attributes (here, represented by different geometric shapes and colors) for each plant species in the presence-absence community data. The phylogenetic matrix (P matrix) accounts for the evolutionary legacies of the regional species pool. For clarity, we only include three different time-series datasets in the graphical scheme. Artwork and figure caption modified from Ovaskainen et al. (2017a). For clarity, we only include three different time-series datasets in the graphical scheme. et al., 2017 and papers therein for mathematical and notation details). [Correction added on 13 September 2021, after first online publication: this sentence was reworded to avoid duplicating information contained in original sources, which have now been properly cited.] For hierarchical studies, such as ours, the random effects L R ij are incorporated mathematically as "the sum of the random effects over all levels of the study exercise" (Tikhonov, 2018) following the equation: L R ij = P nr r = 1 L r p r i ð Þj , where n r denotes the number of exercise levels r = 1, …, n r , p r (Á) indicates the projection for exercise level r, which maps the sites i = 1…m s to the corresponding units q = 1,…, n r p , and n r p is the total number of units for a certain exercise level r (Ovaskainen et al. 2017a;Tikhonov, 2018). [Correction added on 13 September 2021, after first online publication: this sentence was reworded to avoid duplicating information contained in an original source, which has now been properly cited.] The HMSC approach incorporates aspects of traditional species distribution models to delineate patters of ecological communities and, at the same time, account for variation attributed to phylogeny, functional traits, environmental niche filtering, and spatial and temporal random effects (Ovaskainen et al., 2017a;Fig. 1;Supporting Information S5).
[Correction added on 13 September 2021, after first online publication: this sentence was reworded to avoid duplicating information contained in an original source, which has now been properly cited.] Here, we used default priors (see Ovaskainen et al. 2017a and Supporting Information therein for details) and modeled species occurrences using the probit model (thus, excluding the variance term σ 2 j from the models) for presence-absence data with the Hmsc R package (Tikhonov et al. 2020b). As fixed explanatory variables, we included (log-transformed) physicochemical and (logit-transformed) land-use covariates (except amounts of ditches, for which we used the log-transformation), with lake identity, surveys (i.e., sampling time points) and the interaction effect of lake identity with surveys representing the community-level random effects (Ovaskainen et al. 2017a). The transformed physico-chemical (excluding elevation) and land-use variables were then modeled as the slope between each covariate and the time period in between surveys, and included those described previously, i.e., maximum depth, lake area, pH, Secchi depth, agricultural area, built area and ditches. Here, we also used the occurrences of the species in the previous survey as predictors of their occurrences in the following surveys. Specifically, to separate species' responses to the shared environment (β jk ) from species' responses to predecessor species (α j 1 j 2 ), we separately examined the influence of occurrence of species j 2 in the previous surveys on the occurrence of species j 1 in the focal survey (see Ovaskainen et al. 2017a for details). Hence, priority effects were modeled as the regression L P ij = P ns j = 1 x ij 1 α j 1 j 2 , where x ij 1 represents species j 1 in lake i, and α j 1 j 2 is the regression coefficient representing the linear response of species j 1 to predecessor species j 2 . As the regression parameters measure how the occurrences of focal species depends on the occurrences of predecessor species (Ovaskainen et al. 2017a), we interpret these as describing historical contingencies via priority effects (Little and Altermatt 2018).
We primarily compared four models. The first included three types of factors (EPR model): physico-chemical and land-use covariates (E), prior plant species occurrences (i.e., priority effects, P), and community-level random effects (R). Because there was no prior occupancy data for the first survey (1940s), this EPR model (L EPR ij = L E ij + L P ij + L R ij ) was run using information from the second (1970s) through fifth (2010s) surveys (see Little and Altermatt 2018 for a similar approach). The second model (R model) included only the community-level random effects (L R ij ) for the five study decades (1940s, 1970s, 1990s, 2000s, and 2010s). For the sake of comparison, we re-ran the analyses (ER model) using only the second through fifth surveys with community-level random effects plus physico-chemical and land-use covariates (L ER ij = L E ij + L R ij ), and repeated the model (PR model) including community-level random effects plus the prior presence of plant species (L PR ij = L P ij + L R ij ). Following Tikhonov et al. (2020a), we fitted all models with 10,000 Markov chain Monte Carlo (MCMC) steps, out of which we discarded the first 2000 steps as burn in, and subsequently thinned the remaining samples by 10. For comparing the predictive performance of the models, we ran a two-fold cross validation. To do this, we first split the data into two sets, of which both contained a randomly selected half of the study sites. We then fitted the models to both sets of data and used the fitted models to predict species occurrence probabilities in the half of the data not used in model fitting, resulting in predictions for the whole dataset based on independent datasets used for training ). We measured the predictive performance of the models against the validation data using Tjur R 2 coefficients of discrimination (Tjur 2009). Specifically, we obtained an overall measure of predictive power by averaging the species-specific Tjur R 2 values.
Additionally, we included phylogenetic and functional information (see Supporting Information S6) into the HMSC procedure to map the environmental responses onto the phylogeny and model how species traits (here, growth form, normal method of propagation, perennation and potential size; Supporting Information S6) influence their niches. More specifically, we followed Ovaskainen et al. (2017a) and modelled the influence of functional traits on species responses to the environment as μ jk = P nt l = 1 t lj γ lk , where "t lj is the value of a trait l = 1, …, n t for species j" (Fig. 1), whereas the parameter γ lk denotes "the effect of trait l on response to covariate k" (Ovaskainen et al., 2017a). Interestingly, this previous equation can also be used to respond to the question of how much variation in species' niches can be attributed to their functional traits (see Ovaskainen et al. 2017a for details). Moreover, we further followed Ovaskainen et al. (2017a) and accounted for phylogenetic legacies (Fig. 1)   added on 13 September 2021, after first online publication: the previous two sentences were reworded to avoid duplicating information contained in the originally cited source.] Further following Ovaskainen et al. (2017a), we partitioned the explained variance in presence-absence of each species among the fixed (i.e., physico-chemical and land-use covariates, and prior plant species occurrences) and community-level random (i.e., lake identity, surveys and the interaction effect of lake identity with surveys) variables. This summarized the relative contribution of the physico-chemical environment, landuse changes, the prior presence of plants and random effects to the occupancy patterns of each species in the HMSC model. Finally, we used the latent part of the EPR model to analyse residual correlations between pairs of species (Ovaskainen et al. 2017a) and obtain a set of 'hypothetical species associations' (Little & Altermatt, 2018). [Correction added on 13 September 2021, after first online publication: this sentence was reworded to avoid duplicating information contained in an original source, which has now been properly cited.] More specifically, the HMSC model incorporates the species-to-species associations by a correlation matrix as M j 1 j 2 = Ω j 1 j 2 = ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi , where "the off-diagonal element Ω j 1 j 2 denotes the covariation among species pairs j 1 and j 2 " (Ovaskainen et al. 2017a;Tikhonov 2018). [Correction added on 13 September 2021, after first online publication: this sentence was reworded to avoid duplicating information contained in an original source, which has now been properly cited.] Since environmental covariates, community-level random effects and prior site occupancy were included in the analyses least 75% posterior probability for β parameters. The branches in the phylogenetic tree are colored to represent different plant lineages, i.e., Malpighiales, Brassicales, Saxifragales, Lamiales, Asterales, Ericales, Caryophyllales, Ranunculales, Ceratophyllales, Poales, Asparagales, Alismatales, Acorales, Nymphaeales and Equisetales (from left to right). SeeSupporting Information S6 for the tip labels on the phylogeny. (b) Heatmap showing how species traits (here, growth form, normal method of propagation, perennation and potential size; Supporting Information S6) respond to physicochemical covariates. The color scales and threshold posterior support for γ parameters correspond with those of (a). residual correlations in matrix M j 1 j 2 described if a species pair j 1 and j 2 is empirically detected in the same study sites more or less than random expectations from the EPR model (Ovaskainen et al. 2017a;Warton et al. 2015). Note, however, that hypothetical species-to-species associations do not provide empirical evidence for biotic interations (Dormann et al. 2018). Instead, these signals suggest that biotic interactions might have a strong effect on the above-mentioned associations (Ovaskainen et al., 2017a;Little and Altermatt 2018). [Correction added on 13 September 2021, after first online publication: the previous two sentences were reworded to avoid duplicating information contained in an original source, which has now been properly cited.] All analyses were performed in R programming language (R Development Core Team 2018) for CentOS Linux 7.7 (The CentOS Project, <https://www.centos.org/>) using the supercomputing resources managed by SCAYLE (Supercomputing Centre of Castilla y León, <https://www.scayle.es/>).

Results
The cross-validation exercise of the models showed that the average predictive performance generally increased with model complexity, so that the R model using only the random effects (n = 135) produced the worst predictions (mean Tjur R 2 = 0.21). While individually accounting for the environmental template (n = 108; mean Tjur R 2 = 0.70) and predecessor species (n = 108; mean Tjur R 2 = 0.75) clearly improved predictions, the "full" EPR model (n = 108) showed the highest average predictive power (mean Tjur R 2 = 0.95), supporting that environmental and prior occupancy data provide complementary information (Little & Altermatt 2018). [Correction added on 13 September 2021, after first online publication: the previous sentence was reworded to avoid duplicating information contained in an original source, which has now been properly cited.] Since the "full" EPR model produced the best predictions (Kruskal-Wallis χ 2 = 216.2; p < 0.001), the following results are shown for this model. Importantly, the convergence diagnostics (Supporting Information S7) suggested reasonably good MCMC convergence, as most effective sample sizes were relatively high and potential scale reduction factors were close to one.
Variance partitioning indicated that prior occupancy patterns (i.e., priority effects) were by far the most important explanatory variables influencing plant occurrences, accounting for 72.1% of the explained variance in the "full" model (Fig. 2a). The physico-chemical environment and land-use covariates explained 21.6% and 6.1% of the variance, respectively, whereas the community-level random effects captured only residual co-occurrence patterns among the species (0.2%). Surprisingly, we found no differences in the relative Residual correlations are shown only if pairwise associations are supported by at least 75% posterior probability (the remaining associations are in white) (Ovaskainen et al. 2017). Positive and negative associations denote species pairs co-occurring in more and less sites than predicted by random chance and their niches, respectively (Little & Altermatt 2018). Species (here, n = 32 since we only represented species with at least one positive or negative pairwise association) have been ordered after hierarchical clustering with Ward's criterion (Ward 1963) on pairwise co-occurrence values. (b) Chord diagram illustrating positive (red) and negative (blue) species associations with at least 95% posterior probability. Associations are indicated by intersecting chords, i.e., the residual correlation of species j 1 with species j 2 . For clarity, species were classified as helophytes and hydrophytes (green and yellow segments, respectively). SeeSupporting Information S3 for species abbreviations. contributions of the physico-chemical environment, land-use changes, the prior presence of plants and random effects to the occupancy patterns of species classified as helophytes and hydrophytes ( Fig. 2b; Factorial ANOVA, F = 0.1, p$1).
In general, there was a strong posterior support for many β parameters being positive (red) or negative (blue), i.e., there were relatively strong signals of environmental niche filtering (Fig. 3a). Most species exhibited positive responses to lake area and Secchi depth (except for some helophytes, e.g., Carex diandra and Typha angustifolia for lake area and Eleocharis spp. and Schoenoplectus lacustris for Secchi depth), and the opposite pattern was found for elevation and maximum depth (with exceptions including, e.g., Potamogeton perfoliatus and Sparganium natans for elevation and S. lacustris and Typha latifolia for maximum depth). On the other hand, some species exhibited preference for basic waters (e.g., Ceratophyllum demersum and Ranunculus spp.), whereas other plants exhibited avoidance of basic pH and thus preference for more acidic conditions (e.g., Phragmites australis and Potamogeton crispus). Importantly, we did not find a strong phylogenetic influence on how the species responded to the explanatory variables, as the 95% confidence interval of parameter ρ was 0.05 (min, max = 0.00 − 0.14) (Supporting Information S8). This means that species traits controlling for environmental preferences were not phylogenetically constrained for lake plants. We observed a positive association between lake area and Secchi depth with individual potential size, suggesting that small-sized plant species tended to be less common in large and deep lakes. Moreover, seed-dispersing plants were more likely to occur in deeper lakes (Fig. 3b). Note, however, that the functional traits studied explained a modest proportion of the variation (on a scale from 0 to 1) in both the species niches (0.10) and the species occurrences (0.15).
After we accounted for the physico-chemical environment, land-use changes, the prior presence of plants and random effects (sensu Little and Altermatt, 2018), the EPR model provided evidence for associations between different species pairs (Fig. 4a). For example, Potamogeton praelongus and P. perfoliatus, as well as Alisma plantago-aquatica and Hippuris vulgaris, were found together more (i.e. residual correlation > 0.95) than what our models predicted by random chance or even considering their niches (Fig. 4b).
[Correction added on 13 September 2021, after first online publication: the previous two sentences were reworded to avoid duplicating information contained in an original source, which has now been properly cited.] Similarly, Lemna trisulca, Eleocharis palustris complex, and Sparganium emersum were particularly likely to be found at the lakes where Spirodela polyrhiza, Eleocharis acicularis and Sagittaria sagittifolia were also detected in previous surveys, respectively. Conversely, Nuphar lutea and Utricularia minor rarely co-occurred with Utricularia vulgaris (despite somewhat similar habitat requirements) and the invasive Elodea canadensis. Other species pairs were also found to experience heterospecific attraction (e.g., Crassula aquatica and E. acicularis) and inhibition (e. g., Eleocharis spp. and T. angustifolia). [Correction added on 13 September 2021, after first online publication: the previous sentence was reworded to avoid duplicating information contained in an original source.].

Discussion
Although much is known about the spatial variation of ecological communities in terrestrial (Meynard et al. 2013), marine (Moritz et al. 2013) and freshwater (Heino et al. 2015) ecosystems, community assembly history still remains understudied (Datry et al. 2016). [Correction added on 09 March 2021, after first online publication: this sentence was reworded to avoid duplicating information contained in an original source, which has now been properly cited.] To overcome this shortage, we have taken the advantage of Hierarchical Modeling of Species Communities (HMSC) and long-term data on the ecology of boreal lake plants to examine the imprints of historical contingency via priority effects and environmental change on the assembly of communities across decades. This approach has steadily become popular (e.g., Tikhonov et al. 2017;Tikhonov et al. 2020a) in statistical literature, and perhaps most importantly, it proved to be a useful heuristic framework that allowed us to compare and relate different metacommunity dynamics across a fivedimensional parameter space (here, represented by the linear predictors L, the species niches β, the influence of traits on species niches γ, residual species associations Ω and the influence of phylogeny on species niches ρ).
To the best of our knowledge, our study provides the first empirical evidence that historical contingency via priority effects outweighs environmental change and can persist across multiple decades in lake systems. More specifically, variance in community composition explained by environmental changes was relatively modest, partially refuting our first hypothesis (H1) that environmental changes (here, the local physico-chemical environment and land-use covariates) primarily drive compositional variation among communities. At the same time, these findings confirmed our second expectation (H2) that species occupancy strongly depends on the community of plants already established via priority effects, thereby underlying the importance of history in patterning metacommunity dynamics. In a first tier of analyses, however, we also found evidence for relatively strong signals of longterm environmental niche filtering in the functional traits of the species. As we expected (H3), our results revealed that time-series models with high-resolution community data provide a more comprehensive understanding than snapshot observations. Contrary to our expectations (H4), we found no big differences in the relative contributions of assembly mechanisms to the occupancy patterns across all species and species classified as helophytes or hydrophytes. Finally, we report evidence supporting the existence of both positive and negative species-to-species associations after controlling for the shared responses to the environment, predecessor species and random effects, thereby confirming our last hypothesis (H5). Our main message is simple: historical contingencies via priority effects have profound implications for interpreting long-term community dynamics observed in nature. We therefore advocate the expansion of the metacommunity framework to consider priority effects as another potentially important factor explaining metacommunity dynamics.
Historical contingency via priority effects is key to understanding metacommunity dynamics.
Priority effects have been shown to affect community assembly across a wide range of terrestrial ecosystems, particularly in Pyrenean forests (Kouba et al. 2015) and Scandinavian grasslands (Cousins 2009). In freshwater ecosystems, historical influences are often investigated through the use of experimental mesocosms (Chase 2010) and palaeoecological records (Hundey et al. 2014), usually with the aim to reconstruct former ecosystems without testing their influence on contemporary biodiversity (but see Mergeay et al. 2011). As a consequence, our current theoretical and empirical understanding of historical contingencies usually neglects priority effects over longer timeframes. Our findings demonstrate, however, that long-term priority effects can profoundly shape community assembly under the influence of environmental change in lake ecosystems. In this regard, we show that the unexplained variance of traditional analytical tools, such as the variation partitioning approach in the context of constrained ordinations (Alahuhta et al. 2018b), could be assigned to priority effects and argue that community assembly history provides an underappreciated perspective on context dependency in ecology (see De Meester et al. 2016). We therefore strongly believe that ignoring historical contingencies, even if they cannot be explicitly examined using conventional statistical tools, may compromise our understanding on metacommunity dynamics and decrease the accuracy and performance of predictive models. In other words, although metacommunity theory explains most niche-based and spatial-based signals on ecological communities (Brown et al. 2017), our results now suggest a more nuanced role of historical contingency via long-term priority effects.
More specifically, the priority effect exerted by incumbent communities that we detected here might have been strongly enhanced by vegetative strength and the storage effects that is created by the production of new, large buffer of long-viable propagules in the local propagule bank (in lake plants, for example, total seedling densities and vegetative plant parts can reach ca. 3300 propagules m −2 and 0.8 complete plants m −2 , respectively ;Kautsky 1990;Smart and Dick 1999). Indeed, previous analysis on temporal patterns of lake plants, which were mostly based on local scale of observations, attributed community persistence to high recruitment rates (e.g., Capers 2003) and the high regeneration abilities of vegetative fragments (Barrat-Segretain et al. 1998). In this regard, regenerative strategies, involving plant fragments, deeply anchored rhizome networks and buds, are already known to efficiently predict the underlying long-lasting stability of aquatic vegetation in disturbed habitats (e.g., Combroux et al. 2001). Although the mechanism underlying priority effects remains unclear in our study, we propose that the colonization of early-arriving species is consolidated by their regeneration success and the propagule pressure they produce, and this may have extended the impact of their priority effect in time (Mergeay et al. 2011), as observed in our data. Indeed, this reasoning may also help to explain the only slightly different patterns in spatial beta diversity across decades found for the same dataset of lake plant communities (Lindholm et al. 2020a).
Storage effects, vegetative resistance and high recruitment rates may have contributed to the weakening of environmental niche filtering by historical contingencies (Tucker and Fukami 2014). Although most important environmental covariates were likely included in our study models, we cannot exclude the possibility that missing explanatory variables might have constrained species niches and long-term community dynamics of these lake plants (see Little & Altermatt, 2018 for a similar reasoning on freshwater macroinvertebrates). [Correction added on 13 September 2021, after first online publication: the previous sentence was reworded to avoid duplicating information contained in an original source, which has now been properly cited.] However, we found a relatively strong posterior support for many β parameters of our analysis (Fig. 3a), suggesting that species and trait compositions were somehow affected by environmental variation. Specifically, the role of lake area can be attributed to the separate, but not mutually exclusive effects of increased area per se and habitat diversity (more habitats and microhabitats appear when ecosystem size increases; Heino 2013b) on occupancy patterns, whereas the relationships between plants and altitude might be associated to the climate and local habitat gradients between lowland and upland lake groups (Alahuhta et al. 2018b). Similarly, the preference of some species for basic and acidic conditions can be explained by how pH affects bicarbonate concentration in freshwaters (Iversen et al. 2019). However, since only summer pH values were available in our dataset, these findings should be interpreted with caution, not least because summer pH values can fluctuate greatly according to the intensity in the primary production of microalgae and submerged plants, particularly in more eutrophic waters (Jeppesen et al. 1998). Indeed, this might explain why P. australis and P. crispus exhibited avoidance of basic pH in our study area despite the fact that these plant species usually show wide ecological amplitude (Cirujano et al. 2014).
The relationships between potential plant size, lake area and Secchi depth have been discussed elsewhere (Lacoul and Freedman 2006), suggesting that larger and deeper lakes tend to support relatively tall species with elongated steams. On the other hand, the rather weak phylogenetic signal detected in our study underscores the idea that lake plants have a high phenotypic plasticity both morphologically and ecologically and, at the same time, agree with a recent study (García-Girón et al. 2020a) that found no support for the existence of phylogenetic signals in plant convergent lineages worldwide. Perhaps more importantly, due to high phenotypic plasticity (Lacoul and Freedman 2006), we suggest that lake plants probably adapted to the relatively modest changes in the environment during the 70-year study period, thereby potentially explaining the relatively low variance explained by niche-based processes in metacommunity dynamics.
Variable species-to-species associations at the metacommunity level.
Many studies have concluded that biotic interactions have an important role in improving biodiversity predictions (e.g., Kissling and Schleuning 2015;García-Girón et al. 2020c) and, in line with a recent replacement experiment at fine spatial grains (Hao et al. 2013), our results support this conclusion for lake plants. After controlling the fixed and community-level random effects, two main species co-occurrence groupings emerged showing both positive interspecific interactions and mutually exclusive presences. More specifically, there were a host of positive interactions between hydrophytes (e.g., Potamogeton spp. and H. vulgaris) and helophytes (e.g., A. plantago-aquatica), which may be related to the differentiated use of nutrients from the water column and sediments (Lycarião and Dantas 2017), thereby reducing the effects of niche-shrinking factors and enabling species to exploit a greater portion of the available resources (i.e., facilitation, Table 1). Moreover, pre-existing plant canopy (e.g., S. emersum) could have served as a substrate for small free-floating plants (e.g., L. trisulca and S. polyrhiza) lacking anchoring rooting capability, which probably promoted plant growth and persistence by reducing current and wind wave disturbances (Santos et al. 2011). On the other hand, several species belonging to the same life form were less prone to co-occur than our EPR model predicted. [Correction added on 13 September 2021, after first online publication: the previous sentence was reworded to avoid duplicating information contained in an original source.] For instance, certain opportunistic invasive species of Hydrocharitaceae (e.g., E. canadensis) are particularly effective at appropriating nutrients and space, particularly bicarbonate at low-light conditions (Lacoul and Freedman 2006), thereby excluding other hydrophytes (e.g., Utricularia spp.) from shallow habitats (i.e., inhibition, Table 1). In addition, the rapid growth, large seed production and large development of some emergent species (Typha spp.) were likely to eliminate several smaller neighboring species of spikerushes (Eleocharis spp.). In this study, the greater competitiveness of Typha spp. was probably mediated by the release of allelochemicals and its relatively larger allocation of biomass to rhizomes (Lacoul and Freedman 2006). Perhaps more importantly, these findings are likely to be consistent with the suggestion that functional divergence in lake plants (e.g., life forms) can switch the inhibitionfacilitation balance at the metacommunity level (Song et al. 2017) and contrast with previous studies that divided lake plant associations on the basis of the trophic state of the habitats (Toivonen and Huttinen 1995). However, the existence of the species-to-species associations highlighted here should be examined in more detail using manipulative in-lake experiments (Dormann et al. 2018), not least because biotic interactions inferred from HMSC can only be considered as working hypotheses (Tikhonov et al. 2017). [Correction added on 13 September 2021, after first online publication: the previous sentence was reworded to avoid duplicating information contained in an original source, which has now been properly cited.]

Caveats
There are four potential caveats in our study that must be carefully explained due to common limitations of the modeling approach. First, it is possible that missing environmental covariates could affect statistical inference (Ovaskainen et al. 2017a), including species-to-species association networks. Although major environmental gradients for boreal lake plants have been covered in the models (Alahuhta et al. 2015;Lindholm et al. 2020a,b), more detailed data on climate, nutrient concentrations and sediment properties would have benefitted our study. However, as the land use and nutrient concentrations are closely related in lake systems (Ecke 2009) and latent factors may represent hidden or unmeasured environmental covariates (Warton et al. 2015), this should not be a critical issue in our study (Lindholm et al. 2020b). Another difficulty in identifying historical components is that the abiotic environment of lakes and their plant communities both influence each other. This makes it extremely difficult to separate abiotic-mediated signals from the priority effects exerted by incumbent communities, especially in observational studies (Ovaskainen et al. 2017a,b;Norberg et al. 2019b). Third, we used a presence-absence community matrix in our HMSC models, whereas data on species' relative abundances might more easily reflect empirical patterns created by historical contingencies via priority effects (Ovaskainen et al. 2017a), as has been evidenced for the significant negative impacts of the invasive E. canadensis on native lacustrine floras across European countries (Mjelde et al. 2012). An alternative solution for future studies would be to mathematically account for heterospecific inhibition and attraction by means of density-dependent models (Ovaskainen et al., 2017a). [Correction added on 13 September 2021, after first online publication: the previous two sentences were reworded to avoid duplicating information contained in the originally cited source.] However, we strongly believe this is a minor problem since García-Girón et al. (2020c) recently refuted Grinnellian ideas that species occurrences are less likely to capture biotic interactions in lentic systems. Finally, interspecific interactions among trophic guilds, e.g., herbivory stress and cascading effects in food webs (Carpenter et al. 2001), might also alter the assembly mechanisms and species-to-species associations we unraveled here. For example, the introduction and spread of the non-native muskrat (Ondatra zibethicus L.) in the present study area have probably caused changes in lake plant communities due to nesting, housing behavior and feeding habits (Toivonen and Meriläinen 1980), thereby creating different successional stages among lakes across the landscape. These potential caveats, however, do not dismiss the validity and significance of our results, but they highlight the risk of too lenient interpretation of observational data (Dormann et al. 2018).

Conclusions and implications
Our study is a proof-of-concept towards a more historically integrative approach in metacommunity ecology, but it is certainly only a first step. While this field of research has been fostered by key conceptual advances (Logue et al. 2011), empirical research based on snapshot data of one time point has often revealed context dependency in community-environment relationships (Datry et al. 2016, see Heino et al. 2015 for a review in aquatic systems). [Correction added on 09 March 2021, after first online publication: part of this sentence was reworded to avoid duplicating information contained in an original source, which has now been properly cited.] This is partially because analytical routines to infer community assembly processes are not keeping pace and still hamper our ability to synthesize knowledge. Using the Bayesian inference scheme of Hierarchical Modeling of Species Communities (HSMC; Ovaskainen et al. 2017a), this work provides evidence that the unexplained variance of traditional analytical tools, such as the variation partitioning approach, could be assigned to historical components. These could potentially lead to priority effects that can leave their imprint on species distributions via vegetative regeneration, long-lasting propagule pressure and plant recruitment, thereby resulting in historically contingent communities across multiple decades. Our study therefore demonstrates that incorporating long-term perspectives offers considerable untapped potential for increasing our understanding of metacommunity dynamics. Further, our findings revealed relatively strong signals of long-term environmental niche filtering of species and trait composition and supported the existence of both positive and negative species-to-species associations in lake plants. Consequently, we argue that historically integrative inferences can be useful to compare studies on different taxa, ecosystems and regions, and should contribute to synthesis in ecology.
Studies of this nature provide managers and conservationists the necessary information to incorporate regional and historical contingencies, rather than purely local conditions, in their management schemes. This is because geographical regions differ in their history (e.g., the order and timing of immigration and the abiotic and biotic settings) and therefore a one-size-fits-all management approach might not be efficient across regions. Moreover, our results might also be exploited in ecological restoration projects by selectively inducing long-term priority effects of target species to buffer against successful colonization and establishment of alien aquatic plant species. This is a particularly relevant application because, unlike riparian vegetation, submerged and floating-leaved plants are typically expected to colonize independent of human interference after restoration (Zefferman 2015). A tentatively further benefit of this historical approach is that areas within a region where plant communities are more susceptible to environmental change can be managed in a spatio-temporally explicit way. That is, lakes may be identified that should receive conservation and restoration priority which may allow for a more targeted allocation of limited management resources (Heino et al. 2020).
Since comprehensive spatio-temporal assessments of biodiversity are becoming available, we encourage future studies to consider historical contingency via priority effects as a potentially important assembly mechanism that might counteract environmental change on metacommunity dynamics across decades. Environmental change has been relatively modest in our study lakes, so more research is needed from metacommunities with strong human impact to find out whether priority effects dominate in such systems, too. By doing so, we should collectively advance the goal of understanding the circumstances under which assembly mechanisms are of greatest importance for metacommunity dynamics while navigating the Anthropocene.