Good things take time—Diversity effects on tree growth shift from negative to positive during stand development in boreal forests

Long‐term grassland biodiversity experiments have shown that diversity effects on productivity tend to strengthen through time, as complementarity among coexisting species increases. But it remains less clear whether this pattern also holds for other ecosystems such as forests, and if so why. Here we explore whether diversity effects on tree growth change predictably during stand development in Finland's boreal forests. Using tree ring records from mature forests, we tested whether diameter growth trajectories of dominant tree species growing in mixture differed from those in monoculture. We then compared these results with data from the world's longest running tree diversity experiment, where the same combinations of species sampled in mature forests were planted in 1999. We found that diversity effects on tree growth strengthened progressively through time, only becoming significantly positive around 20 years after seedling establishment. This shift coincided with the period in which canopy closure occurs in these forests, at which time trees begin to interact and compete above‐ground. These temporal trends were remarkably consistent across different tree species sampled in mature forests, and broadly matched growth responses observed in the much younger experimental plots. Synthesis. Our results mirror those from grassland ecosystems and suggest that canopy closure is a key phase for promoting niche complementarity in diverse tree communities. They also provide a series of testable hypotheses for the growing number of tree diversity experiments that have been established in recent years.


| INTRODUC TI ON
All things being equal, diverse tree communities generally sequester and store more carbon from the atmosphere than their species-poor counterparts (Fichtner et al., 2018;Morin, Fahse, Scherer-Lorenzen, & Bugmann, 2011;Vilà et al., 2013). Yet underlying this overall positive relationship between tree diversity and productivity is a considerable degree of spatial and temporal variation in the strength of diversity effects on tree growth (Forrester, 2014;Jucker et al., 2016;Jucker, Bouriaud, Avăcăriei, Dănilă, et al., 2014;Searle & Chen, 2020). Recent work has highlighted how differences in climate, soils, canopy structure and species composition account for much of the spatial variation in the strength and direction of these diversity effects (Baeten et al., 2019;Forrester, 2014;Jucker et al., 2016;Ratcliffe et al., 2016;Toïgo et al., 2015). However, considerably less is known about how and why diversity effects on tree growth change through time during stand development (Taylor, Gao, & Chen, 2020;Zhang, Chen, & Reich, 2012).
However, unlike in grassland ecosystems where community dynamics are relatively fast, in forests the process of canopy filling is a slow one which unfolds over the course of multiple successive growing seasons during which neighbouring trees expand their crowns and begin competing for light. Consequently, overyieldingwhereby species in mixture outperform those in monoculture-may take years to manifest in regenerating stands. This may help explain why, in contrast to observational studies conducted in mature forests, most tree diversity experiments established in temperate and boreal forests in the last 5-10 years have so far found little evidence of overyielding (Haase et al., 2015;Verheyen et al., 2016;Grossman et al., 2018;Kambach et al., 2019; although see Williams et al., 2017). The problem is that testing this hypothesis would require long-term, annually resolved growth records for trees exposed to different levels of diversity, data which are not typically recorded in forests.
Here we overcome this challenge by using tree ring records to reconstruct the growth trajectories of individual trees from stands that span a tree diversity gradient ranging from monocultures to 3-species mixtures. Using this dataset, we explore how diversity effects on tree growth change during the early stages of stand development in regenerating boreal forests in Finland. We hypothesize that diversity effects should become increasingly positive with time and that this shift should coincide with the period of canopy closure-which occurs approximately 20-25 years after a stand-replacing disturbance in these forests (Angelstam & Kuuluvainen, 2004;Shorohova, Kuuluvainen, Kangur, & Jõgiste, 2009). To complement this analysis, we then compare these growth responses with those observed in the Satakunta experiment in Finland-one of the world's longest running tree diversity experiments where the same combinations of species we sampled in closed-canopy forests were planted two decades ago. We expect that temporal trends in the strength of diversity effects on tree growth in these experimental plots should mirror those observed in closed-canopy forests. However, because of the relatively young age of trees in the Satakunta experiment, overyielding will be less evident.

| Overview
To explore how diversity effects on tree growth vary through time, here we take advantage of two complementary research platforms: the FunDivEUROPE plot network, which captures closed-canopy forests characterized by different levels of tree diversity, and the Satakunta tree diversity experiment. Below we provide an overview of these two platforms before detailing the approach we used to model the effects of diversity on tree growth. For a comprehensive description of the FunDivEUROPE project and of the Satakunta experiment see Baeten et al. (2013) and Verheyen et al. (2016) respectively.
Note that while the FunDivEUROPE network spans multiple sites across Europe, here we focus exclusively on the site in Finland. This is for two main reasons. Firstly, this site lies less than 400 km east of Satakunta (see Appendix S1), which is one longest running tree diversity experiments anywhere in the world. The two platforms share the same target tree species-which include Pinus sylvestris, Picea abies and Betula pendula-and replicated plots with all possible combinations of these species are found at both sites (Table 1). This provides a unique opportunity to compare tree growth responses to diversity in natural and experimental forests in a way that would be hard to do anywhere else. Secondly, the FunDivEUROPE plots in Finland all consist of even-aged stands that have regenerated naturally following clear cutting in the past 40-60 years (Table 1; Appendix S1). This makes comparing growth trajectories through time and across plots much simpler than would be the case in older, uneven-aged stands.

| FunDivEUROPE plot network
As part of the FunDivEUROPE project, six study sites were established across Europe, including one in the region of Northern Karelia in eastern Finland. At this site, 28 permanent plots (30 × 30 m in size) with all possible combinations of the three locally dominant tree species-P. sylvestris, P. abies and B. pendula-were established in 2012 in closed-canopy forest stands. This includes seven possible species combinations-three monoculture treatments, three 2-species mixtures and one 3-species mixture-each of which was replicated at least three times (Table 1). This full factorial design mimics that of most tree diversity experiment, thus allowing diversity effects to be teased apart from identity and compositional effects. To enable statistically rigorous comparisons across diversity levels, the final list of 28 plots was selected from a wider pool of candidates following a screening procedure that aimed to maximize community evenness while minimizing differences in topography, soil properties, climate, stand development stage and management history among plots (for details see Baeten et al., 2013;. In particular, all plots were established in even-aged stands that regenerated naturally following clear cutting and have not been actively managed. Stand age varied between 40 and 60 years, resulting in predictable differences in stem density and mean tree size among plots (Table 1).
Importantly, however, we found no evidence that these differences in stem density and mean tree size were related to variation in The Satakunta experiment includes a total of 114 plots (38 plots × 3 blocks). Only plots which feature combinations of P. sylvestris, P. abies and B. pendula were used for this study (42 plots; 14 × 3 blocks). c For the FunDivEUROPE plots stem densities include all trees with D ≥ 7.5 cm in the plot. For the Satakunta experiment, 169 trees were initially planted in each plot (13 × 13 rows with seedlings 1.5 m apart). d Calculated as � ∑ D 2 ∕n, where n is the number of stems with D ≥ 7.5 cm in the plot. See Appendix S2 for the relationship between stem density, quadratic mean stem diameter and stand age in the FunDivEUROPE plots. e In the FunDivEUROPE network B. pendula monocultures were replicated three times, and the 2-species mixture of B. pendula and P. sylvestris was replicated four times. f Tree diameters in the Satakunta plots were measured in 2004, 2009, 2011 and 2016. TA B L E 1 Summary of the FunDivEUROPE plot network and the Satakunta tree diversity experiment tree diversity among stands when the plots were established (see Appendix S2 for details).

| Reconstructing temporal growth trends from tree ring data
Within each FunDivEUROPE plot, all stems ≥7.5 cm in diameter were mapped, identified to species and permanently marked (n = 2,146 stems). For each stem, we measured diameter at 1.3 m above-ground (D, in cm) using diameter tape and tree height (H, in m) using a vertex hypsometer (Haglöf AB). To reconstruct the growth trajectories of individual trees, in September 2012 we extracted bark-to-pith increment cores from a subset of trees in each plot following a size-stratified random sampling approach (for details see . Specifically, we cored 12 trees per species in monoculture plots and eight trees per species in all mixture plots (n = 430 cores). This approach ensures that the tree size distribution of each plot is adequately captured by the subsample without needing to core all trees in a plot (Nehrbass-Ahles et al., 2014). This is important, as growth trajectories and responses to competition of canopy dominant and suppressed trees can vary considerably (Luo, McIntire, Boisvenue, Nikiema, & Chen, 2020).
Wood cores were extracted using a 5.15-mm diameter increment borer (Haglöf AB) and stored in polycarbonate sheeting to air dry. Cores were then mounted on wooden boards and sanded with progressively finer grit sizes before being digitally scanned using a high-resolution flatbed scanner (2,400 dpi optical resolution).
From the scanned images we measured annual radial increments for all cored trees using the software CDendro (Cybis Elektronik & Data). Individual chronologies were crossdated against specieslevel reference curves generated by pooling all samples belonging to a given species to detect any misplaced or missing ring boundaries. From these chronologies we calculated the annual diameter increment of each cored tree (D incr , in cm/year), as well as its age.
For trees in which cores did not include the pith, we estimated the number of missing rings by first calculating the distance to pith from the innermost visible ring using the pith locator tool in CDendro and then dividing this distance by the mean increment of the five innermost rings (Rozas, 2003). Finally, the true age of each tree was adjusted to account for the number of years needed for trees to reach a height of 1.3 m at which cores were extracted. We did this by fitting species-specific height-age functions using data from the Satakunta experiment (see section below and Appendix S3 for details). We chose to use D incr to represent tree growth instead of basal area increments because the former showed a simpler relationship with tree age which we were able to capture using well-established nonlinear plant growth models (see below and Appendix S4 for details). The disadvantage of D incr is that, compared to basal area increments, it is a poorer surrogate of whole-tree biomass growth. We note, however, that replacing D incr with basal area increments in our analysis did not affect our results (Appendix S4).

| Satakunta tree diversity experiment
The Satakunta tree diversity experiment was established in the Satakunta region of southwestern Finland in 1999. It forms part of TreeDivNet-a global network of tree diversity experiments-of which it is the longest running study and the only one in the boreal forest biome (Verheyen et al., 2016). The experiment includes 114 plots (20 × 20 m in size) in which different combinations of five target tree species were planted in clear-cut areas (Table 1). Diversity treatment includes monocultures, 2-, 3-and 5-species mixtures. Plots are grouped into three blocks, with all species compositions replicated two times within each block. The target species include P. sylvestris, P. abies and B. pendula, as well as Larix sibirica and Alnus glutinosa. For the purposes of this study only plots which feature combinations of P. sylvestris, P. abies and B. pendula were analysed (42 plots; 14 × 3 blocks). One hundred and sixty-nine seedlings were planted in each plot (13 × 13 rows with seedlings 1.5 m apart). Seedlings of P. abies were 2 years old at the time of planting, while those of P. sylvestris and B. pendula were 1 year old. An equal number of seedlings was planted for each species in the mixture treatments, but planting locations inside the plots were assigned randomly.

| Tree growth measurements
Tree growth was monitored at four points in time during the experi-

| Comparing alternative tree growth models
Having reconstructed diameter growth trends from the tree ring records, we then used these data to model the growth trajectory of trees across the FunDivEUROPE plots to determine how diversity effects on tree growth vary through time ( Figure 1). We started by comparing different diameter growth models using the approach outlined in Paine et al. (2012). Because diameter growth tends to vary nonlinearly with tree age-with initial increases in growth rates followed by a decline and levelling-off phase-we used nonlinear regression to model changes in growth rate through time. All models were fit using the nls function in R (R Core Development Team, 2019). Following a comprehensive comparison of alternative models based on AIC (Appendix S4), we settled on a modified version of the Ricker function (Bolker, 2008) to capture how D incr varies as a function of tree age (A; in years): where α, β and γ are parameters to be estimated from the data using a nonlinear least squares approach. This flexible function outperformed all other nonlinear plant growth models we tested (Appendix S4).
Integrating Equation (1) provides a function for modelling cumulative diameter increments through time: where α, β and γ are the same parameters estimated for Equation (1). Equation (2) allows the diameter of a tree to be estimated based on its age. This is particularly convenient as it provides a way to directly compare growth trends in the FunDivEUROPE plots to those observed at Satakunta, where tree growth increments were not measured on an annual basis.

| Testing the effects of diversity on tree growth through time
Having identified a growth function that captures age-related variation in tree growth for all three study species, we then used this model to quantify how diversity effects on tree growth change through time.
To do this, for each species we first fit separate growth models for trees growing in all possible species combinations (i.e. monoculture, the three possible 2-species mixtures and the 3-species mixture). We then used the fitted models to predict D incr and D as a function of tree age for each of these treatments and calculated the differential between tree growth trajectories in monoculture and the mixtures through time (see Figure 1 for a schematic representation). This allowed us to not only test whether trees in mixture grow faster than those in monoculture, but also determine at what age diversity effects emerge. For the purposes of model fitting we restricted the analysis to include only the first 30 years of growth, as beyond this threshold the number of trees with complete chronologies dropped off sharply (Appendix S1). In order to test whether growth differences between treatments were statistically significant, we used Monte Carlo simulations as implemented by the predictNLS function in the propagate r package to estimate 95% confidence intervals for each fitted model (Spiess, 2018).
Our analysis makes two important assumptions about the FunDivEUROPE data which are worth stating explicitly. The first is that species composition and diversity have remained relatively stable since stand establishment. If true, current-day species composition can be combined with tree ring records to infer how diversity effects on growth have changed through time. While we have no information on the community composition of the plots prior to their establishment in 2012, a second census was conducted in 2017. This shows that in the five years following our initial sampling, the species composition of the plots has remained unchanged (Appendix S2). While these observations do not capture the initial phases of stand development in the FunDivEUROPE plots, a second census of the Satakunta plots in 2011 revealed almost no changes in community composition during the first 12 years of the experiment (Appendix S2). Together, these data suggest that species composition and diversity are likely to have remained relatively constant during the initial stages of stand development in these forests.
(1) The second assumption is that stand structural attributes known to influence tree growth-such as the number and size of trees in a plot )-vary independently of tree diversity. Note that this does not mean we assume that the number and size of trees in a plot has remained constant through time. Instead, the assumption is that changes in the number and size of trees have been similar among plots, allowing us to directly compare the growth trajectories of trees across the diversity gradient. Supporting this premise, the repeat census data from the FunDivEUROPE plots show that the number and mean size of trees varies closely with stand age (Appendix S2), following a classic selfthinning pattern (Yoda, Kira, Ogawa, & Hozumi, 1963). Crucially, however, at the time of establishing the plots we found no significant differences in mean tree size and density across diversity levels (Appendix S2). A very similar pattern emerged from the Satakunta plots, where rates of stem exclusion during the initial 12 years of the experiment were statistically indistinguishable across the diversity treatments (Appendix S2).

| Comparing diversity effects on growth in the FunDivEUROPE and Satakunta plots
To compare diversity effects on tree growth between the FunDivEUROPE plots and the Satakunta experiment, we used mixedeffects models to estimate differences in diameter between trees in monoculture and mixture at each census period of the experiment (2004, 2009, 2011 and 2016 assigned a value of D = 0 (15 and 9 trees respectively).

| Diversity effects on growth in the FunDivEUROPE plots
While the shape of the relationship between D incr and age was similar across the three species, clear quantitative differences in their growth trajectories also emerged ( Figure 2). Of the three, P. sylvestris was the fastest growing early on (mean D incr before age 15 = 0.81 cm/ year, compared to 0.69 and 0.72 cm/year in P. abies and B. pendula respectively). However, P. sylvestris also showed the steepest decline in diameter growth rate with age of all three species, and by age 30 growth differences between species had reversed (mean D incr after age When we compared the growth trajectories of trees in monoculture and mixture, we found that on average diversity effects on growth tended to shift from mostly negative to overwhelmingly positive during stand development (Figure 3). This pattern matched our predictions and was remarkably consistent across species and diversity treatments ( Figure 3; Table 2). By age 35 the average diameter growth rate of a tree in mixture was 25% faster than that of a tree in monoculture (Table 2). This overyielding effect was significantly strongest for trees in the 3-species mixtures (+32%, compared to +22% in the 2-species mixture) and for B. pendula (+39% across treatments, compared to +21% and +15% for P. sylvestris and P. abies respectively). Moreover, when comparing across species and treatments we found that the average age at which diversity effects on growth shifted to significantly positive was 21 (Table 2).
This coincides with the period in which regenerating boreal forests in Finland typically achieve canopy closure.

| Comparing diversity effects on growth in the FunDivEUROPE and Satakunta plots
The cumulative diameter growth trajectories of trees in the Satakunta experiment were very similar to those observed in the FunDivEUROPE plots (Figure 2a-c), although on average P. sylvestris grew quicker at Satakunta (D at age 18 = 11.2 cm, compared to 9.9 cm in the FunDivEUROPE plots). When we compared the effects of diversity F I G U R E 3 Difference in diameter growth between trees in monoculture and those in mixture as a function of tree age for (a-c) Pinus sylvestris (PINSYL), (d-f) Picea abies (PICABI) and (g-i) Betula pendula (BETPEN) in the FunDivEUROPE plots. Shaded regions in grey correspond to the 95% confidence intervals of the curves. See Figure 1 for a  on diameter growth between the two platforms, we found good or partial agreement for seven of the nine possible species combinations ( Figure 4). In particular, P. abies showed similar responses to diversity in the FunDivEUROPE and Satakunta plots, particularly when mixed with P. sylvestris (Figure 4d) and in the 3-species mixture (Figure 4f).
Equally, for all three species, temporal trends in diversity effects in the 3-species mixtures were broadly consistent with those observed in the FunDivEUROPE plots (Figure 4c,f,i).
The clear exception where growth responses to diversity did not match between the two research platforms was the

| D ISCUSS I ON
Across the FunDivEUROPE plots, we found a clear pattern whereby diversity effects on tree growth shifted from mostly negative to positive during the first 35 years of stand regeneration following clear cutting. These trends were remarkably consistent across species and mixture types (Figure 3), and closely matches what has previously been observed in long-term grassland biodiversity experiments (Cardinale et al., 2007;Guerrero-Ramírez et al., 2017;Reich et al., 2012;Zuppinger-Dingley et al., 2014). Observational studies conducted across a range of forest ecosystems have revealed a considerable degree of variation in the strength and even the direction of diversity effects on productivity (Paquette & Messier, 2011;Ratcliffe et al., 2016;Vilà et al., 2013). Previous work has shown that this context dependency can be partially explained by environmental differences among forest types, such as those associated with climate or soils (Forrester, 2014;Jucker et al., 2016;Jucker, Bouriaud, Avăcăriei, Dănilă, et al., 2014;Ratcliffe et al., 2017;Toïgo et al., 2015). Our study highlights how changes in species interactions during stand development can also play an important role in determining the strength of diversity-productivity relationships in forests (Lasky et al., 2014;Taylor et al., 2020). It also illustrates the value of focusing on how individual trees respond to species mixing in order to better understand community-level responses (Chamagne et al., 2017;Fichtner et al., 2018).

| Canopy packing as a driver of diversityproductivity relationships in forests
On average, overyielding in the FunDivEUROPE plots first became apparent around 20 years after seedling establishment (Figure 3; Table 2). This coincides with the period in which boreal forests in northern Europe typically undergo canopy closure and enter the phase of stem exclusion (based on observations in the Satakunta experimental plots; see also Angelstam & Kuuluvainen, 2004;Shorohova et al., 2009), lending support to our hypothesis that the process of canopy filling is key to promoting positive diversityproductivity relationships in forests. Growing evidence suggests that by combining tree species with complementary crown architectures, phenologies and abilities to tolerate shade, diverse forests are able to use canopy space more efficiently (Jucker et al., 2015;Pretzsch, 2014;Williams et al., 2017). This in turn alleviates the effects of competition for light among neighbours, allowing trees to grow faster in mixture and pack more densely in space (Kunz et al., 2019;Sapijanskas et al., 2014;Searle & Chen, 2020;Williams et al., 2017).
Despite the low number of tree species present in our study system, differences in their ecological strategies still present numerous opportunities to maximize the use of above-ground space. Firstly, phenological differences between the evergreen conifers and the deciduous B. pendula can reduce competition for light among neighbouring trees at the onset and end of the growing season. Secondly, while both P. sylvestris and B. pendula (in particular) are light-demanding species, P. abies is able to persist and grow even in low-light conditions (Niinemets & Valladares, 2006). These contrasting abilities to tolerate shade are also reflected in differences in the way the three species invest in vertical growth and crown expansion (Appendix S3), which enables them to vertically and horizontally partition canopy space. Finally, these crown complementarity effects can be further enhanced by the ability of individual trees to plastically adapt the vertical distribution of their branches and leaves to suit that of their neighbours (Jucker et al., 2015;Pretzsch, 2014;Sapijanskas et al., 2014). For example, previous work conducted across the FunDivEUROPE network revealed that trees in mixed-species stands had significantly wider and deeper crowns than their counterparts growing in monoculture (Jucker et al., 2015). When scaled up from individual trees to whole stands, these crown complementarity effects allow mixed-species forests to use canopy space more efficiently, thus contributing to overyielding at the community level (Jucker et al., 2015;Pretzsch, 2014;Williams et al., 2017). Tree rings provide one way to address this challenge by allowing the long-term growth trends of individual trees to be accurately reconstructed. However, they tell us nothing about the past composition of a stand. Consequently, attributing growth responses to diversity becomes progressively harder the further back in time one goes. One way around this is to use a space-for-time substitution, where plots at different stages of stand development are compared. Using this approach, Taylor et al. (2020) recently showed that in Canada's boreal forests diversity-productivity relationships tended to peak in mid-successional stands. However, the challenge with this type of study is that accounting for differences in management practises is often made challenging by a lack of historical data, particularly for older stands. Moreover, because of recent climate change, conditions under which forests are regenerating today will often be substantially different to those in which currently mature stands developed in the past. To complement these analyses, it can therefore be useful to pair them with simulation models of forest dynamics (Morin et al., 2011). In this respect, Holzwarth, Rüger, and Wirth (2015) (Pretzsch et al., 2015).
While our results are predominantly observational and representative of a single, low-diversity ecosystem, they provide a series of testable hypotheses for the growing number of tree diversity experiments established in recent years. Large-scale syntheses will clarify whether the tendency of diversity effects to strengthen through time is a general one, and if so, help elucidate the mechanisms driving it. Here we focused on one possible explanation for these temporal trends-the slow onset of canopy interactions among neighbouring trees. But other processes are also likely to be at play.
For instance, studies in both grasslands and forests have shown that trophic interactions are key to promoting positive biodiversity-ecosystem functioning relationships (Ammer, 2019;Eisenhauer, 2012), but these interactions take time to establish (Eisenhauer, Reich, & Scheu, 2012). Similarly, soil nutrients have been shown to influence how quickly diversity effects emerge in grasslands by constraining rates of ecosystem development (Guerrero-Ramírez et al., 2017).
Future work leveraging networks of tree diversity experiments will also help clarify whether some of the other trends we observe in our data-such as the tendency of diversity to negatively influence growth in the early stages of stand development-also emerge across different species and forest types .
In contrast to our expectations, which were for diversity effects in the earliest stages of stand development to be mostly neutral, seven of the nine species combinations in the FunDivEUROPE plots showed negative effects of diversity on tree growth between ages 5 and 15 (Figure 3). This initial negative relationship between diversity and growth likely explains why we found no significant differences in mean tree size across the diversity gradient (Appendix S2), as it would have offset any subsequent increases in growth in mixed-species plots. Early synthesis work from tree diversity experiments outside tropical and subtropical regions has mostly revealed neutral effects of diversity on above-ground productivity at a community level (Grossman et al., 2018;Kambach et al., 2019). This pattern could emerge even if diversity was to negatively influence the early stage growth of individual trees, provided that survival rates were higher in mixtures. However, even if this was the case, it still begs the question of what might cause individual trees to grow more slowly at first when in mixture. Above-ground interactions seem an unlikely candidate, as competition for light among neighbouring trees would initially be weak. Trophic interactions, both above-and below-ground (e.g. slower colonization by mutualistic fungi or increased pest and pathogen loads in mixed-species plots), are possible explanations worth exploring further (Ammer, 2019;Eisenhauer, 2012).

| Bridging the gap between observational studies and tree diversity experiments
The fact that positive diversity effects on tree growth in the FunDivEUROPE plots tended to strengthen with time and only became apparent once stands matured enough to achieve canopy closure may explain why most tree diversity experiments established outside the tropics have so far found little evidence that diverse tree communities are more productive than species-poor ones (Grossman et al., 2018;Haase et al., 2015;Kambach et al., 2019;Verheyen et al., 2016). Currently, the average duration of the 26 globally distributed tree diversity experiments that form TreeDivNet is 9 years (range 1-20 years, with Satakunta being the oldest; for details see: http://www.treed ivnet.ugent.be and Verheyen et al., 2016). Our results from the FunDivEUROPE plots suggest this may simply not be long enough for the above-ground interactions that underpin the positive effects of diversity on tree growth to manifest themselves, particularly in slower growing boreal and temperate forests. Moreover-just as we find in the FunDivEUROPE plots-recent work suggests that in the BEF-China experiment the strength of these diversity effects has been progressively increasing through time (Huang et al., 2018).
Outside the tropics, experimental evidence for positive diversityproductivity relationships in the early stages of stand development is much more mixed (for a review see Grossman et al., 2018). The one notable exception is studies from the IDENT network (Tobner, Paquette, Reich, Gravel, & Messier, 2014). For instance, Williams et al. (2017) found positive effects of diversity on productivity emerging relatively soon after planting in an experiment established in 2009 at the temperate-boreal forest ecotone in Quebec. Crucially, this study also concluded that increased canopy packing in mixed-species plots was driving positive diversity effects on productivity. The fact that these effects emerged so early in the experiment is likely attributable to the study's design, which involved planting seedlings at extremely high densities to speed up their interaction (planting density = 40,000 seedlings ha −1 , almost 10 times as high as Satakunta; Tobner et al., 2014).
When comparing early stage tree growth responses to diversity in the Satakunta and FunDivEUROPE plots, we generally found reasonable agreement between the two research platforms (Figure 4).
However, there were a few exceptions, the most notable of which was the behaviour of both P. sylvestris and B. pendula when grown in combination with one another (Figure 4b There are several plausible explanations for the mismatch we observed. For instance, spatio-temporal differences in climate and soils can strongly influence species interactions (Forrester, 2014;Jucker, Bouriaud, Avăcăriei, Dănilă, et al., 2014;Pretzsch et al., 2015), and generally speaking diversity effects on tree growth have been shown to be strongest in more stressful and less productive environments (Jucker et al., 2016;Toïgo et al., 2015). Mean annual temperature at Satakunta is around 3°C warmer than in Northern Karelia where the FunDivEUROPE plots were established (Table 1)-a difference that would have been further amplified by the fact that Finland has warmed considerably in the decades that separate when the FunDivEUROPE stands established and the Satakunta experiment was planted. These differences in climate may explain why P. sylvestris grew faster at Satakunta (Figure 2) and could have contributed to shifting the competitive balance between the two species.
Another possible explanation for the contrasting responses to diversity in the two platforms is differences in tree density and spatial arrangement ( Table 1). As is fairly common practice in tree diversity experiments (e.g. Tobner et al., 2014), planting density in the Satakunta plots was higher than what is typically found across managed forests in northern Europe (4,225 ha −1 , compared to 1,600-2,000 ha −1 in commercially planted stands in Finland).
Planting seedlings at high density encourages species interactions to begin sooner, but it may also fundamentally alter their outcome (Ammer, 2019). Finally, an additional contributing factor which is worth considering is herbivory. In particular, browsing pressure by moose has been shown to increase in mixed stands of P. sylvestris and B. pendula relative to monocultures (Milligan & Koricheva, 2013;Nevalainen, Matala, Korhonen, Ihalainen, & Nikula, 2016). Moreover, work by Muiruri, Milligan, Morath, and Koricheva (2015) at Satakunta showed that these differences in browsing can actually alter the growth response of B. pendula to mixing, shifting it from positive-saturating at low browsing intensities to neutral under high browsing pressure. Given that between the 1980s-when trees in the FunDivEUROPE plots would have been short enough to be susceptible to moose browsing-and the early 2000s damage by moose more than doubled across Finland's forests (Nevalainen et al., 2016), it is possible that differences in browsing pressure between the two platforms contributed to the discrepancy in the results.

| CON CLUS IONS
Using a combination of tree ring records and data from a longterm tree diversity experiment, we find that diversity effects on tree growth change predictably during the early stages of stand development in Finland's boreal forests. In doing so, we take a further step towards reconciling the results of previous studies which suggest that while diversity effects in forests are generally positive, they can also vary substantially through space and time.
Our results point to canopy closure as a key phase of stand development during which positive diversity effects on tree growth first emerge. This reinforces the importance of canopy space filling as an ecological mechanism for explaining why diverse forests are, on average, more productive than species-poor ones. It also provides a testable prediction for when positive diversity effects on tree growth should emerge across different forest types. This is critical when it comes to bridging the gap between observational studies-from which most of our understanding of how diversity relates to productivity in forests has traditionally come from-and tree diversity experiments-which have grown rapidly in number and ecological realism in recent years. Overall, our study lends further support to the growing evidence that management and conservation strategies aimed at increasing tree diversity in forests have the potential to enhance carbon sequestration. However, as with most good things, a little patience is needed before we reap the benefits of what we sow.

ACK N OWLED G EM ENTS
We thank FunDivEUROPE site managers and field technicians for establishing the permanent plots, and are grateful to D. Avăcăriței, I. T.J. designed the study, collected the tree ring data, performed the analyses and wrote the first draft of the manuscript; G.I. and O.B. collected and curated the data from the second census of the FunDivEUROPE plots. All the authors contributed substantially to revisions.

PE E R R E V I E W
The peer review history for this article is available at https://publo ns.