Green hay application and diverse seeding approaches to restore grazed lowland meadows: progress after 4 years and effects of a flood risk gradient

The two most common approaches to target species introduction in European meadow restoration are green‐hay transfer from species‐rich donor sites and the use of diverse seed mixtures reflecting the chosen target community. The potential of both approaches to restore species‐rich grassland has been variously reviewed, but very few studies have experimentally compared them at one and the same site. Moreover, studies involving one or both approaches have rarely taken into account environmental gradients at a site, and measured the impacts of such gradients on restoration outcomes. Such gradients do, for example, exist during grassland restoration on former arable land in river floodplains, where gradients in the occurrence of flooding, and in associated edaphic characteristics such as nutrient availability, might affect restoration outcomes. Using a randomized complete block experimental design, based on five different indicators of restoration progress, we compared the usefulness of green‐hay application and diverse seeding to restore species‐rich grazed meadows of the MG5 grassland type according to the British National Vegetation Classification, and also investigated how restoration outcomes differed after 4 years between areas within experimental plots characterized by high flood risk and areas characterized by low flood risk. Overall, both restoration approaches yielded similar results over the course of the experiment, whereas high flood risk and associated edaphic factors such as high availability of phosphorus negatively affected restoration progress particularly in terms of floristic similarity to restoration targets. These results highlight the need to take into account environmental gradients during meadow restoration.

• Carefully designed seed mixtures can produce restoration outcomes as good as those from application of green hay from species-rich donor sites. However, depending on context, other considerations such as species availability or cost might take precedence.

Introduction
Due to agricultural intensification, species-rich lowland seminatural grassland has markedly declined in extent over the last 70 years both in the United Kingdom (Bullock et al. 2011;Ridding et al. 2015) and in continental Europe (Veen et al. 2009). In the United Kingdom, conversion into agriculturally improved grassland has been a main driver of this decline (Ridding et al. 2015). In the case of grazed hay meadows of the Cynosurus cristatus-Centaurea nigra type, classified as MG5 grassland in the British National Vegetation Classification (NVC; Rodwell 1992), and once the most widespread type of lowland hay meadow in Britain (Rodwell et al. 2007), less than 10,000 ha are now left across England and Wales (Maddock 2008). Remaining MG5 grassland is now strongly fragmented, comprising isolated and small stands in otherwise highly intensified pastoral landscapes (Rodwell et al. 2007). Consequently, there is a substantial risk of local extinction of MG5 specialist Author contributions: RP conceived and designed the research with the help of MN, JR; SH, LH, MW carried out botanical and soil monitoring; MW analyzed the data and led and coordinated the writing of the manuscript; SH, LH, MN, JR, RP edited the manuscript. 1 species due to small population sizes and strongly reduced dispersal between sites (Ozinga et al. 2009). To reduce fragmentation and its negative effects on such specialist species, and to restore a coherent and resilient network of sites, both the floristic diversification of existing species-poor habitat and the creation of additional high-quality habitat are needed (Lawton et al. 2010). However, unassisted reversion of species-poor intensive grassland to species-rich lowland meadow habitat solely relying on management adaptation is hampered by limited dispersal of target species even where suitable source habitat exists nearby (Coulson et al. 2001), and usually requires many decades (Stampfli & Zeiter 1999;Bischoff 2002). This is due to the operation of two key constraints, seed limitation and microsite limitation (Bakker & Berendse 1999;Walker et al. 2004). Seed limitation means that propagules of target species fail to reach a restored site, and microsite limitation means that site-specific constraints prevent establishment even after propagules arrive at the site.
To overcome these constraints, active restoration is required. Seed limitation is usually addressed via target-species introduction (Bakker & Berendse 1999;Walker et al. 2004). In the case of meadow restoration, the two most common approaches are species-rich seed mixtures and transfer of green hay from high-quality local donor sites (Hedberg & Kotowski 2010;Kiehl et al. 2010). The term "green hay" in grassland restoration indicates that freshly harvested hay from a species-rich donor meadow is applied to a restoration site for seed transfer, usually on the same day as harvested, while it is still green (Jones 1993;Albert et al. 2019). A detailed discussion of practical aspects of this method can be found in Trueman and Millett (2003). While attempts have been made to compare results from both seeding and green hay approaches (Hedberg & Kotowski 2010;Kiehl et al. 2010), very few studies have directly compared them (Jones et al. 1995). Also, we know of no such study that has simultaneously involved a gradient in site conditions that might affect microsite limitation and thus restoration outcome.
Such microsite limitation can be very pronounced (Walker et al. 2004), with raised soil fertility and associated primary productivity exacerbating competition among plant species (Foster 2001;Öster et al. 2009). In such situations, short-term alleviation of microsite limitation is crucial to allow initial establishment of actively introduced target species, and can be achieved through disturbance by cultivation, resulting in bare-ground creation and provision of a low-competition environment (Hofmann & Isselstein 2004;Wagner et al. 2016). However, competitive generalists tend to establish more reliably than uncompetitive specialists whose occurrence is more restricted to agriculturally unimproved grassland (Pywell et al. 2003). Moreover, such differences in establishment success in relation to plant strategy often tend to become more pronounced with time, as ecological filtering causes a gradual decline of uncompetitive specialists after initially successful establishment (Pywell et al. 2003).
In river floodplains, soil fertility is usually higher in low-lying areas close to the river channel than in more distant high-lying areas less affected by flooding and sediment deposition (Sival et al. 2005). This is also usually reflected by grassland species composition, with lower species richness and increased cover of generalist species adapted to high fertility closer to the river channel (Klaus et al. 2011). Differential flooding and associated environmental gradients, e.g. in soil fertility, thus represent an opportunity to explore trajectories and outcomes of ecological restoration as simultaneously affected by different approaches and by pre-existing environmental variation.
In this study, we investigated the following two main questions: (1) Provided both green hay application and diverse seeding are carefully designed and carried out to restore MG5 grassland, is one approach superior over the other in terms of shortterm progress after 4 years of restoration? (2) To what extent can occasional flooding and associated gradients in soil fertility affect such outcomes?

Field Site and Experimental Design
A 4-year meadow restoration experiment was set up in July 2013 in three species-poor grassland fields at Hillesden Estate, a 1,000 ha arable farm in Buckinghamshire, England (51 57 0 58 00 N 0 58 0 15 00 W), along the western bank of the small river Padbury Brook, a tributary to the River Twins. These fields were arable until 2007, and then sown with a mixture of perennial ryegrass Lolium perenne and white clover Trifolium repens, and managed intensively as agriculturally improved L. perenne grassland (= MG7 in the NVC; Rodwell 1992). The reason for this conversion was that parts of the field were subjected to occasional flooding, and were thus no longer considered reliable enough for crop production. The soil in all three fields is alluvial clay and clay loam, with a pH of about 6.5. Mean annual temperature is 9.7 C and annual rainfall is 648 mm, of which 381 mm falls between April and October (data from 1981 to 2010; Met Office 2019). The experimental design was a randomized complete block design with four replicate blocks, two of which were located in the same field. Within each block, three restoration treatments were applied to plots of between 0.95 and 2.7 ha, depending on overall size and shape of the field in which the block was located. These treatments were: (1) a green hay treatment to which target species were introduced by application of freshly harvested hay from a species-rich donor meadow; (2) a diverse seeding treatment to which target species were introduced by sowing of a specifically tailored seed mixture containing many grasses and forbs; and (3) a control treatment as provided by the species-poor extant grassland.
Treatments (1) and (2) were designed to restore Cynosurus cristatus-Centaurea nigra grassland, that is MG5 grassland according to the NVC (Rodwell 1992), which had been determined as a suitable restoration target, based on location, soil, hydrology, and a proposed management as grazed hay meadow.
To prepare the experimental plots for green hay application and seeding, a silage cut was carried out in the w/c 17 June, 2013. This was followed by the marking out of experimental plots, and subsequent glyphosate spraying and cultivation of those plots assigned to the green hay and diverse seeding' treatments, to create a suitable shallow tilth and bare ground to facilitate seedling establishment of target species. Spraying was carried out in the w/c 24 June, 2013 and cultivation in the w/c 8 July, 2013. In doing so, we followed recommendations for green hay restoration by Natural England to create a short sward followed by bare-ground creation prior to hay application (Natural England 2010), and by Trueman and Millett (2003) to combine cultivation and spraying to achieve more lasting bareground creation.
The hay applied in the green hay treatment was from speciesrich MG5 grassland 3.68 ha in size, located within Rushbeds Wood SSSI nature reserve, c. 15 km south of the experimental site. A hay cut was carried out on 24 July, 2013 using a disk mower, and a forage harvester was used to load a tractor trailer for transport to the experimental site. On the same day as the cut took place, the hay was evenly spread onto the green hay treatment plots using a muck spreader. The total area of the four replicate green hay restoration plots was 8.15 ha, resulting in an actual area ratio of 1:2.2 for the spreading of green hay from the donor site, thus exceeding a recommended ratio of 1:3 (Natural England 2010).
The diverse seeding' treatment was applied in September 2013 at a rate of 12 kg of seed per hectare, using a targeted seed mixture of 6 grasses and 20 forbs, supplied by a specialist company for native seed (Emorsgate Seeds Ltd., Ling's Lynn, U.K.; Table S1).
Post-establishment management from 2014 onwards was identical for all experimental plots and based on the typical management of MG5 grassland, involving a single cut in the summer, followed by aftermath and winter grazing (Rodwell 1992).

Vegetation Monitoring and Soil Analysis
Vegetation recording at the green hay donor site was carried out on 3 July, 2013, 3 weeks before the hay cut. A total of 24 quadrats of 1 × 1 m were randomly placed within the site, avoiding a margin of 2 m width around the edge, and percentage cover was visually estimated for all vascular plant species, following the nomenclature of Stace (2010). The assessment also involved a recording of site-level species abundance using the DAFOR scale (Kershaw & Looney 1985), also capturing species not picked up during the quadrat-based assessment. For a full species list, see Table S2.
Vegetation sampling in the restoration experiment was carried out annually in July between 2014 and 2017. Within each replicate plot, areas were delineated as either being at high or low risk of flooding, using aerial photographs (25 cm resolution) taken in 2007 and 2012. Areas determined as being at higher risk of flooding were, as expected, generally closer to watercourses and in the lower-lying areas of the field. Based on this delineation, within each experimental plot, seven randomly placed 1 × 1 m quadrats were recorded annually within areas designated as being at high flood risk, and another seven within areas designated as being at low flood risk. As with the green hay donor site, a margin of 2 m width around the edge of each treatment plot was avoided. Using this stratified sampling approach allowed us to take flood risk into account in our analyses.
Soil sampling for textural, chemical, and bulk density analyses was carried out in October 2017, to confirm soil gradients in accordance with our delineation of areas of high and low flood risk, and to test whether experimental treatments had any effects on the measured soil parameters. Three pooled samples were collected in each experimental plot, two from the area of the plot characterized by low risk of flooding, and one from the-usually smaller-area characterized by high risk of flooding. We also collected four pooled soil samples in nearby arable fields, also located at Hillesden Estate and characterized by comparable elevation and distance from the river as the experimental plots, and another four pooled samples from the green hay donor meadow. These additional samples enabled us to determine soil characteristics both for the starting point of restoration on former arable land and for an endpoint of successful ecological restoration of MG5 grassland. Soil chemical and textural analyses were carried out by NRM Laboratories (Berkshire, U.K.). They included measurement of soil pH by 1:2.5 water extract, of available phosphorus (Olsen et al. 1954), of available potassium via ammonium-nitrate extraction (Ministry of Agriculture, Fisheries and Food 1981), of soil texture via laser diffraction, of loss-onignition via dry combustion at 430 C, and of total N by hightemperature combustion in a Carlo Erba NCS2500 elemental analyzer (Carlo Erba Instruments, Milan, Italy).

Data Analysis
Soil Characteristics. To test for treatment and flood risk effects on soil characteristics, we constructed linear mixed models (LMM) as provided in SAS 9.3 PROC MIXED (SAS Institute, Cary, NC, U.S.A.). We included restoration treatment, flood risk level, and their interaction as fixed factors, and specified main plots, nested within blocks, as random effects (Schabenberger & Pierce 2002). To ensure normality of residuals and variance homogeneity, all soil data was Box-Coxtransformed (Quinn & Keough 2002).
Restoration Progress. Restoration progress was characterized in several ways, including (1) calculation of floristic similarity with the vegetation at the green hay donor site, (2) calculation of goodness-of-fit with MG5 grassland as defined by Rodwell (1992), (3) total cover and species density per m 2 of positive indicator species of MG5 grassland sensu Robertson and Jefferson (2000), and (4) total species density per m 2 . This multi-pronged approach was chosen to enable a direct comparison with the green hay donor site as well as a quantification of progress towards a more general target of MG5 grassland, and to allow deeper insights into underlying processes.
Floristic similarity with the vegetation of the green hay donor site was calculated as follows. First, on the basis of individual quadrat data standardized to total cover, we calculated interquadrat Bray-Curtis similarity values (Bray & Curtis 1957) for each quadrat recorded during 4 years in the experiment with each of the 24 quadrats recorded in 2013 at the green hay donor site. This was followed by a two-step averaging procedure, first individually for each quadrat recorded in the experiment with the 24 donor-site quadrats, and then across the 7 replicate quadrats annually recorded per treatment plot in a given sub-area and level of flood risk.
In contrast to this purely cover-based approach, calculation of goodness-of-fit with MG5 grassland, as carried out by TABLEFIT, version 2.0 (Hill 2015), takes into account both species cover and frequency of occurrence (Hill 1989). TABLEFIT analyses were also carried out for the dataset of 24 quadrats from the green hay donor site, to determine the donor grassland's closest match within the NVC (Rodwell 1992).
Summed cover and species density of MG5 positive indicator species sensu Robertson and Jefferson (2000), as well as total species density, were calculated by averaging across the seven replicate quadrats annually recorded per sub-area of high or low flood risk within a given treatment plot. For comparison, for the latter three parameters, we also calculated average cover and species density values for the set of 24 green hay donor-site quadrats recorded in 2013.
Statistical analyses of these indicators of restoration progress were carried out using repeated-measures LMM as provided in SAS 9.3 PROC MIXED (SAS Institute, Cary, NC, U.S.A.). Prior to analyses, to ensure normality of residuals and variance homogeneity, species density and summed cover parameters were Box-Cox-transformed, and similarity and goodness-of-fit parameters that have an upper bound of 100% were arcsinetransformed (Quinn & Keough 2002). We were primarily interested in comparing restoration progress following active intervention via green hay application or diverse seeding. As already mentioned, grassland restoration is much faster via such interventions than via unassisted natural regeneration (Bakker & Berendse 1999;Walker et al. 2004), rendering moot the question of differences due to intervention versus non-intervention. This point was also illustrated for our experiment by a preliminary exploratory ordination analysis using nonmetricdimensional scaling (NMDS; Fig. S1), as provided by PC-ORD, Version 7.02 (McCune & Mefford 2015), indicating a clear difference between the two treatments involving active restoration and the control treatment. Hence, for clearer analysis of the differences in outcome between green hay application and diverse seeding approaches, only these two active restoration treatments were compared in LMM analyses. We included restoration treatment, flood risk level, year, and all possible interactions between these as fixed factors, and specified year as repeated-measures factor, and main plots nested within blocks as random effects (Schabenberger & Pierce 2002). We repeated each analysis using various alternative covariance structures for the repeated factor, including unstructured, compound symmetric, and several autoregressive structures. The model with the most suitable error structure was identified using Akaike's information criterion (AIC; Akaike 1974). In case of a significant main effect of year, pairwise comparisons between years were carried out using two-sided Tukey HSD tests.
Compositional Dynamics. Initial differences in species composition between restoration treatments necessitated separate analyses for each treatment to test compositional shifts across the 4 years of the experiment. Hence, data from each treatment was analyzed separately for the occurrence of two separate trends, including (1) a general trend in species composition independent of flood risk level and (2) a modification to this general trend in response to flood risk level. Both types of analysis were carried out using CANOCO, version 5.10 (ter Braak & Šmilauer 2018). For general trends in each treatment, prior to analyses, plant quadrat cover data was averaged across all 14 quadrats recorded in a given year and treatment plot. For modifications to these general trends as caused by differences in flood risk, prior to analyses, plant quadrat cover data was averaged across the seven quadrats recorded at the respective level of flood risk in a given year and treatment plot.
Prior to analyses, all species percent cover data was log(x + 1) transformed, and as we were interested in absolute changes in species cover, cover data was not standardized. Length of the longest gradient in initial detrended correspondence analyses ranged from 1.55 to 2.38 for the six datasets, indicating a particular suitability of linear ordination techniques (Šmilauer & Lepš 2014). Hence, partial redundancy analysis (pRDA) was used for both types of analysis. For statistical significance testing, Monte Carlo permutation with 9,999 permutations was used (ter Braak & Šmilauer 2018). General trends were tested using a time series permutation scheme, specifying Year as explanatory variable and Block as covariate, with permutation blocks defined accordingly. Differential trends in response to flood risk level were tested by specifying the Flood Risk × Year interaction as explanatory variable, and Block, Year, and the Block × Year interaction as covariates, thus effectively removing block effects and general and block-related compositional trends. In this case, a hierarchical design was used for permutations, with "whole-plots" defined on the basis of areas of a given flood risk within treatment plots and "split-plots" defined as individual years within one such area, and permutations were carried out at whole-plot level. To provide a baseline measure of compositional variation within each dataset, we also carried out principal component analyses.
To test whether species-level trends in cover of successfully introduced species in each of the two active restoration treatments, and trends of extant species in the control treatment, were affected by life strategy or realized niche, we then carried out Spearman correlation analyses of species' constrained pRDA axis 1 scores versus their C-, S-, and R-scores for established plant strategies (Hunt et al. 2004;Grime et al. 2007) and versus their Ellenberg indicator values for light, moisture, reaction, and nitrogen (Hill et al. 2004). Again, analyses were carried out to investigate general trends as well as modifications to these trends in response to flood risk level. In these Spearman correlation analyses, we tested for the strength and direction of monotonic relationships between pRDA axis 1 scores of species on the one hand, and their life history and realized niche characteristics on the other hand.
In all correlation analyses, to minimize the impact of rare species that might have been characterized with low precision along pRDA ordination axes, all species recorded in only 1 year in a given experimental treatment were excluded from analysis.

Restoration Ecology April 2021
Furthermore, in the analyses of extant species' trends in the control treatment, only species were included that were not potentially augmented by diverse seeding or green hay application.
Finally, to characterize performance with time of individual introduced species in the green hay and diverse seeding treatments, and of extant species in the control treatment, we calculated species response curves using generalized additive models as provided by CANOCO (ter Braak & Šmilauer 2018), as characterized by species' case scores on the constrained first pRDA ordination axis of general trend analyses (Šmilauer & Lepš 2014). For the control treatment, we calculated species response curves only for species recorded with an average cover exceeding 1% in at least 1 year. For each species whose response was modeled in a given experimental treatment, we compared two alternative models with one and two degrees of freedom, respectively. Based on AIC (Akaike 1974), we then chose the more parsimonious model.

Soil Characteristics
Restoration treatments did not affect any of the soil characteristics measured in the final year of the experiment (see Table S3). However, for seven of the nine characteristics, we detected an independent main effect of flood risk level (Table S3; Fig. 1). The soil in areas of high flood risk had much higher levels of available phosphorus (Fig. 1A) and of available potassium (Fig. 1B). Furthermore, soil in areas of high flood risk also differed texturally by having lower sand content (Fig. 1D) and higher clay content (Fig. 1F), and had higher organic matter content as determined by loss-on-ignition (LOI; Fig. 1G), lower pH (Fig. 1H), and lower bulk density (Fig. 1I).
Also shown in each panel in Figure 1 in the form of orange bands is the range of values for each soil parameter measured in the four arable reference samples, with green bands indicating . For comparison, the range of values for these parameters as measured in local arable fields is indicated by orange bands, and the range of values as measured at the green hay donor site is indicated by green bands, representing potential starting points and a potential endpoint of grassland restoration, respectively (in each case n = 4).
April 2021 Restoration Ecology the range of soil parameter values measured in reference samples from the green hay donor site. As indicated by these green bands, both available P and soil pH were much higher at the experimental site than at the green hay donor site (Fig. 1A &  1H), whereas total N and LOI were lower (Fig. 1C & 1G).

Restoration Progress
As indicated by highly significant effects of year in LMM analyses, all five indicators of restoration progress were highly dynamic (see Table S4). Flood risk level mostly affected indicators based on floristic similarity or goodness-of-fit, whereas restoration approach in terms of green hay application versus diverse seeding had few effects, and only in interaction with other factors (Table S4). Floristic similarity with the green hay donor site was affected by flood risk level (F [1,42] = 6.67, p = 0.013), with higher similarity achieved in areas characterized by low risk (Fig. 2A). It also differed highly significantly between years (F [3,42] = 29.93, p < 0.001), and was lower in year 1 than in years 2-4 (Tukey's t ≤ −6.70, p < 0.001 for all three pairwise comparisons involving year 1). Vegetation restored by green hay application was marginally non-significantly more similar to that of the green hay donor site than vegetation restored via diverse seeding (F [1,3] = 9.33, p = 0.055).
Goodness-of-fit of actively restored grassland to MG5 grassland was also significantly affected by flood risk level (F [1,42] = 9.33, p = 0.004), with a closer goodness-of-fit in low-flood-risk areas (Fig. 2B). Grassland restored via green hay application and grassland restored by diverse seeding did not consistently differ from each other (F [1,3] = 1.51, p = 0.306), but goodness-of-fit was affected by a three-way interaction between year, restoration treatment, and level of flood risk (F [3,42] = 5.50, p = 0.003), with diversely seeded areas of low flood risk having an only slightly higher fit to the MG5 reference than diversely seeded areas of high flood risk in the second and third year of the experiment, but with much higher fit than the latter to the reference in the first and fourth year of the experiment (Fig. 2B). As with similarity to the green hay donor site, goodness-of-fit to MG5 grassland differed highly significantly between years (F [3,42]  p < 0.001), again with lower levels in year 1 than in years 2-4 (Tukey's t ≤ −7.77, p < 0.001 for all three pairwise comparisons involving year 1). The closest goodness-of-fit to MG5 grassland was found in the low-flood-risk areas within the green hay treatment, leveling off at 52% from year 2 onwards (Fig. 2B). This is slightly lower than a goodness-of-fit to MG5 grassland of 66% calculated for the vegetation at the green hay donor site, which conformed more closely to MG5 grassland than to any other NVC vegetation type. Total cover of MG5 positive indicator species in both restoration treatments also highly significantly differed between years (F [3,42] = 51.81, p < 0.001), increasing over time (Fig. 2C). Pairwise comparisons based on Tukey tests indicate that cover was lowest in year 1, and highest in years 3 and 4, with intermediate cover in year 2. MG5 positive indicator cover reached its highest level (45%) in year 4 in the low-flood-risk areas of diversely seeded restored grassland (Fig. 2C), which is almost as high as the mean positive indicator cover of 49.1% (SE: AE 3.3%; n = 24) found at the green hay donor site in 2013. While it appears from Fig. 2C that the diversely seeded treatment was characterized by higher total cover of MG5 indicator species than the green hay treatment, this difference was not significant (F [1,3] = 5.54, p = 0.100).
Species density per m 2 of MG5 positive indicators in active restoration treatments also differed between years (F [3,42] = 5.06, p = 0.004; Fig. 2D), and was also lower in year 1 than in years 3 (Tukey's t = −3.63, p < 0.001) and 4 (Tukey's t = −2.96, p = 0.025). In year 4 of the experiment, average MG5 positive indicator species density in active restoration treatments at a given flood risk level ranged from 1.9 to 2.6 species per m 2 (Fig. 2D), and was still markedly lower than the species density of MG5 positive indicators of 6.9 species per m 2 (SE: AE 0.3; n = 24) found at the green hay donor site in 2013.
Total species density in the two active restoration treatments also differed highly significantly between years (F [3,42] = 14.48, p < 0.001; Fig. 2E). Pairwise comparisons based on Tukey tests indicate that it was slightly lower overall in year 1, and highest in year 3, with intermediate levels in years 2 and 4. Patterns were also affected by two-way interactions between year and restoration treatment (F [3,42] = 7.28, p < 0.001), and between year and level of flood risk (F [3,42] = 5.10, p = 0.004). The first of these interactions appears to have been the result of higher initial species richness in year 1 in the diverse seeding treatment than in the green hay treatment (Fig. 2E). The second interaction appears to have resulted from a temporary reversal in year 2 of the experiment of the pattern of slightly lower species richness in high-flood-risk areas that was observed in other years (Fig. 2E). Average total species density in the final year of the experiment in the active restoration treatments at a given flood risk level ranged from 12.3 to 14.8 species per m 2 and was still markedly lower than the average total species density at the green hay donor site in 2013 of 21.9 species per m 2 (SE: AE 0.7; n = 24).

Compositional Dynamics
General Trends With Time. As established by pRDA analyses, pronounced shifts in plant species composition occurred in all three treatments (Table S5). Taking into account relevant covariates, year as an explanatory variable explained 32% of partial variation left in the control treatment, 39% in the diverse seeding treatment, and 43% in the green hay treatment, with Monte Carlo permutation tests indicating these shifts to be highly significant (pseudo-F values from 6.6 to 9.9 and p < 0.001 for all three analyses; Table S5).
As established by a second set of pRDA analyses, specifying the Flood Risk × Year interaction as explanatory variable and a different set of covariates, species compositional trends were modified by flood risk level in the two active restoration treatments (pseudo-F = 8.3 for diverse seeding and pseudo-F = 8.9 for green hay application; for both treatments p = 0.031; see Table S5), but not in the control treatment (pseudo-F = 3.1; p = 0.204).
Neither for the green hay treatment nor for the diverse seeding treatment did Spearman correlation of experimentally introduced species' pRDA axis 1 scores, representing overall trends in performance with time, and their life strategy and realized niche characteristics yield any significant relationships (Table S6). However, while no such dependence could be established in relation to life history and realized niche, as indicated by species response curves for both treatments, significant shifts in percentage cover occurred over the 4 years of the experiment in several of the introduced target species (Fig. S2A,B). Moreover, a comparison of species response curves for both active restoration treatments indicates similarities in temporal patterns of percentage cover of several species simultaneously introduced in both treatments (Fig. S2A,B). Notably, in both treatments, cover of Agrostis capillaris and of Centaurea nigra steadily increased with time, whereas cover of Leucanthemum vulgare first increased, and then decreased (Fig. S2A,B). One notable difference between treatments was that Plantago lanceolata established a high initial cover in the first year in the green hay treatment, followed by a steady decline, whereas in the diverse seeding treatment, it appeared to establish less well initially, to then steadily increase in cover (Fig. S2A,B). Another notable difference between both treatments was that cover of Trifolium pratense increased rapidly in the green hay treatment, reaching a plateau of about 18-21% in years 2-4 of the experiment (Fig. S2B), whereas in the diverse seeding treatment, it remained under 2% in the first 3 years, and the species was no longer recorded in year 4.
For extant species already present in the control treatment at the start of the experiment, Spearman correlation analyses established that species characterized by high Ellenberg N-values were more likely to decline than species characterized by lower N-values (r S = −0.53, n = 21, p = 0.013). The most obvious cover changes at species level were a decrease in ryegrass L. perenne, and an increase in T. repens (Fig. S3).
Trends in Response to Flood Risk Level. Species-level cover trends of experimentally introduced green hay species in response to differential flood risk, as characterized by pRDA axis 1 scores of an ordination analysis of such trends, were significantly related to C and R strategy scores sensu Grime et al. (2007), with high-flood-risk areas being associated with a relative increase with time of species characterized by high C-scores (r S = 0.52, n = 29, p = 0.004), when compared to low-flood-risk areas, and a relative decrease of species characterized by high R-scores (r S = −0.47, n = 29, p = 0.009). Compared to lowflood-risk areas, high-flood-risk areas were also associated with a relative increase with time of species characterized by high Ellenberg R-values (r S = 0.38, n = 29, p = 0.043), and of species characterized by high Ellenberg N-values (r S = 0.59, n = 29, p < 0.001). In the diverse seeding treatment, trends in relation to these characteristics of introduced species had the same direction as in the green hay treatment, but were not significant (Grime's C-score: r S = 0.14, n = 21, p = 0.545; Grime's R-score: r S = −0.26, n = 29, p = 0.263; Ellenberg's R-value: r S = 0.30, n = 29, p = 0.183; Ellenberg's N-value: r S = 0.24, n = 29, p = 0.293; see Table S6 for full results).
Spearman correlation analyses of differential trends within control treatment plots in relation to flood risk level indicate a relative increase of extant species characterized by high Ellenberg N-values in plot areas characterized by high flood risk when compared to areas of low flood risk (r S = 0.47, n = 21, p = 0.033), indicating that the overall negative trend of high-N species was more pronounced in low-flood-risk areas than in high-flood-risk areas.

Soil Characteristics
High levels of available P and K, combined with low total N and soil organic matter, as was the case in our study in at least in part of the experimental area, are a common starting point for restoring grassland on former arable land (McCrea et al. 2004;Donath et al. 2007). However, a requirement of sufficiently low levels of available P has been identified as prerequisite for high plant species richness in grassland (Janssens et al. 1998;Critchley et al. 2002a). Our site was converted from arable land to species-poor intensive grassland 6 years prior to the experiment, and accordingly, at the end of our experiment, particularly in high-floodrisk areas, total N and organic matter content were already higher than at local arable reference sites. On the other hand, Olsen-P and extractable K are still high in these high-flood-risk areas, whereas in low-flood-risk areas, P levels are already lower than those found at local arable reference sites, and K levels are at the lower end of the range of values at these sites. While Olsen-P is still slightly higher in areas of low flood risk than at the green hay donor site, its levels in these areas are already comparable with those typically found in MG5 grassland (Critchley et al. 2002b), and below a threshold for Olsen-P of 15 mg P/L defined by Critchley et al. (2002b) as critical for the formation of species-rich grassland. In the case of extractable K, levels found in low-flood-risk areas at the experimental site are similar to both those found at the green hay donor site and those typically found in MG5 grassland (Critchley et al. 2002b). In contrast, in high-flood-risk areas, both P and K levels were higher than those found at the green hay donor site and those typically encountered in MG5 grassland (Critchley et al. 2002b).
The fact that levels of various soil parameters differed between areas according to flood risk level was not unexpected. Nutrient inputs by river flooding have been described for similar-sized rivers to ours (Sival et al. 2005), as have increased levels, e.g. of soil P in more regularly flooded parts of floodplains (Klaus et al. 2011). Hence, conditions at our site are likely reflect those found on other ex-arable land similarly subjected to occasional flooding whose limited utility for arable cropping makes it a candidate for grassland restoration.

Restoration Progress and Compositional Dynamics
Both our active restoration treatments resulted in grassland vegetation much more floristically similar to MG5 grassland and to the green hay donor site than the control treatment. In the control, we observed shifts in the relative cover of extant species, in line with the shift from previously intensive management to extensive management characteristic of species-rich MG5 grassland, but independent colonization by restoration target species from outside the experiment did not occur.
Somewhat surprisingly, green hay application did not produce a sward more similar to that of the green hay donor site than diverse seeding, although a near-significant trend was observed. Our results in terms of the diversely seeded vegetation's goodness-of-fit to MG5 grassland and similarity to the green hay donor site vegetation underline that good results can also be achieved with species-rich seed mixtures. Other indicators of restoration progress in our study, such as summed cover and species density of MG5 positive indicator species sensu Robertson and Jefferson (2000), and total species density, also yielded comparable results for the green hay and diverse seeding approaches. As we also carried out an analysis of green hay seed content (data not shown), we were able to ascertain that the seeds of similar numbers of MG5 positive indicator species were introduced in both treatments (green hay application: 11; diverse seeding: 12), indicating an equivalence of both treatments in terms of their potential to restore MG5 grassland.
Notably, regardless of approach, similarity of restored vegetation to that of the green hay donor site and goodness-of-fit to MG5 grassland, after an initial increase in the first 2 years, thereafter remained more or less constant, whereas cover of MG5 positive indicator species continued to increase. This continued increase in positive indicator species cover in subsequent years was mainly caused by C. nigra. Another positive indicator in both treatments, L. vulgare peaked in years 2-3, again decreasing in the final year. As a relatively ruderal grassland species of the CR/CSR type according to Grime et al. (2007), this species establishes reliably and often initially abundantly from seed in the early open phase of grassland restoration (Pywell et al. 2003). Leontodon saxatilis, another positive indicator species in the green hay treatment, continuously declined between years 1 and 4. Several other species that occur in MG5 grassland, but are not defined by Robertson and Jefferson (2000) as positive indicators, similarly declined over the course of the experiment, including Festuca rubra in the diverse seeding treatment, and Restoration Ecology April 2021 P. lanceolata and Scorzoneroides autumnalis in the green hay treatment. It thus appears that the observed leveling-off of compositional similarity with the two references was not a reflection of compositional stability, but the net outcome of increases in the abundance of some target species and of decreases by others, with these opposing trends canceling each other out. This means that, ultimately, only those target species particularly welladapted to the prevailing conditions at the experimental site may persist, whereas other, less well-adapted species likely continue to decline and may get gradually eliminated from the sward, due to abiotic and competition filters at this former arable site continuing to affect plant species composition.
Notably, we found that the strength of these filters is positively linked with flood risk level. The similarity of grassland restored by green hay application or by diverse seeding with both references was markedly lower in high-flood-risk areas than in low-flood-risk areas. Stronger filtering in high-flood-risk areas has also been indicated by correlation analyses, indicating that species performing increasingly better with time in highflood-risk areas than in low-flood-risk areas tended to be more competitive and better-adapted to high levels of soil fertility as indicated by the correlation of species responses with Ellenberg N-values reflecting site productivity (Wagner et al. 2007), in line with existing edaphic gradients of extractable P and K. Similar patterns in terms of Ellenberg R-values of species match existing gradients in soil pH. For the R-score sensu Grime et al. (2007), which characterizes the "ruderality" of plant species, we found an opposite trend. This finding suggests that, with increasing time after initial restoration, continued opportunities for plant establishment from seed, as required by species with a more ruderal strategy, primarily persist in low-flood-risk areas characterized by lower fertility. Overall, these differential trends in the performance of species introduced via active restoration, between areas of low and of high flood risk, indicate a process of species sorting in line with underlying environmental gradients. It appears that, once the window-of-opportunity for initial target species establishment is closed, pre-existing environmental filters gradually affect species in accordance with their established life strategies, sorting them according to their realized niche requirements along spatial environmental gradients. Eventually, species compositional patterns in our restored grassland will likely resemble those found in extant grassland along river floodplains where competitive nutrient-demanding grassland species tend to be more common in floodplain compartments that are not protected from flooding (Klaus et al. 2011). However, it remains to be seen how such processes of species sorting affect species composition in the longer-term, both in terms of community assembly and with respect to the degree of success in restoring the reference plant community (Bischoff et al. 2018;Harvolk-Schöning et al. 2020).
Another important finding of our study is that on its own, no single indicator of restoration progress allows a detailed enough assessment of the underlying dynamics to evaluate prospects for longer-term success. In fact, the chosen indicators displayed very different trajectories, as e.g. exemplified by the quick leveling off of indicators of reference similarity, compared to a steady increase in MG5 positive indicator cover.
Total species density, the least specific of our indicators, was high in year 1 in the diverse seeding treatment and dropped in year 2, whereas in the green hay treatment, it started off from a lower level but then constantly increased until year 3 of the study. Similar increases in species density in the first few years after green hay application, both in terms of introduced target species and across all species, were also found by Schmiede et al. (2012). In our study, the difference in patterns between the two active restoration treatments partly reflects the fact that species from the weed seed bank in the soil on this relatively recent ex-arable site initially established in much greater abundance in the diversely seeded treatment, causing a transient flush in their occurrence. Similar patterns in other studies have been interpreted as a result of weed seedling emergence being stimulated by cultivation, with this effect being negated by green hay application but not seeding, due to the added cover of mulch provided by green hay decreasing light availability at the soil surface, and adding to the thickness of the soil layer that seedlings from the soil seed bank must penetrate in order to emerge (Jones et al. 1995;Desserud & Naeth 2011). A continued increase in overall species richness in the second year after green hay application, and sometimes even in the third year as observed in our study, is not uncommon (Mann & Tischew 2010;Sengl et al. 2017), and is attributed to an initial delay of seedling emergence caused by green hay mulching (Sengl et al. 2017) or to primary seed dormancy resulting in delayed germination (Mann & Tischew 2010).

Recommendations for Restoration
In our study comparing green hay application with the use of a specifically tailored species-rich seed mixture, neither approach proved superior over the other, that is restoration of MG5 grassland can be achieved with either approach. However, depending on context, other considerations might favor one approach over the other. Green hay application tends to be less costly than the use of diverse commercial seed mixtures, and also allows the transfer of species not available commercially. This is especially important as about two thirds of the species of European grassland are still not commercially available (Ladouceur et al. 2018). However, green hay donor sites may not always be available, as highquality remnants of species-rich grassland are rare, and as there is a requirement for such sites to be local to recipient sites, as the quick transfer of green hay is essential (Trueman & Millett 2003).
It is also possible to combine green hay and seed-sowing approaches from the start, e.g. if the amount of green hay available is not sufficient to cover a whole site, or if the aim is to establish as many species as possible from the reference community. As shown by Baasch et al. (2016), such a combination has the potential to produce better results than either approach on its own.
In our study, total species richness and richness of MG5 indicator species did not increase after year 2. As indicated by longer-term follow-up studies after green hay restoration (Sullivan et al. 2020) and/or restoration by other means (Bissels et al. 2004;Harvolk-Schöning et al. 2020), further progress towards the reference beyond the first 4 years is often also slow or absent, due to low functional connectivity with highquality remnant sites. It has thus been suggested that to achieve further restoration progress, deliberate introduction of additional target species may be required during later stages of restoration (Sullivan et al. 2020).
Another implication of our results is that the variation in environmental conditions between different parts of a field can affect patterns of results at a recipient site. In such situations, a more spatially targeted approach might be more efficient, particularly if resources are limited and have to be concentrated on those areas characterized by the most favorable conditions for restoration of species-rich references. With respect to MG5 grassland restoration as carried out in our experiment, elevated levels specifically of phosphorus in those areas characterized by higher flood risk likely represent a bigger obstacle than elevated levels of other nutrients. This is further underlined by results from a study on MG5 grassland creation by McCrea et al. (2001), who found that plant species diversity was closely negatively associated with soil P levels, whereas no such relationship was found for soil K levels. In a subsequent study that also included old MG5 grassland reference sites, McCrea et al. (2004) found that high K levels favored plant diversity. Accordingly, Klaus et al. (2011) have questioned the suitability of regularly flooded areas next to river channels that are characterized by high P inputs, for the restoration of grassland vegetation typically occurring at sites characterized by low nutrient levels, and other targets may have to be defined for such areas. Even when such gradients, e.g. in soil fertility, are less obvious but nonetheless suspected, it might be worth investing in spatial monitoring of site conditions to have this information available at the planning stage.
The following information may be found in the online version of this article: Figure S1. NMDS sample plot of a joint analysis of vegetation quadrats in the experiment and at the green hay donor site. Figure S2. Species response curves for experimentally introduced species in the diverse seeding and green hay treatments. Figure S3. Species response curves for extant species in the control treatment. Table S1. Species composition of the seed mixture used in the diverse seeding treatment.