A decision support framework assessing management impacts on crop yield, soil carbon changes and nitrogen losses to the environment

Agricultural management practices have multiple impacts on farming systems, including crop yield, soil fertility parameters such as soil organic carbon (SOC), and environmental quality. Agricultural decision support tools (DSTs) are key in sustainable farm strategies to optimize yield and minimize environmental losses because both the current agroecosystem properties as well as the effectiveness of management practices are highly variable in time and space. Here, we introduce a highly data‐driven framework focusing on the evaluation of agronomic measures to reach agronomic and environmental targets. We demonstrate the potential of this approach by a proof of principle for 81 selected farm types across Europe, focusing on measures with respect to crop rotation, fertilization and soil tillage. Synthesizing data from long‐term experiments and meta‐analytical models, we estimated the impact of these measures on crop yield, SOC and N surpluses, while accounting for site‐specific properties for the current and desired situation. The impacts of these measures on all farm types have been quantified, and optimum sets of agronomic measures have been selected in order to maximize crop yield and SOC levels and minimize N surpluses to reach the critical values for NO3− concentrations in groundwater. Our results, quantifying trade‐offs among sustainability indicators that have traditionally been analyzed separately, illustrate that the suitability of measures varies by soil, climate and crop types within Europe. Our approach is promising for mapping region‐specific management recommendations and evaluating the effectiveness of agronomic measures over multiple environmental goals and targets.

specific properties for the current and desired situation. The impacts of these measures on all farm types have been quantified, and optimum sets of agronomic measures have been selected in order to maximize crop yield and SOC levels and minimize N surpluses to reach the critical values for NO 3 − concentrations in groundwater. Our results, quantifying trade-offs among sustainability indicators that have traditionally been analyzed separately, illustrate that the suitability of measures varies by soil, climate and crop types within Europe. Our approach is promising for mapping region-specific management recommendations and evaluating the effectiveness of agronomic measures over multiple environmental goals and targets.

| INTRODUCTION
To meet the demands of a growing population, agriculture continues to intensify, along with increasing and evolving impacts on crop growth, soil quality and environmental quality (Kanianska, 2016). There has been a great increase in world food production since the 1960s, with a 68% increase in Europe over 40 years and an increase in per capita agricultural production, accompanied by a likewise increase in machinery and fertilizer use (Pretty, 2008). Increased inputs of nitrogen (N) and phosphorus (P) to the soil have also led to substantial negative impacts on biodiversity, drinking and surface water quality, and human health (Amery & Schoumans, 2014;Cordell, Drangert, & White, 2009;European Commission, 2013;Kros, de Vries, & Voogd, 2015;Pretty, 2008;Tonitto, David, & Drinkwater, 2006;Velthof et al., 2014). In addition, there are indications for a decline in soil organic carbon (SOC) content in response to climate change (Wiesmeier et al., 2016), which is defined as a threat for European soils due to its crucial link with ecosystem functioning (Haddaway et al., 2014;Panagos, Hiederer, Van Liedekerke, & Bampa, 2013;Stolte et al., 2016). Agriculture is challenged to intensify sustainably in order to meet the demands of improving yields without compromising environmental integrity or public health. Sustainable intensification refers to strategies for increasing food production on existing agricultural land while minimizing environmental impacts (Pretty, 2008). These strategies include fertilizer, crop and soil management (Basch et al., 2011). There is a broad understanding that soils are not just a growing medium for crops but that they also support multiple ecosystem services, such as water purification, carbon sequestration, nutrient cycling and the provision of habitats for biodiversity (Bünemann et al., 2018;Rinot, Levy, Steinberger, Svoray, & Eshel, 2019). Closing yield gaps, for example, remains relevant in substantial parts of Eastern Europe where less than 40% of production potential has been achieved (Pradhan, Fischer, van Velthuizen, Reusser, & Kropp, 2015). Increased fertilizer application is perhaps the most widely applied strategy to enhance crop yield; however, although the efficiency is generally larger in Europe (De Vries & Schulte-Uebbing, 2020), more than half of applied N is typically lost to the environment (Galloway & Cowling, 2002). Improving N use efficiency and decreasing pesticide use is particularly challenging across Europe given net precipitation surpluses, intensive cropping systems, increased runoff risk from soil compaction or unsustainable fertilizer practices (van den Akker & Soane, 2005;Wang & Li, 2019). Soils and their management in agriculture are also critical contributors to the ability of the earth's biosphere to act as a carbon sink and to reduce emissions of nitrous oxide (Haddaway et al., 2014;Minasny et al., 2017;Paustian et al., 2016;Stolte et al., 2016;van Groenigen, Qi, Osenberg, Luo, & Hungate, 2014).
Because agricultural management practices are part of sustainable intensification, it is key that we understand their effects on crop growth, soil and environmental quality. For example, practices such as diversified crop rotations and improved nutrient management through 4R practices, that is, applying fertilizer according to the right type, right amount, right timing and right placement (Venterea, Coulter, & Dolan, 2016), could help maintain crop yields while reducing nutrient losses (Eagle et al., 2017;Tonitto et al., 2006). Although agricultural intensification might negatively impact SOC contents (Luo, Wang, & Sun, 2010), measures such as residue management, reduced tillage and optimized rotation schemes can sequester carbon, becoming influential in climate change mitigation and increasing soil fertility (Haddaway et al., 2014;Lugato, Bampa, Panagos, Montanarella, & Jones, 2014). Various best management practices (BMPs) are recommended based on goals for soil quality, nutrient surpluses and water use (Antonopoulos et al., 2018). Because impacts vary by agro-ecological conditions (Abdalla, Chivenge, Ciais, & Chaplot, 2016;Qin, Hu, & Oenema, 2015), successful management strategies should be tailored to site properties, as illustrated for selenium by Ros et al. (2016). Furthermore, various synergies and trade-offs among BMPs and sustainability indicators exist (Klapwijk et al., 2014).
When only single impacts are considered, this can lead to unexpected outcomes in relation to the other aspects that a management measure affects. Consider, for example, that measures to increase SOC may lead to increases in nitrous oxide emissions (Gao et al., 2018;Lugato, Leip, & Jones, 2018).
Frameworks for integrated goal-oriented assessment of both the environmental performance of agricultural measures as well as crop yields are essential for sustainable intensification as well as unravelling the aforementioned trade-offs and synergies. A wide range of decision support tools (DSTs) have been developed and used over the last decades, providing decision options for policymakers and farmers (Power, 2007). Some DSTs are more strongly focused on a certain topic, such as soil protection (Oleson et al., 2015;Sarangi, Madramootoo, & Cox, 2004), precision agriculture (Venkatalakshmi & Devi, 2013), fertigation management (Elia & Conversa, 2015) or specific nutrient measures (Hewett, Quinn, Whitehead, Heathwaite, & Flynn, 2004;PLANET, 2019). Others have been developed for a specific geographic context (Manos, Bournaris, Papathanasiou, Moulogianni, & Voudouris, 2007) or for land evaluation and spatial planning of sustainable management operations (De la Rosa, Mayol, Diaz-Pereira, Fernandez, & De la Rosa Jr, 2004). Most of these DSTs make use of some form of multi-criteria analysis, focusing on resource allocation for farmers in coordination with environmental protection (Latinopoulos, 2009;Manos et al., 2007;Parsons, 2002). These DSTs therefore rely on a series of input databases as a function of the geographic context. Several DSTs explicitly aim for increased resilience of agricultural systems to climate-induced environmental changes (Oleson et al., 2015;Wenkel et al., 2013).
We aim to evaluate the effectiveness of recommended agricultural measures along with their trade-offs (or synergies) for the environment considering multiple sustainability objectives, while accounting for site-specific properties and links to target levels. Although the design of many DSTs partially addresses these goals, the linkage between management measures and site-specific targets is usually weak and dependent on process-based models. For example, MEBOT assesses economic and environmental impacts of manure and fertilizer strategies at the farm level using a process-oriented model for capturing nutrient and carbon cycling dynamics at the field level (Schreuder, van Dijk, van Asperen, de Boer, & van der Schoot, 2008). GPFarm evaluates on-farm management practices and sustainability indicators given targets for transport of nitrates and pesticides at the catchment scale, as well as optimizing crop yield and animal production, given environmental and economic impacts (Ascough et al., 2001). DSSAT and APSIM (continental to global applicability) focus on crop growth simulation based on detailed computations related to soil-plantatmosphere dynamics, where APSIM additionally includes preferences of the farmer as well as trade-offs between profit and risk (Holzworth et al., 2014;Hoogenboom et al., 2015;Jones, 1993;Jones et al., 2003). The recent launch of Soil Navigator DSS, a field-scale DST for assessment and management of soil functions, is a cutting-edge development due its continental coverage, use of multi-criteria decision analysis, integration of expert knowledge, and close consultation with stakeholders for end-use (Debeljak et al., 2019). Although highly relevant and valuable for scientific and modelling purposes, a significant number of DSTs rely on processbased models, implying the need for relatively high data inputs by the user (to allow site specificity) as well as expert knowledge for generating decision-support insights. The environmental impact assessment of measures on the farm scale is usually weakly underpinned by experimental evidence.
Despite the existence of various DSTs for agriculture, we find that very few tools evaluate agronomic measures quantitatively using impact assessment on targeted environmental as well as agronomic outcomes. This limitation is likely to originate from the fact that most of them use process-based models to evaluate the impacts of measures, being intrinsically difficult to parameterize on the field and farm scale (see e.g., Lutz, Stoorvogel, and Müller (2019b) for an overview of models assessing losses of nitrous oxide). In addition, model validation is challenging, because independent data on management impacts are limited, and model results often deviate from measurements due to the complexity of interactions between measures and soil process (see e.g., Lutz et al. (2019a)). An alternative approach is to use quantitative environmental impacts based on metaanalytical regression models, which have gained attention and importance in the last decade. Meta-analysis is defined as the quantitative analysis of empirical research results (Haddaway and Rytwinski, 2018), where an average effect size and its significance are summarized across multiple studies (Franke, 2015). Meta-analytical models are highly data driven due to their dependence on field experimental evidence and have been increasingly used in environmental sciences to find general patterns among field experiments, settle controversies among conflicting studies and to generate new hypotheses. As such they have the potential to overcome the current challenges regarding site specificity and to unravel site-driven interactions among agroecosystem properties controlling the efficiency and impact of measures.
We foresee a new evolution of DSTs that incorporate these data-driven approaches, integrating literature-based evidence to evaluate the impacts of agricultural measures. The SmartSOIL toolbox, for example, integrated meta-analytical data into its assessment of crop yield, SOC and economic impacts in select regions of Europe (Oleson et al., 2015). To support and evaluate the potential of this evolution towards empirically supported management effects, we developed a framework for integrating meta-analytical data and illustrate its potential in a case study, while linking outcomes to agronomic and environmental targets. Our approach, which is in contrast to process-based tools, links many metaanalytical models into one dataset by quantifying changes in these agronomic and environmental indicators due to management measures. This ultimately leads to empirically driven decision support for targeted shifts in best management practices on arable farms. As a proof of principle, we illustrate overall benefits and trade-offs that BMPs have for the three selected indicators crop yield, SOC and N surplus, to represent the crop, soil and environmental domains, respectively.

| Overall approach
Based on the empirical foundation of meta-analysis and long-term field experiments, we developed a decision support framework (DSF) interlinking BMPs and their impacts on sustainability indicators for agricultural production and environmental impacts in view of meeting critical environmental limits and yield targets (Figures 1  and 2). Variation in soil, crop and climate types (site specificity) is included, making it applicable across various spatial scales. In contrast to traditional baseline management, which is covered by other modelling suites, we focus on decision making for targeted sustainable shifts in management that are relevant across Europe. Sustainable shifts from the current situation are evaluated in view of desired critical limits and thresholds for yield, soil organic carbon levels, nitrogen surpluses and targets for crop production. We assume in our model that the current situation means a management practice is not applied (i.e., the control measure is currently applied) and that changes occur when the practice is applied (i.e., the treatment). Trade-offs and synergies among agronomic measures (so-called "best management practices") are assessed in order to define optimum management practices that maximize agronomic production with minimum environmental losses and maintenance of soil quality. The main objective of our framework is not only to improve the evaluation of specific aspects of soil quality as well as agricultural and environmental sustainability, but also to demonstrate a straightforward flexible approach of connecting data from long-term field studies in a usable, relevant and simple manner. We first describe what options are included in our impact analysis for specific farming systems across Europe (Section 2.2) and subsequently the methodology behind our framework, consisting of: estimated site-specific effects of management practices on crop yields, SOC and N surplus via meta-analytical models (Section 2.3), the derivation of site-specific target values for crop yields and SOC as well as critical limits for N surplus (Section 2.4), and an integrated evaluation and ranking of management measures in view of their impacts on crop yields, SOC and N surplus (Section 2.5). Calculations can be performed for spatial "units" varying in resolution (fields, farms and regions) given predefined categories for soil type, cropping systems and climatic conditions.
The predefined set of management practices (the currently availably set is described in the next section) is evaluated in terms of the most overall improvement considering all indicators (e.g., most increase for crop yield and SOC, most decrease for N surplus). Built-in (default) values for (a) management impacts, (b) targets of indicators and (c) reference status of site properties help to minimize data inputs from users. Users can define what crop types or management practices are included (with no selection all options are given as outputs) as well as the time period over which measures are evaluated. In principle, users may also define their own reference and target values, without affecting the model operations. The user can view varied quantitative information, such as the management impacts, absolute distances to targets and relative distances to targets, via output tables of the model framework.

| Management, site properties and impacts included
We currently focused our evaluation on a series of agronomic measures, including (a) crop management measures such as diversification (by addition of a cover crop, legume crop, extra crop species or green manure into rotation) and crop residue incorporation, (b) soil tillage practices, and (c) multiple fertilization strategies. More specifically, we assessed six so-called treatments: combined organic and mineral fertilizer effects compared to a mineral fertilizer control (CF), organic fertilizer effects compared to a mineral fertilizer control (OF), no tillage F I G U R E 1 Overview of the decision support framework approach, consisting of (a) management practices assessed, (b) estimation of management impacts, (c) the influence of agroecosystem properties on impacts, environmental zones adapted from Metzger, Bunce, Jongman, Mücher, and Watkins (2005)  Note: The range in annual percentage changes for each indicator due to each management practice is given, as well as a description of the variation in estimates, including the site factors most contributing to this variation. Combined fertilizer and organic fertilizer are in comparison to a mineral fertilizer control, no tillage and reduced tillage in comparison to conventional tillage, crop residue incorporation in comparison to residue removal, and crop rotation in comparison to a monoculture control. SOC, soil organic carbon; N, nitrogen. losses ( Figure 1b). We currently focus our evaluation on crop yield, SOC and N surplus to represent each of these domains (here-on simply called "impacts"). Our framework provides evaluations for the three crop types, cereals, maize and root crops, the three soil types, clay, loam and sand, and the three climate types, continental, Mediterranean and oceanic (Figure 1c). This results in 27 combinations that are further described in the next section.

| Meta-analytical estimates of management impacts
Meta-analysis publications calculate effect sizes (e.g., a ratio of treatment effect to the control or null effect or an absolute change due to a treatment), often reported in units such as percentage changes. In total, 37 metaanalysis publications (Table S4) were used to estimate the effect sizes in our analysis (Table 1), which were applied in our framework for the 27 combinations of crop, soil and climate types on crop yield, SOC and N surplus for the six management practices (Section 2.2.). Data are expressed as % change year −1 compared to a control. To assess the % change year −1 , we took the average duration of experiments into account, which has been recorded for each meta-study. If not reported in annual units, changes were averaged over the mean years of duration to derive an average % change year −1 . Studies on SOC had an average experimental duration of 15 years. Crop yield and N surplus are generally reported by meta-studies as mean changes within 1 year (season), averaged over the various years included in field measurements. The average duration included by studies quantifying impacts of measures on crop yield and N surplus was 10 years.
Where multiple studies report on the same management impact, individual effect sizes were weighted as inversely proportional to the variance reported and were aggregated into an overall mean (Young, Ros, & de Vries, 2019) via Equations 1 and 2, thereby extending the applicability of the meta-analytical models found.
where x = weighted mean, σ x = standard error of weighted mean, x i = individual mean from reported effect size and σ 2 i = individual variance from reported effect size.
In addition to global (or overall) estimates, existing meta-studies often provide effect sizes for different site properties as moderator variables of the managementimpact relationship (Meurer, Haddaway, Bolinder, & Kätterer, 2018;Quemada, Baranski, Nobel-de Lange, Vallejo, & Cooper, 2013). For soil type, for example, this may result in subgroups of clay, sand or loam, where each estimate is a mean of the observations collected for one soil type by that meta-study. Whenever available, effect size estimates for individual crop, soil and climate types were averaged to get an overall mean for each unique combination of crop, soil and climate. When a mean for a specific site property was missing, it was replaced by an overall estimate for that impact. Impacts were thus compiled from a synthesis of effect sizes in meta-analysis literature, reported for various climate zones, soils and crop types.

| Site-specific assessment of target values and critical limits
We evaluate the impacts of six agronomic measures in relation to current and desired values for crop yield, SOC F I G U R E 3 Map of geographic regions used for our case study and for which estimations of reference and target values were made for various soil and climate types. The indicated soil types were selected within three different climate zones (oceanic, continental and Mediterranean) and are labelled as such. Crop yields of the three crop types (cereals, maize and root crops) were estimated from these locations. The model is therefore applied to one homogeneous arable land unit for each combination of crop, soil and climate type. NCU, NitroEurope Calculation Units [Color figure can be viewed at wileyonlinelibrary.com] and N surplus. The performances of these measures are evaluated based on minimization of the distance between the current and desired situation, given target values for crop yield and SOC as well as critical limits for N surplus (Figure 3d). This distinction between target and limit values is made in order to distinguish between targets that need to be maximized (e.g., crop yield and SOC) and limits that need to be minimized (e.g., N surplus). Target crop yields were defined as 80% of the water-limited yield potential, or the exploitable yield that can be achieved when crops are grown under optimal nutrient supply and protection against pests, based on cost-effectiveness. National estimates were downscaled to reflect subnational variation from the Global Yield Gap Atlas (GYGA) (Van Ittersum et al., 2013). Targets for SOC were based on the analysis performed by Körschens, Weigel, and Schulz (1998). Using long-term experiments that began in 1902, they proposed critical limits for SOC for optimum crop production in relation to clay content, ranging from 0.5% SOC at 4% clay to 1.75% SOC at 38% clay (Table S2). With this approach we recognize the fact that one common or uniform threshold for SOC that limits crop production does not seem appropriate (Goulding et al., 2013;Loveland & Webb, 2003;Oldfield, Bradford, & Wood, 2019) and that clay particles might stabilize organic matter in soil (Goulding et al., 2013).
Critical limits for N surpluses were calculated by: where N crit is the critical nitrate concentration of 11.3 mg NO 3 -N/L in groundwater, PS is the precipitation surplus and fr N leach is the leaching (runoff) fraction of the N surplus that is leached from the rooting zone. Multiplying the critical nitrate concentration by the mean precipitation surplus leads to a critical N leaching rate, which is divided by a leaching fraction. The leaching fraction is calculated as a function of land use, soil type, precipitation surplus and SOC content Velthof et al., 2009).

| Optimization: multiple objectives and distances to targets
Our DSF is initiated for crop yield, SOC and environmental N loss, which all have partly conflicting objectives by nature (Figure 2). Our goals are to minimize N losses (environmental impact) as well as maximize both crop yield (production) and soil carbon (soil fertility). To assess the suitability of BMPs, the management-induced impact on crop yield, SOC and N surpluses is evaluated given the distance of these indicators to targets or critical levels (dependent on site conditions), where the best BMP is considered the BMP with the least deviations from these targets. The objectives are summarized in Figure 2, showing the relevant attributes (indicators), objectives (direction of improvement), targets (acceptable levels) and goals (outcomes) for decision making.
The change in an indicator i (being crop yield, SOC or N surplus) due to a management measure m at a certain site s (being a combination of soil, crop, and climate type) was estimated via: where δR s,i,m,t is the estimated change in indicator i on site s due to measure m, over a predefined time step t. This change is derived from meta-analytical models. Based on δR s,i,m,t the relative deviation from a target (e.g., yield or SOC) or a critical limit (e.g., N surplus) is used to evaluate the measure. This relative deviation D s,i,m,t was derived as: which assigns each indicator a percentage deviation from its target (yield and SOC) or limit (N surplus). With the idea to meet several goals simultaneously, indicators with critical limits (N surplus) were assigned a negative deviation because the goal is to minimize the deviation rather than maximize it. The overall index or score S of a measure m on the set of indicators i at a site s was then estimated by the summed deviations. The best overall measure for one site s is the measure with the lowest score. A suitability ranking of the MPs was subsequently derived based on S, where the best-ranked (most suitable) practice is that resulting in the least overall deviations from both targets (a shift in positive direction is improvement) and limits (negative shift is improvement). In its current form, our analysis uses an overall deviation as the ranking measure, which simply maximizes this improvement. This means that the system prioritizes additional improvement in indicators even if some targets (limits) have already been met.

| Site selection for farming systems in Europe
We use the environmental zones developed by Metzger et al. (2005) to select a variety of climate and land types in Europe. These environmental zones have been aggregated into major climate regions by previous studies (Spiegel et al., 2014;Zavattaro, Costamagna, Grignani, & Bechini, 2014). We selected regions from three environmental zones, including southern (Mediterranean), eastern (maritime/oceanic) and western (continental), to represent the variation across European arable farming systems.
Sites were selected for the various combinations of agroecosystem properties, consisting of three types of land use (crop types wheat, maize and potatoes), three soil types (sand, loam and clay) and the three climate regions mentioned above. Three countries were included for each of the climate types. This included the Netherlands, France and Ireland for north-western Europe (Oceanic), Poland, Romania and Hungary for eastern Europe (Continental), and Spain, Italy and Greece for southern Europe (Mediterranean). Figure 3 gives a spatial representation of the selection of soil and arable land across climates, resulting in 81 unique combinations.

| Input data on estimated impacts of management measures
Using the meta-analytical models, the impacts of a series of management practices were quantified (see Table 1). On average, use of combined fertilizer as compared to mineral fertilizer has a positive impact on SOC, a neutral effect on crop yield (values just above and below zero), and results in a large decrease in N surplus (−35 to −33%). Organic fertilizer as compared to mineral fertilizer has a range of impacts on crop yield (−12 to 3%) and results in a clear strong increase in SOC (1 to 8%) and N surplus (26%). No tillage, as compared to conventional tillage, has a strong negative impact on yield (−11 to −9.4%) and results in small increases in SOC (less than 1%) and N surplus (1.1%). Reduced tillage decreases yield slightly (−2.4 to 0.3%), has a close to neutral effect on SOC, and increases N surplus slightly (0.9%). Residue incorporation has a range of impacts on crop yield (−12 to 0.1%), a slight positive impact on SOC (less than 1%), and slightly reduces N surplus (−2%). Crop rotation compared to a monoculture has different effects for crop yield (−2% to 5.7) and neutral effects for SOC (−0.3 to 0.4), and decreases N surplus (−25 to 5%). A full overview of the impact of measures on crop yield, SOC and N surpluses expressed as % change yr −1 compared to a control is given in Table S1. Their actual contribution to enhance crop yield and SOC content and lower N surpluses below critical levels for all 81 sites is given in Table S3.

| Input data for reference and target values
Data on current (reference) and target values (crop yields and SOC contents) or critical limits (N surpluses) for the 81 country and farm-type combinations were derived from a European database containing agroecosystem properties for 40,000 specific combinations of soil properties, topography and climate (also called NitroEurope Calculation Units, NCU). This database was generated by the INTE-GRATOR model for European-wide assessments of nitrogen and greenhouse gas fluxes in response to changes in land cover, land management and climate (De Vries et al., 2011;Kros et al., 2012 (2020)). Target crop yields were derived for wheat from the Global Yield Gap Atlas (GYGA) and scaled to other crops as described in De . Current clay and SOC values in the topsoil were generated from WISE, SPADE1 and EFSDB databases, which jointly contain about 3,600 soil profiles, irregularly distributed over Europe (Heuvelink, Kros, Reinds, & de Vries, 2016). Data for clay and soil organic matter contents at the NCU level were derived with a multivariate regression kriging model accounting for the spatial structure of soil properties and their dependence on explanatory variables such as soil type and land cover (Heuvelink et al., 2016). Target limits for SOC were assessed as a function of clay content (see Section 2). Current N surpluses at the NCU level were estimated as the total N input by animal manure, fertilizer, biosolids, biological N fixation and atmospheric deposition minus the N that is removed by crop harvesting. More details are given by . Critical limits for N surpluses at the NCU level were derived as a function of N input and precipitation surplus as described in Section 2.
Ranges in current and target crop yields, SOC contents and current and critical target N surpluses for most common combinations of crops and soils are given in Table 2, with ranges related to differences in three climate zones and nine countries. Cereal crop yield targets range from 2,839 to 9,270 kg ha −1 , maize targets from 7,022 to 13,813 kg ha −1 , and root crop targets from 16,096 to 48,263 kg ha −1 under different soil and climate types. Reference yields range from 2,950 to 8,660 kg ha −1 , 1,570 to 12,171 kg ha −1 , and 13,981 to 45,348 kg ha −1 , respectively, with variation by climate but not soil type. Furthermore, target SOC ranges from 1 to 1.5%, with slightly higher targets for clay and loam compared to sand. Reference SOC ranges from 0.5 to 10%, with a clear general increase in the order of clay, loam, sand. Critical N surplus ranges from 8 to 118 kg ha −1 overall, with slightly lower limits for sand, and some higher upper limits for maize. Reference N surplus ranges from 9 to 178 kg ha −1 , with a slight increase in this range in the order of maize, root crops, cereals. Detailed information on both reference and targets for all 81 combinations of crops, soils, climate types and countries is given in Supporting Information Table S2. Table S3 summarizes properties of each site in relation to the derived targets and limits (Table S3), illustrating the influence of these deviations on the ranking outcomes. The deviations presented for each management measure and each indicator (18 columns in total) show the trade-offs (positive/negative for different indicators) or synergies (positive/positive) that each individual management practice poses, which is also described in Section 3.2.

| Impact analysis of agronomic measures in relation to targets
Crop yields across Europe deviated from target yields with a range of −88 to 54% in clayey soils, −88 to 32% in loamy soils, and −88 to 27% in sandy soils. The deviation was the smallest for root crops (13% below target) and almost twice as big for cereals and maize (21 to 23% below target). Crop yields in Mediterranean climates were almost comparable to target yields (3% deviation), whereas the biggest deviations occurred in continental climates (36% yield gap). Average SOC over all crops was 3% lower than the optimum value in clay soils, but was more than 50% above the optimum threshold value for loamy and sandy soils. For the averaged N surplus it was the opposite: on clay soils the N surplus was 11% below the critical level but it exceeded the critical level by 15% on average in sandy and loamy soils. Differences among cropping systems were negligible for the averaged deviation of SOC from the optimum target, and cereals had a minor exceedance of the N surplus of about 8%. Surprisingly, this exceedance was on average 4% higher than for maize and root crops. Climatic effects were more pronounced for SOC, with SOC levels well above targets (180%) for oceanic climates and averaged SOC contents for the other zones 26% below the target value. The exceedance of the N surplus was 6% smaller in Mediterranean climate zones compared to both oceanic and continental ones.
All measures brought crop yields and SOC on average 1-2% closer to targets. For N surplus this was more varied. Combined fertilizer and residue incorporation brought N surplus 2.5 and 0.1% further above the limit on average. N surplus moved on average approximately 2% closer to meeting targets due to organic fertilizer and crop rotation, whereas there was only a small positive effect of tillage practices. For crop yield, the best impacts were seen from organic fertilizer and cereals in oceanic climates, no tillage in oceanic climates, reduced tillage for maize in oceanic and continental climates, and residue incorporation in oceanic climates for cereals. Crop diversification had a positive impact in an oceanic climate for maize, and especially low effects on cereals under the same climate. More positive effects on SOC were found for combined fertilizer and residue incorporation in oceanic climates as well as organic fertilizer on cereals in oceanic climate. No tillage also had the highest effect in oceanic climates, this effect decreasing clearly from continental to Mediterranean. Effects of reduced tillage seem to be highest in France and Hungary for sand and loam and lowest in Italy, Romania and Poland for the same soil types. Unusually high effects were found for combined fertilizer on N surplus under maize and root crops in Italy and Spain (Mediterranean climate) for clay and loam. This effect is similar for residue incorporation. Other high effects are for combined fertilizer on N for maize crops in the Netherlands and France (oceanic climate).

| Variability in site properties in relation to BMPs
A series of management suitability rankings for the selected 81 farming systems (Tables 3 and S3) were calculated for impacts over a period of 5 years (Section 2.5). Management suitability outcomes vary by different site properties (Table 2) in addition to the variability from T A B L E 3 Summary of suitability ranking outcomes for our case study, which was calculated for 1 year of management  (16) is most varied over continental (9) and Mediterranean (9) climates. Countries such as Romania, Poland and Greece have less variation in ranking outcomes than countries such as France and the Netherlands (see Table 3 for descriptive columns on site property variation).
Considering the variation in management impacts as well as targets and limits for crop yield, SOC content and N surplus, it is logical that management predictions show distinct trends or influences by climate, crop and soil type combinations (Tables 1 and 2). However, the heterogeneity in management recommendation outcomes indicates that not one site property acts as a single predictor of management performance. This shows that holistic approaches, capturing the variation in best recommendations for management as a function of many site properties, are preferred above one-dimensionally designed DSTs.

| Optimum management measures across Europe
We summarize the results of management outcomes in Table 3. For each case, the six BMPs are ranked from highest (1) to lowest (6) suitability. The most common recommended measure to sustainably intensify agriculture in Europe (applicable to 77 of 81 sites) is: optimizing the nutrient budgets by combining mineral and organic fertilizers instead of mineral fertilizer only. The use of combined fertilizer strategies is the best overall practice due to its large reduction of N surplus and its neutral to positive effects on SOC and crop yield. This is logical when considering the synergistic effects of combined sources on crop yield (macro-and micronutrients from mineral and organic sources, respectively) and SOC (additional organic matter, which is approximately 50% organic carbon, is added from organic sources) (Hijbeek et al., 2017;Janssen, 2002;Pribyl, 2010).
Next to combined fertilizer, other practices show varying performance in terms of what measures are ranked from second to sixth. Crop rotation diversification and residue management are the next most promising measures to improve soil quality as well as crop yield and nutrient buffering in almost all farming systems, as they are often ranked in second and third place. This might be related to the somewhat neutral overall impacts of these measures on indicators. Finally, tillage practices are most consistently ranked fourth (reduced tillage) and fifth (no tillage). This can be explained by the negative impact of no tillage on yield, resulting in a higher overall performance of reduced tillage for all indicators (see Table 1 and Section 3.3). The use of organic amendments alone was most often ranked last (sixth), although it shows greater variability in performance across all sites compared to other measures. This is logical considering its negative impacts on yield (general decrease, although varied) and N surplus (increase). Organic amendments without mineral sources may thus give additional support for the improvement of these sustainability indicators at certain sites (see management impacts description in Table 1 and Section 3.3).
As with combined fertilizer, the high ranking of crop measures is related to the decreasing effects on N surplus but neutral effects on SOC and yield. The reverse is true for tillage practices and organic fertilizer. N surplus changes are relatively large in magnitude, and often the lower ranked practices are those which increase N surplus the most. Although using organic instead of mineral fertilizers has a positive effect on SOC and yield for example, the negative effect on N may outweigh this and make organic fertilizer lowest in the ranking.

| Sensitivity of parameters
The sensitivity of outcomes to the influence of N surplus is linked to the fact that the indicators crop yield, SOC and N surplus are different in nature, especially with respect to dynamics through time. Carbon accumulates over the longer term (e.g., crop rotation and residue inputs), whereas changes in yield and N surplus mostly happen within one season. Large effects of nutrient management can result in more variability within a short time frame, resulting in bigger fluctuations in impacts for crop yield and especially N surplus. Crop yield is also more inherently related to weather factors, meaning a smaller magnitude in impacts could be expected for yield, and relatively larger (more direct) effects on N surplus are therefore not surprising. We found indications of a decreasing influence of N surplus over time when comparing our 5-year interval to results over shorter and longer periods (not shown). This indicates more heterogeneity in management recommendations when SOC is allowed to accumulate over time, and shows that our DSF has the opportunity to integrate both short-and long-term dynamics affecting farm and soil management.
In our proof of principle we simplified the optimization procedure to maximize yield with minimal N losses, even if some targets or limits have already been met. As a consequence, measures still improve in ranking when they lower the N surplus far under the critical limits, indicating that N surplus as an indicator is overprioritized in comparison with the indicators crop yield and SOC. This raises the question of how to prioritize such different effects. As a future aspect of this DSF, we envision the implementation of a weighting process where a natural cut-off is included as soon as a threshold or limit is reached, as well as a process where the user may put emphasis on different goals (e.g., short versus long term, or preference for environmental quality versus profits). We furthermore envision weighting functions that give more priority to "problem indicators" (i.e., where there is still potential to reach a target or limit) and less priority to "safe indicators" (i.e., where targets or limits have already been met).

| Integrated assessment of synergies and trade-offs
Considering current DSTs for agriculture, we find that there is a need for an integrated assessment of synergies and trade-offs of multiple MPs on various indicators related to crop yield, soil quality and environmental quality. Such an analysis is relevant to choosing practices that stimulate sustainable agricultural intensification. Instead of using process-based models to evaluate management approaches, which are hard to model and even more challenging to validate, we used published meta-analysis data from long-term field experiments to evaluate the effect of nutrient, crop and soil management practices on multiple agricultural and soil-related indicators.
A decision support framework such as that proposed here covers agroecosystems across Europe and has a strong empirical foundation. We demonstrate that our relatively simple DSF provides insights into recommended BMPs given the current and desired status of crop yield, SOC and N surplus. This first proof of principle of our DSF shows that data-driven meta-regression models strongly contribute to tools that can be used for site-specific recommendations of BMPs by accounting for variable site properties. This will help provide regional guidelines for sustainable intensification. Due to the flexibility of our approach, there is great potential in extending it to a more comprehensive list of important sustainability indicators and typical BMPs. It can also easily be incorporated into existing tools, providing them with site-specific insights into targets as well as empirical-based synergies and trade-offs between agronomic and environmental indicators.

| Data-driven empirical approaches
In contrast to most DSTs, our framework strongly builds on data-driven empirical (statistical) models. Considering the quickly growing body of quantitative assessments on management-induced changes in impacts (Eagle et al., 2017;Gurevitch, Koricheva, Nakagawa, & Stewart, 2018;Philibert, Loyce, & Makowski, 2012), our framework integrates these meta-analytical models into a knowledge platform to aid sustainable agricultural intensification. We expect continual progress in the quantification of management-induced impacts as field research evolves. In fact, it is not uncommon for authors using meta-analysis to publish updates of their work over time as more field studies are published, or to be involved in ongoing meta-analysis projects (Cayuela et al., 2014(Cayuela et al., , 2015Haddaway et al., 2014Haddaway et al., , 2017Lehtinen et al., 2014). We therefore consider meta-analytical models to be a flexible and adaptable aspect of our DSF, allowing the framework to be updated with the most recent research results with minimal restructuring of equations and inputs.

| Potential use of the DST framework
In terms of applications, farmers may use their specific location for an indication of what changes could be expected for a shift in management. Different goals may also be assessed, such as crop yield effects (traditional farmer objective) in comparison to environmental goals (which an increasing number of farmers find important). We envision a user interface that includes summaries of desired outputs that the user chooses; for example, a graph showing the effect of a specific measure on all indicators or on relative distances to target. Our highly data-driven approach based on long-term field experiments and site properties can easily be implemented in other DST's, enhancing their applicability across multiple farms, soils and regions. Given the growing request for scientific methods that quantify environmental performance of farm management, these types of DSFs might also assist consultancy workers in making farm-specific recommendations. Finally, policymakers may visualize potential changes over regions of Europe in terms of management currently applied in comparison to what management should be applied (and where) in order to reach targets for arable farming. Potential problem areas or best practice areas can then be identified in terms of gaps between initial reference status and targets and limits to further provide recommendations for meeting these targets.
Our developed framework highlights the relevance and importance of agricultural measures and allows practitioners and scholars to increase insights into the best management to improve soil and ecosystem functioning, to identify effective measures over multiple environmental objectives, and to quantify the overall environmental performance of agricultural ecosystems. The novelty of our research lies in its simple and reproducible approach, which integrates recent meta-analytical outcomes to quantify synergies and trade-offs among sustainability indicators that have traditionally been analysed separately. Because soil is intertwined in crop production, climate change and environmental buffering, it is crucial to identify these synergies in order to reconcile crop production, soil quality and environmental impacts. Applied at the European level, this type of analysis will provide valuable comparisons for sustainable management in various arable farm types.