Decoupling of impact factors reveals the response of German winter wheat yields to climatic changes

Yield development of agricultural crops over time is not merely the result of genetic and agronomic factors, but also the outcome of a complex interaction between climatic and site‐specific soil conditions. However, the influence of past climatic changes on yield trends remains unclear, particularly under consideration of different soil conditions. In this study, we determine the effects of single agrometeorological factors on the evolution of German winter wheat yields between 1958 and 2015 from 298 published nitrogen (N)‐fertilization experiments. For this purpose, we separate climatic from genetic and agronomic yield effects using linear mixed effect models and estimate the climatic influence based on a coefficient of determination for these models. We found earlier occurrence of wheat growth stages, and shortened development phases except for the phase of stem elongation. Agrometeorological factors are defined as climate covariates related to the growth of winter wheat. Our results indicate a general and strong effect of agroclimatic changes on yield development, in particular due to increasing mean temperatures and heat stress events during the grain‐filling period. Except for heat stress days with more than 31°C, yields at sites with higher yield potential were less prone to adverse weather effects than at sites with lower yield potential. Our data furthermore reveal that a potential yield levelling, as found for many West‐European countries, predominantly occurred at sites with relatively low yield potential and about one decade earlier (mid‐1980s) compared to averaged yield data for the whole of Germany. Interestingly, effects related to high precipitation events were less relevant than temperature‐related effects and became relevant particularly during the vegetative growth phase. Overall, this study emphasizes the sensitivity of yield productivity to past climatic conditions, under consideration of regional differences, and underlines the necessity of finding adaptation strategies for food production under ongoing and expected climate change.


| INTRODUC TI ON
Historical climate change, and first and foremost rising temperatures during the second half of the 20th century, contributed to profound changes in yield in many major wheat-producing regions globally and in Western Europe in particular (Alexander et al., 2006;Asseng et al., 2014;Frich et al., 2002;Lobell, Schlenker, & Costa-Roberts, 2011). For instance, an estimated net loss of 4% in wheat yield, coinciding with increasing temperature and decreasing precipitation between 1980 and 2008, has been found for France (Lobell et al., 2011). Wheat has been found to be very sensitive to high temperatures, and its response to heat stress varies at different phenological stages (Farooq, Bramley, Palta, & Siddique, 2011;Slafer & Rawson, 1994). High temperatures are presumed to be more harmful to grain yield during the reproductive growth phase than during the vegetative phase (VP; Wollenweber, Porter, & Schellberg, 2003). Heat stress around anthesis mainly leads to a reduction in photosynthesis rate, increased respiration, accelerated leaf senescence and enhanced evapotranspiration, which finally results in reduced grain numbers (Porter & Gawith, 1999;Reyer et al., 2013;Wheeler et al., 1996;Wollenweber et al., 2003). Exposed to drought stress during reproduction, grain yields of wheat are negatively affected due to a hampered uptake of nutrients, and in combination with a diminished surface cooling (induced by reduced transpiration) crop canopy temperatures increase and lead to further decrease in photosynthetic rates (Mäkinen et al., 2018;Porter & Semenov, 2005). During stem elongation, water is needed for expansive growth processes that bring up the spike to the top of the canopy through the unfolding leaf, as well as for spike growth and cell expansion, pollen ripening or grain growth and filling (Farooq et al., 2011). With enhanced climate variability during summer across Europe, encompassing a higher risk of heatwaves, droughts and heavy precipitation events, negative impacts on yields are likely to increase (Mäkinen et al., 2018;Meehl & Tebaldi, 2004;Porter & Semenov, 2005).
Besides environmental factors, quality and quantity of grain yields are determined by the crop genetic yield potential and by agronomic measures of crop management that aim to reduce environmental limitations. Despite the ongoing advancement in breeding for higher grain yields since the 1960s, the use of nitrogen fertilizers, irrigation or pesticides, the steady increase of winter wheat (Triticum aestivum L.) yields during the second half of the 20th century has slowed down since the 1990s in several regions of the globe (Brisson et al., 2010;Calderini & Slafer, 1998;Chen, Zhou, & Pang, 2015;Grassini, Eskridge, & Cassman, 2013;. For instance, historical yield records reveal that winter wheat yields almost simultaneously reached a plateau at about 7-8.5 t/ha in many West European high-yield countries between 1991 and 2000 (Brisson et al., 2010;Grassini et al., 2013).
For Germany, yield data from the National Statistical Office show an increase of winter wheat yields from approximately 3 t/ha in 1960 up to 7.5 t/ha in the year 1999 (Wiesmeier, Hübner, & Kögel-Knabner, 2015), but thereafter, no further increase has been documented. A number of causes for the stagnation of grain yield have been discussed. Besides aspects of climatic change (e.g. increasing temperatures, high precipitation events or drought stress) or the genetic and agronomic progress (expansion of wheat to sites with lower productivity, increasing shares of 'second wheat' in crop rotation), socio-economic incentives and/or constraints (e.g. world market price for wheat grain or general production factors; expansion of organic production systems; legal limitations to fertilization; political subsidies, price influences from climate events) were in the focus of research (Brisson et al., 2010;Grassini et al., 2013;Himanen, Hakala, & Kahiluoto, 2013;Laidig et al., 2017;Olesen et al., 2012;Reidsma, Oude Lansink, & Ewert, 2008;Trnka et al., 2019).
Many studies only account for the effect of one single impact factor and only take yield data from official statistics that represent annual averages at national or global scale, which result from diverse crop management of a multitude of farmers acting under diverse production conditions (Brisson et al., 2010;Calderini & Slafer, 1998;Hafner, 2003;Lobell & Field, 2007;Wiesmeier et al., 2015).
Similarly, studies building on a single experiment or using historical yield and fertilization data from only one location are limited in their explanatory power due to the restricted number of years, locations and used cultivars. Hence, a set-up combining multiple locations over a long period is preferable to compensate for these limitations.
In this study, we aim at examining the importance of a range of individual agroclimatic factors affecting winter wheat yields. However, since also other factors affecting yield development changed over time, it is necessary to disentangle the effects of climate change from those of genetic progress, and of all other agronomic factors that changed over time (e.g. socio-economic incentives and constraints). In order to dissect these factors, mixed-effect models were used, which allow attributing yield variability to randomly distributed independent effects. These models serve to analyse data setsincluding unbalanced ones-and to dissect various impact factors by means of fixed regression terms and random residuals (Laidig, Drobek, & Meyer, 2008;Laidig et al., 2017;Mackay et al., 2011;. Moreover, statistical models do not depend on field calibration data for many driving variables as required for process-oriented models, and model uncertainties can be assessed in a more transparent way (Lobell & Burke, 2010b).
Mixed-effect models can further take into account the influence of regional diversity, for instance, regarding soil type and quality. In this study, we make use of historical field trial records found in 34 publications that provide data on winter wheat yields, N-fertilization levels and cultivars from a total of 298 N-fertilization experiments over a long period of time and with a large geographic spread across Germany. The amount of nitrogen fertilizer applied is the agronomic factor, which usually has the strongest impact on yield and quality traits and features large regional and inter-annual variation (Basso, | 3603 Ritchie, Cammarano, & Sartori, 2011;Delogu et al., 1998;Whitfield & Smith, 1992). In order to exclude limiting effects of the factor 'N-fertilization', using data from optimum N-fertilization levels has been proven a useful approach (Basso et al., 2011;Raun et al., 2002).
Specifically, experiments with multiple N-levels allow calculating the maximum of the nitrogen supply-yield relationship under standardized and predefined conditions. We analyse the yield development of winter wheat over the last 60 years in Germany based on published data from wheat-N fertilization experiments. We identify the underlying causes of the observed trends by considering agronomic, genetic, inter-annual and geographic differences. Based on responses described in the literature, this study considers climatic factors linked to important winter wheat crop growth stages and phases. To estimate the explanatory value of an individual agroclimatic factor, we use a novel approach (Piepho, 2019) that compares the total variance estimates between two mixed-effect models-one that includes the climate variable of interest and one that does not, and employs the calculation of a coefficient of determination for such models (Section 2). First, we run the analysis over all experiment data across all study sites. In order to understand the impact and specify the significance of soil factors we then split the data into groups of experimental sites differing in yield potential and soil type. Understanding the interactive effects of climatic changes and genetic adaptation (i.e. genetic improvement through plant breeding) on yield productivity development is crucial for developing viable crop adaptation strategies to future climate change.

| Study design
In order to assess the impact of climatic changes on winter wheat yield development, we applied several processing steps from raw data to final data analysis. As illustrated in Figure 1, we first gathered specific crop data including yields, N-fertilization amounts, cultivar choice and year of release (YOR). Furthermore, for every experimental site we gathered winter wheat phenology data (i.e. beginning of phenological stages), climatic data and site-specific soil information including soil type and soil rating (Ackerzahl) henceforth called 'soil yield potential'. In the second stage, we processed the experimental data to derive the optimal N-fertilization amounts and maximum yields. We used the climatic and phenological information to derive crop-specific agroclimatic conditions throughout the observation period. Third, for all time-series data, trend analyses were performed using simple linear regression models followed by segmented regression analysis. Additionally, linear plateau analyses were carried out for the yield and nitrogen data. Fourth, we used a data set with all data combined to decouple individual agroclimatic conditions from F I G U R E 1 Data processing scheme from raw data to final data analysis. Crop, phenological, climatic, agroclimatic and site-specific data were collected in the first place and analysed for their trends. All data were combined into a data set from which individual agroclimatic variables were decoupled from all other influential factors using statistical mixed-effect models. Finally, the intensity of each variable was estimated by computing the coefficient of determination (R 2 ) for generalized linear mixed models as introduced by Piepho (2019) all other influential crop parameters to finally assess the intensity of these individual factors by computing the coefficient of determination (R 2 ) for generalized linear mixed models as introduced by Piepho (2019).

| Crop data
Winter wheat field experiment data were gathered from multiple sources including peer-reviewed articles, dissertation theses, habilitation theses, conference papers and N-fertilization experiment reports from state authorities. Germany was selected as study region, as it represents a specific breeding region and ensures a sufficient number of well-documented experiments. All considered experiments investigated the response of grain yield of one or several cultivars to varied levels of N-fertilization (suboptimal, optimal, supraoptimal). In rainfed experiments with optimal plant protection, grain yields are supposed to be defined by the site-specific soil and climate conditions and limited by the genetic yield potential of the selected cultivar and the applied N rates. To be included in the study data set, the following criteria had to be fulfilled by an experiment: at least three levels of total rates of mineral N-fertilization (sum of all applications within one growing season [GS]) are represented, plant protection excluded biotic stress and the experiment was not irrigated. Moreover, site-and year-specific grain yields at defined dry matter content (either 86% or 100% dry matter) had to be available for each N-fertilization level. To make data comparable, all analyses were carried out by converting data to 100% dry matter, expressed in tons dry matter per hectare (t DM/ha). The exact location of each experimental site had to be given. A final data set was derived from 34 publications comprising 43 individual experimental sites between 1958 and 2015 and a total of 59 individual cultivars (Bönecke et al., 2020). The duration of the individual experiments ranged from 1 to 6 years. Information about the YOR of the corresponding cultivars was retrieved from databases of the Federal Plant Variety Office and from GRIS (Genetic Resources Information System for Wheat and Triticale, CIMMYT).

Calculation of Y max and N max
N-fertilization experiments provide the opportunity to estimate the yield maxima (Y max ) for each experiment, which then represent the environmental and agronomic limits in an experimental year and of a region. Y max can then be used to compare the yield variation between different locations and years. To derive potential grain yields under non-N-limited conditions for each experimental year, site and cultivar, data on dry matter grain yield at individual N-fertilization levels were used to calculate Y max and the corresponding maximum N-fertilization level (N max ). Most trials implemented suboptimal, close to optimal and supraoptimal N-levels in equal quantity steps (e.g. 50, 100, 150 kg N/ha). We fitted a quadratic yield response function.
to the data of each individual N-fertilization response trial, where N is the applied N-level and Y is the observed yield (Equation 1; Figure 2), using the statistical software R (R Development Core Team, 2008).
Values of Y max and N max were obtained from the coefficients of these functions by setting their first derivatives to 0 and solving for N: where a, b, and c are the model parameters.
When fitting quadratic functions to the trial data, several scenarios need to be considered for interpreting the derived yield as Y max under the given site and climatic conditions: the derived Y max value may lie (a) beyond the highest observed N-level, and (b) it may be higher or lower than a measured Y max value. To deal with these issues,

F I G U R E 2
Examples for the yield response curves to nitrogen fertilizer level and derived maximum yields and corresponding nitrogen values of three fertilization experiments. The green crosses mark the maximum achievable winter wheat yields as derived from a quadratic linear function and its corresponding nitrogen level | 3605 certain thresholds were set: When the derived Y max was not reached within the observed range of N-levels, an upper threshold was set for acceptance of a study. Specifically, we determined the mean width of the N-level increments. The threshold was computed as the largest N-level tested in the study, plus the mean increment. A study was accepted only if Y max was estimated to occur at an N-level below that threshold. For instance, in an 80-120-160 kg N/ha trial, the mean increment of the N-fertilization levels is 40 kg N/ha and when Y max was estimated at 210 kg N/ha, the trial was then excluded from the data set, whereas when Y max was derived below or equal to 200 kg N/ha, the trial was included. Moreover, Y max values below 50 kg N/ha were excluded from further analysis and considered as unrealistic under conditions present in Germany. When the derived Y max value was below the highest yield measured by more than 5%, the measured yield was taken as Y max . Yet, when the derived Y max was greater than 5% of the highest measured yield, the derived Y max was chosen to reflect the maximum yield achievable. In any case, experiments outside these defined thresholds were excluded from the data set.
About 11.2% of the data were removed from further analysis due to decreasing yield functions and coefficients of determination below 0.5. Following these plausibility tests, 324 out of 331 individual trials remained within the data set. The correlations between Y max and the corresponding N max , the intercept and the linear coefficient of the quadratic regression model were 0.51, 0.62 and 0.33 respectively.
Correlations between N max, the intercept, and the slope factors of the quadratic regression model ranged between −0.1 and 0.52.

National yield data
The national average winter wheat yield data were obtained from

| Phenological data
None of the publications from which yield data were retrieved provided precise information about phenological stages and phases. Therefore, data on the beginning of the individual phe- To eliminate errors and incorrectly recorded single values, data records were processed by an automated selection process. As suggested in Menzel (2003), only observation sites with relatively complete data records of more than 20 or 30 years should be considered as meaningful for reliable predictions because trend analysis strongly depends on the number of years included in the linear regression. In this study, 30 years was set as minimum record length to ensure a certain degree of temporal stability of the resulting trends. Even then, a small uncertainty remained because part of the variation in trends might be caused by differing start and end years. With respect to topography and altitude (m a.s.l.), all stations with more than 50 m difference in altitude from the experimental sites were removed to avoid misinterpretation due to vertical thermal differences and their influence on stage initiation. Mean values and standard deviation for each year of the dates were calculated and potential outliers removed when they fell outside the range of the mean value ± two times the standard deviation as suggested in Siebert and Ewert (2012). After applying the filtering process, the total number of observations obtained for the 43 sites across the study regions was 15,238. Moreover, the duration of the whole GS, the generative phase (GP) and the VP was calculated, and in addition VP was subdivided into the leaf development and tillering phase (LP) and the stem elongation and booting phase (SP). As shown in Figure 3, the length of the VP was defined as the time between emergence and heading and the GP as the time between heading and hard dough. The LP and SP were separated at the beginning of the stem elongation.

| Climatic data
As for phenology, high-resolution weather data that allow calculating agrometeorological variables was scarcely provided within the publications. For this reason and to investigate long-term climate trends for the crop growth phases outlined above at each respective experimental site, we obtained information from several databases providing climate data. The daily mean temperature (°C), precipitation (mm) and relative air humidity (%) were obtained from the hydrological raster (HYRAS) data set (5 × 5 km 2 ) of the DWD for the period 1951-2015 (Frick et al., 2014). The interpolation of the gridded HYRAS data set is based on a combination of multiple linear regression and inverse distance weights and described in detail in Rauthe, Steiner, Riediger, Mazurkiewicz, and Gratzki (2013

| Agrometeorological variables
Obtained climate variables were evaluated for the long-term changes in accordance with the duration of the phenological phases of winter wheat. This was done for individual sites and the overall mean across all study sites. In addition, the long-term trend was estimated for the normal calendar year as well as for the periods of the defined hydrological winter (1 November to 30 April) and summer (1 May to 31 October). This was intended to provide information on the weather conditions independent of crop phenology.

Crop and drought parameters
To assess crop-related and site-specific developments during the observation period, additional climatic and hydrological variables were calculated from the available meteorological data sets. First of all, the thermal duration (TD, °Cd) for each phenological phase was calculated after McMaster and Wilhelm (1997; formula described in the Supporting Information). The potential evapotranspiration (PET, mm) for winter wheat crops was calculated after Haude (1954). The formula and the data processing are described in the Supporting Information. Moreover, the climatic water balance (CWB, mm) as a drought indicator differs largely within Germany and was calculated to estimate the available water supply. It is defined as the difference between the precipitation height and the amount of the PET.

Adverse weather conditions
Moreover, we accounted the occurrence of adverse weather events in order to evaluate their impact on the yield potential. Due to limitations in the climate projection as daily values based on the available DWD data, we counted the frequency of those days with high precipitation events, those days with potential heat stress impact, and those days with negative CWB and calculated their cumulative numbers for each crop developmental phase and each individual site. We defined high precipitation events in two ways in order to account for all days that may have caused water logging and lodging even in short phases such as SP or GP in the first place and in particular for those precipitation events that cause severe logging and lodging effects in particular during summer months. Thus, we considered those days with a minimum rainfall of 20 mm in 24 hr as reported and analysed in Gömann et al. (2015) and the number of days, which had more than 40 mm of daily rainfall as high rainfall events (Mäkinen et al., 2018;Trnka et al., 2014). We used 27°C as upper temperature threshold with considerable impact on yield losses due to sterilization of grains of wheat around anthesis (Mitchell, Mitchell, Driscoll, Franklin, & Lawlor, 1993;Tashiro & Wardlaw, 1989), and 31°C as threshold for large detrimental effects around anthesis (Porter & Gawith, 1999;Wheeler et al., 1996).

Climate conditions
In order to obtain additional site-specific information of each trial site, the long-term mean annual temperature, precipitation and CWB were calculated using the HYRAS data set based on the 30 year  reference period ( Figure S7). Mean annual temperatures ranged from 7.8 to 10.6°C at the sites of this study. The average was at 8.9°C and the median at 8.8°C. Mean annual precipitation at the sites varied between 580 and 1,110 mm. The average was 774 mm and the median 744 mm. As for the mean annual CWB, the minimum was 198 mm, the maximum 822 mm, the average 465 mm and the median was at 431 mm.
Each site can be determined by the mean number of stress events during a specific growth phase. For example, based on the number of heat stress days during the GP-detected over the entire observation period-the number of these stress events for each site varied between 1.6 and 3.6 days and was 2.4 days in average.

Soil type
Soil types differ in water and nutrient availability and thus have an impact on the yield potential. Therefore, the soil type was retrieved from the publications at the level of the soil type group according to the German soil classification system-KA5 (Ad-hoc- AG Boden, 2005 for Geosciences and Natural Resources (BGR) in Germany (Düwel, Siebner, Utermann, & Krone, 2007). Four soil type groups where identified for the experimental sites in this study: 8 loamy sands, 12 sandy loams, 13 loams and 10 clayey silts. Locations with loamy soils (loam and sandy loam) are dominant within the data set over the experimental years from 1958 until 2015, whereas silty soils (clayey silt) and sandy soils (loamy sand) were predominant in the early and mid-1980s ( Figure 4a).

Soil yield potential
The suitability of a site under agricultural land use and its estimated yield potential is an indicator for the annual productivity of grain yields. Thus, we retrieved soil yield potential for all sites in this study based on the Muencheberg Soil Quality Rating (MSQR) developed by the Leibniz Centre for Agricultural Landscape Research (ZALF) at a global scale (Mueller et al., 2010). In brief, this approach evaluates a set of soil describing properties, such as the substrate, rooting depth, topsoil structure or soil compaction in combination with potential yield effecting hazardous factors that are critical for farming and limit the overall soil quality, such as drought risk, soil depth above solid rock, flooding or extreme waterlogging regimes. The final scores range from about 0 to 102 points and are displayed on a map that shows the MSQR for cropland in Germany based on the land use stratified soil map of Germany at scale 1:1,000,000. The MSQR for the 43 sites in this study ranged from 31 to 99 points, with a median at 69 and a mean of 67.9 points. In order to divide the research area in relatively poor and relatively good soils, the data set was split at a soil yield potential of 70 points.

| General trend analysis-Linear and segmented
In order to analyse the development of all obtained phenological, climatic and agroclimatic time series, data were fitted with ordinary linear regression models (Equation 4) in the first place and depicted using the statistical software R (R Development Core Team, 2008). Linear trends and point data were then analysed visually and when trend changes were obvious within the point data, the data were analysed again by segmented piecewise linear regressions (Equation 5) to obtain more detailed information about these potential trend changes (Piepho & Ogutu, 2003;Schabenberger & Pierce, 2001).
where the parameters of the model are a 1 , a 2 , b 1 , b 2 and x 0 .
There is an implicit constraint that the regression lines must intersect at x = x 0 . This can be done by removing one parameter and explicitly introduce x 0 as a parameter: which can be transformed into: and allows removing a 2 from the list of parameters to be estimated, leaving the parameters a 1 , b 1 , b 2 and x 0 .

| Segmented plateau analysis for maximum yield and nitrogen development
To test whether the yield and nitrogen development result in any kind of levelling, data were plotted against year using linear regressions with an upper plateau (Equation 7)-graphically depicted as a rising line or curve followed by a plateau. The 'linear plateau' model corresponds to a special case of the segmentation approach with b 2 = 0.
The segmentation analysis is based on an iterative approach where a breakpoint value is estimated based on nonlinear least squares (Muggeo, 2003(Muggeo, , 2008. The initial parameters were derived from values of a pre-fitted ordinary linear model. The advantage of this type of fit is that it can estimate the year of change or transition to plateau. The significances of these trends were calculated using the t test. Only those phenological and climatic changes with significant trends were chosen for the final discussion.

| General decoupling-Dissecting genetic from nongenetic sources
Here we explicitly point out that the data set used for this analysis differs from the data set used to investigate phenology and climate trends ( Figure 1). A subset of years with experimental data, which include cultivar information, was established. Next, all data on the beginning of the phenological stages, the duration of phenological phases and the agroclimatic variables were attached for those experimental years where it was possible to obtain this information.
However, this data set faces certain limitations. First, the true phenology of the cultivars used in the fertilization trials may differ from the phenological information obtained from the DWD. Second, agroclimatic information was only available for those years where weather and phenological data were available and adverse weather events occurred. Moreover, effects of individual agroclimatic conditions are considered to occur only when they vary in their variability and error. It also needs to be considered that cultivars largely vary in occurrence, duration over time, and in location. Also, the applied N-levels vary between experiments, over time and in location.
To overcome the uncertainty of such unbalanced data sets, we used well established statistical mixed-effect models, which take the large number of environmental and nonenvironmental covariates into account. These models include a predetermined number of independent factors treated as random effects. Moreover, to disentangle the main effects that influence the evolution of winter wheat yields and to quantify the impact of an individual agroclimatic factor, a varying number of fixed effects can be included in these models.
Grain yield is a function of genetic and nongenetic conditions and thus, a standard three-way model after Laidig et al. (2008) was established: where y ijk represents the mean yield of the ith genotype in the jth location and the kth year, µ is the overall mean, G i is the main effect of the ith genotype, L j is the main effect of the jth location, Y k is the main effect of the kth year, LY jk is the jkth location × year interaction effect, GL ij is the ijth genotype × location interaction effect, GY ik is the ikth genotype × year interaction effect, GLY ′ ijk is the residual of the ijkth genotype × location × year interaction effect and error of the mean. As in Piepho et al. (2014), we assume that all effects except µ, G i and Y k are random and independent with constant variance, following a normal distribution. Thus, we integrated genetic and nongenetic time trends as fixed regression components into the model. G i was then estimated as the following regression term based on the YOR: where β is the fixed regression coefficient for the genetic trend, r i is the first year of testing (YOR) for the ith cultivar, and H i is the random deviation of G i from the genetic trend line. If there was a linear nongenetic time trend, we modelled Y k as: where γ is the fixed regression coefficient for the nongenetic trend, t k is the continuous covariate for the experimental year and Z k is a random residual. Both, β and γ quantify the genetic and nongenetic trends per year in the same units as y ijk .
In this study, we made use of data from crops grown under optimum N-fertilization levels and full crop protection. Hence, changes in agronomic practices can be considered to play a minor role and the time effect predominantly represents the effect of climatic changes. To evaluate whether the overall climate change has to be accounted for as linear or nonlinear regression terms in the mixed model, a pre-analysis was carried out, modelling the time effect first with a linear relationship and second with a quadratic relationship.
The latter should account for the potential yield levelling. While the linear relationship was significant (p < .001) in the pretest, the quadratic was not (p = .69). Hence, the time trend was included as a linear function and a potential yield levelling cannot be traced back to climate change only.

| Specific decoupling-Dissecting individual agroclimatic trends
We further investigated whether an agronomic variable has a specific impact on winter wheat yield development. While we assume that γ in Equation (10) represents all climatic changes over time combined, we assume that Z k in Equation (10) neither represents the interyear variability of a single agroclimatic variable appropriately nor does it account for the effect of a climatic factor at a specific location. Thus, in order to simultaneously model interyear and interlocation variation due to climatic or agronomic variables, we modified the model by regressing LY jk on these variables: where α is the fixed regression coefficient for the respective climatic or agronomic covariate, s jk is the specific value of the covariate for the kth year and the jth location and C jk is a random residual location × year interaction.
Crop growth and yields not merely depend on weather effects during the GS, but also on the site and soil conditions, which, hence, should be considered more explicitly. To account for such disparities among the experimental sites, the variance estimates were additionally adjusted for their location attributes (L j ) yield potential and soil type, and each tested within separate models as: where L j is the main effect of the jth location, δ is the fixed regression coefficient for the respective spatial attribute or site condition (e.g. yield potential), u j is the specific value of the attribute for the jth location and S j is the random deviation from the trend.

| Estimating the intensity of individual agroclimatic variables on yield variation
For each independent variable assessed, the total variance, defined as the sum of variance components of all random effects, was estimated twice: once without and once with the agroclimatic variable included. The model without the factor is henceforth called M −x and the model with the specific factor is called M +x , where M is the model, x describes the specific factor and '−' and '+' refer to the absence or presence of the factor respectively. In both models, trend components in G i , Y k and L j were modelled using regression Equations (9), (10), and (12) respectively. In M +x , however, the spatio-temporal covariate s jk was modelled additionally as per Equation (11) In order to assess the impact of climatic or agronomic factors on yield development under different site-specific conditions, this procedure was repeated for subsets of two groups of soil type and two groups of soil yield potential. For this purpose, soils of sandy loam and loamy sand were combined and addressed as sandy soils, and loams and clayey silts were addressed as loamy soils. The threshold for the two groups of different yield potential was the median yield potential (70) across the study sites. For further analysis, the agroclimatic variables were also tested for interaction with the time effect.

| Adjusting trends of agroclimatic covariates
To compare the slope for an agroclimatic variable (α) with the overall time trend (γ) and the genetic time trend (β), α was multiplied by the covariate's slope in an ordinary regression on time over the entire observation period (b, from Equation 4) to yield an adjustment climate trend:

| Yield development as a result of soil type and site-specific yield potential
Over all the trials analysed in this study, between 1958 and 1997 the annual yield of winter wheat increased on average by 0.12 t DM ha −1 year −1 and reached a plateau at 8.35 t DM/ha (Figure 4a; equation coefficients in Table 1). This corresponds to average wheat yields in Germany recorded by the Federal Statistical Office which also show increases beginning in the late 1950s until the end of the (10) 1990s. From around 1997 onwards, no further yield increase was observed in our trial data, which also corresponds with the official data and with other studies which point out a yield levelling beginning in the late 1990s (Brisson et al., 2010;Grassini et al., 2013). In parallel, the optimal nitrogen fertilizer dosage increased by about At sites with relatively low yield potential (<70 quality points, Section 2), yield levelling occurred about a decade earlier compared to the average of all sites (Figure 4c), while for sites with relatively high yield potential no stagnation was revealed. Still, our data indicate that yield levelling occurred at sites with light soils (sandy loams and loamy sands) as well as at sites with heavy soils (clayey silts and loams; Figure S1). The optimal nitrogen dosage showed a levelling for all soils and both yield potential groups ( Figure S2a-d).

| Earlier occurrence of wheat growth stages and shortened development phases except for stem elongation
Besides agronomic and soil factors, weather conditions during the whole GS and the occurrence of severe weather events during TA B L E 1 Regression coefficients of winter wheat yield (t DM/ha) and N-fertilization (kg N/ha) trends estimated by segmented regression line analysis and simple linear models (Equations 4-7). Heavy soils refer to loams and clayey silts and light soils to sandy loams and loamy sands (Section 2). Low yield potential soils are soils with MSQR below 70 points and high yield potential soils are those with MSQR above 70 points (Section 2). SE describes the standard error of the fitted parameter  (Figure 5a; equation coefficients in Table 2).
Yet, there was a strong geographical heterogeneity, and sowing tended to shift to earlier dates by up to 6 days per decade at North German sites, whereas South German sites showed inconsistent patterns of sowing dates (±2 days per decade) during the same period ( Figure S3). A similar pattern as for the overall trend in sowing was found for the dates of emergence (r 2 = .88 between sowing and emergence dates) and of heading (

| Increasing temperatures during the vegetative growth period
Across the trial locations, the annual mean temperature increased on average by 0.024°C/year and was nearly 1.3°C higher in 2006 than in 1951 ( Figure S4; equation coefficients in Table 3). The same trend was found for the mean annual minimum and maximum temperatures. The temperature increase was accompanied by an increase of the mean annual potential evapotranspiration by 1.6 mm per year, summing up to 85 mm over the entire observation period of 55 years. However, mean annual precipitation and mean annual CWB did not change significantly, even though the latter showed a negative trend indicating dryer conditions.
Climate change became apparent not only by enhanced annual mean temperatures but also by higher mean temperatures during the VP of winter wheat (Figure 6a; equation coefficients in Table 3).

| Climatic variation explained yield variability the most
Differentiating the results of the main influential factors on yield development from the specific agroclimatic effects, the general

F I G U R E 7 Proportions of variance explained by the main influential factors (genotype, location and time) and their interactions in
German long-term yield data. Results indicate a relative strong variance of all factors that changes along the timeline (time effect) except those caused by altering genotypes, which were relatively low. Moderate variabilities were only found for the location effect and the location × time interaction. Other interaction effects were rather low with no effect for the genotype × time interaction

| Elevated temperatures, heat stress and drought-affected yield development
In order to quantify the influence of an individual agroclimatic factor in terms of explained variance on yield development, we modified our model by including a fixed regression term for that particular agroclimatic variable (Section 2). The estimated variances explain the importance of a tested variable on yield development and can be considered as a 'coefficient of determination' for mixed-effect models (Piepho, 2019; Section 2).
The increase of the mean temperature during the GP explained about 16% of the yield variance over all locations (Figure 8; Table 4).
Higher temperatures are presumed to have led to enhanced evapotranspiration rates during that phase, which also had an effect of about 16%. The number of days with maximum temperatures above 27 or 31°C during the GP explained about 25% and 17%, respectively, of the variance. While the effect of the mean global radiation on yield variation was relatively low during the entire GS (<2%), it was more pronounced during the phase of leaf development (6%) and stem elongation (4%). Heat stress and temperature effects were less pronounced when the influence of the soil yield potential or the soil type was removed from the random effects (Table S3).
The trend of the agroclimatic factors on yield development (Table 5)

F I G U R E 8
Effect of selected agroclimatic variables on winter wheat yield development. Each break represents an agroclimatic variable and the stretches of the coloured areas shows their influence in percentages, described as 'coefficient of determination for mixed effectmodels' (Section 2). Low yield potential sites refer to sites with quality points between 0 and less than 70, while high yield potential sites refer to quality points between 70 and 100. The classification is based on the Muencheberg Soil Quality Rating (MSQR; Mueller et al., 2010; Section 2). T mean is the average temperature (°C), T max is the maximum temperature (°C), HS27n is defined as the number of heat stress days above 27°C, HS31n is defined as the number of heat stress days above 31°C, GRmean is the average global radiation (W/m 2 ), Psum is the total precipitation amount (mm), HR20sum is the total amount of precipitation of days with minimum 20 mm precipitation (mm), ETPmean is the average winter wheat-specific potential evapotranspiration (mm), and CWBsum is the wheat-specific climatic water balance (mm). All climate variables are related to wheat-specific growth stages: the entire growing season (GS), the vegetative phase (VP), the generative phase (GP), the leaf developmental phase (LP) and the shooting phase (SP; Section 2). The mathematical symbol '+' describes a positive relationship to yield development and the '−' a negative relationship (Table 2) TA B L E 4 Estimated explained variance (%) of the agroclimatic variables (units in parentheses) on winter wheat yield development between 1958 and 2006 across the study sites in Germany. All explained variances are estimated after accounting for the genetic and nongenetic time trends. Low yield potential sites refer to sites with quality points between 0 and less than 70, while high yield potential sites refer to quality points between 70 and 100. (Section 2)

| Sites with relatively low yield potential were particularly affected by temperature and heat stress events
After analysing the sensitivity of individual agroclimatic factors for the total of the sites represented, we separately examined those sites with relatively low (<70 quality points) and relatively high (>70 quality points) soil yield potential (Section 2). In comparison to the average of all locations, temperature-related effects were on average about onesixth higher at sites with low yield potential (Figure 8 and Table 4).
There, the number of days with heat stress above 27°C during the GP explained nearly 30%, being the strongest effect of a single agroclimatic factor at the same time. Heat stress days above 31°C during the GP still explained up to 21%. The number of heat stress days above 31°C doubled within the whole observation period considered ( Figure   S6). Moreover, while the effect of the mean temperature during the GP increased to as much as 19%, the effect of maximum temperatures during the GP was even more pronounced and explained nearly 23% of the yield variation. As a consequence, the effect of the mean evapotranspiration during the GP also increased by more than one-third at

TA B L E 4 (Continued)
TA B L E 5 Coefficient estimates of the fixed effects (genotype, time and selected agroclimatic variables as found in Figure 4) in the mixedeffect models on the yield development over time. SE denotes the standard error, df the degree of freedom, n the number of observations and p the significance of the estimates of each model. Values in parentheses describe the trend of the appropriate selected agroclimatic variable adjusted by its trend over time (Section 2). Low yield potential sites refer to sites with quality points between 0 and less than 70, while high yield potential sites refer to quality points between 70 and 100 (Section 2)

| 3621
sites with light soils to 26% in comparison to the total of the sites considered. Similar effects were found for soils classified as light soils, which are shown in Table S6.
Except for the effect of days with maximum temperatures above 31°C during the GP, temperature and heat stress effects were less pronounced at sites with higher yield potential (Figure 8). Here the explained yield variance decreased to about 16% for the effect of days with maximum temperatures above 27°C during the GP. Yet, when stress events with maximum temperatures above 31°C occurred, the explained variance increased from 17% to 25%.
Regarding the exposure of wheat to heat stress events above 31°C during the GP along the overall time effect, sites with fewer heat stress events were less afflicted by high temperatures than sites with more stress events (Figure 9a). Winter wheat yields increased at sites of low yield potential and with an average of 1.6 heat stress days by 0.1 t DM ha −1 year −1 , while at sites with an average of 3.6 heat stress days yields increased by about 0.088 t DM ha −1 year −1 .
At sites with high yield potential and exposed to 1.9 heat stress days during the GP, yields increased by about 0.072 t DM ha −1 year −1 and by about 0.071 t DM ha −1 year −1 at sites with 3.0 heat stress days ( Figure 9b).

| Negative effects due to high rainfall events and water deficit
Interestingly, our data reveal that effects related to high precipitation events (defined as days with more than 20 mm precipitation) became relevant during the VP (Figure 8). Here the total amount of precipitation through high rainfall events explained about 2.8% of the variance in winter wheat yields. The effect was strongest during the SP (3.5%), which still is inferior to temperature-related effects. Nevertheless, waterlogging might become more likely due to high precipitation events, leading to oxygen deficiency or enhanced erosion processes, which causes nutrient losses accelerated by a reduced ground cover canopy in particular during the LP at sites with higher clay content (Malik, Colmer, Lambers, Setter, & Schortemeyer, 2002;Nearing et al., 2005). The effect of the overall precipitation amount during the SP was more than 1.5-fold stronger at sites of high yield potential (5.5%) as compared to all sites (3.3%). The magnitude of radiation effects increased at sites with higher yield potential, foremost during leaf development to up to 30%. The relatively large temperature and radiation effects are presumed to have resulted in an enhanced effect of the negative CWB in particular during stem elongation. This effect was not evident when considering the total of all sites or sites with low yield potential only. All climate effects are found in a similar way in soils classified as heavy (Table S7).

| D ISCUSS I ON
For interpreting the results of the statistical models used for timeseries analysis, several aspects need to be considered. First, these models tend to include collinearity effects between predictor variables (e.g. temperature and precipitation) and hence results do not stack up automatically to 100% (Lobell & Burke, 2010b). This means a single effect might be suppressed or intensified due to interdependencies or confounding effects with other variables. Moreover, they assume that past relationships will continue in future, even though, for example, management systems may have evolved or changed completely, and they may have low signal-to-noise ratios in yield or weather records in many locations (Lobell & Burke, 2010a).
In this meta-analysis, we found that many agroclimatic effects were negatively related to yield development except for the mean global radiation during SP. We assume that the prolongation of the SP was caused by breeding progress since the mean temperature during this phase did not change significantly and modelling revealed no effect. Moreover, the SP is known to be critical for yield development and has been proposed as a target trait to improve yield and environmental adaptation of wheat in numerous studies (González, Slafer, & Miralles, 2003;Kronenberg, Yu, Walter, & Hund, 2017;Miralles, Richards, & Slafer, 2000). The prolongation of the SP may, however, also have been due to advanced sowing dates. The advanced sowing may have been a consequence of changed climates prior to sowing dates (Johnen, Boettcher, & Kage, 2012). The probability of a higher number of leaf primordia initiated between sowing and the double ridge stage is greater with earlier sowing and may lead to a delay in the appearance of the leaves that emerge until stem elongation and of the final flag leaf until heading (Johnen et al., 2012). However, the duration of the SP did not change significantly beyond 1988-nearly one decade before winter wheat yields, in general, started to level off ( Figure 4a). Interestingly, the TD of the SP showed an ongoing increase by approximately 40°Cd between 1988 and 1997. This may indicate that in this period winter wheat continued to increase radiation uptake due to enhanced temperature and light use potential during stem elongation as well as due to increased nitrogen uptake from the continuous increase of fertilizer input (Loomis & Amthor, 1999).
The values presented in Figure 7 reflect the explained variance of the main effects (genotype, location, time) and their interactions. The results show that the variability of the time effects explained the variance of the winter wheat yields over time and across the study sites the most, followed by the variability of the location effect. However, a lower explained variance of the genotype effect does not automatically mean that this factor has a lower influence on yield development. The slopes of the specific covariates used in the fixed regression part of the mixed-effects model might, therefore, provide a more detailed assessment (Table 5). While most of the agroclimatic parameters had a negative impact on yield development, the effect of the genotype was always positive. However, the slopes of the agroclimatic covariates come in their own unit (e.g. °C) and in themselves have no time unit as the genetic and the overall time effect. Hence, an adjusted climate trend, which was obtained by multiplying the slope of the covariate with the time trend of this specific covariate, was used for comparison with the two temporal trends (Section 2). We found that the trend of the genetic progress was, for example, about 100 times larger than the trend of the mean or maximum temperature during the GP at all study sites (0.004 t ha −1 year −1 )as well as at sites with lower or greater yield potential. Moreover, the results indicate a high-temperature variability between the years and over the locations in this study.
This study uses the agronomic yield maxima for each experimental year and site as achieved by optimal N-fertilization dosage (estimated from fertilizer dose-yield response functions, Section 2) and optimal plant protection. In this way, the effect of variation of agronomic practices over time was minimized so that the overall time trend essentially reflects climatic effects (Figures 7 and 8).
In practice, however, optimal agronomic management is often not realized on the farmers' fields ( Figure 4a). In practical farming, socio-economic aspects have an important influence on the intensity of agronomic inputs, for example the fact that since the late 1990s subsidies from the European Union have been paid based on production area instead on yield, or that export subsidies were reduced (Himanen et al., 2013;Reidsma et al., 2008) . Nevertheless, we also assume that the optimal amount of N fertilizer is a result of decreasing prices of nitrogen, comparable to, for example, decreasing prices of the crops and, additionally.
The increased potential evapotranspiration during stem elongation between 1951 and 1992 is very likely a result of the SP prolongation rather than a climatic change response, as for mean temperature and precipitation no significant changes were detected, and mixedmodel analysis revealed no effects due to potential evapotranspiration. This prolongation is probably also the explanation of the increased number of days with negative CWB until 1997. Resulting water limitations, accompanied by an increased number of heat stress days during stem elongation ( Figure S6) may have enhanced irreversible plant damage associated with yield losses in particular on shallow soils and/or in dry regions before maturity was reached (Farooq et al., 2011;Semenov, 2009).
Our data show that yield stagnation in German winter wheat occurred since the late 1990s for all sites combined, and since the late 1980s on soils with relatively low yield potential. Climate variation-spatial and temporal-explained most of the variability of the winter wheat yields (>50%), whereas genetic variation over time explained only 4%. Our results emphasize that except for heat stress days with more than 31°C, sites with higher yield potential were less prone to adverse weather effects than sites with lower yield potential. In general, elevated temperatures, heat stress during the GP and drought stress during the SP affected wheat development the most. Regarding the sole effect of the agroclimatic variables at all experimental sites combined, the mean temperature during the GP explained about 16% of the yield variability. Days with maximum temperatures above 27 or 31°C during the GP explained about 25% and 17% respectively. With respect to the winter wheat yield development of the entire observation period , the mean temperature during the GP reduced yields by about 0.23 t/ha in total. At sites with higher yield potential, relatively large radiation effect during the leaf development (30%) and negative precipitation effects during stem elongation (5.5%) are presumed to have resulted in an enhanced effect of the negative CWB in particular during stem elongation. Hence, the response of yield productivity to past climatic conditions demonstrates the sensitivity of German wheat production to climatic variation and underlines the need of finding adaptation strategies for food production under expected ongoing climate change.
The analysis shows that German wheat production is continuously adjusting to climatic changes, both with regard to the genetic adjustment (i.e. respective cultivar/variety selection choice) as well as management adjustment, especially shift of sowing times. However, In the light of continuous climatic changes in the future, further cropping system adjustments might be required to support stable winter wheat production in Germany. As such, it might be necessary to employ additional measures such as irrigation, in particular at sites with light soils and high risk of drought induced yield losses. Furthermore, earlier sowing in combination with 'early' wheat genotypes might be suitable to escape drought stress.