Assessing lignocellulosic biomass production from crop residues in the European Union: Modelling, analysis of the current scenario and drivers of interannual variability

This study assesses crop residues in the EU from major crops using empirical models to predict crop residues from yield statistics; furthermore it analyses the inter‐annual variability of those estimates over the period 1998‐2015, identifying its main drivers across Europe. The models were constructed based on an exhaustive collection of experimental data from scientific papers for the crops: wheat, barley, rye, oats, triticale, rice, maize, sorghum, rapeseed, sunflower, soybean, potato and sugarbeet. We discuss the assumptions on the relationship between yield and the harvest index, adopted by previous studies, to interpret the experimental data, quantify the uncertainties of these models, and establish the premises to implement them at regional scale –i.e., NUTS level 3– within the EU. To cope this, we created a consolidated sub‐national statistical data along with an algorithm able to aggregate (figures are provided at country level) and disaggregate (production at 25 km grid is provided assupplementary material) estimates. The total lignocellulosic biomass production in the EU28 over the review period, according to our models, is 419 Mt, from which wheat is the major contributor (155 Mt). Our results show that maize and rapeseed are the two crops with the highest residue yield, respectively 8.9 and 8.6 t ha‐1. The spatial analysis revealed that these three crops, which, according to our results, are feedstocks highly suitable a priori for second generation biofuels in the EU and are unevenly distributed across Europe. Weather fluctuation was identified as the major driver in residue production from cereals, while, in the case of starch crops and oilseeds – which are predominant in northern Europe – corresponded to the marked production trend likely influenced by the agricultural policies and agro‐management over the review period. Our results, among others, could help to understand and quantify the ecological boundaries of the bioeconomy from agriculture.


| INTRODUCTION
The European Union (EU) aims at decarbonizing its economy by 2050 with a 80%-95% reduction in greenhouse gas (GHG) emissions compared to 1990 (European Council, 2011). This ambitious goal has been mainly driven by setting several targets and introducing multisectorial EU policy packages (Scarlat, Dallemand, Monforti-Ferrario, & Nita, 2015). These are currently being updated for a shorter horizon to ensure that they fit to the economic, environmental and social challenges that our society faces (EEA, 2015;European Commission, 2016;OECD, 2014). Within this context, bioenergy is expected to play a central role (Cudlínová, Lapka, & Vávra, 2017;European Commission, 2011. Specifically, bioenergy is expected to contribute, among others, to climate-change mitigation by providing energy services that displace fossil fuels while generating fewer GHG emissions (Koponen, Soimakallio, Kline, Cowie, & Brandão, 2018;Schlamadinger et al., 1997).
In this regard, the Intergovernmental Panel on Climate Change (IPCC) agrees with the critical role that bioenergy can play for mitigation, but also remarks that there are issues to consider, such as the sustainability of practices and the efficiency of bioenergy systems (IPCC, 2014). Indeed, biofuels do not necessarily have a lower environmental impact than conventional fossil fuels (McCoy, 2017;Mueller & Kwik, 2013;Posen, Jaramillo, & Griffin, 2016;Weiss et al., 2012), e.g. when indirect land-use change (iLUC, Finkbeiner, 2014;Wicke, Verweij, van Meijl, van Vuuren, & Faaij, 2012) is taken into account, GHG emissions from biofuels may actually be higher than those from fossil sources (Searchinger et al., 2008). The concurrent uses of the biomass feedstock for bioenergy with food and feed production along with the mentioned environmental impacts of biofuels production have led policy frameworks to emphasize techno-scientific innovation for producing energy using lignocellulosic biomass through second-generation or advanced biofuels (Boucher, 2012;Hansen, 2014;Levidow & Papaioannou, 2016), and combined heat and power from nonfood biomass (Creutzig et al., 2015;Martinez-Hernandez et al., 2013).
Crop residues are the main feedstock of lignocellulosic biomass from agriculture and are expected to provide a major contribution to the production of advanced biofuels (Bourguignon, 2017). Although the production of advanced biofuels have been explicitly supported by the EU since 2015 (European Parliament, 2018), there are still economic and technological challenges (Marelli et al., 2015) to establish an operational industrial-scale production capacity in EU, and hence a mature bioeconomy market. In this context, a quantitative assessment of residues production is an essential preliminary step for the deployment of a second-generation biofuel industry in the EU. This assessment is indirect since there are no systematic statistics of the amount of crop residues biomass produced. To that end, one of the most frequently used approach is based on constructing empirical models that infer residual biomass (R) from crop yield statistics (Y). Following this statistical approach, in the last couple of years there have been several studies that try to estimate crop residues potentials in the EU (Bentsen, Felby, & Thorsen, 2014;Böttcher et al., 2010;de Wit & Faaij, 2010;Monforti, Bódis, Scarlat, & Dallemand, 2013;Scarlat, Martinov, & Dallemand, 2010). Although there is evidence of this relationship between Y and biomass partitioning (e.g. Larsen, Bruun, & Lindedam, 2012), the effect of different environmental and management factors (weather, agro-climatic conditions, fertilization, crop genetics) on crop yield and biomass partitioning can be quite complex (Unkovich, Baldock, & Forbes, 2010). Indeed, reference works studying the variability of the harvest index (HI) such as Donald and Hamblin (1976) or Hay (1995), have highlighted the contrasting ways in which biomass partitioning changes depending on these factors.
The main objective of this paper is assessing the production of lignocellulosic biomass from agricultural residues in the EU 28. To achieve it, we first propose empirical models to infer residue production from agricultural production statistics. Such models are constructed from experimental data, and assume that a relationship exists between the crop economic yield (Y, grain and tuber/root yield, expressed in t/ha) and the residue yield (R, lignocellulosic biomass production per unit area, expressed in t/ha). Compared to previous works proposing such kind of models -e.g. Bentsen et al. (2014) or Scarlat et al. (2010)-we put the emphasis on two aspects: explaining the nature and factors determining the observed relationship between economic and residue yield for the main crops in EU (complete analysis provided as Supporting Information 1, S1); and quantifying the uncertainties of these models.
Secondly, we apply these models to estimate the agricultural residues production in the EU28 from agricultural production statistics. Special attention is paid to analyse the interannual variability in residues production in the period from 1998 to 2015, disentangling the effect of variations in crop area, variability in weather conditions, and agromanagement for the different crops and EU countries. That analysis helps in understanding the existing trends in residue production linked to new policies and technological improvements, and the possible effects of adverse weather extremes in the residues production figures.
Moreover, the analysis also permitted to identify those crops and countries that are characterized by a residue yield higher and less susceptible to weather fluctuation. This relates to stability of feedstock supply from crop residues in the EU. Kluts, Wicke, Leemans, and Faaij (2017) review of existing studies assessing Europe's bioenergy potential provides an accurate overview of the existing methodologies used to assess the different potentials of residual biomass. In many of these studies, production of crop residues is inferred empirically from crop economic production -e.g. grain productionassuming that both variables are correlated. de Wit and Faaij (2010) or Böttcher et al. (2010) used crop-specific empirical conversion factors, such as the residue-to-product ratio (RPR) or the HI for the main cereals and oilseeds, assuming a fixed crop biomass partitioning, therefore not influenced by weather/climatic conditions or changes in agro-management.

| Deriving empirical models to predict crop residues production
where Y corresponds to economic yield (e.g. the grain yield in cereals and oilseeds, tuber yield for potato and root yield for sugar beet), expressed in t/ha; and R includes the remaining aboveground biomass -e.g. leaves, stems, husks, chaffnot considered as economic yield, also expressed in weight per unit area. To calculate consistently the HI or RPR, both Y and R must refer always to dry-matter weight (Donald & Hamblin, 1976).
Other studies, as Bentsen et al. (2014), Monforti et al. (2013) or Scarlat et al. (2010), use empirical regression models (e.g. exponential, logarithmic, etc.) between economic yield and the RPR or the HI for different crops, assuming that a positive correlation between crop yield and the proportion of biomass allocated to plant storage organs (grains, fruits, tubers, roots…) exists for all crops. According to that, any increment in the total crop biomass production would be mostly located in the plant storage organs (grains or fruits, tubers, etc.).
To verify the hypothesis of previous works, understand the relationship between these variables, analyse the influence of factors investigated -direct or indirectly-by the scientific community, and identify the most appropriate way to infer empirically residues from economic yield, a dataset of 1,580 experimental observations of crop economic yield (Y), residue yield (R) and the HI was collected from a selection of 84 scientific papers published in English (see Table 1 for references, crops covered and geographical distributions). This dataset resulted from filtering and transforming the samples originally collected (around 2,500 observations from more than 130 studies) to make them comparable among experiments in terms of weight per area and moisture content for yields, and especially for what regards HI definition.
The experimental data on Y, R and HI collected from scientific literature were statistically processed to generate a predictive regression model and confidence intervals at 95%. Heteroscedasticity appears mainly in regressions between Y and R: in some crops, the variance of R increases progressively with Y, while in other crops exactly the opposite behaviour is observed (see Section 3). The presence of heteroscedasticity violates the assumption of uncorrelated variances, leading to unsatisfactory results in regression analysis. Using HI as dependent variable in the empirical models instead of R, and subsequently estimating R from HI, solves this problem in most cases, as we may consider HI as a transformation of R, normalized by Y (Equation 1). Therefore, the regression model was established using Y -as predictor-and HI -as predicted variable-in all the crops studied, with the exception of sugar beet and potato, where the relationship between Y and R does not indicate the presence of heteroscedasticity.
A correct estimation of the confidence intervals requires a distribution of residuals close to normal. To satisfy that condition, the predicted variables (R for potato and sugar   T A B L E 1 (Continued) beet, HI for the remaining crops) were transformed using the group of functions proposed by Johnson (1949), commonly used for this purpose. The empirical constants and the functional shape for each crop were fitted using the algorithm proposed by Hill, Hill, and Holder (1976) as implemented in the Johnson Curve Toolbox for Matlab (available at https:// it.mathworks.com/matlabcentral/fileexchange/46123-johnson-curve-toolbox/). The first function is a logistic transformation (LT): where P is the original predicted variable (HI or R), P t is the transformed one and γ, δ, λ and x i are empirical constants. The second function is a hyperbolic sine transformation (HS): P t is, in all cases, linearly correlated with Y, and then a leastsquare method is applied to estimate the slope (a) and offset (b) of the linear regression: The 95% confidence interval (CI) for P t is calculated multiplying the standard error of P t in the linear regression by 1.96. Then, predicted variables P t and CI can be transformed back to the original variables using the inverse transform functions, respectively, for LT and HS: Note that P t , can be substituted in Equations 6 and 7 by the linear regression model from P t (Equation 5) to get the complete regression models. The values retrieved for all empirical coefficients are presented in the Results section.
Equation 5 was applied to predict the transformed HI and the confidence intervals at NUST 3 using dry-matter economic yield statistics for wheat, barley, rapeseed, sunflower, maize, sorghum, soybean and rice. Then Equations 6 or 7, depending on the crop, were used to compute the actual HI from the transformed one. The HI is considered regionspecific, fixed over the period 1998-2015 for winter cereals (wheat, barley, triticale, rye and oats) and rapeseed (see S1), and therefore is predicted using the average economic yield over that period. This is justified because an analysis of the drivers determining the variability of the HI, reported in S1, indicates that those changes in yield induced by water stress are having a considerable impact on HI. Frequently, the interannual yield variability of these winter crops is not determined by water stress in many of the main EU producers (López-Lozano et al., 2015) and, therefore, is not expected to change the HI. The values of the empirical parameters in Equations 3-7 for wheat were used also to predict residues of triticale, rye and oats, as not enough experimental data were found for these three cereals and the HI can be expected to be close. For the summer crops, HI is computed for every year and region.
After Equation 1, residue yield (in t/ha) can be the obtained from observed Y and predicted HI: For potato and sugar beet Equation 8 is not necessary as Equations 5-7 predict the R directly.
Regional residue production (at NUTS 3) and confidence intervals are computed multiplying R by the respective sown area, and then aggregated to derive country figures (NUTS 0). The estimations of residue production at NUTS 3 were further disaggregated into a regular grid of 25 km, to produce a continuous distribution map over the EU based on the following expression: where B c,g is the residue production (in tons of dry matter) for a given crop c in the grid cell g; B c,i,N=3 is the residue production for that crop in region i at NUTS 3; L g∩i is the area intersection of the land cover L between grid g and region i; and L i,N=3 is the area of land cover L within region i. The sum in the above expression refers to all regions i…n intersecting with grid g. The values of L are extracted from the Corine Land Cover 2006 map (Büttner & Kosztra, 2014) classes "nonirrigated arable land," "permanently irrigated arable land" and "rice." Overall, the algorithm first detects null values in the statistical series, differentiating null values from zeros and then it fills these null values applying rules recursively for all the NUTS. After gap-filling the algorithm calculates regional weights on each dataset as the relative A/P of a given region r at NUTS level n to the A/P of the region R at NUTS level n − 1 to which it belongs to. Once the full set of regional weights is established for a given crop and year, the area or production value for any region r at NUTS level n can be retrieved by multiplying the national level Eurostat value of the country (NUTS 0) region r belongs to by the regional weight of r and the weights of all the regions R from level 1 to l + 1 containing r. The downscaled production into the 25 km grid over the EU is also provided as Supplemental Material in a separate GIS file. (3) GARCÍA-CONDADO et Al.

| Analysis of interannual variability and main drivers of residue production
The interannual variability of crop residues production was quantified using the coefficient of variation (CV), expressed in percentage according to: in which i refers to a crop or a group of crops, σ corresponds to the standard deviation and μ the mean. The measurable-explanatory factors to this variability are identified as changes in area (A), weather (W) and technical and agro-management drivers (T).
We first quantify the fraction of the variance in residue production (as dependant variable, y i ) that is attributable to changes in area (A) and residue yield (R) by conducting an analysis using multiple linear regression: where, i corresponds to a crop or a group of crops, e 0 i are the residuals, 0,i is the intercept, x 1,i and x 2,i correspond to the independent variables R and A, and 1 i and 2 i represent the weight of these variables in explaining the variance of y i . To make them comparable for quantifying the relative proportion of variance in crop estimates explained by A and R, the coefficients β j,i were standardized as (Bring, 1994): where j refers to the respective coefficient, and μ i j and σ i j correspond to the mean and standard deviation, respectively.
The variance of R was decomposed in the factors T and W. We assume that the influence of technology is mainly reflected by the time trend component of the yield series (Ceglar, Toreti, Lecerf, Van der Velde, & Dentener, 2016;Finger, 2010;Kucharik & Ramankutty, 2005;Lobell et al., 2005), and by inference, the remaining unexplained variance is due to weather changes. To this end, a linear trend model (Chen, Wang, Zhang, Tao, & Wei, 2017;Li et al., 2016;Supit et al., 2010) was fitted to the residues yield data (as dependant variable, y ′ i ) over the 18-year period 1998-2015 as follows: where, i refers to a specific crop or crop group, t 1,i is the year, b 0,i the intercept and b 1,i represents the annual growth rate of residue yield (R, t/ha) over the period 1998-2015 that is presented per country and EU28 level. The resulted coefficient of determination r 2 and the complement to unit 1 − r 2 , are interpreted as the proportion of the variance of R that is explained by T and W, respectively, assuming the hypothesis that T and W are statistically independent and exhaustive.

| Empirical models for crop residues prediction
The empirical models proposed to predict residue yield from economic yield are shown in Figure 1, whereas the empirical coefficients of the full models (Equations 5-7) are given in Table 2. The proposed models, except for potato and sugar beet, are based on a regression between Y and HI.
Among the crops studied, wheat, barley, rapeseed, sunflower and soybean present a positive, strong relationship between residues and economic yield. The models do not deviate largely from linearity, which means that the predicted HI tends to be relatively stable and weakly dependent from Y. We can deduct from Equation 1 that the curvature of the model between Y and R is determined by the slope of the relationship between Y and HI. Therefore, a constant HI assumes a perfectly linear model between Y and R (Figure 1, dashed black line), which is a good approximation for sunflower, where the model proposed predicts a HI close to 0.32 along the Y interval. In wheat and barley, by contrast, the assumption of a constant HI would lead to a moderate, but systematic overestimation of R at high yield values. The reader is referred to S1 for a deeper analysis on this.
The models for maize and sorghum, by contrast, are strongly nonlinear and indicate a weak relationship between Y and R: a positive relationship when Y < 2.5 t/ha and an almost constant R while Y increases. Similarly, economic yield and residues are uncorrelated in potato and sugar beet, and the model predictions would be almost constant −2.5 t/ha of residues for potato, 6 t/ha for sugar beet-regardless of the observed yield. The slightly negative slope of the model for sugar beet seems to be a consequence of the small number of observations available. As a consequence of the low correlation between Y and R, the HI is strongly variable and correlated with yield (see S1). According to the data analysed, the hypothesis of a constant HI is invalid for these crops.
Model uncertainties are large, as indicated by both the model errors and the confidence intervals shown in Figure 1. Moreover, the relationship between residues and yield is highly heteroscedastic, except for potato and sugar beet. For those crops presenting an appreciable positive relationship between R and Y -wheat, barley, rapeseed, sunflower and soybean-the error variance is increasing dramatically with Y. In sorghum and maize, the situation is exactly the opposite: very large uncertainties in R are expected for low yields, and then R converges progressively to a value around 12.5 t/ha, while economic yield increases. These results reflect how differently the nature of the relationship between Y, R and HI among the crops is. For wheat, barley, rapeseed, sunflower and soybean, a dependence between economic yield and residues exists. In these crops, the grain yield is, to some extent, determined during vegetative growth, when leaf formation and stem elongation occur and the plants adapt some of the yield components (e.g. number of tillers and inflorescences) to the available resources (Sadras & Slafer, 2012;Unkovich et al., 2010). As indicated by Sadras and Connor (1991) pre-anthesis transpiration contributes to an important proportion of the final yield. In other words, these crops tend to keep biomass partitioning between vegetative and reproductive organs (and thus HI) stable in the presence of abiotic stress. Most of the economic yield variability is caused, first, by agroclimatic differences among experiments, and second, by irrigation and N fertilizing treatments within the same experiment (see S1 for a discussion on the factors influencing the relationship between Y, R and HI). Genetic differences among cultivars under the same environmental conditions cause only a moderate impact on yield, but may have a large influence on biomass partitioning, especially between old and new varieties. The improvement of wheat and barley cultivars was focused in the production of modern semidwarf cultivars (Hay, 1995, Peltonen-Sainio, Muurinen, Rajala, & Jauhiainen, 2008 with a different plant architecture compared to older verities: shorter, thicker plants with a higher HI. This variability in the ratio between residues and yield biomass, introduced by differences among cultivars, is higher under potential growing conditions (i.e. low abiotic stress pressure), which explains the increasing variance in R observed for both crops in Figure 1 when Y tends to be high.
In maize and sorghum, Y, R and HI are related in a different way compared to the above-mentioned crops. Grain yield is highly determined by the number of kernels per plant (Otegui & Bonhomme, 1998;Tolk, Howell, & Miller, 2013), which are established after the onset of vegetative organs and, therefore, highly sensitive to available resources (mainly water) during reproductive phases (Grant, Jackson, Kiniry, & Arkin, 1989;NeSmith & Ritchie, 1992). That explains the weak relationship between R and Y (Figure 1), and a large variability of the HI, correlated with Y (see S1). Genetic differences and irrigation treatments are responsible for the large R uncertainties in the proposed model. Maize and sorghum breeding efforts were focused in identifying cultivars able to cope with water stress, for instance reducing water uptake during vegetative growth to maximize water availability for the reproductive phases (D'Andrea, Otegui, & De La Vega, 2008;Edmeades, Bolaños, Chapman, Lafitte, & Bänziger, 1999). Therefore, the expression of these genetic differences is maximum when water stress is high, explaining the high variance in R when yields are low. Moreover, the timing of watering treatments in irrigation experiments may also have a strong impact in the HI of both crops (Farré & Faci, 2006, for instance if water constraints affect exclusively the kernel set and grain-filling phases. Overall the models proposed present a general agreement with the previous work of Bentsen et al. (2014), with the exception of rice, for which our predictions would give systematically lower residue production (Figure 1). Compared to Scarlat et al. (2010), the discrepancies of maize and rapeseed models are significant. The maize model in that study depicts a positive linear relationship between economic yield and residues -an almost constant RPR -that diverges substantially from our results. For rapeseed, the results in Scarlat et al. (2010) indicate a higher biomass production in seeds compared to vegetative parts (RPR > 1), whereas the experimental data analysed in this study indicate a much lower biomass partitioning to seeds.
The empirical models proposed in this paper are shown here as a relationship between Y and R (Figure 1), whereas T A B L E 2 Parameters of the empirical regression models for the estimation of crop HI. Y refers to yield data at 0% moisture content

| Estimated production of lignocellulosic biomass from agricultural residues in the EU28
Lignocellulosic biomass production from crop residues in EU28 is estimated at 419 million tonnes of dry matter (Mt) per year for the crop categories considered (cereals, oilseeds, and sugar and starchy crops) during the reference period 2011-2015. These three groups of crops cover, approximately, 95% of the total EU agricultural residue production (García-Condado, López-Lozano, & der van Velde, 2018). The breakdown of these crop figures is given in Figure 2. About 79% of the total (331 Mt) originates from cereals (wheat, rye, barley, oats, grain maize, triticale, sorghum and rice), whereas oilseeds (rapeseed, sunflower and soya) F I G U R E 2 EU28 lignocellulosic biomass from crop residue production (in Mt dry matter per year) from cereals, oilseeds, and sugar and starch crops calculated for the reference period 2011-2015 contribute 76 Mt (18% of total), and residues from sugar and starch crops constitute only a minor fraction, 3% and 13 Mt, of the EU production. The top four crops -wheat, maize, rapeseed and barley -represent, respectively, more than 80% (347 Mt) of the total. In Table 3, the average EU28 residue yield (lignocellulosic biomass production per unit area, expressed in t/ha) of the crops studied is ranked. Maize is the crop with the highest residue yield, as it produces, on average, 8.9 t/ha of residues, closely followed by rapeseed. Wheat is the major contributor to the EU total residue production thanks to the large sown area (26 Mha) but produces, on average, 5.9 t/ha of residues, much lower than maize and rapeseed. Barley is among the main crops the one with the lowest residue yields (on average, 4 t/ha). That variability in the average residue yield among the crops is due to two factors: differences in biomass partitioning predicted by the models -e.g. low HI for rapeseed, high for barley-and irrigation that increases overall productivity of crops such as maize or rice, which are permanently irrigated in some countries.

Residue yield (dry t/ ha) Area (Mha)
The uncertainty of the EU28 residue production estimate -calculated from the confidence intervals of the empirical models-is rather large: the upper and lower limits are, respectively, 764 and 292 Mt, which represent, in relative terms, 112% of the estimated value. The confidence intervals are not symmetric -as indicated by the empirical models (Figure 1)-being the lower confidence interval relatively close to the model estimate of 419 Mt. By contrast, the experimental observations are much more scattered among those varieties and growing conditions resulting in a low HI, explaining the high distance from the model prediction to the upper confidence interval.
The estimation uncertainties vary moderately among crops, as shown in Figure 3. In wheat and barley, the difference between the upper and lower confidence intervals are close to 100% of the estimated value. Maize residues estimations present the largest uncertainties in relative terms, around 177% of the predicted value (estimated production is 85 Mt, upper and lower intervals are 54 Mt to 205 Mt, respectively), consequence of the large error variance in the empirical model (Figure 1). By contrast, confidence intervals for sugar beet, rice and soybean are the smallest ones between 65% and 70% of the model estimations. Figure 4 displays the interannual variability from 1998 to 2015 of the total EU 28 estimations of lignocellulosic biomass from agricultural residues and the proportion of that variability is explained by the different drivers considered. In relative terms, the expected interannual variability is 7.5%, and would be primarily driven by changes in residue yield, with a minor influence of changes in the sown area. Cereals, as the most important contributors to total production, present similar values. By contrast, the interannual variability of residues from oilseeds, sugar and starchy crops is much higher -above 20%-and the contribution of area changes is much larger, compared to cereals.

| Influence of weather conditions, agromanagement and agricultural policy in the interannual variability of the EU28 production
Area changes are mainly attributed to the impact of agricultural policies. For instance, the existing market needs and the strict regulation of sugar beet production in the EU, lead to F I G U R E 3 Estimations of current (2011)(2012)(2013)(2014)(2015) residue production (in Mt of dry matter per year) in EU-28 per crop. Solid lines represent the confidence intervals of the residue production in EU28 at 95%

F I G U R E 4 Coefficient of variation
in percentage, CV% -of residue production (left panel) and residue yield (right panel) at EU level from 1998 to 2015 for different crop groups. The colours in the stacked bars represent the proportion of the production and yield variances explained, respectively, by area, yield, weather and agro-management a negative area trend of 0.18 Mha (Figure 5), which explains almost entirely the variability of residues production from the sugar and starchy crops. Conversely, a positive trend in oilseeds area is appreciable from 2003 to 2012 coinciding with the announcement of the EU to establish a biofuels support policy, primarily with the aim of lowering CO 2 emissions in the transport sector (Bourguignon, 2015). In absolute terms, that increase of oilseeds area is mainly coming from rapeseed, which is a major contributor to the production of biodiesel (Junginger, Goh, & Faaij, 2014). That increasing interest in oilseeds led to parallel intensification of agro-management in oilseeds (e.g. new varieties, fertilizing, etc.) resulting in a positive trend of total crop biomass yield and, consequently, also residue yields (+0.11 t/ha). This trend represents, approximately 70% of the residue yield variability of oilseeds.
By contrast, the variability of residue yield -the main factor of residue production in cereals-is almost equally explained by weather and technological factors (Figure 4, right panel). A positive trend in residue yield from cereals (0.05 t/ha, Figure 5) explains half of the estimated variance in residue yield, and is attributed to an overall increase of cereals biomass yield due to improvements in agro-management. The effect of weather F I G U R E 5 Time series and corresponding trend of harvested area (grey surface, Mha) and residue yield (purple line, t/ha) for cereals (top), sugar and starch crops (middle) and oil crops (bottom) | 821 GARCÍA-CONDADO et Al. conditions in residue yield would explain the remaining half of the variance. The lowest residue yields estimated for cereals shown in Figure 5 are explained by extreme weather events: an exceptionally cold spring followed by a drought in August impacted cereals in 2003 (Ceglar et al., 2016;Fontana, Toreti, Ceglar, & De, 2015;López-Lozano et al., 2015;Van der Velde, Tubiello, Vrieling, & Bouraoui, 2012;Zaitchik, Macalady, Bonneau, & Smith, 2006;Zampieri, Ceglar, Dentener, & Toreti, 2017); a drought across all the Black Sea area affected summer crop yields in 2007 and 2012 (Bussay, van der Velde, Fumagalli, & Seguini, 2015). Conversely, in 2014 and 2015 the high estimates correspond with highly favourable weather conditions across the growing season in most of the EU territory. There are, nevertheless, some differences among cereals: trends represent less than 20% of the interannual variability of residue yield in maize ( Figure 5), which is mostly determined by weather conditions.

| Differences in lignocellulosic biomass
production among the EU 28 countries Table 4 shows the estimates of lignocellulosic biomass production from crop residues by Member State for the main crop groups. The top nine producers are France, Germany, Poland, Romania, Spain, UK, Hungary, Italy and Bulgaria, altogether summing to a total residue production of 339 Mt, and representing 80% of total EU28 production. Only the top four countries would sum up to more than a half of the total EU28 residues production estimates.
The feedstocks vary substantially among the main producing countries, as shown in Figure 6. Large areas dedicated to winter cereals, and especially their high biomass yield, make them the most important source of lignocellulosic biomass from agricultural residues in countries from the north of the EU. Regionally, production is located in northeastern France, East Anglia (UK), central Germany and western Poland (Figure 7). The contribution of rapeseed to the total production is also quantitatively relevant for these northern countries, thanks to the low HI of that crop, which leads to a higher residue yield (Table 3), compared to winter cereals.
In Romania, Hungary and Italy, maize constitutes the main source of agricultural residues (Figure 6), and the production is highly concentrated across the Po valley and the Danube basin (Figure 7). According to our results, maize produces, on average, more lignocellulosic biomass per unit area than any other of the studied crops. The model computed for maize (see Figure 1) indicates that it is possible to obtain a high residue yield (8-9 t/ ha) even when grain yields are moderate or low (4-5 t/ha). This is especially relevant for Hungary and Romania, where maize is rainfed. Other countries like Croatia and Slovenia, where maize has also a major share of the production, are among those with the highest residue yields in the EU (Figure 6). By contrast, Spain presents the lowest residue yield (<4 t/ha) among the main producers due to the important contribution of barleywhich has a high HI -to total residues production.
The different composition of total agricultural residues production also influences the uncertainty of the country estimates ( Figure 6). In relative terms, those countries where maize is highly relevant for the total residue production present the largest uncertainties (e.g. Romania, Bulgaria, Hungary or Croatia) given the large error variance of the maize model (Figure 1). Similarly, in France, where production of lignocellulosic biomass from maize residues is significant, relative uncertainties are higher than in Germany, where the residues from maize are minor. Conversely, in the Netherlands or T A B L E 4 Estimated lignocellulosic biomass production per year in the period 2011-2015 from crop residues per Member State for the main crop groups, cereals (wheat, rye, barley, oats, grain maize, triticale, sorghum, rice), oilseeds (rapeseed, sunflower, soya) and sugar and starch crops (sugar beet, potato). Member States are ranked by decreasing production. All values in million tonnes (dry weight) Belgium, where the contribution of sugar and starch crops to the total residue production is highly relevant, the relative uncertainties are the lowest (30% and 50%, respectively).
The estimated interannual variability of total lignocellulosic biomass production from crop residues in most of the EU 28 countries would be primarily driven by variations F I G U R E 6 Estimation of current (2011)(2012)(2013)(2014)(2015) residue production (in Mt of dry) at Member State level. The coloured areas denoting crops correspond to their contribution in the residue production in each country. The green points represent the total residue yield (tonnes per hectare) by country while the dashed line indicates the total EU 28 residue yield. The thin grey line corresponds to 95% confidence intervals of the estimates for each country. Countries are ranked in decreasing order of their residue production F I G U R E 7 Spatial distribution of crop residues estimates, in kt of dry matter, per 25 km across EU28 for the average period (2011)(2012)(2013)(2014)(2015).
Biomass production correspond from the major crops, wheat, maize and rapeseed, along with the distribution of total feedstock coming from the crop groups cereals (wheat, rye, barley, oats, grain maize, triticale, sorghum and rice), oilseeds (rapeseed, sunflower and soya) and sugar and starch crops (sugar beet and potato) | 823 GARCÍA-CONDADO et Al. in residue yield, rather than changes in area (Figure 8, top panes). Among the top producers, only in Italy the relevance of crop area changes is higher than the residue yield, as crop residues are mainly coming from maize, grown under irrigation in the north of the country.
The interannual variability of residue yield estimations differs substantially among countries. Central and northern countries -e.g. France, Germany and the UK-are characterized by a prevalence of winter crops and temperate, humid agro-climatic conditions, which would lead to high and stable residue yields over the years (CV < 10%). In north-eastern EU countries (e.g. Poland, the Czech Republic, Slovakia, Latvia, Lithuania, Estonia) the residue yield variability would be higher and mostly linked to technical and agro-management factors (Peltonen-Sainio, Salo, Jauhiainen, Lehtonen, & Sieviläinen, 2015) resulting in a positive trend in total crop biomass yield during the last 15 years. In southern countries (e.g. Romania, Hungary, Spain and Bulgaria) the residue yield interannual variability is also high -can reach 20%-but mostly driven by weather conditions, as there crop production is heavily determined by rainfall regimes (López-Lozano et al., 2015).
Differences do exist among crops. In most of the main producers, maize residues would be driven by inter-annual changes in sown area (Figure 8) rather than in yield. Actually, F I G U R E 8 Inter-annual variability -expressed as coefficient of variation in percentage, CV% -of the residue production and residue yield at Member State level from 1998 to 2015, calculated for the crops in this study (cereals, oil-seed, sugar and starch crops) and the crops with the highest production in EU: wheat, maize and rapeseed. Member states are ranked in decreasing order of their residue production with arrows marking the group of countries covering >80% of EU total production the inter-annual variability of residue yield is very low when compared with wheat or rapeseed. This is partially due to the empirical model computed for maize, which tends to predict almost constant residue yields (Figure 1) when grain yield is above 2.5 t/ha. Among the top producers, only in Romania and Hungary the variability would be primarily explained by residue yield, and only in these countries the variability of residue yield would be higher than 10%, as they are exposed to severe summer droughts that reduce drastically crop yields.
Regarding rapeseed, with the exception of France and Germany, the top two producers, the inter-annual variability of lignocellulosic biomass from residues would be higher than 20%, mostly explained by changes in area. As mentioned before, oilseeds, and particularly rapeseed area have been rapidly increasing until recently due to the increasing interest of this crop as feedstock for first-generation biodiesel: in Poland, the area in 2015 is double that in 1998, in Romania almost multiplied by five. In addition, average residue yields have increased in many of these countries ( Figure 8) as a result of technical improvements increasing crop productivity.

| DISCUSSION
The total estimated production of lignocellulosic biomass from crop residues in the EU28, according to our models, is 419 Mt, considering the reference period 2011-2015. Wheat is, according to our results, the main residues feedstock (155 Mt, 37% of total EU production), predominant in centralnorthern Europe (mainly UK, France and Germany). Apart from wheat, our study identifies as well maize stover (84 Mt, 20%), mainly grown in southern Europe, as an attractive feedstock of lignocellulosic biomass, previously stated as well by Mitchell et al. (2016). Maize presents the highest residue yield of all crops (9 t/ha) and, according to our results, these high residue yields could be achieved also under moderately dry agro-climatic conditions. Rapeseed residues are the third feedstock in importance (54.5 Mt, 13% of total EU production) thanks to a positive trend in production, consequence from an increase in the area and yield in the last 15 years. Moreover, rapeseed produces a high proportion of total biomass in the vegetative plant organs: the data used to constrict our models indicate that the HI of rapeseed ranges between 0.2 and 0.3, very low compared to the other field crops. This high importance of rapeseed as source of lignocellulosic biomass is, nevertheless, highly dependent on the assumptions and data used in our models, as in the previous study of Scarlat et al. (2010) the proportion of rapeseed residues in total EU production is much lower.
In this study, production estimates have been also disaggregated into a regular grid over the EU to describe the spatial distribution of residue production through the EU (also addressed in Monforti et al., 2013), and provided as Supplemental Material. The economic sustainability of biofuel production systems is largely conditioned by the design of the supply chain, where the biomass transportation costs (Searcy, Flynn, Ghafoori, & Kumar, 2007) play a central role.
The amount of available biomass, the distance between production, processing and consumption hotspots, or the timing of biomass supply -e.g. residues from winter and/ or summer crops-are critical factors determining the optimal full supply chain: the need of biomass preprocessing, the location, number and capacity of the processing plants, or the distance to the biofuel markets (Lin et al., 2016;Sultana & Kumar, 2011). Moreover, GHG emissions in the biomass transportation are accounted for in the life cycle assessment of straw-based bioethanol and power systems (Borrion, McManus, & Hammond, 2012;Martinez-Hernandez et al., 2013;Morales, Quintero, Conejeros, & Aroca, 2015).
The estimations reported in this study, as they are based on models constructed from experimental data on crop HI, refer to the total biomass produced by the plant which, implicitly, includes a proportion of biomass that cannot be actually harvested by combines due to technical limitations (Douglas, Rasmussen, & Allmaras, 1989;Kim & Gregory, 2015). Quantitative studies on the sustainable use of crop | 825 GARCÍA-CONDADO et Al. residues for bioenergy that could use biomass estimations computed from this empirical models have to take into account, among others, this fraction of uncollectable biomass that is, necessarily, left in the soil.
In this study we put emphasis on quantifying the uncertainties of the empirical models proposed and the estimations produced, an aspect not fully addressed in previous studies as Bentsen et al. (2014) or Scarlat et al. (2010). On our view these uncertainties are relatively large as a consequence of two factors. First, empirical models inferring residues from economic yield -either directly or using parameters as the HI or the RPR-are oversimplifying, as the influence of genetic differences among crop varieties and environmental conditions in biomass partitioning can be rather complex. Second, the models are constructed from data produced in experimental conditions that may not be representative of commercial EU agriculture. In our study we have used a much larger number of observations to construct our models if compared to previous works, but perhaps at the cost of including agromanagement practices (e.g. use of old landraces for some crops, fertilizing treatments) that are beyond the range if commercial conditions, thus increasing artificially the error variance of the models proposed.
More complex biophysical models able to describe the actual effects of weather on crop biomass partitioning, could reduce model uncertainties. General crop models such as WOFOST (de Wit et al., 2018), CropSyst (Stockle, Cabelguenne, & Debaeke, 1997), or STICS (Brisson, Launay, Mary, & Beaudoin, 2009) can simulate adequately how environmental and management factors influence yield formation and biomass partitioning in crops. Similarly, canopy models driven by Earth Observation data using satellite and weather observations would have the potential to quantify the total biomass (expressed as net primary production) over croplands (e.g. Lobell et al., 2002;Prince, Haskett, Steininger, Strand, & Wright, 2001).
The uncertainties related to varietal differences of biomass partitioning are still difficult to tackle in such models, and systematic, and field observations representative of actual agricultural conditions in the EU would be needed. Nevertheless, the large variance observed in the models are indicating the range of achievable residue yields in crops like wheat and barley, where varietal improvement led to modern cultivars with higher HI. The analysis of possible scenarios for lignocellulosic biomass production should take into account those cultivars with a lower HI which can increase significantly the biomass supply for bioenergy under low abiotic stress pressure.