Current Models Underestimate Future Irrigated Areas

Predictions of global irrigated areas are widely used to guide strategies that aim to secure environmental welfare and manage climate change. Here we show that these predictions, which range between 240 and 450 million hectares (Mha), underestimate the potential extension of irrigation by ignoring basic parametric and model uncertainties. We found that the probability distribution of global irrigated areas in 2050 spans almost half an order of magnitude (∼300–800 Mha, P2.5,P97.5), with the right tail pushing values to up to ∼1,800 Mha. This uncertainty is mostly irreducible as it is largely caused by either population‐related parameters or the assumptions behind the model design. Model end‐users and policy makers should acknowledge that irrigated areas are likely to grow much more than previously thought in order to avoid underestimating potential environmental costs.


Introduction
The size of irrigated areas is a key parameter defining the social-ecological impact of irrigated agriculture, the most intensive agricultural regime in terms of labour and production per surface unit (Boserup, 1965): It determines crop yields (Dillon, 2011;Netting, 1993),water consumption (Wisser et al., 2008;Puy et al., 2017), the impact of agriculture on global warming (IPCC, 2014), the sustainability of river systems (De Graaf et al., 2019), the spread of water-borne or air-borne diseases (through large-scale flood and sprinkler irrigation respectively FAO, 1997), or rainfall patterns in and around irrigated areas (Alter et al., 2015;Cook et al., 2011). This explains why modeling future irrigated areas is paramount to guide midterm policies on water and biomass demand and to prevent environmental deterioration, poverty and climate change (Hejazi et al., 2014;Shen et al., 2008).
To date, projections of the global irrigated area suggest that it will increase linearly, reaching between 250 and 450 Mha by 2050 (Alcamo et al., 2007(Alcamo et al., , 2005Alexandratos & Bruinsma, 2012;Fischer et al., 2007;Molden, 2007;Rosegrant et al., 2002;Shiklomanov & Rodda, 2003) ( Figure S1 in the supporting information). The narrow range of these predictions raises a fundamental question: Are models of future irrigated areas robust, or do they neglect crucial uncertainties?
Here we address this query and conduct a systematic uncertainty and global sensitivity analysis of the projection of continental and global irrigated areas for 2050. Following previous works (Alcamo et al., 2007(Alcamo et al., , 2005Alexandratos & Bruinsma, 2012;Fischer et al., 2007;Molden, 2007;Rosegrant et al., 2002;Shiklomanov & Rodda, 2003), we model continental irrigated areas as a function of population, cropland, and water available for irrigation. We keep our model as simple as possible to facilitate its assessment, and unlike previous approaches, we map the model output back onto the projection's main parametric and structural uncertainties Finally, we discuss the nature of these uncertainties, and reflect on the implications of our results for all estimations that use the future extension of irrigation as a model input.

The Model
Population is a major constrain of agricultural systems and has been widely used as a driver to project irrigation-related variables in many modeling exercises (Alexandratos & Bruinsma, 2012;Boserup, 1965;Shen et al., 2008;Shiklomanov & Rodda, 2003). We used population size rather than population density to model irrigated areas because it better explains changes in irrigated areas at the continental level ( Figures S2 and S3). Although irrigated areas might be conditioned by other factors (i.e., trans-oceanic food demands and trade networks), the available data suggests that their effective influence is effectively summarized by population size, a parameter that encapsulates many complex features of social-ecological systems (Bettencourt, 2013;Oka et al., 2017;Puy, 2018).
We described the relation between irrigated areas and population with the following equation: where Y and N are respectively irrigated area and population size, is a constant, and the scaling exponent describing the growth rate between population size and irrigated area. In order to ground the model on current continental estimates of irrigated areas, we described as where Y 0 and N 0 are the irrigated area and the population size at the baseline year t = 0.
The world population has been growing super-exponentially for most of known human history, and it has begun to slow down only recently (Johansen & Sornette, 2001). The United Nations (2017) projects this deceleration in population growth rates up to 2100. However, we considered this slowdown uncertain based on the fact that the United Nations has been revising their projections, both upwards and downwards, since 2013, and the intrinsic unpredictability that characterize population dynamics (Pison, 2019). We assumed that some fluctuations in population growth/decline rates should be expected and modeled population growth as where r reflects population growth rate and is an exponent that controls the speed of population growth/decline.
Our final model is the result of rearranging equations (1)-(3) (see Text S1 in the supporting information for a step-by-step description): Equation (4) implies that the growth rate of irrigated areas is open ended. This concurs with recent evidence that suggests that continental irrigated areas, on average, grow super-exponentially for a given increase in population (Puy, 2018). However, it can be argued that the expansion of irrigated agriculture is constrained by the availability of cropland and water for irrigation. Such premise draws from the ecological concept of "carrying capacity" the idea of a static boundary that limits how much of a given thing can exist in a given environment (Chapman & Byron, 2018). Although technological innovations to save land and water (i.e., vertical agriculture, Despommier, 2011, or seawater desalination, Burn et al., 2015) have shown this boundary to be dynamic, we acknowledge the implausibility of (1) irrigated areas increasing from the current ∼6% to beyond 100% of the arable land in 30 years' time and (2) irrigated areas consuming all freshwater available, as this would leave no water for industrial or domestic use. We thus limited the growth of irrigated areas as follows: where f(x) is our new model, Y comes from equation (4), and K is the maximum cropland available.
In order to operationalize water constraints, we began by modeling the relation between the water required for irrigation and the irrigated area as where w is the volume of water required to irrigate the area Y , the intercept, and the slope of the line of best fit ( Figure S4). We then limited the growth of the irrigated area by the maximum volume of water available for irrigation, which we made a fraction of the total water available: If the amount of water required Note. DU stands for discrete uniform and j is the index of the continent. The rationale behind the selection of the probability distributions is explained in the supporting information. Tables S1-S4 show the specific distributions used for Y 0 , r, K, and W a in each continent j.
to irrigate the area Y is smaller than the total water that can be allocated to irrigation, we keep Y as the model output. Otherwise, the largest area that can be irrigated with the total water available for irrigation is used as the model output: where f(x) is our new model, Y comes from equation (4), W a is the total water available, and is the proportion of W a that is allocated to irrigation. Altogether, our model includes seven uncertain parameters (Table 1). Text S2 in the supporting information justifies their characterization as uncertain and describes the steps taken to mathematically formalize this uncertainty.

The Triggers
Models also embed structural uncertainties, that is, alternative model formulations might be plausible, and parameters can be computed using a range of different yet equally reasonable approaches (Saltelli et al., 2008). This is the case of in equation (1) and of ( , ) in equation (6). To tackle these model uncertainties, we made and ( , ) a function of triggers, = f(X 1 , X 2 , X 3 , X 4 ) and ( , ) = f(W 1 , W 3 , W 4 ). The rationale is the following: • (X 1 , W 1 ): There are currently six data sets on global irrigated areas (Meier et al., 2018;Thenkabail et al., 2009;Siebert et al., 2013;Salmon et al., 2015;FAO, 2016;FAO, 2017). The values provided by these data sets might differ by up to 4 orders of magnitude at the national level ( Figures S5-S7). These different estimations introduce uncertainty in the value of Y in equations (1) and (6), and in consequence, in the calculation of and ( , ). X 1 and W 1 were thus used to select which of the six irrigated area data sets should be chosen for the computation of and ( , ). • X 2 : in equation (1) can be computed via ordinary least squares (OLS) or standard major axis (SMA) regressions. OLS would allocate all equation error to Y (i.e., it assumes there is an independent and a dependent variable), whereas SMA would split it into N and Y (i.e., it does not differentiate between the driver and the response variable) (Smith, 2009;Warton et al., 2006). The use of OLS thus implicitly endorses Boserup, 's (1965) model, in which population drives the adoption of intensive agricultural systems, whereas SMA is more appropriate if both variables are considered to be inextricably intertwined (Morrison, 1994). X 2 was therefore used to decide whether OLS or SMA regressions should be used in equation (1). • (X 3 , W 3 ). OLS and SMA might provide different estimations of the line of best fit depending on whether robust or nonrobust methods are used. We could confirm the presence of bivariate outliers affecting the computation of and and ( , ) ( Figures S8 and S9). X 3 and W 3 were thus used to select whether to apply robust or nonrobust methods for the estimation of and ( , ), respectively. • (X 4 , W 4 ). Even if the aforementioned triggers account for the uncertainty in the computation of and ( , ), there still would be uncertainty with regard to their true value. This can be approached by bootstrapping ( , ) n times in each of the 24 (12) possible scenarios created by the triggers, thus creating 24 (12) different vectors of ( , ) with length n per continent. X 4 and W 4 were therefore used to decide which and ( , ) values should be extracted from the vector and used in the computation of equation (4).

Uncertainty and Sensitivity Analysis
In order to analyze how the model parameters and triggers convey uncertainty to the prediction of irrigated areas, we propagated uncertainty from the inputs to the output. Sobol' quasi-random number sequences were used to generate a (2 15 , 2k) matrix for each continent (Bratley & Fox, 1988), with k being the number of model parameters plus triggers. The first k-matrix was labeled A and the remaining k-matrix, B. We allocated each parameter and trigger to a different column and described them using probability distributions according to the continent and the available data (Text S2 and Table 1).
Our model ran rowwise in both the A and B matrices, as follows: For v = 1, 2, … , 2 15 rows, it retrieved a (v) , ( (v) , (v) ) value following the conditions set by To examine which parameters and triggers conveyed the most uncertainty to the model output, we relied on Sobol' indices. We constructed k additional matrices A (i) B , i = 1, 2, … , k parameters, where the ith matrix is composed of all columns of the A matrix except the ith, which comes from the B matrix. Sobol' indices decompose the model output variance (V Y ) as the sum of all conditional variances up to the kth order, as where V i is the first-order variance contribution of the ih parameter, V ij the second-order contribution of the interaction between the ith and the jth parameter, and so on up to the kth order. The first-order effect, or the proportion of variance contributed by the ith parameter, is then calculated as We also computed the total-order effect (S Ti ), which reflects the proportion of variance contributed by the first-order effect of the ith parameter plus its interactions with all the rest: The larger the difference between S Ti and S i , the more the parameter is involved in interactions. Here we used Saltelli et al.'s (2010) and Jansen's (1999) estimators to compute S i and S Ti , respectively.
In order to ensure the robustness of the results, all the steps taken during our analysis have been replicated independently by two of the coauthors (A. P. and S. L. P.) using two different programming languages and scripts (R and Python, respectively). The code and all data sets required to replicate our results are publicly available in Zenodo (Lo Piano, 2020;Puy, 2020). Figure 1 shows the empirical distribution of irrigated areas in 2050, Y in equation (4). To compare our results with the projections currently available, Table 2 presents the proportion of model runs yielding min ≤ Y ≤ max and Y ≥ max, with min and max being the minimum and maximum extension of irrigation in 2050, as suggested by existing reports (Alcamo et al., 2007(Alcamo et al., , 2005Alexandratos & Bruinsma, 2012;Fischer et al., 2007;Molden, 2007;Rosegrant et al., 2002;Shiklomanov & Rodda, 2003) (Table S5).

Uncertainty Analysis
The distribution of the projection of continental irrigated areas significantly overshoots previous estimates, which makes the likelihood of an extreme value much larger than previously suggested (Figure 1a). This is especially the case for the irrigated area of Africa, whose range of possible sizes spans more than one order of magnitude (∼20-130 Mha; P 2.5 , P 97.5 ). If we take into account the full empirical distribution, we observe that 60% of our model runs are beyond the maximum extension of irrigation proposed so far for that continent (39 Mha) (Alexandratos & Bruinsma, 2012), and 2% over the 135 Mha which are potentially irrigable according to Xie et al.'s (2014) updated land estimates. Note. The proportion of model runs yielding Y ≤ min is not shown.
In the Americas, Asia, and Europe the potential increase is also significant (∼48-120, ∼190-550, and ∼15-40 Mha, respectively; P 2.5 , P 97.5 ). These values are unevenly constrained by the minimum and maximum extensions reported by existing works (19-80%; Table 2). The uncertainty in the projection for the Americas, Asia and Europe increases to 320, 880, and 63 Mha, respectively, if the upper 2.5% of the distribution is taken into account. In the case of Asia, 15% of our model runs also yielded values as large as K, suggesting significant chances of irrigated agriculture extending over all available cropland by 2050 (Zhang & Cai, 2011). Europe is the only continent with a significant proportion of model runs (∼8%) where Y < min, indicating that European irrigated areas could actually shrink as a result of a negative population growth rate.
To know how uncertainty at the continental level translates at the global level, we generated a set of row vectors with the model output for the continent j in the vth row. We then calculated the global irrigated area (Figure 1b). The range of possible extensions, which is bounded between 300 and 800 Mha (P 2.5 , P 97.5 ), increases to ∼1,800 Mha if the right tail of the distribution is taken into account. Our results suggest that there are ∼100% and 95% chances of global irrigated areas in 2050 being larger than Alcamo et al.'s (2005) (262 Mha) and FAO's (322 Mha) predictions (Alexandratos & Bruinsma, 2012), and 60% chances of irrigated areas being larger than the largest extension proposed to date (450 Mha) (Molden, 2007). In fact, a scenario where global irrigated areas extend over an area at least twice that size cannot be ruled out (∼1%).
Since the expansion of irrigation in our model is driven by population growth (see equation (3)), we checked whether our irrigated area estimates were caused by unreasonably high population values. Instead, the results indicate that the 2.5th to 97.5th percentiles are consistently associated with modeled population sizes bounded by the United Nations' lowest and highest population estimates for 2050 ( Figure S10).

Sensitivity Analysis
Figures 2 and S11-S14 present the results of the sensitivity analysis. We found that the parameters that model water ( , W a ) and cropland (K) availability, as well as the different statistical frameworks available to model how water withdrawals scale with irrigated areas (W 1 , W 3 ), do not contribute any significant uncertainty to the projection of irrigated areas in any continent. Unless the other uncertainties are not reduced, they can be excluded from future models or fixed at any value within their uncertainty range without significantly affecting the projection (Figure 2a).
The other parameters are influential and their importance varies depending on the continent. Europe is the region most influenced by the first-order effect of the irrigated area baseline value Y 0 , which explains 55% of the uncertainty in the projection. For countries such as Slovakia or Belgium, the uncertainty in the extension of irrigated agriculture spans three to four orders of magnitude ( Figure S7). If reducing the model output variance for Europe is a priority, then efforts should be invested in improving the mapping of irrigated areas. This might be achieved by increasing the accuracy of state-of-the-art measuring tools, such as remote sensing techniques; by reaching an agreement on what exactly is an irrigated area; or by better merging the data compiled in official reports with that obtained from satellite imagery (FAO, 2016(FAO, , 2017Meier et al., 2018;Salmon et al., 2015;Siebert et al., 2013;Thenkabail et al., 2009).
The reduction of the projection uncertainty for the other continents is more problematic: First, because a significant proportion of the model output uncertainty for Asia (∼65%) and the Americas (∼70%) is caused by Figure 2. Sobol' indices. S i and S Ti refer respectively to Sobol' first-and total-order indices. S i measures the influence of a parameter in the model output, while S Ti measures the influence of a parameter jointly with its interactions. The blue, transparent horizontal frame indicates the approximation error in the calculation of the S Ti indices, S * Ti . Only S Ti whose lower 95% confidence interval does not overlap with S * Ti are considered truly influential (see supporting information). (a) Model parameters/triggers (b) Clusters of parameters/triggers: irrigation (X1, Y 0 , W 1 , W a , ), model (X 2 , X 3 , X 4 , W 3 , W 4 ), and population (r, ). See Table 1 and Text S2 in the supporting information for a description of the parameters. the population growth rate r, an index largely driven by unpredictable factors such as demographic stochasticity (i.e., randomness in birth/death rates and sex determination) or demographic heterogeneity (i.e., variance in the features affecting survival and reproduction) (Melbourne & Hastings, 2008). Second, because in the case of Africa, it also largely results from the different model structures available (S X 2 + S X 4 ≈ 30%), which is harder to pin down with our current knowledge.
To gain further insight into the structure of the model output uncertainty, we clustered the parameters and the triggers in three groups and simultaneously assessed the uncertainty driven by irrigation, population and model-related inputs (Figure 2b). We observed that the model is largely additive for the Americas, Asia, and Europe: 88-94% of the output uncertainty is explained by the sum of first-order effects, which are dominated by population-related uncertainties in the case of the Americas and Asia and by irrigation-related uncertainties in the case of Europe. The model is nonadditive for Africa (74%), meaning that ∼25% of the model output variance results from the interactions between at least two different clusters.

Discussion and Conclusions
Irrigation models are widely used to define policies on water and food security, environmental sustainability, and climate change. Their reliability strongly depends on whether their inferences hold once the chain of assumptions they embed is allowed to vary within reasonable bounds. In this paper we provide an understandable, transparent model of future irrigated areas, and exhaustively explore its behavior by activating 10.1029/2020GL087360 the main sources of uncertainty in the assumptions underlying the analysis. Our study aligns with existing works and guidelines that regard the study of uncertainty as an unavoidable step to judge the quality of model-based inferences (Eker et al., 2018;Funtowicz & Ravetz, 1990;Jakeman et al., 2006;van der Sluijs et al., 2004).
Our estimates improve existing ones in several ways. First, in model parsimony and transparency: our projection, which assumes that the growth of irrigated areas is a function of population size limited by future water and cropland availability, is summarized in a one-liner equation that can be readily reviewed. Furthermore, the model output holds against its main underlying parametric (7) and structural (7) uncertainties (Table 1). Information on model dimensionality, mathematical formalization or uncertainty assessment is lacking in previous projections (Alcamo et al., 2007(Alcamo et al., , 2005Alexandratos & Bruinsma, 2012;Fischer et al., 2007;Molden, 2007;Rosegrant et al., 2002;Shiklomanov & Rodda, 2003), making it hard to assess their robustness.
Although a systematic account is beyond the scope of this study, the evidence suggests that previous models might have indeed left several structural and parametric uncertainties unchecked. For instance, Alcamo et al.'s (2007) or OECD's (2012) proposal of a null increase of irrigated areas assumes the exhaustion of available cropland, the intensification of rainfed agriculture, increases in crop yields, and optimization of world trade. Shiklomanov and Rodda (2003) and Shen et al. (2008), in contrast, assume that past trends in population and irrigated area will continue in the future within the limits of available cropland . These two opposing narratives reflect alternative ways of framing the projection, a setting which can only be dealt with in the context of a stringent uncertainty and sensitivity analysis. Even basic model inputs such as current irrigated areas and future cropland available are treated by previous models as known parameters, despite strong evidence indicating otherwise (Meier et al., 2018;Zhang & Cai, 2011).
Second, our exercise provides a probabilistic estimation of future irrigated areas given the uncertainties involved in the models designed with this purpose. Our results indicate that current estimates are unreliable and should not be used to guide policies on irrigation. By overlooking uncertainties, previous studies failed to detect the considerable weight of the right tail of the output distributions, which can only be accounted for with a global sensitivity analysis (Saltelli et al., 2008). Varying all uncertainties simultaneously is especially important when the model under study is nonadditive (i.e., when it includes multiplications, exponents, etc). This is the case with the model presented in equation (4) and is likely to hold true also for any model projecting irrigated areas as a function of population. As a practitioner of sensitivity analysis would expect, the results of a modeling exercise then tend to exceed what one would envision from the linear mindset, leading to "surprises" (Taleb, 2007).
Third, our work characterizes for the first time the uncertainty involved in projecting irrigated areas. Figure 2 shows that it is mostly additive and yet it might not be easily reduced. In the case of the Americas, Asia, and Europe it is largely driven by the systemic unpredictability that characterises population dynamics, which might be hard to pin down. Even if a fertility-constraining policy is globally implemented, its effects might not be felt until way past 2050, suggesting that there are no easy ways to change the broad population trends (Bradshaw & Brook, 2014). Large-scale population control measures implemented without understanding the underlying demographical regime might also have unwelcomed and counterintuitive consequences (i.e., higher birth rates, as in India in the early 1950s; see Connelly, 2008). In the case of Africa, which presents a larger contribution of nonadditivities, canceling out the stochasticity in population dynamics would not make the projection of irrigated areas less sensitive to model uncertainties. This renders the addition of extra mathematical complexity unlikely to increase the accuracy of the model should all the parametric and structural uncertainty of the addition be also taken into account.
By casting a long shadow on the robustness of current predictions on irrigated areas, our results also loom large over all the studies that rely on these estimates to model any related variable or environmental phenomenon. This includes not only projections on irrigation water withdrawals (Alcamo et al., 2007;Döll & Siebert, 2002;Wada et al., 2016;Wisser et al., 2008) or CO 2 emissions (Chaturvedi et al., 2013) but also models on climate change. As a direct effect, it has been demonstrated that the extent of irrigation affects the amount of water vapor in the atmosphere and surface temperature (Boucher et al., 2004). Higher-order effects might include an increase in albedo and shortwave reflection (Cook et al., 2015) or the modification of downwind and cloud formation processes (Lo & Famiglietti, 2013), potentially affecting very remote regions