Using the Mid‐Holocene “Greening” of the Sahara to Narrow Acceptable Ranges on Climate Model Parameters

During the early to mid‐Holocene vegetation expanded to cover much of the present‐day Sahara. Although driven by a well‐understood difference in the orbital configuration, general circulation models have generally failed to simulate the required rainfall increase. One possible explanation is the presence of systematic biases in the representations of atmospheric convection which might also impact future projections. We employ a Bayesian method to learn from an ensemble of present day and mid‐Holocene simulations that vary parameters in the convection, boundary layer and cloud schemes. The model can reproduce the “Green Sahara” rainfall if mixing between convective plumes and the environment is increased in the upper troposphere relative to lower down. This does not appreciably impact the present day simulation, meaning that the paleoclimate reconstructions are able to narrow constraints on suitable parameter ranges. This suggests that other uncertain components of climate models could be targeted in this way.

To be useful in this way, a paleoclimate state or transition must be associated with a good understanding of both the underlying forcings (e.g., a change in greenhouse gas levels or a change in orbit) and the resultant impacts in the climate system.
The early to mid-Holocene (around 11,000-4,000 years before present) is frequently highlighted as such a period (Biasutti et al., 2018;Braconnot et al., 2012;Harrison et al., 2015). This is because the ultimate forcing during this time was a very well-understood change in the configuration of Earth's orbit (Berger & Loutre, 1991). The resulting increase in northern hemisphere summer insolation drove enhanced monsoon circulation and precipitation (Kutzbach & Street-Perrott, 1985). This led to a so-called "Greening" of the Sahara (e.g., Claussen et al., 2017). This is evidenced by the development of savanna or steppe-like vegetation (Hély et al., 2014), expansion of lakes and rivers (Kohfeld & Harrison, 2000;Skonieczny et al., 2015), a reduction in dust deposited over the Atlantic (de Menocal et al., 2000;Williams et al., 2016) and the presence of neolithic settlements and domesticated animals (Manning & Timpson, 2014).
Pollen and macro-fossil evidence suggests that annual mean precipitation increased by 1.1 mm day −1 0.1 (Bartlein et al., 2011) relative to the present day (meanstandard error for 15- 30 N). There is some uncertainty on the spatial pattern of change, with earlier compilations of pollen suggesting a relatively uniform vegetation change (Hoelzmann et al., 1998), while more recent datasets suggest greater changes in vegetation in the South compared with further North (Hély et al., 2014). The pollen samples have been integrated with a vegetation model to infer climate, showing the same overall precipitation increase (Wu et al., 2007). A larger rainfall increase of around 1.5 (0.9-2.8) mm day −1 has been inferred from marine core leaf-wax hydrogen isotope ratios (Tierney et al., 2017). A recent isotope-enabled modeling study suggests these isotope changes are consistent with a slightly smaller precipitation increase (Thompson et al., 2021). Despite these uncertainties, all lines of evidence agree on a minimum increase in precipitation of at least 0.7 mm day −1 (Joussaume et al., 1999) that enabled vegetation to grow across much of the present-day Sahara Pachur & Holzmann, 1991;Peyron et al., 2006;Ritchie & Haynes, 1987;Street-Perrott et al., 1990).
All GCM simulations driven with the orbital configuration for 6,000 years before present (6 ka BP), simulate an increase in precipitation, but almost always much smaller than these pollen observations imply over the Sahara itself (Braconnot et al, 2007(Braconnot et al, , 2012Brierley et al., 2020;Joussaume et al., 1999). This is also true when dynamic vegetation and/or dynamic dust processes are enabled (Harrison et al., 2015;Hopcroft & Valdes, 2019;Perez-Sanz et al., 2014). In contrast, when significant changes in the land surface albedo and/or significant reductions in dust aerosols are specified, sufficient rainfall can be simulated (e.g., Levis et al., 2004;Pausata et al., 2016;Skinner & Poulsen, 2016).
Saharan dust particles are less absorbing than is prescribed in most climate models which tend to use outdated radiative properties (Albani & Mahowald, 2019;Hopcroft & Valdes, 2019). A significant dust reduction during the mid-Holocene probably did not appreciably enhance convective precipitation (Hopcroft & Valdes, 2019). Moreover, the reduction in dust loading would have altered cloud formation through dustcloud interactions, and this has been shown to reduce stratiform precipitation (Thompson et al., 2019). Land-surface feedbacks can efficiently drive the monsoon northwards (Levis et al., 2004;Skinner & Poulsen, 2016;Texier et al., 2000) but there is little agreement on how the "greening" of the Sahara should be configured in models (Chen et al., 2020;Hopcroft et al., 2017;Lu et al., 2018;Street-Perrott et al., 1990;Texier et al., 2000). It is thus not trivial to judge whether or not a sufficient precipitation enhancement is achieved for the right reasons in model simulations of the mid-Holocene. The model-data disparity over North Africa has persisted for several decades across multiple GCMs (Biasutti et al., 2018). This suggests systematic biases that either require more detailed physical representations or a different approach to parameter choices.
In this work, we use the atmospheric component of the coupled GCM HadCM3 (Gordon et al., 2000;Pope et al., 2000;Valdes et al., 2017) which also does not simulate a "greening" of the Sahara under mid-Holocene boundary conditions (Braconnot et al., 2007). We use this GCM to evaluate what the model failure may reveal about the representation of precipitation in GCMs and to compare the parametric constraints from present-day and mid-Holocene climatic conditions.

General Circulation Model and Boundary Conditions
We use the Met Office Hadley Center atmosphere model 3 (HadAM3) with the MOSES 2.1 land surface scheme and prescribed vegetation cover (Essery et al., 2003;Pope et al., 2000), specifically HadAM3B-M2.1aN . This GCM has horizontal resolution of 3.75 × 2.5° (longitude-latitude) with 19 vertical levels. HadAM3 uses the mass-flux convection scheme by Gregory and Rowntree (1990) which is comparable in complexity to schemes used in several other GCMs (Maraun & Widmann, 2018;Stensrud, 2007). Relative to the published configuration of the model (here labeled ORIG), a revised (REV) model version was developed here that includes a humidity-dependence of the mixing and forced detrainment from convection following Derbyshire, et al. (2011), as implemented in more recent Hadley Centre models.
The preindustrial setup follows that of Valdes et al. (2017) with prescribed observed present-day vegetation coverage (Loveland et al., 2000) and preindustrial levels of greenhouse gases ( 2 CO , 4 CH and N 2 O). We use 1981-2010 climatological sea-surface temperatures (SSTs) and sea-ice from HadISST (Rayner et al., 2003(Rayner et al., , updated to 2010. For the mid-Holocene, the orbital parameters are modified for the conditions of 6 ka before present (BP) (Berger & Loutre, 1991). Sea surface temperatures (SST) and sea-ice are modified by adding the 6 ka minus preindustrial difference as simulated with the coupled model HadCM3B-M2.1aD, to the preindustrial HadISST climatology. The simulations setup is summarized in Table S1.
Today, the Sahara has a surface albedo of around 0.35 (Loeb et al., 2012). A reduction in albedo would strengthen the monsoon (Boos & Storelvmo, 2016;Charney, 1975;Street-Perrott et al., 1990;Texier et al., 2000). The mid-Holocene "greening" involved the northwards expansion of grasses and shrubs Hély et al., 2014). Satellite observations show that these biomes have an albedo of 0.17-0.25 when precipitation is in the range reconstructed for the "greening" (i.e., 1.1 mm day −1 Bartlein et al., 2011), see supporting information S1 and Figure S1. The Sahel which is at the periphery of the present-day West African monsoon is in the upper part of this range (0.2-0.3). This may present the best analogue for the mid-Holocene "greening." This higher end is also consistent with mid-Holocene simulations with dynamic vegetation and soils (e.g., Claussen & Gayler, 1997;Vamborg et al., 2011). Many model studies have prescribed a value at the very lower end of 0.15 (Chandan & Peltier, 2020;Levis et al., 2004;Pausata et al., 2016;Skinner & Poulsen, 2016), which is typically seen in regions of higher rainfall of 2.0-3.0 mm day −1 .
We do not use the HadCM3B-M2.1aD dynamic vegetation scheme as it is overly sensitive to arid conditions (Hopcroft et al., 2017). Instead bare soil in the Sahara region (from 10-35°N, 30°W-50°E) is replaced with grasses and shrubs with a total fractional coverage of 50%. This produces a surface albedo of 0.27 relative to 0.31 in the preindustrial simulation. This a relatively conservative change for the period since it is at the upper end of the range discussed above.

Perturbed Parameter Ensemble
We introduce a new variable E into HadAM3 which controls the vertical dependence of entrainment and mixing detrainment-the mixing of environmental air into the convecting air, and of convecting air into its environment. By default, the entrainment rate decays with altitude in proportion to pressure. This was intended as an ad-hoc representation of larger clouds, which proportionally mix less with their surrounding, reaching higher (Gregory & Rowntree, 1990). With the new parameter E we relax this assumption. Increasing E increases the upper troposphere entrainment values and reduces those near to the land surface. A value of zero returns the default proportional dependence on pressure (see supporting information S2).
We configured a 150-member perturbed parameter ensemble using the REV configuration of HadAM3. Eleven GCM parameters within the convection, boundary layer or large-scale cloud schemes were sampled. This includes three new parameters: E which controls the vertical profile of entrainment and detrainment, r det which sets the sensitivity of forced detrainment to the buoyancy gradient (Derbyshire et al., 2011), and  det which sets the sensitivity of detrainment to relative humidity (Derbyshire et al., 2011). The 11 model parameters are assigned different values globally leading to 150 paired ensemble members of preindustrial and mid-Holocene simulations. The parameter definitions and ranges used are given in Table S2 and illustrated schematically in Figure S2. The evaluations are selected using a Latin hypercube method (McKay et al., 1979), which distributes the parameter samples optimally across the 11-dimensional state space.

Statistical Modeling and Parameter Tuning
We used a Gaussian process emulator (e.g., Kennedy & O'Hagan, 2001) to construct a statistical representation of the perturbed parameter ensemble of GCM simulations. Emulators have been extensively used in analyzing complex numerical models like GCMs (Edwards et al., 2019;McNeall et al., 2016;Rougier et al., 2009;Sexton et al., 2012). The emulator represents some output as a linear function of the input parameter values combined with a Gaussian process (Roustant et al., 2012). In this way, the emulator interpolates in multidimensional parameter space to predict the GCM response at any combination of input parameter values. further details are given in supporting information Text S3.
We use a Bayesian method (see supporting information Text S3.2) to update the model parameters based on the mid-Holocene paleoclimate reconstructions. In a Bayesian method we compute a posterior probability distribution function (PDF) on the model parameters based on the prior PDF and the likelihood (e.g., Rougier, 2007). The prior is taken as the current model version and the likelihood quantifies the performance of the GCM for selected climate outputs such as simulated precipitation. Thus we condition the model parameters with the present-day observed climate variables and optionally the mid-Holocene rainfall increase.
The posterior PDF must be approximated using a Markov chain Monte Carlo method (Gilks et al., 1995) and since the MCMC algorithm requires many thousands of iterations, we use the emulator in place of the full GCM. We perform this process twice. First including four observational targets for the present day (Table S3) and second adding to this the mid-Holocene absolute precipitation over North Africa inferred from pollen data (Bartlein et al., 2011). Thus we derive a new parameter set suitable for both present and mid-Holocene conditions, which is different to Su and Neelin (2005) who used different parameter sets for the two time periods.

Sampling Convection, Clouds and Boundary Layer within a Global Model
The resultant precipitation anomalies for the 112 simulations that completed 50 model years are averaged over North Africa (  20 W-30°E by 15-30°N) in Figure 1. This region of North Africa includes many of the fossil pollen sites. All model simulations overestimate present-day precipitation in Africa and in North Africa in particular. This is a systematic bias in HadAM3 . Part of which is due to an underestimation of the soil albedo in the Sahara region in the model. The simulated difference for mid-Holocene minus preindustrial in precipitation in this region ranges from 0.7 mm day −1 to 2.6 mm day −1 for JJAS, when most precipitation falls. Many ensemble members with the preindustrial precipitation similar to the original (around 0.5 mm day −1 ), have much higher increases of about 2.0 mm day −1 for the mid-Holocene. The weak correlation between the two axes in Figure 1 shows that different factors influence precipitation in the preindustrial compared with the precipitation difference between the two time periods.
The parameter dependence of the mid-Holocene precipitation anomaly is shown in Figure S3. The most obvious relationship is with E, for which higher values result in larger changes. E controls the vertical profile of entrainment and detrainment, which is the rate of mixing of convective clouds with the surrounding air masses. In many models entrainment decays with altitude. High values of E increase the upper level entrainment and reduces it near the land surface. This produces a wetter mid-Holocene in North Africa.
The Gaussian process emulator is used to calculate influence of varying each parameter value individually on the simulated precipitation change over North Africa for the mid-Holocene. The result of this is shown  (20°W-30°E, 15°N-30°N) against the simulated mid-Holocene minus preindustrial precipitation (mm day −1 ) change. The observed precipitation in this region from CRU  based on years 1961-1990, is indicated by the shaded gray bar. The impact of using present-day versus preindustrial precipitation observations is likely to be small and less important than differences due to model biases and due to significant spatial gaps in the early instrumental observations.
in Figure 2 and the emulator skill is evaluated in Figure S4. We find that E is the dominant parameter. We examined the parameter dependence of the West African monsoon in the preindustrial ensemble members (over 5- 15 N) ( Figure S5). This shows that the parameters which exert the strongest control on precipitation in this monsoon region ( q ini, ct and  det ) are not the same as for the mid-Holocene anomaly relative to the preindustrial simulation (E, F, T ini and q ini).

Mechanisms of Enhanced Mid-Holocene Precipitation
Given the profound effect of changing E on the mid-Holocene North African precipitation, we ran a simulation with only one change from REV: increasing E from its default value of 0 to 0.25. Figure 3a shows the percentage difference in the mid-Holocene minus preindustrial (6 kaGS-0 ka) precipitation anomalies for the pair of simulations with E = 0.25 compared to the pair with E = 0. With E = 0.25 the precipitation anomaly is generally larger across North Africa and is nearly twice as large in the North West (Figure 3a). The latitude of the precipitation maximum moves northwards by around 2.5- 5 compared to the E = 0 simulation (not shown). It produces a more diffuse precipitation band during JJAS which pushes the periphery of the monsoon further into the dry interior.
A key diagnostic of the convection scheme is the updraught mass-flux. The simulated mean convective updraught over North Africa decreases fractionally much more with height than the mean over the wet regions of the tropics as a whole. This is presumably because of dilution by the extremely dry environment in North Africa, which makes it harder for moist convective plumes to persist. In all model versions the HOPCROFT ET AL.  increase in rainfall over North Africa at 6 ka BP is accompanied by the updraught mass flux weakening lower down and strengthening aloft (Figure 3b), becoming more like tropical moist convection elsewhere in the tropics. The mid-Holocene boundary conditions lead to less dilution of convective plumes low down, so that those that reach their lifting condensation level (LCL: indicated by vertical lines in Figure 3b) are more vigorous and end up penetrating higher overall.
The direct effect of increasing the parameter E, that is, decreasing the entrainment rate near the surface and increasing entrainment rate higher up, is to amplify these mass flux changes for the mid-Holocene relative to the preindustrial (Figure 3b), so that the REV (E = 0.25) case has a lower mass-flux near the surface than the REV model version, and a stronger updraught mass flux above the lifting condensation level (LCL: indicated by vertical lines in Figure 3b). Interactions between convection and its environment mean the net effect can be very different in other regions (Figure 3a), but over North Africa these changes reinforce each other to produce larger amplitude mass flux changes, and a stronger rainfall increase at 6 ka BP (Figure 3d).
In tandem with this, the circulation (zonal wind) and humidity anomalies associated with heavier downpours in the Sahara are different when E is given a higher value (Figures 3c and 3d). For E = 0, wetter days north of  15 N are associated with a strengthened tropical easterly jet (TEJ) and a slightly weakened Africa HOPCROFT ET AL.
10.1029/2020GL092043 6 of 12 easterly jet (AEJ), as observed (Nicholson, 2009). African Easterly Waves (AEWs) move along the jet and contribute precipitation to the North (Claussen et al., 2017) and this is well represented in HadAM3 (Taylor et al., 2002). For E = 0.25, the same precipitation increase is achieved with a 20% smaller increase in the TEJ. This suggests that convection is more effective in this model configuration and this partly explains the increased precipitation response for E = 0.25. Some of these downpours are also associated with tropical plumes (Knippertz, 2003), especially in the months of August-October (Dallmeyer et al., 2020;Skinner & Poulsen, 2016). When E is increased to 0.25, plumes contribute more precipitation in the region from 20- 35 N despite relatively similar mean climatologies of large-scale circulation and humidity.

Learning from the Model-Data Mismatch
The mid-Holocene pollen quantitative precipitation reconstruction over North Africa gives an annual-mean precipitation increase of 1.1  0.1 mm 1 day (Bartlein et al., 2011) relative to the present day. We use the annual mean reconstruction and model outputs in a probabilistic formulation to optimize the GCM so that it is consistent with the pollen-based precipitation reconstructions and hence the widespread environmental evidence for an invigoration of the hydrological cycle.
The posterior PDFs on the 11 parameters are shown in Figure S6. Two cases are considered where the second only differs with the inclusion of the mid-Holocene precipitation target. For the mid-Holocene the algorithm favors high E because as discussed above, it has an extremely strong impact on the response to the mid-Holocene insolation. The emulator predicted mid-Holocene precipitation increase is very different between the two cases, showing sensitivity of the system to parameter combinations and also that the optimization against present-day observations does not guarantee an improvement for the mid-Holocene.

New Model Version
One optimized parameter set derived from the PDFs on the model parameters ( Figure S6 and Tables S4 and S5) was used in new preindustrial and mid-Holocene GCM simulations and is denoted REVopt. In this we only changed parameters from their original GCM values where there is stronger preference posterior PDF. Whilst the choice of parameters which underline REVopt is based on the posterior PDF sampling, it would be more consistent with the Bayesian paradigm to think in terms of the probability distribution on the parameters, rather than to focus on any single parameter set. However, given computational limitations and to simplify the presentation of the results we mostly focus on the REVopt parameter set.
This simulation is compared with the ORIG and REV models in Figure 4. The difference between the two controls (ORIG and REV) is due to changes to the detrainment parametrization following Derbyshire et al. (2011). Figure 4c shows that in REVopt the precipitation anomaly over North Africa is approaching double that in the ORIG simulations for JJAS, and the annual mean anomaly is also 67% larger. There is a large region of precipitation increase across North Africa and Arabia and Northern India. This shows that the statistically-inferred parameter changes are effective when introduced into the GCM. Crucially, the present-day performance of REVopt is very similar to ORIG and REV for both temperature and precipitation (supporting information Text S4 and Figures S7 and S8). This means that the improvement for the mid-Holocene has not significantly altered the present-day simulation.
HOPCROFT ET AL.

Discussion
The strong precipitation increase during the early-to mid-Holocene in North Africa presents a unique challenge to climate simulations of tropical precipitation. In this perturbed parameter study we find that parameters controlling the preindustrial climatology of precipitation are different from those that determine the anomaly under a climate change scenario. The Bayesian approach demonstrates that a modified vertical profile of convective entrainment can significantly improve the simulation of the mid-Holocene North Africa despite having little impact on the simulation of the present day.
Like any model, HadCM3 has biases and simplifications. For example, HadCM3 suffers from having too little, too optically bright cloud cover (Massey et al., 2015), a common problem in CMIP5 models (Nam et al., 2012). It also has too much precipitation over Africa and too little over South America, but does not suffer from a double ITCZ bias or weak ENSO variability, which are common problems in many GCMs. Overall its performance compared to observations is typical of GCMs used in recent intercomparisons .
Structural limitations mean that many biases could be corrected by varying model parameters, and there are many more than the 11 we varied. Also, the existence of compensating errors means that tuning that improves one bias can actually exacerbate another. Despite this, in the REVopt case we significantly improved the mid-Holocene precipitation in comparison with reconstructions, without affecting the simulation of the present day state. It is possible that with a more comprehensive list of parameters, for example, of the order of 20-50, some of these other biases may be reduced.
Tropical precipitation in GCMs has recently been improved through the use of adaptive convective entrainment, whereby local entrainment rates reduce following convective activity. On a local scale this could produce a similar effect as in our study, reducing entrainment in the lower troposphere, following precursor convective plumes (Mapes & Neale, 2011;Willett & Whitall, 2017). Future work should consider how such dynamic entrainment parametrizations (e.g., Hohenegger & Bretherton, 2011;Mapes & Neale, 2011) could similarly improve modeling of the mid-Holocene and whether this is consistent with our statistically derived model changes.
Convection-permitting atmospheric model simulations, with high resolution and no convection parametrization, have highlighted further significant improvements when convection parametrizations are deactivated Birch et al., 2014;Finney et al., 2019;Kendon et al., 2019;Marsham et al., 2013;Pante & Knippertz, 2019). This includes a more realistic diurnal cycle of precipitation (Marsham et al., 2013), improved simulation of wet spells Kendon et al., 2019) and of cloud cover and humidity (Pante & Knippertz, 2019). Making this transition is not without drawbacks and substantial model errors can emerge (Pante & Knippertz, 2019) that are difficult to eliminate because of the high computational cost of test simulations. Future work could compare convection-permitting simulations and ensembles of GCM simulations like those studied here for both present day and paleoclimate conditions to identify further ways to improve GCM parametrization schemes.
Some have argued that the Green Sahara modeling problem may be resolved by altering model boundary conditions such as vegetation coverage or dust loading (Chandan & Peltier, 2020;Pausata et al., 2016). Here, we have shown that it is also possible to improve the simulation of precipitation through changes to the model parametrizations, many aspects of which are subject to significant uncertainty. This focus on the model itself would strengthen the rationale for evaluating the relationship between past and future climate change, given that very different boundary conditions characterize the past and future (orbital changes for the mid-Holocene, vs. greenhouse gases and land use for the near future). Some earlier (Claussen & Gayler, 1997) and more recent model studies (Dallmeyer et al., 2020) have successfully simulated the Green Sahara without prescribing large changes to the land-surface. We speculate that this shows that for a limited number of GCMs the convection schemes are compatible with these paleoclimatic states. In most other models including HadCM3 used here, the convection scheme may need to be specifically tuned. It would be informative to test the impacts of similar entrainment changes in other models, in transient Holocene simulations and under future climate scenarios.

Conclusions
Most climate models are developed and calibrated against modern climate. We have shown that there are multiple parameter sets which allow for a good simulation of the present day conditions, but that only a subset of these are also able to satisfy past criteria. This example provides a new quantitative demonstration of how a paleoclimate state may provide information of relevance to uncertainty in simulating future precipitation change. Paleoclimate may therefore be a significantly undervalued source of additional information for informing the parameter values and parametrization choices in GCMs as it has rarely been used in this way, see examples by Hopcroft and Valdes (2015) and DiNezio et al. (2016). Well documented climate states in the past thus may have the potential to be used in model development and this approach can include other important paleoclimate changes.