Most climate models predict a weakening of the North Atlantic thermohaline circulation for the 21st century when forced by increasing levels of greenhouse gas concentrations. The model spread, however, is rather large, even when the forcing scenario is identical, indicating a large uncertainty in the response to forcing. In order to reduce the model uncertainties a weighting procedure is applied considering the skill of each model in simulating hydrographic properties and observation-based circulation estimates. This procedure yields a “best estimate” for the evolution of the North Atlantic THC during the 21st century by taking into account a measure of model quality. Using 28 projections from 9 different coupled global climate models of a scenario of future CO2 increase (SRESA1B) performed for the upcoming fourth assessment report of the Intergovernmental Panel on Climate Change, the analysis predicts a gradual weakening of the North Atlantic THC by 25(±25)% until 2100.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
 The thermohaline circulation (THC) is a global 3-dimensional belt of ocean currents that transports large amounts of heat and freshwater around the world [Manabe and Stouffer, 1999]. In the North Atlantic, it is manifested in a meridional overturning circulation (AMOC) which, through its northward transport of warm tropical waters by the Gulf Stream and North Atlantic Current, effectively contributes to the warming of Northern Europe [Trenberth and Caron, 2001; Rahmstorf, 2003]. Previous model projections [e.g., Manabe and Stouffer, 1993; Stocker and Schmittner, 1997; Intergovernmental Panel on Climate Change (IPCC), 2001] suggested that global warming may lead to a strong weakening or even to a complete disappearance of the AMOC, which would have serious impacts on the climate, the ecology and the economy of many countries surrounding the North Atlantic.
 The value of ensemble prediction is well established in numerical weather forecasting and seasonal climate prediction. An important outcome from the field of seasonal climate prediction is the demonstration of the superiority of the multi-model ensemble over any single model. This feature is quite universal and not restricted to any particular region or variable [Palmer et al., 2004]. Thus a multi-model ensemble is an effective method for sampling model uncertainties and for making more reliable forecasts. When applied to global change prediction, multi-model ensembles, however, yield a large uncertainty for the climate of the 21st century both globally and regionally [IPCC, 2001]. In particular, the future evolution of the AMOC is characterized by a large model spread: While some models simulate a rather strong weakening of the AMOC, other models are relatively stable and simulate either only a moderate or no change [IPCC, 2001; Gregory et al., 2005]. Here we investigate the behavior of the AMOC in the most recent greenhouse simulations conducted for the upcoming Fourth Assessment Report (AR4) of the Intergovernmental Panel on Climate Change (IPCC). Our aim is to improve the model projections of the AMOC by reducing the uncertainty. We do this by taking into account the models' skills in simulating observation based circulation estimates and observed climatological hydrographic conditions in the assessment of the multi-model ensemble.
 Such an assessment methodology based on model skill to obtain more reliable forecasts has a long history in weather and seasonal forecasting [e.g., Fraedrich and Leslie, 1987; Fraedrich and Smith, 1989; Metzger et al., 2004]. A similar approach was followed by Murphy et al. , who analyzed surface air temperature in an ensemble of greenhouse simulations. They constrained the different model versions by a multi-variate climate prediction index derived from observations. A major result of this study is that the weighted probability density function of climate sensitivity based on model performance is narrower than the unweighted one, thus decreasing the uncertainty. Knutti et al.  investigated an ensemble of reduced complexity models using observed surface warming and ocean heat uptake as constraints and found a very broad range of AMOC responses.
2. Model Data and Observations
 We obtained results from 9 global climate models which were integrated as part of the AR4 of IPCC. All models were integrated using observed concentrations of greenhouse gases/aerosols from 1850 to present (scenario 20C3M). Future concentrations are prescribed according to IPCC-scenario SRESA1B until 2100. Carbon dioxide concentrations rise up to about 700 ppm until 2100 in this scenario leading to a radiative forcing of about 6 W/m2. Globally averaged surface air temperature increases by about 3°C until 2100 in a simplified model with an intermediate climate sensitivity [IPCC, 2001]. If multiple (ensemble) runs were available for an individual model we used the ensemble mean in the assessment.
 We evaluate the ocean component of the climate models in terms of the simulated global temperature (T), salinity (S) and pycnocline depth (D = ∫(ρmax − ρ)zdz/∫(ρmax − ρ)dz) distributions during the period 1981–2000. The pycnocline depth is a dynamically important variable controlling the upper ocean flow [Gnanadesikan et al., 2002]. Since our goal is an assessment of the simulation of the Atlantic overturning circulation we use additional measures that characterize the Atlantic circulation. Sea surface temperature (SST) and salinity (SSS) in the North Atlantic depend strongly on the AMOC for its effect on the northward advection of warm and salty subtropical surface waters which is most pronounced between 40–70°N [e.g., Schmittner et al., 2002]. Therefore we use SST and SSS in this region as well as the pycnocline depth restricted to the Atlantic basin north of 35°S, since theory [Marotzke and Klinger, 2000] and model results [Hughes and Weaver, 1994] suggest that the density gradients within the Atlantic drive the overturning. Additionally, observation-based estimates of the mass flux at 24°N, 48°N and its maximum value are used as controls in the model assessment (see Table 1).
Table 1. RMS Errors for the Individual Models and the Resulting Weight Wa
RMS errors are normalized by the standard deviation of the observations. The weights gi for the individual variables in the calculation of the total weight are given. Bold numbers give the two best; italic numbers give the two worst models. Numbers in parentheses in columns 8–10 denote the absolute value of the circulation in Sv.
Number of ensemble runs for the 20C3M and SRESA1B scenarios, respectively.
 The skill score S is a combination of the normalized (by the standard deviation of the observations) root mean square (rms) errors of the above described variables weighted by gi (Table 1) in order to emphasize circulation estimates and the tracer distributions in the North Atlantic:
We have tested different versions of the skill score, e.g. including different choices for the gi, considering correlation coefficients and pattern rms errors. The main results were similar and therefore we restrict our discussion to the above formulation.
 The weights W used in the assessment of the models are calculated based on a probabilistic approach assuming Gaussian statistics as from Murphy et al. :
In order to account for the fact that flux corrections may influence the transient model response and artificially increase the correspondence with the observations, we penalized model 1 (global flux correction) by multiplying the rms errors of T,S and D by 2 and those of model 9 (tropical flux correction) by 1.3.
 Finally, the question arises whether the model responses should be scaled by the climate sensitivity. We did not find, however, any systematic relationship between climate sensitivity and AMOC response.
4. Model Assessment
Taylor  diagrams display the correspondence of each model with the observations for the global fields of temperature, salinity and pycnocline depth (Figure 1a) and for North Atlantic SST, SSS and pycnocline depth (Figure 1b). The model data were normalized by the observed standard deviation. A “perfect” model would reside in the point (1,1) in the σ,R-plane of the Taylor diagram.
 In general, temperature is simulated more successfully by the models than salinity or pycnocline depth. Furthermore, global statistics exhibit less spread than those for the North Atlantic. In particular, all models simulate the global ocean temperature distribution quite realistically, with a normalized standard deviation close to unity and correlations above 0.9. Models 3 and 5 display a systematically too deep thermocline (not shown) resulting in larger rms errors (Table 1). The global salinity distribution is simulated less successfully than that for temperature: The correlations are much smaller and the standard deviations are off by at least 10% in most models. This suggests that the atmospheric hydrological cycle and/or sea ice are still not very well simulated in most models. Models 3 and 5 display a systematically too salty upper ocean and model 9 has no gradients below a few hundred meters depth (not shown). The correlations for the global field of pycnocline depth are similar to those for salinity but the rms errors are larger (Table 1), likely because errors for temperature and salinity may add. The North Atlantic statistics, which may be more relevant for the AMOC, exhibit a similar tendency: Temperature is simulated with more success than salinity, and the spread is larger for salinity. There appears to be no clear systematic relationship between global and North Atlantic statistics.
 The mass flux is inconsistent with the observations for models 6, 3 and 1. Model 6 has almost no deep water formation in the North Atlantic. The final weight W is almost zero for models 3 and 6 and very small for models 5 and 1. Models 2 and 8 are superior to the others.
 How do these model statistics affect the projection? We show in Figure 2 the index of the AMOC at 24°N. As in the report by IPCC , there is still a large spread in the model behavior. The initial states, the level of decadal variability, and the response to greenhouse warming, all three are rather different. The initial conditions, for instance, can differ by as much as 10 Sv (1Sv = 106m3/s). Likewise, the level of decadal variability varies from virtual no variability to a decadal standard deviation of several Sverdrups (Sv). The weighted model mean from 1980–1999 is consistent with the observations.
 The projection based on the weighted mean shows a linear weakening of the circulation from 15.7(±3.5) Sv during the last decade of the last century (1990–1999) towards 11.8(±2.9) Sv during the decade 2090–2099. This presents a decrease by about 25(±25)%. This result is remarkably robust with respect to alternative methods to calculate the model weights. The unweighted mean exhibits a similar reduction from 14.2(±6.0) Sv to 10.3(±4.6) Sv of 27%, suggesting that the response of AMOC does not depend much on the model performance. We can thus conclude that a considerable weakening of the AMOC can be expected until 2100. No individual model shows an abrupt collapse of the circulation during this century.
 There are some problems with our methodology. First, it remains to be shown that simulating correct climatological mass fluxes, and temperature, salinity or pycnocline depth patterns can really improve AMOC prediction. As small-scale processes such as convection in the Labrador Sea or the overflows across the Greenland-Iceland-Scotland ridge system may influence AMOC behaviour, an approach based more on the important physical mechanisms to evaluate the climate models would be desirable. As such our methodology can be regarded only as a first step in the direction of a more refined model evaluation. The lack of an advanced ocean observing system, however, makes this a challenge. Strategies developed for ocean model evaluation, e.g. natural or artificial tracer distributions as in the OCMIP exercises [Doney et al., 2004], should be extended to coupled models. Evaluation of the time dependent AMOC response should be attempted in the future. This will be possible using direct observations for the last century, and for the more distant past by including the increasing data base of paleo ocean circulation changes (e.g. during the last glacial maximum).
 An interesting outcome of our study is the result that the weighted model means are very similar to the unweighted ensemble mean. Does this mean the effort of weighting the models is useless? Whereas the evolution of the weighted and unweighted means is similar, the weighted standard deviation is generally smaller than the unweighted (Figure 2, top panel). This suggests that our method of assessing the models with observations can indeed reduce the uncertainty in the projection.
 Finally, what can we conclude for the stability of the North Atlantic THC under increased levels of greenhouse gas concentrations? First, a significant weakening of the AMOC is to be expected until 2100. Second, this change will evolve gradually, no model simulates an abrupt change. These two findings are consistent with the study of Gregory et al. , who analyzed another type of greenhouse warming simulations. Third, the anthropogenically induced change in the North Atlantic THC is unlikely to leave the range of natural variability during the next several decades. This was also concluded by Curry et al.  by analyzing ocean observations of the last 50 years and M. Latif et al. (Is the thermohaline circulation changing, submitted to Journal of Climate, 2005) by investigating the SSTs of the last century. We note, however, that increased melting from the Greenland ice sheet, a process not included in present climate models, may induce an additional freshwater forcing for the North Atlantic and accelerate the weakening of AMOC during the 21st century.
 This work was supported by the German CLIVAR and European ENSEMBLES projects and the SFB 460. We acknowledge the international modeling groups for providing their data for analysis, the Program for Climate Model Diagnosis and Intercomparison (PCMDI) for collecting and archiving the model data, the JSC/CLIVAR Working Group on Coupled Modelling (WGCM) and their Coupled Model Intercomparison Project (CMIP) and Climate Simulation Panel for organizing the model data analysis activity, and the IPCC WG1 TSU for technical support. The IPCC Data Archive at Lawrence Livermore National Laboratory is supported by the Office of Science, U.S. Department of Energy. The help of Frank Kösters in the early stages of this study is greatly acknowledged.