While ocean observations of temperature and salinity extend back to the 19th century, their observation count, as well as geographical and vertical distributions all changed dramatically between successive decades. Similarly, atmospheric observations were unevenly distributed in space and time. This study explores the usefulness of past oceanic and atmospheric observing systems to detect extreme climate events through a set of observing system simulation experiments. In these experiments an initial simulation of the evolving ocean state during 1995–1998 (Nature Run) is sub-sampled using the same distribution of surface and subsurface observations as exists in successive decades. The result is a set of synthetic ocean observation re-samples of the massive mainly tropical/subtropical climate anomalies of the 1995–1998 years. These synthetic observation re-samples are then assimilated into a general circulation ocean model using a conventional assimilation scheme. In one set of experiments the model used in data assimilation is driven with climatological forcing to mimic the effects of poorly specified surface forcing. The results indicate that prior to the 1940s the historical observing network alone was only able to resolve limited aspects of tropical/subtropical variability. In contrast, by the 1960s the observing system was sufficient to resolve variability without additional wind information. In a second set of assimilation experiments surface meteorological forcing is improved to an extent consistent with meteorological error estimates for past decades. When this historical surface forcing is also included the results suggest that this extreme climate variability is reproducible even back to the early years of the 20th century. The paper concludes with a discussion of the implications of several simplifying assumptions used to obtain these optimistic results.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
 Reanalyses of ocean circulation variations have found increasing application in climate studies in recent years. Most such reanalyses limit their temporal span to the relatively data-rich recent decades, partly because of concerns about data sampling. One exception is the study of El Niño events throughout the 20th Century by Giese and Ray . Attempts to reconstruct circulation variations in earlier decades face two problems. The first problem is the limited availability of surface meteorological forcing and information about the error characteristics. This problem is being addressed by the 20th Century Reanalysis project (20CRv2) of Compo et al. . The second problem is the limited distribution of historical surface and subsurface oceanographic observations, whose numbers, horizontal and vertical distributions, and errors have all varied dramatically in time.
 The current study presents results from observing system simulation experiments (e.g., following Atlas et al. ) to explore the expected accuracy of historical reanalysis of tropical/subtropical ocean climate variability during the 20th and early 21st centuries. The simulation (known as the Nature Run) is sub-sampled to create ‘synthetic observations’ in a pattern resembling the historical temporal and geographic pattern and type of observations. These synthetic observations are used in a data assimilation analysis driven by degraded surface forcing starting from inaccurate initial conditions in an attempt to reconstruct the variability in the Nature Run. To allow the results from different experiments to be compared and to explore their usefulness in detecting extreme climate anomalies the historical sampling patterns and surface meteorological forcing accuracies are applied to the four year base period 1995–1998.
 The 1995–8 base period is chosen because it includes some of the most striking climate anomalies of the century, including the development and decay of the massive tropical El Niño/La Niña of 1997–8, which had global teleconnections [McPhaden, 1999]. The 1997 El Niño began early in the year with anomalous weakening of the trade winds along the Pacific Equator and corresponding warming of SST. These anomalies grew and expanded geographically throughout the boreal summer of 1997, so that by fall the seasonal upwelling along the coast of Peru was suppressed. At the height of the El Niño (which for heat content occurred in November, 1997) SSTs were more than 4°C above normal in the cold tongue region of the eastern equatorial Pacific with 0/300 m heat content anomalies of up to 2°C [Hasegawa and Hanawa, 2003]. The most extreme monthly values of the Southern Oscillation Index of zonal sea level pressure gradient occurred two months later in early 1998. The current study begins with a numerical simulation (Nature Run), carried out using a model described later, whose variability along the equator is shown in Figure 1 (an estimate of the observed changes is superimposed). Off the equator the teleconnections of the 1997/8 El Nino conditions extended globally. For example, intensification of the easterly trade winds over the Indian Ocean was linked to a westward shift of ocean heat and a strong east-west SST dipole [Saji et al., 1999; Webster et al., 1999; Murtugudde et al., 2000]. In the Atlantic atmosphere hurricane formation was suppressed as the vertical shear of tropospheric winds increased, while SST in the southern tropical Atlantic was anomalously cool.
 The demise of the El Niño occurred abruptly in boreal spring of 1998 when the trade winds intensified and SST dropped at some locations by as much as 8°C in a month, also evident in the Nature Run (Figure 1). The tropical Pacific then rapidly transitioned into cool La Niña conditions, with near-surface cooling throughout the first half of 1998 and high values of the Southern Oscillation Index (>2.0), reflecting a weakened zonal sea level pressure gradient, and increased frequency of North Atlantic hurricanes in the second half. These La Niña conditions persisted into 2000.
 The ocean observations we focus on are contained in the combined set of global surface and subsurface records of historical ocean temperature and salinity observations (Figures 2 (top) and 2 (bottom)). Prior to the 1940s the number of temperature and salinity profiles rarely exceeded 500/month and many of these measurements were confined to shallow depths that did not penetrate the thermocline. Geographically, most of the measurements were taken in the northern oceans, particularly the North Atlantic and the western North Pacific. An interesting exception is the extensive data collected during the Meteor expeditions of 1925–1927 which surveyed both hemispheres of the Atlantic [Defant, 1981]. Most observations were collected using instruments such as reversing thermometers and water sample bottles.
 By the 1930s the corresponding set of SST measurements are most extensive along ship routes in the northern oceans and, interestingly, across the Northern Indian Ocean. In the late 1930s and 1940s the introduction of the mechanical bathythermograph (MBT) led to a gradual increase in coverage throughout the northern oceans, although much of the additional data is at depths shallower than 200 m. In the years following World War II the descendent of the MBT, the expendable bathythermograph (XBT), led to a massive increase in temperature information. Like the MBT the XBT is limited to measuring temperature, but can be deployed from a moving ship which led to a dramatic increase in instrument numbers. However, concerns have developed over the presence of uncorrected time-dependent bias [Levitus et al., 2009] prompting efforts to correct XBT bias in ocean reanalyses [Giese et al., 2011]. The most dramatic changes in observation coverage have come from the deployment of the tropical mooring array, first in the Pacific and then successively in the Atlantic and Indian Oceans, and the Argo float system initiated after 2000. Argo, which when first deployed was limited to the upper 1000 m but has since been extended to 2000 m, has resulted in a massive increase in both the number and spatial coverage of subsurface temperature and most dramatically, salinity [Roemmich et al., 2009].
 In this article we explore an initial series of observing system simulation experiments of the type described above using the observing array for five decades: 1925–8, 1945–8, 1965–8, 1995–8, and 2005–8. For each decade we explore the extent to which the historical observing system could detect a climate anomaly as substantial as the events of 1997/8.
2. Methods and Data
 The Nature Run is carried out using a global ocean model based on Parallel Ocean Program version 2.01 software [Smith et al., 1992]. The horizontal resolution, averaged globally, is 0.4° × 0.25° with 40 vertical levels (10 m resolution near-surface) and a fully resolved Arctic Ocean. Vertical mixing is based on the Large et al.  K-Profile Parameterization, while horizontal diffusion is modeled using biharmonic mixing. Rivers are included with climatological seasonal discharge. There is no explicit sea ice model, although surface heat flux is modified to simulate its climatological seasonal impact. The model includes an explicit free surface.
 Surface wind stress, heat, and freshwater flux from the 20CRv2 reanalysis of Compo et al.  are used by the ocean model for the surface momentum fluxes beginning in year 1871. 20CRv2 assimilates only surface observations of synoptic pressure and monthly SST and sea ice distribution from the Hadley Center HadISST1.1 data set [Rayner et al., 2003]. Solar radiation, specific humidity, cloud cover, 2 m air temperature, precipitation and 10 m wind speed are used in the bulk formulae for computing heat and freshwater fluxes. The 20CRv2 reanalysis is constructed using an ensemble Kalman filter using a 6 h cycle with 56 ensemble members, for which the mean of the ensemble members is the analysis and the spread of the ensemble members is the uncertainty. The ensemble error covariance is inflated following Anderson and Anderson , a procedure which partially accounts for the presence of both model and observation sampling error.
 As the 20CRv2 reanalysis progresses into later years of the 20th century the ensemble spread in the subtropics gradually collapses. This process is illustrated in Figure 3 (top) which shows the vertical component of the curl of the surface wind stress averaged over the North Atlantic subtropical gyre. The curl, which is related to the atmospheric torque that drives the gyre circulation, undergoes significant decadal and multidecadal fluctuations. Prior to the 1920s the amplitude of these interannual fluctuations falls within the envelope of the ensemble spread meaning that the surface stress forcing of subtropical gyre fluctuations remains essentially unknown. But beginning in the 1920s the spread of the 56 ensemble members is sufficiently reduced to clearly distinguish decadal variability in the reanalysis. Tropical zonal winds are more strongly constrained by SST (e.g., as represented in the simple model of Gill ) than subtropical winds. Thus the uncertainty associated with an index of tropical Pacific zonal wind stress anomaly (Figure 3, bottom) reduces only rather slowly through the decades.
 Three different surface forcing fields (winds, freshwater, and heat flux) are derived from 20CRv2: 1) climatological monthly forcing based on the ensemble mean, which is used in the first set of experiments discussed below; 2) historical monthly forcing based on the ensemble mean, which is used in the Nature Run; and 3) degraded historical monthly forcing. This degraded historical forcing is constructed by adding the difference between an individual ensemble member and the ensemble mean from one era to the ensemble mean of a different era. This degraded surface forcing is used in the second set of experiments discussed below. One limitation of this procedure is that it does not take account of the possible correlation of the ensemble spread with the phase of ENSO.
 The observing system simulation experiments use the sequential Simple Ocean Data Assimilation (SODA) of Carton and Giese , briefly described here. The assimilation cycle is carried out every 10 days, but the corrections are introduced incrementally every time step, in which an analysis at time t is followed by a five-day simulation [Bloom et al., 1996]. On day t + 5 the assimilation package is called to produce estimates of the temperature and salinity updates. The data window for this assimilation spans ±45 days (although observations at large time lags have reduced influence on the estimates). Then a simulation is carried out for 10 days beginning at time t with temperature and salinity corrections added incrementally to produce the final analysis for the 10-day period (from time t to t + 10). This procedure has the advantage of maintaining a nearly geostrophic relationship between the pressure and velocity fields with a minimum excitation of spurious gravity waves. Model output, such as temperature, salinity, and velocity, is averaged by month and is mapped onto a uniform global 0.5° × 0.5° × 40-level grid using the horizontal grid Spherical Coordinate Remapping and Interpolation Package with second order conservative remapping [Jones, 1999].
 The temperature and salinity data used in the data assimilation is derived in the following way. Vertical profile data locations, measurement types (temperature and/or salinity), and depths are obtained from the most recent release of the World Ocean Database 2009 (WOD09) [Johnson et al., 2009] with locations of surface temperature samples obtained from ICOADS 2.5. For five four-year intervals; 1925–8, 1945–8, 1965–8, 1995–8, 2005–8; at each profile observation location in WOD09 or surface measurement in ICOADS 2.5 during these years a synthetic observation is created of the same variable (temperature or salinity), month, and depth range as the original observation, but with the year date shifted so the observations sample the Nature Run during 1995–1998.
 The 1920s observing array includes nearly 18,000 profiles including extensive sampling in the northwest Pacific, the Meteor expedition in the Atlantic and various North Atlantic experiments. The 1940s observing array begins with extremely limited sampling in 1945 during World War II, but rapidly grows after 1946 to a total of over 26,000 profiles, including some in the Arctic and Southern Ocean. In the late 1940s Ocean Weather Station time series were set up in the North Atlantic and Pacific [e.g., Østerhus and Gammelsrød, 1999]. By the 1960s the observing array has grown through expansion of both scientific experiments and volunteer observing ship sampling to over 184,000 profiles during the four year period. In the 1990s the total observation count is similar (195,000), but by the 2000s the observation count increases dramatically (908,000) due to the deployment of the Argo system.
 These five sets of synthetic observations are then assimilated in a set of five observation alone experiments listed in Table 1 which are driven by climatological surface forcing. Thus in these experiments all interannual variability may be ascribed to the data assimilation. In addition, a numerical simulation called the Climatology Experiment is carried out which is forced with the same climatological surface forcing used in the observation alone experiments.
All experiments except Nature Run begin from initial conditions provided by a simulation driven with climatological surface forcing. Nature run is forced with observed surface forcing beginning in January, 1871. Observation Alone Experiments combine data assimilation with a climatologically forced model. Observation and Wind Experiments each have three ensemble members that combine data assimilation with a model driven by different representations of observed forcing.
Simulation begins Jan 1871
Observation Alone Experiments
SST and profiles from 1925 to 1928
SST and profiles from 1945 to 1948
SST and profiles from 1965 to 1968
SST and profiles from 1995 to 1998
SST and profiles from 2005 to 2008
Observation and Wind Experiments (3 Ensemble Members of Each)
1995–8 plus 1920s noise
SST and profiles from 1925 to 1928
1995–8 plus 1990s noise
SST and profiles from 1995 to 1998
 An additional set of two observation and wind experiments are carried out, each with three ensemble members (Table 1). In these experiments historical surface forcing is provided for the period 1995–8 in addition to the data assimilation, but degraded as described above to reflect the uncertainties associated with the spread of the 20th century reanalysis. Two such experiments are presented, one: Expt. W1920s reflecting the uncertain forcing and limited data coverage in 1925–8, and the other: Expt. W1990s reflecting the more accurate forcing and expanded data coverage of 1995–8. Anomalies from the climatological monthly cycle are computed by removing the monthly climatology computed from the Climatology Experiment, as well as the time mean.
 We begin by focusing on 0/300 m heat content variability, which we choose as a proxy for the thermal inertia of the upper ocean on interannual timescales, and SST. We exclude from our discussion regions south of 30°S because of lack of observational sampling until recently. Poleward of the tropics this 300 m depth layer is wholly above the thermocline whereas in the tropics the thermocline generally lies within the upper 300 m. As a result, 0/300 m heat content variability in the tropics largely reflects changes in the depth of the thermocline whereas the relationship between these variables is less apparent at higher latitudes. When comparing variability among experiments (shown in the first part of section 3) the output is smoothed with six passes of a five point top hat smoother to reduce the impact of slight differences in the locations of fronts and eddies and to emphasize differences in basin-scale structures. This smoother eliminates variability for wavelengths of 4° or shorter but reduces RMS variability for wavelengths longer than 16° by less than 20%.
 Heat content in the Nature Run shows substantial 0.5–2°C subseasonal variability in the western boundary and outflow regions of the Gulf Stream and Kuroshio as well as in the tropical Pacific and Indian Oceans (Figure 4a, top left). Comparison to the overlying contours reassures us that the anomaly pattern in the Nature Run closely resembles the SODA reanalysis. The corresponding variability of subseasonal SST is geographically broader, covering much of the North Pacific and North Atlantic (Figure 4b). In the tropics variability is primarily due to changes in Pacific Ocean temperature resulting from the El Niño/La Niña of 1997–1998 together with changes in Indian Ocean temperature associated with a massive dipole-related westward shift of heat. Heat content variability is distributed uniformly in the zonal direction, while SST variability is concentrated in the eastern cold tongue regions of the tropical Pacific and Atlantic and the cooler western Indian Ocean (Figure 4b).
 Heat content variability in the five experiments roughly divides into two groups (Figure 4a). The 1920s and 1940s experiments have heat content variability in the western boundary current systems of approximately the correct magnitude with only weak <1°C tropical variability. Interestingly, the 1920s experiment does show some variability in the western north tropical Pacific. The remaining three experiments show significant variability in the tropical Pacific and Indian Ocean as well as the western tropical Atlantic. A consistent feature in all experiments is somewhat weak variability relative to the Nature Run within a few degrees of the equator, presumably because of the impact of using climatological seasonal winds. In contrast to heat content, SST variability is qualitatively similar among the five experiments (Figure 4b). All five show ENSO variability in the eastern tropical Pacific, although somewhat reduced in the 1920s and 1940s experiments. All show significant variability in the North Pacific and North Atlantic.
 Since the variability in the North Pacific and Atlantic is rather similar we next consider the growth of RMS difference between heat content variability from the experiments and the Nature Run and Climatology Experiment averaged across both basins (0°-60°N, 120°E-30°E) as a function of time (Figure 5a, top). These results again fall into two groups. For the 1920s and 1940s experiments the RMS difference from the Climatology Experiment increases from zero rapidly in the first couple of months to >0.5°C and then more slowly throughout the four year period toward 1°C. In contrast experiments: 1960s, 1990s, and 2000s all develop differences from the Climatology Experiment within the first two months that exceeds 1°C and that rapidly rise to about 1.5°C, similar to the RMS difference between Nature Run and Climatology Experiments (1.43°C).
 The corresponding RMS heat content differences from the Nature Run start at about 1.5°C (reflecting differing initial conditions). The 1920s and 1940s experiments drop to around 1.3°C for 1940s and slightly higher for 1920s (Figure 5a, bottom). In contrast the 1960s, 1990s, and 2000s experiments all drop rapidly to RMS differences of around 0.5°C, with slightly lower values for 2000s. The RMS SST differences from the Climatology Experiment and Nature Run show much less diversity, while the RMS SST differences from the Nature run are somewhat larger for the 1920s and 1940s than the later experiments (Figure 5b, bottom). These results reinforce the conclusion that SST analyses were already pretty good earlier into the 20th century, with an RMS error of, at most, 1°C, and that this error is reduced to about 0.5°C in recent decades.
 We next consider differences in basin-average temperature for the North Pacific and North Atlantic Oceans from the Nature Run (excluding the deep tropics; Figures 6a and 6b). The Climatology Experiment in comparison is too cool by 0.7°C near surface in the summer and too warm in the winter by a similar amount. The depth of maximum error, however, is between 200 m and 100 m where the Climatology Experiment is too warm by more than 0.8°C. The 1920s experiment has spatially averaged errors similar to the Climatology Experiment for the first few months, but the temperature error gradually decreases so that by 1998 it is frequently below 0.2°C in the Atlantic Ocean and 0.4°C in the Pacific Ocean. The exception to this is in the mixed layer where seasonal temperature biases remain year-round. Salinity is too low in comparison to the Nature Run by 0.2 psu at depths shallower than 100 m, similar to the Climatology Experiment. This fresh bias persists throughout the four year period due to the limited salinity observation coverage during these years.
 In the 1940s experiment the difference from the Nature Run shows that the basin-average temperature error gradually declines to less than 0.2°C for levels deeper than the mixed layer in the Atlantic Ocean and to values nearly as low in the Pacific Ocean by 1998 (Figures 6a and 6b). However, salinity errors remain nearly as large as for the 1920s Experiment. In the remaining three experiments, basin-average temperature errors are rapidly reduced, after the experiment begins, to below 0.2°C except in the mixed layer. Salinity errors are substantially reduced to below 0.2 psu over much of the water column in the North Pacific, somewhat larger in the North Atlantic. Salinity error increases in the 1960s experiment relative to the 1940s experiment due to reductions in salinity profile information, but decreases strikingly to well below 0.1psu for the 1990s experiment in both ocean basins.
 Next we examine the representation of the enormous El Niño of 1997/8, specifically the spatial structure of anomalies in the extreme season of September–November, 1997, spanning the period when anomalies of heat content and SST develop (Figures 7a and 7b). To reduce the contribution of seasonal variability we remove the climatological monthly variability as estimated from the Climatology Experiment prior to computing anomalies. During this season there are heat content and SST anomalies both as large as ±2.5°C, with a large warm heat content anomaly in the eastern basin and a somewhat smaller cool heat content anomaly in the western basin. In the Indian Ocean the development of the Indian Ocean dipole produced a −1°C cool anomaly in the eastern basin and a corresponding warm anomaly just south of the equator in the central basin.
 The 1920s experiment shows a reasonable representation of the cool heat content anomaly in the western tropical Pacific as the result of the presence of nearly 500 ocean station profiles collected by Japanese scientists in this region from 1925 to 7 (a sample of this data is evident in Figure 8). The 1920s experiment also shows a weak warm heat content anomaly (interesting because there are no direct observations) and a +1°C warm SST anomaly in the central Pacific (Figure 7b). In contrast, the 1940s experiment has only 300, mostly American, station observations in the tropical Pacific during the period 1945–1947, and these are distributed further eastward (again, a sample is evident in Figure 8). As a result the 1940s experiment captures more of the warm heat content anomaly in the east than the 1920s experiment, but none of the cold anomaly in the west, and it has a stronger, but geographically more limited SST signal along the equator. By the 1960s the observation coverage of the tropics has increased dramatically, as indicated in Figure 8, extending well north and south of the equator. As a result, this experiment has a qualitatively realistic representation of heat content and SST anomalies. Note that for all experiments heat content anomalies are weak in a narrow latitude band surrounding the equator, again, due to the use of climatological wind-forcing. In the Indian Ocean sector all experiments show the appearance of a cool SST anomaly in the east, and the 1960s and later experiments show corresponding heat content anomalies.
 In the last set of two experiments climatological surface forcing in Expts. 1920s and 1990s is replaced with forcing for 1995–8, degraded to reflect the difference between a 20CRv2 ensemble member and the ensemble mean for either 1925–8 (Expt. W1920s) or 1995–9 (Expt. W1990s). Each experiment is repeated three times with the errors determined using three randomly selected 20CRv2 ensemble members. The improvement in the analysis in the tropics due to the inclusion of the observed forcing is consistent with the ocean model behaving like a forced quasi-linear system. This highlights the importance of having the correct atmospheric forcing for the simulation of ENSO.
 To illustrate the impact of both forcing (primarily winds) and updating observations we examine tropical heat content, temperature, and zonal velocity anomalies for two three-month periods, beginning with poorly sampled Expt. W1920s. The first period is September–November, 1997, spanning the peak of the heat content anomaly associated with El Niño (Figure 9a). In contrast to the situation presented in Figure 7, when historical winds are added to the assimilation the representation of the heat content anomaly associated with El Niño becomes qualitatively realistic even in the 1920s. Each ensemble experiment shows a strong warm heat content anomaly exceeding 2.5°C east of 160°W and a correspondingly strong cold heat content anomaly north of the equator in the west. The warm heat content anomaly in each member experiment is confined to the upper 200 m, but does extend below the depth of the mean 15°C isotherm. In each member experiment the zone of the intense eastward flowing Equatorial Undercurrent is similarly confined to the eastern basin, at depths greater than 50 m. Each experiment shows a negative heat content anomaly in the eastern Indian Ocean which is most intense at 100 m depths. Closer inspection does show some differences. For example, member 1 winds have a larger westerly anomaly in the central Pacific south of the equator than the other members considered here or than the ensemble average of the member winds. This difference in forcing seems to be the cause of an enhanced negative heat content anomaly in the western basin south of the equator.
 The second period we examine is September–November, 1998 when La Niña conditions had set in (Figure 9b). The heat content anomalies associated with La Niña are not as strong as for El Niño and the differences from one member experiment to another are greater. However, all show gross features similar to those of the Nature Run with anomalously cool temperatures along the equator in the mid-Pacific, with the most intense anomalies in excess of −4°C at 100 m depth. All show these anomalously cool temperatures extending into the Southern Hemisphere in the western Pacific, but with contrasting anomalously warm temperatures to the North. All experiments show the eastward flowing Equatorial Undercurrent extending along the equator all the way from 120°E to 100°W at depths shallower than the depth of the 15°C isotherm except in the far east. Figure 10, corresponding to Expt. W1990s, when the quality of the wind product is good and the historical sampling is also high, shows that the spatial structure of the temperature and velocity field in the tropics may be reproduced with considerable accuracy.
 The time evolution of the climate anomalies along the Equator in Figure 11 shows many of the features of the observed evolution of climate anomalies described in the Introduction (see also the overlying contours from the SODA reanalysis). Heat content is anomalously warm in the central basin in October, 1997 through the beginning of 1998, with clear evidence of eastward propagation and eastward surface currents of up to 40 cm/s. This warm anomaly is followed by a cool anomaly that begins in the western Pacific in October, 1997 and arrives in the central basin in July, 1998. Anomalies of SST lag anomalies of heat content somewhat, so that the eastern Pacific has warm anomalies exceeding 3°C by January, 1998, followed by anomalously cool temperatures in July. In the Indian Ocean the relaxation of the normal monsoonal conditions led to anomalously cool SSTs of −1°C in boreal fall of 1997 in the eastern basin and anomalously warm SSTs in the western basin the following spring. The results of the two experiments, W1920s and W1990s both reproduce the gross features of this evolution as well as many of the finer details.
 This study is designed to address the question of the extent to which massive climate anomalies such as the 1997/8 El Niño, the 1997 shift of the Indian Ocean dipole, and the 1998/2000 La Niña would have been represented had these events occurred during other decades in the past century. Our approach is to begin with a numerical simulation of the ocean that spans the four year period 1995–1998, which we refer to as the Nature Run. We then carry out a set of observing system simulation experiments attempting to reconstruct the results of the Nature Run by applying data assimilation, but beginning with erroneous initial conditions and using inaccurate forcing. In our first set of experiments we sample the Nature Run at the locations where historically observations were collected. We assimilate these synthetic observations into a model driven by climatological monthly surface forcing. Thus in this set of experiments all information about year-to-year climate variability comes from the assimilation of observations rather than from initial conditions or surface forcing. In a second set of experiments we use both assimilation and historically varying surface forcing (degraded consistent with meteorological error estimates).
 In the first set of five experiments synthetic observations from five decades (1925–8, 1945–8, 1965–8, 1995–8, and 2005–8) are assimilated. Each successive experiment has more observations which extend to deeper levels reflecting the growth of the observing system over time. The results show that by the 1960s the ocean surface and subsurface observing system is sufficiently extensive that the major features of the warm and cold phases of ENSO and shifts in the Indian dipole can be reproduced even in the absence of information from surface forcing. In the northern subtropics/midlatitudes the comparison of basin-average quantities such as temperature and salinity in the upper 300 m suggests that even as far back as the 1940s the historical observational coverage allows us to estimate temperature anomalies with an accuracy of 0.2°C, but that basin-averaged salinity remains unconstrained. From the 1960s forward basin-averaged temperature errors are less than 0.2°C throughout the upper 300 m and salinity errors are generally less than 0.2 psu except within the mixed layer. The salinity errors increase somewhat in the due to a shift toward observing systems that measure temperature, but not salinity, and then decrease as a result of the introduction of Argo.
 In the second set of experiments the 1920s and 1990s experiments described above are repeated, but now including historical surface forcing, but forcing which is degraded consistent with meteorological error estimates from the 20CRv2 reanalysis. The results of these more realistic experiments show that historical representation of surface meteorological forcing is sufficiently accurate (if we accept the 20CRv2 error estimates) that tropical phenomena of the magnitude of the 1997/8 El Niño and associated 1997 Indian dipole and 1998 La Niña could have been described qualitatively at least as early as the 1920s and likely even earlier. This result, which is partly a consequence of the low estimates of wind error along the equator, is extremely encouraging for the application of data assimilation to historical ocean climate studies in the first half of the 20th century [e.g., Giese et al., 2010].
 The current study comes with a number of qualifications. It focuses on phenomena resembling the climate anomalies of the late-1990s which had their largest expression in the tropics. By focusing on this extraordinary era and the tropics at interannual timescales, we likely overestimate the ability of an ocean reanalysis to track more usual climate events, those outside of the tropics, and those on longer decadal timescales. Simplifying assumptions in this study include the assumption that the ocean observations are perfect (no observation error). The use of the same numerical simulation model for the Nature Run and for the experiments eliminates deleterious effects of model error and unresolved physical processes. The study also accepts, without any attempt to verify, the error estimates that accompany 20CRv2. Additional studies are needed to explore all of these qualifications.
 Finally, we note that our emphasis in this study on analysis of just two variables, SST and heat content, tends to underemphasize the contributions of the modern ocean observing systems. Indeed, it is important to emphasize that the atmospheric and oceanic observing systems provide complimentary information, each making the other more valuable.
 Financial support for this research has been provided to JAC by the National Science Foundation (OCE-0351319). Support for BSG and HFS is provided by NOAA (NA06OAR4310146) and the National Science Foundation (OCE-0351804).