The main goal of this study is to determine how well large-scale temperature, salinity, upper ocean heat content (UOHC), and surface mixed layer fields and their variability can be reconstructed from the Argo observing system. The approach is to sample and reconstruct these oceanic fields from a coarse-resolution ocean general circulation model (OGCM), quantify the errors in the reconstructed fields, and analyze the factors controlling these errors. In particular, this study analyzes the effects of float movements on the spatial coverage and reconstruction of temporal variability. Overall performance of the simulated Argo array is good, and the reconstructed climatological means of such key quantities as the temperature, salinity, UOHC, and mixed layer depth are very close to the actual OGCM-simulated values in most of the global ocean. However, the differences between the reconstructed and actual fields (“reconstruction errors”) are more significant in several regions with strong currents, such as the Antarctic Circumpolar Current (ACC). The results also suggest that the detection of the year-to-year changes in UOHC in the ACC, in high-latitude North Atlantic, and near the coasts can be particularly problematic. As illustrated by sensitivity experiments, the main effect of float movements is to increase reconstruction errors. This adverse effect of float movements is the main cause of large errors in the UOHC interannual difference in the ACC. When the spatial sampling coverage is improved, for example, by increasing the number of floats, the accuracy of reconstruction improves substantially.
 Addressing the need of the oceanographic community for global, continuous profiling of the ice free upper ocean, the deployment of the Argo array target of 3000 floats was reached in 2007. The floats spend most of the time at a depth of 1000–2000 m, where they are advected by the deep oceanic currents. Every 10 days, the floats surface and take vertical profiles of temperature and salinity over the depth interval from 1000 to 2000 m to the surface. The resulting three-dimensional global data set is unprecedented in the history of oceanographic observations. For the first time, reliable estimates of the global upper ocean stratification are available, which will significantly advance understanding of the state of the oceans and enhance the ability to predict future climate changes. In particular, data for temperature and salinity help to constrain climate models used for climate change projections, whereas the information contained in such derived variables as the upper ocean heat content (UOHC) and mixed layer depth (MLD) is central for the ability to detect ongoing climate changes.
 Several factors, however, impact the ability of the Argo observing system to accurately reconstruct the upper oceanic state, and an analysis of such factors is crucial for the understanding of the limitations of the Argo system. Coarse spatial and temporal sampling is an important limiting factor. If the 3000 floats are nearly uniformly distributed, the average distance between floats is approximately 300 km. Such coarse spatial resolution is insufficient to resolve sharp oceanic fronts associated with western boundary currents, and the Argo array was never intended to provide accurate measurements in these areas. The Antarctic Circumpolar Current (ACC) represents another region with strong currents and high gradients, and the ability of the Argo to reconstruct temperature and salinity structure there is an important uncertainty in the Argo-based reconstruction of the global oceanic state. This paper addresses the significance of the differences between the reconstructed and actual values of some state estimates and examines the effects of float movements and spatial sampling coverage on these biases. The goal of this study is to examine how well temperature, salinity, UOHC and surface mixed layer fields and their variability can be reconstructed from the Argo data.
1.1. Effects of Advection
 Movement of the Argo floats by oceanic currents has complicated effects on the overall accuracy of reconstruction of the gridded fields from the Argo data. In particular, the constant redistribution of floats on average acts to increase the spatial sampling coverage of the Argo system, by providing observations from more points in the domain, and can thus be expected to improve the Argo ability to reconstruct oceanic fields. In the regions with divergent currents, on the other hand, the spatial sampling coverage can be expected to decrease [Vecchi and Harrison, 2007]. The float redistribution can also negatively impact the reconstruction of the time variability in sampled fields, by decreasing the time a float spends near any particular location.
 These competing “positive” and “adverse” effects can be illustrated on the example of a one-dimensional, uniform current. The density of the sampling coverage, in this case, can be measured by the length scale d within which at least one sampling is guaranteed during a time interval T. If the floats are not moving, d = D, where D is the spacing between floats (300 km). If the floats are advected by a current of speed U over time T, d is reduced to D − UT. Long advective length scale UT leads to increased spatial sampling coverage (smaller d), and for UT ∼ 300 km, the advection-induced improvement in the sampling coverage becomes very significant. Since the resolution of the annual cycle is usually a minimum requirement for reconstruction of climatology and variability of oceanic variables, T is taken to be a monthly time scale, 30 days. Then for the slow advection of U < 0.01 m s−1, typical for most large-scale subsurface currents in the oceanic interior, d is very close to D and the spatial sampling coverage is largely unaffected by advection. Moreover, the time mean velocities are directed primarily along mean isobars, and the advection does not affect important cross-gradient resolution. Positive effects of advection become noticeable at approximately U > 0.03 m s−1; and for the fast advection of U ∼ 0.1 m s−1d becomes approximately 40 km. Note, however, that the required spatial resolution should be in general proportional to the characteristic length scale L on which temperature and salinity change. Such length scale is typically shorter in the regions of strong geostrophic currents, which limits the positive effect of advection.
 The importance of the adverse affects of advection can be quantified by estimating an advective time scale L/U. On this time scale, float movements cause noticeable changes in the sampled fields, through changing accuracy of reconstruction at any fixed location. If the time scale L/U is comparable to the time scale of interest, the movement of floats can severely distort the reconstructed variability. The advective time scale L/U is approximately 580 days for U ∼ 0.01 m s−1 and L ∼ 500 km (typical scales for oceanic interior), and 23 days for U ∼ 0.05 m s−1 and L ∼ 100 km (typical scales for the ACC). On the basis of these estimates, significant distortion of the annual cycle is expected in regions of strong current and short characteristic length scale, such as the ACC and boundary current regions; whereas in the interior of midlatitude gyres, the adverse effects of float movements are expected to be felt on interannual time scales only. Finally, even relatively weak (U ∼ 0.002 m s−1) but divergent flows can cause significant gaps in spatial coverage over the course of 5 years (L ∼ 300 km).
1.2. Observing System Simulation Experiments
 To estimate the error in the fields reconstructed from Argo data, a simulation of the Argo sampling is studied here using an oceanic general circulation model (OGCM), in an Observing System Simulation Experiment (OSSE) [Arnold and Dey, 1986]. The technique has been used for the analysis of different ocean observing systems in ocean models of varying complexity [Kindle, 1986; Barth and Wunsch, 1990; Bennett, 1990; Hernandez et al., 1995; Hackert et al., 1998]. For the Argo array, OSSEs concentrated on the Indian Ocean [Schiller et al., 2004; Oke and Schiller, 2007; Ballabrera-Poy et al., 2007; Vecchi and Harrison, 2007] and the Mediterranean Sea [Griffa et al., 2006]. The OSSE approach has several advantages. Most importantly, the actual field is known and thus the errors in the reconstructed fields can be accurately estimated. Additionally, parameters of the observing system and the “observed” oceanic state can be modified in sensitivity studies and the affects of various factors on the accuracy of the observing system can be isolated and estimated. For example, Schiller et al.  demonstrate the importance of spatial resolution for capturing intraseasonal oscillations in the upper Indian Ocean, and recommend a meridional resolution of 100 km in the equatorial region. Vecchi and Harrison  demonstrate that more frequent sampling decreases the accuracy of the Argo system in the tropics because of the increased time spent by the floats at the surface and the resulting enhanced divergence of the floats. The OSSE approach also permits calculation of the optimal float distribution and trajectories [Oke and Schiller, 2007; Griffa et al., 2006].
 This study makes a first attempt to analyze the Argo system in global OSSEs and to focus on the effects of float movements and spatial sampling coverage on the ability of the Argo system to reconstruct oceanic state variables. It can be argued that, due its coarse spatial resolution, the main objective of the Argo array is to measure large-scale fields. This study evaluates to what degree this objective can be met and utilizes a coarse-resolution ocean model in order to analyze the reconstruction of large-scale oceanic features. All the advection is, however, carried by large-scale currents and most importantly, the mesoscale eddies are absent. These idealized simulations do not attempt to account for the actual launching times and locations, difference in sampling times, differences in float design and finite ascending/descending times. In addition, the pressure offset error, and the effects of the loss of floats due to mechanical failures and vandalism are also ignored. Given all these idealizations, this study is expected to provide an upper bound estimate on the accuracy of the Argo observing system.
 The paper is organized as follows. The numerical model and details of the simulations are described in section 2, which compares the reconstructed fields to the original OGCM-simulated values and analyzes the reconstruction errors, the differences between the reconstructed and actual OGCM-simulated values. Section 3 presents an analysis of the reconstruction errors in the temperature, salinity, OUHC and MLD in the “standard” simulation (defined later). Section 4 analyzes sensitivity experiments designed to illuminate the effects of float movements (section 4.1) and the density of the spatial sampling coverage (section 4.2) on the expected accuracy of the Argo system. The analyses of section 4 are focused on a single variable, UOHC. The discussion and conclusions are presented in section 5.
2. Numerical Model and Experimental Design
 The numerical model used in this study is discussed in detail by Kamenkovich ; here only a brief description is provided. The model is based on the GFDL MOM3 code [Pacanowski and Griffies, 1999]. The horizontal resolution is 2° in longitude and latitude. The model domain is global and extends from 78°S to 84°N. There are 25 levels in the vertical with resolution increasing from 17 m at the surface layer to 510 m at the bottom. The bathymetry of the model is derived from the Scripps Topography. Vertical diffusivity varies continuously from 0.25 × 10−4 m2 s−1 at the surface to 1.0 × 10−4 m2 s−1 at the bottom. This profile reflects the increase of the vertical mixing from the thermocline to the deep ocean [Bryan and Lewis, 1979], and the intensification of mixing by rough bottom topography [Polzin et al., 1997]. Heat and salt transports by the mesoscale eddies are parameterized by the Gent-McWilliams scheme [Gent and McWilliams, 1990] with coefficients for isopycnal diffusion of tracers and isopycnal thickness of 500 m2 s−1. The K-profile parameterization (KPP) scheme [Large et al., 1994] is used to represent turbulent mixing within a surface boundary layer. Horizontal and vertical viscosity in the model are 8 × 104 and 10−4 m2 s−1 correspondently. Flow of dense waters down a topographic slope is parameterized following Campin and Goose ; this parameterization improves properties of the deep and bottom water in the model. The Gibraltar Strait is not resolved and is closed; the Mediterranean Outflow is parameterized by a local salinity source following Rahmstorf .
 Heat fluxes into the ocean are calculated using conventional bulk formulas and are corrected for stability. Daily values for the 2-m air temperature and humidity, 10-m wind speed, and zonal and meridional components of the wind stress are taken from years 1979–2001 of the NCEP-NCAR reanalysis. Climatological monthly values are used for all other atmospheric variables. Cloud cover and solar radiation are taken from the International Satellite Cloud Climatology Project. Freshwater fluxes are taken from Jiang et al.  and include river runoff data. To avoid a drift in surface salinity toward unrealistic values, weak restoring of the surface salinity to the Levitus climatological values [Levitus and Boyer, 1994], with a restoring time scale of 180 days, is used.
 The model is coupled to a thermodynamic sea ice model [Visbeck et al., 1998]. The ice model diagnoses ice thickness, ice cover, i.e., the percentage of ice-covered ocean in a grid box, surface temperature of the ice/snow, and heat and salt exchanges with the ocean. To keep the amount of sea ice in the model close to observations, the “ice correction” is employed: In the ice-covered areas, the model has an anomalous heat flux out of the ocean that is proportional to the difference between the observed and model-simulated ice cover. The observed values for years 1979–2001 are taken from the National Snow and Ice Data Center data set.
 The simulated Argo-like array of 3000 floats is initially randomly distributed in the model domain. In the standard run, the floats are advected by the GCM-simulated velocities at the 1500-m depth. The floats surface every 10 days, while taking an instantaneous vertical profile during their ascend, spend 8 h at the surface, where they are advected by the surface currents and then return to the 1500-m depth level. All floats surface simultaneously. The floats are assumed to be lost when they enter regions with the ocean depth shallower than 1500 m and ice-covered areas; after being lost, a float is not substituted by another one. The simulations are run for 10 years, which correspond to model forcing of 1992–2001.
 Float locations at the end of years 1, 5, and 10 are shown in Figure 1. By the end of year 5 (Figure 1b), several regions exhibit significantly decreased spatial sampling coverage. The analysis reveals that these changes are not due to the loss of simulated Argo floats, but rather due to the divergence of the profilers. In the tropics, divergent near-surface currents clear the area of the Argo floats, resulting in significant gaps by the end of year 10. Similarly, the spatial sampling coverage also steadily decreases with time in the vicinity of the western coasts of all continents and in the subpolar gyre region in the North Atlantic.
 The uneven spatial coverage is further signified by the frequency of occupation of each 2° × 2° box by the simulated Argo floats (Figure 2). To calculate the frequency of occupation, the number of floats within 1° of each grid point at each 10-day sampling are counted, added together for all samplings, and divided by the total number of samplings (365). Overall, approximately 41% of ice free grid boxes are sampled at least once a month (sampling frequency of 1/3) and 11% sampled more than twice a month. There are some significantly undersampled areas with the very small sampling frequency (<0.1), including parts of the ACC, high-latitude North Atlantic and the tropics.
 Random errors in temperature and salinity, with corresponding magnitudes of 0.005 K and 0.01, are added to the synthetic data. The results from these pseudo measurements are objectively analyzed and the global gridded maps of temperature and salinity are produced. The objective analysis (OA) scheme of Mariano and Brown  is used to map the float sampled data onto the model grid. The OA scheme is applied on each vertical layer of the 10-day sampled data. A Gaussian correlation function with decorrelation scales of 8° longitude by 5° latitude is assumed.
3. Estimates of the Expected Argo Reconstruction Accuracy: Standard Run
 The analysis of the reconstruction errors is carried out for three quantities: the annual mean climatological values, the amplitude of the climatological annual cycle and the amplitude of interannual difference. The first two of these characterize the climatology averaged over 10 years of OGCM data. For the measure of the magnitude of the climatological annual cycle, the absolute value of the difference between September and March values [Gleckler et al., 2006] is used. Because of the limited length of the simulated and real Argo data, quantifying the interannual variability is not straightforward. In this study, the analysis is focused on one simple measure of an average magnitude of the year-to-year change, an absolute value of the difference between the annual mean values at years 10 and 1. Two additional measures, value of the linear trend and standard deviation (STD) of the annual-mean values, will also be briefly discussed below. Spatial distributions of the reconstruction errors in all three measures are qualitatively similar.
 Area-averaged (over the ice-free ocean) reconstruction errors in the vertical profiles of temperature and salinity are shown in Figure 3 for the ACC (Figures 3a and 3c) and the area north of it (40°S–45°N, Figures 3b and 3d). The errors decrease with depth at all latitudes, in concert with decreasing spatial gradients and temporal variability in the actual fields. The errors exhibit a maximum in the upper 100 m, where the average errors in the annual mean temperature exceed 0.5 K in the ACC and 0.3 K elsewhere. These errors are significant for the upper ocean temperature (exceeding 10 percent in the ACC) and, as will be seen, lead to noticeable biases in UOHC and MLD. Average errors below 1000 m are small, less than 0.2 K for the ACC and less than 0.05 K for the 40°S–45°N region. The errors in salinity are smaller than 0.06.
 The horizontal distribution of reconstruction errors for UOHC is reported in Figure 4. UOHC is calculated here over the top 800 m. Figure 4a shows the errors in annual mean climatological UOHC. For convenience, these values are given in their temperature equivalent, using units of K, by dividing the heat content per unit area by ρCpH (where ρ = 1025 kgm−3, Cp = 4186 Jkg−1K−1, H = 800 m) or 3.4 × 109 Jm−2K−1. In the bulk of the midlatitude interior, the errors are small and amount to less than 0.05 K. The distribution of these errors exhibits zonal bands, explained by the tendency of the Argo reconstruction to underestimate the large-scale latitudinal gradients in UOHC, which is not surprising given the coarse spatial resolution of the Argo array. As a result of this tendency, the curvature (the derivative of the gradient) is underestimated as well, and close inspection reveals that the latitudes of negative curvature correspond to the negative biases in UOHC, whereas the latitudes of positive curvature correspond to the positive biases.
 There are, however, several locations with more significant reconstruction errors. Most significantly, the reconstruction errors are large (∼0.5 K) in the Indian sector of the ACC and the high-latitude North Atlantic. There are also smaller, but significant reconstruction errors in the tropical Indian ocean. All these regions are characterized by gradually decreasing spatial sampling coverage (Figure 2). Consistent with this gradual loss of the sampling coverage, the errors grow with time in the tropical Indian [see also Vecchi and Harrison, 2007] and in the high-latitude North Atlantic (Figures 5a and 5b). The growth begins after year 3 in the former, and year 1 in the latter regions. Tropical Atlantic and Pacific oceans, despite the fact that they also correspond to the decreasing spatial sampling coverage, do not exhibit significant errors in the annual mean climatology. The errors vary significantly in the Indian sector of the ACC, but do not exhibit a noticeable trend (Figure 5c); similarly, no trend is visible in the midlatitude gyre interior (Figure 5d). As expected, large biases are also seen in the immediate proximity of such intense western boundary currents as the Gulf Stream and Kuroshio, where the errors exceed 0.3 K. These errors, as explained above, are caused by inability of the Argo system to resolve sharp gradients and strong curvature in the cross-stream temperature structure. Note, however, that the western boundary currents simulated by this coarse-resolution GCM are noticeably (by a factor of 2 or 3) wider than their counterparts in the real ocean, and thus the reconstructed errors for the real Argo system are expected to be even greater. The errors are also large in the vicinity of the coasts and the ice edges, because of the inadequate spatial coverage due to simulated “loss” of floats and the fact that the real and simulated Argo floats spend most of the time at depth and are unable to sample shallow areas.
Figures 4b and 4c show the reconstruction errors in the variability of UOHC on seasonal and interannual scales. These errors are shown in the units of surface heat flux (Wm−2). For conversion, the September–March UOHC difference per unit area is divided by 6 months, and year 10–1 difference, by 9 years. The errors in the annual cycle tend to be distributed in zonal bands in the low latitudes, but exhibit strong variations in the zonal direction in the rest of the domain (Figure 4b). Similarly to the annual mean values, the largest errors in the annual cycle are found in the Indian sector of the ACC and high-latitude North Atlantic, where they exceed 20 Wm−2. Tropical areas exhibit errors in the annual cycle amplitude, which tend to exceed the errors in the midlatitudes, and can be deemed even more significant given the smaller amplitude of the tropical annual cycle. In the rest of the domain, the errors in the annual cycle amplitude are small (<5 Wm−2), and the globally averaged error magnitude (absolute value of errors) is 9.25 Wm−2 (Table 1). Given the fact the actual seasonal cycle magnitude varies between 20 and 30 Wm−2 in the tropics and 70–150 Wm−2 at midlatitudes, the results suggest that the Argo system can resolve the annual cycle rather accurately (within <5%) everywhere, except in the ACC, the high-latitude North Atlantic and the tropics.
Table 1. Area-Averaged UOHC Reconstruction Errors and Their Difference Between Experimentsa
The average is taken over the ice-free areas deeper than 800 m. “Standard” through “1500 Floats” columns show integrated absolute values of errors for the five experiments. “Standard Parked” and “1500 Standard” columns show the difference in magnitude (absolute values) of errors between the experiments, with positive (increase in errors) and negative (decrease in errors) values integrated separately.
Annual mean (degrees)
Amplitude of annual cycle (Wm−2)
Interannual difference (Wm−2)
Figure 4c shows the reconstruction errors in the magnitude of the interannual difference, in flux units. Spatial variability in the reconstruction errors is significant. The errors exceed 5–10 Wm−2 in several regions, most notably the ACC, the high-latitude North Atlantic, and near the coasts and western boundary currents. In the rest of the domain, the errors are smaller than 0.5 Wm−2. The average error magnitude, the global average of the absolute value of reconstruction errors over the ice-free ocean deeper than the 800 m, is 1.2 Wm−2 (Table 1); the global average of the errors themselves (not their absolute values) is 0.6 Wm−2. The area-averaged absolute value of the errors in STD of the annual mean UOHC is 4.45 Wm−2. These are significant reconstructions errors. For comparison, the globally averaged UOHC interannual difference itself in this OGCM is 2.4 Wm−2; the globally averaged STD of the annual mean UOHC is 13.6 Wm−2. It is noteworthy, that these significant errors in the global UOHC values are mainly due to errors in a few regions seen in Figure 4c (primarily in the proximity to ice edges and coasts), and are significantly smaller in the open ocean. For example, if the errors with magnitude more than 5 Wm−2 are excluded from the analysis, the resulting global error is substantially reduced from 1.2 to 0.08 Wm−2.
 The linear warming trend in the annual mean, globally averaged UOHC in this OGCM is weak, and amounts to only 0.22 Wm−2, as estimated by a linear fit. This weakness can be attributed to the use of climatological values for the cloud cover, the absence of greenhouse forcing in these simulations, and the limited extent of the simulation. In contrast, the linear trend from the reconstructed values amounts to almost 0.94 Wm−2, this significant value is mainly a result of large reconstruction errors. For comparison, the radiative forcing, the response to an instantaneous increase in anthropogenic greenhouse gas inventories, is estimated to be 1.6 Wm−2 (IPCC Fourth Assessment Report); the current best estimate for the radiative imbalance is 0.85 Wm−2 [Hansen et al., 2005]. The results, therefore, suggest that errors in the estimated global interannual UOHC anomaly, for a 10-year period, are significant relative to a climate forcing signal. Given this large error, detecting a trend in the OUHC anomaly associated with global warming appears problematic from the Argo data alone.
 MLD is defined here as the depth at which the buoyancy difference with the surface equals 0.03 m-s−2
In the above definition, σθ(z; z0) stands for the potential density at depth z referenced to the depth z0.
 MLD is highly sensitive to the near-surface stratification, and thus even small biases in the subsurface temperature and salinity can result in significant errors in MLD. As a result, the reconstruction errors in the annual mean MLD are significant in many parts of the domain; see Figure 6. These biases tend to be negative, so the Argo underestimates annual mean climatology of MLD. The largest negative biases in MLD are found in the regions with the deepest mixed layer: the Indo-Pacific sector of the ACC, western boundary regions and the high-latitude North Atlantic, where they exceed 10 m. In the rest of the domain the errors are smaller than 5 m. The global error magnitude, the absolute value of MLD error integrated over the entire domain, amounts to approximately 2.35 m.
 The MLD reconstruction errors in the amplitude of the annual cycle generally exceed the errors in the annual mean values (Figure 6b). The largest negative biases correspond to the regions of strong wintertime deepening: Indian sector of the ACC, particularly next to Australia, and the regions of the deep water formation in the high-latitude North Atlantic. The magnitude of the annual cycle in MLD is underestimated by more than 100 m in these locations. Similarly, a cold bias in the reconstructed temperatures in the regions of the Gulf Stream and Kuroshio (Figure 4) leads to a negative bias in the wintertime values and annual cycle of MLD.
 Over the course of 10 years, the GCM-simulated annual mean MLD exhibits significant (up to 100 m) changes at high latitudes (not shown): deepening southwest of the Australia and west of the Drake Passage, and shoaling southeast of Australia and in the North Atlantic. The amplitude of these 10-year changes is overestimated by the Argo system with the errors exceeding 50 m in the ACC and the North Atlantic (Figure 6c). These errors severely distort and often overwhelm the actual signal. Over the rest of the domain, the annual mean MLD errors are less than 10 m.
4. Role of Float Movements and Spatial Sampling Coverage: Sensitivity Runs
4.1. Effects of Float Movements
 Two sensitivity runs described in this section address the effects of float movements on the reconstruction errors. In the first experiment, the “parked floats” run, the Argo floats do not change positions in time. The distribution of the floats in the parked floats run is identical to that at the beginning of the standard run, and any differences between these two simulations are therefore caused solely by the movements of floats in the standard case. Figure 7 shows the difference in the magnitude (absolute value) of the errors between the standard and parked floats cases, and thus the portion of reconstruction errors that can be attributed to float movements. The error difference is shown for the annual mean UOHC, the magnitude of the UOHC annual cycle, and the magnitude of the UOHC interannual difference.
 The differences in the reconstruction errors for the annual mean UOHC between the standard and the parked float cases are small and are both positive and negative; see Figure 7a and Table 1. (Note that in Table 1, the advection-caused increases (positive values in Figure 3) and decreases (negative values in Figure 3) in errors are shown separately.) On the basis of the arguments in section 1.1, the large-scale currents simulated by this OGCM cannot significantly affect spatial sampling coverage on important seasonal time scales, and the positive effect of advection cannot be expected. Large differences in the reconstruction errors in the high-latitude North Atlantic between the standard and parked floats runs are explained by the loss of spatial sampling coverage in the standard run (Figure 1). Similarly, strong advection by the ACC currents causes large reconstruction errors in the standard case. The errors in the tropics decrease in the parked float case, but the changes are very small. In the rest of the basin, the adverse effect of advection for the annual mean UOHC is negligible (mostly less than 0.25°C).
 The biases in the reconstructed amplitude of the UOHC annual cycle, in contrast, are noticeably increased by the movements of floats, particularly in the ACC and near the coasts. As argued in section 1.1, float movements distort the reconstructed annual cycle, especially in the regions of strong currents and high temperature gradients. The reconstruction errors attributable to float movements are comparable in magnitude to the errors in the standard case; compare Figures 4b and 7b. The errors in the tropics in the standard case, however, are only partly explained by the float movements. The effects of float movements are small in the rest of the domain, which is manifested by the small globally averaged difference between the errors in two cases (approximately 1.5 W m−2).
 The largest effect of float movements is seen in the magnitude of the interannual difference (Figure 7c). As argued in section 1.1, large-scale oceanic currents are expected to significantly distort the reconstructed variability on the interannual time scale. Direct comparison of Figures 4c and 7c in fact implies that the float movements are the main cause of errors in the standard case, particularly in the ACC and the high-latitude North Atlantic. Large increases in errors due to advection are also found in the vicinity of the coasts and near the ice edge, where some floats become lost in the standard case. As a result, the globally averaged error is significantly reduced (Table 1). The reconstructed linear warming trend becomes 0.22 Wm−2, the same value as for the actual GCM-simulated data. The global mean error in STD of the annual mean UOHC is reduced to 1.3 W m−2.
 How large can the effects of float redistribution be? Strong eddy currents, absent from the coarse-resolution simulations, can significantly rearrange the Argo array over the course of 10 days. Although such rearrangement cannot be equally strong everywhere, it seems useful to obtain an upper bound on the importance of the advection effects. In the second highly idealized sensitivity run, the “random position” case, the effects of the advection are taken to the extreme and the floats change their positions randomly over the entire domain every 10 days, each time sampling takes place. As a result, the sampling frequency becomes more uniform spatially, with 80% of grid boxes sampled at least once a month, and only 2% sampled more than twice a month.
 Differences in the magnitude of reconstruction errors between the random position and standard cases are shown in Figure 8. The random redistribution leads to the reduction in the magnitude of reconstruction errors, due to the significant overall increase in the spatial sampling coverage. The largest reduction in errors is found in the high-latitude North Atlantic and ACC. Among all fields, the largest reduction in errors is in the magnitude of the interannual difference, for which the reconstruction errors decrease in most of the domain (Figure 8c) and by more than 75% on global average (Table 1). The reconstructed linear warming trend is 0.23 Wm−2 in this case, and the global mean error in STD of the annual mean UOHC is 1.2 Wm−2. It is noteworthy, that the globally averaged reconstruction errors in the interannual signal are similar between the parked floats and random position cases. In both the sensitivity runs, the adverse effects of advection are effectively alleviated, by the lack of advection in the parked floats case and the random change in float position in the random position case. This leads to the improved ability of the Argo system to reconstruct the sampled fields. The effects of random redistribution on the reconstruction of the annual cycle are more complex (Figure 8b). The reconstruction errors decrease on global average (by almost 20%), but increase in parts of the Indian Ocean and the western boundary regions of the Northern Hemisphere.
4.2. Sensitivity to the Number of Floats
 This section explores the sensitivity of the reconstruction errors to the density of spatial sampling coverage. Two sensitivity experiments, differing only in the number of floats, are analyzed: the case with the doubled number of floats (“6000 float” case) and the case with the halved number of floats (“1500 float” case). In addition to testing sensitivity of the Argo performance to the density of the spatial sampling coverage, the second experiment also addresses the consequences of the decrease in the number of Argo floats as they approach the end of their life span.
 In the 6000 float case, the sampling frequency increases, with 58% of grid boxes sampled at least every month and more than 20% sampled more than twice a month. As expected, the accuracy of the reconstruction improves for all three fields (Figure 9). The largest decreases in reconstruction errors (29 and 33%) are seen in the annual mean and interannual difference; see Table 1. The magnitude and geographical distribution of changes in reconstruction errors due to the doubling the number of floats are similar in magnitude to the effects of the random redistribution of floats particularly for the annual means and the magnitudes of the annual cycle. These facts confirm the conclusion that the error reductions in the random position case are mainly caused by the increased sampling coverage of the simulated Argo array. The globally averaged error magnitude in the interannual difference, nevertheless, remains large (0.8 Wm−2 in Table 1), and the linear trend (0.64 Wm−2) is still significantly different from the actual GCM-simulated value of 0.22 Wm−2. Doubling the number of floats cannot dramatically improve the reconstruction of the global UOHC year-to-year changes, even for the coarse-resolution fields.
 Decrease in the number of floats to 1500 leads to a substantial increase in the magnitude of reconstruction errors; see Figure 10 and Table 1. The average magnitude of errors in the annual mean and interannual difference each increase by almost 43%; the errors in the annual cycle go up by 32%. The linear trend is 1.1 Wm−2, more than five times the actual one. Despite these negative impacts of the reduced number of floats, the reconstruction of climatology (annual mean and annual cycle) remains rather accurate in the oceanic interior north of the ACC; but reconstruction of the interannual difference using 1500 floats has significantly higher errors than using 3000 floats.
5. Summary and Conclusions
 This study employs a 2° × 2 resolution OGCM to produce fields that are then subsampled in ways similar to how the Argo float array samples the ocean, and compares the fields reconstructed from this simulated “Argo array” with the direct model fields. The objective of this study is twofold: to estimate the expected accuracy of the Argo system in reconstructing oceanic state variables and their variability on large spatial scales, and to study the effects of float movements and spatial sampling coverage on such accuracy. These OSSEs suggest that the overall performance of the simulated Argo array is good, in the sense that the reconstructed large-scale features of the climatological annual mean and annual cycles of such key quantities, as the temperature, salinity, UOHC and MLD are very close to the actual GCM-simulated values over most of the global ocean. However, all cases exhibit similarly significant differences between the reconstructed and actual fields in the regions of high gradients and strong currents. Although the Argo array was never intended to reconstruct sharp gradients within relatively small regions of the western boundary currents, the results also demonstrate significant errors in large parts of the ACC, particularly in the Indian sector. Large errors are also found in the regions with deep mixed layer, such as the high-latitude North Atlantic and regions south of the Australia. Gradual loss of spatial sampling coverage, caused by divergent oceanic currents, additionally results in large errors in the subpolar region of the North Atlantic, parts of the ACC and some coastal areas; errors of the same origin in the tropics are much smaller.
 The errors in the interannual difference in UOHC, defined as the difference of heat content between the annual means at years 10 and 1, are significant: 0.6 Wm−2 (in flux units) on global average compare to 2.4 Wm−2 in the actual OGCM-simulated signal; the global mean error magnitude is 1.2 Wm−2. These errors are nearly doubled when the interannual difference (in flux units) is calculated over 5 instead of 10 years, which stresses the importance of longer data record. The reconstruction of the linear warming trend in the annual mean global UOHC is particularly unreliable in this study, with errors greatly exceeding the signal itself. This can be partly explained in this study by the weak actual trend, limited extent of simulations and significant year-to-year variability. These results, nevertheless, cast doubts on the accuracy of detecting interannual global changes from the Argo data alone, particularly using data over a small number of years. Lyman and Johnson  analyzed UOHC derived from sea surface height data, and estimated reconstruction errors in the UOHC linear trend from synthetic in situ measurements to be only approximately 0.1 Wm−2 (normalized to the area of the earth). It is, however, difficult to reconcile their results and the ones presented in this study, given largely unknown accuracy of the UOHC anomaly estimates from SSH data [Wunsch et al., 2007; Lyman and Johnson, 2008] and several idealizations in the OSSEs described here. Because of significant spatial variability in the year-to-year changes in UOHC [Harrison and Carson, 2007; Lyman and Johnson, 2008], global integrals can be strongly affected by errors in the regional estimates of UOHC. Large biases in the ACC, in particular, can distort an important contribution of this region to the global warming of the upper ocean. The largest biases, however, are found in relatively small areas near the coasts and ice edges, particularly in the high-latitude ACC and North Atlantic. Excluding these areas from the analysis leads to a substantial reduction in the error in global mean UOHC (down to 0.08 Wm−2), which puts emphasis on the importance of regional year-to-year changes in UOHC and stresses the need for more measurements in the high-latitude regions, near the coasts and ice edges.
 As discussed in section 1.1, movements of Argo floats have both positive and adverse affects on the reconstruction errors: they act to provide observations from more points in the domain, but distort the reconstructed time dependency at any location and result in the gradual loss of coverage in several regions. The relative importance of the adverse effects of float movements is expected to be larger in the regions of strong, and particularly divergent, oceanic currents and on longer time scale. It is therefore not surprising, that the float movements represent a major source of errors in the ACC (particularly in the Indian sector), high latitudes of the North Atlantic and boundary regions, and on the interannual time scale. The results also suggest that, in this model, the large-scale advection is too weak to considerably improve the spatial sampling coverage. The overall effect of float movements is, therefore, to increase the reconstruction errors.
 The positive effect of float movements can be expected to become more significant in the presence of strong advection, such as that by mesoscale eddies. Powerful, rapidly changing eddy motions, which are not resolved in these simulations, can significantly rearrange the Argo array between samplings. The potential importance of such float rearrangement for the spatial sampling coverage is illustrated by the second sensitivity experiment. In this highly idealized random position experiment, the efficiency of advection is taken to the extreme, and the floats are randomly redistributed each time sampling takes place. The results exhibit a significant overall decrease in the reconstruction errors in the annual means and interannual differences. The effects of the actual eddies in the real ocean are not, however, expected to cause such rapid redistribution of Argo floats, definitely not everywhere in the World Ocean. The random change in float position is also expected to lead to underestimation of the adverse effect of advection on reconstruction of time dependency. It is therefore premature to conclude that the eddy currents can act to reduce reconstruction errors, particularly on the global scale. The mesoscale variability is likely to introduce a wealth of additional effects on Argo performance. The net effect of eddies on the accuracy of the Argo observing system remains to be seen, and is the subject of an ongoing study.
 When the number of floats is doubled, and the average spacing between floats decreases from 300 km to 215 km, the errors decrease substantially (around 30%) over most of the domain. The remaining errors in the ACC and other regions suggest that even the doubled sampling density is not sufficient for accurate reconstructions in those regions. The importance of density of spatial sampling is further illustrated by the sensitivity experiment with the halved number of floats, which exhibits a significant increase in the magnitude of reconstruction errors. The errors in climatology (multiyear average) in this case, nevertheless, remain small.
 This study illustrates significant effects of Argo float movements on the expected accuracy of reconstruction of large-scale oceanic state variables. It represents a first step in this direction and focuses on the effects of large-scale oceanic currents in a coarse-resolution OGCM. The results described here suggest that the effects of float movements are predominantly adverse, acting to distort the variability on seasonal and longer time scales. The conclusions, therefore, emphasize the need for methods compensating for these adverse effects of float movements, and advocates utility of combining Argo data with the data from methods less affected by advection, such as moorings, XBTs and satellite-based sea level measurements. The need for additional measurements in such “problematic” areas as the ACC and the high-latitude North Atlantic is another recommendation of this study.
 The authors would like to thank two anonymous reviewers for their helpful advice on improving this manuscript. Kamenkovich and Cheng were supported by the NOAA Climate Program Office, Climate Observation Division. Kamenkovich would also like to acknowledge the support by the National Aeronautics and Space Administration grant NNG06GA66G.