Corresponding author: S. D. Jones, School of Environmental Sciences, University of East Anglia, Norwich NR4 7TJ, UK. (email@example.com)
 Understanding the variability and coherence of surface ocean pCO2 on a global scale can provide insights into its physical and biogeochemical drivers and inform future samplings strategies and data assimilation methods. We present temporal and spatial autocorrelation analyses of surface ocean pCO2on a 5° × 5° grid using the Lamont-Doherty Earth Observatory database. The seasonal cycle is robust with an interannual autocorrelation of ∼0.4 across multiple years. The global median spatial autocorrelation (e-folding) length is 400 ± 250 km, with large variability across different regions. Autocorrelation lengths of up to 3,000 km are found along major currents and basin gyres while autocorrelation lengths as low as 50 km are found in coastal regions and other areas of physical turbulence. Zonal (east–west) autocorrelation lengths are typically longer than their meridional counterparts, reflecting the zonal nature of many major ocean features. Uncertainties in spatial autocorrelation in different ocean basins are between 42% and 73% of the calculated decorrelation length. The spatial autocorrelation length in air-sea fluxes is much shorter than forpCO2 (200 ± 150 km) due to the high variability of the gas transfer velocity.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
 The ocean is estimated to absorb approximately 25% of the total anthropogenic emissions of carbon dioxide (CO2) released into the atmosphere every year [Mikaloff Fletcher et al., 2006; Le Quéré et al., 2009]. The partial pressure of CO2 at the ocean surface (pCO2) is a fundamental determinant of the rate at which CO2 is absorbed by the ocean [Fangohr and Woolf, 2007]. Thus, understanding the spatial and temporal variability of surface ocean pCO2 is critical to understanding the interaction between the atmospheric and oceanic carbon cycles.
 Some attempts at global assessments of pCO2 variability have been undertaken despite these limitations. Li et al.  produced global maps of spatial autocorrelation lengths from surface ocean pCO2 based on previous data sets of pCO2measurements, but their analysis was made on a coarse 10° × 10° grid and their results were restricted to variability on scales of <∼1,000 km only. These limitations both reduced the ability to discern long-scale autocorrelations and restricted detection of finer detail.Sweeney et al.  examined the decorrelation lengths for a selection of cruise tracks with a view to estimating desired spatial sampling rates for future observation projects, but a full global analysis was not attempted.
 This paper presents a global assessment of the spatial and temporal variability of surface ocean pCO2, based on the measurements from the Lamont-Doherty Earth Observatory database (LDEO database) [Takahashi et al., 2009]. This work expands on previous analysis by using a more extensive data set, by looking at much larger spatial scales limited only by the length of individual cruise tracks, and by examining the directional features of the autocorrelation characteristics. The factors controlling the spatial variability of pCO2 are identified by decomposing the pCO2 signal into its temperature and residual components [Takahashi et al., 2002]. The study is extended to cover the spatial variability of air-sea CO2 fluxes. The differences in spatial variability between pCO2 and CO2 fluxes are identified and discussed. Finally the influence of other external drivers (winds, ocean circulation, biology) on CO2 variability is also examined directly or through the analysis of proxy variables (sea surface height, surface chlorophyll). This study thus provides a global view of the spatial and temporal coherence of surface ocean CO2 data, their underlying controls, and their correspondence with the signatures of known physical and biological drivers.
 The global view of oceanic pCO2 variability presented here can inform strategies for determining sampling rates in both space and time [Sweeney et al., 2002; Lenton et al., 2009]. It will also prove useful in a number of modeling projects: sea-air CO2 fluxes can be calculated using these data, which in turn can be used as prior estimates of ocean variability for inverse modeling techniques based on data assimilation [Rödenbeck et al., 2003]. Furthermore, knowledge of the autocorrelation characteristics of pCO2 can inform advanced methods of interpolating the sparse measurements available, such as those used for other physical and geochemical variables [Levitus, 1982]. This will provide an important improvement over the necessarily less comprehensive interpolations performed to date [Lefèvre et al., 2002; Takahashi et al., 2002; Schuster et al., 2009].
2. Data Preparation
 The LDEO database consists of ∼4.1 million individual surface ocean pCO2 measurements spanning the period 1968–2008. Outliers were detected and removed from the data to reduce the influence of erroneous entries caused by transcription errors or faulty instrumentation. The remaining measurements were converted into two separate formats for temporal and spatial autocorrelation.
 The autocorrelation calculations performed in this analysis are based on two surface ocean pCO2 data products: time series for each 5° × 5° grid cell and ship track data. Section 2 describes the treatment of the data necessary to construct the data products and their rationale. The method will be described in detail in section 3.
2.1. Data for Temporal Autocorrelation
 To compute the temporal autocorrelation, the ocean was divided into 5° × 5° grid cells and time series constructed for each. This grid size represents a compromise between a high-resolution analysis and the limitations of the available data. Daily mean values were calculated for each cell to produce a time series spanning the complete time period of the data set. For leap years, a ‘day’ was calculated as 1 calendar days, to produce a constant year length of 365 days throughout. While calculating the daily mean value for each grid cell, any measurements falling outside 3 standard deviations of the mean were flagged as outliers in an iterative process, repeated until no outliers were detected. 17,952 measurements (0.004%) were flagged as outliers in this manner. Further outliers were removed by examining the complete daily time series for each grid cell as follows. A linear trend for the time series was calculated and temporarily removed. Any day whose mean pCO2 level fell outside 3 standard deviations of the mean was flagged in an iterative process, again repeated until no outliers were detected. A total of 268 days' measurements (0.007%) were flagged across all grid cells.
 The measurements flagged as outliers were removed from both the binned and original data sets, which were then used as the basis for the temporal and spatial autocorrelation analysis respectively.
2.2. Data for Spatial Autocorrelation
 Calculating the spatial variability of the pCO2 measurements requires a set of data with sufficient spatial coverage over the ocean. Using the gridded data set created for the temporal autocorrelation above was not suitable: in any given day or month there was insufficient coverage to calculate the variability, and combining multiple maps from the gridded data set to produce sufficient spatial coverage would artificially increase the variability of the data as pCO2levels changed over time. Using gridded data also restricts the spatial resolution of the final autocorrelation calculation, and prevents the detection of small-scale variability.
 The LDEO database is constructed from measurements taken along individual cruise tracks. Each cruise represents a suitable data set for assessing the spatial variability of pCO2 in the region through which it passes, with most cruises made up of several hundred measurements logged to subkilometer precision. Each cruise's measurements are also taken closely together in time, thus minimizing the effect of temporal variations in pCO2. Calculating the spatial autocorrelation for each cruise's data, then projecting the results onto a gridded map, allows a global assessment of the spatial variability of pCO2 levels to be created.
 Unfortunately, the LDEO database does not identify individual cruises: it lists only the institutes or scientists who collected the data. Extraction of specific cruise information was therefore performed by analyzing the characteristics of the data as follows. The measurements provided from a single source were grouped together and sorted by date and time. Where two consecutive measurements were taken within 10 days, both were assumed to be from the same cruise period; greater time periods between measurements were treated as boundaries between separate cruises. The 10 day period was chosen to provide a balance between maintaining coherent cruises, and accounting for reduced correlations due to large time differences between measurements. The measurements from each of these periods were split into cruises by assessing their geographical proximity. For any pair of measurements to be considered as part of the same cruise, they could not be separated by more than the distance a ship is likely to travel in the time between the two measurements. The threshold was set at a rate of 1,500 km d−1, equating to an average speed of 33 knots. While this is faster than most ships can travel, it provides some flexibility to account for errors in the recorded measurement positions and/or times. Even this threshold was not sufficient to capture accurately all cruises: in some cases, all the measurements for a cruise are recorded on a single date or at short fixed intervals, presumably where accurate time records were not available. These cruises would be split erroneously using the above threshold, and so were identified and processed manually. Finally, any cruise containing fewer than five measurements was discarded, as this was a strong indication of errors in the original data such as misrecorded ship positions. This yielded a total of 1,535 individual cruises from the LDEO database.
3.1. Temporal Autocorrelation
 A temporal autocorrelation function (ACF) for each 5° × 5° grid cell was calculated at monthly resolution. Since no grid cells contained a complete time series, the autocorrelations could be computed only where the original and lagged time series contained pairs of values at the same time steps. In some cases the number of paired time steps was very small, so a measure of the statistical significance of the result was required to ensure that the results were robust. The statistical significance of each ACF value was calculated as a function of the number of time steps used in the calculation using the formula
where T is the threshold of statistical significance; Q is the quantile of the cumulative distribution function of the normal distribution at 95% [Wichura, 1988]; and n is the number of values used to construct the ACF. This function gives a threshold between 0 and 1. Individual ACF values between T and 0 (either positive or negative) are not statistically significant; any such values in the calculated ACFs were discarded. A value of T ≤ 0.5 was used for this study, since the number of monthly values available for a given grid cell is relatively low. A total of 348 grid cells had ACFs containing statistically significant values at the 95% level (Figure 1a). The majority of these were in the North Pacific and North Atlantic, where most pCO2 measurements are available [Takahashi et al., 2009].
 Temporal ACFs at daily resolution were also calculated for each grid cell, but too few cells produced statistically significant ACFs to allow a robust analysis of the results.
3.2. Spatial Autocorrelation
 Spatial autocorrelation functions were calculated for each cruise in the LDEO database using the Moran's I technique [Moran, 1950], comparing the similarity of pairs of measurements within the cruise. Autocorrelation values for the cruise were calculated in distance groups of 50 km. For the 0–50 km bin, pairs of measurements separated by 50 km or less were assessed to give an autocorrelation value for the cruise at a distance lag of 50 km. Next, pairs of measurements separated by 50–100 km were examined and so on, to build a complete ACF covering the full distance of the cruise. This approach limited the smallest detectable autocorrelation length to 50 km, which meant that some detail was lost around coastlines where autocorrelations are likely to be very short. However, this was necessary to reduce the amount of computation required for the analysis to a feasible level.
 Any cruise from the LDEO database that covered a distance of 50 km or less was discarded from the analysis. Similarly, any cruise with a correlation length of within 100 km of the overall cruise distance was also discarded as it is likely that the correlation length was limited by the length of the cruise. A total of 1,454 cruises remained for the spatial autocorrelation analysis. The Moran's I technique includes an assessment of the statistical significance of its results. Any value that fell below the threshold of 95% significance was discarded. The decorrelation length of the measurements from each cruise was determined by the e-folding length of the ACF.
 Many of the cruises in the data set pass through different water masses, meaning that the ACF for each cruise represents the combined autocorrelation characteristics of all the water masses encountered and any variability between them is hidden. The autocorrelation analysis for each cruise was extended to reveal this variability. For each grid cell through which the cruise passed, an ACF was calculated for the measurements taken within a reference distance of the center of the cell. The reference distance was set at five times the e-folding length of the original ACF calculated for the entire cruise.
 Global maps of spatial autocorrelation lengths were produced using the e-folding lengths of the ACFs calculated for the individual grid cells. Where more than one cruise had an ACF for a given grid cell, the meane-folding length of all those ACFs cruises was calculated to determine the spatial autocorrelation length for that particular cell. This produced a map of decorrelation lengths for each 5° × 5° grid cell. An accompanying map showing the number of cruises contributing to each cell's value was constructed to provide a measure of the confidence level for each cell. Additional maps were produced to show directional autocorrelations. A zonal map was computed using the 571 cruises traveling within 30° of the east–west direction, and a meridional map from the 521 cruises traveling within 30° of the north–south direction.
 Assessing the uncertainty of the spatial autocorrelation lengths was difficult because very few cruises contribute to each grid cell over much of the ocean. An estimate of the uncertainty for each grid cell was calculated as follows. The standard deviation of the autocorrelation lengths that were used in each grid cell was plotted against the mean autocorrelation length calculated for that cell, and a linear fit applied to the scatterplot. The slope of the fitted line was converted to an uncertainty expressed as a percentage of the grid cells' mean autocorrelation length. The uncertainty estimates were calculated for the global ocean as well as smaller ocean regions. Examples of the scatterplots and fitted slopes are shown in Figure 2. The linear fit used to estimate the uncertainty was robust, as illustrated by the r2 values of the linear fits (Table 1).
Table 1. Uncertainty Levels for the Autocorrelation Lengths of pCO2 Measurements in Different Ocean Regionsa
Uncertainties are calculated as the linear relationship between the autocorrelation length for each grid cell and the standard deviation of cruise autocorrelation lengths contributing to that cell. This gives the uncertainty as a percentage of the calculated autocorrelation length. Numbers in parentheses show the r2 coefficient of the linear fit to illustrate the robustness of the uncertainty estimate. The boundary between the eastern and western North Pacific is at 170°E, and the equatorial Pacific is between 15°S and 15°N.
Western North Pacific
Eastern North Pacific
 We examined the pCO2autocorrelation lengths in greater detail by extracting the temperature-driven component of thepCO2 measurements, calculated as pCO2 at a constant temperature, and a residual component representing the effect of all other processes. Following the method of Takahashi et al. , pCO2 has been observed to vary with temperature at the rate
 This allows each pCO2 measurement to be decomposed into temperature and residual components:
where pCO2 and T are the in situ pCO2 and SST measurements, respectively, and the global mean sea surface temperature (20.29°C) cell calculated from Level 3 Standard measurements from the Aqua-MODIS satellite provided by NASA/GFSC/DAAC (http://oceancolor.gsfc.nasa.gov). Spatial autocorrelation maps of each component were produced for direct comparison.
3.3. Autocorrelation of Drivers
 Spatial autocorrelation analysis was also performed on other ocean properties to determine possible drivers for the autocorrelation of pCO2 values. We used Chlorophyll data from the SeaWiFS satellite, sea surface temperature (SST) data from the MODIS satellite, and sea surface height (SSH) data from AVISO. The latter was used as a proxy for ocean currents, since spatial gradients in SSH are a strong indicator of current strength and direction [Imawaki et al., 2001; van Sebille et al., 2010]. These are gridded data sets covering multiple years. To eliminate the influence of seasonal cycles and trends, a single grid was produced for each data set containing the temporally averaged data from the whole data set.
 The nature of gridded data sets means that they cannot be used to detect very short decorrelation lengths unless they are of very high resolution, at which point the computation requirements of the Moran's I technique become unmanageable. However, using a coarse grid allows an approximation of the spatial ACF for each grid cell to be obtained while maintaining realistic computation times. We used 1° × 1° grids for each of the data sets, and the decorrelation limit was set to 0.1 instead of the e-folding length to compensate for the larger distances between data points. Even so, the minimum detectable decorrelation length was 200 km with spatial autocorrelation lag steps of 100 km instead of the 50 km obtained for thepCO2 autocorrelation.
3.4. Spatial Flux Autocorrelation
 Spatial autocorrelation analyses were also performed on air-sea CO2 fluxes. Instantaneous CO2 flux values were calculated for each of the individual measurements using the standard formulation
where k is the gas transfer velocity; s the solubility; and ΔpCO2 the difference between the atmospheric and oceanic pCO2. The gas transfer velocity k was calculated using the wind formulation by Wanninkhof  with bomb 14C corrections by Sweeney et al. . Six hourly wind data were taken from the ERA-Interim Reanalysis [Simmons et al., 2007] for measurements from 1989 onward, and from the ERA-40 Reanalysis [Uppala et al., 2005] for measurements prior to that date. The solubility s was calculated according to the method presented by Weiss , using the in situ temperature and salinity value from the LDEO database. The Hadley Centre's EN3 data set [Ingleby and Huddleston, 2007] was used where salinity data were missing from the LDEO database. The atmospheric pCO2 levels used to calculate ΔpCO2 were taken from the corresponding latitude in the GLOBALVIEW atmospheric CO2 database [GLOBALVIEW-CO2, 2011] for measurements from 1979 onward, and from the Mauna Loa record [Keeling et al., 1976] for measurements prior to 1979. Barometric pressure values were taken from the in situ measurements recorded in the LDEO database.
 The spatial autocorrelation of air-sea flux values was calculated in exactly the same manner as for thepCO2 values, using the same set of 1,454 cruises. Autocorrelation maps were also produced for each of the flux components k, s and ΔpCO2 to see which had the greatest influence in determining the flux decorrelation scales.
4. Results and Discussion
4.1. Temporal Autocorrelation
 The monthly temporal ACF shows almost no subseasonal variability, with a dominant seasonal cycle (Figure 1b). The e-folding length of this ACF falls between the first and second months. The 12 month autocorrelation is ∼0.46. The interannual correlation decays only very slowly (∼0.33 after 4 years), indicating that the seasonal cycle is consistent and robust. The ACF from the original data is indistinguishable from the ACF computed from the observations with the long-term trend removed. This means that the slow decay of the temporal ACF is not due to the trend inpCO2 levels, but is caused by other sources of interannual variability.
 The prominence of the seasonal cycle is not consistent across all regions. Examining the 6 and 12 month lags in the ACF for five major ocean regions (Figure 1b) shows that the seasonal cycle is strong in the North Pacific and North Atlantic, and slightly less influential in the Indian and Southern Oceans (although there is much less data available in these regions). In the equatorial Pacific, a seasonal cycle is not evident at all. This is consistent with previous analyses of the seasonal cycle of pCO2 levels [Takahashi et al., 2009].
4.2. Spatial Autocorrelation
 The decorrelation lengths calculated for each grid cell range between 50 km and 3,150 km (Figure 4a), with a median of 400 km and 25%/75% quantiles of 200 km and 650 km respectively. This reflects the large variability of the world's oceans. The zonal and meridional mean decorrelation lengths (Figure 4b) are 450 (250–850) km and 350 (200–550) km respectively. Zonal decorrelation lengths are frequently longer than their meridional counterparts (Figure 3) because many ocean currents run east–west, resulting in a zonal transport of water with similar characteristics in most regions.
 The uncertainties for the autocorrelation lengths were calculated in seven ocean regions as well as globally (Table 1). The global mean uncertainty for the map of all cruises (Figure 4a) is 59% of the calculated autocorrelation length, varying between 42% and 73% in different regions. The zonal and meridional uncertainties are 46% (19%–79%) and 37% (20%–72%) respectively. Errors in the zonal and meridional autocorrelation lengths are smaller than those found in the directionless autocorrelations because they eliminate much of the variability caused by different cruises crossing or following currents. Using the same technique, the zonal and meridional errors are calculated as 46% and 37% of the autocorrelation lengths respectively. The region with greatest uncertainty is the North Atlantic, where the uncertainty is greater than 70% in all directions. This is because there are several gyres, currents and upwelling/downwelling areas [Schmitz, 1996] in this relatively small region, including the Gulf Stream whose position varies over time [Kelly, 1991]. This means that cruises passing through this region will encounter several different water masses with different spatial variability, which may be in different locations for different cruises in the LDEO database. This accounts for the large uncertainties in spatial autocorrelation length in the North Atlantic. The varying position of the Kuroshio current and its extension [Kawabe, 1995] has a similar effect in the western North Pacific, which shows much higher zonal variability than the eastern North Pacific.
 The map of mean autocorrelation lengths highlights many of the major ocean currents and gyres as regions where autocorrelation lengths are long (1,000 km and above), especially away from the coasts (Figure 4a). The North Pacific and South Atlantic gyres are clearly discernible, as are the currents of the Indian Ocean. Short autocorrelation lengths (400 km and below) are evident where waters are heterogeneous or where different water masses are in close proximity. This is most evident in the Southern Ocean, where the water characteristics are heterogeneous [Watson and Naverira Garabato, 2006], especially around Drake Passage and the Scotia Sea [Heywood et al., 2002]. Other prominent regions of short autocorrelation lengths include the Humboldt current system off Chile, Peru and into the equatorial Pacific, where biological activity is particularly pronounced [Morales and Lange, 2004]; the North Atlantic around Iceland and Greenland, where the Gulf Stream is most prominent [Dickson and Brown, 1994]; the Kuroshio current in the western Pacific south of Japan [Taft et al., 1973]; the highly variable currents of the Caribbean Sea and Gulf of Mexico [Richardson, 2005]; and the continental shelf of the South Atlantic Bight of the United States [Jiang et al., 2008]. The North Atlantic is the only ocean basin with no obvious coherence in spatial autocorrelation lengths. This is due to the high variability of the currents in this region combined with the effects of biological activity. The distribution of autocorrelation lengths in the North Atlantic becomes much clearer when the zonal and meridional cruises are assessed separately (Figure 5).
 The accompanying map of cruise counts for each cell shows areas of most prolific coverage (and therefore greatest confidence) in the western North Pacific off the Japanese coast, the Caribbean islands and Drake Passage in the Southern Ocean, with over 50 cruises recorded in the database (Figure 4b). The North Atlantic and North Pacific have 10 or more cruises recorded over the majority of their areas. The remainder of the world's oceans are only minimally sampled outside repeat cruise tracks such as those between New Zealand and the Antarctic, and a repeated circular cruise track in the Indian Ocean.
 Further detail of spatial autocorrelation patterns can be seen by examining the zonal and meridional cruises independently (Figure 5). The extended autocorrelation lengths in the North Pacific basin (1,200 ± 700 km), the South Equatorial current (1,500 ± 500 km) and the Antarctic Circumpolar current (1,300 ± 500 km) are more clearly discernible in the zonal map along the main direction of water flow, with much shorter meridional autocorrelations of 550 ± 200 km, 450 ± 150 km and 450 ± 150 km respectively. Meridional correlations dominate in the Atlantic Ocean only, particularly in the middle to high latitudes of the North Atlantic (1,400 ± 1,000 km) and the western South Atlantic (1100 ± 200 km). In the western tropical Atlantic the autocorrelations follow the bifurcation of the South Equatorial current on the coast of South America, forming the Brazil and North Brazil currents [da Silveira et al., 1994]. In the eastern North Atlantic, the autocorrelations are associated with the Canary Current [Schmitz, 1996]. The long meridional autocorrelations in the western North Atlantic follow the Gulf Stream and the North Atlantic Current [Flatau et al., 2003], showing the greatest dominance of meridional over zonal correlations. Cruises traveling east–west here will cross many currents carrying waters of different characteristics, thereby producing short autocorrelation lengths; north–south cruises, meanwhile, will not see this effect. These long autocorrelations extend as far north as Greenland and Iceland, where the North Atlantic Current loses its identity around Greenland and there is a large area of dense, sinking water at the limits of the thermohaline circulation [Dickson and Brown, 1994].
 A full analysis of zonal and meridional autocorrelations cannot be performed for the eastern South Pacific, the region of the Southern Ocean south of South Africa, or for much of the South Atlantic due to the unidirectional nature of the cruises in this region (Figures 5c and 5d). Comparing the autocorrelation lengths of the temperature-driven and residual components (Figure 6a) shows that the temperature-driven component is more spatially stable in much of the ocean, with 61% (17%) of grid cells reporting longer correlations for the temperature (residual) component. The residual component tends to have the longest relative autocorrelation length in the midlatitudes of the Atlantic, with similar but weaker patterns in the Pacific midlatitudes. This pattern compares well with analyses of the biological influence on surfacepCO2 levels [Takahashi et al., 2002], indicating that this is a significant constituent of the residual component. The relative spatial stability of these two components varies with the seasonal cycle. In the summer months (June–August/December–February in the Northern/Southern Hemisphere), the pattern of relative spatial stability (Figure 6b) is much the same as that for the complete year, while pattern in the winter months changes significantly (Figure 6c). Analysis of the seasonal differences in the two components (not shown) shows that this is due to a combination of the temperature component becoming less spatially stable in the winter months, and biological activity becoming more spatially stable as it decreases to a minimum in most regions.
4.3. Comparison With Drivers
 Comparing the maps of pCO2 autocorrelation lengths with those of chlorophyll, SST and SSH (Figure 7) shows the extent to which the latter variables may act as drivers for the pCO2autocorrelations. The chlorophyll and SST maps show the same basic large-scale patterns of spatial autocorrelation, with larger autocorrelations in the central basins of the Atlantic and Pacific. Values in the eastern Indian Ocean are not well defined, since they are consistently shorter than the 200 km lower limit on detectable autocorrelation lengths for the gridded data sets and therefore show no variability. SSH autocorrelation lengths are also below the 200 km threshold across much of the global ocean, with only the equatorial and North Pacific, tropical Atlantic and portions of the Southern Ocean exhibiting longer autocorrelation lengths.
 The pattern of the chlorophyll and SST maps is visible to some extent in the map of pCO2 autocorrelation lengths, although it is obvious that these are not leading drivers of the autocorrelation length since the pCO2 map shows greater spatial variability. This is confirmed with a quantitative comparison of the maps, with pattern correlations of r2 = 0.24 and r2 = 0.21 for chlorophyll and SST respectively. The SSH map cannot be reliably compared to the pCO2 map because of the limited number of regions in which the autocorrelation length can be estimated. However, the relatively low similarity of pCO2 autocorrelation lengths, and the fact that ocean currents and gyres are clearly visible in the zonal and meridional maps of autocorrelation (Figure 5), leads to the conclusion that it is the physical circulation of the oceans is likely to be the largest influence on the patterns of pCO2 autocorrelation.
4.4. Flux Autocorrelation
 Spatial autocorrelation lengths for CO2 fluxes are approximately half those calculated from the pCO2 measurements (200 (150–350) km). Estimated uncertainties for the flux autocorrelation lengths are very similar to those for the pCO2 measurements (Table 1). Mapping the individual components of the flux calculation (Figure 8) reveals the primary cause of this difference. The ocean pCO2 and ΔpCO2 autocorrelation lengths are essentially identical, with a mean difference that is smaller than the 50 km resolution of this analysis; atmospheric CO2 therefore has no influence on the flux autocorrelation. Solubility autocorrelation lengths are typically longer than those of the pCO2 measurements (600 (350–950) km), but this has the parameter with by far the smallest influence over the calculated flux value, consistent with current understanding of the carbonate system [Takahashi et al., 2009]. The difference between the pCO2 measurements and the gas transfer velocity is 150 (50–350) km, which is very close to the overall difference between pCO2 and the total flux (150 (50–300) km). Pattern correlation tests show that the fluxes have a very similar distribution to both pCO2 and the gas transfer velocity, with r2 = 0.71 and r2 = 0.76 respectively. Thus we conclude that the gas transfer velocity is most influential in causing the decreased autocorrelation length in CO2 fluxes.
4.5.1. Bias Detection
 Tests for the existence of systematic biases in the data show that there are no inherent characteristics of the LDEO data set that influence the results of this study. Checks were performed to ensure that the spatial autocorrelation length for a given grid cell is not influenced by the number of cruises contributing to that value, despite observations that the regions with most cruises tend to be regions of short spatial autocorrelation length. A linear regression fit on the relationship between autocorrelation length and the number of cruises in each cell gives an r2 of 0.056, confirming that there is no such relationship. Furthermore, examination of the sea surface height (SSH) (calculated using the AVISO SSH anomaly data) also shows that the regions of high cruise counts and short autocorrelation lengths are regions of high SSH variability. This indicates the high mesoscale variability caused by unstable currents where short autocorrelation lengths are expected.
4.5.2. Comparison With Previous Studies
 The results of our autocorrelation analysis compare well with previous studies of pCO2variability, but provides near-global coverage and a level of detail that better highlights oceanographic features and allows the identification of underlying drivers. The strong seasonal cycle in the temporal ACF is in agreement with similar regional studies, both in terms of interannual variability [Bates et al., 1996; Gruber et al., 2002; Wong et al., 2010] and the ability to fit harmonic curves to time series of pCO2 measurements [Schuster et al., 2009]. The spatial autocorrelation analysis also compares well with other studies examining both surface ocean pCO2and related air-sea fluxes. The gyre and current features visible inFigures 4 and 5 are similar those described by Li et al. , but they provide a more coherent picture and details that were not captured therein because of the scale limitation and the coarse grid selected. The additional resolution and less restrictive limits used here enhance significantly the ability to detect and understand these characteristics. The short autocorrelation lengths in the Humboldt current region agree well with high spatial pCO2 variability associated with strong CO2 drawdowns [Lefèvre et al., 2002]. Relatively short autocorrelation lengths also agree with high spatial variability of carbon fluxes found in the south–east Atlantic [Santana-Casiano et al., 2009] and the South Atlantic Bight [Jiang et al., 2008], while the “moderate” variability in the western equatorial Pacific [Ishii et al., 2009] is reflected in autocorrelation lengths close to the global mean average. The autocorrelation lengths found in this study also match closely estimates of the required spatial sampling rate for pCO2 along specific cruise tracks from previous versions of the Takahashi database [Sweeney et al., 2002]. The directional autocorrelation lengths we find in the regions matching the same cruises are very close to the results from that study, which is to be expected since both studies are based upon the analysis of individual cruises. However, our analysis shows that all available data should be examined to provide a true picture of spatial variability of pCO2 across the oceans.
5. Summary and Conclusion
 The temporal and spatial autocorrelation analysis of the LDEO database of surface ocean pCO2measurements and their corresponding air-sea fluxes provides a comprehensive insight into the global variability of these critical ocean characteristics. ForpCO2 in the temporal dimension, the monthly mean ACF exhibits a robust and consistent seasonal cycle. For pCO2 in the spatial dimension, the global median and quantile autocorrelation lengths of pCO2are 400 (200–650) km. For the air-sea CO2 flux, the global median autocorrelation length decreases to 200 (150–350) km because of the spatial variability of the gas transfer velocity. In both cases zonal correlations are longer than their meridional counterparts, indicating that ocean currents play a significant role in determining these lengths. The major ocean currents and gyres have longer correlations in both pCO2 and CO2 fluxes than those regions with lessheterogeneous characteristics, consistent with the autocorrelation lengths in sea surface height.
 The results of this study will be useful to both the measurement and modeling communities. They will inform a future research into the interaction between the atmospheric and oceanic carbon cycles, and help to develop future oceanic measurement strategies. The results are particularly relevant for atmospheric CO2 inversions, which require a priori correlations in Bayesian inverse calculations to estimate CO2 fluxes from atmospheric data. Our analysis suggests that inverse calculations should incorporate a priori correlation of pCO2 patterns and compute CO2 fluxes using observed winds to optimize the information content of the available surface ocean data. Such a strategy would require the addition of a surface ocean box in inversions in order to merge the oceanic and atmospheric data streams most effectively.
 We thank all the people who contributed data to the LDEO database, in particular: Thorarinn S. Arnarson, Dorothee C. E. Bakker, Nicholas R. Bates, Richard Bellarby, Wei-Jun Cai, Francisco Chavez, David W. Chipman, Cathy E. Cosca, Brune Delille, Hein J. W. de Baar, Richard A. Feely, Gernot Friederich, John Goddard, Burke Hales, Mario Hoppema, Masao Ishii, Trus Johannessen, Arne Körtzinger, Nicolas Metzl, Takashi Midorikawa, Ludger Mintrop, P. P. Murphy, Timothy Newberger, Yukihiro Nojiri, Jon Olafsson, Are Olsen, Christopher L. Sabine, Ute Schuster, Tobias Steinhoff, Stewart C. Sutherland, Peter Salomeh, Colm Sweeney, Taro Takahashi, Rik Wanninkhof, Andrew Watson, Ray F. Weiss, C. S. Wong, and H. Yoshikawa-Inoue. The SSH anomaly products were produced and distributed by Aviso (http://www.aviso.oceanobs.com/), as part of the Ssalto ground processing segment. The SST data were Level 3 Standard measurements from the Aqua-MODIS satellite provided by NASA/GFSC/DAAC (http://oceancolor.gsfc.nasa.gov). The SeaWiFS Chlorophyll data were produced by NASA/GFSC/DAAC (http://oceancolor.gsfc.nasa.gov). We thank Ute Schuster, Andrew Manning, and the reviewers for their invaluable comments and suggestions. The autocorrelation calculations presented in this paper were carried out on the High Performance Computing Cluster supported by the Research Computing Service at the University of East Anglia. Steve Jones is supported by a PhD Studentship funded by UK NERC Project reference NE/F005733/1.