Climate variability drives significant changes in the physical state of the North Pacific, and there may be important impacts of this variability on the upper ocean carbon balance across the basin. We address this issue by considering the response of seven biogeochemical ocean models to climate variability in the North Pacific. The models' upper ocean pCO2 and air-sea CO2 flux respond similarly to climate variability on seasonal to decadal timescales. Modeled seasonal cycles of pCO2 and its temperature- and non-temperature-driven components at three contrasting oceanographic sites capture the basic features found in observations (Takahashi et al., 2002, 2006; Keeling et al., 2004; Brix et al., 2004). However, particularly in the Western Subarctic Gyre, the models have difficulty representing the temporal structure of the total pCO2 seasonal cycle because it results from the difference of these two large and opposing components. In all but one model, the air-sea CO2 flux interannual variability (1σ) in the North Pacific is smaller (ranges across models from 0.03 to 0.11 PgC/yr) than in the Tropical Pacific (ranges across models from 0.08 to 0.19 PgC/yr), and the time series of the first or second EOF of the air-sea CO2 flux has a significant correlation with the Pacific Decadal Oscillation (PDO). Though air-sea CO2 flux anomalies are correlated with the PDO, their magnitudes are small (up to ±0.025 PgC/yr (1σ)). Flux anomalies are damped because anomalies in the key drivers of pCO2 (temperature, dissolved inorganic carbon (DIC), and alkalinity) are all of similar magnitude and have strongly opposing effects that damp total pCO2 anomalies.
 At interannual to decadal timescales, the physical state of the North Pacific is strongly influenced by the dominant mode of decadal-scale atmospheric variability associated with changes in the strength of the wintertime Aleutian Low [Trenberth and Hurrell, 1994]. Changes in surface wind stress alter Ekman flows, surface ocean mixing and heat fluxes. The dominant pattern of the sea surface temperature (SST) response was termed the Pacific Decadal Oscillation (PDO) by Mantua et al. . Positive PDO anomalies are associated with cold SSTs in the central and western Pacific, and with warm SSTs in the Alaska Gyre, along the North American coast, and southward into the tropics [Miller et al., 2004; Mantua et al., 1997]. A wide-range of ecosystem and fisheries impacts are associated with the PDO pattern [Chai et al., 2003; Mantua et al., 1997; Polovina et al., 1995].
Russell and Wallace  consider the pattern of sea level pressure (SLP) across the Northern Hemisphere that covaries with the seasonal drawdown in the atmospheric CO2 record. They find that this pattern of SLP variability has a strong similarity to the Northern Annular Mode (NAM), and that the normalized difference vegetation index (NDVI) also varies in a spatially coherent way with the atmospheric CO2 record. They conclude that large-scale climate variability drives notable changes in the seasonal carbon drawdown by the land biosphere. Miller et al.  illustrate an intriguing correlation of atmospheric CO2 anomalies at Mauna Loa with the PDO. Since the North Pacific Ocean also responds to climate variability, is it possible that a portion of the observed variability in the atmospheric CO2 record is due to the North Pacific Ocean? In their atmospheric inversion study, Patra et al.  find such a relationship, suggesting that the sea-air CO2 flux over the North Pacific is significantly associated with the PDO at 5 months lag. Newman et al.  and Rodgers et al.  illustrate a strong relationship between the PDO and the El Niño–Southern Oscillation (ENSO) cycle, and Wong et al. [2002a] have shown that carbon cycle variability at Ocean Station Papa (OSP) in the northeast Pacific is correlated with ENSO.
 Changes in the year-to-year CO2 flux response to climate variability are primarily a function of the surface ocean response to climate changes, and there are also likely to be long-term trends in ocean CO2 uptake due to the buildup of anthropogenic CO2 in the atmosphere. However, data for quantification of interannual variability and long-term trends are relatively sparse. Sabine et al.  summarize recent observations changes in the uptake rate of CO2 into the North Pacific. These studies suggest increased uptake in the tropics and subtropics in the recent past, but the temporal structure and magnitudes are not consistent from study to study. In the subarctic, there is more disagreement about recent changes. In the North Atlantic, Feely et al.  report that repeat hydrography observations suggest significant changes in carbon uptake rates, and results from repeat hydrography in the North Pacific will soon be available. Models can help with interpretation and synthesis of these kinds of observations. They can help to understand driving mechanisms and fill in the gaps in space and time between the observations.
 Several studies of extratropical North Pacific CO2 flux variability on interannual timescales have used time series observations at Station ALOHA near Hawaii [Dore et al., 2003; Brix et al., 2004; Keeling et al., 2004] and OSP in the northeast subarctic Pacific [Signorini et al., 2001]. Across most of the Pacific, time series data are sparse. On seasonal timescales, however, data for surface ocean pCO2 and sea-to-air CO2 fluxes in the subtropics and high latitudes are more available [Takahashi et al., 2006, 2002, 1993; Keeling et al., 2004; Sasai and Ikeda, 2003; Signorini et al., 2001]. Using cargo-ship observations in six biogeochemical provinces of the subarctic North Pacific, Chierici et al.  illustrate dramatic east-west contrasts in total pCO2 cycles due to different relative amplitudes in the cycles of its component parts. The seasonal cycle of surface ocean pCO2 results from the opposing influences of sea surface temperature (SST) and biological and physical influences on dissolved inorganic carbon (DIC). The pCO2 is positively correlated with SST, and thus temperatures drive pCO2 low in winter and high in summer. The DIC cycle has the opposite phase: Vertical mixing entrains deep waters rich in DIC in winter and increases pCO2, while in the summer, biological production reduces DIC and decreases pCO2 in the stratified surface mixed layer waters. Alkalinity and salinity cycles are small. The total pCO2 cycle at any location depends on the local balance of these components. At Station ALOHA and OSP, Brix et al.  and Signorini et al.  illustrate that the same processes drive interannual variability in pCO2.
 In this paper, we compare seven ocean biogeochemical models in the North Pacific to each other and to data on seasonal to decadal timescales. Seasonal and interannual comparisons to observations allow us to assess the accuracy of model simulations of the upper ocean carbon cycle response to climate forcing on these timescales. In light of this evaluation, we consider the patterns of basin-scale decadal timescale variability in the air-sea CO2 flux and its driving components in response to climate forcing. Specifically, we (1) assess model performance on seasonal to interannual timescales in reference to data in three distinct oceanographic regions; (2) elucidate the model mechanisms that regulate interannual variability in the North Pacific air-sea CO2 flux; and (3) highlight the strengths of and several key challenges for ocean carbon cycle modeling.
 This intercomparison benefits from the work of the Ocean-Carbon Cycle Model Intercomparison Project (OCMIP). OCMIP has illustrated that large differences in ocean model physics [Doney et al., 2004a] can significantly affect deep ocean ventilation and uptake of CFCs [Dutay et al., 2002], anthropogenic carbon and 14C [Matsumoto et al., 2004]. Many of the models presented here were included in the OCMIP process, using earlier, typically coarser-resolution versions of their physical models with uniform biogeochemical parameterizations agreed on in advance. This effort is distinct from OCMIP in that the models have been independently designed and executed. No part of these models was prescribed to be consistent. Instead, these models represent a sampling of the variety of ocean biogeochemical models used in current research to understand the carbon cycle of the North Pacific and global oceans.
 This study is additionally motivated by the fact that these and other similar models are being applied at the global scale and in other basins to estimate current carbon sinks, to explain mechanisms of ocean biogeochemical variability, and for data interpretation. Versions of these or similar models will be used in the next set of Intergovernmental Panel on Climate Change (IPCC)-class models for predictions of future climate states. Thus it is important to evaluate how well current models capture the carbon cycle and to identify ways in which models can be improved.
 In section 2, we describe the models and the data. In section 3, we present model-data comparisons on seasonal timescales at three North Pacific regions for which data is available, and in section 4, we consider model-to-model comparisons on interannual to decadal timescales. In sections 5 and 6 we discuss the comparisons and conclude.
2. Models and Data
2.1. Ocean Biogeochemical Models
 We compare North Pacific sea-to-air CO2 fluxes and features of the surface ocean carbon cycle of seven ocean biogeochemical models that are based on the following physical models: (1) ROMS-Maine, based on the Regional Ocean Modeling System physical model, configured for the Pacific Ocean [Wang and Chao, 2004]; (2) MIT, based on the MITgcm physical model [Marshall et al., 1997a, 1997b]; (3) UMD, based on the University of Maryland Pacific regional model [Christian et al., 2002]; (4) NCOM-Maine, based on the ocean component of the National Center for Atmospheric Research (NCAR) Climate System Model (CSM) [Gent et al., 1998], configured for the Pacific Ocean [Chai et al., 2003]; (5) BEC-CCSM, based on the ocean physical component of the Community Climate System Model (CCSM-3) which, in turn, is based on the Parallel Ocean Program (CCSM-POP 1.4) [Smith and Gent, 2004]; (6) MPI-Met, based on the Max-Planck-Institute ocean model (MPI-OM) [Marsland et al., 2003]; and (7) PISCES-T, based on the OPA (Océan PAralléllisé) physical model [Madec et al., 1999]. Each modeling group made independent choices for all aspects of the models, including resolution, parameterizations, ecosystem structures, and boundary and initial conditions. Model details are presented in Table 1 and the additional references noted therein.
Table 1. Model Descriptions, Ordered by Total Years of Simulation
UMD model is forced with NCEP winds. Precipitation and solar radiation are also prescribed. A coupled atmospheric component determines latent, sensible, and long wave fluxes [Murtugudde et al., 1996].
Monthly observations, without spatial variation.
Monthly observations, with seasonal and latitudinal gradients in atmospheric CO2 interpolated from NOAA/ESRL Global Monitoring Division (formerly CMDL) shipboard observations.
Average of Mauna Loa and South Pole monthly observations, without spatial variation.
 Five models are z-coordinate models. The other models, ROMS-Maine and UMD, use s-coordinates and σ-coordinates, respectively. Horizontal resolution ranges from 0.5° × 0.5° (ROMS-Maine) up to 3° × 0.6–1.8° (BEC-CCSM). Four models are global, and three (ROMS-Maine, UMD, and NCOM-Maine) are regional, with southern boundaries far from the region of analysis. Six models use National Centers for Environmental Prediction (NCEP) [Kistler et al., 2001] surface forcing, and one (NCOM-Maine) uses Comprehensive Ocean-Atmosphere Data Set (COADS) [da Silva et al., 1994]. The UMD model is unique in that the surface heat and freshwater fluxes are modeled rather than specified from reanalysis products [Murtugudde et al., 1996].
 Six models incorporate complex ecosystem representations including various nutrient, phytoplankton, zooplankton and detritus compartments. One model (MIT) uses a simpler export parameterization accounting for light limitation, nutrient limitation, and regional variability in export efficiency. All models use the work of Wanninkhof  to parameterize gas exchange. The BEC-CCSM model was run with a preindustrial atmospheric pCO2 boundary condition, which impacts mean pCO2, DIC and alkalinity in comparison to data and other models. All other models were run with anthropogenic-era atmospheric pCO2 boundary conditions based on ice core records [Enting et al., 1994], the Mauna Loa and/or South Pole time series.
 Finally, models were run for a range of timespans, from 14 years (1990–2004, ROMS-Maine) to 56 years (1948–2004, PISCES-T). In most models, some drift in the North Pacific air-sea CO2 flux is observed in the first years of the runs, and thus these years were not included in this analysis. For NCOM-Maine, we neglect years after 1992 because surface forcing was transitioned from COADS to pseudo-COADS (B. Winters, personal communication, 2000) in 1993, significantly altering the magnitude of the air-sea CO2 flux seasonal cycle. Table 1 lists both modeled and analyzed years and other distinguishing features of the seven models.
 In most models, the anthropogenic pCO2 boundary condition varies with time, and models may not be spun up long enough to be in full equilibrium with the atmospheric pCO2. Additionally, the models have different periods of simulation. To address these challenges to model-to-model comparison, we remove the mean values and detrend model results for the bulk of our analysis and focus our analysis on seasonal to decadal timescale variability. We briefly compare mean modeled quantities and fluxes to data in sections 3.1 and 4.1.
2.2. Regional Surface Ocean Data
 We use comparisons of the models to seasonal data in the Western Subarctic Gyre (WSG) and Alaskan Gyre (AG) to evaluate the modeled response to seasonal forcing. In the Subtropical Gyre, we compare model results to seasonal and interannually varying data at Station ALOHA near Hawaii.
 In section 3.2, we present seasonal comparisons to ocean surface pCO2, sea surface temperature (SST) and sea surface salinity (SSS) observations from the database of Takahashi et al.  at two regions in the North Pacific (Figure 1): Kuril (46°N–50°N, 150°E–160°E) and Ocean Station Papa (OSP, 47.5°N–52.5°N, 150°W–140°W). Data are averaged across each region, binned by month and then each monthly bin is averaged to form a climatological year. At Kuril, 5330 observations from 1984–2002 are used, with 20% of binned months having data. At OSP, there are 6428 observations from 1973–2001, and 24% of binned months have data. These areas are selected to test the models' performance in oceanographic environments exhibiting contrasting seasonal variability in surface water pCO2 as well as to maximize the density of observations. The monthly variability of the observed pCO2 values are on average ±25 μatm (1σ), a large portion of which is attributable to meridional variability within each box area. Model results are average seasonal cycles of all analyzed model years (Table 1).
 Observations of sDIC and sALK in the WSG are not available in the data set of Takahashi et al. . Instead, we present observations from Station KNOT (44°N, 155°E) [Tsurushima et al., 2002] where data were collected from June 1998 to February 2000 in all months except for March, April, and September.
Brix et al.  and Keeling et al.  presented a detailed analysis of the interannual variability of the carbon cycle at Station ALOHA, the location of the U.S. JGOFS Hawaii Ocean Time series (HOT) program (22°45′N, 158°00W). In this study, we compare modeled cycles and interannual variability to their SST, SSS, salinity-normalized DIC (sDIC), salinity-normalized ALK (sALK) and pCO2 data from 1988–2002. Error in the observed pCO2 is estimated at ±4 μatm (H. Brix, personal communication, 2006).
 In the Kuril and OSP regions, observed phosphate concentrations from the World Ocean Atlas (WOA01) [Conkright et al., 2002] are compared to the models, while at Station ALOHA, climatological phosphate data from the HOT program [Karl and Lukas, 1996] is presented. For nitrate-only models (NCOM-Maine, UMD), phosphate is estimated from nitrate and the Redfield ratio (N:P = 16:1). At Station ALOHA, this is a somewhat arbitrary correction as upper ocean inorganic nutrient concentrations do not follow Redfield stoichiometry [Karl et al., 2001], but this has little bearing on the model-data comparison as concentrations of both nutrients are uniformly low. WOA01 climatological temperature and salinity fields [Stephens et al., 2002; Boyer et al., 2002] are used to calculate observed mixed layer depths (MLDs) in the Kuril and OSP regions, and HOT data is used at Station ALOHA. Model MLDs are computed using time-varying temperature and salinity and then averaged to a climatological year. In both data and models, MLDs are calculated using a Δσθ criteria of 0.125 kg/m3.
2.3. Calculation of pCO2 Components
 Variability in temperature, dissolved inorganic carbon (DIC) concentrations, alkalinity (ALK) and salinity (S) impact surface ocean pCO2. Separation of these multiple influences allows a more detailed understanding of the surface ocean carbon cycle and of the models' representation of these processes. For this study, we separate these influences in two ways.
 For comparisons at the seasonal timescale, we separate the influence of temperature (pCO2-T) from all other influences (pCO2-nonT) as in the work of Takahashi et al. . The separation is based on the experimental finding of Takahashi et al.  that the temperature effect for isochemical seawater (∂lnpCO2/∂T) is 0.0423°C−1. The two components are calculated following equations 1 and 2, in which the overbar represents the temporal mean value.
This separation is useful because the components can be directly calculated from both the available in situ data and from the models, and it is used in the comparisons to data presented in this paper.
 For model-model comparisons at interannual to decadal timescales, we use a more detailed separation in which DIC, alkalinity and salinity influences can be calculated and presented individually [Takahashi et al., 1993],
Each term is calculated using the variability of one component (e.g., DIC, T, ALK, S) while other quantities are held at their long-term mean value. The carbonate system equilibrium constants as in the original model simulations (Table 1) are used for the calculations. Owing to the monthly values used in the post-processing, there are some differences between the total pCO2 calculated this way from model-simulated pCO2, but they are less than ±5 μatm and without coherent spatial structures. Other evaluations not presented here show that the temperature driven component (pCO2-dT/dt) of equation (3) is almost identical to pCO2-T as calculated in equation (1), and that the sum of the other three components (pCO2-dDIC/dt, pCO2-dALK/dt, pCO2-dSSS/dt) is approximately equal to pCO2-nonT.
2.4. Pacific Decadal Oscillation
Mantua et al.  define the PDO index as the leading principal component of observed monthly SST anomalies in the Pacific poleward of 20°N. The PDO time series used in this study is updated from Mantua et al.  (N. Mantua, personal communication, 2005). While the PDO is the generally accepted canonical pattern of North Pacific decadal variability, Bond et al.  illustrate that in the years since 1990, an orthogonal pattern of variability has gained energy, and thus the PDO index may, in fact, be only a partial indicator of North Pacific climate. Since our model intercomparison period spans 1951–2004 and most of the models have the majority of their timespan before 1990, we focus on PDO-related climate variability and its links to the upper ocean carbon cycle.
3. Seasonal and Interannual Comparison of Models to Observations
3.1. Mean Comparisons at Kuril and Station KNOT, OSP, and Station ALOHA
 In Tables 2–4, we compare mean values of SST, sea surface salinity (SSS), mixed layer depths (MLDs), phosphate (P), salinity-normalized dissolved inorganic carbon (sDIC), salinity-normalized alkalinity (sALK), and pCO2 in the Kuril region and at Station KNOT (Table 2), in the OSP region (Table 3), and at Station ALOHA (Table 4). For some models, there are substantial deviations of mean modeled quantities from the observations in the Kuril region (Table 2). Models suggest large differences in mean sDIC and sALK that contribute substantially to differences in mean pCO2; however, these values are poorly constrained by the data from Station KNOT and so it is hard to determine which model is more accurate. Models generally compare better to the data in terms of mean quantities in the OSP region (Table 3) and at Station ALOHA (Table 4).
Table 2. Mean Values in the Western Subarctic Gyre: Kuril Region (1984–2002) and Station KNOT (1998–2000)
SST, SSS, and pCO2 from Takahashi et al. . Phosphate from Conkright et al. . Number of months in each period with data: 1973–2001 (n = 83), 1990–2001 (n = 39), 1982–1998 (n = 33), 1982–2001 (n = 47), 1973–1992 (n = 44).
3.2. Seasonal Cycles at Kuril and Station KNOT, OSP, and Station ALOHA
 In this section, seasonal variability at the Kuril and OSP regions and at Stations KNOT and ALOHA of SST, SSS, MLDs, P, sDIC, sALK, pCO2, pCO2-T and pCO2-nonT are presented. These comparisons illustrate the fidelity of the models' representation of seasonal-timescale physical variability and its impacts on the upper ocean carbon cycle.
 In the Western Subarctic Gyre in the Kuril region (Figure 2), the models tend to capture the basic shape of the seasonal cycle of SST and SSS. However, several models overestimate the amplitude of the SST cycle, and the UMD model underestimates the amplitude of both the SST and SSS cycles. We note that the UMD model was developed for the tropical Pacific and extended into the subarctic with no tuning. The only freshwater boundary condition is precipitation, which is adequate in the tropical Pacific but not in the higher latitudes, as illustrated in this comparison and in the mean values presented in Table 2. For MLDs at Kuril, models range from being able to capture these quite well (ROMS-Maine, UMD, MPI-Met) to having almost no deep winter mixing (PISCES-T). Somewhat surprisingly, the wintertime phosphorus maximum is not clearly connected to the mixing depth in all models. Specifically, PISCES-T does as well as ROMS-Maine and UMD in the capturing the wintertime nutrient maximum despite shallow winter mixed layers, suggesting that horizontal nutrient supply is important in PISCES-T. No model fully captures dramatic and rapid phosphorus drawdown in spring, indicating that modeled biologically mediated nutrient and carbon removal is not fast enough. Considering the short data record at Station KNOT, the seasonal changes in SST, SSS, MLD and P are consistent with those seen in the Kuril region.
 The deviation from the mean for sDIC observed at Station KNOT is approximately ±50 mmol/m3, which is 2 to 3 times as large as the largest seasonal amplitude of the models (MIT) in the Kuril region. The observed seasonal amplitude for sALK is much smaller than that for sDIC. Considering large experimental uncertainties for ALK measurements (approximately ±10 μeq/kg for expedition-to-expedition reproducibility), it seems that the seasonal variability for sALK is small. Consistent with the observation by Wong et al. [2002b], this suggests that CaCO3 production is small in the western subarctic. Model sALK values seem to be consistent with the observations, although NCOM-Maine yields a phase opposite to that indicated by the observations. For any given model, there is not a clear relationship between the amplitudes of the phosphorus and sDIC cycle, indicating that horizontal and vertical transport and mixing and air-sea exchange are important in addition to biological DIC drawdown. Modeled sALK cycles also vary substantially, both in amplitude and phase.
 The above forcings on the carbon cycle result in the pCO2 cycle at Kuril being out of phase with the observations in all models except for UMD. The models have a low excursion of pCO2 in winter and a peak in summer, while the data show the opposite phase. This is because the amplitude of the pCO2-T cycle is either slightly overestimated, or reasonably simulated, while that of the pCO2-nonT cycle is substantially underestimated. UMD's pCO2 cycle is in phase with the observations because both the pCO2-T and pCO2-nonT cycles are too small.
 The physics and biogeochemistry of the Kuril region, where cold Oyashio Current water and Northwestern Subarctic Gyre water converge, are complex and difficult to model. Oyashio water consists of three components: extension of the Alaskan Current, outflow of the Bering Sea and outflow of the Okhotsk Sea. In the real world, this area is full of eddies formed by interactions of the fast flowing Oyashio current with the Kuril islands. These complexities are not well resolved in space and time with the limited number of observations. Hence the disagreements between the model results and the observations may be partly due to undersampling. Model resolution may also play a role. The highest-resolution, eddy-permitting model (ROMS-Maine) comes closest to capturing the component parts of the pCO2 cycle here. Another factor in this region may be the effect of nutrient drawdown on alkalinity. If nutrients are not sufficiently drawn down, or if the model formulation does not include adjustment of alkalinity with removal of nitric acid (MIT, UMD), alkalinity will be too low. Drawdown of 20 μmol/kg of nitrate should increase the alkalinity by this amount, which, in turn, should cause a pCO2 decrease of 30 μatm, and thus if the effect is ignored, as in two models, or too small, pCO2 values in summer will be too high by an appreciable amount. In order to capture the correct seasonal cycle of pCO2 in this complex region, simulation of both the pCO2-T and pCO2-nonT cycles need to be quite accurate, and this is clearly a challenge for these seven models.
 In the Alaska Gyre at OSP (Figure 3), as at Kuril, all models except for UMD capture the SST and SSS cycles reasonably well. MLDs are generally better-captured at OSP than at Kuril, though PISCES-T is still too shallow. The relatively small phosphorus cycle is also better-captured, but its amplitude is still underestimated by most models, except for ROMS-Maine and PISCES-T. The sDIC cycle and sALK cycles are of magnitude comparable to these cycles at Kuril, but, the lack of data makes detailed comment difficult. The models are better able to capture the pCO2-T and pCO2-nonT cycles at OSP than at Kuril, and for this reason are generally able to capture the total pCO2 cycle within the ±25 μatm data error estimate. Only a few models have a late-summer pCO2 peak that is not consistent with the data, associated primarily with too large a pCO2-T peak. Overall, these comparisons suggest that in this region of less vigorous mixing and biological cycling [Signorini et al., 2001], models perform more realistically in terms of both pCO2-T and pCO2-nonT and thus the total pCO2 cycle is better simulated.
 In the subtropics near Hawaii, we compare to the observations at the Station ALOHA [Keeling et al., 2004; Brix et al., 2004] and from the HOT data base [Karl and Lukas, 1996] (Figure 4). In section 3.4, we will discuss model-data comparisons of interannual variability at this location. At Station ALOHA, the models capture the seasonal cycle of SST quite well. Most models have a smaller SSS cycle than the data, and the details of the variability also differ from the observations. MLDs in the models tend to be reasonable in the summer and fall, but too deep in winter. ROMS-Maine MLDs are too shallow throughout the year while UMD MLDs are too deep. The small modeled phosphorus variation is generally consistent with the HOT data. NCOM-Maine and ROMS-Maine are best able to capture the amplitude of the sDIC cycle, while the other models underestimate the amplitude. The models are out of phase with the data for the sALK cycle. Nevertheless, most models are able to capture the pCO2 cycle at Station ALOHA because the total pCO2 cycle is primarily temperature controlled [Keeling et al., 2004], and the models capture or slightly underestimate the amplitude of the pCO2-T cycle. Underestimation of the amplitude of the pCO2-nonT cycle in most models is consistent with the underestimation of the sDIC cycle. The underestimation of this cycle compensates for the tendency to underestimate the pCO2-T cycle, and helps to make the pCO2 cycle compare quite well.
3.3. Seasonal Variability Across the North Pacific
 Mapping late summer/early fall (August, September, October) minus late winter/early spring (February, March, April) pCO2, pCO2-T and pCO2-nonT (Figure 5) illustrates the magnitude of the seasonal forcings on upper ocean pCO2 in the observations [Takahashi et al., 2002] and models across the extratropical North Pacific (15°N–60°N, 140°E–125°W). The observed pCO2-T cycle is positive across the whole basin, with largest amplitudes in the center of the subtropical gyre and in the Kuroshio region. Observed pCO2-nonT becomes increasingly negative to the north and west, with the strong biological drawdown in the central northern part of the domain that is likely driven by iron from the shelf in the eastern Bering Sea [Tyrrell et al., 2005]. To the north of the subarctic front (at approximately 45°N) and north of 36°N to the west of the dateline, the total pCO2 amplitude is weakly negative, indicating dominance of the pCO2-nonT component. In most of the Subtropical Gyre, the total pCO2 amplitude is weakly positive, indicating dominance of the pCO2-T component.
 Modeled pCO2-T cycles broadly agree with the data, showing an enhanced cycle in the subtropical North Pacific between 30°N and 40°N from Japan across to about 160°W, the region of the Kuroshio extension and maximum in winter mixed layer depths. The Alaska Gyre (AG) shows a weaker cycle, consistent with the local comparison at OSP (Figure 3). In the center of the subtropical gyre, modeled pCO2-T cycle amplitudes range from being slightly too small to being somewhat larger than (MIT, ROMS-Maine) the data estimate in all models except UMD, which is substantially too small in this cycle.
 Modeled pCO2-nonT cycles have patterns consistent with the data, but the amplitudes are more difficult to capture, particularly to the north and west. The amplitude of the pCO2-nonT cycle is of a similar or slightly smaller magnitude in comparison to the data in the Subtropical Gyre. However, the amplitude tends not to increase sufficiently to the north and west, particularly into the Western Subarctic Gyre (WSG) and to the north where iron supply from the shelf in the eastern Bering Sea is not captured well, if at all, in any of the models. The damped nature of the cycle in the WSG is consistent with comparisons in the Kuril region (Figure 2) in which none of the models capture the full seasonal amplitude of pCO2-nonT. As explained above, this is an oceanographically complex and biogeochemically vigorous region such that these comparisons are likely impacted by both undersampling in the data and deficiencies in the simulations.
 These patterns of the pCO2-T and pCO2-nonT cycles combine to create the total pCO2 (Figure 5, column 1). Most models do not capture the right seasonality of pCO2 in the subarctic, suggesting a temperature-controlled cycle when the nontemperature component should be more dominant. ROMS-Maine captures some of the observed non-temperature-controlled behavior only in the far north, and is unable to capture it farther south because the pCO2-T is too strong and overcompensates for well-simulated pCO2-nonT. To the south and east, ROMS-Maine captures the observed pCO2 cycle very well. UMD captures the observed total pCO2 cycle well in the subarctic, but this is because both the pCO2-T and pCO2-nonT cycles are far too damped. PISCES-T captures best the spatial pattern and amplitude of the east-west gradient in the total pCO2 cycle, largely because the model is able to capture the observed east-west gradient in the pCO2-nonT cycle and also captures the pCO2-T reasonably well.
3.4. Interannual Variability Near Hawaii
 Seasonal comparisons across the North Pacific illustrate that the seasonal variability of the total pCO2 cycle results from the opposing influences of the pCO2-T and pCO2-nonT cycles. In this section, the same behavior is illustrated for both modeled and observed interannual variability at Station ALOHA.
 In Figure 6, we compare to the Station ALOHA data syntheses of Brix et al.  and Keeling et al.  to the modeled interannual variability from 1988 to 2002. The models capture most of the details of the SST variability. The models are much more challenged with respect to the SSS variability, with modeled variability that is generally much lower than observed. Some of the observed variations in SSS are mirrored by the models. However, none of the models capture the substantial freshening that occurred from approximately 1995–1997 that Brix et al.  suggest is due to water mass changes. The models capture the appropriate magnitude and some of the temporal features of the observed variation in sDIC, but the comparison deteriorates after 1999. For sALK, models underestimate the variability seen in the observations, with the exception of ROMS-Maine that substantially overestimates the sALK variability.
 As in the seasonal comparisons presented earlier, modeled pCO2-T compares well to the data, consistent with the SST representation. With the exception of 2000–2001, most models capture pCO2-T variability within the ±4 μatm data uncertainty. pCO2-nonT compares well to the observations in terms of the larger, multiyear shifts seen in the observations; however, on shorter timescales, the details of the pCO2-nonT interannual variability are quantitatively different from the observations in all models across most of this time period. As with the seasonal comparisons across the extratropics, we find that the interannual pCO2 variation has the correct amplitude, although its quantitative details are not correctly modeled, and that this is primarily due to difficulty in capturing the details of the pCO2-nonT component.
3.5. Summary of Comparisons to Observations
 In order to capture the full pCO2 cycle, it is essential to capture quite accurately both the pCO2-T and pCO2-nonT components. This is a nontrivial challenge for models as the components are the net result of a wide array of biogeochemical parameterizations, boundary conditions, model resolution and parameterizations of subgridscale physics. In Figure 7, we summarize these relationships by plotting percent deviation of the modeled cycles relative to the observations for the two components. At Kuril, the modeling challenge is primarily underestimation of the amplitude of the pCO2-nonT component. At OSP the amplitude of the pCO2-nonT component is too large in some cases and too small in others, and significant overestimation of the pCO2-T component also occurs in MIT and ROMS-Maine. At Station ALOHA, despite smaller amplitude cycles that exaggerate these model-data differences, models generally do well with the pCO2-T component, but the pCO2-nonT component has some substantial deviations. Interannual comparisons at Station ALOHA (Figure 6) also illustrate difficulty with the pCO2-nonT component. By the measure used in Figure 7, and keeping in mind the data uncertainties mentioned previously, the most successful models overall are ROMS-Maine and PISCES-T in the Kuril region; PISCES-T, NCOM-Maine and BEC-CCSM in the OSP region; and MIT at Station ALOHA.
 Though the models have difficulty capturing the details of the pCO2-T and pCO2-nonT seasonal cycles and interannual variability at all locations, they clearly show that these components oppose each other and damp the total pCO2 as in the observations. The mechanistic relationship of these two components is consistent with the data in the models, causing total pCO2 seasonal cycles and interannual variability to be consistently smaller than either of the two components. Though the models are clearly imperfect, the fact that they capture these fundamental relationships makes them reasonable tools for studying the carbon cycle response to climate variability on longer timescales across the North Pacific. We proceed with this analysis in the next section.
4. Modeled Interannual to Decadal Timescale Variability
 Interannual to decadal variability in the sea-to-air CO2 flux, pCO2 and pCO2 components across the extratropical North Pacific are driven by variations in seasonal physical forcings. In this section, we compare the modeled responses to climate variability and illustrate that the mechanisms driving the seasonal cycle also control the longer-term response of the upper ocean carbon cycle.
4.1. Magnitude of the CO2 Flux: Mean and Variability
 Mean CO2 fluxes and variability are presented in Table 5. In six of the seven models, the extratropics are a sink of carbon, ranging from 0.38 to 0.79 PgC/yr. UMD suggests a small source of 0.08 PgC/yr. The Tropical Pacific (15°S–15°N, 120°E–80°W) is a source of carbon to the atmosphere (0.40 to 0.70 PgC/yr) in all models. Takahashi et al.  find a tropical (14°S–14°N) source of 0.62 ± 0.18 PgC/yr for 1995, with which the models are generally consistent.
Table 5. Pacific CO2 Flux: Mean and Variability (1σ)a
North Pacific (15–60°N, 140°E–125°W)
Tropics (15°S–15°N, 120°E–80°W)
Mean (All Years)
Variability (All Years)
Mean (All Years)
Variability (All Years)
Units: PgC/yr. Variability is calculated based on detrended monthly means smoothed with a 12-month box.
 Flux variability (1σ) in the extratropical North Pacific ranges across the models from 0.03 to 0.11 PgC/yr (Figure 8a and Table 5). Comparing columns 4 and 6 in Table 5, we see that extratropical variability in the four global models is substantially smaller than tropical variability. In the regional models (UMD, ROMS-Maine, and NCOM-Maine); however, the extratropical variability ranges from 50% to 100% of the tropical variability. In UMD and NCOM-Maine, this is largely due to a tropical variability that is too small in light of previous modeling and data analyses [McKinley et al., 2004; Feely et al., 1999]. Model-to-model differences in the mean sinks for each year are due to the fact that mean pCO2 at any point in time is controlled by several competing factors (SST, DIC, ALK) that tend be large and out of phase with each other. Since the models respond slightly differently to each forcing, they differ in their simulation of the total pCO2 and thus the CO2 flux. Evaluation of interior patterns of DIC evolution should also be undertaken to better understand these mean flux differences. The spatial patterns of surface pCO2 responses in the models and their impacts on the CO2 flux will be explored in detail in the coming sections.
 In Table 6, we present mean CO2 fluxes for each model in each decade. Takahashi et al.  find that the Pacific north of 14°N has a 1995 mean sink of 0.49 ± 0.12 PgC/yr using the gas exchange parameterization of Wanninkhof ; two of five anthropogenic models and the preindustrial BEC-CCSM fall within this range in the 1990s. The three anthropogenic models simulating the 1970s suggest a substantial reduction in the CO2 sink during this period, a feature that can also be seen in the detrended decadal mean anomalies (Figure 8b). The 1970s was a period of generally negative phase of the PDO, and the reduced sink in this decade is consistent with the models' responses to the PDO that will be discussed in section 4.2. In the decadal mean flux results presented in Table 6 for the anthropogenic models, an increasing sink of CO2 with time, forced at least in part by increasing atmospheric pCO2, is also evident.
 In summary, the models tend to suggest a carbon sink in the extratropical North Pacific that is consistent with climatological data for most models. Additionally, the models suggest that extratropical variability is smaller than in the tropical Pacific, and roughly an order of magnitude lower than suggested by atmospheric inversions [Baker et al., 2006; Patra et al., 2005].
4.2. PDO and CO2 Flux Variability
 In this section, we use both Empirical Orthogonal Function (EOF) analysis and regression analysis to study temporal anomalies in modeled CO2 fluxes, and the pCO2 and wind speed variability driving these flux changes. We begin with EOF analysis which uncovers the most prominent spatial patterns of CO2 flux variability on interannual timescales. This analysis indicates the inherent variability of the modeled fluxes that this has a significant relationship to the PDO. We continue with regressions, in which a relationship to the PDO is assumed and the spatial patterns of CO2 fluxes, pCO2 and wind speeds that covary with the PDO are calculated. EOF analysis illustrates an inherent relationship of the PDO to CO2 fluxes and regression analysis allows us to narrow our study to specifically consider how the fluxes and their driving components vary with the PDO.
 The first (ROMS-Maine, MIT, NCOM-Maine, BEC-CCSM, MPI-Met) or second (PISCES-T) EOF modes exhibit coherent, longitudinal bands of CO2 flux variability between about 27°N and 40°N across the Pacific basin in all models except UMD (Figure 9). The percent of the total variance explained by these EOF patterns is from 9 to 17%. The principal component of this pattern is associated with the PDO with a correlation of 0.34 (MIT) to 0.72 (BEC-CCSM). Thus the first or second modes of CO2 flux variability are correlated with the PDO, and the spatial patterns are also similar across the models. This pattern has a similar spatial structure to the first EOF of the SST that is used to define the PDO [Mantua et al., 1997], and the positive phase of the PDO is associated with colder SSTs in this region which is also consistent with an increased CO2 uptake.
 The total variance and EOFs of the UMD model illustrate that very high variance in the Sea of Okhotsk dominates the regional variability. Several processes not considered in this model are important in this region, including terrestrial freshwater flux, tidally driven topographic mixing, and sea ice formation. This may obscure a PDO-like pattern as seen in the other models.
 Regression of the CO2 flux anomalies on the PDO (Figure 10) reveals modeled spatial patterns that covary with the PDO. Units are (mol CO2/m2/yr)/(1σ PDO), i.e., the flux change associated with 1 standard deviation shift in the PDO index. The local flux response to the PDO is small in all models. On the whole, the spatial details are in less agreement between models than was found with the EOF analysis. However, there is some agreement as to an increased efflux with positive PDO phase in the western subtropics between about 18°N and 27°N, and an influx to the north or northwest (MIT, UMD). In the Western Subarctic Gyre, the models generally suggest an influx anomaly with the positive phase of the PDO. Excepting MPI-Met and NCOM-Maine, the models also suggest an efflux anomaly in the Alaska Gyre.
 The magnitude of the flux anomaly associated with a 1σ deviation in the PDO is calculated by integrating over the flux regression pattern (Table 7). What does this tell us about the percentage of the total CO2 flux variability in the North Pacific that is associated the PDO? We estimate this by dividing the flux anomaly associated with a 1σ deviation in the PDO by the total variability in Table 2, and the result is shown in the last column of Table 3. From 6% (BEC-CCSM) to 38% (MIT) of the total amplitude of the variability can be attributed to the PDO. How much of a CO2 flux could a change in the PDO drive? The maximum annual excursion of the PDO for the period 1951–2004 is ±2.2σ. Multiplying this by the maximum net flux from the models (−0.025 (PgC/yr)/(1σ PDO) from ROMS-Maine, Table 7), we find that a maximum flux excursion due to the PDO of ±0.06 PgC/yr (ROMS-Maine) given this set of models. This upper bound estimate for the magnitude of the CO2 flux variability correlated with the PDO is quite small with respect to the global ocean flux extremes of up to ±0.7 PgC/yr from global ocean models [Wetzel et al., 2005; McKinley et al., 2004; Obata and Kitamura, 2003; Le Quéré et al., 2003, 2000; Winguth et al., 1994], and it is even smaller with respect to terrestrial CO2 flux variability estimates of ±2 PgC/yr [Peylin et al., 2005]. These models suggest that even extreme phases of the PDO do not cause a large change in the North Pacific air-sea CO2 flux.
Σ regression as percent of total North Pacific variability, Table 5.
4.3. PDO Impacts on pCO2 and Wind Speed
 Why are the simulated CO2 flux anomalies correlated with the PDO, a dominant mode of surface-ocean physical variability in the North Pacific, so small? The sea-to-air flux is driven by the atmosphere-ocean pCO2 gradient, the gas transfer velocity and the solubility of CO2 in seawater. In the parameterization common to all models [Wanninkhof, 1992], the gas transfer velocity is parameterized as the square of the wind speed. In this section, we consider pCO2 and wind speeds independently to understand how the PDO alters the sea-to-air CO2 flux in the models on interannual to decadal timescales.
 Regressions of annual anomalies in pCO2 and its components on the PDO are shown in Figure 11. Here we use the more detailed separation of the pCO2 components discussed in section 2.3 (equation (3)). We find that while the components of pCO2 do respond significantly to the PDO, they do so in an out-of-phase fashion that results in a small total pCO2 variability. The opposition of the driving components parallels the seasonal patterns explored in section 3.2.
 The temperature effect on pCO2 (pCO2-dT/dt; Figure 11, column 2) has a strong response to the PDO, consistent with the PDO being an index defined as the principal component associated with first EOF of SST in the region; and the pattern of pCO2-dT/dt response is quite similar to the regression of the SST anomaly on the PDO [e.g., Mantua et al., 1997, Figure 2a]. In all models, during the positive phase of the PDO, cold SSTs across the western and central North Pacific result in negative pCO2-dT/dt. Along the west coast of North America and in the southeast of our study region, warm SST anomalies drive positive anomalies in pCO2-dT/dt in all models except UMD. The response pattern also has some similarity to that of the seasonal amplitude of pCO2-T shown in Figure 5, with a larger response in the subtropics than to the north.
 The temperature effect is largely countered by the DIC-driven component (pCO2-dDIC/dt; Figure 11, column 3) in all models, with its center of action typically in the central North Pacific. The positive phase of the PDO is associated with increased pCO2-dDIC/dt in the subtropics and the Western Subarctic Gyre. In the Alaska Gyre (AG), the response is generally negative. Thus we find that in the subtropics of all the models, the influence of increased mixing and Ekman supply of DIC [Chai et al., 2003] associated with the PDO outweighs any increases in biological carbon export. This result is consistent with analysis of a 1000-year integrations of a coupled ocean-atmosphere model [Doney et al., 2006] which illustrates that export production partially counters DIC upwelling under colder SST regimes, reducing the flux of CO2 to the atmosphere. In the AG, Polovina et al.  show that shoaling mixed layers associated with the PDO increase productivity by reducing light limitation. Both reduced MLDs and increased productivity would reduce surface ocean DIC, such that the negative pCO2-dDIC/dt response found in the AG of all models except ROMS-Maine is consistent with this previous work.
 At OSP, Signorini et al.  illustrate the importance of the alkalinity term in determining the total pCO2 and the CO2 flux. These models indicate a similar sensitivity across much of the basin. The alkalinity-driven component (pCO2-dALK/dt; Figure 11, column 4) has a predominantly negative impact with positive PDO across the central Subtropical Gyre in six of the seven models. In the MIT model, the impact of alkalinity on pCO2 is more positive in the subtropics. The detailed spatial pattern of the response in alkalinity varies substantially between the models in other parts of the basin, in contrast to the pCO2-dT/dt and pCO2-dDIC/dt components that are more consistent from one model to another. The magnitude of the pCO2-dALK/dt is similar to those for the pCO2-dT/dt and pCO2-dDIC/dt terms (column 4, Figure 11).
 Alkalinity changes are due primarily to the effects of water balance and growth of organisms secreting CaCO3 shells. However, precipitation/evaporation affects the DIC and ALK by equal proportions, and the effects of DIC changes on pCO2 (∂lnpCO2/∂lnDIC = +10 global surface water mean) are similar in magnitude but with the opposite sign for the effects of ALK (∂ lnpCO2/∂ lnALK = −9), and hence these effects tend to cancel each other. The net effect of water balance on pCO2 is therefore small (∂lnpCO2/∂lnSal = +0.9), and a 1% increase in salinity should increase pCO2 by about 0.9% [Takahashi et al., 1993]. The production of CaCO3 shells decreases ALK, and hence increases pCO2. Although a portion of this decrease is compensated by the decrease in DIC (one-half ALK, due to the stoichiometry of CaCO3) the effect is still large: pCO2 increases by 3.2% with each 10 μmol kg−1 of CaCO3 production. On the global average, the production of CaCO3 amounts to 20% [Broecker and Peng, 1982] to 30% of the net community production [Gruber and Sarmiento, 2002]: In the western North Pacific, the production of CaCO3 is negligibly small, whereas in the eastern North Pacific, it may be as high as 70% of the net community carbon production [Wong et al., 2002b]. The tendency for anticorrelation of the alkalinity effects with the DIC effects (columns 3 and 4, Figure 11) should represent the effects of freshwater forcing, whereas deviations from this tendency suggest changes in the organic carbon/CaCO3 production ratio.
 The salinity-driven pCO2 component (pCO2-dSSS/dt) has some regions of significant correlation with the PDO, almost always opposite in sign to the alkalinity effect. Salinity effects on pCO2 are uniformly small (<±2 μatm). The net effect on pCO2 thermodynamics of surface freshwater fluxes or CaCO3 production are significantly larger for alkalinity than for salinity.
 The similarity of modeled pCO2 component responses to physical forcing on interannual to decadal timescales is partially due to the overall fidelity of the physical and biogeochemical simulations to the real system. The fact that the underlying processes are observed and modeled with some success on seasonal and interannual timescales supports this conclusion. However, it should also be noted that the models share many physical and biogeochemical parameterizations. The impact of alternate parameterizations of gas exchange on CO2 flux response to the PDO [Signorini et al., 2001; Takahashi et al., 2002] is not tested in this study. Additionally, the fact that most models apply NCEP forcing fields should be kept in mind when the commonality of the models' response is considered.
 As expected, the total pCO2 regression pattern (Figure 11, column 1) and the CO2 flux pattern (Figure 10) are largely coherent; that is, positive pCO2 anomalies results in positive CO2 flux anomalies and vice versa. While the total ΔpCO2 determines the sign of the CO2 flux, wind speeds determine the magnitude of the flux in the parameterization of Wanninkhof . Wind speeds vary with the PDO owing to changes in the atmospheric pressure field. In both the NCEP and COADS data sets, winds are stronger over the North Pacific during the positive phase of the PDO (Figure 12). COADS winds have a stronger response with a more southerly maximum than NCEP winds. NCEP also indicates reduced wind speed in the Alaska Gyre. Regions of maximum wind speed in the subtropics are also generally consistent with the maximum flux responses in the first and second EOFs (Figure 9) that have significant PDO correlations. The wind forcing (Figure 12) complements pCO2 changes to drive sea-to-air CO2 flux variability. Analysis with the MIT model in which sea-to-air fluxes are calculated with variations in only one of pCO2 or winds (not shown) confirms that pCO2 and winds complement each other in this way.
 On all timescales considered here, temperature and nontemperature (chemical, biological, convective, advective) forcings act in opposition to determine surface ocean pCO2. Comparisons to data illustrate that models reproduce quite well the effect of temperature on surface ocean pCO2. However, the nontemperature (chemical, biological, convective and advective) influences on pCO2 are harder to capture, particularly in oceanographically complex regions such as the WSG. Because of difficulty with the non-temperature component, models do less well with the overall pCO2 cycle which is the difference of these two large and out-of-phase cycles (Figures 2–4). Additionally, model-to-model comparisons at interannual to decadal timescales are more similar as to the responses of these components than for the net pCO2 and flux anomalies (Figures 6 and 11). These seven models suggest a parallel between the seasonal and interannual pCO2 response to physical variability at the large scale, and in the robustness of models' simulation of these patterns. This suggests that where data are not available for comparison, we should have more confidence in modeled temperature responses to climate variability than in the nontemperature component responses.
 The first or second mode of variability in North Pacific CO2 flux anomalies is associated with the PDO in all but one model in this study. Since CO2 flux variability is ultimately physically forced, such an association with the dominant mode of atmospheric variability is not unexpected. Despite this relationship, we find that the magnitude of the interannual variability in the air-sea CO2 fluxes is small when compared to global ocean and terrestrial biosphere flux variations. On interannual to decadal timescales, we find that the pCO2 impacts of variability in T, DIC and ALK tend to cancel each other out, resulting in a small total variability in pCO2. The out-of-phase responses of the pCO2 components to climate variability are responsible for small net pCO2 responses and basin-integrated CO2 flux variability. The basin-integrated air-sea CO2 flux variability is also damped by cancellation between regions with opposing signs of the air-sea flux.
 The alkalinity response to climate variability on interannual timescales is as important as temperature and DIC responses. In the models, boundary conditions, restorations to SSS and other parameterizations involved with freshwater forcing determine the very different alkalinity responses, and have a notable impact on the total pCO2. Freshwater forcing also impacts the DIC response in a manner that opposes the ALK response, resulting in some cancellation of these effects. Still, model structures, parameterizations and forcings for freshwater are generally poorly constrained by observations. CaCO3 production algorithms also differ substantially. Observations of alkalinity variability on seasonal to interannual timescales to constrain the models are also quite limited. More effort is needed to better understand the alkalinity response to climate variability and its impact on pCO2, and also to improve model parameterizations and forcing fields.
 Interannual to decadal timescale responses of pCO2 and its components to the PDO are generally amplified in the higher resolution models (ROMS-Maine, MIT, NCOM-Maine; Figure 11). ROMS-Maine is also most realistic physically and biogeochemically at Kuril (Figure 2), and for pCO2-nonT across the Western Subarctic Gyre (Figure 5). At the same time, the moderately high resolution MIT model that employs a simple export parameterization and treats alkalinity as a linear function of salinity, performs as well as several of the models with complex ecosystems and more complete carbon equilibrium systems in terms of the component seasonal cycles (Figure 7), and has decadal timescale responses to the PDO that are consistent with the other models (Figure 11). This comparison illustrates the importance of model spatial resolution to the representation of the surface ocean carbon cycle, particularly in regions that are physically complex. The global, coarser-resolution global models with complex ecosystems such as BEC-CCSM and PISCES-T perform best in less complex regions such as OSP (Figures 3 and 7).
 A specific physical feature found in the models is that SST and SSS are best reproduced in the models that prescribe surface heat and freshwater fluxes, even though mixed layers are in some cases significantly too shallow (Figures 2 and 3). This implies that the surface CO2 variability is not necessarily supported by model subsurface dynamics and thermodynamics. Further evaluation of subsurface dynamics and impacts on the carbon cycle would be a valuable undertaking.
 A specific ecosystem factor that is not accounted for in any of the models is the impact of variability in aeolian dust supply to the surface ocean. All models use either a climatological dust supply or include implicit iron limitation in their ecosystem equations. Dore et al.  find substantial interannual variations in nitrogen fixation levels at HOT that can be linked to variability in iron limitation [Grabowski et al., 2006], and thus could be impacted by interannual variations in dust supply. Mahowald et al. , using an atmospheric transport model, illustrate variations in dust column amount with the PDO over the North Pacific. In a new version of the BEC-CCSM, S. C. Doney et al. (manuscript in preparation, 2006) find that variability in iron supply drives a CO2 flux variability of 0.02 PgC/yr (1σ) in the North Pacific, equivalent to 50% of the variability due to physical variability alone (Table 5). There are many other aspects of ecosystem variability in the models that would be useful to evaluate in more detail than allowed for in this study. How ecosystem structure and forcing cause variability in export of carbon from the surface ocean would be particularly interesting to investigate.
 Seasonal variation of surface water pCO2 from seven ocean biogeochemical models have been compared with the observations made in three contrasting oceanographic environments: OSP in the eastern subarctic gyre, Station ALOHA in the subtropical gyre, and an area east of the Kuril Islands in the Oyashio/Western Subarctic Gyre area. For the first two areas, the models are found to reproduce the pCO2 seasonal amplitude and phase in a manner consistent with the observations. However, the model results for the Kuril area fail to yield a temporal structure of the total pCO2 cycle that is consistent with the observations. The highly complex oceanographic settings that involve the Western Subarctic Gyre waters, the Oyashio Current, and the outflows of the Bering and Okhotsk Seas make this area difficult to capture in physicals models, particularly models with coarse resolution.
 Across much of the subarctic North Pacific, models are consistent with observations as to the amplitude of pCO2 variations on seasonal to interannual timescales, but not as to the temporal structure of these variations (Figures 2–6). The fact that the models poorly represent the net response to seasonal forcing over significant parts of the study region is a concern when models are to be employed to project the response of the ocean carbon cycle to climate change. Capturing the details of the total pCO2 variations in the high-latitude North Pacific is complicated by the fact that it is the sum of temperature and non-temperature-driven components that are large and out of phase. However, at seasonal and interannual timescales, we find the models capture these component parts significantly better than they do the full pCO2 cycle. Model-to-model agreement as to the response of DIC and temperature components to physical variability also suggests greater robustness than in the magnitudes and spatial patterns of the net pCO2 response. Thus we conclude that the small amplitude of the CO2 flux variation in the models is generally realistic, but that the temporal details of the flux variability are most likely not reliably simulated by any of the models. Even accounting for additional variability in CO2 fluxes due to variations in dust-borne iron supply (S. C. Doney et al., manuscript in preparation, 2006), it is hard to envision a mechanism by which high latitude air-sea CO2 flux variability could vary to the large degree (±0.2 to ±0.5 PgC/yr) suggested by recent inversions of atmospheric CO2 data [Baker et al., 2006; Patra et al., 2005; Rödenbeck et al., 2003; Bousquet et al., 2000] and extrapolations of data from Station ALOHA [Brix et al., 2004]. Additionally, these findings support the conclusion of Russell and Wallace  that the terrestrial carbon response to the climate variability dominates the observed atmospheric CO2 response.
 Future observing strategies could take advantage of models' indication of where the centers of action of surface ocean carbon cycle response to climate variability are likely to be. Enormous benefit and understanding about the upper ocean carbon cycle has been derived from the time series at Hawaii [Karl and Lukas, 1996; Dore et al., 2003; Brix et al., 2004; Keeling et al., 2004], but this analysis and previous work [Brix et al., 2004] illustrate that this time series is not optimally located for observations of long-term variability in surface ocean pCO2 response to climate variability because it lies along the zero line in the temperature- and non-temperature-driven responses to the PDO, the canonical mode of physical variability in the North Pacific (Figure 11). A long-term, higher-latitude, central Pacific time series station would be very useful [Doney et al., 2004b].
 Four of the “Anthropogenic” models in this study that span at least 2 decades (MIT, NCOM-Maine, MPI, PISCES-T) illustrate both increasing CO2 uptake into the North Pacific, due at least in part to the increasing atmospheric pCO2 boundary condition, and decadal variability in this sink (Table 6). The reduction in uptake found in the 1970s is consistent with the work of Watanabe et al.  summarized by Sabine et al. . Though the basin-integrated air-sea CO2 flux variability is small, these models do suggest variability in the DIC content of the surface ocean can vary substantially with climate variability (Figure 11). In order to understand in-situ observations of the changing inventory of DIC in the ocean, we need to better understand these patterns of change. As shown in this paper, models capture some of the observed variability in the surface ocean carbon cycle and thus, despite their limitations, are a tool that can aid understanding of the observed temporal and spatial variability in the CO2 sink in order to develop a more coherent picture of carbon cycle variability in the North Pacific. Better understanding and improvement of models is, of course, still needed. Additional intercomparisons that evaluate the biological and solubility pumps together in addition to air-sea CO2 variability would be useful. Understanding how modeled interior structure and variability impacts the surface ocean carbon cycle would also help to interpret observations of decadal changes in the rate of uptake of anthropogenic carbon into the ocean.
 We thank the organizers of the Understanding North Pacific Carbon-cycle Changes: A Data Synthesis and Modeling Workshop for encouraging this intercomparison project. We also thank D. Vimont for helpful discussion and assistance with the analysis, and we thank A. Kozyr and S. C. Sutherland for data management. F. Chai and L. Shi acknowledge grant support from NSF (OCE 0137272) and NASA (NAG5-9348); S. Doney and I. Lima acknowledge grant support from NSF/ONR NOPP (N000140210370) and NASA (NNG05GG30G); G. McKinley acknowledges grant support from NASA (NNG05GF94G); C. Le Quéré, E. Buitenhuis and P. Wetzel acknowledge grant support from the European Union (EVK2-CT-2001-00134 and 511176[GOCE]); and T. Takahashi acknowledges grant support from NOAA (NA16GP2001). This is LDEO contribution 6882.