The global assessment of phytoplankton biomass and its variations in time and space is essential for the long-term evaluation of ocean ecosystem health and for understanding changes in the ocean carbon cycle [Field et al., 1998; Gregg and Conkright, 2002; Fasham, 2003]. The sheer size of the ocean and the costs associated with its in situ sampling have led to the deployment of satellite ocean color missions [IOCCG, 1999; McClain et al., 2004]. These global determinations of the upper ocean chlorophyll distribution have produced the first consistent views of the space/time dynamics of the ocean biosphere [Yoder et al., 1993; Longhurst, 1995; Behrenfeld et al., 2001]. However, satellite ocean color data are produced through a complex procedure which accounts for atmospheric, surface and in-water effects to produce useful products like the chlorophyll a concentration [Gordon and Morel, 1983; McClain et al., 2004]. Some of the models used have roots in first principles while others are empirical and are constructed by statistically modeling field observations. A critical part of this procedure is the bio-optical model which relates a measure of ocean color, the water-leaving radiance spectrum, to an in-water constituent, such as the chlorophyll concentration [Gordon and Morel, 1983; O'Reilly et al., 1998]. These models are developed and validated using limited in situ data which do not span the full range of oceanic conditions [Claustre and Maritorena, 2003]. Hence, this data limitation creates a potential for significant biases in remote sensing products with important implications.
 A five year time series of monthly satellite ocean color observations from the Sea-viewing Wide-Field of view Sensor (SeaWiFS) [McClain et al., 2004] is used to determine surface chlorophyll concentrations (Chl) using the operational empirical bio-optical algorithm (OC4v4) [O'Reilly et al., 1998, 2000] and a semi-analytical algorithm (GSM) [Maritorena et al., 2002; Maritorena and Siegel, 2005]. Both algorithms have been developed using the best available data set of biological (chlorophyll concentrations) and optical (water-leaving radiance spectra) properties. The OC4v4 algorithm is a polynomial relationship of water-leaving radiance ratios numerically fit to global chlorophyll observations [O'Reilly et al., 1998]. It assumes that the major optically-active components in the surface ocean covary with Chl in a consistent manner globally. In contrast, the GSM algorithm considers that Chl, colored dissolved and detrital organic materials (CDM) and particulate abundances each independently affect ocean color and these properties are retrieved simultaneously from a water-leaving radiance spectrum [Maritorena et al., 2002; Siegel et al., 2002, 2005]. Values of the parameters used in the GSM model are derived using a very similar data set to the one used for developing the OC4v4 algorithm, but also includes CDM and the optical backscattering due to particulates [Maritorena et al., 2002; Maritorena and Siegel, 2005]. Both algorithms are good Chl predictors as demonstrated using a match-up data set of SeaWiFS imagery and coincident in situ observations (Table 1). When water depths are greater than 1000 m (chosen to reflect open ocean conditions), the performance of the two bio-optical algorithms is indistinguishable (Table 1).
|OC4v4 vs. in Situ||GSM vs. in Situ||OC4v4 vs. in Situ (Z > 1000 m)||GSM vs. in Situ (Z > 1000 m)|
 However, comparison of the two global Chl climatologies shows large qualitative and quantitative differences (Figure 1). Normalized percentage differences (ΔChl) exceed 50% over large expanses of the ocean where retrievals found using the empirical algorithm (OC4v4) are greater than the semi-analytical algorithm (GSM). Large differences are seen poleward of 40° latitude, particularly in the northern hemisphere where they approach 100%. On the other hand, GSM algorithm Chl values are greater than the OC4v4 Chl retrievals by as much as 50% in the clear waters of the subtropical gyres. These differences are of the same size as errors in Chl retrievals reported from the previous generation of satellite ocean color observations from the Southern Ocean [Mitchell and Holm-Hansen, 1991; Sullivan et al., 1993] but extend over larger regions of the oceans than just the Southern Ocean (Figure 1).
 The spatial patterns of ΔChl and the colored detrital material (CDM) distribution suggest a central role for CDM in creating the observed differences (Figures 1c and 1d). Regions with high average CDM retrievals correspond to regions where the OC4v4 algorithm retrieves higher Chl values than does the GSM algorithm. This can also be seen in the strong correspondence observed between ΔChl and CDM for the entire 5 year data set (Figure 2). The cause for the large differences in global Chl climatologies appears to lie in differences in the underlying assumptions used in the two models. Here, the semi-analytical model (GSM) is able to account for absorption of light by CDM independently while the empirical model (OC4v4) assumes that CDM covaries in a consistent way with Chl.
 For nearly all of the ocean, the CDM signal is driven by changes in the colored dissolved organic material content (CDOM) [Siegel et al., 2002]. High quality, open ocean CDOM observations are even rarer than Chl observations [Nelson and Siegel, 2002]. That said, the GSM algorithm performs well for predicting CDOM [Siegel et al., 2005] as demonstrated using match-up data of satellite and field observations (R2 = 0.61; N = 112) and from meridional transect observations from the North Atlantic Ocean (R2 = 0.65; N = 111). Thus, the correspondence between satellite determinations of CDM and in situ CDOM observations and between ΔChl and CDM signals all suggest that the varying CDOM contribution is not properly accounted for in the OC4v4 algorithm [Siegel et al., 2005].
 Other processes could conceivably create the observed discrepancies though it is hard to make a convincing argument. For example, land-sea interactions are not driving the observed differences as the expected patterns from riverine inputs are largely inconsistent with the observed ΔChl distribution (Figure 1c) [see Siegel et al., 2002]. Further, it is also unlikely that the observed differences are an artifact of the procedures used to correct the satellite signals for the atmospheric path as there is no correspondence in spatial patterns between ΔChl and retrieved aerosol property indices in either space or time (data not shown). Changes in phytoplankton community structure or photoadaptation, which alter phytoplankton light absorption per unit chlorophyll in response to light and other environmental stresses [Bricaud et al., 1998; Cota et al., 2004], are also not likely to create the observed differences. The effects of photoadaptation and community structure shifts on light absorption per unit chlorophyll are typically modeled using non-linear power-law relationships with Chl [Bricaud et al., 1998; Carder et al., 2004]. Hence, the global relationship between phytoplankton absorption and Chl will be accounted for in the empirical fittings of the two bio-optical models. Residual differences may still occur, although the observed, large-scale patterns of ΔChl (Figure 1c) and its relationship to the derived CDM distribution (Figure 1d) seem to preclude this possibility.
 We conclude that the observed differences in the two Chl retrievals are built into the models themselves. A comparison of clear water, in situ observations (in situ Chl < 0.25 mg m−3 defined by [Gordon and Clark, 1981]) shows differences between the two models (ΔChl) that increase as a function of CDM (Figure 3a; data from Werdell and Bailey ). This trend is also seen as a function of in situ spectrophotometric observations of CDM (Figure 3b; data from Siegel et al. ). Type II regression statistics show weak yet significant positive relationships between the ΔChl and either the algorithm produced CDM value or the in situ observed CDOM. The correspondence between these figures produced with in situ observations and the same figure constructed with satellite data (Figure 2) is striking supporting our argument that the observed differences is built into the algorithms. In turn, these differences are created by the limitations of the data sets used to parameterize the two bio-optical algorithms. Relatively few of the in situ data locations used to develop the two algorithms have come from locations where ΔChl values are large (see the locations of the NOMAD data in the upper left panel of Figure 1).
 Empirical modeling requires that model performance is optimized through comparison to a development data set. Performance can, however, be significantly degraded outside of the range of applicability of the development data set [Davis, 1977]. It is for this reason that an evolution toward mechanistic relationships is preferable. Here, we apply a semi-analytical, bio-optical model (GSM) which accounts for the independence among open ocean optical properties [Siegel et al., 2002]. Comparison of the two bio-optical models indicates that there may be a serious bias in our interpretations of satellite ocean color data. This issue revolves around the contribution that colored dissolved organic materials make to ocean color variations [Siegel et al., 2005]. Empirical algorithms, like OC4v4, assume that the CDM to Chl relationship is a fixed function of Chl for all ocean waters. The advantage of a semi-analytical approach is that it can accommodate the independent and highly variable contributions of CDM and chlorophyll to ocean optical properties.
 Implications of our results go well beyond the simple quantification of light absorption by two components of the upper ocean. Perhaps most importantly, these discrepancies between models seriously impact assessments of ocean net primary production (NPP) and quantifying global ocean carbon cycling. For example, the Vertically Generalized Productivity Model [Behrenfeld and Falkowski, 1997; Behrenfeld et al., 2001] applied using OC4v4 Chl yields a global net primary production of 53.5 Gt C/y, whereas if the GSM Chl field is used it gives a value of 37.1 Gt C/y. This difference of 16.4 Gt C/y is particularly worrisome when compared to the magnitude of natural interannual variations in NPP, such as the 5 Gt C/y change associated with the 1997–1999 El Nino to La Nina transition [Behrenfeld et al., 2001]. The detection of longer-term temporal change in operational chlorophyll records [Gregg and Conkright, 2002] may also be easily confounded by details of the bio-optical models identified here. For example if CDM varies independently from Chl, then changes in global biospheric processes may be incorrectly attributed [Siegel et al., 2005]. Finally, a path for retrieving information on growth rates from space has recently been developed that is based on phytoplankton chlorophyll-to-carbon ratios derived from GSM absorption and backscattering products [Behrenfeld et al., 2005]. Clearly, any uncertainty in the attribution of absorption to CDM versus chlorophyll will impact interpretations and calculations made from satellite ocean color imagery.
 The present work points out how differences in algorithms with near identical validation characteristics (Table 1) can produce important differences when applied to the global ocean. This will have an important bearing on the remote assessment of ecosystem functioning and carbon cycling for the world's oceans using satellite ocean color data products. The resolution of this issue will require continued improvements in remote sensing algorithms, the in situ data sets used in validating these models and the ocean viewing instrumentation deployed in space. This work clearly suggests that remote sensing algorithms must evolve from empirical toward mechanistic approaches so that confounding influences of independent ocean optical properties can be diagnosed successfully. This path must also emphasize the importance of in situ sampling covering the entire parameter range that the ocean provides, a limitation that currently reflects the fact that much of the ocean remains largely unexplored [Claustre and Maritorena, 2003]. These improvements in turn must be coupled to the development of new satellite-based technologies capable of accurately separating signals from the dominant in-water constituents. It is along this path where we will reduce the uncertainty in our remote assessments of the ocean biosphere due to competing ocean optical constituents.