Notice: Wiley Online Library will be unavailable on Saturday 27th February from 09:00-14:00 GMT / 04:00-09:00 EST / 17:00-22:00 SGT for essential maintenance. Apologies for the inconvenience.
 Growing recognition of the importance of natural and anthropogenic aerosols in climate research led to numerous efforts to obtain information on aerosols based on model simulations, satellite remote sensing, and ground observations. This study describes an approach to combine information from independent sources that complement each other in their capabilities to achieve a global characterization of monthly mean clear-sky daytime aerosol optical depth. The following sources of information have been used: simulations from the Global Ozone Chemistry Aerosol Radiation and Transport (GOCART) model; retrievals from the Moderate Resolution Imaging Spectroradiometer (MODIS) instrument on the Terra satellite; and measurements from the Aerosol Robotic Network (AERONET). Leading empirical orthogonal functions (EOFs) are used to represent the significant variation signals from model and satellite results; the EOFs are fitted to the ground observations to propagate the AERONET information at a global scale. The methodology is implemented with a 2-year time record when collocated data from all three sources are available.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
 Numerous approaches have been developed to study large-scale atmospheric aerosols based on remote sensing and model simulations. Major sensors used for AOD retrievals include the advanced very high resolution radiometer (AVHRR) [Rao et al., 1989; Stowe et al., 1997; Husar et al., 1997; Higurashi and Nakajima, 1999; Mishchenko et al., 1999]; the Total Ozone Mapping Spectrometer (TOMS) [Herman et al., 1997; Torres et al., 1998, 2002]; Polarization and Directionality of the Earth's Reflectance (POLDER) [Goloub et al., 1999; Deuzé et al., 2001]; Moderate resolution Imaging Spectroradiometer (MODIS) [Kaufman et al., 1997; Tanré et al., 1997]; and Multiangle Imaging Spectroradiometer (MISR) [Martonchik et al., 1998]. Detailed descriptions of spaceborne remote sensing of aerosol properties are presented in the work of King et al. . Model simulations of the wide spectrum of aerosol types are provided by chemical transport models (CTMs) that are off-line modules driven by meteorological data or from global circulation models (GCMs) which take aerosol processes as an integrated part within the simulation scheme. Description, intercomparison of models and evaluation against satellite retrievals and ground observations are presented in the work of Penner et al.  and Kinne et al. [2001, 2003]. Very few historical ground measurements of aerosol properties are available due to limitations of instrument maintenance and calibration, and degradation of the filters used. Recently, a centrally maintained ground-based Aerosol Robotic Network (AERONET) has been in operation for more than 10 years to provide accurate point measurements at more than 100 stations [Holben et al., 1998, 2001].
 Each of the above approaches has advantages as well as deficiencies. Ground observations give accurate point information, yet, are limited in spatial coverage. Satellites have improved geographical coverage, but the accuracy of the retrieved values is affected by surface conditions, cloud contamination, and uncertainties about aerosol microphysical and chemical properties. Models capture the mechanisms of aerosol production, transformation, transport and deposition and provide a comprehensive description of aerosol properties, but the complex processes are simulated with highly parameterized schemes which need continuous evaluation. Integrated analysis is required to combine the useful aspects of the individual data sources to give a complete description [Charlson, 2001; Diner et al., 2004].
 Optimal assimilation of AOD on a global scale from multiple data sources requires reliable error information. Obtaining accurate estimates of error variance and covariance structure remains a challenge given the limited “ground truth.” In this work, an empirical method is presented for obtaining representative monthly grid area averaged clear-sky daytime AOD by combining the advantages of each data set. Temporally collocated monthly mean AOD at 0.55 μm from satellite retrievals, model simulations and ground measurements are used. As a major sensor designed to provide high quality, routine retrievals both over ocean and land, MODIS data are selected; GOCART model which produces reasonable spatial structures [Chin et al., 2000] is utilized; the best available ground measurements are taken from the AERONET. Analysis was performed for a 2-year period (March 2000 to February 2002) and spatial domain between 60°S and 60°N where most MODIS retrievals and AERONET stations exist. To obtain a global field, extrapolation to high latitudes has been performed based on the spatial distribution of the GOCART model results.
 Data sources used are described in section 2; a quality check of MODIS and AERONET data is presented in section 3; comparison of spatial and temporal variability between GOCART and MODIS data is given in section 4; in section 5 the empirical combination method is introduced and implemented; discussion and summary are presented in section 6.
2. Data Sources
2.1. GOCART Model Simulations
 The Global Ozone Chemistry Aerosol Radiation and Transport (GOCART) model is a three-dimensional chemical transport model with a horizontal resolution of 2.5° longitude by 2° latitude and 20–30 vertical layers, depending on the background meteorology used (the Goddard Earth Observing System Data Assimilation System) [Chin et al., 2000, 2002; Ginoux et al., 2001]. As a forward model that provides needed AOD information, GOCART estimates the emissions of the key types of aerosols (sulfate, dust, organic carbon, black carbon and sea salt) and their precursors based on state-of-the-art data sets of fossil/biofuel combustion; biomass burning and surface topographic features. Chemical reactions (e. g., DMS and SO2 oxidation), transport mechanisms (advection, diffusion and convection), aging and removing processes are built into the model to simulate the aerosol evolvement. To derive AOD, dry aerosol mass Md for each aerosol component is calculated, aerosol optical parameters and hygroscopic effect are assumed to estimate the mass extinction efficiency β, which describes a linear relationship between the dry aerosol mass and the AOD at specified wavelength. Most of these processes are highly parameterized and could be sources of error. Evaluation of the GOCART AOD against satellite retrievals and AERONET observations revealed that the model has the capability to reproduce prominent spatial and temporal variations, in particular in areas with strong signals (biomass burning and dust dominant) [Chin et al., 2002].
2.2. MODIS Satellite Retrievals
 The Moderate resolution Imaging Spectroradiometer (MODIS) onboard the EOS Terra and Aqua polar orbiting satellites is a well-designed instrument for AOD retrievals [Salomonson et al., 1989; King et al., 1999]. With 36 well-calibrated bands of wide spectral range of radiance observations it is possible to implement improved cloud screening algorithms, obtain better determination of surface reflectance, and therefore, a better estimate of AOD from the clear sky path radiances. Owing to availability of observations at high spatial resolution and nearly daily global coverage, MODIS presents an unprecedented opportunity to monitor global aerosol characteristics.
 Retrievals of AOD using multispectral signals from MODIS are performed separately over ocean and land [Kaufman et al., 1997; Tanré et al., 1997]. Over land, a multispectral cloud mask is used for cloud screening [Ackerman et al., 1998]. The dark target technique is used to determine the surface reflectance at blue and red channels (0.47 and 0.66 μm). Major sources of error in the retrievals over land are subpixel cloud contamination, inappropriate aerosol models and inaccurate surface reflectance estimation over areas with subpixel surface water, snow or ice cover. Evaluation of three months of level 2 (10 km × 10 km) land AOD product with AERONET data shows that retrievals are within the expected error range (±0.05 ± 0.2τ) for the 470 and 660 nm wavelengths [Chu et al., 2002]. Over oceans, cloud screening is based on the spatial variability of visible reflectance in combination with tests using infrared channels [Martins et al., 2002]. Five fine mode and six coarse mode aerosol models are built in the lookup table; selection and relative contribution of each mode is based on a least square best fit to the multispectral path radiances. Validation of about six months level 2 ocean AOD product with AERONET observations shows that retrievals are well within the expected error uncertainties (±0.03 ± 0.05τ), with standard error being about 0.02 for wavelengths 0.66 and 0.87 μm [Remer et al., 2002].
2.3. AERONET Observations
 The AERONET is a globally distributed federated network of ground-based observations representing a wide range of atmospheric conditions [Holben et al., 1998, 2001]. AERONET uses the weather-resistant automatic CIMEL Sun/Sky radiometer to make frequent measurements of atmospheric aerosol optical properties at remote sites. Assessment of possible errors due to calibration uncertainties, inaccuracy in ozone absorption, and Rayleigh scattering calculations shows that the total uncertainty in AOD is about 0.01 to 0.02 [Holben et al., 1998; Eck et al., 1999]. Therefore AERONET data are regarded as a quality “benchmark” and are extensively used for the evaluation of other AOD products and calculation of radiative effects.
3. Quality Check and Data Preparation
3.1. MODIS Data
 Level 3 version 4 1° × 1° monthly mean AOD data as derived from MODIS observations on Terra as are used in this study. MODIS retrievals are restricted by surface conditions and cloud presence and therefore, daily count of “pixels” (spatial resolution of 10 km by 10 km) within each grid cell varies from several to near two thousand. Most grids with limited retrievals are found in arid areas (bright surfaces), high latitudes (snow/ice cover) and the “roaring forties” of the Southern Hemisphere Ocean (glint effects). Temporal and spatial averages formed from these low numbers of retrievals could cause a large sampling error. Yet, filtering of MODIS data, based solely on a minimum daily count at pixel level, could be problematic because a less conservative threshold would likely include suspicious data, while a too conservative limit may suffer from losing too much valuable information.
 An obvious feature of the unfiltered MODIS data is the existence of some local discontinuities. For the spatially and temporally averaged AOD, large variation among adjacent grid points might be unrealistic and could be the result of under-sampling. To check if indeed this is the case, a discontinuity index is defined for each grid point as follows: local average and standard deviation are determined from a 3 by 3 array of points centered on the target grid; absolute difference between the target grid value and the local mean is calculated; and the discontinuity index is set to be the absolute value of this difference minus the local standard deviation. Accordingly, a large index value indicates large variations around the central point, and therefore, high discontinuity. Next, MODIS grid data are grouped based on the discontinuity index at a 0.1 bin size, and average pixel daily count is calculated for each bin. Result of this analysis is shown in Figure 1. It can be seen that higher index is related to small number of pixel counts, namely, large discontinuity is associated with under-sampling. Also seen from Figure 1 is that more than 97% of the grids have an index lower than 0.2, which implies that the discontinuity index could also be used to improve the quality of MODIS monthly mean AOD with minimum loss of data. Availability of overlapping MODIS retrievals from Terra and Aqua provides an opportunity to test this idea. Since the local overpass time of Terra and Aqua (about 10:30 am and 1:30 pm) are close to each other, the monthly mean values from these two satellites should be consistent. If a large discrepancy exists, it can be attributed to sampling errors [Kaufman et al., 2000]. Seventeen months (July 2002 to November 2003) of level 3 monthly mean AOD data are taken from both platforms and a linear correlation between the two data sets is calculated at different combinations of two thresholds (minimum pixel daily count and maximum discontinuity index) (Table 1). If all data are used, correlation is only 0.79. The correlation is improved as the lower limit of pixel count increases and upper limit of discontinuity index decreases. It is evident that a combination of these two criteria could result in higher correlation with elimination of a small amount of data. When the minimum pixel count is chosen to be 10 and the maximum discontinuity index as 0.2, the correlation increases to 0.91; less than 5% data are being filtered out. After implementing these criteria to the 2-year Terra MODIS monthly mean AOD used in this study, more than 96.7% of the data remained.
Table 1. Correlation Between the 1° × 1° MODIS Monthly Mean AOD From Terra and Aqua (July 2002 to November 2003) at Different Combinations of Minimum Pixel Daily Count and Maximum Discontinuity Indexa
Maximum Discontinuity Index 0.1
Maximum Discontinuity Index 0.2
Maximum Discontinuity Index 0.3
Maximum Discontinuity Index 0.4
Maximum Discontinuity Index 0.5
Maximum Discontinuity Index 0.8
Maximum Discontinuity Index 1.0
Maximum Discontinuity Index ∞
Also shown is percentage of grids that satisfy the requirement.
Minimum Pixel Daily Count of 0
Minimum Pixel Daily Count of 10
Minimum Pixel Daily Count of 20
Minimum Pixel Daily Count of 30
Minimum Pixel Daily Count of 50
Minimum Pixel Daily Count of 100
 For compatibility with GOCART model output (2.5° × 2°), the 1° × 1° MODIS data are degraded to the same resolution, based on an area-weighted averages. In this remapped data set, data void grids are present in bright surface areas and high latitude. To make the data set complete, interpolation/extrapolation of the AOD values from neighboring grids is performed based on the Poisson technique [Oort and Rasmusson, 1971; Reynolds, 1988]. The Poisson equation
describes an equilibrium solution of a field (ϕ) which is balanced by the external forcing (ρ) and the diffusion process. Using Poisson's equation, spatial distribution information (locations of the local minima and maxima and the rate of change of AOD field) could be prescribed in terms of forcing ρ. The forcing terms for the data void grids are calculated from GOCART model results and MODIS data are taken as boundary values. In order to fill in high latitudes, GOCART data serve as external boundaries with the assumption that low AOD values from GOCART at polar region represent the relatively clean atmospheric conditions. Finite differences in the spherical coordinates and successive over-relaxing (SOR) method [Press et al., 1995] are implemented to solve this second-order differential equation iteratively. Data within the region of interest (60°S, 60°N) are further analyzed. The rationale behind this filling process is to keep the magnitudes of AOD from MODIS and to utilize the spatial distribution information from GOCART. As an example of the effect of quality check and void-filling, one month (August 2000) of data from GOCART simulation results, 1° × 1° MODIS level 3 data, and error-filtered, void-filled, remapped global 2.5° × 2° MODIS monthly mean AOD are shown in Figure 2.
3.2. AERONET Data
 Quality assured level 2.0 data from AERONET are used to compute the monthly mean AOD values for each individual site (Figure 3). Optical depths at two adjacent wavelengths (0.5 and 0.67 μm) are used to interpolate to the standard wavelength (0.55 μm) based on the Ångström empirical expression given as:
where λ is the corresponding wavelength in microns for the AOD τ, β is the Angstrom's turbidity coefficient, and α is the wavelength exponent. Monthly averages are calculated on the basis of daily mean values. Although AERONET provides accurate point measurements, regional representation of monthly means can be questionable [Chin et al., 2002; Kinne et al., 2003]. Table 2 lists the monthly mean AOD of multiple AERONET stations collocated within the same 2.5° × 2° grid cell. Most of the collocated sites have AOD that are close to each other. Variations larger than 0.1 exist in grids in proximity to source regions of biomass burning (August–September in Ndola and Solwize) and dust outbreaks (April in Beijing and XianHe). Possible reasons are the episodic nature of dust outbreaks and biomass burning, short lifetime of large particles and the directionality of the transport. Local pollution could also lead to large subgrid variation as observed at Penn_State_Univ, GSFC and MD_Science_Center on July 2001. Since most of the time aerosol properties and concentrations are consistent over larger scale, AERONET monthly mean AOD is a good estimate of grid mean value, and can be used for evaluation of grid averaged products [Chin et al., 2000; Yu et al., 2003; Kinne et al., 2003]. A full year comparison among collocated AERONET, GOCART and MODIS data is presented in Figure 4. To achieve a better representation of spatial and temporal variations, AERONET stations are grouped into six regions, and sorted by direction as specified in Table 3. Generally, in terms of magnitude, AERONET data are comparable to the other two grid averaged data sets. MODIS retrievals appear to be the highest among the three, in particular in the western North America during the spring/summer time and in dust dominated regions (region D). This can be partially attributable to inaccurate estimation of surface reflectance [Chin et al., 2004] and insufficient knowledge of the optical properties of nonspherical particles [Levy et al., 2003].
Table 2a. Monthly Mean AOD of AERONET Sites Located Within the Same 2.5° × 2° Grid Cella
From March 2000 to February 2001. Cells with subgrid variation larger than 0.1 are shown in boldface.
NA, temporal collocated measurements are not available.
 AERONET monthly mean AOD values are also affected by under-sampling. Stations are marked as questionable if within one month, the number of total measurements is less than 100 and days in operation are less than five. During March 2000 to February 2001, three measurements are eliminated because they show much higher values than MODIS and GOCART data (Figure 3) (Dakar on August 2000; NCU Taiwan on October 2000; Mexico City on November 2000). Similar comparison was performed for 2001 and two unrealistic AERONET monthly mean values were filtered out (Yulin in April 2001 and Philadelphia in June 2001). Data from Mauna Loa, Hawaii are not used in the analysis since the site is located 3.4 km above sea level, void of aerosol effects and used for calibration of Sun photometers [Holben et al., 1998].
3.3. Temporal Sampling Differences
 The data sets used in this study have some intrinsic differences in their temporal coverage. GOCART model simulates the whole aerosol life cycle, and the monthly mean is an all-sky all-time average; MODIS onboard Terra provides AOD at the daily local overpass time under cloud-free condition, and the monthly mean represents the clear-sky prenoontime value; AERONET measurements are performed for all clear-sky daytime situations, and therefore, the temporal coverage is intermediate between GOCART and MODIS. Relative to the clear-sky daytime average, inclusion of aerosols under cloudy condition in GOCART might introduce a bias which is difficult to estimate because of the compensating effects of secondary aerosol (sulfate) production, hygroscopic growth and wet deposition [Chin et al., 2002]. Possible bias of MODIS monthly means can be attributed to AOD diurnal variations. While Kaufman et al.  found that measurements at MODIS overpass time represent clear-sky daytime averages quite well, others show detectable diurnal variability in urban/industrial areas (10–40%) [Smirnov et al., 2002], and in the southern African biomass burning region (25%) [Eck et al., 2003]. Scatterplots of AERONET monthly mean AOD against the GOCART and MODIS data are shown in Figure 5. For cases when AERONET AOD is less than 0.6, MODIS retrievals have a positive bias, while GOCART simulations do not show a significant bias. For high values of AERONET AODs both data sets tend to underestimate such values. The discrepancies revealed in the comparison could be attributed to intrinsic uncertainties of each data set as well as to the sampling incompatibility. Confident estimates of the deficiencies in both data sources would be possible only when the sampling effects are reliably estimated.
 To propagate the high quality information of the AERONET, an empirical method is proposed here that takes advantage of the spatial distribution information from GOCART and MODIS. To test whether reliable geographical distribution information can be retrieved from model and satellite data, intercomparison of the variability signals of GOCART and MODIS data is first performed.
4. Comparison of MODIS and GOCART Variability
 In order to compare the spatial and temporal variability of MODIS and GOCART, anomalies (difference between monthly means and total 24 months average) are calculated. Coupled analysis is performed based on the singular value decomposition (SVD) method, which is a powerful tool to identify pairs of spatial patterns (modes) with the maximum temporal covariance between the two fields [Bretherton et al., 1992]. It has been widely applied to meteorological data for exploring the coupled relationship between two physically related variables [Wallace et al., 1992; Wang and Ting, 2000]. If the two fields have large common signals and are joined-analyzed using the SVD method, the spatial distributions of the modes and the temporal variation of the expansion coefficients are expected to be similar. The contribution of each pair of modes is described by the squared covariance function (SCF), defined as:
where σi is the ith singular value and M is the total number of coupled pairs. In the coupled GOCART and MODIS anomaly SVD analysis, more than 95% of squared covariance is explained by the first three leading modes (see Figure 6), suggesting that most of the variation signal is contained in these modes. The temporal evolution of MODIS and GOCART data is similar, namely, mode 1 and mode 2 represent a strong annual cycle, and mode 3 describes the seasonal variation. The amplitude of the MODIS time series appears to be larger than that of GOCART, due to the larger variance associated with the satellite retrievals.
 The first three leading coupled modes are present in Figure 7. Pairs of modes display similar spatial distribution; large-scale prominent features such as biomass burning in southern hemisphere, tropical Africa and southeast Asia; dust over northern Africa, Asia and transport over the tropical Atlantic Ocean are in good agreement.
 Missing MODIS data over part of northern Africa and Saudi Arabia are filled based on GOCART spatial information. The filled desert area is less than 2% of the total analysis domain, yet, nearly 9% of the total variance is found in this region. The question arises whether this data filling has a large effect on the anomaly analysis. Sensitivity coupled SVD analysis performed with unfilled MODIS data shows little difference. The reason could be attributed to the large outflow areas over ocean and nearby dark land surfaces, which maintain strong signals from the data void regions.
 The SVD coupled analysis indicates that in spite of the differences in temporal coverage (clear-sky snapshot versus all-sky all-time) the variability information from MODIS retrievals and GOCART model results is in good agreement. The different sampling strategies do not seem to have significant effect on the spatial and temporal variability signals of the two data sets. These results are the basis for utilizing the spatial variation information from both data sets to distribute AERONET data at a global scale.
5. Empirical Combination
 Progress has been made to combine satellite retrievals with model products based on optimal interpolation (OI) techniques. Collins et al.  dynamically assimilate the AVHRR retrievals to a chemical transport model; Yu et al.  merge the monthly mean MODIS retrievals with the GOCART model results and analyze a complete annual cycle for global AOD. In this study, we present an “empirical” method (not solely dependent on the error analysis) to combine the AERONET, MODIS and GOCART AOD monthly mean data. The global long-term averages are determined first, followed by spatial and temporal variations constructed from truncated EOF fitting. A flowchart of the empirical combination scheme is presented in Figure 8.
5.1. Two-Year Averaged Global AOD
 Minimum variance estimation method [Daley, 1991] is usually used to average two data sets with weights determined as:
where w is the weight and e2 is the unbiased error variance [North et al., 1991; Huffman et al., 1995; Xie and Arkin, 1996]. To estimate the respective averaging weights for GOCART and MODIS 2-year mean AOD, data from Figure 5 are binned at 0.02 AOD units based on AERONET measurements. Assuming that inside each bin the mean value contains the bias, then the standard deviation can be regarded as the square root of unbiased error variance. The “error” is not entirely due to the deficiencies of model simulation schemes and satellite retrievals; sampling differences could also exert an effect. The analysis result and the linear fittings of the mean value and the standard deviations are shown in Figure 9. Disperse distribution at the high value end is due to data scarcity and might be statistically insignificant. On the basis of this analysis, unbiased error variances are set to be (0.057 + 0.158τ)2 for GOCART results and (0.074 + 0.134τ)2 for MODIS data. Consequently, fractional contribution (weight) of GOCART data monotonously decreases from 0.63 to 0.48 as τ increases from 0 to 1. To assign the fractional contributions to both data sets, there is a need to determine the value of τ. Since the linear regression is performed in respect to the “ground truth,” τ should be the “true” value rather than estimates from model and satellite. In this work we set τ to be the arithmetic average of collocated GOCART and MODIS data, since the fractional contribution of GOCART (or MODIS) is a slowly varying function of τ. A difference of 0.2 in τ results in a change of at most 0.037 in weights and therefore, inaccuracies in τ will not have a large impact on the contribution of each data set.
 Since the weighted average is computed from data that might be biased, a further check of the bias is necessary. In Figure 10 presented is a comparison of the combined 2-year average with data from thirteen AERONET stations (twelve grid values because GSFC and MD_Science_Center are located within one cell and averaged). The merged 2-year means are generally larger than the AERONET observations (difference is below 0.1). To possibly reduce the remaining bias, Poisson technique is used: twelve grid points serve as anchor points (internal boundaries) and the weighted averaged data at the Polar Regions are kept as external boundary values. We assume that these AERONET long-term averages represent the area average and that a small bias remains in the merged data at high latitude. Forcing terms at the remaining points are calculated from the weighted averaged data. The rationale of this procedure is that linear bias contained in the original field could not affect the value of the second derivative (forcing term ρ in Poisson equation). Using accurate values at some anchor grid points and trying to keep the original forcing terms at the remaining points, reconstruction of the field can reduce constant and linear bias from the original data. This technique is well established in the assimilation of SST and precipitation [Reynolds, 1988; Reynolds and Marsico, 1993; Reynolds and Smith, 1994; Xie and Arkin, 1996].
Figure 11 shows the 2-year mean AOD from GOCART, MODIS (void-filled) and the final result within (60°S, 60°N). Displayed is also the effect of the Poisson technique. GOCART data (with spatially averaged AOD being 0.13) are smaller than MODIS results (0.19), and weighted averaged result lies in between (0.16). Poisson technique has an overall reduction effect (Figure 11d) due to general overestimation by the 2-year weighted average compared with AERONET (Figure 10), which leads to the final averaged value (0.13) to be close to GOCART simulations. In the 2-year average AOD (Figure 11c), large values are found in Africa and Asia, mostly from mineral dust, combined with biomass burning and industrial pollution. Evident is also the westward propagation from north Africa and eastward transport from east Asia. AODs over South America are somewhat lower than over South Africa, perhaps due to the shorter and less intense period of the burning season [Duncan et al., 2003]. Urban/industrial aerosol signals could be detected in the eastern United States and Europe.
5.2. Spatial and Temporal Variations
 Propagating the AERONET information at global scale is difficult largely due to the limited number of stations and the inhomogeneous and anisotropic AOD spatial distributions which might not be reliably described by simple modeled covariance functions. Truncated EOF fitting is more suitable for this case because of its ability to distribute sparse data to large scale in a more realistic and coherent manner. Such approach has been applied to the reconstruction of historical SST and model data assimilation [Smith et al., 1996; Kaplan et al., 1997; Joaquim et al., 2001].
 Denoting the leading EOF modes computed from MODIS and GOCART anomalies as E, and AERONET anomalies as O:
where μ is the expansion coefficient; H is the observation operator which converts the data from grid space to the observation locations; and d is the difference between the observation anomaly and the constructed value. Best estimation of the expansion coefficient μ in a least square sense requires:
this equals to:
μ Is derived by solving this linear system and the constructed anomalies are calculated from HEμ.
 We assume that the quality checked AERONET monthly means of AOD could be regarded as the grid average, so the observation operator H is simply mapping the stations to the grid points where they are located. AERONET anomalies O are calculated relative to the above estimated 2-year average AOD values at the corresponding grids. Leading modes E are derived from area-weighted EOF analysis, performed on the composite MODIS and GOCART anomalies (i.e., concatenate two data sets together). Figure 12 shows the percentage of the total variance explained by each mode. More than 70% contribution comes from the first 5 modes, which indicates that large common variability is shared by the two data sets. In order to fit the leading EOFs to the measurements month by month, relative significance of each mode must also be determined on a monthly basis. The leading sequence of the EOFs is determined by:
where the index i represents the ith mode, and t denotes the time; T is the normalized expansion coefficients (temporal amplitude) from the EOF analysis and σ is the eigenvalue (explained variance).
5.2.2. Sensitivity Tests
 Before implementing the EOFs fitting, following questions remain: (1) How many modes are necessary to achieve satisfactory result? (2) Is this method robust in respect to the observational and sampling errors in the AERONET data?
 To test how many EOFs are needed for capturing significant signals and for testing the performance of truncated EOF fitting, the following sensitivity test is designed: AERONET anomalies are replaced with the MODIS/GOCART anomalies at the grid points where AERONET stations are located. The result of such simulated EOF fitting will be compared with the original MODIS and GOCART anomaly fields to check whether significant signals can be reconstructed. To make a quantitative estimate of the resemblance between two fields, the two-dimensional data array is reformed to a vector, and the vector cosine is computed as a similarity index. The cosine value of 0.71 represents a projection angle of 45°, which is served as an acceptable lower bound for two spatially similar fields. The robustness of the fitting, which is determined by the condition number of the matrix (HE)T(HE), is also calculated. Larger condition number will make the linear system ill conditioned and very sensitive to small change of observed values, thus unfavorable for the fitting process.
 Test results for August 2000 are shown in Figure 13. Condition number and the similarity (vector cosine) between the fitting results and the simulated field are displayed as a function of the number of EOFs being used. Test results for the other months are similar. Generally, with few EOFs participating in the fitting, large-scale spatial patterns can be successfully reproduced, with the vector cosine being larger than 0.8. As more modes are being included, the similarity increases; however, the condition number also becomes larger.
 The following empirical rules are followed to decide how many leading EOFs to use: sufficient number of modes is needed to capture significant spatial variability information; modes with small and comparable eigenvalues are usually degenerated and might be contaminated by errors [North et al., 1982]; the condition number should be relatively small. As a compromise, a threshold value of 0.02 of the relative significance value sigi,t is used to truncate the EOF modes for each month. Table 4 gives the number and index of the leading modes being used and the number of grid points with available AERONET measurements.
Table 4. Number and Index of the Leading Modes; Number of grid Points With Available AERONET Values Used in the EOF Fitting for Each Month
Leading EOFs Used in the Fitting
Number of Grid Points With AERONET Measurements
1, 2, 3, 4, 5, 6, 17
1, 2, 3, 4, 5, 6, 7, 8
1, 2, 3, 4, 5, 6, 7, 8, 9
1, 2, 3, 4, 5, 6,
1, 2, 3, 4, 5, 6, 8, 11, 13
1, 2, 3, 4, 5, 6, 7, 13
1, 2, 3, 4, 5,
1, 2, 3, 4, 5, 7, 8, 10
1, 2, 3, 4, 5, 7, 8, 9, 10, 12
1, 2, 3, 4, 5, 6
1, 2, 3, 4, 5, 7
1, 2, 3, 4, 5,
1, 2, 3, 4, 5, 6, 7, 8
1, 2, 3, 4, 5, 7, 9, 15
1, 2, 3, 4, 7, 8
1, 2, 3, 4, 5, 6, 8, 10
1, 2, 3, 4, 5, 6, 8, 9, 11
1, 2, 3, 4, 5, 8, 10
1, 2, 3, 4, 5, 6,
1, 2, 3, 4, 6, 7
1, 2, 3, 4, 5, 7
1, 2, 4, 5, 7
1, 2, 3, 4, 5
1, 2, 3, 4, 5
5.2.3. Combination Results
 Truncated EOF fitting is performed to construct the anomaly field for the time period March 2000 to February 2002 over the domain of (60°S, 60°N). Monthly mean AOD are obtained by adding the anomaly back to the 2-year average. The Poisson technique is used to fill in the high latitude region as used in the MODIS data void filling. Large-scale spatial and temporal variations are well represented in the combined results (Figure 14).
 Minimum least square fitting cannot reproduce the exact AERONET AOD values. Unless the spatial variation of the measurements is represented by the linear combination of the available patterns (leading EOFs) the results might not be satisfactory. Scatterplot of the combination results against AERONET data is presented in Figure 15 showing high correlation to the AERONET data. Dispersion and a small negative bias of the merged AOD can be caused by the different temporal coverage of GOCART and MODIS data, subgrid-scale variability in the AERONET observations, loss of small-scale signals due to the truncation of EOFs and inaccurate long-term average from which AERONET anomalies are calculated.
 To assess differences between the combination results and GOCART/MODIS data and to evaluate regional performance, comparison is performed at six regions as specified in Table 3 (Figure 3). Figure 16 displays the intercomparison between the data sets and scatterplots of each against AERONET measurements. Only grid points with MODIS retrievals are selected.
 Region A: Merged AOD agrees well with GOCART in spring and summer but is lower than both data sets in autumn and winter. In general, combination result gives a higher correlation with AERONET.
 Region B: Combination results display a negative bias. MODIS data agree well with AERONET while GOCART tends to have a low bias thus not improving the combination results in this area.
 Region C: Combination results shows good agreement with AERONET and regionally averaged combination values are generally between GOCART and MODIS.
 Region D: Combination results agree better with GOCART results in this region Underestimations exist for some high AERONET AOD cases. This region has the largest known aerosol burden.
 Region E: Both GOCART and MODIS tend to overestimate the AOD in this region. Although the correlation is not improved for the combination results, positive bias is largely reduced.
 Region F: Combination results show a high degree of agreement with GOCART data. MODIS data display a positive bias in the small to medium range of AOD, however, they agree better with AERONET at the high end of values.
 Overall, merged results are closer to GOCART than to MODIS. The explanation could be that in the range of low and medium AOD, GOCART data do not show a significant bias, while MODIS data have a positive bias (Figures 5 and 9). Since monthly AODs over large part of the world are within the low and medium range, the combination results tend to be close to model simulations. However, variations of the merged results are more consistent with MODIS retrievals which would imply that the GOCART model provides better estimates of the magnitude, while MODIS results are better for describing variations.
6. Discussion and Summary
 When describing the Progressive Retrieval and Assimilation Global Observing Network (PARAGON) concept, Diner et al.  emphasize the need to reduce the uncertainties in our understanding of aerosol-climate interactions. Specifically: “The complexity of the aerosol-climate problem implies that no single type of observation or model is sufficient to characterize the current system or to provide the means to predict aerosol impacts in the future with high confidence”. Consequently, information must be drawn from multiple observational and theoretical techniques, platforms, and vantage points, and strategies that explicitly plan for the integration and interpretation of the various components. In the present study an attempt has been made to reduce the errors at global scale in AOD by developing a merging approach to obtain global monthly mean clear-sky daytime AODs, using observations from independent sources. The methodology was implemented with a 2-year record of simultaneous information from model outputs, satellite retrievals and ground observations. This approach has the following merits:
 1. Leading EOFs can retrieve the significant and geographically continuous variation signals from model and satellite data.
 2. Fitting the leading EOFs to the ground observations can propagate the AERONET information in an inhomogeneous and anisotropic manner, with an amplitude that is close to the measurements in a general least square sense.
 3. Truncated EOF fitting is robust and not very sensitive to possible sampling errors in the ground observations. If the sampling errors lead to variations that cannot be explained by the leading EOFs, these signals will be largely ignored in the fitting process.
 Limitations regarding this scheme are:
 1. It is empirical in nature where assumptions can be only partially tested due to the limited amount of high quality monthly mean, grid area averaged AOD data sets.
 2. Propagation of AERONET information in the time dimension was not implemented. Kaplan et al.  constructed a first-order linear Markov model to provide further constrains on the temporal amplitudes. However, a reliable model of this type can be built only when the database of collocated information is expanded.
 3. More realistic observation operator than H might ameliorate the regional representativeness problem of AERONET point measurements. However, finding the relationship between the areal average and point value remains an open issue. It is hoped that the full potential of the proposed approach would be achieved when longer term information from independent sources becomes available in the future.
 This work was supported under the NASA Science Fellowship grant NGT530450 and NASA EOD/IDS grant NAG59634 to the University of Maryland. Thanks are due to the sponsoring agencies and to the providers of the satellite, ground, and model data used in this study. Helpful comments of Mian Chin and Theodore L. Anderson are gratefully appreciated. Insightful comments from reviewers helped to improve the manuscript and are acknowledged.