Application of spectral analysis techniques in the intercomparison of aerosol data: 1. An EOF approach to analyze the spatial-temporal variability of aerosol optical depth using multiple remote sensing data sets



[1] Many remote sensing techniques and passive sensors have been developed to measure global aerosol properties. While instantaneous comparisons between pixel-level data often reveal quantitative differences, here we use Empirical Orthogonal Function (EOF) analysis, also known as Principal Component Analysis, to demonstrate that satellite-derived aerosol optical depth (AOD) data sets exhibit essentially the same spatial and temporal variability and are thus suitable for large-scale studies. Analysis results show that the first four EOF modes of AOD account for the bulk of the variance and agree well across the four data sets used in this study (i.e., Aqua MODIS, Terra MODIS, MISR, and SeaWiFS). Only SeaWiFS data over land have slightly different EOF patterns. Globally, the first two EOF modes show annual cycles and are mainly related to Sahara dust in the northern hemisphere and biomass burning in the southern hemisphere, respectively. After removing the mean seasonal cycle from the data, major aerosol sources, including biomass burning in South America and dust in West Africa, are revealed in the dominant modes due to the different interannual variability of aerosol emissions. The enhancement of biomass burning associated with El Niño over Indonesia and central South America is also captured with the EOF technique.

1 Introduction

[2] Atmospheric aerosols have been identified as the largest source of uncertainty in anthropogenic forcing of climate change. Emphasis has been placed on retrieving aerosol properties (e.g., optical depth, Ångström Exponent) from satellite measurements in order to have the required global coverage. For example, the Moderate Resolution Imaging Spectroradiometer (MODIS) and Multi-angle Imaging Spectroradiometer (MISR) are two dedicated sensors whose data have been extensively used in aerosol research. With multiple observational data sets available, it is important to examine their consistency in representing the spatial and temporal variability of aerosol properties, especially in the study of aerosol climate effects and the validation of GCMs. For example, Liu et al. [2006] and Ginoux et al. [2006] both used Level 3 monthly mean satellite data to compare with aerosol models. In addition, large-scale variability is usually the focus when studying the interaction between climate modes and aerosol properties.

[3] In this paper, we focus on the analysis of monthly mean, gridded AOD products from several sensors and examine whether different the data sets capture the same spatial and temporal changes in aerosol properties. An Empirical Orthogonal Function (EOF) approach is adopted in order to reduce noise level in the data, to strengthen the signal in the data, and to isolate patterns of different variability. The method is very suitable for the efficient representation of multidimensional aerosol data. Li et al. [2009] used rotated EOF analysis on Aerosol Index (AI) data and found that despite inherent differences in satellite orbit and instrumental design, the dominant modes of variability from Nimbus 7 TOMS, Earth Probe TOMS, and OMI AI data sets agree, and sources of major absorbing species are well separated. This method also revealed a consistent ENSO-AE correlation in MODIS, MISR and SeaWiFS data [Li et al., 2011]. These results imply that averaging and spectral analyses of satellite data emphasize the similarity rather than differences. These methods are effective ways in assessing the utility of the data to describe large-scale space and time variability, which is not reflected in the instantaneous, pixel-level comparisons. Here we perform EOF analysis on four independent data sets to demonstrate their ability to constrain the spatial distribution and temporal variations of aerosols, as well as to provide insights into potential problems. The agreement between the data sets also confirms that the EOF modes are real aerosol variability and increases our confidence in using the data sets despite the differences in their sensor issues, sampling, and algorithm. The paper is organized as follows: Section 2 introduces the multiple data sets used, including MODIS on Aqua and Terra, MISR, and SeaWiFS. The EOF approach is described in section 3. Section 4 presents the results of the EOF analysis of AOD full data set and AOD anomalies. And finally the summary and conclusions are given in section 5.

2 Data


[4] The MODIS instrument was launched on EOS-Terra in December 1999 and later also on EOS-Aqua in May 2002. It has a viewing swath of 2330 km and covers the entire surface of the Earth approximately every 2 days. The MODIS aerosol retrievals have separate algorithms for land and ocean. A detailed description of the retrieval algorithm for data collections 005 and 051 can be found in Levy et al. [2009]. Moreover, a Deep Blue algorithm [Hsu et al., 2004] is also developed to retrieve aerosol properties over bright surfaces such as deserts, where the dark target approach is not applicable. The retrieval accuracy is found to be Δ(AOD) = ±0.03 ± 0.05 × AOD over ocean and Δ(AOD) = ±0.05 ± 0.15 × AOD over land [Remer et al., 2008], while the one standard deviation of effective radius retrievals fall within ±0.11 µm [Remer et al., 2008; Levy et al., 2010].

[5] In this study, we use MODIS collection 051 Level 3 monthly mean 550 nm AOD product from both the Aqua and Terra platforms (available at Deep Blue products [Hsu et al., 2006] are also included to fill the gaps in the dark target retrievals. The Level 3 product reports AOD on a 1° × 1° grid. The temporal coverage of the data used in this study is from October 2002 to December 2010. And land and ocean products are analyzed separately due to the different algorithms used in the data processing.

2.2 MISR

[6] The MISR instrument was also launched onboard EOS-Terra in December 1999. It consists of nine pushbroom cameras that view the Earth in nine different directions [Diner et al., 1998; Martonchik et al., 1998]. MISR has a ~400 km swath width and takes 9 days for complete global coverage. The aerosol retrieval algorithm has been described in Martonchik et al. [2009]. It has been reported that about two thirds of the MISR AOD values fall within ±0.05 or ±0.2× (AERONET AOD), while more than a third are within ±0.03 or ±0.1× (AERONET AOD) [Kahn et al., 2010].

[7] MISR version F15_0031 Level 3 gridded and monthly averaged 550 nm AOD data are used (available at The original 0.5° × 0.5° data resolution has been rescaled to 1° × 1° resolution in order to be consistent with MODIS. The rescaling is performed by assigning equal weights to each subgrid, and the final 1° × 1° grid is considered valid only when more than half of the subgrids have valid data. Here we also use data from October 2002 to December 2010.

2.3 SeaWiFS

[8] The SeaWiFS instrument launched on SeaStar spacecraft in August 1997 is primarily for the routine global ocean color measurements and ocean bio-optical property data generation. The instrument has a swath width of 1502 km and covers the globe in approximately 2 days. The SeaWiFS aerosol retrieval algorithm is split into three major components—ocean, vegetated land, and barren land. The ocean algorithm and the validation of the products are described by Sayer et al. [2012a]. The over-land retrieval uses the “Deep Blue” algorithm [Hsu et al., 2004, 2006; Sayer et al., 2012b], which is part of the MODIS processing. The accuracy of the AOD retrieval at 550 nm is reported to be 0.05 ± 0.2× (AERONET AOD) over land [Sayer et al., 2012b] and 0.03 ± 0.15× (AERONET AOD) over ocean [Sayer et al., 2012a].

[9] Here we use Level 3 gridded monthly mean version 003 AOD products at 550 nm (available from The spatial resolution is selected at 1° × 1° to be consistent with the other data sets. The data period is from October 2002 to December 2010 when the mission was declared over.

[10] As mentioned above, the three satellite sensors have different sampling strategies and sampling frequencies. In Figure 1, we show the pixel counts per grid box for the four data sets averaged over the study period. All four data sets have sufficient sampling over subtropical and midlatitudes. MISR and SeaWiFS have much fewer pixel counts over the tropical cloudy zones. And due to its narrower swath, MISR has significantly less samples per grid box.

Figure 1.

Pixel count per grid box on log scale for the four data sets used in the study, averaged over the entire study period. MODIS pixel count for the deep blue product is not available, so most of the desert regions are not covered. The two MODIS and SeaWiFS have good sampling over subtropics and midlatitudes. MISR also has moderate sampling over these regions. The sampling of MISR and SeaWiFS over the tropics is limited.

3 Methodology

[11] EOF analysis [Bjornsson and Venegas, 1997] is performed to extract spatially and temporally varying modes within each data set. The land and ocean data are analyzed separately due to the different strategies and aerosol models used in their retrieval algorithm.

[12] The EOF method aims at decomposing the multivariate data matrix into a set of independent, orthogonal eigenvectors, with the first eigenvector explaining the most variance, the second eigenvector explaining the most of the remaining variance, and so on. EOF analysis has been traditionally applied to climate variables such as SST to examine climate modes. Here we consider it potentially useful in studying aerosol data primarily due to two reasons: (1) the composition of aerosols is rather complicated and different aerosol species have different mechanisms of generation, transformation, and deposition. The EOF method may help isolate independent aerosol sources or processes such as transport and removal; (2) the aerosol data sets are comparatively noisy, due to uncertainties in surface reflectance, cloud screening, instrument calibration and retrieval assumptions. Ideally, much of the noise should be randomly distributed and EOF analysis will filter the noise into a series high order modes, while separating signals in the dominant modes.

[13] Specifically, assume X is the data matrix of N × M, where N is the number of locations and M is the number of observations at each location. Then the EOFs are found by determining the eigenvectors of the covariance matrix C, which is

display math(1)

C is an N × N real, positive semidefinite matrix and can therefore be written as

display math(2)

where Λ is a diagonal matrix whose elements are the N eigenvalues of C and E is an orthogonal matrix whose columns are the N orthogonal eigenvectors, i.e., EOFs. Each EOF has a corresponding time series, the so-called Principal Components (PCs), and can be computed from

display math(3)

where P is a M × N matrix whose columns are the N PCs. So P and E satisfy

display math(4)

Combining (1), (2), and (4), we can see that

display math(5)

Since Λ is diagonal, the PCs are mutually orthogonal and the eigenvalues are equal to their variances.

[14] Prior to the analysis, the mean of each column of the data matrix has been removed. Moreover, as aerosols typically have strong seasonal cycles, we also repeat the analysis on the anomalies after removing the mean seasonal cycle. The calculation of the anomalies is described below.

[15] First, each monthly mean time series is organized as Zy,m, which means observation at year y and month m. Next, the mean seasonal cycle is calculated by taking the overall average of each month:

display math(6)

[16] Finally, the mean seasonal cycle is removed from each observation by

display math(7)

[17] And Za means the anomaly of Z, which is the “deseasonalized” data.

4 Results

4.1 Analysis of AOD Data

[18] The first four EOF modes of AOD data from MODIS, MISR, and SeaWiFS explained the bulk of the variance (>50%) in the data for both land and ocean (Figures 2-5). The EOF patterns and the PCs agree quite well across the data sets. The correlation between the PCs is well above 0.9 and those between the EOFs are mostly above 0.8 (Tables 1 and 2) for the two MODIS and MISR. The relatively lower correlation between SeaWiFS and the other data sets may arise from the AOD threshold value specified in its retrieval algorithm [Sayer et al., 2012b] and may be related to some particular regions and aerosol types as discussed in section 4.2.

Figure 2.

The first four EOFs of land AOD data with seasonal cycle left in. For SeaWiFS, Mode 2* and Mode 3* are the original Mode 3 and Mode 2. It is clearly seen that the first four modes highly agree among the data sets. The first two modes are dominated by dust and biomass burning aerosols, respectively, indicating these are the major aerosol types globally.

Figure 3.

The PC time series of the first four modes shown in Figure 2. The first two PCs exhibit annual cycles. The first PC peaks in the summer, corresponding to dust variability. And the second PC peaks around October, representing the biomass-burning season over South America and South Africa. The third and fourth PCs show semiannual variability.

Figure 4.

The first four EOFs of ocean AOD data with seasonal cycle left in. The agreement among the data sets is even better than over land. The signals are concentrated over coast regions, indicating ocean aerosols are mostly dominated by land sources.

Figure 5.

The PC time series of the first four modes shown in Figure 4. Similar to land data, the first two PCs exhibit annual cycles and the next two PCs show semiannual variability.

Table 1. Land Data—Correlation Between the First Four EOFs and PCs
 EOF NumberPC Number
Table 2. Ocean Data—Correlation Between the First Four EOFs and PCs
 EOF NumberPC Number

[19] From Figures 2 and 3, the first PC of land data displays an annual cycle with maximum in the boreal summer and minimum in the boreal winter. The associated spatial pattern also shows reversed signs for the northern and southern hemispheres. The dust source regions [e.g., Washington et al., 2003], including North Africa (Sahara desert), Middle East, and Northeast China (Taklamakan desert) have the strongest signals in EOF 1. PC 2 also has an annual cycle but peaks at northern hemisphere autumn (September to November). For this EOF, positive signs are observed over South America, South Africa, and Southeast Asia and negative signs mainly appear over the Sahel. This mode is likely to be associated with the spatial and temporal variability of biomass burning aerosol (sources described in van der Werf et al. [2006]). The results of the first two EOFs indicate dust and biomass burning aerosols account for the bulk of global aerosol loading, as well as temporal variability. The third and fourth modes show a mixture of aerosol signals and their PCs exhibit more intraannual variability.

[20] The results for ocean data (Figures 4 and 5) are similar to those of land. The agreement across the data sets is even better, especially for SeaWiFS. This is not surprising as the retrieval of aerosol properties over the dark water surface is a much easier task [Mishchenko et al., 1999]. The first four modes also explain most of the variance. The EOF patterns are mostly associated with aerosol transport from land sources, as highest AOD values are generally found over coastal regions. Consistent with land results, the first two PCs of ocean data both display clear seasonal cycles. The first EOF is also related to dust pattern, with prominent transport feature off the West African coast and to the North Arabian Sea. The second EOF pattern highlights transport of biomass burning aerosols from South Africa, the Sahel, and tropical Asia. In addition, EOF 2 also appears to show transpacific transport of Asian aerosols, which peaks during the spring [e.g., Yu et al., 2008]. However, compared with MODIS and MISR, the second and fourth EOF modes of SeaWiFS seem to miss the transport of biomass burning aerosols from South Africa. This feature is consistent with that of land data and will be further examined in section 4.3.

[21] Overall, the EOF analysis results suggest that different data sets agree on the spatial and temporal variability of predominant aerosols that produce the strongest signal in satellite retrievals. Globally, the bulk of the signals in the AOD are seasonal variability of northern and southern hemisphere aerosols, which are dominated by dust and biomass burning, respectively. The high correlation indicates that the four data sets are consistent in representing dominant spatial and temporal variability of global aerosols, despite their different characteristics in sampling, calibration, retrieval assumptions, etc. We thus conclude that these data are reliable in examining large-scale aerosol properties.

4.2 Analysis of AOD Anomalies

[22] In a further step, we repeat the EOF analysis on the AOD anomaly data set constructed in section 3. This allows a clearer separation of aerosol source regions with different interannual variability. It also helps examine the variability of aerosols with certain atmospheric processes or climate modes. Figures 6-9 show the first four EOFs of Aqua, Terra MODIS, MISR, and SeaWiFS AOD over land and ocean, respectively. The correlation between the EOFs and PCs is still mostly above 0.5 (Tables 3 and 4). EOF 1 of land data clearly displays the strong biomass burning source in the Amazonian Basin. The strong positive anomaly in 2007, followed by two strong negative anomalies in 2008 and 2009, is consistent with Torres et al. [2010]. EOF 2 of Aqua MODIS and EOF 3 of Terra MODIS and MISR highlight dust sources over North Africa and south of the Sahel. For SeaWiFS, the biomass burning and dust signals are split between the first two modes. EOF 3 of Aqua MODIS data and EOF 2 of Terra MODIS and MISR data, although noisier, capture aerosol variability over East Asia.

Figure 6.

The first four EOFs of land AOD anomalies. Mode 2* and Mode 3* of Terra data are the original Mode 3 and Mode 2, respectively. And SeaWiFS Mode 1* and Mode 2* are the original Mode 2 and Mode 1, respectively. The decycled data appear noisier. Nonetheless, the four data sets still agree well in the first two modes. Biomass-burning regions show up in the first Mode, indicating the higher interannual variability of aerosols over these regions.

Figure 7.

The PC time series of the first four modes shown in Figure 6.

Figure 8.

The first four EOFs of ocean AOD anomalies. Mode 1* and Mode 2* of SeaWiFS data are the original Mode 2 and Mode 1, respectively. The four data show consistent patterns of aerosol transport from major source regions and the agreement is better than “deseasonalized” land data.

Figure 9.

The PC time series of the first four modes shown in Figure 8. Note that all four data sets show the enhanced biomass burning over South East Asia by the 2006 ENSO (grey shade in PC4).

Table 3. Land Anomalies—Correlation Between the First Four EOFs and PCs
 EOF NumberPC Number
Table 4. Ocean Anomalies—Correlation Between the First Four EOFs and PCs
 EOF NumberPC Number

[23] With respect to the oceans, the dominant four EOFs are also well correlated, except that the second and third EOF of MISR are flipped in order relative to those of MODIS and SeaWiFS, i.e., the EOF 2 of MISR is correlated with EOF 3 of MODIS and EOF 3 of MISR is correlated with EOF 2 of MODIS. Similar to the results without removing seasonal cycle, high AODs over oceans are mostly found at coastal regions that are associated with major land sources. The first EOFs are obviously associated with dust transport off West African coast (Figures 8 and 9). Interestingly, EOF 4 of all four data sets captures the increased biomass burning over Southeast Asia during the 2006 El Niño (Figures 8 and 9), although the spatial signal from Aqua data appears weaker. This result is consistent with previous studies by van der Werf et al. [2006], Le Page et al. [2007] and Logan et al. [2008], which documented intensified biomass burning over the Indonesia. It indicates that EOF analysis is able to separate the interannual variability of aerosol properties influenced by climate modes such as ENSO.

[24] In sum, the comparison between the EOF modes of AOD anomalies also reveals primary consistency. The biomass burning region over South America appears in the first EOF of land data. Moreover, the influence of ENSO on aerosol variability over Indonesia is also reflected in one of the major modes in all four data sets over ocean. MODIS and MISR AOD achieve fairly good agreements in terms of large-scale aerosol changes. SeaWiFS data also agree on the mean condition. However, the aerosol types are not as well separated by the analysis as MODIS and MISR. Factors influencing the EOF patterns of different data sets include instrumental calibration issues, aerosol models used in the retrievals, cloud screening, surface reflectance, and surface wind speed, and will be discussed in a separate study.

5 Summary and Conclusions

[25] In this study, we use an EOF approach to analyze the spatial and temporal variability in multisensor aerosol retrievals and examine the consistency and differences between the four data sets. Analysis of the AOD data indicates good agreement over both land and ocean. The major modes are highly correlated in both the spatial pattern and time series. The dominant feature of land AOD is associated with dust sources, including the Sahara, Persian Gulf, and Central Asia. And the second largest aerosol signal is attributed to biomass burning over South America, South Africa, and the Sahel region. Over the oceans, the dominant aerosol regimes are similar. Transpacific transport from Asia also appears in the second EOF in addition to biomass burning. Some differences in the SeaWiFS data mainly come from South America and Africa and are associated with biomass burning aerosols.

[26] After removing the mean seasonal cycle in the data, biomass burning over South America appears in the dominant EOF due to its strong source strength and comparatively large interannual variability, while the second EOF features West African dust. The dominant EOFs of AOD anomalies over ocean exhibit similar results. Moreover, all four data sets capture the enhanced biomass burning around Indonesian during the 2006 El Niño.

[27] In conclusion, paralleled comparison between major EOF modes of different remote sensing products provides an alternative and effective means to assess the data consistency in representing large spatial/temporal scale aerosol variability. Moreover, this method also helps to identify major aerosol sources and the influence of climate modes. While various sources of uncertainty still exist in aerosol retrievals, the results presented here indicate the current MODIS, MSIR, and SeaWiFS AOD data sets are useful on a large-scale basis and can be used to investigate aerosol sources and their variability. Further quantitative analysis of the differences in the data sets requires the use of different techniques and data levels, which will be the subject of future work.


[28] We thank the MODIS, MISR, and SeaWiFS science teams for providing the data used in this study. We also thank the anonymous reviewers for providing many helpful comments and suggestions. This study is funded by climate grant 509496. Jing Li is also funded by the NASA Postdoctoral Program (NPP).