Journal of Geophysical Research: Solid Earth

Extracting surface wave attenuation from seismic noise using correlation of the coda of correlation

Authors


Corresponding author: J. Zhang, EES-17: Geophysics, Los Alamos National Laboratory, Los Alamos, NM 87545, USA. (jjzhang@lanl.gov)

Abstract

[1] Extracting surface wave travel time information from the cross-correlation (CC) of seismic ambient noise has been a great success and remains fast growing. However, it is still challenging to exploit the amplitude content of the noise CC. Although spatial average is able to constrain somewhat meaningful attenuation using noise CC amplitudes, clear bias is observed when spatially varying attenuation is estimated with the traditional noise CC calculation methods. Perhaps the key lies in the development of novel techniques that can mitigate the effect of the uneven distribution of natural noise sources. In this paper, we propose a new method to use the correlation of the coda of correlation of noise (C3) for amplitude measurement. We examine the ability of the method to retrieve surface wave attenuation using data from selected line array stations of the USArray. By comparing C3-derived attenuation coefficients with those estimated from earthquake data, we demonstrate that C3 effectively reduces bias and allows for more reliable attenuation estimates from noise. This is probably because of the fact that the coda of noise correlation contains more diffused noise energy, and thus, the C3 processing effectively makes the noise source distribution more homogeneous. When selecting auxiliary stations for C3 calculation, we find that stations closer to noise sources (near the coast) tend to yield better signal-to-noise ratios. We suggest to preprocess noise data using a transient removal and temporal flattening method, to mitigate the effect of temporal fluctuation of the noise source intensity, and to retain relative amplitudes. In this study, we focus our analysis on 18 s measurements.

1 Introduction

[2] Constructing empirical Green's function (EGF) from the cross-correlation (CC) of seismic noise has proven a standard and successful tool for imaging Earth's velocity structure [e.g., Shapiro et al., 2005; Sabra et al., 2005; Yao et al., 2006; Lin et al., 2007]. Although the EGF constructed from noise is usually not exact due to the uneven distribution of noise sources, bias in travel time estimates is found to be small [Yao et al., 2006; Yang et al., 2008; Yao and van der Hilst, 2009; Harmon et al., 2010]. The ability of accurately retrieving surface wave travel time from noise, in spite of the lack of full noise field diffusivity, is also demonstrated in theory [Snieder, 2004; Godin, 2009; Weaver et al., 2009; Froment et al., 2010].

[3] The EGF derived from seismic noise correlation also contains, in its amplitude, information about the Earth's attenuation. In laboratory experiments and numerical simulations, Weaver and Lobkis 2001, Larose et al. [2007], Cupillard and Capdeville [2010], and Weaver [2011a] were able to extract accurate medium attenuation from acoustic ambient noise. Early efforts using seismic noise correlation to study surface wave amplitudes were also encouraging. Prieto and Beroza [2008] showed a clear correlation between the relative amplitudes of seismic noise EGF at different stations and those obtained from earthquakes. Using a line array, Matzel [2008] obtained a noise CC amplitude decay that is related to surface wave geometric spreading and attenuation. In addition, Taylor et al. [2009] developed a standing-wave method that can be used to estimate site amplifications using seismic noise. Unlike travel time, however, bias in attenuation estimates from EGF amplitudes can be significant if noise sources are unevenly distributed [Harmon et al., 2010; Cupillard and Capdeville, 2010; Tsai, 2011], which is often the case in seismology. For example, dominant sources of microseism field across southern California are multiply located, and their locations and strengths change with season [Gerstoft and Tanimoto, 2007].

[4] Therefore, to reliably extract attenuation properties of the Earth structure from seismic noise requires techniques that can reduce the bias due to uneven noise source distributions. Prieto et al. [2009, 2011] proposed an approach to estimate surface wave attenuation using the spatial coherency of the seismic noise field. They used azimuthal averaging to suppress the effect of noise directionality. Lawrence and Prieto [2011] extended the method to image the lateral variation of surface wave attenuation for the western United States. Although there are theoretical arguments that the conjecture of the expression of Prieto et al. [2009] may not be strictly valid in the presence of unevenly distributed noise sources [Tsai, 2011] (Weaver, personal communication), Nakahara [2012] showed that this approach is at least approximately valid for weak attenuation. Spatial averaging was examined in the time domain as well by Lin et al. [2011], showing that azimuthally averaged EGF amplitudes over a large region yields average attenuation estimates that are consistent with estimates derived from earthquakes.

[5] Recently, Weaver [2011a, 2011b] established theoretically and numerically that accurate medium attenuation could be retrieved from noise CC amplitudes if noise source intensity varies smoothly as a function of location, particularly near the extension of the line linking the two stations. The CC amplitude from such a noise field depends mainly on the medium attenuation, site amplification, and noise source intensity in the direction of the interstation line, but little on noise source intensities in other directions. The result shows promise that spatially varying surface wave attenuation, site amplification, and noise source intensity may be derived from noise CC amplitudes without averaging if novel techniques can be developed to sufficiently diffuse the noise source distribution.

[6] In this study, we evaluate the method of calculating the correlation of the coda of correlation (C3) in reducing the bias in attenuation estimation from noise. Because the coda of the noise CC contains further scattered energy, the effective noise sources that generate coda are presumably more diffused. The C3 method was developed by Stehly et al. [2008] to address the problem of poorly constructed EGF in some directions due to the azimuthal variation of noise sources. It has been successfully used to retrieve robust travel time estimates in situations where noise source intensity varies strongly in space [Stehly et al., 2008; Garnier and Papanicolaou, 2009; de Ridder et al., 2009; Froment et al., 2011]. To the best of our knowledge, this study is the first attempt at exploiting the noise CC coda for accurate amplitude measurement. To preprocess the data, we use a temporal flattening method suggested by Weaver [2011a], instead of traditional methods such as one bit or running absolute mean (RAM) [Bensen et al., 2007], in order to mitigate the effect of temporal fluctuation of the noise source intensity and to retain relative amplitudes. We focus our analysis on 18 s measurements near the primary microseisms peak.

2 Data

[7] We collected 2 years (2007 and 2008) worth of continuous, long-period (one sample per second), vertical-component seismograms from the Transportable Array (TA) of the USArray and Southern California Seismic Network (CI). We then form line arrays of different lengths, locations, and directions from among all the stations for our analysis. We require a minimum of five stations for an array. For each array, we find an earthquake that is either located near one of the end stations of the array or along the extension of the array. The earthquake also has to be recorded by the array with sufficient signal-to-noise ratios (SNRs). The selected line arrays and earthquakes are listed in Table 1. The maximum difference in azimuth between the reference end station and all other stations in an array is less than 15°. The numbers of common days when all stations in a line array have data range from 125 to 365. Except for lines 3 and 11 (Table 1), where we use surface waves from a distant earthquake along the extension of the array, the maximum distance between the earthquakes and corresponding reference stations is 33 km. Before the correlation calculation, data are band pass filtered with a frequency domain Gaussian filter of the form inline image within a frequency band and zero outside the band [Herrmann, 1973], where α is filter constant that dictates the filter width and ω0 is center circular frequency. We set the filter constant to 20. Earthquake signals are filtered twice using the same filter for noise CC comparison and four times for C3 comparison. This is because noise CC calculation effectively doubles the order of the filter, and the C3 calculation doubles it again. Tests confirm that results filtered this way have consistent frequency contents.

Table 1. List of Earthquakes and Line Arrays Used in the Study
EarthquakeArrayFigure
2008/07/29 Mw5.4 33.96N 117.77W Depth: 14 km1BFS-GSC-SHO-V11A-U11A-T11A-S13A-R13A-R14A-Q14A-Q15A-P15A-P16A-O16AFigures 1, 6, and 11
2BFS-SLA-FUR-U10A-GRA-S10A-S11A-R10A-R11A-Q10A-Q11A-P11A-P12A-O11A-O12A-N11A-N12A-M12A-M13A-L12A-L13AFigures 9a and 9d
2008/03/25 Mw4.2 44.71N 110.07W Depth: 9 km3G15A-F13A-F12A-F11A-E11A-F10A-E10A-E09A-D09A-D08A-E08A-E07A-D07A-E06A-D06A-D05AFigure 7
2007/06/12 Mw4.6 37.54N 118.86W Depth: 10 km4MLAC-R05C-P05C-O05C-M02C-L02AFigure 8
5MLAC-P07A-O07A-N07B-N08A-M07A-M08A-L08A-K07A-K08A-J07A-H08A-G08A-F08A
6MLAC-Q08A-P09A-O09A-O10A-N10A-N11A-M11A-L12A-K12A-J13A-G15A
7MLAC-S11A-S12A-S13A-S14A-S15A
8MLAC-GRA-U10A-SHO-V11A-V12A-W12A-NEE2-W13A-X13A-Y14A
9MLAC-CWC-LRL-RRX-BBR-SWS
2007/03/09 Mw4.7 38.43N 119.38W Depth: 9 km10R06C-P08A-O09A-O10A-N10A-N11A-M11A-M12A-L12A-L13A-K13AFigures 9b and 9e
2007/07/27 Mw5.1 44.39N 129.78W Depth: 10 km11K04A-L07A-M08A-M09A-N11A-O12A-O13AFigures 9c and 9f
2008/02/21 Mw6.0 41.19N 114.86W Depth: 8 kmM12A against all other TA and CI stations operated during 2008Figure 3

[8] We calculate daily noise CC and C3 between the reference station (colocated with, or nearest to, the selected earthquake) and the rest of the stations in an array. Seats et al. [2012] recommended dividing the data into shorter and overlapping time windows before calculating noise CC. Our tests of using 30 min window indicate little improvement in terms of the comparison between the noise- and earthquake-based attenuation estimates. We thus choose not to apply a shorter window calculation because of concerns of computation memory cost when calculating C3. We measure envelope amplitudes of both noise EGF and filtered earthquake signals for comparison. Geometric spreading is corrected by multiplying the amplitude by a distance term inline image, where d is the interstation or epicentral distance in degrees. The decay of the corrected EGF amplitude A as a function of distance r can then be used to estimate the average attenuation coefficient γ along the array, through the following relationship: A(r) = exp(−γr). Table 2 lists the attenuation (and velocity) estimates along with associated 95% confidence intervals from earthquakes, C3 EGF, and noise CC-derived EGF for all line array examples we present below.

Table 2. Velocity and Attenuation Estimates with 95% Confidence Interval from Peak Envelope Amplitudes of Earthquakes and Noise EGFs
LineVelocity (km/s)Attenuation Coefficient (×10−4 km−1)
EQC3CCFlatteningCCRAMEQC3CCFlatteningCCRAM
12.9 ± 0.13.0 ± 0.23.0 ± 0.23.0 ± 0.28.4 ± 3.48.1 ± 6.14.1 ± 6.51.8 ± 5.1
22.9 ± 0.12.8 ± 0.12.8 ± 0.12.8 ± 0.15.9 ± 3.86.2 ± 7.12.5 ± 6.60.2 ± 6.3
32.8 ± 0.12.8 ± 0.2N/AN/A9.1 ± 7.010.5 ± 8.7N/AN/A
42.5 ± 0.62.8 ± 0.8N/AN/A17.5 ± 5.510.9 ± 22N/AN/A
52.7 ± 0.22.9 ± 0.52.9 ± 0.22.9 ± 0.2−4.9 ± 4.84.5 ± 9.01.7 ± 111.7 ± 10
63.0 ± 0.32.8 ± 0.12.9 ± 0.12.9 ± 0.14.8 ± 5.02.9 ± 5.8−0.1 ± 3.7−2.4 ± 3.0
72.7 ± 0.32.8 ± 0.62.8 ± 0.32.8 ± 0.3−1.1 ± 7.5−0.8 ± 116.6 ± 200.8 ± 18
83.0 ± 0.23.0 ± 0.73.1 ± 0.43.1 ± 0.35.1 ± 9.94.1 ± 161.8 ± 190.3 ± 15
92.5 ± 0.22.7 ± 0.52.8 ± 0.32.9 ± 0.27.1 ± 295.1 ± 10−2.0 ± 14−4.4 ± 8.5
102.9 ± 0.12.9 ± 0.12.9 ± 0.12.9 ± 0.15.7 ± 5.73.2 ± 10−0.8 ± 7.3−2.4 ± 7.2
112.9 ± 0.23.1 ± 1.02.8 ± 0.32.8 ± 0.38.2 ± 128.8 ± 141.9 ± 112.9 ± 5
1 (8 s)2.9 ± 0.22.8 ± 0.32.8 ± 0.32.8 ± 0.311.1 ± 1111.9 ± 1210.6 ± 9.65.7 ± 9.6

[9] To measure surface wave velocity from noise-derived EGF, previous studies often construct a symmetric EGF by stacking negative and positive lag components (signals that travel in opposite directions) to improve SNR. For attenuation estimation, however, we should use only the outgoing signal that travels from the end reference station to other stations. The cross-correlation between signal x at the reference station and signal y at a second station is defined as follows [Bendat and Piersol, 2000]:

display math(1)

Equation (1) shows that EGF at a positive lag time (t > 0), where the signal recorded by the reference station at time τ is correlated with the signal recorded by the second station at a later time τ + t, represents the outgoing energy from the reference station. In fact, Weaver [2011a] showed xnumerically that negative lag EGF amplitudes decay differently than positive lag EGF amplitudes if the noise field intensity varies azimuthally due to the variation of noise energy flux at different stations in a line array. As a result, we only measure amplitudes of positive lag EGF in our analysis.

3 Typical Bias Due to Uneven Noise Source Distribution

[10] When comparing noise CC amplitude decay with that from earthquake data, we often observe a typical bias as an underestimate of the attenuation. Figure 1 presents such an example using a line array (line array 1 in Table 1). The noise is preprocessed following traditional procedures involving instrument response correction, filtering, RAM normalization, and spectral whitening [e.g., Bensen et al., 2007]. We use a filter with the center period of 18 s in the filtering. We calculate the noise CC using daylong signals and stack the outputs over the days for which all stations in the array have data. Since all noise CCs are calculated with the same data length, we do not divide the stacked signals by the data length. To illustrate the bias introduced by symmetric EGF in attenuation estimates, we also include symmetric EGF measurements in this example.

Figure 1.

An example of typical bias of attenuation estimates from 18 s noise. (a) Map of the western United States showing USArray stations (blue triangles), an earthquake (yellow star, EQ2008/07/29 Mw5.4 33.96N 117.77W), and a line array (red triangles). The end reference station BFS is in black. (b) Comparison of the earthquake signals (black) with the CC-derived EGFs (red). In order to better compare the amplitudes, the earthquake signals are time shifted (same for Figures 6, 7, 11, and B1). (c) Comparison of the apparent attenuation estimated from the earthquake signal amplitudes (black stars) with those from noise CC (with RAM preprocessing), measured from positive lag time (blue circles) and symmetric (magenta crosses) EGF components, respectively. Attenuation coefficients (listed in the inset) are derived by fitting the logarithms of geometric spreading corrected amplitudes with straight lines. Amplitudes of different data sets are separated by multiplying each of them with a different scaling factor. This is done to ease the comparison. Again, the same procedure is used in following figures whenever applicable.

[11] For this line array, although clear EGF signal is constructed (Figure 1b), EGF amplitudes yield a much smaller attenuation coefficient (1.8E-4) than that from the earthquake data (7.9E-4), and the bias is even larger (−4.7E-4 versus 7.9E-4) when the symmetric EGF component is used (Figure 1c). The underestimation of attenuation from noise EGF can be explained by the existence of a directionally unsmooth or discrete distribution of noise source intensity. An extreme scenario is a discrete and focused noise source located along the extension of the line array, i.e., the source region is much smaller than the distance between the source and the line array. In this case, the source behaves like a point source. The geometric spreading term of the EGF amplitude for such a source is inline image, instead of inline image, where D is the distance in degrees between the source and the reference station of the line array, and d is the interstation distance in degrees [Cupillard and Capdeville, 2010]. If we correct EGF amplitudes using inline image instead of inline image, the resulting amplitudes differ from the correct amplitudes by a factor inline image, which decreases with increasing d. The end result is that amplitudes corrected for geometric spreading using inline image will decay slower than the amplitude decay due to attenuation, which is what we see in Figure 1c.

[12] To illustrate the variation of noise source distribution, we perform a beamforming [Gerstoft and Tanimoto, 2007] (Appendix A) using seismic noise recorded at the CI component of USArray stations (Figure 2a). Figure 2b plots the beamforming outputs in terms of spectral power in dB. The major energy of fundamental mode Rayleigh wave (the high energy at ~0.3 s/km) as a function of azimuth serves as a proxy for the directional distribution of the noise source intensity. Clearly, the noise source intensity is anisotropic and temporally varying. At all four sample periods, noise sources tend to spread more during winter, coming from both Pacific and North Atlantic, while in summer times, sources from South Pacific seem dominant. The figure shows that the noise sources may not be smooth and sometimes appear strongly localized (e.g., 16 s summer and 18 s winter). In cases where noise sources are localized, attenuation derived from noise EGF would likely be biased low, according to the point source scenario discussed above.

Figure 2.

An illustration of noise source distribution for noise recorded in the western United States from beamforming. (a) Map of the western United States showing a subset of USArray stations (red triangles) used for beamforming. (b) Beamforming outputs as a function of slowness (radial coordinate) and azimuth at periods of 8, 10, 16, and 18 s for winter (December 2006 to February 2007) and summer (June–August 2007), respectively. Power of each output is normalized to the scale of 0–1.

4 Azimuthally and Spatially Averaged Attenuation Coefficient Estimates From Noise

[13] Following Lin et al. [2011], we investigate whether an azimuthally and spatially averaged attenuation estimate determined from noise-derived EGF agrees with the estimate using an earthquake. For all USArray stations that recorded the 2008/02/21 earthquake (Table 1), we calculate noise CC between station M12A (approximately colocated with the earthquake) and all other stations (Figure 3a). We preprocess data using the RAM normalization [Bensen et al., 2007] as what Lin et al. [2011] did, and perform duration correction by normalizing noise CC amplitudes with the length of the data used in the stacking. We then use the criteria of SNR > 2 and a distance range of 200–1000 km to select amplitude measurements for analysis. Here SNR is defined as the peak envelope amplitude of the EGF divided by the standard deviation of 1000 s noise CC coda. We measure EGF envelope amplitudes from both positive lag and symmetric components. After geometric-spreading correction, we compare the average amplitude decay of noise EGF for all stations with that of the earthquake signal at multiple periods between 6 and 25 s.

Figure 3.

Comparison of spatially averaged attenuation coefficients from noise EGF amplitudes with those from earthquake-generated amplitudes. (a) Map of the western United States showing the locations of an earthquake (yellow star), the reference station M12A (red triangle), and other stations (blue triangles) that recorded the earthquake and used in the analysis. (b) Attenuation coefficients at different periods estimated from amplitudes of noise EGF calculated for all stations shown in the map versus those estimated from the earthquake. Error bars indicate 95% confidence intervals. (c) At different periods, the log of positive lag EGF amplitudes as a function of distance (red dots) is compared with that of the earthquake data (black dots). Black straight lines are the best fits. Numbers are the estimated attenuation coefficients.

[14] Figure 3 shows the comparison result. Overall, the spatially averaged EGF amplitudes yield attenuation coefficient estimates that are similar to those from the earthquake across the whole microseism band. Most of the estimates from noise EGF are within 95% confidence intervals of corresponding earthquake estimates. In addition, EGF estimates for secondary microseism periods (6–10 s) are less consistent with the earthquake-based estimates than those for primary microseism periods (14–20 s), and average attenuation coefficients determined by using the positive lag EGF component agree slightly better with the earthquake-derived results. Figure 3c also shows that EGF amplitudes are more scattered than earthquake amplitudes.

[15] Our observations confirm Lin et al.'s [2011] conclusion that ambient seismic noise clearly contains meaningful anelasticity information of the Earth, and spatially averaged estimate could serve as a constraint for higher-resolution imaging. However, to image spatially varying attenuations from noise, it is still necessary to explore feasible processing methods that can reduce bias due to an inhomogeneous noise source distribution.

5 Transient Removal and Temporal Flattening

[16] Normalization such as one bit or RAM is often applied in traditional noise CC calculations [Bensen et al., 2007], which accelerates the EGF convergence efficiently for extracting travel time information. However this may not be appropriate for extracting attenuation as the relative amplitude information is lost during the nonlinear operation. Although one-bit preprocessing may still recover attenuation in the case of a uniform distribution of noise sources [Cupillard et al., 2011], simulations have shown that it fails for nonuniformly distributed noise sources [Cupillard and Capdeville, 2010; Weaver, 2011a], which is likely the case when dealing with real seismic data. In order to retain amplitude information, we employ a different preprocessing method in this analysis. We first remove transient signals from sources such as earthquakes and instrument glitches after band-pass filtering the data. We then use a temporal flattening technique suggested by Weaver [2011a] to reduce the effect of temporal fluctuation of noise intensity and to retain relative amplitudes. We note, however, that neither of these preprocessing techniques is aimed at smoothing the noise source distribution in space. Figure 4 gives an example illustrating the effects of the methods on the seismograms.

Figure 4.

An example illustrating data preprocessing using transient removal and temporal flattening techniques. (a) One year (2007) worth of raw seismic records at station BFS, which is shown in Figure 1a. Spikes are high-amplitude transient events (earthquakes, instrument glitches, etc.). (b) Noise data shown in Figure 4a after transient removal. (c) Global noise level averaged over all stations of the line array shown in Figure 1a. (d) Noise data shown in Figure 4b after temporal flattening using the global noise level shown in Figure 4c.

[17] For transient removal, we first calculate a daily median amplitude level (A1 day), i.e., the median of the envelope amplitudes of each daylong data trace, which represents the daily noise amplitude level, assuming that most of data are noise with scarce transients. We then step through consecutive 10 min long windows and calculate the mean of the envelope amplitude for each window (A10 min). Transients are identified as those 10 min window data with A10 min > 2A1 day. They are then replaced with zeros. We test using 1 month long data to calculate the average noise amplitude level, and the result is similar. Figure 4b shows that the method we use along with the parameters we choose adequately removes significant transient signals. We find that the amount of data removed constitutes less than 15% of the total data, which has little effects on the convergence of noise EGF.

[18] The transient-removed noise amplitudes can still vary with time (Figure 4b), both seasonally (high in winter and low in summer across the western United States) and occasionally (e.g., due to strong storms). Strong noise off the strike direction of a station pair could cause spurious noise CC arrivals and/or decelerate the convergence of EGF. To reduce the temporal variation of noise field intensity, we then apply the temporal flattening technique to the transient-removed data, in which we normalize each daylong trace by a global noise amplitude level for that day. We calculate the global noise level as the quadratic mean (square root of the mean of squares) of noise standard deviations at all stations in an array (and coda stations defined in section 6) after transients are removed. The flattened and transient-removed noise data (Figure 4d) allow rapid convergence of EGF, while relative amplitude information between stations is retained. Slight improvement from temporal flattening can be seen in Figure 6c for the line array in Figure 6a (the same array shown in Figure 1a), where temporal flattening results in an average attenuation coefficient of 4.1E-4, which is closer to the earthquake-derived attenuation coefficient (8.4E-4) compared with either one-bit (2.2E-4) or RAM (1.8E-4) normalization results.

6 Correlation of the Coda of Correlation (C3)

[19] Line array examples (e.g., Figures 1c and 6c) show clear discrepancy between noise CC-based attenuation estimate and that from earthquake data, despite improvements from the temporal flattening. Average over space and azimuth [Lin et al., 2011] may reduce bias, yet it does not allow for a spatially varying attenuation estimate. One possible approach of accounting for the effect of uneven noise source distribution is to first accurately characterize the distribution in both space and time [e.g., Stehly et al., 2006; Gerstoft and Tanimoto, 2007; Yang and Ritzwoller, 2008]. Another approach is to somehow smooth the noise source distribution. Since noise EGF represents the signal from a virtual source at one of the two stations involved, the coda of the EGF should then be signals scattered by scatterers around the two stations. Using coda of noise EGF to calculate a second CC should then produce an EGF (C3 EGF) that is the result of a more homogeneous effective noise field from scatterers. In fact, Stehly et al. [2008] have demonstrated that calculating C3 using scattered coda energy in noise CC results in an EGF with improved time symmetry and less azimuthal dependence on the noise source distribution. Since then, a few studies [Garnier and Papanicolaou, 2009; de Ridder et al., 2009; Froment et al., 2011] have further established that the C3 method can suppress effects due to non-isotropic noise source distribution and enhance the quality of travel time estimates. In this section, we demonstrate that C3 also yields more reliable attenuation estimates.

[20] Figure 5 is an illustration showing how we calculate C3. In order to construct C3 EGF between a pair of stations R1 and R2, we first use preprocessed data to calculate daily CC between a third station S (termed “coda station”) and R1, CCS,R1, and between S and R2, CCS,R2. We then select the coda of EGF from CCS,R1 and CCS,R2 using a 1500 s time window Tcoda. The window starts at 500 s lag time, which is well after the surface wave arrival even for the longest interstation distance (<1000 km) in our analysis. We select coda from both positive and negative lag EGF. We then flip the coda segment from the negative lag EGF and add both coda segments together (mirror stacking) before calculating C3. The whole C3 calculation can be expressed as follows:

display math(2)

where (±)Tcoda signifies a positive or negative lag coda segment. We also tested using only positive or negative lag coda, or stacking positive and negative lag coda without flipping to calculate C3. Our conclusion is that the method we adopted yields the most accurate attenuation estimation. We provide a comparison of these different C3 calculation methods in Appendix B.

Figure 5.

An illustration of the calculation of correlation of the coda of correlation (C3). For a pair of stations R1 and R2, we first calculate noise CC between a third station S and R1, CCS,R1, as well as between S and R2, CCS,R2. C3 EGF between stations R1 and R2 is then calculated by cross-correlating the coda of CCS,R1 and CCS,R2.

[21] To compute C3 for a line array, we select multiple coda stations around the array. For station pairs in the array, we calculate C3 using each of the coda stations for each day when all stations, including coda stations, have data. We then stack these C3 over coda stations, as well as over the common operating days, to obtain the final C3 EGF for each station pair.

[22] We also test the effect of selected coda stations on the resulting SNR of the C3 EGF by choosing coda stations from different regions around the line array. We find that stations closer to the coast tend to allow C3 to converge to an EGF with higher SNR, except for northwest United States, where coda stations further inland can also allow C3 EGF with high SNR. Note that for C3 EGF, we calculate SNR as the peak envelope amplitude of the EGF divided by the standard deviation of 200 s C3 coda, instead of 1000 s noise CC coda as we use for the SNR of noise CC-derived EGF. Indeed, there should be stronger noise energy near noise sources (along the coast), and thus presumably more scattered energy as well. We also find that the time symmetry feature of C3 depends little on the locations of the coda stations. As a result of our test, we choose stations for C3 calculations based on the criterion that they are within 1o of the coastline. They also need to have at least 100 common operating days of the line array stations. An alternative approach would be to use all available stations as coda stations to calculate C3 and select those with high SNR EGF for stacking. However, this would incur a tremendous amount of computation. With the criterion that we use, which results in 27 or less coda stations for the examples shown below, we obtain sufficient SNR of the C3 EGF with reasonable amount of computation.

[23] Below, we present examples showing the effectiveness of C3 in reducing the bias in surface wave attenuation estimates. We use an SNR of seven to select C3 EGF for the examples. All examples, except the last one, are for data filtered around 18 s. The last example shown in Figure 11 is for data filtered around 8 s.

[24] Figure 6 is the first example (line array 1, same as in Figure 1) comparing C3 EGF amplitudes with those of EGFs from traditional noise CC in terms of containing unbiased surface wave attenuation information. Figure 6b shows that with the method we use, C3 produces clear EGF signals. The minimum SNR of the EGFs is 12. Compared with earthquake-based estimate, the attenuation coefficients derived from C3 EGF amplitudes agree better than the CC-derived results (see Figure 6d versus 6c). This is true for all data preprocessing methods used: temporal flattening, RAM, and one bit. The temporal flattening data preprocessing yields the best match to the earthquake-based estimate (8.1E-4 versus 8.4E-4). In the following examples, we thus calculate C3 using data preprocessed by transient removal and temporal flattening only. It is also noteworthy that C3 EGF is little biased in travel time as well. The estimated velocity of C3 EGF from Figure 6b is 3.0 ± 0.2 km/s, which is comparable to the earthquake-based estimate of 2.9 ± 0.1 km/s. We see similar consistency for other line array examples shown below as well (see Table 2).

Figure 6.

Comparison of attenuation coefficients extracted from noise CC-derived EGF, C3 EGF, and signals from an earthquake. Data are filtered around 18 s. Noise data are preprocessed with temporal flattening, RAM, or one-bit normalization. (a) Map of the western United States showing USArray stations (blue triangles), an earthquake (yellow star, EQ2008/07/29 Mw5.4 33.96N 117.77W), a line array BFS-O16A (red triangles), reference station BFS (black triangle), and coda stations used for C3 calculation (green triangles). (b) Comparison of the record section of the earthquake signal (black) with that of the C3 EGF (red). (c) Comparison of the apparent attenuation estimated from the geometric spreading corrected earthquake amplitudes (black stars) with those estimated from noise CC-derived EGF, respectively, preprocessed using temporal flattening (red circles), RAM (blue circles), and one bit (green circles). Straight lines are the best fits to the log amplitudes. Estimated attenuation coefficients are listed in the inset. (d) Comparison of the apparent attenuation estimated from geometric spreading corrected earthquake amplitudes (black stars) with those estimated from C3 EGF, respectively, preprocessed using temporal flattening (red circles), RAM (blue circles), and one bit (green circles). Straight lines are the best fits to the log amplitudes. Estimated attenuation coefficients are listed in the inset.

[25] As we argue in section 2, the outgoing signals (positive lag EGF of station pairs between a reference station and other stations in a line array) should be used when estimating attenuation in a certain direction. However, a positive lag EGF may not converge due to the lack of noise energy flux in the outgoing direction from the reference station. For instance, for the line array shown in Figure 7a (line array 3 in Table 1), the noise energy traveling westward across the line array is weak, resulting in poorly converged outgoing EGFs (red parts in Figure 7c). This problem can be mitigated by using C3, which provides nearly symmetric EGF with sufficient SNR (red parts in Figure 7d) and permits reliable attenuation extraction from noise (Figure 7b). The more symmetric characteristic of C3 is also valuable in tomographic attenuation-model inversions where crossing paths in all directions are desired.

Figure 7.

An example of successful attenuation extraction from positive lag C3 EGF at 18 s, where positive lag EGF from noise CC fails to converge. (a) Map of the western United States showing USArray stations (blue triangles), an earthquake (yellow star, EQ2008/03/25 Mw4.2 44.71N 110.07W), a line array G15A-D05A (red triangles), reference station G15A (black triangle), and coda stations used for C3 calculation (green triangles). (b) Comparison of the apparent attenuation estimated from geometric spreading corrected earthquake amplitudes (black stars) with that estimated from C3 EGF (red circles). Straight lines are the best fits to the log amplitudes. Estimated attenuation coefficients are listed in the inset. (c) Comparison of the record section of the earthquake signal (black) with that of the CC-derived EGF (blue). The positive lag EGFs are colored red and bounded by two red lines. (d) Comparison of the record section of the earthquake signal (black) with that of the C3 EGF (blue). The positive lag EGFs are colored red and bounded by two red lines.

[26] The improvement of C3 in estimating surface wave attenuation can be seen in a few more examples. In Figure 8, we examine six line arrays sharing a common reference station MLAC and sampling different directions (Figure 8a; line arrays 4–9 in Table 1). For four of the six lines in Figure 8 (line arrays 6–9), C3 EGFs yield attenuation estimates that are fairly consistent with those from the earthquake, whereas (except for line array 7) estimates from noise CC-derived EGFs are much lower or even negative (Figures 8d–8g).

Figure 8.

Comparison of attenuation coefficients extracted from C3 (with transient removal and temporal flattening preprocessing) and noise CC (with RAM preprocessing) of 18 s noise with that from an earthquake for six line arrays sampling different directions. (a) Map of the western United States showing USArray stations (blue triangles), an earthquake (yellow star, EQ2007/06/12 Mw4.6 37.54N 118.86W), the six line arrays (red triangles), the reference station MLAC (black triangle), and coda stations used for C3 calculations (green triangles). (b–g) Comparison of the apparent attenuation estimated from the geometric spreading corrected earthquake amplitudes (black stars) with those estimated from C3 (red circles) and noise CC-derived (blue circles) EGF, respectively, for line arrays 4–9. (Noise CC fails to converge for line array 4.) Straight lines are the best fits to the log amplitudes. Estimated attenuation coefficients are listed in the insets. Names of the line array stations are listed in Table 1.

[27] For line array 4 in Figure 8, because the positive lag EGFs from noise CC have very poor SNR, we are not able to make a noise CC-based estimate. Nevertheless, C3-based attenuation estimate for line array 4 (Figure 8b) is less consistent with the earthquake estimate compared with the four better lines (line arrays 6–9). One of the possible reasons for the inconsistency is that there are only five stations for the line, and C3 EGF amplitudes show large variations. These factors certainly reduce the reliability of the estimate, which has an uncertainty of ±22E-4, the largest among all line arrays.

[28] Line array 5 in Figure 8 is another example showing noise attenuation estimates that are different from the earthquake estimate. For this line, earthquake amplitudes increase significantly with increasing distance, yielding an apparent attenuation coefficient of −4.9E-4, whereas noise EGF amplitudes decrease. Although it requires more detailed analysis to adequately explain this observation, which is beyond the scope of this study, there are several factors that could potentially contribute to the observed discrepancy. These factors include the difference between an earthquake and the virtual source at the reference station in terms of source size, mechanism, and location (including depth), as well as possible different path and/or site responses to earthquake and noise signals. These factors may also explain the observation that amplitude variations among stations in a line array sometimes exhibit different behavior between noise EGFs and earthquake signals. Some of these effects may be empirically corrected before inverting for an attenuation map. For example, Lin et al. [2012] developed a method that utilizes spatial differential operators to determine the local amplitude variation and the (de)focusing.

[29] Although our examples show only path-averaged attenuation estimate for each line array, we see some interesting similarities between the estimates and published attenuation models for the western United States [Philips and Stead, 2008; Lawrence and Prieto, 2011], which correlate well with the geology and tectonics of the region. For example, attenuation is particularly high along line array 4 in Figure 8 that traverses northern California both from noise C3 estimate and in published models, particularly the Lg attenuation model of Philips and Stead [2008]. Low attenuation is observed along line array 7 in Figure 8 extending from central Nevada to the Colorado Plateau, sampling low-attenuation regions such as central and eastern Nevada low-attenuation blocks and the Colorado Plateau in Philips and Stead's [2008] model. The other lines traversing regions with mixed attenuation characteristics yield intermediate attenuation estimates.

[30] To increase sampling, Figure 9 shows additional line array examples (line arrays 2, 10, and 11 in Table 1). Overall, the improvement from C3 is apparent. Combining all the examples shown above, we quantify the improvement of C3 versus noise CC by plotting noise-derived attenuation coefficient estimates versus those from earthquakes in Figure 10a and their differences in Figure 10b. Examples from line arrays 3 and 4 are not included because we do not have estimates from noise CC-derived EGF for those two line arrays (see Table 2). Figure 10a shows that attenuation coefficients from C3 EGF correlate much stronger with those from earthquakes than attenuation coefficients from noise CC-derived EGF. The only outlier is the estimate for line array 5 where the earthquake data yield a negative attenuation coefficient. Figure 10b shows that not only do estimates from noise CC show a larger bias (mean difference of −4.3E-4 for noise CC versus 0.3E-4 for C3; median difference of −5.4E-4 for noise CC versus -0.3E-4 for C3), the differences between estimates from noise CC and earthquake results are more scattered than those between C3 estimates and earthquake results (two times standard deviation of 9.6E-4 for noise CC versus 7.2E-4 for C3). If we calculate noise CC using temporal flattening preprocessing instead of RAM (not shown), the mean difference between noise CC-derived EGF and earthquake estimates is −2.4E-4 and the median is −3.8E-4, with two times standard deviation of 10.3E-4. These examples demonstrate that C3 processing can largely reduce bias when extracting attenuation from noise.

Figure 9.

Comparison of attenuation coefficients extracted from C3 (with transient removal and temporal flattening preprocessing) and noise CC (with RAM preprocessing) of 18 s noise with that from an earthquake for three more line arrays. (a–c) Map of the western United States showing USArray stations (blue triangles), an earthquake (yellow star), a line array (red triangles), the reference station (black triangle), and coda stations (green triangles) used for C3 calculations. (d–f) Comparison of the apparent attenuation estimated from geometric spreading corrected earthquake amplitudes (black stars) with those estimated from C3 (red circles) and noise CC-derived (blue circles) EGF, respectively, for lines in Figures 9a–9c. Straight lines are the best fits to the log amplitudes. Estimated attenuation coefficients are listed in the insets. Names of the line array stations are listed in Table 1.

Figure 10.

Comparison of the bias in noise-derived attenuation estimates between using C3 and noise CC. (a) C3-derived estimates (red circles) and noise CC-derived estimates (blue crosses) plotted against earthquake estimates. (b) Difference between attenuation coefficients estimated from noise and those from earthquakes for C3 estimates (red circles) and noise CC estimates (blue crosses). Horizontal lines indicate the mean of C3 estimates (red), the mean of CC estimates (blue), and zero (black). Error bars indicate the range of mean ± 2 times standard deviation (σ). Values of mean and two times standard deviation are also shown for C3 (red) and noise CC (blue) estimates, respectively.

[31] Finally, we test the ability of C3 to extract reliable surface wave attenuation for noise at the secondary microseisms peak of 8 s. In general, SNR of many examples (line arrays 4, 7, 8, 9, and 11) is poor for 8 s C3 EGF (e.g., Figure 11a for line array 8) and for noise CC-derived EGF as well (not shown). Perhaps it indicates that 8 s noise field is much less diffused, despite the fact that noise at the secondary microseisms peak is the strongest. Recall that the spatially averaged attenuation estimate near 8 s also shows larger bias and uncertainty (Figure 3b). A few examples with fairly good SNR (line arrays 1, 3, and 6) show good agreement between the C3-based attenuation estimate and that from the earthquake data (e.g., Figures 11b and 11c for line array 1). Improvement from applying temporal flattening to noise CC can also be seen in Figure 11c. For other examples (lines 2, 5, and 10), neither C3 EGF nor noise CC-derived EGF shows convincing match with the earthquake result. The ability of C3 versus noise CC to extract spatially varying attenuation is thus less conclusive for noise around 8 s. However, it is interesting to note that in some cases, both C3-derived estimate and that based on noise CC compare well with the earthquake result in amplitude variation between stations (e.g., Figure 11c).

Figure 11.

Examples of C3 EGF construction and attenuation extraction from 8 s noise. (a) Comparison of the record section of the 8 s earthquake signal (black) with that of the C3 EGF (red) for line array 8 in Figure 8. (b) Comparison of the record section of the 8 s earthquake signal (black) with that of the C3 EGF (red) for line array 1 (Table 1). (c) For line array 1 at 8 s, comparison of the apparent attenuation estimated from geometric spreading corrected earthquake amplitudes (black stars) with those estimated from C3 EGF (red circles) and from noise CC-derived EGF, respectively, preprocessed using temporal flattening (green circles) and RAM (blue circles). Straight lines are the best fits to the log amplitudes. Estimated attenuation coefficients are listed in the inset.

[32] The fact that extracting surface wave attenuation from noise works not as well at 8 s as at 18 s may be indicative of stronger effects of small-scale medium heterogeneities and higher-mode surface wave contamination at shorter periods, which may be different for noise and earthquake signals with slightly different source and path characteristics. Generally, low SNR of 8 s EGF also affects the reliability of attenuation estimates. One way to address the problem of low SNR at 8 s is to use more coda stations and/or longer data. Ma and Beroza [2012] showed that C3 EGFs could be constructed across seismic stations regardless of whether or not they operate simultaneously. Therefore, it is possible to calculate C3 EGF between a pair of stations using data of longer time than the common operating time of the stations. In addition, calculating another iteration of noise CC from C3 coda [Froment et al., 2011] may also help.

7 Conclusions

[33] We have experimentally examined procedures that can help extract more reliable surface wave attenuation information from noise correlations. Although a spatial average clearly shows meaningful amplitude information, traditional noise correlation processing often fails to extract unbiased spatially varying attenuation estimate from noise, presumably due to the uneven distribution of noise source intensity. Using data recorded by the USArray, we select line arrays to study the ability of noise correlation methods to extract surface wave attenuation and compare the results with earthquake data. With the line array examples, we demonstrate that the correlation of the coda of correlation (C3) effectively reduces bias and improves attenuation estimates at least for noise around the primary microseisms peak (18 s). Results for noise at the secondary microseisms peak (8 s) are less conclusive. For data preprocessing, applying temporal flattening to transient-removed data, instead of the traditional RAM method, also improves the attenuation estimate to some degree.

Appendix A

Array Beamforming

[34] We estimate source spectra and locations using frequency domain beamforming, from which the peak beam power indicates the speed and direction of the major noise energy [e.g., Gerstoft and Tanimoto, 2007]. In frequency domain, the seismic displacement records are ordered into an N-dimensional vector v(ω), where N is the number of stations. Assuming a plane wave response w(ω) = exp(iωsre) for seismic noise energy, where s is slowness, r describes the coordinates of the array stations relative to the array center, and e contains the direction cosines of the plane wave for a given azimuth θ. The beamformer output is then given by inline image, where C(ω) is the cross-spectral density matrix given by C(ω) = E(vvH) and H indicates transpose.

[35] Using a subset of USArray stations (Figure 2a), we first filter the data into a band that covers most microseism energy (0.01–0.2 Hz). We also remove transients using the same method described in section 5. We then separate 1 year (2007) worth of data into 4 h long windows, and Fourier transform each one into frequency domain. We perform beamforming for each 4 h long data, so that the beamformer outputs are a function of time as well. We then stack them over the winter (from December to February) and the summer (from June to August), respectively, for showing typical seasonal noise source distributions (Figure 2b).

Appendix B

A Comparison of Different C3 Calculation Methods

[36] C3 can be calculated using only the positive lag coda, only the negative lag coda, positive and negative lag coda stacked together, or positive lag and flipped negative lag coda stacked together (mirror stacking). We tested all these different methods on the line array data shown in the main text. Figure B1 gives an example showing the resulting attenuation estimates from these methods compared with the estimate from earthquake data. It is apparent that C3 calculated using mirror-stacked coda produces an attenuation estimate that is most consistent with that from the earthquake data. Other methods also yield improvements over noise CC estimate, but they are not as significant. It is also noteworthy that EGFs from C3 calculated using mirror-stacked coda are most symmetric. This might imply that the effective noise field for the EGFs is more homogeneous. We obtain similar results as those shown in Figure B1 for all line array examples given in the main text.

Figure B1.

Comparison of EGFs and attenuation coefficients extracted from C3 calculated using different coda segments and/or stacking methods, and signals from an earthquake. Data are filtered around 18 s. (a) Map of the western United States showing USArray stations (blue triangles), an earthquake (yellow star, EQ2008/07/29 Mw5.4 33.96N 117.77W), a line array BFS-O16A (red triangles), reference station BFS (black triangle), and coda stations used for C3 calculation (green triangles). (b–e) Comparison of earthquake signals (black) with C3 EGFs constructed using, respectively, positive lag and flipped negative lag CC coda stacked together (mirror stacking, red), positive lag and unflipped negative lag CC coda stacked together (stacking, blue), positive lag CC coda only (magenta), and negative lag CC coda only (green). (f) Comparison of the apparent attenuation coefficient estimated from geometric spreading corrected earthquake amplitudes (black stars) with those estimated from C3 using mirror-stacked CC coda (red circles), stacked CC coda (blue circles), positive lag CC coda only (magenta circles), negative lag CC coda only (green circles), and that estimated from noise CC (orange circles). Straight lines are best fits to the log amplitudes. Estimated attenuation coefficients are listed in the legend.

[37] Because of the complexity of a scattering medium, it may be challenging to establish a rigorous theoretical foundation for our C3 processing method. A full theoretical derivation would require taking into account all scattering orders and is beyond the scope of this study. As a result, we recognize that our method is purely empirical and more tests may be needed when trying to use the method in other parts of the world.

Acknowledgments

[38] Data were from the USArray component of the EarthScope experiment through the Incorporated Research Institutions for Seismology. We are grateful to the thorough and constructive review of Fan-Chi Lin, an anonymous reviewer, and the Associate Editor. Our work benefited greatly from our discussions with Richard L. Weaver and Xiaodong Song at University of Illinois at Urbana Champaign. This work was performed under the auspices of the U.S. Department of Energy by Los Alamos National Laboratory under Contract Number DE-AC52-06NA25396.