Compressive sensing (CS) is a technique for finding sparse signal representations to underdetermined linear measurement equations. We use CS to locate seismic sources during the rupture of the 2011 Tohoku-Oki Mw9.0 earthquake in Japan from teleseismic P waves recorded by an array of stations in the United States. The seismic sources are located by minimizing the ℓ2-norm of the difference between the observed and modeled waveforms penalized by the ℓ1-norm of the seismic source vector. The resulting minimization problem is convex and can be solved efficiently. Our results show clear frequency-dependent rupture modes with high-frequency energy radiation dominant in the down-dip region and low-frequency radiation in the updip region, which may be caused by differences in rupture behavior (more intermittent or continuous) at the slab interface due to heterogeneous frictional properties.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
 For this complex earthquake rupture, we seek a sparse set of spatiotemporal source locations consistent with the measurements. Based on records on an array of sensors, Compressive Sensing (CS) [Donoho, 2006; Candès et al., 2006; Malioutov et al., 2005] recovers sparsely distributed source locations. This is based on the assumption that the source signals are linear and spatially sparse. CS has been presented as overcoming the data deluge [Baraniuk, 2011] and has been used in fields as diverse as magnetic resonance imaging (MRI) [Lustig et al., 2007] and computational photography [Duarte et al., 2008].
 Conventional narrowband beamforming is accurate when only a single source is present, but suffers from low resolution for multiple sources or propagation paths. High resolution methods as MUSIC [Goldstein and Archuleta, 1991; Meng et al., 2011] require many observations to obtain a good estimate. MUSIC only retains a basis for the estimated noise space, which requires many observations. Our CS method works on a single observation in the frequency domain. Specifically, we do not estimate the noise subspace.
 In the time domain, back-projection of teleseismic P waves [Ishii et al., 2005; Walker and Shearer, 2009; Xu et al., 2009] has been used to image the rupture of earthquakes. These back-projection methods usually need to perform time-averaging to obtain reliable estimates of rupture information.
 In this study we use the CS method and teleseismic P-wave data of the 2011 Mw9.0 Tohoku-Oki earthquake from about 500 stations in the U.S. to image the rupture process of the main shock (Figure 1), which reveals apparent frequency-dependent rupture modes of this earthquake.
2. Theory of Compressive Sensing
 Central to CS is the sparse recovery problem which we cast as follows: Let (θ1,…, θM)T be a vector of M candidate source locations on a suitably chosen grid on Earth's surface and x(ω) = (x1,…, xM)T is the complex-valued source vector. In the frequency domain, we observe seismic waveforms on an array of N stations
Here, the nmth element of A(ω) is where τnm is the traveltime from location θm to station n. The number of candidate source locations M is larger than the number of stations N, i.e., N < M. The additive noise n(ω) is zero-mean circularly Gaussian with cross spectral density matrix σ2I. Now, there are infinitely many source vectors x(ω) that result in the same observed waveform b(ω). To recover a physically meaningful source vector, we use our knowledge of its sparsity: We require that x(ω) has only K non-zero entries and K ≪ M. The recovery of such sparse x(ω) could, e.g., be formulated as
where ∥x∥0 counts the number of nonzero entries in x. The additive Gaussian noise motivates an ℓ2-norm interpretation of the constraint with ε being the noise floor. Unfortunately, this problem is non-convex, hard to solve, and unstable in the presence of noise [Baraniuk, 2007].
 Practical recovery algorithms rely on the restricted isometry property (RIP). The conditions for RIP are satisfied due to the pseudo-randomly varying phases [Donoho, 2006]. Following Malioutov et al. , we convexify (2) to
Finally, we introduce the Lagrange multiplier λ−1 > 0 and arrive at the following second-order cone problem
This is efficiently solvable with interior points solvers [Boyd and Vandenberghe, 2004] and we use the convex optimization package CVX (M. Grant and S. P. Boyd, CVX: Matlab Software for Disciplined Convex Programming, version 1.21, accessed 25 Jan 2011, available at http://cvxr.com/cvx).
 To obtain meaningful solutions to equation (4), a proper Lagrange multiplier λ is important. Using equation (1) with just one source and no noise gives ∥b∥2 = ∥Ax∥2 = ∥x∥1. We define the residual r = ∥b − Ax∥2/∥b∥2, which depends on the level of random noise and other coherent but unmodeled signals. These expressions indicate λ ∼ r. This balances the ratio between data misfit (∥b − Ax∥2) and model constraint (∥x∥1) in the misfit function (equation (4)). Using the array in Figure 1a, Figure 2 shows synthetic tests of the performance for different λ values at f = 0.23 Hz with a randomly generated source amplitude and phase. We add 10% random noise (r = 0.1) to the synthetic spectrum data. We use λ = r as it is around the “knee” in the L-curve in Figure 2h between the data misfit and model constraints.
 Conventional beamforming is first used to image the source location. For one source (same source spectrum as in Figure 2a), the beamformer peak corresponds to the input source location (Figure 2e), but with poor resolution. However, CS exactly picks the true source location with super resolution if proper damping is used (Figure 2c). For the synthetic model with two sources (Figure 2f; 10% noise), the beamforming output (Figure 2g) cannot get the correct locations due to signal interference. However, CS (Figure 2f) finds the locations exactly with super lateral resolution.
3. Data and Analysis
 We apply CS to image the March 11 Mw9.0 Tohoku-Oki earthquake using the first 200 s of the teleseismic P waves recorded by over 500 stations in the U.S. Waveform data (10 Hz sampling) are first 0.05–4 Hz bandpass filtered and each trace is normalized by its peak amplitude. The waveforms are aligned after correction for the predicted P-wave traveltime from the hypocenter (38.19°N, 142.68°E, 23 km depth [Chu et al., 2011]) to each station using the IASP91 1-D model. To suppress effects of 3-D heterogeneity on the traveltimes, we realign the waveforms using multichannel cross-correlation and clustering analysis [e.g., Ishii et al., 2005] for the first 8 s of the P waves (Figure 1b). We use N = 476 stations (Figure 1a) with good data quality as shown in Figure 1c. Other waveforms (e.g., pP and pwP [Chu et al., 2011]) are present in the first 180 s and these could cause some biases in our source location.
 For CS source locations, we use a sliding time window. We estimate the source locations in two frequency bands: 0.05–0.2 and 0.2–1 Hz. The sliding window has length 20 s for lower frequencies (0.05–0.2 Hz) and 10 s for higher frequencies (0.2–1 Hz). This window is moved in steps of 2 s for the aligned waveforms. Since we directly use the aligned waveforms with respect to the hypocenter, the coefficient matrix A becomes
where Δτnm is the predicted traveltime difference from station n to either location m (tnm) or to the hypocenter (tn0). We only invert for sources on the plane at the hypocenter depth as the depth resolution is very poor for teleseismic P waves [e.g., Xu et al., 2009]. The source locations are spaced every 10 km in a grid with total number of source points M = 41 × 41 = 1681. Therefore, for each frequency of each time window, A is a 1681 × 476 complex-valued matrix. Since the choice of damping λ depends on the noise (both random and coherent) level and affects the inversion results (see section 2.2), λ = 0.25 is chosen upon consideration of a possible large coherent noise level (r = 0.25, corresponding to 25% noise), which gives reasonable data fitting and solutions for a sparse source distribution.
 For each time snapshot and each frequency band, we first taper the windowed waveforms and then obtain the complex-valued spectrum data x(ω) for each discrete frequency using a Fourier transform. The data at each frequency and snapshot are then used to invert for the source distribution. By combining inversion results from all the snapshots, we obtain the spatial and temporal distribution of sources during the earthquake.
4. Results and Discussion
 In Figures 3a–3c we show the source amplitude ∣x∣ at 0.39 Hz from CS at three representative time snapshots. It is likely that source locations vary with frequency. To obtain the average source location of a snapshot in a frequency band ωk (k = 1,…, K), we smooth the single frequency source power (∣x(ωk, t)∣2) in that snapshot using a 2-D Gaussian function to obtain the snapshot source power:
where t gives the time window, dij the distance between locations i and j, R the smoothing distance (here 50 km), and C a normalization constant. The snapshot source power in the 0.2–0.5 Hz band is shown as Figures 3d–3f. In some cases, two distinct sources may appear in the same snapshot (e.g., Figure 3e). From the snapshot power map we pick the first and the second largest local peaks (crosses in Figures 3d–3f) as representative sources in that snapshot.
 The total source power i in a frequency band is obtained by summing all snapshots of source power in that band. Figures 4a–4d show the total source power in four frequency bands. For the higher frequency bands (0.5-1 and 0.2–0.5 Hz),the dominant source power is located around the hypocenter region (Figures 4a and 4b). In the lower frequency bands (0.1–0.2 and 0.05–0.1 Hz), the dominant source power shifts to northeast of the hypocenter, closer to the trench.
 A potential source at the north/south limit of the grid will have about a 10 s faster/slower traveltime than from the hypocenter. Corrections for this traveltime difference must be applied to obtain correct source times. For each time window t ∈ [tb te] with tb and te the start/end time of the window, the source time for source location m is approximated by tS = (tb + te)/2 + median(tn0 − tmn), where the median acts on the station index n ∈ [1, 2,…, N]. Results without this time correction provide inaccurate source times and rupture speeds.
 The spatial and temporal distribution of sources after the time correction is shown in Figures 4e–4h for four frequency bands. Note the apparent migration of sources from the down-dip region (close to coast) in the higher frequency band to the offshore region (close to trench) in the lower frequency band. The spatial distribution of high-frequency energy radiation during the main shock (Figures 4e and 4f) is generally similar to the aftershock distribution (open circles in Figure 4i) within the two days after the main shock. In the high frequency band (0.5–1 and 0.2–0.5 Hz), the region close to the hypocenter may rupture multiple times as inferred from the clustered distribution of sources with different source times (mostly less than 100 s) in the hypocenter area. Clear northward rupture is seen in the 0.2–0.5 Hz band (Figure 4f) around 100 s. Significant southward rupture and energy radiation starts at 100 s or later (Figures 4e–4g).
 The west/northwest rupture is obvious at both high and low frequencies (Figures 4e–4h). In particular we observe significant energy radiation around 100–110 s near the coast region (Figure 4h). In the lower frequency bands (Figures 4g and 4h) northeastward rupture is apparent and energy radiation in the region between the hypocenter and trench mostly occurred before 80 s. Even around 170 s or later, energy radiation is observed closer to the trench (Figures 4f–4h), but it is not clear whether these sources are from the main shock or early aftershocks. One striking feature is that there is almost no energy radiation in the southwestern part of the rupture area (close to the coast) in the lowest frequency band (0.05–0.1 Hz) (Figure 4h). Most sources occurred around the southern part of the trench after about 140 s in the 0.05–0.1 Hz band.
 Our source power and distribution in Figure 4 from CS in the high-frequency bands share similar features as frequency domain MUSIC imaging [Meng et al., 2011] or time domain back-projection [Ishii, 2011; Koper et al., 2011; Wang and Mori, 2011; Zhang et al., 2011] (Figures 4j and 4k), which reveals dominant seismic energy radiation in the down-dip (or around the hypocenter) region. However, low-frequency slip inversion results from seismic data and geodetic data [Chu et al., 2011; Ide et al., 2011; Koper et al., 2011; Simons et al., 2011] reveal large slip patches dominantly close to the trench (Figure 4l). Our low-frequency results (Figures 4g and 4h) confirm this. For high-frequency energy source distributions (Figures 4j and 4k), the difference among our results and the others may be due to differences in the stations used, the frequency content of the data, the hypocenter location, the data bandwidth for initial alignment of the P-waves, time versus frequency domain analysis, etc. Since our results are relative to the hypocenter, as in back-projection methods, any change in the hypocenter location (e.g., from USGS [e.g., Meng et al., 2011; Koper et al., 2011]) will systematically shift the absolute location of the sources.
 Since high-frequency seismic radiation is physically different from finite fault slip, and the former usually occurs during sudden changes of rupture speed, it is not straightforward to compare high-frequency rupture images with finite slip inversions. However, our method can simultaneously obtain energy radiation in both the high- and low-frequency bands, which provides a unique way to investigate frequency-dependent rupture modes of great earthquakes. The frequency-dependent spatial and temporal distribution of seismic energy radiation during the Tohoku-Oki main shock (Figure 4) reveals complicated rupture modes of this megathrust earthquake. These features may be caused by depth-varying frictional properties at the slab interface in this region [Koper et al., 2011], where the updip region (close to the trench) ruptures more smoothly and continuously to generate dominantly low-frequency seismic energy radiation. However, the down-dip region, in particular the patch southwest of the hypocenter that lacks low frequency radiation, ruptures more intermittently due to sudden changes of frictional properties in a relatively small area, for instance, at the transition between the brittle and ductile regions [Simons et al., 2011]. Future numerical modeling using heterogeneous frictional properties should help in understanding more about the rupture physics of this earthquake.
 We thank T. Lay and an anonymous reviewer for their constructive comments and Editor M. Wysession for his assistance. We appreciate S. Ide, L. Meng, Z. Wang, S. Wei, and H. Zhang for providing their slip inversion or back-projection results for our comparison. This work was supported by a Green Scholarship at IGPP/SIO, UCSD to YH, NSF grants EAR- 0710881, EAR-0944109 and OCE-1030022, and grant ICT08-44 of Wiener Wissenschafts–, Forschungs– und Technologiefonds.
 The Editor thanks Thorne Lay and an anonymous reviewer for their assistance in evaluating this paper.