We derive a finite slip model for the 2013 Mw 8.3 Sea of Okhotsk Earthquake (Z = 610 km) by inverting calibrated teleseismic P waveforms. The inversion shows that the earthquake ruptured on a 10° dipping rectangular fault zone (140 km × 50 km) and evolved into a sequence of four large sub-events (E1–E4) with an average rupture speed of 4.0 km/s. The rupture process can be divided into two main stages. The first propagated south, rupturing sub-events E1, E2, and E4. The second stage (E3) originated near E2 with a delay of 12 s and ruptured northward, filling the slip gap between E1 and E2. This kinematic process produces an overall slip pattern similar to that observed in shallow swarms, except it occurs over a compressed time span of about 30 s and without many aftershocks, suggesting that sub-event triggering for deep events is significantly more efficient than for shallow events.
 Deep earthquakes within subducted slabs of oceanic lithosphere provide key information about the thermal, thermodynamic, and mechanical properties inside slabs [Kirby et al., 1996]. While tomography studies indicate a variety of structures [Deal and Nolet, 1999], recent waveform modeling indicates that the Kuril-subducted slab has high-velocity anomalies of over 5% [Zhan and Helmberger, 2012], with geometry similar to that presented in Figure 1a. Thus, this cold heavy structure has a strong stress concentration above the transition into the higher-viscosity lower mantle [Tao and Oconnell, 1993]. Apparently, many slabs deform at this depth when they encounter this resistance with compressional stresses parallel to the dip of the seismicity zone [Isacks and Molnar, 1971]. However, it is not clear if these events result from dehydration embrittlement-induced failure [Omori et al., 2004] or some combination of transformation triggering caused by olivine phase changes [Kirby et al., 1991]. The recent 24 May 2013 Sea of Okhotsk earthquake, the largest deep event with instrumental record to date (Figure 1c) provides an unprecedented opportunity for detailed examination of the rupture process and mechanisms of deep-faulting earthquakes. This event occurred beneath the Sea of Okhotsk at a depth of 610 km with a normal moment tensor mechanism on a shallow-dipping fault (after U.S. Geological Survey and Global Centroid Moment Tensor solutions).
 While shallow earthquakes have well-developed aftershock activity which can be used to estimate source dimensions and geometries, deep earthquakes are generally void of such activity [Wiens and Mcguire, 1995]. Additionally, geodetic data are not available due to the great depth of rupture. Thus, we need to rely completely on seismic waveform modeling to constrain both the faulting dimensions and to investigate the spatial and temporal evolution of the rupture. Previous studies utilizing this approach include analysis of the 1994 Mw 8.3 Great Bolivian event (Z = 637 km) where rupture dimensions were estimated to be about 40 km × 40 km [Beck et al., 1995; Kikuchi and Kanamori, 1994; Silver et al., 1995]. Teleseismic observations of circum-Pacific earthquakes have improved significantly in recent years with the addition of Global Seismographic Network (GSN) stations throughout the Pacific region, and the Sea of Okhotsk event was well recorded over a broad range of distance and azimuth (Figure 1b). Even so, the travel paths involve 3-D complexity, especially to island locations in the Pacific. The effects of these path complications are illustrated in the inset of Figure 1c, where the vertical P wave velocity records at station WAKE are displayed for the Mw 8.3 main shock and a Mw 6.7 aftershock. Note the strong coda for the aftershock, which has a source duration of about 1 s (Figure S1 in the supporting information).
2 Path Calibration and New Source Time Function
 To account for the path complications in the source inversion, we applied empirical path corrections to the main shock observations using the Mw 6.7 aftershock records. The calibration is obtained by convolving the main shock data with the 1-D synthetic of the aftershock and then deconvolving with the data of the aftershock (see supporting information for details). This process effectively removes the 3-D path effects from the main shock records in the frequency band below 1 Hz (Figure 2). To invert for frequencies higher than 1 Hz data could potentially provide more details of the rupture process but is very challenging due to the high variability of P wave amplitude [Ni et al., 2010] and the 3-D structure in the source region (Figure S1).
 Traditionally, longer-period displacement P and SH waves are used for resolving the finite rupture process of great earthquakes [Beck et al., 1995; Kikuchi and Kanamori, 1994; Silver et al., 1995], and most source inversions assume geometric ray path Green's functions or Green's functions from 1-D Earth models. While the recordings from many stations appear to be adequately modeled by this assumption, many are not, especially when inversions are conducted with velocity waveforms at frequencies approaching 1 Hz. One of the difficulties lies in the deficiency of higher-frequency energy in the cosine-like source time function (Figure 2a) that is often used in the inversion of longer-period waveforms. To overcome this issue, we adopt a Kostrov-like source time function (Figure 2a), which has proven quite efficient in modeling higher-frequency strong motion amplitudes [Graves and Pitarka, 2010; Wei et al., 2012]. For the finite fault inversion, we follow a wavelet-based inversion scheme [Ji et al., 2002] with the addition of the updated source time function (see supporting information for details of inversion setup).
 The new source time function and the calibrated data allow better resolution of the shorter-period features of the velocity waveforms. An example is displayed in Figure 2b for a set of oceanic stations marked by the circle in Figure 1b. This small cluster of stations, WAKE, KWAJ, and TARA, are nearly the same azimuth from the earthquake and range in distance from 36° to 55°. The result of using a cosine source time function inversion on the original data (Figure 2b, left) is compared with that of new source time function inversion on the calibrated data (Figure 2b, right), and the contribution of each approach is shown in Figure 2c. The original data have more complicated coda, and the peak amplitudes are affected by station-specific path effects such as focusing and defocusing; e.g., KWAJ (47°/451.6 µm/s) has larger peak amplitude than the much closer station WAKE (36°/339.2 µm/s). The same features are observed in the aftershock records and are thus not likely related to main shock source complexity. As displayed in Figure 2b, the calibration procedure not only accounts for the strong coda but also the anomalous amplitudes, allowing the corrected data to behave more consistently with the response of the 1-D velocity model we use. Also note that the shorter-period content in the synthetics is increased by the use of the new source time function. Apparently, the combination of path calibration and new source time function can greatly improve the waveform fitting while individual approach can only handle a portion of misfit.
3 Inversion Results
 The rupture process for this earthquake is quite complex (Figure 1c), and the process of obtaining our preferred rupture model required extensive testing to understand the inherent trade-offs among parameters such as slip amplitude, rise time, rupture velocity, and rupture direction. In addition, we performed our initial analysis using displacement waveforms, which allows identification of the longer length-scale features of the rupture process. Once the general aspects of the rupture process were determined, we then used the velocity waveforms in our final inversions to better resolve the shorter length-scale rupture features. The details of this inversion process are discussed in the supporting information (Figures S2–S16). We use the fault plane with dip of 10° and strike of 177° after the National Earthquake Information Center W phase mechanism. The conjugate fault plane with strike of 15° and dip of 80° is easily rejected because slip on this fault plane cannot fit the strong directivity effect observed along the azimuth of 165°, which is 30° off the strike of this steeply dipping fault (Figure S2). Our results indicate that the primary rupture stage (S1) was unilateral from north to south followed by a secondary rupture stage (S2) initiating at a different hypocenter and propagating in the reverse direction across the fault plane (Figure 1). Moreover, the slip distribution displays a series of relatively strong patches, appearing as discrete sub-events. Viewing the synthetics and data in a standard directivity plot is useful to better understand the relative timing and location of the sub-events (Figure 3). See Figure 4 for corresponding sub-events in our preferred model. Here the rupture directivity parameter is defined as [Ammon et al., 2005; Silver et al., 1995] , where θi is azimuth for the ith station, is the rupture direction of the nth sub-event and is the phase velocity of P arrivals. The arrival time of the nth sub-event on the ith station can be expressed as , where Tn is the sub-event origin time at and Ln is the distance of the sub-event relative to the hypocenter. Displacement and velocity waveforms are arranged in this format (Figure 3) for rupture directivity along an azimuth of 165°, which describes the line connecting the hypocenter with the most distant asperity (Figure 1). From the slopes of the aligned arrivals (dashed lines), the distance from the hypocenter to different sub-events can be estimated. Note that E1 and E3 have similar slopes, which means they are located nearby while E2 and E4 are about 50 and 90 km to the south, respectively. In addition, as shown in both data sets, we observe that sub-event (E3) ruptures in the opposite direction, interfering with E4. To model E3, we added a secondary reversed rupture stage (S2), with the initiation point (secondary hypocenter) located on the largest asperity (E2) and delayed 12 s after the main shock origin time.
 It proves useful to display the individual contributions of the two stages to help understand this interference. In Figure S16a, we present a detailed analysis of waveform matching for two stations with one located to the north (SFJD) and another to the south (CTAO). See Figure S16b and Figure 3b for the locations and rupture directivity parameters of these two stations. For station SFJD, the interference between the two rupture stages enhances the amplitude with almost equal contributions from S1 and S2. At CTAO, however, the contribution from S2 is minimal, and strong southward rupture directivity on S1 dominates the waveform amplitude. Furthermore, the complicated interference between the two rupture stages greatly improves the average waveform cross-correlation coefficients between the data and synthetics (approaching 1.0 for some sites, Figure S16b). The comparisons of various source models with all the waveforms are presented in Figures S9 and S10, demonstrating the improvement by adding S2 (E3).
 The four largest sub-events in the rupture model are marked by rectangles in Figure 4 with their timings and strengths shown in the moment-rate plot, corresponding to those identified in Figure 3. The total moment of our preferred model is 4.8 × 1028 dyne · cm, and the moment magnitudes for E1–E4 are 7.8, 8.0, 7.9, and 7.9, respectively. The beginning sub-event (E1) shows its own complexity of rupture history, which starts from an Mw 6.5 event (estimated from moment-rate function in Figure 4), then evolves into larger moment around 8 s. The largest sub-event E2 has a peak moment rate around 14 s and is located about 50 km to the south of the hypocenter, corresponding to an average rupture speed of about 4.0 km/s. A slip gap of about 30 km is left between E1 and E2, which is subsequently filled during the northward rupture of E3 at around 24 s. At the same time, southward rupture continues and the last sub-event (E4) occurs 40 km south of E2. The four sub-events produce over 80% of the total moment, in an area of 140 km × 50 km with average slip amplitude of about 4 m. The rise time and displacement inversion results can be found in Figure S13.
 Compared with the 1994 Bolivian earthquake, the rupture area of the 2013 Sea of Okhotsk event is at least three times larger. Assuming a dip slip mechanism and a rectangular rupture plane [Kanamori and Anderson, 1975], the estimated static stress drop is about 8 MPa, which is more than an order of magnitude smaller than that of the Bolivian event (>100 MPa) [Kanamori et al., 1998]. Additionally, the rupture velocity is about 75% of the local shear wave speed, which is relatively high compared with that of the 1994 Bolivian Earthquake and is more consistent with that observed for shallow earthquakes. Assuming a mode III crack and using the definition of seismic efficiency as the ratio between radiated energy and the total potential energy change ( where V is rupture speed and β is shear wave speed) [Kanamori et al., 1998] gives a value of 0.62 for the Sea of Okhotsk event, compared to 0.18 for the Bolivian event [Kanamori et al., 1998], which implies relatively low-sliding friction during the rupture process of the Sea of Okhotsk earthquake. Both events have similar downdip rupture widths, which are primarily controlled by the width of the subducting slab; however, the along-strike dimension of the Sea of Okhotsk event is nearly four times that of Bolivia. Furthermore, while the rate of subduction in these regions is similar (~70–90 mm/yr), the age of the subducting oceanic lithosphere beneath the Sea of Okhotsk (~110 Ma) is twice that beneath Bolivia (~55 Ma) [Muller et al., 2008]. This suggests that the combination of large rupture area and low static stress drop may be caused by failure of relatively cold and brittle material in the very old slab, which ruptures in a weakly dissipative manner.
 The rupture behavior of sub-events in the Sea of Okhotsk earthquake is in some ways similar to that observed during the 2012 Brawley swarm [Wei et al., 2013] in which the largest sub-events ruptured with complementary slip distributions, effectively filling the low-slip regions left behind by earlier ruptures. However, in contrast to the hours long rupture process of shallow swarms, the rupture process of the Sea of Okhotsk event is condensed into a total duration of only 30 s. The complementary slip distributions between sub-events strongly suggest a triggering relationship between them. The triggering of seismicity by large earthquakes, even globally, is easily observed both at the time of the dynamic wave passage or delayed by hours to days [Gomberg et al., 2004; King et al., 1994; Pollitz et al., 2012]. The latter appears common in earthquake swarm environments where larger events have their own sequence of aftershocks. This delay can be explained in terms of rate-and-state framework with quasistatic preparation [Kaneko and Lapusta, 2008; Noda and Lapusta, 2013]. In contrast, transformational fault triggering with its strong positive feedback heating is expected to be much faster. Since the timing of E3 is consistent with the S wave fields originating from E2, dynamic triggering is likely involved. We suggest that deforming slabs at great depths are enriched with stress concentrations (asperities) that can be easily triggered, thus greatly shortening the time delay between triggered sub-events.
 The rupture of the 2013 Mw 8.3 Sea of Okhotsk Earthquake is composed of four major sub-events (E1–E4) and can be divided into two rupture stages. The stage one sub-events (E1, E2, and E4) mainly ruptured toward the south, and the second stage sub-event (E3) ruptured northward, filling in the slip gap between E1 and E2. The earthquake ruptured along a 10° dipping fault zone (140 km × 50 km) with an average rupture speed of 4.0 km/s. The relatively fast rupture speed coupled with the complementary nature of the sub-event slip distributions suggests a strong triggering relationship among the sub-events, possibly related to stress concentrations within the deep structure of the very old slab material.
 The teleseismic data were downloaded from IRIS, and figures are made with GMT. This research is supported by NSF grant EAR-1142020 and USGS award G10AP00048, Caltech Tectonics Observatory. The manuscript was improved by the constructive input of Ken Hudnut, Gavin Hayes, and two anonymous reviewers.
 The Editor thanks two anonymous reviewers for their assistance in evaluating this paper.