The 2011 Tohoku earthquake produced tens of meters of fault slip near the Japan Trench, which generated devastating tsunami. The rupture process before the huge slip is still unclear due to a lack of resolution. Here I perform a multiscale slip inversion analysis to examine the first 10 and 20 s of the rupture process and the whole rupture process at different scales. The result shows that 4 s after the initiation, this earthquake started with a relatively high-speed rupture that had a peak slip rate faster than 1 m/s and a rupture velocity comparable to 3 km/s. Fourteen seconds after the initiation, the rupture propagation direction changed from northward to westward, near the edge of the M 7.3 foreshock coseismic slip area. The stress release by the foreshock may contribute to the complex small-scale rupture propagation, which may appear to be slow rupture propagation when only looking at long-period data.
 The 2011 Tohoku earthquake at 5:46 on 11 March 2011 (UTC) is the largest earthquake instrumentally recorded in Japan. This earthquake is characterized by a huge slip of more than 40 m in a shallow region close to the Japan Trench [e.g., Lay et al., 2011; Shao et al., 2011; Suzuki et al., 2011; Yagi and Fukahata, 2011; Iinuma et al., 2012], which produced the devastating tsunami. Seismic slip inversion indicates that the huge shallow slip started about 40 s after the onset [Yagi and Fukahata, 2011]. It is important to understand how this huge shallow slip occurred in terms of both the long-term stress accumulation and the coseismic rupture. Hoshiba and Iwakiri  pointed out that the peaks of acceleration, velocity, and displacement in the first 3–5 s of the P waves are comparable or even weaker than those of an Mw 6 earthquake. Chu et al.  reported that the first 4 s of the rupture process are equivalent to an Mw 4.7 individual thrust event. Fukahata et al.  suggested that a major moment release started around 20 s from the onset 35 km west of the hypocenter, inferred from abrupt rises in 1 Hz GPS data and the comparison to far-field term calculations using a full-space homogeneous elastic medium and point sources. However, there have not yet been detailed images of the initial part of the rupture process. Here I perform a multiscale slip inversion analysis [Uchide and Ide, 2007] to image the initial rupture process. The coseismic and postseismic slip for the largest foreshock on 9 March (Mw 7.3) are estimated from seismic data [Imanishi et al., 2012; Kato et al., 2012; Suzuki et al., 2012] and GPS with ocean bottom pressure data [Ohta et al., 2012]. The relationship between the areas of the mainshock coseismic rupture and the areas of the foreshock slip can be clearly seen.
 I use velocity seismograms from Hi-net and strong motion data from KiK-net and F-net, which are all operated by the National Research Institute for Earth Science and Disaster Prevention (NIED) [Okada et al., 2004]. The locations of stations are shown in Figure 1. Hi-net and KiK-net have short-period velocity seismometers and strong motion accelerometers, respectively, installed in common boreholes. F-net is the network of strong motion and broadband seismometers in tunnels. These networks which encompass a wide frequency band and a broad dynamic range work well for studying earthquake source processes.
 The velocity seismograms at representative stations along the Pacific coast in the Tohoku region (Figure 1c) show stepwise increases in amplitude. The P arrivals are clear in Hi-net records. The amplitudes of the first 4 s of the P waves are comparable to those of the Mw 5.2 earthquake (22:20, 4 December 2005 (UTC)), though the source duration of M 5 earthquakes is expected to be shorter than 1 s. From 4 s, waves that are 10 times larger arrive. Another increase in the amplitudes is seen around 14–18 s. These steps of amplitudes are probably from the stepwise growth of the rupture; therefore, this study will image the rupture process corresponding to these steps.
 The multiscale slip inversion method using fault models on three spatial scales [Uchide and Ide, 2007] is applied to the Tohoku earthquake. I refer to the smallest, medium, and largest scales as Scales 1, 2, and 3, respectively. The slip velocity distribution history on the fault model is represented by multiple triangle functions in both time and space. I solve a linear least squares problem to determine the amplitude of triangle functions to minimize the misfit. The slip velocity distribution history at a larger scale is the spatiotemporal average of that at a smaller scale, when the model regions at different scales are overlapped in time and space. Based on this linear relationship, observation equations at individual scales are combined [see Uchide and Ide, 2007].
 Table 1 summarizes the settings of the fault model, data, and Green's functions at each scale. The duration of the triangle functions at each scale is a quarter or half of the period of the low-pass filter. The fault model is composed of four fault planes (namely, NE, NW, SE, and SW), whose geometries are described in Figure 1 and Table 2. The eastern edges of the NE and SE planes correspond to the Japan Trench, and a finite slip on these edges is allowed, by setting node points of the slip functions on the edge. Otherwise, the model cannot give a good fit to observed waveforms. A slip on the border between adjacent planes is also allowed. The slip is set to be zero on the other boundaries since the fault segments are set by trial and error to be large enough to include the whole rupture area. To set the rupture initiation point, I refer to the hypocenter relocated by Suzuki et al.  using data from ocean bottom seismometers as well as other onshore networks. Suzuki et al.  pointed out that their hypocenters are systematically 5 km deeper than the plate interface; therefore, I set the rupture initiation point at 38.0919°N, 142.7897°E at a depth of 23 km. Scale 3 includes the entire rupture process, whereas Scales 2 and 1 are for the first 20 and 10 s, respectively.
Table 1. Setting of Multiscale Slip Inversion Analysis
 The Green's functions for Scale 3 are calculated using the reflection-transmission matrix [Kennett and Kerry, 1979] and the wave number integral [Bouchon, 1981] methods, using a 1-D velocity structure (Table 3), which is based on Iwasaki et al.  and the preliminary reference Earth model [Dziewonski and Anderson, 1981]. The effect of anelastic attenuation is modeled by using a complex seismic velocity, which for frequency ω is
where v1 denotes phase velocities of body waves at frequency ω = 2π [Takeo, 1985]. In general, the uncertainty of velocity structures for calculating Green's functions is problematic in slip inversion analyses, but such problems are not severe for the use of seismic data in a period longer than 20 s. For example, S waves may be affected by a thick seawater layer in the Pacific Ocean, but seismic waves in a longer period are not influenced very much [e.g., Hatayama, 2004].
Table 3. One-Dimensional Structure of P and S Wave Velocities (Vp and Vs, Respectively) for Calculating Green's Functions
 The empirical Green's functions (EGFs) for Scales 1 and 2 are observed waveforms of an Mw 4.9 earthquake that occurred at 21:00, 9 March 2011 (UTC). The moment magnitude is from F-net. I relocated the hypocenter of the EGF event at 38.2265°N, 142.8406°E at a depth of 21.2 km by a master event relocation technique using the time differences of P arrivals and the mainshock as the master event.
 Spatial smoothing constraints are applied to stabilize the analysis. I set two superparameters for the intensity of the smoothing constraint at Scales 1 and 2 and Scale 3. The values of the superparameters are determined by minimizing the Akaike's Bayesian Information Criterion (ABIC) [Akaike, 1980; Yabuki and Matsu'ura, 1992].
 Figure 2 shows the estimated source model. The seismic moments at 10 and 20 s are 9.3 × 1019 N m (equivalent to Mw 7.2) and 1.2 × 1021 N m (Mw 8.0), respectively, and the final seismic moment is 5.6 × 1022 N m (Mw 9.1). The characteristics of the rupture are different, as seen in the various scales. Scale 3 shows a shallow slip starting around 40 s with a low slip velocity in the first 20 s. However, at Scales 1 and 2, a dynamic rupture process with a slip velocity of more than 1 m/s is estimated from 4 s. The slow rupture in the first 20 s seen at Scale 3 is apparent due to the lack of resolution for the longer period data. Scales 1 and 2 imply dynamic ruptures with various rupture propagation directions, which combine to give an apparent slow rupture velocity in Scale 3. During the first 3 s in Scale 1, a small rupture is found. Between 4 and 10 s, the rupture propagates northward and southward bilaterally. The rupture toward the north stopped propagating around 14 s, and the direction of the main rupture propagation changed toward the west.
 Figure 3 compares the observed and synthetic waveforms. The synthetic waveforms fit the overall features of the observed waveforms. The variance reduction is 64%. The observed waveforms at stations to the north and south of the source region show monochromatic oscillation. This is due to some propagation effect, such as localized oscillation of the water column at a trench [Ihmlé and Madariaga, 1996], not simulated using the 1-D velocity model (Table 3). Since the synthetic waveforms do not follow this oscillation, it is likely that the obtained source model is not distorted by this oscillation.
5 Discussion and Conclusions
 From the results of the multiscale slip inversion analysis of the first 20 s of the 2011 Tohoku earthquake, the slip velocity and the rupture velocity are generally fast and correspond to a dynamic rupture (i.e., on the order of 1 m/s of slip velocity; rupture velocity comparable to shear wave velocities), and I do not see any indication of a slow rupture for the beginning of the mainshock. In the first 3 s, the multiscale slip inversion result indicates that a high-speed rupture started but stopped immediately. The cumulative seismic moment in the first 3 s is 1.2 × 1019 N m (equivalent to Mw 6.0). One of the possibilities is that this rupture may be a distinct event with a duration, according to the appearance of the waveforms (Figure 1c) and the centroid moment tensor (CMT) inversion [Chu et al., 2011]. It is still difficult to investigate the detail of such an “M 6 event” from onshore seismic data; therefore, careful studies with a good data set are needed to confirm if this is a distinct event.
 This earthquake shows a complex rupture propagation pattern: A northward and southward bilateral rupture is followed by westward rupture propagation. The beginning of the westward rupture propagation may correspond to the breakout of a major moment release inferred from a high-rate GPS observation [Fukahata et al., 2012]. The rupture was propagating westward at 20 s; however, more changes of the rupture propagation direction are expected later since the rupture eventually reaches the region close to the Japan Trench to the east. In fact, Asano and Iwata  estimated a strong motion generation area (SMGA) on the west of the rupture area during the first 20 s, in which the rupture mainly propagates eastward (Figure 4). Such a complex rupture may give the appearance of an average slow rupture propagation when looking at a large scale.
 Figure 4 compares the slip distribution for the first 20 s of the mainshock with those of the M 7.3 foreshock and its afterslip [Ohta et al., 2012]. The change of the rupture propagation direction occurred at the edge of the source region of the M 7.3 foreshock. It is probable that much of the shear stress in the source region of the M 7.3 foreshock was already released, and the rupture of the mainshock in the early stage could not initially propagate into that region. Therefore, the M 7.3 foreshock can be one factor to influence the complex rupture propagation direction of the mainshock, as seen in Figure 2. However, the effect of preceding earthquakes, including immediate foreshocks, on the following mainshock is not simple: The stress release in their source regions may prevent a following earthquake rupture, while the surrounding stress redistribution can promote rupture [Mitsui et al., 2012; Ide and Aochi, 2013].
 I thank Jim Mori and two anonymous reviewers for their constructive comments. I used seismograms of Hi-net, KiK-net, and F-net and the F-net CMT catalog maintained by NIED. The figures are drawn using GMT [Wessel and Smith, 1991]. T.U. was a research fellow of the Japan Society for the Promotion of Science.
 The Editor thanks two anonymous reviewers for their assistance in evaluating this paper.