We developed a three-step method for three-dimensional (3-D) S wave velocity tomography by fitting synthetic cross spectra to the observed ones of ambient seismic noise. We applied this method to the recording of Hi-net tiltmeters in Japan at 679 stations from June 2004 to December 2004. First, we calculated normalized cross spectra between radial components and those between transverse components for every pair of stations. The first step is local 1-D S wave velocity inversion for each station assuming small lateral heterogeneity under a 100-km circle of a station. We measured the dispersion curves of fundamental Rayleigh waves, fundamental Love waves, and first overtone of Love waves by fitting the synthetic cross spectra to the observed ones between pairs of stations within the circle. We inverted the measured dispersion curves for obtaining a 1-D S wave velocity model. The second step is the inversion of the observed cross spectra for obtaining path-averaged 1-D S wave velocity structure. The third step is the inversion of the resultant path-averaged structures for obtaining 3-D S wave velocity structure (0.1° × 0.1° × 1 km grid from the surface to a depth of 50 km) using ray approximation. The resultant S wave velocity structures show clear low-velocity anomalies along tectonic lines from the surface to a depth of 20 km. In particular, along the Hidaka mountain range, we observed S wave perturbation more extreme than −20%. They also show low-velocity anomalies under volcanoes in Kyusyu and Tohoku. In the southwestern part of Shikoku, our results show a clear low-velocity anomaly corresponding to an accretional belt (Shimanto belt). Below 20 km, we observe a low-velocity anomaly in the center of Japan, which suggests a thick crust.
 It is well known that microseisms are excited at random by standing ocean surface waves. The typical frequency of microseisms at about 0.2 Hz approximately doubles the typical frequency of ocean surface waves through nonlinear interactions [Hasselman, 1963; Longuet-Higgens, 1950]. Microseisms are the main source of noise for seismic observation because they mask seismic signals from earthquakes.
 Using their random excitation properties by contraries, one-dimensional (1-D) S wave velocity structures at shallow depth (≤5 km) have been explored since an early work by Aki [1957, 1965]. In the method known as spatial autocorrelation method, the dispersion curves of surface waves are obtained from the cross spectra between many pairs of stations of an array. The measured dispersion curves are inverted for obtaining a 1-D S wave velocity structure under the array.
 Recently, Shapiro et al.  performed a cross-correlation analysis of long sequences of the ambient seismic noise at around 0.1 Hz to obtain a group velocity anomaly of Rayleigh waves due to the lateral heterogeneity of the crust in Southern California. They inverted the measured anomalies for obtaining a group velocity map. This method is called ambient noise surface wave tomography. The ambient noise tomography is theoretically justified by the fact that a cross correlation function between two stations provides its Green's function between the stations [e.g., Snieder, 2004]. These studies resulted in group speed maps at short periods (7.5–15 s) that display a striking correlation with the principal geological units in California with low-speed anomalies corresponding to the major sedimentary basins and high-speed anomalies corresponding to the igneous core of the main mountain regions. Group velocity maps have also been obtained at larger scales and longer periods across much of Europe [Yang et al., 2007], in South Korea at very short periods [Cho et al., 2007], and in Tibet at long periods [Yao et al., 2006]. However, three-dimensional (3-D) S wave velocity inversion has not been performed because of complex propagation of the observed waves.
 For performing 3-D inversion, the phase information of the observed surface waves is important. However, only group velocity maps were obtained by the ambient noise tomography in most cases because the propagation of short-period surface waves, which are most sensitive to the crust, is too complicated. The waves are preferentially attenuated and scattered; therefore, the propagation distance exceeding 100 km distorted their waveforms significantly. In order to use the phase information, both dense instrumentation and widely distributed stations are required. The densed Hi-net array data of tiltmeters in Japan [Obara et al., 2005] enables us to use their phase information.
 For performing 3-D wave velocity inversion, we developed a new method to fully utilize the waveform information. For modeling the observed cross spectra, we formulated synthetic cross spectra based on a normal mode theory with an assumption of stochastic stationary excitation of surface waves [Fukao et al., 2002; Nishida and Fukao, 2007]. The method we have used is similar to partitioned waveform inversion [Nolet, 1990; van der Lee and Nolet, 1997]. Our method has three steps: (1) measurement of dispersion curves using many pairs of the observed cross spectra as in the spatial autocorrelation method [Aki, 1957], and inversion of the dispersion curves for obtaining local 1-D S wave velocity models; (2) estimation of path-averaged 1-D S wave velocity structures by modeling observed cross spectra; and (3) inversion of path-averaged structures for obtaining 3-D S wave velocity structure (0.1° × 0.1° × 1 km grid from the surface to a depth of 50 km) using ray approximation.
 Hi-net is a dense array that covers entire Japan and is operated by the National Research Institute for Earth Science and Disaster Prevention. The Hi-net tiltmeter network consists of 679 tiltmeters (Figure 1) buried in deep boreholes of 100 m depth or more [Okada et al., 2004]; it can be used as a network of horizontal long-period seismometers (accelerometers) [Tono et al., 2005; Tonegawa et al., 2006].
 For each station, we remove glitches and divide all the records in a time period from June 2004 to December 2004 into 1024 s segments with an overlap of 512 s. Each segment is Fourier-transformed to obtain cross spectra. Figure 2 shows probability density as a function of power spectra [e.g., McNamara and Buland, 2004]. Figure 2 shows a clear thick red curve with a peak at about 0.2 Hz, which corresponds to microseisms. The high probability densities in red color at about 0.2 Hz show a background level that is three orders of magnitude larger than the new low noise model (NLNM) [Peterson, 1993]. This high background level is originated from the location of stations near coast lines. In order to analyze the background wavefield, we must discard outliers such as earthquakes, local nonstationary ground motions, and instrumental noise [Nishida and Kobayashi, 1999]. In order to detect the outliers, we estimate the background level t at time t using the median of mean square amplitudes Ii,t of the ith station filtered from 0.05 to 0.2 Hz. We discard the segments with Ii,t larger than 10t and those with Ii,t smaller than 0.1t. We also discard all segments disturbed by transients when a sudden change of background level between successive time segments is larger than 0.12t.
 In order to study the propagation features of the observed surface waves, we calculated the frequency-slowness spectra of the data [e.g., Rost and Thomas, 2002] at 0.075 Hz in Figure 3. The spectrum of radial components (R) shows clear Rayleigh wave propagation from all directions, and that of transverse components (T) also shows clear Love wave propagation. Surprisingly, the observed amplitudes of Love waves are larger than those of Rayleigh waves in contrast with the dominance of Rayleigh waves in most cases [Kawakami et al., 2005]. Therefore, we use both information on Love waves and that on Rayleigh waves in this study. Figures 3a and 3b show that the surface waves travel from all directions, although their amplitudes change slightly with directions. This feature justifies the ambient noise surface wave tomography [Snieder, 2004] because a cross-correlation analysis of an anisotropic excitation of surface waves causes apparent waves with a fast phase velocity.
3. Cross Spectra Between Pairs of Stations
 We calculate the ensemble average of normalized cross spectra, djα of the jth pair (between the kth and the lth stations) for their common record segments as
where skα is the Fourier spectrum of the selected segments of the kth station. Here, α indicates a cross spectrum between R or that between T. R and T are defined in Figure 4. We model these spectra in section 4.
 Because of an isotropic excitation of Love and Rayleigh waves, as shown in Figure 3, the display of the cross-correlation functions between two stations against their separation distance should indicate clear Rayleigh and Love wave propagation [e.g., Shapiro and Campillo, 2004]. In fact, this is the case of the observed records, as shown in Figure 5, where the cross-correlation functions between R and T are band-pass-filtered from 0.02 to 0.5 Hz. They show clear propagation and dispersion of Rayleigh waves and Love waves. The cross-correlation functions with a separation distance longer than 1000 km are distorted because of their dispersion and attenuation. The cross-correlation functions shown in Figure 5a also exhibit weak crustal P wave propagation up to a separation distance of about 400 km.
 In order to confirm the dispersive nature of the observed waves, a display in wave number-frequency domain is better. In Figure 6, we plot such wave number-frequency spectra between R and T against phase velocity and frequency [Nishida et al., 2002] using all observed cross spectra. These plots exhibit a clear fundamental Rayleigh wave branch and a fundamental Love wave branch. We can also detect the first and the second overtone branches of Love waves in Figure 6b, although separation of the first overtones from fundamental Love waves in space-time domain is difficult. The crustal overtones result from the constructive interference of overcritical multiply reflected shear waves in the crust [Levshin et al., 2005].
 In order to construct 3-D S wave velocity model, we use fundamental Rayleigh waves, fundamental Love waves, and the first overtone of Love waves below 0.2 Hz. The measurement of overtones provides valuable information for better vertical resolutions. Above 0.2 Hz, in some regions, their wave propagations are too complicated to model them. We can also observe ambient crustal P waves in Figure 5a and the second overtone of Love waves, but we do not use them because of their small amplitudes.
4. Forward Problem: Synthetic Cross Spectra of Horizontal Components Between a Pair of Stations for Homogeneous Excitation Sources
 In this section, for modeling the observed cross spectra, we calculate the synthetic cross spectra of Rayleigh waves and Love waves. Following Fukao et al.  we assume homogeneous and isotropic excitation sources. We can write the synthetic cross spectra as equation (A10) and equation (A11) of Appendix A, based on the normal mode theory. In this study, assuming that the separation distance Δ between pairs of stations is much shorter than the Earth's radius, we can approximate the cross spectra Ψα as
where ω is angler frequency, n represents a mode branch, knα is the wave number of the nth overtone, qnα is the inverse of quality factor of nth overtone and anα is the power spectrum of the nth overtone. Here, ψ is wave function defined by
where U(ω) is the group velocity of the wave. Later, we model the observed cross spectra using these equations.
5. Local 1-D S Wave Velocity Model for Each Station
 The first step of our inversion procedure is to obtain 1-D S wave velocity inversion for each station. For this, we assume that the lateral heterogeneity within a 100-km radius of a station is sufficiently small. A typical example of the circle at the Hakuta station (N.HKTH) is shown in Figure 1. In order to check plausibility of the assumption of isotropic excitation of the surface waves and lateral heterogeneity within the circle, we show cross-correlation functions within the 100-km circle for N.HKTH in Figure 7. In Figure 7 we just corrected amplitudes due to geometrical spreading. We also show them within the circles at the Kamikawa station (N.KKAH), the Karakuwa station (N.KKWH) and the Asahi station (N.ASNH) from Figure 9 to Figure 10. Locations of the stations are shown in Figure 1. Figures 7– 10 show clear surface wave propagations and no apparent waves with fast phase velocity due to the anisotropic excitations. These features suggest that our assumption is plausible. By fitting the synthetic cross spectra to the observed ones between pairs of stations within the circle, we measure the phase velocity of fundamental Rayleigh waves, fundamental Love waves, and the first overtone of Love waves. From the dispersion curves, we obtain 679 reference 1-D S wave velocity structures by phase velocity inversion.
5.1. Measurement of Dispersion Curves of Surface Waves
 In order to construct a reference 1-D S wave velocity model within a 100-km circle for the mth station, we estimate wave number k0α(ω)(m) of a fundamental mode branch, its quality factor q0α(ω)(m), and excitation terms a0α(ω)(m) described in equation (2) by minimizing the square difference Sα(m) between the synthetic cross spectra and the observed ones for each frequency. Here we define Sα(m) as
where djα(ω) is a cross spectrum of the jth pair of stations and wj is a data quality-dependent weight of the jth observed cross spectrum. Here we simply corrected topographic effects, following Snieder , by δtjα (see Appendix B for details). A three-step grid search method has been implemented to find the minimum Sα(m) for each frequency.
 At the first step, we set q0α = 1/200 as the initial value. For the entire wave number and frequency range, we estimate corresponding excitation term a(k, ω) by minimizing Sα(m) as ∂Sα(m)/∂a = 0.
 At the next step, we plot the least squares misfits Sα(m)(k, ω) with estimated a(k, ω) against frequency and the corresponding phase velocity. We show four typical examples from Figure 7 to Figure 10. All Sα(m) show clear dispersion curves of these waves regardless of their location. The clear propagations suggest that our assumption of the isotropic excitations and the small lateral heterogeneities are plausible. For each frequency, a clear minimum value of Sα(m) can be identified in black, which exhibits a clear fundamental branch of Rayleigh waves with a phase velocity from 3.25 to 3.6 km/s, and Love waves from 3.4 to 4 km/s. We can determine k0α(ω)(m) by selecting the minimum value from 0.05 to 0.2 Hz. We also obtain a0α(ω)(m) corresponding to the determined k0α(ω)(m).
 At the third step, keeping k0α(ω)(m) and a0α(ω)(m) by constant, we estimate q0α(ω)(m) using grid search. Here, q0α(ω)(m) is influenced by not only intrinsic anelasticity but also small heterogeneity within the circle; therefore, we cannot interpret the physical meaning of q0α(ω)(m). However, q0α is necessary only for reducing the contribution of fundamental Love waves in order to measure the wave number of the first overtone of a Love wave, as shown below, because fundamental Love waves mask the fist overtone, as showing in Figure 7. In space-time domain, separation of the first overtones from the Love waves is also difficult because of their overlaps as shown in Figure 7 (top right).
 In order to measure the wave number of the first overtone of a Love wave for the mth reference station, k1T(ω)(m), and its excitation term a1T(ω)(m), we reduce the contribution of the estimated fundamental Love waves from the observed spectra as
because the first overtone of a Love wave is masked by the fundamental Love waves, as shown in Figure 7. Then, we can determine the dispersion curves of the first overtone using the same method as that used for fundamental modes. A resultant plot of their misfits is shown in Figure 11 for N.HKTH. Figure 11 exhibits a clear first overtone branch from 4 to 4.5 km/s. We measure k1T(ω)(m) from 0.1 to 0.2 Hz because a crustal overtone of a Love wave can exist only above 0.1 Hz.
5.2. 1-D S Wave Velocity Inversion Using a Simulated Annealing Method
 At the first stage of 1-D S wave velocity inversion, we modify the initial model using linearized phase velocity inversion in order to accelerate the inversion in the second stage. This inversion begins with the initial P and S wave velocity model JMA2001 [Ueno et al., 2002], and a density model by scaling the P wave velocity [Christensen and Mooney, 1995; Mooney et al., 1998]. For the initial model, we calculate the wave number perturbations δknα(ω)(m) of the nth overtone for the mth reference station as
where r is radial distance from the center of the Earth, R is the radius of the Earth and Rb is the bottom of the model in this inversion. Here, Knα(ω, z)(m) is the 1-D sensitivity kernel of the nth branch for the mth reference model [Takeuchi and Saito, 1972; Dahlen and Tromp, 1998]. Here, because the density perturbation (δρ) and the P wave velocity perturbation (δα) are less sensitive to the wave number perturbation, we simply scale them [Masters et al., 2000] as
 In Figure 12 we plot typical examples of sensitivity kernels. At 0.2 Hz, the maximum sensitivity of the fundamental Love wave is in the upper crust above 10 km. The sensitivity deepens and broadens as period increases. At 0.05 Hz, the fundamental Rayleigh wave is sensitive to the crust above 40 km. The first overtone of the Love wave is sensitive to the crust from 20 to 40 km. By using the kernels, we estimate the S wave velocity perturbation δβ(r) employing the linearized phase velocity inversion.
 In the next stage, in order to avoid the initial model dependence of the 1-D S wave velocity inversion, we employ a simulated annealing method [Rothman, 1985; Shapiro and Ritzwoller, 2002; Metropolis et al., 1953], which effectively searches for global model space. We invert the measured dispersion curves for obtaining a 1-D S wave velocity structure modeled by a continuous piecewise linear function of depth with nodes (0, 2, 4, , 40, 50 km) using very fast simulated annealing (FSA) [Ingber, 1989] which is very efficient for many geophysical applications [e.g., Zhao et al., 1996]. The FSA method uses a random sampling of a new model with a Cauchy-like distribution, which depends on the current model and the temperature parameter. This search scheme allows the use of a very fast cooling schedule.
Figure 13 shows a typical example of the inversion for the reference station N.HKTH. Figure 13a shows the observed dispersion curves and the ones predicted by the final model. Figure 13a shows that the final model predicts the observed dispersion curves well. Figure 13b shows models for all iterations of FSA; the color of the models darkens with iterations. Figure 13b shows a rapid convergence of the iterations (200 times). Similarly, we estimated 679 1-D S wave velocity models for all stations. In Figure 14, we show typical examples of the resultant structures. The models at N.KKAH and N.ASNH are typical 1-D structures in a low-velocity region, whereas those at N.KKWH and N.HKTH are typical ones in a high-velocity region. Figure 14 shows that S wave velocity anomaly reaches 20% in the mid crust. This large velocity contrast shows the necessity of these local 1-D models for our waveform inversion. Figure 14 does not show a sharp Moho discontinuities at all the stations, because of the broad sensitivity kernels at that depth (Figure 12).
6. Path-Averaged S Wave Velocity Structure
 The second step is the waveform inversion of cross spectra for average S wave velocity perturbations along their paths. In section 5, we assumed that the lateral heterogeneity within a 100-km circle of a reference station is sufficiently small. Therefore, we can calculate the synthetic cross spectra between a pair of stations within the circle using a first-order approximation. We use only the cross spectra with a separation distance of 200 km in order to avoid the complexity of wave propagations such as multipath effects.
 We define the average wave number perturbation (m) from the mth reference model along the jth path by
where knα(θ, ϕ, ω) is local wave number of nth overtone, kn(ω)(m) is synthetic wave number for mth reference model, Pj represents a jth raypath, and Δj is separation distance along the jth path. (m) can be represented by a convolution between its sensitivity kernel and the path-averaged S wave speed perturbation (m) from the mth reference model along the jth path as
Here, we define (m) by
where β(r, θ, ϕ) is the S wave velocity at (r, θ, ϕ) and β0(r)(m) is the mth reference 1-D model.
 Then, the synthetic cross spectra jα of the jth pair of station can be written to the first order of a wave number perturbation as
We can obtain path-averaged 1-D S wave velocity structures by minimizing the square difference between the observed and the synthetic cross spectra ( − dR)2 + ( − dT)2.
Figure 15 shows typical examples of the observed cross-correlation functions band-pass-filtered from 0.05 to 0.2 Hz. We also plot synthetics from the reference 1-D model and the path-averaged model. Figure 15 shows that path-averaged models significantly improve waveform fitting. Figure 16 shows mean S wave velocity of the path averages with their standard deviations for reference station N.HKTH. Figure 16 shows that the S wave velocity perturbations below the depth of 15 km decrease due to low sensitivity.
 We obtain the averaged S wave velocity structure along the jth path for all reference stations as
where M is the number of reference stations for the jth path.
7. Inversion of 3-D S Wave Velocity Structure Using Ray Approximation
 The third step is the construction of a 3-D S wave tomographic model from a collection of path-averaged 1-D S wave velocity structures. The shear slowness of the 1-D models obtained from the waveform inversion can be regarded as the average of the shear slowness structure along the great circle path between a pair of stations [Debayle and Kennett, 2000] as
where slowness p = 1/β. The path-averaged model for each path represents an average of the slowness structure encountered along the path. We model the slowness p(r, θ, ϕ) by blocks of 0.1° latitude and longitude and 1 km depth.
 The S wave velocity anomaly on the order of 100-km scale reaches 20%, as shown in Figure 14. For the 3-D inversion with an initial 1-D model for entire Japan, the damping of the 3-D inversion distorts small-scale structures due to the strong anomaly. In order to avoid the distortion, we estimate the initial slowness model p0(r, θi, ϕi) using the weighted average of the resultant path-averaged S wave velocity structure as
where Δsj,i is the path length of the ith grid intersected by the jth path and wj is the weight for the jth path. We reduce prediction by p0 from data as
where δp is the slowness perturbation (p − p0).
 Strong spatial variations of path density shown in Figure 17 also distort the spatial resolution of this inversion. In order to homogenize their resolution [Barmin et al., 2001] we change variables as
where η is the path density. Although use of a sampling-dependent cell parameterization may result in a better conditioned inverse problem, actually the hit count scaling gives good result in many cases [e.g., Bijwaard et al., 1998]. For each depth (every 1 km), we estimated δp′ using singular value decomposition algorithm. The inverse of the singular values is replaced by zeros when they are less than 2.5% of the maximum value.
 In order to ascertain the adequacy of the ray coverage and reliability of the obtained image, we plot a typical example of a recovered image for a localized anomaly at a block (Figure 18). Figure 18 shows 23% recovery in amplitude due to the damping applied in the inversion. The recovered image is broaden from 0.1° × 0.1° to about 0.3° × 0.3°. We also plot the diagonal components of the resolution matrix at the corresponding blocks in Figure 19. Figure 19 shows that the spatial resolution is almost uniform although its value near a coast line is low due to a lack of path coverage. Below 20 km, because spatial resolution is constrained by the wavelength of excited surface waves rather than path density, the resolution is broaden with frequency. For example at a depth of 30 km, typical wavelength of the corresponding fundamental surface waves at 0.05 Hz is about 70 km.
 The resultant S wave velocity structures at nine depths (2, 5, 10, 15, 20, 25, 30, 35, 40 km) are shown in Figure 20. At shallow depths from 2 to 10 km, we can identify a low-velocity anomaly located at volcano regions in Tohoku and Kyusyu. The strongest low-velocity anomaly is located along the Hidaka mountain range in Hokkaido. We also observe a strong low-velocity anomaly in Chubu and Kanto around major tectonic lines. At a depth from 10 to 15 km, Figure 20 shows clear low-velocity anomaly due to an accretional belt in the southwestern part of Shikoku. At a depth from 2 to 25 km in Chugoku and Tohoku, we can identify a high-velocity anomaly. At a depth from 25 to 40 km, Figure 20 show a low-velocity anomaly at the center of Japan, which suggests a thick crust. In this depth range, the low-velocity anomaly in Hokkaido disappears. In whole area, we cannot detect sharp Moho discontinuities because of low sensitivity of corresponding surface waves although we can identify low-velocity anomalies at 40 km due to lateral variations of Moho depth.
 Our result reveals a low-velocity belt along the Hidaka collision zone (HCZ) from the surface to a depth of 20 km (Figure 21). In the upper crust, we observe S wave velocity perturbations more extreme than −20%. A resultant profile from A to B in Figure 21 is consistent with that obtained by seismic refraction/wide-angle reflection experiments [Iwasaki et al., 2004]. This low-velocity anomaly suddenly disappears below 25 km. In the east part of the profile form A to B, Figure 21 exhibits thick superficial sedimentary layers of about 5 km thickness. To the north of the HCZ, the profile from C to D in Figure 21 also shows a strong low-velocity anomaly along the lithospheric boundary although it disappears at the center of the profile.
 Arc-arc collisions are important processes in the transformation of an island arc crust to the new continental crust. In the southern Hokkaido region of Japan, the Kuril arc has been colliding with the northeast Japan arc since middle Miocene [Kimura, 1994], resulting in the uplift of the Hidaka Mountains behind the Hidaka Main Thrust. Seismic tomographic and exploration studies confirm that the Kuril arc is obducted toward the west [Tsumura et al., 1999; Murai et al., 2003; Iwasaki et al., 2004]. Their results show that the lower crust of the Kuril arc is delaminated at a depth of 20 ∼ 30 km on the eastern side of the Hidaka Mountains and that the lower part is descending westward. The S wave velocity structure along the profile from A to B in Figure 21 shows a high-velocity anomaly at a depth of 10 ∼ 20 km, corresponding to the obducted crust at a shallow depth. Strong low-velocity anomalies along the profile from A to B suggest a severe deformation during the collision.
Figure 22 shows a S wave velocity structure beneath Tohoku. A depth slice at 2 km reveals low-velocity anomalies under volcanoes with a scale of about 40 km. The low-velocity anomalies are confined to two or three active volcanoes [Nakajima et al., 2001]. The low-velocity anomaly below volcanoes disappears from 15 to 25 km. Below 25 km, we can observe a large-scale low-velocity anomaly beneath the volcanic front although our results do not have sufficient spatial resolution in the horizontal range of 30 ∼ 40 km. Through all depth, we observe high-velocity anomaly beneath the fore-arc side of Tohoku (from C to D in Figure 22).
8.3. Chubu and Kanto
Figure 23 shows two clear low-velocity anomalies around tectonic lines. The first anomaly is located from the surface to a depth of 25 km in Kanto to the south of Median Tectonic line (MTL), one of the longest and the most active arc-parallel fault systems in Japan. This anomaly was formed by the successive accretion of sediments and oceanic plate materials. The second anomaly is distributed along the Niigata-Kobe Tectonic Zone (NKTZ) [Sagiya et al., 2000]. The zone of high strain rates along the NKTZ was revealed by a GPS array in Japan. This high strain rate zone, which is approximately 500 km long in the NE–SW direction and approximately 100 km wide, undergoes contraction in the WNW–ESE direction. The contraction rate is a few times larger than that in the surrounding regions. On the basis of resultant velocity structures, the NKTZ can be divided into two regions: the western part and the eastern part where volcanoes are concentrated. In the western part, we observe a low-velocity anomaly at the depth of about 20 km. That may be explained by the existence of an aqueous fluid deprived from the Philippine Sea slab [Nakajima and Hasegawa, 2007a]. In the eastern part, we can observe a clear vertical thin dike structure of a low-velocity anomaly. The low-velocity anomalies extend from the middle crust to the upper crust, which are probably due to the fluids related to the back-arc volcanism [Nakajima and Hasegawa, 2007a]. Below 30 km, we observe a low-velocity anomaly with a scale of 200 km at the center of this area. The low velocity is reflected from the thick crust with a deep Moho depth of about 40 km [Zhao et al., 1992]. We also observe a high-velocity anomaly below 30 km beneath eastern Kanto, which suggests segments of Philippine Sea slab.
8.4. Southwestern Japan and Kyusyu
 In Figure 24, the profile from A to B shows a clear low-velocity anomaly below 30 km corresponding the oceanic crust. The Philippine Sea slab subducts continuously in a SE–NW direction beneath this region [Shiomi et al., 2006]. We cannot detect oceanic Moho because of poor spatial resolution in that depth range.
 The vertical slice from C to D shows a low-velocity anomaly in south Shikoku from the surface to a depth of 20 km. The depth slices at 5 and 15 km show that the low-velocity anomalies are distributed on the southern side of MTL, corresponding to the Shimanto belt. The Shimanto belt was formed by the successive accretion of sediments and oceanic plate materials. On the other hand, the depth slice at 5 km in Figure 24 shows a high-velocity anomaly on the northern side of the MTL corresponding to the granitic rocks of the Ryoke metamorphic belt.
 In the northern part of Kyusyu, the depth slice at 5 km shows a low-velocity anomaly under individual volcanoes. On the other hand, in the southern part, Figure 24 shows a larger low-velocity anomaly under volcanoes. The low-velocity anomalies under volcanoes in the northern Kyushu disappear below 20 km, whereas that in the southern part exists up to 30 km.
9. Statistical Comparison With Other Local Models
 In order to check plausibility of our model, we calculated cross-model correlation between our model and three local S wave velocity models by traveltime tomography: a model by traveltime tomography in Tohoku [Nakajima et al., 2001], that in Kanto [Matsubara et al., 2005] and that in western Japan [Nakajima and Hasegawa, 2007b]. Areas of the three models are show in Figure 20. For depth slices from 0 to 40 km we calculated cross-correlation coefficients between the S wave velocity perturbations of our model and those of the other models (Figure 25). At shallow depth above 5 km, the coefficients are lower than those at other depths because both models are insensitive to the shallow S wave velocity structure. At depth from 5 to 40 km the coefficients are positive. This result shows that our model is consistent with the other models although difference spatial resolution causes undulation of the values. Below 30 km their correlations are distorted by poor spatial resolution of our method due to effects of the finite wavelength. A local minimum at about 15 km in Tohoku may be originated from existence of crustal discontinuities (the Conrad) in their model [Nakajima et al., 2001].
 We developed a three-step waveform inversion method for estimating the 3-D S wave velocity structure from the observed cross spectra of ambient seismic noise. This method was applied to the real data obtained from 679 Hi-net of tiltmeter stations in Japan. The resultant S wave velocity maps show four clear low-velocity anomalies around tectonic lines from the surface to 20 km (1) low-velocity anomaly along the Hidaka Collision Zone, which was formed by a collision between the Kuril arc and the northeast Japan arc; (2) that in the Niigata-Kobe Tectonic Zone where volcanoes are concentrated; (3) that in Kanto to the south of the Median Tectonic line, which was formed by the successive accretion of sediments and oceanic plate materials; and (4) that corresponding to the Shimanto belt in Shikoku, which was also formed by the successive accretion. They also show a low-velocity anomaly under volcanoes in Tohoku and Kyusyu. Below 20 km, we observed low-velocity anomaly at the center of Japan, which suggests a thick crust. Throughout the entire crust, our results reveal a high-velocity anomaly beneath Chugoku and the fore-arc side of Tohoku. Statistical comparisons with current traveltime tomography models show that our model is consistent with them.
 Above 30 km, we can resolve sharp velocity anomalies up to −25%. The uniform raypath distribution enables us to resolve such a strong and sharp S wave perturbation. On the other hand, below 30 km, the finite wavelength of the observed surface waves broadens our tomographic images. In order to improve spatial resolution below the depth, joint inversion between ambient noise surface wave measurements, receiver functions, and teleseismic measurements is needed in further study.
 In this study, we neglected the effects of radial anisotropy because an isotropic model can explain most of observed dispersion curves without radial anisotropy as shown in Figure 13. A uniform path distribution is suitable for the estimation of an azimuthal anisotropic structure; however, now we cannot distinguish an azimuthal anisotropy from the effects of source heterogeneity. For an anisotropic inversion, we must develop a theory to calculate synthetic cross spectra for heterogeneous excitation sources in future studies.
Appendix A:: Synthetic Cross Spectrum Between Two Stations for Homogeneous Sources
 We calculate synthetic cross spectra between a station at x0 and the other one at x1 assuming that the Rayleigh waves and Love waves are excited by spatially isotropic and homogeneous sources at the Earth's surface [e.g., Fukao et al., 2002; Nishida and Fukao, 2007]. Here, we define spherical coordinates in Figure A1. The horizontal displacement sh at x can be represented by the sum of normal modes as
where blmn is the coefficient of expansion of the nth overtone of spheroidal modes and clmn is that of toroidal modes. Here, Blm represents the toroidal modes with angular order l and azimuthal order m, and Clm represents spheroidal modes [Dahlen and Tromp, 1998] as
 Therefore we can write its radial component for a station x1 at (θ, ϕ) as
Because Blm and Clm are singular at pole x0, here we evaluate their asymptotics near the pole. All Blm and Clm except m = ±1 vanish where θ → 0 so that the horizontal displacement at the pole x0 approaches as
Here, we assume homogeneous excitation of waves, which are characterized by the following statistical features: = , = , and = 0. 〈bln2〉 and 〈cln2〉 are the sums of real resonance functions with a peak of eigen frequencies fl and bandwidth of flql, where ql is the inverse of a Quality factor. The cross spectra between radial components can be written as
Similarly, we can calculate the cross spectra between transverse components as
Here, assuming that the separation distance θ between two stations is much smaller than the Earth's radius R and that the half bandwidth of modal peaks (= flql/2) is much lager than the mode spacing (fl+1 − fl), we can simplify the above equations of a synthetic cross spectrum Ψ as
Here, ψ is a wave function defined by
where U is the group velocity of the wave, J0 is the 0th order Bessel function of the first kind, J1 is the first-order Bessel function of the first kind, n represents a mode branch, and an is a power spectrum of the nth overtone. Here, we approximate the spherical harmonics by Bessel functions assuming that angular distance between x0 and x1 is much shorter than π.
Appendix B:: Correction for Topography
 Because topography affects the dispersion curves significantly above 0.05 Hz, topographic corrections are crucial for this study. Following Snieder , we can write a correction of Love waves δtT for local wave number due to topography as
We can also write a correction of Rayleigh waves δtR due to topography in the same way as
Here, I1 is the energy integral of an eigenfunction, U is the group velocity, y3 is the eigenfunction of a displacement on the Earth's surface, and is the mean altitude along the wave path.
 We thank two anonymous reviewers and the Associate Editor for many constructive comments.