We construct a detailed image of velocity discontinuity beneath southwest Japan by using receiver functions. We first propose an improved receiver function estimation method based on the statistical multivariate autoregressive model. Then we apply the new method to the teleseismic waveform data recorded by the high-density seismograph network in southwest Japan. The results show a clear velocity discontinuity at 30 km depth beneath the southern Shikoku region. This discontinuity, corresponding to the boundary between the oceanic crust and the high-velocity layer (the oceanic Moho) of the Philippine Sea plate (PHS), continues down to the north indicating that the aseismic PHS extends to the central Chugoku region. The continental Moho is also clearly imaged beneath the Chugoku region. The depth contour of the PHS shows a rather complicated feature with ridges and valleys. The most significant ridge is located around longitude 133°E, west of which the contours are basically directed north-south, changing to west-east to the east. Beneath the western part of this ridge, hypocenters of microearthquakes are located above this velocity discontinuity, while earthquakes mainly occur below the discontinuity, within the oceanic mantle, beneath the eastern part of the ridge.
 The Philippine Sea plate (PHS) is subducting toward the northwest from the Nankai Trough beneath the Chugoku-Shikoku region, southwest Japan. Subduction of the PHS has caused the Nankai earthquakes of magnitude (M) about 8 and a few inland earthquakes with M 7. Knowledge of the detailed configuration of the PHS is very important for understanding the subduction process and evaluating the source process of impending large earthquakes. Recently, various seismological, petrological and/or thermal models of the subducting PHS have been proposed and the relations between the slab geometry and the local seismicity have been discussed [e.g., Ohkura, 2000; Kurashimo et al., 2002; Gutscher and Peacock, 2003; Hacker et al., 2003]. So far, most of the configuration models of the PHS are based on investigations of hypocentral distribution of microearthquakes [e.g., Yamazaki and Oida, 1985; Nakamura et al., 1997] and wide-angle reflection and/or refraction surveys [e.g., Kodaira et al., 2000; Baba et al., 2002; Kurashimo et al., 2002]. However, in the Chugoku region of Honshu Island, seismicity along the subducting PHS is very low, and no reflection and/or refraction phases from the PHS have been observed by artificial explosion surveys. Figure 1 is a location map of Japan and southwest Japan with the depth contours of the PHS according to the depth distribution of microearthquakes [Nakamura et al., 1997]. It is not clear from this map whether the PHS continues toward the north beneath the Chugoku region or not. Nakanishi  and Nakanishi et al.  clearly identified the subducting PHS as a high-velocity layer (HVL) underlying a relatively low velocity layer, identified as the oceanic crust, by analyzing the ScSp phase of teleseismic waves observed in the western Chugoku and Shikoku region. They also indicated that the PHS descending from the Nankai Trough may have reached the upper mantle beneath the Chugoku region without seismic activity there. Tomography studies on velocity structure have been done in this area by Hirahara , Zhao et al. , and Honda and Nakanishi . They stated that the HVL in the uppermost mantle beneath the Chugoku region is evidence of the existence of the PHS there. However, tomography is sensitive to the gradual velocity change within the Earth and is not effective at detecting an abrupt velocity discontinuity. There are significant inconsistencies in the depths of the upper boundary of the HVL among those tomographic results, and none of them gave a clear configuration of the PHS in this region.
 The spatial relationship between the configuration of the plate and the local seismicity is essential to understanding the seismotectonics of southwest Japan. For earthquakes that occur in the Shikoku region with depth range of 30–60 km, one can frequently find a distinct pair of later P and S phases after the initial P and S waves at stations in the eastern part of the Chugoku and/or the Shikoku region. Ohkura  demonstrated that these later phases are guided waves traveled through both the oceanic crust (the low-velocity layer) and the continental lower crust. These guided waves can be observed if the earthquakes occur in the oceanic crust and if the oceanic crust is in contact with the lower crust between the hypocenters and the receivers. Analyzing the wide-angle reflection and refraction survey data, Kurashimo et al.  found that the HVL corresponds to the oceanic mantle of the PHS. They concluded that the subcrustal earthquakes beneath the eastern part of the Shikoku region are located within the HVL of the PHS, underlying the oceanic crust. However, this is inconsistent with the findings of Ohkura . A similar discrepancy in the locations of the intermediate depth earthquakes is reported for the region along the Nazca slab at 21°S, where the seismicity in Wadati-Benioff zone is located in the slab mantle area [ANCORP Working Group, 1999], but most of the earthquakes at 60–120 km depth occur within the subducting crust at 22°S to 23°S [Yuan et al., 2000]. More recently, Preston et al.  reported that along the Cascadia slab, shallow earthquakes occur in the slab mantle and deep earthquakes are mainly located within the subducting crust, similar to that in the Shikoku region and the Nazca slab.
 A receiver function is obtained by deconvoling the source wavelet from teleseismic waveform, in either the time or frequency domain. This method does not require measurements of the arrival time of direct P or converted phases, so the reading error of the phase arrival would not affect the accuracy of depth estimates for discontinuities beneath a station. This technique has been intensively used to investigate the geometry of the continental Moho worldwide [e.g., Sheehan et al., 1995; Zhu, 2000; Zhu and Kanamori, 2000; Levin et al., 2002; Stankiewicz et al. 2002] and subducting slabs [e.g., Li et al., 2000; Yuan et al., 2000; Ferris et al., 2003]. Beneath the Japanese Islands, Li et al.  described the receiver function images of velocity discontinuity using broadband seismographs' records. They indicated the locations of the subducting Pacific slab and the upper mantle discontinuities at 410 and 660 km.
 Recently, the National Research Institute for Earth Sciences and Disaster Prevention has established a high-sensitivity seismograph network (Hi-net) with an average station interval of 20 km nationwide over Japan [Obara et al., 2000]. The waveform data accumulated by this high-density seismograph network offer an unprecedented opportunity to study both the spatial configuration of the PHS and the seismicity beneath southwest Japan.
Yamauchi et al.  applied the conventional receiver function analysis method to the teleseismic waveform data observed by the Hi-net and other seismograph networks. They found a seismic velocity discontinuity in the uppermost mantle beneath southwest Japan and stated that the PHS subducts, asesimically, under the western Chugoku region. They proposed depth contours for the upper boundary of the PHS. However, the contours of their PHS model are significantly different from those derived from the spatial distribution of the microearthquakes beneath the Shikoku region.
 In this paper, we first improve the conventional receiver function analysis method based on a statistical model, where a special focus is put on the method of spectral estimation. Then, by applying this method to teleseismic waveform data recorded by the Hi-net seismograph stations in southwest Japan, we derive a spatially varying of depth profile of receiver function amplitude. Finally, we discuss the relationship between the depth contours of the PHS and the distribution of microearthquakes.
2. Estimation of Receiver Function
2.1. Receiver Function and Conventional Estimation Method
 The receiver function analysis is a method to extract the P-to-S converted phases from teleseismic waves to provide information about the subsurface structure beneath the seismic station. The receiver function h(t) is given by the deconvolution of radial component Xr(t) by vertical component Xz(t). In frequency domain, the receiver function is given by
where Z(f) and R(f) are vertical and radial component data in the frequency domain, respectively, and the asterisk means complex conjugation. Taking the inverse Fourier transform of H(f), we obtain a time domain receiver function h(t) [e.g., Langston, 1979; Ammon, 1991].
 The fast Fourier transform (FFT) technique is widely used to calculate the spectrum of the vertical Z(f) and radial R(f) components for a given lapse time window. When we estimate the receiver function by using FFT, we sometimes encounter serious instability in the frequency domain deconvolution process. The calculated spectrum by Fourier transform can be considered as a convolution of a spectrum of the infinite data and that of a finite time window. If the sidelobe of the time window spectrum is not negligible, the estimated spectrum will be contaminated. The contaminations by such time window effect and noise can, usually, cause serious instability for the spectrum estimation and the division in equation (1). To stabilize the receiver function estimation, the water level method [e.g., Helmberger and Wiggins, 1971] has frequently been used. The water level method replaces the notch spectrum in the denominator of equation (1) with a constant value. The constant value is determined by the product of the maximum amplitude of a power spectrum of the vertical components and a water level value c (c ≪ 1). Obviously, such a process smoothes out not only the noise but also information about subsurface structure and sometimes significantly distorts estimated receiver functions. As an example, we compare the “ideal” receiver function with the one estimated by using the water level method (see Figure 2). The synthetic seismograms are theoretically calculated by using the reflection matrix method [Kennett and Kerry, 1979] for the incidence of a Ricker wavelet of dominant frequency 1 Hz into the bottom of the model structure with an incidence angle of 25°. Figure 2c shows a comparison of ideal receiver function and estimated receiver functions by using the water level method with water level values c of 0.01 and 0.001, respectively. The ideal receiver function is given by the deconvolution of the radial response of the one-dimensional (1-D) structure by vertical one. The radial and vertical structural responses are also calculated by the reflection matrix, theoretically. The receiver functions derived by the conventional method by using FFT and the water level method show strong distortion with a large negative amplitude around the initial motion (t = 0). In this example, the distortion decreases with decreasing water level value. However, the spectral roughness may include information about the characteristics of the structure. Hence there is a trade-off between stability and spectral information in this process. Moreover, since a negative amplitude of receiver function is usually interpreted as a low-velocity layer, the distortion may lead to a serious misinterpretation on the structure. Therefore distortion control is critical for the receiver function interpretation.
 As both seismic observation and computer techniques develop, methods for receiver function estimation have evolved. Soda et al.  proposed a method to smooth FFT spectrum with the Hanning window to stabilize receiver function estimation. This is superior to the water level method because its meaning, in frequency domain, is clear and it causes less distortion of the receiver function. However, this method requires choices of the window width for the spectrum smoothing. To avoid the instability in conventional frequency domain deconvolution, time domain deconvolution methods have been proposed [e.g., Gurrola et al., 1995; Sheehan et al., 1995; Ligorría and Ammon, 1999]. Receiver functions estimated this way keep spectral characteristics of low-frequency components and satisfy causality (no leakage appears before direct P wave arrival). Time domain deconvolution tends to be dominated by the Fourier components with largest amplitude [Park and Levin, 2000], and the difficulty is that this method sometimes requires to set the damping parameter, subjectively. Park and Levin  developed a frequency domain receiver function estimation method using multiple-taper correlation estimates. This method is able to calculate a variance of receiver functions in the frequency domain, with which one can evaluate a weighted-average stack to improve a signal-to-noise ratio. However, it also requires choosing time-bandwidth product. In this paper, we propose new technique using another spectral estimation method with objective parameter selection. This improved technique is suitable for imaging shallow structure (crust and uppermost mantle).
2.2. Improved Receiver Function Estimation Method Based on the Multivariate Autoregressive Model
 The maximum entropy method (MEM) and the autoregressive (AR) model method [Akaike, 1969] estimate spectra based on statistical models. These methods have advantages over the FFT for high stability and high resolution. With the multivariate autoregressive (MAR) model, power and cross spectra can be estimated by the same statistical model, simultaneously. Thus this method is also very effective for receiver function estimation.
 Sampling the vertical and radial components of seismic waves with time interval Δt, we construct time series xz(n) and xr(n). The M-dimensional MAR model is a model that explains the multivariate time series, X(n) = (xz(n), xr(n))t with the past value X(n − 1), ⋯, X(n − M) and white noise matrix U(n):
where N is the number of discrete waveform data and superscript t means transposition. Two-dimensional square matrix A(m) is an AR coefficients matrix. To apply this model to time series, we estimate the AR coefficients and the covariance matrix of the white noise. In Appendix A, we explain how to calculate these parameters using observed time series.
 We define a nondimensional parameter fs as the ratio of seismic wave frequency f to sampling frequency Fs(= 1/Δt):
The time series X(n) is interpreted as an output of linear system for input U(n), where the response function in frequency domain is A(fs)−1. Therefore the cross-spectral matrix, P(fs) is represented as
where Pu(fs) is a cross-spectral matrix of input U(n). Diagonal components Pzz(fs) and Prr(fs) of cross-spectral matrix P(fs) give the power spectra of xz(n) and xr(n), respectively, and the off-diagonal component Prz(fs) gives the cross spectrum. Hence the Fourier transform of receiver function estimated by using the MAR model is given by
To search the statistically suitable order of autoregression M, we use Akaike's information criteria (AIC):
where m is the estimated covariance matrix of white noise (see Appendix A). The m value that minimizes the AIC value gives the optimum order of the MAR model. A bold curve in Figure 2d shows the synthetic receiver function estimated by using the MAR model for the synthetic waveforms shown in Figure 2a. The receiver function shown in Figure 2d is significantly improved around the direct P wave arrival (t ∼ 0 [s]). It is worth noting that the obtained receiver function decays with lapse time and the amplitude become small over 15 s. These are expected characteristics. A receiver function may be considered as one of filtered cross-correlation functions without phase change and lag time of cross-correlation function corresponds to the lapse time in receiver functions. Generally, the accuracy of the correlation function estimated from finite data decreases with increasing lag time. A spectrum estimated by using the MAR model does not extract the information that is not statistically significant. As a result, amplitude of receiver function estimated by using the MAR model gradually attenuates with increasing lapse time. The characteristic of that the accuracy of receiver function decreases with lapse time has been discussed in the multitaper spectral correlation method [Park and Levin, 2000] and in time domain deconvolution method [Gurrola et al., 1995; Ligorría and Ammon, 1999].
 Our improved “MAR model method” has two advantages, at least, over the conventional frequency domain deconvolution method: one can objectively select the most suitable parameters with statistical conditions by using AIC instead of ambiguous choices of the parameters and the deconvolution process is much more stable without specific filtering process like the water level method.
 We use the teleseismic waveform data recorded at 181 Hi-net stations in the Chugoku-Shikoku region, southwest Japan (Figure 3). We select 241 earthquakes (October 2000 to March 2003) with magnitudes 5.5 or larger and epicentral distances between 27° and 90°. Figure 4 represents the epicenter distribution of teleseismic events used in this study. The incidence angles at each station are in the range of 20°–40°. We analyze waveforms with duration of 120 s starting from 30 s in advance to the onset of P wave to estimate the receiver functions.
 Each Hi-net station is equipped with a three-component short-period velocity seismograph of natural frequency 1 Hz at the bottom of a borehole with depth more than 100 m. To avoid the contamination by the surface reflection and other high-frequency noise, we select the stations with sensor depth shallower than 350 m and apply a Gaussian low-pass filter with cutoff frequency 0.6 Hz to raw seismograms. The response of the Hi-net system to the ground motion decreases proportionately to the square of frequency for frequencies lower than 1 Hz. To exclude low-frequency components, a high-pass filter with corner frequency of 0.1 Hz is applied to seismogram with the instrument response correction, giving 0.1 to 0.6 Hz band-pass-filtered seismograms. In total, 9911 receiver functions are estimated.
4. Results and Discussion
4.1. Examples of Receiver Functions
 To investigate the depth distribution of the velocity discontinuities, we transform the independent variable of the estimated receiver functions from lapse time to depth by using the JMA2001 [Ueno et al., 2002] as the reference velocity model (Figure 5). Figure 6 shows examples of receiver functions in depth domain estimated at the three stations shown by squares in Figure 3. In Figure 6, we line up the receiver functions according to the backazimuth. Large amplitudes at 0 km depth indicate the arrival of direct P wave. The later phase with positive amplitude indicates the existence of seismic velocity discontinuity, where the deeper layer has higher velocity than that of the shallower one. After the direct P wave, the later coherent phases are clearly shown in Figure 6. For station OOTH, the remarkable phases, indicated with arrows, appear at about 30 km in depth. For earthquakes located south-southeast of the station (in backazimuthal range 140°–180°), the phases are slightly shallower than the other traces. For earthquakes located west of the station (in backazimuthal range 225°–320°), the amplitudes of the same phases are larger and slightly deeper than the other traces. Such characteristics indicate that there is a HVL at 30 km in depth beneath the station OOTH and the HVL declines toward the west or northwest.
4.2. Cross Sections of Receiver Functions
4.2.1. Subducting Philippine Sea Plate
 If only data from a sparse seismic network are available, it would be difficult to identify what these discontinuities represent and to discuss its spatial distribution. Many investigations on the depth distribution of velocity discontinuities from receiver function's images using dense seismic arrays [e.g., Neal and Pavlis, 1999; Yuan et al., 2000; Zhu, 2000; Ferris et al., 2003]. The high density of the Hi-net stations enables us to trace these discontinuities spatially beneath southwestern Japan. For each event-station pair, the ray parameter is evaluated using the IASP91 Earth model [Kennett and Engdahl, 1991] with the epicentral distance and the focal depth, and the ray is traced beneath the station using JMA2001 model (Figure 5). We divide the study area into blocks with sizes of 2.5 km in horizontal and 2 km in depth. The amplitude of the receiver function shown at a certain depth is attributed to the block where the ray passed through the same depth. If many rays pass the same block, we take the average of amplitude over all rays. Through this procedure for all seismic rays, the configuration of a velocity discontinuity is constructed. Figure 7 shows cross sections of the receiver function amplitude along 10 lines shown in Figure 3. Red blocks indicate the positive amplitude of the receiver functions. Black dots in Figure 7 represent microearthquake hypocenters (October 1997 to about June 2003, M ≥ 0.0) beneath the crust. These hypocenters are determined by using JMA2001 velocity model by Japan Meteorological Agency (JMA) routines. As illustrated in profiles A-A′ to E-E′ in Figure 7, the continuous distribution of red blocks indicated by thick lines beneath the Shikoku region shows that there is a velocity discontinuity subducting toward the north from the south with a low dip angle. This discontinuity also descends toward the Kyushu region, in west of the Shikoku region as shown in profiles G-G′ to I-I′ in Figure 7. The depth of this discontinuity is 30 km beneath the southern Shikoku region (J-J′) and 40 km beneath the Seto Inland Sea (G-G′) and continues down to the north reaching to about 60 km beneath the Chugoku region (F-F′). The southern segment of this discontinuity corresponds well with the hypocenter distribution of microearthquakes. Therefore the clear HVL in Figure 7 indicates the upper boundary of the HVL (the oceanic Moho discontinuity) of the PHS. As shown in sections C-C′ and D-D′ in Figure 7, the northern portion of the HVL of the PHS clearly extends to the central part of the Chugoku region, where red blocks continue from the south toward north with very few earthquake hypocenters. It demonstrates that the aseismic PHS extends, at least, to the central part of the Chugoku region.
4.2.2. Continental Moho Discontinuity
 In Figure 7, aligned red blocks around 30–40 km depth beneath the Chugoku region can be seen as indicated by the thin line. As illustrated in A-A′ to E-E′ in Figure 7, these discontinuities tend to incline toward the Pacific side. This feature and the depth of these discontinuities are consistent with the configuration of the continental Moho discontinuity revealed by Zhao et al.  from a travel time inversion analysis. According to their results, the continental and oceanic Moho beneath KYDH and IAMH stations are located 35 and 31 km in depth, respectively. We can find the converted phases corresponding to them in Figure 6, as shown with arrows. Looking at profiles A-A′ and E-E′ in Figure 7 in detail, we find a valley beneath the central Chugoku region. The same feature seems to exist in B-B′ to D-D′ in Figure 7 but not as clearly. The location of this valley corresponds to the mountainous area with elevation in a range of 1000 m. This area shows a negative Bouguer gravity anomaly [Gravity Research Group in Southwest Japan, 2001], indicating that the thickness of the continental crust (the depth of the continental Moho) is larger beneath the central Chugoku region. This is consistent with the receiver function images shown in Figure 7. The continental Moho shown from A-A′ to E-E′ in Figure 7 becomes obscure beneath the southern Chugoku region and the Seto Inland Sea, where the subducting oceanic Moho exist quite near. Ohkura  insisted that the mantle wedge does not exist beneath the Shikoku region, where the oceanic crust is in contact with the continental crust. It is hard to image this area from our results because the ray path is not enough around the Seto Inland Sea. However, profile E-E′ in Figure 7 confirms that the mantle wedge is absent in the western Shikoku region around 33.4°N, at least, in the northern end of the Shikoku region. In the eastern Shikoku region, the edge is located around 33.5°N, as shown in profile A-A′ of Figure 7. These features are consistent with the results of Ohkura .
4.2.3. Evaluation of the Error of Depth Conversion
 A simple horizontally layered velocity model is used to transform the receiver function lapse time to depth. To evaluate the error of depth estimation caused by dipping structure, we perform the following experiment. Figure 8 illustrates a profile of a two-dimensional (2-D) structure where a dipping high-velocity (VP = 8.0 km/s) slab embedded into the JMA2001 1-D model. The P-to-S conversion depths are set at 30, 40, and 50 km corresponding to different station locations. We let dip angle ϕ vary from 0° to 30° with a step of every 5°. For direct P wave with an incidence angle θ = 35° from the updip direction, the lapse time of the conversion phase at the dipping slab after the arrival of direct P is calculated and then transformed to the depth for each ϕ at different “stations.” This assumed incidence angle is equivalent to a source about 45° distance. As listed in Table 1, the maximum change in the depth is within 15% (several kilometers) between dipping angle 0°–30°. Since the dipping angle of the subducting slab shown in Figure 7 is mostly within 10°, it would be reasonable to say that the error on the estimated conversion depth caused by the simple 1-D model is around 2–3 km. Of course, we should keep in mind that the error increases with the increase of the dipping angle, and this error estimation is empirical.
Table 1. Estimation of Depth Conversion Errors for “Dipping Slab” Modela
Dip indicates dip angles of subducting slab. The parameters Z, t, and D mean the assumed P-to-S conversion point depth, the lapse time of the conversion phase at the dipping slab after the arrival of direct P wave and the resultant conversion depth using 1-D model, respectively.
 The hypocenters used in this study are located using the 1-D regional velocity structure (JMA2001). Matsubara et al.  constructed a 3-D velocity structure beneath the Japanese Islands based on travel time tomography, then relocated the hypocenters using their 3-D model. Figure 9 gives the E-W profile of the spatial variation of the average differences between the focal depths beneath the Shikoku region determined using their 3-D (Dep3D) and 1-D (Dep1D) models. Within our study region, the average error of the focal depth caused by the 1-D velocity model is within ±2 km, while the standard deviation of the average is ±1.5 km. Beneath the eastern Shikoku region (east of 134°E), Dep3D is 2 km shallower than Dep1D, systematically. Considering the uncertainty of both the depth conversion of receiver function and the focal depth, the hypocenters that are close to (within 5 km) the slab boundary are eliminated in the discussion about spatial relationship between the configuration of the PHS and the local seismicity.
4.3. Depth Contour of the Philippine Sea Plate and Local Seismicity
 We made more than 100 cross sections of receiver functions' amplitude, half of them have strike with NW-SE direction and the others in NE-SW direction to construct the PHS model with high resolution. Figure 10 shows the depth contours of the upper boundary of the HVL (the oceanic Moho discontinuity) of the PHS. Clearly, the PHS dips to north with a dip angle about 10° beneath Shikoku. The dipping angle becomes low at the Seto Inland Sea and then changes to 25°, approximately, at the central part of the Chugoku region. The depth contours show a rather complicated feature with ridges and valleys. The most significant ridge is located around longitude 133°E (R1 in Figure 10). East of the ridge, the contours trend NE-SW and bend to south-north on the west of it. The depth contours of local seismicity (Figure 1) also bend around longitude 132.5°E. There is a small valley (V1 in Figure 10) in the depth contours of the PHS in the area west of the ridge R1. This valley is coincident with seismicity in depth range of 30–40 km, represented by orange and yellow in Figure 7. In the eastern part of the Shikoku region, the depth contours of the PHS show a small ridge around 133.7°E (R2 in Figure 10), where the depth contours of the local seismicity are smooth. In our PHS model, there is a valley at 134.2°E (V2 in Figure 10). Seismicity is higher in the east of this valley, although seismic activity is low at the central part of the Shikoku region.
Yamauchi et al.  carried out the conventional receiver function analysis to generate a PHS model. They showed a large valley around our valley V1 in Figure 10, and the contours of their model are directed east-west in the west of the valley. However, in profiles G-G′ and I-I′ of Figure 7, the velocity discontinuity represented by red blocks appears to decline to the west, monotonously. Also, the south-north directed depth contours of the distribution of the microearthquakes in the western part of the Shikoku region do not support their model but is consistent with our model. Because their PHS model is constructed of few profiles with only parallel strike, it is hard to trace the configuration of the velocity discontinuities if the contours of the discontinuities are parallel to the cross sections. In constructing our PHS model, we add the information obtained from the cross sections with NE-SW strikes, shown in F-F′ to J-J′ of Figure 7 as examples. Therefore our PHS model has much higher spatial resolution than the previous one, especially in the western part of the Shikoku region.
 As shown in Figure 11a, almost all local earthquakes are located above the HVL west of the ridge at around 133°E. This result confirms the observation of later phases interpreted by Oda et al.  and Ohkura , as mentioned earlier. Ohkura  suggested that this high-seismicity zone in the subducting oceanic crust might be caused by the gabbro to eclogite phase transformation. East of the ridge, where the subducting PHS runs east-west with a lower dip angle, earthquakes mainly occur within the slab mantle (Figure 11b). The dip angle of the HVL becomes larger east of 134°E, while the seismicity is significantly lower. Those features are also consistent with the analyses of refraction and wide-angle reflection survey by Kurashimo et al. . The Kinan seamount chain exists to the southeast off the Shikoku region (upper map in Figure 1), and these seamounts may be subducting beneath the eastern part of the Shikoku region [Kodaira et al., 2000; Seno and Yamasaki, 2003]. Seno and Yamasaki  suggested the possibility that dehydration may not take place in the subducting seamounts because of the lack of hydrous minerals. This might be a reason why no earthquake occurs above the HVL. Seno and Maruyama  proposed that the subduction direction of the PHS is west-northwest in southwest Japan at the present time but it was northwest 17 Ma. The dipping direction of the PHS is also northwest (R1 and R2 in Figure 10). The complicated stress field generated by discrepancy between the plate motion and the slab geometry may be one of the major causes of the different pattern of the seismicity under the crust between the eastern area and the western area of the Shikoku region.
 We developed the new receiver function estimation method based on the MAR model and applied it to teleseismic waveform data recorded by the Hi-net. Taking advantage of the high density of the Hi-net stations, we constructed a detailed image of velocity discontinuities beneath southwest Japan. The largest discontinuity, which corresponds to the boundary between the oceanic crust and the HVL (the oceanic Moho) of the PHS, subducts from south toward the north continuously. We have confirmed the extension of the aseismic PHS beneath the central part of Chugoku region. We found a deepening of this discontinuity toward the Kyushu region in the west of the Shikoku region. The continental Moho discontinuity was also imaged clearly. We found that the Moho has a valley beneath the central Chugoku region. The location of the valley is corresponds to high surface topography, and the shape of the continental Moho is consistent with the gravity anomaly. The depth contours of the HVL of the PHS trend NE-SW from the east to the west then bend to south-north direction around 133°E. All local earthquakes occur above the HVL to the west of the significant ridge around longitude 133°E, although almost all earthquakes are located in the upper mantle east of the ridge. The configuration of the PHS resolved by this study enables us to investigate the relationships between seismicity in the slab and the slab geometry for the first time.
Appendix A:: Estimation of AR Coefficients and Covariance Matrix of White Noise
 In this paper, we adopt the Yule-Walker method [e.g., Kitagawa, 1998] to calculate AR coefficients and covariance matrix of white noise. A cross-covariance matrix R(m) for multivariate time series X(n) is generally defined as
If a lag m is not equal to zero, the second term of equation (A2) becomes zero. If m is equal to zero, the second term is modified to
where W is a covariance matrix of white noise. Correlation coefficients of white noise are zero when lag time is not equal to zero. Because both power spectrum and cross spectrum of white noise take constant values without frequency dependences, a covariance matrix W coincides with a cross-spectral matrix Pu(fs) in equation (5). From equations (A2) and (A3), we can get the “multivariate Yule-Walker equations”:
 On the other hand, a cross-covariance matrix (m) for observed multivariate time series, (1), ⋯, (N), is written as
Solutions of simultaneous equation (A6) are estimated AR coefficients, (1), ⋯, (M). By using these estimated coefficients and the first equation of equation (A4), we can calculate the estimated covariance matrix of white noise, m :
 This covariance matrix of white noise, m is used to search the statistically suitable order of autoregression, M by AIC derived by equation (7). Power and cross spectra of observed time series are calculated with estimated AR coefficients, (1), ⋯, (M) using equation (5).
 We thank Anshu Jin, Shigeo Kinoshita, Toru Matsuzawa, and two anonymous reviewers who kindly gave us many useful comments. We are grateful to Charles J. Ammon for providing receiver function estimation software and to Makoto Matsubara, who provided hypocentral data determined using his 3-D velocity model. We also thank Japan Meteorological Agency and Ministry of Education, Culture, Sports, Science and Technology, Japan for providing hypocentral data. We used GMT software [Wessel and Smith, 1991] to draw figures. This work is partially supported by the Japan-Germany Research Cooperative Program of JSPS.