Geochemistry, Geophysics, Geosystems

Imaging Yellowstone plume-lithosphere interactions from inversion of ballistic and diffusive Rayleigh wave dispersion and crustal thickness data



[1] Diffusive and ballistic Rayleigh wave dispersion data from three PASSCAL seismic deployments are combined with crustal thickness constraints from receiver function analysis to produce a high-resolution shear velocity image of the Yellowstone hot spot track crust and uppermost mantle. This synoptic image shows the following crustal features: the eastern Snake River Plain (ESRP) high-velocity midcrustal layer, low-velocity lower crust beneath the 4.0–6.6 Ma Heise caldera field, high-velocity lower crust beneath the <2.1 Ma Yellowstone Calderas, and low-velocity upper crustal volume beneath the <2.1 Ma Yellowstone caldera fields. The low-velocity lower crust beneath the 4.0–6.6 Ma Heise caldera field is found to extend outward 50–80 km from the ESRP margins, consistent with outflow of the magmatically heated and thickened ESRP lower crust. In addition, the lack of 10 km of crustal thickening of the ESRP crust, associated with the estimated 10 km of magmatic thickening, requires that the ESRP lower crust has flowed outward in a complex fashion governed by preexisting lower crustal strength heterogeneity. Within the northern Wyoming province, the so-called 7.x km/s lower crustal magmatic layer is found to extend westward to the N-S oriented pre-Cambrian rift margin. The high-velocity, hence high-density, 7.x layer beneath the <2.1 Ma Yellowstone caldera fields has apparently inhibited heating of the subcaldera lower crust and instead magmatic heat and fluids are exchanged with the country rock above 13 km depth. The narrow 80 km diameter plume imaged by body wave tomograms, after being sheared to horizontal by plate drift, is manifest as a very low velocity (3.9 km/s) layer that is only about 110 km wide. The ESRP mantle lithosphere has been thinned to about 28 km thickness by the plume's transport of heat and magma upward, lateral advection of the lower lithosphere by plume shear, and ongoing lithospheric dilatation.

1. Introduction

[2] The most remarkable feature of the Yellowstone hot spot track is the time-transgressive sequence of caldera fields with estimated ash flow tuff and rhyolite eruptive volumes of 1,200 to 12,200 km3 [Armstrong et al., 1975; Bonnichsen et al., 2008; Christiansen, 2001; Leeman et al., 2008; Morgan and McIntosh, 2005; Sabra et al., 2005; Smith and Braile, 1994]. The creation of these caldera fields requires large inputs of magma and heat to the crust from the underlying plume. This heat exchange is primarily accomplished via basaltic magma input to the midcrust from the combined melting of the plume layer and the overlying thin Archean age mantle lithosphere [Leeman et al., 2008; Lum et al., 1989; Menzies et al., 1983]. The most recent caldera activity is primarily within Yellowstone Park and consists of the 2.1 Ma Huckleberry ridge caldera, the 1.2 Ma Mesa Fall caldera, and the 0.6 Ma Lava Creek caldera [Christiansen, 2001]. The most significant activity since the Lava Creek caldera event is the 150,000–75,000 a growth of the Sour Creek and Mallard Lake rhyolitic domes in addition to subsequent minor small basaltic flows [Christiansen, 2001]. Ground deformation data show that the most recent 0.6 Ma caldera within Yellowstone Park has large changes in vertical velocity rates consistent with the movement of fluids and perhaps magma at depth [Chang et al., 2007]. Tomographic imaging of the Yellowstone caldera upper crust finds low-velocity bodies above 8 km depth beneath the young rhyolitic domes [Husen et al., 2004; Smith et al., 1982].

[3] Within the eastern Snake River Plain (ESRP) province (Figure 1), a sequence of four major caldera fields are found to contain 1–3 km of dominantly high-silica rhyolitic caldera fill: the Heise (6.6–4.0 Ma), Picabo (10 Ma), Twin Falls (10.5 Ma), and Bruneau-Jarbidge caldera fields (12.7 Ma) [Leeman et al., 2008; Shervais et al., 2006; Smith et al., 1982; Sparlin et al., 1982]. Each caldera field consists of several distinct caldera eruptions over 1–2 Ma timescales. These caldera fields reside within the structural and topographic down-warp called the ESRP. Two processes are cited to explain the ESRP down-warp: (1) an increase in the mean density of the midcrust due to emplacement of mantle derived and fractionated ferro-basalts [Christiansen and McCurry, 2008; McQuarrie and Rodgers, 1998; Sparlin et al., 1982] and (2) an outward directed flow of the ESRP lower crust forced by the load associated with midcrustal magmatic intrusion and the thickened ESRP lower crust [McQuarrie and Rodgers, 1998; H. Y. Yuan et al., Variations in crustal velocity and thickness along the Yellowstone hot spot track, manuscript in preparation, 2008]. In this work, we seek to quantify the importance of these two effects.

Figure 1.

Yellowstone region map with seismic stations and topography. Symbols refer to seismic stations of different PASSCAL arrays as indicated in the legend. White text shows location of the Bighorn (BH), Wind River (WR), and Green River (GR) sedimentary basins. Inset identifies <2.1 Ma Yellowstone caldera field (Y), 4.0–6.6 Ma Heise caldera field (H), and 10.3 Ma Picabo caldera field (P).

[4] A veneer of <1 km thick late stage basalts cover most of the ESRP calderas and provides important chemical constraints with respect to the basalt source region and magma fractionation-assimilation-mixing histories [Bonnichsen et al., 2008; Boroughs et al., 2005; Christiansen and McCurry, 2008; Christiansen, 2001; Hughes et al., 2002; Leeman et al., 1985; Menzies et al., 1984; Shervais et al., 2006; Wilshire et al., 1988]. In addition, numerous Quaternary rhyolitic domes along the axis of the ESRP provide further magmatic history constraints [McCurry et al., 1997]. Xenolith plucking by this recent magmatism within the ESRP has produced mid to lower crust xenoliths with 2.6–3.2 Ga ages [Leeman et al., 1985]. These Archean ages are consistent with the age of the Wyoming craton crust immediately to the east [Chamberlain and Mueller, 2007; Frost and Fanning, 2006]. Unfortunately, no mantle xenoliths have been found within the ESRP basalt flows.

[5] Several lines of evidence support the association of the ESRP high-velocity midcrust with a very large midcrustal sill complex (referred to as MCS hereafter) [Sparlin et al., 1982]. First, petrologic analysis suggests that the high-silica caldera magmas are fractionated from midcrustal basalt intrusions with modest amounts of remelting of previous intrusions and minor amounts of Archean country rock melting [Christiansen and McCurry, 2008; Christiansen, 2001; Shervais et al., 2006]. Given the estimated caldera volumes, petrologic models suggest that a 10–20 km thick layer of mantle derived basalt has been added to the ESRP crust [Christiansen and McCurry, 2008; Leeman et al., 2008; Lum et al., 1989]. Second, flexural modeling of observed geologic tilt indicators about the ESRP margins suggests that a 10–20 km thick layer with a 3–4% density increase is required to explain the geologic tilt indicators [McQuarrie and Rodgers, 1998]. Third, the 1978 seismic refraction lines find a 10–13 km thick high-velocity layer within the ESRP midcrust [Priestley and Orcutt, 1982; Sparlin et al., 1982]. This layer is modeled with a P wave velocity of 6.5 km/s intruded surrounded by 6.2 km/s Archean country rock. Fourth, receiver function analysis from the dense station spacing 1993 eastern Snake River Plain PASSCAL experiment is roughly consistent with the refraction based MCS model. Unfortunately, receiver function results from the 2000–2001 Continental Dynamics–Yellowstone (CD-YEL) PASSCAL experiment are too sparse to accurately constrain the MCS (Yuan et al., manuscript in preparation, 2008).

[6] With respect to the ultimate origin of the ESRP and Yellowstone Park calderas that form the Yellowstone hot spot track, a weak upper mantle plume has been found by the CD-YEL experiment tomograms [Schutt and Dueker, 2008; Waite et al., 2006; Yuan and Dueker, 2005]. This upper mantle plume is found to not cross the 660 km discontinuity based on mantle discontinuity topography [Fee and Dueker, 2004]. As the North American plate has drifted to the southwest at 2–3 cm/a [Sella et al., 2002], the plate-sheared 80 km diameter plume conduit [Yuan and Dueker, 2005] has significantly thinned the ESRP mantle lithosphere and transferred large volumes of magmatic heat and mass into the ESRP crust. At the depth of the 410 km discontinuity, the plume conduit is imaged 100 km to the NW of the Yellowstone Caldera, roughly beneath Dillon, Montana [Fee and Dueker, 2004; Yuan and Dueker, 2005]. This offset of the plume with depth manifests the 75° plunge of the plume conduit toward the NW. The plunge of the plume conduit suggests that the deep mantle flow is to the east [Steinberger, 2000]. Prior to the CD-YEL experiment, the nonplume case for Yellowstone had been presented [Christiansen et al., 2002; Humphreys et al., 2000], albeit without the new seismic sampling provided by the CD-YEL experiment. However, we believe that the new tomographic and discontinuity topography imaging are now sufficiently compelling to make a plume origin for the hot spot track to be a good working hypothesis.

[7] A primary motivation for this study is that our recently published ballistic Rayleigh wave shear velocity image was unable to resolve intracrustal structure well because of a lack of Rayleigh wave dispersion data below 30 s period [Schutt et al., 2008]. Therefore, to provide improved resolution of the region's crustal velocity structure, diffusive wavefield Green's functions have been extracted from 91 stations associated with the CD-YEL, Billings, and National Seismic Network (NSN) arrays. Extractions of interstation Green's functions from diffusive wavefield cross correlation [Ritzwoller et al., 2006; Shapiro et al., 2005] confirms the theoretical framework with respect to interstation Green's function estimation [Roux et al., 2005; Sabra et al., 2005; Wapenaar et al., 2005]. Such analysis has been used to produce shear velocity tomograms from around the world [Bensen et al., 2008; Lin et al., 2008; Yang et al., 2007]. Measurement of the diffusive wavefield Green's functions complements the shortcomings of traditional ballistic surface wave imaging which are limited by the spatial and temporal heterogeneity of earthquake locations.

2. Data Processing and Group Velocity Measurements

[8] Continuous one sample per second vertical component seismic data from 91 stations recorded by three temporary PASSCAL arrays denoted as the CD-YEL array [Fee and Dueker, 2004], the Billings array (BA) [Yuan et al., 2008], and the Snake River Plain array (SRP) [Walker et al., 2004] are used in addition to seven National Seismic Network (NSN) stations (Figure 1). From this data set, diffusive group velocity measurements are extracted from the August 1999 to July 2001 time span. The CD-YEL array operated from year-julian day 2000–179 to 2001–133, the Billing's array from 1999–281 to 2000–239, and the Snake River Plains array from 2000–200 to 2001–255. The NSN stations were operational during the entire time span. The ambient noise processing method used here is similar to that of previous studies [Bensen et al., 2008; Sabra et al., 2005; Shapiro and Campillo, 2004] and is briefly described below.

[9] Prior to cross correlation of station pair data, the waveforms from each station are processed by: removing the instrument responses, removing linear data trends, and 3–100 s band pass filtering. Amplitude normalization of the waveforms is performed using spectral whitening in the frequency domain. Finally, all possible overlapping data from station pairs are correlated in the frequency domain using 600 s long time windows. To reduce the total number of individual station pair correlations, these 600 s correlations are summed into day-long correlation functions which are stored in the Antelope relational database for further processing and error estimation.

[10] Figure 2 shows the broadband correlation functions for a selected station pair. The temporal variations in the correlation functions results from changes in the diffusive wavefield power flux with respect to changes in seasonal noise sources [Yang and Ritzwoller, 2008]. This particular station pair correlation function exhibits well defined fundamental mode Rayleigh packets for both positive and negative lag times, albeit not all station pairs show such symmetric correlation functions. To minimize the asymmetric nature of the correlation functions, the positive and negative lags are averaged (Figure 3), as is common practice [Bensen et al., 2008].

Figure 2.

Vertical component broadband noise correlogram image for two stations separated by 348 km at an azimuth of 358° for the time period 1 July 2000 to 1 May 2001. The correlation function for each day is amplitude normalized to one with red and blue representing positive and negative amplitudes. White bands indicate time periods with no data available. The stacked correlation function is plotted at the top.

Figure 3.

Example ambient noise cross-correlation record section. Symmetric component stacked cross-correlation function envelopes are filtered at 20 s and arranged by interstation distance. The solid blue line denotes a group velocity of 3.0 km/s. Using our two wavelength data cutoff, interstation paths <120 km are not used.

[11] For each station pair, the fundamental mode Rayleigh wave dispersion measurements are determined via a frequency-time analysis [Dziewonski and Hales, 1972] of the stack of the full set of the day-long correlation functions. The set of wave periods analyzed derives from a bank of 50% overlapping Gaussian filters with a linear distribution of filter center periods. From these filtered correlations the lag time of the envelope peak is measured to define the group arrival time (tg). The group velocity is x/tg, where x is the interstation distance. Figure 4 shows the correlation functions for two stations filtered at selected center periods. Note that the longer-period waves arrive earlier with respect to the shorter-period waves, consistent with an increase in group velocity with period.

Figure 4.

Vertical component noise correlogram for two stations separated by 226 km at an azimuth of 348°, filtered with variable width Gaussian at different center periods. (a) Broadband, (b) 20.8 s, (c) 12.3 s, and (d) 6.1 s. Dashed line shows the envelope of the signal.

3. Group Velocity Error Analysis

[12] The average amount of temporal overlap for our 3560 station pairs is 145 days; these days are approximately divided between the Northern Hemisphere's winter and summer seasons. To estimate standard errors for the group arrival time and the signal-to-noise ratio (SNR), bootstrap resampling with replacement [Efron and Tibshitani, 1986] is used on the set of day correlations for each station pair. To calculate group arrival time and SNR uncertainties, 100 bootstraps of group arrival time and the SNR for each station pair are performed. This bootstrap method implicitly addresses the bias that temporal variations of the diffusive wavefield flux have upon the group velocity measurements. The SNR is computed at each period by dividing the maximum amplitude of the correlation function envelope by the root-mean-square (RMS) of the correlation function within a noise window. The noise window extends from the upper lag time of the signal window to the end of the 600 s correlation function. As expected, the SNR is a good proxy for the amplitude of the diffusive wave power flux and hence is peaked at the mean period of the primary seismic and double seismic peak at 6–12 s [Schulte-Pelkum et al., 2004]. The standard errors of the dispersion measurements are <0.1 km/s for periods >10 s, but the standard errors do increase at periods <10 s (Figure 5). We suspect that this increased error at <10 s period is related to strong wavefield scattering into body and other surface wave modes within the array.

Figure 5.

Group velocity measurement characteristics plotted versus period. (a) Average signal-to-noise ratio before data culling. (b) Number of observations. (c) Standard deviation of group velocity. Average is indicated by the solid line, and median is indicated by the dashed line. (d) Average group velocity with one average standard deviation error bound. Note that Figures 5c and 5d show data characteristics after SNR and interstation distance culling only. Also note that Figure 5b represents final number of data used for tomographic inversion after culling based on SNR, interstation distance, and group velocity standard deviation as described in text.

[13] The group velocity measurement procedure yields a large number of group arrival times which are then culled into usable data by applying three culling metrics. First, a two wavelength cutoff is imposed because body and higher mode wave scattering interferes [Roux et al., 2005] with the ability to accurately measure the arrival time of the Rayleigh wave envelope: e.g., a 10 s period wave traveling at 3 km/s group speed requires the station pairs be separated by >60 km. This contamination of correlation functions at short offsets is observed in Figure 3. Although other studies use a three wavelength rule [Lin et al., 2008], we have found that our two wavelength rule provides measurement errors with standard group velocity errors of <0.15 km/s. Second, the SNR of the cross correlation functions provides a reliable estimate of the dispersion measurement errors and a SNR cutoff value of 10 has been used. Third, a group velocity measurement is discarded if the group velocity standard error is >0.1 km/s. These culling metrics remove 55% of the total number of theoretical station pairs (Figure 5).

4. Group Velocity Tomography

[14] The culled group velocity data set is inverted for isotropic group velocity maps at the measured wave periods. The group velocity models are parameterized as 20 km square blocks and ray theory is used to compute the great circle paths between station pairs. Inversion of the data kernel matrix is regularized using diagonal damping. To simulate the finite frequency effects associated with wave propagation [Spetzler et al., 2002], a convolutional quelling operator is applied to the data kernel matrix [Meyerholtz et al., 1989]. This operator has a Gaussian functional form with a half width that increases linearly with the wave period. The mean group velocity at each wave period is used as a starting model.

[15] Our tomographic equation solver is an iterated reweighted least squares (IRLS) algorithm [Aster et al., 2005] that down-weights group velocity data that present large residuals during iteration. After the first iteration, data are down-weighted to zero if a data residual is >3 times the standard error of the data. Two more IRLS iterations produce no significant changes in the model. To assess how damping controls the group velocity models, the resolution versus data residual variance trade-off curves were evaluated via multiple inversions at different damping values [Menke, 1984] which results in an L-shaped trade-off curve. In all our group velocity map inversions, the damping value nearest the bend in the curve is used as our optimal damping.

[16] Spatial resolution is assessed using both single-checker and many-checkers synthetic tests (Figures 6 and 7). Each input checker is a 60 km by 60 km square with an input group velocity anomaly of unity. These tests show that 60 km lateral scale velocity variations can be well resolved by our 5.4 s period data set (Figure 7). Likewise, a Yellowstone caldera field scale anomaly at 10.3 s period can be resolved (Figure 6a). Finally, the ESRP midcrustal sill can be resolved by 20.8 s period data (Figure 6b). These tests give us confidence that the primary group velocity anomalies found in this study are resolved structure. Our group velocity maps at selected periods are presented in Figure 8. At 4.8 s period, the most prominent features are the low group velocities associated with Bighorn, Wind River, and Green River sedimentary basins in western Wyoming (Figure 1). At 14.6 s period, the group velocity map reveals a distinct low group velocity anomaly beneath the most recent 0.8 Ma Yellowstone caldera.

Figure 6.

Feature recovery test for group velocity tomographic inversions. (a) Test for significance of velocity anomaly within the Yellowstone Caldera at 10.3 s. (b) Test for midcrustal velocity anomaly beneath the Heise caldera field at 20.8 s. For both tests the input model has unity amplitude outlined by white lines. The maximum amplitude recovery is indicated by the maximum scale bar value. Solid black line outlines region of good resolution.

Figure 7.

Group velocity checkerboard test at 5.4 s with 60 km input checkers.

Figure 8.

Group velocity maps: (a) 4.8 s, (b) 7.3 s, (c) 14.6 s, and (d) 19.6 s.

5. Inversion of Diffusive and Ballistic Data for Shear Velocity Model

[17] To form a three-dimensional shear velocity model, a joint inversion of our diffusive and ballistic fundamental mode Rayleigh wave maps was performed with crustal thickness constraints specified using our new crustal thickness map (Yuan et al., manuscript in preparation, 2008). The ballistic wave measurements were performed using the two-plane wave technique [The MELT Seismic Team, 1998; Li et al., 2005] and the results have been described in a prior paper [Schutt et al., 2008]. Our shear velocity model data are derived by interpolating the ballistic and diffusive dispersion maps on a 20 km grid of nodes. At each grid node, the group and phase velocity dispersion curves were extracted and inverted for a one-dimensional shear velocity profile using an iterated linear least squares equation solver that incorporates model smoothness constraints and weighting of the model vector norm [Herrmann and Ammon, 2002]. To assess the coherence between the diffusive and ballistic data sets, inversion of the diffusive data only was compared to the combined inversion of the diffusive and ballistic data. Comparison of these two inverse images shows that the crustal structure is highly correlated suggesting that the constraints provided by the two data sets is coherent.

[18] Our crustal thickness constraints were provided by stacking of P wave receiver functions at each of our 91 stations. The processing and results of this work is presented in a companion paper (Yuan et al., manuscript in preparation, 2008). Briefly, 20–50 high-quality receiver functions for each station were stacked with move out corrections to produce a set of clearly identifiable direct P-S conversions from the Moho. A free-surface reverberation analysis (the H-K method [Zhu and Kanamori, 2000]) was performed to constrain the bulk crustal Vp/Vs ratio which varies reasonably between 1.76 and 1.87. The two standard error value for the P-S Moho times is <0.2 s or about 2 km (Yuan et al., manuscript in preparation, 2008). The resulting crustal thickness estimates from the 91 stations were then interpolated into a 2-D map using a two-dimensional least squares spline fit. To impose these crustal thickness timing constraints, the crustal thickness map and shear velocity inversion was iterated twice: with the crustal thicknesses from the previous iteration used as the starting model for the subsequent shear velocity inversion. After two iterations this procedure resulted in very minor changes of <1 km in the crustal thickness map.

[19] The one-dimensional shear velocity model parameterization consists of 2 km thick crustal layers, 5 km thick layers from the Moho to 100 km, and 10 km thick layers from 100 to 200 km. To encourage the uppermost 6 km of the model to absorb the large surficial low-velocity layers associated with sedimentary basins and caldera fill, the 0–6 km depth layers are down-weighted by a factor of five in the model norm calculation. The average shear velocity increase across the crust-mantle boundary is defined as 0.75 km/s on the basis of our receiver function analysis (Yuan et al., manuscript in preparation, 2008). The starting model crust and mantle shear velocity values are 3.65 km/s and 4.40 km/s with a Vp/Vs ratio of 1.78 and 1.81 (Yuan et al., manuscript in preparation, 2008). The starting velocity model is purposely set to high values to guard against the development of false low-velocity zones [Cho et al., 2007; Julia et al., 2003]. Increasing or decreasing the starting velocity model has little effect on the resulting shear velocity model as long as the crustal thickness is fixed to the receiver function values. The estimation of the layer velocity standard errors are difficult to accurately determine as the regularization bias due to diagonal damping, model norm weighting, and iteration control the final solution [Aster et al., 2005].

[20] The resolution of the 400 one-dimensional shear velocity inversions is estimated as the trace of the resolution matrix. The trace of the resolution matrix is an estimate of the number of degrees of freedom (DOF) constrained by the dispersion data. For the upper 45 km of our shear velocity models, 4–5 DOF are found indicating very good resolution of the crust. Noteworthy is that our prior ballistic wave only inversion [Schutt et al., 2008] had only about 2 DOF above 45 km depth. Thus, the increase in crustal DOF with the combined inversion indicates the utility of our diffusive group velocity data. From 45 to 200 km depth, our data constrain 2–3 DOF, indicating decent resolution of the mantle lithosphere and asthenosphere to about 125 km depth. Given that we are using the same ballistic wave dispersion data set presented by Schutt et al. [2008], our subcrustal resolution is only slightly better with respect to our prior ballistic-only inversion.

[21] For a grid point located within the Heise caldera field on the ESRP (Figure 1), the fit of our final shear velocity profile predicted data to our observed data is shown in Figure 9a. The fit to both the group and phase velocity data is within one standard error of the observed data everywhere except for a group velocity maximum at 15 s period. Because the iteration number is the primary inversion parameter controlling the shape of the velocity profile, the change in the velocity model with iteration number is shown (Figure 9b). Figure 9b shows that the model converges monotonically toward a solution in six iterations. Noteworthy velocity features found by the inversion are: the low-velocity surface layer associated with caldera fill (0–6 km), the high-velocity midcrustal layer (i.e., the velocity “nose” at 15–30 km), the low-velocity lower crustal channel (30–40 km), and an asthenospheric low-velocity channel that is the sheared plume layer (80–100 km).

Figure 9.

Inversion for shear velocity at a sample point within the ESRP. (a) Data points are group (blue) and phase (black) velocity dispersion data with standard deviation error bars. Lines show best fit model from one-dimensional inversion for shear velocity (blue line in Figure 9b). (b) One-dimensional shear velocity model for the first six iterations.

[22] The crustal thickness values used in our starting model are a very important quantity that controls the final model profile. In particular, the low-velocity lower crustal layer beneath the ESRP is especially sensitive to the accuracy of our crustal thickness constraints. Thus, a set of inversions that span our range of crustal thickness uncertainties are presented (Figure 10). For these inversions, the crustal thickness is perturbed within our ±2 km uncertainty bounds: i.e., from 40 to 44 km for this particular grid point. This test shows that the shear velocity minimum in the lower crust varies by <0.1 km/s with respect to crustal thickness perturbations: implying that the lower crustal low-velocity layer is required by the surface wave data.

Figure 10.

Significance test of low-velocity lower crust beneath the ESRP using dispersion curves shown in Figure 9a as input data. (a) Velocity models for Moho depths within the 40–44 km uncertainty bounds for Moho depth. (b) One-dimensional shear velocity models for corresponding input models in Figure 10a after six iterations with damping factor of 0.05 km2.

[23] To assess how the resolution and data residual variance vary with respect to damping and iteration number for our 400 shear velocity profile inversions, mean resolution and mean error values were calculated (Figure 11). The L1 residual norm versus iteration number plot shows that no significant variance reduction occurs after six iterations. Using six iterations as our preferred value, the resolution spread versus L1 residual norm figure shows the expected monotonic behavior as the damping parameter is varied. To illustrate how the damping value controls our model solutions, two differently damped model cross sections are presented in Figures 12 and 13. These two differently damped models are well correlated: with the more damped model being effectively a low-pass filtered version of the less damped model. Noteworthy is that no new structure (artifact) is created in the less damped model.

Figure 11.

Determination of regularization parameter trade-off for shear velocity inversion. The top curve presents trade-off between model resolution, as determined by Dirichlet spread functions [Menke, 1984], and the fit to the data (L1 norm of the residuals) for different damping values. Using six iterations, the red circle and blue square denotes damping value of 0.2 km2 and 0.05 km2. Each point is the average value and one standard deviation error bar for all sample points within the region of good resolution outlined by the solid black line in Figures 8a8d. The bottom curve shows how the residual L1 norm varies with respect to iteration number for damping of 0.05 km2. After six iterations, no additional iterations are warranted. Note that the first two iterations are always performed with a high damping value to minimize model instabilities.

Figure 12.

Less damped (0.05 km2) shear velocity model. (a–c) Cross sections and (d) locations. White line indicates Moho depth. The model presentation is offset at 50 km depth where separate color scales are used as indicated.

Figure 13.

More damped (0.2 km2) shear velocity model. (a–c) Cross sections and (d) locations. White line indicates Moho depth. Color scale bars are the same as Figure 12.

[24] The primary difference between the two models is that the less damped model has greater velocity range and more spatially focused structure. Specifically, the following crustal features are more focused in the less damped model: the ESRP midcrustal high-velocity layer and the low-velocity lower crustal layer under the Heise caldera field which extends laterally up to 80 km from the ESRP margins (Figure 12b). The most remarkable subcrustal velocity difference between the models is that a 25 km thick mantle lithosphere with shear velocities of 4.0–4.2 km/s is imaged by the less damped model. Also, the plume anomaly beneath the Yellowstone caldera dips to the NW in the less damped model (Figure 12c) consistent with the P wave tomography [Yuan and Dueker, 2005]. Given these observations, we chose the less damped model to be our preferred model given its higher resolution and lack of artifacts related to underdamping or overiteration of the inverse problem.

6. Shear Velocity Model Results

[25] Our preferred 3-D shear velocity model is presented in Figures 12, 14, and 15. In all the cross sections and map views, only shear velocity profiles with group velocity resolution values greater than one half of the maximum value are rendered. At 5 km depth, the low velocities associated with the Bighorn, Wind River, Green River, and Belt basins are clearly imaged (Figure 14a). In addition, low upper crustal velocities are found above 5 km depth beneath the <2.1 Ma Yellowstone caldera fields. In cross section (Figure 12a), the low-velocity surface layer associated with the ESRP calderas and basalt fields is found extending to about 3 km depth. At 7 km depth, the near-surface low-velocity layers associated with the ESRP and sedimentary basins disappear and the low-velocity zone beneath the 0.6 Ma Yellowstone caldera is found. Two cross sections show that the 0.6 Ma caldera anomaly extends to 12 km depth and is the most prominent upper crustal velocity anomaly besides the sedimentary basins (Figures 12a and 12b). At 25–30 km depth, the upper boundary of the high-velocity lower crustal layer beneath the Wyoming craton is found consistent with previous refraction [Gorman et al., 2002; Henstock et al., 1998; Snelson et al., 1998] and receiver functions results (Yuan et al., manuscript in preparation, 2008).

Figure 14.

Shear velocity maps of less damped model: (a) 5 km, (b) 7 km, (c) 25–30 km (average), (d) 35 km, (e) 50 km, and (f) 80 km depth. Solid lines are 0.2 km/s shear velocity contour lines. Dotted white lines show state boundaries. Regions not rendered have poor resolution. Shear velocity variations indicated by the color bars. BH, WR, and GR in Figure 14a refer to sedimentary basins as in Figure 1.

Figure 15.

Shear velocity model stacked map views. From top to bottom, slice depth is at 5, 7, 25–30, 35, 50, 60, and 80 km depth. Shear velocity variations indicated by the color bars to the right, with the bottom three slices using the lower velocity scale. Black dashed line outlines the ESRP, and the white dashed lines represent state boundaries.

[26] At midcrustal depths, the high-velocity MCS layer beneath the ESRP is observed along with high velocities beneath the Laramide deformed Dillon Block in SW Montana [Foster et al., 2006; Mueller et al., 2002] (Figure 16a). In cross section, the 3.7 km/s contour outlines this ESRP midcrustal feature as a 10–15 km thick layer (Figures 12a and 12b). At 35 km depth, the previously noted high-velocity lower crustal layer (so-called 7.x layer [Henstock et al., 1998]) is found beneath the Wyoming craton which includes the Dillon Block (Figure 16a). Remarkable is that the westernmost limit of the high-velocity Wyoming craton lower crust is demarcated along a N–S line at −111° longitude near the pre-Cambrian rift margin [Foster et al., 2006]. The lack of high-velocity lower crust beneath the southern Absaroka Range is the most significant anomaly with respect to the pervasive high-velocity 7.x layer.

Figure 16.

Decorated crust-mantle structure. (a) Map view at 35 km depth. DB, Dillon Block; YC, Yellowstone Caldera. Thick black line shows approximate Precambrian hinge line. (b) ESRP-YC cross section. MCS, midcrustal sill; LVLC, low-velocity lower crust.

[27] Another remarkable feature at 35 km depth is the low-velocity lower crust found along a N-S trend along the Wyoming-Idaho border. Two other smaller regions of low-velocity lower crust are found surrounding the high-velocity Dillon Block. As discussed in the next section, we believe this low-velocity lower crust is due to crustal outflow of hot lower crust from beneath the ESRP. In an ESRP parallel cross section, the low-velocity lower crust found beneath the Heise caldera field (Figure 12a), is found to thin to the SW toward the older caldera fields. This thinning of the low-velocity lower crustal layer is consistent with the 1978 refraction model [Priestley and Orcutt, 1982]. In an ESRP perpendicular cross section through the Heise fields (Figure 12b), the low-velocity lower crust extends 50–80 km to the NW and SE of the ESRP margins. Group velocity checkerboard tests show that resolution is good in this region (Figure 7) and our crustal thickness maps are also well constrained here.

[28] At 50–60 km depth, a low-velocity mantle lid (4.2 km/s) is found beneath the ESRP and Yellowstone caldera fields (Figure 14e). Outside this low-velocity plume disturbed region, high-velocity mantle lithosphere (4.6 km/s) is found with some embedded low-velocity artifacts on the eastern side of the image because the crust is 52–54 km thick here. At 80 km depth, the spatial distribution of the sheared plume layer mantle is outlined by the 4.0 km/s velocity contour (Figure 14f). This velocity contour outlines a low-velocity swath that is about 120 km wide and extends from just NE of the 0.6 Ma Yellowstone caldera along the extent of the ESRP sampled by our data. In cross section (Figure 12), the sheared plume layer is found to be impinging upon a lithosphere that is 100–125 km thick while the bottom of the plume layer extends to about 125 km depth. However, the depth extent of the plume layer is not that well constrained because of a decrease in resolution with depth and the small lateral extent of the sheared plume layer. Yet, it is noteworthy that the body wave tomograms [Schutt and Humphreys, 2004; Yuan and Dueker, 2005] find a very similar depth and width associated with the sheared plume layer.

7. Discussion

[29] The presence of a high-velocity layer within the ESRP midcrust was initially detected by the 1978 Yellowstone–eastern Snake River Plain seismic profiling experiment [Priestley and Orcutt, 1982; Smith et al., 1982; Sparlin et al., 1982]. Modeling of the ESRP parallel and perpendicular refraction profiles found a 10 km thick high-velocity layer between 10 and 20 km depth with a P wave speed of 6.53 km/s. This layer was named the midcrustal sill (MCS) even though it is almost certainly a composite of hundreds of individual sill intrusions [Annen et al., 2006; Annen and Sparks, 2002; Leeman et al., 2008; Shervais et al., 2006]. A discrepancy exists between the refraction determined depth to the top of the MCS (10–12 km) [Braile et al., 1982] and our finding of the depth to the top of the MCS of 15–20 km. To rationalize this discrepancy, we note two factors. First, for the refraction line down the middle of the ESRP, the phase identified as refracting off the top and bottom of the MCS is a weak secondary arrival as noted by Braile et al. [1982]. No formal model errors are presented for the refraction model, and we estimate that a 5 km deeper depth to the top of the MCS is plausible. Second, our surface wave model finding of the top of the MCS at 15–20 km depth is an average along the 80 km wide ESRP, whereas the ESRP-parallel refraction model samples only along the middle of the ESRP. The top of the MCS is probably quite irregular reflecting the fact that only about 1/3 of the ESRP area is occupied by calderas. Presumably, the calderas have the MCS rising to shallower depths beneath them.

[30] In addition to the ESRP-parallel refraction line [Braile et al., 1982], the ESRP-perpendicular refraction line [Sparlin et al., 1982] crosses the Picabo caldera field near Pocatello, Idaho (Figure 1). This line models the MCS as a domal shaped body that is thickest (10 km) and shallowest (8 km) beneath the middle of the ESRP. Similar to our assessment of the Braile et al. [1982] model, two sources of this discrepancy between the refraction and surface wave models is plausible. First, this refraction line crosses the Picabo caldera field and thus it is plausible that the MCS does extend to shallower depth beneath this large caldera. Second, no formal errors are presented in the Sparlin et al. [1982] work and the authors note that their modeling was guided by the Braile et al. [1982] model. In addition, receiver function analysis along the ESRP-perpendicular refraction line finds no evidence for the domal geometry [Peng and Humphreys, 1998]. Noteworthy is that the refraction modeled MCS sill is about the same width as the 80 km wide ESRP. The refraction model of the MCS is consistent with the observed Bouguer gravity high over the ESRP for a MCS density of 2.88 g/cm3 intruded in 2.65 g/cm3 Archean country rock [Sparlin et al., 1982].

[31] The 10 km thick ESRP high-velocity midcrustal layer found by our image (Figures 12a and 12b) is consistent with the MCS model required by petrologic and chemical analysis. The petrologic models suggest that basaltic magma from an upper mantle source ascends into the crust to a level at which it reaches neutral buoyancy and forms a sill. Over time, rhyolitic magmas are extracted from the basaltic intrusion via remelting previous basalt sills, fractionation, residual liquid formation, and minor amounts of Archean crustal assimilation [Christiansen and McCurry, 2008; Leeman et al., 2008; Shervais et al., 2006]. Thus, the basaltic input of mass and heat is intimately related to the formation of the massive caldera eruptions [Annen et al., 2006]. Within a few Ma after a caldera eruption, the excess heat is conducted away and the resulting MCS is more dense, and rheologically stronger, with respect to the Archean country rock it has intruded [Anders and Sleep, 1992].

[32] Estimation of the total high-silica caldera eruptive volumes provides constraint with respect to the total MCS volume by assuming an extrusive to intrusive volume ratio. Caldera volume estimates are provided by several authors and may contain 50% errors due to the lack of caldera exposure in many cases [Bonnichsen et al., 2008; Christiansen, 2001; Leeman et al., 2008; Morgan and McIntosh, 2005]. Noteworthy is that the caldera volumes have progressively decreased in time since the 12.7 Ma Bruneau-Jarbidge field [Smith and Braile, 1994]. Using a basalt-rhyolite fractionation ratio of 2:1 [Bonnichsen et al., 2008], a 10 km thick MCS layer beneath the ESRP would be predicted. The composition of the midcrustal sill is predicted to be a ferro-basalt on the basis of seismic velocities and the volcanic rock fractionation trends [Bindeman et al., 2007; Christiansen and McCurry, 2008; Shervais et al., 2006]. This thickness estimate is probably a lower bound as the late stage basalts and rhyolitic domes require further magmatic inflation of the ESRP crust.

[33] One potential source of crustal thickness variations between the ESRP and its margins would be differential extension over the last 14 Ma of ESRP. Yet, no definitive evidence exists to suggest differential extension between the ESRP and its margins. This conclusion is supported by the lack of ESRP parallel bounding normal faults [McQuarrie and Rodgers, 1998] and the relatively uniform modern-day regional GPS crustal velocity field [Puskas et al., 2007]. Thus, assuming no significant differential extension between the ESRP and its margins, the addition of 10 km of basalt to the ESRP crust will thicken it by 10 km with respect to its margins. The crust expelled into the atmosphere during explosive eruptions is estimated to be <1 km [Perkins and Nash, 2002]. Yet, assuming a 10 km magmatic thickening of the ESRP crust, our crustal thickness map finds that the ESRP crust is no more than 3–5 km thicker with respect to the ESRP margins (Yuan et al., manuscript in preparation, 2008). Thus, we conclude that 5–7 km of ESRP crust is “missing” because of lower crustal outflow. The seismic velocity evidence for this lower crustal outflow is provided by the low-velocity lowermost crust that extends 50–80 km laterally from the ESRP margins (Figure 12b). The flow originates from the low-velocity lower crust beneath the Heise caldera field. In contrast, no evidence for lower crustal outflow from beneath the <2.1 Ma Yellowstone caldera fields is found probably because of their smaller volume and stronger surrounding crust [Lowry et al., 2000].

[34] The lower crustal flow of 50–80 km over 5 ma would be a 1.0–1.6 cm/a flow rate which is a reasonable rate given the high heat flow [Blackwell et al., 1991] and low-viscosity structure [Lowry et al., 2000] found along the ESRP. Analysis of the viscous relaxation from the Mb 7.1 Borah peak earthquakes finds a low-viscosity lower crust [Barrientos et al., 1987]. The driving force for the crustal outflow is provided by both the lateral density contrasts associated with the densified MCS layer and thickening of the ESRP lower crust. Yet, Figure 14d shows the outflow of the low-velocity crust from beneath the Heise field is spatially nonuniform. This is probably due to preexisting lower crustal compositional, hence strength, differences related to the variable long-term magmatic evolution of the crust [Foster et al., 2006].

[35] Beneath the most recent 0.6 Ma Yellowstone Caldera, low shear wave velocities (2.8 to 3.1 km/s) are found above 12 km consistent with previous refraction results [Smith et al., 1982]. The most significant difference between the Yellowstone and Heise caldera fields is the lower crustal velocity: beneath the Heise field the velocities are low, whereas beneath the Yellowstone caldera fields the velocities are high (Figures 12b and 12c). This finding suggests that magma does not stagnate to transfer its heat and fluids into the lower crust beneath the Yellowstone caldera fields. This lack of magma stagnation in the lower crust is consistent with the high-velocity, hence high-density, lower crust found beneath the Yellowstone caldera fields. This Yellowstone caldera high-velocity lower crust is spatially connected to the 7.x km/s layer found to the east beneath most of the northern Wyoming province crust [Gorman et al., 2002; Henstock et al., 1998; Snelson et al., 1998]. Noteworthy is that our new results extend the known extent of the 7.x km/s layer with respect to the previous Deep Probe refraction results.

[36] The mantle lithosphere thickness and thermal state beneath the ESRP and Yellowstone caldera fields is important to interpreting the magmatic petrologic and chemical data [Hanan et al., 2008; Leeman et al., 2008; Shervais et al., 2006]. In particular, the late stage ESRP basalt fields require that the Archean mantle lithosphere is being melted [Leeman, 1982; Leeman et al., 1985; Menzies et al., 1984] or at least that plume layer basaltic melts are geochemically equilibrating with this ancient mantle [Leeman, 1982; Menzies et al., 1983]. These petrologic models require that melt from the plume layer is transferring magmatic heat into the mantle lithosphere with modest amounts of conductive thermal diffusion that increases with time. A hot ESRP mantle lithosphere is suggested by the low shear velocities (4.0–4.2 km/s) of the ESRP mantle lithosphere (Figures 12a and 14e).

[37] The ESRP lithospheric thickness is found to be about 55–70 km with the velocity varying between 4.0 and 4.2 km/s. Given an average ESRP crustal thickness of 42 km, the ESRP mantle lithosphere is no thicker than 28 km. The variations in the ESRP mantle lithosphere thickness and velocity may be related to Rayleigh-Taylor instabilities at the lithosphere-asthenosphere boundary, yet our current seismic resolution does not warrant any interpretation of the ESRP mantle lid variations. Outside the ESRP and Yellowstone volcanic track, the mean lithospheric thickness is about 120 km (Figures 12 and 15). This thickness is consistent with xenolith studies from the eastern Montana kimberlite pipes [Carlson and Irving, 1994; Carlson et al., 2004; Dudas et al., 1987]. Remarkable is that the Wyoming craton mantle lithosphere to the east of −111° longitude is significantly faster (>4.6 km/s) with respect to the mantle lithosphere beneath SW Montana (4.4–4.6 km/s) (Figures 12a and 12b). Most likely this mantle lithosphere velocity difference reflects the low heat flow in the nonplume affected northern Wyoming craton.

[38] While the deeper plume conduit found by the body wave tomograms is not imaged by our work, our results do find the plume conduit beneath the Yellowstone Park is tilted to the NW above 150 km depth (Figure 12c). This finding is consistent with the NW plume tilt found by the body wave tomograms [Waite et al., 2006; Yuan and Dueker, 2005]. Downstream of the Yellowstone caldera, the plume conduit becomes a sheared plume layer as it is dragged to the SW by North American plate drift. As found by all previous Yellowstone tomograms [Schutt and Humphreys, 2004; Waite et al., 2006; Yuan and Dueker, 2005], the plume conduit does not spread laterally much at the base of the lithosphere [Ebinger and Sleep, 1998; Sleep, 1990]. For example, at 100 km depth beneath the Yellowstone caldera, the plume anomaly is about 100 km wide, while downstream at the 8–10 Ma Picabo caldera field, the plume is only 150 km wide at the same depth. We suggest the minor lateral spreading of this plume is related to its very small volume and heat flux [Schutt et al., 2008].

[39] The base of this low-velocity sheared plume layer beneath the ESRP is imaged at about 125 km depth. However, both vertical and lateral resolution decrease with depth and hence this depth to the base of the plume layer is tentative. To estimate the approximate plume volume flux, the horizontal extent of the ESRP portion of the sheared plume layer (150 km) is multiplied by the height (55 km) and a 20 km/Ma plate drift rate to yield 11,250 km3/Ma. This volumetric flux rate is close to an estimate (10,048 km3/Ma) produced by using the 80 km diameter of the plume conduit between 200 and 400 km depth from the body wave tomography [Yuan and Dueker, 2005], and assuming a conduit 2 cm/a upwelling rate. Thus, the Yellowstone plume has a flux rate that is very small with respect to the Hawaiian plume [Sleep, 1990] and should be considered a weak lukewarm upper mantle plume [Schutt and Dueker, 2008]. The reason the Yellowstone track volcanism is as voluminous as observed, is that the ESRP lithosphere is both relatively thin and actively extending at about 2–4 mm/a [Puskas et al., 2007].

8. Conclusions

[40] The primary features found by our inversion of fundamental mode Rayleigh wave diffusive and ballistic dispersion data are shown in Figure 16. At 35 km depth, the high-velocity lower crust beneath the northern Wyoming province is labeled along with its notable absence beneath the Eocene age southern Absaroka volcanic field. Noteworthy is that this high-velocity 7.x crustal layer is found beneath the <2.1 Ma Yellowstone calderas. The low-velocity crustal outflow vectors from beneath the 4.0–6.6 Ma Heise caldera fields are labeled. Whether the isolated high-velocity anomaly beneath the Dillon block is part of this 7.x layer is unknown at present and the north trending low-velocity channel at −111° longitude has no obvious geologic explanation at present. In a cross section parallel to the ESRP (Figure 16b), the plume conduit found by the body wave tomograms is shown. The upward flow within this plume conduit is sheared to the SW by North American plate drift to create the sheared plume layer beneath the ESRP and Yellowstone caldera fields. The two primary crustal structures associated with the ESRP are: the 10 km thick MCS extending along the length of the ESRP sampled by our data, and the low-velocity lower crust that is thickest beneath the Heise caldera field and thins toward the SW toward the 10 Ma Picabo caldera field. Beneath the <2.1 Yellowstone caldera fields, the low-velocity inverted volcanic cupola is found above 12 km depth.


[41] The NSF IRIS PASSCAL program is thanked for use of the broadband recorders. J. Stachnik thanks A. Ferris for helpful discussion and software collaboration. Figures were created using GMT [Wessel and Smith, 1998]. The data used were assembled from the IRIS Data Management Center. Constructive reviews by Anthony Lowry and Samantha Hansen improved the content and presentation of this manuscript.