Crust and Upper Mantle Structure Beneath the Eastern United States

The Eastern United States (EUS) has a complex geological history and hosts several seismic active regions. We investigate the subsurface structure beneath the broader EUS. To produce reliable images of the subsurface, we simultaneously invert smoothed P‐wave receiver functions, Rayleigh‐wave phase and group velocity measurements, and Bouguer gravity observations for the 3D shear‐wave speed. Using surface‐wave observations (3–250 s) and spatially smoothed receiver functions, our velocity models are robust, reliable, and rich in detail. The shear‐wave velocity models fit all three types of observations well. The resulting velocity model for the eastern U.S. shows thinner crust beneath New England, the east coast, and the Mississippi Embayment (ME). A relatively thicker crust was found beneath the stable North America craton. A relatively slower upper mantle was imaged beneath New England, the east coast, and western ME. A comparison of crust thickness derived from our model against four recent published models shows first‐order consistency. A relatively small upper mantle low‐speed region correlates with a published P‐wave analysis that has associated the anomaly with a 75 Ma kimberlite volcanic site in Kentucky. We also explored the relationship between the subsurface structure and seismicity in the eastern U.S. We found that earthquakes often locate near regions with seismic velocity variations, but not universally. Not all regions of significant subsurface wave speed changes are loci of seismicity. A weak correlation between upper mantle shear velocity and earthquake focal mechanism has been observed.

Though the EUS is naturally less seismically active than the western U.S., several intraplate seismic zones (see Figure 1), including the New Madrid seismic zone (NMSZ), the Wabash Valley seismic zone, the South Carolina seismic zone, the Central Virginia seismic zone, and the West Quebec seismic zone, show localized seismic activity in the east coast. Three great-to-major earthquakes (Magnitude 7-to-8) struck the NMSZ in 1811-1812 and caused severe damage (Hough et al., 2000;Johnston, 1996). Paleoseismic studies suggest that at least two large earthquakes occurred thousands of years prior to 1800 (Tuttle, 2002). Past destructive earthquakes and slow deformation (Newman, 1999) provoke debate over the seismic hazard assessment (Stein, 2007). The mechanism of stress concentration that is proposed to have caused these large events is unsolved. Some geodynamic models predicted a stress concentration in the seismogenic zone (e.g., Levandowski et al., 2016;Pollitz, 2001). However, the assumed structural models differ significantly due to poor images of the subsurface, especially in the crust. Detailed structure images beneath the EUS are required to answer outstanding questions, such as what factors control the occurrence of intraplate earthquakes.
The deployment of EarthScope USArray Transportable Array (TA) provided unprecedented station coverage in the EUS for more than 10 years. The TA has crept continuously across the U.S. and collected seismic observations at more than 2,000 unique locations. With a nearly uniform station spacing (∼70 km) and high-quality sensors, the array was designed to improve our understanding of the subsurface structure and has led to many seismic velocity models for the lithosphere structure of the EUS and surround regions (e.g., B. B. Yang et al., 2014;Biryol et al., 2016;Bollmann et al., 2019;Dong & Menke, 2017;Golos et al., 2018;Menke et al., 2018;Savage, 2021;Savage et al., 2017;Wagner et al., 2018). To better utilize the improved data coverage and compare the subsurface structure across a broad region, we investigate the seismic velocity variations beneath the EUS with a model parameterization that is suitable for TA stations.
Even with dense seismic networks, tightly constraining subsurface 3D geologic variations is a challenge. But the seismological community has worked for decades with less data to extract information valuable enough to steadily advance our understanding of the lithosphere. P-wave receiver functions processed from teleseismic P waves provide us a way to repeatedly sample subsurface structure beneath seismic stations as source-side effects are removed by deconvolution (e.g., Langston, 1979). To extract detailed subsurface structural parameters from receiver functions, an inversion is often used to estimate model parameters from receiver function observations. The inversion of receiver functions is nonunique (e.g., Ammon et al., 1990), but incorporating complementary observations has shown to ease the nonunique problem and improve the sensitivity (e.g., Chai et al., 2015;Chong et al., 2016;Julià et al., 2000Julià et al., , 2003Özalaybey et al., 1997;Sun et al., 2014). The construction of 3D models adds additional complexity. Different data sets may average 3D heterogeneity differently, making one-dimensional models that fit all the data difficult to construct. For example, 3D scattering in receiver functions can introduce hard-to-identify high-wavenumber artifacts during an inversion. An often used and generally successful approach to reduce these effects is to target smooth earth models that represent averages of the true structure. We used the receiver function smoothing/interpolation technique (Chai et al., 2015) to reduce the scattering noise in the data prior to the inversion to avoid mapping noise into artifacts. Simultaneous inversion using smoothed receiver functions has produced reliable images of subsurface seismic velocity variations (Chai et al., 2015).
We simultaneously inverted smoothed/interpolated P-wave receiver functions, Rayleigh-wave phase and group velocity measurements, and Bouguer gravity observations for subsurface structure beneath the EUS using linearized uniform cell-based 3D inversions. Compared with recent studies (e.g., Porter et al., 2016;Schmandt et al., 2015;Shen & Ritzwoller, 2016), we used a wider period range for surface-wave dispersion observations, included longer receiver function signals, and reduced scattering noise in receiver functions through spatially smoothing. We describe the procedures used to process each observation type in the next section. Details of the receiver function smoothing/interpolation technique are discussed in the following section and are followed with a description of the theory and processes for the simultaneous inversion. The fits to each type of observations are illustrated in a separate section to show an important component of the quality control applied to our results. We end with a discussion on the resulting 3D shear-wave velocity models that compares our results with recently published models and an investigation of the relationship between seismicity and subsurface material variations.

Data
We used three types of geophysical observations to image the subsurface structure including P-wave receiver functions, Rayleigh-wave dispersion measurements, and Bouguer gravity observations. The receiver functions were computed using observations from about 1,700 seismic stations operated within the broad EUS ( Figure S1 in Supporting Information S1).

Receiver Function Observations
P-wave receiver functions were processed using teleseismic waveforms recorded at TA and many other seismic networks (a complete list of networks can be found in Table S1). Seismic events (around 3,700) with body-wave magnitude m b larger than 5.7 during station operation times (prior June 2015) and at epicentral distances from 30° to 100° (see Figure S2 in Supporting Information S1) were selected to avoid P-waveform interference with upper mantle triplications and the core shadow zone. We downloaded three-component P-waveforms along with metadata for these events from the Data Management Center of Incorporated Research Institutions for Seismology (IRIS DMC). The seismograms were rotated to the radial-transverse-vertical system from the original north-east-vertical recording coordinates. P-wave receiver functions (Langston, 1979) were obtained by deconvolving the vertical component from the radial and transverse components using the time-domain iterative deconvolution algorithm (Ligorría & Ammon, 1999). Gaussian filters (Ammon, 1991) were used in the deconvolution process to limit the bandwidth of the receiver functions and preserve absolute amplitudes of receiver functions. Specifically, we computed receiver functions with a Gaussian filter of 1.0 and 2.5 that correspond roughly to pulse widths of 1.67 and 0.67 s at half the maximum, respectively. Similar to receiver functions obtained for the western US region (e.g., Chai et al., 2015), many receiver functions processed from even high-quality stations are quite noisy. In order to exclude noisy outliers, we used three programmable waveform selection criteria. Other waveform selection criteria have been used by previous studies (e.g., X. Yang et al., 2016), but visual examinations of our receiver function waveforms confirm that these three criteria are sufficient to remove problematic receiver functions from the data. First, receiver functions with signal-to-noise ratios (measured using a 170-s time window that ends 10 s earlier than the expected P arrival time as noise and a 130-s time window immediately after as signal) less than 10 were discarded. Second, since the deconvolution can be unstable, we reject receiver functions with a convolution fit less than 85%; the convolution fit is computed as a signal power ratio between the radial component and a convolved signal of receiver function and the vertical component. Even with these relatively strict criteria, occasionally, some problematic receiver functions (e.g., unusual large amplitude, wrong polarity, and low-frequency noise) could pass through and influence our observations. To identify these signals, we computed the signal difference of each receiver function with respect to the single-station averaged waveform to further clean up our receiver function data set and excluded receiver functions with a signal difference larger than 300% of the stacked waveform signal power. The distribution of receiver function signal differences follows the extreme value distribution for the EUS data set ( Figure S3 in Supporting Information S1), which suggests that a small portion of receiver functions is significantly different from the single-station averaged receiver functions. We obtained around 228,000 (38% of the raw receiver functions) acceptable receiver functions in total. The accepted receiver functions were then binned into three ray parameter bins (smaller than 0.05 s/km, larger than 0.07 s/km, and in between) for both Gaussian 1.0 and 2.5 receiver functions. Therefore, we have six receiver function waveforms at a station when data are abundant.

Surface-Wave Dispersion Observations
We used both Rayleigh wave group and phase velocity observations in the simultaneous inversion. We avoid Love waves to minimize complexity that may arise from anisotropy. Our results are thus an approximation of the potentially anisotropic Earth. Surface-wave dispersion observations from two studies were blended to combine the short-period dispersion values from Herrmann et al. (2021) and longer period dispersion observations from Ekström (2011). The short-period dispersion observations were measured using both earthquake signals and ambient noise cross correlations. The measured dispersion observations were localized to grid points through a surface-wave tomography (e.g., Ekström, 2011;Herrmann et al., 2021). The short-period dispersion model used a 1°-by-1° grid with a node at the center of each grid cell. Only the dispersion data corresponding to cells with good ray path coverage (at least 50 km of ray path) were used. We adopted the following formula from Maceira and Ammon (2009) for a smooth blending of the dispersion curves. In the expressions above, s(T) is the blended dispersion value (group or phase velocity). T is the period, s 1 (T) is a value from the long-period dispersion tomography (Ekström, 2011), and s 2 (T) is a value from the short-period dispersion tomography  at the same period. φ is a control parameter that is a function of T c and ε. ε is set equal to 0.5. T c is the period around which we want to switch from short-and long-period surfacewave observations. Since the available period range varies from location to location (due to ray path coverage differences), we used a grid search to determine the best transitional period (T c ). We tried a range of T c and computed the gradient of the resulting dispersion curves. In order to minimize artificial anomalies caused by the blending, the optimal T c is the one with a minimum gradient (the smoothest dispersion curve). All blended dispersion curves were visually examined and minor adjustments were made as needed. Generally, a dispersion-curve transition period between 30 and 40 s was chosen. The blended dispersion data set spans from 3 to 250 s in the best case. As an example, the blended dispersion curves are compared with recent dispersion models (Bensen et al., 2007;Ekström, 2011Ekström, , 2014Herrmann et al., 2021;Jin & Gaherty, 2015) in Figure S4 in Supporting Information S1. The blending of dispersion curves greatly extended the period range and is consistent with alternative recent dispersion models at most places.

Gravity Observations
Gravity observations were extracted from a global Bouguer gravity model WGM2012 (Balmino et al., 2012). This Bouguer gravity data set was computed by spherical harmonic analysis using ETOPO1 topography-bathymetry data and gravity observations from the EGM2008 global gravity model (Pavlis et al., 2012). The lateral resolution of the gravity data is 5 arc min (∼9 km), which is higher than what we can resolve with a 1°-by-1° grid in our inversion. The Bouguer gravity observations are averaged within a 1°-by-1° grid for the EUS to reduce gravity anomalies that are due to smaller scale structure perturbations. Long wavenumber gravity signals are primarily caused by deep density changes. However, the interpretation of deep gravity signals is nonunique and can be associated with density variations, plate flexure, and upper mantle dynamic effects. We wavenumber-filtered the gravity observations with a box-car filter so that the remaining gravity signals are mainly sensitive to the shallow density structure (upper ∼15 km). Initial gravity observations from WGM2012 and wavenumber-filtered gravity values are compared in Figure S5 in Supporting Information S1. The filtered gravity image shows fewer small-scale and large wavenumber variations than the raw image as expected.

Receiver-Function Smoothing/Interpolation
P-wave receiver functions and surface-wave dispersion observations have different spatial (lateral) sensitivity. Receiver functions are sensitive to sharp changes in vertical and lateral structure. In particular, large, complicated (and hopelessly aliased) variations near the surface often strongly influence P-wave receiver functions. Surfacewave dispersion observations are more sensitive to longer wavenumber variations in the structure. We smoothed the receiver-function wavefield spatially to reduce near-surface scattering effects and to better complement the surface-wave dispersion measurements. We have developed receiver function smoothing/interpolation to attack this issue by simplifying the receiver function wavefield (Chai et al., 2015). The smoothing/interpolation reduces the scattering noise on receiver function observations and isolates the spatially coherent components of the signals (see Figure 2, Figure S6 in Supporting Information S1). Compared with the traditional single-station averaged receiver functions, interpolated/smoothed receiver functions are more consistent laterally for different lag times.
A spatially smoothed receiver function is computed by averaging receiver functions recorded at adjacent stations with distance-dependent weights. Smoothing/interpolation also equalizes the lateral sensitivity of receiver functions and surface-wave dispersion measurements, which can reduce potential inconsistencies between surfacewave observations and body-wave receiver functions. Interpolation also provides us a way to approximate receiver functions at grid points where surface-wave observations are commonly estimated by tomography. This functionality is quite useful especially for 3D inversions using multiple types of observations. We can express the receiver function wavefield as R (x,y,t), where x and y are spatial coordinates (e.g., latitude and longitude), and t is the lag time after the direct P wave. Then, the interpolated receiver function wavefield is computed using the following formula repeatedly for all locations of interest (e.g., grid points) where r ij is the distance between the point of interest (x i , y i ) and station location (x j , y j ), and n is total number of stations. d 1 and d 2 (d 1 < d 2 ) are two distance parameters that control the distance range for the stations included in the averaging and the weight (w j ) of the different stations. Receiver functions recorded at stations at distances less than or equal to d 1 are weighted as 1. Stations located at distances between d 1 and d 2 are weighted between 0 and 1 with the weight linearly decreased from 1 at distance d 1 to 0 at distance d 2 . Chai et al. (2015) compared smoothed receiver functions computed using several pairs of distance parameters for different seismic network configurations in the western U.S. In our analysis, we choose a d 1 of 110 km and a d 2 of 160 km for the EUS to match the 1° grid used for surface-wave dispersion tomography. By spatially smoothing the receiver function wavefield, we sacrifice a spatial resolution for simplicity and better average properties, but we may blur some geological transitions and boundaries. The resulting smoothed receiver functions are more complementary to the surface-wave dispersion data and comprise a data set more consistent with the other observations used in the joint inversion and with less near-surface scattering effects that can complicate the inverse models.

Simultaneous Inversion
To estimate the subsurface shear-wave speeds beneath the EUS, we simultaneously inverted the smoothed/interpolated receiver function wavefield, Rayleigh wave group and phase velocities in the period range from 3 to 250 s, and wavenumber-filtered Bouguer gravity observations. The study region was divided into 950 1°-by-1° size cells for the EUS so the best lateral resolution of our model is 110 km by 110 km. The resulting 3D grid was used for our simultaneous inversions in a hybrid 1D-3D manner. Receiver function and surface-wave dispersion calculations were performed with a 1D formalism for each cell across the region where observations are available. Gravity calculations were computed in 3D using rectangular prisms. The lateral dimension of the prisms is the same as the grid size whereas the vertical dimension varies from 1 km near the surface to 50 km in the mantle. A linearized discrete geophysical inversion developed from (Chai et al., 2015;Julià et al., 2000;Maceira & Ammon, 2009) was used with smoothness-based stabilization. A jumping strategy is used for regularization to allow constraints directly on model parameters and not changes in model parameters (e.g., Constable et al., 1987). The linearized inversion can be expressed as where D s , D r , and D g are matrices containing the partial derivatives of the seismic shear velocity model corresponding to the dispersion, receiver function, and gravity estimates, respectively. m 0 is the 3D velocity model from previous iteration. m a is the a priori velocity model. m 1 is the updated velocity model for the current iteration. R s , R r , and R g are the corresponding data residual vectors, and 2 , 2 , and 2 are global weights assigned to the three data sets. Those weights are defined in the same way as Julià et al. (2000) as 2 where N is the number of data points for the specific data set and 2 is the kth data point variance. p(T) was introduced by Maceira and Ammon (2009) to control the trade-off between fitting both gravity and surface-wave dispersion measurements. The matrix Δ applies vertical smoothing with a weight η to make the 1D velocity profiles vary smoothly-necessary when data constraints are not sufficient. W is a diagonal matrix to constrain velocities from varying too far from the a priori values in m a with associated weights. Lateral smoothing is added through the matrix S using the first differences between shear velocity values in adjacent grid cells. The lateral smoothing does not smooth across the ocean-continent boundary to allow a sharp change in material properties across this well-defined (for our cell size) feature. The inverse equation is solved using a Conjugate-Gradient LSQR solver for sparse linear equations (Paige & Saunders, 1982). The weights and smoothing parameters are determined by performing suites of inversions and comparing data misfits and model properties, such as roughness. Interactive visualization tools were used to help the comparison of inversion results with different weighting parameters (Chai et al., 2018). The vertical smoothness constraints increase with depth to reflect a loss in data resolution and the increase in surface wavelengths sampling the greater depth in the model. Partial derivatives for surface-wave dispersion were computed using finite-difference approximations. The simulation of surface-wave dispersion is based on algorithms from Saito (1988). The gravity derivatives were computed using the equations from Plouff (1976) and the chain rule. The inversion is performed to estimate shear-wave speed, which is related to P-wave speed by the V p / V s ratio, and to the density using formulas described in Maceira and Ammon (2009) and Chai et al. (2015). The V p /V s ratio was fixed throughout the inversion (inherited from the initial model).
The inversion for the EUS started with an initial model that was based on Crust Model 1.0 (Laske et al., 2013). Velocities below the crust were initialized with the AK135 velocity model (Kennett et al., 1995). Since flat-Earth codes were used to compute dispersion, the initial model was flattened using the formulas from Biswas (1972) with a slightly modified density transformation exponent (see Supporting Information of Chai et al., 2015 for details). Velocities and densities were unflattened prior to the gravity partial calculation and flattened after the gravity computation since the density-shear velocity relationship is based on laboratory measurements (corresponding to the spherical model).
Based on the results from hundreds of 3D inversions (multiple cells) and thousands of 1D inversions (single cell), it became clear that tuning the weights for each data set and constraints is necessary to produce good fits to the observations. We experimented with weight selection using numerous 1D inversions with observations selected from several samples in the overall data set. We searched for optimal ranges of weight parameters by randomly testing 1D inversion parameters and examining the range of velocity models that resulted from the simulation. Within this suite of velocity models, some fit the data poorly. For many choices of the weights, the results were quite similar-those that fit the observations well often differed from the average model by less than 0.1 km/s. The tests suggest that we can use data fits as a guide to select both the data and smoothness weights for the larger 3D inversions. To be sure that the resulting model fits the observations reasonably well, we used interactive visualization tools (Chai et al., 2018) to visually check data fits for all three types of observations.
The EUS results were obtained after eight total iterations of the 3D inversion ( Figure S7 in Supporting Information S1). We began the inversion without gravity observations and modeled the seismic data for a total of four iterations. We then added the gravity observations for the final four iterations. The final EUS model has been unflattened for interpretation and easy comparison with published spherical Earth models.

Data Fits
Fitting data is the most important quality control for our inversion. We discuss not only the overall data fit of the three types of observations we use, but also the spatial distribution of data misfits. Fit to surface-wave dispersion and receiver functions is significantly improved in the first four iterations and is stable for the last four iterations ( Figure S7 in Supporting Information S1). Gravity misfit was reduced greatly after one iteration. In general, our inverted model fits all three types of observations much better than the starting model. Figure 3 shows the receiver function and surface-wave dispersion misfit distributions of all cells for the initial and final models. In all cases, an improvement in fit is clear. We use a 95% threshold to quantify the overall misfit. The 95% misfit threshold represents the fractional misfit value equal to or larger than that found for 95% of the cells. The metric is used to eliminate the influence of the worst fitting 5% of the cells (which are often in the ocean, where data coverage is substantially more limited). As shown in Figure 3, the 95% misfit threshold was improved from 15.5% (initial model) to 6.5% (inverted model) for group velocity measurements, from 8% (initial model) to 2.5% (inverted model) for phase velocity measurements, from 102% (initial model) to 33% (inverted model) for Gaussian 1.0 receiver functions, and from 132% (initial model) to 54% (inverted model) for Gaussian 2.5 receiver functions. The modest reduction in dispersion misfits reflects the relatively small range of speeds within the data. Any model with a roughly correct average dispersion value performs reasonably well with this metric. As expected, phase velocity measurements are fit better than group velocity measurements. Narrowband receiver functions were fit better than broadband receiver functions.

Receiver Function Misfits
In Figure S8 in Supporting Information S1, we show the spatial distribution (at available grid points) of receiver function misfits for both the initial model and the inverted (final) model. Receiver function misfits computed from the initial model are quite large at most grid points. The misfit shown in the maps (Figures S8a and S8b in Supporting Information S1) is the average of those corresponding to all available (up to six) receiver functions. For example, at the location corresponding to Figure S8c in Supporting Information S1, the misfit (represented as the circle size in Figures S8a and S8b in Supporting Information S1) is computed from six receiver functions (three ray parameter bins and two Gaussian widths). Comparing the observed and predicted receiver functions in Figure S8c in Supporting Information S1, we found that the converted phase and multiples arrive later in initialmodel receiver functions. Receiver functions computed from the inverted model agree well with observations. At many grid points, the receiver functions computed with the initial model differ significantly from the observations, while the inverted models fit the observations nicely (see Figure S8d in Supporting Information S1 for an example). Receiver function misfits corresponding to the inverted model are uniformly small with slightly larger values in Gulf of Mexico coastal regions and the Williston basin where thick sediments complicate receiver function waveforms. The relatively large misfits in eastern Tennessee and western North Carolina may be due to anisotropy or less-optimal V p /V s ratios.

Surface-Wave Dispersion Misfits
The spatial distribution of surface-wave dispersion misfits (an average of group and phase speed misfits) is shown in Figure S9 in Supporting Information S1. Group and phase velocities computed from the initial model differ significantly from surface-wave dispersion observations for the most model grid points as indicated by cells with large circles in Figure S9a in Supporting Information S1. For example, we compared observed and simulated dispersion curves at a grid point located in northeast Mississippi in Figure S9c in Supporting Information S1. Predicted dispersion curves based on the inverted model agree well with the observations, while simulated group and phase velocities from the initial model are too small at most periods. Even for regions that the initial model-derived surface-wave dispersion curves are similar to observations (see Figure S9d in Supporting Information S1 for an example), the inverted model predicts the surface-wave measurements better, especially at short periods (less than 20 s). As shown in Figure S9c in Supporting Information S1, surface-wave dispersion misfits corresponding to the inverted model are much smaller than those from the initial model except for a few off-coast locations. Large misfits at these off-coast grid points (at the gulf coast of Florida and near northern New Jersey) are likely due to limited observations and a poor starting model.

Gravity Misfits
Gravity observations are much fewer in number compared to receiver functions and surface-wave dispersion. For this reason, we use a smaller weight for gravity observations and include gravity data at a later stage of the inversion. The predicted gravity values agree with Bouguer gravity observations well ( Figure S10 in Supporting Information S1). Since the gravity calculations are performed in 3D, we used a buffer zone (gray color-filled regions in Figure S10 in Supporting Information S1) to avoid edge effects. In general, our goal with the gravity data is to fit the relatively high-wavenumber features that are likely associated with crustal density variations. The addition of gravity to the inversion does not introduce significant changes to the model. However, these changes are necessary to fit the gravity observations. Numerical tests showed that small changes of roughly 0.1 km/s to the upper 15 km account for an improved fit to the gravity. We only fit the first-order features in gravity observations to avoid overfitting.

Results
Representative shear-velocity depth slices are shown in Figure 4. Maps in Figures 4a and 4b are a plot of the average speed within the specified depth range. At shallow depth (0-5 km, Figure 4a), low velocities correlate well with major sedimentary basins and the coastal plains. Low-velocity layers near the surface are thick along the coastal plains and within the basins of the Great Plains compared with the Michigan, Illinois, and Appalachian basins. Average lower upper-crustal shear velocities are lower beneath the Mid-continent rift, which interrupts the relatively fast shallow crusts of Wisconsin and northern Minnesota and the Canadian Shield regions to the north and continues southward through southern Minnesota and Iowa. There is a hint (solid ellipse in Figure 4b) of a sharper transition along the northern (western) Mississippi Embayment (ME) boundary than along the east. Seismicity extends from the lower shear-wave speeds into a region of more normal speeds. On average, the Appalachian region and the Midwest appear slightly faster than the region to the west that includes Wisconsin, Iowa, Nebraska, Kansas, and Oklahoma. The fast region in the northwest lies beneath the Williston Basin and is not well resolved, but the model, driven by dispersion measurements at these depths, suggests perhaps an unusually fast middle crust beneath the basin (Figure 4b). Two East Coast regions, the Carolinas and southern Virginia, and Maine appear slower than the regions immediately west in the middle crust ( Figure 4b).
Figure 4c is a map of the shear-velocity model at 37 km depth. This depth was chosen to provide the first-order information on crustal thickness-in areas of thin crust, we see mantle-like speeds (dark blue colors, V s larger than 4.4 km/s), while in many other areas, we simply see the lower crust. The map identifies the southern ME and coastal regions as regions of unusually thin crust. Regions of relatively fast deep crust are suggested in the eastern Dakotas, beneath Michigan, and parts of Canada and New England (see Figure 5 for crust thickness variations). We discuss specific estimates of crustal thickness later. Figure 4d is a map of the shear-wave model at a depth of   (Figures 5a-5d and 5f-5i) show detailed velocity changes as a function of depth. To define crustal thickness, we measured the depth corresponding to P-wave velocity larger than 7.8 km/s (based on visual inspections of velocity profiles). Only a few cells failed to reach this speed before a depth of 53 km. The automatically measured crustal thicknesses are shown as dashed gray lines in Figures 5a-5d and 5f-5i, which match the crust-mantle transition well visually at all presented locations.
Our smoothing approach to receiver functions and reliance on surface-wave dispersion tomography to constrain the deeper features in the model increase the consequences of the assumption of sharp features, such as the crust-mantle boundary in the initial model (Crust 1.0) on the location of the crust-mantle transition. However, as shown earlier, even when the model is good, slight adjustments are made to improve the alignment of the converted phases originating from the crust-mantle boundary region. Although we did not perform a rigorous analysis, our examination of the model suggests that when the constraints used to build Crust 1.0 were based on good nearby data, little adjustment in crustal thickness was needed by our inversion. In regions where we believe that Crust 1.0 relied heavily on interpolation, the inversion changes from our initial model were larger. Although adjustments are generally not large, typically 3 km, our model is almost systematically thinner than Crust 1.0 by a few kilometers (both models share the same V p /V s ratio).
Crustal thickness patterns for eastern North America are familiar. The crust is thinner near the coast and thicker landward. The average crustal thickness in the region is 43 km with a standard deviation of 5 km. The relatively thinner crust (<43 km) is also imaged beneath eastern North Dakota. The crust beneath the ME and into east Texas appears significantly thinner (<35 km) than the continental interior. Thicker crust (>43 km) is found beneath the Appalachian Mountains and the Great Plains. The depth of the crustal-mantle transition is less well constrained beneath thick sedimentary basins due to the dominance of basin reverberations in the receiver functions. One potential application of our model is to form the basis of a wavefield downward continuation and decomposition (Chai et al., 2017;Langston, 2011) to extract receiver function signals generated from the crust-mantle transition from teleseismic P-wave seismograms, assuming that the shallow structure is known. We leave that analysis for future efforts.
We compared our crustal thickness measurements with several published models (Crotwell & Owens, 2005;Porter et al., 2016;Schmandt et al., 2015;Shen & Ritzwoller, 2016). The EARS project (Crotwell & Owens, 2005) measured crustal thickness from automatically computed P-wave receiver functions using the H-k stacking method (Zhu & Kanamori, 2000) by assuming a constant velocity in the crust. The original results may suffer from scattering noise in receiver functions (e.g., Chai et al., 2015). We spatially smoothed the measured crustal thickness from EARS to show first-order features (see Figure S11 in Supporting Information S1). Comparing our crustal thickness map (Figure 5a) with four recently published models ( Figure S12 in Supporting Information S1), we note that a thinner crust was imaged along the east coast and beneath the ME for all models. However, notable differences exist among these models. The average crustal thicknesses and the standard deviations from the Shen-2016 model (Shen & Ritzwoller, 2016), the Schmandt-2014 model (Schmandt et al., 2015), and our model are the same. The EARS model (Crotwell & Owens, 2005) may underestimate crustal thicknesses due to oversimplified model parameterization (constant-velocity crustal layer). The Porter-2015 model (Porter et al., 2016) used the EARS results to constrain crustal thickness and shared the same issue. The common conversion point stacking procedure (Zhu, 2000) used for the Schmandt-2014 model reduced some scattering but had difficulty in challenging regions (e.g., sedimentary basins) where stacking energy is not focused. To a large degree, the Schmandt-2014 model, the Shen-2016, and our model are consistent. Small differences exist between the Shen-2016 and our model (see Figure S13 in Supporting Information S1). The EARS model is certainly suitable for a starting model. We believe that our model is less contaminated by scattering in receiver functions and includes simultaneously modeled constraints from Rayleigh waves (similar to the Shen-2016 model) and near-surface gravity.
Figures 6a and 6b are maps of the average crustal and upper-mantle shear-wave speed, respectively. The mean of the average crustal V s velocity across the model continental grid points is 3.7 km/s (standard deviation 0.1 km/s). The corresponding mean crustal V p velocity is 6.5 km/s (standard deviation 0.2 km/s, see Figure S14 in Supporting Information S1). These mean values agree reasonably well with the global compilation of Christensen and Mooney (1995). Relatively fast average crustal velocity is found beneath the stable North America craton. Regions of very slow average crustal shear-wave speeds are associated with thick sediments in the southern ME. The mean uppermost mantle shear-wave speed is the average of the upper 60 km of the mantle and it is 4.50 km/s (standard deviation 0.05 km/s). The average uppermost mantle P-wave speed in the study area is 8.1 km/s (standard deviation 0.1 km/s), which is also consistent with the global compilation (Christensen & Mooney, 1995) and regional surveys (Li et al., 2007). Note these P-wave speeds were derived from shear-wave speed using V p / V s ratios inherited from Crust 1.0. The uppermost mantle shear-wave velocity is slowest beneath New Mexico and the associated southern Basin and Range province. We imaged relatively slow average shear-wave speeds in the uppermost mantle beneath New England and the southeastern US, regions that may have interacted with Great Meteor (e.g., Eaton & Frederiksen, 2007) and Bermuda (e.g., Cox & Van Arsdale, 2002) hotspots, respectively. Slow speeds along the east coast seem to follow the trend of eastern North America rift basins but extend further into northern New England than the surface expression of rifting. A slightly slower uppermost mantle is imaged beneath the western ME. Localized low-average mantle speeds (dashed ellipses in Figure 6b) near the Kansas-Nebraska border and the Kentucky-Ohio border are situated near kimberlite volcanic site and may represent modified lithosphere associated with these volcanic structures. The relatively slow upper mantle weakly correlates with the region of thrust faulting along the eastern margin of the U.S. and within southeast Canada. The thrust faulting environment also correlates with the thin crust for much of the region, but not northern New York or southeastern Canada ( Figure S15 in Supporting Information S1).
We do not see any evidence for slower mantle resulting from lithospheric delamination and upwelling that is evident in geochemical observations (Mazza et al., 2014). Thus, if lithospheric delamination and mantle upwelling occurred as the rocks suggest, the feature must be quite localized. The West Virginia-Virginia border overlies a relatively abrupt transition in crustal thickness. The crust is relatively flat toward the Appalachian interior from west to east but begins to thin near the area containing the kimberlite intrusions (see Figure 6b for the locations). More evidence for a significant change in lithospheric structure across the region is the difference in mid-crustal speed, which decreases by about 0.2-0.3 km/s from the Appalachians eastward (see Figure 4b). Although the transition is smoother, the upper mantle speed also decreases toward the east in the same region. The differences in structure suggest interesting geologic differences throughout the lithosphere from the Appalachians to central and eastern Virginia.
In Figure S16 in Supporting Information S1, we merged the EUS model with the published model in the western U.S. (Chai et al., 2015). The smooth transition from west to east indicates that our EUS model is compatible with the published western U.S. model. The anomalous upper mantle beneath the western ME, the east coast, and New England is modest compared with that beneath the Basin and Range province in the western U.S.
An automated cluster analysis (Chai et al., 2015) was used to group the 1D shear velocity profiles into clusters of similar Earth structure. The spatial distribution of the clusters using shear velocities between the depth range of 5 and 200 km is shown in Figure S17 in Supporting Information S1. The clustered velocity models are shown in Figure S18 in Supporting Information S1. The crust is thinner beneath western ME, southern Basin and Range region, southern Interior Plains, and Atlantic Plain. A thicker crust was imaged beneath western Interior Plains, Appalachian Highlands, Interior Highlands, and central Interior Plains. The southeastern Canadian Shield shows a slighter thinner crust. The upper mantle is faster beneath the stable North America craton. The seismicity in the region does not show a simple correlation to the subsurface seismic speed variation on the broad image in Figure S18 in Supporting Information S1. The clusters obtained with shear velocities between the depth range of 0 and 20 km is shown in Figures S19 and S20 in Supporting Information S1. To a certain extent, the distribution of clusters ( Figure S19 in Supporting Information S1) can be used as a guide to the choice of a local 1D velocity model for initial earthquake location or moment tensor inversion. For instance, the same 1D velocity model may be used for the region specified by the same cluster. We use more detailed cross sections to explore the relationship between the spatial distribution of earthquakes and subsurface structural variations.

Discussion
Large-scale regional lithospheric models contain many features that are often described in a list of short commentary related to geology. Many of the structural features that we observed have been pointed out in earlier studies (e.g., Chen et al., 2016;Porter et al., 2016;Schmandt et al., 2015;Shen & Ritzwoller, 2016); we noted several of these in the Results section. One aspect not covered in other papers is the fact that a detailed crustal model provides an opportunity to explore the relationship(s) between seismicity and subsurface structure (if any exist). We use the cross sections defined in Figure 1 to review the 3D model in detail and to explore potential relationships between structure and seismicity. The cross sections were chosen to cover most of the seismically active regions. Despite intensive studies, the cause of localized seismicity in the region remains poorly understood. Many ideas have been proposed to explain the intraplate seismicity of the EUS, such as stress concentration in regions of crustal weakness or changes in lithospheric structure (Grana & Richardson, 1996;Grollimund & Zoback, 2001;Kenner, 2000;Levandowski et al., 2016;Pollitz, 2001). Numerical models have shown that crustal deformation can be induced at shallow depth when a dense anomaly resides in the crust (Levandowski et al., 2016;Pollitz, 2001;Zhan et al., 2016) above a relatively weak upper mantle. The depth and cause of a weak zone are still under debate (Chen et al., 2014;Pollitz & Mooney, 2014). Since seismicity in Oklahoma may be related to human activity, we skip that region (see Chai et al., 2021 for a detailed study of subsurface structure in that region). But note that ultimately, the Oklahoma activity is a result of the geologic structure associated with the hydrocarbon-rich basins in the area. The EUS model contains no significant features deep beneath the Oklahoma seismic activity. Two regions of particular interest are the northern ME and the Appalachian Mountains. We discussed some of the seismicity patterns and structure relationships for New Madrid in the Results Section 6.
The cross-section A1-A2 (Figure 7) passes through seismically active regions in Oklahoma, the NMSZ, and the Eastern Tennessee seismic zone (ETSZ). The NMSZ is underlain by a relatively fast lower crust, which has been interpreted as a mafic intrusion (Catchings, 1999;Mooney et al., 1983). Reconstruction of the feature details is difficult with the coarse 100-km sampling, but the broader scale model shows the slower mantle in comparison with regions to the east and west. Numerical tests and Rayleigh-wave sensitivity kernels suggest that our resolution begins to degrade at about 150 km ( Figure S21 in Supporting Information S1), so we do not image the entire feature. Nyamwandha et al. (2016) suggested that a Cretaceous thermal event producing these anomalies is associated with upwelling fluids from the Farallon Slab along with an already weakened and thinned lithosphere as a result of interaction with the Bermuda Hot Spot roughly 80-100 Ma (Cox & Van Arsdale, 2002). The crustal thickness is maximum beneath the Valley and Ridge and decreases by about 10 km at the coastal region to the east. Although not associated with seismicity, the cross section indicates relatively higher mid-crust speeds beneath the Piedmont region of North Carolina.
Seismicity in the ETSZ occurs on basement faults that have no surface expression (Steltenpohl et al., 2010). Seismicity is located in the Valley and Ridge Province west of the highest Appalachian elevations. A recent local earthquake tomography study imaged the existence of the buried fault (Powell et al., 2014) as a low-velocity anomaly. The lateral extent of the low anomaly is beyond the resolution of our velocity model. Our model contains a velocity change in the lower crust beneath the ETSZ (ellipse in Figure 7), which may be associated with the transition from the Granite-Rhyolite basement (west) to the Grenville basement (east) (Fisher et al., 2010). Upper mantle shear velocity beneath the region is 2%-3% faster than the mantle beneath the NMSZ at depths of about 100 km.
The northwest-southeast-striking cross-section B1-B2 (Figure 8) shows apparently thinned crust beneath western North and South Dakota (solid ellipse) which is consistent with Thurner et al. (2015) who argued that the apparently thin crust is actually a very fast underplated crust in the vicinity of the 2 Ga Trans-Hudson Orogen. The model at what could be considered uppermost mantle depths is relatively slow, so it is plausible to interpret the material as fast crust. A region of apparently thin crust in southeastern North Dakota is more perplexing. Thurner et al. (2015) estimated a crustal thickness of 30-35 km in this region. Our apparent crustal thickness is slightly Black circles are events with depth uncertainties less than 5 km. Gray circles represent earthquakes with larger depth uncertainties or without uncertainties. Note the image is vertically exaggerated.
larger, that is, about 38-40 km. To be fair, reverberations in the thin surface sedimentary cover interfere with the Ps arrival from the apparent crust-mantle boundary, which may affect an accurate estimation of the crust-mantle boundary. The multiples in receiver functions from the apparent crust-mantle boundary are free of near-surface interference, which provide some constraints on the boundary. The P-wave speeds in the apparent lower crust are in the 6.6-7.0 km/s range, not unusual, and the values increase to 7.8 sharply and reach values of 8.0-8.1 km/s by a depth of 43 km. The region has relatively low heat flow (50-60 mW/m 2 , Blackwell et al., 2011) though the data coverage is sparse. The apparently thinned crust may be a result of eclogite facies mafic material in the lowermost crust rendering the petrologic crust-mantle boundary seismically transparent (Furlong & Fountain, 1986;Griffin & O'Reilly, 1987). In cratonic areas, eclogite facies mafic material in the lowermost crust can have a P-wave speed larger than 8 km/s, which can lead to bias in the estimation of the crust-mantle boundary. Near the South Dakota Iowa border, the mid-crustal shear-wave speed decreases relatively abruptly. The relatively slow mid-crust (see also in Visualization S1) from Iowa through Missouri may be related to a batholith inferred from the Missouri gravity low (Hildenbrand et al., 1996). The structure from Iowa through Missouri is relatively uniform until interrupted by the relatively fast lower crust beneath the NMSZ. The upper mantle beneath Iowa and Missouri is relatively fast and uniform across this region of relatively slow middle crust. To the southeast of the NMSZ, the middle crust again is relatively slow and the crust begins to thin (dashed ellipse in Figure 8) and the upper crust speed decreases near the Alabama-Georgia Border. The upper mantle below 100 km, from the NMSZ to the southeast, is one of the slowest profiles in the model and includes the region believed to have been crossed by the Bermuda Hot Spot (Chu et al., 2013;Cox & Van Arsdale, 2002).
Additional cross sections can be found in Figures S22-S25 in Supporting Information S1 and discussed in Text S1 in Supporting Information S1. The inverted 3D seismic velocity model for the EUS is available in Data Set S1. The 3D seismic velocity model for the western United States from Chai et al. (2015) is provided in Data Set S2.  We also provide an interactive tool (Chai et al., 2018) to easily view the 3D model with depth slides and depth profiles side by side for both the EUS (Visualization S1) and the western United States (Visualization S2).

Conclusions
Using spatially smoothed P-wave receiver functions, surface-wave dispersion, and Bouguer gravity observations, we construct 3D shear-wave velocity models in the EUS. The average crustal thickness of the EUS model is 43 km; the average crustal shear speed is 3.7 km/s; and the average uppermost mantle shear-wave velocity in the model is 4.5 km/s. We imaged the thinner crust beneath New England, the East Coast, and the ME. The relative slow average shear-wave speeds in the mantle beneath New England and the southeastern US may be linked to hotspots. Compared to a compilation of basement age (Lund et al., 2015), regions with thin crust were formed after 670 Ma and thicker crust in the cratonic region formed before 1,000 Ma.
A comparison of seismicity and subsurface shear velocity suggests that often, but not universally, earthquakes locate near regions with seismic velocity variation. However, not all regions of significant subsurface seismic speed changes are loci of seismicity. The eastern seaboard mantle appears slow, consistent with coarser, but deeper sampling models that have been used as a basis for estimating dynamic topographic changes along the eastern seaboard (Rowley et al., 2013). A weak correlation between upper mantle shear velocity and earthquake focal mechanism has been observed. A relatively small upper mantle low-speed region in eastern Kentucky and southwestern Ohio correlates with the area of perturbed upper mantle P-waves analyzed by Chu et al. (2013), which they associated with the circa 75 Ma kimberlite volcanic site near Elliot Kentucky. The northern ME and in particular the region of the large earthquakes in 1811-12 appear to be underlain by a relatively fast lower crust and a relatively slow uppermost mantle. Levandowski et al. (2016) suggested that such a structure can focus stress in the upper crust, and our model is consistent with the idea. (DOE), Office of Fossil Energy, Carbon Storage Program through the Science-informed Machine Learning for Accelerating Real-Time Decisions in Subsurface Applications (SMART) Initiative. The facilities of IRIS Data Services, and specifically the IRIS Data Management Center, were used for access the waveforms, related metadata, and/or derived products used in this study. See Table S1 for a full list of seismic networks used in this study. IRIS Data Services are funded through the Seismological Facilities for the Advancement of Geoscience and EarthScope (SAGE) Proposal of the National Science Foundation under Cooperative Agreement EAR-1261681. Earthquake catalogs from U.S. Geological Survey (https://earthquake.usgs.gov/earth-