Distributed Extension Across the Ethiopian Rift and Plateau Illuminated by Joint Inversion of Surface Waves and Scattered Body Waves

The East African Rift System provides a rare location in which to observe a wide scope of rifting states. Well‐defined active narrow rifting in the Main Ethiopian Rift (MER) transitions to incipient extension and eventually pre‐rifted lithosphere through the northwestern flank of the Ethiopian Plateau (EP). Although the MER is well studied, the off‐axis region has received less attention. We develop Rayleigh wave phase velocity maps, Ps receiver functions, and H‐κ stack surfaces, and jointly invert these data using a trans‐dimensional, hierarchical Bayesian inversion algorithm to create shear velocity profiles across the MER and EP. All shear velocities observed are slower than the PREM global average, a reflection of the elevated temperatures that persist from plume impingement. In the EP, we find a shallow mantle slow shear velocity lineament parallel to the MER axis, amidst otherwise faster shear velocities. The crust is shallow in the MER, and also in the northwestern‐most EP flank. Thicker crust found elsewhere throughout the plateau is caused by crustal underplating and flood basalt emplacement. Shear velocities more reduced than the already low regional average, in concert with surficial volcanic features, geodetic observations, and slow P‐ and S‐wave anomalies, support off‐axis extension in the Ethiopian plateau, requiring reevaluation of the localization of continental breakup in the narrow MER.

The scope of previous geophysical studies has been primarily constrained to the MER axis, Afar, and the nearaxis portions of the EP. However, geodetic data suggests that ∼20% -30% of present-day extension does not occur within the magmatic segments of the MER (C. Ebinger & Casey, 2001) and may be distributed across the EP (Birhanu et al., 2016). While seismicity supports continuing extension along the border faults (Illsley-Kemp et al., 2018;La Rosa et al., 2021;Lapins et al., 2020), westward distribution of seismicity (Keir et al., 2009;Lamessa et al., 2019;Muluneh et al., 2020), volcanism (Kieffer et al., 2004), small-scale GPS movements (D. Stamps et al., 2014;D. S. Stamps et al., 2018), and low shear velocities (Chambers et al., 2019(Chambers et al., , 2021 in the EP support a degree of off-rift extension, which begs the questions: where does this off-axis extension occur, and how is extension accommodated at depth? In order to answer these questions, we interrogate the shear velocity structure of the lower crust and upper mantle in the EP and northwestern Ethiopian Plateau flank (NWEP) using a recently deployed broadband seismic array in conjunction with legacy data (Figure 1a). We develop Rayleigh wave phase velocity maps, Ps receiver functions, and H-κ stack surfaces from teleseisms, and jointly invert these complementary data types using a trans-dimensional hierarchical Bayesian inversion algorithm . We solve for 1-D shear velocity profiles beneath the EP, NWEP, and within the MER, and use these to support the existence of off-axis extension. We interpret shear velocity structure in the context of previous studies (e.g., Dugda et al., 2007;Gallacher et al., 2016;Weeraratne et al., 2003) and thermal structure associated with rifting.

Geological and Tectonic Background
Ethiopia has a complex tectonic and volcanic history that has shaped the evolution of the MER and EP (Figure 1b). Formed within the Neoproterozoic orogenic Mozambique Belt (Abbate et al., 2015;Asrat et al., 2001;Corti, 2009;Meert & Lieberman, 2008), Ethiopia preserves the Proterozoic-aged Himalayan style collision zone (Burke & Sengör, 1986;Shackleton, 1986) between East and West Gondwana, which scarred the basement with numerous sutures (Berhe, 1990;Stern et al., 1990;Vail, 1985), and Precambrian aged fractures . Some of these structures were reactivated in the Paleozoic-Mesozoic, creating ultimately aborted rifts (C. Ebinger et al., 2000;Corti, 2009;Mège & Korme, 2004), which left lithospheric weaknesses that have the potential to be exploited by magmatic or extensional forces. , atop an exaggerated DEM for symbol clarity. Stations used to develop receiver functions, and that are inverted to develop shear velocity profiles are plotted in blue, green, and red, depending on their shear velocity group (see Section 3.6). Station DAQU, which was inverted but not grouped, is plotted in dark grey. Stations plotted in white were used only for phase velocity measurements. Black arrows are GPS data from Birhanu et al. (2016). b) A map of relevant geology in the study area, colored by elevation (in m) from (Amante & Eakins, 2009), showing flood basalt extent from Dochartaigh (2019), geologically and currently active major faults digitized from Thiéblemont et al. (2016), and a variety of volcanic and hydrothermal features (Keir et al., 2009). Within the MER, NW-SE trending strike-slip faults have reactivated to allow extension (Corti, 2009;Gani et al., 2009;Korme et al., 1997Korme et al., , 2004, demonstrating that existing fault structures control some degree of extension within the MER. NW-SE faults in the Ogaden basin and Blue Nile basin (Figure 1b), which transect the EP and NWEP, reactivated from the early to late Jurassic and interspersed lithospheric structures that may accommodate localized extension. Interrelation between Precambrian structures, active normal faulting, and magmatism outside the MER (cf., Hautot et al., 2006) is evidenced by Paleozoic-Mesozoic and Late Miocene-Quaternary dike swarms to the southwest and west of Lake Tana (Chorowicz et al., 1998;Mège & Korme, 2004;Rooney et al., 2018). Today, earthquake distribution is focused within the MER and Afar, often near magmatic segments, Keir et al., 2006Keir et al., , 2009Wilks et al., 2017), and along border faults separating the MER from adjacent plateaus (La Rosa et al., 2021;Lapins et al., 2020;Lavayssière et al., 2019). Sparse earthquakes are dispersed throughout the EP (Ayele & Kulhánek, 1997;Lamessa et al., 2019), with increasing concentration nearing the MER.
Thermal modification from impingement of the Afar plume and associated Tertiary volcanics have altered the lithospheric structure of the MER from that of the Mozambique Belt, which may serve as a comparison point to the pre-rift lithospheric structure of Ethiopia (Keranen et al., 2009). Plume impingement may have also chemically altered the lithosphere (Boyce et al., 2016(Boyce et al., , 2021Liao et al., 2017;Wang et al., 2015;Wenker & Beaumont, 2018). Plume-related magmatism, exploiting pre-existing weaknesses and fabrics in the lithosphere, produced a ∼1000 km diameter (Baker et al., 1996;Hofmann et al., 1997;Ukstins et al., 2002), 2 km thick region of flood basalts that cap much of the EP (Figure 1b). Flood basalt emplacement began ∼45 Ma (Rooney et al., 2014), but was most prevalent between 31 -29 Ma. Within the EP, shield volcanoes developed at 22 Ma and 10-7 Ma (Kieffer et al., 2004) (Figure 1b). Between 10-12 Ma, volcanoes appeared both within the MER and near Lake Tana (Abate et al., 1998;Conticelli et al., 1999;Kieffer et al., 2004). Present day fumaroles and thermal springs are active in both regions (Figure 1b). Quaternary volcanism is present in the form of Strombolian volcanoes and spatter cones south of Lake Tana (Kieffer et al., 2004), and along the Yerrer-Tullu Wellel Volcanotectonic lineament (YTVL), where clusters of volcanoes and fractures running E-W from the MER have erupted coevally with northern MER volcanoes (Abebe et al., 1998;Chernet et al., 1998).

Phase Velocity Maps
Using the Automated Surface Wave phase velocity Measuring Systems (ASWMS) , we developed phase velocity maps at 25, 32, 40, 50, 60, 80, and 100 s using Rayleigh waves from teleseismic earthquakes. This period range provides sensitivity to lower crustal and mantle wavespeeds ( Figure S16 in Supporting Information S1). We used 5741 high quality earthquakes occurring between 2000-03-10 and 2017-10-03, of Mw ≥5.5, a maximum event depth of 250 km, and spanning a distance of 30°-160° from from the center of our study area (12°N 39.5°E). We processed vertical seismograms for all stations between 4°N and 20°N latitude, and 34°E to 45°E longitude ( Figure 1a). Our final dataset includes 415 stations within 17 networks (see Data Availability Statement).

of 24
We briefly explain the ASWMS methodology and quality controls used. For the full procedure, see . For each earthquake, vertical cross-correlograms, C(t), were computed in the time domain for each possible station pair according to: where S 1 and S 2 are the de-trended, windowed, vertical seismograms of the two stations with instrument response removed, and ⋆ denotes cross-correlation. W S is a 300 s windowing function, with a 75 s Hanning taper. C(t) contains lag and coherency information about the Rayleigh wave energy traveling between the two stations. The dominant energy of the waveform was isolated using another windowing function, W c (t), and then filtered using a series of Gaussian narrow band filters. By fitting the filtered, windowed cross-correlogram function, F i *(W c C(t)) using a 5-parameter Gaussian wavelet we extracted frequency-dependent phase delays, t p , for this station pair .
Station pairs with inter-station distance (in km) less than 5, or greater than 3.55× the period (in seconds), were discarded to reduce the possibility of cycle skipping or spurious measurements. Station pairs with cross-coherency ≤0.6 were discarded to prevent measurement on poorly correlated cross-correlograms. t p were then inverted using the Eikonal equation to develop phase velocity surfaces at each period for each event (Lin et al., 2009). Regularization for this step was implemented as in , with consequent resolution as demonstrated by synthetic recovery tests (Section 3.2, Figure S4 in Supporting Information S1). Finally, these phase velocity surfaces were stacked (on a per-pixel basis; see below) over all events, thereby piecing together regions of temporally and spatially isolated station coverage.
ASWMS was initially designed for the large, evenly spaced US Transportable Array . Use of this method for smaller arrays is feasible Orfanos et al., 2016;Pratt et al., 2017) but requires careful quality control. As such, events for which more station pairs were discarded than not were disincluded from the stacks. We also culled individual phase delay measurements that had apparent velocities below 3.1 km/s or above 4.8 km/s, as velocities outside these bounds are unrealistic for our period range (Ekström, 2011;Ekström et al., 1997). Testing demonstrated that expanding these limits by 0.2 km/s did not affect the final result. Pixels were included in the final stack ( Figure 5) if the number of rays crossing those pixels from all event maps exceeded 66 (providing on average one ray entering each pixel per km of circumference). We also excluded pixels with phase velocities more than three standard deviations from the stack mean. For the stacking, each event's phase velocity map was weighted proportionally to the number of inter-station rays measured for that event. Maps were smoothed using a linear function with a lengthscale of 0.4× the wavelength.

Resolution
To assess the resolution of our phase velocity tomography, we conducted checkerboard (Figure 2), ( Figure S3 in Supporting Information S1), semblance (Figures S7 and S9 in Supporting Information S1) and spike tests ( Figure  S4 in Supporting Information S1) following Russell (2021). We construct input checkerboard maps using 2-D sinusoids that vary in amplitude between 3.8-4.2 km/s. We varied the checker length scale linearly with period to approximately match the Rayleigh wave wavelengths: the sizes range from 0.9° (25 s) to 3.6° (100 s). Using the same rays at each period as in the real data, we forward modeled inter-station delay times, adding ray-length-dependent noise by randomly perturbing the true ray-path averaged velocity, with standard deviation of 0.05 km/s. This level of noise was found to replicate the fraction of quality control measurements discarded in ASWMS for the real data. The checkerboard tests show velocity anomalies are recovered well at all periods ( Figure 2 and Figure S2 in Supporting Information S1), with a small amount of NE-SW smearing in Afar, a region which is not the focus of the present study. We used these checkerboard tests to calculate the semblance at each period   (Figures S7 and S9 in Supporting Information S1), and mask regions with <0.7 semblance in our results ( Figure 5). To assess depth sensitivity of phase velocity at each period to shear velocity structure, we compute Vs sensitivity kernels using our final (i.e., most applicable for the area) shear velocity models from the joint inversion (Section 3.5). Our highest overall sensitivity to Vs is between 25 and 200 km ( Figure S16 in Supporting Information S1 and Supporting Information S2).

Ps Receiver Functions
Receiver functions (RFs) were calculated at several recently deployed stations on the EP and rift flank as well as a few long-standing stations within the rift (Figure 1a, indicated in blue, green, and red). We used Mw ≥ 5.5 earthquakes 30° to 90° from the approximate center of the study region (12°N, 39.5°E) between 2000-03-10 and 2017-10-03. We extracted seismogram traces spanning ±100 s from the predicted P-wave arrival. Traces were de-meaned and bandpass filtered from 0.01 Hz -0.25 Hz using a 2 pole Butterworth filter. We culled events with a signal-to-noise (SNR) below 1.5, where signal and noise were computed as the maxima of envelope functions based on 8 s and 26 s smoothing of the whole trace, respectively (Krueger et al., 2021). Waveforms were rotated into Z,R,T orientation and then migrated to the median ray parameter of the earthquake. We used the iterative time domain deconvolution technique (Ligorría & Ammon, 1999) to calculate receiver functions.

Receiver Function Quality Control
On the basis of an a priori assumption that first-order crustal structure is one-dimensional, we assume that highly similar RFs reflect the true 1-D crustal impedance structure. Other more disparate and complex RFs are either caused by noise or reflect more complicated 2-D structure which cannot be recovered by this study. While strongly anisotropic crust or dipping layers may produce receiver functions with bimodal phase arrival times or strong back-azimuthal variations (e.g., J. O. Hammond, 2014;Savage, 1998), we did not encounter this in our data.
We used a multi-stage process to select a family of receiver functions with similar waveforms, relying partly on the Density Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm (Ester et al., 1996). DBSCAN is a path-specific hierarchical algorithm that uses two parameters to define clusters. The first is a variable distance metric, ϵ, which defines the maximum tolerable difference between two RF traces within the same cluster. The second is the minimum number of data to form a cluster, which we set at a constant 30% of the  Figure S2 in Supporting Information S1, and displays similar resolution. Each period has the input checkerboard displayed as dark red (3.8 km/s) or blue (4.2 km/s) contours, atop the checkerboard surface. At 80s, all stations used in checkerboard tests are plotted in inverted white triangles with black edges. number of RFs for a given earthquake, a value was found to adequately differentiate families of good RFs from noise. In practice, we found that varying this value between 25%-75% made little difference.
Our quality control procedure comprised six steps: (i): We calculated the L 1 norm of the differences between all receiver functions for a given station (Figure 3a), and used those values as distances subject to ϵ. (ii) We used DBSCAN with a high ϵ (lenient) culling step on the RFs windowed narrowly around the expected P arrival, from T = 0 to T = 1.5s. This discriminated a cluster of RFs with "normal" P-wave arrivals from those with abnormally late or early arrivals, which were discarded ( Figure 3b). (iii) We culled receiver functions with amplitudes less than 0.5× the mean amplitude at T = 0. (iv) RFs that passed these culling steps were again processed using DBSCAN, this time with a time window from T = 2s to T = 15s, which included the Ps and PpPs arrival phases for most receiver functions. We excluded the PpSs + PsPs phases in this clustering step as they generally had lower amplitude arrivals. Within this step, we changed ϵ until a cluster of RFs were found that had a clear Ps arrival pulse, and the number of clustered traces did not change by increasing ϵ. (v) After a cluster of RFs were found, we culled traces with crustal multiple phase (PpSs + PsPs) amplitudes greater than 30% of the trace's P arrival amplitude, which is greater than expected for such phases (e.g., Julià, 2007). (vi) Finally we visually assessed each station's RFs, and culled RFs with an unclear Ps arrival (Figure 3c). The stacked receiver function ( Figure 3d) used in the joint inversion (Section 3.5) was the linear mean of the 'cleaned' receiver functions that passed the above quality controls.

H-κ Stacks
Receiver functions at 28 stations that passed the "cleaning" process were used in a standard H-κ stacking procedure (Zhu & Kanamori, 2000) to estimate crustal thickness and Vp/Vs ratio. We used a Vp of 6.3 km/s, and phase weights of 0.7, 0.2, and 0.1 for the Ps, PpPs, and PpSs + PsPs phases, respectively. The resultant H-κ stack for each station was used to constrain crustal thickness during joint inversion (Section 3.5, Figure 4c). H-κ stacks for each station are provided in Supporting Information S2.  (ii) of the quality controls. Thin red lines are receiver functions culled by DBSCAN, thin blue lines are receiver functions passing quality controls, and the thick blue line is the mean of the clustered receiver functions. c) The manual visualization and cull step (vi) after clustering. Traces with colored red and blue fills, signifying positive and negative phases, were used for the final stack, others were manually removed. d) The final stacked receiver function from the 38 receiver functions that passed quality control, normalized to the parent phase amplitude. 7 of 24  found that many H-κ derived crustal thickness and Vp/Vs ratio estimates were unreliable within the EP. They found that stations within flood basalts produced unreliable H-κ stacks with high standard deviation and broad minima, due to low RF SNR and broad, low amplitude arrival phases. They attributed this to a gradational Moho, complex crust, or sediments and basalts. We performed a modified version of Ogden et al.'s (2019) random input perturbation procedure on several stations in the region using receiver functions from before the quality control process and those that passed the quality control process ( Figure S10 in Supporting Information S1). We found that a distinct high energy maximum was able to be well resolved after quality control for most stations (26 of 28), and be moderately resolved for the rest. We note that strong anisotropy may impact the H-κ process (cf. J. O. Hammond, 2014), or might cause the quality control algorithm to reject back-azimuthal swaths. However, while our earthquake arrivals are not uniformly distributed, we do not find evidence of systemic back azimuthal patterns or automatic rejections.

Joint Inversion Algorithm
We used a trans-dimensional, hierarchical, Bayesian inversion algorithm  to create 1-D shear velocity profiles of lower crustal and mantle structure. This algorithm uses Bayesian statistics and Monte Carlo Markov Chains to explore the model space, developing ensembles of shear velocity models that closely predict observed Rayleigh wave phase velocities and receiver functions. To ensure we adequately explored the parameter space for each station, we ran 6 parallel Markov Chains for 21,000 iterations, approximately 16× the number of iterations required for a given chain to achieve stationarity (the "burn-in"). Proposed model updates were made by drawing from local Gaussian distributions for single parameters in each iteration. Model perturbations were accepted or rejected using a Metropolis-Hastings acceptance criterion (Hastings, 1970;Metropolis et al., 1953), and were confined by a priori distributions determined by generous geologically plausible ranges. An empirical prior was developed which confirms that this algorithm does not systemically introduce bias ( Figure S12 in Supporting Information S1). We adapted the method of  to incorporate H-κ surfaces as an additional data type. To some extent, this step simply increases the power of the RF data within the joint inversion. However by implicitly migrating crustal multiples in a more self-consistent manner than in a synthetic RF stack, it sharpens the inversion's constraints on crustal Vp/Vs ratio, for essentially zero increased computational expense. . An example of the joint inversion data types at station GUBA, including final data fit. a) Stacked P-wave radial receiver function (black). The predicted receiver function computed from the mean posterior shear velocity profile following inversion is shown in red . b) Rayleigh wave dispersion curve at this station (black) and standard deviation at each period, together with MINEOS-predicted phase velocities from the mean posterior shear velocity profile (red). c) H-κ amplitude surface (η), with the corresponding values from the entire posterior model ensemble (light grey dots), and the mean of this posterior (red dot).

of 24
We assumed that errors between different data types are uncorrelated Calò et al., 2016;Drilleau et al., 2013;Khan et al., 2011;Roy & Romanowicz, 2017;Shen et al., 2013), so the overall model likelihood function is computed as the product of likelihoods for each individual data type: where σ i is the data error for data type d i , n i are the degrees of freedom, m is the proposed model, and Φ i are misfit functions for body waves, surface waves, and H-κ surfaces, respectively, as follows: where R(t) and Z(t) are the radial and vertical observed time series of stacked receiver functions, r p and z p are corresponding predicted time series, and * is the convolution operator. Note that this "cross-convolution" body wave misfit equation does not require synthetics to match the true source time function (Bodin et al., 2014;Menke & Levin, 2003). C and c p are the phase velocity observations and predictions at frequency f. η is the amplitude of the H-κ surface, and H p , κ p , are the proposed model Moho thickness and Vp/Vs ratio.
Body wave data were forward modeled using a propagator matrix code (Keith & Crampin, 1977a, 1977b, 1977c which creates synthetic Z,R,T waveforms from a 1-D layered input velocity model. We used a 1-s Gaussian synthetic source time function and a ray parameter averaged from the RFs in the station stack. Only the first 10 seconds of RFs were used, which contain the P and Ps arrival phases. Surface wave phase velocities were forward modeled using MINEOS from CIG (Computational Infrastructure for Geodynamics) . MINEOS is computationally expensive, so data prediction relied on perturbation kernels calculated infrequently (and updated per  using MINEOS whenever the current accepted model departed substantially from the most recent kernel basis model). Because MINEOS requires a whole Earth model, our models linearly graded into the PREM global model (Dziewonski & Anderson, 1981) between 200 km and 300 km, and basal knots were prevented below 160 km to ensure no unrealistic sharp velocity gradients were involved in grading into the global velocity average.

Shear Velocity Aggregate Groups
On the basis of the observed shear velocity profiles (Section 4.3), we determined that stations could be assimilated into three characteristic groups that well span the range of rifting states found in our dataset. We tested clustering the stations into subgroups to identify trends and simplify interpretation. To form these clusters, we started with N random groups, and for 1000 iterations, we randomly moved a single station to a different random group. We then calculated the summed absolute misfit between individual profiles and their group mean V S (z), and added the total misfit across all the groups. If this misfit decreased from the previous iteration, indicating overall more similar intra-group velocities, we accepted the new group assignment. We tested this scheme for N = 2 to 5 groups and found that N = 3 achieved the best balance of minimizing intra-group variability, while maintaining parsimony and geographic coherency. Our preferred groups of stations are as follows: Group 1 encompasses 10 stations in the NWEP, Central EP, and Somalian Plateau (Figure 1a, blue). Group 2 comprises 12 stations the central-NWEP (Figure 1a, green). Group 3 includes 5 stations in the MER, and the MER-adjacent EP (Figure 1a, in red). Station AMME (8.3031°N, 39.0934°E) was manually placed into group 3 because of its location in the MER instead of group 2, to which the shear velocity structure more closely matches. For this analysis, we ignored station DAQU (13.1448°N, 37.8976°E) which had a shear velocity structure that does not closely match any group, despite the joint inversion yielding a good data fit (Figure 1a, in grey).

Phase Velocity Maps
Phase velocities in the NMER and SMER follow the surficial border faults of the MER at short periods (25-32 s), and are slower than much of the EP and the average phase velocity from 25 -80 s (Figure 5a). The CMER appears up to 0.1 km/s faster than the SMER and NMER from 32 -60 s, but remains slower than the EP by 0.1 km/s and Somalian Plateau by 0.2 km/s. While we note that the error threshold in the CMER is 0.14 km/s, regions we compare it to show lower error (Figures S8 and S9 in Supporting Information S1). We can, however, resolve that the CMER remains substantially slower than the EP by 0.05 -0.2 km/s, and Somalian Plateau by 0.1 -0.3 km/s between 32 -60 s. The region of slow velocity extends further west from the MER into the EP with increasing period. This does not occur east of the rift, within the Somalian Plateau, as observed previously (e.g., Bastow et al., 2005Bastow et al., , 2008. By 80 s, the MER, the Somalian Plateau, and eastern EP homogenize to phase velocities of 3.75 -3.8 km/s, which remain reduced compared to the westernmost EP. Slow phase velocities extend in a lineament east-southeast from Lake Tana at periods of 25 -60 s (Figures 5a-5e). This lineament roughly follows the trend of inferred late Paleozoic-Mesozoic aged rifts (Mège & Korme, 2004), Quaternary volcanoes, and spatter cones (Kieffer et al., 2004). Phase velocities here are ∼0.1 km/s faster than the MER at 25 s ( ∼3.5 km/s, vs. 3.4 km/s). In the north, this structure links to slow phase velocities at the western boundary of Afar from 25 -32 s. This structure grades into less reduced, but still slow, phase velocities to the southwest of Lake Tana between 25 -50 s. This slow phase velocity lineament does not broaden with period, and homogenizes to the fast background NWEP phase velocity by 80 seconds (Figure 5f).
A pronounced ∼200 km E-W, ∼160 km N-S, fast phase velocity region in the central EP at 32 s separates the slower phase velocities in the MER and EP lineament (Figure 5b). Velocities here are ∼4.0 ± 0.14 km/s, ∼8% faster than the ∼3.7 ± 0.15 km/s area average at 32 s. This appears to result from a shallow feature: the phase velocities in this location are not noticeably faster than the regional average at periods greater than ∼50 s, although at longer periods most of the EP and flank are ∼0.1 km/s faster than the study region average. Fast phase velocities also exist in the Somalian Plateau from 25 -50 s ( ∼3.7 -3.9 km/s). At periods >60s, the phase velocities on either side of the MER are roughly symmetrical at ∼3.7 ± 0.17 km/s, and the rift-proximate part of the Somalian plateau that we image is clearly slower than the rift-distal NWEP (3.95 ± 0.15 km/s).

Receiver Functions
The number of total receiver functions at each station varied between 9 -165, and averaged 65. The fraction of receiver functions culled by our automatic quality controls averaged 51%, and ranged between 26 -82% of recorded traces. On average, manual culling removed a further ∼16% of traces. The number of RFs that passed quality controls to develop the final stacked RF varied from 2 -73, with a mean of 23. The quality control process summarized in Figure 3 may be found for each inverted station in the Supporting Information S2.
All stacked receiver functions include a distinct Ps phase arrival at ∼5 s (Figure 3d), and 25 show a clear PpPs phase arrival (Figure 3d, ∼16 s). 21 stations displayed a PpSs + PsPs arrival phase (Figure 3d, ∼21 s), which was often broader and lower amplitude than preceding phases, potentially caused by stacking across moveout, or more local issues outlined in , including non-simple crustal structure, gradational Mohos, or flood basalts overlaying sediments. We do note, however, that no stations lacking the PpSs + PsPs phase arrivals failed the test criteria posed in , supporting our argument that (notwithstanding tradeoffs) crustal thickness may be determined here using H-κ stacking, even without these later crustal multiples.
A negative phase following the Ps arrival is observed at 16 stations (e.g., Figure 3d, ∼6 s). All group 1 station RFs record this phase, but it does not arrive at a consistent time. It is absent from all group 3 stations except AMME, where it immediately follows the Ps phase, and has a large negative amplitude. More rigorous investigation of these negative velocity gradients, which may reflect layering within or at the base of the lithosphere (Abt et al., 2010), will be the subject of future work. Because of the time window used for RFs in joint inversion, these phases must also be accounted for in the body wave forward modelling (Figure 4).

Joint Inversion
We inverted for shear velocity models at 28 station locations within the EP, NWEP, and MER, which are most relevant to interrogating the possibility of off-axis extension (Figure 1a). Stations in the CMER and SMER (Figure 1a, in red) serve as points of comparison between our results and previous studies (e.g., Dugda et al., 2007;Gallacher et al., 2016;Weeraratne et al., 2003). Overall, our inversions fit the data well ( Figure 4). With misfit defined as in Equations 3-5, we find a median RMS misfit for RFs of 0.054 parent-normalized amplitudinal units, for phase velocities a median RMS misfit of 0.0453 km/s, and for the HK surfaces, on average our posterior models are at or above 94.9% of the H-κ energy surface maximum. The closer fit to smooth phase velocity curves compared to the RFs is typical to these types of inversions (Bodin et al., 2012).

Crustal Thickness
We report the median and 1σ (actually the ±17 th percentile of the posterior ensemble about the median) crustal thicknesses at inverted stations ( Figure 6a). Crustal thickness values are generally well constrained, with the a mean 1σ uncertainty of 1.68 km ± 0.67 km, a minimum of 0.84 km, and a maximum of 3.65 km. Crustal thicknesses are similar to those found from the H-κ stacking method ( Figure S11 in Supporting Information S1), but notable divergences appear at two stations in the central EP (Stations NEKE, HARO), indicating that phase velocities are not straightforwardly compatible with the receiver function constraints here.
The crust is thickest northeast of the central EP, and near Lake Tana, varying from 35 -40 km. This is also the region with some of the highest Vp/Vs ratios: ∼1.89. Stations in the CMER and Somalian plateau also exhibit thick crust of ∼37.5 km and ∼36.5 km, respectively. Crust at stations in the SMER is thinner, at ∼29 -31 km. Trending northwest of the SMER axis to the central EP are three stations of similar thickness, ∼31 km, amongst ∼34 -36 km thick crust.
Crustal thickness is bimodal along the slow phase velocity lineament, split between ∼29 -30 km and ∼33 -34 km. Crust in the NWEP is between 28.5 -31 km. We find that our crustal thicknesses are more similar to those of Keranen et al. (2009) than Dugda et al. (2007 and Stuart et al. (2006), but are not strongly inconsistent with the latter studies. A list of crustal thicknesses with associated standard deviation may be found in the (Table T2 in Supporting Information S1).

Shear Velocities
We obtain a median shear velocity profile at each station from the posterior model ensemble. We focus on interpreting velocities from 30 and 140 km depth for tectonophysical relevance (Figures 6b-6e), but our phase velocity sensitivity extends to ≥165 km ( Figure S16 in Supporting Information S1). We note that at 30 km the observed shear velocities represent crustal structure at most stations ( Figure 6a). The 1σ shear velocity uncertainties between 60 and 120 km are approximately ±0.05 km/s across all inverted stations. 1-D shear velocity profiles at each inverted station, with 1σ and 2σ uncertainties, may be found in Figure S15 in Supporting Information S1 and Supporting Information S2. We compare stations' shear velocity profiles to PREM (Dziewonski & Anderson, 1981), and also to the average shear velocity of the inverted stations at depth, since we find that the entire region is slower than the global average. For the grouped velocity profiles, we calculated median and 32 -68% percentile velocity ranges from the combined posterior distributions concatenated across all stations within the groups.
Shear velocities closely mirror the trends seen in phase velocities ( Figure 5). Slow shear velocities exist in the CMER, SMER, and near-rift EP stations, but also near Lake Tana, and southeast of it along the EP. Fast velocities are resolved within the central EP, Somalian Plateau, and NWEP.
For group 1, the median Vs profile (see Section 3.6, Figures 6a and 1a) is faster than the regional average at all depths. 1σ bounds of group 1 shear velocity span 3.0 -4.3% (0.13 -0.19 km/s) slower than PREM from 60 -120 km (Figures 7a and 7d). This depth range is later used as representative of each group due to its consistency between stations, as measured by similarity of shear velocity profiles within each group (Figure 7).
Four of six northwestern-most NWEP stations are faster than the regional average at 50 km by 1.2 -4.8% (4.3 -4.45 km/s) (Figure 6b). At 70 km depth and below, all six northwestern stations are 0.5 -5% faster (4.27 -4.46 km/s) than regional average (Figures 6d and 6e). Stations within the central EP manifest heterogeneous mean shear velocities between 30 and 50 km, but this region is 0 -4% faster (4.25 -4.42 km/s) than the regional average deeper than 50 km. The Somalian Plateau is faster than the regional average between 50 -100 km by 1 -3.5% (4.3 -4.4 km/s). Group 2 shear velocities, which include Lake Tana and EP stations to the southwest of it, display slower velocities than those found in group 1. Shear velocities are 4.6 -6.2% (0.21 -0.28 km/s) slower than PREM from 60 -120 km (Figures 7b and 7d). Unlike group 1, group 2 velocity profiles seem to have a thin or absent lid, and deeper than 60 km get faster monotonically with depth. At depths greater than ∼100 km, they are statistically indistinguishable from group 1.
At 50-70 km, shear velocities in the southwestern EP and near Lake Tana are moderately slower than the regional average by 0 -2% (4.17 -4.25 km/s) (Figures 6c and 6d). By 110 km, only the southwestern EP shows reduced shear velocities slower than the regional average by 0 -1% (4.2 -4.25 km/s) (Figures 6e and 6f), while other stations throughout the EP are homogeneously faster than the regional average by 0.5 -1.5% (4.27 -4.31 km/s).
Group 3 shear velocities, which include the MER and the near-rift EP, are the most reduced compared to PREM; 7.7 -8.7% (0.34 -0.39 km/s) slower from 60 -120 km (Figures 7c and 7d). Between 50 and 110 km depth, these stations are slower than the regional average by 2 -6% (4.00 -4.17 km/s), with SMER stations displaying the slowest velocities. Station MERT displays an increase in shear velocity between ∼40 -50 km, and then decreases between ∼50 -60 km. This may be evidence of a thin fast lithospheric lid, which is not displayed by other stations in group 3.

Regional Shear Velocity Trends
We find mantle shear velocities in this region are slower than PREM and much slower than a global average (French et al., 2013) for nominally Proterozoic (Stern, 2002) continent. We observe this for all stations and velocity aggregate groups, even the "fast" group 1 most distal from the rift (Figure 7d). As many others have shown before, plume impingement and continental rifting have combined to produce one of the slowest regions of upper mantle on the planet.
In the Central EP and NWEP (group 1), we find a faster uppermost mantle from ∼40 -80 km depth (Figures 7a  and 7d). Some individual stations, and our median group 1 Vs profile, hint at a negative velocity gradient below this depth, implying a gradual lithosphere-asthenosphere boundary (LAB). However, our posterior ensemble contains enough range that we cannot identify a statistically significant LAB. Despite this, at some of the stations in this region, RFs also record a negative pulse, as seen in others' work (Lavayssière et al., 2018), implying steep negative velocity gradients. Although hardly representative of the entire Somalian Plateau, our single station east of the rift also exhibits the above features. We interpret this as evidence for a lack of substantial extension in the central EP and, in agreement with previous velocity models (Bastow et al., 2008;Chambers et al., 2019), a sharp transition to un-extended lithosphere immediately east of the rift.
A faster uppermost mantle is also present in the Mozambique Belt (Weeraratne et al., 2003). We argue that the Mozambique Belt serves as an informative comparative structure (Keranen et al., 2009) from which to assess velocity differences resultant from plume impingement and associated volcanism, despite its non-continuity along the EARS. The similarity in shear velocity structure, but uniformly slower velocities of group 1 compared to the Mozambique Belt, suggest holistic lithospheric velocity perturbations far off-rift (Figures 7a and 7d). At ∼45 km depth, the group 1 profile is ∼0.25 km/s slower than the Mozambique Belt. Velocities in the two regions gradually become more similar with depth, and are barely distinguishable by ∼165 km. The fact that these regions seem to differ particularly in shallow mantle structure demonstrates the role ongoing tectonics play.
Middling shear velocities in group 2 are distinct from group 1 and 3, with stations in this region lacking a fast velocity lid, but also failing to display a negative concave velocity structure between 40 -140 km (Figures 7b  and 7d). Shear velocities are up to ∼3.5%, or 0.13 km/s slower than group 1, but faster than group 3 by a maximum of ∼3.8%, or 0.15 km/s. We interpret this region as manifesting an intermediate rifting state between two end-members (the far flank and the MER). We do note that shear velocities within group 2 are more heterogeneous than groups 1 and 3, perhaps a reflection of complex crustal structure. Dugda et al. (2007) observed a lack of a fast velocity lid in the MER and Afar (Figure 7b), and concluded that advanced thermal erosion of the lithosphere had already taken place within the rift (cf. Rychert et al., 2012). We similarly propose that the minimal fast velocity lid within the EP slow velocity lineament is indicative of extensional lithospheric perturbation surprisingly far from the rift axis (see Section 5.3).
Group 3 evinces particularly slow shear velocities in the upper mantle. Extensive velocity structure perturbation here is consistent with the tectonic and thermal effects of rifting in the MER and proximal EP. For reference, we compare group 3 to the mean global shear velocity structure of 0 -25 Ma oceanic crust from the compilation of French et al. (2013) and find V S (z) is slower than this reference by 3.6 -4.3% (Figures 7c and 7d). The fact that these values are even slower than the young oceans likely reflects unusually high mantle potential temperatures and/or less mature melt percolation network beneath still-extant continental crust, as well as perhaps a high-biasing of the oceanic regional average by cooler off-ridge structure. Taken at face value, and attributing velocity reduction entirely to partial melt, the ∼4% (0.16 km/s) offset reflects an excess of between 0.5% and 4% in situ melt compared to the young oceans depending on the chosen Vs -% melt relationship (Faul et al., 1994;The MELT Seismic Team, 1998;W. C. Hammond & Humphreys, 2000). The similarity in velocity profile shape evidences that the maturation of the rifted asthenosphere significantly precedes crustal breakup here Keranen et al., 2009).
Group 3 has the best (albeit limited) geographic overlap with previous studies, with which we find good agreement. For instance, the profile of group 3 is nearly indistinguishable from station ARBA in the work of Dugda et al. (2007) (Figure 7c), except for a more negative velocity gradient seen between 35 -60 km depth. Compared to the MER shear velocity aggregate from Gallacher et al. (2016), we find dissimilarity between 30 and 60 km, after which both profiles are nearly indistinguishable. Gallacher et al. (2016) sampled mostly the NMER, CMER, and Afar, which capture a more extensively rifted, and thus slower, upper mantle structure.

Quantifying the Contributions of Melt and Temperature
Pervasive low velocities throughout this region point to systematically elevated temperature (Dugda et al., 2007), a more silicic mantle composition, grain size reduction, the presence of melt and other fluids, or a combination of these factors (e.g., . Compositional variance between pyrolite, piclogite, and harzburgite can account for only ∼1% δVs (Cammarano et al., 2003), and grain size reduction from 1 cm to 1 mm at 1200°C reduces velocities a maximum of 0.1 km/s (Hirth & Kohlstedf, 2003), or ∼2% δVs (Keranen et al., 2009). Our data therefore likely constrain the combination of elevated mantle temperatures and/or in situ melt fraction. Dugda et al. (2007) explained slow MER shear velocities by invoking ∼250°C temperature increase caused compared to the Mozambique Belt. Other tomographic studies posit even hotter temperatures, 160 -550°C above normal mantle (Chambers et al., 2019). We attempt to quantitatively connect our shear velocity observations to mantle conditions in two ways. First, by generating a suite of geotherms and forward-modelling shear velocity profiles, seeking a best fit to the average Vs(z) of each group. Second, we take a simpler approach, positing that independent effects of excess temperature (dT) and in situ melt fraction (ϕ) can explain slowness compared to PREM (δV S ), at some representative depth.

Forward-Modeled Geotherms Fit to Observed Vs
In method 1, candidate geotherms are constructed by grid searching over a range of mantle potential temperatures (T p ), surface heat flows (q 0 ), and radiogenic crustal heat production values (A 0 ) (Figures S20-S22 and Table T3 in Supporting Information S1). We use the formulation of  to compute V S (T, z, d) at 1 Hz. We assume a grain size, d, of 1 mm, obtain anharmonic shear moduli from the tool of Abers and Hacker (2016) assuming a pyrolitic composition, and calculate self-consistent pressure profiles. Continental geotherms cannot fully predict observed shear velocities between 30 and 60 km (Figures S17-S19 in Supporting Information S1) using existing scaling relationships , so we choose to minimize predicted misfit between 60 and 120 km, the range where our phase velocities are most sensitive, and the majority of melt is expected to reside.
We augment our forward calculation to account for the velocity reduction effects of partial melt. We use super-solidus temperature excess to predict a melt fraction, assuming 0.0995% melt is generated per degree Celsius above the dry peridotite solidus (Figure 8a), which was calculated from a pMELTS algorithm curve (Ghiorso et al., 2002) using a LOSIMG composition peridotite (Hart & Zindler, 1986), at 3 GPa, with no melt escape (a necessary simplification). We translate this melt fraction into a velocity reduction beyond the melt-free case using the relationship from Faul et al. (1994), whereby 1% partial melt slows shear velocities by 3.3% (this value is for penny shaped inclusions, likely more applicable here based on the presence of anisotropy and high differential stress from rifting ). We account for melt percolation effects by convolving the instantaneous melt curve with an exponential function to smooth melt upwards with decay lengthscale of 20 km. Figure 8. a) Best fit geotherms for group 1 (blue), group 2 (green), and group 3 (red) using method 1 to fit shear velocity between 60 -120 km depth (Section 5.2), with temperatures indicated by the lower x-axis. The dry peridotite solidus from Asimow et al. (2004) is plotted in black dots. Xenolith pressure -temperature equilibration conditions from Conticelli et al. (1999) and Ferrando et al. (2008) are shown in orange circles and triangles, respectively. Partial melt percent (upper x-axis) calculated using the super-solidus forward modeling method is shown for groups 2 and 3 (dashed, colored as above) to the left. b) The slowness domain, showing linear tradeoff between melt and mantle temperature excess, contoured in increments of −2% δVs. Mean ±1σ slowness compared to PREM between depths of 60 -120 km is shown for each velocity group (colors as above). The melting curve of a LOSIMG composition peridotite (Hart & Zindler, 1986) at 3 Gpa is plotted in solid purple.
We also apply a compaction function downwards at the Moho with a lengthscale of 5 km to mimic the effect of neutral buoyancy of the mafic melt in the crust while ensuring total melt percentage is preserved. This ad hoc procedure acts to distribute the effect of any melt across a broader depth range, in keeping with our lack of observed sharp low velocity layers. Our treatment of melt production and percolation is admittedly simplistic, but sufficient for this exercise. A second caveat is that we are combining a melt-free anelastic scaling law with a nominally anharmonic constitutive relationship for the effect of melt on velocity. However, given the absence of a more self-consistent approach and the other uncertainties in this forward modelling approach, we argue that this is a good approximation to balance the effects of temperature and melt on velocity. We estimate uncertainty in melt fraction from the suite of forward models that fit the velocity profiles to better than 95% of the best fit model.

A Simple Melt -Temperature Tradeoff
In method 2, we assume that = + , using derivatives of −0.0012 km/s/K (Dugda et al., 2007;Jackson et al., 2002) and −0.047 km/s/% (Faul et al., 1994), respectively. Note that we prefer this model to W. C. Hammond and Humphreys (2000), as discussed above. Consistent with method 1, we use the average Vs between 60 -120 km for each group (4.33 ± 0.053 km/s for group 1, 4.25 ± 0.048 km/s for group 2, and 4.11 ± 0.063 km/s for group 3) to yield contours of relative slowness for each group in reference to PREM (4.47 km/s Voigt average) (Figure 8b). We assume that PREM represents a melt-free mantle with a potential temperature of 1300°C, and account for a 0.4 K/km adiabat when extending to 90 km depth. Naively, this approach is limited to establishing the linear trade-off between thermal and melt perturbations. However, these parameters are themselves of course linked. We therefore use the pMELTS calculation described above to compute the peridotite melting curve (Figure 8b) at 3 GPa ( ∼90 km depth), assuming batch melting. Theoretically, the intersection between this curve and the groups' slowness contours constrains mantle conditions. Realistically, due to melt percolation, our estimates of melt fraction are likely maxima, and estimates of temperatures may be minima.

Physical Interpretation of Low Velocities
Using the first method, group 1 is best fit by a geotherm with parameters: 45 km Moho, Q 0 = 75 mW/m 2 , T p = 1425°C, A 0 = 0.99e −7 W/m 3 . This estimate agrees with independent xenolith-based (Rooney et al., 2012) and geodynamically modelled (Armitage et al., 2015) estimates of local mantle temperatures in the range 1350 -1450°C within the MER and Afar. This geotherm does not intersect the solidus of dry peridotite (Figure 8a). Similarly, the second method suggests an upper mantle temperature ∼100°C above average, that does not intersect the pMELTS melting curve (Figure 8b). Our seismic velocities do not seem to require in situ melt here, even when considering error. This is in agreement with an absence of volcanism in the NWEP. While some thermal springs exist near the central EP, no recent volcanism exists in any location aggregated into group 1. Group 1 stations sit atop the coolest mantle in the region, apparently ∼1450°. That said, ∼30 Ma xenolith geothermobarometry (Ferrando et al., 2008) (Figure 8a) and potential temperature estimates of 1410°C from primitive magmas 10 -40 Ma (Rooney et al., 2012) imply that in past mantle temperatures below this region were lower than those inferred today. Even in the distal flank of the rift, there seems to be a warming trend over time.
Group 2 is best fit by a geotherm with parameters: 30 km Moho, Q 0 = 75 mW/m 2 , T p = 1475°C, A 0 = 1.32e −6 W/m 3 . This is hotter than the 1350 -1450°C ambient mantle temperatures estimated for the MER and Afar (Armitage et al., 2015;Rooney et al., 2012). Group 2 velocity averages are low enough to require the presence of upper mantle partial melt. A maximum of 0.46% ± 0.08% partial melt is found at 70 km depth (Figure 8a) using method 1. A mantle temperature of ∼1500°C, and 0.1 +0.3 −0.1 % melt (Figure 8b) is predicted using method 2. This 0.1-0.46% partial melt component is slightly lower than the 1.1% melt component invoked by Chambers et al. (2019) to the east of Lake Tana, possibly because here we group the slower northern and less slow southern stations in the off-axis lineament.
Xenolith data (Figure 8a) at Injibara, just south of Lake Tana, contain signatures of hydrous enrichment (Conticelli et al., 1999;Ferrando et al., 2008). Xenoliths indicate the mantle here is primarily depleted peridotite (Conticelli et al., 1999;Rooney et al., 2012) which may require hydration for substantial melt absent of very high temperatures (Asimow et al., 2004). Quaternary volcanoes and thermal springs extend southwest from Lake Tana as far as 11°N, following surficial border faults. These features, previously linked to shallow velocity reduction (Chambers et al., 2019;Korostelev et al., 2015), are mirrored by the low velocity lineament we observe in the group 2 footprint ( Figure 6). Our data support the presence of particularly elevated uppermost mantle temperature, as well as hydration and melt, extending south of Lake Tana.
Group 3 is best fit by a geotherm with parameters: 30 km Moho, Q 0 = 75 mW/m 2 , T p = 1525°C, A 0 = 1.65e −6 W/m 3 . This is 75 -175°C hotter than some others have proposed (Armitage et al., 2015;Rooney et al., 2012). This region, close to the magmatic rift, is melt-rich (Armitage et al., 2010;J. O. Hammond, 2014;White & McKenzie, 1995). We obtain estimates of up to 1% ± 0.18% partial melt at 80 km ( Figure 8a) using method 1, and a temperature of ∼1515°C with 0.95 ± 0.4 % melt is predicted using method 2 (Figure 8b). These methods closely agree on a partial melt percentage ∼1%, which aligns with recent estimates within the MER between 0.5 -2.7% (Chambers et al., 2019(Chambers et al., , 2021Gallacher et al., 2016;Keranen et al., 2009). Fully disentangling seismic velocities, melt fraction, and pore orientation/anisotropy is not fully possible using our methodology. However, recent studies (Chambers et al., 2021) have found radial anisotropy as high as 6.5 ± 0.5% in the shallow crust (5-15 km) of the near-MER EP, supporting melt storage in the form of sills, in agreement with geochemical studies in the area (Rooney et al., 2017). Strong azimuthal anisotropy, primarily in the rift, also complicates interpretation (e.g., J. O. Hammond, 2014;Kendall et al., 2005). However, good backazimuthal distribution ( Figure S1 in Supporting Information S1) likely means that we average out this effect. While volcanoes are confined mostly to the narrow rift valley (Figure 1b), our phase velocities are slow in a relatively broad swath extending 100 km beyond (particularly to the west of) the border faults even at 32 -40 s. This may stem partly from resolution issues, but likely also reflects the nature of melt channelization and focusing within the rift (cf., Holtzman & Kendall, 2010).

Crustal Thickness Variations
The Moho depths we observe in the center of the Ethiopian Plateau are qualitatively consistent with crustal thickening following the impingement of the Afar Plume ∼30 Ma (Mackenzie et al., 2005;Stuart et al., 2006). This event is thought to have thickened the crust in two ways: up 2 km of flood basalt emplacement at the surface (Baker et al., 1996;Hofmann et al., 1997;Ukstins et al., 2002), accompanied by up to 10 km of underplating, yielding a total crustal thickness up to ∼45 km. This is best observed from wide-angle seismic reflection/ refraction data (Mackenzie et al., 2005), but that study only imaged the deepest crust ∼100 km on either side of the rift valley. Some degree of thickening by igneous intrusion is also well recorded by gravity surveys (Cornwell et al., 2006), and increased Vp/Vs ratios signifying a more mafic crustal composition . Xenolith data collected along the YTVL also support underplated material beneath the near-rift EP (Abebe et al., 1998;Adhana, 2014;Rooney et al., 2017), and melt emplacement well outside the rift axis is supported by abundant basaltic dikes (Mège & Korme, 2004;Rooney et al., 2018) west of Lake Tana, where magmatism exploited zones of pre-existing weakness.
We do not observe crust as thick as 45 km in our field region. The average crustal thickness of stations in our inversion within the EP is 34 km, and the thickest crust we see is 39 km at SERE, just north of Lake Tana. Ignoring erosion, 12 km of thickening would suggest a pre-plume thickness of 22 km in the plateau, which we argue is unrealistically thin for continental crust. In order to estimate an appropriate baseline, we propose that stations that lie outside the bounds of flood basalt emplacement (Figure 1b) in the NW of our field region best exemplify pre-rift/plume lithospheric structure and crustal thickness: ∼30 -32 km ( Figure 6). One caveat is that four of six of these stations lie within the erosional basin of the Blue Nile Gorge, which has incised a mean of 0.37 km of rock within the basin since 29 Ma (Gani et al., 2007). As another regional comparison, ∼35 km crust is observed further west in Sudan (El Tahir et al., 2013;Mohamed et al., 2001), which similarly lacks flood basalts or underplating.
With this baseline of ∼31 km, the crustal thicknesses of 31-39 km we see in the EP to Lake Tana imply igneous addition on the order of 0 -8 km, distributed non-uniformly across the region. This disparity in estimated underplating compared to previous studies may simply be geographical -there was more igneous thickening in the region of the MER than on the flank. Alternatively, it might be methodological; our receiver functions might be sensing the substantial velocity boundary between pre-existing crust and igneous underplate, and not the (perhaps more gradational) underplate-mantle boundary (e.g., Stuart et al., 2006) to which wide-angle reflection/refraction is more sensitive. Nonetheless, since our method attempts to fit the full receiver function waveform, and since we do not see an abundance of double-peaked Pxs phases, we estimate that the crustal thickness patterns we see are real.
Two observations stand out in the EP slow velocity lineament: Firstly the thickest crust in our field region is around Lake Tana and coincides with some of the lowest off-rift shear/phase velocities. Secondly, we see a modest trend of thinning to the southwest (Figure 6a). This tapering has several possible explanations. It may be that more erosion has occurred in the south-central plateau than previously thought, or alternatively that the plume emplaced more igneous material to the north, creating gradients in flood basalt thickness. Finally, it may be that the pre-plume, pre-extension thickness increased from north to south, and we are imaging a step in the flood basalt province; N-S trending sutures and dike swarms are more concentrated near Lake Tana (Figure 1b) perhaps connoting more extensional strain in that area in the geologic past that has altered crustal thickness. This region is further discussed in Section 5.4.
Although we only include three stations from within the rift axis in this study, they show evidence for thin crust within the SMER (<31 km), and thicker crust (∼38 km) within the CMER, consistent with previous work (Dugda et al., 2007;Keranen et al., 2009;Stuart et al., 2006). Although simple models of pure shear rifting would predict the thinnest crust in the region within the well defined rift valley (Buck, 1991), thicker crust in the CMER supports complex, diachronous rift evolution of the MER (Bonini et al., 2005;Keranen & Klemperer, 2008;Wolfenden et al., 2004).

Off -Axis Extension
The northern portion of the slow velocity lineament in the EP coincides with Quaternary volcanoes, spatter cones, fumaroles, and thermal springs (Figure 1b) (Abebe et al., 1998;Kieffer et al., 2004;Keir et al., 2009). These features are concentrated near Lake Tana, where we find the slowest velocities, and where calculations (Section 5.2) seem to require 0.1 -0.46% partial melt in the upper mantle to fit group 2 Vs observations. This region is also observed to have regionally pervasive slow P and S-wave arrivals (Bastow et al., , 2008Boyce et al., 2021;Keir et al., 2009) and small amounts (<1mm/yr) of contemporary surface extension as determined by GPS measurements (Birhanu et al., 2016). We find Vp/Vs ratios at stations closest to Lake Tana among the highest in our region, in excess of 1.8 ( Figure S11 in Supporting Information S1), indicative of melt (J. O. . Surficial volcanic features suggest that the northern portion of the slow velocity lineament is more mature than the southern, which is also supported by a larger magnitude GPS velocity measurements near Lake Tana (Figure 1a). Geologic evidence suggests that Lake Tana was previously a particular locus for extension. Pre-existing N-S trending sutures in this area (Johnson et al., 2004;Shackleton, 1986) are believed to have guided plume-fed dike swarms (Mège & Korme, 2004;Rooney et al., 2018) and may remain active zones of weakness. Multi-phase graben faulting (Chorowicz et al., 1998, and references therein), strike-slip motion within the God Serpent dike swarm, shear structures within their margins , and magnetic imbrications (Callot et al., 2001;Schultz et al., 2008) all indicate that Lake Tana has experienced localized extension episodically since the Oligocene. Ti systematics indicate that the region around Lake Tana may have been initial, or an important secondary, locus of plume-sourced melt infiltration 30 Ma (Chorowicz et al., 1998;Rooney et al., 2018).
Without continued extension, the magmatic activity and -as we have shown -low velocities southwest of Lake Tana would have ceased since the plume arrival. Our results, together with a host of previous work, suggests ongoing localized, shallow extension >450 km from the rift axis along a rift-parallel lineament extending south west from Lake Tana. Particularly reduced shear velocities, and volcanism, near Lake Tana suggest that the northern portion of the slow velocity lineament is more mature than the south. That said, our imaging of velocity perturbation at depth extending south of the ∼10.6°N southernmost limit of volcanism indicates that deeper lithospheric perturbation can lead surficial extension. This extension may be driven from the bottom-up. P-and S-wave delay time observations and tomography of the Ethiopian plateau have previously hinted at a slow upper mantle extending well beyond the MER valley (Bastow et al., , 2008Gallacher et al., 2016;Keir et al., 2009). Absolute P-wave delay times of up to 1 second imply δV p of as more than −2% off axis, with these low velocities asymmetrically observed to the north west of the rift (Boyce et al., 2021). These results agree qualitatively with our data, which reveal upper mantle velocities 4.3% (0.19 km/s) slower than PREM in the group 2 region. Clearly, there is substantial thermal perturbation and -our calculations implyin situ melt beneath large swaths of the Ethiopian plateau, with particular apparent concentration in the slow velocity lineament.
Geodetic data suggest that while the majority of extension in the northern EARS is localized within the MER and Afar, up to ∼20%, or 1 -2 mm/yr, of extension may be distributed across the EP (Birhanu et al., 2016). Keranen et al. (2009) have interpreted crustal thickness maps as implying broad distribution of strain beyond the MER. Small magnitude GPS measurements in the EP of <1mm/yr are of similar magnitude to those in the near-MER EP (Figure 1a) (Birhanu et al., 2016), suggesting extensional accommodation is present in both locales. Such a combination of diffuse and narrow extension has also been proposed to occur in the southern EARS (D. Stamps et al., 2021).
Finally, it is interesting that notably thick crust near Lake Tana (∼36 -39 km), and at other Group 2 stations, coincides with some of the slowest upper mantle velocities where we have argued extension is ongoing (Figure 6). At face value, this is paradoxical. We propose two potential explanations in the context of off-axis extension. One possibility is that >10 km of magmatically driven thickening (Mackenzie et al., 2005) associated with plume impingement has taken place, following which high (>40%) strain pure shear extension in the plateau -evidenced by slow shear velocities -has re-thinned the crust by up to 9 km to present day values (∼30 -39 km). Alternatively, extension off-axis has reduced velocities due to lithospheric degradation (Havlin et al., 2013) and melt percolation, but is still too immature to have initiated much crustal thinning. In this case, our results would seem to be inconsistent with >5km of magmatic thickening. It may also be that underplating of the Ethiopian plateau tapers substantially away from the -now rifted -center. The more parsimonious explanation would seem to be the latter; melt and thermal effects of bottom-driven incipient rifting well precede crustal breakup, at least within the Ethiopian rift flank.

Conclusions
We have investigated the northwestern EP flank of the East African Rift and Ethiopian Plateau using Rayleigh wave phase velocity measurements and Ps receiver functions to interrogate the deep crustal and shallow mantle structure. We jointly inverted these data for shear velocity profiles using a trans-dimensional, hierarchical Bayesian inversion algorithm, incorporating H-κ receiver function stacks to better constrain crustal thickness.
Throughout the study region we found slow phase and shear velocities, reflecting mantle temperatures significantly elevated compared to a stable continent. We showed that high temperatures and in situ mantle melt content extend beyond the rift. The central Ethiopian plateau has a fast lid above a gradual negative velocity gradient. Within and near the SMER, we found maximally thinned crust and extremely low upper mantle velocities -the effects of rifting are clear. Similarly thin crust within the NWEP flank, on the other hand, is paired with fast upper mantle velocities. These flank stations appear to be the best local representatives of pre-rift and pre-plume lithosphere, comparatively unaffected by either process.
In the NWEP, we observed a lineament sub-parallel to the rift axis with slow shallow upper mantle velocities. This lineament coincides with GPS-observed divergence, pervasively delayed teleseismic arrivals, and Quaternary aged volcanism, fumaroles, and thermal springs near Lake Tana. These observations support the presence of localized off-axis extension, hundreds of kilometers from the MER. Plume-derived underplating and flood basalts cause ambiguity in determining the extensional maturity of the slow velocity lineament. If igneous addition has offset crustal thinning this region may already be heavily extended. Alternatively, there may be insufficient off-axis extension to have allowed much crustal thinning at all, but melt percolation and lithospheric degradation have already begun to substantially reduce shear velocity. In either case, it is apparent that some amount of that off-rift extension occurs well away from the MER, and that continental breakup in Ethiopia is more complex than simple narrow rifting.

Data Availability Statement
All data used are available from the Iris Data Management Center (https://service.iris.edu/). IRIS Data Services are funded through the Seismological Facilities for the Advancement of Geoscience (SAGE) Award of the National Science Foundation under Cooperative Support Agreement EAR-1851048 and EAR-1261681. All models and derived datasets associated with this manuscript are available in Dryad repository (https://doi.org/10.25349/ D9RK6H). Networks used to develop Rayleigh wave phase velocity maps and receiver functions, in alphabetical order, are: AFAR0911 2H (