Magnitude of the 1920 Haiyuan Earthquake Reestimated Using Seismological and Geomorphological Methods

Reported magnitudes ranging between 7.8 and 8.7 highlight a confusion about the true size of the 1920 Haiyuan earthquake, the largest earthquake recorded in the northeast Tibetan Plateau. We compiled a global data set of previously unlooked‐at historical seismograms and performed modern computational analyses on the digitized seismic records. We found the epicenter to be near Haiyuan town and obtained a moment magnitude of MW=7.9±0.2. Following traditional approaches, we obtained mB=7.9±0.3 with data from 21 stations and MS(20)=8.1±0.2 with data from three stations. Geomorphologically, we mapped the surface rupture and horizontal offsets on high‐resolution Pleiades satellite and drone imagery that covered the entire western and middle sections of the 1920 Haiyuan earthquake rupture and compiled offsets reported on the eastern section from field measurements in the 1980s. Careful discrimination between single‐event and cumulative offsets suggests average horizontal slips of 3.0±1.0 m on the western section, 4.5±1.5 m on the middle section, and 3.5±0.5 m on the eastern section, indicating a total moment magnitude of MW=7.8±0.1. Thus, the seismological and geomorphological results agree within the uncertainties, a weighted average giving a moment magnitude of MW=7.9±0.2 for the 1920 Haiyuan earthquake. It is likely that earthquake magnitudes based on the historical M were systematically overestimated.


Introduction
The 1920 Haiyuan earthquake is one of the most destructive earthquakes of the past century (Chen & Molnar, 1977;P. Zhang et al., 1988) and is commonly reported as the largest known event in the tectonically active northeastern Tibetan Plateau (Figure 1). Accurate knowledge of its magnitude and strain release is of great importance to our understanding of the seismic risks in this area. However, a confusing number of magnitude values on different scales have been reported for this earthquake (Table A1). The most often quoted of these are M 8.7 (Cavalié et al., 2008;Deng et al., 1986;Lasserre et al., 1999Lasserre et al., , 2002P. Zhang et al., 1987;W. Zhang et al., 1988), M 8 (Cavalié et al., 2008;Lasserre et al., 2007;, although the definition of the M used is unclear, and M S 8.5 (Ren et al., 2016(Ren et al., , 2018. However, M W 8.3 is currently Figure 1. Tectonic overview of the northeast Tibetan Plateau, the location of which is given by the red box on the inset map. Blue arrows with error ellipses represent the Global Positioning System velocity field relative to the stable Eurasia (Wang & Shen, 2020). White circles are earthquakes between 1904 and 2015 with magnitude above 5.0 in the ISC-GEM catalog (ISC, 2019). The 1920 Haiyuan earthquake is highlighted in orange. The red lines show the extent of the rupture, with the western and middle sections mapped on Pleiades imagery and the eastern section mapped on Google Earth with reference to the 1:50,000 geological map in IGCEA and NBCEA (1990). Black box shows the extent of Figure 2. magnitude m B (section 2.5). Then, we measure horizontal offsets on orthorectified Pleiades satellite and unmanned aerial vehicle imagery to estimate the coseismic slip distribution and the average slip of the 1920 earthquake and hence calculate its moment magnitude M W (section 3). Finally, we reconcile the differences in reported magnitudes and shed light on how to interpret historical earthquake magnitudes (section 4).

Tectonic Setting
The northeastern Tibetan Plateau is a tectonically active region bounded by the E-W-trending strike-slip Haiyuan Fault, the Altyn Tagh Fault, the Western Qinlin Fault, and the Kunlun Fault as well as the NW-SE-trending thrust faults in the Qilian Shan and Liupanshan (Figure 1) (Song et al., 2019;Wang et al., 2014;. The Qilian Shan and the Qaidam Basin absorb ∼10 mm/yr of N-S shortening, ∼25% of the shortening between India and the Alashan block (Wang et al., 2001). The combination of thrusts and strike-slip faults controls the elevation and eastward extrusion of the Tibetan Plateau in this NE part of the plateau (Meyer et al., 1998;Tapponnier et al., 1986;Wang et al., 2001).
The ∼1,000 km long Haiyuan Fault System is a major geomorphological structure in the NE Tibetan Plateau that accommodates oblique convergence with deformation partitioning between major left-lateral strike-slip faults and thrusts Tapponnier et al., 2001). The 1920 Haiyuan earthquake ruptured the eastern end of the Haiyuan fault (Figure 1), between the creeping Laohushan section (Jolivet et al., 2012) and the Liupanshan Thrust System (D. Zheng et al., 2006) (Figure 2). Estimated geological slip rates along the 1920 rupture range from 3.2 ± 0.2 mm/yr in the western section (Matrau et al., 2019) to 4.5 ± 1.0 mm/yr in the middle section (Li et al., 2009), which are consistent with the 5-6.3 mm/yr creep rate at Laohushan based on interferometric synthetic-aperture radar (InSAR) (Cavalié et al., 2008;Daout et al., 2016;Jolivet et al., 2012) and the 2.8-3.5 mm/yr overall slip rate absorbed by thrusting across Liupanshan based on measurements using the Global Positioning System (GPS) (Li et al., 2017).
Overall, the Haiyuan fault is a near-vertical plane that dips steeply to the south (Daout et al., 2016;Gaudemer et al., 1995). Seismic reflection profiles near Santang in the western section and the Salt Lake in the middle section suggested dips between 60°and 70° (Fan et al., 2004;Wang et al., 2012Wang et al., , 2014. A surface dip measurement of 67°was reported from a fault scarp at Haizixia in the eastern section (IGCEA & NBCEA, 1990). In addition, Duan et al. (2018) constrained, using the precise locations of small earthquakes, the dip of the 1920 Haiyuan rupture to be uniformly ∼80°above 8-km depth and found that, below 8-km depth, the dip decreases from ∼80°to ∼45°from west to east.
The seismogenic thickness in this region is well constrained. The current locking depth underneath the 1920 rupture, as inverted from InSAR, is 20 km (Jolivet et al., 2012). Eighty-nine percent of small earthquakes registered within 30 km of the 1920 rupture since 1976 occurred in the upper 22 km of the crust . Aftershocks of the 2016 M S 6.4 Mengyuan earthquake to the west of the 1920 rupture reached 19-km depth (Li, Jiang, et al., 2016;. Although Li, Shan, et al. (2016) suggested shallow locking depth of 5 km below the 1920 rupture from GPS measurements, they obtained similar locking depths of 22 km to the west of the Laohushan creeping section and 15-20 km at the Liupanshan. Therefore, even though the current locking depths under the 1920 rupture is still debated, we can assume that the locking depth of Haiyuan fault before the 1920 rupture was ∼20 km.

1920 Haiyuan Earthquake
The Haiyuan earthquake occurred at 12:05:55.42 UTC on 16 December 1920(ISC, 2018. Described as "Where the Mountain Walked" because of extensive landslides (Close & McCormick, 1922), this event caused over 230,000 casualties, making it the second deadliest earthquake in the twentieth century (IGCEA & NBCEA, 1990). The maximum intensity of shaking reached XII, with all buildings completely destroyed around the Salt Lake pull-apart basin and Xi'anzhou of Haiyuan County (Figure 2).
The previously reported seismological epicenters are scattered up to 30 km away from the rupture due to the sparse seismic network at the time and the coarser velocity model used ( Figure 2). The epicenter inferred from macroseismic shaking observations is at the Salt Lake, very close to the epicenter in the Centennial catalog (IGCEA & NBCEA, 1990). However, recent physics-based rupture models suggest an epicenter further west would better match the observed intensity of shaking pattern .
Detailed field mapping in the 1980s reported 237 km of surface rupture (Deng et al., 1986;IGCEA & NBCEA, 1990), which is now commonly divided into the western, middle, and eastern section based on geometry and the slip pattern (P. Zhang et al., 2005;Ren et al., 2016;Xu et al., 2019): the 70 km long 110°striking western section from Santang to the Dayingshui basin, the 90 km long 110-120°striking middle section from east of the Dayingshui basin to Luanduizi, and the 77 km long 145°striking eastern section from Luzigou to Haizixia ( Figure 2) (IGCEA & NBCEA, 1990).
The maximum slip was first thought to be the 10-m offset at Shikaguangou in the middle section (Deng et al., 1986;IGCEA & NBCEA, 1990), but this offset is now believed to be a cumulative offset (Ren et al., 2016;Xu et al., 2019). Dynamic rupture simulations predicted horizontal slip distributions that form bell-shaped curves on the three sections with the largest offset of 6.5 m near the center of the middle section .

A Brief History of Magnitude Scales
Following Richter's invention of the Californian local magnitude scale, M L , 21 years after the 1920 earthquake, Gutenberg released the first global earthquake catalog in terms of surface wave magnitude, M (Gutenberg & Richter, 1941). The M magnitude was derived from amplitudes of surface waves with periods around 20 s (Abe, 1981;Bormann, 2012;Geller & Kanamori, 1977;Gutenberg, 1945b). Realizing that surface waves of all deep earthquakes are relatively small, Gutenberg (1945a) developed the broadband body wave magnitude m, which later became known as m B (Gutenberg & Richter, 1956), and gave more uniform results  Ren et al. (2016). The isoseismal contours in brown are reproduced from Zhuang et al. (2018). Yellow dots show previously reported epicenters: HIST, catalog of historically felt earthquakes in China, preferred by Xu et al. (2019); CENT, centennial catalog (Engdahl & Villaseñor, 2002); ISC-GEM, International Seismological Centre-Global Earthquake Model catalog (ISC, 2019); ISS, International Seismological Summary catalog (Villasenor & Engdahl, 2005;Villaseñor et al., 1997); CB, center-bottom nucleation point used in the finite fault waveform modeling in section 2.4. Stars show relocated epicenters in this study from phase arrival times from waveforms (blue), bulletins (orange), and waveforms and bulletins combined (red), with hypocentral depth fixed at 15 km; the orientations of the 1-standard-deviation error ellipses reflect the azimuthal distribution of the seismic stations available (Figure 3). The green box around the strike-slip portion of the fault rupture represents Pleiades data coverage in Figure 8a. across hypocentral depths. After the implementation of the World-Wide Standardized Seismograph Network in the 1960s, m B was adapted to only measure 1-s P wave amplitudes and became the narrow band m b . It was only in the late 1970s that the moment magnitude, M W , was introduced to scale with radiated energy and moment release (Hanks & Kanamori, 1979;Kanamori, 1977). Significant efforts were made to unify the different scales by applying corrections (Pacheco & Sykes, 1992) or choosing preferred scales according to magnitude range (Engdahl & Villaseñor, 2002). The reported magnitude of the 1920 Haiyuan earthquake varied from source to source as summarized in Appendix A.

Data Acquisition and Preparation
According to the International Seismological Centre (ISC) online bulletin (https://www.isc.ac.uk/iscbulletin/search/bulletin/), 78 stations reported phase readings for the 1920 Haiyuan Earthquake (ISC, I. S. C., 2018). However, many of these records are no longer to be found due to war damage, fire, misplacement, and untracked lending over the years (Okal, 2015). Over 1 year of data hunting, we recovered 60 analog seismograms recorded at 27 stations in 12 countries on 13 unique types of instruments ( Figure 3 and Appendix B; also see Data Set S1 in the supporting information for archive contacts). The only Chinese analog seismogram of this event, recorded at the Zi-ka-wei station in Shanghai, was saturated upon the first P wave arrival and the recording stylus was dislocated soon after, rendering it unusable.
The seismograms are preserved in a variety of formats. Whether a record is suitable for magnitude determination depends critically on the availability of a scale reference. Newly made scans from the original smoked papers are the best as they retain the true dimensions. Photos have to be taken against a ruler placed in a way that minimizes parallax error. Microfiches were usually carefully prepared with scale references. Microfilms,  (c) show the station distribution for Europe and Japan, respectively. The three-letter codes in the plot represent station names. When followed by a fourth character in the text, they represent waveforms. "B," "W," and "G" are used to differentiate Bosch-Omori, Wiechert, and Gallitzin seismographs at those stations where multiple instruments were operational. "H" stands for "historical" when only one type of instrument was operational at the station (Table B1). however, were scanned in the past at unknown resolution. If the microfilm did not contain a scale on the page, the record could only be used for epicenter relocation and focal mechanism determination but not for amplitude measurements.
We first digitized the waveforms by hand as Bezier curves in version 2.10 of the GNU Image Manipulation Program (https://www.gimp.org). We digitized the traces starting from the beginning of the minute mark 3 min before the minute of first arrival and stopped where either a waveform becomes saturated, the trace breaks due to dislocation of the stylus, or at the edge of the seismogram if the left and right edges could not be reliably spliced together. As the seismograms were scanned at different resolutions, pixels-per-inch and pixels-per-minute values were both required to convert the digitized records into time series in units of millimeter and seconds (Figures 4a and 4b).  (Heimann et al., 2017). (c, d) Enlargements of the boxes in panel (b). Green markers highlight the amplitude (measured from zero) and period (crest-to-crest or trough-to-trough) measurements for m B calculation in section 2.5.
Due to the mechanical design of the seismographs, which involved a hinged stylus swinging on a rotating cylindrical drum, these analog waveforms have unwanted curvature and slant ( Figure 4a). To correct these geometric artifacts, we ran an inversion algorithm, developed at the University of Potsdam (see Kulikova & Krüger, 2015), to restore the symmetry of the waveforms based on the geometry of the instrument setup described in Čadek (1987). Instrument parameters including natural period, T o , magnification factor, V o , and either damping ratio, ϵ, or damping constant, h, were collated from the seismograms, scanned station bulletins, station-specific literature, Wood (1921), andMcComb andWest (1931), in descending order of preference.
When neither damping constant, h, nor damping ratio, ϵ, was found in literature, we calculated them from the calibration pulse at the start of the seismogram, as in the case of records GIFH, RCRH, CHVH, and TOLH, where the stylus was first set in motion after a change in paper and recorded a series of decaying oscillations (see Equations 4.27 and 4.31 in Scherbaum, 2001): where a k and a k+2 are amplitudes one full wavelength apart in the calibration pulse.
If we could not find damping information in literature and the seismograms did not contain the calibration pulse (e.g., AKIH, LPZH, and TACB), we chose as a best estimate the most common damping ratio of the same type of instrument in Wood (1921) or McComb and West (1931). These parameters (Appendix B) were used to calculate the instrument specific transfer functions, V(T), which is the key to converting between the recorded and the actual ground displacements (Scherbaum, 2001): where P 1,2 are the poles of the low-pass filter that describes the frequency response of the seismograph, A g is the ground displacement in micrometer, and A s is the recorded signal in millimeter on the seismogram.
The station books and local bulletins that we sourced from individual contacts and obtained from the the Istituto Nazionale di Geofisica e Vulcanologia's EUROSISMOS project webpage (https://sismos.rm.ingv.it/ en/index.php/bulletins) also provided additional phase arrival data which we added to the ISC bulletins for epicenter relocation in section 2.2.

Epicenter Relocation
We first picked the arrival times of P, PP, PPP, S, SS, SSS, and SP phases from the waveforms in the Snuffler browser of the Pyrocko toolbox (Heimann, 2019) ( Figure 4b; see Data Set S2 for picked phase times). Since only one vertical component waveform was available, the P-type phase arrivals were preferentially picked on the horizontal component that forms the smaller angle with the radial direction from the source, and the Stype phase arrivals were picked on the component with the earlier onset.
We then used version 6.0c of the Hyposat program package (Schweitzer, 2001) to invert for the origin of the rupture based on the Crust 1.0 and AK135 velocity models (Kennett et al., 1995;Laske et al., 2013). The program used both onset times and travel time differences to iterate for the best-fitting hypocenter. Phases with large time residuals were successively removed until all remaining phases had residuals below 30 s.
Three independent inversions were performed with arrival times from newly picked waveforms only, bulletins (previously published and newly added from station books readings) only, and the combined data, respectively. Due to the lack of depth phases, we fixed the depth at 15 km assuming the rupture started near the bottom of the seismogenic layer (Das & Scholz, 1983). Changing the fixed depth shifts the epicenter along the major axes of the error ellipses. The resultant epicenters are all within 20 km of Haiyuan and fall within the XII intensity isoseismal contour ( Figure 2; Zhuang et al., 2018). We chose the epicenter derived from the combination of waveforms and bulletins (105.540 ± 0.251E, 36.481 ± 0.256N) as our best estimate to use for focal mechanism determination as it represents a balance of less numerous but more accurate, and more numerous but less accurate, data, and it lies closest to the known geomorphic trace of the fault rupture.

Focal Mechanism Determination
Considering the often unknown polarity and inconsistent minute-mark lengths of our waveforms, we could not use the first motions nor waveform modeling to determine the earthquake focal mechanism, so we determined the focal mechanism by fitting the radiation pattern in terms of the ratio between P-and S-phase amplitudes.
First, Green's functions assuming a point source were computed for a distance range between 1,000 km (17.9°) to 18,000 km (162.2°) at intervals of 20 km, a depth range between 0 and 40 km at intervals of 2 km, and sampled at 2-Hz frequency. The calculation was based on the Qseis2006 code in the Fomosto tool of the Pyrocko package (Heimann et al., 2017Wang, 1999). The Green's functions were convolved with the instrument-specific transfer functions to generate synthetic seismograms in the same domain as the digitized records.
We then measured the amplitude of P, PP, PPP, S, SS, and PS automatically by extracting the maximum absolute amplitude of each phase from time windows determined from theoretical travel times. The misfit, η, in the amplitude ratios between real and synthetic seismograms was calculated as where K is the number of waveforms and N is the number of epicentral distance dependent i/j phase combinations.
We performed a grid search through strikes between 80°and 160°, dips between 20°and 90°, rakes between −60°and 60°at steps of 10°, and centroid depths between 0 and 30 km, at steps of 2 km. Figure 5a shows the best-fit strike, dip, and rake for each centroid depth suggesting the centroid depth is not well constrained. However, the focal mechanism is stable, with dip constrained between 80°and 90°and rake between −10°and 10°( Figure 5b). This range is consistent with our geological understanding that the 1920 Haiyuan earthquake was a shallow earthquake that slipped left laterally on a steeply dipping fault (Deng et al., 1984;Fan et al., 2004;IGCEA & NBCEA, 1990;Wang et al., 2012Wang et al., , 2014.

Waveform Modeling for M W
As the surface waves were clipped for the majority of our records, we only modeled the body wave portion of a subset of high-quality records. The records had to be long enough to contain the full body wave trains, be well-damped, and, more importantly, be of known polarity (Kulikova & Krüger, 2015Kulikova et al., 2016). For stations within 30°epicentral distances, we defined a time window from 100 s before the theoretical P wave arrival to 200 s after the theoretical S wave arrival. For stations further away, we set the end of the time window to 200 s after the theoretical SS arrival. As the historical instruments mostly had narrow frequency bands, we band-pass filtered both digitized and synthetic waveforms to between 0.01 and 0.1 Hz in order to preserve the signals. To characterize the misfit, we correlated the real and synthetic waveforms for the time shift to keep them aligned, before cropping them again to only include the time window with overlapping data. The waveforms were then Butterworth filtered with a cosine taper of 10 s before the average relative misfit, γ, was calculated: where K is the total number of records being modeled, N is the total number of points sampled in the extracted time window, and y obs and y syn are the real and synthetic waveforms sampled at 2 Hz (Kulikova & Krüger, 2015Kulikova et al., 2016). The synthetic waveforms in the earlier steps were generated using point sources, which means that all the seismic moment was released at once to create impulsive arrivals. However, with rupture velocities of 2.5-5.5 km/s, it would take about one minute to rupture a fault over 200 km. Therefore, in order to reproduce the moment rate embodied by the waveforms, we modeled the seismograms with a rectangular source. We initialized the fault as a single vertical plane that ruptures bilaterally, as indicated by our epicentral location near the central part of the rupture.
We tested the best-fitting focal mechanisms in Figure 5b

Journal of Geophysical Research: Solid Earth
and rake = 0°as the focal mechanism for magnitude determination. Fault dimensions had the least significant effect on γ, so we fixed the fault length and width to be 240 and 20 km, respectively, based on our geological understanding. With the nucleation point (center bottom, CB in Figure 2) and rupture speed (3 km/s) selected based on misfit values from our limited data set, the synthetic magnitude was then varied to find the best-fitting M W .
The magnitude with minimum misfit was M W 7.8, with γ = 0.86. However, the synthetic waveforms visibly underestimated the amplitudes, which we speculate resulted from the inconsistent minute mark lengths on our seismograms causing poor alignment between the synthetic and digitized waveforms. In order to avoid bias from misfits due to misalignment in the time domain, we used the root-mean-squared waveform displacement as a proxy for earthquake magnitude. The magnitude that provides the best match of the root-mean-square value of displacements sampled along the body wave trains in the overlapping time window was chosen to be the best-fit magnitude for each station component. Figure 6 shows waveform fits at our final magnitude M W 7.94, as averaged from the best M W for each station in terms of the root-meansquared displacement (Figure 7a). To check for consistency, a test with M W 7.94 and varying source parameters was performed. The misfit with the above parameter set was γ = 0.88, only 0.1% larger than the minimum misfit found by grid searching the source parameters again, which happened at a rupture speed of 2.8 km/s with all other parameters unchanged.
The ability of the synthetic waveforms to fit the data varies between stations. One reason could be that the parameters we compiled from historical documents failed to accurately describe the instrument response on the day. Rarely could we find the full set of parameters in the same document. The parameters  Table B1 only represent a combination of values we trusted the most, based on the date of publication, associated elaborations, and whether the values agree with what we observe on the seismograms. The fact that the synthetic waveforms tend to underestimate the P wave train could be related to the background noise in the seismograms which sometimes have similar amplitudes as the recorded P wave train. Therefore, we trust the S wave trains more as they are more energetic.

m B and M S Determination
To check if we could reproduce the body wave and surface wave magnitudes obtained by Gutenberg and Richter (1954) and Gutenberg and Richter (1941) and to understand the differences between M W and other types of magnitudes reported, we calculated m B and M S using our collected seismograms according to the following definitions: and where A H is the vectorially combined horizontal phase amplitude in μm and T is the corresponding period in s. Δ is the epicentral distance, and Q is a distance-and phase-dependent calibration value tabulated in Gutenberg and Richter (1956).
The classical m B , which is equivalent to the m in Gutenberg (1945a), is a medium-period broadband body wave magnitude (Bormann, 2012). Unlike the more familiar narrow-band m b , m B accepts PH, PV, PPH, PPV, or SH phase amplitudes with 2 s < T < 20 s and only saturates at ∼8.5, hence was a popular choice for determining magnitudes of large earthquakes before the 1960s.
Due to the ambiguity in the meaning of and method for obtaining Gutenberg's surface wave magnitude, M, we adopted the International Association of Seismology and Physics of the Earth's Interior (IASPEI) standard long-period teleseismic surface wave magnitude M S (20), which takes the highest phase amplitudes with 18 s < T <22 s from stations with epicentral distance 20°< Δ < 160° (Bormann, 2012).
For both m B and M S (20), if the measured phases were within half a wavelength on the two orthogonal horizontal components and their periods differed by less than 1 s, we vectorially combined the amplitudes to give the true horizontal amplitude, A H .
If vectorial combination was not possible, for m B , we back-projected the body wave P, PP, S, and SS phase amplitudes using the back-azimuth angle to obtain A H trigonometrically, expecting the body waves to be polarized into radial and transverse components. For M S (20), we did not use the station if we could not find matching phases satisfying the requirements.
The amplitudes, A, and periods, T, were measured in the Snuffler seismogram browser (Heimann, 2019) (Figures 4c and 4d). The measured displacements A s in mm were then converted to ground displacement A g in μm through the transfer function, V(T) (Equation 2).
The epicentral distances, Δ, and back-azimuth angles were calculated between our relocated epicenter and the latitudes and longitudes of historical stations in local bulletins or Wood (1921) using the locdiff utility function of the Seismic Handler software package (Stammler, 1993).
In total, 123 phase amplitudes and periods were measured (see Data Sets S3 and S4) to obtain an average magnitude for each station (Figures 7b and 7c). The wide phase options, broadband period allowance, and the back-azimuth method allowed us to retain 21 stations for body wave magnitude determination. The resultant magnitude is m B = 7.9 ± 0.3. Due to the strict requirements for M S (20), only three stations with long-period instruments fulfilled the criteria and gave a surface wave magnitude of M S (20) = 8.1 ± 0.2.

Geomorphological Approach
Mapping of rupture length and continuity, combined with measurements of coseismic slip, are valuable for estimating moment release, as well as assessing the geometrical complexity of the rupture and phenomena related to the rupture process (Clark et al., 2017;. For the 237 km long 1920 Haiyuan earthquake rupture, detailed field surveys along the entire rupture were performed in the 1980s (Deng et al., 1986;IGCEA & NBCEA, 1990;W. Zhang et al., 1987) (see Data Set S5 for their offset measurements), but their interpretation of the coseismic slip was challenged by Ren et al. (2016) based on 70 km of LiDAR data that partially covered the western and middle sections (pink patches in Figure 8a).
The 1980s field studies (Deng et al., 1986;IGCEA & NBCEA, 1990;W. Zhang et al., 1987) found that the surface slip distribution of the 1920 Haiyuan earthquake followed bell-shaped curves with peak offsets of 6, 10, and 4 m at the centers of the western, middle, and eastern sections (white squares in Figures 9a and 10a). In contrast, Ren et al. (2016) observed clustering of 1.5-3.5-m offsets and 8-12-m offsets along about two thirds of the western section (red error bars and gray bands in Figure 9a), hence classified the former as single-event displacements from the 1920 Haiyuan earthquake and the latter as cumulative displacements from 1920 and previous events. However, this classification leaves an ambiguity whether the offsets measured between 3.5 and 8 m represent single-event or cumulative displacements. In the middle section, Ren et al. (2016) argued that the 10-m maximum offset reported by Deng et al. (1986) and IGCEA and NBCEA (1990) (circled in Figure 10a

Journal of Geophysical Research: Solid Earth
These differences prompted us to revisit the mapping of offsets in order to see whether peak slip in excess of that measured by Ren et al. (2016) may have occurred in their data gap, to discriminate between the 3.5-8-m offset measurements in the western section and the scattered offsets in the middle section in order to interpret whether they were produced by the 1920 earthquake or represent more than one event. Finally, with a better understanding of the surface slip distribution along the entire length of the strike-slip western and middle sections and with the offsets reported by IGCEA and NBCEA (1990) on the eastern section, we estimate the seismic moment release M o and hence moment magnitude M W of the 1920 Haiyuan earthquake.

Image Processing and Field Acquisition
Twelve scenes of tri-stereo Pleiades optical imagery totalling over 1,115 km 2 were acquired on cloud-free and snow-free days in the winter between November 2016 and January 2017, in order to minimize vegetation To orthorectify the imagery, we first processed the panchromatic imagery using the Leica Photogrammetry Suite module offered by ERDAS IMAGINE software following the methods described in , Middleton et al. (2015), and Ainscoe et al. (2019). Tie points were computed based on satellite orbital information and then manually inspected to ensure even distribution and that they were away from shadows. We then used the Enhanced Automatic Terrain Extraction tool to extract point clouds based on relief displacement between images. The point clouds of all the possible combinations of images were combined to generate a raster DEM of 1-m resolution. After orthorectifying the multispectral and panchromatic imagery with the DEMs, we pan-sharpened the orthorectified multispectral imagery to the resolution of the orthorectified panchromatic imagery. The resultant ∼0.7-m resolution orthorectified multispectral imagery allowed us to map fault traces in detail and to measure horizontal offsets in the QGIS software.
Fieldwork was undertaken in June 2017 and June 2018 to measure selected offsets with a tape measure and to conduct low-altitude photographic surveys over 10 sites (Figure 8a). Ground control markers were distributed evenly across the survey areas and their geographic coordinates determined using differential GPS. Aerial photos were taken with a camera mounted on a DJI Phantom 4 quadcopter, which traversed the area at flying altitudes of ∼100 m. Flight itineraries were programmed to ensure 60% overlap between tracks and 30% overlap along track between consecutive photos. Using the structure-from-motion method (Westoby et al., 2012), we created orthophotos and a DEM from the aerial photos using the Agisoft Metashape software. The drone DEMs were of decimeter-resolution and served to validate measurements from the Pleiades imagery.

Offset Measurement
Surface rupture and horizontal offsets were mapped in QGIS software using the line feature tool (see Data Sets S7-S9 for the resultant .kml files and measurements). For each displaced stream, we measured the best offset between the thalwegs of channels, the minimum offset between the inner banks, and the maximum offset between the outer banks, using the method of Elliott et al. (2015) (Figure 11). Measurements are summarized as blue triangular probability distributions in Figures 9a and 10a.
In areas with complex rupture patterns, we measured the total slip as the sum of displacements across all participating strands. For example, at the en echelon faults at Baishuihe site (Figure 12), Gullies 1, 2, and 3 are interpreted as being offset by one splay and Gullies 4 and 5 by two splays. Hence, the summed best offsets measured are 3.3, 5.6, and 5.6 m for Gullies 1-3, 3.4 and 7.2 m for the two beheaded downstream channels of Gully 4, and 7.2 m for Gully 5. In rupture zones where we could not resolve the offsets on individual strands, the offsets were not measured. In this way, we ensured that the offset obtained represents the total slip rather than a partial slip on each strand.
The measurements we made on the satellite and low-altitude photogrammetric data sets are well correlated ( Figure 13a) and were corroborated with our field measurements (Figure 14). At Zhoujia Kiln, two streams 30 m from each other were displaced. The offsets measured on them were 3.2+3.2/−1.8 m and 1.5+1.1/−0.9 m from the Pleiades imagery, 2.9+2.1/−1.6 m and 1.5+0.9/−0.8 m on the drone DEM, and 2.8+1.9/−0.7 m and 1.4+1.0/−0.4 m in the field, all consistent with each other. As drone DEMs have higher resolution than Pleiades imagery, the drone measurements were preferred and, when available, used in place of Pleiades measurements for subsequent analysis.
When we compared our measurements to those made by Ren et al. (2016), we find that the set of features we selected to measure are not always identical due to different interpretations and features being highlighted in different ways by the different data sets (Figures 15a-15d). Sometimes, careful manual relocation of the measurements from Ren et al. (2016) was necessary to associate the measurements with the corresponding features. However, there is generally good agreement between the values, including similar uncertainties, obtained despite the different data sets employed and the different methods used to estimate offset (Figure 13b).
A quality rating was assigned to each offset measurement following Scharer et al. (2014). High quality was given to linear features that cross the fault at high angles (>60°), with well-defined edges and sharp

10.1029/2019JB019244
Journal of Geophysical Research: Solid Earth displacement. Medium quality was given to features that were sinuous or less well defined or cut the surface rupture at smaller angles. Low quality was given to features that were poorly rated on at least two of the three factors mentioned above.

Cumulative or Single Event Offsets?
It has been shown that slip variation over 100-m length scales is a real physical phenomena (Gold et al., 2013). Therefore, the only times we can be certain that an offset is cumulative is when there is a smaller offset within tens of meters from it. Thus, at Baishuihe site (Figure 12), we are confident that the offsets of Gullies 2, 3, and 5 are cumulative and the 3.3 and 3.4 m from Gullies 1 and 4 are likely single-event offsets from the 1920 earthquake. Similarly, at Mijiashan, Changtan, and Diwanxian sites in the western section (Figures 15a-15d), we see small offsets of 2-4 m juxtaposed against 5-9-m offsets, suggesting these larger offsets are cumulative offsets which could be attributed to 1920 and the previous event.

Surface Slip Distribution in the Middle Section
The middle section (Figure 10c) is characterized by a 35-km linear trace in the west that connects, through a 1 km wide right-stepping compressional jog, with a 20-km trace that penetrates the Salt Lake pull-apart basin, followed by another 35-km linear section, culminating with a series of en echelon faults at the eastern end. In the westernmost linear trace, we observed a gentle increase of slip from 2.9 and 4.3 m at Lijiaping, through 5.7 m at Guansiwan, to 6-7 m at Huadaozi (Figures 10d-10f and 10k-10m), following the same pattern reported by Deng et al. (1986) (Figure 10a). The section around the jog is now heavily altered by mining activity; hence, we could not verify the scattered offsets measured in the 1980s, but we suspect the cluster of small offsets measured by Ren et al. (2016) could be related to the strike-slip-inhibiting effect of the restraining bend. In the Salt Lake basin, a series of field boundaries have been offset by 2-5 m, with the 5 m highlighted in Ren et al. (2016) as the best preserved (Figure 10a). From the east of the Salt Lake all the way to Dagoumen is classified as the area of maximum shaking intensity during the 1920 earthquake (IGCEA & NBCEA, 1990). Along this section, we found the <5 m good quality offsets identified by Deng et al. (1986) (Figures 10a and 10g-10j) and verified that we could not find smaller offsets within tens of meters of some 5.4-7.1-m offsets highlighted by the gray band in Ren et al. (2016) (Figure 10a). Therefore, we kept them all as single-event offsets. Further east, the surface rupture becomes less continuous and fractures into en echelon splays with ∼3.3-m offset as shown in the Baishuihe site ( Figure 12).

The Eastern Section
Along the eastern section, Deng et al. (1986) reported 25 horizontal offsets between 2.7 and 4 m and one of 7.5 m (Figure 16a). The 7.5-m horizontal displacement was near a 1.0-m scarp height, whereas two other 3.8and 4-m horizontal offsets were near a 0.5-m scarp height. Therefore, we interpret the 7.5 m as a cumulative offset. The average and standard deviation of the other horizontal measurements is 3.5 ± 0.5 m.
The scarp height measurements were all associated with a southwest-dipping scarp on a northeast fault bounding the extensional Laohuyaoxian pull-apart basin. Therefore, they are formed by normal slip in a locally transtensional regime instead of the thrust motion that might be expected for an oblique convergent regime due to the fault bend. The other geomorphic features reported along the eastern section were tension cracks, landslides, and southwestern dipping scarps that provide no hard evidence for any thrust motion.
If we were to be conservative in estimating the maximum possible slip on the eastern section, we could calculate the shortening across the fault due to the change of strike, α, of 20°, and hence the potential thrust motion on the fault that would result (Figure 16b). First, we can assume the slip vector is conserved between the middle and eastern sections, that is, d′ = d = 5 m, the strike-slip component parallel to the eastern section would be D SS ¼ d ′ cosðαÞ ¼ 4:7 m and the expected shortening across the fault would be D SH ¼ d ′ sinðαÞ ¼ 1:7 m. Taking a 70°dip on the eastern section (Duan et al., 2018), we get D T = 5 m for the thrust component of the slip, D V = 4.7 m for the vertical displacement, and D = 6.9 m for the total slip on the eastern section. We note that the expected D SS = 4.7 m is larger than the 3.5 ± 0.5 m of horizontal Figure 11. Schematic illustration of converting minimum, best, and maximum offset measurements into a triangular probability density function (PDF).

10.1029/2019JB019244
Journal of Geophysical Research: Solid Earth offsets observed in the field (Deng et al., 1986). However, the average magnitude of slip can vary from section to section as it does between the western and middle sections. If we assume that only the slip direction is conserved, but not the slip magnitude, then scaling values by 3.5/4.7, we obtain D SH = 1.3 m, D T = 3.8 m, D V = 3.6 m, and D = 5.2 m.
However, no such significant vertical displacements (D V = 4.7 m or 3.6 m) were documented by Deng et al. (1986). A simple explanation could be that the stress field is modified by events on the thrusts around the eastern section at other times so that slip on the eastern section in 1920 was indeed pure strike slip.

Average Slip and M W
The moment magnitude, M W , can be calculated combining the moment release, M o , from the three sections: where μ is the shear modulus of the seismogenic layer, D is the average slip along the fault, L is length of rupture, and W is the down-dip width of the rupture (Aki, 1966;Kanamori, 1977). LiDAR DEM, suggesting high-resolution satellite imagery can be as good as LiDAR data for offset measurement . To estimate the average slip, D, on each section, we stacked the triangular PDFs from the 284 and 123 offset measurements along the rupture length to obtain the cumulative offset probability density (COPD) of the western and middle sections, respectively (Figures 9b and 10b). Assuming the offset population from each event follows a normal distribution, as was observed from, for example, the 1992 Landers earthquake (Milliner et al., 2015), we fit a Gaussian curve to the bottom half of the lowest peak of high-quality measurements, following Klinger et al. (2011), Kurtz et al. (2018), Zielke et al. (2010), and Zielke et al. (2015). We obtained the means and standard deviations of 3.0 ± 1.0 m for the western section and 4.5 ± 1.5 m for the middle section. These results agree with the single-event offsets we determined at Mijiashan (3.9 m), Changtan (1.9 and 2.7 m), Diwanxian (2.3 and 3.5 m) in the western section, and Baishuihe (3.4 m) at the eastern end of the middle section (Figures 15a-15d and 12).
To assess the uncertainties in our offset measurements and their potential impact on the average slip, D, we performed 20 "blind" measurements at each of eight example sites (two sites in Figure 14b and three high-quality [yellow] sites in Figure 15b from the western section and two sites in Figure 10p and one site in Figure 12e from the middle section) ( Figure S1). The measurements were carried out with the feature layer turned off in QGIS such that the line segments drawn to mark the offset lengths (see the yellow lines in Figure 14a and 14b) were invisible. Distances between the thalwegs, left banks, and right banks of the streams were measured with the image zoomed to different magnifications. All but one site have the reported thalweg-to-thalweg distances inside the two sigma range from the average of the 20 blind measurements. All but one site have the maximum value of the 20 blind measurements smaller than the reported outer-bank distance, and all sites have the minimum value of the 20 blind measurements larger than the reported inner-bank distance (Figure 11a). Therefore, our individual PDFs defined in Figure 11b cover 99% of the full range of possible measurements. The differences between the averages of 20 blind measurements and our reported thalweg-to-thalweg distances across eight examples average to 0.1 m, insignificant compared to the 1-and 1.5-m uncertainties associated with the average slips in the western and middle sections. Newly plotted COPDs based on measurements from Ren et al. (2016) and histograms based on measurements from Deng et al. (1986) also agree with our reported average slips ( Figures S2-S8).
There are also other possible sources of errors due to locally varying dip and strike, channel erosion, and scarp degradation that are unaccounted for, but given the steep fault dip and the near 0°rake constrained

10.1029/2019JB019244
Journal of Geophysical Research: Solid Earth by our focal mechanism ( Figure 5), we expect them to be negligible (Mackenzie & Elliott, 2017). Such uncertainties might be better assessed and the subjectivity of choosing the "best" marker further reduced in future studies by adopting a mathematical approach on high quality DEMs using a code such as that developed by Stewart et al. (2018).
Taking 70, 90, and 77 km as the lengths, L, of each section, a width, W, of 20 km, and a shear modulus μ of 34 GPa based on local velocity profiles (Chen et al., 2017), we calculate the total M o and obtain M W = 7.8 ± 0.1 ( Table 1). Uncertainty of 2 GPa in μ, 20 km in L, or 1.5 km in W only increases the uncertainty of M W by 0.02. As the average slip, D, is measured at the ground surface, it might be lower than the slip at depth, thus providing a lower bound on the moment estimate. The hypothetical case considered in Figure 16b, with a maximum possible slip of 6.9 m on the eastern section, gives M W = 7.9 even with maximum slip on the western and middle sections. This conservative upper bound value gives us confidence that the moment contributed by this possible, but unproven, scenario will be limited and its effect on M W negligible.

Average M W
For a direct comparison between the different estimates of the magnitude of the 1920 Haiyuan earthquake, we use the orthogonal regression relationships in Bormann and Saul (2008) and Storchak et al. (2012)   Our reestimated moment magnitude M W 7.9 ± 0.2 agrees with the M W = 8 estimated by Deng et al. (1984) using M o derived from spectral densities first published by W. Chen and Molnar (1977). W. Chen and Molnar (1977) initially obtained M W 8.3 instead of M W 8.0 because they modeled the earthquake using a dip of 45°and a rake of −45°. However, as our focal mechanism inversion shows, this earthquake happened on a predominantly vertical fault plane and slipped with a rake close to 0°. Therefore, M W 8.3, the current value in the ISC-GEM (ISC, 2019) and Centennial catalog (Engdahl & Villaseñor, 2002), both based on W. Chen and Molnar (1977), is likely to be overestimated.
The moment magnitude M W 7.9 and the rupture length of 230 km fit the scaling relationship for strike-slip earthquakes in Wells and Coppersmith (1994). With M W 7.9, the 1920 Haiyuan earthquake is similar in magnitude to the 2001 M W 7.8 Kokoxili earthquake Lasserre et al., 2005) and the 2008 M W 7.9 Wenchuan earthquake (Lin et al., 2009;Liu-Zeng et al., 2009;Xu et al., 2009) that also occurred within or on the boundary of the Tibetan Plateau.

m, m B and M W
Our m B average from global records corroborates the m 7.9 determined by Richter (1958) using Pasadena records and supports our understanding that m B and m are equivalent.
The regression relationship M W = 1.33×m B −2.36 is associated with a standard deviation of σ = ± 0.18 (Bormann & Saul, 2008), reflecting the scatter in the original catalogs. Considering the ∼0.2 uncertainty, the m B -converted M W = 8.1 ± 0.4 is not inconsistent with the M W = 7.9 ± 0.2 from body wave modeling. This is reassuring as the moment magnitude was originally designed to scale well with m B (Kanamori, 1977) and m B later proved to be a good estimator of M W (Bormann & Saul, 2008).

M S and M
There are two IASPEI standards of M S , a narrow-band M S (20) with period 20 ± 2 s, which we used, and a broad-band M S (BB) with periods 3-60 s (Bormann et al., 2009;Storchak et al., 2012). ISC recently obtained a broad-band surface wave magnitude M S (BB) = 8.06 ± 0.11 for this earthquake using amplitudes and periods reported in nine station bulletins (Di Giacomo, 2020 Abe (1981) found that the original M is systematically larger than his reworked M S by an average of 0.2 units, and in extreme cases, 0.9 units. This overestimation was most likely because Gutenberg vectorially combined the highest amplitudes of the N-S and E-W components, which might not have occurred at the same time (Bormann, 2012;Geller & Kanamori, 1977).  Bormann and Saul (2008) and Storchak et al. (2012)

M and M W
A regression relationship between M and M W has not been developed. Nevertheless, Richter (1958) introduced the relationship m = 2.5+0.63M which would convert an M value of 8.6 to an m value of 7.9, the same as our m B value as discussed above. Thus, when dealing with M of large shallow earthquakes between 1904 and 1952, it would be advisable to first convert M into m (m B ), which would then be comparable to M W as discussed in section 4.2.
In addition, values of M in some historical catalogs were assigned based on intensity of shaking. For example, Gu (1983) assigned the Haiyuan earthquake M8 1 2 based on I ∘ = XII. This conversion was made using the relationship M = 0.58I ∘ +1.5 derived by Li (1960)

Epicenter, Intensity, and Surface Slip Distribution
Our relocated epicenters based on the 1°Crust 1.0 velocity model converge on the town of Haiyuan. The answer is stable whether we use the waveforms only, the bulletins only, or these two data sets combined.
With the large number of stations in Europe and Japan, the epicenter is better constrained along strike than perpendicular to strike. Therefore, we believe a location at this longitude is more accurate than those reported in previous catalogs.
Earthquakes have been observed to nucleate and terminate near geometrical discontinuities (compressional or extensional jogs, triple junctions, and fault bend) (King & Nábělek, 1985). This phenomenon is observed for the 1976 Tangshan earthquake (King & Nábělek, 1985), the 1995 Tianzhu earthquake , and the 2001 Kokoxili earthquake . The colocation of our epicenter and the junction between the middle section and the eastern splay ( Figure 2) also supports this theory. We infer that the geometrical triple junction and the transition between the strike-slip and the transpression regimes could have concentrated stress near Haiyuan before the 1920 earthquake started (Aki, 1979;King & Nábělek, 1985).
As many 4-6-m offsets measured on the western section are actually cumulative, the average offset on the western section is only 3 ± 1 m, smaller than the 4.5 ± 1.5 m on the middle section and the 3.5 ± 0.5 m on the eastern section. Therefore, the surface slip distribution we obtained also correlates with the shape of the intensity contours, which is narrower in the west and wider in the east (Figure 2).

Conclusion
Using digitized analog seismograms, we have relocated the 1920 epicenter to 105.540E, 36.481N near Haiyuan, resolved the focal mechanism to be predominantly strike-slip on a vertical plane, and modeled the body waves to obtain M W = 7.9 ± 0.2. From horizontal offsets measured from the orthorectified Pleiades satellite imagery and DEM derived from aerial drone photos acquired by a quad-copter, we estimated M W = 7.8 ± 0.1 using the average slip and length of the rupture.
We also measured the phase amplitudes and periods of the waveforms to obtain broadband body wave magnitude m B = 7.9 ± 0.3 and the IASPEI standard surface wave magnitude M S (20) = 8.1 ± 0.2. The m B is equal to the m and m B previously reported by Gutenberg and Richter (1954) and Abe (1981). Converting m B and M into M W and averaging the best estimates from all four approaches, we constrain the magnitude of the 1920 Haiyuan earthquake to be M W = 7.9 ± 0.2, consistent within the uncertainty with the M W 8.0 published by Deng et al. (1984). The M S (20) is smaller than the original M 8.5 by Gutenberg and Richter (1941), suggesting Gutenberg's definition of M is not compatible with that of the modern M S and overestimates the size of earthquakes.
M reported based on intensity of shaking could be overestimated as well, as in the case of the Haiyuan earthquake, the relationship was derived from Gutenberg's surface wave magnitude M (Gu, 1983;Li, 1960).  The last digit of the four-letter station code is used to differentiate seismograms recorded by multiple types of instruments at the same station. "B," Bosch-Omori; "W," Wiechert; "G," Gallitzin; "E," Ewing; and "H," historical when only one type of instrument was in operation.

Journal of Geophysical Research: Solid Earth Data Availability Statement
Our scanned and digitized seismograms can be found in the IRIS Seismo Archives (https://ds.iris.edu/ seismo-archives/quakes/1920haiyuan/).