Geophysical Research Letters

Paleoseismic interevent times interpreted for an unsegmented earthquake rupture forecast


Corresponding author: T. Parsons, U.S. Geological Survey, Menlo Park, CA 94025, USA. (


[1] Forecasters want to consider an increasingly rich variety of earthquake ruptures. Past occurrence is captured in part by paloeseismic observations, which necessarily see three-dimensional ruptures only at a point. This has not been a problem before, because forecasts have assumed that faults are segmented, and that repeated ruptures occur uniformly along them. A technique is now required to calculate paleo-earthquake rates at points that may be affected by multiple recurrence processes, and that is consistent with an all-possible-ruptures forecast. Dating uncertainties are addressed by bootstrapping across event time windows, and the resulting distributions are transformed into log space as f(ln(T)) where T is interevent time. This takes advantage of a property of time-dependent recurrence distributions in which their logarithms are normally distributed. Paleoseismic series necessarily have a finite number of observations such that the true long-term mean interevent time (μ) is hard to estimate. However the mode (most frequent value) is easier to identify. Since the mode is equal to the mean of a normal distribution, μ can thus be found at the mode (m) of f(ln(T)) as μ = em. The point μσ occurs where 32% of a folded (half) normal distribution is found in the interval between ln(T) = 0 and m. The μ + σ value is identified by symmetry, which overcomes the difficulty of absent long intervals in the record. Tests are conducted with complex synthetic interevent distributions, and applications to real data from the Hayward and Garlock faults in California are shown.

1. Introduction

[2] Paleoseismic records of earthquake sequences arise only at some points along faults because they require specific geological conditions such as a continuous and datable sedimentary record that is thick enough to capture multiple earthquake disturbances, and that is deposited near an active fault. Because of this, many faults have limited or irregular spatial paleoseismic coverage. Thus paleo-sites are essentially a point process, and like earthquake epicenters, often cannot reveal much information about rupture dimensions or variability. However, they provide essential empirical mean earthquake rates that are crucial to seismic hazard assessment.

[3] If a fault is assumed to behave according to a characteristic earthquake model [e.g., Schwartz and Coppersmith, 1984; Wesnousky, 1994], where fault segments repeatedly rupture segments in a similar fashion, then one paleoseismic site along a fault segment is representative of that segment's earthquake recurrence distribution. However, if one hypothesizes a broader range of possible ruptures [e.g., Field and Page, 2011] that overlap, branch, or that have variable magnitudes [Weldon et al., 2004] with different recurrence distributions, then interpreting paleoseismic information at one point on a segment can become more complicated. Indeed, sites where overlapping ruptures are thought to occur, like Wrightwood and Pallet Creek on the San Andreas fault [Biasi and Weldon, 2009], have paleoseismic series that cannot be reproduced by any one recurrence distribution even after 50 · 106 attempts [Parsons, 2008a], signaling a more complex process. Additionally, earthquake sequences may change character, branching into long-term cycles of increased or diminished activity owing to fault interactions [e.g., Marzocchi and Lombardi, 2008] that obey different recurrence distributions.

[4] Paleoseismic observations reveal a number of earthquakes above an observable surface slip threshold (assumed proportional to magnitude) in a period. This empirical information is extremely valuable to earthquake forecasters who need a rate to make probability calculations. The variability in earthquake recurrence intervals due to inconsistency in the rupture process, which gives rise to aleatory uncertainty in hazard modeling, is perhaps best represented by paleoseismic observations. Two sources of epistemic uncertainty must be accounted for before a paleoseismic rate constrains a probability calculation: (1) dating uncertainty (usually radiocarbon dating), and (2) the effects of undersampling that can cause a time-limited historical or paleoseismic record to preferentially reflect the shortest intervals and miss the longest ones [Stein and Newman, 2004]. Dating uncertainty can be addressed by bootstrapping across the possible event time ranges (sampling a uniform PDF determined by the reported uncertainties) [e.g., Ellsworth et al., 1999; Biasi et al., 2002]. Undersampling has been accounted for by Monte Carlo sampling from long-tailed recurrence distributions [e.g., Console et al., 2008; Parsons, 2008a]; this has been necessary because the arithmetic mean of observed interevent times is not likely to represent the true average recurrence because the means of distributions thought to represent earthquake occurrence are all skewed to the right of their modes, and it requires many samples to capture that.

[5] In this paper I present a method to estimate the long-term mean and confidence bounds on the earthquake rate at a point when a segmented and/or characteristic earthquake rupture concept is not assumed. This application is to be explored for the Uniform California Earthquake Rupture Forecast version 3 (UCERF3); prior California forecasts have segmented faults by characteristic ruptures [e.g., Field et al., 2009]. Results are given as interevent times (T) rather than earthquake rates (1/T), because T is more intuitive and more often reported in the paleoseismic literature. The primary result of interest for UCERF3 is the range of allowable interevent times rather than the mean.

2. Method

[6] In accordance with UCERF, I assume that earthquake recurrence is time dependent, meaning that a Poisson process is not applicable. Probability density functions commonly thought to represent time dependent earthquake recurrence like the lognormal [e.g., Nishenko and Buland, 1987], Brownian Passage Time (inverse Gaussian) [Kagan and Knopoff, 1987; Matthews et al., 2002] or Weibull [Hagiwara, 1974] are skewed such that the logarithms of their probability density are approximately normally distributed [e.g., Sachs, 1984] (Figure 1).

Figure 1.

(a) Theoretical example of three different time-dependent recurrence processes affecting a point on a fault displayed as a combined probability density function. Relative amplitudes are governed by coefficients of variation. (b) The natural logarithms of recurrence times ln(T) are binned, which are distributed approximately normally, indicating that the sum of lognormal distributions can be interpreted as lognormal. Real observations are expected to be sparse for long interevent times, so the log distribution is summed up to the mode of a folded (half) normal distribution. The −1σ bound is the point x where 32% of the density of the folded distribution occurs. If the half-normal distribution is reflected across the mode (m), then the +1σ bound can be identified by symmetry. (c, d) The same process is shown but for a more complex example with very different recurrence means. More uncertainty is introduced, but reasonable values for the ±1σ bounds are found.

[7] The point (m) where the mode of the normal distribution f(ln(T)) occurs is also its mean (Figure 1). Thus em yields the “true” mean (μ) of the interevent time distribution. The left half of the ln(T) distribution can be thought of as a folded, or half normal distribution. Solving for the point x from

display math

where 32% of the summed value of the folded distribution is located, is also the point where one standard deviation (σ) on the mean of the complete distribution can be found. This is because 32% of the folded distribution is 16% of the complete distribution, which in turn marks the lower bound of where 68% of the density lies. The value of emx thus represents the μσ bound on interevent time. The variable σ is used here to denote 68% confidence bounds on the mean of the skewed interevent distribution because it is also the standard deviation of the normal f(ln(T)) distribution. Operationally, f(ln(T)) is expressed as a histogram, which is summed numerically. Maximum likelihood estimators exist for folded normal distributions, but their calculation and error estimation are “troublesome” according to Johnson [1962]. The histogram mode is subject to binning uncertainties, so I make 10 bootstrap calculations for every series to ensure that it is stable.

[8] If there were total sampling of the underlying time dependent distribution, then ln(T) would be a complete normal distribution, fully symmetric about its mean. This is however unlikely for most paleoseismic sites unless a very long record is present. An advantage of transforming the data into log space is that symmetry applies. Therefore, if the more complete, left side of the normal distribution f(ln(T)) is reflected across its mode (m), then the +1σ bound on the recurrence interval estimate can be extrapolated to e(2mx) (Figure 1).

[9] The primary uncertainty associated with this approach is that it approximates an unknown interevent time distribution, or combination of distributions, as generally lognormal. For example, a normal distribution fit to a Weibull or Brownian Passage Time distribution on a log axis could be biased because the Brownian Passage Time has a slightly heavier right tail and the Weibull a slightly heavier left tail. Combinations of different interevent behavior at a single site would also affect the resolution of this technique. I next examine the impacts of these effects.

3. Testing With Known Distributions

[10] I generate 100 recurrence distributions at random that are meant to simulate point process behavior in an unsegmented fault system. These are combinations of 1 to 5 Brownian Passage Time distributions, which can have means ranging from 0.1–1000 yr, and coefficients of variation from 0.1 to 1.0. Each is given a normalized random weighting ranging from 0.1 to 1.0 when combined; Figure 1 shows two examples. I choose to combine 5 or fewer distributions because I am most concerned about strongly multimodal distributions. If there are more distributions combined, then their modes tend to get smoothed out.

[11] The folded ln(T) method as described above is applied to each of the 100 synthetic distributions, and their ±1σ ranges are found by counting where 68% of their density occurs. These numbers are compared with calculated values from the proposed method as a way to assess it (Figure 2). The μσ bounds on interevent times are fit fairly well, with maximum magnitude of misfits all less than 50 yr, and the majority less than 20 yr (Figures 2a and 2b). This isn't surprising since the method is tuned to fit the μσ bound, and the logarithmic bins of the ln(T) histogram are smallest at low T. The tradeoff from being able to apply Gaussian statistics to ln(T) is that each bin has increasing width with increasing T. When the folded normal distributions of ln(T) are reflected across the modes/means, and the μ + σ bounds on interevent times are extrapolated, the misfits are thus proportionately larger (Figures 2c and 2d), increasing with greater mean interevent times (μ) (Figure 3). However, while μ + σ misfits are greater, the majority are less than 20 yr.

Figure 2.

(a) Actual vs. modeled (μσ) values on interevent time are plotted against each other; the red line shows a slope = 1.0 for reference. “Actual” refers to the variability of the tested distributions. (b) The absolute misfits are shown as a histogram. The misfits are relatively small on the lower (μσ) bounds. (c) The actual and modeled (μ + σ) values are plotted. The dashed black line shows linear fit with a slope of 1.12, suggesting that the method skews a little high. (d) Misfits are greater on the upper (μ + σ) bound because of logarithmic binning. (e) Misfits are given as a function of the number of events used to model mean interevent times (from 100 realizations).

Figure 3.

(a) Mean interevent times (μ) are plotted against +1σ misfits, and (b), μ + σ values are plotted against their misfits. Red lines show linear fits that indicate misfit is a function of increasing interevent time. (c) Calculated μ values are plotted against the actual means. If the method worked perfectly all the points would fall on the red line, which would have a slope of 1.0, but instead there is scatter and the slope is 0.9.

[12] Real paleoseismic observations are subject to radiocarbon dating uncertainties, and have varying numbers of events. A second test is applied where intervals are drawn at random from an example known distribution (parameters also generated at random: μ = 195, coefficient of variation α = 0.6). Dating uncertainty bounds of ±50 yr are added to each interval, and interval distributions are created by bootstrapping across time windows for each event. The number of included events is systematically reduced from 15 to 2, with misfits being stable down to ∼4 events (Figure 2e).

4. Example Application to Observations: South Hayward and Central Garlock Faults

[13] When the method is applied at California sites, it returns reasonable earthquake rate estimates based on comparison with results from prior methods (±1σ ranges encompass past results: see Table S1 in Text S1 of the auxiliary material).

[14] However, in the interest of full disclosure, I show applications to two California paleoseismic series that cover the range from closest to greatest mismatch from past studies. The first series comes from Tyson's Lagoon, which lies on the south Hayward fault in the San Francisco Bay region of California, and preserves a ∼1900-yr record of 11 paleoearthquakes and one historic event in 1868 [Lienkaemper et al., 2010]. Possible intervals are bootstrapped across reported time windows for each event from radiocarbon dating uncertainty as shown in Figure 4a. In this example, 1000 series of 12 events are drawn from within the uncertainty bounds, and their intervals calculated.

Figure 4.

Best and worst case results: (a) observed Hayward fault paleoseismic event intervals from Lienkaemper et al. [2010] are bootstrapped across reported radiocarbon dating uncertainties. (b) The same information is displayed except binning is by natural logarithm of interevent time. The mode of the folded normal distribution is 5.35 (T = 211). This value is interpreted as the mean of the complete distribution, and the ±1σ range for Hayward fault interevent times lies between 116–384 years. “Missing” data refers to the longest interevent times that are inferred to exist based on the long-tailed time-dependent recurrence distributions used to make earthquake probability calculations. (c, d) The same method is applied on a strongly multimodal series from the central Garlock fault. Distinct clustering behavior makes it more difficult to interpret this series as having one mean interevent time.

[15] The same procedure was used on the real data as described in Sections 2 and 3. Observations were rebinned by ln(T), and the mode identified (Figure 4b). Equation (1) is solved to find the μσ value. The folded normal distribution in log space is then reflected across the mode, and the μ + σ value is found through symmetry. The exponentials of these values yield a mean interevent time on the south Hayward fault of 211 yr, and a ±1σ range from 116 to 384 yr. Rate values are the reciprocals, yielding 0.0026–0.0086 events/yr, with a mean of 0.0047 events/yr. The mean recurrence interval value of 211 yr is essentially the same as the 210-yr value calculated using a Monte Carlo technique by Parsons [2008b].

[16] The concern about absent longest intervals seems to be present with the Hayward fault record, as the normal distribution of ln(T) is truncated (“missing” data area shown in Figure 4b). This issue is important in cases like UCERF, where most of the calculated earthquake rates come from long-term fault slip rate measurements inferred from offsets of geologic features that have occurred over 103–106 yr periods. Since California paleoseismic constraints mostly represent 102–103 yr scales, their rates need to be consistent with longer-term, fault-slip-rate solutions that may have involved many more earthquake cycles. Further, UCERF has applied time-dependent Brownian Passage Time functions (with their long tail assumptions) for probability calculations; thus the underlying earthquake rate values also need to be consistent with that assumption.

[17] The ∼250-km long left-lateral Garlock fault trends roughly east–west across southern California. The El Paso Peaks paleoseismic site lies on the central segment of the fault, and is located in an extensional step over that has been filled by an ephemeral stream. Dawson et al. [2003] report on six well-resolved earthquakes that happened during the past 7000 yr. They also note that earthquake occurrence has been very irregular (intervals range from 215 to 3300 yr), which can be seen in Figure 4c. Bootstrapping of the central Garlock intervals results in a distinctly multimodal distribution. Re-binning by natural logarithm puts the mode of the distribution on the very high end, which returns a very high mean interevent time of μ = 3362 yr. In cases like this, it may be best to consider different modes individually, because of potential rupture mode switching [e.g., Zöller et al., 2007; Hillers et al., 2009], or double branching behavior [Marzocchi and Lombardi, 2008] in which the Garlock fault may only be stressed by slip events on the San Andreas fault, rather than directly by plate motions [Parsons, 2006]. As a comparative measure, I calculated the fit of the observed intervals to a lognormal distribution using a Kolmogorov-Smirnov (KS) test (see auxiliary material for details). While neither the Hayward nor Garlock series can be confirmed as lognormally distributed at high significance (if they could this paper would not be necessary), the significance level of the Hayward series (55% confidence) far exceeds the central Garlock (1%).

[18] Lastly, a general consequence of dating uncertainty is that some event windows overlap, which when bootstrapped, leads to many intervent times that are close to zero (Figure 4a). These are not part of the transformed normal distribution shown in Figure 4b. They instead result in an isolated spike in negative log space. This occurrence can be interpreted as short-term clustering behavior (like aftershocks), and therefore can be added as a static shift to the interpreted rate if desired.

5. Conclusions

[19] A simple method is identified to calculate long-term earthquake rates from point process observations of paleo-earthquakes. A feature of time-dependent earthquake recurrence distributions is that binning them by their natural logarithms results in a normal distribution. This allows identification of the μσ bound from the most complete part of the observed record, while the μ + σ bound is extrapolated by symmetry. Tests using complicated, multi-modal synthetic distributions show that the method works. This method can estimate earthquake rates at sites known to have multiple recurrence processes operating, and where Monte Carlo methods have failed such as at the Pallet Creek and Wrightwood sites. An application using real data from the south Hayward fault returns virtually the same mean as that found with computationally intensive Monte Carlo sampling. However, very irregular sequences with long interevent times such as those on the Garlock fault remain difficult to interpret.


[20] My thanks to Eric Geist and David Schwartz for their constructive review comments on an initial draft and to Seth Stein, Max Werner, and Editor Andrew Newman for their helpful GRL reviews.

[21] The Editor thanks Maximilian Werner and Seth Stein for their assistance in evaluating this paper.