## 1. Introduction

[2] The obvious clustering of earthquakes in space and time implies that an earthquake increases the probability of another one nearby. This clustering is most often described by mathematical models of the pair-wise relationship of earlier and later events. By analogy with population studies, the earlier earthquake is often called a “parent”, and the later one a “child”. Some events, called “immigrants”, apparently arise spontaneously, without known parents, but immigrants may trigger children. Many models allow parents to have many children, but some allow only one parent per child. Most models specify a space- and time-dependent event rate based on the location, time, magnitude, and perhaps other properties of all previous events. Goodness-of-fit of a model is evaluated by the degree to which the child events occur when and where the calculated event rate is high.

[3] Many analyses evaluate goodness-of-fit by summing a log-likelihood score for some cataloged earthquakes in a specific space, time and magnitude window called the “target window”. We argue that some events outside this target window should be included in the evaluation because they may have spawned events in the target window. Thus, we introduce the “auxiliary window” which includes possible triggers of the events in the target window. The combination of the target window and the auxiliary window is called the “data window”. The triggering potential should be estimated for all earthquakes in the data window, but the model's success should be based only on quakes in the target window.

[4] One approach would be to assume the data window is fixed in advance, and ask what target window meets the criterion that events inside have their parents within the data window. Instead, we assume that the target window is fixed in advance, and we focus on the adequacy of the auxiliary window. How big is big enough? In general bigger is better, but really distant, early, or small events may have little influence within the target window, and practical considerations enter the decision. We provide here some tools for assessing the adequacy of auxiliary window and the value of expanding it.

[5] The optimum size of the auxiliary window depends on the temporal, spatial, and magnitude extent of triggering, the very properties that the model is designed to measure. Thus we have a “Catch-22”; we need to know the answer before we can choose the data needed to get the answer. For that reason we explore the behavior of model parameters for both real and simulated catalogs as a function of the auxiliary window size. We use the Epidemic-type Aftershock Sequence (ETAS) model introduced by *Ogata* [1988, 1998]; it allows both multiple children and multiple parents.

[6] Mathematically this model is expressed in the form of the conditional rate (or conditional intensity) [see, e.g., *Daley and Vere*-*Jones*, 2003]: *λ*(*t*,*x*,*y*∣*H*_{t}), defined by the expected seismicity rate at time t and location (x, y) given historical information *H*_{t}, the history of events prior to time t. If the background seismicity rate is stationary and the magnitude distribution is independent, as suggested by *Zhuang et al.* [2005], the conditional rate function can be written

where

where *η*(*x*, *y*) is the background (spontaneous) seismicity rate, *m*_{0} is the minimum magnitude threshold, the parameter *p* indicates the temporal decay rate of primary (first generation) triggered events, the parameter shows the expected number of aftershocks per event, and the exponent *a* governs the effect of an earthquake's magnitude on its expected number of aftershocks. The sum covers all previous earthquakes in the catalog.

[7] We believe our results apply to clustering models in general [e.g., *Kagan*, 1991] but prefer the ETAS model for several reasons. It is widely used and its properties are widely explored. *Wang et al.* [2010a] showed that the ETAS model of *Zhuang et al.* [2005] has some advantages over others and effectively distinguishes spontaneous from triggered events. *Sornette and Werner* [2005a, 2005b] studied the relationship between the lower magnitude threshold and the branching ratio, which shows the average number of directly triggered events per event [*Helmstetter and Sornette*, 2003], in the ETAS model. *Wang et al.* [2010b] studied the uncertainties in ETAS parameter estimates. The bias in parameter estimates introduced by neglecting earthquakes below a lower magnitude cutoff has been investigated by *Schoenberg et al.* [2010].

[8] Given an earthquake catalog, parameters in the ETAS models (1) can be estimated by maximizing the log-likelihood function,

where θ = (*K*_{0}, *p*, *c*, *d*, *q*, *γ*) is the parameter vector for (1) [*Daley and Vere*-*Jones*, 2003] and *S* = [*x*_{0}, *x*_{1}] × [*y*_{0}, *y*_{1}]. Here, [*T*_{0}, *T*_{1}] × [*m*_{1}, *m*_{max}] × *S* represents the target space-time-magnitude window. The distinction between the target and auxiliary windows is best seen in equations (1) and (6): all events are used in (1), but only those in the target window appear in (6). Qualitatively, events in both windows contribute to the forecast earthquake rate *λ*(*t*, *x*, *y*∣*H*_{t}), but only those in the target window contribute to the log-likelihood *L*(θ).

[9] The target and auxiliary windows are subject to choice. In this study we fixed the target window and varied the size of the auxiliary window. Based on California catalog comparisons by *Kagan et al.* [2006] and *Wang et al.* [2009], we chose the Advanced National Seismic System (ANSS) earthquake catalog. The target space-time-magnitude window including 1122 events is:

We varied the start of the auxiliary time window in 5 year steps from 1940 to 1980; it always ended in 1980. We varied the minimum auxiliary magnitude threshold from 3.8 to 3.0 in 0.1 unit steps and the size of the space window by 2 degree steps in both longitude and latitude.

[10] To measure the bias introduced by this estimation procedure, including data limitations, we generated synthetic catalogs from an ETAS model with known parameters. For every operation we performed on the actual catalog, we did the same on twelve synthetic catalogs; each covered a larger area, time, and magnitude range than the actual data. Simulating over an expanded data window assured that the simulated quakes were consistent with the assumed ETAS parameters over the actual data window. The simulation parameters are shown in Table 2 We estimated ETAS parameters for the simulated catalogs as we did for the actual data, using the same auxiliary and target windows. For simplicity in the simulations, we assumed that the spontaneous seismicity rate *η*(*x*, *y*) was spatially uniform.