The physics of earthquake triggering together with simple assumptions of self-similarity imply the existence of a minimum magnitude m0 below which earthquakes do not trigger other earthquakes. Noting that the magnitude md of completeness of a seismic catalog is not, in general, the same as the magnitude m0 of the smallest triggering earthquake, we compare observed aftershock sequence parameters with the predictions made by the epidemic-type aftershock sequence model to constrain the value of m0. In particular, we use quantitative fits to observed aftershock sequences from three previous studies, as well as Båth's law, to obtain four estimates of m0. We show that the branching ratio n (average number of triggered earthquakes per earthquake, also equal to the fraction of aftershocks in a seismic catalog) is the key parameter controlling the estimate of the minimum triggering magnitude m0. Conversely, physical upper bounds for m0 estimated from rate and state friction indicate that at the very least, 55% of all earthquakes are aftershocks.
 Scale invariance in earthquake phenomena is widely manifested empirically, in the Gutenberg-Richter (GR) magnitude-frequency relation, in the Omori aftershock decay rate, and in many other relationships. Scale invariance means that there are no preferred length scales in seismogenic processes and in spatiotemporal structures. However, there are many reports that purport to identify characteristic scales. As emphasized by Matsu'ura , Aki , and Sornette , the search for characteristic structures in specific fault zones could allow the separation of large earthquakes from small ones and thus advance earthquake prediction.
 Although there is clear evidence of deviations from self-similarity at large scales [Kagan, 1999; Pisarenko and Sornette, 2003], the issue is much murkier at small scales. For instance, Iio , reports a lower-magnitude cutoff mmin ≈ −1.4 for very small aftershocks of the 1984 Western Nagano Prefecture, Japan, earthquake (mJMA = 6.8) in spite of the fact that the high sensitivity of the observation system (focal distances less than 1 km and very low ground noise) would have permitted to detect much smaller magnitudes. On the basis of induced seismicity associated with deep gold mines, Richardson and Jordan  find a lower-magnitude cutoff mmin ≈ 0 for friction-dominated earthquakes, while fracture-dominated earthquakes have no lower cutoff but an upper cutoff of magnitude ≈1. Using deep borehole recordings, Abercrombie [1995a, 1995b] found that small earthquakes exist down to at least magnitude 0 and that source scaling relationships hold down to at least −1. On the basis of seismic power spectra, on the evidence of a low-velocity low-Q zone reaching the top of the ductile part of the crust and on seismic guided waves in fault zones, Li et al.  argue for a characteristic earthquake magnitude of about 3 associated with the width of fault zones. Another characteristic magnitude in the range 4–5 is proposed by Aki , on the basis of the simultaneous change of coda Q−1 and the fractional rate of occurrence of earthquakes in this magnitude interval. At Parkfield, Heimpel and Malin  found evidence of a transition from creep-dominated slip to earthquake-dominated slip taking place in the range of magnitudes close to M = 0.9, above their detection limit of M = 0.3. The authors underline that their results do not suggest the existence of a minimum earthquake size, but rather indicate a nucleation scale in their stochastic rupture model. They further consider it likely that this scale varies with the geological setting. Similarly, Marone and Kilgore  suggested that the critical slip distance over which strength breaks down during nucleation in models of velocity-weakening friction scales with shear strain in fault zones. Therefore, if the critical slip scale fixes a minimum earthquake size, as we consider below, then the smallest earthquake may be nonuniversal and change with the maturity or gouge thickness of the fault.
 From a theoretical point of view, the equation of motion for a continuum solid is scale-independent, suggesting that deformation processes in solids should produce self-similar patterns manifested in power law statistics. However, the symmetry of an equation does not guarantee that the solutions of this equation share the same symmetry. The difference (when it exists) in the symmetry between a solution and its governing equation is known as the phenomenon of “spontaneous symmetry breaking” [Consoli and Stevenson, 2000] and underlies a large variety of systems (explaining for instance the nonzero masses of fundamental particles [Englert, 2002]). Of course, length scales associated with rheology and existing structures can produce deviations from exact self-similarity. For instance, a transition from stable creep to a dynamic instability at a nucleation size whose dimensions depend on frictional and elastic parameters defines a minimum earthquake size [Dieterich, 1992], estimated at magnitude ≈−3 by Ben-Zion . This minimum size corresponds only to events triggered according to the mechanism of unstable sliding controlled by slip weakening and thus concerns friction-dominated earthquakes.
 A different perspective is offered by models of triggered seismicity in which earthquakes (so-called foreshocks and main shocks) trigger other earthquakes (so-called main shocks and aftershocks, respectively). Recent studies suggest that maybe more than 2/3 of events are triggered by previous earthquakes [see Helmstetter and Sornette, 2003b, and references therein]. In this context, the relevant question is no longer how small is the smallest earthquake but how small is the smallest earthquake which can trigger other earthquakes (and, in particular, larger earthquakes).
 The effects of seismicity below the detection threshold in models of triggered seismicity are also considered in the very closely related work by Sornette and Werner . In particular, earthquakes too small to detect with the current network sensitivity are shown to bias the estimates of the branching ratio and the background event rate. The article also uses the estimates of the smallest triggering earthquake established below to link the apparent (measured) percentage of triggered quakes in a seismic catalog to the real fraction.
2. ETAS Model and the Smallest Triggering Earthquake
 To make the discussion precise, let us consider the epidemic-type aftershock sequence (ETAS) model, in which any earthquake may trigger other earthquakes, which in turn may trigger more, and so on. Introduced in slightly different forms by Kagan and Knopoff  and Ogata , the model describes statistically the spatiotemporal clustering of seismicity.
 The ETAS model consists of three assumed laws about the nature of seismicity viewed as a marked point process. We restrict this study to the temporal domain only, summing over the whole spatial domain of interest. First, the magnitude of any earthquake, regardless of time, space or magnitude of the mother shock, is drawn randomly from the exponential Gutenberg-Richter (GR) law. Its normalized probability density function (pdf) is expressed as
where the constant exponent b is typically close to one, and the cutoffs m0 (see below) and mmax serve to normalize the pdf. The upper cutoff mmax is introduced to avoid unphysical, infinitely large earthquakes. Its value was estimated to be in the range 8–9.5 [Kagan, 1999]. As the impact of a finite mmax is quite weak in the calculations below, replacing the abrupt cutoff mmax by a smooth taper would introduce negligible corrections to our results.
 Second, the model assumes that direct aftershocks are distributed in time according to the modified “direct” Omori law [see Utsu et al., 1995, and references therein]. Denoting the usual Omori law exponent by p = 1 + θ and assuming θ > 0, the normalized pdf of the Omori law can be written as
where t is the time since the earthquake and c is a constant.
 Third, the number of direct aftershocks of an event of magnitude m is assumed to follow the productivity law
where k and α are constants. Note that the productivity law (3) is zero below the cutoff m0; that is, earthquakes smaller than m0 do not trigger other earthquakes. This is typically assumed in studies using the ETAS model. The existence of the small-magnitude cutoff m0 is necessary to ensure the convergence of the models of triggered seismicity (in the statistical physics of phase transitions and in particle physics, this is called an “ultraviolet” cutoff which is often necessary to make the theory convergent). Below, we show that there are observable consequences of the existence of the cutoff m0 thus providing constraints on its physical value.
 Since the present formulation of the ETAS model requires cutoffs to ensure its convergence, it is interesting to mention the variation recently introduced by D. Vere-Jones (A class of self-similar random measures, submitted to Journal of Applied Probability, 2004, hereinafter referred to as Vere-Jones, submitted manuscript, 2004). This modified model is completely self-similar yet well defined and convergent. No cutoffs break the self-similarity. To remove all scales, he first requires the Omori law constant c to be a function of magnitude so that the plateau following large main shocks lasts longer than for smaller shocks. Secondly, α is set equal to b. Thirdly, he introduces a function S that penalizes a large departure of a daughter's magnitude from the mother's magnitude. This implies that the magnitudes of the daughters are distributed according to a modified GR law, with a bend around the mother magnitude. This completely self-similar model without cutoffs requires a conditioning of aftershock magnitudes on mother magnitudes, for which observational evidence remains to be established. This question is of fundamental importance in order to clarify whether the ETAS cutoff magnitude m0 is of real physical relevance to earthquake triggering. We will not consider the model of Vere-Jones (submitted manuscript, 2004) further here, but see A. Saichev and D. Sornette, Vere-Jones' self-similar branching model, submitted to Physical Review E, for a theoretical analysis parallel to the ETAS model.
 The key parameter of the ETAS model is defined as the number n of direct aftershocks per earthquake, averaged over all magnitudes. Here, we must distinguish between the two cases α = b and α ≠ b:
 Three regimes can be distinguished on the basis of the value of n. The case n < 1 corresponds to the subcritical regime, where aftershock sequences die out with probability one. The case n > 1 describes unbounded, explosive seismicity that may lead to finite time singularities [Sornette and Helmstetter, 2002]. The critical case n = 1 separates the two regimes.
 The fact that we use the same cutoff for the productivity cutoff and the Gutenberg-Richter (GR) cutoff is not a restriction as long as the real cutoff for the Gutenberg-Richter law is smaller than or equal to the cutoff for the productivity law. In that case, truncating the GR law at the productivity cutoff just means that all smaller earthquakes, which do not trigger any events, do not participate in the cascade of triggered events. This should not be confused with the standard but incorrect procedure in many previous studies of triggered seismicity of simply replacing the GR and productivity cutoff m0 with the detection threshold md in equations (1) and (3) [see, e.g., Ogata, 1988, 1998, 2004; Kagan, 1991; Guo and Ogata, 1997; Console et al., 2003b; Ogata et al., 2003; Zhuang et al., 2004]. This may lead to a bias in the estimated parameters.
 The realization that the detection threshold md and the triggering threshold m0 are different leads to the question of whether we can extract the size of the smallest triggering earthquake. Here, we infer useful information on m0 from the physics of earthquake triggering embodied in the simple ETAS formalism, from Båth's law, and from available catalogs.
 We will assume that the detection threshold md of a seismic catalog is (currently still) larger than the smallest triggering earthquake m0. This assumption seems justified since, for instance, Helmstetter et al.  found that m = 2 earthquakes trigger their own sequences of (possibly larger) magnitudes. Their Figure 1 presents evidence that the scaling of aftershock productivity continues down to at least magnitude 2. This implies that m0 has not yet been observed directly and is below the detection threshold.
 There is no loss of generality in considering one (independent) branch (sequence or cascade of aftershocks) of the ETAS model. Let an independent background event of magnitude M1 occur at some origin of time. We will refer to independent (nontriggered) background events as main shocks or initial shocks and any triggered events as aftershocks, independent of magnitude. The main shock will trigger direct aftershocks according to the productivity law (3). Each of the direct aftershocks will trigger their own aftershocks, which in turn produce their own, and so on. Averaged over all magnitudes, each aftershock produces n direct offspring according to (4). Thus over all time, we can write the average of the total number Ntotal of direct and indirect aftershocks of the initial main shock as an infinite sum over terms of (3) multiplied by n to the power of the generation [Helmstetter and Sornette, 2003b], which can be expressed for n < 1 as
However, since we can only detect events above the detection threshold md, the total number of observed aftershocks Nobs of the sequence is simply Ntotal multiplied by the fraction of events above the detection threshold, given by − md) − − m0) − 1) according to the GR distribution. The observed number of events in the sequence is therefore
Equation (6) predicts the average observed number of direct and indirect aftershocks of a main shock of magnitude M1 > md. To estimate m0, we need to eliminate or find estimates of the three unknowns n, k, and Nobs. We can eliminate k through the expression (4) for n, leaving n and Nobs. The mean number of observed aftershocks as a function of main shock magnitude M1 was estimated by Helmstetter et al. , Felzer et al. , and Reasenberg and Jones  and can also be obtained from Båth's law. In the following sections, we use these four estimates for Nobs and thus obtain m0 as a function of the only remaining unknown n. Acknowledging the controversy surrounding the estimation of the percentage of aftershocks in a catalog, we nevertheless use existing estimates of n to finally obtain quantitative values for m0.
 As we rely on fits and estimates of constants to obtain m0, it is useful to attempt an error estimation of m0 given the variation of these constants. In particular, we can solve equation (6) for m0 and find its variation Δm0 with Δn, which amounts to assuming that the leading error in m0 comes from the relatively poorly known n. This leads to
For Δn ≃ 0.2 and b − α ≃ 0.2, one obtains Δm0 ≃ 1.6 for n = 0.5, and Δm0 ≃ 4.4 for n = 0.9. Given that the other parameters may also contain errors (see also below) and that the estimates of n may be biased by undetected seismicity [Sornette and Werner, 2005], these error estimates may themselves contain large errors. We therefore stress that the following sections present order of magnitude calculations.
3. Constraint on the Smallest Triggering Earthquake From the ETAS Model and Observed Estimates of Aftershock Numbers
 Following the recipe outlined above, we begin by using the estimates of the observed number of aftershocks Nobs obtained by Helmstetter et al.  in order to find m0 as a function of n. Helmstetter et al.  sidestepped the problems associated with maximum likelihood estimates of the complete model parameters by fitting stacked observed aftershock rates within predefined space-time windows using the formula
based on the scaling laws (the GR law, the Omori law, and the productivity law) discussed above. The constant Kfit includes all aftershocks, direct and indirect, and thus corresponds to a global renormalized constant different from k in the ETAS productivity law (3). Furthermore, pfit is also a global exponent, which may be different from the local exponent 1 + θ of the ETAS model for n close to 1 and at not too long times, as explained in Sornette and Sornette  and Helmstetter and Sornette . The total number of aftershocks is then obtained by integrating over an unnormalized Omori law according to Helmstetter et al. :
Equating the ETAS model prediction Nobs(M1) given by (6) with the empirical estimate Nfit(M1) given by (10), and eliminating the unknown k through the expression for n in (4) leads to an equation for m0 as a function of n:
for α ≠ b and
for α = b.
 Expression (11) shows that, provided an estimate of the branching ratio n is available, we can deduce m0, since the other quantities can be measured independently: b is close to 1, α is usually between 0.5 and 1, md depends on catalogs but is often about 3, c is typically close to 0.001 days and Kfit in equation (8) is obtained from the calibration of the productivity of earthquakes as a function of their magnitude. In Table 1 of their study, Helmstetter et al.  report values for Kfit in the range from 0.0009 to 0.0193 (days)p−1, 0.94 ≤ α ≤ 1.16, b ≈ 0.95, and md = [2, 3]. They find c < 0.001days and p = 0.9. We will thus assume θ = 0.1 (see above).
 We note that md appears in the expression (10) for m0. Clearly, a detection threshold that evolves with seismic technology should not influence the physics of triggering. We thus expect m0 to be independent of md. The reason md does appear in the expression can be traced to the GR law (1), which is normalized over the magnitude interval from m0 to mmax. When integrated to give the probability of m lying in the range from md to mmax, the factor involving md does not enter as simply as in the formulation (8) of Helmstetter et al. . Therefore the factors do not cancel out when comparing the ETAS prediction with the assumed parameterization of Helmstetter et al.  and md remains in the equations. Assuming that the GR law is correctly normalized in the present ETAS model, this implies a (weak) dependence of Kfit on md. Given the correlation between α and Kfit (see below), the estimates of α may thus also depend on md. Finally, for practical purposes we note that for any reasonable values of the other parameters, the influence of md is negligible.
 The estimate of m0 that we are trying to obtain relies on the adequacy of the model used here and on the stability and reliability of the quoted parameters. For now, we sidestep any possible difficulties in the determination of the parameters and present in Figure 1 the magnitude of the smallest triggering earthquake m0 as a function of the average number n of direct aftershocks per main shock for a range of parameters. For n = 0, m0 equals the largest possible earthquake mmax, representing the limit that earthquakes do not trigger any aftershocks. At the other end, for n = 1, the formula predicts that m0 diverges to minus infinity. Recall that n = 1 corresponds to the system being exactly at the critical value of a branching process and the statistical average Nobs(m) of the total number of events triggered over all generations by a mother event of magnitude m becomes infinite. Of course, individual sequences have a finite lifetime and a finite progeny with probability one and the theoretical average loses its significance because of the fat-tailed nature of the corresponding distribution [Athreya and Ney, 1972; Saichev et al., 2005; Saichev and Sornette, 2004]. Therefore the prediction on m0 becomes unreliable for n close to 1 (how close to 1 depends on b − α which controls the amplitude of the fluctuations from realization to realization).
 For a wide range of n and combinations between α and Kfit, the magnitude of the smallest triggering earthquake lies between 0 and −10. Only for values of n above 0.9 does the size of m0 become smaller than −10. For reference, a magnitude −10 event roughly corresponds to a fault of length 1 mm, that is, to grain size.
 Given that we expect m0 to be smaller than the detection threshold md, the horizontal line at md = 3 serves as a (very) conservative estimate of the upper limit of m0 and thus provides constraints on the combination of parameters α, Kfit and n. For example, for α = 1, at least 65% of all earthquakes must be aftershocks. This lower limit increases drastically to about 90% for α = 0.5. Note that we extrapolate Kfit from the observed values for α around one to smaller values of α using an exponential fit (see below).
 We can obtain another external bound on m0 from estimates of the minimum slip required before static friction drops to kinetic friction and unstable sliding begins, according to models of velocity-weakening friction. For example, the parameter dc in rate- and state-dependent friction [Dieterich, 1992, 1994] was estimated at 0.5m from seismograms [Ide and Takeo, 1997] and similarly at 40–90 cm from slip-velocity records [Mikumo et al., 2003], although both probably correspond to upper bounds. Estimates of dc from laboratory friction experiments give 1 to 100 μm, approximately 4 to 6 orders of magnitude less than the upper bound determined by seismic studies. One could conclude that either the upper bound from seismic studies is so extreme as to render the comparison to laboratory studies meaningless, or the slip weakening process is in fact different at laboratory scales [Kanamori and Brodsky, 2004]. Scholz  related the critical slip to a minimum nucleation length Lc = Gdc/([B − A]σ), where G is the shear modulus, σ is the effective normal stress, and B − A is a material property. Following Lapusta and Rice , we take the values G = 30,000 MPa, B − A = 0.004, σ = 50 MPa, and dc = 100 μm to obtain Lc = 10 m. If we assume that the minimum slip needed to initiate stable sliding scales with the minimum length of a friction-based earthquake, then, neglecting fracture-based earthquakes, Lc corresponds to the size of the smallest earthquake. Given that the smallest triggering earthquake must be equal to or larger than the smallest earthquake, but that the estimate of Lc is an upper limit, we use these values for Lc as an upper limit of the smallest triggering earthquake. From the relations between fault length, moment and moment magnitude [Kanamori and Brodsky, 2004] with Lc = 10 m and a stress drop of 3 MPa, we obtain an upper limit of magnitude 0.5 for the smallest triggering earthquake. This upper limit is represented in Figure 1 as the bottom, dotted horizontal line.
Felzer et al.  have used α = b on the basis of an argument of self-similarity. Helmstetter et al.  also argue for a value of α essentially indistinguishable from b based on fits of stacked aftershock decay rates in predefined space-time windows. Other studies have found α smaller and much smaller than b [see, e.g., Console et al., 2003b; Helmstetter, 2003; Zhuang et al., 2004]. In view of the lack of consensus and to keep the discussion independent of the estimation problem, we use the correlation we found between the parameters Kfit and α estimated by Helmstetter et al.  to extrapolate to smaller α. The existence of such a correlation is standard in joint estimations of several parameters and can be deduced from the inverse of the Fisher matrix of the log likelihood function [Rao, 1965]. Such correlation can also be enhanced if the model is misspecified. We performed an exponential least squares fit to the scatterplot (see Figure 2) to obtain a relationship between the parameters and then calculated an estimate of Kfit for smaller values of α according to the best fit relationship Kfit = 11441.3285 exp(−α/0.07056). In the absence of other estimates, this method provides one possibility to extrapolate to small α. The resulting curves for m0 are plotted in Figure 1.
 Delaying the discussion on the estimation problem until the end of the section, we use (11) together with existing estimates of the percentage of aftershocks in seismic catalogs (equivalent to n [Helmstetter and Sornette, 2003b]) to constrain m0. We note, however, that different declustering techniques lead to different estimates. No consensus exists on which method should be trusted most. For example, Gardner and Knopoff  found that about 2/3 of the events in the Southern California catalog are aftershocks. With another method, Reasenberg  found that 48% of the events belong to a seismic cluster. Davis and Frohlich  used the ISC catalog and, out of 47500 earthquakes, found that 30% belong to a cluster, of which 76% are aftershocks and 24% are foreshocks. Recently, using different versions of the ETAS model, Zhuang et al.  have performed a careful inversion of the JMA catalog for Japan using a magnitude md = 4.2 for the completeness of their catalog. They provide three estimates of the branching ratio for their best model: n = 0.42, 0.55, and 0.46.
 Whether any of these methods estimate n correctly and without bias remains questionable. In particular, the branching ratios as calculated by Zhuang et al.  and others may be significantly biased by the assumption that md = m0, which can be shown to lead to an apparent branching ratio modified by the impact of hidden seismicity below the catalog completeness [Sornette and Werner, 2005]. Moreover, there are problems with the maximum likelihood estimation [see, e.g., Helmstetter et al., 2005]. However, in the absence of better estimations, we nevertheless use the above values as rough estimates of n. Given the range of α and Kfit, m0 is still not very well constrained for values of n near one (see Figure 1). For example, for 85% aftershocks (n = 0.85), m0 ranges from −10 to an unrealistic 4 depending on the values of α and Kfit. This argument could be used to rule out the combination n = 0.85 and α = 0.5. In fact, for m0 to be smaller than md = 3, at least 65% of earthquakes should be aftershocks. For m0 to be smaller than the upper limit estimated from dc, at least 75% must be aftershocks. Both fractions must increase for a smaller α.
 We can also use the values obtained by Felzer et al.  to constrain m0. The authors also used finite space-time windows in which they fitted aftershock sequences with parameters for a global sequence according to
where CT is the total number of observed triggered events, t is the selected duration of the sequences, pT is the global Omori exponent, cT is the Omori constant, and AT is the productivity. Assuming that the local Omori exponent is p = pT = 1 + θT, expression (12) can be rewritten for the infinite time limit as
The values obtained are listed in Table 3 of their study: AT = 0.116 days1−pT, pT = 1.08 and cT = 0.014 days. These values hold for a typical California aftershock sequence of a magnitude M1 = 6.04 main shock, a detection threshold of md = 4.8, and α = b = 1.
 As before, we equate the ETAS model prediction (6) with the observation (13), eliminate k through expression (4) for n (where α = b) and obtain an equation for m0 as a function of n:
This expression for m0 is shown in Figure 3 (solid curve). As in equation (11), md remains artifactually in the equation because of a dependence of AT on the detection threshold. Since the parameters were obtained with α = b = 1, we do not alter the values of α and obtain only one curve.
 Another estimate of m0 can be obtained from values estimated by Reasenberg and Jones . Their aftershock rate above md due to a main shock M is modeled as
 Again, we integrate over time, assuming p = 1 + θ > 1 to obtain
 As before, we equate expression (16) with the ETAS prediction (6), eliminate k through equation (4) for n (where α = b), and arrive at a third estimate of m0 as a function of n:
assuming α = b = 1.
 We adopt here the values termed the “generic California model” according to Reasenberg and Jones : a = −1.67, θ = 0.08, c = 0.05, and we assume b = 1, mmax = 8.5 and md = 3. Expression (17) is drawn in Figure 3 (dotted line). For comparison, we include in Figure 3 two further curves: one for the case α = b that we obtained above, based on the fits by Helmstetter et al.  (dashed curve), and another curve for the case α = b that results from using Båth's law (see next section) (dash-dotted curve). We observe the same characteristics as before in that m0 approaches mmax for small n and that it diverges to minus infinity for n going to one. Differences between the four curves arise only in the faster or slower decrease of m0 with n. For example, the (conservative) upper limit md = 3 for m0 constrains n to be larger than 60% according to the values obtained by Felzer et al. , whereas the parameters of Helmstetter et al.  and Reasenberg and Jones  for the case α = b impose n to be at least 70 and 80%, respectively. For the estimate obtained from Båth's law (see next section), n must be larger than about 45%. If we assume that the upper limit of m0 can be obtained from estimates of the critical slip distance dc, corresponding to m = 0.5, then n must be at least 55% according to the estimate from Båth's law, 70% according to Felzer et al. , 80% according to the fit by Helmstetter et al.  and 85% according to the fit by Reasenberg and Jones .
 Since the four expressions for m0 (for α = b) show the same functional dependence on key variables and differ only in the different estimates of a few constants, this consistency provides some confidence in our results. As for the difference in the four curves, they constitute four differently formulated, empirical estimates of the rate of events of typical aftershock sequences. Given the variability of the aftershock process, the discrepancy in the estimates is to be expected.
 We now point out difficulties for exploiting quantitatively the above ideas. Our conclusions for m0 and n are based on empirical parameter estimations that involve delicate technicalities. The constants Kfit defined in (8), AT defined in (13) and a defined in (15) are in principle measurable. Issues that may bias the estimation of these parameters include: (1) The rate of aftershock production is estimated empirically over an apparently complete subperiod of finite space-time windows. Missed events outside the spatial delimitation may act to decrease the rate estimate. (2) Stacking different sequences with different global Omori law decays may introduce errors. (3) The p exponent of the Omori law may intrinsically depend on the main shock magnitude [Sornette and Ouillon, 2005; Ouillon and Sornette, 2005]. (4) Background events may be falsely counted as aftershocks. (5) Magnitude and location uncertainties may bias the parameters. (6) Missing events in the catalog, especially after large events, may artificially change the parameter values. (7) Undetected seismicity may bias the estimated parameters [Sornette and Werner, 2005].
 Recently, Sornette and Ouillon  and Ouillon and Sornette  argued for a dependence of the exponent p in the Omori law on the magnitude of the main shock. According to their calculations, p becomes zero; that is, no earthquakes are triggered, at a magnitude around −3. From Figure 1, their estimate of m0 = −3 constrains n to be larger than roughly 80% for α = 1 and n extremely close to one for α = 0.5.
4. Constraints on the Smallest Triggering Earthquake From Båth's Law
 Finally, we use the empirical Båth's law to constrain m0 as a function of n. The law states that the average difference between a main shock of magnitude M1 and the magnitude ma of its largest aftershock is dm = M1 − ma = 1.2, regardless of the main shock magnitude [Båth, 1965]. Console et al. [2003a], Helmstetter and Sornette [2003a] and Saichev and Sornette  showed that the law, deriving from the selection procedure used to define main shocks and aftershocks, is consistent with the ETAS model.
 Let Nobs be the total number of aftershocks generated by the main shock above the magnitude md of completeness of the catalog. Assuming that the magnitudes of the aftershocks are drawn from the Gutenberg-Richter law, the largest aftershock has an average magnitude given by a simple argument of extreme value theory:
Solving this expression for Nobs, equating it with the ETAS prediction (6) and eliminating k through the expression for n(4) provides an estimate of m0 as a function of n:
for α ≠ b and
for α = b. Note that from expression (19), if α is different from b, and M1 − ma is constant with M1 (Båth's law), then m0 depends on M1, showing the inconsistency of the argument based solely on the average number of events, as also explained by Saichev and Sornette . Only when α = b does the main shock magnitude disappear in the expression of m0 as shown in (20). The estimates of m0 for α < b are thus dependent on M1 and should thus be taken only as indications.
Figure 4 illustrates the behavior of m0 as a function of the average number n of direct aftershocks for reasonable values of the other constants (mmax = 8.5, md = 3, b = 1, α = [0.5, 0.6, 0.7, 0.8, 0.9, 1] (light to dark)), for main shock and largest aftershock values according to M1 − ma = 7–5.8 = 1.2 from Båth's law. Again, as n tends to one, m0 tends to minus infinity, while for n = 0, m0 = mmax, as expected. We also observe that m0 is almost constant over a wide range of n for comparatively small α, whereas m0 varies much faster for the case α = b.
 As alluded to in the last section, we obtain the same functional dependence as in both previous estimates of the last section. However, for α = b, the decrease of m0 with increasing n is even faster than when using the parameters of Felzer et al. . Here, the upper limit md = 3 for m0 (upper horizontal line) constrains n to be larger than 45%, much smaller than the lower limit obtained previously. This discrepancy is due to the four different ways of estimating the observed number of aftershocks. However, since all four are in a similar range, they provide a test of the consistency of the results. When applying the dc-derived upper limit of 0.5 (bottom horizontal line), n must be larger than at least 55% for α = b and larger than 85% for α = 0.8. If n = 0.5, m0 is in the range 0 to an unrealistic 5, while for n = 0.7, m0 lies between −9 and 5, depending on the values of α. Since m0 ≥ 3 is unrealistic, the entire region of combinations between α and n that fall above that value can be ruled out. For example, the case α = 0.8 leads to a reasonable m0 smaller than md = 3 only for n larger than about 65%.
 We have shown that differentiating between the smallest triggering earthquake m0 and the detection threshold md within the ETAS model leads, together with four separate methods of estimating the observed numbers of “aftershocks” (defined as triggered events independently of their magnitude), to four estimates of m0 as a function of the percentage n of triggered events in a catalog (also equal to the branching ratio). We have used empirically fitted values for aftershock numbers and thereby eliminated one variable from the ETAS formalism in order to obtain an estimate of m0 as a function of n. The four different estimates were obtained from the fits performed by Helmstetter et al. , Felzer et al. , and Reasenberg and Jones  and from the empirical Båth's law. All four give the same functional dependence and similar values for m0. In particular, we can place bounds on m0 from estimates of the percentage of aftershocks in earthquake catalogs. Conversely, we can limit the range of n by observing that m0 must be less than the detection threshold md, or, less conservatively, that m0 must be less than the magnitude corresponding to the rate and state critical slip dc = 100 μm estimated in laboratories. As well as quantitative values for m0, the bounds limit the possible combinations between n and α and, in particular, indicate that at the very least 55% of all earthquakes are triggered events (“aftershocks”).
 The fact that the existence of a small magnitude cutoff m0 for triggering should have observable consequences may appear surprising. However, such a phenomenon of the impact of a small-scale cutoff on “macroscopic” observables is not new in physics and actually permeates particle physics, field theory and condensed matter physics. In the present case, the existence of m0 has an observable impact especially when α ≤ b for which the cumulative effect of tiny earthquakes equate or dominate that of large earthquakes with respect to the physics of triggering other earthquakes [Helmstetter, 2003; Helmstetter et al., 2005]. We hope that the present article, together with our companion paper [Sornette and Werner, 2005], will draw the attention of the community to the important problem of the distinction between md and m0. Moreover, it will perhaps encourage reanalyses of inversion methods of models of triggered seismicity, and in particular of maximum likelihood estimations, to take into account the bias due to the unobserved seismicity below the magnitude of catalog completeness.
 We acknowledge useful discussions with A. Helmstetter, K. Felzer, and J. Zhuang. This work is partially supported by Southern California Earthquake Center (SCEC) grant NSF-EAR02-30429, and SCEC is funded by NSF cooperative agreement EAR-0106924 and USGS cooperative agreement 02HQAG0008. The SCEC contribution number for this paper is 859. M.J.W. gratefully acknowledges financial support from a NASA Earth System Science Graduate Student Fellowship.