Journal of Geophysical Research: Solid Earth

Recurrence rates of large explosive volcanic eruptions



[1] A global database of large explosive volcanic eruptions has been compiled for the Holocene and analyzed using extreme value theory to estimate magnitude-frequency relationships. The database consists of explosive eruptions with magnitude (M) greater than or equal to 4. Two models are applied to the data, one assuming no underreporting of eruptions and the other taking underreporting into consideration. Results from the latter indicate that the level of underreporting is high and fairly constant from the start of the Holocene until about 1 A.D. and then decreases dramatically toward the present. Results indicate there is only a ∼20% probability that an explosive eruption of M = 6 occurring prior to 1 A.D. is recorded. Analysis of the data set in the time periods 1750 A.D. and 1900 A.D. to present (assuming no underreporting) suggests that that these periods are likely to be too short to give reliable estimates of return periods for explosive eruptions with M > 6. Analysis of the Holocene data set with corrections for underreporting bias provide robust magnitude-frequency relationships up to M = 7. Extrapolation of the model to greater magnitudes (M > 8) gives results inconsistent with geological data, predicting eruption size upper limits much smaller than known eruptions such as the Fish Canyon Tuff. We interpret this result as the consequence of different mechanisms operating for explosive eruptions with M > 7.

1. Introduction

[2] Human exposure to natural hazards is increasing as a consequence of population growth and environmental changes [Huppert and Sparks, 2006]. Extreme natural events are of particular concern because large areas can be affected and the consequences can have global repercussions. The magnitude-frequency (M-F) relationships of natural events needs to be established to estimate return periods of extreme events and to aid assessment of global risk. In recent years, the use of statistics to help understand volcanic processes has greatly increased [e.g., Jones et al., 1999; Ammann and Naveau, 2003; Connor et al., 2003; Mason et al., 2004a, 2004b; Naveau and Ammann, 2005; Caniaux, 2005; Marzocchi and Zaccarelli, 2006; Coles and Sparks, 2006]. In this study, we apply extreme value theory to the record of large explosive Holocene volcanism to estimate the return rates of large explosive volcanic eruptions. This study expands on the work of Coles and Sparks [2006].

[3] Extreme value theory is a branch of statistics that uses the very large values of a data set (as opposed to its “normal” values) to model the extremes of the process from which the data come. There are numerous applications of extreme value theory, especially in fields such as engineering. For instance, it can be used to determine how strong and high a sea wall needs to be in order to withstand an extreme tidal event [Coles, 2001]. For natural processes, the M-F relationship that characterizes common events must eventually break down at the extreme tails of distributions, as there are physical limits to the size of events. Extreme value theory characterizes these tails and enables an assessment of the physical limits of the extremes. Extreme value theory can be applied to data sets that span a shorter time period than the period of interest, with the fundamental assumption that the underlying system behavior has not and will not change (i.e., the system is stationary). Extreme value theory has been applied to earthquake science since the 1980s [e.g., Kim, 1983; Gan and Tung, 1983]. More recently, it has been used to discriminate between large earthquakes and aftershock sequences [Lavenda and Cipollone, 2000], and also to determine the global earthquake energy distribution [Pisarenko and Sornette, 2003]. In volcanology, the application of extreme value statistics is still in the early stages. Mason et al. [2004b] applied it to a database they compiled of “supereruptions” (eruptions with an eruptive mass of at least 1015 kg) occurring from the Ordovician (490 million years ago) to the present to determine average occurrences of supereruptions. In another application, Naveau and Ammann [2005] used ice core data to determine the distributions of large climate-affecting eruptions. A key assumption they made was that volcanic eruptions that left no record in the ice cores would not affect global climate. Results from the application of extreme value theory indicated that the distribution of eruptions leaving a sulfate record in the ice sheets is close to being unbounded. If unbounded, this suggests there is no limit to the strength of the sulfate signal left in ice sheets resulting from a volcanic eruption.

[4] More recently, Coles and Sparks [2006] applied extreme value theory to the Hayakawa catalog (Y. Hayakawa, Hayakawa's 2000-year eruption catalog, 1997, available at∼hayakawa/catalog/2000W), a data set of large volcanic eruptions occurring in the last 2000 years, to estimate the global frequency of large explosive eruptions and to predict the size of the largest possible explosive eruption. This study took into consideration underreporting of eruptions, which increases back in time, by using a censoring model to assess recording bias. They found that recording bias is dependent on both eruption size, since larger events are more likely to be recorded than smaller events, and on timing, as events that happened a long time ago are less likely to be known about today than more recent eruptions. This finding is consistent with work in other fields; for example, it is well known that the completeness of the seismic record decreases as one goes back in time, with larger magnitude earthquakes more likely to be recorded than smaller earthquakes [e.g., Tinti and Mulargia, 1985; Albarello et al., 2001; Woessner and Wiemer, 2005; Leonard, 2008; Grünthal et al., 2009; Wang et al., 2009].

[5] Coles and Sparks [2006] demonstrated the utility of extreme value methods, but as the Hayakawa catalog only extends back 2000 years, could not make inferences on return periods over very long time scales. To address this issue we have compiled a new database of global Holocene explosive volcanism based principally on information in the Smithsonian Global Volcanism Program Database (GVPD) [Siebert and Simkin, 2002], supplemented by the literature. The database lists numerous parameters including tephra volumes, dense rock equivalent volumes (DRE), eruption age, intensity estimates, magnitude, column heights, and volcanic explosivity index (VEI) [Newhall and Self, 1982] (see Figure 1). Magnitude, M, and intensity are defined here following Pyle [2000]:

equation image
equation image

When available, uncertainties are listed. The database is used to compile a data set of magnitude estimates. Statistical methods are then applied to estimate underreporting, assess the extreme value characteristics, estimate magnitude-frequency relations of eruptions of different magnitudes (known as the inverse of the “return period”), and consider the issue of the upper physical limit to explosive volcanism on Earth.

Figure 1.

Description of volcanic explosivity index (VEI) [after Siebert and Simkin, 2002; Newhall and Self, 1982].

2. Database of Holocene Explosive Volcanism

[6] An extensive literature review was carried out, using the GVPD as a major resource, to create a database of large explosive Holocene eruptions (see auxiliary material). The aim was to include all known explosive Holocene eruptions with M ≥ 4.

[7] The choice of the Holocene (the last 10,000 years) is arbitrary but practical. There are many geological and tephrochronological studies that have investigated Holocene volcanism. Additionally, the ice age that ended at the start of the Holocene destroyed many Pleistocene and older geologic records, making eruptive history reconstruction difficult or impossible. To be included in the database, an explosive eruption had to meet the following conditions:

[8] 1. It must have an assigned date based on historical records or scientific dating techniques (e.g., radiocarbon dating, dendrochronology, ice core dating).

[9] 2. It must have an M or VEI ≥ 4. We have two criteria for assessing that the eruption met this condition. They are as follows: (1) The eruption was assigned a M or VEI ≥ 4 in the literature or by the GVPD. (2) There is evidence for or witnessed observation of either a Plinian-scale eruption or an explosive caldera-forming eruption.

[10] 3. In the absence of the first two criteria, there must be an erupted tephra volume or DRE of at least 0.1 km3.

[11] The choice of only including eruptions with M ≥ 4 is arbitrary but is chosen for two reasons. First, our interest in this study is in the extreme events where the magnitude-frequency relationship for common eruptions breaks down. Eruptions with M < 4 are well within the common distribution patterns. Figure 2 shows the number of reported volcanic eruptions against VEI for the Holocene [after Siebert and Simkin, 2002]. This frequency histogram does not reflect the true distribution because there are undoubtedly major reporting biases. However, Figure 2 indicates that qualitatively the tail can be considered as starting at M ≥ 4. Second, small eruptions (i.e., M < 3) are probably underreported relative to larger magnitude eruptions with this underreporting increasing with decreasing magnitude due to increasingly poor preservation potential.

Figure 2.

Distribution of VEI assignments of Holocene volcanic eruptions [after Siebert and Simkin, 2002].

[12] A comment on the relationship between the VEI scale of Newhall and Self [1982] and the magnitude scale based on mass is warranted. As noted by Mason et al. [2004b] and supported by our study, to a first approximation an eruption with VEI x has a magnitude within the range of x and x + 1. Table 1 lists the VEI scale and the range of M values for a given reported VEI in our database.

Table 1. Range of Eruption Magnitudes in the Database for Events Assigned a Given VEI Value
Assigned VEIMagnitude Range in Compiled Database

[13] For each eruption, the following are noted in the database: (1) the source volcano (required), (2) eruption name (date or tephra/unit name; required), (3) eruption date (required), (4) error associated with eruption date, (5) method used to determine eruption date, (6) tephra volume (km3), (7) error associated with tephra volume (km3), (8) DRE (km3), (9) error associated with DRE (km3), (10) intensity, (11) magnitude, (12) column height (km), and (13) assigned VEI. One of items 6, 8, 11, or 13 is required for the eruption to be included. When possible, information from primary sources was used. However, as for numerous volcanoes detailed studies are published in local journals, which are difficult to access, primary sources were not always available. In this situation information from the GVPD was used. A few eruptions listed as having a VEI 4 by the GVPD are excluded on the basis that no individual explosive event in the eruption had a VEI ≥ 4 (e.g., the 1943 eruption of Parícutin in the Michoacán-Guanajuato volcanic field, Mexico).

[14] For a number of eruptions, the GVDP indicates that, while the VEI or tephra volume has not been estimated, the eruption was assessed as Plinian in scale. For these cases, an arbitrary tephra volume of 0.5 km3 is put into the database. Such events are excluded from the quantitative analysis. Likewise, there are many GVDP entries of eruptions with a VEI assignment but no indication of tephra volume. For these eruptions, a provisional tephra volume of 0.1, 1, and 10 km3 is given for VEI 4, 5, and 6 eruptions, respectively. Uncertainties in such cases are large. We included data for the VEI 5 and 6 eruptions in the analysis (but excluded VEI 4 eruptions).

[15] For all eruptions, the DRE and magnitude are estimated when no value was found in the literature. When the DRE is not known, tephra and magma density are estimated to be 1000 kg/m3 and 2700 kg/m3, respectively [after Pyle, 2000]. To calculate DRE, the following equation is used:

equation image

When the DRE but not the magnitude of an eruption was found in the literature, the magnitude is estimated using:

equation image

When neither the DRE nor the magnitude was found in the literature, eruption magnitude is estimated by:

equation image

One potential problem with equations (4) and (5) is that technically, the definition of magnitude involves all erupted mass (i.e., lava and tephra), not just tephra. However, for the majority of large explosive eruptions, the mass of erupted lava is trivial compared to the total erupted mass.

[16] In all, 576 eruptions from 227 volcanoes were included in the database. Figure 3 shows the distribution of all eruptions present in the database. Undoubtedly, more than 227 volcanoes had large explosive eruptions in the Holocene; Deligne [2006] classified 564 volcanoes worldwide as being capable of having large explosive eruptions based on eruptive histories and geomorphic features. While it is unlikely all the volcanoes identified by Deligne [2006] have erupted in the Holocene, undoubtedly more than 227 of them have.

Figure 3.

Distribution of eruptions in compiled database (see auxiliary material). “Assumed” indicates that only the VEI of the eruption, or the fact that it was Plinian in scale, is known. “GVPD” or “literature-based” eruptions are ones for which the magnitude was calculated as described by equations (4) and (5) from DRE or tephra volumes found in the GVPD or literature, respectively. “Literature” indicates that the eruption magnitude is stated in the literature.

3. Extreme Value Theory

[17] One way of building statistical models for the eruption process is as a two dimensional point process, with each event being characterized by a point with coordinates (t, x), where t is time and x is magnitude. Standard arguments from extreme value theory suggest that on regions above a high magnitude level (also known at the “threshold,” or u), the process will be approximately Poisson with an intensity function (the function that determines expected number of points per subregion) falling within a specified parametric family that has links with other representations of extremes. The assumption of the process being approximated Poisson presumes that no volcanic eruption will influence (both temporally and magnitude-wise) any other eruption. While recent work suggests that some individual volcanoes in a closed conduit state may follow a Poisson distribution [Marzocchi and Zaccarelli, 2006], this assumption of independence generally does not hold at the individual volcano level and may also become suspect in a local region of volcanoes whose behavior might be correlated in either space or time through tectonic mechanisms (e.g., the Taupo volcanic zone, New Zealand). However, we consider that this is a reasonable assumption when studying global volcanism, in particular when examining “extreme” events.

[18] As discussed by Coles [2001] and Coles and Sparks [2006], threshold selection is an important and nontrivial matter. The more data there are, the less sampling variation there is, resulting in tighter confidence intervals. In this respect, a lower threshold is better. However, the models work best at higher threshold; if there are too many values at the low end, the statistical models can induce bias. Put another way, data within the bulk of the distribution may contain no information about the distribution in the tail, so that an analysis with too much data from the bulk of the distribution may give a misleading or incorrect result. On the other hand, increasing the threshold to higher values will reduce the amount of data of extreme events and results in very large uncertainties and even meaningless results if there are very little data above the selected threshold.

[19] As previously discussed, each event (eruption) is characterized by a point with coordinates (t, x), where t is time and x is magnitude. The data set of eruptions is of the form {(t1, x1),… (ti, xi),… (tn, xn)}, where ti and xi are the date and magnitude, respectively, of eruption i, and the start and end of the time period from which the data come from are t1 and tn, respectively. The process driving the generation of global eruptions is assumed to be homogeneous in time, and, as previously discussed, the process generating points above a high threshold u is assumed to follow a two-dimensional Poisson process. Denoting the region above this threshold u as space Au, with t1titn and uxi < ∞, it remains to determine a suitable model for the intensity of the process. If a model with parameters θ provides a reasonable approximation for the intensity of points over Au, this approximation should be still valid at a higher threshold u* > u. Put another way, if the model over Au produces parameters θ at threshold u, then these parameters, with the same model, should be valid for all subspaces Au*, u* > u. Consequently, the optimal threshold is the lowest one that is consistent with higher thresholds, when estimation variability is taken into consideration [Coles and Sparks, 2006].

[20] Two models are applied: one that assumes that there is no underreporting of volcanic eruptions (i.e., all volcanic eruptions that occurred during the period covered by the data are accounted for), and a second that takes underreporting of volcanic eruptions into consideration. See Table 2 for a brief description of the variables and notation used.

Table 2. List and Description of Extreme Value Statistics Parameters
(ti. xi)Time and magnitude of eruption i
AuSpace over which model is applied, with uxi < ∞, t1titn
nyNumber of years spanned by the data, with ny = tntl
μLocation of distribution parameter
σScale of distribution parameter
ξRate of decay of tail parameter
vExtent of underreporting probability parameter
wImportance of eruption magnitude on underreporting probability parameter
bImportance of eruption timing on underreporting probability parameter

3.1. No Underreporting Suspected

[21] The asymptotic developments which lead to the Poisson process on suitably high regions also provide a parametric family for the corresponding intensity density function, λ(t, x), i.e., the rate of points per unit area as a function of position where t is time and x is eruption magnitude, which takes the form:

equation image

where μ determines the location of the distribution, σ determines the scale of distribution (σ > 0), and ξ determines the rate of decay of the tail end of the distribution [Coles and Sparks, 2006]. The subscript plus indicates that if the value for λ(t, x) is negative, λ(t, x) is set to 0. We note that the parametric family of equation (6) is the only family which can be shown to satisfy the stability property that parameters describing space Au are also valid for all subspaces Au*, u* > u [Coles, 2001]. When ξ < 0, the distribution is bounded at μ − σ/ξ; this upper bound corresponds to the predicted largest possible eruption of the system. If ξ ≥ 0, the system is unbounded. The likelihood function for the intensity density function (equation (6)) given an observed data set of the form {(t1, x1),… (tn, xn)} is:

equation image

The combination of parameters (μ, σ, ξ) that maximize equation (7) has the highest attached probability of describing the process that generated the data; these parameters are the maximum likelihood estimates.

[22] Once the maximum likelihood estimates (μ, σ, ξ) are determined, it is possible to estimate the annual rate that threshold u is exceeded:

equation image

and the return period of an eruption of size x > u:

equation image

3.2. Underreporting Suspected

[23] When underreporting is suspected, the intensity density function (equation (6)) is modified to reflect that there is a probability that an eruption is recorded:

equation image

As discussed by Coles and Sparks [2006], the selected probability function should be in accord with reasonable assumptions regarding the probability of an eruption being recorded. Specifically, the function should meet the following conditions: (1) p(1, x) = 1 for all x, implying that any eruption with a magnitude greater than the threshold would be recorded today; (2) p(t, x) increases as t → 1 for fixed x, implying that the closer t is to the present, the more likely it is that an eruption of size x is known about today; and (3) p(t, x) is nondecreasing as x → ∞ for fixed t, implying that at any point in time, a larger eruption is more likely to be recorded than a smaller one.

[24] The following probability function, which we applied, meets these conditions:

equation image

where v determines the extent of underreporting (if v = 0, p(t, x) = 1, i.e., there is no underreporting of eruptions), w indicates the importance of eruption magnitude for the eruption being recorded (if w = 0, p(t, x) has no dependence on magnitude), and b signifies the importance of timing for the eruption being recorded (if b = 0, p(t, x) has no dependence on the time of an eruption). The likelihood function becomes:

equation image

The combination of parameters (μ, σ, ξ, v, w, b) that maximize equation (12) has the highest attached probability of describing the process that generated the data. The annual rate and return period are calculated in the same way as for the case where no underreporting is suspected. More detailed treatments of extreme value theory are given by Coles [2001] and Coles and Sparks [2006].

4. Model Applications

[25] Increasing underreporting of volcanic activity with age is a well-known problem in considering historical and geological data on volcanic records [Simkin, 1993; Simkin and Siebert, 2000; Coles and Sparks, 2006; Deligne, 2006]. This phenomenon is not limited to volcanism; it is a well known problem in fields such as seismology [e.g., Tinti and Mulargia, 1985; Albarello et al., 2001; Woessner and Wiemer, 2005; Leonard, 2008; Grünthal et al., 2009; Wang et al., 2009]. The problem is made more complex because data biases also depends on magnitude with larger eruptions being more likely to recorded than small eruptions at any given age. Knowledge of eruptive histories of volcanoes worldwide is poor, with only a quarter of large volcanoes having a known history extending back through the first half of the Holocene. About a third of volcanoes only have an eruptive history extending back to the beginning of the 20th century at best [Deligne, 2006]. This suggests that the underreporting of volcanic activity extends well into the 20th century. The model assuming no underreporting is applied to data from two time periods: from 1750 onward and from 1900 onward. The first period is likely to be affected by underreporting while the second period does not contain very much data. These model applications are to establish how well a short record of relatively unbiased data can predict the extreme tail characteristics. We then analyze the entire Holocene data set with corrections for the underreporting bias. Table 3 summarizes the number of eruptions for each model run considered at each applied threshold.

Table 3. Number of Eruptions in Each Model Runa
Threshold1750 A.D. to Present1900 A.D. to PresentHolocene
  • a

    Note that events whose size is described solely as “VEI 4” (assigned M = 4) or “Plinian” (assigned M = 4.7) were excluded from the analysis.


5. Results

[26] In presenting the results from the magnitude-frequency (M-F) model predictions based on data assuming no underreporting of volcanic eruptions, we include empirical estimates based on the data. These empirical estimates are determined by taking the total number of observed eruptions at or above the specific magnitude and dividing it by the total time being considered. We stress that the models predict behavior over time, so while the return levels of smaller (magnitude 5.0–5.5 eruptions) are sufficiently small such that within the 259 or 109 years from which the data come, the number of eruptions divided by the time period studied reflect the return periods, the return rates of the large eruptions do not. For instance, while the 1815 Tambora eruption (M = 6.9) happened during the time interval from 1750 to the present, and thus the empirical data indicates that one M = 6.9 eruption happened in 259 years, the model, predicting behavior over time, suggests a much longer return period than 259 years for such an eruption. We estimate upper and lower bounds for the M-F curves by calculating uncertainties at the 95% confidence interval.

[27] In considering the results of our analysis we draw attention to physical and geological constraints on values of ξ. If ξ ≥ 0, the model predicts that there is no upper limit to the size of an eruption, implying that an eruption could erupt more than the mass of the earth. However, there is a physical upper magnitude limit to explosive volcanic eruptions as a consequence of the mechanisms of magma generation. Large bodies of silicic magma are generated by intracrustal igneous processes [Smith, 1979; Hildreth, 1981]. The volume of magma that can accumulate in a magma chamber is limited by parameters such as crustal thickness, crustal strength, and magma supply rates [Jellinek and DePaolo, 2003]. Our model takes none of these into consideration; rather than modeling the details of the process to predict system limitations, we are using results of the process to predict the overall behavior and limitations of the system. The data compilation of Mason et al. [2004b] on the largest explosive eruptions indicates that there have been at least 42 magnitude 8 or greater eruptions over the last 36 Myr, the largest of which is the Fish Canyon Tuff eruption with M = 9.1–9.2 [Lipman et al., 1997; Bachmann et al., 2000]. These observations indicate that the upper bound for global volcanism magnitude is at least 9.2. While an unbounded solution is physically impossible, such a result could indicate that the data do not support any particular value for the upper limit. If the maximum likelihood estimate has ξ ≥ 0, making the high end results certainly invalid, one should note that the data support this model with ξ ≥ 0 as the best fit model, so it is the model to use for other result interpretations, such as return periods for smaller eruptions still within the end tail of the distribution.

5.1. Eruptions From 1750 to Present

[28] Figure 4 shows the data in the 1750 A.D. onward period. Figure 5 shows the parameters (μ, σ, ξ) obtained at different thresholds, with error bars indicating the 95% confidence interval. The precision, as quantified by the 95% confidence interval, greatly decreases (i.e., the confidence interval increases) for both μ (location) and σ (scale) at threshold u = 4.7 and again at u = 5.0 (Figures 5a and 5b). While the model applied is more relevant at higher thresholds, the errors associated with thresholds u = 5.0–5.2 are too large for any meaningful conclusions to be drawn. We therefore chose the threshold u = 4.9 as the optimum for analysis and compared these with analyses with a threshold u = 4.0, where all data are used. For two thresholds (u = 4.5 and 4.6), the maximum likelihood value of ξ is greater than zero, predicting no upper limit to the size of volcanic eruptions (Figure 5c). For six other thresholds the 95% confidence interval for ξ includes values equal to or greater than zero.

Figure 4.

Distribution of eruptions in compiled database (see auxiliary material) in the period from 1750 A.D. to the present. The legend is as described in Figure 3.

Figure 5.

Parameters obtained from application of extreme value statistics to the 1750 A.D. to present period. The model assumes no underreporting of volcanic eruptions. For data where there are no obvious error bars, the 95% confidence interval is too tight for the bars to appear. (a) Results for μ, the location parameter. (b) Results for σ, the scale parameter. (c) Results for ξ, the rate of decay of the tail parameter. The dashed line shows where ξ = 0. At or above this line, the model predicts there is no upper limit to the size of a volcanic eruption.

[29] Figure 6 shows the return periods and predicted magnitude limit for thresholds u = 4.0 (Figure 6a) and 4.9 (Figure 6b) with calculations of upper and lower bounds of uncertainty along with empirical data. For relatively low magnitude eruptions in the range M = 5.0 to 6.1, both threshold models give M-F curves that match the empirical data relatively well. The agreement is better for u = 4.9 than for u = 4.0. This demonstrates the principle that adding more data from smaller eruptions (i.e., lowering the threshold) introduces more bias. Small eruptions are both unrepresentative of the tail and may also have a more pronounced underreporting problem then large eruptions. Above M = 6.1 the empirical data and the curves depart with the curves giving longer return periods than the empirical data. As previously discussed, this is because the model predicts behavior over time, while the empirical data simply reports how many eruptions of or greater than a given magnitude happened in the last 256 years. Thus, while there happened to be an M = 6.9 eruption in the last 259 years (Tambora, 1815), the model predicts that the long-term average return level for such an eruption is every 781 years (for u = 4.9). Table 4 lists the values for μ, σ, and ξ obtained for u = 4.0 and 4.9, in addition to the values obtained for the analysis for eruptions from 1900 A.D. to the present.

Figure 6.

Return periods calculated at thresholds (a) u = 4.0 and (b) u = 4.9 from model application to eruptions in the 1750 A.D. to present period (solid line), with 95% confidence interval indicated by the dashed lines. The solid red line indicates the predicted magnitude limit, with the 95% confidence interval indicated by the red dotted lines. The range of possible magnitudes for the Fish Canyon Tuff, the largest known explosive eruption, is shown by the blue bar. Solid diamonds are the return periods empirically calculated from the data. This is done by dividing the total number of years of observations (259 years) by the number of eruptions at or above a given magnitude in that time.

Table 4. Model Parameters Obtained Using the 1750 A.D. to Present and 1900 A.D. to Present Data Setsa
Data Setu Thresholdμ Locationσ Scaleξ ShapeLimit
  • a

    Assuming no underreporting of eruptions. Errors reported at the 95% confidence level.

1750 A.D. to present4.02.62 ± 0.561.47 ± 0.62−0.30 ± 0.167.51 ± 1.19
 4.91.16 ± 4.682.38 ± 4.06−0.38 ± 0.477.38 ± 1.77
1900 A.D. to present4.02.76 ± 0.841.68 ± 1.12−0.40 ± 0.306.94 ± 1.19
 4.53.08 ± 1.591.27 ± 1.81−0.29 ± 0.577.47 ± 4.06

[30] The models at thresholds u = 4.0 and 4.9 predict that the largest possible explosive eruption magnitudes are 7.5 ± 1.2 and 7.4 ± 1.8, respectively, with the error corresponding to the 95% confidence interval. These two predicted magnitude limits are very similar, although when error is considered the result obtained at u = 4.9 is close to containing the largest known eruption of M = 9.2. We note that the increase in error between u = 4.0 and 4.9 is due to the sparsity of data at the higher threshold (only 25 eruptions are considered at u = 4.9; see Table 3). It appears that while the period from 1750 A.D. to the present is sufficient for producing a return period model for thresholds below M ∼ 6, it is insufficient to characterize the M-F relationship for larger magnitudes.

[31] These results differ from those of Coles and Sparks [2006] for their application of the same model to the portion of the Hayakawa catalog from 1600 A.D. to the present. Quite similar parameters were determined for model application at a threshold u = 4.0, although at higher thresholds results are less similar. Additionally, none of their values for ξ exceeded zero. The main differences between the data used are that the Hayakawa catalog includes large effusive eruptions, whereas our study only relates to explosive eruptions, and that they assumed that there was no underreporting from 1600 A.D. onward, which is suspect.

5.2. Eruptions From 1900 to the Present

[32] The same procedures have been applied to the data set for the period since 1900 A.D. The advantage of this period is that underreporting should not be a major problem for large explosive eruptions, but the disadvantage is that the period is unlikely to sample the larger magnitude events well or at all. The optimum choice of u = 4.5 was chosen and compared with the analysis with u = 4.0. The maximum likelihood M-F curves (Figure 7) predict a limit for eruption size that is far too low considering the geologic record. Agreement between the analysis and the empirical data is good up to about M = 5.7 for both thresholds. Taken together these results suggest that the period is too short to be representative and to characterize the extreme tails.

Figure 7.

Return periods calculated at thresholds (a) u = 4.0 and (b) u = 4.5 from model application to eruptions in the 1900 A.D. to present period (solid line), with 95% confidence interval indicated by the dashed lines. The solid red line indicates the predicted magnitude limit with the 95% confidence interval indicated by the red dotted lines (note that for u = 4.5 the error is off the range of this plot; see Table 4). The range of possible magnitudes for the Fish Canyon Tuff, the largest known explosive eruption, is shown by the blue bar. Solid diamonds are the return periods empirically calculated from the data. This is done by dividing the total number of years of observations (109 years) by the number of eruptions at or above a given magnitude in that time.

5.3. Analysis of Holocene Data

[33] Figure 3 shows the Holocene data for magnitude versus age. Uncertainties vary greatly within the data set. Age uncertainties depend on geochronological methods and range from precise historical fixes or precise tree ring correlations to scattered 14C ages with both corrected and uncorrected results being reported. Typical age error estimates for nonhistorical age determination methods are ±250 years. Magnitude error estimates are on the order of ±0.2. Error bars are not shown both for clarity, and also because in many cases they are not reported. In our analysis we assume that the uncertainties are nonsystematic and will not affect the overall statistical outcomes.

[34] Underreporting is evident from visual inspection with the density of data points decreasing back in time. The change in data density is more pronounced for smaller events than for larger events, reflecting the increased probability of recording large eruptions. The plot includes data on Plinian eruptions for which there are no magnitude data but which are interpreted as M > 4. These data have been assigned an arbitrary magnitude of 4.7 and are excluded from the statistical analysis. Additionally, data with an assigned VEI = 4 which were arbitrarily assigned a tephra volume of 0.1 km3 are likewise excluded from analysis.

[35] We have estimated the parameters (μ, σ, ξ, v, w, b) obtained at different thresholds, where the last three parameters are those of the probability function (Figure 8). The error bars indicate the 95% confidence interval. The value of ξ is below zero all thresholds (Figure 8c). For parameters that describe underreporting we observe a modest effect of magnitude (Figure 8e) and a large effect of time (Figure 8f) reflected in large values of b for all thresholds. We choose u = 5.5 as the preferred threshold that avoids the influence of small magnitude eruptions to estimate the tail for the M-F curve for extreme events. In addition, u = 5.5 is the highest threshold with reasonable error that includes all the higher thresholds within the uncertainty limits.

Figure 8.

Parameters obtained from application of extreme value statistics to the compiled database of large explosive Holocene eruptions. The model assumes underreporting of volcanic eruptions. Computation problems excluded the determination of parameters at thresholds u = 5.6. For data where there are no obvious error bars, the 95% confidence interval is too tight for the bars to appear. (a) Results for μ, the location parameter. (b) Results for σ, the scale parameter. (c) Results for ξ, the rate of decay of the tail parameter. The dashed line shows where ξ = 0. At or above this line, the model predicts there is no upper limit to the size of a volcanic eruption. (d) Results for v, the extent of underreporting parameter. (e) Results for w, the magnitude parameter. (f) Results for b, the timing parameter.

[36] Figure 9 shows the calculated probability of a M ≥ 6 eruption being recorded from the parameters obtained at thresholds u = 4.0 and 5.5. At the preferred threshold (u = 5.5), the curves indicate that underreporting increases markedly between the present and 2000 years ago with a prior constant value of about 20% for M ≥ 6 (Figure 9b). These results give the probability that an eruption would be recorded were it to have occurred during the reporting period. Coles and Sparks [2006] obtained similar probability functions in their study of eruptions from the last 2000 years. Their probability function differs in that the sharp increase in probability started around 1200 A.D. (as opposed to around 1 A.D.). However, the initial probabilities for magnitude 6.0 are almost identical to our results.

Figure 9.

Probability function that an eruption of magnitude 6.0 is known about today, as determined at thresholds (a) u = 4.0 and (b) u = 5.5.

[37] Figure 10 shows the return periods for thresholds u = 4.0 and 5.5. For this much longer time period the upper limits are well below the expected value of about M = 9.2. In this case the lower threshold (u = 4.0) gives a closer value M = 8.3 ± 1.0 to the expected upper limit than the higher threshold (u = 5.5) with a value of M = 7.6 ± 0.4; error corresponds to the 95% confidence interval. Overall, the confidence intervals obtained are much tighter than those obtained for the analysis where no underreporting of eruptions is suspected, which is not surprising as there are much more data and a much longer time period has been considered. Table 5 lists the values for the parameters and predicted upper limit obtained for u = 4.0 and 5.5.

Figure 10.

Return periods calculated at thresholds (a) u = 4.0 and (b) u = 5.5 from model application to Holocene eruptions (solid line), with 95% confidence interval indicated by the dashed lines. The solid red line indicates the predicted magnitude limit, with the 95% confidence interval indicated by the red dotted lines. The range of possible magnitudes for the Fish Canyon Tuff, the largest known explosive eruption, is shown by the blue bar.

Table 5. Model Parameters Obtained Using the Holocene Data Seta
u Thresholdμ Locationσ Scaleξ Shapev Extentw Magnitudeb TimeLimit
  • a

    Assuming underreporting of eruptions. These value were obtained by rescaling time in the Holocene from 0 to 1; thus, any estimates for return periods must by multiplied by 10,009 (i.e., time between 8000 B.C. and 2009 A.D.) to obtain return period in years. Errors reported at the 95% confidence level.

4.07.70 ± 0.480.14 ± 0.08−0.25 ± 0.091.30 ± 0.210.21 ± 0.1116.03 ± 4.918.27 ± 0.99
5.57.39 ± 0.220.08 ± 0.07−0.39 ± 0.221.09 ± 3.230.18 ± 1.6511.76 ± 10.417.59 ± 0.43

[38] As choice of threshold dictates the resulting return period function and predicted magnitude limit, we include Figure 11, which shows the predicted magnitude limit obtained at each threshold. Error bars indicate the 95% confidence interval. Six thresholds have a predicted magnitude limit greater than M = 9.2; these six predictions also have the greatest errors of any of the predicted limits obtained. Of the seventeen thresholds whose predicted magnitude limit is less than M = 9.2, the majority (ten) do not contain M = 9.2 within their error bars.

Figure 11.

Predicted magnitude limit of all thresholds with calculated Holocene parameters, with error bars showing the 95% confidence interval. For u = 4.6, the predicted magnitude limit is M = 22.6, and for u = 4.3, 4.5, 4.6, 4.7, and 6.0, the full range of the 95% confidence interval is greater than the range of the plot. The range of possible magnitudes for the Fish Canyon Tuff, the largest known explosive eruption, is shown by the blue bar.

6. Discussion

[39] The analysis of the Holocene data set on global explosive eruptions provides a statistically rigorous approach to assessing the return periods of large magnitude explosive volcanic eruptions. However, it is important to acknowledge that the model can only make predictions on the system that the data samples. Thus, if the data fail to sample a certain process, the model predictions will not apply to this process. Several issues arise from the results of the study that limit the ability to constrain accurately the magnitude frequency relationship of very large magnitude eruptions for which there is little or no data during the Holocene, either as a consequence of such eruptions not having occurred over this time period or having occurred but not yet being recorded by geological studies or because the processes involved are not captured by the modeling of smaller magnitude eruptions and the extrapolation of the results. Here we discuss these issues and propose a current state of knowledge on return periods and assess the uncertainties in these estimates.

[40] The study quantifies the degree of underreporting back in time. Starting about 2000 years ago there is a rapid increase in the probability that an event is recorded, which reflects the spread of historical written records and advances in science. Prior to 2000 years ago underreporting is statistically constant. This result reflects the fact that the data for volcanism prior to 2000 years are largely obtained by geological studies and if a study is done, the Holocene record of tephra from large magnitude vents for a particular volcano is fairly complete. If there is a decrease in preservation potential between the start of the Holocene and 2000 years ago, then our results do not detect this effect. For our preferred threshold value the underreporting is estimated at approximately 80% for M ≥ 6. This predicts that if a magnitude 6.0 or greater eruption (approximately 1991 Mt. Pinatubo size eruption or above) occurred prior to 1 A.D., there is only a one in five chance that we know about it today. As previously mentioned, the probability of a Holocene eruption happening prior to 1 A.D. and the scientific community not knowing about it today is the probability of it being recorded times the probability of it happening during the length of time of interest (in this case, 10,000 years). These results concerning the state of knowledge are similar to those obtained by an independent method assessing how far back volcanic histories extend from an analysis of a global database of volcanoes [Deligne, 2006], which found that only a quarter of the world's volcanoes with potential for explosive volcanism have records extending back earlier than 3000 BC. Underreporting is also dependent on magnitude with recording probability increasing with magnitude.

[41] The choice of a threshold is important in characterizing the tail of the distribution and is a tradeoff between data quantity and data relevance. A low threshold increases the data quantity, decreasing apparent confidence intervals, but biases the analysis with data from relatively low magnitude events, which may be unrepresentative of the natural processes that govern the tail. A high threshold may be better at characterizing the tail but the uncertainties increase markedly due to the reduction in data. Comparison of the results for different thresholds indicates that these effects are important in the explosive volcanism data set. Thus we propose that the best representation of the M-F relationship of the tail is for u = 5.5 for the Holocene data set. Even with this choice of threshold that excludes the smaller events the results suggest that the extreme region of the tail is not yet captured in the model. The maximum likelihood result with u = 5.5 gives an upper limit for an extreme event of only M = 7.6 with an upper bound slight above M = 8. However, the geological record shows that M > 8 occur, albeit very rarely, and the largest known eruption has M = 9.2; this is probably close to the true maximum [Mason et al., 2004b]. Figure 11 shows that most thresholds with a predicted magnitude limit containing reasonable error do not contain M = 9.2 within their 95% confidence intervals; the predicted magnitude limit is in such case smaller. From this is clear that the model based on Holocene data does not capture the very extreme tail.

[42] The failure to capture the extreme parts of the tail can be attributed to two possible causes. First the largest magnitude eruption in the Holocene data set is M = 7.4. Extrapolation of best fit models well beyond the largest event that has been analyzed is problematic if the extreme events do not follow the same frequency distribution as the smaller events in the data set. This is not just an issue of the uncertainties increasing with increased extrapolation since our analysis includes estimates of uncertainty and the upper bounds fall well short of the M > 8 events. This observation suggests the second possibility, that different physical mechanisms may be responsible between large and very large magnitude events.

[43] Such different mechanisms are known in other systems. For example, Sparks and Aspinall [2004] analyzed the statistical distribution of durations of 150 historic lava dome eruptions. The M-F curve for duration changed at about 5 years duration with a long-duration tail, which is quite different to the M-F curve for the majority of the data. Thus, the processes involved for short-duration (less than 5 years) lava dome eruption appear to be different from processes involved in long-duration events. In this case, Sparks and Aspinall [2004] suggested that dome eruptions lasting more than 5 years form thermally mature conduits which allow prolonged magma flow and eruption. Irrespective of the cause the data support contrasted physical mechanisms between long- and short-duration dome eruptions.

[44] In the case of explosive volcanism we propose that there is a change in the physical mechanism between large and very large magnitude explosive eruptions. A clue to the nature of the change is that caldera formation becomes a major feature of explosive eruptions at about M ∼ 7. The six Holocene eruptions that exceed M ∼ 6.9 all formed calderas. Additionally, the data set of Mason et al. [2004b] for M > 8 events support a different M-F relationship for caldera-forming eruptions. Many of the eruptions in our data set are for Plinian style eruptions from stratovolcanoes or central vents and were not accompanied by caldera formation. The study of Jellinek and DePaolo [2003] provides a plausible explanation for a change of mechanism between smaller magnitude Plinian eruptions and larger magnitude caldera-forming eruptions. They propose that a change in behavior occurs when magma chambers reach a critical size. Below the critical size the chamber cannot sustain large pressures and conditions are repeatedly reached when dykes can propagate and eruptions can take place. In such systems the chamber erupts relatively frequently and does not grow to a size where very large magnitude eruptions are possible. Once the size threshold is crossed, however, the conditions allow the chamber to grow by progressive magma accumulation and prevent dyke propagation. The magmatic system can then grow to a large volume and other triggering mechanisms become important, such as failure of the crustal lid to the chamber [Jellinek and DePaolo, 2003]. As in the case of lava dome durations, the two styles of magma chamber evolution and associated volcanism might be expected to give different M-F relationships.

[45] If our interpretation of different mechanisms is correct then extrapolation of the curves to large magnitude is not justified. A major objective of future research should be to gather data for M > 7 explosive eruptions over a longer time period than the Holocene. There effectively is a data gap between the study of Mason et al. [2004b] for M > 8 explosive eruptions and our study which likely covers too short a period to sample eruptions in the M = 7 to 8 range.

[46] We are confident that the curves up to M ∼ 6.5 are robust representations of the global M-F relationships. In Table 6 we tabulated the predicted return period for eruptions between magnitude 4.5 and 7 in intervals of 0.5. For M ≥ 7 the uncertainties remain large and will need more extensive time period data sets as well as investigation of the underlying mechanisms.

Table 6. Predicted Return Levels for Eruptions of Different Magnitudes According to Different Model Runs
Eruption MagnitudeEruption ExamplePredicted Return Levels (years)
u = 4.0u = 5.5u = 4.0u = 4.9u = 4.0u = 4.5
4.5Avachinsky, 19454.
5.0Somma-Vesuvius, 16317.99.2126.77.3
5.5Shiveluch, 1964152419231416
6.0Quizapu, 1932354950514144
6.5Krakatau, 188396129191163268184
7.0Kurile Lake, KO eruption370626185014282249

7. Summary and Conclusions

[47] We compiled a database of large explosive Holocene eruptions, with the aim of including every known eruption of magnitude 4 or greater. A total of 576 eruptions from 227 volcanoes were included. Extreme value theory statistics was applied to the resulting database. A model assuming no underreporting of volcanic eruptions was applied to two data sets, one of eruptions from 1750 A.D. to the present and a second from 1900 A.D. to the present. A model taking underreporting into consideration was applied to the entire Holocene database. Results in all cases predict eruption size limits smaller than most supereruptions and considerably than the largest known volcanic eruption, the Fish Canyon Tuff. Additionally, results from the Holocene analysis imply considerable underreporting of volcanic eruptions prior to 1 A.D., predicting that a magnitude 6 eruption only has a 20% chance of being recorded. We suggest that the models predict a smaller magnitude eruption size limit than the geologic record indicates due to sampling biases. It is likely that the data only sampled the high end of “ordinary” eruptions and that the mechanisms behind “supereruptions” are different. Our results hence predict the upper limit and recurrence rates for large Plinian-type eruptions but do not make predictions on size or recurrence rates of caldera-forming eruptions, the high end of which are “supereruptions.”


[48] This project is a product of the VOGRIPA project, supported by Munich Re and the European Research Council. VOGRIPA is part of the Global Risk Identification Programme (GRIP). R.S.J.S. acknowledges support of a Royal Society Wolfson Merit Award and a European Research Council Advanced Grant. We thank Matt Watson and Nick Tanushev for support to N.I.D. with preparation of the data and computing. This work reflects part of the work N.I.D. did for her Masters thesis at the University of Bristol. This manuscript was improved by comments from Warner Marzocchi, Patrick Taylor, and two anonymous reviewers.