We perform a statistical study of risk in nuclear energy systems. This study provides and analyzes a data set that is twice the size of the previous best data set on nuclear incidents and accidents, comparing three measures of severity: the industry standard International Nuclear Event Scale, the Nuclear Accident Magnitude Scale of radiation release, and cost in U.S. dollars. The rate of nuclear accidents with cost above 20 MM 2013 USD, per reactor per year, has decreased from the 1970s until the present time. Along the way, the rate dropped significantly after Chernobyl (April 1986) and is expected to be roughly stable around a level of 0.003, suggesting an average of just over one event per year across the current global fleet. The distribution of costs appears to have changed following the Three Mile Island major accident (March 1979). The median cost became approximately 3.5 times smaller, but an extremely heavy tail emerged, being well described by a Pareto distribution with parameter α = 0.5–0.6. For instance, the cost of the two largest events, Chernobyl and Fukushima (March 2011), is equal to nearly five times the sum of the 173 other events. We also document a significant runaway disaster regime in both radiation release and cost data, which we associate with the “dragon-king” phenomenon. Since the major accident at Fukushima (March 2011) occurred recently, we are unable to quantify an impact of the industry response to this disaster. Excluding such improvements, in terms of costs, our range of models suggests that there is presently a 50% chance that (i) a Fukushima event (or larger) occurs every 60–150 years, and (ii) that a Three Mile Island event (or larger) occurs every 10–20 years. Further—even assuming that it is no longer possible to suffer an event more costly than Chernobyl or Fukushima—the expected annual cost and its standard error bracket the cost of a new plant. This highlights the importance of improvements not only immediately following Fukushima, but also deeper improvements to effectively exclude the possibility of “dragon-king” disasters. Finally, we find that the International Nuclear Event Scale (INES) is inconsistent in terms of both cost and radiation released. To be consistent with cost data, the Chernobyl and Fukushima disasters would need to have between an INES level of 10 and 11, rather than the maximum of 7.
The industry-standard approach to the evaluation of the risk of nuclear accidents is a top-down technique called probabilistic safety assessment (PSA). PSA consists of developing fault tree models that allow one to simulate accidents, with different triggers and event paths, and the severity and frequency of such accidents. Furthermore, within a plant, PSA may be an ongoing process where both the PSA, and plant operations and technology, evolve together with the purpose of improving plant safety. The basic PSA methodology works as follows. Initiating events, such as component failures, human errors, and external events, are enumerated and assigned probabilities. Next, a (typically deterministic) fault tree is defined to encode the causal links between events, allowing combinations of initiating events to form the ultimate resultant/system-level event. Such a model then allows one to determine the probability of such events, and potentially attach damage/consequence values to the event paths. Thus, a textbook PSA would require the complete and correct definition of initiating events, subsequent cascade effects, and their probabilities and consequences.
It is therefore not surprising that the documentation for a plant-specific PSA often fills a bookshelf, and is a constant “work in progress.” Within PSA, three levels exist, delineating the depth/extent to which events are studied:[2-4] level 1 concerns core damage events, level 2 concerns radioactive releases from the reactor building given that an accident has occurred, and level 3 evaluates the impact of such releases on the public and the environment. Levels 1 and 2 are required by regulation. Level 3, which is the level considered in this study, is seldom done in PSA. Given that the reliability of PSA depends on the inclusiveness of scenarios, the correct modeling of cascade effects, and the handling of tremendous uncertainties, it is not surprising that PSA has failed to anticipate a number of historical accidents in civil nuclear energy.[5, 6] It has been found that the probability assessments were fraught with unrealistic assumptions, severely underestimating the probability of accidents. The chairman of the World Association of Nuclear Operators stated that the nuclear industry is overconfident when evaluating risk and that the severity of accidents is often underreported.
Instead of entering this quagmire, several studies have used a “bottom-up” approach, performing statistical analysis of historical data. These studies, [5, 9-12] and others, have almost universally found that PSA dramatically underestimates the risk of accidents. The International Atomic Energy Agency (IAEA) provides the International Nuclear Event Scale (INES) measure of accident severity, which is the standard scale used to measure the severity of nuclear accidents. However, the INES has been censured—for being crude, inconsistent, only available for a small number of events, etc.—not only in statistical studies, but by the industry itself.[8, 13] As noted by The Guardian newspaper, it is indeed remarkable (sic astonishing) that the IAEA does not publish a historical database of INES events. However, given that the IAEA has the dual objective of promoting and regulating the use of nuclear energy, one should not take the full objectivity of the INES data for granted. Independent studies are necessary to avoid possible conflicts of interest associated with misaligned incentives.
Presumably for lack of better data sources, a number of statistical studies[11, 12] have used the INES data to make statements about both the severity and frequency of accidents in nuclear energy systems. Here, we also perform a statistical analysis of nuclear incidents and accidents, but we avoid relying on the INES data. Instead, we use the estimated cost value in USD (U.S. dollars) as the common metric that allows one to compare often very different types of events. This database has over triple the number of events compared with most studies, providing a much better basis for statistical analysis and inference, and bringing into question the reliability of the other studies. Moreover, because radiation releases may translate into very different levels and spread of contamination of the biosphere, depending on local circumstances, the quantification of cost is more useful and provides a better comparative tool.
According to PSA specialists, the gaps between PSA-specific results and the global statistical data analysis mentioned above exist in the eyes of observers who ignore the limitations in scope that apply to almost all PSA—e.g., PSA is often restricted to normal operating conditions and internal initiating events. Indeed, PSA is a tool that serves many purposes that do not rely on the accurate absolute quantification of risks. However, PSA is used as the tool for discussing risks in nuclear energy systems, and has multiple shortcomings in this regard. That is, PSA applications need to better consider incompleteness, uncertainty,[6, 15-17] and be combined with bottom-up statistical approaches when discussing risks at many levels.
Moreover, because of the uniqueness of each reactor, some nuclear experts say that assigning risk to a particular nuclear power plant is impossible.[3, 19] A further argument is that the series of accidents form a nonstationary series; in particular because the industry has been continuously learning from past accidents, implementing procedures, and upgrading each time to fix the problem when a vulnerability was found—especially when revealed by an accident. For instance: the loss of criticality control in the fast breeder reactor EBR-I (1.7MWe), which started operation in 1951 on a test site in the Idaho desert, led to a mandatory reactor design principle to always provide a negative power coefficient of reactivity when a reactor is producing power; the Windscale accident in 1957 catalyzed the establishment of the general concept of multiple barriers to prevent radioactive releases; the Three Mile Island accident in 1979 led to plant-specific full-scope control room simulators, plant-specific PSA models for finding and eliminating risks, and new sets of emergency operating instructions; the Chernobyl accident in 1987 led to the creation of the World Association of Nuclear Operators (WANO) through which participating operators exchange lessons learned, and best practices; the Fukushima-Daiichi accident in 2011 is pushing toward designs that ensure heat removal without any AC power for extended times, etc. As a consequence, each new accident supposedly occurs at a nuclear plant that is not exactly the same as for the previous accident. This leads to the concept that nuclear risks are unknowable because one does not have a fixed reference frame to establish reliable statistics.
In contrast, we propose that it is appropriate—and important—to study the global risk of nuclear energy systems by performing statistical analysis of an independently compiled data set of the global pool of reactors. There is nothing invalid about modeling the overall risk of the heterogeneous global fleet, provided one takes sufficient care to control for nonstationarity, and does not draw inference beyond the “average reactor.” In particular, risk-theoretic stochastic models aiming at describing both the frequency and severity of events, as already used in some studies,[5, 9] offer very useful guidelines for such statistical analyses. This constitutes the standard approach that insurance companies rely upon when quoting prices to cover the risk of their clients, even when the estimation of risk appears very difficult and nonstationary. In this spirit, Burgherr et al. write that “the comparative assessment of accident risks is a key component in a holistic evaluation of energy security aspects and sustainability performance associated with our current and future energy system.”
In the next section, we describe the data used in our analyses, how severity of events in nuclear energy systems can be measured, and show that the INES values are a poor measure of severity when compared with the consequences of events measured in USD cost. Section 'UNCERTAINTY QUANTIFICATION OF RISKS IN NUCLEAR ENERGY SYSTEMS' discusses uncertainty quantification in nuclear risks. Section 'EVENT FREQUENCY' estimates the rate of events, and proposes simple models to account for the evolution of the nuclear plant industry. Section 'EVENT SEVERITY' analyzes the distribution of costs. Section 'RUNAWAY DISASTERS AS “DRAGON-KING” OUTLIERS' discusses a runaway disaster effect, where the largest events are outliers referred to as “dragon kings” (DK). Section 'MODELING AGGREGATE ANNUAL DAMAGE' combines the different empirical analyses of previous sections on the rate of events, the severity distribution, and the identification of the DK regime, to model the total future cost distribution, and to determine the expected annual cost. Section 'DISCUSSION AND POLICY CONCLUSIONS' concludes and discusses policy implications.
2. DATA AND THE MEASUREMENT OF EVENT SEVERITY
We define an “event” as an incident or accident within the nuclear energy system that had material relevance to safety, caused property damage, or resulted in human harm. The nuclear energy system includes nuclear power plants, as well as the facilities used in its fuel cycle (uranium mines, transportation by truck or pipeline, enrichment facilities, manufacturing plants, disposal facilities, etc.). Events are defined to be independent in the sense that one event does not trigger another one. For instance, three reactors melted down at Fukushima in 2011; however, we define this as a single accident due to the fact that the occurrences at the individual reactors were part of a dependent sequence, and linked to a common cause. Statistical changes due to industry responses to past accidents are controlled for in the modeling.
With this definition, we compiled an original database of as many events as possible over the period 1950 to 2014. To be included in the database, an accident had to be verified by a published source, some of them reported in the peer-reviewed literature, but others coming from press releases, project documents, public utility commission filings, reports, and newspaper articles. Such an incremental approach to database building has been widely utilized in the peer-reviewed energy studies literature. Hirschberg et al. have constructed the ENergy-related Severe Accidents Database (ENSAD), the most comprehensive database worldwide covering accidents in the energy sector. Flyvbjerg et al. built their own sample of 258 transportation infrastructure projects worth about 90 billion USD.[24, 25] Ansar et al.  built their own database of 45 large dams in 65 different countries to assess cost overruns. Also investigating cost overruns, Sovacool et al. [27, 28] compiled a database consisting of 401 electricity projects built between 1936 and 2014 in 57 countries, which constituted 325,515 megawatts (MW) of installed capacity and 8,495 kilometers of transmission lines.
The data set includes three different measures of accident severity: the industry standard measure, INES; a logarithmic measure of radiation release, NAMS (Nuclear Accident Magnitude Scale); and the consequences of accidents measured in 2013 U.S. dollars (USD). The industry standard measure is the discrete seven-point INES, defined as: level 0: events without safety significance, level 1: anomaly, level 2: incident, level 3: serious incident, level 4: accident with local consequences, level 5: accident with wider consequences, level 6: serious accident, and level 7: major accident. Levels 1–3 are considered to be “incidents,” and levels 4–7 “accidents.” The distinction between incidents and accidents is not clear, and thus somewhat arbitrary (e.g., see page 152 of the INES user manual). Incidents tend to concern degradation of safety systems, and may extend to include events where radiation was released and people were impacted. However, when the damage and impact to people and the environment becomes large enough, then the event is deemed an “accident.” But, there are rules about how many fatalities, or how much radiation release, is necessary to qualify for a specific INES level.
The second measure, NAMS, was proposed as an objective and continuous alternative to INES. The NAMS magnitude is where R is the amount of radiation released in terabecquerels. The constant 20 makes NAMS approximately match INES in terms of its radiation level definitions.
Finally, the main measure used here is the USD consequences/costs due to an event. This cost measure is intended to encompass total economic losses, such as destruction of property, emergency response, environmental remediation, evacuation, lost product, fines, court and insurance claims, etc. In the case where there was a loss of life, we added a lost “value of statistical life” of 6 MM USD per death. The 6 MM USD figure is chosen as a lower bound of the value of statistical life reported by various U.S. agencies (e.g., the Environmental Protection Agency, Food and Drug Administration, Transportation Department, etc.). Practically speaking, given that the costs are taken from different sources, it is unlikely that the data truly reflect all relevant costs. While imperfect and controversial, this has the advantage of leading to a single USD metric associated to each event that combines all possible negative effects of the accidents. The costs were standardized by using historical currency exchange rates, and adjusting for inflation to 2013 USD. Adjusting for differing price levels (e.g., because an equivalent event in Switzerland will cost more than one in the Ukraine) was not done because the majority of events belong to countries with a similar price level (the United States, the United Kingdom, Japan, and Western Europe), and because the sample is so heavy tailed that adjusting cost within an order of magnitude has little impact on the statistics.
The result of this effort is a unique data set containing 216 events. Of these events 175 have cost values, 104 have INES values, and 33 have NAMS values. The data sets of Sovacool from the energy studies literature (e.g., a data set has been studied [5, 9]) provided a starting point of around 100 events with cost values. For our data set, Table I lists the 15 most costly events. The data and severity measures are discussed in the following subsection. The data set has been published online, where the public is encouraged to review and recommend additions and modifications, with the intention of continually expanding and improving the quality of the data. We believe that this is very important for the proper understanding of risk in nuclear energy. The cost and INES scores are plotted over time in Fig. 1. The frequency with which events exceed the threshold of 20MM USD, and the distribution of these excesses, will be studied in Sections 'EVENT FREQUENCY' and 'EVENT SEVERITY', respectively. Further, how these quantities have changed in response to the major accidents of Three Mile Island, 1979, and Chernobyl, 1986, will be studied. There are likely to be changes following the major accident at Fukushima in 2011. However, there has been little time to observe improvements, and the cost data are most incomplete in this area: the cost of 18 of the 29 post-Fukushima events contained in the data set are, as of yet, unknown. Thus, the industry response to Fukushima cannot be quantified in this study.
Table I. The 15 Largest Cost Events Since 1960 Are Provided with the Date, Location, Cost in MM 2013USD, INES Value, and NAMS Value
Cost (MM USD)
The full data set is provided online. Unknown values are indicated with a dash.
TMI, Pennsylvania, United States USA
Athens, Alabama, USA
Jaslovske Bohunice, Czechoslovakia
Plymouth, Massachusetts, USA
This data set dwarfs existing data sets from the academic literature, and is mature enough to justify an analysis. However, it is still important to consider the quality of the data, and what methodology is best to handle the limitations. Regarding data completeness, in a rare statistical statement, the IAEA stated: “During the period July 1995–June 1996, the Agency received and disseminated information relating to 73 events—64 at NPPs [nuclear power plants] and nine at other nuclear facilities. Of those 73 events, 32 were stated to be ‘below scale’ (i.e., safety-relevant but of no safety significance) and three to be ‘out of scale’ (i.e., of no safety relevance). Of the remaining 38 events, three were rated at INES level 3 and eight at level 2 (i.e., as ‘incidents’), and 27 at level 1 (i.e., as ‘anomalies’).” On the other hand, in the data set of this study, only six events fall within this period, none of whose INES values are known, rather than the 38. This statistic tells two important things. First, for increasingly small event sizes, our data are increasingly incomplete—the smaller the event, the less likely it is to be discovered, recorded, reported to the regulator, reported in the media, etc. Second, our data will remain incomplete for small events until the IAEA publishes historical INES data. The shortage of events of INES level 1, and even 2, prohibits a “near miss” analysis. That is, given that every event develops from having INES score 0, then to 1, and so on, it would be interesting to determine the probability of developing to the next level.
However, the statistics of the largest costs are much more interesting. For instance, the most costly event (Chernobyl) has cost roughly equal to the sum of all other events together, the two most costly events (Chernobyl and Fukushima) have cost roughly five times the sum of all other events together, and the sum of the 53 costs in excess of 100 MM USD is 99.4% of the total cost of all 175 cost values. This clearly implies that, if one wants to study the aggregate risk in terms of total cost, then one simply needs data for the largest events. Thus a typical approach taken in such cases is to study events that are in excess of a threshold, above which the data are thought to be reliable and well modeled. As in a former study by Hofert, we use a threshold of 20MM USD, although results will be similar if a threshold of, e.g., 100 MM were used. For cost, there are 101 values above the threshold of 20 MM. Further, of the 41 events with unknown cost, 31 have known INES scores, of which half are level 2 or higher.
2.1. Comparing Severity Measures and Critiquing INES
There are many ways to quantify the size of a nuclear accident. Following Chernobyl, several authors proposed to use a monetary value of severity to make events comparable, and use a rate measure, normalized by the number of reactor operating years, to consider frequency.[34-36] This is what we have done. Since the IAEA uses INES, it is instructive to compare the two approaches. First, the INES is a discrete scale between 1 (anomaly) and 7 (major accident). Similarly to the Mercalli intensity scale for earthquakes (which has 12 levels from I [not felt]) to XII [total destruction]), each level in the INES is intended to roughly correspond to an order of magnitude in severity (in the amount of radiation released).
The INES has been criticized. Common criticisms include that the evaluation of INES values is not objective and may be misused as a public relations (propaganda) tool; moreover, the historical scores are not published, not all events have INES values assigned, no estimate of risk frequency is provided, and so on. Given confusion over the INES scoring of the Fukushima disaster, nuclear experts have stated that the “INES emergency scale is very likely to be revisited.” The Nuclear Accident Magnitude Scale (NAMS), a logarithmic measure of the radiation release, was proposed as an objective and continuous alternative to INES. This proposition, to go from the INES to the NAMS, is reminiscent of when the geophysics discipline replaced the discrete Mercalli intensity scale by the continuous Richter scale with no upper limit, which is also based on the logarithm of energy radiated by earthquakes. In the earthquake case, the Mercalli scale was invented more than 100 years ago as an attempt to quantify earthquake sizes in the absence of reliable seismometers. As technology evolved, the cumbersome and subjective Mercalli scale was progressively replaced by the physically-based Richter scale. In contrast, the INES scale looks somewhat backward from a technical and instrumental point of view, but was created in 1990 by the International Atomic Energy Agency as an effort to facilitate consistent communication on the safety significance of nuclear and radiological events, while more quantitative measures are available.
Here, we perform a statistical back-test of the accuracy of INES values, in relation to costs and NAMS. Indeed, INES is not defined in terms of cost; however, if INES fails to capture the information that costs do, then the cost measure is important. In Fig. 1, we plot both the logarithm of cost, and NAMS versus INES. There is an approximate linear relationship between INES and log cost (intercept parameter at INES =0 is 0.64 [0.3] and slope 0.43 [0.08] by linear regression). This is consistent with the concept that each INES increment should correspond to an order of magnitude in severity. However, cost grows approximately exponentially () rather than in multiples of 10 with each INES level. Further, the upper category (7) clearly contains events too large to be consistent with the linear relationship. For instance, the largest events (Fukushima and Chernobyl) would need to have an INES score of 10.6 to coincide with the fitted line. In addition, the cost of INES level 3 events does not appear to be statistically different from the sizes of INES level 4. Finally, there is considerable uncertainty in the INES scores, as shown by the overlapping costs. There is an approximate linear relationship between INES and NAMS (at INES =3 the intercept is 1.8 [0.9] and slope 1.7 [0.2] by linear regression). One sees from the points, and from the fact that the slope of the line is greater than 1, that large radiation release events have been given an INES level that is too small. Furthermore, some INES level 3 events should be INES level 2. This illustrates the presence of significant inconsistency of INES scores, in terms of radiation release level definitions.
2.2. The Current Fleet of Reactors
One must judge the number of accidents relative to the so-called volume of exposure; in this case, the number of reactors in operation. These data were taken from the industry data set and are plotted in Panel I of Fig. 2. The number of reactors in operation grew sharply until 1990, after which it stabilized. The stable level has been supported by growth in Asia, compensating for a decline in Western Europe. A steep drop is observed in the Asian volume where, following Fukushima in 2011, all of Japan's reactors were shut down temporarily, until further notice. On the topic of reactors, it is important to note that reactors are somewhat informally classified into generations: Generation I reactors were early prototypes from the 1940s to 1960s. Generation II reactors were developed and built from the 1960s to the 1990s, of which boiling water reactors (BWR) and pressurized water reactors (PWR) are common. Generation III reactors have been developed since the 1990s. These reactors, such as the advanced BWR and PWR reactors, were improvements upon their Generation II ancestors, replacing safety features that required power with passive ones, and having more efficient fuel consumption. Generation IV reactors concern new technologies, and are still being researched/developed. They have the intention of further improving safety and efficiency, which were deemed as still being inadequate in the Generation III reactors. The vast majority of existing reactors are of Generation II, where most Generation I reactors have been decommissioned, and few Generation III reactors have been constructed. Generation IV reactors are not expected to be deployed commercially until at least 2030 or even 2040.
3. UNCERTAINTY QUANTIFICATION OF RISKS IN NUCLEAR ENERGY SYSTEMS
Prior to moving ahead with data analysis and interpretation, reflection on the degree of uncertainty present, and how it is handled in the analysis, is warranted. Following Aven two important questions are (i) if (frequentist or Bayesian) probabilities are attainable, and (ii) if the proposed probabilistic model is accurate/valid. Given a lack of relevant data (e.g., when talking about the future), or difficulty with justifying models, the answer to these questions may be no. If the answer to both questions is no, then one can be said to be in a state of deep uncertainty. Such considerations are relevant to the study of risk in nuclear energy systems, and are discussed below.
Fortunately for this study, the context is clear, as historical risks, and the current risk level, are being analyzed. Future risk is only being discussed insofar as the current state remains. Thus, the analysis does not need to deal with uncertain futures. Also, by studying risk at a global level, one avoids epistemic uncertainties associated with the specificities of a given plant, type of accident, or technology. Furthermore, relevant data are available and a simple and somewhat justifiable model used. Thus, here a probabilistic approach is valid, where uncertainties include epistemic model uncertainty and aleatoric statistical uncertainty in the parameter estimates, as well as data uncertainty. The epistemic and aleatoric uncertainties are dealt with by imposing a relatively broad range of parameter estimates (for frequency, severity distribution, and maximum possible severity). Regarding data uncertainty, the data studied here are at a much higher quality level than those of previous studies on nuclear risks. Although the authors are committed to ongoing expansion and refinement of the data, this is a sensible point to provide an analysis. That is, it is unlikely that reasonable modifications of the cost estimates will substantially impact the high-level results provided.
Going beyond this analysis, one can look deeper into nuclear risk. For instance, as regulation requires, PSA is done for each individual plant.[2, 3] This necessitates that the probabilities of initiating events be specified and that the interaction of events be encoded in a fault tree. Further, if one wants to perform level 3 PSA, then the consequences of each event need to be specified. For this task, at a unique power plant, there is little data and thus the huge number of parameters must be specified based on belief/assumption rather than frequentist estimates. Thus, in addition to the aleatoric sampling uncertainty that is captured by simulating from the model, one should also consider the epistemic uncertainty in the model specification and its parameters.[17, 41] Such an approach has been suggested and a research project considering such methodology in studying the risk of a large loss of coolant is underway. However, standard PSA practice and regulations are not yet at this level of uncertainty quantification. Furthermore, needing to encode all possible events in the fault tree implies that the worst case is limited to the one that the modeler can imagine. Finally, the epistemic uncertainty present in the specification of the (typically deterministic) fault tree, which will practically always be incomplete, is not considered at all. These data and epistemological limitations imply that PSA exhibits deep uncertainty. These issues largely inhibit the ability of standard PSA to provide an adequate quantification of overall risk.
Nonetheless, PSA is an important and useful exercise. That is, it is a top-down technique that allows for the generation and prioritization of high-risk events that may not have been observed, and for common causes to be identified. It is thus instrumental in safety improvement, and risk-informed decision making. The statistical approach is a bottom-up technique, whose specificity (e.g., the risk of a specific reactor technology) is restricted by limited historical data, and whose instances from which one can learn are limited to those that have been observed. It is natural to combine top-down and bottom-up approaches, at least by comparing their results. It is clear that this should be done in nuclear energy systems as well.
Taking a broader perspective, the future risk of nuclear energy should be considered to support decision making, both within the nuclear energy industry, but also within the portfolio of energy source alternatives. The future risk of nuclear energy is deeply uncertain: it depends heavily on developments in reactor and disposal technology, plant build-out and decommissioning scenarios, the emerging risk of cyber-threats and attacks, etc. Furthermore, in making decisions about the holistic plans for future energy systems, the uncertainties of other energy sources also become relevant across multiple criteria. Many risks are reasonably well understood, such as reduced life expectancy, but the evaluation of terrorist threats, and the potentially severe environmental impacts of carbon emission, are deeply uncertain.[44, 45] Thus, energy system decisions should be supported by robust multiple criteria decision-making tools with adequate consideration of uncertainty. A probabilistic attempt could be scenario analysis with probability distributions being assigned to the scenarios. Alternatively, nonprobabilistic methods may be warranted, as has been done in large-scale engineering systems with multiple diverse stakeholders.
4. EVENT FREQUENCY
Regarding the frequency of events, we observe events each year for the nuclear reactors in operation for years . The annual observed frequencies of accidents per operating facility reactor are . The observed frequencies are plotted in Panel II of Fig. 2. The rate of events has decreased, and perhaps stabilized since the end of the 1980s. The running rate estimate,
used by Hofert is plotted for and . In the presence of a decreasing rate of events, such a running estimate overestimates the rate of events for recent times. Furthermore, it is a smoothing method where the estimate is taken at the right-most edge of a constantly growing smoothing window, rather than in the center of a window with fixed width. To avoid this bias and to properly evaluate the trend, we consider another approach. We assume that are independently distributed . The Poisson model features no interaction between events, which is sensible as separate nuclear events should occur independently, and is compatible with how we have defined our events in Section 'DATA AND THE MEASUREMENT OF EVENT SEVERITY'. The changing rate of events is accommodated by a log-linear model for the Poisson rate parameter,
for given and parameters . This is the so-called generalized linear model (GLM) for Poisson counts and may be estimated by maximum likelihood (using in R:glm). The GLM model was estimated from 1970 until 1986 and from 1987 until 2014, with estimated parameters in Table II, and plotted in Panel II of Fig. 2. The first estimate suggests a significantly decreasing rate, which is in agreement with the decreasing running rate estimate. The second (approximately flat) GLM indicates that, from 1987 onwards, the rate has been not significantly different from constant. Clearly, the running rate estimate, starting at 1970, is unable to account for this. To further diagnose this difference, in Panel III of Fig. 2, the running estimate (Equation (1)) is done running both forward and backward from the Chernobyl event of 1987. From here it clear that the rate prior to Chernobyl is larger than the rate after . Thus it is apparent that there was a significant reduction in the frequency of events following Chernobyl—likely due to a comprehensive industry response.
Table II. Parameter Estimate, Standard Error, and p-Value for GLM Estimates of Rate (Equation (2))
The two rows are two estimates. The first starts at 1970 and ends at 1986. The second starts and 1987 and ends at 2014. The intercept parameter is given for at the starting time.
−3.87 (0.3), 10−16
−0.06 (0.03), 0.05
−5.53 (0.3), 10−16
−0.015 (0.02), 0.45
It is interesting to note that such a change is not apparent following TMI. There are insufficient data to identify a change following Fukushima. That the data are incomplete implies that our rate estimates are underestimates. For instance, even within our data set, there are 40 events occurring after Chernobyl whose cost is unknown. Of these 40 events, 32 have INES values, and 17 of these have INES values of 2 or larger. Based on the known INES and cost values, the median cost of events with INES =2 is 26 MM USD, i.e., more than half of INES =2 events exceed the threshold. Thus, based on these statistics, assuming that only nine of the 40 unknown events have cost in excess of 20 MM USD is conservative. With such an assumption, the estimate becomes . Taking into account the above, and that the rate of events may actually still be decreasing, we consider a conservative range of current estimates between 0.0025 and 0.0035. This suggests an average of 1 to 1.5 events per year across the current fleet of reactors. Thus, despite having a data set twice as big as in Hofert, we find a similar rate estimate. Further, provided that the fleet does not undergo any major changes, we expect the rate to remain relatively stable.
As in Hofert, a significant difference between the frequency of events across regions is found. In Table III, one sees that the running estimate of the annual rate varies by as much as a factor of 3 across the regions. This is likely to be more due to a difference in reporting rather than a difference in the true rate of events. This provides further evidence that our rate estimates are underestimates.
Table III. Statistics by Region: Number of Events (N) and Number of Reactor Years (v) from 1980 through 2014, the Rate of Events per Reactor Year, and the Poisson Standard Error of the Rate
Russia is included in Eastern Europe.
5. EVENT SEVERITY
For the quantitative study of event severity, costs (measured in MM 2013 USD) are considered to be i.i.d. random variables with an unknown distribution function F. Here, we estimate the cost distribution. A common heavy-tailed model for such applications is the Pareto CDF,
which may be restricted to a truncated support as,
where u1 and u2 are lower and upper truncation points that define the smallest and largest observations allowed under the model. Extending further, truncated distributions may be joined together to model different layers of magnitude,
In Fig. 3, the severity measures are plotted according to their empirical complementary cumulative distribution functions (CCDFs). In Panel I, the sample of costs in excess of 20 MM USD is split into pre- and post-TMI periods, with 42 and 62 events, respectively. The distributions are clearly different. Indeed, the KS test, with the null hypothesis that the data for both subsets come from the same model, gives a p-value of 0.015. As can be seen from the lower inset of the first panel of Fig. 3, this p-value is much smaller than the p-values obtained for testing other change-times. For instance, there was no apparent change between the pre- and post-Chernobyl periods. The pre-1979 data, having median cost of 283 MM USD, have a higher central tendency than the post-1979 data, having a median cost of 77 MM USD. However, the post-1979 distribution has a heavier tail, whereas the pre-1979 distribution decays exponentially. It is a rather well-known observation that improved safety and control in complex engineering systems tends to suppress small events, but often at the expense of more and/or larger occasional extreme events.[49-51] This raises the question of if the observed change of regime belongs to this class, as a result of the improved technology and risk management introduced after TMI.
Thus, we focus on estimating the left-truncated Pareto (Equation (3)) for the post-1979 data. The estimate fluctuates in the range of 0.5–0.6 for lower threshold MM USD, indicating that the data are consistent with the model. For , the estimate of α is smaller, as is typical for data sets where small events are underreported. In the former study of Sornette, the estimated value was larger (), while Hofert also found values between 0.6 and 0.8. With our more complete data set, the smaller value α is qualitatively consistent with previous studies, but further emphasizes the extremely heavy tailed model () where the mean value is mathematically infinite. In practice, this simply means that the largest event in a given catalog accounts for a major fraction () of the total dollar cost of the whole. That is, the extremes dominate the total cost.
6. RUNAWAY DISASTERS AS “DRAGON-KING” OUTLIERS
In a complex system with safety features/barriers, once an event surpasses a threshold, it can become uncontrollable, and develop into a “runaway disaster”—causing disproportionately more damage than other events. This is the type of phenomenon that is considered here. In Panel I of Fig. 3 one can see that (at least) the two most costly events since TMI (Chernobyl and Fukushima) lay above the estimated Pareto CCDF, and that Chernobyl, TMI, and Fukushima form a cluster of outliers in the sample of NAMS data. The NAMS data cannot be split into pre- and post-TMI samples due to insufficient data. The NAMS distribution is well described by the exponential distribution for values between 2 until 5, as reported by Smythe. That is, for NAMS values above , the estimate is with sample size , where the three suspected outliers were censored to avoid biasing the estimate. Since NAMS is a logarithmic measure of radiation, this corresponds to a Pareto distribution for the radiation released, which is valid over three decades.
We relate this concept of runaway disasters to the concept of DK. The term DK has been introduced to refer to such situations where extreme events appear that do not belong to the same distribution as their smaller siblings.[49, 50] DK is a double metaphor for an event that is both extremely large in size or impact (a “king”) and born of unique origins (a “dragon”) relative to other events from the same system. For instance, a DK can be generated by a transient positive feedback mechanism. The existence of unique dynamics of DK events gives hope that they may be “to some extent” predictable. In this sense, they are fundamentally different from the a priori unpredictable “black swans.”[54, 55] Statistically speaking, given that extreme events tend to follow heavy-tailed distributions, DK can be specifically characterized as outlying large extremes within the population of extreme events. That is, DK live beyond power law tails, whereas black swans are often thought of as being generated by (unanticipated) heavy power law tails.
Let us now make the above observations more rigorous by testing the apparent DK points as statistical outliers. There are many tests available to determine if large observations are significantly outlying relative to the exponential (or Pareto) distribution.[56-59] A suitable approach to assess the NAMS outliers is by estimating a mixture of an Exponential and a Gaussian density,
where the Gaussian density provides the outlier regime, and is a weight. The test is done for the 14 points in excess of 3.5, where the exponential tail is valid. The maximum likelihood estimation of this model (Equation (6)) is done using an expectation maximization algorithm.[60, 61] The estimates of this (alternative) model are . We also consider a null model with no DK regime . For this the MLE is . The alternative model has a significantly superior log-likelihood (the p-value of the likelihood ratio test  is 0.04). Thus there is a statistically significant DK regime relative to the exponential, with outliers expected.
That the amount of cost is related to the amount of radiation released suggests testing for a DK regime in cost. Not every runaway radiation release disaster produces commensurate financial damage (see Three Mile Island in Table I). But, given that the majority of nuclear power installations have surrounding population densities higher than Fukushima, the DK regime in radiation should amplify cost tail risks. From Panel II of Fig. 3 as many as the three largest points could be outlying. For this we consider the sum-robust-sum (SRS) test statistic,
for the ordered sample , which compares the sum of the outliers to the sum of the nonoutliers. This test was performed for and outliers for a range of upper samples—i.e., the sample in excess of a growing lower threshold. For , the p-value fluctuates between 0.05 and 0.1, for samples ranging from the 10 to the 40 largest points. For , the test fluctuates between 0.1 and 0.2. Thus, there is evidence that the two largest events are indeed outliers, both in terms of radiation and cost.
Given the suggestive evidence that the extreme tail of cost is heavier than the rest of the tail, perhaps due to a runaway disaster effect, it is important to include this in the risk modeling. For simplicity, we continue with the Pareto model. The MLE for the top 5 points is . To pursue a pleasant but nonrigorous argument, this appears to be consistent with the run-away effect that propels NAMS values from to . That is, transforming back from log scale, this same effect on the Pareto model would transform the parameter α to .
7. MODELING AGGREGATE ANNUAL DAMAGE
We now characterize the annual risk of nuclear events measured by cost, with three quantities: quantiles, return periods, and expected values. These characterizations are relevant for the current state of the operating nuclear fleet—excluding any potential improvements following Fukushima—and do not consider scenarios for the transition to more advanced reactor technologies, such as Generation III and beyond.
where, for each year , there are a random number of events , modeled by a Poisson process with annual rate , and each event has a random size (in MM USD) , where the condition ensures that we consider only events with costs larger than 20 MM USD.
There are a range of statistically valid parameter estimates that should be considered. From Section 'EVENT FREQUENCY', rate estimates ranging between 0.025 and 0.035 were suggested as conservative underestimates. From Section 'EVENT SEVERITY', the distribution of cost in excess of 20 MM was found to be well described by a Pareto distribution, with parameter between 0.5 and 0.6. Further, in Section 'RUNAWAY DISASTERS AS “DRAGON-KING” OUTLIERS' it was found that the largest cost values are significantly larger than what would be expected under the Pareto model. An attempt to account for this was made by including a heavier tail (a DK regime) with for the top 10% of the mass (with lower threshold MM USD).
First we characterize the risk level with return periods, defined within the CPP model by considering,
which is the probability of observing at least one event, at least as large as some size (e.g., given by an order statistic ), in a given time period τ. One sets the equation (Equation (9)) to a given probability p and solves for the return period of the jth largest event. Setting , one obtains the standard return period . In Table IV, median and quartile estimates of return period estimates are given for combinations of the above range of parameter values. For each given set of parameter values, 100,000 samples of the data were simulated, parameters reestimated on these samples, and the return period computed. The median and quartiles were taken over these 100,000 return period estimates. In the lowest risk model (, and ), the median return periods for TMI and Fukushima are 17 and 154 years, respectively. In the most conservative case (, and in the extreme tail), these values are 12 and 62. It is clear that including the DK effect does not inordinately amplify the risk. For instance, the return periods with high frequency risk () and low risk severity () are similar to those of the low frequency risk () and high severity risk ( in the extreme tail).
Table IV. Median and Quartile Return Periods (Equation (9)) for Fukushima (x(2)) and TMI (x(4)) for Different Rate Parameters λ and Parameters α for the Pareto Distribution Above Lower Threshold u
The last column corresponds to the “dragon-king” regime. The median and quartiles are computed over 100,000 estimates of the parameter values, computed on data simulated using the parameter values provided in the table.
Next we provide quantiles. For F, we take the estimated Pareto cost distribution (Equation (3) with and ). We also consider , which is a two-layer model (Equation (5)) where the upper layer is for the DK regime. The first layer, from to 1,100, is Pareto with estimated by MLE. The second layer, from 1,100 onwards, is also Pareto with heavier tail . Given , , and F, we can calculate the “aggregate” distribution G for annual cost . We do this for the year 2014 with the Panjer algorithm by Monte Carlo. Quantiles of the estimated G are in Table V. The 0.99 quantile is highly sensitive to the choice of λ and distribution F: for very low rate , and without considering the DK effect, the 0.99 quantile is 54,320 (MM USD), which is five times the cost of TMI. For , we obtain a similar estimate to Hofert, who obtained 81,000 (MM USD). Considering , with the DK effect, this quantile is 331,610 (MM USD), which is double the estimated cost of Fukushima.
Table V. The Estimated 0.95 and 0.99 Quantiles, as Well as the Probability of the Annual Cost Exceeding the Cost of Fukushima MM USD, Are Given for the Aggregate Distribution G
The Pareto model is with , and the Pareto DK model is with . The volume (number of active nuclear reactors) is taken to be . The quantiles are given in MM 2013 USD.
7.2. Expected Annual Damage
So far we have considered models without a limiting cost, in which the mean cost is mathematically infinite, since our various estimations of the Pareto exponent α all have values less than 1. Of course, the Earth itself is finite, thus there is an upper cutoff, u2, to the maximum possible cost. But this upper cutoff could be exceedingly large, and there is—as of yet—no evidence of a maximum being reached thus far (i.e., no accumulation of observations at an upper limit in Fig. 3). Think, for instance, of the real-estate value of New York City, USA or Zurich, Switzerland, both of which are rather close to a nuclear plant, and would become inhabitable in a worst-case scenario. Here, we would be speaking of up to tens of trillions of dollars of financial losses, not to speak of human ones. Thus, insurance and re-insurance companies introduce a maximum loss for their liabilities, which for them works as if there is a genuine upper cutoff: u2. Everything above such a cutoff is then the responsibility of the government(s) and society; for the truly extreme catastrophes, only the state can be the insurer of last resort.
It is useful to put hard numbers behind these considerations by using scenarios. For the CPP (Equation (8)), the mean and variance of the annual cost are:
Given lower and upper truncations, u1 and u2, the first two moments for the Pareto are,
Thus the mean grows in proportion to (and the variance faster as ). In Table VI, we compute these moments of the costs X when the maximum value, u2, is equal to the present estimate of the cost of Fukushima, 10 times greater, and 100 times greater. Since the expected annual number of events is approximately 1, these values provide a rough estimate of the mean and standard deviation of annual cost in 2014 (Equation (10)).
Table VI. The First Moment and the Square Root of the Second Moment of X Are Given by the First and Second Value, Respectively
The Pareto model is with , and three values for the maximum value u2 given by the column labels. The Pareto DK model is with and three values for the maximum value u3. The maximum values are 1, 10, and 100 times the cost of Fukushima, 166,089 MM USD. All units of costs are in MM USD.
If we accept that the Fukushima or Chernobyl events represent roughly the largest possible cost then (see Table VI) the mean annual cost is approximately 1.5 billion USD, with a standard error of 8 billion USD. This brackets the construction cost of a large nuclear plant, suggesting that about one full equivalent nuclear power plant value could be lost each year, on average. However, the heavy tailed severity implies that most years, there is little cost, and once in a while an extreme hits, driving the total cost up considerably. If we assume that the largest typical possible cost is about 10 times that of the estimated cost of Fukushima, then the average annual cost is about 5.5 billion USD, with a very large dispersion of 55 billion USD. Indeed, the outlook is even more dire for larger possible upper cutoffs. Such numbers do not appear to be taken into account in standard calculations on the economics of nuclear power. To be fair, we should also note that the long-term effect on, say, lung cancer risks and other particle pollution induced deaths, are not taken into account in evaluating the cost-benefits of alternative sources of energy such as coal.
8. DISCUSSION AND POLICY CONCLUSIONS
Our study makes important conclusions about the risks of nuclear power. Regarding event frequency, we have found that the rate of incidents and accidents per civil nuclear installation decreased from the 1970s until the present time. Along the way, there was a significant drop in the rate of events after Chernobyl (April 1986). Since then, the rate has been roughly stable, implying a rate between 0.0025 to 0.0035 events per reactor per year in 2015. It is worth noting that the decrease in risk due to the reduced accident frequency per reactor from the 1960s onwards has been somewhat offset by an increasing number of reactors in operation.
Regarding event severity, we found that the distribution of cost underwent a significant regime change shortly after the Three Mile Island major accident. Moderate cost events were suppressed, but extreme ones became more frequent, to the extent that the costs are now well described by the extremely heavy tailed Pareto distribution with parameter . We noted in the introduction that the Three Mile Island accident in 1979 led to plant-specific full-scope control room simulators, plant-specific PSA models for finding and eliminating risks, and new sets of emergency operating instructions. The change of regime that we document here may be the concrete embodiment of these changes catalyzed by the TMI accident. We also identify statistically significant runaway disaster (“dragon-king”) regimes in both NAMS and cost, suggesting that extreme events are amplified to values even larger than those explained under the Pareto distribution with .
In view of the extreme risks, the need for better bonding and liability instruments associated with nuclear accident and incident property damage becomes clear. For instance, under the conservative assumption that the cost from Fukushima is the maximum possible, annual accident costs are on par with the construction costs of a single nuclear plant, with the expected annual cost being 1.5 billion USD with a standard deviation of 8 billion USD. If we do not limit the maximum possible cost, then the expected cost under the estimated Pareto model is mathematically infinite. Nuclear reactors are thus assets that can become liabilities in a matter of hours, and it is usually taxpayers, or society at large, that “pays” for these accidents rather than nuclear operators or even electricity consumers. This split of incentives improperly aligns those most responsible for an accident (the principals) from those suffering the cost of nuclear accidents (the agents). One policy suggestion is that we start holding plant operators liable for accident costs through an environmental or accident bonding system, which should work together with an appropriate economic model to incentivize the operators.
Third, looking to the future, our analysis suggests that nuclear power has inherent safety risks that will likely recur. With the current model—which does not quantify improvements from the industry response to Fukushima—in terms of costs, there is a 50% chance that (i) a Fukushima event (or larger) occurs in 62 years, and (ii) a TMI event (or larger) occurs in 15 years. Further, smaller but still expensive (⩾20 MM 2013 USD) incidents will occur with a frequency of about one per year, under the assumption of a roughly constant fleet of nuclear plants. To curb these risks of future events would require sweeping changes to the industry, as perhaps triggered by Fukushima, which include refinements to reactor operator training, human factors engineering, radiation protection, and many other areas of nuclear power plant operations. To be effective, any changes need to minimize the risk of extreme disasters. Unfortunately, given the shortage of data, it is too early to judge if the risk of events has significantly improved post-Fukushima. We can only raise attention to the fact that similar sweeping regime changes after both Chernobyl (leading to a decrease in frequency) and Three Mile Island (leading to a suppression of moderate events) failed to mitigate the very heavy tailed distribution of costs documented here.
A separate conclusion of our article concerns the nature of data about nuclear incidents and accidents. We found that the INES scale of the IAEA is highly inconsistent, and the scores provided by the IAEA incomplete. For instance, only 50% of the events in our database have INES scores. Further, for the costs to be consistent with the INES scores, the Chernobyl and Fukushima disasters would need to be between an INES level of 10 and 11, rather than the maximum level of 7. The INES scale was compared to the antiquated Mercalli scale for earthquake magnitudes, which was replaced by the continuous physically-based Richter scale. Clearly, an objective continuous scale such as the NAMS would be superior to the INES. However, while using INES, scores should be made available for all accidents. When such a framework is established, and data on incidents and accidents made more rigorous and transparent, accident risks can be better understood, and perhaps even minimized through positive learning.
Finally, our study opens a number of avenues for future research. Our results have been obtained for the current fleet, dominated in large part by Generation II reactors. Future research directions could be to investigate how much of the specific risks for each reactor type or design can be inferred from statistical analysis, with the goal of identifying which of the reactors are the safest. In addition to the role of technology, another natural extension would be to correlate accidents to the type of market or form of regulatory governance, restructured versus monopoly/state run, or limited liability versus no limited liability.
Speaking in comparative terms, our focus on the risks of civil nuclear power plants might give the impression that this technology is riskier than other competing technologies, such as coal or wind energy. However, due to the more diluted nature of the costs, and the quasi-hysteric focus on nuclear risks following the Fukushima disaster, an insidious villain may be hidden: it has been estimated that fine particle air pollution causes about 7 million premature deaths globally each year, including more than 1 million in China, and about 60,000 in Europe. Considering the value of life to be in the millions (106), these deaths alone account for a cost on the order of a trillion USD (1012), not taking into consideration the billions also being required for healthcare. Coal, whose global use has soared by 50% from 2000 to 2010, is the leading source of fine particles, which are embedded in the lungs, causing cancers. Between 2010 and 2012, European coal consumption jumped 5%, or 50 million tons. Thus, performing a rigorous empirically-based comparative analysis of the risks of nuclear versus other forms of energy providers is absolutely essential to avoid falling in the traps of media hypes and availability biases, in the goal of a better steering of our societies. Furthermore, such an analysis should, on one hand, take into consideration the costs of the disposal of nuclear wastes, while on the other hand recognize that humankind is confronted with a “nuclear stewardship curse,” whereby existing nuclear byproducts/waste need to be securely managed over immense time scales.