Over the past several decades, the Yellowstone caldera has experienced frequent earthquake swarms and repeated cycles of uplift and subsidence, reflecting dynamic volcanic and tectonic processes. Here we examine the detailed spatial-temporal evolution of the 2010 Madison Plateau swarm, which occurred near the northwest boundary of the Yellowstone caldera. To fully explore the evolution of the swarm, we integrated procedures for seismic waveform-based earthquake detection with precise double-difference relative relocation. Using cross correlation of continuous seismic data and waveform templates constructed from cataloged events, we detected and precisely located 8710 earthquakes during the 3 week swarm, nearly 4 times the number of events included in the standard catalog. This high-resolution analysis reveals distinct migration of earthquake activity over the course of the swarm. The swarm initiated abruptly on 17 January 2010 at about 10 km depth and expanded dramatically outward (both shallower and deeper) over time, primarily along a NNW striking, ~55° ENE dipping structure. To explain these characteristics, we hypothesize that the swarm was triggered by the rupture of a zone of confined high-pressure aqueous fluids into a preexisting crustal fault system, prompting release of accumulated stress. The high-pressure fluid injection may have been accommodated by hybrid shear and dilatational failure, as is commonly observed in exhumed hydrothermally affected fault zones. This process has likely occurred repeatedly in Yellowstone as aqueous fluids exsolved from magma migrate into the brittle crust, and it may be a key element in the observed cycles of caldera uplift and subsidence.
 Yellowstone Plateau Volcanic Field has been shaped over its youthful geologic history by an amalgamation of volcanic and tectonic processes, which remain active today. The Yellowstone hot spot has been active at least since 16 Ma, leaving a 700 km long path of magmatic activity along the Snake River Plain in southern Idaho and eastern Oregon as the North American plate has moved southwestward at ~2.5 cm/yr over the mantle hot spot. Over the past 2.1 million years, Yellowstone has erupted catastrophically 3 times, most recently 640,000 years ago in the eruption that formed the Yellowstone caldera [Christiansen, 2001]. Numerous smaller eruptions have occurred since then, the youngest 70,000 years ago [Christiansen, 2001]. Though magmatic eruptions are infrequent, the Yellowstone system poses major hazards from occasional hydrothermal explosions and large earthquakes [Christiansen et al., 2007], such as the 1959 Mw 7.3 Hebgen Lake earthquake that killed 28 people [Doser, 1985; Stover and Coffman, 1993].
 A preponderance of evidence suggests that a large volume of partial melt underlies the caldera, extending to depths as shallow as 4–6 km. Conceptual models for large, silicic calderas envision repeated intrusion of basaltic magma into the lower crust, which supplies the heat to partially melt a volume of the overlying silicic crust [Hildreth, 1981]. This provides the large thermal source necessary to generate the extraordinarily high observed heat flow [Fournier et al., 1976; Morgan et al., 1977], recently estimated to average 1.4–2.8 W/m2 over 2900 km2 of the caldera [Hurwitz et al., 2012]. Maintaining this heat flow as well as high CO2 flux [Werner and Brantley, 2003] over time is thought to require frequent input of new magma [Fournier, 1989; Lowenstern and Hurwitz, 2008]. Geophysical investigations also suggest the presence of relatively shallow melt. These include studies using seismic tomography [Husen et al., 2004], receiver function analysis [Chu et al., 2010], and a combination of gravity and heat flow data [DeNosaquo et al., 2009]. Luttrell et al.  have recently postulated that a zone of partial melt lies at depths as shallow as 3–6 km based on small spatial-temporal variations in strain generated by a seiche on Yellowstone Lake. Wicks et al.  and Chang et al. [2007, 2010] modeled deformation data from episodes of caldera uplift as resulting from the inflation of sills at depths of 15 and 9 km, respectively, associated with the intrusion of magma, exsolved magmatic fluids, or both. In the shallower subsurface, Husen et al.  suggested that pore space is filled by CO2 at depths of less than ~2 km near the northwest edge of the caldera, based on low P wave velocities and low P/S velocity ratios.
 Earthquakes and fluid flow are thought to be highly coupled phenomena [e.g., Sibson, 1996; Yamashita, 1999], with aqueous fluids ubiquitous in the middle and upper crust [Cox, 2005]. Rising pore fluid pressure eventually triggers failure (shear, tensile, or a combination) [Sibson, 1990], which may simultaneously increase pore space [Yamashita, 1999; Sheldon and Ord, 2005] and permeability [Ingebritsen and Manning, 2010], causing fluid pressure to then drop. This process may also result in mineralization of the fault zone [Sibson, 1987; Weatherley and Henley, 2013], which over time could reduce permeability and effectively “self-seal” the fault. Because many areas of crust are critically stressed [e.g., Townend and Zoback, 2001], even an incremental increase in pore fluid pressure can potentially trigger earthquakes [Ellsworth, 2013].
 Examples of earthquakes triggered by controlled injection of fluids at depth provide direct evidence for the effects of pore fluid pressure on fault strength [Healy et al., 1968; Raleigh et al., 1976; Shapiro et al., 1997; Rutledge et al., 2004; Julian et al., 2010]. In addition, many natural earthquake swarms appear to be triggered by transient fluid pressure increases [Parotidis et al., 2003; Vidale and Shearer, 2006; Chen et al., 2012]. Swarm seismicity commonly expands outwardly from the point of injection as the square root with time, consistent with a diffusive process [Shapiro et al., 1997; Hainzl, 2004; Chen et al., 2012]. In some cases, fluid injections may generate tensile or hybrid shear-tensile fractures, which are reflected by non-double-couple focal mechanisms [Dreger et al., 2000; Šílený et al., 2009; Julian et al., 2010; Taira et al., 2010], while, in other cases, fluids may trigger purely shear failure [Sibson, 2003].
 We evaluated the well-recorded 2010 Madison Plateau earthquake swarm, one of the three largest swarms recorded in the volcanic field since monitoring began in the 1970s. The 2010 swarm began 17 January at ~10 km depth beneath the northwest boundary of the Yellowstone caldera (Figure 1) [Massin et al., 2012, 2013], within a zone of frequent swarm activity extending out from the northwest corner of the caldera toward the Hebgen Lake fault zone [Farrell et al., 2009]. The 1985 swarm, the largest yet recorded in the volcanic field, also occurred in this zone, 10–15 km NNW of the 2010 swarm (Figure 1) [Waite and Smith, 2002]. The 2010 swarm began abruptly at 20:10 UTC (Figure 2), with the first cataloged event at 20:17 (M = 1.0). Over the following 3 weeks, ~2250 earthquakes were eventually cataloged, including 17 of M 3 or larger. The largest event had local magnitude (Ml) 3.9 and moment magnitude (Mw) 4.1. In addition, several small earthquakes occurred nearby on 15 and 16 January at 3–7 km depth, the largest being M 1.7. The waveforms for these earthquakes correlate poorly with the main swarm events, however, and their relationship to the main swarm remains unclear. Although coseismic offsets of the largest (M 3+) earthquakes were visible on the closest strainmeter (E. Roeloffs, personal communication, 2012), no deformation associated with the swarm was detected by GPS, though the closest GPS stations were 10–20 km from the swarm epicenters.
 Our approach in this study was to process the available continuous seismic data for this swarm to simultaneously increase the number of located earthquakes and the precision of their locations. This then allows us to more thoroughly examine the spatial-temporal evolution of the swarm activity, with the aim of using this information to constrain the physical processes driving the swarm. The additional earthquakes not only provide more complete coverage in space and time but also serve to increase the precision of the other event locations by increasing the quantity of data (the differential arrival times) available to constrain the location inversion [Waldhauser and Ellsworth, 2000]. This technique is particularly effective for the 2010 Madison Plateau swarm because of the high degree of waveform similarity among earthquakes of this swarm [Massin et al., 2013].
2 Data and Method
 Waveform cross correlation has been used effectively for both precise earthquake location [e.g., Poupinet et al., 1984] and for event detection [Gibbons and Ringdal, 2006]. Cross correlation can be used to precisely measure relative timing of similar waveforms, reducing uncertainty associated with phase arrival time picks, which translates into reduced location uncertainty. Cross correlation also allows for efficient detection of events similar to known “template” events, even in the case of low signal-to-noise ratio, allowing identification of events too small to be cataloged by typical phase-picking methods [Schaff and Waldhauser, 2010]. Correlation-based detection is especially effective in cases where high event rates result in overlapping seismograms, such as in tectonic tremor and other earthquake swarms [Shelly et al., 2007; Shelly and Hill, 2011].
 Here following the technique described in Shelly et al. , we combined correlation-based detection and location procedures, simultaneously identifying events and measuring precise differential times, as shown in Figure 3. We used ~2000 earthquakes cataloged by standard network processing of the University of Utah Seismograph Stations (UUSS) as template events. To increase hypocenter precision, we separately constructed P wave and S wave templates and began each template 0.2 s before the estimated phase arrival time. We used the catalog arrival time picks when available; however, we often inferred the arrival time of the S phase because only the P phase was available. We used a duration of 2.5 s for the P wave template and 4 s for the S wave template. For stations near the source, with a difference in P and S arrival times of less than 2.5 s (hypocentral distance within ~20 km), we truncated the P wave template to avoid overlapping with the S wave template. Both P and S templates were constructed on available vertical and horizontal seismograms. All data were band-pass filtered between 2 and 12 Hz to optimize the signal-to-noise ratio and correlations among events. An example template event is shown in Figure 3a.
 To initially identify the presence of a similar event, we summed the normalized correlation functions for P and S templates on all seismometer components. For times where the summed correlation exceeded 8 times the median absolute deviation (MAD) of the summed correlation function for the day, we then took the second step of attempting to measure the precise time of the correlation peak for P and S windows on each channel (Figures 3b–3d). In this case, we used a threshold correlation coefficient of either 7 times the MAD value for that particular phase/channel pair on that particular day or an absolute threshold of 0.8, whichever is lower. These thresholds were determined empirically to achieve a balance of measurement quality and quantity. We allowed a maximum differential time of 1.0 s for P waves and 1.73 s for S waves to avoid a possible bias from small bounds, though most measured differential times were much smaller. Events for which we could successfully measure at least 4 differential times were saved, and we enforced a minimum time separation between events of 4 s.
 To achieve a balance between computational efficiency and differential time precision, we calculated the correlations at increments of 0.01 s, which corresponds to one sample in the seismic data. We then performed a simple quadratic three-point interpolation. In an ideal case, this gives ~1 ms timing precision, a time over which seismic waves travel ~3–7 m (velocities of 3–7 km/s). Thus, timing precision of a few milliseconds is required to locate event centroids with precision of ~10 m.
 For event detection and relocation, we used continuous seismic data from 18 stations (22 channels) of the Yellowstone Seismograph Network, operated by the UUSS, shown in yellow in Figure 1. Owing to data archiving problems at this time (continuous data for broadband seismic channels were not archived), we used only short-period seismic stations. Because we required precise relative time among stations, we additionally confined our analysis to stations digitized on the same system by the UUSS, which share a common time base. This eliminated problems of subsample time slew, which could have become a significant source of error for event locations. These stations represent the bulk of the network; thus, this was the best configuration for achieving optimal location precision. In total, we derived ~11 million precise differential times from these stations using cross correlation, split almost evenly between P and S measurements.
 Finally, the correlation-derived differential times were input into the hypoDD location package [Waldhauser and Ellsworth, 2000] along with differential times derived from the catalog phase picks. We used the 1-D UUSS velocity model for this area. To appropriately emphasize the highest quality measurements, weights for the correlation-derived times were set as the square of the maximum correlation coefficient. In the first several iterations, catalog data were weighted most heavily to define the broad structure. In subsequent iterations, we weighted correlation data most strongly to refine the event centroid locations. We relied on the weighting and outlier elimination in hypoDD to mute the effects of occasional spurious measurements. Events analyzed here are those for which we consider the locations to be well constrained, in that they retain at least 20 P wave and 20 S wave correlation-derived differential times throughout the inversion. With the differential time linking employed in hypoDD, relative locations among closely spaced earthquakes are generally most precise, especially when many events are located within a small volume. Because the final locations are dominated by waveform cross correlations rather than phase arrival times, they represent centroid rather than hypocenter coordinates, though this distinction is minor for all but the largest events.
 Perhaps counterintuitively, we are often able to locate small to moderate-sized events more precisely than we can locate larger events. This is due both to clipping of seismograms and to fewer event pairs with similar waveforms for these larger events. Since larger magnitude earthquakes occur less frequently, even though they are recorded at more stations, they have fewer potential event pairs of similar-sized events for which cross correlations are generally most effective [Schaff et al., 2004].
3 Earthquake Detection and Location Results
 A comparison of the UUSS cataloged events versus the precisely located data set determined here is shown in Figure 4. In total, we are able to locate almost 4 times as many events as included in the UUSS catalog, adding 6460 earthquakes not previously identified. Though there is some variation, the relative event rates between this study and the standard catalog are approximately constant throughout the swarm. The maximum daily event rates peak at 329 catalog events (on 21 January) and 1231 precisely located events (on 18 January), though both data sets show high event rates for the entire period from 18 to 21 January (Figure 4). Note that we do not attempt to measure the magnitudes of newly detected events. Previous work has found that the catalog is complete down to magnitude 1.0–1.2 since 1995 [Farrell et al., 2009], so newly detected events are expected to be smaller. Future work to estimate magnitudes of newly detected events could provide more robust constraints on the b value of the Gutenberg-Richter relation, which is sometimes interpreted to reflect stress and fluid conditions in the source region [e.g., Farrell et al., 2009]; the b value of UUSS catalog events for this swarm is near the typical value of 1.0.
 The space-time progression of the swarm is illustrated in Figure 5. High-resolution event locations show that swarm earthquakes dominantly formed a NNW striking structure dipping ~55° to the ENE, with dimensions of about 3 × 3 km and depths of 8.5–11 km below the surface. In the later stages of the swarm, activity developed east of this structure and slightly shallower, near 8 km depth.
 Initial activity on 17 January 2010 was concentrated in a very small area (Figure 5a). Events expanded outward from the initial source with time, gradually illuminating a distinct structure dipping ~55° to the ENE. Initial source migration (within the first hour) was dominantly along the strike direction of this structure, toward the NNW, though some expansion also occurred updip and downdip. One hour after the swarm initiation, its dimensions were approximately 500 m along strike and 150 m in the dip direction (Figure 5a and Animation S1 in the supporting information).
 The first earthquake exceeding magnitude 3 was an Ml 3.1 event at 18:03 on 18 January, which occurred on the updip edge of the activity front. At this time, the primary zone of swarm activity had expanded to ~1.5 km along strike and ~1 km along dip. A secondary zone was offset to the west and slightly shallower, though ~150 m below the projection (eventually illuminated by seismicity) of the main structure (Figure 5c and Animation S1). The largest events of the sequence occurred on 21 January, with an Ml 3.7 earthquake at 06:01, followed 15 min later by an Ml 3.9 event. Shallower activity (7.9 km depth) well separated from this main structure followed ~7 h after these earthquakes. The shallower events might have been triggered by static and/or dynamic stresses from these largest events.
 On 24 January, the swarm proceeded with renewed vigor on the downdip edge of the structure, initiating with a series of small events starting at 04:00 at ~10.5 km depth and accelerating with an M 2.9 earthquake at 07:56. By midday on 25 January, the swarm had progressed to a depth of ~11 km, in the wake of an M 3.2 event at 06:09. For the next several days, the activity rate slowly declined, but remained high (Figure 4). In what could be described as the final stage of the swarm, activity on the shallower structures renewed on 2 and 3 February, including dramatic propagation of events from east to west (Figure 5l and Animation S1).
4.1 Swarm Migration
 The most striking feature of the swarm was the pronounced migration of earthquake centroids outward from the initial source (Figures 5 and 6). The spatial-temporal expansion of the earthquake activity front can be well fit by a diffusive relationship , where r is the distance from the injection point, t is time, and D is hydraulic diffusivity [Shapiro et al., 1997]. Note that this relationship assumes homogeneous, isotropic diffusivity in three dimensions from a point source held at constant pressure—conditions that are violated here to various extents. We apply this equation as a means of comparison with studies of other earthquake swarms, which have mostly adopted the same formulation. Swarms fitting this diffusive-like pattern are typically interpreted as related to fluid pressure propagation [e.g., Shapiro et al., 1997; Parotidis et al., 2003; Hill and Prejean, 2005; Hainzl and Ogata, 2005]. In this swarm, we find that an assumed diffusivity of ~1.5 m2/s fits the activity front well (Figure 6), using the location of the first event as a proxy for the point of injection. This estimate is somewhat higher than diffusivities of 0.2–0.8 m2/s estimated for the 1989 earthquake swarm at Mammoth Mountain, on the rim of Long Valley caldera [Hill and Prejean, 2005], as well as the ~1 m2/s estimated for the 2009 swarm at Mount Rainier [Shelly et al., 2013], and it is several times greater than diffusivities of ~0.3 m2/s from the 2000 and 2008 Vogtland/NW Bohemia swarms in central Europe [Hainzl and Ogata, 2005; Hainzl et al., 2012]. On the other hand, Parotidis et al.  examined shorter bursts during the 2000 Vogtland swarm and estimated diffusivities ranging from 0.3 to 10 m2/s. A generally lower range of diffusivities of 0.01–0.8 m2/s was found by Chen et al.  for 18 earthquake swarms in various fault zones in Southern California. Compared to past swarms in Yellowstone, migration rates for the 2010 swarm are similar. Waite and Smith  estimated a linear propagation rate of 150 m/d for the average position of events during the 1985 swarm, though the front of activity expanded somewhat faster at ~400 m/d. Farrell et al.  estimated that the 2008–2009 Yellowstone Lake swarm front propagated ~1000 m/d. This compares to the 2010 swarm front migrating ~3 km over the first 4 days of the swarm, or ~750 m/d; however, since the migration rate slows with time, linearly estimated rates become lower for the 2010 swarm if time periods longer than 4 days are considered (Figure 6). All of these rates are much slower than the 1989 Dobi earthquake sequence in Central Afar of ~50 km in ~50 hours, which was also hypothesized to be related to a fluid pressure pulse [Noir et al., 1997].
 In addition to the overall expansion of earthquake hypocenters with time, numerous transient propagation episodes occur within the swarm, which we have attempted to highlight by the purple arrows in Figure 5. These episodes occur in all directions (updip, downdip, and both directions along strike), initially directed outward from the original source. This propagation may reflect cascading stress transfer coupled with fluid flow. While it may seem surprising to see downward migration induced by fluid flow (since the large-scale hydraulic gradient would generally drive fluid upward), we note that the difference between lithostatic and hydrostatic pressures at these depths (~10 km) is much greater than the hydrostatic gradient between the shallow and deep extents of the swarm (8–11 km). Therefore, the normal hydraulic gradient can be easily overcome by fluid pressures locally approaching lithostatic levels. A similar effect can be seen in cases of controlled fluid injection in boreholes, where triggered earthquakes often extend to depths greater than the depth of injection [e.g., Ake et al., 2005].
4.2 Swarm Structure and Source Mechanisms
 Earthquakes of the 2010 Madison Plateau swarm collectively form a structure approximating a plane striking NNW and dipping ~55° to the ENE, consistent with the orientation of the dominant Basin and Range normal faults known to exist in the vicinity [Christiansen, 2001]. A preexisting fault zone such as this might provide a natural pathway for fluids, with the high-permeability damage zone and the low-permeability fault core likely serving to effectively guide fluids parallel to the fault [Sibson, 1996; Cox, 2005; Wibberley et al., 2008; Ingebritsen and Appold, 2012].
 Rather than normal faulting, which formed the Basin and Range extensional structures, fault solutions based on moment tensor analysis and P wave first motions indicate that the swarm is dominated by strike-slip faulting mechanisms. Figure 7a shows the double-couple constrained first-motion mechanism for the M 3.9 event on 21 January, the largest event of the swarm and, thus, one of the best recorded. The solution indicates either right-lateral faulting on a roughly E-W striking plane or left-lateral faulting on an ~N-S plane. Also shown are the distributions of polarity observations for remaining swarm events, the vast majority of which are consistent with those for the largest event. Though either nodal plane could be the fault plane (indeed, the swarm might contain a mix of events with either orientation), alignments of event locations striking NNW suggest that the left-lateral orientation may dominate.
 Because of uncertainty in the dip of the nodal planes, faulting may occur either along the primary dipping structure or as a series of nearly vertical en echelon faults. Whereas strike-slip motion on a dipping fault is not mechanically optimal, it can occur if the fault is sufficiently weak relative to its surroundings, perhaps as a result of locally elevated pore fluid pressure [Sibson, 1990]. Alternatively, the fault zone may contain numerous en echelon vertical strike-slip segments in a fault-mesh-type geometry [Hill, 1977; Sibson, 1996].
 The UUSS standard cataloged P wave polarity observations for the swarm are highly skewed, with more than 2.5 times as many cataloged compressional first motions as dilational. While this could simply be an artifact of relatively uniform focal mechanisms of swarm earthquakes and an inhomogeneous station distribution, it opens the possibility of a volumetric component of the source. Similar observations of dominantly compressional first motions for the 1985 swarm at Yellowstone [Waite and Smith, 2002] and a 2009 swarm at Mount Rainier [Shelly et al., 2013], both of which are hypothesized to be fluid triggered, lend support to the idea that this may be more than an observational artifact. Despite the overall dominance of compressional first motions, the 22 polarity observations of the largest earthquake of the 2010 swarm (Ml 3.9) can be well fit by an assumed double-couple source (Figure 7a). As noted above, polarities of swarm events are remarkably consistent, with relatively few polarity observations differing from those observed for this largest event.
 Moment tensor analysis also suggests a positive volumetric component for the largest earthquake of the 2010 Madison Plateau swarm. We used a set of broadband seismic waveforms in a frequency range of 0.02–0.05 Hz, recorded at 16 broadband stations with a good azimuthal coverage (Figure 7b). Green's functions were computed by the frequency–wave number code of Herrmann , with the same UUSS 1-D velocity model used for locating earthquakes in Yellowstone. We found that the largest swarm earthquake had 30% of the energy associated with an isotropic expansion component. Assuming a Poisson solid and Lamé parameters λ and μ of 10 GPa, we estimate a volume increase of 1.7–3.0 × 104 m3, depending whether crack or spherical geometry is assumed [see Müller, 2001]. The F test statistic for the significance of the volumetric component is 1.19, above the 95% confidence level of 1.18. This suggests that the improved fit is not merely an artifact of additional free parameters compared with the double-couple solution. The resulting fault planes are in good agreement with the focal mechanism derived from the P wave polarities (Figure 7a); however, the expected radiation pattern obtained from the moment tensor analysis is not completely consistent with the observed P wave polarities (e.g., station YWB). This may reflect remaining uncertainties in the volumetric component of the earthquake source process. Earthquakes with non-double-couple source mechanisms at similar depths have been observed previously in Yellowstone [Taira et al., 2010; Farrell et al., 2010] as well as in Long Valley caldera during unrest in 1997 [Dreger et al., 2000].
 Many studies have suggested a hybrid shear and dilatational mechanism of failure under high pore fluid pressures. Although hydraulic extension fracturing can occur if the fluid pressure exceeds the least principal compressive stress, in most cases, faulting with a shearing component is induced before the fluid pressure actually reaches this level [Sibson, 2003; Cox, 2010; Fischer and Guest, 2011]. Hybrid fractures reflect mixed tensile and compressive stress states [Ramsey and Chester, 2004], which may be manifested in a fractured mesh structure of linked shear and dilatational fault segments [Hill, 1977; Sibson, 1996]. Julian et al.  proposed a hybrid mechanism of hydraulic fracturing and associated shear slip on wing tip faults to explain waveforms recorded by a dense borehole seismic network in the vicinity of industrial fluid injection in the Coso Volcanic field. This type of faulting is also well documented in the geologic record [e.g., Sibson, 1987] and could result in non-double-couple mechanisms with a high percentage of compressional P wave polarities in fluid-driven swarms.
4.3 Source of Fluids and Relationship to Caldera Dynamics
 The Yellowstone caldera has exhibited repeated episodes of uplift and subsidence (Figure 8), yet our understanding of the processes that drive surface deformation remains limited. Impressive heat flow and CO2 emissions provide evidence for high intrusion rates of basaltic magma into the crust, a process that probably drives deformation in one form or another. Fournier , using an estimate of 1.8 W/m2 over 2500 km2 from the chloride-flux technique [Fournier et al., 1976], noted that about 0.2 km3/yr of crystallizing rhyolitic magma could provide this heat. Alternatively, the heat could be provided by cooling an equal volume of already crystallized magma by 300°C, or some combination of crystallization and cooling. Lowenstern and Hurwitz  estimated that an intrusion rate of ~0.3 km3/yr of basaltic magma would be required to supply the observed CO2 flux in steady state. Dzurisin et al.  hypothesized that the apparent discrepancy in intrusion rate estimates might be explained by time variability in the CO2 flux and/or intrusion rate.
 The process of magma crystallization at depth may impact deformation observed at the surface. Fournier  estimated that crystallization of 0.2 km3/yr of rhyolitic magma with 2% water content, after accounting for the volume decrease of the magma, would result in a volume increase of 0.026 km3/yr, which he argued is more than sufficient to account for caldera uplift rates of 1.4–2.2 cm/yr observed between 1923 and 1985 (Figure 8). Following a period of subsidence from 1985 to 2004 (except for slow uplift during 1995–1997), a period of accelerated uplift occurred in the caldera beginning in 2004. Uplift rates averaged 7 cm/yr from 2004 to 2006, slowing slightly to a still high 5 cm/yr from 2006 to 2008 [Chang et al., 2007, 2010]. Interferometric synthetic aperture radar measurements showed that uplift was concentrated inside the ring fracture system (Figure 1), with little deformation on the surface above the 2010 swarm zone. Because these high uplift rates require volume increases larger than the expected exsolution rate of fluids from crystallizing magma, the accelerated uplift was probably driven by an increase in the intrusion rate of new magma into the midcrust from greater depth [Pelton and Smith, 1979; Chang et al., 2007].
 Although exsolved magmatic fluids may accumulate for some time in the ductile crust, eventually, they are likely to migrate into the upper crust. In the past few decades since monitoring began, major transitions from uplift to subsidence at Yellowstone have been accompanied by earthquake swarms [Smith et al., 2009; Dzurisin et al., 2012]. The 1985 swarm, the largest yet recorded at Yellowstone, heralded an end to a period of overall inflation between 1923 and 1985 [Waite and Smith, 2002]. Similarly, following the 2008–2009 Yellowstone Lake swarm [Farrell et al., 2010], caldera inflation slowed; with the 2010 Madison Plateau swarm examined here, the caldera resumed deflation (Figure 8). Geologic examination of Yellowstone Lake shorelines has shown that uplift and subsidence have been mostly balanced over postglacial times (since ~14,000 years ago), which could be explained by accumulation and release of exsolved fluids [Pierce et al., 2002]. Considering this and the association of earthquake swarms with the onset of subsidence, major earthquake swarms might be associated with fluids escaping from a lithostatically pressured ductile regime into a hydrostatically pressured brittle region [Dzurisin et al., 1994, 2012; Waite and Smith, 2002]. Once the fluids are within the brittle regime, they can induce faulting by lowering the effective normal stress, which aids further fluid propagation.
Fournier  argued that exsolved magmatic fluids are likely to accumulate in horizontally extensive, overlapping sill-like structures. He noted that vertically extensive fluid-filled fractures are not mechanically stable in plastic rock that is not capable of maintaining significant differential stress, where the stress magnitude will follow the lithostatic gradient. Therefore, in rocks that are more ductile, vertically extensive fluid fractures either will propagate rapidly upward or will spread out laterally from their top edge, as the lower part of the fracture is squeezed shut. The net result is that over the long term, horizontally extensive aqueous fluid “sills” will dominate in the ductile regime since vertically extensive “dikes” will be quickly destroyed. Therefore, exsolved fluids accumulating near the top of the magma reservoir below the ductile-brittle transition probably form in horizontal lenses [Fournier, 1999; Smith et al., 2009]. This structure could result in permeability anisotropy, favoring lateral flow much more readily than vertical flow [Fournier, 1999]. At ~10 km depth, fluid moving laterally outward from the caldera would encounter the ductile-brittle transition near the caldera boundary, without moving upward. In fact, the site of the 2010 swarm at the NW boundary of the caldera corresponds with an especially abrupt deepening of the maximum focal depth of earthquakes compared to within the caldera, which probably indicates a correspondingly abrupt deepening in the brittle-ductile transition [Smith et al., 2009]. Figure 9 schematically illustrates the hypothesized lateral transport of fluids and their role in swarm generation.
4.4 Interaction of Fluids and Faulting
 The association between fluid pressure fluctuations and earthquake swarms is supported by the observed spatial-temporal migration and the “swarm-like” character in which seismicity does not follow a decaying main shock–aftershock pattern [Mogi, 1963]. Seismic swarms are usually thought to reflect heterogeneous stresses and some type of external forcing, commonly either aseismic slip or fluid pressure increase [e.g., Vidale and Shearer, 2006]. For the 2010 Madison Plateau swarm, the spatial and temporal histories are most readily produced by a fluid pressure transient. In particular, an abrupt release of fluids into the fault zone could explain the abrupt initiation of the swarm and the fact that earthquake hypocenters eventually surround the point of initiation, both characteristics that deviate from patterns associated with earthquakes triggered by aseismic slip [e.g., Segall et al., 2006; Lohman and McGuire, 2007]. Though stress triggering from previous events is still a primary process, sustained earthquake rates and spatial expansion are well explained by a diffusing fluid pressure increase [Parotidis et al., 2005; Hainzl and Ogata, 2005].
 In particular, numerical modeling has suggested that the strong spatial spreading of hypocenters is indicative of fluid triggering rather than stress triggering [Hainzl, 2004]. We observe migration of the front of the swarm activity in a manner consistent with fluid diffusion (Figure 6), similar to the pattern seen with other swarms hypothesized to be fluid triggered [Parotidis et al., 2003; Hill and Prejean, 2005; Hainzl and Ogata, 2005; Chen et al., 2012; Shelly et al., 2013], as well as those triggered by controlled injection of fluids at depth [e.g., Shapiro et al., 1997]. Often, small earthquakes lead the propagation front, with larger events following only after weaker activity has become established in an area (Figure 6). This may reflect the heterogeneous rise of fluid pressure—initially, the increased pressure would be localized in areas with higher permeability, allowing very small earthquakes. Larger earthquakes, in contrast, may be delayed until fluid pressure has risen over most of the eventual source region. We also observe a lower density of earthquakes on the main dipping structure near the regions of larger earthquakes—this may be simply a reflection of the larger slip and rupture dimension that characterizes larger earthquakes. However, it might also indicate relative simplicity of the fault zone in the source regions of the larger earthquakes.
 Geological observations and associated modeling studies may prove helpful for understanding the mechanics of the swarm at depth. Mineral deposits are often concentrated in zones associated with faulting under high fluid pressures [Sibson, 1987; de Ronde et al., 2001; Caine et al., 2010], which, owing to their economic importance, have been extensively studied. These and other studies of fault zones demonstrate that most faults, rather than being discrete planes, are complex zones of deformed rock [Wibberley et al., 2008]. This complexity naturally leads to the creation of new pore space during faulting, especially if that faulting is occurring under high (near-lithostatic) pore pressures [Hill, 1977]. Accordingly, as slip occurs in an earthquake, fluid pressure in the fault zone will abruptly drop, possibly serving as an arresting mechanism for dynamic events [Sibson, 1987]. Under some conditions, the pore pressure in the fault zone may decrease to the point of vaporizing pore fluids [Weatherley and Henley, 2013]. Because this process scales with the amount of slip, it is expected to be especially important for the larger magnitude events and is thought to be a major factor in fault mineralization [Sibson, 1987; Sheldon and Ord, 2005].
 After an earthquake, fluid would be drawn into the fault zone from the surrounding area to once again equilibrate the pressure, allowing the process to repeat [Sibson, 1987; Waite and Smith, 2002; Sheldon and Ord, 2005]. Over time, the fault pore space not filled by mineralization may contract again under lithostatic pressure. This zone could then act as a new source of fluids, similar to the mechanism proposed by Rutledge et al.  to explain the characteristics of seismicity induced by industrial fluid injection. In the first few hours of the swarm, there appears to be competition between updip and downdip propagation, where activity expands episodically in either direction (see Animation S1). This characteristic might result from a relatively steady fluid supply, where propagation of the swarm activity front in the updip direction reduces the overall pressure in the fault zone, slowing propagation of the downdip front.
 While we cannot absolutely rule out a magmatic intrusion as the direct trigger of the 2010 Madison Plateau swarm, we think that it is unlikely. Hydraulic diffusivity is inversely proportional to viscosity, assuming that permeability and other parameters are fixed [Ingebritsen and Manning, 2010]. The high diffusivity describing swarm migration (~1.5 m2/s, Figure 6), coupled with the lack of resolvable surface deformation, suggests triggering by propagation of a low-viscosity fluid. The viscosity of a single-phase supercritical aqueous fluid expected in the swarm source region [Lowenstern and Hurwitz, 2008] is ~10−4 Pa s, several orders of magnitude lower than that of basaltic magma (~101–102 Pa s) and many orders lower than rhyolitic magma (104–108 Pa s) [Rubin, 1995, and references therein]. Thus, diffusion of magma compared to aqueous fluid would require drastically higher permeabilities to progress at similar rates. This is not impossible in the case of dike opening, but magma-filled dikes might be expected to generate long-period earthquakes [Chouet, 1989], which are not observed during the swarm. Significant dike opening might also cause detectible surface deformation and stress changes large enough to override the background stress state. By contrast, a low-viscosity aqueous fluid could quickly traverse narrow fractures at depth, triggering earthquakes consistent with the background stress state with negligible surface deformation. Further evidence supporting a nonmagmatic origin of the swarm is the accompanying transition from uplift to subsidence in the caldera. This is opposite of the result that would be expected if the intrusion were triggered by input of new magma from depth, suggesting that triggering fluids may occupy preexisting pore space and/or narrow fractures. Finally, the broad consistency of waveforms and P wave polarities (Figure 7a) among swarm earthquakes suggests that they are occurring in a relatively uniform stress field, perhaps many of them on the same fault structure.
 High-resolution earthquake detection and location provides constraints for understanding the underlying driving processes in swarms, beyond those available from routine locations. The technique is applied retrospectively here, but with appropriate preparation, it has the potential to be implemented in near real time, utilizing automated phase picks and/or a limited subset of analyst-reviewed events. Such an implementation could potentially guide interpretation and response during future swarms.
 Our analysis of the 2010 Madison Plateau swarm demonstrates a dramatic outward expansion of swarm hypocenters with time, suggesting that the swarm was triggered by a fluid pressure pulse expanding within a preexisting fault zone. This provides additional support for the hypothesis that many earthquake swarms in Yellowstone, such as the 1985 Yellowstone swarm, relate to the expulsion of exsolved aqueous fluids from the caldera [Dzurisin et al., 1994, 2012; Waite and Smith, 2002]. Magma intruding beneath the caldera provides an obvious source of water-rich fluids containing CO2, sulfur species, and chloride. The 2010 swarm was likely triggered as these fluids crossed the brittle-ductile transition and, in doing so, moved from a near-lithostatic to a near-hydrostatic pressure regime. This, in turn, triggered faulting in response to the preexisting differential background stress. This hypothesized transfer of fluids outward from the caldera and into the upper crust is consistent with the observed transition to subsidence in the caldera. Though the accelerated uplift of the caldera observed in 2004–2008 probably requires an increase in magma flux into the midcrust [Chang et al., 2010], much of the observed vertical fluctuations of the caldera could be explained by corresponding fluctuations in the accumulation and discharge rates of exsolved magmatic fluids.
 Geologic observations and numerical modeling suggests that faulting under high pore fluid pressures will dilate the fault zone and dramatically increase the permeability [Ingebritsen and Manning, 2010]. Because rising fluid pressure will trigger earthquakes in response to the differential stress, earthquake swarms may be a natural consequence of fluid injection from overpressured crustal volumes into the surroundings. Preexisting fault zones would provide a natural path of weakness and potentially high permeability. Though faulting initially facilitates fluid flow, precipitation in fractures will eventually reseal the system. Thus, the “fault valve” cycle repeats, where fluid pressures rise until faulting is again triggered, fluid flows, and precipitation reseals the system [Sibson, 2003; Cox, 2005, 2010]. In the case of mineral-rich fluids, such as those exsolved from magmas, repeated activation of this process leads to the concentration of precipitates, including minerals of economic importance [Sibson, 1987; Fournier, 1989; Weis et al., 2012; Weatherley and Henley, 2013]. Swarms such as the 2010 Madison Plateau sequence may also serve as natural analogs to seismicity triggered by controlled fluid injection at depth [Ellsworth, 2013], providing insight into the interactions between fluids and faulting.
 The seismic data for this study were from the Yellowstone Seismic Network, operated by the University of Utah, and from the Plate Boundary Observatory, operated by UNAVCO for EarthScope and support by the National Science Foundation (No. EAR-0350028 and EAR-0732947). Data were retrieved from the IRIS data center and from the UUSS. Support for the University of Utah authors F. Massin, J. Farrell, and R. B. Smith, respectively, were from The Brinson Foundation, the University of Utah, other private foundations, and the U.S. Geological Survey. The U.S. Geological Survey, Cooperative Agreement G10AC00124, and Yellowstone National Park supported the operation of the Yellowstone Seismic Network. We are grateful for comments and reviews of this manuscript in various states from Serge Shapiro, Ole Kaven, Chuck Wicks, Shaul Hurwitz, and Jake Lowenstern and for discussions with Andy Michael, Bob Christiansen, and Peter Cervelli, all of which greatly improved the manuscript.