For the first time, ambient noise tomography is used to clearly image the magma chamber beneath Lake Toba caldera, one of the largest Quaternary calderas on Earth. Using data from 40 seismic stations deployed between May and October 2008 around Lake Toba, empirical Green's functions are extracted from long term cross-correlations of continuous records. These functions are dominated by Rayleigh waves, whose group velocities can be measured in the period range from 2.5 to 12 seconds. Arrival times of these waves are picked for a given period and inverted using 2-D tomography to calculate lateral variations in velocity for the given period. This was done for six different periods, which all correspond to different sampling depths. Thus the six 2-D models presented together provide information on velocity variations with depth. The results show a low-velocity body coincident with the Lake Toba caldera, representing the magma chamber under the volcano. The chamber is observed to have a complex 3-D geometry, with at least two separate sub-chambers underlying the caldera. Other results include a deep low velocity body, possibly another magma chamber, south west of the lake with an upper limit of ∼7 km depth. The maximum depth to which this body reaches could not be resolved. The Sumatra Fault marks a velocity contrast, but only down to depths not greater than 5 km. The reliability of the results was further confirmed by checkerboard recovery tests.
 The Toba caldera is located in north Sumatra, Indonesia. It is part of the volcanic arc associated with the subduction of the Australian Plate beneath the Southeast-Asian Plate. The subduction zone, and the Sumatra Fault, a right lateral strike-slip fault which marks the plate boundary, are seismically active. However, only two seismic networks have previously been deployed in the region. Between 1990 and 1993 BMKG (Meteorological, Climatological and Geophysical Agency) deployed a telemetered network of 10 stations. The data were used by Fauzi et al.  to examine the geometry of the subducting slab by locating earthquakes. In 1995 10 broad-band and 30 short-period IRIS-PASSCAL stations were installed for 4 months. Masturyono et al.  inverted P-wave travel times to provide the first velocity model in the area. Koulakov et al.  extended this work by computing an S-wave velocity model, as well as Poisson's ratio distribution map. A crustal image beneath the caldera was constructed by Sakaguchi et al.  using receiver functions.
 Between May and October 2008, a dense seismic network was installed around Lake Toba (Figure 1). The network comprised of 40 continuously recording seismic stations. They were equipped with three-component, short-period seismic sensors with 1Hz natural frequency. The GPS synchronised data loggers recorded at 100 samples per second for the experiment's time span of 6 months. During this time period local and regional seismicity was recorded. In this study we present an analysis of the ambient noise part of the recorded data. Although we used short-period sensors, there was significant strong noise energy in the 0.06 to 1 Hz frequency band, which formed the basis for an ambient noise tomography study.
 In this manuscript, we investigate the uppermost crustal structure beneath the Toba caldera and its surrounding area. The caldera is known as the location of one of the largest Quaternary calderas on Earth, formed 75 ka ago by an eruption of 2500–3000 km3 of material (ignimbrites, tuffs [Chesner et al., 1991; Rose and Chesner, 1987]). This volcanic event had a significant global impact on climate and the biosphere. However, major pyroclastic activity started much earlier (∼1.2 Ma [Chesner and Rose, 1991]).
 The presence of a crustal magma reservoir beneath the Toba caldera has been revealed by previous geophysical analysis that relied on an inversion of P-wave arrival times and gravity anomalies [Masturyono et al., 2001]. Two magma reservoirs in the northern and southern parts of the calderain the 10–20 km depth range have been found. The goal of our seismic studies is to understand the present-day magma distribution by high resolution mapping of crustal seismic velocities tomographically derived from ambient noise records.
2. Ambient Noise Cross-Correlation
 The idea of extracting coherent signal by cross-correlation of noise was first applied to seismic waves in helioseismology [Duvall et al., 1993]. An acoustics study by Weaver and Lobkis  extracted Green's function by cross-correlating diffuse fields, and favourably compared the response to direct Rayleigh waves. This technique was extended to seismic recordings by Shapiro and Campillo , who cross-correlated vertical component records from seismic station pairs in North America to obtain Rayleigh waveforms. Since then the technique has generated a lot of interest across the globe, e.g., Iceland [Gudmundsson et al., 2007], the Alps [Stehly et al., 2008], and Australia [Saygin and Kennett, 2010].
 The method used in this study involves calculating the cross-correlation integral for the daily time series records for all available station pairs in the 40 station array. If a particular station did not record continuously for a particular day (due to site maintenance or instrument failure), that day's recorded data were not considered. Each daily record's mean was removed prior to cross-correlation. For each available station pair, the resulting correlation function is a two sided time function with both positive and negative correlation lags. While each correlation is twice as long as the input files, i.e., two days, given a maximum station distance of 177 km it was sufficient to store the sections from −150 to 150 seconds. The daily correlation functions were stacked for each station pair to improve the signal to noise ratio. Figure 2 shows the presence of coherent signal in the correlation function. All data presented here show the correlations of the vertical component of the data. Analyses of the radial and transverse components did not yield coherent signals, leading us to conclude that the extracted Green's functions are indeed dominated by Rayleigh wave signals traveling between the two stations concerned.
 After stacking the daily records, the frequency-dependent group velocities of the Rayleigh waves were calculated using frequency-time analysis [e.g., Ritzwoller and Levshin, 1998]. Each Green's function was filtered with progressively higher period range, and the envelope of each resulting filtered trace was calculated. These envelopes were then plotted together, where the arrival times at each frequency were picked. This was done by identifying the highest value among the envelopes on the time segment corresponding to group velocity between 1.5 and 3.5 km/s, and tracing the dispersion curve through the frequency range as long as the arrival was sufficiently prominent (Figure 2b). Periods between 2.5 and 12 seconds were used. Including periods above 12 seconds significantly reduced measurable Green's functions. It was decided to obtain travel times for six different periods: 2.5, 3.3, 5, 7, 9 and 12 seconds. Different period ranges have different group velocities due to the fact they sample different depths [e.g., Ritzwoller and Levshin, 1998]. High frequency (short period) waves penetrate only the shallow sub-surface, while long period waves penetrate deeper into the crust. The 2.5 s signal samples the uppermost 2–3 km, while the 12 s signal penetrates to a depth of 10–15 km [e.g., Saygin and Kennett, 2010].
 The positive and negative correlation lags were not stacked, as is often done to improve signal quality [e.g., Bensen et al., 2007], but two dispersion curves were drawn simultaneously. For a number of station pairs the two opposite directions of propagation exhibited prominent arrivals in different period ranges. The frequency-time analysis between stations 4 and 12 (77 km apart), shown in Figure 2b, exemplifies this. This information might have been lost in the stacking of the two lags. The possible problem of two significantly different travel times between two stations at the same frequency was dealt with later, during the tomographic inversion.
3. Tomographic Inversion
 Having obtained Rayleigh wave travel times between station pairs for a given period range, a 2D tomographic inversion can be performed to estimate variations in surface wave group velocities in the study area.
 The Fast Marching Method (FMM) [Sethian, 1996] is a grid based numerical algorithm for tracking the progress of monotonically advancing interfaces by seeking finite-difference solutions to the eikonal equation. The method uses implicit waveform construction as opposed to conventional ray tracing, and furthermore contains entropy conditions, which increase the method's stability. This method has been successfully applied to seismic tomography [Rawlinson and Sambridge, 2004, 2005; Arroucau et al., 2010]. In this study we use the Fast Marching Surface Tomography (FMST) code developed by N. Rawlinson in the framework of the aforementioned studies.
 For each set of travel times, a number of 2D inversions was performed using different starting models, grid densities, damping factors, and other inversion parameters. Furthermore, as picking of travel times was carried out automatically, some quality control needed to be imposed. After an inversion, synthetic travel times were computed through the resulting velocity model, and compared to the calculated input times. Where a difference of more than 5 seconds was observed, the particular travel time was discarded. The possibility of two different travel times between the same station pair was also investigated – where a significant difference between the two picked times existed, the time further from the synthetic travel time was discarded. The inversion was then repeated without the discarded travel times.
 The final velocity models for different period ranges are shown in Figure 3. Each model was calculated on a 2° by 2° grid, separated into 30 nodes in each lateral direction. The models did not change significantly when the inversion parameters were varied, as long as these were kept reasonable. The rms residuals for each model are also shown in Figure 3.
 To investigate the resolution of the results shown in Figure 3, checkerboard tests were applied. Examples of how alternating anomalies can be recovered from synthetic travel times are given in the auxiliary material. The recoveries are good, but indicate the necessity to treat features in Figure 3 near the edge of the ray coverage, as well as features smaller than 0.2° (∼20 km), with caution.
4. Results and Discussion
 The most prominent feature in all velocity models (Figure 3) is the low-velocity zone coinciding with the location of Lake Toba, particularly Samosir Island. While no body can be uniquely defined based solely on its surface wave velocity, high temperatures and the presence of partial melts are considered to be responsible for reduced seismic velocities [e.g., Christensen and Mooney, 1995]. Our results thus are consistent with a region of high temperatures and possibly partial melt in the crust (magma chamber, pluton) underneath the volcanic island forming the source of the voluminous Toba ignimbrites and tuffs [e.g., Aldiss and Ghazali, 1984]. Similar low-velocity anomalies were observed in Rayleigh group velocity maps beneath the Yellowstone caldera in North America [Stachnik et al., 2008] and in local earthquake tomography at the Altiplano-Puna volcanic zone (APVC) in the Chilean magmatic arc [Graeber and Asch, 1999; Haberland and Rietbrock, 2001]. Moderate seismic velocities in the crust beneath the active volcanoes of the magmatic arc north-west and south-east of this anomalous Lake Toba region (mainly erupting calc-alcalic basalt and andesite [Van Bemmelen, 1949]) indicate that here the crust does not show evidence of strong heating or the generation of voluminous partial melts. Page et al.  related the unusually large amount of volcanic activity in the Toba region with a tear in the subducting plate (Investigator Ridge fracture zone) at this section of the subduction zone. However, the mechanisms responsible for the large-scale anatexis of the crust in the Toba region remain unclear.
 The exact geometry of the zone of partial melt or a magma chamber appears to be complex. While the lowest velocities are observed beneath the southern part of Samosir island (this is consistent with body wave analysis of Masturyono et al. ), in some sections of the model a second low velocity region is clearly seen underneath the northern part of the lake. The existence of two separate low-velocity bodies under the caldera is also consistent with Masturyono et al. . While Toba is often referred to as a caldera, gravity [Nishimura et al., 1984] and paleomagnetic [Knight et al., 1986] studies suggest it is in fact a complex of several calderas. Our study shows the caldera can be separated into two distinct parts. The size of these parts is close to the resolving capabilities of our velocity models, making it impossible for us to state whether these are single calderas or sub-complexes. A denser seismic network would be necessary to resolve this. However, the total volume of the low velocity anomaly is at least 75 × 30 × 15 km (length × width × depth) = ∼34,000 km3, which corresponds very well to estimates of the pluton size of 34,000 km3 underlying the Toba region, based on volumes of magma erupted over the last 1.2 Ma [Chesner and Rose, 1991]. It is possible that the volume of the anomaly is larger – the signal extracted in our study does not contain energy at the frequencies which would yield information about velocity distribution beyond the depth of ∼15 km.
 For the short-period models the Sumatra Fault loosely marks a clear shift in velocity, with higher values south-west of the fault. This is consistent with body wave results [Masturyono et al., 2001; Koulakov et al., 2009], as well as with the geological model of a subduction zone. However, the fault no longer marks a distinction for periods higher than 5 second (i.e., depths greater than ∼5 km). While the Fault is know to extend to a greater depth, at 5 km it ceases to have a resolvable seismic signature. A further significant feature is the low velocity zone south-west of the Lake at the 12 seconds period model. It is also present, though less prominent, at the 9 and 7 seconds models. The size of this anomaly is definitely within the resolving capabilities of the inversion, and it is interpreted here as another magmatic body, extending from a depth of ∼7 km downwards. This body, as well as high velocity regions north of Lake Toba, lead us to conclude that such a simple bimodal velocity model defined by the Sumatra fault is an oversimplification, and the tectonics in the region are much more complex.
 This study extracted Rayleigh wave signal through stacking daily cross-correlations of seismic signal recorded by a temporary array of 40 stations deployed around Lake Toba on Sumatra between May and October 2008. These Rayleigh waves contained energy in the period range of 2.5 – 12 seconds. Arrival times of these surface waves were picked for the available station pairs at six frequencies in the aforementioned range. Using the available travel times for each frequency tomographic inversion was used to estimate surface wave velocities at a given frequency. As waves of different frequencies penetrate different depths, these velocity models represent pseudo depth slices underneath the study area.
 The resultant velocity models identified a low-velocity zone very well correlated with the location of Samosir Island. While this is not surprising, it confirms the feasibility of the technique of noise cross-correlation for the data set used here. The velocity models confirm that the Toba caldera is underlaid by at least two separate magma chambers. The volume of the Lake Toba low velocity anomaly suggests a pluton size of at least 34,000 km3, which confirms the previous estimates based on the volume of magma erupted. Our analysis also identified another low velocity body south of the Sumatra Fault, with its upper limit at a depth of approximately 7 km.
 The experiment was done in collaboration with the Indonesian Institute of Science (LIPI) and Indonesian Meteorological and Geophysical Agency (BMG). In particular we are grateful to Bambang Suwargadi (LIPI), as well as Frederik Tilmann and Andreas Rietbrock. Furthermore, we thank all field crews for their excellent work under difficult conditions and the landowners for hosting our stations. We thank Nick Rawlinson for making his tomography code available to us. The experiment was funded by the GeoForschungsZentrum Potsdam, with the Geophysical Instrument Pool Potsdam (GIPP) providing the equipment. JS is funded by the Helmholtz-Russia Joint Research Groups (project HRJRG-110). The manuscript benefited from reviews by S. Widiyantoro and an anonymous reviewer.