The oxygen isotopic ratio (δ18O) in tropical Pacific coral skeletons reflects past El Niño–Southern Oscillation (ENSO) variability, but the δ18O-ENSO relationship is poorly quantified. Uncertainties arise when constructing δ18O data sets, combining records from different sites, and converting between δ18O and sea surface temperature (SST) and salinity (SSS). Here we use seasonally resolved δ18O from 1958 to 1985 at 15 tropical Pacific sites to estimate these errors and evaluate possible improvements. Observational uncertainties from Kiritimati, New Caledonia, and Rarotonga are 0.12–0.14‰, leading to errors of 8–25% on the typical δ18O variance. Multicoral syntheses using five to seven sites capture the principal components (PCs) well, but site selection dramatically influences ENSO spatial structure: Using sites in the eastern Pacific, western Pacific warm pool, and South Pacific Convergence Zone (SPCZ) captures “eastern Pacific-type” variability, while “Central Pacific-type” events are best observed by combining sites in the warm pool and SPCZ. The major obstacle to quantitative ENSO estimation is the δ18O/climate conversion, demonstrated by the large errors on both δ18O variance and the amplitude of the first principal component resulting from the use of commonly employed bivariate formulae to relate SST and SSS to δ18O. Errors likely arise from either the instrumental data used for pseudoproxy calibration or influences from other processes (δ18O advection/atmospheric fractionation, etc.). At some sites, modeling seasonal changes to these influences reduces conversion errors by up to 20%. This indicates that understanding of past ENSO dynamics using coral δ18O could be greatly advanced by improving δ18O forward models.
 The El Niño-Southern Oscillation (ENSO) is the dominant source of interannual climate variability and influences atmospheric and oceanic conditions worldwide [Horel and Wallace, 1981; Ropelewski and Halpert, 1986]. This makes the issue of ENSO's response to future climate change a key question, but to date the scientific community has been unable to provide an answer. Model projections of 21st century ENSO strength disagree widely [Guilyardi et al., 2009], and the disagreement does not seem to be a function of overall model performance [Collins et al., 2010]. Given that large unforced modulations are seen in multicentury simulations [Wittenberg, 2009; Stevenson et al., 2010], the disagreement between 21st century ENSO projections created with different Intergovernmental Panel on Climate Change-class climate models may be due in large part to unforced internal variability [Stevenson et al., 2012].
 A major roadblock to constraining the ENSO response to climate change is the limited amount of available data on past ENSO variability. The modern instrumental record is too short to detect the climate change signal against the background of natural variability [Stevenson et al., 2010], and most instrumental data are used by modeling centers to “tune” the models during the development process. So even with a longer instrumental record, multimodel projections would still be somewhat suspect, not having the option to test the models against out-of-sample data. To construct independent calibration and verification periods of sufficient length, the only option is to use paleoclimatic proxies to provide information predating the start of the modern record.
 The major difficulty with model ENSO validation using proxy evidence is the need for quantitative comparisons between proxy data and climate model output, which requires a proxy whose variability depends primarily on ENSO and whose behavior can be well described as a function of its environment. The oxygen isotopic ratio (δ18O) in the aragonite skeletons of tropical corals satisfies both of these requirements, providing seasonally resolved records throughout the tropical Pacific. Coral δ18O has therefore been used successfully to infer past ENSO variability [Cole et al., 1993; Dunbar et al., 1994; Charles et al., 1997; Evans et al., 1998b; Urban et al., 2000; Tudhope et al., 2001; Cobb et al., 2003; Lough, 2004; McGregor and Gagan, 2004; Lough, 2010; McGregor et al., 2011; Nurhati et al., 2011; McGregor et al., 2013]. But obstacles to using coral δ18O to evaluate model performance still remain, arising from our limited understanding of the details of the ENSO influence on δ18O at proxy sites.
 Coral δ18O depends primarily on two quantities: SST and the δ18O of seawater. The SST dependence is an inverse relationship, created by thermodynamic fractionation [Epstein et al., 1953] with estimated slopes generally between −0.18 and −0.21‰/°C (see review of Grottoli and Eakin ). The δ18O of local seawater is incorporated into the coral skeleton during growth [e.g., Gagan and Abram, 2011], which results in a direct proportionality between coral and seawater δ18O; seawater δ18O is then typically represented as proportional to SSS [LeGrande and Schmidt, 2006]. Although seawater δ18O is affected by precipitation, evaporation, and advection (see also section 7) [Fairbanks et al., 1997], enhanced rainfall normally leads to more negative δ18O values due to the fact that precipitation tends to be isotopically much more negative than seawater δ18O; thus, more negative coral δ18O values are associated with warm/wet conditions and more positive δ18O values with cold/dry conditions [Grottoli and Eakin, 2007]. However, it is possible for the influences of temperature and precipitation on δ18O to partially cancel one another in some locations [Cahyarini et al., 2008]. Other influences, such as small-scale circulations, river runoff, or orographic precipitation, may affect coral δ18O in some locations as well.
 The complexity of controls on coral δ18O makes it difficult to convert between model output and the proxy signal, a requirement for quantitative model validation. Statistical methodologies which reconstruct a climatic variable from multiple combined signals [Evans et al., 2002] are sometimes used to overcome this obstacle. However, these methods are limited by the assumption that large-scale climate variability has a stationary structure, as well as by the quality of instrumental data used to calibrate the reconstruction [Emile-Geay et al., 2013]. Thus, empirically derived relationships between coral δ18O and conditions local to each individual site are sometimes also used. In this case, the local climate-δ18O relationship is assumed constant, which places a weaker stationarity restriction on large-scale variability. Local conversions either use a “calibration,” where proxy data are converted into estimates of observations or model output, or (in the other direction) a “pseudoproxy/forward model” to estimate the proxy signal [Dunbar et al., 1994; Brown et al., 2008; Thompson et al., 2011; Carré et al., 2012; Smerdon, 2012; Phipps et al., 2013]. Although the true local relationship may not remain stationary, as the tree ring community has discovered [D'Arrigo et al., 2006], site-specific conversions incorporate the best available knowledge of controls on the proxy signal and have gained popularity recently for estimating past ENSO variability [Brown et al., 2008; Thompson et al., 2011]. However, to date, there has been no systematic quantification of the magnitudes of errors associated with pseudoproxy-based ENSO amplitude estimates—a critical task for determining the extent to which model/proxy disagreement is actually a result of model error.
 Here we use modern observations and coral records to evaluate sources of error in coral-based ENSO reconstruction. The conceptual framework is laid out in section 2, data and methods in section 3, issues related to errors in δ18O observations in section 4, and errors in linear pseudoproxies in section 5, including the impact of inaccuracies in observational SST and SSS products. section 6 then analyzes the degree to which single-site conversions can be improved by including site-specific information. Finally, suggestions for new directions for the paleoclimate community are presented in section 7.
2 Conceptual Framework
 In general, a proxy signal P (i.e., the δ18O time series) can be modeled as a function of a set of variables x (i.e., temperature and/or salinity):
where f represents the relation between P and x, and ε is the associated error (e.g., Gaussian white noise).
 Paleoclimate calibration/pseudoproxy studies often assume that the proxy signal P is a linear function of just one to two climate variables. Contributions from additional variables may be present as well and may lead to nonlinearities (for instance, advection of water masses carrying distinct δ18O signatures). The goal of quantitative model/proxy comparison is to determine the function f which describes as much of the signal P as possible.
 The complexity of potential influences on a proxy signal can be mathematically visualized by applying a Taylor expansion to approximate the function f as a combination of its derivatives, which provides an arbitrarily accurate estimate of f by including more and more terms in the expansion. This is equivalent to deriving higher and higher-order calibration slopes for P as more becomes known about the generation of the signal. For a bivariate expansion, as is generally applied to coral δ18O, f can be estimated near the point x0=(x0,y0) (where x0,y0 represent mean conditions, such as SST and seawater δ18O values) using
 The upper line of (2) contains linear dependencies on x and y, and for a bivariate pseudoproxy, the derivatives and are equivalent to the SST and SSS calibration slopes. The Taylor expansion illustrates that even a bivariate relationship may contain higher-order nonlinear variability (bracketed terms in (2)), although this is typically neglected. Furthermore, x and y may not completely describe f; other variables might affect δ18O (i.e., advection, runoff).
 In paleoclimate applications, data from multiple locations are typically combined to minimize nonclimatic influences. This can be represented mathematically as modifying (1) by an operator B:
where P is now the combined proxy signal. For instance, to obtain mean δ18O, B becomes an averaging operator. Alternately, to reconstruct NINO3.4 SST [Emile-Geay et al., 2013], B would then become an operator representing the “regularized expectation maximization” algorithm.
 The dominant sources of error in model/proxy conversions can be classified as follows:
 Nonlinearities in (2): i.e., analytical uncertainty in the proxy signal measurement, feedbacks between variables, or biological influences (section 4.1).
 Uncertainty in B: this might arise from temporal nonstationarity (changes to the relationship between variability at the proxy site and the signal of interest), i.e., true changes to the character of El Niño events, apparent changes due to undersampling of internal variability, or shifts in the relation between SSS and seawater δ18O due to changing water mass properties. The choice of sampling locations could also be interpreted as a change in B, where different ENSO signatures appear depending on the network of sites employed (section 4.2).
 Errors in observational products: uncertainties in individual in situ measurements, the gridding/interpolation process, or problems with the construction of a reanalysis product (applies only to pseudoproxies derived from gridded climate data; section 5).
 Too few or incorrect choices of variables used to constrain f(x): i.e., not accounting for important processes affecting the coral δ18O signal, such as local river runoff or changes to the δ18O value of precipitation (section 6).
 Other local influences affecting the variables x, such as upwelling or local reef circulations: i.e., the action of waves and/or tides, or the interaction of large-scale currents with subsurface island topography (insufficient information to estimate).
 Errors of types 1–4 will all contribute to noise ε; quantitative estimates of ε from various sources are therefore essential. Errors of type 5 cannot be constrained accurately at the moment, and future efforts to assess their influences are recommended.
 In this study, δ18O is converted to climate variables via a linear pseudoproxy relationship, where the independent variables are sea surface temperature and salinity (proportional to seawater δ18O):
Here εlin refers to the uncertainties associated with the linear relationship.
 Equation (4) follows previously adopted linear relationships [Thompson et al., 2011]. Its limitations are discussed in section 5, and potential improvements are then presented in sections 6 and 7.
3 Data and Methods
 Modern δ18O records are selected from the World Data Center for Paleoclimatology (WDCP) in the tropical Pacific (23°S–23°N), and all records are included for which the following criteria are met:
 Temporal resolution of ≥4 measurements/yr (seasonal),
 Correlation between coral δ18O and SST in the NINO3.4 region (5°S–5°N, 170–120°W) significant at or above the 90% confidence level.
 This leads to the selection of 15 proxy sites (Table 1). At some locations multiple records are available: two coral records were collected at Nauru [Guilderson and Schrag, 1999] and at Savusavu [Bagnato et al., 2004], three at New Caledonia [Stephans et al., 2004] and Rarotonga [Linsley et al., 2006], and six at Kiritimati [Evans et al., 1998a; Woodroffe and Gagan, 2000; Woodroffe et al., 2003; Nurhati et al., 2009; McGregor et al., 2011]. For Kiritimati, the published stack of McGregor et al.  is used except where otherwise indicated. For other sites, a single core has been chosen to represent each location (core “92 PAA” from New Caledonia [Quinn et al., 1998], core “3R” from Rarotonga [Linsley et al., 2006], and core “LH” at Savusavu [Bagnato et al., 2004]; our tests showed that results are insensitive to the choice of coral for a given site). Variations within a single site (section 4.1) are estimated using the records listed in Table 1 from New Caledonia and Rarotonga, as well as the six short δ18O records from Kiritimati Island used to construct the stacked Kiritimati time series (Table 2). For the network analysis of section 4.2 and the pseudoproxy calculations of section 5, the time period 1958–1990 is used to maximize simultaneous data availability.
Table 1. Modern (Twentieth Century) Coral Sites Used for Monte Carlo Error Estimationa
The “Samples/Year” entry refers to the sampling resolution of the raw data; “N3.4/SST” and “N3.4/SSS” refer to the correlation coefficient (r) between NINO3.4 SST and SST or SSS closest to the coral location. “N3.4/δ18O” refers to the correlation between coral δ18O and NINO3.4 SST. All correlations are computed using the ERSSTv3b product (SST) or the Delcroix et al.  product (SSS). The Kiritimati record is a stacked time series, described in McGregor et al. .
 Unless otherwise specified, all SST data are taken from the NOAA Extended Reconstructed SST product (ERSSTv3b) [Smith et al., 2008] and all SSS data are taken from the ship-of-opportunity database constructed by Delcroix et al. , since these products both supply grid point standard deviations. Grid points for SST and SSS data are chosen from the four nearest neighbors for each proxy site, such that the correlation with δ18O is maximized. A complete accounting of the grid points selected for each site and data product is provided in the supporting information.
 SST and SSS data were linearly interpolated to match the calendar dates associated with the coral age models. Table 1 provides correlation coefficients between NINO3.4 and grid point SST and SSS. For most sites the correlation with SST is above 0.4, and for some locations (e.g., Nauru and Tarawa) the correlation with SSS is substantial as well (see correlation maps in Figure S1). This demonstrates the feasibility of coral-based ENSO amplitude reconstructions from these sites (see section 4.2).
4 Observational Constraints
 Errors in the coral observations themselves are first examined, by separately considering issues arising from combining multiple measurements at a single site (section 4.1) and from constructing an ENSO amplitude estimate from multiple δ18O time series (section 4.2).
4.1 Signal/Age Model Errors
 Owing to the difficulty of collecting contemporaneous fossil corals from multiple locations, many coral paleoclimate studies rely on a single δ18O time series from each site. Thus, replication studies often compute estimates from present-day corals and assume that similar errors are present in fossil corals. Here an error analysis for modern corals is performed at the three tropical Pacific sites for which at least three simultaneous, seasonally resolved δ18O time series were available: Amedee Lighthouse in New Caledonia (22.5°S, 166.5°E) [Stephans et al., 2004], Rarotonga (21.2°S, 159.8°E) [Linsley et al., 2006], and Kiritimati Island (2°N, 157°W; Table 2).
 If the analytical δ18O measurement error is ignored, the dominant uncertainties in single-site δ18O variance are here referred to as the “signal” and “age model” errors; the δ18O time series contain a combination of both. Signal error consists of local (external) or biological (internal) processes unrelated to large-scale climate. External signal errors likely arise from small-scale climate influences near the reef [McGregor et al., 2011], or other factors acting on the coral (i.e., fish bites) [Linsley et al., 1999]. Internal signal errors might arise from coral vital effects (i.e., the “spawning spikes” of Evans et al. ), growth rate influences [Gagan et al., 2012], or diagenesis [McGregor and Abram, 2008].
 Age model error is generated during the process of assigning calendar months to δ18O measurements within a given year [Evans et al., 1999; Felis et al., 2000; Cobb et al., 2003] and is well constrained in comparison to signal error. Age model construction is typically done by assigning at least one fixed “tie point” per year and interpolating between the tie points. Tie points may be assigned to the annual coldest month (maximum δ18O) or warmest month (minimum δ18O) [Linsley et al., 1994; Felis et al., 2000; Cobb et al., 2003]. Sr/Ca measurements are sometimes used for assigning times [DeLong et al., 2007], as are coral δ13C [Cole et al., 1993; Evans et al., 1999; Guilderson and Schrag, 1999; Tudhope et al., 2001]. During an El Niño year when the δ18O seasonal cycle is suppressed, either a constant growth rate is assumed or the density band structure is used (where available, e.g., Linsley et al. , Quinn et al. , Tudhope et al. , Cobb et al. , and McGregor et al. , among others). Season matching against instrumental data can then provide additional accuracy [Gagan et al., 1998].
 To combine records from a given site, the δ18O values for each record are adjusted by adopting a single δ18O time series as the “reference” and offsetting the others such that the means of the overlapping portions are identical. The time series of each individual δ18O record (after adjustment) is then shown in Figure 1, for the 1975–2005 period. The standard deviation σ measures signal and age model errors combined; this was computed by taking the variance as a function of time, for each month having ≥ two measurements, then computing the square root of the mean variance. The mean σ is 0.14‰ at Kiritimati, 0.14‰ at New Caledonia, and 0.12‰ at Rarotonga. The agreement in signal/age model error estimates between sites is remarkable and suggests that this result is quite robust. Notably, this error is of the same order of magnitude as the analytical uncertainty (0.05–0.08‰); perhaps, the signal error is due in large part to the δ18O measurements themselves, or to sampling errors introduced during the construction of the δ18O time series [Alibert and Kinsley, 2008; McGregor et al., 2011; DeLong et al., 2013].
 The remaining portion of the signal error most likely reflects the effects either of local circulation in the reef environment (generally minimized at the time of collection) or of other biological factors. Here both “head” and “microatoll” forms of the Porites species from Kiritimati are included (Table 2); microatolls tend to grow in much shallower water than head corals but are as sensitive as head corals to large-scale ocean conditions [McGregor et al., 2011]. The error estimates here are largely unaffected by exclusion of the microatoll corals from analysis (not pictured), suggesting that other effects, such as “spawning spikes” [Evans et al., 1999] or diagenesis [Nurhati et al., 2011; LaVigne et al., 2013], dominate. In addition, the Kiritimati δ18O records derive from all Porites growth forms, growing at different rates on different parts of the island, and have been analyzed at three different labs; the analyses are therefore quite independent. Although produced by a single team in each case, which might be expected to result in a slight reduction of error relative to Kiritimati, the Amedee and Rarotonga sites provide further verification of the Kiritimati estimates. Thus, the 0.12–0.14‰ value can be considered a reasonable first-order approximation of signal/age model uncertainties. A full attribution of sources of error would require a comprehensive comparison with in situ seawater δ18O measurements, which is not possible at present.
 We next consider the implications of intrasite δ18O offsets for the error on the “true” δ18O variance. For Kiritimati, Amedee, and Rarotonga, the site-specific signal/age model error is used; elsewhere, the mean of those three sites is adopted. The signal/age model error value is used as the uncertainty on each individual measurement, which is added as Gaussian noise to the δ18O time series to compute Monte Carlo samples (see section 5 for methods). The standard error is then calculated by taking the standard deviation of the Monte Carlo δ18O variances, which results in a range of 8–25% depending on the site (Table 3). This range approximates (but is not necessarily identical to) the true signal/age model errors, in the absence of a full replication analysis at other locations.
Table 3. Variance of δ18O From the Modern Samples, a Comparison of That Variance With Signal/Age Model Influences Expressed as a Percentage (), and the Associated Standard Error on the Variance Resulting From Propagation of the Signal/Age Model Error (Σage)a
For Kiritimati, Amedee, and Rarotonga, the signal/age model error for that site is used (σage=0.14, 0.14 and 0.12‰, respectively). For all other sites, the mean of the three signal/age model error estimates is applied.
New Caledonia (Amedee)
Vanuatu (Malo Channel)
Vanuatu (Sabine Bank)
4.2 Multiple Site Combination
 Combining the δ18O signal from multiple locations is often used to help mitigate local uncertainties; this is equivalent to applying the combination operator B in (3). A common choice of B is the first principal component, or PC1, as this captures the dominant covarying signal across sites. For the coral sites in Table 1, the relationship between PC1 of coral δ18O and ENSO is verified in Figure 2. The PC1 time series has a correlation coefficient of −0.62 with NINO3.4 SST, much stronger than the correlation between NINO3.4 and any other δ18O principal component (not pictured). Therefore, PC1 is presumed to contain the largest proportion of ENSO-related variability.
 First, the contribution of signal/age model errors (section 4.1) to uncertainties on PC1 is considered. This is done by drawing values randomly from a Gaussian PDF with zero mean and a standard deviation equal to the signal/age model error, then adding them to the input δ18O time series for each Monte Carlo sample; the PC1 is then recalculated. The red envelope in Figure 3a shows the resulting scatter in the PC1 power spectrum, and the major spectral features in δ18O PC1 remain clearly identifiable.
 Next, dating uncertainties are considered; these are negligible for modern (living) corals which can be compared directly with observations [Evans et al., 1999], but can become large for corals which are dead when collected (for example, in situ fossil corals or storm-washed coral boulders). Unbroken fossil corals overlapping with observations can be dated as accurately as living corals, given additional constraints (e.g., U-Th dating). However, any gaps within a given core will create dating uncertainties which increase with time [DeLong et al., 2012; Alibert and Kinsley, 2008]. In such cases, as well as for cores which do not overlap with observations, radiometric dating must be used to estimate the absolute age of the coral. For the past decade, errors of ±1% [Cobb et al., 2003] have been achievable for high-resolution U/Th dating of “young” fossil corals, which leads to uncertainties of 5–10 years [Zhao et al., 2009]. However, new approaches are able to achieve smaller errors [Shen et al., 2008; Cheng et al., 2013].
 Here temporal offsets representing “typical” dating errors are applied to each δ18O time series according to a uniform distribution prior to performing the principal component analysis. Two values are used: 5 years, appropriate for young corals from the past few centuries, and 10 years, appropriate for older corals (i.e., mid-Holocene samples) [McGregor et al., 2013]. The result is shown in Figure 3b and is relatively small, an encouraging indication for future reconstruction efforts. Note that a reduction in the mean interannual variance does occur (on the order of 11%), but since not all simulated time series are offset by the maximum dating error, the majority of variance is retained. The variance reduction is larger for a 10 year error, an average of 14% less than the input δ18O PC1 variance, but the major spectral features remain clearly visible.
 The number of sites used to reconstruct ENSO is also important, as using only a few locations will underestimate the amplitude of the basin-scale signal. Coral proxy network construction was studied in detail by Evans et al. [1998b], who found that central and eastern Pacific sites added the most skill and that the first six to seven sites achieved half of the total error reduction. A later study by Evans et al.  further illustrated that the covarying δ18O modes do represent both ENSO and the twentieth century global warming trend. But there has been relatively little analysis of how the amplitude and character of the covarying δ18O signal is reproduced using small subsamples of tropical Pacific locations, as may be the case for fossil coral applications. To that end, Figures 4a–4d show PC1 spectra for samples of varying sizes drawn from the set of tropical sites. The power in δ18O PC1 is systematically underestimated, but the accuracy increases rapidly with sample size. Using four to five corals captures roughly 30–40% of the total PC1 variability and 90–100% is captured when 9–10 of the 11 corals are retained in the subsample (Figure 4e).
 To illustrate the effect of site selection on the character of the reconstructed ENSO, the locations in Table 1 are split into several categories:
High CC: high correlation between NINO3.4 SSTA and δ18O (here “CC” refers to “correlation coefficient”). This sample includes Savusavu, Palmyra, Kiritimati, and Vanuatu (Malo Channel).
East/Central: directly influenced by the equatorial cold tongue in the eastern/central Pacific. This sample includes Clipperton, Secas, Palmyra, and Kiritimati.
SPCZ: locations directly influenced by the South Pacific Convergence Zone. This sample includes Vanuatu (both sites), Savusavu, New Caledonia, and Rarotonga.
Warm Pool: locations in the western Pacific warm pool. This sample includes Tarawa, Maiana, Laing, Madang, Nauru, and Bunaken.
 The PC1 of δ18O from each sample is correlated with basin-wide SST anomaly from HadISST [Rayner et al., 2003] and shown in Figure 5 (left column). The canonical El Niño “horseshoe” is detected by the Warm Pool, SPCZ, East/Central, and High CC samples and is relatively insensitive to sample size (not shown). Thus, reconstruction of eastern Pacific-type El Niño events seems feasible given four to five δ18O records.
 The correlation of SSTA with PC2 of δ18O is shown in Figure 5 (right column). Most samples show a correlation structure resembling PC1, indicating that eastern Pacific El Niño-like variability may be split between δ18O PC modes. But remarkably, in the High CC case (Figure 5d), a distinctive pattern resembling the “El Niño Modoki” of Ashok et al.  appears. The central Pacific sites Palmyra and Kiritimati appear in both the High CC and East/Central samples; the greater sensitivity of the High CC sample to “Modoki-like” ENSO variability thus likely derives from the SPCZ-influenced sites Savusavu and Vanuatu. The SPCZ migrates substantially during “Modoki-type” events, and the associated salinity anomalies are extremely large near Savusavu and Vanuatu [Singh et al., 2011]. In contrast, Modoki-like salinity signals at Palmyra are quite weak, which is thought to be a consequence of anomalous mixing due to enhanced wind stress [Nurhati et al., 2011]. Although the present analysis does not allow attribution of the temperature versus salinity-driven portions of δ18O variability, we hypothesize that such differences in salinity sensitivities could be responsible for the ability of the High CC sample to effectively identify ENSO variability centered near the dateline.
 These results indicate that effective reconstruction of the complete continuum of “canonical” to Modoki-like ENSO variability [Ray and Giese, 2012] requires a combination of records from both the central equatorial Pacific and other locations. An important caveat here is that only a limited subset of locations was used; extending this analysis to include additional sites will provide an improved understanding of the effects of sample construction.
5 Linear Pseudoproxies
 Excluding observational considerations, the most important factor in paleo-ENSO reconstruction is the conversion between δ18O and climate variables (e.g., SST and SSS). There is a wealth of literature on calibrating proxy data against local climate (see the review by Grottoli and Eakin ), and for coral δ18O, the temperature fractionation effect [Epstein et al., 1953] and dependence on seawater δ18O [Fairbanks et al., 1997; LeGrande and Schmidt, 2006] have been studied in detail. What is still missing is a detailed examination of the degree to which uncertainties in climate calibrations introduce errors in forward-modeled proxy signals. The pseudoproxy calculations here are therefore designed to maximize the application of existing knowledge of the controls on coral δ18O, while still providing for the minimization of errors in bivariate linear pseudoproxies of the type adopted by Brown et al.  and Thompson et al. .
5.1 Pseudoproxy Formulation
 The δ18O temperature dependence has been extensively examined in the literature [Epstein et al., 1953; Correge, 2006]. Likewise, the relationship of seawater δ18O with salinity has been previously studied [Fairbanks et al., 1997], and basin-scale calibrations derived for all world oceans [LeGrande and Schmidt, 2006]. We first take advantage of these existing calibration studies, by applying specified regression coefficients to instrumental SST and SSS in (4), hereafter the “fixed-slope pseudo-δ18O.” This is the approach used by Thompson et al.  and represents the best available knowledge of the temperature and salinity sensitivities of δ18O.
 The δ18O/SST and δ18O/SSS calibration slopes vary significantly from site to site (e.g., Correge , for δ18O/SST). Here we adopt a δ18O/SST sensitivity of −0.18±0.04‰/°C, where the −0.18‰/°C coefficient represents the best estimate from the multicoral synthesis of Gagan et al.  and the 0.04‰/°C uncertainty reflects an average spread due to growth rate influences. This allows the calibration slope range to include the −0.21‰/°C slope sometimes adopted by other multisite analyses [Grottoli and Eakin, 2007].
 The errors on our δ18O/SSS slope are larger than those used in the Thompson et al.  study, since seawater δ18O measurements are derived from extremely sparse measurements of salinity and seawater δ18O, which may in turn reflect possible nonlinear effects from precipitation/evaporation and oceanic advection (see sections 6 and 7). The LeGrande and Schmidt  data set derives δ18O/SSS slopes of 0.27 and 0.45‰/practical salinity unit (psu) for the tropical and South Pacific, respectively. In the absence of a well-constrained site-specific δ18O/SSS slope, the value for any given Pacific location may therefore be expected to lie somewhere between these values. To best approximate the true sensitivity range, a value of 0.36±0.09‰/psu is adopted.
 Although the fixed-slope pseudo-δ18O calculation is “optimal” in the sense that it most strongly leverages known SST and SSS calibration data, this approach will not necessarily provide the most accurate climate/δ18O conversions on a site-by-site basis. A smaller uncertainty can be achieved by deriving site-specific relationships using least squares error minimization; this approach will give a smaller overall δ18O error than the fixed-slope method. For comparison, therefore, the δ18O values determined by individual fits of (4) to each δ18O time series are performed and are referred to as the “best fit pseudo-δ18O” time series. The best fit coefficients are calculated using a stepwise linear regression algorithm [Venables and Ripley, 2002] and are presented in Table 4. The best fit approach can be considered to give the linear pseudoproxy approximation the “best chance,” so to speak, of capturing ENSO amplitude accurately. If even the linear relationship chosen to maximize the δ18O variance explained still cannot provide an accurate prediction of that variance (as section 5.2 will demonstrate), then this is strong evidence that the relationship in (4) is insufficient for accurate δ18O/climate conversion.
Table 4. Fit Statistics for Conversion From Climate Variables to δ18Oa
Fit parameters listed are the result of a stepwise regression of δ18O on SST and SSS (see equation (4) in the main text); the adjusted R2 is listed in the last column, being a modified form of the coefficient of determination which penalizes fits containing additional independent variables. En dashes (–) indicate that a variable was not included in the best fit regression.
Vanuatu (Malo Channel)
Vanuatu (Sabine Bank)
5.2 Pseudoproxy Error Estimation
 For both the fixed-slope and best fit pseudo-δ18O, errors are estimated using a Monte Carlo approach. The observational uncertainties are computed by sampling from a Gaussian distribution with the appropriate signal/age model error: For sites with only a single core, the average error from the Kiritimati, New Caledonia, and Rarotonga analyses is used, and elsewhere, the measured errors are applied. Uncertainties in SST and SSS are computed based on the error estimates supplied with the ERSSTv3b and Delcroix et al.  products, respectively. Errors in δ18O, SST, and SSS are all added prior to the linear fit.
 Fits are computed for each biased Monte Carlo sample, and the associated residual errors computed for each location individually. We allow the slope to vary: In the fixed-slope case, the slopes βT and βS follow a normal distribution with standard deviation equal to the adopted error for the SST and SSS slopes (0.04‰/°C and 0.09‰/psu respectively). In the best fit case, the slopes are randomized according to the standard errors on βT and βS returned from the least squares algorithm. This approach may perhaps underestimate the magnitude of fit errors relative to more sophisticated “errors in variables” algorithms [Mann et al., 2008], but given that errors in the slope do not contribute substantially to the overall uncertainty (not pictured), this should not greatly impact the result.
 For each Monte Carlo sample, since the associated best fit intercept (β0′) will change, the regression intercepts for each location are recalculated based on the randomly generated slopes βT′ and βS′ by minimizing the least squares equation:
where N is the number of observations and δ18O, T, and S are the values obtained after applying observational uncertainties. Biased pseudo-δ18O time series are then constructed by adding the pseudo-δ18O values predicted from (5) to an error time series drawn from the probability density function of the fit residuals. The variance of each simulated time series is then computed, and the standard deviation of the simulated variances adopted as the error on the δ18O variance.
 Errors in pseudo-δ18O variance are reported in Figure 6b for the fixed-slope and best fit cases. These errors are quite large, ranging roughly from 37 to 75% of the original δ18O variance. The magnitude of the errors indicates that the corresponding ENSO amplitude estimates from linear pseudoproxies will also be highly uncertain, despite the large correlations between δ18O and ENSO [Brown et al., 2008; Thompson et al., 2011]. Combining multiple δ18O signals using PC1 is next performed, in an effort to mitigate the errors depicted in Figure 6; errors on δ18O PC1 calculated from the Monte Carlo δ18O time series are given in Figure 7. As for the single-site pseudo-δ18O variances, computing PC1 does not provide an accurate estimate of the total degree of variability. The pseudo-δ18O PC1 underestimates the magnitude of the peak in δ18O PC1 near 3.5 years, although the peak can still be visually identified. But the large errors in δ18O PC1 indicate that there are systematic errors associated with the pseudoproxy conversion which prevent averaging from eliminating errors in multisite ENSO amplitude estimation. Accounting for the conversion errors appropriately is the reason why the errors in Figure 6 are so much larger than previous error estimates in coral δ18O-based ENSO amplitude [Hereid et al., 2013].
 The large errors shown in Figure 6 may result from the observational δ18O errors discussed in section 4. However, there is still the possibility that gridded SST and SSS products do not provide sufficiently accurate information to reconstruct the coral δ18O signal, either because of inaccuracies in the SST or SSS measurements or because of subgrid-scale influences on SST and/or SSS. A good example of the profound implications of differences between observational products is the contrast between two recent studies: Solomon and Newman , who removed the ENSO signal from twentieth century data and concluded that there was no trend in the residual Walker circulation, and Tokinaga et al. , who showed that a twentieth century weakening of the Walker circulation could be masked by biases in surface wind data sets. Similarly, twentieth century Pacific SST trends were shown by Deser et al.  to differ between data products, not only in magnitude but also in sign; the controversies over the twentieth century instrumental record clearly have yet to be resolved.
 A rough idea of the contribution of SST and SSS uncertainties to δ18O conversion errors may be gained by examining the effects of differences between observational SST and SSS products on the resulting pseudo-δ18O. This is calculated in Figure 8 using the HadISST [Rayner et al., 2003] and ERSSTv3b [Smith et al., 2008] SST data sets, and the Delcroix ship-of-opportunity [Delcroix et al., 2011] and ORA-S4 reanalysis [Balmaseda et al., 2012] SSS data sets, which results in four combinations of data sets: HadISST/Delcroix, HadISST/ORA, ERSST/Delcroix, and ERSST/ORA. The magnitude of errors is estimated as the mean variance between the four pseudo-δ18O time series (Table 5). In the fixed-slope case, the values are on the order of 0.001–0.01‰2 for most sites (or in standard deviation units, 0.03–0.1‰), comparable to signal/age model errors but small compared with the errors in pseudo-δ18O conversions.
Table 5. Mean Variances Between Pseudo-δ18O Computed With All Possible Combinations of Data Productsa
Where a higher σ represents a larger scatter between fits to different instrumental products (less confidence in pseudo-δ18O); represents the mean variances between best fit pseudo-δ18O time series while for the fixed-slope pseudo-δ18O. Units are ‰2.
Vanuatu (Malo Channel)
Vanuatu (Sabine Bank)
 The time series of pseudo-δ18O in Figure 8 suggest that the linear regression of δ18O on SST and SSS does not provide a complete description of δ18O variability. A more effective pseudo-δ18O conversion method might be to look at other processes to include in the conversion relationship; this is investigated in the following section.
6 Improvements to Linear δ18O Pseudoproxies
 A simple linear dependence of δ18O on grid point SST and SSS seems to be insufficient for accurate ENSO amplitude estimation. One simple candidate improvement method is to account more accurately for the seasonal cycle, which may affect conditions near the proxy site by mechanisms other than variations in SST and SSS. Two good examples of locations which are strongly influenced by seasonality are Amedee Lighthouse in New Caledonia and Secas Island in Panama. At New Caledonia, the annual cycle in δ18O is caused by seasonal SPCZ migrations [Quinn et al., 1998], which create anomalies not only in SST but in δ18O advection and rainfall as well [Delcroix and Lenormand, 1997]. In the case of Secas, Linsley et al.  concluded that the coral δ18O seasonal cycle is caused by changes to precipitation δ18O due to seasonal migration of the ITCZ; in that study, the δ18O value of precipitation was linearly related to coral δ18O according to
and seasonal variations dominated the signal far in excess of the SST influence.
 In the absence of more detailed δ18O observations, the net seasonal cycle in δ18O may be fit using sinusoids; the formulation of the fixed-slope pseudoproxy in (4) then becomes
where t is the date expressed in years and both the sine and cosine terms are retained to allow the phase of the δ18O seasonal cycle to vary. This approach is numerically equivalent to fitting a sinusoidal function to the residuals from (4); if the resulting fit is able to describe a larger proportion of the variance, this is an indication that there is a seasonally varying signal in δ18O which is not described by SST and SSS alone, such as advective/source region effects.
 The “fixed-slope-seasonal” (FSS) pseudoproxy relationship in (7) was fit to all coral time series; 9 out of 15 sites show a statistically significant contribution from the seasonal cycle, as indicated by the bold entries in Table 6. As expected, Secas and New Caledonia are among those sites that are well fitted by (7), and the resulting FSS fits for these sites are shown in Figure 9. Notably, even locations where δ18O does not show a visually obvious sinusoidal pattern (Kiritimati, Palmyra, Laing and Madang) see improvements to pseudo-δ18O. This is consistent with the finding of McGregor et al.  that the annual cycle accounts for 15% of the modern Kiritimati δ18O variance, partly related to the westward propagation of SST anomalies near the equator.
The first column shows the total variance in the time series over 1958–1985, and errors are then listed in ‰2 (and in percent) for the total error (residual plus age) in the fixed-slope pseudoproxy (FS + Age) and the total error for the fixed-slope pseudoproxy including the seasonal cycle (FSS + Age). Both the FS + Age and FSS + Age columns represent 1σ uncertainties on the δ18O variance in the first column. Entries in bold indicate sites where inclusion of the seasonal cycle term results in a significant reduction in δ18O variance error. As in Table 3, the signal/age model error calculated for individual sites is applied to those sites, and for all others, the mean of the site-specific calculations is used.
Vanuatu (Malo Channel)
Vanuatu (Sabine Bank)
 The seasonal fits included here are a simple improvement to pseudoproxy conversions and indeed are a natural extension of the Thompson et al.  pseudoproxies which used the same SST and SSS dependencies. However, it is important to note that although (7) allows improvements in δ18O variance estimates, the errors still remain at 30% or more of the input value; future work will be required to further improve model/proxy conversion.
7 Discussion and Future Recommendations
 The present study is aimed at improving quantitative model ENSO validation against coral δ18 records through the use of empirical “pseudoproxy/forward model” conversions. Although such conversions are less mathematically sophisticated than multiproxy statistical methods [Mann et al., 2009; Wilson et al., 2010], there are nonetheless advantages, namely, the relaxation of the assumption of large-scale stationarity and of the requirement for high data density. The character of ENSO is known to change from event to event [Ashok et al., 2007], affecting the strength of the covariance between ENSO and any given proxy location; empirical conversions assume stationarity only in the sensitivity of the proxy to local climate, a less restrictive assumption. Empirical conversions also have the potential to allow accurate ENSO reconstruction given a far smaller sample size than field reconstructions, which typically require a minimum of several dozen records [Mann et al., 2009]—far beyond what is currently achievable for subannually resolved, synchronous fossil corals. Better local climate/δ18O conversions should thus allow existing coral records to provide much more useful information.
 The ENSO estimates produced by linear approximations such as (4) have such large errors that they are consistent with pseudo-δ18O values derived from climate model output (not pictured). In this situation, it becomes nearly impossible to use coral δ18O measurements to motivate specific improvements to model ENSO behavior, or alternately to use simulations of past climates to examine the potential causes for shifts in δ18O signals. Through an improved understanding of the mechanisms generating coral δ18O anomalies, it will become possible to construct more detailed dynamical ENSO diagnostics, and thus to improve the utility of both climate models and fossil δ18O records.
 Deriving improved forward models for δ18O is complex. The true seawater δ18O will be affected by changes to advection past the proxy location, by changes to water mass properties, by the relative amounts of precipitation and evaporation, and by the δ18O values within local precipitation (itself a function of the source region, atmospheric water vapor transport, and other fractionation processes). These effects may sometimes be difficult to detect below a given “threshold,” as observed by McGregor et al.  for the precipitative influence on seawater δ18O at Kiritimati; this may contribute to errors in the linear pseudoproxies of Brown et al. , Thompson et al. , and Phipps et al. .
 The most accurate description of δ18O based on local conditions is the δ18O budget:
where ΦP and ΦE are the fluxes of precipitation and evaporation, is the advection of seawater δ18O, and Osw, OP, and OE are the δ18O values for seawater, precipitation, and evaporation, respectively. The complexity and coupled nature of the processes involved quickly lead to this approach becoming indistinguishable from the physics in an isotope-enabled GCM. Yet an approach like (8) is ultimately the only way to correctly account for the relative importance of various influences on δ18O, since many of these effects may dominate over SST and SSS influences. Indeed, a preliminary diagnosis of (8) using ORA-S4 reanalysis data shows qualitative similarities between changes to coral δ18O and upper layer ocean advection at a variety of sites (not pictured). This highlights the potential of GCMs to provide valuable physical insights into fossil coral δ18O records, as noted recently by Russon et al. , and should serve as motivation for improving the state of coral/model comparison techniques.
 This work has assessed the errors associated with ENSO reconstruction from coral δ18O, using modern corals from 15 sites covering the period 1958–1985. The δ18O observational error at a given site (the “signal/age model error”) is found to be 0.12–0.14‰ using records from three locations: Kiritimati Island [Evans et al., 1998b; Woodroffe and Gagan, 2000; Woodroffe et al., 2003; Nurhati et al., 2009; McGregor et al., 2011], Amedee Lighthouse in New Caledonia [Stephans et al., 2004], and Rarotonga [Linsley et al., 2006]. Such intercoral offsets lead to typical variance errors of roughly 8–25%. Although substantial, signal/age model errors do not preclude the accurate estimation of the PC1 power spectrum of δ18O. Errors from dating uncertainties of ±5 to 10 years are also fairly small. The required sample size for reconstructing the majority of ENSO-related variance seems to be roughly five to seven sites, based on an analysis of δ18O PC1. However, the choice of sampling location critically affects the character of δ18O PC1 and PC2. ENSO variability with largest loading in the eastern Pacific is detected as both PC1 and PC2 by sites in the cold tongue, warm pool, and SPCZ, while a pattern resembling the Modoki El Niño appears in PC2 when cores from the warm pool and SPCZ regions are combined. Regardless of the location of the ENSO center of action, a minimum of two to four corals seems to be required.
 A Monte Carlo error analysis on linear pseudoproxies of the form used by Thompson et al.  shows that the major obstacle to quantitative ENSO amplitude estimation using δ18O is its conversion to climate variables. Instrumental/grid point uncertainties in observational SST and SSS products lead to a variance of roughly 0.001–0.01‰2 between pseudo-δ18O calculated with different data products, an error of the same order of magnitude as signal/age model uncertainties. The offsets between pseudo-δ18O time series seem to be largest at sites where the temperature influence does not dominate, indicating that a detailed understanding of controls on seawater δ18O will be required to improve pseudo-δ18O accuracy. Fit residuals, both using regression slopes specified from prior knowledge of the δ18O relationships with SST and SSS and from site-specific least squares estimation, result in errors of 37–75% of the original δ18O variance. These errors are so large that they dominate even the covarying modes between locations, as demonstrated by calculating the first PC using multiple linear pseudo-δ18O time series biased according to the appropriate error distributions.
 The importance of correctly representing changes to seawater δ18O (or local-scale processes) is confirmed by including a sinusoidal component with a period of 1 year in the fixed-slope linear pseudoproxy conversion. This is intended to account for the combined influence of advective and atmospheric source effects on the seasonal cycle of δ18O and results in a statistically significant improvement in conversion performance at 9 of 15 sites. However, errors remain at or above 30% for all locations.
 To improve δ18O-based ENSO reconstructions, we recommend improving environmental monitoring at coral sites with a focus on SST, SSS, ocean velocities, evaporation/relative humidity, seawater δ18O, and δ18O of precipitation. By doing so, the contribution of all of these influences to coral δ18O variations may be more accurately quantified, allowing more accurate model/proxy conversions to better illustrate the dynamical linkages between ENSO characteristics and δ18O variations near the proxy site.
 S.S. was supported by a CIRES Innovative Research Project grant for travel in Australia. This work is supported by Australian Research Council Discovery Project grant DP1092945. H.V.M. is supported by an AINSE Research Fellowship. C. Charles, A. Timmermann and A. Tudhope are gratefully acknowledged for helpful discussions which greatly improved the manuscript.