Journal of Geophysical Research: Atmospheres

Climate variability and precipitation isotope relationships in the Mediterranean region

Authors

  • M. J. Fischer,

    Corresponding author
    1. Institute for Environmental Research, Australian Nuclear Science and Technology Organisation, Lucas Heights, New South Wales, Australia
      Corresponding author: M. J. Fischer, Institute for Environmental Research, Australian Nuclear Science and Technology Organisation, Locked Bag 2001, Kirrawee DC, NSW 2232, Australia. (mjf@ansto.gov.au)
    Search for more papers by this author
  • D. Mattey

    1. Department of Earth Sciences, Royal Holloway University London, Egham, UK
    Search for more papers by this author

Corresponding author: M. J. Fischer, Institute for Environmental Research, Australian Nuclear Science and Technology Organisation, Locked Bag 2001, Kirrawee DC, NSW 2232, Australia. (mjf@ansto.gov.au)

Abstract

[1] This study investigates the links between Mediterranean precipitation δ18O and Mediterranean sea level pressure (SLP) anomalies during the winter months and over the years 1960–present. Previous studies have considered only the influence of the North Atlantic Oscillation (NAO) on rainfall δ18O at Mediterranean sites, but Mediterranean winter SLP variability evolves with at least three degrees of freedom, which means that other climate patterns may be equally important in influencing Mediterranean rainfall δ18O. In this study, Multivariate Linear Regression (MLR) is employed to identify the ‘coupled patterns’ in the Mediterranean winter SLP and δ18O fields. The multivariate linear model is estimated in two different ways, using Principal Components Regression (PCR) and regularized Canonical Correlation Analysis (regCCA), resulting in two different models which are compared. In both models two main patterns are identified, that explain 50% of the shared variance in the SLP and δ18O fields. Subspace projection of various regional and Northern Hemisphere climate indices shows that the two main patterns are more closely related to local Mediterranean climate indices than to other Northern Hemisphere climate indices. Analysis of the predicted and residual fields from the two models suggests that the regCCA model provides better predictability for rainfall δ18O at central Mediterranean sites, while both models explain relatively less of the rainfall δ18O variance at eastern Mediterranean sites. These results can potentially aid the interpretation of the climate-isotope signal preserved in high-resolution natural archives from different parts of the Mediterranean.

1. Introduction

[2] Stable water isotopes in precipitation are influenced by both local and non-local factors. Local factors include condensation temperature and precipitation amount, while non-local factors include trajectory-type effects. Because interannual variation in large-scale climate patterns such as North Atlantic Oscillation (NAO) and Southern Oscillation Index (SOI) can drive changes in these local and non-local factors, these climate patterns can have a strong impact on the variation in precipitation isotopes [Baldini et al., 2008; Tindall et al., 2009].

[3] Knowledge of the relationships between the δ18O of precipitation and climate processes are required to interpret proxy records of precipitation δ18O, preserved for example as speleothem carbonates, in terms of past climate behavior. The main issues with linking climate variability to precipitation isotope proxies are:

[4] 1. The relationships between climate modes, precipitation amount and precipitation isotopes are complex. For example, Baldini et al. [2008], using monthly precipitation δ18O data from the Global Network of Isotopes in Precipitation (GNIP), found the strongest relationship between NAO and rainfall δ18O occurred at central European GNIP stations (a positive relationship), whereas the strongest relationship between NAO and rainfall amount occurred in the western Mediterranean (a negative relationship). There was little or no relationship between NAO and δ18O from circum-Mediterranean stations [Baldini et al., 2008, Table 1].

[5] 2. A second issue is that a climate pattern that explains much of the variance in one field is not necessarily the characteristic pattern that explains most of the variance in another field. Although the NAO accounts for about one fifth of the variance in the total monthly North Atlantic sea level pressure (SLP) field, most of the variance in the European monthly precipitation field is explained not by the NAO but by a North Sea pattern [Qian et al., 2000].

[6] 3. The environments in which precipitation isotopes are ‘captured’ act as impulse response filters [e.g., Baker and Bradley, 2010; Jones and Imbers, 2010].

[7] The focus of this study is on points 1 and 2, and its purpose is to extend the work of Baldini et al. [2008] by asking the question: if Mediterranean precipitation δ18O patterns are not linked to NAO, can they be linked to other patterns of climate variability?

[8] Identifying the links between climate fields and water isotope fields requires the use of methods that are specifically aimed at finding associated patterns in two fields [Tippett et al., 2008]. Such methods include: Principal Components Regression (PCR), Maximum Covariance Analysis (MCA), and Canonical Correlation Analysis (CCA). In this study, the relationship between winter SLP (NDJFM) and δ18O in the Mediterranean Basin is examined using a modern implementation of CCA which is better suited to deal with problems such as data sparsity and data incompleteness, issues that are pertinent to both climate and water isotope data sets [Schneider, 2001] (J. Emile-Geay et al., Imputation of missing values in climate data sets: Data-adaptive regularization methods and their applications, submitted toJournal of Climate, 2011).

[9] Feliks et al. [2010]recently showed that a 210-year tree ring width series from Golan Heights (eastern Mediterranean), contains interannual variation that can be linked to the east/west pressure differences in the Mediterranean and they suggest that the Mediterranean Oscillation (MO) and NAO are quasi-oscillatory modes that become coupled on seasonal and interannual timescales. There have been no attempts to reconstruct Mediterranean pressure oscillation fields using high-resolutionδ18O records from speleothems or tree ring cellulose [Roberts et al., 2010], and so our study highlights the potential of the δ18O proxy in reconstructing major oscillation patterns in the Mediterranean.

[10] The primary aim of this paper is to examine the relationship between SLP and rainfall δ18O in the Mediterranean Region, using linear models. The data and methods are described in sections 2 and 3. The models and their residual fields are discussed in section 4, and a comparison of the models' predicted fields is made in section 5.

2. Data

[11] This paper makes use of two data sets, containing mean sea level pressure and monthly precipitation δ18O data. Both data sets span the study years 1960–2010, but the precipitation δ18O data set has non-contiguous values. Monthly mean sea level pressure data were extracted from the National Centers for Environmental Prediction-National Center for Atmospheric Research (NCEP-NCAR) reanalysis globally archived data set with a horizontal resolution of 2.5 × 2.5° [Kalnay et al., 1996], for the region 10W–40°E, 30–45°N. The total number of grid points in this region is 147. These data contain no missing values. Monthly precipitation δ18O data for circum-Mediterranean stations were extracted from the Global Network of Isotopes in Precipitation [International Atomic Energy Agency, 2008]. Circum-Mediterranean stations with continuousδ18O records over these years are relatively scarce: there are only 26 stations that have δ18O time series with more than 40 values (non-contiguous) in the months NDJFM and years 1960–2010. These 26 stations form the primaryδ18O data investigated in this paper, and their location is shown in Figure 1. A matrix map showing the missing values for all 26 stations is found in auxiliary material Figure S1. Since stations along the Mediterranean coast receive predominantly winter rainfall, data for the months NDJFM were extracted from the above data sets after detrending and deseasonalizing the full data sets (see below). The methods that follow are especially designed to handle the missing values in the δ18O data.

Figure 1.

Map of the study area showing the location of 26 GNIP stations. The inset (bottom right) shows five stations around the Dead Sea, and their names are centered on their location. Some station names have been shortened. A matrix map of the precipitation isotope data from the 26 stations is shown in auxiliary material Figure S1.

3. Methods

3.1. Statistical Models for Coupled Fields

[12] The main aim of this study is to construct a statistical model for predicting Mediterranean δ18O (Y) from SLP (X):

display math

where X and Yare space-time anomaly fields,B is a matrix of regression coefficients, and E is the residual field. In this paper, X has a total of 147 rows as grid points in the SLP data set, and Y has a total of 26 rows as stations in the GNIP data set. The conventional methods of solving equation (1) are reviewed by Tippett et al. [2008] but all are susceptible to the problem of sparsity, i.e., where the number of observations in the space domain is relatively large compared to the number of observations in the time domain. In this section, several methods of solving equation (1) that work with sparse matrices are discussed, including Principal Components Regression (PCR) and sparse Canonical Correlation Analysis. These two methods will provide a useful comparison. The subsections that follow briefly discuss parameter estimation, inference and prediction for PCR and sparse CCA, as well as methods to investigate the residual field E.

[13] First, it is necessary to briefly discuss data preprocessing (detrending and deseasonalizing). Detrending and deseasonalizing the data is important because we want to investigate the relationship between the anomaly fields of the predictor and response variables. If the data are not detrended and deseasonalized, then it is possible that the extracted coupled patterns and linking relationships will contain a mixed signal (e.g., that due to trend + seasonal variance + anomaly in X and Y), rather than a pure signal (due to the anomalies only). The seasonal and trend patterns in the data were simultaneously fitted and removed using Multivariate Linear Regression (MLR); see Appendix A. Thus, in all that follows, X and Y are anomaly fields.

[14] The general component model for both PCR and CCA is:

display math
display math

where X, Y are the anomaly matrices from equations (A1) (Appendix A), inline image are fields (spatial patterns) that tend to occur together, and inline image are their associated time series of amplitude. There are M patterns. In both PCR and CCA, the time vectors share a maximum correlation for same m, and are orthogonal for different m. The difference between the two methods lies in how the time vectors and spatial patterns are estimated. Note that prior to PCR and CCA, the rows of X and Yare scaled to unit variance and area-weighted by the square root of cos(latitude) [Baldwin et al., 2009].

3.2. Principal Component Regression

[15] In PCR, the patterns and their amplitude time series are estimated using a two-step procedure. In the first step, inline image are found by the principal components analysis (PCA) of X; these spatial patterns and their amplitude time series are referred to as empirical orthogonal functions (EOFs) and principal components (PCs) respectively [e.g., Monahan et al., 2009]. In the second step, the patterns inline image are found using MLR, with Y as the response matrix and by assuming inline image. Tests of the significance of field patterns are complicated by issues such as sparsity and incomplete data [DelSole and Yang, 2011]. A simple test is to employ the standard t-test for the coefficients of a single-response multipredictor but reduced-rank regression. Here the reduced-rank regressions are formed fromequation (2b), for each s (e.g., s = 1, 2, 3 …). Although this test does not account for the spatial correlation in Y, or for the multiplicity of tests (for many stations), it does provide a guide to the importance of coefficients, that can be applied in the case of sparse, incomplete data. The standard t-test for a single-response multipredictor regression is:

display math

where t is the test statistic, seb = standard error of a parameter, σ2 is the mean residual sum of squares, and a is the predictor matrix. Last, predicting the response field Y from the predictor field X is equivalent to multiplying X by the product bYWX (i.e., projecting X onto its EOFs, and multiplying the component time series by the MLR coefficients for each site).

[16] One aspect of PCR that is not discussed in this paper is the use of cross-validation (CV) to determine the number of PCs to include in the PCR model [e.g.,Hidalgo et al., 2000; Li and Smith, 2009]. Instead, here the PCs are chosen on the basis of physical relationships with other indices of Mediterranean climate variability (section 4), rather than by statistical cross-validation. While the physical and statistical approaches may well agree, the main reason for our physically based selection is to construct a relatively simple PCR model with which to compare the more complex CCA model.

3.3. Canonical Correlation Analysis and Sparse CCA

3.3.1. Conventional CCA

[17] CCA, which has a variety of implementations, is generally different from PCR in the following ways: (i) In CCA the canonical time series (or components) inline imageare generally strongly correlated but are not exactly equal (as in PCR), (ii) the canonical components maximize the correlation between two space-time fields rather than variance within one of the space-time fields, and (iii) the squared canonical correlations (Σcca) provide an estimate of the contribution that each canonical pair makes to the total shared variance (in PCR, this contribution is not defined). (iv) In CCA, predicting the response field Y from the predictor field X is equivalent to multiplying X by the product bYΣccaWX (i.e., the canonical patterns of Y, by the canonical correlations, by the adjoint patterns of bX) [see Widmann, 2005, equation 17; Smerdon et al., 2010, equation (7); Lim et al., 2012, Appendix C1].

[18] Conventional CCA is not designed to work with sparse matrices, so several approaches to sparse CCA have been developed [Witten et al., 2009; Lim et al., 2012; Smerdon et al., 2010]. These approaches make use of either dimension reduction or regularization (penalization) to solve the problem of sparse matrices. Although tests for the significance of the canonical correlations exist for conventional CCA (Hoteling-Lawley, Wilks' Likelihood, Pillai-Bartlett, and Roy's largest root), these tests are not applicable for sparse CCA [DelSole and Yang, 2011]. In the following subsections, two approaches to sparse CCA are discussed.

3.3.2. Dimension Reduction CCA

[19] In dimension reduction CCA (drCCA), the data matrices are individually whitened (orthogonalized), followed by SVD (singular value decomposition) of the covariance matrix of the whitened X,Ycomponents. The number of whitened components to retain is typically chosen using cross-validation [Smerdon et al., 2010]. This approach to sparse CCA is very stable, but if there are many missing values in X and Y, these need to be imputed due to the calculation of the whitened X,Y components [e.g., Schneider, 2001]. Here we are reluctant to use an approach to sparse CCA that requires data imputation, owing to the relatively large proportion of missing data in the response matrix (auxiliary material Figure S1).

3.3.3. Regularized CCA

[20] In regularized CCA (regCCA), the canonical patterns are estimated by performing SVD on the product of the inverse regularized auto-covariance matrices and the cross-covariance matrix ofX and Y:

display math
display math

where ΣXX, ΣYY are the regularized covariance matrices, and λ1, λ2are the regularization parameters. The regularization parameters make the sparse matrices invertible. Note that regCCA operates on the cross-covariance matrix ofX and Yrather than on the cross-covariance matrix of the whitenedX, Ycomponents (as in drCCA). In calculating the cross-covariance matrix inequation (4a)and the auto-covariance matrices inequation (4b), missing values can be ignored- as long as there are sufficient pairwise data values for each pair of stations. Also in regCCA, the regularization parameters can be optimized using cross-validation, with the CV score set as the leading canonical correlation. The optimal values ofλ1, λ2 are those that maximize the CV score.

[21] A preliminary test of applying regCCA to the Mediterranean data demonstrated some general problems. Some GNIP stations (e.g., Har-Knaan, southeast Mediterranean) did not have sufficient pairwise data with other GNIP stations (matrixY), and this problem was exacerbated by including too many sites in the response matrix- resulting in overall too few pairwise data values. For this reason the auto-covariance matrix of the data matrixY (auxiliary materialFigure S1) is not invertible. These issues were solved using a two-step process:

[22] 1. The CCA model was estimated using a response matrix with reduced dimensions. The Mediterranean was divided into four regions (NW, NE, SW, SE), and from each region, three sites with the least number of missing values (in the months NDJFM) were selected and concatenated into a new response matrix YR. These 12 sites were: Genoa, Penhas Douradas, Portalegre, Ankara, Antalya, Adana, Gibraltar, Faro, Tunis-Carthage, Sidi Barrani, Bet Dagan, and Ras Muneef. The dimension ofX and YR was further reduced by selecting columns where less than half the YR values were missing (YR was then a 12 × 121 matrix, and XR a 147 × 121 matrix). The regularization parameter for YR, λ2, was chosen from the set { 0, 0.001 }. The regularization parameter for XR, λ1, was chosen from the set { 0.001–1.001 at 0.1 intervals }. The two regularization parameters were chosen from different optimization sets, owing to the difference in sparsity of XR and YR. Cross validation was performed by a leave-out-one-season method (e.g., a season = NDJFM). Once the CCA model was estimated, the predicted canonical components ofY were calculated by multiplying the full X matrix by the product ∑ccaWX.

[23] 2. The full Y matrix (with 26 stations) was then regressed on the predicted leading canonical components, using MLR. This is similar to the second step in PCR (section 3.2), so the significance of the MLR coefficients can be tested using equation (3), and this aids the comparison of the PCR and regCCA models. This step could be termed CCR (Canonical Component Regression), different from PCR because the method of calculating the leading components of X, Y is different.

3.4. Model Residuals

[24] The residual matrix E (equation (1)) from both the PCR and CCA models should be investigated. Spatial and temporal patterns in model residuals can help identify where the unexplained variance in the model lies, and may suggest ways of improving the overall model. Methods for investigating residuals in multiresponse models are not well developed [Zhu et al., 2008]. Here, the model residuals were investigated using two methods.

[25] In the first method, both the overall RMSE (Root Mean Square Error), and the RMSE by month, were calculated for each site. In the second method, the model residual matrix was decomposed using the component model:

display math

where inline image, and aEbE are the time, space components of E.A typical expectation for the model residuals in a multiple-response model is that they form a multivariate normal distribution where the off-diagonal elements of the covariance matrix are ∼0. However, this may not always be true because of temporal and/or spatial bias in the multiresponse model. Model residuals may be large or small for particular months, or residuals may be correlated between sites from a particular area. InvestigatingE by PCA is the simplest way to identify common signals in the model residuals. Here the sparse PCA method of Witten et al. [2009] was employed.

[26] The results of PCR and regCCA are presented in the following section.

4. Results

4.1. Principal Component Regression

4.1.1. PCA Step

[27] In the first step of PCR, the principal components and EOFs of X ( inline image) were estimated. The EOFs of X are pictured in Figure 2(left). In order to investigate the physical aspects of these patterns, their amplitude time series (PCs) were correlated with indices from other Mediterranean climate studies. These indices include NAO, MO (Mediterranean Oscillation index) and WeMO (Western MO index) (discussed below). The main similarities between these Mediterranean climate indices (MCIs) and our PCs are that: they are each associated with a particular type of climate variability, and they are calculated from time series of standardized pressure anomalies. The main difference between the PCs and the climate indices is that each MCI is based on a 2-station pressure difference, whereas the PCs are calculated using multiple time series from gridded SLP data (in this case, from 147 SLP time series over the Mediterranean Region). Each PC is generally strongly correlated with one climate index (e.g., PC1 & NAOr = 0.73, PC2 & MOI r = 0.74, PC3 & WeMOI r = 0.64, Table 1).

Figure 2.

(left) The first three leading EOFs of NDJFM SLP over the study area (top to bottom: EOF1, EOF2, EOF3). Positive and negative SLP anomalies are shown by solid and dashed contours respectively. The triangles mark the locations of San Fernando (Spain) and Gibraltar (together), Padua (Italy) and Lod Airport (Israel) that are used in calculating station-based indices for: NAO (Gibraltar-Reykjavik), MOI (Gibraltar-Lod Airport), and WeMOI (San Fernando-Padua). (right) The first three leading patterns of NDJFMδ18O that are associated with the corresponding SLP patterns. The circles are the MLR coefficients bY (equation (2)), where blue (negative) and red (positive) circles correspond to the signs of the coefficients, and the circle areas are proportional to the magnitudes of the coefficients.

Table 1. Correlations Between the Leading Three Principal Components of Mediterranean Winter SLP, and Various Climate Indices, for the Months NDJFM and Years 1960–2010a
 NAOMOIigWeMOI
  • a

    Symbols: ***p < 0.001, **p < 0.01 significance levels. The statistical significance was tested after adjusting for time series autocorrelation using an effective number of degrees of freedom (Neff) estimated by Neff = N(1 − rxry)/(1 + rxry).

PC10.73***0.51***−0.14
PC20.35***0.74***0.56***
PC30.06−0.040.64***

[28] The first PC explains 70% of the variance in X and is related to the North Atlantic Oscillation (Table 1). A common NAO index is calculated as the standardized pressure difference between Gibraltar and Reykjavik [Jones et al., 1997]. As a result of the NAO, the air pressure anomalies over the Mediterranean all move in the same direction (Figure 2), and the effects of the NAO on temperature and precipitation patterns in southern Europe and north Africa are well known [Qian et al., 2000; Wanner et al., 2001; Trigo et al., 2002]. The influence of the NAO on water isotopologues in Mediterranean precipitation is relatively weak [Baldini et al., 2008].

[29] The second PC explains 15% of the variance in X and is related to the MO pattern (Table 1). The MO is a zonal pressure difference across the Mediterranean Sea, and the MO index is calculated using the station pairs Lod Airport and Gibraltar (MOig) or Algiers and Cairo (MOac) [Conte et al., 1989; Palutikof, 2003]. Positive MO phases (high pressure over the west Mediterranean, and low pressure over the eastern region) lead to below normal precipitation over much of the Mediterranean, except for the southeastern area (the coastal areas of Libya, Egypt and the Levant) [Dünkeloh and Jacobeit, 2003]. The upper-level trough extending from Iceland to the eastern Mediterranean pushes very cold air over the relatively warm sea, and the resulting instability leads to increased rainfall in the southeast. Negative MO phases are linked with above normal rainfall to most of the Mediterranean except the southeastern area, owing to strong westerly airflow and the advection of humid unstable air masses.

[30] The third PC explains 6% of the variance in X and is related to the WeMO (Table 1). The WeMO index is calculated as the standardised-pressure difference between the stations San Fernando and Padua, and is an index of cyclogenesis in the Mediterranean Basin [Martin-Vide and Lopez-Bustins, 2006]. Positive WeMO phases, marked by an anticyclone over the Azores and low-pressure in the Liguria Gulf, lead to decreased rainfall in southern Spain; while negative WeMO phases, characterized by a strong central European anticyclone, lead to increased rainfall. Although the spatial pattern of this PC has a tripolar structure (Figure 2), there has been little investigation of the influence of this WeMO-related pattern on rainfall in the central or eastern Mediterranean.

[31] The principal component analysis shows that 3 degrees of freedom are required to explain 90% of the variance in Mediterranean SLP. Figure 3ashows the SLP subspace formed by PC2 and PC3, and the inter-annual variation of those PCs, e.g., the winters of 1977 and 1981 were characterized by ‘extreme’ PC2 events.Figure 3 also shows the relationship that the second and third PCs have with other climate indices discussed above (NAO, MO, WeMO). These other indices were regressed on PC2 and PC3, e.g., MOIig = 0.14PC2 – 0.06PC3, with r2 = 0.56, and this linear combination is illustrated by the arrows in Figures 3a and 3b. Thus, in Figure 3 the correlation coefficient between any two indices is given by the cosine of the angle between the corresponding arrows or axes, so Figure 3 confirms that the MO index is correlated with PC2 and the WeMO index is correlated with PC3. These other indices (MO, WeMO), though, do not provide suitable replacements for PC2 and PC3 because those indices do not lie completely in the PC2–PC3 subspace. This is shown by the r2 values from the multiple regressions, which are not close to 100% (Figure 3a). Thus the vectors formed by the MO and WeMO index are pointing away from the PC2–PC3 subspace, and into other dimensions, so they do not properly describe the degrees of freedom and the evolution of Mediterranean winter SLP.

Figure 3.

Biplots showing (a) PC2 and PC3 of Mediterranean winter SLP – numbers refer to the year, e.g., 81 is the winter that includes January 1981; and (b) the MLR coefficients bYfor PC2 and PC3 by site - the text colors refer to an area within the Mediterranean region (NW = black, NE = blue, SE = green, SW = orange). Some site names have been shifted far left of their corresponding points (Amman, Irbid, Sidi, Alexandria). In both Figures 3a and 3b, the vectors (arrows) show the correlation between the axes (PC2 and PC3) and other indices of Mediterranean climate variability (MCIs). The vectors were calculated by projecting the MCIs into the subspace formed by the principal components of Mediterranean SLP. The months January and February only were used for the vector calculations. In Figure 3a, the percentage for each dimension (on the plot axes) refers to the proportion of variance in the Mediterranean winter SLP field that each principal component explains. Also in Figure 3a, the percentages for each vector refer to the variance explained by regressing each MCI on Mediterranean winter SLP PC2 and PC3.

4.1.2. PCR Step

[32] In the second step of PCR, the patterns of Y ( inline image) that covary with the EOFs of X were estimated, and are plotted in Figure 2 (right). The statistical significance of the MLR coefficients is given in Table 2. SLP EOF1 has little influence on the δ18O field (Figure 2), and only two stations show a significant relationship with PC1 (Genoa and Har-Knaan,Table 2). This result extends the work of Baldini et al. [2008] to many more Mediterranean stations, and confirms that the NAO signal in Mediterranean rainfall δ18O is weak. SLP EOF2 has a stronger influence on Mediterranean rainfall δ18O, and positive phases of EOF2 are associated with rainfall isotopic depletion in the north-central Mediterranean (Monaco, Athens) and rainfall isotopic enrichment in the western Mediterranean (Porto, Penhas Douradas, Portalegre, Faro, Gibraltar, Fes-Saiss) (Table 2). Perhaps the strongest influence on rainfall δ18O comes from SLP EOF3. Positive phases of EOF3 are associated with rainfall isotopic depletion in the central Mediterranean (Monaco, Algiers, Sfax) and rainfall isotopic enrichment in the west and east Mediterranean (Figure 2). Overall, the PCR model explains up to ∼30% of the variance in Mediterranean rainfall δ18O (Table 2).

Table 2. Coefficients and r2 for the Regression of Y on the Principal Component Time Series of Xa
 Nb1b2b3cr2
  • a

    Symbols: ***p < 0.001, **p < 0.01, *p < 0.05 significance levels. This table includes only sites for which the overall r2 is significant at the 0.1 level. The columns are as follows: N is the number of months in the regression, bm are linear coefficients from equation (2b), c is the regression intercept, and r2 is the multiple correlation coefficient.

Northwest Mediterranean
GENOA..SESTRI1380.064**0.0470.008−0.110.06*
PORTO71−0.0140.127*0.247**0.270.14*
PENHAS.DOURADAS780.0140.133*0.191*0.080.12*
PORTALEGRE810.0130.179**0.209*0.120.18**
 
Northeast Mediterranean
ATHENS.PENDELI40−0.012−0.272**0.394**0.100.36**
ATHENS.THISSION380.006−0.232*0.354*0.150.29**
ANKARA..CENTRAL184−0.0350.0120.247***0.000.07**
ADANA149−0.008−0.0200.288***−0.050.12***
 
Southwest Mediterranean
GIBRALTAR1500.0030.147***−0.0040.040.11***
FES.SAISS41−0.0200.130*0.106−0.020.20*
FARO77−0.0070.160***0.115.0.100.19**
 
Southeast Mediterranean
HAR.KNAAN..TIRAT.YAEL72−0.053**−0.0340.095−0.030.15*
BET.DAGAN181−0.0120.0380.121**0.070.05*
IRBID56−0.044.−0.0020.241**0.190.17*
AMMAN.WAJ64−0.0240.0210.256***0.050.19**

4.2. Regularized CCA/CCR

[33] In the first step of CCA model estimation (Section 3.3.3), the CCA model was estimated using XR and YR. The values of the regularization parameters that maximized the leading canonical correlation were λ1 = 1.001, λ2 = 0.001. The leading canonical components and canonical patterns are shown in Figure 4. The CCA model was used to predict the canonical components of the full X and Y matrices. The correlations between the full CCs (canonical components) and other MCIs are given in Table 3. CC1, CC2, and CC4 are correlated with at least one MCI (e.g., CC1 & MOI r = 0.68, CC2 & WeMOI r = 0.47, CC4 & NAO r = 0.41) (see below for further discussion). In the second step (the CCR step), Y was regressed on its predicted components, giving the canonical patterns of Y (Figure 4). Table 4 provides the statistical significance of the MLR coefficients. Note that of the 12 sites in YRused to calibrate the CCA model, there were 3 sites (Antalya, Sidi Barrani, Ras Muneef) for which the single-response reduced-rank regressions were not significant, and there were an additional 9 sites (not included inYR) for which the regressions were significant (Table 4).

Figure 4.

(left) Canonical time series, and (right) canonical patterns for the first four leading canonical pairs. Note r is the canonical correlation, and Vsh is the proportion of shared variance that each canonical pair explains. The canonical patterns of X, i.e., the positive and negative SLP anomalies, are shown by the solid and dashed contours. The canonical patterns of Y, i.e., the positive and negative rainfall δ18O anomalies, are shown by red and blue circles; the circle areas are proportional to the size of the δ18O anomaly. The canonical time series pictured span the temporal dimension of the matrix YR (section 3.3.3), but the canonical patterns of Y are derived from regressing the full Y matrix on the predicted aY (calculated using the full X matrix).

Table 3. Correlations Between the Four Leading Canonical Components of Mediterranean Winter SLP (From the CCA Model in the Text) and Various Climate Indices, for the Months NDJFM and Years 1960–2010a
 NAOMOIigWeMOI
  • a

    Symbols: ***p < 0.001, **p < 0.01 significance levels. Statistical significance was tested after adjusting for autocorrelation (see Table 1). The four leading canonical components were predicted using the full X matrix (section 3.3.3).

CC10.43***0.68***0.67***
CC2−0.05−0.110.47***
CC30.34***0.38***0.01
CC40.41***0.090.1
Table 4. Coefficients and r2 for the Regression of Y on the Predicted Canonical Components of Ya
 Nb1b2b3b4cr2
  • a

    Symbols: ***p < 0.001, **p < 0.01, *p < 0.05, ′p < 0.1 significance levels. This table includes only sites for which the overall r2 is significant at the 0.1 level. The columns are as follows: N is the number of months in the regression, bm are linear coefficients from equation (2b), c is the regression intercept, and r2 is the multiple correlation coefficient.

Northwest Mediterranean
GENOA..SESTRI1380.501*−1.322***−0.3552.066***−0.200.21***
MONACO41−0.891−3.159*2.6673.911*0.000.28*
PORTO710.891*1.003*0.2750.4510.180.17*
PENHAS.DOURADAS780.931**0.7361.154′0.9060.020.22**
PORTALEGRE811.323***0.5291.0310.5330.100.28***
 
Northeast Mediterranean
ATHENS.PENDELI40−0.3941.607*−0.8812.693**0.000.37**
ATHENS.THISSION38−0.1230.913−1.656′2.675**0.160.31*
ANKARA..CENTRAL1840.2091.221***−0.5340.422−0.020.07**
ADANA1490.2101.189***−1.513***0.638−0.040.17***
 
Southwest Mediterranean
GIBRALTAR1500.772***−0.2450.225−0.675′0.070.15***
FES.SAISS410.867**0.6361.023.−0.297−0.140.33**
FARO771.071***0.1800.046−0.6110.050.24***
TUNIS..CARTHAGE1300.371′−0.583′−0.836*0.5190.070.09*
SFAX69−0.409−1.732*−2.190′1.1660.050.15*
 
Southeast Mediterranean
HAR.KNAAN..TIRAT.YAEL72−0.1430.122−1.231**0.231−0.070.11′
BET.DAGAN1810.2240.397′−0.983**−0.0960.090.08**
IRBID560.3220.993*−1.0050.5370.040.15′
AMMAN.WAJ640.2321.152**−1.179*0.1720.070.22**

[34] The first CC explains 13% of the variance in X, and 33% of the overall shared variance between the SLP and δ18O fields. Positive phases of CC1 are associated with rainfall isotopic enrichment in the west Mediterranean (Iberia and NW Africa) (Figure 4 and Table 4). Thus CC1 has similarities with the MO and PC2.

[35] The second CC explains 8.5% of the variance in X, and 18% of the overall shared variance between the predictor and response fields. Positive phases of CC2 are associated with rainfall isotopic enrichment in the east (including Greece) and west Mediterranean, and isotopic depletion in the central Mediterranean (Genoa, Monaco, Sfax) (Figure 4 and Table 4). Thus CC2 has similarities with the WeMO and PC3.

[36] The third CC explains 4% of the variance in X, and 15% of the overall shared variance between the X and Y fields. Positive phases of CC3 are associated with rainfall isotopic depletion in the central and east Mediterranean (Athens, Adana, and sites in Tunisia, Israel and Jordan) (Figure 4 and Table 4). CC3 has a significant but only weak correlation with other MCIs (Table 3); its relationship to other indices of Northern Hemisphere climate variability is discussed below.

[37] The fourth CC explains 31% of the variance in X, and 11% of the overall shared variance between the X and Y fields. Positive phases of CC4 are associated with rainfall isotopic enrichment in the north central Mediterranean: the coefficient of CC4 w.r.t Genoa and Athens is significant (Table 4). CC4 shares similarities with NAO and PC1, but in addition it has a center of action in the southeast (Figure 4).

[38] Together, the leading four canonical components explain ∼56% of the variance in the Mediterranean winter SLP field and ∼75% of the shared variance between the SLP and δ18O fields. Like PCs, the canonical components are a subspace of the variability of Mediterranean SLP, but unlike PCs the canonical components are a subspace that directly relates to the variation in the δ18O field. To better understand the relationship that the canonical subspace has with other indices of Northern Hemisphere climate variability (NHCIs), these NHCIs were projected into the canonical subspace using MLR. To simplify interpretation, the NHCIs were projected into different parts of the subspace (e.g., NAO = 0.55CC1 − 0.14CC2, r2 = 22%; and NAO = 0.4CC3 + 0.55CC4, r2 = 29%). The results are illustrated in Figure 5.

Figure 5.

Biplots showing the correlation between the canonical components of Mediterranean winter SLP (the axes) and various indices of Northern Hemisphere Climate Variability (NHCIs, the vectors). The vectors (arrows) were calculated by projecting each NHCI into the subspace formed by the canonical variates of Mediterranean SLP. For vector calculation, only the January and February months were used. (a) The subspace formed by Mediterranean winter (JF) SLP CC1 and CC2. (b) The subspace formed by Mediterranean winter (JF) SLP CC3 and CC4. In Figures 5a and 5b, the percentage for each NHCI (on the arrows) refers to the variance explained by regressing each NHCI on the respective two canonical components. The percentage for each dimension (on the plot axes) refers to the proportion of variance in the Mediterranean winter SLP field that each canonical component explains.

[39] The NHCIs used here, in addition to the MCIs already discussed, were: the East Atlantic pattern (EA), East Atlantic-West Russia pattern (EA.WR), North Caspian pattern (NCP), Scandinavian (SCA), and North Africa-West Asia pattern (NA.WA), which constitute the major NH Atlantic winter teleconnection patterns [Barnston and Livezey, 1987; Paz et al., 2003]. When calculating the subspace projections, only the January and February months from the canonical components and NHCIs were used, in order to limit bias due to seasonal changes. Each NHCI vector represents an axis of variability in the Mediterranean SLP space (arrows, Figure 5). The NHCI vectors that are most closely associated with the CC1–CC2 subspace are the MO indices (MOig, MOac and WeMO). These 2-station MO indices, though, do not define axes of variability which are best associated with variation in theδ18O field, because they do not lie completely within the canonical subspace (their r2values are <100%). Nevertheless, this analysis shows that CC1–CC2 subspace has a closer association with MO-type variability than with any other pattern of winter North Atlantic variability. In contrast, the NHCI vector that is most closely associated with the CC3–CC4 subspace is the EA index. The EA pattern generally resembles a meridionally shifted NAO pattern.Woollings et al. [2010]have shown that the NAO-EA subspace of North Atlantic winter pressure variability essentially describes regimes of the position and strength of the eddy-driven jet stream.

[40] Overall, the CCA model explains up to ∼30% of the variance in Mediterranean rainfall δ18O. This is similar to the PCR model above.

4.3. Model Residuals

[41] Figure 6 shows the RMSE and first EOF of the residual matrices of the PCR and CCA models.

Figure 6.

(top) RMSE of (left) PCR and (right) regCCA models for each site. The circle area is proportional to the magnitude of RMSE, and the circle color represents the month with the largest RMSE. The color key for the month (NDJFM) is at the right. (bottom) The leading EOF of the model residual matrix for the PCR and regCCA models. The circle color and area correspond to the signs (red = positive, blue = negative) and magnitudes of bE (equation (5)).

[42] The RMSE values show seasonal bias in both models. The month of maximum RMSE is either November or March for 17 stations (for the PCR model), and for 16 stations (for the CCA model), out of the 26 Mediterranean stations. This bias occurs because the covariability in the Mediterranean SLP and δ18O fields is strongest in the core winter months (DJF), whereas other months may be dominated by other patterns, noise, or seasonally varying parameters [Fischer and Baldini, 2011]. To further investigate these effects, the PCR or CCA model could be calculated separately for each season (DJF, MAM, JJA, SON), but this has the effect of making the X,Ymatrices sparser. The seasonal bias is also region-dependent, for example 5/11 stations (for the PCR model) and 5/9 stations (CCA model) that have maximum RMSE in November are found in the southeast Mediterranean.

[43] For both the PCR and CCA models, the first EOF of the residual matrix shows a common signal of unexplained variance in the eastern Mediterranean (Figure 6). This EOF explains ∼24% of the unexplained variance for stations east of the longitude of Alexandria. This common signal strongly suggests that there are other predictors that influence rainfall δ18O in the east Mediterranean. Likely candidates for these other predictors could include temperature, rainfall amount, humidity or atmospheric pressure effects outside of the grid used in this study (e.g., perhaps the SLP field should be extended eastward). A generalized CCA model which can incorporate these other predictors is briefly discussed in the following section.

5. Discussion and Conclusions

[44] In this section, the ability of the linear models to predict Mediterranean δ18O variability from SLP variability is discussed.

[45] In the PCR model, Xwas whitened, followed by a rank-m regression with m = 3 patterns. In the CCA model, XR and YRwere prefiltered using a regularized CCA model, followed by a rank-m regression with m = 4 patterns, using the full X and Y. Owing to the different methods of estimation, the first m patterns of the PCR and CCA models may not be the same. Differences in the ability of the two models to predict Y will depend on whether or not the models span the same subspace of the predictor field. The results show that the PCR and CCA models capture two similar patterns (CC1/PC2, and CC2/PC3), which explain ∼50% of the shared variance between the predictor and response fields. The CCA model also includes two extra patterns (CC3 and CC4).

[46] Figure 7 shows the correlation field between the predicted rainfall δ18O anomaly time series for the PCR and CCA models. If, for any individual site, the PCR and CCA models spanned the same subspace of the predictor field, than the predicted time series for the two models should be correlated. For comparison, it is useful to group the stations as follows: (i) stations where the predicted time series from the PCR and CCA models are correlated (r ≅ 0.7–0.95), but the individual regression models were not significant (i.e., the latter criteria includes sites that are missing from both Table 2 and Table 4), (ii) stations for which the predicted time series from the PCR and CCA models were uncorrelated (r < 0.4) and the individual regression models were not significant, (iii) stations for which the predicted time series were uncorrelated, and the individual regression models were significant for the CCA but not PCR model, and (iv) stations where the predicted time series from the PCR and CCA models are strongly correlated, and the individual regression models were significant.

Figure 7.

Correlation between the predicted time series for the PCR and regCCA models, for each site. The color shading is the r2 value of that correlation. This figure shows how well the PCR and regCCA models agree with each other.

[47] In the first group, there are six sites: Madrid (r = 0.72, n = 43), Patras (r = 0.94, n = 36), Antalya (r = 0.72, n = 170), Sidi Barrani (r = 0.68, n = 89), Alexandria (r = 0.72, n = 69), and Ras Muneef (r = 0.91, n= 76). These sites generally have a weak relationship with a single pattern that occurs in both the PCR and CCA models (such as CC1/PC2 or CC2/PC3), but at these sites the noise:signal ratio is high or non-SLP predictors are at work; hence the individual regressions do not obtain statistical significance. In the second group, there are two sites: Avignon (r = 0.0, n = 53), and Algiers (r = 0.31, n = 40). At these sites there is no predictability using the estimated PCR or CCA models, which means that either the SLP patterns resolved by the two models have little influence on rainfall δ18O, or that there are unresolved data issues. In the third group, there are three central Mediterranean sites: Monaco (r = 0.73, n = 41), Tunis (r = 0.54, n = 130), and Sfax (r = 0.48, n = 69) where rainfall δ18O appears more predictable by including patterns CC3 (Tunis, Sfax) or CC4 (Monaco) in the regression, in addition to pattern CC2 (/PC3) (Table 4). As the PCR model only includes a CC2-like pattern (i.e., PC3), but not CC3 or CC4-like patterns, the PCR model does not obtain significance for those sites (Table 2). The CCA model suggests that the second characteristic pattern that affects Mediterranean rainfall δ18O (after CC1) is a linear combination of CC2 + CC3 or CC2 + CC4. In the fourth group, of particular note are the sites Porto (r = 0.88), Athens (two sites: r = 0.91/0.86), Adana (r = 0.87), Gibraltar (r = 0.89), Faro (r = 0.96) and Amman Waj (r = 0.9). These sites all have strong relationships with CC1/PC2 or CC2/PC3, and the corresponding coefficients in Tables 2 and 4 are in the same direction (i.e., positive or negative). Here, the strong influence of a single pattern on δ18O means that for these sites, the PCR and CCA models span the same subspace, hence the predicted time series from the two models are strongly correlated.

[48] So far the physical mechanisms behind the Mediterranean winter SLP-δ18O relationships shown here have not been discussed. Baldini et al. [2008] attributed the strong positive impact of winter NAO on δ18O in central Europe to the higher frequency of cold easterly winds carrying δ18O depleted moisture during NAO-negative phases. The weak correlations observed between winter NAO andδ18O at some Mediterranean stations (Genoa, Ankara, Bet Dagan, Tunis, Gibraltar) were attributed to moisture recycling, although no direct moisture recycling data supporting this explanation were presented by Baldini et al. [2008]. But at least three degrees of freedom are required to explain variation in Mediterranean winter SLP, which means that NAO is not the only SLP pattern that may influence Mediterranean rainfall δ18O. Here we have shown that a large component of the variance in the winter δ18O field is explained by two to three other SLP patterns. Thus, our study has similar themes to that of Qian et al. [2000], who showed that while NAO is certainly a component of European rainfall variability, it may not be the most important component of rainfall variability. Other studies have suggested that Mediterranean rainfall is driven by the superposition of multiple pressure patterns [Dermody et al., 2012; Roberts et al., 2012]. Many previous climate-rainfall isotope studies have focused on only the first component of regional SLP variability (NAO, SOI), but our study suggests that climate-isotope studies can be expanded by, and can benefit from, using multivariate regression models for coupled fields. A future study in which regularized CCA techniques are applied to the Atlantic-European SLP and rainfallδ18O fields should provide a more general understanding of European pressure and δ18O dynamics, as has been illustrated here for the Mediterranean.

[49] Our study can also be extended by generalizing CCA to work with multiple predictor fields [Tenenhaus and Tenenhaus, 2011]. The advantage of a generalized CCA model is that the response variable at a site can be dependent upon multiple variables at non-local points in the predictor fields. This means, for example, the rainfall isotopes at a particular site can be dependent upon spatial patterns of precipitation and/or humidity, which would capture the effect of local precipitation amount as well as ‘upstream’ rainout effects. This will lead to a better understanding of the physical mechanisms behind Mediterraneanδ18O variability, and work on this is underway. Similarly, using Mediterranean SST and relative humidity as predictor fields and rainfall deuterium-excess as the response field, should further elucidate the dynamics of Mediterranean precipitation.

[50] Principal Component Regression and sparse Canonical Correlation Analysis provide useful subspaces for investigating the relationships between climate fields and precipitation δ18O fields. Both methods are able to deal with the problems of sparse matrices and missing data in the response field, and hence are particular suited to climate and precipitation δ18O data. Based on the results here, it is concluded that Mediterranean winter δ18O variability is determined by at least two important patterns, which locally have a dipole and tripole structure. As several sites across the Mediterranean have a relatively strong relationship with one or both of these patterns, and because both models span this subspace (despite the models having a different rank), the PCR and CCA models explain similar amounts of winter δ18O variance in the Mediterranean.

[51] For both the PCR and CCA models, the time vectors of the predictor patterns, like other climate indices, represent rotations in the space of the SLP anomaly field, but in the case of the CCA model they are rotations that are directly linked to the correlations between the predictor and response fields. Analysis of the predicted and residual fields from the two models suggests that the CCA model may provide better predictability for rainfall δ18O at a few central Mediterranean sites, while a substantial proportion of the unexplained variance in the PCR and CCA models lies temporally in the months of November and March, and spatially in the eastern Mediterranean. The results presented here will aid the interpretation of δ18O signals in tree rings, speleothems and other natural archives from the Mediterranean Region.

Appendix A:: Detrending and Deseasonalizing Time Series

[52] Owing to the problem of missing values in the response matrix, the detrending and deseasonalizing methods were different for the predictor and response matrices.

[53] The SLP data was detrended and deseasonalized using the following model:

display math

where m = 1:12 (the rows of I), i = n mod 12, n = 1 to N, N = total number of months (the columns of I), and t is time, t = year + month/12 − 1/24.

[54] The coefficients of equation (A1a) were estimated from the data using multivariate linear regression (MLR).

[55] The GNIP δ18O data was detrended and deseasonalized using the following model:

display math

where w1 = 2π radians yr−1, and w2 = 4π radians yr−1 (angular frequency).

[56] The coefficients of equation (A1b) were also estimated from the data using MLR. The advantage of using equation (A1b) for incomplete data is that there are only 6 parameters in equation (A1b) compared with 24 parameters in equation (A1a), per site. This is very important for sites where the number of months is small relative to the number of estimated parameters. Where this situation occurs, the use of equation (A1a) could lead to substantial bias in the Anomaly matrix.

Acknowledgment

[57] The study was undertaken while the lead author was a visiting researcher at RHUL supported by NERC grant NE/G007292/1 (PI D. P. Mattey).