### Abstract

- Top of page
- Abstract
- 1. Introduction
- 2. Data and Methodology
- 3. Results of Trend Analysis
- 4. Summary and Discussion
- Acknowledgments
- References
- Supporting Information

[1] This study reports on the 1871–2010 trends in significant wave heights (*H*_{s}) in the North Atlantic, as statistically reconstructed from the 20th century reanalysis (20CR) ensemble of mean sea level pressure (SLP) fields. The 20CR SLP data set for the North Atlantic has been reported to be homogeneous since 1871, although it has discontinuities before 1949 in other regions. A multivariate regression model with lagged dependent variable is used to represent the SLP-*H*_{s}relationship. It is calibrated and validated using the ERA-Interim reanalysis of*H*_{s} and SLP for the period 1981–2010.Trends in the reconstructed annual mean and maximum *H*_{s} are found to be consistent with those derived from two dynamical wave reanalysis data sets (MSC50 and ERA40), which indicates robustness of the trend estimates. The trend patterns of extreme *H*_{s}generally feature increases in the northeast North Atlantic with decreases in the mid-latitudes; but there are seasonal variations. The main features of the patterns of trends over the last half century or so are also seen in the last 140-yr period (1871–2010). However, the trend magnitudes are much greater in the last half century than in the 140 years.

### 1. Introduction

- Top of page
- Abstract
- 1. Introduction
- 2. Data and Methodology
- 3. Results of Trend Analysis
- 4. Summary and Discussion
- Acknowledgments
- References
- Supporting Information

[2] Although ocean wave height is an important element of the climate system and could be affected by anthropogenic forcing, in situ observations of ocean wave heights are available only for the last few decades at limited buoy locations around the globe, in addition to some volunteer ship observations which are limited to major ship routes. Satellite data for wind speed and wave height have global coverage, however, they span only the last couple of decades [*Young et al.*, 2011, 2012], which hampers reliability of trend estimates, especially for extremes [*Young et al.*, 2012]. Thus, major analyses of historical trends in wave heights [e.g., *Wang and Swail*, 2001, 2002; *Wang et al.*, 2008] are based on a dynamical reanalysis of ocean waves for the last few decades, e.g., the AES40 for global oceans for 1958–1997 [*Cox and Swail*, 2001], the MSC50 for the North Atlantic for 1954–2010 [*Swail and Cox*, 2000; *Swail et al.*, 2006], and the ERA40 for global oceans for 1958–2001 [*Uppala et al.*, 2005; *Caires et al.*, 2004a, 2004b]. This is because, until recently, reanalyses of the atmosphere were limited to span only the second half of the 20th century. The 20th century reanalysis (20CR) [*Compo et al.*, 2011] is the first reanalysis data set that spans over the past 140 years (1871–2010). It consists of a multi-member ensemble of analyses, providing an uncertainty estimate for each ensemble-mean analysis [*Compo et al.*, 2011]. A recent analysis of cyclone activity in the 20CR reveals that the 20CR SLP data is basically homogeneous over the North Atlantic, although it contains inhomogeneities in the pre-1950 era in other regions [*Wang et al.*, 2012]. Thus, this data set makes it possible for the first time to assess North Atlantic wave height trends on a centennial scale, which would be helpful for related decision making. This study aims to make such an assessment by a statistical reconstruction of historical wave heights using the mean sea level pressure (SLP) fields of the 20CR ensemble. The reconstructed wave height data set and historical trend estimates have the potential to serve the increasing demands of the community concerned with offshore and coastal impacts (e.g. assessing coastal impacts of climate change, coastal and offshore engineering…). It would improve our knowledge on historical ocean wave conditions.

[3] The rest of this paper is organized as follows. The data sets and methodology are briefly described in section 2. The reconstructed historical trends in the North Atlantic ocean wave heights are presented in section 3, in comparison with the trends derived from the MSC50 and ERA40 wave reanalyses. Finally, a summary and some discussion are given in section 4.

### 2. Data and Methodology

- Top of page
- Abstract
- 1. Introduction
- 2. Data and Methodology
- 3. Results of Trend Analysis
- 4. Summary and Discussion
- Acknowledgments
- References
- Supporting Information

[4] The ERA-Interim Reanalysis [*Dee et al.*, 2011] of the atmosphere (SLP) and ocean wave heights (*H*_{s}) for the period 1981–2010 are used to calibrate and validate the statistical relationship between the predictand *H*_{s}and its SLP-based predictors. Here, the model calibration period is 1981–2000, and the evaluation period, 2001–2010. Namely, the period of data used to calibrate a regression model does not overlap with the period of data used to evaluate the model. Having evaluated the performance of the models and chosen the best model (with the best set of predictors), we also use the 30-yr (1981–2010) data from the ERA-Interim Reanalysis to re-calibrate the best model, which is then used for the reconstruction of*H*_{s}. The re-calibration using 30-yr data, instead of 20-yr data, has the potential to provide more robust estimates of the model parameters.

[5] Then, the 20CR ensemble of SLP fields for the period 1871–2010 [*Compo et al.*, 2011] are used to derive time series of the predictors to hindcast/reconstruct the corresponding significant wave heights (*H*_{s}) in the North Atlantic. Since the 20CR ensemble of SLP fields are available on a 2°-by-2° lat/long grid, this study uses the ERA-Interim SLP data on this grid, and the*H*_{s}data, on a 1°-by-1° lat/long grid, although the ERA-Interim SLP data are available on higher spatial resolutions. All the SLP and*H*_{s}data are 6-hourly instantaneous values (here ‘instantaneous’ means as output from the corresponding model for the specific time step). The unit is hPa for SLP, and m for*H*_{s}.

[6] To account for seasonality of atmospheric circulation regimes, we model the 6-hourly*H*_{s}in each of the four seasons, separately. The four seasons are defined as JFM (January-February-March), AMJ (April-May-June), JAS (July-August-September), and OND (October-November-December).

[7] The results of our analysis on selection of model and predictors, which is detailed in the auxiliary material(SM1 and SM2), indicate that, among the models we have tried, the best model for predicting 6-hourly*H*_{s} is a multivariate regression model of the form:

where *H*_{t}is the Box-Cox transformed*H*_{s} (see SM1 of the auxiliary material) at a target wave grid point, *X*_{k,t} are the *K*SLP-based predictors that are retained for the wave grid point (detailed below),*P* is the order of lags of the dependent variable (the predictand), and the residuals *u*_{t} can generally be modelled as an *M*-order autoregressive process, AR(*M*). *u*_{t} is a white noise process if *M* = 0. We also include the spacial case of *P* = 0:

The Box-Cox transformation [*Box and Cox*, 1964] of *H*_{s} is necessary to bring the residuals of the model fit close to a normal distribution, as assumed in the regression analysis.

[8] For each wave grid point, a pool of 62 *potential* predictors are derived from the fields of squared SLP gradients *G*_{t}^{o} and of SLP, following the findings of previous studies [*Wang and Swail*, 2006a, 2006b]. An improvement here is that a Box-Cox transformation is also sought and applied to*G*_{t}^{o} before they are used in this study (see SM1 and SM2 of the auxiliary material for how and why). Let *G*_{t} denote the anomalies of the transformed *G*_{t}^{o}, and *P*_{t}, the anomalies of SLP (Here anomalies are relative to the 1981–2000 mean). In order to model the swell components of waves, the 30 leading principal components (PCs) of the *P*_{t} fields (*PCPs*), and of the *G*_{t} fields (*PCGs*) are also included in the pool of potential predictors, in addition to the local *P*_{t} and *G*_{t} (which are derived as in *Wang and Swail* [2006a, 2006b]). Considering that swells are generated remotely, these PCs are derived from the *P*_{t} and *G*_{t} fields that are for a larger area than the *H*_{s} fields (see Figure S1 of the auxiliary material). These *PCPs* (*PCGs*) are used here to represent large scale patterns of atmospheric circulation (of the geostrophic wind energy), explaining 88%–92% (43%–52%) of the total variance. We chose to include 30 leading PCs because we want to focus on large scale variations and we found that inclusion of higher order PCs in the pool have trivial effects on the resulting trend estimates.

[9] A forward model-selection procedure is used to determine which and how many of the 62 potential predictors need to be retained in the regression model for a target wave grid point, using the F test with the equivalent sample size [*von Storch and Zwiers*, 1999] (see SM2 of the auxiliary material for how and why). The F tests, and all other tests in this study, are conducted at the 5% significance level. As a result, different predictors are usually retained for different wave grid points, although the pool of potential predictors are almost identical for all grid points (except for the local *P*_{t} and *G*_{t}). Both the composition and the number *K* of the retained predictors could vary with location. The results show that 20 ≤ *K* ≤ 40 for over 95% of the wave grid points (30 ≤ *K* ≤ 35 for about half of the points), and that *K* is usually larger for points in the mid latitudes than in the high latitudes (not shown).

[10] The *P* and *M* values in models (1) and (2) are also determined using the F test with the equivalent sample size (see SM2 of the auxiliary material). The results show that 2 ≤ *P* ≤ 4 for about 93% (85%) of the wave grid points in winter (summer), and that 1 ≤ *M* ≤ 4 for about 80% of the points.

[11] The ordinary least squares (OLS) parameter estimates and forecasts of model (1) will be biased and inconsistent even for large samples if *u*_{t} are serially correlated [*Ramanathan*, 2002]. In this case, the Cochrane-Orcutt (CO) algorithm [*Cochrane and Orcutt*, 1949] will give consistent estimates; and with a slight modification it also gives consistent standard errors [*Ramanathan*, 2002]. Thus, when *M* > 0 we used the slightly modified CO algorithm [*Ramanathan*, 2002] to provide the final estimates of parameters in model (1). However, the iterations in the CO algorithm might not converge, in which case we used the OLS estimates of parameters in model (1) and assumed that *M* = 0. Our results show that, even in this case, model (1) is still of the best performance. This is probably because that here the sample size is unusually large (about 7200 and 10800 when fitting with data for 20 and 30 years, respectively).

[12] The observed values of *H*_{t} and *H*_{t−p} for all *p* ∈ {1, …, P} in the calibration period are used to estimate the parameters in model (1). However, *H*_{t−p} has to be replaced by the predicted value, , when the fitted model (1) is used to make predictions. Even with such replacements, model (1) still has the best performance among all models we have tried so far (see SM2 of the auxiliary material). Thus, it is used in this study to predict ocean wave heights using SLP-based predictors.

[13] Note that predictions by model (1) are not available for the first *P* time points of each season because does not exist for *t* ≤ *P* when *P* > 0. Similarly, predictions by model (1) or (2) are not available for the first *M* time points of each season when *M* > 0. For simplification, we use the fitted model (2), assuming white noise (i.e., *M* = 0), to make predictions for the first (*P* + *M*) time points of each season when (*P* + *M*) > 0.

[14] As would be expected, the performance of the chosen best model varies spatially. As partly shown in Figures 1a–1c, in general, the model has a smaller relative root mean squared error (RMSE) (i.e., RMSE expressed in percentage of the 1981–2000 mean *H*_{s}) and a higher prediction skill in the higher latitudes than in the lower latitudes. The model also has a higher prediction skill in winter than in summer, with the lowest skill being seen in the southeast corner of the domain (Figures 1a–1c). Thus, higher confidence can be placed on the reconstructed *H*_{s}for the high latitudes than for the mid-latitudes (30–45°N), and for winter than for summer (especially in the mid-latitudes).

[15] Time series of the predictors derived from each member of the 20CR ensemble are then fed to the fitted model (1)to reconstruct 6-hourly*H*_{s} in the North Atlantic for the period 1871–2010. Time series of annual and seasonal mean and maximal *H*_{s}are derived from the reconstructed 6-hourly*H*_{s}and are subjected to a trend analysis using the Mann-Kendall estimator/test as described in*Wang and Swail* [2001, Appendix A].

[16] For comparison, the climatological fields and linear trends were also derived from the MSC50 and ERA40 wave data sets. The patterns of the reconstructed 1958–2001 climate of the annual mean and maximum *H*_{s} are similar to those derived from ERA40 and MSC50; the reconstructed values are larger than those of the ERA40 but smaller than those of the MSC50 (see Figures 1d–1ffor annual means). This is because the 20CR reconstruction is based on the ERA-Interim reanalysis of 1.0° (0.75° native) resolution, which is higher than that of ERA40 (1.5°) but lower than that of MSC50 (0.5°); and the 20CR SLP is of about 2.0° resolution. Such resolution effect can be seen in the 1979–2001 climate of annual mean*H*_{s}as derived from ERA-Interim, ERA40, and MSC50 (Figures 1g–1i; 1979–2001 is their common period). The trend maps are shown in Figure 2 and discussed in section 3.

### 3. Results of Trend Analysis

- Top of page
- Abstract
- 1. Introduction
- 2. Data and Methodology
- 3. Results of Trend Analysis
- 4. Summary and Discussion
- Acknowledgments
- References
- Supporting Information

[17] Since the 20CR consists of 56 member analyses, for each grid point we obtained an estimate of trend and its statistical significance from each of the 56 member time series of the reconstructed 20CR wave heights. The average of the 56 trend estimates is taken as the trend estimate for the reconstructed 20CR waves. For the annual statistics (mean and maximum), and selected seasonal statistics, the resulting trend maps are shown in Figure 2.

[18] As shown in Figures 2a–2f, trends in the statistically reconstructed 20CR annual means and maxima of *H*_{s} are in reasonable agreement with the corresponding trends derived from the MSC50 and ERA40 dynamical wave reanalyses over their common period 1958–2001. For the annual maxima of *H*_{s}, the trend patterns are characterized by increases in the northeast North Atlantic with decreases in the mid-latitudes (especially in the southwest sector of the domain;Figures 2d–2f). 20CR also shows somewhat smaller increases in annual maximum *H*_{s} in the area off Northern European coast, but larger increases in the Bay of Biscay, than do both reanalyses (Figures 2d–2f). In terms of annual mean *H*_{s}, 20CR shows smaller increases that are more limited to the northeast sector than does either MSC50 or ERA40, with more extensive decreases in the mid-latitudes (Figures 2a–2c). In summary, the three wave data sets (20CR, MSC50 and ERA40) are in reasonable agreement in terms of trends in *H*_{s} in the North Atlantic. Over their common period 1979–2001 (23 years), these data sets also show *H*_{s}trends that are similar to those in the ERA-Interim wave reanalysis (not shown).

[19] Over the longer period 1954–2010 (the common period of MSC50 and 20CR), trends in the reconstructed 20CR annual maximum *H*_{s}are also in general agreement with the trends derived from MSC50, although the decreases in the area off Labrador and in the area southwest of Portugal are seen only in 20CR, and MSC50 shows larger increases in the North Sea-Norwegian Sea and in the area off Canadian east coast (Figures 2g–2h). Such general agreement between 20CR and MSC50 is also seen in the trend patterns of seasonal maximum *H*_{s} for each of the four seasons of year, and of the annual mean *H*_{s} (not shown).

[20] The main features of the patterns of trends in the last half century are also seen in the last 140-yr period (1871–2010). As shown inFigures 2h and 2j, however, stronger trends are seen in the last 57 years than in the last 140 years. On the centennial scale, the increases in the northeast North Atlantic are much weaker, with much stronger and more extensive decreases in the southwest sector of the domain (Figure 2j). The area off Labrador experienced decreases in the last 57 years but increases in the last 140 years (Figures 2h and 2j). The 1871–2010 trend pattern is very similar to the 1871–1953 trend pattern (Figures 2i–2j), indicating that the 140-yr trend pattern is dominated by the first 83-yr trend pattern. The last 57-yr and the first 83-yr trends are also similar in the North Sea-Norwegian Sea, the Bay of Biscay, and the area south of Iceland; but they are of the opposite signs in the area west of Ireland, the area off Labrador, and the subtropics (Figures 2h–2i). The increases in the Norwegian Sea and the area south of Iceland are larger in the last 57 years (Figure 2h) than in the first 83 years (Figure 2i).

[21] The centennial-scale trends in the reconstructed 20CR seasonal maximum*H*_{s} also show seasonal variations. In winter (JFM), small increases are seen in almost the entire domain, with the largest increases being seen in the area off Norwegian coast and off Canadian east coast (Figure 2k). In spring (AMJ), changes are the smallest among the four seasons, showing small decreases in the mid-latitudes with slight increases in the area off North American east coast and off northern European coast (not shown). The changes in summer (JAS) are the largest among the four seasons. The summer trend pattern is characterized by increases in the northeast North Atlantic with decreases in the southwest North Atlantic, showing large increases in an extensive area northwest of the British Isles (Figure 2l). In autumn (OND), the largest increases are seen in the area off Greenland south coast, accompanied with decreases in the southwest corner of the domain (not shown).