The aim of this paper is to present a statistical downscaling method in which the relationships between present-day daily weather patterns and local rainfall data are derived and used to project future shifts in the frequency of heavy rainfall events under changing global climate conditions. National Centers for Environmental Prediction and the National Center for Atmospheric Research (NCEP/NCAR) reanalysis data from wet season months (November to April) 1958–2010 are composited for heavy rain days at 12 rainfall stations in the Hawaiian Islands. The occurrence of heavy rain events (days with amounts above the 90th percentile estimated from all wet season rain days 1958–2010) was found to be strongly correlated with upper level cyclonic circulation anomalies centered northwest of Hawai‘i and south-to-north transport of water vapor in the middle troposphere. The statistical downscaling model (SD) developed in this study was able to reproduce the observed interannual variations in the number of heavy rain events based on cross-validation resampling during the more recent interval 1978–2010. However, multidecadal changes associated with the mid-1970s' climate shift were not well reproduced by the SD using NCEP/NCAR reanalysis data, likely due to inhomogenities in the presatellite period of the NCEP/NCAR reanalysis. Application of the SD to two model scenarios from the CMIP3 database indicates a reduction of heavy rain events in the mid- to late 21st century. Based on these models, the likelihood of a widespread increase in synoptic heavy rain events in Hawai‘i as a result of anthropogenic climate change is low over the remainder of the century.
 Studies with general circulation models (GCMs) unequivocally predict rising global mean temperatures in response to anthropogenic climate forcing in the coming decades [IPCC, 2007]. The GCM results from the Coupled Model Intercomparison Project (CMIP) provide the basic information for investigating and understanding regional climate changes such as future changes in the mean climate as well as changes in the daily weather statistics including extreme events [Kharin et al., 2007; Wehner et al., 2009]. However, precipitation information from CMIP3 models is susceptible to systematic biases due to cloud/precipitation parameterizations and difficulties in representing tropical ocean-atmosphere interactions [Dai, 2006, 2012]. In addition to these general problems in modeling precipitation in global climate models, smaller-scale features of the hydrological cycle, in particular extreme events, cannot be directly estimated from coarse-resolution models without further refinements by either statistical methods or nested high-resolution regional dynamical models [Wilby et al., 2004; Maraun et al., 2010]. In this study, the region of interest is the state of Hawai‘i, where rainfall is crucial for the freshwater supply and where stakeholders are in need of reliable projections of future precipitation changes which may alter the frequencies of extreme rain events, flash floods, and droughts. Hawai‘i is located under the descending northern branch of the Hadley circulation with predominant northeast trade winds [Sanderson, 1993]. The complex topography of the Hawaiian Islands with its volcanic mountains reaching elevations of 4000 m, however, interacts with the circulation and forms spatially complex rainfall pattern [Lyons, 1982; Chu et al., 1993; Schroeder, 1993; Esteban and Chen, 2008; Giambelluca et al., 2011, 2012] with desert-like precipitation minimum zones and extreme wet conditions within a few tens of kilometers distance.
 As a consequence of these contrasting climates, it is very challenging to analyze statewide changes in extreme rainfall events. Studies describing recent trends in daily rainfall extremes have been hampered by gaps in most of the long-term operating rain gauges [Chu et al., 2010; Elison Timm et al., 2011; Frazier, 2012]. However, Chu et al.  reported that between the years 1950–1979 and 1980–2007 most stations in Hawai‘i experienced a negative trend in the frequency and intensity of heavy rain events using standard metrics such as annual number of days with precipitation >25.4 mm and the annual maximum consecutive 5 day precipitation amounts. The authors also indicated that the trend might show opposite signs in the southernmost island.
 Since the resolution of current GCMs is too coarse to represent the topographic features of the Hawaiian Islands, these models inevitably miss important aspects of the regional rainfall characteristics found over the Islands of Hawai‘i [Zhang et al., 2012]. Direct inferences about changes in statistics of heavy rain events are, therefore, unlikely to provide an accurate estimate of the frequency distribution of precipitation extremes in future climate. Statistical downscaling methods have been employed in many regional climate change studies to fill the gap between GCMs and their unresolved physical processes and local climatic changes. For Hawai‘i, two studies have reported statistically downscaled future trends in extreme rainfall. Norton et al.  found, for a selected number of stations on O‘ahu, trends toward more frequent extreme rains, but with an overall reduced intensity. Their results were based on one particular GCM (MPI-ECHAM5 A2 scenario 2011–2040). Elison Timm et al.  used six different climate models from the IPCC AR4 report (A1B and A2 emission scenarios) and found no statistically robust change in the number of heavy rain events across the main islands. However, in their statistical downscaling, the predictor information was limited to changes in the large-scale climate modes over the North Pacific El Niño/Southern Oscillation (ENSO) and the Pacific North American (PNA) mode. Global climate models are not in agreement on future changes in means and variances of ENSO and PNA modes, and hence, the simulation of these phenomena is highly uncertain [Collins, 2004; Deser et al., 2010; Vecchi and Wittenberg, 2010]. Deser et al.  showed that the internal variability of the North Pacific atmospheric circulation is masking the forced mean climate response [see also Bonfils and Santer, 2010]. Currently, the discrepancies in representation of ENSO among the GCMs constitute a major uncertainty in future projections. Although efforts are on-going to reduce multimodel spread and the resulting uncertainties [Mote et al., 2011; Annan and Hargreaves, 2011], we emphasize that the main purpose of statistical downscaling methods is to fill the gap between the unresolved local-scale processes and the general circulation features that are explicitly simulated in the GCMs.
 The main purposes of this study are (1) to identify the synoptic weather patterns that are associated with heavy rain events in Hawai‘i during the wet season (November to April), (2) to estimate changes in the frequency of these synoptic circulation patterns for future climate scenarios, and (3) to give statistical estimates for the expected changes in the frequency of heavy rain days at individual rain gauge stations.
 In section 2, the data and the statistical downscaling (SD) model will be described. Section 3 presents results of the synoptic weather pattern analysis, the calibration and cross-validation of the SD model, and the future projection of heavy rain frequencies. Section 4 summarizes the results with a critical review and conclusions on the broader implications of our findings.
2 Data and Methods
 For the diagnostic analysis of the relationship between large-scale circulation and local heavy rain events, the daily mean NCEP/NCAR reanalysis 1 data were used (Kalnay et al., 1996; Kistler et al., 2001). Despite the fact that newer and potentially more accurate reanalysis products are available for years following 1978, we initially decided to use this data set because of its long temporal coverage from 1958 onward, in an attempt to test the downscaling method on multidecadal time scales.
 We will later discuss and reassess the long-term trends in NCEP/NCAR reanalysis data.
 We limited the spatial domain to the region of 10°S to 40°N, 120°E to 180°E (21 × 25 grid points), which encompasses the area whose synoptic features affect the weather in Hawai‘i. Daily mean data (with respect to 00:00 coordinated universal time (UTC)) of the variables listed in Table 2 (and Table 3) are used in this study to explore the relationship between large-scale weather patterns over the northern and tropical Pacific and high-intensity rainfall at specific stations in Hawai‘i [see also Timm and Diaz, 2009].
Table 1. Station Information and Basic High Rainfall Day Statisticsa
1958–1977 (November to April)
1978–2008 (November to April)
Shown for each station are the 90th percentile daily rainfall amount (P90RF), data completeness (in percent of reported daily observations over the specified period), the number of days with rain greater than the P90RF (NP90), and the percentage of wet days (number of days with detectable rainfall divided by the total number of days). Note that the season year includes November and December of the preceding year (i.e., 1958 refers to November 1957 to April 1958).
Table 2. Large-Scale Predictor Variables Used in This Study
Air temperature difference (1000–500 hPa)
Geopotential height (500 hPa)
Meridional moisture advection (700 hPa)
Moisture advection magnitude (700 hPa)
Specific humidity (700 hPa)
Meridional wind (700 hPa)
Meridional wind (1000 hPa)
Table 3. Linear Regression Statistics: Calibration of the Linear Regression Model for the Prorated Number of Heavy Rain Events NP90 in the Wet Season (November to April)a
Jackknife Estimate 1977–2010
Preshift Validation 1958–1976
Jackknife estimate of the explained variance (R2) and its standard deviation (in parentheses) during the calibration 1977–2010. Jackknife cross-validation refers to the correlation between the observed time series and the predicted values during the Jackknife calibration. Preshift validation 1958–1976 shows the correlation between the observed and calibrated MLR estimates. Prd_Mean gives the predicted 1958–1976 mean frequency, mean error shows the absolute differences between observed mean and predicted mean, and the last column “Persist. Err.” is the persistence error, the difference between unchanged number of events and observed number of events. Predictor variables are described in Table 2.
 The number of rainfall stations with high-quality, long-record daily rainfall data is limited in Hawai‘i. As described in previous studies of rainfall extremes [Chu et al., 2010; Elison Timm et al., 2011; Norton et al., 2011], only a few stations have nearly complete reports of daily rainfall in the available National Climatic Data Center (www.ncdc.noaa.gov) database. We selected 12 stations (Figure 1 and Table 1) with nearly complete daily records covering the years 1958–2010. In order to test the statistical downscaling method matching its application to long-term future climate change, we followed a calibration/cross-validation procedure as in Elison Timm et al. . The temporal sample is divided into 1958–1976 and 1977–2010 subsets, representing the periods before and after the major climate shift in the Pacific during the mid-1970s [Graham, 1994; Trenberth, 1990; Meehl et al., 2009]. Note that we subsequently decided to use the 1977–2010 period for the statistical model calibration because of the higher-quality reanalysis data set for that period [Brands et al., 2012; also see discussion section].
 Heavy and extreme rain events in Hawai‘i show considerable differences in their amounts for different sites. Therefore, a threshold-based definition is preferred in this study to define heavy rain events. The choice of the threshold is based on the empirical cumulative distribution function (CDF) that is derived for each station from the sorted daily rainfall data using all years 1958–2010. Dry days (rain less than 0.01 inches (0.0254 cm)) were excluded. At each station we selected the 90th percentile (P90RF(k)) for the study of rain events from the CDF. Note that for dry stations with few rain days, the probability for a heavy rain event on a given day in the wet season is lower than nominal 10% (see Table 1), since we based our heavy rain threshold to the cumulative distribution of nondry days only.
 Throughout the study only days during the wet season (November to December of the previous year plus January to April of the following year—e.g., year 1958 refers to November 1957 through April 1958) are considered as the majority of heavy rain events occur during these months. The days with precipitation above the P90RF(k) were identified and counted, resulting in annually resolved time series of the numbers of heavy rain events NP90(t,k), where t and k denote the year and station index, respectively. Since the number of missing observations varies with time, frequency counts can have a time-dependent negative bias (i.e., heavy rain days are underestimated when records are incomplete). Assuming that the occurrence of a missing observation is independent of the rainfall amount, we correct for the frequency counts with the factor
where Ncomplete(t,k) is the number of days in the season (181, or 182 in seasons with leap years) and Nmiss(t,k) is the number of days with missing rainfall records. This prorated event statistic is justified as long as no other information indicates the existence of any conditional dependence between the missing events and precipitation intensity. In the following paragraphs, we describe the details of the downscaling method. A schematic flow diagram of the statistical downscaling is given in Figure 2.
2.1 Composite Analysis
 Having identified days with rain exceeding the selected 90th percentile at a given station, the major weather anomalies contributing to the heavy rain days were analyzed. Let C+90(k) be the set of days with heavy rain events for a given station k. The composite anomaly pattern is the average of the NCEP/NCAR data fields taken over the subset C+90(k) and subtracted from the long-term mean of the calibration period. Areas with significant anomalies were identified for each composite using the t test (p ≤ 0.01), and only those regions were used in the subsequent projection procedure.
 Having derived a representative anomaly in the multivariate large-scale circulation through the composite analysis, each daily synoptic anomaly pattern (with respect to the climatological mean) was correlated with the composite pattern. Note that only the significant areas were taken into account to calculate the spatial correlation r(i,j,k) (here i indicates the day, j and k are indices for the climatic variable and station, respectively). We found that the spatial correlation coefficient gives a well-defined measure of similarity with a skill level equivalent to the projection index, which measures the amplitude as well as the spatial pattern of the anomaly.
2.2 Multiple Linear Regression (MLR)
 The spatial correlation time series r(i,j,k) were passed through a peak-over-threshold analysis, where the threshold correlation (RC90(j,k)) is set such that the corresponding exceedance rate matches the expected number of heavy rain days during the calibration period. For each year t, each composite variable j, and each station k, the number of days N90C(t,j,k) were counted for which the spatial correlation r(i,j,k) was greater than R90C(j,k). Note that in this notation the daily time index i is limited to wet season days of year t. The annual-resolution time series N90C(t,j,k) are then used as predictors in a multiple linear regression model, where the predictand is the number of heavy rain events per season Ñ90P(t,k) for a station k:
 The time series was divided into two subintervals, of which the years 1977–2010 were used for the calibration of the multiple regression parameters.
 In order to find the optimal combination of predictors, we applied the Bayesian information criterion (BIC) first for the full calibration period (and kept the predictors unchanged in the Jackknife cross-validation procedure). Other measures such as Akaike information criterion [Burnham, 2004; von Storch and Zwiers, 1999] or adjusted R2 statistics were also considered. Little differences in the overall statistical skill (during calibration and cross-validation) were found. The BIC usually selected the smallest number of significant predictors. Application of the BIC resulted in station-dependent combinations of predictors. We decided to use the specific predictor sets rather than attempting to further limit the predictors to those with significance among all stations, since the composite pattern differed from station to station, which is an indicator of locally variable influences on rainfall in Hawai‘i.
 Within the calibration interval 1977–2010, the Jackknife (cross-validation) method was used, in which the predictor variables were kept unchanged and 1 year of the data sample was withheld for each run while performing the MLR fit and calculating basic calibration statistics. This allowed for 33 MLR calibrations, and each calibrated MLR gave one predicted number of heavy rain events for the left-out year. We note that the autocorrelation in the heavy rain event numbers is small (range −0.25 to 0.25, with the exception of one station (0.5)). Furthermore, the withheld data from the years 1958–1976 were reserved for an independent cross-validation, in an attempt to evaluate the performance of the MLR in the presence of multidecadal changes.
 After estimating the regression parameters, the years 1958–1976 were used for cross-validation of the long-term mean changes. For each day the NCEP/NCAR anomaly fields were correlated against the composite pattern and the number of days with correlations above the fixed thresholds was counted in each wet season. The downscaled numbers of heavy rain days per season Ñ90P(t,k) were obtained by applying the regression model. The downscaled estimates were compared with observed numbers of heavy rain days NP90(t,k). The validation statistics include correlation and bias analysis. Furthermore, we exchanged the calibration and validation period to test the overall robustness of the regression parameters and regression skills.
2.3 Application to Future Climate Change Scenarios
 After the statistical regression model was fitted for the individual stations, the spatial correlation was calculated with daily mean data output fields from two members of the CMIP3 database: the CCCMA-CGCM3.1 and MPI-ECHAM5. We analyzed the emission scenario A1B for the mid- and late 21st century (2047–2064 and 2082–2100). The CCCMA-CGCM3.1 provided three ensemble simulations that we used to produce an average scenario from the three runs.
 These models performed reasonably well in representing seasonal mean sea level pressure fields in the historical 20th century simulations [Timm and Diaz, 2009]. We note that these GCMs generally lack an accurate representation of synoptic daily weather patterns. This directly affects the spatial correlation statistics and can lead to biases in the spatial correlation coefficient distribution (see Figure 6). We therefore estimated the ratio in probabilities above the R90C(j,k) threshold using the late 20th century (1961–2000) simulations as a standard against the 21st century model results.
3.1 Composite Pattern Analysis
 The majority of the heavy rain events are linked to anomalies in the synoptic weather pattern associated with an upper level low-pressure system destabilizing the air column across the Hawaiian Islands. Figure 3 shows the composite anomalies associated with rainfall events above the 90th percentile during the wet seasons for four stations from different islands. As the selection illustrates, a key feature for heavy rain events is a cyclonic circulation anomaly centered slightly to the west off the major islands. It is worth taking a closer look at the differences between the four stations, since opposing trends in rainfall extremes were reported for stations from the Big Island and Kaua‘i [Chu et al., 2010]. One possible cause is that more extratropical weather disturbances reach the northwestern islands (especially Kaua‘i) than the islands farther to the southeast. The composite pattern for Līhu‘e (Figure 3a) shows the center of the upper level disturbance on average located northwest off the islands, whereas the center is located further south in the composites for stations on the Big Island and Maui. Furthermore, the inflow of moist air exhibits a more pronounced southeasterly component.
 The two nearby stations from the Island of Hawai‘i are presented here as an example of the spatial complexity of the rainfall pattern introduced by the topography within a short distance. Hilo airport is located in a region with predominantly trade-wind-induced rainfall as a result of the orographically forced convergence of the air masses. The intensity of the rainfall appears to a large degree affected by the vertical stability. With an upper level low disturbance located over Hawai‘i, Hilo experiences a higher frequency of intense precipitation events. A similar destabilization affects the frequency of heavy rain events further south on the windward site of Hawai‘i at station Na‘alehu, but in contrast to the Hilo station, the heavy rain events depend more strongly on the surface low anomaly with southerly wind advecting moist tropical air masses toward the island.
3.2 Relation Between Heavy Rain Frequency and Large-Scale Predictors
 After the multivariate composite pattern had been derived for each of the 12 rain gauge stations, the spatial correlation time series r(i,j,k) were computed. As shown in Figure 4, the ability to “hindcast” individual heavy rain events from the daily mean weather anomalies is limited. It is clear that the threshold-based decision between heavy and nonheavy rainfall still has a large uncertainty in terms of “false alarms” and “missed events.” Several factors contribute to the limited success with daily data. First, the chaotic nature of small-scale weather extremes is predictable only to a certain degree. Under the same synoptic weather conditions, essentially random small-scale perturbations can lead to significant differences in rainfall amounts at any one specific site. Second, small-scale disturbances in the atmospheric stability and moisture distribution that cause short but intensive rain events are not fully represented by the 2.5 × 2.5 grid resolution of the reanalysis data used in this study. Third, the daily rainfall data and NCEP/NCAR reanalysis data do not in general integrate amounts over the exact same time interval. Daily means are based on 6 h reanalysis data in UTC times and local rain measurements are in general integrated over 24 h periods beginning and ending at station-specific observation time.
 While prediction of individual events is beyond the goal of the statistical downscaling for future change studies, the method can be used to predict whole-season frequency counts for heavy rain days. The linear regression models derived from the annual time series are shown in Figure 5. The fitted regression models have explained variances (R2) values between 40% and 70% (see Table 3) in the Jackknife calibration. As expected, the calibration overestimates the explained variances. A more conservative measure of the regression skill is provided by the cross-validation using the correlation between the time series of predicted heavy rain days by the Jackknife approach and the observed time series in 1977–2010. From Table 3 it can be seen that 37%–66% of the interannual variability is reproduced by the individual regression models, resulting in an overall high correlation in the recent decades for all 12 stations pooled together (Figure 6).
 The downscaling models are applied during the earlier cross-validation period (1958–1976), and the regression-based time series with heavy rain counts are compared with the observed values. The results in Table 3 indicate that 7 of the 12 stations have significant correlations between regressed and observed time series; 5 failed the test at the chosen significance level of 5%. It is unlikely that our multivariate calibration has been overfitted, and we argue that the apparent drop in correlations may result from changes in the data quality in rainfall observations and/or reanalysis products in the presatellite period. For example, we compared the composite pattern of the pre- and postshift period and found that for the Hilo station, while the specific humidity transport anomalies maintained their key pattern, and the amplitude was reduced significantly in the preshift period. In particular, heavy rainfall frequency for the 1969 season, which recorded the highest number of events (30), was underestimated (8). Withholding this outlier year yielded explained a variance of 15% during the validation period. Moreover, the long-term mean changes between the pre- and postshift periods at station Hilo were not reproduced. In general, the downscaled mean changes during the preshift period had the error amplitude compared with a persistence model estimate (Table 3).
 The data assimilation methods behind the development of reanalysis products are optimized for estimating the state of the atmosphere. Systematic changes in the observation network, such as incorporation of remote sensing products over otherwise sparsely observed oceans, inevitably change the quality of the reanalysis.
 In order to illustrate the problems with using the reanalysis data set for downscaling of long-term trends, we compared sea level pressure and specific humidity at 700 hPa between NCEP/NCAR and ERA-40 [Uppala et al., 2005] (seasonal averages 1958–2002) using principal component analysis. Whereas dominant modes of variability in the sea level pressure fields were highly correlated in their spatial pattern and in their principal component time series (not shown), the specific humidity fields at 700 hPa exhibit opposite trend behavior in the North Pacific sector (Figure 7). Furthermore, recent studies have shown that daily specific humidity reanalysis data have low correlations over the tropical Pacific region including Hawai‘i [Brands, et al., 2012]. It is therefore likely that the accuracy of the predictor variables has improved with the beginning of satellite observations in the 1970s. As we rely on variables measuring moisture transports in the free atmosphere, it is equally likely that error-prone predictors contribute to the lower regression skills in the preshift period.
 We note that that there is a chance that part of the problems with downscaling long-term changes could be due to a breakdown of the observed (calibrated) correlations between heavy rainfall events and the large-scale circulation coinciding with the climate shift in the mid-1970s. However, a more homogeneous reanalysis of the climate over the earlier entire period would be needed to quantify this effect. In the following, we will continue with the description of past and future frequency changes at the four selected stations.
3.3 Changes in Heavy Rain Frequency
 The observed decrease in heavy rain events (Table 4) over the last several decades indicate that systematic changes in the synoptic weather pattern around Hawai‘i have occurred [Giambelluca et al., 2008; Chu et al., 2010; see also Diaz et al., 2011]. As a measure of the potential changes in synoptic weather pattern, we estimated the empirical probability density functions of the spatial correlation samples for the two time periods using kernel-based density function estimators [Silverman, 1986]. For downscaling purposes, changes in the PDF are of importance at the high correlations as they determine the probability of the heavy rain events. In order to obtain qualitative estimates if the frequency of heavy rain will change with rising global mean temperatures, we analyze daily data from two models of the CMIP3 historical 20th century runs and SRESA1B emission scenarios. We performed the analysis for the 20th century period and two periods in the 21st century. The results are shown in Figure 8 for the key predictor variables at four selected stations.
Table 4. Prorated Number of Heavy Rain Events NP90 Observed in the Wet Season (November to April) Between 1958–1976 and 1977–2010 and Estimated Changes According to CMIP3 SRESA1B Scenarioa
Boldfaced values indicate a decrease in the number of heavy rain events with respect to the 1977–2010 period.
 The examples of the four stations illustrated in Figure 8 show how changes in the synoptic circulation anomalies can be integrated into an index that measures the frequency of weather pattern with potential for heavy rain events. A comparison among the present-day and future scenarios suggests that Līhu‘e and Kailua are likely to experience decreases in the occurrence of heavy-rain-related weather patterns, whereas Hilo and Na‘alehu are projected to remain close to present-day conditions. Note that Figure 8 is only representative of one of the multiple predictor variables used in the downscaling process for a given station.
 The threshold-based discrimination of the synoptic circulation anomalies generates daily-resolved sequences of the binary variable, heavy and nonheavy rain days. After summation over individual seasons, we used the multiple linear regression (equation (2)) and translated the climate model information into heavy rain frequencies. The results are summarized for the model scenario simulations in Table 4. Overall, the changes in the synoptic circulation pattern indicate fewer heavy rain days. The small station number limits further classification of the projected changes in terms of mean rainfall climatologies. Grouped according to the average rainfall, no systematic relation was found between the projected amplitude of the anomalies and the average rainfall at the stations (Figure 9). The lack of clear evidence for a conditional dependence on the geographic location or on the seasonal mean precipitation amount is the basis for the statistical significance test described below.
 We consider the projected changes at the 12 stations were obtained randomly by chance and ask what is the probability of obtaining say eight or nine out of 12 stations with a reduction in heavy rain events in the future scenarios. In such a random-guess downscaling, the sign of the change would be evenly distributed and independent of the large-scale climate changes. With a 50% chance for an increase or decrease, the random model would show on average 6 out 12 stations with negative trends. In the calibration of our statistical downscaling, the temporal correlation among the stations' heavy rain frequencies is high, an indication that the large-scale circulation organizes the trends spatially. In this way the SD organizes the response to the climate change scenario, and we observe in three of the four independent scenarios that the number of negative trends is significantly higher than the average outcome of a Bernoulli trial (p < 0.10). Moreover, we see that at most stations the downscaled signs of the changes are consistent (see Table 4), another indication that between the two scenarios and two future time periods, a consistent pattern emerges in the mid- to late 21st century with fewer heavy rain events for Hawai‘i. Nevertheless, 3 (of the 12) stations in fact show an increase in the number of heavy rain events under the future climate change scenarios.
4 Discussion of the Results
 The results presented in the previous section need to be further discussed in light of the underlying statistical constraints and implicit assumptions made in the process of the downscaling. Initial tests showed that the statistical downscaling of heavy rain events with an event-by-event approach was not applicable (although significant skill was achieved compared with a purely random guess model). For each day the large-scale synoptic circulation anomalies and rainfall in Hawai‘i are linked through a number of “hidden” processes leading to a causal connection. However, it is clear that small differences in the synoptic conditions (e.g., wind direction, lifting condensation level, stratification of the ambient air, and entrainment processes in convective clouds) can lead to different outcomes due to the chaotic nature of the atmosphere. This limits the statistical downscaling skill. Furthermore, errors in the reanalysis data and rainfall observations are reducing the downscaling skill in practice. Daily rainfall amounts are measured over a 24 h period, but some heavy rain events could be the result of false reporting, when the accumulated amount of two or more consecutive days was reported for the last day of that period. Another problem results from the use of daily mean reanalysis products, which are aligned to the UTC time. The 10 h offset between local time and UTC and the additional shift due to the common practice of reporting rainfall amounts in the morning hours reduce the regression skill of the downscaling model for individual events. The purpose of statistical downscaling, however, is to simulate climatological mean statistics and their changes with time. The downscaling task was accordingly reduced to estimating seasonal frequency counts. As was shown in Figure 5, the linear relation between the seasonal number of heavy rain events and large-scale circulation anomaly events justifies this holistic approach.
 We have established an empirical statistical downscaling model that is based on the assumption of stationarity. Past observations in local rainfall were investigated for their conditional dependence on the large-scale synoptic circulation anomalies. In order to use the SD model for future predictions, it is important to consider under what conditions this assumption is justified. The predictor information used in the study was based on dynamic and thermodynamic variables as well as hydrological variables. We identified that most stations require a destabilization in the upper level atmosphere and increased moisture transport toward the islands. These key principles are expected to operate also under future climate change, despite discussions that the efficiency of the hydrological cycle within the Hadley or Walker circulation may change [Held and Soden, 2006; Vecchi et al., 2006; Vecchi and Soden, 2007; Tokinaga et al., 2012].
 The composite-based analysis technique measures the similarity of the synoptic pattern to the past observed heavy rain situations by means of spatial correlation indices; the amplitude of the pattern itself is not taken into account. Therefore, the SD model does not take into account the full magnitude of specific humidity changes, in particular the temperature-dependent increase in specific humidity indicated by the Clausius-Clapeyron equation. On the other hand, the trade wind inversion strength and height of the inversion layer can control the rainfall intensity to a degree that simple scaling laws for precipitation changes may not apply in Hawai‘i [Cao et al., 2007]. The SD model lacks the adaptive skill to capture new classes of high-intensity rain events, which in the past were associated with rainfall amounts of moderate magnitude. Testing the stationarity assumption would need a comparison between statistical and dynamical regional modeling results that include such nonstationary changes [Vrac et al., 2007]. For Hawai‘i, regional climate change scenario simulations are currently under development [Zhang et al., 2012].
 Support for the validity of the stationarity assumption was sought from our cross-validation approach, during which we applied the statistical downscaling model to a period with different long-term mean climate conditions in the North Pacific [Elison Timm et al., 2011]. The composite patterns that were identified in these two periods exhibit high spatial coherence. Changes in heavy rain frequencies are therefore interpreted as changes in the probability of recurring synoptic weather types similar to the ideas of Palmer . Our SD model had only moderate success in reproducing long-term mean changes in the frequency counts. Comparison with persistence models showed no significant improvement of the SD estimates over the persistence estimates (measured by the root mean square error). We found no clear dependence of the bias on the correlation skill of the regression models, which raises the question to what extent long-term trends in the NCEP/NCAR reanalysis data are reliable. Changes in the quality and type of observational data available for assimilation by reanalysis models have an imprint on the daily synoptic variability. Moreover, Brands et al.  compared specific humidity and geopotential height data between NCEP/NCAR and ERA-40 reanalysis products. They found low correlations between the two products on daily time scales. Based on the results and the discussed data limitations, we consider the pattern-based statistical downscaling concept applicable to future climate change scenarios.
 It was shown in Figure 8 that the two GCMs produce different spatial correlation statistics for the 20th century compared with the results from the reanalysis data. Mismatches in the probability density distribution reflect biases in the synoptic-type weather variability in the GCMs. Careful evaluation of the synoptic weather pattern for the North Pacific region was beyond the scope of this analysis and requires a multimodel comparison with well-defined metrics. Such an analysis would need a basin-wide and regional analysis of how well the GCMs can simulate extratropical cyclone tracks in terms of location and frequency as well as the strength and position of blocking events. On the other hand, it is standard practice to remove model biases from climate change signals by analyzing only anomalies with respect to the models' 20th century climatology. We adopt this concept in a similar way by keeping the spatial correlation value that discriminates between heavy and nonheavy rain days fixed at the level that was derived from NCEP/NCAR reanalysis data. The modeled changes in the frequency of heavy rain counts were expressed as ratios between the modeled 21st century and 20th century values and multiplied with the frequency counts of reanalysis data.
 This nonparametric approach is applicable only if the probability to exceed the threshold is comparable to the nominal probability of the reanalysis data. In particular if the models would fail to simulate any heavy-rain-like weather pattern, then the probability estimates would be based on small sample statistics in the upper tail of the distribution. Consequently large uncertainties would accompany the future versus present probability ratios. Our choice of working with 90th percentile heavy rain events was to account for these factors.
 Our overall ability to obtain consistent projections from the CMIP climate models ultimately depends on the agreement in the large-scale circulation changes of the models. Here we presented only two CMIP3 model results. Forthcoming CMIP5 climate change scenarios will allow us to estimate the most likely scenario from a larger set of simulations.
5 Summary and Conclusions
 We have presented a statistical downscaling method that translates daily-mean synoptic circulation anomalies from NCEP/NCAR reanalysis data into frequency counts of heavy rain events at local rain gauge stations in Hawai‘i. The calibration and cross-validation showed that the statistical downscaling model can reproduce interannual variability in the November-April wet season heavy rain counts at most of the analyzed 12 stations. The well-known climate shift in the mid-1970s was associated with a notable downward trend in the number of heavy rain events in Hawai‘i. The statistical downscaling model was able to reproduce interannual variability driven by ENSO and PNA [Elison Timm et al., 2011], but the cross-validation had limited success in reproducing the mid-1970s shift toward fewer heavy rain events, likely caused by poorer-quality reanalysis data prior to the satellite era. The statistical downscaling model was applied to estimate future changes in the number of heavy events during the wet season. The two models from the CMIP 3 database analyzed show a reduced number of heavy-rain weather patterns in the emission scenarios A1B for the mid- and late 21st century. This study is consistent with the previously identified trend toward drier wet-season conditions [Timm and Diaz, 2009; Elison Timm et al., 2011] and a likely reduction in heavy rain days.
 However, the results presented here are opposite to the conclusions of Norton et al.  who estimated an increase in the number of heavy rain events, but decreasing intensity for one particular region in Hawai‘i based on one particular global climate model. It is possible that nonlinear effects due to increased moisture content paired with the nonstationary shifts in the relationship between large-scale climate and rainfall could increase the intensity of heavy rain events [Min et al., 2011]. On regional levels, however, it is clear from other regional climate change studies in subtropical regions that a reduction in the formation of storms [Dowdy et al., 2012] could be an important dynamical constraint to the frequency of heavy rain events.
 A strong linkage between mean and heavy rain changes was recently shown to be a rather universal feature of precipitation around the globe [Haerter et al., 2010; Benestad et al., 2012], which could be a useful diagnostic tool for testing the consistency of past and future changes in downscaled rainfall changes, despite the intricate nature of extreme rain events [Vrac and Naveau, 2007; Maeda et al., 2012]. Future work with models from the CMIP5 database and a more detailed comparison of the different downscaling methods will help to assess hydroclimatic risks for Hawai‘i in a changing future climate. Given these results, we estimate that the anthropogenically induced risk for widespread increase in synoptic heavy rain events is low in the next few decades.
 We thank the three anonymous reviewers for their constructive comments on the statistical interpretation of the cross-validation results. O.E.T. is supported by the Japan Agency for Marine-Earth Science and Technology (JAMSTEC) through its sponsorship of the International Pacific Research Center. H.F.D. received support from the NOAA Office of Climate Programs. The authors received support from the U.S. Dept. of Interior, the U.S. Fish and Wildlife Service through the Pacific Island Climate Change Cooperative (F10AC00077), and by the U.S. Army Corps of Engineers, Honolulu District, and the State of Hawai‘i Commission on Water Resource Management. (W912HZ-11-2-0035). This is International Pacific Research Center contribution 961 and School of Ocean and Earth Science and Technology contribution 8895. The information and opinions represented in this article do not necessarily reflect the position or policy of the governmental institutions and no official endorsement should be inferred.