This study uses the newly developed Climate extension of Weather Research and Forecasting (CWRF) model nested in the National Centers for Environmental Prediction (NCEP) operational Climate Forecast System (CFS) to improve interannual prediction of cold season precipitation over the United States. An ensemble of 5 retrospective forecasts for 27-cold seasons (December–April) during 1982–2008 has been conducted to assess the predictive skill. The CWRF downscaling reduces CFS forecast errors of seasonal mean precipitation by 22% on average, increases the equitable threat score by 0.08–0.15, and produces greater skill for heavy rainfall events. The CWRF simulates more accurate number of rainy days than the CFS over the northern and western U.S. due to the refined representation of orographic effect, shallow convection, and terrestrial hydrology. The CWRF also more realistically captures the broad region of extreme rainfall over the Gulf States and maximum dry spell length along the Great Plains, as well as their contrasts between El Niño and La Niña events. The results demonstrate the significant advantage of the CWRF downscaling for regional precipitation prediction, especially during years with weak planetary anomalies.
 It has been well established that regional climate model (RCM) downscaling improves precipitation simulation from its driving global reanalysis or general circulation model (GCM) due to refinements in spatial resolution and physics representation [Giorgi et al., 2001; Roads et al., 2003; Liang et al., 2004a, 2004b, 2006, 2008; Diffenbaugh et al., 2005; Zhu and Liang, 2007; Yuan et al., 2008; Wang et al., 2009]. As the downscaling better resolves regional processes in the terrestrial hydrosphere (soil moisture, snow, aquifer) and coastal oceans that contain additional memory, the RCM is anticipated to produce higher predictive skill than the GCM on seasonal-interannual precipitation variations. Such skill enhancement can only be realized by a nesting system where the GCM correctly captures the large-scale forcing signals (e.g., El Niño-Southern Oscillation (ENSO)) while the RCM accurately represents the regional-local climate responses. This is challenging in both summer when the GCM climate predictability is low and winter when the RCM regional advantage diminishes due to the large-scale forcing dominance, as well as in spring and fall when key processes are in transition.
 Thus, few studies have actually achieved the RCM skill enhancement over the GCM in predicting U.S. seasonal precipitation. Fennessy and Shukla  reported some success in hindcasting 15 winter and summer North American climate anomalies, in which the RCM downscaling reduced the driving GCM's systematic biases and produced better contrast between flood and drought years. However, these were not strictly seasonal predictions since sea surface temperature (SST) distributions were prescribed by observations. The first downscaling prediction, using an RCM embedded in a coupled atmosphere-ocean GCM, was made by Cocke and LaRow  over the southeastern and western U.S. for the winter of 1987 and 1988, when significant El Niño events occurred. They found that the RCM downscaled precipitation was consistent with the GCM results, but with more realistic details around coastlines and mountains. Later Cocke et al.  conducted 1986–1997 winter predictions over the southeast U.S. and showed that the RCM downscaled precipitation anomaly pattern was somewhat better than the GCM.
 The present study uses the state-of-the-art CWRF (X.-Z. Liang et al., Regional Climate-Weather Research and Forecasting Model (CWRF), submitted to Bulletin of the American Meteorological Society, 2010) nested in the operational CFS [Saha et al., 2006] to improve interannual precipitation prediction of cold seasons (December–April) during 1982–2008 over the entire conterminous U.S. and adjacent Canadian and Mexican regions. This differs from previous studies not only by its use of a much longer forecast period and broader geographic domain to identify statistically robust signals, but also by its emphasis on the predictive skill of daily precipitation characteristics (rainy frequency, extreme events) that are more difficult and yet critical for many impact applications. We will demonstrate the clear advantage of the CWRF downscaling over use of the CFS alone in predicting U.S. precipitation seasonal distributions and daily characteristics.
2. Prediction System, Experimental Design, and Verification Data
 The CWRF has been developed as the Climate extension of the WRF version 3.1.1 by incorporating numerous improvements that are crucial to climate prediction, including interactions between land–atmosphere–ocean, convection–microphysics and cloud–aerosol– radiation, and system consistency throughout all process modules (X.-Z. Liang et al., submitted manuscript, 2010). An essential component is the state-of-the-art Conjunctive Surface-Subsurface Process model (CSSP) that exhibits significant advances over major land surface models in predicting soil temperature/moisture distributions, terrestrial hydrology variations, and land-atmosphere exchanges [Yuan and Liang, 2011].
 The CWRF downscaling requires inputs of initial, surface and lateral boundary conditions. They are provided by retrospective seasonal forecasts using the CFS, a fully coupled atmosphere-ocean GCM that has been producing operational climate predictions at the NCEP since August 2004. The CFS depicts important advances from previous dynamical forecast practices with a demonstrated skill comparable to statistical methods [Saha et al., 2006]. The CFS version used in this study consists of the NCEP Global Forecast System at T62L64 (∼1.875°) resolution and Geophysical Fluid Dynamics Laboratory Modular Ocean Model version 3.0 at 1/3–1° grid spacing.
 The downscaling experiment includes an ensemble of 5 retrospective forecasts of the cold seasons during 1982–2008. Each ensemble member contains about five months integration that extend to April 30, with the initial dates of consecutive November 29–December 3. The 3-month mean results for December-January-February (DJF), January-February-March (JFM), and February-March-April (FMA) are evaluated below and referred as to seasonal forecasts at a lead time of 0, 1, and 2 months. The lateral boundary conditions for the CWRF are updated every 3 hours from the CFS forecasts. The study domain, centered at (37.5°N, 95.5°W), covers the conterminous U.S. and adjacent Canadian and Mexican regions with 30-km grid spacing, which facilitates skillful regional precipitation downscaling [Liang et al., 2004b, 2006, also submitted manuscript, 2010]. There are 36 vertical levels, with the model top at 50 hPa.
 For prediction verification, daily precipitation data are a composite of two sources (X.-Z. Liang et al., submitted manuscript, 2010). The primary one is constructed from daily measurements at the 7235 cooperative stations over the U.S. and is adjusted to correct the terrain-slope effect [Daly et al., 2008]. The other source is based on bilinear interpolation from the National Oceanic and Atmospheric Administration Climate Prediction Center (CPC) global 0.5° analysis of daily gauge measurements [Chen et al., 2008], which supplements the data over Canada and Mexico. The predictive skill is also distinguished between ENSO phases, where the warm or cold episodes are identified as the Niño3.4 (5°N–5°S, 120°–170°W) SST 3-month running mean anomalies are above +0.5°C or below −0.5°C. Based on the CPC Niño3.4 data, there are 6 significant El Niño (1982, 1986, 1991, 1994, 1997, 2002) and La Niña (1984, 1988, 1995, 1998, 1999, 2007) events, and 14 relatively normal years during the study period of 1982–2008.
3. Predictive Skill Enhancement by the CWRF Downscaling
Figure 1a compares frequency distributions of root mean square errors (RMSE) of seasonal mean precipitation interannual variations predicted by the CFS and downscaled by the CWRF. The statistics are based on all land grids over the entire inner domain (excluding the buffer zones) for DJF, JFM, FMA, and DJFMA from the 5 ensemble realizations during 1982–2008. As a general rule, the peak frequency occurring more to the left of the horizontal axis indicates that the respective simulation has more grids of smaller RMSE. Although there is some spread among the different realizations, all CWRF results consistently reduce CFS forecast errors. The reduction is obvious at all forecast lead times, with the RMSE peak decreased by about 0.5 mm/day. Most of the error reductions occur over the northern and southwestern U.S. and the adjacent Canada and Mexico, where the CFS has large wet biases (see also Figure 2a). On average, the CWRF downscaling reduces the DJFMA mean CFS errors by 22%. The error growth resulting from the increase in forecast lead time remains similar for the CFS and CWRF, where the RMSE peaks respectively at 1 and 0.5 mm/day in DJF and at 1.5 and 1 mm/day in FMA. This feature is accompanied by the increase in the RMSE spread among the ensemble members, suggesting that the climate system becomes more chaotic. Note that the spread is smaller in the CWRF than CFS, especially for the range of large errors.
Figure 1b illustrates the CWRF minus CFS differences in equitable threat scores (ETS) of the seasonal mean precipitation forecasts. The ETS, defined as the ratio of (hits minus hits expected by chance) over (hits plus false alarms plus misses minus hits expected by chance), is a standard measure to assess the skill dependence on rainfall intensity [Mesinger and Black, 1992]. On average for DJFMA, ETS values for low (<3 mm/day), medium (3–6 mm/day) and high precipitation thresholds (>6 mm/day) are respectively 0.16, 0.17 and 0.08 by the CFS, while 0.31, 0.25 and 0.22 by the CWRF. Thus the CFS forecast skill decreases rapidly for heavy rainfall events, while the CWRF maintains a good level across the range. The CWRF skill is overall superior to the CFS, with notably higher ETS values, especially at the low and high rainfall ranks. At the lead time of (0, 1, 2) months, the CWRF downscaling enhances the ETS from the CFS forecast by (0.18, 0.16, 0.12) for the low rank, maintaining at 0.1 in the medium range, and by (0.1, 0.15, 0.13) for the high end. One possible reason for the low enhancement at the medium range is that the global model has more skills in predicting synoptic-scale precipitation than the light or extreme events.
 Note that the CWRF minus CFS ETS differences, depicting the skill enhancements by the downscaling, are larger in ENSO-neutral years than in strong anomalous years. For instance, smaller enhancements are identified in years with La Niña (1984, 1988) and El Niño (1986, 1991, 2002). During these abnormal years, significant ENSO signals presented in the planetary circulation, and thereby the CFS has higher seasonal climate predictability, especially for wintertime when global anomalies are more intense. As a result, the advantage of the CWRF downscaling over the CFS forecast is relatively weaker than ENSO-neutral years. A similar study keyed to warm season precipitation prediction will be valuable to determine how much the downscaling can enhance the skill, especially as the planetary forcing is weak.
Figure 2 compares the number of rainy days (>1 mm/day, Figure 2a), maximum dry spell length (between consecutive rainy days, Figure 2b), and daily rainfall 95th percentile (Figure 2c) for JFM averaged during 1983–2008, predicted by the CFS and downscaled by the CWRF against observations. The CFS overpredicts the number of rainy days greatly in the northern and southwestern regions of the domain, while the CWRF downscaling produces more realistic amounts and sufficient geographic details (Figure 2a). Over the North Rockies, the CFS generates excessive orographic rainfall due to its low resolution, whereas the CWRF gives much more reasonable distribution. Over the central Great Plains with flat terrain, shallow convection is overactive in the CFS [Higgins et al., 2008] but more appropriately resolved in the CWRF using an advanced scheme [Park and Bretherton, 2009]. Over the Great Lakes region, the CFS predicts warmer skin temperature that may cause heavier precipitation than observations (H. Juang, personal communication, 2010). In contrast, the CWRF couples an 11-layer lake model, a 5-layer snow model and a subsurface frozen soil parameterization to better predict skin temperature and terrestrial hydrology [Yuan and Liang, 2011], and consequently demonstrates substantial improvement over the CFS. Over the Southwest with complex terrain, resolution increase alone does not overcome the CFS deficiency [Yang et al., 2009], whereas the CWRF correction of the wet biases may be attributed to its advanced CSSP representation of the terrestrial hydrology, including the topography-induced lateral and vertical subgrid moisture fluxes and the thin soil column of water movements constrained by shallow bedrock depths [Yuan and Liang, 2011]. The areal averaged number of rainy days from the observation, CFS and CWRF are 20, 38 and 25, respectively.
 In observations during JFM, the Great Plains, U.S. Southwest and Mexico experience the largest number of dry spells, with the maximum length exceeding 30 days (Figure 2b). The CFS greatly underpredicts the maximum dry spell length over most of these regions. The CWRF downscaling substantially reduces the CFS errors and well reproduces the observed dry spell pattern across the Great Plains. The observed dry spell length is generally short in the eastern U.S., with the maximum below 15 days. The CFS reasonably captures that, although the CWRF improvement is obvious in the Great Lakes region. The observed daily rainfall 95th percentile (Figure 2c) exhibits broad peaks over the Gulf States. The CFS roughly simulates this extreme precipitation pattern but systematically underestimates the magnitude. In contrast, the CWRF produces more accurate intensity, especially over the Gulf States. Therefore, the CWRF downscaling demonstrates clear advantages over the CFS seasonal-interannual forecasts not only in predicting the frequency of daily rainfall occurrence, but also in capturing the extreme events such as heavy rains or dry spells.
 The predictable signals in U.S. precipitation interannual variations have been identified as regional responses to the ENSO forcing [e.g., Leung et al., 2003; Higgins et al., 2008]. Figure 2d compares the averaged differences in the number of rainy days for JFM between the El Niño (warm) and La Niña (cold) events during 1983–2008 (see Section 2 for the list of years) as predicted by the CFS and downscaled by the CWRF. Observations showed more rainy days in El Niño than La Niña winters over Mexico, the U.S. Southwest, Gulf coast, and Great Plains regions, and the opposite over the Northwest and Northeast. The CFS systematically exaggerates such ENSO contrasts, where the differences in rainy days of both signs are overwhelmingly larger than observations. The CWRF downscaling corrects the CFS errors to some extent and produces an overall more realistic geographic distribution. Similar CWRF improvements over the CFS are also seen in the ENSO differences of the maximum dry spell length and daily precipitation 95th percentile (not shown).
4. Conclusion and Discussion
 The nested CWRF-CFS system is evaluated on its predictive skill for U.S. cold-season precipitation variations during 1982–2008 using an ensemble of 5 retrospective forecasts that extend to April 30 from initial dates of consecutive November 29–December 3. It is demonstrated that the CWRF downscaling has substantially higher skills than the driving CFS in predicting geographic distributions of precipitation seasonal mean variations and daily mean statistical characteristics (rainy days, dry spells, extreme events) as well as their responses to the ENSO forcings. Such skill enhancements, over various regions, result from the CWRF refined representation of regionally-based physical processes, including the orographic effects, shallow convection, and terrestrial hydrology.
 This study presents a very promising case for using advanced RCMs nested with coupled GCMs to significantly enhance the predictive skill for precipitation seasonal-interannual variations at regional and local scales. There exist a number of approaches that can further improve such downscaling skills. An obvious one is to refine the initial conditions to more realistically preserve the essential memory stored in land (soil, lake, snow) and coastal oceans. For example, Koster et al.  found that realistic land surface initialization improves GCMs' summer rainfall forecast skill, with significant regional influences out to 45 days, especially when initial soil moisture anomalies are strong. The CWRF incorporates the CSSP, a state-of-the-art land surface model that can be readily assimilated with recently available high-quality satellite data such as snow cover and terrestrial water storage to improve the initial state memory, and thereby further extend seasonal climate predictability.
 Another appealing approach is to construct the ensemble forecast, which can be based on multiple initial conditions, multiple physics configurations of a single model [Liang et al., 2007] or multiple models. The present study uses an ensemble of 5 initial conditions, where the spread among the forecasts is not as large as one would expect (Figure 1a). On the other hand, Liang et al. (submitted manuscript, 2010) formed an ensemble of 26 physics configurations from alternative schemes for radiation, land surface, planetary boundary layer, cumulus and microphysics processes in the CWRF. They found that, for downscaling U.S. precipitation variations in 1993, the ensemble mean with an equal weight dramatically outperformed all members in summer, spring and autumn, although only marginally improved the performance in winter when differences are small between individuals. There exists, however, substantial room to further enhance the predictive skill through intelligent optimization of the ensemble that incorporates varying weights to account for regional skill dependences of individual members of either different physics configurations or multiple models. These issues will be the focus of our future investigations.
 We thank Henry Juang for providing the CFS seasonal prediction output, Tiejun Ling for helping develop job scripts, and Fengxue Qiao for compiling precipitation observational data. We appreciate constructive comments of James Angel and Nancy Westcott. We acknowledge NOAA/ESRL and UIUC/NCSA for the supercomputing support. The research is supported by the NOAA Climate Prediction Program for the Americas (CPPA) NA08OAR4310575 and NA08OAR4310875, and NASA NNX08AL94G. The views expressed are those of the authors and do not necessarily reflect those of the sponsoring agencies.