A series of climate extreme events affected many parts of the US during 2011, including the severe drought in Texas, the spring tornado outbreak in the southern states, and the weeklong summer heat wave in the Central Plains. Successful prediction of these events can better inform and prepare the general public to cope with these extremes. In this study, we investigate the operational capability of the new NCEP Climate Forecast System (CFSv2) in predicting the 2011 summer heat wave. We found that starting from April 2011, the operational CFSv2 forecast consistently suggested an elevated probability of extremely hot days during the forthcoming summer over the Central Plains, and as the summer was approaching the forecast became more certain about the summer heat wave in its geographic location, intensity and timing. This study demonstrates the capability of the new seasonal forecast system and its potential usefulness in decision making process.
If you can't find a tool you're looking for, please click the link at the top of the page to "Go to old article view". Alternatively, view our Knowledge Base articles for additional help. Your feedback is important to us, so please let us know if you have comments or ideas for improvement.
 A series of climate extreme events affected many parts of the US during 2011. From the beginning of 2011, a drought developed in the south, especially around Texas and Louisiana boarder. By the end of March, the drought had become so severe that the US drought monitor (U.S. Drought Monitor, http://droughtmonitor.unl.edu/) [Svoboda et al., 2002] placed much of Texas, Oklahoma, Louisiana and Arkansas in the D4 drought category (extreme drought). This drought then persisted throughout the year, and finally started to weaken slightly during the winter.
 In April 2011, an extremely large and violent tornado outbreak took place in the Southern, Midwestern, and Northeastern US. This super outbreak lasted for four days (April 25 to 28, 2011), and is the largest tornado outbreak ever recorded with a total of 359 tornadoes confirmed by the National Weather Service in 21 states from Texas to New York. It caused over 300 casualties, and is one of the most costly natural disasters in the US history with total damage of nearly $11 billion (NOAA National Climatic Data Center, State of the Climate: Tornadoes for April 2011, available at http://www.ncdc.noaa.gov/sotc/tornadoes/2011/4).
 Later during the second half of July 2011, a wave of intense heat affected much of the central and eastern part of the US, with many places seeing temperature above 40°C. The most intense period of the heat wave was July 16 to 23 over the Central Plains. Figure 1shows the observed air temperature anomaly during the entire summer (JJA), July and the second half of July of 2011, respectively. The air temperature data are from the second phase of North American Land Data Assimilation System (NLDAS-2) [Xia et al., 2012; Cosgrove et al., 2003; Mitchell et al., 2004], and are spatially and temporally aggregated to 1 degree resolution and daily. The climatological mean for each of the three periods is calculated based on the daily temperature during the same period over the 33 years (1979–2011). It is evident that most of the Central and Eastern US experienced a warmer than normal summer with anomalies between 1°C and 5°C. July, especially the second half of July, contributed to most of the overall warm anomaly. During the peak of the heat wave in the second half of July 2011, the average daily temperature was over 7°C above normal in part of the Central Plains.
 Successful prediction of events like these can better inform and prepare the general public to cope with these climate extremes. Although it is not possible to forecast the timing and location of individual thunderstorm and tornado accurately weeks or months in advance, it is possible to make skillful forecast of temperature variation at subseasonal to seasonal time scales [Koster et al., 2010; Huang et al., 1996].
 The National Centers for Environmental Prediction (NCEP) is current moving towards a climate services paradigm, which will rest heavily on forecast from the Climate Forecast System (CFS) [Saha et al., 2006, 2010], a coupled atmosphere-land-ocean modeling system. The model became the operational system for seasonal forecasting at NCEP in August 2004. During the past few years, a new version of CFS was in development, and this new version, referred to as CFSv2 [Saha et al., 2010], has more advanced physics packages and runs at a higher resolution. CFSv2 became the operational model for seasonal climate forecasting at NCEP on March 30, 2011. To support this upgrade, NCEP also finished the new climate forecast system reanalysis (CFSR) and reforecast (CFSRR) with the CFSv2 model [Saha et al., 2010].
 In this study, we investigate the operational capability of the CFSv2 model in predicting the 2011 summer heat wave using both the real-time CFSv2 data stream and the CFSRR hindcast dataset. The question we would like to address is whether the state-of-the-art seasonal forecast system provided any clear indication of forthcoming the summer heat wave in an operational setting. If the event was successfully predicted with sufficient lead time, it will demonstrate the usefulness of the new system, and provides added confidence to end users for incorporating climate forecast information in their decision making process.
2. Data and Method
 The CFSv2 real-time operation produces four 9-month forecast runs per day from the 00, 06, 12 and 18 UTC cycles of the CFS real-time data assimilation system. Additionally, there are three 1-season forecast runs at the 00 UTC cycle, and three 45-day forecast runs at the 06, 12 and 18 UTC cycles. Thus this configuration produces a total of 16 forecast runs each day with various integration lengths, but all of them are at T126 resolution (∼0.938°) with 64 vertical layers in the atmosphere. In this study, we use all the real-time 9-month forecast runs from 00 UTC cycle on March 30, 2011 to 18 UTC cycle on July 15, 2011, a total of 424 runs.
 Besides evaluating the forecast of seasonal and month mean temperatures (not shown), this study focuses on the prediction of the summer heat wave, i.e. extremely hot days. Because there is no universal requirement for the number of consecutive extremely hot days to define a heat wave event, we only examine the capability of the model in predicting the number of hot days during the forecast period. For a given location, let Ti,j be the daily temperature on day i in year j, and i = [1,N] and j = [1,M]. Then we define
where Tthreshold is the selected temperature threshold, and N is the number of days during the period. To define extremes, we select Tthreshold as the 90th, 95th, or 99th percentile of the climatological distribution of Ti,jduring the N-day period in M years. We can calculatePj using daily temperature from both observations and forecast runs. The comparison between these two can tell us if the forecast runs provide any skillful and useful information on future temperature extremes.
Figure 2 shows the observed Pjfor three periods (JJA, July, and the second half of July) in 2011 in combination with the three thresholds. To obtain the temperature thresholds, we use the daily temperature data from NLDAS-2 during the 33 years (1979–2011) to construct the climatological distribution. Thus the sample size for the distribution for July is 31 (days) × 33 (years) = 1023 at each grid. With this simple metric,Figure 2 depicts the 2011 summer heat wave clearly in terms of its intensity, geographic location, and timing. Take eastern Oklahoma region for example. Figure 2a shows that over 50% of the days in JJA 2011 the region had daily temperature higher than the 90th percentile of JJA daily temperature distribution. Over 90% of the days in July 2011, the same region had daily temperature higher than the 90th percentile of July temperature distribution (Figure 2d). During the same period, about 50% of the days had daily temperature higher than the 95th percentile of the distribution Figure 2e), and about 5% of the days had daily temperature higher than the 99th percentile (Figure 2f).
 The same calculation can be done for each of the 424 9-month forecast runs from CFSv2 using its daily temperature forecast. It is possible to use the same thresholds defined by NLDAS-2 data, but then we need to correct the bias in CFSv2 forecast first because global climate models like CFSv2 tend to have their own climatology that is often different from the observed climatology. An alternative approach is to define the thresholds with respect to the model climatology using CFSRR hindcast runs, thus avoiding the bias correction. CFSRR includes four 9-month hindcast runs, four 1-season hindcast runs, and twelve 45-day hindcast runs every five days. These twenty hindcast runs start from the twenty CFSR reanalysis cycles during the five day segment with the four 9-month hindcast runs on the first day. The 9-month hindcast runs were produced every five days over the 29-year period from 1982 to 2010, and the shorter hindcast runs are produced only for 1999 to 2010. In this study, to form the model climatological distribution for a real-time forecast run from a given cycle, we use the four 9-month hindcast runs that start from the nearest cycles in each of the 28 years (1982–2009). (At the time of this study, CFSRR hindcast runs for 2010 have not been made available.) For example, for the forecast run starting from 06 UTC cycle on April 30, 2011, hindcast runs from 00, 06, 12 and 18 UTC cycles on May 1 of 1982 to 2009 are used. With the model climatological distribution, the thresholds will still be the 90th, 95th and 99th percentile of the distribution, but the corresponding temperature will likely be different from what were estimated using NLDAS-2 data.
Figure 3 shows the predicted Pj in the forecast run starting from the 06 UTC cycle on April 30, 2011. This run is selected from the 424 forecast runs as an example to show how the predicted pattern compares to the observed shown in Figure 2. In this case, there is significant similarity between the predicted (Figure 3) and the observed (Figure 2), especially over Texas. This particular forecast run suggested a heat wave over Texas and southern tier states during the second half of July 2011, because over 85% of the 16 days would have daily temperature higher than the 90th percentile threshold (Figure 3d) and over 65% of the days have daily temperature higher than the 95th percentile threshold (Figure 3e). However, this particular forecast run also suggested a warmer than normal condition over the Pacific Northwest (Figures 3d, 3e, 3g, and 3h), which is completely the opposite of observations (Figure 1).
 Obviously forecast runs from different initial time of forecast look differently, so it is necessary to determine whether they provide consistent predictions. We calculate Pj in all 424 forecast runs, then spatially average Pj within the Central Plains indicated by the box in Figure 1. These average Pj are plotted against the initial time of forecast in Figure 4. Each plot is for one of the three periods, and within each plot three colors are for the three thresholds. Within each group of lines in the same color, the solid line represents the forecast; the dotted line represents the observation from NLDAS-2; and the long-dashed line represents the expectation of a climatological forecast. By definition, a climatological forecast will have no tendency towards either warm or cold, so the expected percentages of days with temperature higher than the 90th, 95th and 99th percentile shall be 10%, 5% and 1%, respectively. If forecastPj is higher than the climatological expectation, it means that the model suggests an increased chance for a warmer season with more warmer days. If the predicted percentage is much higher than the climatological expectation, especially for the 95th and 99th percentile thresholds, a heat wave is predicted to occur. Perfect model forecast will yield predicted Pj equal to the observed Pj (dotted lines).
Figure 4reveals the following features about the real-time CFSv2 forecast runs in predicting the summer heat wave over the Central Plains.
 1) Almost all of the 424 forecast runs showed increased percentage of days with temperature higher than the thresholds for all three periods (solid lines vs. long-dashed lines), which means that the model consistently suggested a warmer than normal summer with many hot days over the region. In fact, since the beginning of April, CFSv2 forecast has almost doubled the percentage of extremely hot days (T > T99th) during July and the second half of July.
 2) Forecast runs made around April 26 and May 16, 2011, tend to show a normal or even slightly cooler summer. Since these forecasts are verified with less skill given what actually happened later on during the summer, it suggests that model forecast skill for a given extreme event does not necessarily increase monotonically as the initial time of forecast approaches the time of the event and lead time shortens. These fluctuations can be normal expression of the natural variability of the climate system. The fact that these less skillful forecast runs are clustered within those two short time periods makes us speculate that certain large scale atmospheric circulation patterns in the initial conditions might have significantly affected the evolution of the atmosphere, causing it to take a quite different path in the following months.
 3) The predicted percentage increased more quickly as the initial time of forecast approaches the target period. For JJA (Figure 4a), the predicted percentage started to increase from the middle of May and continued in the next two months. For July (Figure 4b) and second half of July (Figure 4c), the dramatic increase in predicted percentage started around the end of June. This also suggests that climate anomalies at larger temporal scales are more predictable at longer lead times.
 4) Although the CFSv2 predicted percentage values are higher than the climatological expectations, they are still much lower than the observed percentage values most of the time. Only when the lead time becomes very short, the predicted percentage values start to increase more quickly, approaching the observed value.
 5) Forecast runs starting from the second week of July over-predicted the number of extremely hot days (T > T99th) for July and the second half of July in the region. But at this short lead time, the forecast is more likely considered as a medium range weather forecast rather than a climate forecast.
 It is also worth mentioning that the same analysis was done for other regions such as the Pacific Northwest. Over the Pacific Northwest, the predicted percentages are generally lower than the expected probability (not shown). This agrees well with the cooler than normal summer there (Figure 1). So it is evident that the real-time CFSv2 forecast is not uniformly biased towards more warmer days during this summer.
4. Summary and Discussion
 In this study, we evaluate the new NCEP climate forecast system (CFSv2) in its ability to predict the heat wave during summer 2011. Besides evaluating the forecast of seasonal and monthly mean temperatures (not shown in this letter), we focus our study on the prediction of extremely hot days, defined as the days with temperature higher than the 90th, 95th, and 99th percentile of the climatological distributions. Our analysis show that CFSv2 did give clear indications of a warmer than normal summer over the Central Plains several months in advance. Since the launch of CFSv2 in operational forecasting in late March 2011, the vast majority of the forecast runs predicted that the number of days with temperature higher than the 90th, 95th, and 99th percentile thresholds would be doubled from the climatological expectation. As the initial time of forecast approached the summer, forecasts became more certain about the summer heat wave in the intensity, geographic locations, and timing.
 The results not only demonstrate the ability of the CFSv2 model in capturing the extreme event, but also demonstrate climate extremes like the summer heat wave have predictability at seasonal time scale. If the model can predict the warming almost four months in advance, the climate system must have provided the predictability in some way. 2011 was a La Nina year. During La Nina years, the southern part of US tends to have warmer winters, and Pacific Northwest tends to have in cooler and wetter winters. However, it is unclear how much La Nina condition contributed to the predictability of summer heat wave in this case, a more detailed modeling study is necessary.
 Previous studies have associated seasonal predictability and prediction of temperature and heat waves with land-atmosphere interaction. Modeling experiments byFischer et al. [2007a, 2007b]revealed that land-atmosphere coupling plays an important role for the evolution of the heat waves in Europe.Koster et al. found in the second phase of Global Land Atmosphere Coupling Experiment (GLACE-2) that seasonal forecast models tend to be more skillful when forecasts start from strong soil moisture anomalies, which reflects a contribution from the initial condition to prediction. The current study focuses on documenting the capability of operational CFSv2 in successfully predicting this extreme event, determining the exact source of predictability and how initial soil moisture conditions and land-atmosphere interaction affect the prediction are beyond the scope of this letter. But it is our interest to investigate whether and how the severe drought in Texas contributes to the predictability and successful prediction of the summer heat wave through both local and remote effects of land-atmosphere coupling as well as soil moisture initialization in a follow-up study.
 We note that the evaluation presented here is conditioned upon the fact that the summer heat wave took place. Such a conditional evaluation answers the question, how often does the model get it right when a climate extreme like heat wave happens. Evaluations like this are necessary and useful for model improvement. But forecasters are also interested in the unconditional performance of the model, i.e., how often does the model get it right when the model predicts a climate extreme. This requires a more comprehensive evaluation of many forecast runs, and the CFSRR hindcast dataset provides the opportunity to do so. Such work is currently underway by the authors and collaborators.
 This research is supported by grant NA10OAR4310246 from the NOAA Climate Program Office and Michigan State University.
 The Editor thanks the two anonymous reviewers for their assistance in evaluating this paper.