The Community Atmospheric Model 2 (CAM2) is run as a short-term (1–5 days) forecast model initialized with reanalysis data. The intent is to reveal model deficiencies before complex interactions obscure the root error sources. Integrations are carried out for three Atmospheric Radiation Measurement (ARM) Program intensive operational periods (IOPs): June/July 1997, April 1997, and March 2000. The ARM data are used to validate the model in detail for the Southern Great Plains (SGP) site for all the periods and in the tropical west Pacific for the March 2000 period. The model errors establish themselves quickly, and within 3 days the model has evolved into a state distinctly different from the ARM observations. The summer forecasts evince a systematic error in convective rainfall. This error manifests itself in the temperature and moisture profiles after a single diurnal cycle. The same error characteristics are seen in the March 2000 tropical west Pacific forecasts. The model performs well in the spring cases at the SGP. Most of the error is manifested during rainy periods. The ARM cloud radar comparison to the model reveals cloud errors which are consistent with the relative humidity profile errors. The cloud errors are similar to those seen in climatological integrations, but the state variable errors are different. Thus there is the possibility that the some basic parameterization errors are obscured in the climatological integrations. The approach described here will facilitate parameterization experimentation, diagnoses, and validation. One way of reducing cloud feedback uncertainty is to make the physical processes behave in the most realistic manner possible. Paradoxically, perhaps the best way to reduce uncertainty in cloud feedback mechanisms is to evaluate the model processes with realistic forcing before such feedbacks have any significant affect.
 In the course of their normal operations numerical weather forecast (NWP) centers compare their forecasts to the observational data on a daily basis. The model developers at the centers have made good use of the information obtained by considering model output and observations. Over the years this activity has contributed to a substantial increase in the skill of present-day NWP predictions. Climate models, on the other hand, are commonly validated against various statistics based on the observations. These comparisons have enabled the models to produce quite creditable simulations of the present-day statistics of some aspects of the mean atmosphere (climate). The complication in diagnosing climate simulations is that the models can achieve a reasonable mean state as the result of compensating errors, which will be impossible to individually identify after months or years of integration. Such errors could compromise the validity of simulations of climate change scenarios in which the forcings are distinct from present-day conditions and thus might alter the mechanism of compensation.
 Merging of these validation exercises by running a climate model in NWP forecast mode could be expected to have some benefit. Confronted with actual weather events, the climate model will reveal possible shortcomings in the simulation of some physical processes. Causes of these misrepresentations of the weather would be difficult to diagnose in a climate statistical evaluation, but could be made manifest in a NWP ensemble forecast statistical setting. Thus NWP offers an avenue for possible improvement in the climate model's simulations of physical processes. NWP is not a panacea for all the ills in climate simulation, however. There are modes of interaction of the climate system which will only be realized after months or years and in some instances decades of integration, and the short-term NWP forecasts cannot address these very important climate components. However, it is not unreasonable to speculate that the accurate depiction of the diurnal cycle and daily weather might contribute to the proper simulation of longer-term climate modes. If, for example, the cloud formation in a GCM does not have a sound physical basis, then subsequent studies of cloud feedback in that GCM's climate simulation would not address the root of the observed problems and might actually be misleading. Cloud feedback in a climate simulation is a statistical result reflecting the simulated physical processes at work; if the physical processes are flawed then the statistic has limited value.
 A major stumbling block in the implementation of using a climate model in operational NWP is the lack of a data assimilation system for the climate model. The initial conditions for NWP forecasts are produced by a rather sophisticated data assimilation process in which the forecast model plays an integral role [Kalnay, 2003]. These systems are complex and at present it is far from a simple task to replace the native model with a “foreign” model. However, the ready availability of analyses from sophisticated data assimilation systems has altered the parameters of the problem. The progress made at the NWP centers permits the production of analyses which are accurate (true to the observations) and, of particular importance for the current project, which are in a dynamically balanced state. With careful interpolation, a model of sufficient quality is able to ingest these analyses and immediately produce useful forecasts without being plagued by startup noise and artifacts created by initial imbalances. Atmospheric analyses are shared by NWP centers on a regular basis and the respective models can be successfully initialized by these foreign analyses. This permits the centers to assess how much the differences in the atmospheric initial conditions are contributing to the differences seen between the forecasts.
 For the purposes of this work the success of this analysis “transplant” by NWP centers offers a route to harnessing the powerful data assimilation systems of NWP to the problem of refining climate models. In this paper techniques will be presented that are closely modeled on the transplant methods of NWP centers to drive a climate model with analyses from two reanalyses projects the NCEP/DOE [Kanamitsu et al., 2002] and ERA40 [Simmons and Gibson, 2000].
 The goal is not primarily to produce good forecasts, but to examine the physical processes of the climate model when it is confronted with a known and realistic atmospheric state. Indeed, once the forecast begins to diverge significantly from the sequence of observed states, it is less able to provide insight as to shortcomings in the model physics. It is vital that deviations from observed clouds, for instance, can be attributed to parameterization shortcomings and not errors in the large-scale flow. The issue of validation data is key to the success of this endeavor, since it is desirable to assess variables beyond the usual state variables of temperature, moisture, and winds. Variables such as precipitation, sensible and latent heat flux, radiative fluxes, and clouds need to be validated in order to measure the success of the model's parameterizations. Although such variables are available from the database of the reanalyses projects, they are derived from the assimilation model's parameterizations. Comparing a model's output to these data can be criticized as merely an exercise of comparing one model's flawed representation of atmospheric processes to another's. As pointed out in the work of Dee and Todling , accurate and unbiased validation data are vital to diagnosing model systematic errors.
 In response to this data requirement, the Department of Energy's (DOE) Atmospheric Radiation Measurement (ARM) Program [Ackerman and Stokes, 2003] was established to furnish the necessary observations to enable the physical processes of atmospheric models to be verified in a detailed manner. At present the bulk of the available ARM measurements are from a site in the Southern Great Plains of the United States with less comprehensive data from sites in the tropical west Pacific. The procedures described in this paper are a specific thrust of a more ambitious DOE program, the Climate Change Prediction Program (CCPP)-ARM Parameterization Test Bed (CAPT, http://www-pcmdi.llnl.gov/capt/). CAPT is part of the DOE effort to improve climate simulations by addressing shortcomings in the modeling of the basic physical processes at work in the atmosphere.
 The idea of running a climate model driven by the observed atmospheric state is not new. Jeuken et al.  used a nudging technique to enable them to run the Hamburg climate model for short forecasts. The POTENTIALS project [Kass, 2000], attempted to analyze the initial tendencies of climate models initialized by observed fields to detect errors in the physical parameterizations. Thus, while the research reported here is not entirely new, it is unique in the leveraging of the transplant technology of the weather centers in combination with the reanalyses data sets to produce forecasts for extended periods with fidelity to the observed atmosphere. The use of ARM data provides the vital validation link to the physical processes going on in the model.
 There is a strong single column model (SCM) component to the ARM research program [Xie et al., 2002]. The approach reported here should be viewed as a complement to the SCM work. The SCM integrations are computationally economical but suffer from the lack of feedback to the dynamical forcing which is prescribed. The current research removes this artificial restriction but at the cost of considerably more computation and data flow. The CAPT approach can be used in conjunction with an SCM. The CAPT can reveal shortcomings in the SCM approach, while the SCM would allow rapid testing of parameterizations before implementation in the full model. There is a fuller discussion comparing CAPT to the SCM approach in the work of Phillips et al. .
 The structure of the paper is as follows: sections 2 and 3 describe the model used and the methods employed to initialize it and to produce the forecasts. The data used for validation (mostly from the ARM database) are described in section 4. The analyses of the forecasts made for the three time periods determined by the ARM Intensive Observing Periods (IOPs) are presented in section 5. An attempt is made in section 6 to characterize the errors and to attribute these to possible shortcomings of the parameterizations. Possible links between the short-term forecast errors and the systematic errors seen in climate simulations by the same model are also discussed. The conclusions in section 7 reflect on future efforts and ways in which this methodology can contribute to better parameterized processes in GCMs.
2. CAM2 and CLM2 Models
 The model used here is the Community Climate System Model (CCSM) Atmosphere Model, version 2 (CAM2). The model was released in October 2002 and is described by Kiehl and Gent  and Dai and Trenberth . The sea ice fraction and sea surface temperatures are prescribed using the observed monthly mean data distributed with the model. The CAM2 was run in its standard atmospheric configuration of 26 vertical levels and T42 spectral truncation in the horizontal, corresponding to a grid of 64 latitudinal and 128 longitudinal nodes with a grid spacing of about 2.5 degrees of longitude and latitude. The model used in the current experiments is identical in dynamical and physical processes to the publicly released version. Dai and Trenberth  provide a concise overview of the physical parameterizations used in CAM2. The land component of the system, the CCSM Land Model, version 2 (CLM2) [Bonan et al., 2002] has 10 levels below the surface.
 The central problem to be addressed in this section is the initialization of a climate model using an observed atmospheric state. The two main techniques pursued for the atmospheric model state variables were (1) nudging and (2) transplant analysis. Land and unobserved parameterization variables were initialized dynamically using a combination of the two atmospheric techniques.
 Details of the nudging procedure are described in Appendix A. Nudging entails adding a forcing term to the model equations that pushes the model integration toward the sequence of observed (reanalysis) states. The variables so forced are the temperature, specific humidity, the winds and surface pressure. The method worked quite well and produced a sequence of realistic states from which the model could be started in forecast mode (with the nudging term then being turned off). The vitiating aspect of nudging is that the states so produced during the course of the integration were sufficiently far removed from the observed state that deficiencies in the parameterizations, identified by comparison to observations, might have a component due to these biases and thus not represent a true shortcoming in the parameterization. The model can produce significant biases in short times (3–6 hours) which are comparable to the nudging relaxation timescale and such biases induce a departure from the observed atmosphere in the course of the nudging integration.
3.2. Transplant Analysis
 The transplant analysis technique used here mimics the procedures at operational weather prediction centers whereby weather forecasts are made by initializing the center's model using analysis from another center. The technique applied here entails a careful but straightforward interpolation of variables from one model's vertical and horizontal coordinates to that of another. The term “transplant” is used to refer to this methodology [Harrison et al., 1999]. In the normal sequence of NWP operations, the model produces a short term forecast (3–6 hours) which is then synthesized with all the observational data in the data assimilation procedures (analysis). The resulting analysis is then used for the subsequent forecast and the cycle continues [Kalnay, 2003]. This sequence is designated as the forecast/analysis cycle, hereafter F/A.
 There are two common difficulties with the transplant procedures. The first is accounting for the difference in representation of the Earth's topography between the models. The difference can be fairly substantial between a coarse resolution climate model and a relatively fine NWP analysis, so care must be taken to use reasonable extrapolations/truncations in regions of terrain mismatch. The second area of concern deals with the land model. The nature of the variables represented in land models does not permit a straightforward mapping of one model's variables into another. This mapping problem is an area of active research [Dirmeyer et al., 2004], but as yet no clear solution has emerged. Hence NWP centers use their own land model initial variables in their transplant experiments.
 In the experiments described here, the dynamical atmospheric variables were interpolated from the reanalyses grids to the CAM2 grid using methods closely modeled after the ECMWF “Full-pos” procedures for the Integrated Forecast System (IFS) [White, 2002]. These procedures involve a slightly different interpolation method for each of the dynamic state variables temperature, winds, specific humidity and surface pressure along with careful adjustments to account for the topographical differences between the reanalyses and CAM models. Some judicious smoothing is required to remove artificial gradients, which are especially evident when going from a finer to a coarser grid. The smoothing was accomplished using a spherical harmonic technique consistent with the CAM2 T42 truncation [Sardeshmukh and Hoskins, 1984]. The procedures are designed to enable the interpolated data to retain dynamic balance, and thus facilitate a smooth model initialization. These methods are used by the ECMWF in their transplant experiments and explicitly account for differences in the underlying topography in a dynamically consistent fashion. Variables such as prognostic cloud water are carried forth in the model unchanged. The resulting initial state allowed the model to start smoothly without notable oscillations in surface pressure, and the rainfall patterns established representative values within the first 3 hours of integration.
3.3. Land Initialization
 Proper initialization of the land model is a significant concern to this study. Although, perhaps less necessary for large errors which develop rapidly in the free troposphere, it is necessary in order to evaluate surface exchanges and PBL parameterization errors. While the atmospheric state parameters provided by the reanalyses can be used with confidence, there is no analogous source of data for the soil variables. The detailed nature of the soil types and their aggregation in a CAM grid box make it very difficult to use observations in a manner consistent with the model. A reasonable alternative is to spin up the land as described below.
 To generate land surface initial conditions for the CAM a three step process was used: (1) Produce a climatological seasonal land data set by running the model for about ten years using climatological SSTs. The output from this long run is used to generate a land climatological data set for each of the 12 calendar months, (2) Run the CAM2 in a nudging (see Appendix A) mode for about six months starting from the climatology of the first step, (3) Run the model in the Forecast/Analysis mode for a short period (two weeks) preceding the time of interest since this method is preferred for the atmosphere as described above. The second and third steps produce soil moisture in the upper levels consistent with the observed atmospheric evolution. The first step provides deep soil moisture values consistent with the coupled land-atmosphere model. These deep values evolve slowly and have little effect on the surface fluxes for the duration of a short forecast. Experience at NWP centers indicate that the Southern Great Plains soil should be spun up through the wet season (winter and spring) before the gradual drying of the summer begins. Not having sufficient deep water could dry out the land too quickly and lead to excessive heating. In addition, we found from “perfect model” experiments (i.e., using CAM2 simulated data s input to the system) that the soil moisture spins up to “correct” values in a few months following the disappearance of snow [Phillips et al., 2004].
 Because step three involves starting and stopping the CAM2, which uses computer resources inefficiently, the nudging mode was used in step two. The nudging proceeds uninterrupted with only some extra data as input which affects the execution time in a minor way. Except for this efficiency issue, step two could be replaced by step three over the entire period. It should be noted that the initialization procedure outlined above makes no attempt to correct biases in the land simulation. Even if the atmospheric forcing of the land were to follow the observed weather exactly, the model would still produce a land state reflecting the shortcomings of the land model. However, this approach is still better than using climatological land initial conditions at least for the CAM2. Experiments performed using climatological land initial conditions displayed a partitioning between latent and sensible heat fluxes in the forecasts that resulted in somewhat larger temperature and moisture errors in the lower troposphere than those described below.
3.4. Initial Conditions
 Both the ERA40 and NCEP/DOE reanalyses were used to generate the atmospheric initialization data. Both data sets were available on their native grids (vertical and horizontal) which were used by the respective assimilation models. The ERA-40 data consisted of 60 model hybrid sigma levels on a Gaussian grid with 180 meridional and 360 zonal Gaussian nodes. The ERA-40 project is described by Simmons and Gibson . The NCEP-DOE reanalysis data were generated at 28 sigma levels with 94 meridional and 192 zonal Gaussian nodes. The NCEP-DOE reanalysis is described in the work of Kanamitsu et al. . The results discussed here are based on the integrations initialized with the ERA40 data, since in general these data produced slightly better forecasts than those using the NCEP/DOE [Phillips et al., 2004]. Williamson et al.  include some discussion of the effects of the different reanalyses upon the behavior of the model's parameterizations.
4. Validation Data
4.1. Atmospheric Radiation Measurement Program
 A critical aspect of the current work is to evaluate the results from the climate model's parameterizations during a short-term forecast. Considerable care was taken to ensure that the model started from as realistic a state as possible so that the subsequent evolution of the model would produce realistic large-scale flow conditions. A long-standing problem is to have sufficient data available to evaluate the model processes including observations of radiation, clouds, precipitation, diabatic heating rates, and etc. The DOE ARM program was designed to address these important data gaps. The ARM data and the ERA40 data are generated from largely independent sources, and thus ARM can provide an objective appraisal of the forecasts.
 For the Southern Great Plains (SGP) site, the ARM data archive provides a unique, comprehensive data collection. The ARM SGP central site is located at 36.61N, 97.49W and encompasses an instrumented area approximately 3 × 3 degrees in area. Figure 1 shows the location and arrangement of this facility.
 ARM provides a 3-hour time resolution for the upper air parameters and higher frequency for the surface variables during IOPs. In addition, the Arkansas-Red Basin River Forecast Center (ABRFC) 4-km rain gauge adjusted WSR-88D radar measurements provide accurate hourly estimates of precipitation over the entire ARM SGP domain. The Surface Meteorological Observation Stations (SMOS) and the Oklahoma (OK) and Kansas mesonet stations provide measurements of precipitation, wind, temperature, humidity, sensible and latent heat fluxes and broadband net radiative flux. The Soil Water And Temperature System (SWATS), installed at twenty one of the SMOS sites, is designed to provide information about the temperature of the soil and the status of water in the soil profile. Sensors installed at various depths below the soil surface provide hourly measurements of soil temperature and estimates of soil-water potential and volumetric water content. Satellite measurements from the Geostationary Operational Environment Satellite (GOES) provide estimates of clouds and top-of-atmosphere broadband radiative fluxes.
 These measurements are combined using a variational technique with the vertical profiles of temperature, water vapor mixing ratio, and winds measured from 3-hour rawindsondes from the five ARM sounding stations and hourly winds taken by seven NOAA wind profiler stations. The variational analysis uses the SGP domain-averaged surface and top-of-atmosphere (TOA) fluxes as the constraints to adjust the balloon soundings and profiler data to conserve column-integrated mass, moisture, static energy, and momentum. The resulting balanced profiles are available every 3 hours along with estimates of the heating and moistening rates Q1 and Q2, respectively. The variational analysis of the ARM surface, upper air and satellite observations is described in detail by Zhang et al. . The ARM Millimeter Microwave Cloud Radar (MMCR) and Micropulse Lidar provide high frequency continuous measurements of clouds. A value-added product using the MMCR data is the Active Remote Sensing of Clouds (ARSCL) [Clothiaux et al., 2000]. ARSCL algorithms allow the geometric extent (in the vertical) of clouds to be mapped and provides information on the size distribution of the cloud particles. The radar is upward pointing and only samples a narrow vertical beam; thus it cannot provide information on cloud-cover extent over the entire ARM region. Hence there is a model-observational data mismatch, in that the radar provides highly detailed time information at a point while the model provides information that is intrinsically spread over a grid box. Radar therefore can miss the presence of clouds of less than 100% coverage over a model grid box.
 This issue is addressed in a comparison with the ECMWF forecast model by Mace et al. . They found that by using sufficiently time-averaged radar data there is a enough time/space correspondence to make the radar data useful in diagnosing the model's cloud scheme. In this study, the radar data are averaged over 3 hours and a scoring method that is based on the presence of cloud rather than the specific cloud amount is used, following Mace et al. .
4.2. ARM Tropical West Pacific
 Data from the ARM site in the tropical west Pacific (TWP) were also used to validate the model forecasts. Data were available from two sites, Manus (2.058S, 147.425E) and Nauru (0.5S, 167E); see Figure 2. The TWP data covered the period of March 2000, but were less comprehensive than the SGP, only encompassing surface measurements and the ARSCL cloud estimates. As of the writing of this paper, there were no upper air data sets comparable to those of the SGP site available for the TWP sites. It is critical to the verification of model bias that a carefully quality controlled and thus unbiased data set be used. Sounding have been taken and archived for both TWP sites but the data have not yet been subjected to the necessary quality control.
4.3. Global Precipitation Climatology Project (GPCP)
 The ARM data provide comprehensive measurements at specific sites, but for global verification of rainfall the GPCP global daily precipitation were used. These data combine satellite estimates and gauge observations to produce a daily estimate on a 1 × 1 degree grid [Huffman et al., 1997]. The GPCP data do not allow insight into the diurnal cycle as do the ARM measurements, but the global aspect of the data sheds light on some larger-scale model biases. The GPCP data suffer from the fact that over water they consist of estimates made from satellite measurements, with all their concomitant uncertainties.
5. Forecast Errors in CAM2 at ARM Sites
 Three time periods that are determined by the ARM IOPs shown in Table 1 are examined in detail. Use of IOP data especially the results of the variational analysis using high frequency (3 hours) upper air observations at the SGP site provides a great deal of supplementary verification information. The characteristics of the errors during the spring IOPs at the SGP site are quite similar, so the description of the March 2000 IOP for this site will be brief. The emphasis of this study is on errors that are averaged over the first 24 hours of the forecast over a number of forecasts. This is done to characterize the systematic errors of the model, rather than just a specific forecast failure.
Table 1. The Dates and Designators of the ARM Intensive Observation Periods Used in This Paper
Start and End Dates
18 June to 17 July 1997
2–23 April 1997
1–22 March 2000
5.1. June/July 1997 IOP: 18 June to 17 July 1997
5.1.1. Synoptic Forecasts
 Examination of synoptic weather maps over North America (not shown) indicate that the model forecasts the overall synoptic situation such that the parameterizations are forced in a manner consistent with observations. The model performance in forecasting the 500 hPa height field on the hemispheric scale is very credible as indicated by the mean anomaly correlation [Phillips et al., 2004].
 The sea level pressure and low level winds show that the forecast does quite well in depicting the location and strength of the pertinent systems. The temperature and winds also indicate that the frontal positions are accurate. For the most part the rain does line up with centers of observed activity but there is a tendency for persistent light rain over the southern United States. On a global scale there is a tendency in convective regimes to produce broad regions of light precipitation that do not correspond to the GPCP estimates. These same characteristics of persistent, light precipitation and accompanying clouds in convective regions are evident in the detailed measurements from the ARM sites.
Figure 3 shows the time series of precipitation at the SGP site during June/July 1997 for a series of model 24-hour forecasts initiated every six hours and observations from ARM. The values plotted are 6-hour averages for the last 6 hours of the 24-hour forecast. The results using the same sampling for 12- and 18-hour forecasts are similar. It is clear that the model rains nearly every day and fails to capture the episodic nature of the rain events seen in the observations. This behavior is associated with deep convection in the model which is parameterized using the Zhang-McFarlane (ZM) [Zhang and McFarlane, 1995] scheme [Williamson et al., 2005]. A similar pattern producing daily rainfall over the U.S. Great Plains has been observed using the ZM scheme using a SCM [Xie et al., 2002] and a regional model [Dai et al., 1999]. Table 2 presents the joint distribution [Wilks, 1995], of the ARM observations and the model's 24-hour forecasts of precipitation. The marginal distributions of the forecasts show the ubiquity of the rainfall events in the model. In the model's defense, the frequency of predicted rain might be an artifact of the four grid cells that are used to generate the CAM estimates (see Figure 1) at the SGP site. These grid boxes occupy a larger area than the SGP region, and thus the model might be depicting actual rain events that are not encompassed by the ARM region. However, there are radar estimates of precipitation from ARM (ABRFC), calibrated with rain gauge data at the SWATS locations, which extend over the full CAM grid cells. In Figure 3, the average of these radar values over the larger region indicate that the rain events do indeed occur more often. Comparison of the CAM and radar time series indicates that the model nonetheless overestimates the frequency of precipitation. Averaging over the larger area also diminishes the magnitude of the peaks in the rain events, but here again the model is unable to match the reduced peaks of the radar data.
Table 2. Joint Distribution Table of 24-Hour Forecast Precipitation at the SGP Site for the June/July 1997 IOP
5.1.3. SGP Column
 The rainfall is the result of many processes at work in the atmospheric column. The ARM SGP IOP data permit a further investigation of the evolution of the variables above the SGP. Figure 4 displays the error in relative humidity for a sequence of 24 hour forecasts initiated every 6 hours from 18 June to 17 July 1997 at the SGP. The error is defined as the model estimate less the ARM observed values. The persistence of the error through most of the time period is noteworthy. The evolution of the ensemble mean difference in the model and ARM relative humidity is shown in Figure 5. The figure shows the mean difference in the relative humidity between 5 day model forecasts initiated at 0000 UT and ARM. The mean is over 30 5-day forecasts. The errors are established within the first 24 hours and then level off to a roughly steady state. The diurnal pulses of differences are apparent in the figure as the convective processes are triggered each day in the early afternoon. The low level drying is established in the first 24 hours and only slightly increases for the next 4 days. For this variable, the model forecasts an atmospheric state significantly different from that observed after two days. Figure 6 shows the mean 24-hour forecast error in temperature, specific humidity and relative humidity for June/July 1997 at the SGP site. The model is too moist both above 500 hPa and in the boundary layer, and too dry between 500–900 hPa. Detailed model data (not shown) indicate that the lower tropospheric drying is primarily due to the ZM deep convection scheme while the upper level moistening is from evaporation of rainfall primarily by ZM below 300 hPa and a combination of rainfall evaporation and detrainment above this level [Williamson et al., 2005]. An examination of the time evolution indicates that a large contribution to the error comes in the 0900 to 1500 LST time frame, consistent with convective sources and, an error which is also noted by Dai and Trenberth . Column temperature errors indicate that the model is much too warm in the upper troposphere between 200 and 600 hPa, too cold near the surface in the boundary layer and a bit warm between 900 and 600 hPa. The time evolution of the errors is similar to that of the relative humidity, Figure 5, in that the errors establish themselves early in the forecast. The warming appears to be probably due to the ZM scheme. As indicated by Dai and Trenberth , the ZM scheme implemented in the CAM2 is triggered too often and too early in the day during summer. The scheme appears to be overly sensitive to CAPE formation and proceeds with the convective processes without due regard for inhibitory aspects of the synoptic situation [Xie et al., 2004].
5.1.4. SGP Surface
Table 3 lists the 24-hour forecast errors for the surface sensible and latent heat fluxes over the SGP region, and shows that the sensible heating is systematically too small and the latent heating is systematically too large. These fluxes are in accord with the atmospheric temperature and moisture errors in the boundary layer, if they are considered as a source of the latter anomalies.
Table 3. Differences in Latent and Sensible Heat Fluxes Between 24-Hour CAM Forecasts and ARM Observations at the SGP Site
Sensible Heating, W/m2
Latent Heating, W/m2
April 1997, rain
April 1997, dry
April 1997, all days
March 2000, rain
March 2000, dry
March 2000, all days
 The very steep gradient below 900 hPa in the specific humidity of Figure 6 may indicate that the PBL transfer is not deep or efficient enough. The model's PBL depth would also be adversely affected by the deep convection occurring too frequently and too early in the day. The results of the April 1997 case, discussed below, indicate that the ubiquitous rainfall may inhibit the vertical growth of the PBL height in the model.
5.1.5. SGP Soil Temperature and Moisture
 As mentioned in section 3.3, proper initialization of the soil model is a significant concern to this project. There can be a fair amount of confidence in the atmospheric state parameters provided by the reanalyses, but there is no analogous source of data for the soil variables. The detailed nature of the soil types and the nature of the soil type aggregation in CAM grid box make any use of observations problematical in view of the large variation of soil types, landforms and land use across the ARM site [Luo et al., 2003]. Thus model-observation comparisons of soil moisture and temperature are necessarily qualitative and are presented to provide some indication of the ability to achieve a reasonable initial state in the land model.
 Because soil moisture in the uppermost layers (5 cm) has the most impact on the short-term forecasts, the top layer of the SWATS temperature and volumetric soil moisture data (at 5 cm depth) are compared to the 24 hour forecasts of the land model's third soil layers (6.2 cm) in Table 4. Given the substantial uncertainty in these types of Comparisons, the model-observed differences are not too large. The model land initialization (see section 3.3) seems to have generated a reasonable land state but in general the soil is a little too dry and warm. In light of the lower-level errors in atmospheric temperature and in the surface sensible and latent heat fluxes, the surface-atmosphere exchange formulation in CAM2 might need attention. The lower atmosphere is too cold, the sensible heat flux is too small, and yet the land is still too warm. On the other hand the latent heat fluxes are too large which is consistent with the land drying and the atmosphere moistening. This is consistent with the differences between the observations and initial conditions. The relation between the observed and land model variables changes very little from the initial state over the duration of the 24-hour forecast. The surface radiative fluxes for this period are in good agreement with the observations [Williamson et al., 2005].
Table 4. Differences in Soil Temperature and Moisture Estimate Between 24-Hour CAM/CLM Forecasts and SWATS Observations at the SGP Sitea
Soil Temperature, K
Volumetric Soil Water, mm3/mm3
The depth levels are 5 cm for the observations and 6.2 cm for the model.
April 1997, rain
April 1997, dry
April 1997, all days
Figure 7 presents the comparison of the ARSCL product to the model cloud for 24-hour forecasts across the IOP. The figure is a colored depiction of a contingency table, indicating the nature of the agreement of the model and observations as to the presence of cloud. The key to the figure is described in Table 5. The most obvious difference is in the low/middle cloud from 900 to 500 hPa where the model appears to systematically underestimate the presence of cloud. There is too much high cloud in the upper levels. The increased vertical resolution afforded by the radar does show that the traditional low (surface to 700 hPa), middle (700–400 hPa) and high (above 400 hPa) categories of cloud levels would mix together distinctly different types of model errors. This is especially true of the 700 to 400 hPa region. Aggregating the cloud fractions from the model levels through this layer could produce a conclusion somewhat different from that gathered from Figure 7. The very lowest levels (below 900 hPa) are not suitable for detailed comparison because the radar signal can be compromised by other aerosols (e.g., dust, insects) in the layer [Mace et al., 1998]. Except for marine stratus, cloud fraction in CAM2 is diagnosed using a scheme that involves relative humidity as a key variable. It is thus not surprising that the cloud error follows the relative humidity error in Figure 6.
Table 5. Summary of Model-ARSCL Comparison Conditions Color Coding
false alarm (red)
5.2. April 1997 IOP: 2–23 April 1997
Figure 8 presents the 24-hour forecast and observed precipitation for the period 2–23 April 1997. There is a marked improvement over the performance in the summer case; see Figure 3. The model captures the episodic nature of the rain quite well but under-predicts intensity. The radar data indicate that this reduced rain intensity is at least in part due to the larger area encompassed by the four CAM2 grid boxes. Table 6 also indicates the general improvement over the summer case. The model success in discriminating between precipitation regimes allows a partitioning of the data into wet and clear periods in the subsequent analysis. In the following, “wet” denotes those times when both the model and observations show rainfall, and “clear” when both show no rainfall.
Table 6. Joint Distribution Table of 24-Hour Forecast Precipitation at the SGP Site for the April 1997 IOP
5.2.2. SGP Column
Figure 6 (right panels) displays the mean 24-hour error in temperature, specific humidity and relative humidity at the SGP site for the April 1997 IOP during wet and clear periods. The figure shows that the wet periods generally contribute to the bulk of the error for the chosen variables. The model is too warm and too moist below 700 hPa and the warm bias is larger for the wet periods. The wet relative humidity pattern is not unlike that of the June/July 1997, but the profiles of temperature and moisture for the two cases are quite different. The terms of the atmospheric moisture budget [Williamson et al., 2005], indicate that the moist processes are not removing enough moisture in the lower layers to offset the vertical diffusion. In April the contribution of the ZM deep convective scheme is secondary to the Hack  shallow convective parameterization of the model [Williamson et al., 2005].
5.2.3. SGP Surface
 The surface sensible and latent heat errors shown in Table 3 for April 1997 are much like those of the June/July 1997 for the wet periods, but they change sign for the dry periods. Thus the sign of the flux errors is consistently tied to occurrences of rain in both cases. The fact that flux errors reverse sign indicates that the model's treatment of rain is not the only determinant of surface flux errors.
Table 4 compares the SWATS upper layer soil temperature and moisture data to the 24 hour CAM2 forecasts for April 1997. The model soil is drier than the observations and evinces somewhat more variability. The model shows rapid increases of soil moisture during rain events and a steeper drying over the nearly rainless second half of April. The model soil is systematically warmer than the observations, with maximum differences approaching 8 C. The CLM2 model used in CAM has a documented warm bias during the cold season in high and midlatitudes [Kiehl and Gent, 2004]. As in the summer case, however, it is difficult to depict a consistent picture of the error in the soil, surface fluxes and lowest level atmospheric temperature and moisture.
Figure 9 presents the ARSCL cloud forecast comparison for April 1997. The clouds track the transition from a rainy to dry period around 13 April. Before this time, the model does well in predicting cloud, and afterward it does well in simulating clear regions. In both periods the most consistent tendency is for the model to miss low/middle clouds, and overall the systematic error is to under predict clouds. The association of relative humidity and cloud errors is not as clear cut as in June/July 1997, although the underestimated low-level relative humidity is coincident with the reduced low/middle cloud.
5.3. March 2000 IOP: 1–22 March 2000
 For the March 2000 IOP the results at the SGP site closely resemble those for April 1997. Thus only aspects of this IOP related to the observations at the TWP site will be presented. Tables 7 and 3 provide some summary information on the March 2000 SGP results.
Table 7. Joint Distribution Table of 24-Hour Forecast Precipitation at the SGP Site for the March 2000 IOP
5.3.1. TWP Precipitation
Figures 10 and 11 present the precipitation time series during the March 2000 period for the Manus and Nauru ARM sites. In addition to the ARM and CAM data, the daily GPCP values for this location are also displayed. The ARM data are hourly values, these are averaged to the three hour CAM sampling. The plots show a decrease in precipitation amounts in going eastward from Manus to Nauru for this period. The CAM fails to capture the episodic nature of the rainfall at either location. The CAM rainfall time series at both locations is reminiscent of the June/July 1997 SGP rain time series, a persistent light rain with a diurnal variation. This would indicate that the model error related to modeling convection might not be restricted to locations over land, since the model takes both TWP locations to be virtually all ocean. The agreement of the GPCP and gauge data is poor for both sites. This highlights the difficulty in using single, island gauge measurements of rainfall in the tropics as being representative of a mostly oceanic region. However, in a qualitative sense both the GPCP and gauge data indicate the episodic nature of the rainfall at these locations. This fact does indicate that the temporal variation of the CAM rainfall is in error.
5.3.2. TWP Column
 There were no quality controlled ARM upper air observations available for the TWP site for March 2000 at this time. Figure 12 presents the mean differences between the 24-hour forecasts and the ERA40 data. There is not a differentiation between clear and wet since the model rains almost all the time. Interestingly enough, the errors do bear some resemblance to those of June/July 1997 at the SGP site. This gives some evidence of a common origin in the convective parameterization which was active throughout June/July 1997. The comparison to the ERA40 data is somewhat questionable since these data were used to initialize the model. However, these differences do provide a measure of the magnitude and direction of the model drift from the initial conditions over the forecast time.
5.3.3. Clouds TWP
Figure 13 shows the ARSCL product comparison to the 24 hour cloud forecast at the TWP sites. For this IOP the two sites present an interesting contrast with respect to cloudiness. The ARSCL (not shown) indicates substantially more cloud activity throughout the period at Manus as compared to Nauru. At Manus the model systematically underestimates the low/middle clouds 800 to 500 hPa, while capturing the presence of the high cloud. There is a slight indication of some overestimation of high cloud. At Nauru, the high cloud is overestimated somewhat more consistently, especially at the highest levels. This high-cloud bias is consistent with the relative humidity biases in Figure 12. The low/middle cloud at Nauru has a small tendency toward under estimation. The model has a systematic bias consisting of too much high cloud and too little low cloud. In those instances when the observations fit this pattern the model appears to do well, as seen by the high cloud at Manus and low cloud at Nauru.
6.1. Relation to AMIP Climatological Errors
 The motivation of CAPT is to put into place tools for identifying deficiencies in the physical parameterizations of climate models. It has been argued that an effective way of improving the physics of the models is by running short-term forecasts. Nonetheless, it is of some interest to investigate if the errors seen in the forecasts have any projection on the climate simulations. This section will present some of the systematic climate errors at the ARM sites. An emphasis will be on variables that relate to clouds and moisture since the proper depiction of cloud is an outstanding problem in climate simulations.
Figure 14 displays profiles of differences between CAM AMIP style simulations and the ERA40 and NCEP/DOE reanalyses at the ARM SGP and Nauru sites for the time of year corresponding to the IOPs used in this work. Only June data are shown for the SGP site, but the July results were quite similar. For the TWP the Nauru results are very much like Manus so only Manus is shown. The AMIP simulation used observed SSTs for the years 1979 to 1995. The CLIMO integration, also compared to the AMIP in the figure. Used climatological SSTs and the data is from years 10 to 29 of the climatological SST integration. An indication of the robustness of the error across independent realizations of the model is provided by the AMIP-CLIMO differences. At the SGP site the difference pattern is surprisingly consistent across the seasons, although there is some indication that the summer error has a larger amplitude than the spring. There is also good agreement between the ERA40 and NCEP/DOE reanalyses.
 The model errors at the TWP are generally smaller in magnitude than at the SGP site. The prescribed SSTs at these ocean sites constrain the model, and at the SGP site the land may evolve away from the observed state and thus contribute an additional error source. Since the TWP sites are in a region that is directly affected by El Niño variations, the AMIP-CLIMO differences are a bit larger than at the SGP. It therefore is logical to use the AMIP as a measure of error since this simulation included observed SSTs. The temperature differences appear to be the opposite of those at the midlatitude site, with the two reanalyses showing fair agreement. For the moisture parameters, the reanalyses diverge; the lack of agreement is most manifest in the relative humidity. Few in situ observations are available in this region, so differences in assimilation of satellite data and assimilation model biases take on greater significance, producing differences in the reanalyses. The most consistent signal is a pronounced positive bias in the CAM2 relative humidity at the levels above 500 hPa, which is especially large relative to the NCEP/DOE reanalysis.
 Comparisons of the low, middle and high cloud fractions between the CAM simulations and the observations are presented in Figure 15. The verification data come from two sources: the International Satellite Cloud Climatology (ISCCP), Rossow et al.  and the land-based observational cloud climatology atlas (ATLAS) of Hahn and Warren . These two data sets are complementary: Land observations are more reliable for the low clouds while the satellite data might have the advantage in observing the high clouds. The Hahn and Warren data provide seasonal means (March, April, May) and (June, July, August) for the low and middle cloud, but provide means for individual months for the high cloud. At the SGP site, the errors are similar across the spring and summer; the model underestimates low cloud and overestimates high cloud. This is especially true if the ATLAS data are used for the lower cloud and ISCCP for the highest. Middle clouds, however, do not display an egregious error. This consistency is not too surprising given the relative humidity curves in Figure 14 since the cloud fraction in CAM2 is diagnosed using the relative humidity as the key parameter. Over the TWP site the glaring error is in the overestimate of high clouds, although there is some indication of an underestimate in the low and middle cloud. Again, this is consistent with the relative humidity bias.
 The challenge is to find the deficiencies in the physical parameterizations that contribute to the systematic errors seen in Figures 14 and 15. There is no reason to expect that the errors in a short-term forecast (1–5 days) should necessarily resemble those in Figures 14 and 15. As the climatological simulation proceeds, the deficiencies in the various physical parameterizations interact in complex ways, such that the long-term error may bear little resemblance to its root cause. Indeed, during the first five days of a simulation, radiation plays only a minor role in forcing variability, but over the longer term its effects can be paramount. Yet, consideration of data such as that shown in Figures 14 and 15 is valuable in suggesting a starting point as to what aspects of the forecast to examine for the source of systematic error, and as a baseline for assessing future improvements in the climate simulation.
 If one only considers the ERA40 curves, Figures 12 and 14 show some commonality. The forecast error and climatological error thus may have some relation to each other in this region of the tropics. The tropical sites are both essentially oceanic (from the model's point of view), so some of the commonality in error might originate from the fact that both the climate (AMIP) and forecast simulations are driven by the same SSTs. At the SGP site, the land model presents different states to the climate and forecast integrations and this provides another source of differences. The model also evinces a climatological systematic error in overestimating high cloud in the TWP.
 Comparing the figures for the CAM climatological errors (Figure 14) and the 24-hour forecast errors (Figures 6 and 12) there is little correspondence between climate and forecast errors except for the relative humidity. The pattern of overestimating humidity at upper levels, and underestimating it at lower levels is seen across all the cases. However, the corresponding temperature and specific humidity profiles are somewhat more disparate. For example, in Figure 14 climatological relative humidity error for June is driven by the temperature profile while the forecast error for June/July 1997 in Figure 6 is mainly due to the specific humidity.
 Thus there is a similarity in the relative humidity error for climatology and forecasts, but the etiology of the respective errors is distinct. This result is not at all surprising. Figure 5 demonstrates that the relative humidity grows rapidly within the first five days of integration for the June/July 1997 SGP case. Over a climatological run the state of the atmosphere is thus sufficiently far from a realistic state that the error characteristics would proceed in a different manner than the forecasts. This would imply that attempting to invert the errors seen in the climatological integrations to errors in the physical processes would be fraught with uncertainty. The goal is to reduce the climatological errors but an important path to this reduction might well be the careful analysis of the short-term forecast. The error seen in the June/July 1997 forecasts is very systematic (Figure 4) yet its signature is transmogrified by other interactions so that it is unrecognizable in the climatology.
 The foregoing discussion indicates that similar caution must be exercised when considering the model-observational differences in clouds. Figure 15 shows the climatological differences and Figures 7, 9, and 13 display the forecast comparisons to observations of clouds. There is a fairly good correspondence between the climatology and the forecast errors. The forecast overestimate of high cloud in convective regimes is quite clear as is the underestimate of low cloud at almost all times and locations. This cloud error is consistent with the error in the relative humidity profile. It is not obvious, however that the processes that spawn the high clouds in the first day of the forecast and those that maintain their existence for 15 years of integration are the same. What can be surmised is that there are feedbacks that maintain the high cloud in the climatological sense even after the state of the atmosphere has changed due to other effects not seen in the five day forecasts. It follows that the model might have different paths to the same error pattern, and the close examination of short-term errors and some experimentation can reveal the true etiology of the error which is obscured in the climatology. The paradox is that cloud feedback uncertainties can be usefully addressed before cloud feedback can exert a dominant influence. The short integrations shown here do not have enough time for radiative feedbacks to become effective. Thus the physical errors can be identified before being entangled in feedbacks, placing the diagnoses of actual feedback loops on firmer ground.
6.2. Land and Surface
 An aspect of the current work that has no elegant solution is the initialization of the land. Validation of the land model is important in its own right and the state of the land upper levels will have an impact on the surface exchanges of heat and water. Table 4 indicates that the soil is in fair agreement with the observations and does track the time variations at the SGP site. There is apparently a consistent bias in that the land model is generally too warm and too dry. Table 3 indicates that despite the land being too warm, the sensible heat flux displays a negative difference with the observations. This negative value is seen even in the June/July 1997 case where the lower atmospheric temperature is too cold. The latent heat flux has a positive bias when it is raining, and a negative bias when it is dry. Thus it seems that the land biases are not dominating the sense of the surface exchanges. Nonetheless, increasing the accuracy of the land initialization is a prime goal of future work so as to allow a clear identification of the source of such flux errors.
 The CAM2 atmospheric model was initialized by state variables from reanalyses and run as a forecast model for short time periods (5 days). The land surface model, CLM2, was spun up by integrations forced by observed state variables for the period preceding the forecast. The goal was to analyze the model evolution in the short term starting from as realistic conditions as possible, so as to be able to isolate specific shortcomings in the modeling of the physical processes before other interactions obscure the actual error source.
 Forecasts were produced for three ARM IOPs, June/July 1997, April 1997 and March 2000. Forecasts were initiated every six hours and the state variables were updated at these times. This sequence mimics the forecast analysis cycle of NWP centers. The experiments indicate that the CAM2 is capable of producing high quality short-term forecasts of the large scales. Detailed validation of the model at the ARM Southern Great Plains and tropical west Pacific sites was carried out. The validation exercise shows systematic errors in convective regimes apparently related to the deep convective parameterization. The error in these regimes is quite systematic and manifests itself over the course of a forecast of a single diurnal cycle. In regimes where the deep convection is not invoked the errors are reduced, but the error during rainfall events is somewhat larger than during clear periods. There are some seeming inconsistencies between the state of the land, the surface air temperature and moisture, and the surface sensible and latent heat fluxes. These might indicate shortcomings in the modeling of the surface exchange layer.
 The land initialization is an issue. There are model derived global soil data sets available, but there is not yet a consensus on how to map observations to the variables of a specific land model [Dirmeyer et al., 2004]. The current land spin-up technique appears to be adequate at this stage of the model development in the sense that is appears that other CAM2 biases are overwhelming the contribution from the land although there is some evidence of persistent biases in the land state.
 Future work will involve the next release of the Community Atmospheric model, CAM3. It is planned to run experiments for much longer periods in the forecast/analysis mode. The longer runs will permit a stratification of the observed conditions and model forecasts so as to permit a diagnosis of specific parameterizations. In the current work it was not possible to stratify the data according to clear versus cloudy conditions. In the current set of experiments the number of instances that the model and observations agreed with respect to the cloud distribution was very small and insufficient to get definitive results as to the nature of the cloud feedback processes in the model.
Appendix A:: Nudging
 The CAM2 code was modified to permit the solution to be continuously “nudged” toward a target analysis during the integration of the model. The nudging technique has a rich history, mostly as a model initialization method to decrease startup noise [Hoke and Anthes, 1976]. The basic idea is to modify the prognostic equations to include a term that decreases the distance between the predicted value and a target analysis. The values are not forced to exactly match those of the analysis, and the degree of conformance is dictated by a relaxation constant. This value is usually expressed in terms of the e-folding time for the model to come into agreement with the analysis, all else being equal. This is depicted in the following equation:
 The value of α was chosen such that the variables would relax to the analysis in 6 hours. The variables so adjusted were the atmospheric prognostic variables: temperature, wind, moisture and surface pressure. The ARM IOPs considered here occurred in 1997 and 2000. For all the cases the CAM was run in the nudging mode from the first of January to the start of the IOP period. Two separate integrations were performed using the R2 and ERA40 data as nudging targets. These same data provided initial conditions for the subsequent experiments. The reanalysis data were available every 6 hours, and these were interpolated using a cubic polynomial to the 20 min time step of the model. This degree of nonlinearity was determined sufficient to adequately drive the model, after experimenting with interpolation schemes ranging from linear to quintic.
 We thank the ECMWF for early access to the ERA40 data which were essential for this study. We also thank Nils Wedi (ECMWF) for explaining some of the interpolation methods used operationally at the ECMWF. The ARM SGP site soil data were obtained from the Land Data Assimilation System (LDAS) validation data archives. This work was performed under the auspices of the U.S. Department of Energy (USDOE) Office of Science, Biological and Environmental Research (BER) program by the University of California, Lawrence Livermore National Laboratory under contract W-7405-Eng-48. This work also was partially supported at the National Center for Atmospheric Research (NCAR) by the Climate Change Prediction Program (CCPP), which is administered by the USDOE Office of Science, BER program. NCAR is sponsored by the National Science Foundation.