We survey the IPCC AR4 models' responses to SRES A1B forcing in order to evaluate a prediction of climate change common to all models and testable using GPS radio occultation data over the coming decades. Of the IPCC AR4 models that submitted runs of the SRES A1B forcing scenario, we select twelve because of the timeliness of their submission. Trends in the global average surface air temperature show better overall agreement over the first 50 years than in the IPCC Third Assessment Report, but the patterns of global surface air temperature trends show little improvement in intermodel agreement. These same twelve models show qualitatively better agreement in their patterns of upper air temperature trends. All show maintenance of a moist adiabatic temperature profile in the tropics, making tropospheric temperature trends the greatest at 200 hPa in the tropics. We test the climate models' predictions using optimal fingerprinting. In order to do so, we use long preindustrial control runs of four of the IPCC AR4 models. Simulating trends in the log of the vertically integrated microwave refractivity, or “dry” pressure, of the atmosphere is nearly the same as measuring trends in geopotential heights of constant pressure surfaces. The first four EOFs of interannual variability of log-dry pressure, as determined by two independent climate models, are ENSO, modes closely associated with the southern and northern annular modes, and a previously unidentified symmetric jet migration EOF. The latter is characterized by poleward migration of the eddy-driven midlatitude jet correlated between hemispheres. The ENSO mode and especially the symmetric jet migration EOF contribute most to optimal detection and are similarly predicted by the IPCC AR4 models. The common prediction of all climate models will be tested with 95% confidence with GPS radio occultation data in 7 to 13 years. The fingerprint is dominated by symmetric poleward migration of the eddy-driven midlatitude jets.
 That global surface air temperature has been increasing over the past century is beyond doubt; that the warming can be attributed to anthropogenic greenhouse gas increases is gaining in certainty; and yet accurate forecasting of future climates remains a distant goal. Nevertheless, it is a goal demanded by governments worldwide and called for in programmatic planning documents [Goody et al., 2002; Climate Change Science Program, 2003; National Research Council, Committee on Radiative Forcing Effects on Climate, 2005]. Two rigorous methodologies have been described to attain this goal through regimes of climate model testing and improvement using data [Goody et al., 1998]. They are second moment testing through comparison of lagged covariances of observations and climate model simulations and first moment testing by monitoring climate trends. Second moment testing is done within the context of the statistical fluctuation dissipation theorem [Leith, 1975] and has been pursued with satellite based measurements of high spectral resolution infrared spectra [Haskins et al., 1997, 1999], a data type particularly rich in information. First moment testing of climate models is an offshoot of the problem of climate signal detection and attribution, derived by putting that problem in its Bayesian context [Leroy, 1998]. Our intention is to explore the technique of first moment testing using profiles of atmospheric refractivity obtained by GPS radio occultation.
 The discipline of climate signal detection and attribution has as its specific goal estimating the probability with which observed changes in the climate over the past several decades can be attributed to specific external radiative forcings, particularly anthropogenic greenhouse gas increases. The data types used in these analyses are most frequently surface air temperature measurements by meteorological stations, upper air temperature measurements by radiosondes, and, more recently, microwave brightness temperatures as measured by the Microwave Sounding Units (MSU) on NOAA weather satellites. A series of studies using primarily surface air temperature measurements shows, with increasing certainty, that much of the warming observed over the past several decades is attributable to greenhouse gas increases [Hegerl et al., 2000; Tett et al., 1999, 2002; Stott et al., 2000a, 2000b, 2001]. Radiosonde and MSU measurements of upper air temperature are not well suited to climate signal detection and attribution studies because their calibration properties are questionable. Multiple attempts to calibrate radiosonde data in postprocessing have yielded significantly different results for upper air temperature trends [Seidel et al., 2004]. Two different but not wholly independent efforts to calibrate the MSU temperature record now yield results consistent with each other [Mears and Wentz, 2005].
 Climate benchmarks have three properties: they are traceable to international standards of measurement, they adequately sample the climate in space and time so that they truly represent what they are meant to represent, and they are independently determined. One of several measurement techniques that satisfy these criteria is radio occultation using the Global Positioning System (GPS) [Leroy et al., 2006]. The observed quantity of a radio occultation, one that is independent of observation geometry, is the profile of atmospheric microwave refractive index, n. The difference between n and unity is unambiguously proportional to atmospheric density in the troposphere and stratosphere with a contribution from water vapor significant only in the lower troposphere. With the absolute positioning information of GPS and the hydrostatic equation, it is possible to measure directly geopotential heights of constant pressure surfaces with near absolute accuracy [Kursinski et al., 1997, 2000; Leroy, 1997]. Trends in geopotential heights come about because of thermal expansion of the troposphere, a consequence of warming of the troposphere on global scales. With the deployment of COSMIC, GPS radio occultation should become the standard in climate benchmarking and climate monitoring [Rocken et al., 2000].
 We will explore how climate models can be tested, via first moment analysis, using synthetic GPS radio occultation data. The radiative forcing of climate over the next century is a major source of uncertainty in climate prediction, but we focus instead on how accurately and precisely climate models respond to a prescribed forcing. (Climate benchmarking of radiative forcing agents is also necessary.) The ensemble of climate model runs done for the Fourth Scientific Assessment of the IPCC (IPCC AR4) can be examined for the range of uncertainty in climate model response to a prescribed forcing, there being 21 contributing climate models. Because our intention is to estimate the accuracy with which GPS occultation must observe microwave refractive index and the amount of time needed before a trend emerges with which climate models can be tested, we use SRES A1B, a moderate estimate of future radiative forcing by greenhouse gases. With the SRES A1B output of several of the IPCC AR4 climate models, we simulate GPS radio occultation observables, and, using optimal fingerprinting, we determine how long before climate signals can be detected by GPS radio occultation. Thorough climate model testing first involves determining whether the most robust elements of predicted decadal climate change are borne out in data. Subsequent climate model testing involves rating models against each other according to their differences using data sufficient to distinguish between them.
 In this, the first section, we introduce the background to our work. In the second we survey decadal trend estimate of the near surface air and upper air temperatures and heights to gain an overview of the similarities and differences in climate model response to prescribed forcing. In the third we discuss how climate signals emerge in atmospheric microwave refractive index profiles. In the fourth we estimate how long it takes for the anthropogenic climate signal to emerge in GPS radio occultation data using optimal fingerprinting techniques, the IPCC AR4 models to simulate the patterns, and two models to simulate natural variability. In the fifth we discuss our results and summarize our findings.
2. Climate Models' Responses to SRES A1B
 In order to survey climate models' response to a fixed but realistic forcing in the upcoming decades, we choose the SRES A1B greenhouse gas forcing scenario. SRES A1B is characterized by rapid population and economic growth for the next 50 years with energy generated from a “balanced” mixture of fossil fuel and alternative sources [Intergovernmental Panel on Climate Change, 2001]. It is characterized by 1% yr−1 growth in carbon dioxide and stabilization after doubling. While the IPCC AR4 ensemble lists 21 contributing models, at the time of this analysis only 15 had contributed SRES A1B runs. Of those, three contained flaws which prohibited their utility in this analysis, leaving 12 climate models for us to compare. They are listed Table 1 and represent a broad range of atmospheric gridding numerical integration schemes. The MIROC 3.2 (medres) model is a medium resolution version of the Japanese Earth Simulator model.
These are the climate models of the IPCC AR4 used in this study. We give the names of the institutes responsible for each model, the horizontal resolution, the total number of vertical levels, number of levels in the troposphere, and the number of levels in the planetary boundary layer. When the horizontal resolution reads “Txx,” a spectral transform method is used. In these cases, the equivalent horizontal resolution at the equator is given in parentheses. Otherwise, a grid point scheme is indicated by the format “x × y” where x is the spacing in longitude and y is the spacing in latitude. For the MIROC model, the “(medres)” indicates that the output of their medium resolution model was used. (Information was obtained from the IPCC AR4 website.)
GFDL-CM2.0 and GFDL-CM2.1, U.S. Department of Commerce, NOAA, Geophysical Fluid Dynamics Laboratory
2.5° × 2.0°
GISS-AOM, NASA Goddard Institute for Space Studies
4.0° × 3.0°
GISS-EH and GISS-ER, NASA Goddard Institute for Space Studies
5.0° × 4.0°
INM-CM3.0, Institute for Numerical Mathematics, Russia
5.0° × 4.0°
IPSL-CM4, Insitut Pierre Simon Laplace
3.75° × 2.5°
MIROC 3.2 (medres), Center for Climate System Research, University of Tokyo, National Institute for Environmental Studies, and Frontier Research Center for Global Change (JAMSTEC)
ECHAM5/MPI-OM, Max Planck Institute for Meteorology, Hamburg
MRI-CGCM2.3.2, Meteorological Research Institute, Japan
CCSM3, National Center for Atmospheric Research
PCM, National Center for Atmospheric Research, U.S. Department of Energy
UKMO-HadCM3, Met Office, Hadley Centre for Climate Prediction and Research
3.75° × 2.5°
 The most basic and relevant geophysical quantity to begin the comparison is global average surface air temperature. In Figure 1 we show the global annual average surface air temperature from years 2000 through 2100. Over this time frame, the various models predict globally averaged surface air temperature increases between 1.8 and 3.2 K. Roughly half the models show nearly linear (but noisy) trends, but the other half show distinct changes in growth rate between the beginning and the end of this 100 year period. Since our aim is to detect climate signals in the briefest period of time possible, we estimate trends in the climate system only over the first 50 years of the SRES A1B runs. If global surface air temperature is a good indicator of linearity in trends, then over this 50-year period we can safely assume that the trends in atmospheric variables are all linear.
 While the claim can be made that climate models agree in their responses better than they did in the Third Assessment Report of the IPCC [Houghton et al., 2001], climate model response on regional scales remains highly uncertain. In Figure 2 we show maps of surface air temperature trends as determined by linear regression. On average, high latitudes show higher surface air temperature trends than in the tropics, but even on continental scales climate models do not agree on the sign of surface air temperature trends. In some regions the differences are startling. For example, Siberia shows temperature trends ranging from −1 to +1 K decade−1 depending on the climate model. Substantial disagreement also exists for temperature trends in the immediate vicinity of the Pacific intertropical convergence zone (ITCZ).
 The differences in predicted patterns of surface air warming have severe consequences for climate signal detection and attribution when surface air temperature is the data set used. Recent works in signal detection distinguish anthropogenic greenhouse warming from natural variability under the assumption that the pattern of warming is known a priori while the integrated global warming is not. Figure 2 calls this assumption into question. Model predictions of surface air temperature trends depend not only on trends in the dynamical structure of the atmosphere but also on regional effects which are highly uncertain. For climate signal detection and attribution and climate model testing, it is important to find quantities more directly related to changes in the dynamical structure of the atmosphere that are clear in a climate benchmark data type such as GPS radio occultation.
 Upper air temperature reveals more about atmospheric structure. Figure 3 shows upper air temperature trends for 12 models. Features common to all models are readily apparent. Maximum warming occurs in the tropical upper troposphere, consistent with the tropospheric temperature structure remaining close to a moist adiabat [Manabe et al., 1965]. Maintenance of a moist adiabat produces a maximum in temperature trends at 200 hPa because latent heat per unit mass in the boundary layer, which is chiefly responsible for convective heating near the detrainment level at 200 hPa, increases much more rapidly than sensible heat per unit mass in the boundary layer. Also, the stratosphere shows strong cooling, consistent with the known radiative properties of carbon dioxide in the stratosphere. This pattern is well known and has been used in climate signal detection and attribution studies [Tett et al., 2002] even though it is not useful for evaluating the uncertainties in climate models' tropospheric predictive capabilities. Some but not all models also show near surface air warming in the Arctic. It is not clear whether this effect is caused by increasing poleward heat transport in the North Atlantic, by atmospheric eddies, or a stronger ice-albedo feedback or both, but in any case the effect is a regional one.
 Strongly related to upper air temperature trends but containing more information on dynamical structures in the atmosphere are geopotential height trends. The IPCC AR4 models produced geopotential heights on constant pressure surfaces in addition to temperature fields. In Figure 4 we show the trends in geopotential height as produced by the IPCC AR4 models subject to SRES A1B greenhouse forcing. Geopotential height is related to temperature through the hydrostatic equation:
where h(p) is the geopotential height at pressure p, hs the surface geopotential height, ps the surface pressure, R the ideal gas constant, T(p′) the temperature profile, μ the mean molecular mass, and g0 the WMO standard of gravity (9.80665 m s−2). Thus the time rate of change of geopotential heights on constant pressure surfaces is
in which Ts is the surface air temperature. Except for horizontal redistribution of atmospheric mass, trends in geopotential heights are the integral of trends in temperature in log-pressure from the surface upward. This is obvious when comparing Figures 3 and 4. What stands out immediately is thermal expansion of the troposphere. The maximum in the plots of Figure 4 is the maximum in the tropics centered at about 70 hPa. This is the location of maximum thermal expansion of the atmosphere below it. At 70 hPa in the tropics, models predict anywhere from a 19 to a 46 m decade−1 expansion of the atmosphere beneath it. Also noteworthy in the plots of Figure 4 are the anomalous Antarctic stratospheric trends in just two of the models: CCSM3 and PCM.
 In Figure 5 we show the zonal average trends of geopotential height of the 200-hPa surface and the contribution from the surface term, first on the right of equation (2). All models show very nearly the same initial zonal average geopotential height at 200 hPa, but the trends differ dramatically, and those differences do not correlate with the surface term. All models, as noted previously, show thermal expansion of the tropical troposphere. Outside the tropics, the predicted trends in the surface pressure and the 200-hPa height fields contain no obvious common patterns. We understand this to represent the near complete uncertainty in the prediction of dynamical trends in the extratropics. In the Southern Hemisphere, that uncertainty appears tied to the uncertainty in the sensitivity of the tropics: those models which are more sensitive in the tropics show lesser increases in geopotential heights in southern high latitudes. No such clear correlation exists in northern high latitudes.
 The sensitivity of a climate model is generally couched in terms of the rate of global average surface air warming [Houghton et al., 2001]. Figures 3 and 4 suggest that zonally averaged upper air heights might be better indicators of bulk atmospheric response to greenhouse gas forcing. It is customary to describe global warming as a trend in global average surface air temperature despite the influence of surface-air interaction processes and the sampling difficulties encountered in measuring it. The global thickness of the troposphere, or height of the 200-hPa surface, is directly measurable from space by GPS radio occultation. For this reason, we describe global warming in terms of global tropospheric thickness. (For conversion, 60 m of tropospheric thickness increase corresponds to 1 K of tropical surface air warming.)
3. Climate Signals in Microwave Refractive Index
 The microwave index of refraction n is related to microwave refractivity N and pressure, temperature, and water vapor through
wherein p is pressure, T is temperature, pW is partial pressure of water vapor, and a = 77.6 K hPa−1 and b = 373 × 103 K2 hPa−1 are empirically determined constants. The natural independent coordinate for refractivity is geopotential height rather than pressure because pressure cannot be unambiguously determined from the measurement whereas geopotential height can [Leroy, 1997]. The first term on the right of equation (3) is the “dry” component, proportional to density, and is about 300 near the surface. The second term is the “wet” component, proportional to specific humidity divided by temperature, and is about 60 near the surface. The second term falls off more rapidly with height than does the first term and consequently is only significant in the lowest few kilometers of the atmosphere, predominantly in the tropics.
 Integrated refractivity, or “dry pressure,” is an ideal product derived from refractivity for monitoring global change [Leroy and North, 2000]. The dry pressure, pN, can be computed from refractivity by
where μd is the molecular mass of dry air. Under the approximation that the mean molecular mass of wet air does not vary substantially from the molecular mass of dry air, the integrated refractivity, or dry pressure, is related to temperature, pressure, and humidity through
where μw is the molecular mass of water vapor and q is specific humidity. Dry pressure is pressure plus a water vapor term which contributes substantially only in the tropical lower troposphere.
 Above the lower troposphere, trends in log-dry pressure, nearly the same as atmospheric pressure, are strongly related to trends in the height of constant pressure surfaces. Simple manipulation of the hydrostatic equation yields
with H the local scale height. Trends in log-dry pressure of air above the lower troposphere are directly proportional to thermal expansion of the atmosphere beneath it. Figure 6 shows the trends in log-dry pressure. They mirror the trends in geopotential height (Figure 4) as expected except in the lowest 4 km of the ITCZ, which show the influence of water vapor trends. The most sensitive models show maximum trends of 0.7% decade−1 near 20 km height, and the least sensitive show maximum trends of 0.3% decade−1. These numbers suggest that, if a climate signal should be detectible in radio occultation over a time baseline of 10 years, then dry pressure should be measured with an accuracy of better than 0.1%. Since most of the contribution to dry pressure comes from one scale height above the height of interest, the accuracy requirement on microwave refractivity is better than 0.1% on vertical scales of a scale height.
4. Detection of Climate Signals
 We seek to find the least amount of time which must elapse before a climate signal emerges in radio occultation data and the components of the signal that emerges which are the strongest and most reliable indicators of climate change. The method we choose to employ is optimal fingerprinting [Hasselmann, 1993; North et al., 1995]. See Leroy and North  for its generalization to arbitrary geometries.
 Optimal fingerprinting takes as its assumptions that the pattern of climate change is known but its overall scaling is not and that the uncertainty of detection is dominated by interannual natural variability of the climate which can be safely approximated by control runs of climate models. When actual data are used, the assumptions are known to break down at some sufficiently small spatial scale or low variance component of natural variability, so statistical tests are used to truncate the implementation of optimal fingerprinting [Allen and Tett, 1999]. From our survey of the differences between climate model forecasts of climate change, though, there is little justification for the assumption that the pattern of climate change can be known any more reliably than its scaling factor a priori. When undertaking a study such as this without data, statistical consistency checks are no longer possible, so the remaining alternative is subjective interpretation of the signals we wish to detect and the components of natural variability we wish to retain. That is the approach we take.
 We seek to find a signal s, known to an arbitrary scale factor α, in the difference d between data sets taken after a time interval Δt. In optimal fingerprinting, the optimal detection of α is given by
wherein N is the covariance of the data vector d due only to the influence of natural variability, typically simulated in a control run of a climate model. The uncertainty σ in the determination of α is
We assume the observed data are exactly the same as predicted by trend rates as produced in the previous section: d = s = Δt. Under this assumption, α = 1 and the signal-to-noise ratio of detection (α/σ = SNR) is
or the amount of time for a 1-sigma detection of climate change is (TN−1)−1/2. A two-sigma detection will take two times as long if the natural variability N remains the same over longer timescales.
 In optimal fingerprinting, inversion of the natural variability covariance matrix N is complicated by the limited sampling used to construct the matrix. Efforts to identify modes of the climate system by EOF decomposition must determine whether an EOF is nondegenerate given limited sampling [North et al., 1982]. Such a determination is unimportant in optimal fingerprinting insofar as it only requires EOF decomposition for the sake of calculating a matrix inverse. What remains important is that the eigenvalues of the EOF decomposition are not so small as to be indistinguishable from zero and make N−1 indeterminate. In evaluating N−1, we retain EOFs according to the principle that the variance contributed by the rejected EOFs does not exceed the sampling error of North et al. .
 We compute the interannual natural variability in zonal average annual average log-dry pressure using preindustrial control runs of the GFDL-CM2.0, ECHAM5/MPI-OM, UKMO-HadCM3 and MIROC 3.2 (medres) models. Using the IPCC AR4 present-day control runs would have been more appropriate, but those runs were not made available in as timely a manner as the preindustrial control runs. For each model the first EOF indicates a general expansion of the tropical troposphere and moistening of the ITCZ. Simple linear regression onto the 200-hPa height field produces a pattern like an El Niño/Southern Oscillation (ENSO) pattern as produced by GCMs [Spencer and Slingo, 2003], meaning that EOF 1 is the ENSO mode. Associating the remaining EOFs with previously described modes of the atmosphere is complicated by the fact that the domain used to define our EOFs is unusual. In defining the Northern Annular Mode (NAM) and Southern Annular Mode (SAM) [Thompson and Wallace, 1998, 2000], surface pressure from 20° latitude to the pole constituted the domain. The NAM and the SAM are characterized by a net migration of atmospheric mass from the polar region, poleward migration of the midlatitude eddy-driven jet, and intensification of the circumpolar vortex. Our EOF 2 of every model of Figure 7 shows the polar mass decrease and the poleward jet migration associated with the SAM. EOF 2 of GFDL-CM2.0 in Figure 7 is correlated with the SAM to 92% on interannual timescales. Likewise, EOF 3 of GFDL-CM2.0 bears the signature of north polar atmospheric mass decrease and the polar migration of the eddy-driven jet associated with the NAM and is correlated with the NAM to 77% in GFDL-CM2.0 on interannual timescales. (The EOF most closely associated with the NAM in the UKMO-HadCM3 model is the fifth.)
 Every model exhibits an EOF which bears the signature of broad based poleward migration of the eddy-driven jets symmetrically between the hemispheres coupled neither to an atmospheric mass deficit in the polar regions nor to a strengthening of the circumpolar vortices (EOF 4 in GFDL-CM2.0 and MIROC 3.2 (medres); EOF 3 in ECHAM5/MPI-OM and UKMO-HadCM3). Jet migration associated with the NAM and SAM have a full width at half maximum of ∼15° in their height anomalies whereas this new symmetric jet migration EOF shows a full width at half maximum in its height anomalies of ∼30°. While the NAM and SAM exhibit stronger gradients associated with the eddy-driven jet migration, the new symmetric EOF shows a stronger geopotential anomaly. If the symmetric (poleward) motion of the eddy-driven jet which characterizes this EOF were the result of random motion of the northern and southern jets and a lack of correlation between them, then both a symmetric migration EOF and an antisymmetric migration EOF ought to be present and they ought to be degenerate. In the case of each of the models used in the analysis of Figure 7, an antisymmetric jet migration EOF does exist and is consistently less significant than the symmetric jet migration EOF. The probability of this occurring should the motions of the eddy-driven jets be uncorrelated is just one in eight. Nevertheless, the separation in their eigenvalues is ≈1.5% of the total variance, so the limited sampling of the IPCC AR4 control runs prohibits our concluding that this symmetric motion of the eddy-driven jet is a mode of the atmosphere. We call this the symmetric jet migration EOF. In Figure 8 we show the horizontal structure of the symmetric jet migration by regressing the 200-hPa height field onto the EOF's principle component time series.
 We apply the optimal detection methodology using natural variability as simulated by GFDL-CM2.0 and ECHAM5/MPI-OM. Since this is a sensitivity study for climate signal detection and model testing and we are not using data, the EOF truncation of Allen and Tett  cannot be used. Instead, we examine the signal detection problem component by component, interpreting each according to physical phenomena. Equation (9) can be rewritten as
in which eμ and λμ are the μth EOF and eigenvalue of natural variability and ,〉 is the inner product. The inner product is computed by 〈u, v〉 = uiviwi,i where w is the diagonal matrix of weights used in the EOF equation Nweμ = λμeμ. The signal-to-noise ratio of detection is dominated by those modes which project strongly onto the signal relative to the noise associated with those modes. It is these modes which enable signal detection in the least amount of time.
 In Figure 9 we show spectra of square projections 〈eμ, 〉2 and the natural variability eigenvalues λμ broken down by EOF. EOFs are sorted according to descending eigenvalue. In such plots, the greater a square projection is in comparison to its corresponding eigenvalue, the more that component of the signal will contribute to detection. When using an ensemble of different models to simulate the signal to be detected, we also gain insight into the certainty with which the pattern of change can be prescribed. When all models tend to cluster their squared projections for a single EOF, the component of the signal described by that EOF can be considered reliable. When the squared projections for a single EOF tend to scatter, that component of the signal can be considered uncertain and does not represent a prediction common to all models. In fact, the latter represent the cumulative uncertainties in forecasting climate change. In this case, the sensitivity mode and the symmetric jet migration EOF are certain patterns of climate change. Recall that the symmetric jet migration EOF is the fourth EOF in GFDL-CM2.0 prescribed variability, the third in ECHAM5-MPI/OM prescribed variability, the third in UKMO-HadCM3 prescribed variability, and the fourth in MIROC 3.2 (medres) prescribed variability. The trends in the spaces of the EOFs closely associated with the NAM and the SAM are uncertain: models disagree on the trends in these modes over the next half century. It so happens that these modes do not contribute significant signal-to-noise ratios in detection, so including them in optimal detection will have little impact. Because we understand the first four modes and because most modes beyond the first four show scatter in the squared projections, we truncate the detection at six EOFs, which captures variability associated with the NAM, the SAM, the symmetric jet migration EOF, and antisymmetric jet migration.
 With real data, the signal detection is accomplished by multiplying the trend data d by a set of coefficients c so that the detected signal amplitude α is
The trend data d is the difference between two radio occultation data sets separated by time Δt, the expected signal pattern is s = Δt, the eigenvectors (EOFs) and eigenvalues of natural interannual variability are eμ and λμ, and the optimal set of coefficients is
with the truncation m = 6. These coefficients c together form the optimal fingerprint.
 In Figures 10 and 11 we show the optimal fingerprints for detecting climate change given four prescriptions of interannual natural variability and 12 prescriptions for the signal shape expected to emerge in log-dry pressure as determined by GPS radio occultation. The most prominent aspect of these fingerprints common to all models and variability prescriptions is the pattern of high coefficients in the optimal fingerprints centered at 45°N and S and at 14 km height with a meridional width of 30°. The large weighting pattern extends nearly vertically upward and downward. This is the most prominent feature of the symmetric jet migration EOF. On the other hand, there is minimal weighting in the tropics where most of the warming due to greenhouse gas increases is expected to occur. In testing climate models' predictions for upper air trends, the first test which will obtain a result with strong confidence in the face of naturally occurring interannual variability is symmetric poleward motion of the eddy-driven midlatitude jets. The midlatitude jetstream not only is expected to migrate poleward in response to greenhouse gas forcing, it naturally migrates meridionally infrequently. Net mass migration from polar regions is less predictable in that models show less agreement in such trends and more interannual variability is associated with polar atmospheric mass.
 Another prominent feature of the optimal fingerprints is the set of negative coefficients in the lowest ∼4 km of the ITCZ. Seemingly, the fingerprint searches for a negative trend in ITCZ water vapor content, which would not be optimal. These negative coefficients are associated with the positive coefficients at 45° latitude, both features of the jet migration mode, the mode which contributes most to the signal-to-noise ratio of detection. If the truncation took into consideration more EOFs (m > 4), at some point the fingerprint coefficients would have become positive in the lowest 4 km of the ITCZ. With more restrictive truncations, the detection is made less optimal, and the negative coefficients in the lowest 4 km of the ITCZ are an example of this phenomenon.
 Two-sigma detection times are summarized in Table 2. (A two-sigma detection connotes a 95% probability that a signal is the result of greenhouse gas forcing rather than a natural fluctuation of the climate.) The range of two-sigma detection times is 7 to 13 years. Even though the models used to prescribe natural variability have different sensitivities, the detection times are nearly independent of which model is used to prescribe natural variability. ECHAM5/MPI-OM is more sensitive because its ENSO mode is associated with more variability; however, the symmetric jet migration EOFs have similar amounts of variance associated with them. Since the symmetric jet migration EOF dominates the optimal detection, the detection times are very similar. Those models which show stronger trends in tropospheric expansion have shorter two-sigma detection times. In general, a 95% confidence test will be possible after 10 m of tropospheric expansion.
These are the 2-sigma detection times for each of the twelve models under four prescriptions of interannual natural variability. Each row is for a different model prescription of the log-dry pressure signal; the second through fifth columns are for the different prescriptions for natural interannual variability of dry pressure. The sixth column gives the thermal expansion rate of the troposphere, defined as the global average trend in geopotential height of the 200-hPa surface. SRES A1B forcing is assumed: 1% yr−1 carbon dioxide growth.
5. Summary and Discussion
 Surface air temperature is not a strong candidate for testing climate models' common prediction for atmospheric change because its geographic pattern is highly uncertain. Many climate signal detection and attribution studies of the past utilized the surface air temperature record of the past ∼120 years. The many models of the IPCC AR4 effort which contributed SRES A1B runs show that the spatial pattern of surface air warming is highly uncertain. In regions such as Siberia, the range of temperature trend estimates over the coming decades ranges from −1 K decade−1 to +1 K decade−1. Climate model testing studies must utilize other current or future data types with more certain patterns for climate signal detection and attribution.
 Even though optimal detection studies have been performed with postprocessed historical radiosonde temperatures, the radiosonde trends do not produce a pattern of warming like any exhibited by the IPCC AR4 SRES A1B models [cf. Tett et al., 2002]. While the comparison of SRES A1B model runs to 20th century trends may not be valid, nonetheless, there is no evidence that the tropical troposphere behaves like anything other than a moist adiabat in all simulations. Radiosonde temperature trends do not unambiguously maintain a moist adiabat: Their temperature trends are too small in the tropical upper troposphere with respect to lower tropospheric temperatures to be physically credible. In conjunction with the fact that multiple analyses of the historical radiosonde record produces different upper tropospheric temperature trends [Seidel et al., 2004], there is good reason to look toward data sets other than radiosonde data sets to detect upper air temperature trends. We instead estimate what to look for in radio occultation data and how long to wait in order to test climate models' common prediction for upper air temperature trends in the coming decades.
 The first four EOFs in both interannual geopotential height variability and log-dry pressure, a product easily derivable from GPS radio occultation data, can be explained as previously known atmospheric phenomena. The first EOF as simulated by GFDL-CM2.0, ECHAM5/MPI-OM, UKMO-HadCM3 and MIROC 3.2 (medres) models is ENSO and is characterized by dramatic warming of the tropical troposphere, especially in the Pacific basin. The second EOF is most closely associated with the Southern Annular Mode. The third EOF of GFDL-CM2.0 and MIROC 3.2 (medres) and fourth EOF of ECHAM5/MPI-OM is the Northern Annular Mode. The fourth EOF of GFDL-CM2.0 and third of ECHAM5/MPI-OM is closely associated with the symmetric jet migration EOF. It is characterized by poleward wander of the eddy-driven midlatitude jets in the Northern and Southern hemispheres and is statistically more significant than the hypothesis of independent jet migration between the hemispheres. While the phenomenon has been identified in the Northern and Southern hemispheres as the first EOF of zonal mean zonal winds [Lorenz and Hartmann, 2001, 2003], we have found that the migration is actually coupled between hemispheres.
 A diagnostic analysis of optimal climate change detection using log-dry pressure shows that the features common to all models' predictions for 21st century climate change can be tested with 95% confidence in 7 to 13 years and that the strongest indicator of climate change is poleward motion of the midlatitude jet. Because we have assumed 1% yr−1 carbon dioxide growth, slower carbon dioxide increases would increase detection times in inverse proportion. While the symmetric jet migration EOF is one of the top four modes of interannual variability of log-dry pressure (and geopotential height), its associated variance is comparatively small with respect to the projection of the global warming signal onto this mode. As a consequence, poleward migration of the eddy-driven midlatitude jets is the most robust prediction of climate change in the tropospheric upper air.
 It is already possible that climate models can be tested in the difference between the GPS radio occultation missions GPS/MET (1995–1997) [Ware et al., 1996] and CHAMP (2001–present) [Wickert et al., 2001]. The GPS/MET mission was a proof of concept for the technique of GPS radio occultation. It proved successful, but because of technical limitations at the time, it collected small numbers of accurate soundings in just four three week periods between 1995 and 1997. The CHAMP GPS radio occultation mission has been collecting significantly more occultation data, nearly continuously, since July 2001. Depending on whether GPS/MET collected enough data to form sufficiently accurate snapshots of the atmosphere, it may be possible already to use the archive of GPS radio occultation to test climate models according to their common predictions.
 Rigorous testing of climate models is still possible on timescales shorter than 7 to 13 years. We have restricted this sensitivity study to GPS radio occultation, 95% confidence levels, and first moment testing. Other benchmarks besides GPS radio occultation are available, for example high spectral resolution infrared interferometry [Dykema and Anderson, 2006]. Joining multiple climate benchmarks together in the statistical evidence function might increase the sensitivity of detection and reduce detection times. Finally, as was mentioned in the introduction, second moment testing of climate models remains as an alternative to first moment testing, and it might provide an authoritative testing regimen for climate models in a shorter period of time.
 We acknowledge Brian Farrell and Richard Goody for their advice. We also acknowledge the international modeling groups for providing their data for analysis, the Program for Climate Model Diagnosis and Intercomparison (PCMDI) for collecting and archiving the model data, the JSC/CLIVAR Working Group on Coupled Modelling (WGCM) and their Coupled Model Intercomparison Project (CMIP) and Climate Simulation Panel for organizing the model data analysis activity, and the IPCC WG1 TSU for technical support. The IPCC Data Archive at Lawrence Livermore National Laboratory is supported by the Office of Science, U.S. Department of Energy. This work was supported by grant ATM-0450288 of the National Science Foundation.