Considerations when using pre-1979 NCEP/NCAR reanalyses in the southern hemisphere



[1] NCEP/NCAR reanalysis data is used widely in the atmospheric sciences. Although these data are generated using homogenous techniques, the effect of changes in the atmospheric observing system is unavoidable. One prominent impact is the introduction of satellite data in 1978, particularly over the southern hemisphere where conventional observations are sparse. This paper attempts to quantify the extent to which the introduction of satellite data impacts both daily and inter-annual scales of variability, using a Self-Organizing Map (SOM) analysis technique. It is clear that daily circulation statistics are quite different before and after 1979, and are generally more typical of model climatology before 1979. Inter-annual variability also appears to be reduced before 1979 in the mid-latitudes of the southern hemisphere. These caveats need to be borne in mind when performing studies over the southern hemisphere pre-1979.

1. Introduction

[2] The NCEP/NCAR reanalysis project [Kalnay et al., 1996] has provided the meteorological community with a valuable dataset that has seen a flourish of research projects investigating atmospheric variability. The openhanded availability of this data has proven to be a great asset to the wider research community. Comprehensive documentation has also been provided in two papers [Kalnay et al., 1996; Kistler et al., 2001]. These highlight some of the difficulties that have been encountered with the reanalysis data and suggest appropriate use of the data in atmospheric studies.

[3] The research community has also performed many assessments of the data. These include, inter alia, an evaluation of the moisture and hydrological cycle [Trenberth and Guillemot, 1998], mean sea-level pressure in the South Pacific [Marshall and Harangozo, 2000] and the quality of reanalysis data in the tropics [Trenberth et al., 2001].

[4] Reanalysis datasets are designed to reduce inconsistencies in the assimilation of atmospheric observations into a numerical model, thus providing a near-homogeneous set of data. Unfortunately, the sparse observation network in the southern hemisphere, particularly before the advent of TIROS (Television Infrared Observation Satellite) Operational Vertical Sounder (TOVS) satellite data in 1978 [Kalnay et al., 1996] remains a problem. As shown in various papers, the introduction of satellite data in 1978 has produced a discontinuity in the data, particularly in the upper troposphere, south of 50°S [Kistler et al., 2001; Sturaro, 2003] and in the stratosphere [Huesmann and Hitchman, 2003]. As a result Kistler et al. [2001] suggest that climatologies should be constructed using the 1979-present period and that trend studies should be avoided. They recommend that the reanalysis data is suitable for studies covering daily to seasonal and inter-annual time-scales. The effect of changes to the observing system on analyses at these time-scales, however, does require further exploration. This is becoming especially pertinent as increasingly more studies are probing sub-seasonal variability and reanalysis data are being used to force regional climate models. Furthermore, the development of seasonal to inter-annual climate prediction [Goddard et al., 2001] may also be vulnerable to inconsistencies in the inter-annual variability of the reanalysis data.

[5] The objective of this letter is to investigate the effect of the introduction of satellite data in 1978 on daily and inter-annual variability of the reanalysis data in the southern hemisphere.

2. Methodology

[6] Daily 500 hPa geopotential height data (500Z) for December-January-February (DJF) and June-July-August (JJA) seasons from 1970 to 2002 were used in this study. 500Z data were chosen as they provide a useful indicator of synoptic variability of atmospheric column temperature and surface pressure patterns. Only data at 00Z analysis times were used to exclude variability related to the diurnal cycle and from observations that may only be taken at certain synoptic times.

[7] The self-organizing map (SOM) algorithm was used to reduce the dimensionality of the daily data into a number of archetypal patterns. This technique is proving to be a useful tool in meteorological data analysis and visualization [Hewitson and Crane, 2002; Tennant, 2003]. In a nutshell, SOMs produce a two-dimensional array of archetypal patterns (nodes) from the input data. This technique differs from traditional clustering in that the nodes span the full continuum of the data space, and position similar nodes near each other such that the variation in archetypal patterns follows a smooth transition across the SOM. The SOM also places a higher number of nodes in areas of increased variability. The user subjectively chooses the number of nodes, depending on the amount of generalization required. In addition, the array of nodes is preferably rectangular to ensure stability during the training process. This study is concerned with changes in archetypal patterns of daily circulation over relatively large areas (synoptically speaking) and so a SOM of 12 × 10 nodes was chosen to allow for this high level of variability. Essentially each season (90 days) could have at least a unique set of archetypal patterns for the region for each day.

[8] Once the SOM has been trained using the daily input data, each input field is then associated with a best-matching node. In this way a two-dimensional histogram of archetype frequencies can be built. If a particular period is chosen, e.g., pre 1979 or post 1979, any shift in the statistics of circulation archetype frequencies becomes clear. This approach is then used to determine the impact of satellite observations on the statistics of daily circulation fields in the southern hemisphere.

[9] Two areas of the southern hemisphere over the South Pacific and South Atlantic Oceans from 20°S to 70°S (Figure 1) were considered because of the paucity of radiosonde observations in these areas. The effect of satellite data would be greatest here.

Figure 1.

Number of radiosonde observations per 2.5° × 2.5° grid-box per month for the period January 1970 to December 1997. Rectangular boxes denote area of study over the South Pacific and South Atlantic Oceans.

3. Results and Discussion

[10] The SOM based on the DJF months of the reanalysis data from 1970 to 2002 over the South Pacific Ocean shows archetypes of daily circulation that include zonal patterns of strong meridional gradient towards the lower part of the SOM, weakening and implying a more wavelike structure for the 500 hPa flow further up and ultimately closed cells in the upper corners of the SOM (Figure 2). The SOM nodes are plotted as the meridional gradient of 500 hPa heights to assist in the visualization of the data, as these fields can be used as a proxy for the 500 hPa geostrophic wind speed. The frequency of an aggregate of nodes (4 × 5) during the different decades is shaded in Figure 3. Numbers show the percentage shift in frequency of days mapping to those nodes during the specific decade, relative to the full period. The aggregation of nodes assists in determining more significant general shifts in circulation patterns.

Figure 2.

Every other node of a 12 × 10-node SOM of daily 00Z 500 hPa height fields during DJF from 1970 to 1999, with the meridional gradient shaded from light (low values) to dark (high values). The area covers the South Pacific Ocean from 20°S to 70°S and 170°W to 70°W as shown in Figure 1.

Figure 3.

Change in frequencies of days mapping to the nodes of the 12 × 10 SOM (see Figure 2) shown as an average for an aggregate of nodes (4 × 5) for the 1970s (top left), 1980s (bottom left) and 1990s (bottom right) relative to the full period from 1970 to 1999. The data are for daily 500 hPa height data during DJF months over the South Pacific Ocean. Circulation archetypes typical of the indicated SOM aggregate areas are shown in the top right panel.

[11] It becomes clear that during the 1970s the 500Z fields in the reanalysis data are more characteristic of the weaker meridional gradients towards the upper corners of the SOM and less characteristic of the stronger gradients extending across the ocean basin, that are positioned in the lower part of the SOM. The frequency differences between the 1980s and 1990s, possibly related to inter-decadal variability, are relatively smaller and more evenly spread across the SOM. During JJA the situation is similar, with fewer days of an active South Pacific Convergence Zone (SPCZ) during the 1970s (not shown).

[12] Archetype frequencies during DJF over the South Atlantic Ocean also show marked changes before and after 1979, but the patterns are somewhat different to those in the South Pacific Ocean. Here there are increased days of the most northward placed meridional gradients during the 1970s (top and bottom left corner of the SOM) (Figure 4). Additionally there are decreased days with a wave-like structure (bottom right corner). During JJA the reverse situation becomes apparent. There is a clear reduction in days with a strong meridional gradient in 500Z fields during the 1970s (Figure 5).

Figure 4.

As per Figure 3 but for a SOM derived for DJF months over the South Atlantic Ocean.

Figure 5.

As per Figure 3 but for a SOM derived for the JJA months over the South Atlantic Ocean.

[13] Storm tracks in the Southern Hemisphere are determined largely by meridional temperature gradients [Trenberth, 1991]. Without sufficient observations, these circulation features tend to be more characteristic of the reanalysis model climate than reality [Marshall and Harangozo, 2000]. The reanalysis biases in the South Atlantic Ocean are consistent with model climates determined from extended simulations of three different GCMs using observed SST fields [Tennant, 2003]. Summer westerly wind systems were generally simulated further north and more frequently along the westerly belt than observed, but during winter the deepest systems were often missed. In the South Pacific Ocean storm tracks are more variable than in the South Atlantic [Trenberth, 1991] owing the complex interaction of systems in the SPCZ that lead to cyclogenesis [Jones and Simmonds, 1993]. This difference may explain why the circulation changes in reanalysis data are greater in the South Pacific than Atlantic Ocean, as the reanalysis model climate may be more distinct from reality in areas of greater uncertainty such as this.

[14] The changes to the daily circulation archetype frequencies also have an impact on inter-annual variability. This is demonstrated in fields of vertically integrated meridional flux of dry static energy during DJF (Figure 6). The magnitudes here match those found by Michaud and Derome [1991] using ECMWF analyses. However, inter-annual variability of dry static energy transport, shown as a standard deviation including the extremes, south of 45°S during the 1970s in the NCEP/NCAR reanalysis is small compared to the 1980s and 1990s and other latitudes. This is consistent with constrained daily circulation characteristics.

Figure 6.

One standard deviation above and below the mean (shaded) and extremes (dotted) of the zonally- and vertically-integrated meridional flux of dry static energy during DJF for the 1970s (top), 1980s (middle) and 1990s (bottom) in Watt × 1015.

4. Conclusions

[15] The scarcity of observation data over the Southern Ocean prior to 1979 has a potentially serious effect on any attempt to re-create atmospheric analyses for this period. The evidence presented suggests that the altered daily and inter-annual circulation statistics between the 1970s and 1980/90s in the reanalysis data appears to be largely due to changes in the observing system and that this exceeds natural inter-decadal variability. The regions studied here are relatively large and these potential problems may also affect many other parts of the Southern Ocean where observation data is sparse, such as artificial surface pressure trends near Antarctica found by Hines et al. [2000]. Studies using the reanalysis data should be restricted to areas or periods where sufficient observations were made or where the weather systems affecting the area of interest are well captured by the reanalysis model.