E-mail: steven.mauget@ars.usda.gov
Time series analysis based on running Mann-Whitney Z Statistics
Article first published online: 22 AUG 2010
DOI: 10.1111/j.1467-9892.2010.00683.x
Published 2010. This article is a US Government work and is in the public domain in the USA.
Additional Information
How to Cite
Mauget, S. (2011), Time series analysis based on running Mann-Whitney Z Statistics. Journal of Time Series Analysis, 32: 47–53. doi: 10.1111/j.1467-9892.2010.00683.x
- †
E-mail: steven.mauget@ars.usda.gov
Publication History
- Issue published online: 12 DEC 2010
- Article first published online: 22 AUG 2010
- First version received November 2009 Published online in Wiley Online Library: 22 August 2010
- Abstract
- Article
- References
- Cited By
Keywords:
- Mann–Whitney U statistics;
- moving window method;
- ranking based analysis;
- non-parametric statistics
Abstract
- Top of page
- Abstract
- 1. Introduction
- 2. The running MWZ Method
- 3. Spatio-temporal analysis of U.S. temperature data
- 4. Summary
- Acknowledgement
- References
A time series analysis method based on the calculation of Mann–Whitney U statistics is described. This method samples data rankings over running time windows, converts those samples to Mann–Whitney U statistics, and then normalizes the U statistics to Z statistics using Monte-Carlo generated null parameters. Based on the Z statistics’ magnitudes this algorithm can identify time windows containing significant incidences of low or high data rankings, where the window length is determined by the sample size. By repeating this process with sampling windows of varying duration ranking regimes of arbitrary onset and duration can be objectively identified in a time series. The simplicity of the procedure's output – a time series’ most significant non-overlapping ranking sequences – makes it possible to graphically identify common temporal breakpoints and patterns of variability in the analyses of multiple time series. This approach is demonstrated using United States annual temperature data during 1896–2008.
1. Introduction
- Top of page
- Abstract
- 1. Introduction
- 2. The running MWZ Method
- 3. Spatio-temporal analysis of U.S. temperature data
- 4. Summary
- Acknowledgement
- References
A general principle of exploratory data analysis (EDA) is that strong statistical methods make few assumptions about the behaviour or distribution of data (Tukey, 1977; Hoaglin et al., 1983). However, many time series analysis methods are based on assumptions regarding how data varies over time. Economic (Kitov and Kitov, 2008), climate (Lins and Slack, 1999; Groisman et al., 2005, Trenberth et al., 2007) and epidemiological (Baxter et al., 1998; Hall et al., 2003) data have been subjected to linear trend analysis, but such analysis may not be suitable when data clearly depart from a simple trend during the time over which trends are fitted. Although data variability may be abruptly transitional or quasi-cyclic, trend analysis can interpret nonlinear behaviour as a simple, if unrepresentative, linear trend. Fourier analysis (Bloomfield, 2000) can detect cyclic variation over time, but assumes that it occurs somewhat continuously over the entire period of data record. Windowed Fourier (Harris, 1978), wavelet (Lau and Weng, 1995; Percival and Walden, 2000) and chirplet analyses (Mihovilovic and Bracewell, 1991) detect intermittent cyclic variation by projecting harmonic transforms onto the data over moving time windows, but still assumes that data vary in a cyclic manner. Even so, moving window methods are able to detect a wide range of data behaviour over time. The scheme described here extends the generality of these methods by making a less limiting assumption about how data variability occurs; specifically, that such variation consists of simple non-cyclic ranking regimes that might exist over a range of time scales and have arbitrary onset times. This approach is similar to the change point detection procedures of Siegal and Castellan (1988) and Lanzante (1996) in that it samples a time series’ rankings and converts those samples to Mann–Whitney Z (MWZ) statistics, but differs in other respects. While their emphasis was on detecting single or multiple regime shifts, the goal of the moving window method described here is to identify the most significant high and low ranking regimes in a time series. This method's overall approach reflects the resistance, re-expression and revelation themes of EDA as described by Hoaglin et al. (1983). Because it is based on the analysis of running samples of rankings, it is resistant to the presence of outliers and to assumptions of distribution. The method re-expresses those samples as MWZ statistics to test for significant concentrations of high or low rankings at a fixed sample size, and also to compare the significance of samples of varying size. Finally, graphic techniques are used to reveal coherent spatial patterns of significant ranking regimes over time. The result is a method that provides a new and robust way to conduct time series analysis.
2. The running MWZ Method
- Top of page
- Abstract
- 1. Introduction
- 2. The running MWZ Method
- 3. Spatio-temporal analysis of U.S. temperature data
- 4. Summary
- Acknowledgement
- References
The basic test statistic in the running MWZ method is the Mann–Whitney U statistic, which is typically described as a non-parametric test for difference in location between two data samples (Wilcoxon, 1945; Mann and Whitney, 1947; Conover, 1971; Mendenhall et al., 1990). In such tests, the data from both samples are pooled, and the pooled data are then ranked. The rankings associated with the two samples are then used to form U statistics, which can be used to determine significant differences in the overall magnitude of the two distributions. However, in this demonstration, U statistics will be used to detect the most significant intra-to multi-decadal (6–30 year) sequences of high or low data rankings in centennial length temperature time series.
This approach will be described using United States climate division temperature data during 1896–2008. Although this data are normally defined at monthly time resolution over 344 climate divisions by the U.S. National Climate Data Center (Guttman and Quayle, 1996), the version used here was further averaged into a 102 division data set by the National Oceanographic and Atmospheric Administration's Climate Prediction Center. These 102 regions (Figure 1) provide a more spatially uniform division of continental United States area and a more concise temperature data set. The monthly temperature values for each division during January 1896 to December 2008 were calculated as the average of monthly mean temperatures measured at meteorological stations within the division.
Assuming a calendar year averaging period, 113 years of annual temperature means (atmp) can be derived for each of the 102 climate divisions in Figure 1 during 1896–2008. Using those yearly averages and the areas of the corresponding climate division, a spatial average of national temperature (NTMP) can be calculated for each year.
The first step in running the MWZ algorithm (Figure 2a) is to rank the values in the 113 year NTMP time series (Figure 3a). Ties are resolved by averaging the ranks that would have been assigned to the tied NTMP values, and assigning that average rank to each value (Conover, 1971; Mendenhall et al., 1990). Thus if the two lowest values were tied, they would both be assigned a rank of 1.5, and the next highest value would be assigned a rank of 3.0. Those rankings are then sampled over moving time windows and converted to Mann–Whitney U statistics. For example, the NTMP rankings might be divided into a class consisting of a sequence of years being tested (class I, of, for example, size nI = 10) and the rankings of the remaining years (class II, of size nII = 113 − nI). In practice, expressions such as that found in Conover (1971) can be used to calculate the U statistic of the class I 10-year sample window,
where RI is the class I sample's rank sum,
and Rank Ii is the rank of the ith member of class I. Alternatively, the UI statistic is equal to the total number of non-sample data values that precede each sample value when all data values are arranged by rank (Hollander and Wolfe, 1999). That is,
where Rank IIj is the rank of the jth member of class II and φ(Rank Ii,Rank IIj) = 1 if Rank Ii < Rank IIj, 0 otherwise. Thus, the maximum UI statistic in this example would occur when the sample accounts for the 10 highest rankings (UI = 103 × 10) in the NTMP series, while the smallest statistic would result when it accounts for the 10 lowest (UI = 0 × 10). More general sampling outcomes result in U statistics that are proportional to the incidence of high rankings in the sample, but bounded by those extreme values. If the 113!/103!10! possible sampling outcomes of class I rankings are equally probable, the resulting distribution of UI statistics is Gaussian with a mean equal to the average of the maximum and minimum values [e.g. μ = 1/2 × nI × nII = 0.5 × (103 × 10)]. For moderately large sample sizes (nI and nII ≥ 10) the distribution's standard deviation can be approximated by the expression (Mendenhall et al., 1990),
Figure 2. (a) Flow chart diagram for the running Mann–Whitney Z time series analysis algorithm. (b) Flow chart diagram for the generation of Mann–Whitney U statistic null distribution parameters. Roman numerals I–VI indicate the corresponding step in the Monte Carlo algorithm described in Section 2
Figure 3. (a) Time series of nationally averaged annual temperature over the continental U.S. during 1896–2008. Gray shaded years mark the 10 most highly ranked years. (b) Mann–Whitney Z statistics of ranked NTMP values sampled over running 10-year time windows. Horizontal lines indicate two-sided 95% (Z = ± 1.96), 99% (Z = ± 2.575) and 99.9% (Z = ± 3.29) confidence intervals. (c) As in (a) with horizontal extent of coloured bars showing significant 10-year cool and warm periods as indicated in (b). Vertical placement of bars show corresponding Z values as marked by right axis. Colour scheme on left axis shows positive and negative significance at 95%, 99% and 99.9% confidence levels. (d) Significant cool and warm periods indicated by running Mann–Whitney Z analyses with 6,7,…30 year sampling windows. (e) The most significant cool and warm periods in (d) occurring over non-overlapping time windows
These two null parameters can be used to Z-transform the Gaussian U statistics, with significantly high (low) Z values indicating a significant incidence of high (low) sample rankings relative to a null hypothesis that assumes random and independent sampling.
For a specified sample size, the analytic solutions for μ and σ assume that rankings within a sample are serially independent. However, sampling outcomes consistent with persistent, or ‘red’, variation are more likely in a sequence of years sampled from a 113 year temperature time series (Thiebaux and Zwiers, 1984). Also, the large sample approximation for σ (eqn 5) places a lower limit on the length of time windows that might be considered. As a result, those null parameters were calculated here via Monte Carlo simulations consistent with a null hypothesis that does not assume independent sampling, but allows for year-to-year persistence. This hypothesis (H0) holds that the NTMP time series represents semi-random climate variation with inter-annual persistence, but is essentially stationary and trendless. Given the goal of detecting significant 6 to 30-year-ranking regimes during 1896–2008, the hypothesis also holds that the time series contains no such regimes. The parameters of U null distributions consistent with H0 were derived here via the following AR modeling and Monte Carlo protocol (Figure 2b):
- I As H0 assumes no low-frequency temperature regimes, the associated low-frequency variation in the NTMP record was derived via a low-pass Lanczos filter (Duchon, 1979) and then subtracted from the data. As the shortest sampling window considered here was 6 years, with a corresponding cyclic period of ∼12 years, this filter was assigned a half-power cutoff frequency of ν = 10−1 years.
- II Calculate AR(1), AR(2) and AR(3) regression coefficients from the autocorrelation values of the high-passed data resulting from (I), and select the AR model yielding the minimum Akaike Information Criteria score (Akaike, 1974).
- III From the results of step (II), form autoregressive red noise processes.
- IV Adjust the mean and variance of the red noise process resulting from step (III) and truncate the number of significant digits to agree with that of the data. Then, select red noise series with lengths equal to that of the time series being tested – in the case of Figure 3a’s NTMP series, 113 – and rank those values.
- V From the ranked noise processes resulting from step (IV) calculate appropriate null statistics, which in the current NTMP example would be UI statistics derived from non-overlapping 10 element segments of each red noise series.
- VI Repeat (III–V) until 10,000 independent null statistics are calculated, and determine the parameters of the resulting UI null statistics.
Using the null distribution parameters derived from these Monte Carlo simulations (μMC, σMC), the Z statistics of rankings sampled from the NTMP time series can be used to test Ho.
The MWZ statistics for running 10-year samples of ranked NTMP values can be found in Figure 3b. Thirty-seven cold periods are identified as significant at a 95% confidence level (Z < −1.96), with the most significant concentrations of low-ranked cold years occurring during 1911–1920 (Z = −5.033) and 1964–1973 (Z = −4.155). Twenty-two 10-year periods were significantly warm at a 95% confidence level. Seven of those warm decades occurred during 1920–1940s, with a period of peak warmth during 1930–1939 (Z = 2.957). The final decades of the 113-year temperature record were the most significantly warm, with the eight overlapping 10-year periods that end in the years 2001–2008 marked as positively significant at a 99.9% confidence level (Z > 3.29).
Figure 3b’s running Mann–Whitney Z statistics identify significant cool or warm periods occurring over any 10-year period during 1896–2008. To extend the test to a wider range of time scales, the algorithm repeats the calculation of running U statistics over increasing sampling window sizes. In this example, those time windows range in length between 6 and 30 years. The μMC − σMC parameter pairs were calculated independently for each of the 25 sample sizes in the course of the Monte Carlo simulations, given the dependence of those parameters on sample size. The use of these parameter pairs to normalize U statistics into Z statistics allows for significance testing of a particular window size, and also allows for comparing the significance of Z statistics derived from different window sizes. After the running U statistics from each analysis were normalized by the appropriate null parameters the positive and negative Z statistics from all 25 tests that exceeded a two-sided 95% confidence threshold are combined (Figure 3d), and then screened for those periods resulting in the greatest significance over non-overlapping time windows (Figure 3e). In Figure 3e, the most significant negative Z statistic (−6.262) from all 25 analyses of NTMP rankings is found in the 19-year period 1902–1920. The period 1964–1979 was also notably cold nationally (Z = −5.362). The most significant positive Z statistic (7.878) in all the analyses, and the greatest magnitude statistic overall, results from the 1986–2007 temperature rankings. The magnitude of that Z statistic is strongly influenced by the fact that 6 of the 10 warmest years in Figure 3a’s 113-year NTMP record occurred during that 23-year period. Although Conover (1971) states that Mann–Whitney U statistics can be calculated from data with a moderate number of ties, these results suggest that they might be calculated from data with a fairly high percentage of ties. Because of their low numerical resolution (0.1°F), of the 113 NTMP values in Figure 3a only 12 are not tied. The remaining 101 values are tied in 22 groups containing as few as 2 and as many as 13 years.
3. Spatio-temporal analysis of U.S. temperature data
- Top of page
- Abstract
- 1. Introduction
- 2. The running MWZ Method
- 3. Spatio-temporal analysis of U.S. temperature data
- 4. Summary
- Acknowledgement
- References
The results of the running MWZ method are relatively simple because it identifies only the most significant non-overlapping regimes of low and high rankings in a time series. If a shading scheme for negative and positive significance is chosen, those results can be plotted on a single horizontal axis, as the national cold and warm periods of Figure 3e have been plotted in Figure 4a. Similar horizontal plots resulting from the analysis of numerous time series of equal length can be combined on one plot. If those time series are formed from measurements at different locations, as in the annual streamflow analyses of Mauget (2003, 2004), the resulting plots represent a spatio-temporal analysis of the rankings of the data.
Figure 4. (a) Peak significant national warm and cool periods as marked in Figure 3e. Positive and negative significance is marked by the shading scheme at the top and in Figure 3b. (b) As in (a) for peak significant warm and cool periods shown by running Mann–Whitney Z analyses of the annual temperature time series for each of Figure 1’s 102 climate divisions. The vertical axis marks the corresponding climate division number. (c) Locations of the climate divisions corresponding to the vertical axis index number in (b)
Figure 4b is such a plot for the running MWZ analyses of the annual temperature (atmp) time series for each of Figure 1’s 102 climate divisions. As shown in Figure 4c, those divisions are identified with one of six colour-coded U.S. regions. The analyses of Figure 4b are arranged such that, as the climate division number increases from 1 to 102, the results of the southeast (green), northeast (beige), north-central (red), central (blue), interior-west (yellow) and western (violet) regions are plotted.
Diagrams like Figure 4b can expose common temporal breakpoints in the data. For example, in the interior-west (climate divisions 69–86) and western regions (divisions 87–102) 1986 marks a common beginning for late 20th century warm periods. By contrast, recent warm periods in the central (divisions 47–57), northeast (divisions 16–31), and southeast (divisions 1–15) regions commonly begin in 1998. In the southeast and northeast, an abrupt shift from warm to cool conditions is evident in the mid-1950s, with 1957 indicated as the end of warm regimes in a number of cases (e.g. divisions 3–7, 9, 10, 13–18) and 1958 as the common beginning of cool periods (e.g. divisions 2–12, 24–29). In western climate divisions, 1957 also marks the end of cool regimes in divisions 89, 91, 93, 94, 98, 99 and 101.
Although spatio-temporal plots like Figure 4b can reveal common shifts in data, their arrangement can also introduce artificial spatial break points. As noted earlier, east of the Rocky Mountains (divisions 1–67 in the southeast, northeast, north-central and central regions) 1998 marks the beginning of many late century warm periods. An exception is divisions 42–45 in the north-central region, which show warm periods beginning in 1986 that are similar to those in the interior-west and west. Figure 4c shows that divisions 42–45 border the interior-west and west regions, thus their appearance as outliers in Figure 4b can be traced to the arbitrary decisions made in ordering the figure's results. In future applications ordering might be defined more objectively, e.g. through an algorithm that clusters time series based on similarities in the timing and significance of their ranking regimes.
4. Summary
- Top of page
- Abstract
- 1. Introduction
- 2. The running MWZ Method
- 3. Spatio-temporal analysis of U.S. temperature data
- 4. Summary
- Acknowledgement
- References
The running MWZ approach is a robust option in time series analysis. This method can detect a wide range of data variation because it makes relatively few limiting assumptions about how data vary over time. The basic assumption is that time series might contain significant ranking regimes, but a wide range of data variability can be expressed in terms of such regimes. For example, a simple linear trend over the series’ duration might be marked by significant low-rank periods at the series’ beginning, and significant high-rank periods at the end. Intermittent cyclic regimes might result in alternating high and low ranked periods. Thus by detecting what might be considered a basic ‘building block’ of data variation over time, i.e. ranking regimes of arbitrary onset and duration, the method can detect arbitrary patterns of how data vary over time. Although the method does not provide direct information about the magnitude of data fluctuations in time, it can isolate periods over which magnitudes can be calculated and compared. The method is resistant because it is based on a test of data rankings that is insensitive to whether a time series’ values contain outliers or are distributed non-normally. A test based on Mann–Whitney Z statistics is also objective in that it can identify runs of extreme rankings, but imposes no arbitrary thresholds that define extreme rankings. The simplicity of the method's results also allows for the graphic comparison of the analysis of many data records (e.g. Figure 4b), and the identification of common regimes and temporal breakpoints.
Acknowledgement
- Top of page
- Abstract
- 1. Introduction
- 2. The running MWZ Method
- 3. Spatio-temporal analysis of U.S. temperature data
- 4. Summary
- Acknowledgement
- References
The authors thanks the preliminary reviewers Pat Brown and John Zhang for their helpful comments, and also David Unger of the NOAA Climate Prediction Center. All figures were produced using Generic Mapping Tools (Wessel and Smith, 1995).
References
- Top of page
- Abstract
- 1. Introduction
- 2. The running MWZ Method
- 3. Spatio-temporal analysis of U.S. temperature data
- 4. Summary
- Acknowledgement
- References
- (1974) A new look at the statistical model identification. IEEE Transactions on Automatic Control AC-19, 716–23.
- , and (1998) Time trend analysis and variations in prescribing lipid lowering drugs in general practice. British Medical Journal 317, 1134–35.
- (2000) Fourier Analysis of Time Series: An Introduction, 2nd edn. New York: Wiley.
- (1971) Practical nonparametric statistics, 1st edn. New York: Wiley.
- (1979) Lanczos filtering in one and two dimensions. Journal of Applied Meterology 18, 1016–22.
- , , , , and (2005) Trends in intense precipitation in the climate record. Journal of Climate 18, 1326–50.
- and (1996) A historical perspective of U.S. climate divisions. Bulletin of the American Meteorological Society 77, 293–303.
- , , , , and (2003) Association between antidepressant prescribing and suicide in Australia, 1991–2000: trend analysis. British Medical Journal 317, 1008–11.
- (1978) On the use of windows for harmonic analysis with the discrete Fourier transform. Proceedings of the IEEE 66, 51–83.
- , and (1983) Understanding Robust and Exploratory Data Analysis. New York: Wiley and Sons.
- and (1999) Nonparametric Statistical Methods, 2nd edn. New York: Wiley and Sons.
- and (2008) Long-term linear trends in consumer price indices. Journal of Applied Economic Sciences 3, 101–12.
- (1996) Resistant, robust and non-parameteric techniques for the analysis of climate data. Theory and examples, including applications to historical radiosonde data. International Journal of Climatology, 16, 1197–226.
- and (1995) Climate signal detection using wavelet transform: how to make a time series sing. Bulletin of the American Meteorologies Society 76, 2391–402.
- and (1999) Streamflow trends in the United States. Geophysical Research Letters 26, 227–30.
- and (1947) On a test of whether one of two random variables is stochastically larger than the other. The Annals of Mathematical Statistics 18, 50–60.
- (2003) Multidecadal regime shifts in U.S. streamflow, precipitation, and temperature at the end of the twentieth century. Journal of Climate 16, 3905–16.
- (2004) Low frequency streamflow regimes over the central United States: 1939–1998. Climatic Change 63, 121–44.
- , and (1990) Mathematical Statistics with Applications, 4th edn. Boston: PWS-Kent.
- and (1991) Adaptive chirplet representation of signals in the time frequency plane. Electrononics Letters 27, 1159–61.
- and (2000) Wavelet Methods for Time Series Analysis, 4th edn. Cambridge: Cambridge University Press.
- and (1988) Nonparameteric Statistics for the Behavioural Sciences. New York: McGraw Hill.
- and (1984) The interpretation and estimation of effective sample size. Journal of Climate and Applied Meteorology 23, 800–11.
- , , , , , , , , , , , and Contributing Authors (2007) Observations: surface and atmospheric climate change. In Climate Change 2007: The Scientific Basis. Working Group I Contribution to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change (eds S.Solomon, D.Qin and M.Manning). Cambridge: Cambridge University Press, 235–336.
- (1977) Exploratory Data Analysis. Reading, Massachusetts: Addison-Wesley.
- and (1995) New version of the generic mapping tools released. EOS, Transactions of American Geophysical Union 76, 329.
- (1945) Individual comparisons by ranking methods. Biometrics Bulletin 1, 80–3.

1467-9892/asset/olbannerleft.gif?v=1&s=d8e8f3c53f73bd4479d3c62e59fabab910b4d272)
1467-9892/asset/olbannerright.gif?v=1&s=dceaf5d776994f7ed0e154f667dbbdbaa2bc9f3c)






