An extended cumulative logit model for detecting a shift in frequencies of sky-cloudiness conditions

Authors

  • QiQi Lu,

    Corresponding author
    1. Department of Statistical Sciences and Operations Research, Virginia Commonwealth University, Richmond, Virginia, USA
    • Corresponding author: Q. Lu, Department of Statistical Sciences and Operations Research, Virginia Commonwealth University, PO Box 843083, 1015 Floyd Ave., Richmond, VA 23284, USA. (qlu2@vcu.edu)

    Search for more papers by this author
  • Xiaolan L. Wang

    1. Climate Research Division, Science and Technology Branch, Environment Canada, Toronto, Ontario, Canada
    Search for more papers by this author

Abstract

[1] In Canada, sky-cloudiness (or cloud cover) condition is reported in terms of tenths of the sky dome covered by clouds and hence has 11 categories (0/10 for clear sky, 1/10 for one tenth of the sky dome covered by clouds, …, and 10/10 for overcast). The cloud cover data often contain temporal discontinuities (changepoints) and present a large amount of observational uncertainty. Detecting changepoints in a sequence of continuous random variables has been extensively explored in both statistics and climatology literature. However, changepoint analyses of a multinomial sequence data with extra variabilities are relatively sparse. This study develops a likelihood ratio test for detecting a sudden change in parameters of the cumulative logit model for a multinomial sequence. The extra-multinomial variation is accounted for by allowing an overdispersion parameter in the model fitting. Moreover, the empirical distribution of the estimated changepoint is approximated by a bootstrap method. An application of this new technique to real sky cloudiness data in Canada is presented.

1. Introduction

[2] Cloud is one of the most obvious and influential features of Earth's climate system. As stated in Trenberth et al. [2007], “Clouds play an important role in regulating the flow of radiation at the top of the atmosphere and at the Earth's surface. They are also integral to the atmospheric hydrological cycle via their integral influence on the balance between radiative and latent heating.” Surface observations made at weather stations and onboard ships provide the longest available records of cloud cover changes. In Canada, sky-cloudiness (or cloud cover) condition is reported in terms of tenths of the sky dome covered by clouds and hence has 11 categories (0/10 for clear sky, 1/10 for one tenth of the sky dome covered by clouds, …, and 10/10 for overcast). These observations often contain temporal discontinuities due to changes in the observing location, environment, instrumentation, and in observing practices/procedures over the period of data record. For example,Karl and Steurer [1990] questioned homogeneity of cloudiness time series in North American. Therefore, it is of critical importance to test cloudiness data series for homogeneity prior to analyzing cloudiness trends and variations (and using them in other applications).

[3] A statistical changepoint analysis can quantitatively identify temporal sudden shifts in a climate data series. It can directly address the question of when the change most likely occurred. In an intuitive sense, a changepoint is a time point that the observations follow one distribution up to that point and another distinct distribution thereafter. The distributions before and after the changepoint usually belong to a same parametric family but have different values of parameter(s). The change times and causes are well documented in some cases but are unknown/undocumented in other cases due to incomplete and inaccurate metadata. In practice, one should use all available metadata along with appropriate statistical test(s) to detect all artificial changepoints (both undocumented and documented) and adjust the data series to diminish their effects.

[4] Changepoint detection in categorical data is an active research area. Hinkley and Hinkley [1970], Fu and Curnow [1990]studied an abrupt change in a parameter for binomial data using maximum likelihood methods. A cumulative sum (CUSUM)-type statistic was applied to binomial data byPettit [1980] and applied to multinomial data by Wolfe and Chen [1990]. Bayesian approaches were proposed by Smith [1975], Carlin et al. [1992], Qian et al. [2004], and Girón et al. [2005]. Braun et al. [2000] examined changepoints in a multinomial sequence within a quasilikelihood framework. Recently, the χmax2 test was developed by Robbins et al. [2011]for multinomial and Poisson data. In most studies the means were assumed to be constant between adjacent changepoints, which is usually not the case for climate time series. Temporal trends in sky-cloudiness condition data are often observed in Canada [Milewska, 2004] and they should not be simply ignored in the changepoint analysis. Even if a good reference series is available, it is still not clear how to apply the reference series to the target categorical data in order to get rid of nonstationarity. Moreover, the cloud cover data frequently present a large amount of observational variation and this extra variability (overdispersion) is rarely accounted for in changepoint analyses of multinomial data.

[5] The objective of this article is to develop a changepoint detection method for the in-situ observations of sky-cloudiness conditions in a cumulative logit regression model that accounts for the prominent statistical features in the data. These features include the non-Gaussian nature of count data, temporal trends, and the observational uncertainty present in cloudiness data.

[6] The rest of this article is organized as follows. Section 2overviews the sky-cloudiness condition data in Canada that we study.Section 3 introduces the cumulative logit model, which provides a natural framework for the ordered categorical data. A likelihood ratio test for detecting a changepoint in a sequence of overdispersed multinomial variables is developed in Section 4. The importance of estimating the extra variabilities and the detection power of the test are also addressed in Section 4. Section 5approximates the distribution of the detected changepoint by a bootstrap method. An application of the test to sky-cloudiness observations at one station in Canada is presented inSection 6. Finally, several concluding remarks are given in Section 7.

2. Data

[7] Sky-cloudiness condition in Canada is reported hourly in 11 categories. Hourly observations for the 1953–2011 period can be extracted from the National Climate Archives of Meteorological Service of Canada. The quality of these observations and potential causes of inhomogeneity were carefully considered and discussed inMilewska [2004].

[8] Traditionally, sky conditions are visually estimated by trained human observers. Individual visual observations are subjective measures of cloud cover with large uncertainties that may vary from one observer to another. The variation appears considerably large between observers around the middle range of cloud amount (5/10) and disappears as the cloud amount approached 0/10 or 10/10 [Galligan, 1953]. In particular, the quality of cloud observations could be greatly affected by bad weather, such as windy, foggy, or really cold weather. Failure to incorporate the large observational uncertainty of cloud data in the changepoint detection method will significantly increase the false alarm rate.

[9] Around the mid-1990s Automated Weather Observing Systems (AWOS) were widely introduced in Canada. Unlike human observers, AWOS reports are automatically derived from ceilometer data. AWOS is either used to replace the human observer or used as an observer's aid. Mixed human and AWOS observations are likely to be merged in the archives into one data record at some stations. This instrument change may induce inhomogeneities in the cloud cover data. Other changes in observation location and observational practice may also introduce changepoints into cloud cover series. Some, but not all of these changes have been documented in the available metadata.

[10] Many studies have reported an increasing trend in cloud cover or decreasing cloudiness [Milewska, 2004; Karl and Steurer, 1990; Groisman et al., 2004]. Because of the subjective nature and potential inhomogeneities, long-term trend studies in surface cloud records must be treated cautiously. On the other hand, the detection of changepoints will be confounded with the presence of temporal trends in cloud cover series. Therefore, the trend components need to be considered for the changepoint detection methods. In this paper, we have included a distinct trend component for each cumulative category in our model for changpoint analysis of cloudiness frequency data. This is necessary, because at least two of the 11 categories must have different trends (of the opposite signs) if a trend exists in the data. For instance, if the occurrence of overcast (10/10) condition at a location has become more frequent, at least one or more of the other 10 categories of conditions must have occurred less frequently at this location during the same time period, because the sum of the occurrence frequencies over the 11 categories in any year equals a fixed number - the number of observations in that year. For the same reason, a jump in the occurrence frequency of overcast condition must be accompanied by a drop in one or more of the other categories. This is why we also allow for different step-changes in different categories in our model presented below inSection 3.

[11] In addition, due to poor visibility in the dark nighttime observations could be less accurate than daytime observations. We group the 24 hours in a day into (i) daytime (9 AM, 10 AM, .., 4 PM local time), (ii) nighttime (9 PM, 10 PM, .., 4 AM local time), and (iii) others (the rest of the hours). Each of these three groups should have 8 observations per day if there is no missing, and a total of 2920 observations in a year of 365 days. The annual frequency series are the focus of this study. Here the annual daytime (or nighttime) frequencies are calculated as the sum of all the daytime (or nighttime) occurrences of each category condition in each year. These annual frequency data are assumed to be independent over years. Although one might question this assumption, it can simplify the changepoint detection problem for categorical data and provide a reasonable first-order approximation in this study for annual series. In fact, it has been sort of common practice in changepoint analysis to assume that annual climate data (one value per year) are independent. As an example of application, cloud frequency data for one station are tested for changepoints inSection 6.

3. The Cumulative Logit Model

[12] A cumulative logit model is often desired when the categorical response can be ordered in some reasonable way, especially in cases where the categorical response is merely a categorized version of a continuous variable [Agresti, 2002; Fahrmeir and Tutz, 2001]. For instance, the sky-cloudiness condition in Canada has 11 categories and its categories have a natural ordering from the clear sky to the overcast. In principle, it can be viewed as a categorized version of the continuous proportion of the sky dome covered by clouds. This section describes an extended cumulative logit model for annual sky-cloudiness condition frequencies in Canada with a distinct temporal trend for each cumulative category.

[13] Consider an ordinal categorical response variable Y with K categories labeled {1,…, K}. Let Ot,k denote the number of times that Y falls into category k during the t-th time period (individual hourly observations are grouped over thet-th time period, which includes all the daytimes or nighttimes in a year in this study). Then for eacht = 1,…, n, Ot = (Ot,1,…, Ot,K)′ follows a multinomial distribution with math formula trials and the probability vector πt = (πt,1, …, πt,K)′. Here math formula and the frequency Ot,k can be described by an n × K contingency table. Throughout the paper, we assume that Ot's are independent for all t = 1,…, n. This is a reasonable and commonly accepted assumption for annual climate data series.

[14] Let γt,k (k = 1,…, K − 1; t = 1,…, n) denote the cumulative probability that Y falls into the categories from 1 to k.For example, for the sky-cloudiness conditions in Canada,γt,1 is the probability of the clear sky condition, γt,2 is the probability that the sky is clear or one tenth of the sky dome is covered by clouds, and so on. These cumulative probabilities can be expressed as

display math

The corresponding cumulative logits are defined as

display math

Each cumulative logit uses all K response categories. The term [γt,k/(1 − γt,k)] inside the log is called the cumulative odds.

[15] Suppose τ ∈ {1,…, n − 1} is an admissible changepoint, the cumulative logit model has the form

display math

For each cumulative category k = 1,…, K − 1, αk is the fixed intercept, βkis the category-specific temporal trend parameter, and Δk is the changepoint parameter. The changepoint at time τ introduces a sudden change in the logits of γt,k by Δk. Here, the changepoint indicator 1[t>τ] is unity for t > τ and zero for tτ. This study focuses on the case in which the changepoint time τ is unknown. Given the homogeneity of the series (Δk = 0), the temporal trend βk in the model (3) can be expressed as

display math

a logarithm of the cumulative odds ratio of making response in categories 1 to k between times t + 1 and t. In other words, within each regime, the cumulative odds for the response to fall in categories from 1 to k at time (t + 1) is exp(βk) times the cumulative odds at time t. Totally, for a given changepoint time τ, in the model (3) there are 3 × (K − 1) parameters: α1,…, αK−1, β1,…, βK−1, and Δ1,…, ΔK−1. The cumulative probabilities γt,k can be expressed as

display math

and the categorical probabilities πt,k can then be calculated from γt,k as πt,1 = γt,1 and πt,k = γt,kγt,k−1 for k = 2,…, K.

[16] In the simple case where there is no changepoint and the temporal trends are same for all categories (e.g. β1 = … = βK−1), the model (3)is the well-known proportional odds model [McCullagh, 1980]. However, in the present case, the category-specific trends and the changepoint parameters considerably complicate the estimation procedure by the restriction that (α1 + β1t + Δ11[t>τ]) < … < (αK−1 + βK−1t + ΔK−11[t>τ]). Currently, this extended cumulative logit model can be fitted by the R package of VGAM [Yee, 2010] using iteratively reweighted least squares (IRLS).

[17] Before we construct the likelihood function, one issue should be mentioned here that overdispersion or extra variation is often observed in applications for categorical variables. That is, the actual variance/covariance of data is larger than explained by the nominal variance/covariance of distributions for categorical data. In the literature of modeling the overdispersed multinomial data, the simplest way to account for the extra variability is to introduce a multiplicative overdispersion parameter ϕ > 1 and to assume that Var(Ot) = ϕΣ(πt), where Σ(πt) represents the nominal variance-covariance matrix of the multinomial distribution.

[18] In the case of overdispersion in logit models, the maximum likelihood estimate (MLE) of model parameters will still remain asymptotically unbiased. However, the variances/covariances of the MLEs will be underestimated if ϕ > 1 is ignored. To adjust for overdispersion, we define the log likelihood function as math formula, which is the usual log likelihoods for the multinomial data, (θ), additionally divided by the consistent estimate of ϕ, math formula. Here the θ = (α1, …, αK−1β1, …, βK−1, Δ1, …, ΔK−1)′ is the 3(K − 1) dimensional model parameter vector for a given changepoint time τ. Thus, the log likelihood function up to constant can be obtained as math formula, where

display math

To estimate ϕ, define the q = K − 1 dimensional vector Otq = (Ot,1, …, Ot,q)′ and the variance-covariance matrix ofOtq is given by ϕΣtq. Under the assumption of multinomial distribution, Σtq = [σij(t)]i,j=1q and σij(t) = Ntπt,i(1 − πt,i), for i = j; − Ntπt,iπt,j, for ij. Let

display math

for t = 1,…, n, where the expected frequencies math formula and math formula is obtained from the estimates of γt,k in the model (3). Then

display math

which is the Pearson Statistic utut divided by its degrees of freedom n(K − 1) − 3(K − 1). This adjustment for overdispersion will become important later for changepoint detection using the likelihood ratio test statistic in the next section.

4. The λmax Test

[19] This section introduces a changepoint detection statistic for overdispersed multinomial data. Since changes in categorical probabilities imply the changes in mean, variance, and distribution, one should not expect standard methods for mean shifts to work well for multinomial data.

4.1. The Likelihood Ratio Statistic

[20] Detecting an overall changepoint in model (3) is equivalent to testing the null hypothesis H0 : Δk = 0 for all k ∈ {1, …, K − 1}, against the alternative hypothesis Ha : Δk ≠ 0 for some or all k ∈ {1, …, K − 1}. Here, the null model is nested within the alternative model (3); thus, the objective is to find the “best” model for the observed series {Ot} by comparing the nested models. The likelihood ratio statistic is most directly useful for such comparison [McCullagh and Nelder, 1989; Agresti, 2002].

[21] Let math formula and math formula denote the maximum likelihood estimates of the parameter vectors for the null model and the alternative model with a changepoint at time τ, respectively. The log likelihood ratio statistic comparing the null and alternative models is defined as

display math

where math formula can be obtained from the larger model (i.e. model under Ha). This statistic is large when the null model fits poorly compared to the alternative model. McCullagh and Nelder [1989]advise that the likelihood ratio statistic for comparing nested models can be better approximated by chi-square distribution with a number of degrees of freedom equal to the difference between the numbers of parameters in the two models, based on the fixed-cell asymptotics. For example, at the fixed changepoint timeτ, λ(τ) has approximately a chi-squared null distribution with (K − 1) degrees of freedom. However, as the total sample size math formula increases, the number of cells in the contingency table increases while the marginal row totals Ntare fixed. Hence, the chi-square theory ofλ(τ) needs to be used with caution. An asymptotic normal distribution under some conditions would be more appropriate in the increasing-cells cases [Osius and Rojek, 1992].

[22] The MLE of the changepoint time, math formula, is the value that maximizes λ(τ) over τ = 1,…, n− 1. Equivalently, the model-based log likelihood ratio test statistic for detecting the unknown changepoint timeτ is

display math

The H0 is rejected when the λmax is excessively large and accepted when λmax is small enough to be explained by random variation.

[23] The empirical percentiles of λmax are reported in Table 1 for different sample sizes n with K = 11. The null hypothesis of no changepoint is rejected if the λmax value exceeds the percentile that corresponds to the nominal level of test. Each entry in Table 1 was obtained from 100,000 independent simulations of λmax. The regression parameters αk and βk (k = 1,…, K − 1) used to generate this table are (−3.50, −2.35, −1.75, −1.39, −1.13, −0.91, −0.66, −0.36, 0.08, 1.21) and (0.0235, 0.0106, 0.0054, 0.0050, 0.0046, 0.0046, 0.0034, 0.0028, 0.0016, − 0.0051), respectively. The overdispersion parameter ϕ = 1 is used in the simulation and the marginal total Nt is set to be 2920 (the total number of observations for a year of 365 days with eight daytime observations per day). Importantly, further simulation studies with different sets of regression parameters and different values of ϕ show that the empirical distribution of the λmax statistic does not heavily depend on the unknown regression parameters and the ϕ value for a fixed series length n, a fixed number of categories K, and a reasonably large Nt within the simulation errors. This can be illustrated by the 95th percentiles of λmax under different simulation settings in Table 2 with n = 30 and K = 11. Overall, the difference in the 95th percentiles of λmax appears in the second decimal place. Therefore, the Table 1 percentiles are reasonably accurate for general use and provide very good guidance.

Table 1. The λmaxPercentiles for 11-Categorical Responses
nλmax,0.90λmax,0.95λmax,0.99
2025.828.133.4
3026.228.533.5
4026.528.833.7
5026.829.033.9
6027.029.234.0
7027.229.434.2
8027.329.634.4
Table 2. The 95th Percentiles of λmax for K = 11 and n = 30
Ntϕαkβkλmax,095
  • a

    Iα = (−3.50, −2.35, −1.75, −1.39, −1.13, −0.91, −0.66, −0.36, 0.08, 1.21).

  • b

    Iβ = (0.0235, 0.0106, 0.0054, 0.0050, 0.0046, 0.0046, 0.0034, 0.0028, 0.0016, − 0.0051).

  • c

    IIα = (−3.57, −2.44, −1.80, −1.43, −1.14, −0.90, −0.67, −0.35, 0.06, 1.16).

  • d

    IIβ = (0.0535, 0.0318, 0.0209, 0.0180, 0.0151, 0.0123, 0.0115, 0.0093, 0.0076, −0.0029).

29201IαaIβb28.55
29201IIαcIIβd28.56
29201Iα028.53
29205IαIβ28.53
14601IαIβ28.46

[24] In addition, the limiting distribution of maximums of chi-square statistics is derived inRobbins et al. [2011]in terms of the supremum of squared Brownian Bridges for detecting a changepoint in multinomial data. The maximization of the chi-square statistic inRobbins et al. [2011] is taken over a truncated set of admissible changepoint times  ≤ τ ≤ h. The reported 90th, 95th, and 99th percentiles are 27.0, 29.2, and 34.1, respectively for  = 0.05n and h = 0.95n. Those percentiles are similar to the empirical percentiles presented in our Table 1 and can be used to approximate the percentiles for λmax when there are no temporal trends (βk = 0) in the model (3) and λ(τ) is maximized over  ≤ τ ≤ h. As well known in the literature of changepoint detection, if the maximization is taken over all possible times τ ∈ {1,…, n − 1}, the λmax statistic converges to infinity at a very slow rate. This is likely why the percentiles in Table 1 are increasing slowly with the series length n.

[25] Having identified a significant changepoint at time math formula in the series being tested, one would like to know which categories are affected significantly by the sudden change at time math formula and which categories change the most at that changepoint. The change may not affect all of the K categories, but it shall affect at least two out of the K categories. In other words, if a sudden change has occurred in one category, there is at least another category that is somehow affected at the same time by the same cause such that the probabilities of outcomes in all categories sum to 1. For instance, a station relocation could affect all categories, although some categories might be affected more significantly than others. However, the change in the definition of overcast condition from “over 90%” to “over 95%” of the sky dome covered by clouds would only cause a sudden change in categories 10 and 11.

[26] In order to assess the effects of an identified change on the individual categories, one might analyze the marginal binomial sequence separately for each category. Instead, Girón et al. [2005] in Bayesian analysis used posterior distributions for the logarithm of the ratio of category probabilities estimated before the changepoint math formula to those after the changepoint. Similarly, we partition the test statistic λmax at the identified changepoint math formula into K components math formula, k = 1,…, K. Each math formula corresponds to the contribution to λmax made by the category k given math formula and can be expressed as

display math

Here, math formula is estimated at the detected changepoint math formula and k(⋅) can be obtained in (5). Since λmax for the model (3) has (K − 1) degrees of freedom, K components math formula are not independent of each other. The larger the value of math formula, the more relevant the corresponding category is for the purpose of segmenting the cloud cover series.

4.2. Effect of Overdispersion

[27] This section presents simulations that quantify how the λmax test performs when overdispersion is ignored in computing the test statistic by (7).

[28] We consider series of length n = 50; series with other lengths behave similarly. As the overdispertion parameter ϕ increases, the variances and covariances of data increase, and hence the signal-to-noise ratio decreases. Thus, it is hard for the test to detect the change (signal). When ϕ = 1, the data has the expected variability based on a multinomial distribution. For simplicity, a series is generated without temporal trends from the overdispersed multinomial distribution with Nt = 2920 and πt = (0.0293, 0.0578, 0.0610, 0.0514, 0.0448, 0.0428, 0.0537, 0.0702, 0.1090, 0.2503, 0.2297)′ for a range of ϕ values. Here πt is computed from αk same as that used in generating the Table 1 and βk = 0.

[29] Table 3 reports empirical probabilities of erroneously rejecting H0 (false alarm rate) at the 5% level for different values of ϕ. Each false alarm rate was obtained from 100,000 independent simulations. In each simulation run, a sequence of multinomial counts with specified ϕ is first generated. Two λmax statistics are then computed, one that takes into account the overdispersion parameter (Case I) and one that ignores overdispersions (Case II). In both cases, the null hypothesis of no changepoint is rejected if λmax exceeds 29.0 (the 95th percentile of λmax in Table 1).

Table 3. Effects of Overdispersion on False Alarm Rate of the Changepoint Detection Method
ϕCase ICase II
1.00.04980.0396
1.20.04920.1768
1.50.05070.5549
1.80.05110.8499
2.00.04900.9428
3.00.05090.9999

[30] The false alarm rates or Type I error rates should be close to 0.05 (for the 5% level test) in a well-designed test. The results inTable 3 show that the Type I error rates are extremely high when overdispersion is ignored (Case II), while the λmax statistic accounting for overdispersions is preforming as it should for all different values of ϕ (Case I). This is as expected: when the overdispersion parameter ϕ is ignored, underestimating the variances/covariances in the series makes rejection easier when there is in fact no changepoints. Hence, one should be extremely careful about the modeling methods in the changepoint problem with overdispersed count data. Ignoring overdispersion will degrade changepoint detection procedures.

4.3. Detection Power Assessment

[31] The power of detection of the test described in section 4.1 is evaluated through Monte Carlo simulations in this section. Specifically, we will estimate how frequently the test detects a changepoint when it actually exists.

[32] In each simulation run, a random multinomial sequence of length n = 50 was generated with K = 11 categories and Nt = 2920. This represents a yearly cloud cover frequency series. The regression parameters αk and βk were set to be same as those used in generating πt in Section 4.2. Three factors were varied in the simulations: true changepoint time at τ = 10, 25 and 40; overdispersion parameter with values ϕ = 1.0, 1.5, 2.0 and 3.0; and a common change Δ = Δk placed on all K − 1 cumulative logits with levels Δ = 0.1, −0.15, and 0.2. Those changes Δ = 0.1, −0.15, and 0.2 are equivalent to the changes of 0.059, 0.093, 0.116 placed on the logistic scale of categorical probability πt, respectively. 10,000 replicates were drawn for each combination of levels. The set of categorical probabilities before and after the changepoint time τ is tabulated in Table 4 with different changes Δ. The multinomial frequencies Ot were then simulated sequentially with the conditional binomial distributions based on the categorical probabilities in Table 4 for a specified ϕ value.

Table 4. Simulated Categorical Probabilities Before and After Changepoint
kBeforeAfter
Δ = 0Δ = 0.1Δ= −0.15Δ = 0.2
10.02930.03230.02530.0356
20.05770.06310.05050.0688
30.06100.06580.05430.0708
40.05140.05470.04640.0582
50.04480.04720.04100.0497
60.04280.04480.03980.0467
70.05370.05570.05060.0574
80.07020.07180.06730.0731
90.10900.10950.10730.1095
100.25030.24260.26020.2342
110.22970.21250.25730.1962

[33] At the 5% level of significance, the λmax test is applied to each simulated series. The detection power of the test is estimated as the rate of correct detection, which is reported in Table 5. For example, when ϕ = 2.0 and Δ = −0.15, the probability of the test correctly identifying the true changepoint at math formula is 81.5%. Here, a correct detection of a changepoint is counted when the λmax value exceeds its 95th percentile (29.0 in this case) and the estimated changepoint time is exactly the actual changepoint time. Therefore, the numbers in Table 5 are a very strict measure of the accurate detection power.

Table 5. Detection Power Estimates
 τ = 10τ = 25τ = 40
Δ = 0.1Δ = −0.15Δ = 0.2Δ = 0.1Δ = −0.15Δ = 0.2Δ = 0.1Δ = −0.15Δ = 0.2
ϕ = 1.00.7630.9640.9960.6710.9690.9970.7560.9700.995
ϕ = 1.50.5030.9110.9810.3890.8870.9820.5150.9020.985
ϕ = 2.00.3370.8150.9550.2220.7500.9490.3300.8100.957
ϕ = 3.00.1540.5950.8750.0960.4660.8340.1460.6060.877

[34] As reported in Table 5, the detection power of the λmax increases as the magnitude of change (|Δ|) increases and decreases as the overdispersion parameter ϕ increases. Although the test loses power when ϕ increases, it is still very powerful when |Δ| = 0.2 and fairly powerful when |Δ| = 0.15. The detection power with a very small level change |Δ| = 0.1 drops vary fast with large ϕ values due to the large variation in the series. In addition, the simulation results for τ = 10 and τ = 40 are symmetric about the center of series n/2 = 25 as expected. The power of detecting a changepoint (for a fixed |Δ|) increases as the changepoint moves away from the center.

[35] So far we have shown that the λmax test is able to effectively detect a changepoint in an overdispersed multinomial sequence. To evaluate the accuracy of the detected changepoint, we will give reasonable limits for the changepoint time τ in the next section.

5. Bootstrap Approximation of the Distribution of Changepoint

[36] The bootstrap method [Efron, 1979] is a data-based simulation method for statistical inference.Hinkley and Schechtman [1987] have applied a conditional bootstrap method in the mean shift changepoint model for continuous variables to approximate the probability distribution of the estimated changepoint time. Instead of the bootstrapping residuals approach used in Hinkley and Schechtman [1987], we will use the unconditional bootstrapping pairs method [Freedman, 1981] in this paper. The bootstrap by pairs consists in resampling the regressand and regressors together from the original data. As shown in Efron and Tibshirani [1993], bootstrapping pairs is less sensitive to regression model assumptions than bootstrapping residuals. In particular, there is no explicit functions connecting count response variables to random errors and hence resampling residuals is not direct and easy in our model.

[37] Let math formula denote the observed value of the maximum likelihood estimate of τ for a given series Ot, t = 1,…, n. Once the changepoint time has been detected, the series can be broken into two segments, one each side of the changepoint, and the bootstrap pairs are generated separately for each segment. More formally, define the bootstrap pairs (O*t, t*) for math formula to be a random sample with replacement from the original pairs (Ot, t) for math formula. Similarly, the bootstrap pairs (O*t, t*) for math formula are defined to be a random sample with replacement from the original pairs (Ot, t) for math formula. The bootstrap estimate math formula of math formula will then be obtained based on (O*t, t*), t = 1,…, n. This process is repeated B times, obtaining B estimates math formula. The empirical distribution of math formula can be used to estimate the distribution of math formula. For example, in a of B bootstrap samples, math formula equals math formula; hence the empirical probability of math formula is a/B. Moreover, the approximate 1 − 2α bootstrap percentile interval is given by math formula, where math formula is the B × αth percentile of the ordered math formula values.

[38] In this study B = 2000 is used and Ot is generated using the method described in Section 4.3 with Δ = −0.15, ϕ = 1.5, and τ = 10, 25, and 40. For each τ, we simulate 50 Ot series and the average empirical probabilities of math formula are reported in Table 6. The empirical distribution of math formula in Table 6 can provide an approximate confidence range for τ. For example, math formula is a 98.4% confidence range for τ = 10, 96.5% for τ = 25, and 98.3% for τ = 40. This estimated confidence range for τ will help determine how accurate the estimated changepoint time math formulais. In the next section we will present an application of this method to real sky-cloudiness frequency data for one station in Canada.

Table 6. Distributions of math formula Estimated by the Bootstrap Method
  math formula
≤−3−2−1012≥3
τ = 100.00280.00490.02470.92640.03260.00390.0047
τ = 250.00770.00650.03950.87770.04820.00800.0125
τ = 400.00320.00550.03500.91760.03060.00550.0027

6. An Application to Sky-Cloudiness Data in Canada

[39] According to the related metadata, the weather office at Fort St. John Airport in Canada (Climate ID: 1183000) was relocated on 7 November 1979, which was accompanied with a change in ownership and station type (the ownership change possibly indicates an observer change). In this section, we apply the above proposed test to sky-cloudiness conditions observed at this station during the period 1965–94 (i.e.,n = 30), to assess the effect of the above changes. Here, we treat this changepoint as unknown, to see if the new test can detect this changepoint.

[40] The annual daytime frequencies of sky-cloudiness conditions are shown inFigure 1 for each of the 11 categories. It is clearly that the temporal trends of cloud cover frequencies vary with categories, which are fitted separately in the model (3). For this set of cloudiness data, the λ(τ) statistic is computed for each admissible τ ∈ {1,.., n − 1} and plotted in Figure 2 along with the 95th percentile of λmax. The largest value of λ(τ), 146.3, corresponds to τ = 14 (year 1978) and greatly exceeds its 95th percentile (28.5 in this case), suggesting a significant change between 1978 and 1979. This indicates that the station relocation and ownership change in 1979 did cause a significant sudden change in cloud cover observations. The estimated year of change is one data interval (one year in this case) earlier than the actual time of change. The estimated overdispersion parameter math formula shows the very large variability in the observed cloud cover data. With such extra variability being accounted for in our model, we can detect the true changepoint with a large probability. The bootstrap estimated probability of τ = 14 is 0.998. Additionally, the average of absolute changes math formula (k = 1,…, K − 1) before and after math formula is 0.42 in the logits of cumulative probabilities.

Figure 1.

Annual daytime frequencies of sky-cloudiness conditions at Fort St. John Airport along with the frequencies estimated using the null model with no changepoint (dotted lines) and the alternative model with a changepoint in year 1978 (dashed lines).

Figure 2.

Log likelihood ratio statistics at Fort St. John Airport.

[41] The estimated frequencies math formulaof sky-cloudiness conditions using the alternative model with math formula and the null model with no changepoint, separately, are also plotted in Figure 1 for each category. The alternative model fit with a changepoint at math formula is obviously better than the null model fit with no changepoint. We then utilize the Pearson residuals to verify the assumption of independence over time. The Pearson residual rt is defined as math formula to account for the overdispersion effect. Here, ut is given by (6) in Section 3 and the expected frequencies math formula, are obtained using the model (3) with math formula. The autocorrelations of the Pearson residuals up to the first 20 lags together with the 95% pointwise confidence bounds for white noise are plotted in Figure 3 for categories k = 1,…, K − 1. Since no more than one of residual autocorrelations in Figure 3lies outside the 95% confidence bounds, the year-to-year autocorrelations in the annual sky-cloudiness cover series at Fort St. John Airport can be statistically ignored. The independence assumption seems to be reasonable for this example.

Figure 3.

Sample autocorrelations of Pearson residuals at Fort St. John Airport.

[42] Further, the test statistic λmax = 146.3 is partitioned into 11 components to assess the effect of the identified change at math formula on the individual categories. The components math formula are plotted in Figure 4. Obviously, the category 1 (clear sky) frequencies of sky-cloudiness conditions have changed the most due to the relocation and change in ownership and station type in year 1979. The categories 5, 8, 9, 10, and 11 are also largely affected by this changepoint. All the other categories seem to be less significantly affected.

Figure 4.

Component log likelihood ratio statistics at Fort St. John Airport.

[43] The results for the annual nighttime frequencies of sky-cloudiness conditions, shown inFigure 5, are similar to those for the corresponding daytime series. For this set of nighttime cloudiness data, the λmax = 104.8 is associated with math formula (year 1976), suggesting a significant change between 1976 and 1977. The 95% bootstrap percentile interval for τ is [12, 14], corresponding to the year [1976, 1978], which includes the year 1978 that shows the shift between year 1978 and 1979. The estimated math formula indicates that the variability in the nighttime series is slightly larger than that in the daytime series. In addition, the average of math formula's before and after math formula is 0.22 in the logits of cumulative probabilities. This small size change makes the changepoint hard to be detected compared to the daytime series. The component analysis by partitioning λmax = 104.8 shows that the identified change in year 1976 affects the most of categories such as category 1, 2, 4, 5, 6, 8, and 10. Among those categories, the category 2 have changed the most for nighttime series.

Figure 5.

The same as in Figure 1 but for the nighttime sky cloudiness conditions with a changepoint in 1976.

7. Concluding Remarks

[44] In this study we have developed a likelihood ratio statistic for detecting a single changepoint in frequency time series of multinomial response variables, using the cumulative logit model. Since the immediate objective is to test for homogeneity of Canadian sky-cloudiness data series, which is an 11-categorical response, we focus on 11-categorical variables in this study. However, the new changepoint detection method can be used to studyK-category variables with anyK > 2, given the empirical percentiles corresponding to the specific number of response categories K. The R programs used in this study can be found in the Website http://cccma.seos.uvic.ca/ETCCDMI/software.shtml.

[45] The assumption of at most one changepoint (AMOC) made in the model (3) may not be realistic in practice for longer time series. This, of course, is the first step in detecting a changepoint in multicategory variables. The extension to multiple changepoints case should be a subject for future research, and we would also like to extend the method to account for autocorrelation and seasonality, which are inherent features of most climate variables.

Acknowledgments

[46] The Climate Research Division of the Atmospheric Science and Technology Directorate of Environment Canada is acknowledged for supporting QiQi Lu through the research contracts KM040-06-0067 and KM040-07-0141. The authors would like to thank Lucie Vincent for her helpful internal review comments on an earlier version of this manuscript. The authors would also like to thank the editor and three anonymous reviewers for their valuable comments and suggestions.

Ancillary