We propose a methodology to assess the noise characteristics in time series of position estimates for permanent Global Positioning System (GPS) stations. Least squares variance component estimation (LS-VCE) is adopted to cope with any type of noise in the data. LS-VCE inherently provides the precision of (co)variance estimators. One can also apply statistical hypothesis testing in conjunction with LS-VCE. Using the w-test statistic, a combination of white noise and flicker noise turns out in general to best characterize the noise in all three position components. An interpretation for the colored noise of the series is given. Unmodelled periodic effects in the data will be captured by a set of harmonic functions for which we rely on the least squares harmonic estimation (LS-HE) method and parameter significance testing developed in the same framework as LS-VCE. Having included harmonic functions into the model, practically only white noise can be shown to remain in the data. Remaining time correlation, present only at very high frequencies (spanning a few days only), is expressed as a first-order autoregressive noise process. It can be caused by common and well-known sources of errors like atmospheric effects as well as satellite orbit errors. The autoregressive noise should be included in the stochastic model to avoid the overestimation (upward bias) of power law noise. The results confirm the presence of annual and semiannual signals in the series. We observed also significant periodic patterns with periods of 350 days and its fractions 350/n, n = 2, …, 8 that resemble the repeat time of the GPS constellation. Neglecting these harmonic signals in the functional model can seriously overestimate the rate uncertainty.
 Continuous Global Positioning System (GPS) measurements have been used now nearly 15 years for estimation of crustal deformation. Station positions are determined with respect to an earth-fixed terrestrial reference system. Geophysical studies using geodetic measurements of surface displacement or strain require not only accurate estimates of these parameters but also accurate error estimates. The precision of these estimates is often assessed by their repeatability defined by the mean squared error of individual coordinate components (i.e., north, east, and vertical) about a linear trend. Except for the significant episodic deformation, such as large earthquakes, a linear trend can be a good representative of the deformation behavior. The site velocities are usually determined by linear regression of individual coordinate components. The least squares technique is used to estimate the line parameters, i.e., the intercept and the slope (site velocity).
 In the ideal case, it is desired that the time series possess only white noise and all functional effects are fully understood. The noise in GPS coordinate time series turns out not to be white. Several geodetic data sets have provided evidence for error sources that introduce large temporal correlations into the data. The ultimate goal of noise studies is to come up with a stochastic model that allows one to process the coordinate time series such that the “best” solution (most precise solution together with proper precision description) of the station positions and site velocities can be determined. An intermediate goal is therefore to better understand and to identify the various noise components of the stochastic model.
 Two techniques have generally been employed to assess the noise characteristics of geodetic time series, namely, the power spectral method and the maximum likelihood estimation (MLE) method. The former is aimed to examine the data in the frequency domain while the latter is used to examine the data covariance matrix in the time (space) domain. The MLE can estimate the parameters of a noise model effectively in contrast to the classical power spectra techniques. In this contribution, we will not make use of the spectral techniques. The MLE method is generally used to compute the amount of white noise, flicker noise, and random walk noise in the time series [see, e.g., Zhang et al., 1997; Langbein and Johnson, 1997; Mao et al., 1999; Williams et al., 2004; Langbein, 2004]. In this paper, we introduce and use a different variance component estimation method based on the least squares principle. Our motivation is given in the next section.
2. Previous Work and Outline
Zhang et al.  processed 19 months of continuous GPS coordinates from 10 sites in southern California. Using MLE with integer spectral indices, they found that the noise in the GPS time series was best described as a combination of white noise and flicker noise. This combination suggested that the velocity uncertainties should be three to six times larger than those obtained from a pure white noise model. Using the power spectra, the noise was characterized by a fractal noise process with a spectral index of −0.4. Neglecting this fractal white noise model, site velocity uncertainties could be underestimated by a factor of 2–4. In an analogous way, Calais , Mao et al. , and Williams et al.  found that GPS position time series best fitted a noise model consisting of both white noise and flicker noise. Higher frequency (1–30 s) GPS position time series have also been shown to contain white plus flicker noise [Bock et al., 2000; Langbein and Bock, 2004].
 Several studies have also recognized random walk noise in geodetic data. Random walk noise was detected in continuous measurements of strainmeters as well as very short baseline GPS data at Piñon Flat Observatory in southern California [Wyatt, 1982, 1989; Wyatt et al., 1989; Johnson and Agnew, 2000]. Langbein and Johnson [1995, 1997] showed that the noise in the electronic distance measuring (EDM) data is well characterized by a combination of white and random walk noise. The random walk amplitude for a very short baseline at Piñon Flat Observatory is only 0.4 mm/year1/2 [Johnson and Agnew, 2000]. Beavan  shows that the noise properties of GPS time series for concrete pillar monuments are very similar to those of deeply drilled braced monuments. Using two-color EDM measurements in California, Langbein  shows that the random walk noise model is valid for about 30% of the data. In some cases, a combination of random walk and band-pass-filtered noise best characterizes the data.
 Site position time series obtained from continuous GPS arrays show significant seasonal variations with annual and semiannual periods. Such seasonal deformation is present in both global and regional GPS coordinate time series [see van Dam et al., 2001; Dong et al., 2002]. Kusche and Schrama  show that after removing the atmospheric pressure loading effect, estimated annual variations of continental scale mass redistribution exhibit patterns similar to those obtained with Gravity Recovery and Climate Experiment (GRACE). Ding et al.  used time series of daily positions of eight colocated GPS and very long baseline interferometry (VLBI) stations to assess the seasonal signals using the wavelet transform. Blewitt and Lavallée  showed that annual signals can significantly bias the site velocity if they are not estimated in the model. Another important systematic error in GPS time series is the presence of offsets (jumps). Williams [2003b] and Perfetti  discuss offset detection and estimation strategies. Kenyeres and Bruyninx  estimate offsets for coordinate time series in the EUREF permanent network.
 This study differs in several ways from previous work. We use the least squares variance component estimation (LS-VCE) method, which has attractive and unique features that we point out in this paper. First, LS-VCE is generally applicable and can cope with any type of noise (and with any number of noise components) in the data series. The method can be implemented in a relatively simple and straightforward manner. Second, using LS-VCE one can obtain the covariance matrix of the estimators describing the uncertainty of the (co)variance components. Third, the general formulation of LS-VCE is applied to a special case to estimate time correlation assuming that the time series are stationary in time. Fourth, we introduce the w-test statistic with which one can simply test the “contribution” of single noise components. One can determine which noise combination best characterizes the noise of GPS position time series. In the same framework as LS-VCE, we then introduce the least squares harmonic estimation (LS-HE) method. The goal is to introduce harmonic functions to capture unmodelled effects in the time series. It is shown that practically only white noise remains, which is attractive from the data processing point of view. Such a duality between the stochastic and the functional model is useful to be able to correctly judge on the amount and the behavior of noise.
3. Variance Component Estimation (VCE)
 In this section, we briefly explain the power spectrum of a power law noise process, the maximum likelihood estimation (MLE) of variance components, the least squares variance component estimation (LS-VCE), and the modeling approach and misspecification of a noise process.
where f is the temporal frequency, P0 and f0 are the normalizing constants, and κ is the spectral index [see, e.g., Mandelbrot and van Ness, 1968]. Typical spectral index values lie within [−3, 1]; for stationary processes −1 < κ < 1 and for nonstationary processes −3 < κ < −1. A smaller spectral index implies a more correlated process and a more relative power at lower frequencies. Special cases within this stochastic process occur at the integer values for κ. Classical white noise has a spectral index of 0, flicker noise has a spectral index of −1, and random walk noise has a spectral index of −2. The power spectral method can be employed to assess the noise characteristic of GPS time series.
where an underscore indicates that a quantity is a random variable and and are the expectation and dispersion operators, respectively. In the preceding functional model, y is the m-vector of observables, x is the n-vector of parameters of interest, and the m × n design matrix A is to be of full column rank. The data covariance matrix Qy is expressed as an unknown linear combination of known m × m cofactor matrices Qk's. Q0 is the known part (if any) of the stochastic model; it also allows one to use a nonlinear stochastic model (see section 6.5). The unknown (co)variance components σk, k = 1, …, p are to be estimated.
 To apply the least squares method to (co)variance component model, one can reformulate the second part of equation (2) in terms of a model of observation equations as (vh(ttT − BTQ0B)) = Avhσ, where σ = [σ1, ⋯, σp]T, t = BTy, and Avh = [vh(BTQ1B), …, vh(BTQpB)] in which vh (vector-half) operator applies to symmetric matrices and B is an m × (m − n) matrix of which the m − n linearly independent columns span the null-space of AT, i.e., ATB = 0 or BTA = 0.
 The least squares estimator for the p-vector of unknown (co)variance components can then be obtained as
where Qvh is the covariance matrix of the observables vh(ttT). One can show that the entries of the p × p normal matrix N and of the p-vector l are obtained as
where k = 1, 2, …, p and the least squares residual vector = PA⊥y in which the orthogonal projector PA⊥ is given as
with I an identity matrix. Since the estimators are based on the least squares method, the inverse of the normal matrix N automatically gives the covariance matrix of the estimated (co)variance components
Therefore this equation offers us measures of precision for the estimators.
 To implement the method, one starts with an initial guess for the (co)variance components (σk0, k = 1, …, p). Using these values, one computes Qy = Q0 + σk0Qk. Equation = N−1l with equations (4) and (5) gives estimates for the σk, k = 1, …, p, which in the next cycle are considered as an improved initial guess for those (co)variance components. This iterative procedure is repeated until the estimated (co)variance components do not change with further iterations. In this section, we considered a linear stochastic model. LS-VCE can also be applied to a nonlinear stochastic model, namely, Qy = Q(σ). To overcome the nonlinearity, one can expand the model into a Taylor series for which one needs the initial values of the unknown vector σ, namely, σ0. After linearization, one obtains a linear form of the (co)variance component model and thus equation (3) can be used [see Amiri-Simkooei, 2007].
 In contrast to MLE which gives biased estimators, LS-VCE provides unbiased and minimum variance estimators. The unbiasedness property is independent of the (unspecified) distribution of the data. LS-VCE is also faster than MLE since it iterates in a Newton–Raphson scheme toward a solution rather than using the downhill simplex which can be extremely slow [see Press et al., 1992]. With LS-VCE one can thus efficiently incorporate any number of noise components in the stochastic model. Using hypothesis testing, one can also simply judge in an objective manner which noise components are likely to be present in the series (see section 5.2).
3.3. Misspecification in Functional and Stochastic Model
 Misspecifications (errors) in the functional model and/or stochastic model will, in general, affect the optimality properties of the estimators. This also holds true for variance component estimation. It is therefore of interest to understand how such misspecifications affect the result of estimation. Without treating this topic in detail, it is relevant to briefly mention some of these effects. Concerning the number of parameters in the model (y) = Ax and Qy = Q0 + σkQk, two types of misspecifications can occur: overparametrization and underparametrization.
 Underparametrization in the functional model (y) = Ax will generally lead to biases in the estimation of x and thus also in the results of variance component estimation (aliasing). Such biases in the results of VCE may have the side effect, that they become misinterpreted as an underparametrization of the stochastic model (see section 5.3). In contrast to underparametrization, overparametrization in the functional model does not lead to biases in the estimation of x. Here, however, one has to be aware that overparametrization reduces the redundancy and therefore also the precision with which the results can be obtained. This is true for the estimation of x, as well as for the estimation of the (co)variance components.
 Misspecifications in the stochastic model will not lead to biases in the estimation of x. However, underparametrization in the stochastic model will lead to biases in the estimation of the (co)variance components, as a consequence of which an incorrect precision description is obtained for the estimator of x. To discuss this effect, let Qy be the correct covariance matrix and Q′y be the incorrect one. The least squares estimator of x, based on Q′y, is then given as = (ATQ′−1yA)−1ATQ′−1yy. This estimator is still unbiased [see Teunissen et al., 2005]. Its covariance matrix follows from an application of the error propagation law as
If one believes that Q′y is the correct covariance matrix, while it is not, one will use the matrix Q′x = (ATQ′−1yA)−1 to describe the precision of . This matrix, however, gives an incorrect precision description, which can either be too optimistic ( ≥ Q′x) or too pessimistic ( ≤ Q′x). Comparisons of precision descriptions for different stochastic models are given in section 6.6.
4. GPS Coordinate Time Series
 This section demonstrates how to estimate the time correlation of GPS coordinate time series using LS-VCE. We rely on a commonly accepted structure of the functional and stochastic model and eventually arrive at a simple expression.
4.1. Functional Model
 We restrict ourselves to the problem of time correlation estimation for an individual component of GPS coordinate time series. In equation (2), y is the m-vector of time series observations, for example, daily GPS positions of one component. Hereinafter it is denoted by y(t) where t refers to the time instant. When a linear trend describes the deformation behavior, the functional model will read: (y(t)) = y0 + rt. When there are in addition q periodic signals in the data series, the functional model is extended to
Two trigonometrical terms cos and sin together represent a sinusoidal wave with in general a nonzero initial phase. The structure introduced above has the advantage of being linear. The unknown vector x consists of the intercept y0, the slope r, and the coefficients ak and bk. In case of a linear trend and annual and semiannual signals (q = 2), the design matrix A is of size m × 6. Its ith row at time instant ti is given as
where ti is expressed in terms of year. In section 5.1, we show how to obtain an appropriate functional model.
4.2. Stochastic Model
 If the time series of GPS coordinates is composed of white noise, flicker noise, and random walk noise with variances σw2, σf2, and σrw2, respectively, the covariance matrix of the time series can then be written as (Q0 = 0)
where I is the m × m identity matrix and Qf and Qrw are the cofactor matrices relating to flicker noise and random walk noise, respectively. The structure of Qy matrix is known through I, Qf, and Qrw, but the contributions through σw, σf, and σrw are unknown. In section 5.2, we show how to improve an existing stochastic model.
 The elements of the flicker noise cofactor matrix Qf can be approximated by [Zhang et al., 1997]
where τ = ∣tj − ti∣. For evenly spaced data, the matrix Qf is a symmetric Toeplitz matrix that contains constant values along negative-sloping diagonals. It is important to note that the Hosking flicker noise covariance matrix, which was introduced and used by Williams [2003a], Langbein , Williams et al. , and Beavan , can also be used. The main difference is a scaling of the amplitudes. Therefore the flicker noise variances we use here are roughly one half the size of those quoted in these papers.
 A random walk process is derived by integrating white noise. Random walk noise is supposed to be zero at initial time t0. For evenly spaced data, the random walk cofactor matrix Qrw is expressed as
 The variance components σw2, σf2, and σrw2 can now be estimated using the LS-VCE method.
4.3. Estimation of Time Correlation
 Let us now consider a stationary noise process. This process does not contain the random walk noise, i.e., σrw = 0 in equation (11). We consider a side-diagonal structure for the covariance matrix Qy. This implies that the correlation between time series observations is only a function of time-lag τ = ∣tj − ti∣, i.e., σij = στ. This structure is in fact similar to that of flicker noise introduced in equation (12). The only difference is that for flicker noise we have one variance as a scale, namely, σf2, but here we employ one (unknown) covariance for each time-lag, namely, m components all together. The covariance matrix can then be written as a linear combination of m cofactor matrices
where, for each value of τ, the m × m cofactor matrix Qτ has only two parallel side-diagonals of ones located on both sides of the main diagonal and τ steps away.
 We can now apply the general LS-VCE approach to the special case of estimating time correlation in the time series. If we measure a functionally known quantity ((y) = μy in equation (2)), it can be shown that the (co)variances are simply estimated as [Teunissen and Amiri-Simkooei, 2007]
where the scalar i is the ith element of the residual vector = PA⊥y. When τ = 0, the preceding equation gives the well-known expression for the estimator of the variance 0 = 2. Equation (15) is identical to the so-called unbiased estimator for the autocorrelation function (ACF) of stationary zero-mean least squares residuals [Priestley, 1981]. The biased estimate obtained from MLE uses m instead of m − τ in the denominator of equation (15).
 One can also compute the correlation coefficients that together represent the empirical autocorrelation function (ACF)
Application of the error propagation law to the linearized form of the preceding equation gives
which shows that the precision of the autocorrelation function gets poorer when τ increases. This means that the correlation of long-memory processes is poorly estimable for large time-lags τ. Therefore when there exists a predefined noise process like power law noise, one may however still prefer to use that structure which is also readily possible with LS-VCE, and the above formulation of time correlation remains to get a general impression of the noise behavior (see section 6.3.1).
5. Model Identification
5.1. Least Squares Harmonic Estimation (LS-HE)
 In this section, we aim to determine an adequate design matrix, A, for the functional model through parameter significance testing. For a time series yT = [y1, y2, …, ym] defined on Rm, we assume that it can be expressed as a linear trend plus a sum of q individual trigonometric terms, i.e., (y(t)) = y0 + rt + ak cos ωkt + bk sin ωkt (see equation (9)). In matrix notation, we may write
where the design matrix A contains two columns of the linear regression terms and the matrix Ak consists of two columns corresponding to the frequency ωk of the sinusoidal function
with ak, bk, and ωk being (un)known real numbers. On the one hand, if the frequencies ωk are known, one will deal with the most popular (linear) least squares problem to estimate amplitudes ak and bk's. Petrov and Ma  studied harmonic position variations of 40 VLBI stations at 32 known tidal frequencies. They found that the estimates of station displacements generally agree with the ocean loading computed on the basis of modern ocean tide models for the main diurnal and semidiurnal tides. On the other hand, if the frequencies ωk are unknown, the problem of finding these unknown parameters is the task of least squares harmonic estimation.
 The problem now is to find the set of frequencies ω1, ⋯, ωq, and in particular the value q, in equation (18). The following null and alternative hypotheses are put forward (to start, set i = 1):
The detection and validation of ωi is completed through the following two steps:
 Step I: The goal is to find the frequency ωi (and correspondingly Ai) by solving the following minimization problem:
where ∥.∥ = (.)TQy−1(.), = [AA1 … Ai−1] and a is the least squares residuals under the alternative hypothesis. The matrix Aj has the same structure as Ak in equation (19); the one that minimizes the preceding criterion is set to be Ai. The above minimization problem is equivalent to the following maximization problem [Teunissen, 2000, p. 96]:
where P⊥ = I − (TQy−1)−1TQy−1 and Pj = j(jTQy−1j)−1jTQy−1. The preceding equation simplifies to
with 0 = P⊥y the least squares residuals under the null hypothesis. In case that the time series contains only white noise, namely, Qy = σ2I, it follows that
 Analytical evaluation of the above maximization problem is complicated. In practice, one has to be satisfied with numerical evaluation. A plot of spectral values ∥Pjy∥ versus a set of discrete values for ωj can be used as a tool to investigate the contribution of different frequencies in the construction of the original time series. That is, we can compute the spectral values for different frequencies using equations (24) or (25). The frequency at which ∥Pjy∥ achieves its maximum value is used to construct Ai.
where i = P⊥Ai and the estimator for the variance, a2, has to be computed under the alternative hypothesis. Under H0, the test statistic has a central Fisher distribution
The above hypothesis testing is in fact the parameter significance test because the test statistic T2 can also be expressed in terms of i in equation (21) and its covariance matrix. If the null hypothesis is rejected, we can increase i by one step and perform the same procedure for finding yet another frequency. As a generalization of the Fourier spectral analysis, the method is neither limited to evenly spaced data nor to integer frequencies.
 Our application of harmonic estimation, in the first place, is to find any potential periodicities in the series. The remaining unmodelled effects (for example, power law noise) will also be interpreted and captured by a set of harmonic functions. Once we compensate for these effects in the functional model, the remaining noise characteristics of the series will be assessed. A nearly white noise combined with autoregressive noise can be shown to remain in the data series.
5.2. The w-Test Statistic
 Here we aim to determine the appropriate covariance matrix Qy through significance testing of the stochastic model. One advantage of LS-VCE over other methods is that one can use statistical hypothesis testing in the stochastic model (similar as done with the functional model). When there is no misspecification in the functional part of the model (y) = Ax, the following two hypotheses, as an example, are considered:
where Cy is a known cofactor matrix, for example, Qf or Qrw, and δ is an unknown (co)variance parameter. We can use the generalized likelihood ratio test for testing H0 against Ha. If we do so, the following w-test statistic can be obtained [Amiri-Simkooei, 2007]:
with b = m − n the redundancy of the functional model and the least squares residuals under the null hypothesis. The orthogonal projector PA⊥ is also given under the null hypothesis.
 The expectation and the variance of the w-test statistic are 0 and 1, respectively. The distribution of this statistic, for large m, can be approximated by the standard normal distribution. The goal now is to compute the w-test statistic values for different alternative hypotheses, i.e., different Cy's in the preceding equation, and to select the one that gives the maximum value for the w-test. In fact, equation (29) provides us with an objective measure to judge, whether or not (or which), additional noise processes are likely to be present in the data at hand. Because of the special structure of the above hypotheses, the numerical evaluation of the preceding test statistic is very simple. We do not need for instance to invert a full covariance matrix since it is diagonal under the (assumed) null hypothesis, namely, Qy = σw2I.
5.3. Demonstration Using Simulated Data
 To illustrate how the proposed LS-HE and LS-VCE work, we simulated a 10-year time series (daily samples) containing only white noise with a standard deviation of 5 mm. Two sinusoidal functions with amplitude of 2 and 1 mm, respectively, for the annual and semiannual term, have then been added to the data. We now use LS-HE to find the frequencies (or periods) of these signals. This was repeated 100 times and it follows that the empirical standard deviation of the detected periods is 1.4 and 0.7 days for the annual and semiannual term, respectively. Figure 1 shows one typical example of application of the method to find the periods of harmonic functions. In the first step, the annual term is detected and in the second step the semiannual term.
 A correlogram portrays the autocorrelation versus time-lag τ. Figure 2 shows the typical example of the simulated data corresponding to Figure 1. In each graph, the top window is the time series itself, the middle shows the running averages, and the bottom gives the autocorrelation coefficients obtained from LS-VCE using equation (16). In case of pure white noise, the autocorrelation function behaves randomly around zero. When both annual and semiannual terms are added, the autocorrelation function (ACF) shows a periodic behavior that resembles the periodicity of the annual signal. This makes sense because the ACF of a sinusoidal wave is again a sinusoidal wave with the same frequency. But the amplitude is proportional to the square of the amplitude of the original signal [Priestley, 1981]. If one removes the annual term, the ACF will still show a periodicity which is due to the presence of the semiannual signal. When one also removes this signal, the ACF becomes very similar to the case of pure white noise.
 One can also compute the values of the w-test statistic using equation (29) for the different cases mentioned above. The cofactor matrix is chosen as that of flicker noise Cy = Qf. Based on the simulation of 20 data sets, the w-test values on average become as follows: in presence of annual and semiannual signals w = 15.3; removed annual signal w = 1.9; and removed both annual and semiannual signals w = 0.3. Using LS-VCE, white and flicker noise amplitudes were estimated. The amplitudes on average are as follows: in presence of both signals σw = 4.85 mm and σf = 2.84 mm; removed annual term σw = 4.98 mm and σf = 1.35 mm; and removed both terms σw = 5.00 mm and σf = 0.07 mm. These all together simply express that if there exist unmodelled effects in data, they can be misinterpreted as time correlation (here flicker noise). One should therefore take care of these signals in the functional model.
6. Numerical Results and Discussions
6.1. Data and Model Description
 Global time series of site positions are supposed to have more noise than those from a regional solution [Williams et al., 2004]. In this study, the daily GPS global solutions of different stations processed by the GPS Analysis Center at JPL are adopted. The data were processed using the precise point positioning method in the GIPSY software [Zumberge et al., 1997]. The satellite orbits, satellite clocks, and Earth rotation parameters (ERP) used for the daily solutions were estimated with data from 42 globally distributed IGS tracking stations [see Beutler et al., 1999]. In addition, corrections for geophysical effects such as pole and ocean tide effects have been applied. The reader is referred to the JPL Web site [http://sideshow.jpl.nasa.gov/mbh/series.html].
 The estimated coordinates of a site are uncorrelated with those of the other sites if the effects of the common errors in the satellite orbits and clocks and ERP on the estimated coordinates are insignificant. To make a proper statement, one can rely on multivariate time series analysis methods. The time series are processed on a component-by-component basis in this study.
 Most of the results given are based on five stations, namely, KOSG, WSRT, ONSA, GRAZ, and ALGO. Four stations are in Europe of which KOSG and WSRT in the Netherlands, ONSA in Sweden, and GRAZ in Austria. ALGO is in Canada. We have used 10 years of daily solutions for all sites except WSRT, which covers only 6.5 years. To justify some of the statements that we will make, 71 globally distributed GPS stations were also processed. Our point of departure is the original time series and its linear model of observation equations yt = y0 + rt. In most cases, the annual and semiannual signals have been considered as well. At times, we have included a set of harmonic functions to compensate for (parts of) unmodelled effects in the series.
6.2. Variance Component Analysis
 Three stochastic models have been chosen to describe the noise characteristics of GPS coordinate time series. They include the pure white noise model (I), the white plus flicker noise model (IIa), and the white plus random walk noise model (III). We employed LS-VCE to estimate the white noise, flicker noise, and random walk noise amplitudes (see equation (11)). Williams' investigations (S. D. P. Williams, Proudman Oceanographic Laboratory, personal communication, 2006) show that LS-VCE gives the same results as MLE. This holds in fact if m ≫ n, which is usually the case in time series analysis. Table 1 gives the noise amplitudes of different components for different stochastic models. The table also provides the precision (standard deviation) of the estimates using equation (7). We find, for different noise components, that the horizontal components are less noisy than the vertical components by a factor of 2–4. Compared to the white noise model only, the amplitude of white noise for the white plus flicker noise model is 30% smaller, while this reduction for white plus random walk noise model is about 20%.
Table 1. White Noise, Flicker Noise, and Random Walk Noise Amplitude Estimates for North, East, and Vertical Components of Site Time Series in Three Different Stochastic Models (White Noise Only (Model I); White Noise Plus Flicker Noise (Model IIa); and White Noise Plus Random Walk Noise (Model III)); Functional Model Is Linear Regression Model
 In a similar manner to Zhang et al.  and Williams et al. , we produced the difference in the log likelihood values for each site, each component, and each error model. The results are given in Table 2. The values given in this table are normalized such that the pure white noise model has a log likelihood of zero. These results confirm that the white noise plus flicker noise model seems to be preferred over the pure white noise model or the white noise plus random walk noise model, which coincide with the findings of Williams et al. for global solutions. We will give the results of the w-test statistic in section 6.4.2 and show to which extent they are different from those obtained by MLE.
Table 2. Difference in Log-Likelihood Values for Model With White Noise Plus Flicker Noise (WN + FN) Versus White Noise Plus Random Walk Noise (WN + RW), Both Compared With Pure White Noise (Log = 0)
IIa: WN + FN
III: WN + RW
 The number of iterations in VCE methods is in fact an indication of (the lack of) appropriateness of the selected stochastic model. Figure 3 gives typical examples of estimated variance components at each iteration step for two variance component models. The graphs show that the flicker and random walk noise variances systematically converge to their final estimates from one side. This confirms the presence of misspecification in the model, which will in fact result in overestimated (biased upward) flicker and random walk noise variances. This coincides with the findings of Langbein  on EDM data (conversely white noise is biased downward). This is mainly due to the presence of ignored noise at high frequencies which leaks, in a systematic way, into lower frequencies (see section 6.4).
 The overestimation can also be verified when we compute the position errors and compare them with the scatter of the time series themselves (see section 6.6). The overestimation of random walk noise is more significant than that of flicker noise. This also means that the white plus flicker noise model is the preferred model in these circumstances. In other words, if the correlated noise in the time series is flicker noise and one tries to estimate the amplitude of the random walk noise, then the result will be biased (Williams, personal communication, 2006; see also section 3.3).
 These discussions essentially mean that there are still misspecifications (in fact underparameterization) in the model. In the functional model for instance, one should take care of any potential periodicities in the series (see section 6.3). Also, a better stochastic model may include in addition to power law noise other noise models like autoregressive noise (see section 6.4). For EDM observations, Langbein  proposed to use a combination of power law noise and band-pass-filtered noise. The upward bias of power law noise and downward bias of white noise can thus be circumvented by introducing a more sophisticated functional and stochastic model.
6.3. Functional Model
6.3.1. Simple and Intuitive Technique
 Seasonal variations in site positions consist of signals from various geophysical sources and systematic modeling errors [Dong et al., 2002]. The weekly, monthly, and yearly mean residuals calculated from averaging the daily residuals are shown in Figure 4 (left) for ALGO. Running averages naturally remove the high-frequency noise and leave the lower frequency signals. Annual and seasonal variations can be observed in the running averages. For example, the vertical component shows a clear annual signal. They should be eliminated from the time series in order to obtain a more realistic assessment of the noise behavior. We can also see some high- and low-frequency fluctuations, which can likely be captured by flicker noise and random walk noise, respectively (see east and north components, respectively). When parts of these variations have a deterministic behavior, they should be compensated for in the functional rather than the stochastic model. Ding et al.  tried to interpret this behavior as some interannual signals.
 We now focus on time correlation in the series and estimate one covariance for each time-lag τ using equations (15) and (16). Figure 5 (top) shows the autocorrelation coefficients for the time series of the each component of two sites. The annual and seasonal variations as well as long-term fluctuations can be seen in the correlograms. The variations are clearer here than those for the running averages in Figure 4. For example, the periodic behavior in the ALGO vertical correlogram shows the annual signal in the series. When the annual signal was included in the functional model, the annual periodicity of the correlogram disappeared. However, this was not the case for KOSG and ALGO east components, which show an annual-like signal. This implies that there might still be some hidden periodicities in the data series.
6.3.2. Harmonic Estimation
Figure 6 (left) shows the test statistic values given by equation (26) to find the first 15 frequencies. The step size used for Tj = is taken small at high frequencies and gets larger at lower frequencies. We can see that the value for the test statistic levels off quickly. With 6–10 harmonic functions, it gets close to the critical value. In all subsequent results, the number of harmonic functions q in equation (18) was set to 10 starting with just the offset and slope model. The combination of all 10 harmonic functions included in the functional model of the series is given in Figure 6 (right). The periods of 10 harmonic functions are given in Table 3. Our opinion is that the periodic functions detected by the LS-HE method are due to the following four reasons:
Table 3. Periods (Days) of the First Ten Harmonic Functions Obtained from Least Squares Harmonic Estimation for North, East, and Vertical Components Using Equation (25) With Qy = σw2I and a Linear Regression Model
 1. Unmodelled periodic ground motion: The site is actually moving periodically in this case. Annual and semiannual signals, for instance, can be specified into this category. Except for a few components, both annual and semiannual signals can be seen in the series. A good example is the first period obtained for ALGO vertical component (366 days) that reveals the annual signal.
 2. Periodic variation of the estimated time series: The site is apparently moving periodically. This is known as the aliasing effect. Unmodelled periodic systematic (for example, tidal) errors present at a station will result in spurious longer periodic systematic effects in the resultant time series [see Penna and Stewart, 2003; Stewart et al., 2005]. A harmonic function with a period of 13.63 days is detected in the north components of WSRT and ALGO. In the east component of KOSG and WSRT, a period of 14.2 days is seen. There are also periods ranging from 170 to 180 days that coincide with those given by Penna et al.  for unmodeled S2 ocean tide loading effect at different globally distributed sites.
 3. Aliased multipath effect (still a challenging problem): We observe periodic patterns with periods of roughly 350, 175, 117, 88, 70, 59, 50, and 44 days. To justify this, the time series of 71 GPS stations were processed. Figure 7 shows the stacked (weighted) power spectra for these stations after including the annual and the semiannual signals. The peaks shown in the spectrum coincide well with the numbers given above. The set of stations was split into two parts and the same conclusion could again be drawn. The results also confirm Ray's  findings. Two possibilities which may lead to this effect are as follows: Agnew and Larson  show that the repeat time of the GPS constellation, through which multipath can repeat at permanent stations, averages at 247 s less than a solar day (24 h). However, daily GPS position estimates are based on a full solar day. The difference will alias to a frequency of 0.0028565 cycles/day or 1.04333 cycles/year (350 days period). The periods found fit with this frequency and its harmonics. Periodic variations of the range residuals with maximum at the eclipse seasons indicate orbit modeling deficiencies for the GPS satellites [Urschl et al., 2005, 2006]. The periods found above also coincide with the period of one draconitic GPS year (about DJ = 351 days) and fractions DJ/n, n = 2, …, 8 [see Beutler, 2006].
 4. Presence of power law noise: There are still many numbers in the table (about 50%) that do not fit into the previous categories. Williams (personal communication, 2006) argues that the 10 harmonic functions uniformly distributed in log-frequency space would be sufficient to simulate power law noise. The higher frequency effects are likely due to flicker noise. Long-term period (for example, larger than 1000 days) effects are observed for most of the series. Parts of these effects can likely be considered as random walk noise. Note also that undetected offsets in the time series can mimic random walk noise [Williams, 2003b]. We used so far equation (25) based on the pure white noise model to detect the frequencies. Therefore some of the detected periods are most likely due to the presence of colored noise in the data that has been ignored in the results of Table 3 and Figure 7 (left). The graph shows that white noise is mainly present at high frequencies, flicker noise at medium frequencies, and random walk noise at low frequencies. To justify this, in the harmonic estimation, we used equation (24) with a more sophisticated noise model given by equation (30). Most of the lower frequency effects that were detected in the white noise model could not be detected here. Figure 7 (right) shows the stacked power spectra of 71 stations using this new stochastic model. The spectrum looks more or less flat and thus does not contradict our statement.
 Therefore to avoid biases in the estimate of x and also the amplitude of noise components (see section 3.3), one should take good care of any potential periodicities in the GPS position time series. We can at least mention the annual and the semiannual signals, signals with the periods of 13.66, 14.2, and 14.8 days, and most likely signals with the periods of 350 days and its fractions.
6.4. Stochastic Model
6.4.1. Simple and Intuitive Technique
Figure 4 (right) shows the weekly, monthly, and yearly running average of residuals when 10 harmonic functions are included to describe unmodelled effects (site ALGO). Most of the signals and fluctuations have now been removed. One can also plot the autocorrelation coefficients for the corrected functional model. The correlograms of the time series are given in Figure 5 (middle). At first insight, they appear to represent white noise. The graphs at bottom provide a “zoom-in” on the first part of the graphs at the top and in the middle, i.e., the correlograms over the first 100 days. Unlike the original data, the autocorrelation coefficients become small already after a few days (max 10 days). Our impression is that this remaining high-frequency correlation can be caused by common and well-known sources of errors like atmospheric effects and satellite orbit errors. Table 4 provides the numerical results over the first 5 days. The correlation coefficients reduce approximately exponentially, for example, by e−ατ, which is known as a first-order autoregressive noise process AR(1). More results on this model are presented in the next section for the cases α = 1 and α = 0.25.
Table 4. Estimated Time Correlation Over the First Five Time-Lags τ (Days) for Time Series of North, East, and Vertical Components Before (Left) and After (Right) Removing Harmonic Functions; Standard Deviation of All Estimates Is 0.02
 The results of the w-test statistic are presented to find the most appropriate noise model for the global GPS coordinate time series. The larger (absolute value) the w-test statistic is, the more powerfully the null hypothesis tends to be rejected, and hence the more likely the alternative model will be as a candidate for the noise in the time series. Six stochastic models were tested using the hypotheses as in equation (28). The results are given in Table 5. For the original data (before removing harmonic functions), the maximum values are obtained for flicker noise and random walk noise models (columns w3 and w4). In addition, except for a few components (for example, ALGO north), flicker noise is preferred to random walk noise. When the values are very close (for example, ALGO vertical), both noise components (in addition to white noise) are likely to be present in the series. In other words, the power law noise is well described with a spectral index between −1 and −2.
Table 5. The w-Test Statistic Values for Time Series of Position Estimate, Before (Left) and After (Right) Removing 10 Harmonic Functions Using Equation (25), for Different Alternative Hypotheses (Different Cy's)a
Before Removing Harmonic Functions
After Removing Harmonic Functions
w1: Cy = Qτ, τ = 1 contains only ones on two parallel side-diagonals next to the main diagonal; w2: Cy = diag(Qrw), only diagonal elements of Qrw in equation (13); w3: Cy = Qf, flicker noise structure introduced in equation (12); w4: Cy = Qrw, random walk noise structure introduced in equation (13); w5: Cy = cij full matrix extracted from an exponential function of the form e−τ; w6: Cy = cij full matrix extracted from an exponential function of the form e−0.25τ.
 Usually the white noise along with either flicker noise or random walk noise are estimated. To confirm the w-test results, we included all three variances in the stochastic model, namely, white noise, flicker noise, and random walk noise as in equation (11). If a variance is negative, it is an indication that this noise model is most likely not the preferred model and can be excluded from the stochastic model. Table 6 shows the signs of the estimated variances using LS-VCE. In 53% of the cases, the random walk noise variance is negative. They are correspondingly related to the cases that the w-test values for flicker noise given in Table 5 are significantly larger than those for random walk noise. In 44% of the cases both flicker and random walk noise variances are positive. They are related to the case that the w-test values of flicker noise are approximately identical to those of random walk noise. Only for the north component of ALGO, the flicker noise variance is negative, which is also verified because the w-test value for flicker noise is smaller than that for the random walk noise.
Table 6. Sign of White Noise, Flicker Noise, and Random Walk Noise Variances for Different Components
Sign of Variance Component
 In columns 5 and 6, we have respectively used the following matrices for Cy: cij = e−τ and cij = e−0.25τ with i, j = 1, 2, …, m. The corresponding w-test statistic values are mostly significantly smaller than those for flicker and random walk noise. However, after removing 10 harmonic functions (individually per component) from the original data series, the largest values for w-test statistic are obtained for the e−0.25τ stochastic model (column 6). Note also that the results given in columns 5 and 6 are not much different. This therefore confirms the existence of remaining correlation at very high frequencies that is believed to be due to common sources of errors that last only a couple of successive days (see Figure 5, bottom). A relatively large value of the w-test statistic (values on the right) for flicker and random walk noise is most likely due to this correlation. A significant decrease in the w-test values for flicker and random walk noise implies that most parts of the power law noise have now been captured by the harmonic functions. This statement was also justified when the white and the flicker noise amplitudes were estimated using the extended functional model (with 10 harmonics), which led to small positive or negative values for flicker noise amplitudes.
 Let us now turn our attention to the second column (w2) in Table 5. The goal is to test stationarity of the white noise amplitude in the series. For this purpose, we selected Cy = diag(Qrw). Most of the w-test values are negative, implying that the white noise amplitude in the daily position estimates gets reduced toward the end of the series as = σw2 + tiδ (δ negative). This reflects the improvements in analysis products (for example, satellite orbits and Earth orientation parameters), which makes sense, of course, as equipment is improving and also our knowledge about error sources like atmosphere and orbit is continuously improved. The reduction in noise amplitudes with time was shown by Williams et al. . They showed that such a reduction of noise also holds true for the flicker noise amplitude.
 A large value for the w-test leads to the rejection of the null hypothesis. One can obtain the w-test values for different alternative hypotheses. The one that gives the largest absolute value is considered as a superior candidate for describing the noise characteristics of the data. In our case, in general, the flicker noise model was preferred. After introducing 10 harmonic functions to the model, the largest values were obtained for the AR(1) noise process. The w-test statistic is considered to be a powerful tool to decide on the preferred noise model. Based on simulated data, Williams (personal communication, 2006) concluded that the w-test statistic and the difference in the log likelihood values give very similar results. Note, however, that the w-test statistic can simply be used while the MLE method needs successive inversion of the covariance matrix.
6.5. Remarks and Discussions
 We would like to point out here the duality between harmonic functions and time correlated noise. If one could compensate for all unmodelled effects in the functional model, this would be the best way to do so. Otherwise, they will be misinterpreted as if the data were time correlated; see section 3.3 and the simulated example in section 5.3. Examples of such hidden periodicities are the annual and the semiannual signals, signals with the periods of 13.66, 14.2, and 14.8 days, and likely signals with the periods of 350/n (n = 1, …, 8).
 If the unmodelled variations cannot be considered as deterministic signals to be compensated for in the functional model (for example, by harmonic functions), they can be captured as power law noise process (for example, flicker noise or random walk noise) through the stochastic model. Although not fully physically justified, we captured unmodelled effects by a set of harmonic functions (in this study 10 individual functions for each series). We observed a duality between the functional and the stochastic model. When unmodelled effects were removed by these harmonic functions, a short-memory process is left in the time series. In other words, most parts of the power law noise have been captured by the harmonic functions. However, this does not necessarily mean that there exist 10 individual hidden periodic functions in each series.
 The remaining minor time correlation, as a short-memory process, exponentially vanishes within a few days and can be expressed for instance as an AR(1) noise process. This essentially means that to avoid biases in the power law noise amplitude due to underparameterization, one will have to include also the AR(1) noise process σa2Qa in equation (11), namely, Qy = σw2I + σa2Qa + σf2Qf (when one ignores random walk noise). This holds indeed also for any potential periodicities in the functional part of the model.
 Our investigations show that the time series are not yet long enough to separately estimate one variance for each noise component. Therefore first the LS-HE method was employed to include a set of harmonic functions to compensate for power law noise model. The remaining noise is now expressed as a combination of white and autoregressive noise. The unknowns in this case are the amplitudes of white and autoregressive noise (σw and σa) and the timescale α of the noise process. In other words, the short-memory process is expressed as Qy = Q(α; σw2, σa2) = σw2I + σa2Qa where qija = e−ατ and τ = ∣tj − ti∣. This is in fact a nonlinear stochastic problem that can again be solved by LS-VCE.
 The method was applied to 71 globally distributed GPS stations. The average value for the timescale is α ≈ 0.25. The mean amplitude of white noise and autoregressive noise is σw = 2.3, 3.3, and 6.3 and σa = 1.3, 1.8, and 4.0 (all in mm) for north, east, and vertical components, respectively. Suppose now that we do not include the 10 harmonic functions to compensate for flicker noise. The covariance matrix is again of the form Qy = σw2I + σa2Qa + σf2Qf. In practice, it is more convenient to combine white noise and autoregressive noise into one “short-memory” process using the average values obtained above. Based on these results, if one assumes that the timescale α and also the relative magnitude of noise components σa/σw is known, the covariance matrix Qy = σw2I + σa2Qa + σf2Qf can be reformulated as
where σs2 = σa2 + σw2 is the variance of the short-memory noise process and Qs is given as
with α ≈ 0.25 and β = σa2 / (σw2 + σa2) ≈ 0.25. Figure 8 shows the weighted mean autocorrelation function of 71 permanent GPS stations and its approximation based on equation (31).
 Stochastic model (30) is referred to as the short-memory noise and flicker noise model (model IIb), for which two variance components σs2 and σf2 need to be estimated by LS-VCE. We now consider this equation to estimate the magnitude of short-memory (combined WN and AR(1)) and flicker noise process. A correct functional model consisting of annual and semiannual signals, a period of 13.66 days, and a period of 350 days and its fractions 350/n (n = 2, …, 8) was also used. The results are given in Table 7. Compared to the results given in Table 1 for the white plus flicker noise model (model IIa), flicker noise shows a reduction of 40% whereas white (in fact short-memory) noise increases by about 20%. The differences in the log likelihood values have also been computed which show an increase of about 10% compared to the values given in Table 2 for model IIa. With this strategy, not only can one obtain a better precision for the parameters of interest, but also one will be able to increase the log likelihood values.
Table 7. Short-Memory (WN plus AR(1)) Noise and Flicker Noise Amplitude Estimates for North, East, and Vertical Components in Modified Functional and Stochastic Model
 The goal here is to estimate and compare the error of four parameters of interest, namely, intercept, slope (rate), position in the middle, and position at the far end of the time series, for different stochastic models. Model I, IIa, IIb, and III are the pure white noise, white noise plus flicker noise, short-memory (combined white and autoregressive) noise plus flicker noise, and white noise plus random walk noise, respectively. Model IIb also includes the annual and the semiannual signals, a period of 13.66 days, and periods of 350/n days, n = 1, …, 8 in the functional model.
 We show how an incorrect stochastic model will result in a too optimistic (or too pessimistic) precision description of the parameters of interest (see section 3.3). The results given in Table 8 are based on Q′x = (ATQ′−1yA)−1, where Q′y is a (in)correct covariance matrix obtained from model I, IIa, IIb, or III. Model I gives the most optimistic results and model III generally gives the least precise results. The error of parameters for different models compared to those for the pure white noise model is larger by the coefficients given in Table 9. For example, if a white plus flicker noise model is used instead of a pure white noise model, the velocity error obtained can be larger by factors of 9–16. Among different models, the error estimates of site velocity and position for model III are considerably larger (about one order of magnitude) than those for other models.
Table 8. Error Estimate (Formal Standard Deviation) of Slope, Intercept, and Position for Different Stochastic Modelsa
I: White noise only; IIa: white noise plus flicker noise; IIb: short-memory noise plus flicker noise with proper functional model; and III: white noise plus random walk noise.
 Compared to the white noise magnitude of the series (scatter of the series), the standard deviations of positions in the middle of the time series are 2%, 80%, 50%, and 400% for models I, IIa, IIb, and III, respectively. These values increase at the end of the series to 4%, 90%, 55%, and 800%. For all models, except model III, the minimum error estimate of the position is obtained in the middle and the largest values are given at both ends of the series. The results of model I are too optimistic. The error contribution of the intercept and the slope on the position seems to be the same (2% in the middle and 4% at both ends). The results obtained from models IIa and IIb appear to be more realistic. In model IIa, the slope has only a contribution of 10% on the position error estimate; the error in the intercept plays the main role. This holds also for model IIb, but with an improvement by a factor of 1.6. The behavior of model III is somehow different. The results are too pessimistic and the error in the slope plays the main role.
 In geophysical literature, the site velocity uncertainty is usually of interest and not directly the position error. However, in geodetic applications (for example, realization of ITRF), the goal of the site velocity studies is both the intercept and the slope, from which the position and its uncertainty can be directly propagated. Figure 9 illustrates the effect of site velocity only and site velocity plus intercept on the position error for different stochastic models. The plot shows the importance of the propagation of errors correctly on topics like realization of ITRF. When interested in position error and relying only on the site velocity error, it seems that models IIa and IIb are more appropriate for long-term accuracy but give optimistic results over short periods. However, if one propagates both the slope and the intercept errors, these models will always be preferred. On the other hand, model III is likely suitable for short-term accuracy but yields pessimistic results over long periods.
7. Conclusions and Recommendations
 We assessed the noise characteristics in global time series of daily position estimates by LS-VCE. The method is easily understood, generally applicable, and very flexible. The LS-VCE estimators are unbiased and of minimum variance. This method provides the precision of the (co)variance estimators. Based on the results given, the following conclusions can be drawn:
 The w-test statistic is a powerful tool to recognize the data noise characteristics in order to construct an appropriate stochastic model. Using the w-test, a combination of short-memory (white plus autoregressive) noise and flicker noise was in general found to best describe the noise characteristics of the position components; we hardly observed that the strict white plus random walk was the preferred noise model. These results have also been verified using correlograms of the time series, the frequencies of harmonic functions, and the signs of estimated flicker and random walk noise variance components.
 The least squares harmonic estimation method was used to find and consequently remove a set of harmonic functions from the data. These harmonic functions captured unmodelled effects. The results confirm the presence of annual and semiannual signals in the series. We could also observe other periodic effects; for example, a period of 13.66, 14.2, and 14.8 days. We observed also significant periodic patterns with a period of 350 days and its fractions 350/n, n = 2, …, 8, which are likely due to aliased multipath effects in permanent stations. When such variations have underlying physical phenomena (or modeling error), their effects can be considered as systematic periodic signals. It may not be appropriate to capture their effects by a power law noise process in the stochastic model. They may mistakenly mimic flicker or random walk noise if we neglect them in the functional model. Therefore neglecting such effects, which may be best described by a deterministic model rather than a power law noise model, can seriously affect the error estimate of the site velocity and the position.
 There are also some effects in the series that are not of periodic nature. They can most likely be expressed as power law noise. We however employed the harmonic estimation method to find more frequencies in the series. A significant decrease in the w-test values for power law noise implies that most parts of this noise are captured by the harmonic functions. This led us to see a duality between the stochastic and the functional model. In fact, what is not captured in the functional model is captured in the stochastic model. When we include the harmonic functions, almost exclusively white noise remains in the data. Only at very high frequencies a significant time correlation appeared to be present, which can be expressed as an exponential function (for example, a first-order autoregressive noise process). This noise can be caused by common sources of errors like atmospheric effects as well as satellite orbit errors that last over only a few successive days. Based on the value of the timescale α ≈ 0.25, Williams (personal communication, 2006) suggested atmospheric loading as a candidate for this noise process.
 The overestimation of the power law noise was due to the presence of the autoregressive noise and also the justified hidden periodic effects in the series. This means that neither the white noise plus flicker noise model nor the white noise plus random walk noise model is the preferred model. The best model includes in addition to power law noise also other noise models like autoregressive noise or as Langbein  used band-pass-filtered noise on EDM data. Instead of a strict white noise model, a short-memory noise process was introduced which led to the reduction of the flicker noise magnitude.
 Colleague H. van der Marel is acknowledged for his suggestion and recommendations on the use of time series of permanent GPS stations. We would like to thank S.D.P. Williams and an anonymous reviewer for their meticulous reviews and valuable comments on an earlier version of this paper. We are also thankful to the editors, R.J. Arculus, T.H. Dixon, and P. Tregoning, for their recommendations.