Journal of Geophysical Research: Solid Earth

Effect of annual signals on geodetic velocity

Authors

Errata

This article is corrected by:

  1. Errata: Correction to “Effect of annual signals on geodetic velocity” by Geoffrey Blewitt and David Lavallée Volume 108, Issue B1, ETG 4-1, Article first published online: 9 January 2003

Abstract

[1] Our analysis of Global Positioning System (GPS) site coordinates in a global reference frame shows annual variation with typical amplitudes of 2 mm for horizontal and 4 mm for vertical, with some sites at twice these amplitudes. Power spectrum analysis confirms that GPS time series also contain significant power at annual harmonic frequencies (with spectral indices 1 < α < 2), which indicates the presence of repeating signals. Van Dam et al. [2001] showed that a major annual component is induced by hydrological and atmospheric loading. Unless accounted for, we show that annual signals can significantly bias estimation of site velocities intended for high accuracy purposes such as plate tectonics and reference frames. For such applications, annual and semiannual sinusoidal signals should be estimated simultaneously with site velocity and initial position. We have developed a model to calculate the level of bias in published velocities that do not account for annual signals. Simultaneous estimation might not be necessary beyond 4.5 years, as the velocity bias rapidly becomes negligible. Minimum velocity bias is theoretically predicted at integer-plus-half years, as confirmed by tests with real data. Below 2.5 years, the velocity bias can become unacceptably large, and simultaneous estimation does not necessarily improve velocity estimates, which rapidly become unstable due to correlated parameters. We recommend that 2.5 years be adopted as a standard minimum data span for velocity solutions intended for tectonic interpretation or reference frame production and that we be skeptical of geophysical interpretations of velocities derived using shorter data spans.

1. Introduction

[2] Velocities derived from geodetic coordinate time series are now routinely used as input to geophysical models [Segall and Davis, 1997], with many applications including plate boundary dynamics, postglacial rebound, surface mass loading, and global sea level change. It has recently emerged that GPS coordinate time series have significant annual signals [e.g., Van Dam et al., 2001], which might significantly bias published velocity estimates. This paper systematically investigates the effect of annual signals on geodetic velocities and complements recent research on power law noise of coordinate time series [Zhang et al., 1997; Mao et al., 1999]. Indeed, seasonally driven signals with an annually repeating component would include power at all annual harmonic frequencies, which would affect both spectral and time domain characterizations of GPS errors.

[3] A major component of annual signals is now known to be true physical site motion. The dominant cause for annual signals with respect to a global reference frame is surface loading due to hydrology and atmospheric pressure. Van Dam et al. [2001, p. 651] report that in loading models of continental water storage, “vertical displacements have a root mean square (RMS) values as large as 8 mm and are predominantly annual in character.” These hydrological models (also accounting for atmospheric loading) were shown to correlate strongly with the same GPS coordinate time series used here, with the variance reduction in GPS height residuals being approximately equal to the variance of the model. Hence seasonal variation, which is best described by a deterministic model (rather than a power law noise model), is likely to contribute to velocity error for globally referenced coordinates, especially over short data spans. Until physical models of annual signals adequately describe the observed variation, a reasonable solution to this problem is to estimate an annual signal (amplitude and phase) simultaneously with site velocities and initial positions. Another strategy is to reference coordinates regionally (e.g., by spatial filtering [Wdowinski et al., 1997]), which would not be effective for large regions such as the North American–Pacific plate boundary, or for stability tests of major plates. A major component of annual signals is now known to be true physical site motion [Blewitt et al., 2001].

[4] There are several concerns motivating our research. First, there are numerous recent examples of published estimates of tectonic velocities with as little as 1–2 year data spans, where annual signals have not been taken into account in the estimated velocities and errors. Their effects have often been ignored completely or have been subject to incorrect intuitive speculation. For example, Dixon and Mao [1997, p. 536] state “the influence of annual errors … on velocity estimates would be minimal for an integer number of years but would affect velocity estimates for the 2.5 year time span used here” (which our research here proves to be wrong). If annual signals are not taken into account, it is shown here that they typically dominate velocity errors during the first 2.5 years of coordinate time series, with the nonobvious theoretical result supported by our data that the bias drops rapidly between 2 and 2.5 years. This is consistent with anecdotal evidence that GPS velocity solutions tend to be unstable until a 2-year data span is exceeded. Our research here can be used as a guide to set criteria for publishing new results and can also be used to assess the level of errors in previously published results.

[5] Second, geodetic investigations almost always rely on the availability of an accurate global reference frame, either directly or indirectly. For example, such a frame is essential for accurate orbits produced by the International GPS Service (IGS), Earth rotation parameters produced by the International Earth Rotation Service (IERS), globally referenced site coordinates, and global site velocities to define kinematics relative to stable plate interiors for deformation studies. Current procedures to produce the IERS Terrestrial Reference Frame (ITRF) do not account for annual signals in deriving site velocities (though future versions may do so if this type of research demonstrates the benefits). To consider the effects on current ITRF and IGS procedures, it should be kept in mind that such analyses assume no time correlation between epoch solutions; in fact this class of solution currently dominates the tectonophysics literature. It should also be noted that even if power law stochastic models (e.g., flicker noise) were used, they alone would not account properly for time domain behavior due to annual signals.

[6] Third, annually repeating signals generally contain not only an annual sinusoidal component but also the annual harmonics. Estimation of only the annual amplitude and phase will therefore not mitigate the entire effect of an annually repeating signal. It is not immediately obvious how many extra terms should be included.

[7] Fourth, while estimation of annual harmonics may reduce systematic error, it will also introduce a greater random error in velocity due to the increased number of parameters. We can expect this to be a problem for shorter time series, when correlations between the estimated parameters become increasingly significant. We might expect that below some minimum data span the systematic bias we are attempting to mitigate would be less harmful than the dilution of velocity precision.

[8] Guided by these concerns, we formulate the following research questions. First, there are fundamental questions: What are the temporal characteristics of velocity bias in the presence of an annually repeating signal when the velocity estimation assumes no deterministic model or interepoch correlations? Can such temporal characteristics be used to advantage? Second, there are questions of interpretation of published results: How should we interpret errors of published velocity solutions (and hence the significance of the research findings) that have not accounted for annual signals? What is the minimum data span at which one should accept velocity estimates (e.g., as input to ITRF) that have not accounted for annual signals? Third, there are questions of implementation. What is the data span beyond which the degradation in precision arising from estimation of extra annual signal parameters is smaller than typical systematic bias? What is the data span beyond which negligible gain is to be made by estimating annual (and harmonic) signals?

[9] We begin by systematically developing a theoretical foundation for the analysis of annually repeating signals and their effect on velocity. We proceed to develop an error model that might be used to reinterpret errors of published results; this is tested using our own data. On the basis of the theory and data we then answer the research questions posed above.

2. Theoretical Foundation

2.1. Previous Work

[10] Black and Scargle [1982] analyzed the apparent motion of a star that is assumed to have secular proper motion plus a sinusoidal perturbation. They pointed out that the residual motion to the apparent secular motion would represent a distorted version of the sinusoidal perturbation because the secular motion used to form the residuals is itself biased by the perturbation. This is relevant to our geodetic problem; while Black and Scargle [1982] focused on the detection and estimation of the sinusoidal signal, our emphasis is on faithful recovery of secular motion and on modeling its error. Furthermore, we generalize our theoretical foundation to accommodate not only pure sinusoidal signals but also repeating signals of arbitrary form.

2.2. Velocity Bias Due to a Sinusoidal Signal

[11] Consider a time series of n coordinate data ri (i = 1,2,…, n), which are regularly spaced with time interval Δt. Let the coordinates be modeled as a linear function of time plus an arbitrary sinusoidal signal:

equation image

where initial position s and velocity u are unknown, vi represents data noise, and the signal is characterized by frequency f, amplitude a, and phase lag φ.

[12] Now consider using least squares to estimate simultaneously the velocity and initial position, where we erroneously disregard the sinusoidal signal. Appendix A1 derives an equation (equation (A5)) that can be used to compute the bias due to an unmodeled signal in the limit of small data intervals, which in this case implies Δt ≪ 1/f. Inferring the appropriate partial derivatives for equation (A5) from equation (1), the bias vector for the estimated parameters is

equation image

where T is the time spanned by the data. The biases from equation (2) apply only when the inversion uses a data covariance matrix with no temporal correlations (which is the case for most published results). For example, equation (2) applies even if there is real colored noise in the data, provided no temporal correlations are assumed in the data covariance matrix. Note that equation (2) represents an incremental bias, so the total velocity error would generally be some superposition of equation (2) plus an error induced by the real noise.

[13] As a brief aside, equation (A6) can be used to compute the formal covariance matrix

equation image

Hence the formal error in velocity is the well-known formula [e.g. Zhang et al., 1997]

equation image

This is not intended to be realistic or to take into account systematic error and is only used as a reference error function further into this paper.

[14] Now, taking the upper matrix element of equation (2), the velocity bias is

equation image

A priori, we generally do not know the phase lag, so it is useful to factor together terms dependent on φ. We have discovered that equation (5) reduces to the lucid expression

equation image

Clearly, the velocity bias is a zero-crossing oscillatory function of data span T, which tends to zero for large T. This function is plotted in Figure 1 for f = 1/yr and a = 1 mm using example values of phase lag φ = 0, π (for a pure ±cosine signal) and φ = ±π/2 (for a pure ±sine signal). Most importantly, equation (6) tells us that for arbitrary phase lag φ the velocity bias is zero when T satisfies

equation image

This is a well-known transcendental equation, which therefore has no analytical solution. Numerical solution to equation (7) for positive data spans T yields the following list:

equation image

Therefore we have discovered the “zero-bias theorem” that the velocity is unbiased near integer-plus-half (equation image) cycles, for positive m. These nodes can be clearly seen in Figure 1.

Figure 1.

Velocity bias from an annual sinusoidal signal versus data span, shown specifically for cosine and sine signals. The velocity bias scales with the amplitude, taken here to be 1 mm.

[15] Owing to the independence of phase lag, we can also conclude that equation (8) applies to an arbitrary superposition of signals (at frequency f), which is applicable to annual geodetic signals that arise from a combination of many constituents. It is useful to quantify the magnitude of the velocity bias as a root-mean-square (RMS) quantity, averaged over all possible phase lags. From equation (6),

equation image

Equation (9) is plotted in Figure 2 along with the maximum possible bias ûmax from equation (6). In the case where we have a set of sites, we can replace amplitude a with the RMS amplitude σa. Clearly, the RMS bias also has zero values at data spans given by equation (8).

Figure 2.

Velocity bias from an annual sinusoidal signal versus data span, showing maximum possible bias (max), and root mean square bias (RMS). The velocity bias scales with amplitude, taken here to be 1 mm.

2.3. Velocity Bias Due to a Repeating Signal

[16] An arbitrary repeating time series discretely sampled at intervals Δt can be expanded as a Fourier series:

equation image

where f is the fundamental frequency. (The k = 0 term is not needed, as it is absorbed by the initial position parameter). There are 1/(fΔt) samples within one fundamental period. Therefore the sampled signal can be described by a unique set of 1/(fΔt) Fourier parameters ak and φk; hence the limited summation to k = 1/(2fΔt). Each of the terms from equation (10) will contribute to the velocity bias according to equation (6), where the frequency must be replaced by each harmonic of the fundamental frequency. Therefore we can expand the velocity bias for a general repeating signal over an arbitrary data span T as

equation image

If we assume that the phase lags φk are randomly distributed, then the RMS velocity bias is given by quadratic summation over all contributing Fourier components:

equation image

where by analogy with equation (9),

equation image

Here ak is the amplitude for Fourier component k.

[17] For the types of physical processes that might give rise to annual signals, the power spectrum should be approximately proportional to f−α [Agnew, 1992], where α is the spectral index appropriate to the process. We now assume (to be tested later) that the harmonics contributing to the repeating signal obey this power law. As power is proportional to the square of amplitude, and as harmonic frequency is proportional to k, we can substitute into equation (13)

equation image

Therefore equation (12) becomes

equation image

For annually repeating signals we set f = 1 and specify T in years. Function (15) is plotted in Figure 3 showing that velocity bias is relatively insensitive to values of α for data spans beyond 2 years. As we show later, a conservative value to assume is α = 1. Thus, with data sampled every week, equation (15) becomes

equation image

This is the key equation of our paper, as it represents an additional error in published velocities that do not account for annual errors and have assumed no correlations between epoch solutions. To apply equation (16) requires only knowledge of the data span used T and an assumed typical value for the annual signal amplitude a1. We note that the formal errors of published velocities often include a covariance scaling factor based on the observed variance of the coordinate time series. For reference, Figure 3 also plots the RMS velocity that would arise from white noise given by equation (4), where the level of white noise is set to be equal to the RMS signal.

Figure 3.

Velocity bias versus data span for annually repeating signals with spectral index α = 1 and fundamental amplitude of 1 mm. Upper and lower bounds assume that α ranges from 0.5 to 2, respectively. Also shown is the white noise-equivalent error with the same 1.4-mm standard deviation as the annually repeating signal.

2.4. Diluted Precision Due to Signal Estimation

[18] An approach that appears to be gaining favor is to estimate sinusoidal signals directly from geodetic data. However, this should be carefully considered because formal velocity precision is necessarily diluted by the introduction of the extra parameters; the question is, to what extent? Estimation of the extra parameters would only be justifiable if the dilution of precision is more than compensated by removal of the otherwise expected velocity bias. Estimation might not even be necessary for longer data spans if the velocity bias becomes insignificant.

[19] We now address this by deriving an analytical expression for velocity precision, again under the common assumption of no temporal correlations. Consider our original equation (1) where, in addition to initial position and velocity, we also estimate the amplitude and phase of the sinusoidal signal. In this case, a transformation of variables is desirable to linearize the equations:

equation image

where the estimated parameters are u, s, p, and q. From equation (A6) the formal covariance matrix for the estimated parameters becomes

equation image

The prime indicates that this is the covariance for our new set of four parameters. Analytical inversion of this gives (after meticulous algebraic manipulation) the following elegant result for the formal error in velocity:

equation image

where equation (4) gives the reference error function σf(T), and we define the “dilution of precision” as

equation image

Equation (19) has been written in a form to emphasize several important properties, which can be seen graphically in Figure 4 for the case f = 1/yr. First, comparing equation (19) with the linear motion model (velocity plus initial position) given by equation (4), we see that D(T) precisely quantifies the dilution of precision caused by the introduction of the two extra parameters. The dilution of precision blows up rapidly for T < 1.5 cycles.

Figure 4.

Velocity formal error versus data span assuming 4-mm white noise for two cases: (1) estimation of linear motion model (velocity plus initial position) and (2) linear motion model plus estimated annual sine and cosine amplitudes. Also shown on the right axis is the dilution of precision D versus data span.

[20] Second, the dilution of precision rapidly approaches unity as the data span is increased. The increase in formal error due to the estimation of sine and cosine amplitudes becomes negligible after 2.5 cycles. As a corollary, estimation of Fourier amplitudes for higher harmonics will also have negligible effect on formal velocity error after 2.5 cycles. Numerical tests confirm this.

[21] Third, the dilution of precision is unity at exactly the data spans given by equation (8). That is, sinusoidal estimation does not improve the expected precision of velocities for precisely the same data spans that produce zero velocity bias. Intuitively, this is because any error in the sine and cosine term at these data spans has no consequence on the estimate of velocity. This confirms that our theoretical development is self-consistent and suggests that an acceptable alternative to sinusoidal estimation is to select data spans of integer-plus-half cycles.

2.5. Theoretical Findings

[22] The RMS velocity bias expressed by equation (16) (and shown in Figure 3) and the dilution of precision expressed by equation (20) (and shown in Figure 4) can be used to draw some theoretical conclusions. First, the summation of equation (16) converges rapidly, with terms k = 1,2 typically accounting for ∼90% of the resulting bias. This therefore suggests that to adequately mitigate velocity bias, only the annual and semiannual amplitudes and phases need to be estimated simultaneously along with velocity and initial position. However, estimation should only be applied for data spans T > 2.5 years to avoid problems of correlated parameters and becomes rapidly unnecessary for T > 4.5 years. In almost any realistic circumstance, estimation of annually repeating signals will dilute the precision by a negligible amount for T > 2.5 years.

[23] Second, solutions tend to be minimally biased at data spans of integer-plus-half years (contrary to the unfounded assumption quoted in section 1). This would approximately cancel the velocity bias from all Fourier terms where k is odd (including the fundamental frequency) because equation (8) shows that such data spans approximately satisfy zero-bias solutions. Note that such a cancellation of odd terms is independent of spectral assumptions; therefore this is a robust and general approach. Moreover, estimation of an annual sinusoid is not necessary at integer-plus-half year data spans. In this case, the dominant remaining bias would be due to the smaller semiannual signals, which can be computed from equation (13):

equation image

For example, taking our typical amplitudes of 2 mm (annual) and 1 mm (semiannual) would give rise to a velocity error of 0.1 mm/yr at 2.5 years (semiannually dominated) in contrast to 0.7 mm/yr at 2.0 years (annually dominated).

[24] Third, theory suggests that the velocity bias due to annually repeating signals would be reduced significantly when extending data spans from 2 to 2.5 years. Hence 2.5 years might be taken as a minimum data span for accepting velocity estimates for geophysical interpretation. These three main conclusions are of course theoretical and depend on how annually repeating signals should realistically be characterized, which we now address by testing with actual data.

3. Experimental Verification

3.1. Data Description

[25] To test the various assumptions and conclusions of the preceding theoretical framework, we analyzed data from a set of IGS sites for which we have velocity solutions with data spans of 3.5 years. The coordinate time series was produced by analysis of weekly IGS Analysis Center solution files in Software Independent Exchange (SINEX) format, using the fiducial-free methodology of Davies and Blewitt [2000]. For a number of reasons, most coordinate time series have a finite data gap. Of the 55 sites originally analyzed, 23 satisfied the strict criteria that (1) there be no additional parameters needed to estimate instantaneous coordinate offsets (e.g., arising from hardware changes or coseismic displacement); (2) any gap in the time series be <8 weeks, which is short enough to capture expected seasonal variations without excluding many sites. Figure 5 shows a map of these sites.

Figure 5.

Map of the 23 GPS sites of the International GPS Service used for this analysis, which passed strict criteria on data spans and data outages.

3.2. Spectrum of Annual Repeating Signal

[26] First, we characterized the range of annually repeating signals present in our globally referenced coordinate time series by simultaneous estimation of annual and semiannual amplitudes and phases at each site along with initial position and velocity. Table 1 summarizes the results, showing the range (and RMS) of values for annual and semiannual amplitudes. Thus for approximate calculations we might take typical values of 2 mm for horizontal annual, 4 mm for vertical annual, 1 mm for horizontal semiannual, and 2 mm for vertical semiannual, with worst-case values a factor of 2 larger. This implies that the previous theoretical calculations should be scaled accordingly. For example, Figure 3 assumes a 1-mm sinusoidal signal; therefore the worst-case effect of an annual horizontal signal would require scaling of the curves by a factor of 4, leading to a velocity bias of ∼1.6 mm/yr at 2 years and ∼0.5 mm/yr at 2.5 years.

Table 1. Estimated Amplitudes by Least Squares
Velocity ComponentAnnual RMS, mmAnnual Range, mmSemiannual RMS, mmSemiannual Range, mm
Up4.41.1–10.91.50.2–3.6
East1.80.3–4.40.50.1–2.0
North1.50.2–2.90.70.1–1.2

[27] Equation (16) assumed that annually repeating signals could be characterized by a spectral index α = 1, although Figure 3 illustrates the lack of sensitivity to a broad range of α values. To test the validity of this assumption, modified periodograms were computed for each site [Scargle, 1982]. The stacked results can be interpreted as the power density distribution averaged over all sites. Figure 6 shows the results of this procedure applied to height time series. The plots for east and north (not shown) are very similar in character, except for overall magnitude. A value of α = 1 (solid curve in Figure 6) approximately characterizes the overall average power spectrum. Figure 6 shows the power at annual harmonics with darker bars. For the height component, the average power at annual period is equivalent to amplitude 3.7 mm [Scargle, 1982], and at semiannual period the equivalent amplitude drops to 1.5 mm. The equivalent annual amplitudes in east and north are 1.5 mm and 1.2 mm, respectively. (This is consistent with the simultaneous estimation of annual and semiannual signals shown previously). Particularly noteworthy is that annually repeating signals show up as peaks above the background spectrum at the annual frequency and its harmonics. This empirically proves the significant presence of annually repeating signals in our data. As previously noted, Van Dam et al. [2001] have positively identified a major component of these signals as being due to hydrological and atmospheric loading.

Figure 6.

Power spectral density distribution, created by stacking periodograms from all 23 sites. Curves are shown for spectral indices α = 0.5 (long dashed line), α = 1 (solid line), and α = 2 (short dashed line). Darker bars are at harmonic frequencies and can be seen to peak above the background spectrum until 5 cycles per year. While the background falls within 0.5 < α < 1, harmonic power is better fit by larger values 1 < α < 2. If there is leakage of harmonic power into the background, then the true background might be closer to white noise.

[28] The presence of real, significant annually repeating signals is now conclusive. An interesting question is to what extent do these signals leak into power law spectral analyses? Mao et al. [1999] found the value α = 1 ± 0.4 describes globally referenced GPS coordinate time series and was unable to remove the annual term and achieve consistent results. Zhang et al. [1997] found the value α = 0.4 ± 0.1 best fits their regionally referenced time series, but their data span was too short to remove annual signals. While our background spectrum can be characterized by 0.5 < α < 1, the annual harmonics are better fit by larger values 1 < α < 2. The results of our study suggest that power spectral analyses not accounting for annual signals would tend to be biased toward higher values of α. Moreover, conversion of the power spectrum into time domain errors would be inaccurate if a significant part of the spectral power were due to annually repeating signals. It is not conclusive whether currently published power laws would therefore overestimate velocity errors at longer data spans, but it is a conjecture deserving attention beyond the scope of this paper.

3.3. Test of Velocity Bias Theory

[29] Ideally, we could test our theory of velocity bias by assessing the accuracy of estimated velocities as a function of data span. Obviously, we have no absolute truth to assess accuracy; however, we can compare velocity estimates for a variety of data spans with the solution using the longest data span. We have therefore developed the “velocity stability test,” which incrementally decreases the data span from 3.5 years for all site solutions and computes the RMS difference in velocity at each data span from the velocity at 3.5 years. Apart from providing an appropriate test of our theory, this approach has practical utility for assessing how velocity solutions stabilize in time and converge to acceptable values. Taking the test data span as T1 and the reference 3.5-year data span as T2, this experimental RMS is computed as

equation image

where uj(T) is the estimated velocity for site j and data span T and N is the number of sites. We call this the “observed velocity stability.”

[30] Appendix A2 derives equation (A16), a model of the expected variance in the change of parameter estimates when increasing the data span from T1 to T2:

equation image

where the appropriate inputs are given by equations (4) and (13). We call this the “modeled velocity stability.” Equation (23) accounts for both the formal covariance plus the variance arising from annually repeating signals. There are two free parameters in this model: the level of white noise σw as input to equation (4) and the amplitude of the fundamental (annual) frequency a1 as input to equation (14); hence equation (13). In this case, a1 should be interpreted as the RMS amplitude averaged over sites. The most interesting feature of equation (23) is the theoretical prediction of temporal undulations in the observed velocity stability. This is a characteristic consequence of the presence of annual signals. By comparing the modeled with the observed velocity stability we can assess whether the model satisfactorily explains the pattern observed in real data.

[31] An advantage of this method is that it uses the time domain characteristics of real data through the observed velocity stability. A possible criticism is that the modeled velocity stability does not assume power law noise (for the nonrepeating components). It should, however, be noted that the observed velocity stability through equation (22) clearly shows the predicted undulations independent of any stochastic assumptions, which implies that any colored noise that might be present does not significantly mask the predicted effect. Another possible limitation is that equation (23) assumes a continuous time series and might not be valid for time series with long data gaps. Our analysis shows that equation (23) does appear to be reasonably valid for the data set from 23 IGS sites (Figure 6) that satisfy the previously described time gap criterion of <8 weeks.

[32] Figure 7 plots the observed RMS stability using up, north, and east coordinate time series from these sites. The curves represent the modeled RMS stability, with best fitting values for the annual sinusoidal amplitude a1 of 5 ± 1 mm for up and 2 ± 1 mm for east and north. An important feature is the evidence for undulations in the velocity bias, which can be clearly seen with annual period. The predicted flattening of the curves is evident near 2.5 years. As expected, the annually repeating signals are larger for the vertical component (5-mm RMS amplitude) than the horizontal (2-mm RMS amplitude), again consistent with both our direct estimation and spectral analyses. This consistency, together with the observed undulations predicted by the model, lends credence to the theoretical findings on the importance of integer-plus-half year data spans. In particular, Figure 7 verifies that 2.5-year estimates of velocity (our recommended minimum data span) are reasonably stable in comparison with 3.5-year estimates.

Figure 7.

Root-mean-square velocity bias for data spans with respect to 3.5-year data span for the (top) vertical component and (bottom) horizontal components. Real data are plotted. Curves represent models for each component.

4. Discussion

[33] We have developed, from first principles, a theory of errors in velocity for coordinate time series that contain annually repeating errors. The velocity biases predicted by the theory are relatively insensitive to the wide variation in power spectral index of coordinate time series reported in the literature. The model does explain well the stability of real velocity estimates versus data span and shows the same essential undulating features that have hitherto been absent from models of velocity error. We are now in the position of being able to answer our stated research questions.

  1. What are the temporal characteristics of velocity bias in the presence of an annually repeating signal when the velocity estimation assumes no deterministic model or temporal correlations? The velocity bias undulates in time with minimum values near integer-plus-half year spans and peak values near integer years. The undulations die out quickly after 4.5 annual cycles, after which they may be considered negligible. Figure 3 shows the velocity bias as a function of data span for a 1-mm amplitude annual signal for a range of spectral indices (applied to power at harmonic frequencies). Equation (16) expresses this bias assuming a power spectral index α = 1. Our data indicate that power spectrum of annual harmonics can be adequately described by a range of spectral values 1 < α < 2. Figure 3 shows that velocity bias is not very sensitive to the assumed spectral index.
  2. Can such temporal characteristics be used to advantage? Yes. To almost eliminate velocity bias, only use data spans greater than 4.5 years. Otherwise, select data spans of 3.5 years or 2.5 years, where the bias is minimum. The advantage of this recommendation is that it is extremely simple for any investigator to implement.
  3. How should we interpret errors of published velocity solutions (and hence the significance of the research findings) that have not accounted for annual signals? One approach is to add in quadrature to the published error the expected velocity bias based on the theory presented here. For example, take typical (or extreme) values of annual and semi annual amplitudes from Table 1, and insert them into equation (16) or scale Figure 3. Alternatively, estimate the amplitudes from published time series, keeping in mind that amplitudes would appear systematically smaller in residual time series due to secular bias [Black and Scargle, 1982].
  4. What is the minimum data span at which one should accept a velocity estimate (e.g., as input to ITRF) that has not accounted for annual signals? This clearly depends on the application. As a practical rule, the minimum acceptable data span is 2.5 years for velocities estimated.
  5. What is the data span beyond which the degradation in precision arising from estimation of extra annual signal parameters is smaller than typical systematic bias? Degradation in precision due to the extra parameters is negligible for data spans beyond 2.5 years. Annual signals and their harmonics should not be estimated for shorter data spans due to correlated parameters, which cause random instability in velocity estimates.
  6. What is the data span beyond which negligible gain is to be made by estimating annual (and harmonic) signals? According to our data and models, rapidly decreasing gain is to be made by estimating annual signals for data spans beyond 4.5 years.

[34] While we certainly expect these findings to be useful to many users, there are some additional points to consider. First, equation (20) is not generally applicable for estimators that assume a colored noise stochastic model; however, the estimation of annual signals should still be effective. Second, annual signals might not be a dominant effect for a specific data set (and analysis procedure); however, we recommend that annual signals be initially assumed unless there is evidence to the contrary. Signal amplitudes and spectral indices should ideally be assessed for specific network and analysis procedures. Our amplitudes, however, should be generally applicable to globally referenced coordinates and may be taken as upper bounds for regionally referenced coordinates. Third, our velocity bias model might underestimate errors for time series with significant data outages (>8 weeks) and for sites where coordinate offsets require estimation due to equipment changes. Fourth, other types of error might also bias velocity. For example, hydrological loading is known to induce significant interannual signals and can even cause secular coordinate variation over several years [Van Dam et al., 2001].

5. Conclusions

[35] For precise geophysical applications such as tectonics, annual and semiannual sinusoidal signals should be estimated simultaneously with site velocity and initial position. Simultaneous estimation becomes rapidly unnecessary beyond 4.5 years, as we have shown that the velocity bias eventually becomes negligible. Minimum velocity bias is theoretically predicted at integer-plus-half years, as confirmed by tests with real data. Below 2.5 years the velocity bias can become unacceptably large, and simultaneous estimation does not necessarily improve velocity estimates, which rapidly become unstable due to correlated parameters. We recommend that 2.5 years be adopted as a standard minimum data span for velocity solutions intended for tectonic interpretation or reference frame production and that we be skeptical of geophysical interpretations of velocities derived using shorter data spans.

[36] Finally, we note the difficulty in characterizing the entire velocity error (not just the bias) in the time domain given that the power spectrum represents some combination of deterministic behaviors and various stochastic behaviors. This work is one step toward this goal. To conclude, we have established that a more complete time domain description of velocity errors must incorporate annually repeating signals; we have provided a means to quantify bias in published velocities and have recommended criteria for future publication of velocity results.

Appendix A:

A1. Least Squares in the Data Continuum Limit

[37] It would be possible to formulate this problem by defining the minimization functional as an integral over time of the difference between the “data function” describing the data continuum limit and the “model function,” which is linear in the estimated parameters. However, we take the point of view that fundamentally we wish to derive what happens to the results of the classic discrete least squares algorithm as the data are sampled at increasingly small intervals. While both approaches provide the same result, the discrete approach has the advantage that it more intuitively relates to the results produced by the actual algorithm used for data processing.

[38] Let us therefore consider the observation equations:

equation image

where the terms are written as explicit functions of the n data epochs; z is the observed minus computed data; the matrix A contains partial derivatives of the observation model with respect to the m parameters x; and matrix v represents the data noise. The classic least squares normal equations are

equation image

where equation image is the least squares parameter vector. Now let us assume that data are collected at regular intervals Δt, over the time period t = 0 to t = T. We can multiply both sides by Δt without affecting the computation of equation image:

equation image

Consider the case where the sampling interval is sufficiently small such that each summation in (A3) approaches the definite Riemann integral, which by definition gives

equation image

Therefore as we decrease the data interval, the least squares estimate vector converges to

equation image

Since (A1) is linear, (A5) can be interpreted it as a residual equation. For example, in the case that the model is inadequate, equation image can be interpreted as systematic error in the estimated parameters, and z(t) can be interpreted as an assumed function representing systematic error in the data minus model. Equation (A5) is used in this way to derive systematic bias in velocity estimation arising from sinusoidal signals in the time series of position. For this purpose, the data interval must be much less than the sinusoidal period for equation (A5) to be a valid approximation.

[39] The formal covariance matrix for the estimated parameters (which is not intended to account for systematic error) can be written

equation image

where σw2 is the assumed variance of noise in data sampled at interval Δt. Note that the covariance matrix is proportional to the data interval only because the data interval is inversely proportional to the number of data. The true utility of equation (A6) lies in the fractional increase in the covariance matrix as extra parameters are added to the model (in which case, the assumed data interval is irrelevant).

A2. Variance in Temporal Change of Estimate With Time-Dependent Bias

[40] From least squares theory we know the following two equations (which are analogous to weighted mean computations) describe how to update an estimate equation image and covariance C1, which uses data up to time T1, by adding new, independent data equation image with covariance CΔ from time T1 to T2, to produce a new estimate equation image with covariance C2:

equation image

“Independent data” of course amount to an assumption of data uncorrelated in time, which is a common assumption in data processing. We can write the change in estimate

equation image

Using the property that ${\bf\hat x}_{1}$ and $\bf\hat x_{\Delta}$ are independent, the covariance matrix for the change in estimates is given by

equation image

Hence the variance of the difference in velocity estimates at times T1 and T2 is

equation image

where the formal standard deviations in velocity σf (T) are given by equation (4).

[41] In the presence of a time-dependent bias in velocity equation image, the variance in change of estimate includes both the formal component (A10), which accounts for precision, plus a systematic component, which is uncorrelated with the formal component:

equation image

where E denotes the statistical expectation operator.

[42] In the case of an annually repeating signal, equation image is given by equation (15), and the E has the effect of averaging over all possible phases for each of the signal harmonics, as was performed in equation (9). The variance due to bias is therefore

equation image

where

equation image

In equation (A12) we have assumed that the phases for the various harmonics are uncorrelated. Equation (A12) can be expanded as a function of sines and cosines of the phase angles, which then allows us to apply the expectation operator:

equation image

Therefore (A11) becomes

equation image

Comparing equations (13) and (A13), we can express this alternatively as

equation image

where σK(T) is the theoretical RMS velocity bias for Fourier component k at data span T.

[43] In conclusion, equation (A16) can be used to compute the expected standard deviation in the difference of estimate velocity at times T1 and T2 (“modeled velocity stability”) using formal errors from equation (4) and the theoretical RMS velocity biases given by equation (13).

Acknowledgments

[44] G.B. gratefully acknowledges a Visiting Professorship from the University of Newcastle upon Tyne supported by an International Activities Grant from the University of Nevada, Reno (UNR). D.L. performed a portion of this work while at the University of Newcastle upon Tyne. The research was funded by the Department of Energy under the Yucca Mountain GPS Investigation (P.I., Jonathan Price), the National Aeronautics and Space Administration under subcontract to the State University of New York (P.I., William Holt), and by a grant and studentship from the Natural Environment Research Council (P.I., GB). D.L.'s travel to UNR was supported by the National Science Foundation through the University NAVSTAR Consortium, at the University Corporation for Atmospheric Research. As always, the authors are deeply grateful to the International GPS Service (IGS) and its contributors for all the data used in our work.

Ancillary