The Stratospheric Diurnal Cycle in COSMIC GPS Radio Occultation Data: Scientific Applications

The diurnal cycle throughout the stratosphere is analyzed by applying Bayesian interpolation to Constellation Observing System for Meteorology, Ionosphere, and Climate (COSMIC) Global Positioning System radio occultation (RO) data and three scientific applications of the analysis are introduced. First, the migrating thermal tides are analyzed with unprecedented accuracy and precision, with an uncertainty in the analysis of the vertically propagating tides ranging from 0.1 in the lower stratosphere to 0.6 K in the upper stratosphere for an individual month of RO data and with an uncertainty in a 10‐year climatological diurnal cycle a factor of 10 less. Moreover, the midlatitude trapped tide is found to be smaller than what is produced by an atmospheric model and lags the model in phase, a likely consequence of a faulty parameterization of eddy diffusivity in the upper stratosphere. Second, a clear signal of solar cycle influence on the diurnal cycle is evident in this analysis, but whether the cause is the systematic bias of ionospheric residual associated with RO retrieval or it is an actual atmospheric phenomenon is less clear. Third, RO satellites and missions that obtain inadequate coverage of the diurnal cycle will be biased by under‐sampling it, whether or not subsampling weather forecasts is used to removal sampling error. The analysis of the diurnal cycle in COSMIC RO data can be used to diagnose the systematic sampling error incurred by incomplete coverage of the diurnal cycle, which is of the order of 0.2 K for a Metop‐based RO climatology.


of 19
First, the diurnal cycle has been studied by atmospheric scientists for over a century, beginning with its manifestation in the surface air, then in the troposphere and lower stratosphere, and then up through the mesosphere (Chapman & Lindzen, 1970). Almost all of the observed diurnal variability in the atmosphere is due to the thermal tides, a form of global-scale internal gravity wave modified by the Earth's rotation and forced by the absorption of solar radiation by ozone in the upper stratosphere and by latent heating and absorption of solar radiation by water vapor in the troposphere. At the surface the tides are manifested as fluctuations in surface pressure as measured by meteorological stations; in the troposphere and stratosphere they can be measured in situ by radiosonde; and in the stratosphere and mesosphere by remote sensing. Each entails its own limitation. Surface pressure measurements are extremely limited in global coverage; radiosonde measurements require more than the usual twice daily or four-times daily sampling and are also limited in global coverage; and measurements from space require non-sun-synchronous sampling, which is only obtained by satellites in rapidly precessing orbits. The latter was obtained with measurements by the SABER instrument on the TIMED satellite (Sakazaki et al., 2012;Zhang et al., 2006) and radio occultation measurements from the CHAMP satellite (Pirscher et al., 2010). Yet SABER analysis of the tides was limited to the uppermost stratosphere and the mesosphere, leaving most of the stratosphere unsampled. Because both the TIMED and CHAMP satellites sampled diurnal time by rapid regression of nodes, the analyses of the tides from these data are subject to bias by aliasing of the seasonal cycle and planetary-scale waves to diurnal time-scales and are limited in their ability to quantify long time-scale variability of the tides.
RO data obtained by the COSMIC mission (Anthes et al., 2008) provides a unique opportunity to study the thermal tides in the stratosphere because it was constituted of six satellites spaced by 30° in right ascension of ascending node for a period of over 10 years. The data are global in distribution and span the diurnal cycle, albeit unevenly, at all latitudes for most of the lifetime of the COSMIC mission. RO soundings of stratospheric temperature and pressure as functions of geopotential height are precise and accurate. If a physical optics RO retrieval of bending angle is performed, the vertical resolution is approximately 100 m; if a geometric optics retrieval is performed-as is most often the case in the stratosphere-the vertical resolution is closer to 1.5 km, easily sufficient to resolve the thermal tides. In theory, little to no bias is incurred in retrievals of temperature and pressure in day-time versus night-time soundings except ionospheric residual; see below. The data are ideally suited to studies of the global structure of atmospheric tides in the stratosphere and their interannual variability. As a result, no thorough, continuous observational study of the tides in the global stratosphere was possible until COSMIC. More recently, the COSMIC-2 mission obtained dense diurnal coverage, thus enabling the study of tides and other equatorial waves (Randel et al., 2021). COSMIC-2 RO only covers the Tropics, however.
Second, systematic bias relevant to climatologies of RO and long-term trend analysis is incurred in day-time versus night-time RO sounding because of the ionosphere's influence on RO sounding. The microwave signals of the GNSS satellites are refracted in the ionosphere because free electrons contribute significantly to the microwave index of refraction. In order to eliminate the influence of the ionosphere on retrievals of neutral atmospheric variables, the ionosphere's contribution is eliminated by a linear combination of GNSS signals at two different carrier frequencies, which itself is possible because the ionosphere's contribution to microwave refractivity is dispersive. The dispersion leads to a complication, however, because the two signals do not follow the same path through the ionosphere and atmosphere, meaning any linear combination will be imperfect, leaving a residual second order error after eliminating more significant first order effects (Vorob'ev & Krasil'nikova, 1994). The effect was first identified by Kursinski et al. (1997). A method has been proposed to reduce this second order ionospheric residual in RO (Healy & Culverwell, 2015), but it has proven exceedingly difficult to isolate ionospheric residual in data because of naturally occurring atmospheric variability on the same time-scales and because other sources of error in RO retrieval have similar envelopes in their vertical structure.
RO data obtained by COSMIC are well suited to detecting the signature of ionospheric residual, because of COSMIC's complete coverage of the diurnal cycle over a complete solar cycle. The ionospheric residual is a greater systematic error in RO retrieval when the ionosphere, especially the F-layer, is more active, as it is during the day rather than at night, at solar maximum rather than at solar minimum. Analysis of the diurnal cycle in COSMIC RO data should show inter-annual variability in the amplitude and phase of the diurnal cycle that correlates with solar activity. A study like this is only possible with a data set that is longer than the solar cycle and completely covers solar time over the period of the data set. COSMIC RO data satisfy these conditions.

of 19
Third, much of the RO data that have been obtained in the last 25 years have not covered all solar times, making the climatologies that are constructed from them subject to aliasing and biasing effects. RO data collected by a sun-synchronous orbiter sample only two solar times, aliasing the semidiurnal cycle in the atmosphere to a systematic sampling bias. Such orbiters include the three Metop satellites, and NASA satellites including TerraS-AR-X and TanDEM-X, and FengYun-3C. Similarly, climatologies constructed from RO data collected from rapidly precessing orbiters are also biased by the atmospheric semidiurnal cycle, but because the regression of nodes of the satellites' orbits proceeded on seasonal time-scales, the incurred bias appears as intra-annual to interannual variability of the atmosphere. Such orbiters include CHAMP, C/NOFS, and GRACE. A method to remove biases due to uneven sampling, including sampling in solar time, has been proposed for RO data that works by subsampling the forecasts of a numerical weather prediction system at RO data geolocations, estimating a bias by subtracting a climatology formed by the interpolated forecasts from the monthly average climatology published by the numerical weather prediction system, and subtracting that bias from the RO climatology constructed from the RO data themselves (Foelsche et al., 2008(Foelsche et al., , 2011. Such approaches rely upon the atmospheric forecasts not creating systematic biases of their own, however. RO data obtained by COSMIC provide an opportunity to investigate the bias incurred by RO sampling patterns that do not fully cover the diurnal cycle. Herein, we do so for the Metop-A and Metop-B RO sounders, both in sun-synchronous orbits. An analysis of the diurnal cycle obtained from COSMIC data can be expanded at the solar times of Metop RO soundings to obtain the bias that would be incurred in a climatology formed from binning and averaging Metop RO data. Similarly, an analysis of the diurnal cycle of the difference of COSMIC data and an atmospheric analysis at the solar times of Metop RO soundings would give the bias that would be incurred in climatologies of RO data obtained by Metop even with sampling error correction applied. The consequences are expected to be significant for periods of RO when the diurnal cycle was not fully covered or when RO satellites covered the diurnal cycle only by regression of nodes, such as the era of CHAMP (Wickert et al., 2002).
The studies mentioned above are enabled by applying Bayesian interpolation to map the RO data obtained by COSMIC and Metop, and this paper focuses on the scientific applications made possible by this type of analysis. The applications introduced above, except an empirical analysis of ionospheric residual, have been undertaken elsewhere. Relevant work will be addressed in the body of this paper. In no case, however, has Bayesian interpolation been used in the analyses, and those works have suffered from a lack of error analysis and a lack of detection sensitivities as a consequence. This paper is organized as follows. The second section contains a description of the data and how they are analyzed. The third section presents the scientific applications introduced above are addressed by Bayesian interpolation applied to RO data. The fourth section presents a discussion.

Data and Method
The data sets and the way in which they are analyzed are presented. First comes an introduction to the RO data used. Second comes a brief introduction to a weather analysis and its forecasts that are used in this work. Finally, a brief overview of Bayesian interpolation is given.

Radio Occultation Data
The first Constellation Observing System for Meteorology, Ionosphere, and Climate (COSMIC; Anthes et al., 2008) obtained radio occultation (RO) data using the transmitters of the (GPS) from April 2006, through April 2020. The six COSMIC satellites were deployed from the same launch vehicle, and mutual regression of nodes was used to spread the satellites by 30° in ascending node in order to obtain approximately uniform coverage of the diurnal cycle. The constellation achieved its final configuration in November 2007, but with one of the satellites (F3) never achieving final orbit (Fong et al., 2008).
Microwave refractivity and dry temperature are retrieved by standard retrieval techniques (e.g., Hajj et al., 2002;Gorbunov et al., 2011). The excess phase data is obtained from the COSMIC DAAC (https://cdaac-www.cosmic. ucar.edu/). In the first step of retrieval, canonical transform of the second type is used to unwrap the effects of diffraction and atmospheric multipath below 25 km Gorbunov & Lauritsen, 2004) and geometric optics above 25 km to recover bending angle as a function of impact parameter. In the second step of retrieval, the Abel integral inversion is used to retrieve a refractivity profile from a bending angle profile. The integral is initialized between 40 and 120 km altitude with the assistance of an upper atmosphere bending 4 of 19 angle climatology (BAROCLIM) (Scherllin-Pirscher et al., 2015). In the third step of retrieval, the hydrostatic integral is initialized at 120 km under the assumption of an exponential profile of refractivity above that. Profiles of microwave refractivity, dry pressure, and dry temperature as functions of geopotential height (Leroy, 1997;Scherllin-Pirscher et al., 2017) result.
Implementation of the BAROCLIM climatology to aid in initializing the Abel inversion integral was formulated so as to minimize any bias in retrieval incurred by the initialization. BAROCLIM is a climatology of bending angle data between 40 and 120 km impact height derived solely from actual COSMIC RO data bending angle data. It is computed by averaging bending angle data in 10° × 10° longitude-latitude bins by month. Each monthbin contains approximately 11,000 soundings that are averaged, thus yielding an ensemble of smooth bending angle profiles, one per month-bin. In a retrieval, the BAROCLIM ensemble of smoothed bending angle profiles is searched for the best match to the observed bending angle profile, where best match is determined after a two-parameter tuning. The two-parameter tuning itself is a simple shift and slope adjustment in the logarithm of the bending angle. The search-tune-match algorithm is specifically designed so that no bias is introduced into the Abel inversion while preserving climatological realism above 40 km. While the ensemble of BAROCLIM smooth bending angle profiles may contain less variance than an ensemble of single soundings, the tuning should theoretically restore the single-sounding variance, including the variance associated with the diurnal cycle. Any diurnal cycle that exists in the measurements should persist through BAROCLIM initialization of the Abel inversion without increase or decrease. This is the same data set used in Gleisner et al. (2020) and described in algorithm theoretical baseline documents (http://www.romsaf.org) except that no attempt is made to account for ionospheric residual by "kappa-correction." Figure 1 shows the timeline of the number of COSMIC RO data considered monthly. At its peak performance, COSMIC generated ≈60,000 soundings monthly tapering off to approximately 20,000 monthly by 2016.

ERA-Interim
In order to evaluate the performance of sampling error correction as a technique for constructing climatologies of RO data, the forecasts of a numerical weather prediction system are compared to the RO data. Here, the forecasts of ERA-Interim (Dee et al., 2011) are interpolated to the time and location of COSMIC and Metop-A,B RO data. The forecasts of ERA-Interim are interpolated rather than the analyses because ERA-Interim has assimilated COSMIC and Metop-A,B RO data. Because the analyses nearly perfectly replicate the RO soundings themselves at the times and locations of the soundings, an RO climatology constructed using the sampling error correction technique would simply recreate the climatology of ERA-Interim. ERA-Interim generated forecasts at 3-hr intervals from the analyses at 00 and 12 UTC hr. Consequently, in order to cover all UTC times, the 3-hr, 6-hr, 9-hr, and 12-hr forecasts were interpolated. Microwave refractivity is computed from the model forecasts of temperature, pressure, and water vapor using the Smith-Weintraub formula. The sampling error correction technique itself is described elsewhere (Foelsche et al., 2008(Foelsche et al., , 2011Gleisner et al., 2020).

Bayesian Interpolation
Bayesian interpolation fits data with a priori unknown error characteristic that are nonuniformly distributed without overfitting the data. It does so by fitting a superposition of basis functions to the data and penalizing the fit. In a second level of inference, the weighting between the misfit and the penalty is determined optimally. Originally, a thorough exposition of the method was given for fitting data with one independent coordinate (MacKay, 1992). It was adapted to mapping RO data by considering spherical harmonics as the basis functions (Leroy, 1997) and the penalty, or regularizer, tuned rigorously and its performance evaluated later (Leroy et al., 2012). Because RO climatologies depend strongly on how the RO soundings sample the diurnal cycle, the technique was expanded by multiplying the spherical harmonic basis function by sinusoids in diurnal time, by optimizing the regularizer, and evaluating the performance (Leroy et al., 2021). The approach has been presented in depth in all of those previous works; Leroy et al. (2021) can be considered a companion work.
The RO retrievals of temperature and refractivity are interpolated onto surfaces of constant geopotential height ranging from 10 to 50 km at intervals of 0.5 km. The dry temperature (Danzer et al., 2014) of the RO soundings and ERA-Interim output are interpolated linearly in geopotential height as are the logarithms of microwave refractivity for both the RO profiles and ERA-Interim output. ERA-Interim is interpolated bilinearly in longitude and latitude to the position of each RO sounding. ERA-Interim is interpolated linearly in time to the time of each RO sounding. Bayesian interpolation is then applied up to spherical harmonic degree 14 and to the fourth harmonic of the diurnal cycle. Even though Leroy et al. (2021) recommends that RO data be collected into threeto five-day bins before applying Bayesian interpolation, in part to account for temporal synoptic variability as a bias in a climatology, Bayesian interpolation is applied to monthly collections of RO data. While doing so would create biases in a climatology because of uneven daily sampling, it is not expected to impact climatologies of the diurnal cycle, which is the goal of this work.
Not only are the RO refractivity and dry temperature mapped by Bayesian interpolation, so are the differences between RO data and atmospheric model forecasts of the same fields. The difference between data and model forecasts bears on monthly average assemblages of RO data, also referred to as climatologies. Traditionally, most atmospheric science uses atmospheric reanalyses, and research on phenomena that occur on seasonal to multi-decadal time-scales involve monthly average gridded reanalysis products. RO climatologies are desired because RO data are understood to be far better calibrated than reanalyses and hence more useful for climate trend studies. RO climatologies are noisy, however, because RO data are sparse and greatly under-sample atmospheric variations associated with synoptic variability. It has become common to attempt to remove the noise in RO climatologies by estimating subsampling biases and subtracting those from RO data-only bin-and-average climatologies. The biases are estimated by a bin-and-average calculation applied to reanalysis fields interpolated in time and location to the times and locations of the RO soundings and then subtracting from that calculation the gridded monthly average fields of the reanalysis. Finally, the published RO climatology is the bin-and-average RO data with the subsampling bias subtracted. This is the sampling error removal algorithm.
Mathematically, the sampling error removal algorithm can be written as in which y is the variable to be mapped, y(r i ) is the value of that variable as retrieved by RO at location r i , and y M (r i ) is the model-interpolated value of that variable at locations r i .  ( ) is a gridded monthly average published by the model, ( ) is the published RO climatology,  ( ( )) is the bin-and-average of RO data, and  ( ( )) is the bin-and-average of the model output fields interpolated to the times and locations r i of the RO data. The term in square brackets is the under-sampling bias of RO data. This can be rearranged to obtain because of linearity of the bin-and-average operation. Equation 2 provides a new interpretation of the sampling error removal algorithm: the published RO climatology is just the gridded reanalysis monthly average  ( ) with a model bias  ( ( ) − ( )) removed. The model bias-the second term on the right of Equation 2-is mapped by Bayesian interpolation as well as maps of RO data alone. As is apparent in Equation 2, the difference maps are expected to yield insights into the performance of the sampling error removal technique: if systematic errors exist in the model, which is 6 of 19 ERA-Interim in this case, then the sampling error removal technique can also introduce unwanted biases into the published climatology ( ) . The same regularizer as is used in mapping RO data by themselves is used in mapping the differences between RO data and model-interpolated data.
The analysis by Bayesian interpolation allows for research of migrating and nonmigrating tides alike, but this work concerns only the migrating tidal features. Bayesian interpolation makes such an analysis simple. Recall from Leroy et al. (2021) that the basis functions are defined as with P lm (x) the associated Legendre polynomial of degree l and order m, λ, φ, τ are longitude, latitude, time, all in radians, and μ a proxy index for unique combinations of degree l, order m, and diurnal cycle harmonic n. For the analyses presented below, one only needs to perform the Bayesian interpolation with time τ defined as diurnal time and retain only the m = 0 terms in the expansion of the basis functions. Sampling error analysis then requires only the consideration of the rows and columns of the error covariance matrix of the expansion coefficients corresponding to the retained terms.

Example Map
An example of Bayesian interpolation applied to COSMIC data is shown in Figure 2. It is an analysis of the diurnal average dry temperature at 15 km geopotential height for the month of January 2010. This level lies in the stratospheric middle-world, with 15 km lying in the upper troposphere in the Tropics and in the lower stratosphere at higher latitudes. The enormous meridional gradient in temperature can be considered a demarcation of the extent of the Tropics. The warm summer (southern) stratosphere is prominent as is a winter (northern) midlatitude stationary eddy with zonal wavenumber 1, the latter of which is anticyclonic over the Pacific with an amplitude of ≈20 K. To form such a map, only the terms of the basis function expansion using diurnal cycle harmonic n = 0 are retained. Maps such as this are formed for every month between September 2006 and December 2016 for all geopotential heights between 10 and 50 km at 0.5 km intervals. The same is done for the difference between COSMIC RO data and the forecasts of ERA-Interim; see Figure 8 and Section 3.5.

Migrating Thermal Tides and Uncertainty
Bayesian interpolation extended to account for the diurnal cycle lends itself ideally to studies of atmospheric thermal tides. Figure 3 is an animation of the migrating tides averaged over the entire extent of the COSMIC RO record and for all seasons. It shows the temperature anomaly associated with the migrating tides for diurnal harmonics 1 through 4 at 3-hr intervals in solar time. These plots can be interpreted as combinations of two distinct phenomena: the vertically propagating migrating tides near the equator, and a vertically trapped oscillation that occurs in the upper stratosphere in mid-latitudes. The vertically propagating migrating tides show downward vertical propagation beginning at the stratopause (50 km height) and ending at the tropical tropopause (17 km). The positive temperature anomaly first appears at the stratopause at ∼03 hr solar time. After 24 hr it descends to 25 km height, and after 33 hr it reaches the tropical tropopause. The negative temperature anomalies lag by ≈12 hr, as the tides are dominated by the n = 1 diurnal harmonic. The amplitude of the migrating tides varies with altitude. In addition to the vertically propagating mode within 10° latitude of the equator, a pulsing mode appears in the upper stratosphere in mid-latitudes, approximately between 35 and 50 km height and between 20° and 60° latitude in both hemispheres. It is sinusoidal in solar time without any signature of vertical propagation. It reaches peak amplitude at ≈18 hr solar time and is also dominated by the n = 1 diurnal harmonic.
The vertically propagating and vertically trapped modes are known from other studies. Until the COSMIC project, they were seen in remote sensing data from satellites in non-sun-synchronous orbits, such as LIMS (Hitchman & Leovy, 1985), UARS (Wu et al., 2008), CRISTA (Ward et al., 1999), and TIMED/SABER (Mukhtarov et al., 2009); see Sakazaki et al. (2012) for discussion and other references. The study of thermal tides using these satellites' data examined the tides' structure in the mesosphere and lower thermosphere primarily, mostly because the tides' amplitudes were large enough to be easily detected. The precision and sensitivity of RO in the stratosphere enabled their study in the stratosphere, first with the CHAMP mission (Xie et al., 2010;Zeng et al., 2008)  and later using COSMIC RO data (Pirscher et al., 2010). Before COSMIC, all of these studies were done with data from rapidly precessing orbits, begetting the concern that the precision could alias seasonal and other long term variability to diurnal timescales.
It is customary in studies of migrating thermal tides to contour wave amplitude and phase separately, as is done here in Figure 4 for the first harmonic of the diurnal cycle. With the output expansion coefficients of Bayesian interpolation, the n'th harmonic migrating thermal tidal temperature fluctuation at latitude φ, height h and for month i can be written as with τ being solar time in radians. The amplitude A n and phase τ n of the n'th migrating tidal harmonic are computed by in which arctan(⋯ , ⋯) is a four-quadrant arctangent. The error analysis for the inference of the amplitude and phase of the migrating tides derives from RO sampling error and from inter-annual variability in the amplitude and phase of the migrating tides and is complicated by the strong seasonality in the tides, typified by meridional motion in the peak of the tides according to the sub-solar latitude. In this error analysis, the monthly time-series of cosine coefficients C n (φ, h, i) and sine coefficients S n (φ, h, i) are de-seasonalized, yielding ̃( ) and ̃( ) for month i in the time-series. Anomalies in the time-series of these coefficients are caused by inter-annual variability in the tides and sampling error due to under-sampling of the atmosphere by COSMIC RO data. The de-seasonalized time-series of the cosine and sine terms of the tidal expansion can then be used to estimate the uncertainty in each of those terms along with their inter-correlation ρ: COSMIC RO data for November, 2007, through December, 2016, are used in this analysis because the COSMIC constellation achieved its final formation only in November, 2007; hence, M = 110 months. The reason 12 degrees of freedom are removed in the computation of root-mean-squares is because the monthly seasonal cycle (12 months) is computed and subtracted from the time series of cosine and sine coefficients in the construction of the anomaly time series ̃ and ̃ . If the annual average climatology of the cosine and sine terms is just the average of the mean-monthly climatologies, resulting in C n (φ, h, i) and S n (φ, h, i), and, assuming stationarity of the anomaly time series ̃( ) and ̃( ) , the uncertainties in the amplitude σ amplitude and in the phase σ phase become 2 amplitude = ( 2 ( ) 2 + 2 ( The coefficients C n (φ, h), S n (φ, h), and A n (φ, h) are averages of the de-seasonalized coefficients. An explanation and derivation of these equations are given in Appendix A. An extra factor of M is included in these equations because σ amplitude and σ phase are standard errors rather than standard deviations such as σ c and σ s .
In Figure 4, downward phase propagation over the equator is evident. The amplitude (plot a) is ≈1 K, which occurs at 42 km height, with a secondary maximum of ≈0.7 K at 30 km height. Multiple maxima in amplitude are likely caused by the background zonal wind shear; see Hagan et al. (1999), for example. Downward phase propagation is evident as phase increasing with depth over the equator (Figure 4c). Most prominent, though, is the tidal feature in the midlatitude upper stratosphere at heights above 37 km and between 20° and 70° latitude in each hemisphere. The feature has an amplitude of 2 K and shows no sign of vertical propagation and only very weak propagation poleward in each hemisphere. Figure 4 shows the uncertainty associated with the amplitude as Figure 4b. Over the equator, the uncertainty in the amplitude reaches 0.05 K at 44 km height and decreases to less than 0.02 K beneath 38 km. The signal-to-noise ratio of detection of the n = 1 migrating thermal tide over 10 of 19 the COSMIC-1 record is in excess of 20. Uncertainty in the amplitude is maximized in the mid-latitude upper stratosphere, especially in the northern hemisphere, reaching 0.08 K at 65°N and 44 km height. The maxima in both hemispheres overlie the poleward edge of the midlatitude non-propagating tidal feature. Detection of this non-propagating feature has a signal-to-noise ratio in excess of 10. Finally, Figure 4 also shows the uncertainty in the phase of the n = 1 migrating thermal tides. Figure 4d shows that the uncertainty in the phase over the equator is maximized at 48 km height is 12 min. Beneath 40 km, the uncertainty in the phase is less than 6 min. Figure 5 shows the annual cycle of the n = 1 migrating thermal tide as sampled by COSMIC RO data. Only the amplitudes are shown for each month, but they are masked where the noise exceeds one-third the amplitude. The uncertainties are computed using the same formalism of equations 6 and 7. Because 10 years of COSMIC RO data are used, however, the equations are exercised independently for each month with M = 10 and a denominator (inside the square root) of M − 1, there being M − 1 independent degrees of freedom in the calculations of the standard deviations and correlations. While this is not the first time the seasonality of the stratospheric tides has been analyzed using RO data, it is the first time it has been done with an error analysis, showing the power of Bayesian interpolation. Figure 6 shows the amplitude and phase, along with their uncertainties, for the n = 2 harmonic of the diurnal cycle. The tidal amplitude is maximized in the upper stratosphere in latitudes that span the Tropics, reaching 0.6 K at 45 km height. As opposed to the n = 1 harmonic, the phase plot indicates upward phase propagation. Figures 4, 5, and 6 are possible up to n = n max . In this analysis, n max = 4.
The climatology of the diurnal cycle constructed here is monthly, giving expansion coefficients for every month from late 2006 through the end of 2016. When considering the diurnal cycle in just one month, the only uncertainties come from the leakage of synoptic (weather) variability and measurement error into the inference of the coefficients. In Bayesian interpolation, this sampling error is computed directly under the assumption that

of 19
it is spatially and temporally uniform. Instructions on computing this uncertainty from the output of Bayesian interpolation can be found in Leroy et al. (2021), Equation 6. Figure 7 shows the impact of sampling error on the determination of the amplitude of four harmonics of the diurnal cycle. It is the average of the sampling error over all of the months from January, 2007, through December, 2016, having neglected the last few months of 2016 because the COSMIC RO did not yet fully sample the diurnal cycle. For the vertically propagating equatorial tides, the uncertainty in amplitude due to sampling error decreases with harmonic n and increases with height. For the diurnal harmonic (n = 1), the uncertainty in amplitude ranges from 0.01 K in the lower stratosphere to 0.06 K at the stratopause over the equator. The uncertainty decreases by approximately 30% for the semi-diurnal harmonic (n = 2). The uncertainties always decrease from the Tropics to the midlatitudes but see strong maxima over both poles, where wintertime synoptic variability is greatly enhanced; see Figure 5 of Leroy et al. (2021), for example. The reason the analyses of Figures 4 through 6 are much more precise than for Figure 7 is that they involved averaging over many months whereas Figure 7 shows uncertainties for just 1 month.

Midlatitude Oscillation
The vertically propagating migrating thermal tides over the equator have been documented elsewhere, but the trapped midlatitude upper stratospheric structure for the n = 1 harmonic seen in Figure 4 has not been thoroughly investigated. For this reason, annual average analyzed time series of dry temperature in the peak of the feature are shown in Figure 8. This figure shows an expansion of the Bayesian interpolation at 45°N latitude and 48 km height for both the analyzed RO data and ERA-Interim. For each year, the average expansion coefficients over the 12 months of that year are computed before the expansion is performed. The diurnal cycle of ERA-Interim is computed by expanding the Bayesian interpolation fit of the RO data and adding the fit of model less data, the latter being the last term on the right of Equation 2. All harmonics of the Bayesian interpolation analysis are included in the expansion.
The analysis of the mid-latitude trapped feature in Figure 8 reveals two new findings. First, RO data and the model disagree on the amplitude and phase of the trapped, upper stratospheric midlatitude feature. RO data show the maximum amplitude to be ≈2 K while ERA-Interim shows the amplitude to be ≈3 K. The time series of the feature in RO data also lags the time-series in ERA-Interim by ≈2 hr. Second, the RO data show more interannual variability in this feature than does ERA-Interim. There are notable departures from a mean diurnal cycle in the analysis of RO data especially for years 2012 through 2015. The departures are particularly noticeable in Figure 8b. The years 2012 through 2015 are years of maximum activity in the solar cycle. This is evident in Figure 9, which shows the f10.7 cm flux that is commonly used as an indicator of solar activity.
The difference in the diurnal cycle between data and model in solar maximum years is very similar to the same difference in solar minimum years as seen in Figure 8b: the model bias is robust. The change in the data-model difference from solar minimum to solar maximum is much smaller than the mean data-model bias but is robust nevertheless. We conclude both that the upper stratospheric trapped feature is poorly analyzed in ERA-Interim and that a further bias is incurred by the solar cycle. The former is a much larger effect than the latter.

Ionospheric Residual?
Two explanations are possible for the correlation between solar activity and the diurnal time series of the midlatitude upper stratospheric trapped feature. The first is that it is an actual feature of the atmosphere, that changing forcing by solar ultraviolet changes the stratospheric thermal tides. The second is that solar activity induces a bias in RO temperature sounding. The first is not without merit, as solar ultraviolet forcing of the stratosphere and the tropospheric climate through downward influence has been hypothesized and discussed for over two decades (Lean, 2010). The second, however, is possible as well, as residual ionospheric error in RO remote sensing is a well known yet elusive phenomenon.
It is exceedingly difficult to disentangle the different contributors to error in RO retrieval in the stratosphere using data alone. Estimating the contributors theoretically is possible and has been done elsewhere (Danzer et al., 2020) but never empirically. In order to explore the impact of the 11-year solar cycle on the stratospheric diurnal cycle, a triple-difference analysis is formed, one that reduces sampling error, isolates the solar cycle, and analyzes the change in the diurnal cycle. Figure    14 of 19 quantity of interest, which is the diurnal cycle. The second difference serves to remove noise due to sampling error, following Foelsche et al. (2008); Foelsche et al. (2011), and thus yield a less noisy analysis. The third difference serves to attribute results of this analysis to the 11-year solar cycle. The best interpretation of this analysis and the resulting figure is that it shows the influence of the solar cycle on the diurnal cycle in dry temperature in the stratosphere as it is mis-analyzed by ERA-Interim. It could represent either mis-modeling of the atmosphere's response to forcing by the solar cycle as seen in the diurnal cycle or ionospheric residual associated with RO.
The triple-difference of Figure 10 shows a pattern that is symmetric about the equator, is strongest in the upper stratosphere, and tapers off with depth and at high latitudes. The dominant extrema occur in midlatitudes in the upper stratosphere in both hemispheres, specifically at the stratopause (50 km) and 40° latitude in both hemispheres, and the amplitude of the extrema is a change of approximately −1 K. The response in the Deep Tropics, again at the stratopause, is less but still approximately −0.3 K. Further analysis (not shown) shows the pattern to be robust. The pattern remains largely unchanged in all combinations of individual solar maximum years (2012, 2013, 2014, and 2015) less solar minimum years (2007, 2008, and 2009), which forms 12 independent combinations. Even the positive response over in the upper stratosphere over the Antarctic is robust in these combinations. Figure 10 cannot be easily explained theoretically. A method for removing ionospheric residual from RO climatologies has been proposed that is based on the intensity of solar activity, making it possible to generate RO climatologies with and without the correction applied. The difference between those would be a theoretical estimate for the influence of ionospheric residual in temperature. Figure 7 of Danzer et al. (2020) shows the impact of ionospheric residual correction in the diurnal average, zonal average temperature response for 1 month at low solar activity and for another month at high solar activity. The extremum impact of ionospheric residual is approximate −3.5 K at high solar activity and −0.5 K at low solar activity. The theoretical signature is symmetric about the equator and peaked at the stratopause over the equator. The same study does not present the diurnal signature of ionospheric residual, however. While the theoretical response is similar in form to that of Figure 10, its magnitude is substantially larger than that in Figure 10 and it does not show extrema at the midlatitude stratopause. Figure 8 shows clear evidence that ERA-Interim mis-models the stratospheric diurnal cycle, especially the trapped midlatitude stratopause feature, and Figure 10 hints at mis-modeling elsewhere near the stratopause. The mis-modeling is systematic in the diurnal cycle; thus, the model can be expected to induce errors in sampling error-adjusted climatologies of RO data if it, or another model that includes such a bias, is used to correct for sampling errors. Recall that, because RO data still greatly under-sample synoptic variability in the atmosphere, one frequently used method to remove the sampling error that results from under-sampling in order to yield less noisy climatologies is to subsample model forecasts to the geolocations of the RO soundings and compare monthly average model fields to climatologies of the subsampled model constructed in the same way as climatologies of RO data. The computed sampling bias is then subtracted from the RO climatology. While intended to reduce the random error normally associated with undersampling of synoptic variability, the technique of sampling error removal can also introduce biases in climatologies due to biases in model realizations of the diurnal cycle, which is possible only when the distribution of RO data under-samples the diurnal cycle.

Sampling Error
The Metop-A, Metop-B, and Metop-C satellites fly in sun-synchronous orbits, with the ascending nodes at 20:46, 21:30, and 21:30 solar time, respectively. Each of the Metop satellites carries a GPS RO instrument on board, and the high latitude orbits of Metop can deliver RO distributions that cover the globe. The distribution of RO soundings, however, severely under-samples the diurnal cycle, however, and a mis-modeled diurnal cycle will yield sampling-error-corrected climatologies that are biased. The resulting bias is computed as an expansion of the Bayesian interpolation fit to the COSMIC-ERA-Interim difference-the last term on the right of Equation 2-at 9:30 and 21:30 solar time. The result is Figure 11.
The biases in Metop climatologies of RO data after sampling error removal are generally small in the upper troposphere and the lower and middle stratosphere but become more pronounced in the upper stratosphere, as seen in Figure 11. The incurred bias is less than 0.04 K beneath 37 km height. Above that level, however, the bias reaches an amplitude of approximately −0.5 K in the midlatitude upper stratosphere in the Southern Hemisphere and approximately −0.2 K in the Northern Hemisphere. The largest biases correspond to the trapped midlatitude migrating tide seen in the upper stratosphere seen in Figure 4a. Mis-modeling of this feature is as large as 1.5 K as seen in Figure 8, much larger than the incurred bias in Figure 11. This is a fortuitous consequence of Metop sampling at solar times that correspond to nulls in the mis-modeling of the diurnal cycle; see Figure 8b. If the Metop satellites crossed the equatorial plane at 3:00 solar time instead of 9:30, the incurred temperature bias would instead have been approximately 1.0 K.

Discussion
The theory and observation of migrating thermal tides in the upper atmosphere have been written about extensively in the past, but only recently has ionospheric residual bias in RO begun to be explored, and little work has been done to examine sampling biases incurred in RO climatologies when the diurnal cycle is under-sampled. Each of these three topics is addressed above by analyzing the diurnal cycle as observed by COSMIC RO using Bayesian interpolation and is discussed in the context of previous work below.
For thorough descriptions of previous research on thermal tides in the upper atmosphere, stratosphere through thermosphere and ionosphere, see a recent review (Pancheva & Mukhtarov, 2011) or some of the most recent papers on the topic (Mukhtarov et al., 2009;Sakazaki et al., 2012Sakazaki et al., , 2018. As discussed in the Introduction, most observational work on atmospheric tides uses the observations of the SABER instrument on the TIMED satellite (Russell et al., 1999), but the analyses are focused on the mesosphere and thermosphere with little analysis in the stratosphere and without coverage poleward of 50° latitude in both hemispheres. Prior to TIMED/SABER, observational work on tides was based on data of the Microwave Limb Sounder on the UARS satellite. Both satellites sampled the diurnal cycle, which is necessary for diagnosing tides, by undergoing rapid regression of nodes.
The work above showed how complete coverage of the diurnal cycle obtained by the COSMIC RO mission can be analyzed for migrating tides throughout the stratosphere. Certainly the diurnal and semidiurnal tides have been analyzed in satellite data previously. What is particularly noteworthy here is the accuracy obtained when analyzing the COSMIC RO data by Bayesian interpolation as extended into the diurnal cycle. Bayesian interpolation is 16 of 19 used in the previous section to establish a climatology of the migrating tides that is accurate and precise to less than 0.1 K in amplitude for each harmonic of the diurnal cycle and each month of the annual cycle. Only over the polar regions can the amplitude not be inferred with a signal-to-noise ratio greater than 3. The uncertainty analysis considers both the random error resulting from under-sampling of synoptic variability and interannual variability of the diurnal cycle. The accuracy and precision extends to estimates of the phase of migrating tides. The diurnal harmonic migrating tide (Figure 4) shows predominantly downward phase propagation while the semidiurnal harmonic ( Figure 6) shows upward phase propagation in the tropical stratosphere. Because upward phase propagation corresponds to downward group velocity for tides and other forms of atmospheric gravity waves, the analysis of the previous section reveals that the dominant forcing of the diurnal migrating tide is tropospheric and the that of the semidiurnal tide is upper stratospheric. The forcing mechanism in the upper stratosphere is radiative heating by the absorption of solar ultraviolet by ozone, and the forcing in the troposphere is latent heating by deep convection. For explanations of the forcing, see, for example, Forbes (2002, 2003). The detail and precision of the phase plots of Figures 4 and 6 enables such discernment.
The midlatitude nonpropagating feature in the upper stratosphere has been diagnosed previously in TIMED/ SABER data (Mukhtarov et al., 2009;Sakazaki et al., 2012Sakazaki et al., , 2018 but with some question. The previous work identified it as a predominantly diurnal feature with maximum at 16:00 solar time but cast some doubt on the diagnosis because the TIMED/SABER data coverage was obtained by nodal regression. This analysis, based on complete continuous coverage of the diurnal cycle, diagnoses it also as a predominantly diurnal feature, but its maximum occurs later in the day, at approximately 18:30 solar time. The feature corresponds to a tidal mode long-known as a solution to the tidal equations, namely the Θ(−1, 2) symmetric Hough mode (Chapman & Lindzen, 1970;Mukhtarov et al., 2009). It does not propagate meridionally or vertically and is thus a trapped mode, one most likely forced by solar ultraviolet absorption by ozone because of its proximity to that forcing. Propagating modes are dissipated by propagation, but a trapped mode such as this can be dissipated only by eddy dissipation or long-wave radiative relaxation. Atmospheric models in the form of reanalyses do reproduce this mode (Sakazaki et al., 2018), but this analysis shows that at least one of them, ERA-Interim, overestimates its amplitude and produces a maximum in phase too early in the day. Both the increased amplitude and the phase lead of this mode produced by the model can be explained by mis-modeling of eddy diffusivity in the upper stratosphere. An increase in the amount of eddy diffusivity would both decrease the amplitude of the midlatitude trapped mode and lag the maximum to later in the day. If so, such analyses could prove a useful diagnostic for the physics of diffusion in the upper atmosphere.
The same analysis also shows that the midlatitude trapped feature shows systematic differences in solar maximum versus solar minimum years. While it is possible that the large changes in solar ultraviolet associated with solar activity could induce a temperature change in the upper stratosphere associated with the diurnal cycle, it is more likely that the change in the diurnal cycle seen in Figure 10 is associated with ionospheric residual error in RO retrieval. It comes about because of the divergence of the paths of the two GNSS signals tracked during an RO sounding, a consequence of the dispersive nature of the ionosphere. The greater the electron density in the ionospheric F-layer, the more the GPS L1 and L2 signals diverge, and the greater the error in the retrieval of temperature in RO sounding. Moreover, an increase in ultraviolet solar radiation can amplify the tidal response but it cannot change the phase, which is clearly seen in Figure 8. The BAROCLIM climatology of bending angles used in initializing the Abel inversion for refractivity is as subject to ionospheric residual as each RO sounding. Because it is searched and tuned to individual RO soundings, it cannot introduce any ionospheric residual error in RO that is not already present in each RO sounding. A more thorough study is needed, however, to disentangle a real atmospheric temperature change in the diurnal cycle versus a signature of ionospheric residual. This could be accomplished using an RO data set that has been corrected to eliminate ionospheric residual, following approaches recently described (Danzer et al., 2020;Fan et al., 2017;Healy & Culverwell, 2015).
Finally, few remote sensing techniques possess the absolute accuracy that is needed for climate monitoring, but RO is one of them. RO possesses this property because its fundamental observable is traceable to the international definition of the second, which is presently known to approximately one part in 10 15 , as opposed to a radiance standard, which is known much more poorly. In the stratosphere, however, three issues must be considered when using RO for climate monitoring: (a) the possibility of local multipath, (b) ionospheric residual, and (c) nonuniform sampling. (Local multipath occurs when the GNSS receiver tracks a GNSS signal reflected from an element on board the satellite in addition to the same GNSS signal that propagates directly from the transmitter to the 17 of 19 receiving antenna.) Factors (a) and (b) were described in Kursinski et al. (1997); factor (a) can be mitigated by systematic intercomparison between missions, and factor (2) as discussed in the previous paragraph. The third factor, though, may not be easily corrected if the RO data systematically under-sample the diurnal cycle. The method most commonly used to correct for sampling error is the sampling error removal technique, wherein the sampling error corresponding to an RO sampling pattern is computed by subsampling atmospheric forecasts to the times and locations of RO soundings. This technique was originally intended to address the random noise associated with synoptic variability and so make time series of temperature appear smoother and a better approximation of reality, but it can induce a bias if it is applied to RO data that systematically under-sample the atmosphere where the forecast model contains systematic biases. This study showed the bias that would result for a temperature climatology that would be assembled from Metop RO data when sampling-error-corrected using the forecasts of ERA-Interim. As above, ERA-Interim mis-models the diurnal cycle in the upper stratosphere, and the climatology that results would be biased negative in that region. The bias would have been worse if the Metop orbits did not coincide with nulls in the difference between the RO-inferred diurnal cycle and the model diurnal cycle.
It is crucial that RO obtain a sampling pattern that completely covers the diurnal cycle even if non-uniformly. Sampling error removal cannot completely eliminate the bias incurred by under-sampling of the diurnal cycle. As of the year 2021, the COSMIC-2 RO mission is obtaining complete coverage of the diurnal cycle in low latitudes. While the Metop RO instruments obtain global coverage, they do not obtain complete coverage of the diurnal cycle, leaving gores in the coverage of the diurnal cycle at high latitudes. It is a problem best fixed with a targeted deployment of a sun-synchronous RO satellite near a nodal crossing time of 3:30/15:30 solar time. Absent such an RO satellite, the best way to address the gap in coverage is careful use of a climatology of the diurnal cycle established using COSMIC RO data.

Appendix A: Uncertainty in Amplitude and Phase of a Tide
In Bayesian interpolation, uncertainty in the posterior coefficients of expansion is returned as a fully populated covariance matrix. Any particular tidal mode is characterized by just two terms in the expansion of the basis functions: a cosine term with amplitude C n and a sine term with amplitude S n . The covariance matrix corresponding to these terms is Recall that diurnal time ranges from 0 to 2π over the course of 24 hr. The diurnal cycle can be written also as with A n the amplitude of the mode and ϕ n the phase. These can be derived from C n and S n according to 2 = 2 + 2 (A4a) = arctan ( , ) ∕ with arctan(⋯ , ⋯ ) a four-quadrant arctangent defined such that tan ϕ n = S n /C n . Ordinary error propagation assuming Gaussian uncertainties is performed following the standard formulas: The partial derivatives are computed from Equations A4a and A4b and inserted into Equations A5a and A5b to obtain the following uncertainties in A n and ϕ n : 2 = ( 2 2 + 2 2 + 2 ) ∕ 2 (A6a) 2 = ( 2 2 + 2 2 − 2 ) ∕ ( 2 4 ) .
With determinations of , , and ρ, whether by Bayesian interpolation or by computing interannual variability, these equations can be applied to estimate the uncertainty in amplitude and phase of a tide.