One‐Dimensional Variational Ionospheric Retrieval Using Radio Occultation Bending Angles: 2. Validation

Culverwell et al. (2023, https://doi.org/10.1029/2023SW003572) described a new one‐dimensional variational (1D‐Var) retrieval approach for ionospheric GNSS radio occultation (GNSS‐RO) measurements. The approach maps a one‐dimensional ionospheric electron density profile, modeled with multiple “Vary‐Chap” layers, to bending angle space. This paper improves the computational performance of the 1D‐Var retrieval using an improved background model and validates the approach by comparing with the COSMIC‐2 profile retrievals, based on an Abel Transform inversion, and co‐located (within 200 km) ionosonde observations using all suitable data from 2020. A three or four layer Vary‐Chap in the 1D‐Var retrieval shows improved performance compared to COSMIC‐2 retrievals in terms of percentage error for the F2 peak parameters (NmF2 and hmF2). Furthermore, skill in retrieval (compared to COSMIC‐2 profiles) throughout the bottomside (∼90–300 km) has been demonstrated. With a single Vary‐Chap layer the performance is similar, but this improves by approximately 40% when using four‐layers.


COSMIC-2
The FORMOSAT-7/COSMIC-2 constellation is a constellation of six identical satellites in a low inclination orbit currently at a nominal altitude of 520-550 km with an inclination of 24° (Anthes & Schreiner, 2019).However, during 2020, the time period of data used in this paper, all the satellites were not yet in their final orbit with some still being maneuvered from their launch altitude of 720 km.The final satellite (FS706) was lowered to its nominal altitude in February 2021 (Weiss et al., 2022).COSMIC-2 follows the successful COSMIC-1 program, which launched in April 2006 (Ho et al., 2020).The COSMIC-2 constellation's primary aim is to observe the atmosphere from a low latitude orbit using RO, supporting operational global weather prediction, tropical weather and climate research, space weather forecasting, and ionospheric research.
Whilst COSMIC-2 do not provide L1 and L2 bending angles, here we use the derivative of slantwise TEC S, with respect to the impact parameter a, ∂S/∂a.Culverwell et al. (2023) showed that, to a good approximation, this is proportional to the difference between the L2 and L1 bending angles, plus an electron density term at the LEO satellite.Here we fit to bending angles with impact heights (impact parameter − radius of curvature) between 100 and 500 km.In contrast COSMIC-2 retrievals use data from 0 up to a height of ∼700 km at a vertical resolution of 1 Hz (≈2-3 km).

Background Model
In this implementation of a 1D-Var system the a priori (background) values are effectively used as a first guess to start the minimization process rather than a strong constraint on the final 1D-Var solution.For example, using 143 COSMIC-1 measurements provided by the Institut D'Estudis Espacials de Catalunya (IEEC) in Spain (the same data set as used by Culverwell et al. (2023)) using a one-layer Vary-Chap and a variety of background parameters gives the analysis results and the number of iterations until convergence shown in Table 1.
What is clear from Table 1 is that varying the background model parameters has almost no impact on the final analysis (the only difference at all is in the third block with an H 0,F2 value of 49.9 where in all other instances it is 50.1).The main difference between varying the parameters is the impact on the number of iterations required to arrive at the analysis results.This varies from 14 iterations to 6.This suggests that an improved background guess may reduce the number of iterations required for the model to converge.
Noting this increased performance when using a more accurate background guess, rather than a fixed set of values, a simple model for these parameters is defined below.Whilst a more complicated background model could be used, this increases the computational cost and run times.Since the a priori values are not a strong constraint in this system, these cost increases provide no advantages.This peak parameter (NmF2, NmF1, NmE, hmF2, hmF1, and hmE) model is based on Nava et al. (2008) and Angling et al. (2018).

NmE
The peak density of the ionospheric E-region, NmE, is well modeled by a simple function based on the a seasonal relationship with F10.7 and solar zenith angle χ (Leitinger & Kirchengast, 1997;Nava et al., 2008): where a e is a seasonal term given by  (2) where ϕ is latitude in radians, s = −1 if the month is January, February, November or December, s = 0 if March, April, September or October and s = 1 if May, June, July or August.Finally, χ eff is the effective solar zenith angle given by where χ 0 is the zenith angle at night-day transition, 86.23° as given in ITU-R P.2297-1 (2019).

hmE
Across a range of models including NeQuick and the IRI hmE, the height of the E-region peak density, is usually set to a fixed height.In this model as per the updated NeQuick model from Angling et al. (2018).

NmF2
NmF2, the peak density of the F2-layer, is dependent on a number of factors including geographical longitude, latitude, time, season/day of year and solar activity.A number of approaches can be used to described solar activity, for this simple model the sunspot number is used.The NeQuick (Nava et al., 2008) and the Ionospheric Reference Model (IRI; e.g., Bilitza & Reinisch, 2008) use the Committee Consultative for Ionospheric Radiowave propagation (CCIR) files to compute NmF2 and M(3000)F2 (the ratio of the maximum useable frequency at a distance of 3,000 km to the F2 layer critical frequency, foF2).The CCIR maps for the F2 parameter consist of 988 coefficients for each month.CCIR provides two sets of coefficients, one for low sunspot numbers and one for high (Haralambous et al., 2021).The coefficients for intermediate levels of solar activity are determined by linear interpolation (European Commission, 2016).
As per (Nava et al. (2008); European Commission (2016) Equations 61-68 and 77) the CCIR coefficients are interpolated for the current date and solar activity levels and then vectors of coefficients for Legendre polynomials are calculated.The NmF2 (electrons/m 3 ) is specified in terms of foF2 (Hz): where foF2 itself is defined as the sum of nine intermediate terms: where and for n = 2, …, 9 where

hmF2
The calculation of the height of the maximum F2 density, hmF2, is based on the NmF2, NmE and the M(3000) F2 (European Commission, 2016, Equations 69-76).The M(3000)F2 is calculated similarly to NmF2 from Section 3.3 as the sum of seven intermediate terms: where and where for n = 2, …, 7. u j are the interpolated CCIR coefficients (European Commission, 2016, Equations 40-51; note that u is written as cm3 in those equations).

NmF1
The maximum density of the F1-layer, NmF1, is defined in this model as in NeQuick (Nava et al., 2008), in terms of the critical density of the E-Region.Specifically ELVIDGE ET AL. 10.1029/2023SW003571 5 of 11

hmF1
The height of the peak density of the F1-layer, hmF1, is defined here simply as the average of the height of peak density of the E and F2 layer:

Using the New Background Model
Table 1 demonstrated the impact of the background model values on the number of iterations needed to converge to a solution with the 1D-Var.Rather than using a fixed set of values for the initial guesses for the Vary-Chap layers but instead using values from the model described in the previous sections has a major impact on the iteration rate and the total number of successful convergences (a successful convergence is defined as a convergence which takes less than 50 iterations), Table 2.
The four-layer Vary-Chap model in Table 2 is an E-, F1-, F2-and topside-layer model.The D-region is excluded as it has very little impact on the overall results (see Culverwell et al. (2023)).It can be seen from Table 2 that using the new background model not only reduces the average number of iterations required for convergence in each case but also improves the overall percentage of observations which do converge.The average reduction in the number of observations is 5 iterations and the increased percentage of observation convergence means that all test versions of the model have at least a 73.4% convergence rate (improved from 58.7%).

Statistical Performance of the 1D-Var Retrieval
To undertake a rigorous statistical analysis of the 1D-Var retrieval technique two things are needed: 1. a comparison to independent observations, 2. sufficient retrievals to reduce uncertainty in the analysis.

Radio Occultation Observations
The current most abundant, and freely available, source of recent RO observations is from the COSMIC-2 satellites (see Section 2).To perform a rigorous statistical analysis of the 1D-Var retrieval technique all available Level 1B (TEC observations) and Level 2 (ionospheric profiles) products from 2020 have been used (UCAR, 2019).COSMIC-2 ionospheric retrievals use the calibrated TEC data derived from the L1 and L2 phase differences, which is then calibrated using the method described in CDAAC (Strauss et al., 2020).
In total there are 1,838,920 Level 1B occultations (each containing several hundred TEC observations) and an associated 966,358 Level 2 derived ionospheric profiles available to download in 2020.

Independent Observations
By stepping through a range of HF frequencies transmitted vertically upwards and measuring the return echoes ionosondes can image the vertical profile of the ionosphere up to the peak density.These observations are both widely assimilated by ionospheric models and also commonly used as reference observations (often incorrectly called "truth") for comparative studies for example, (Elvidge et al., 2014;Feltens et al., 2011;Scherliess et al., 2011).In this work every ionosonde observation (profiles) from within 200 km of the tangent point of an occultation in 2020 has been used to validate both the COSMIC-2 profile reconstruction as well as the 1D-Var retrieval.Usually ionosonde profiles are "autoscaled" to get the true height information from the observations, but this can, and does, give rise to "autoscaling errors."Ionogram autoscaling is the process of automatically detecting the traces on a graph of time-of-flight against transmitted frequency which can then be used to infer the electron densities.
The most commonly used autoscaling software is the Automatic Real-Time Ionogram Scaler with True height (ARTIST) (Galkin et al., 2008) which uses a hyperbolic trace fitting method.The best way to overcome these autoscaling errors is to manually scale the ionograms (as was done in Lin et al. ( 2020)), however that is a time consuming process.Another way to address the problem is by looking at the confidence scores that autoscaling software, such as ARITST, uses to assess the success of the inversion.Only profiles with maximum confidence (100) have been used.That results in 10,935 profiles used in this study.
However, even profiles with the maximum confidence score can still contain errors (Themens et al., 2022).To address this issue, each ionosonde profile has been examined and removed from the analysis if they contain obvious errors.Overall this results in 10,612 profiles which are used in the analysis, far more than used in previous studies.The number of observations are summarized in Table 3.

Analysis Results
Initial analysis of the 1D-Var retrieval, in a similar approach to Lin et al. ( 2020), is to look at the performance of the F2 peak in terms of density and altitude compared to the ionosondes.These results can then be compared to the COSMIC-2 electron density profiles.However, before looking at the statistical performance of the two approaches, some quality control of the peak parameters is required.
An easy first approximation to find the F2 peak parameters is to take the maximum density of the electron density profile and associated altitude.Figures 1 and 2 show the resulting probability density plots for the maximum density and the associated altitudes respectively.By comparing the COSMIC-2 height of maximums to the probability distribution function of hmF2's using data from the Chilton, UK ionosonde between 2000 and 2019 (Figure 2) it is clear that the main bulk of the distributions agree closely.However the second and third peaks centered at 100 km and just greater than 0 in the COSMIC-2 data are likely not actually hmF2 values.These likely included both observations of sporadic-E and other errors in the COSMIC-2 profiles (e.g., see examples in Figure 3).In this analysis hmF2 values are defined to be between 200 and 500 km; if a value is outside of this range both the COSMIC-2 and associated 1D-Var retrieval is excluded from the analysis.
As described in Culverwell et al. (2023) and in Section 3.7 the 1D-Var does not result in a 100% convergence rate.
To make the comparison fair any 1D-Var retrieval which does not converge is also removed from the COSMIC-2 data set before any statistics are calculated.

Statistical Performance
To provide a statistical comparison between the F2 peak parameters the relative percentage error (ΔX(%) = 100(X − X ionosonde )/X ionosonde ) of the specification of both NmF2 and hmF2 between these ionosonde observations with both the 1D-Var retrieval with up to four-layers and COSMIC-2 retrieval is shown in Table 4.
From these results it is clear that the 1D-Var retrieval outperforms COSMIC-2 when using three or four layers.
There is a 7% relative error in the COSMIC-2 retrieval of NmF2 (which is consistent with the results of Cherniak et al. ( 2021)) compared with −5.3% and −4.2% when using 1D-Var with three and four layers respectively.However, when only using one-layer the 1D-Var has a −14.8% error in the retrieval of the parameter.For hmF2 the COSMIC-2 error is 2.6% and all versions of the 1D-Var retrieval result in a smaller error decreasing from 1.3% to just −0.1%.However it should be noted that errors in hmF2 specification from autoscaled ionosonde observations can be large (∼±10%) and this should be kept in mind when comparing the parameters.
A more detailed view of how the number of layers improves the F2 specification can be seen by looking at probability density functions of the error in NmF2 specification. Figure 4a shows the probability density function for COSMIC-2 and a single F2 layer, overall the distributions are similar, albeit with a slight negative bias in the 1D-Var retrieval.Moving around in Figure 4 the 1D-Var retrieval has more layers added to the solution, from one to four.With increasing layers the bias and standard deviation is reduced, with overall the four layers in Figure 4d performing the best, with a significantly reduced standard deviation compared to the COSMIC-2 retrieval.

Bottomside Profile
Whilst the F2-peak is an important parameter and a good indicator of how well the 1D-Var retrieval is working, one reason for using the Vary-Chap layers is to reconstruct the full ionospheric profile.To assess the bottomside (the altitudes region below that of the peak ionospheric density) performance all of the ionospheric profiles (from the ionosonde, COSMIC-2 and the 1D-Var) are interpolated onto the same altitude grid, at 1 km resolution, and the root mean square error (RMSE) with respect to the ionosonde at each altitude is then calculated.The resulting RMSE altitude profile is shown in Figure 5.
This shows that the 1D-Var retrieval with one-layer (F2) performs very similarly to the COSMIC-2 profiles, and above 225 km there is no statistically significant differences in the results, however below 200 km the COSMIC-2 retrieval is clearly superior.A two-layer model (F2 + F1) shows an improvement over COSMIC-2 above 200 km, which is consistent with the fact of adding a lower-altitude layer.The four-layer (F2 + F1 + E + Topside) 1D-Var retrieval shows an excellent performance throughout the altitude range, with   show comparable/improved performance relative to the four-layer model, although the number of data points at these altitudes is relatively small and the results are not statistically distinguishable.

Discussion and Conclusions
The statistical performance of the 1D-Var retrieval technique has been thoroughly evaluated in the current work, comparing it to independent observations and leveraging a substantial set of retrievals to reduce uncertainty in the analysis.The method was applied to the Level 1B and Level 2 data obtained from COSMIC-2 satellites and validated using ionosonde observations, constituting a rigorous and expansive analytical process.
From the analysis, it was clear that the 1D-Var retrieval technique demonstrates a robust performance, outperforming COSMIC-2 retrievals, particularly when employing three or four layers.The relative percentage errors in the specification of both NmF2 and hmF2 were found to be lower for the 1D-Var retrieval.This demonstrates the significant potential of the 1D-Var technique in the field of ionospheric profile reconstruction.However, it is worth noting that the 1D-Var retrieval's performance was significantly lower when only one layer was used, indicating the importance of multi-layer modeling for improved accuracy.This is an important finding and suggests that future work should focus on refining and utilizing multi-layer models to improve the retrieval of ionospheric parameters.Furthermore, the comprehensive analysis of the full ionospheric profiles showed that the four-layer 1D-Var retrieval exhibited excellent performance throughout the altitude range.Notably, there was an average improvement over COSMIC-2 by approximately 40%, showcasing the promising potential of this technique.Although the COSMIC-2 results showed comparable performance to the four-layer model above 280 km, the number of data points at these altitudes was relatively small, and the results were not statistically distinguishable.This suggests that further research should be dedicated to improving the modeling and retrieval at these higher altitudes.
In conclusion, the 1D-Var retrieval technique, particularly when using three or four layers, offers a significant advancement in ionospheric profile reconstruction.While there are still areas for improvement, particularly in the retrieval at higher altitudes, the findings of this work provide a strong foundation for further research in this field.

Data Availability Statement
COSMIC-1 data are freely available from UCAR (2014).The processed files from AVHIRO were provided by Dr Hernández-Pajares, Universitat Politécnica de Catalunya (UPC), and are available from IEEC (2023).COSMIC-2 data and ionosonde observations are available from UCAR (2023) and National Centers for Environmental Information (2023), respectively.
The MODIP and CCIR files required to reconstruct the a priori model described in this work are available from the International Telecommunications Union (https://www.itu.int/md/R07-WP3L-C-0094/en).
The background model described in this paper is coded in Python 3.x, as is the 1DVar retrieval code that uses it.An officially supported Fortran95 implementation of the latter has been part of the Radio Occultation Processing Package (ROPP) since version 11.0 (released January 2022).ROPP is maintained, developed, and supported by EUMETSAT, through the Radio Occultation Meteorology Satellite Applications Facility (ROM SAF), and freely available to download from its website (ROM SAF, 2023).
Lin et al. (2020) andCherniak et al. (2021) have previously validated the COSMIC-2 profiles, in terms of peak density/height, by comparison to ionosonde profiles at eight locations across one month in late 2019 and two months in early 2020 respectively.TheLin et al. (2020) study resulted in the comparison of 135 RO profiles and the Cherniak et al. (2021) study used ∼2,200 profiles.

Figure 2 .
Figure 2. Probability distribution function of COSMIC-2 maximum densities overlaid with hmF2 data from Chilton, UK ionosonde using data from 2000 to 2019.

Figure 3 .
Figure 3. Sample of COSMIC-2 electron density profiles, showing occasional sporadic E-layers and spurious results, COSMIC-2 filenames are above each subplot.

Figure 4 .
Figure 4. Panels show comparisons of probability density functions of the error in NmF2 from COSMIC-2 compared to the 1D-Var solution with (a) an F2 Layer, (b) F2 + F1 layers, (c) F2 + F1 + E layers, and (d) F2 + F1 + E + Topside layers.Vertical dashed line is marked at 0, and the mean (μ) and standard deviation (σ) for each distribution shown in the corresponding color along side each plot.

Table 1 Background
Nava et al. (2008)and Analysis Parameters for a One-Layer Vary-Chap With Different Background Conditions ELVIDGE ET AL.10.1029/2023SW003571 3 of 11

Table 2
Culverwell et al. (2023)ns Which Converge Within 50 Iterations and Statistics of the Rate of Convergence for up to Four Vary-Chap Layers Using the First-Guess Background Conditions Described in Table1ofCulverwell et al. (2023)and the Background Model Described Here