New Methods for Linking Science Objectives to Remote Sensing Observations: A Concept Study Using Single‐ and Dual‐Pair Satellite Gravimetry Architectures

In this manuscript, we present a new analysis tool, called space‐time‐accuracy grid (STAG) analysis, to simultaneously assess the performance of an observing system architecture across space and time. Such an analysis tool is useful to directly link science objectives (typically expressed via a targeted spatial resolution, temporal resolution, and accuracy) to the expected performance of the observing system architecture. As a proof of concept, we apply STAG analysis to analyze three potential future observing systems for mass change in the Earth system: a single pair of polar orbiting satellites (heritage Gravity Recovery and Climate Experiment and Gravity Recovery and Climate Experiment Follow‐On), two polar pairs of satellites, and a polar pair of satellites coupled with an inclined (70°) pair of satellites. Here, we demonstrate the use of STAG analysis to quantify the relative performance of each architecture across space (200–1,800 km) and time (1–30 days), offering a significantly more comprehensive assessment of performance than previous studies. Results show that the polar pair coupled with the inclined pair reduces errors (after state‐of‐the‐art post‐processing for each architecture is accounted for) relative to the single pair of satellites by 40–60% in medium spatial scales (500–1,200 km), with the greatest benefit being for longer solution (monthly) timespans. Overall, the results from this case study highlight the importance of increasing the isotropy of the observable over simply increasing the sampling frequency. Some demonstrated benefits of STAG analysis include the ability to incorporate state‐of‐the‐art post‐processing methods into the analysis and also tailor the analysis to specific geographic regions to address targeted scientific objectives.


Introduction
The continuous monitoring of global geophysical processes from space is critical to acquire a comprehensive knowledge and understanding of the Earth system. Satellites play a key role in observing and investigating climate-induced geophysical phenomena such as sea level change, ice sheet and glacier ablation, depletion of aquifers, and hydrologic events such as floods and droughts. In this context, observations of mass flux within the Earth system, that is, mapping the evolution of the global water cycle, are crucial. Such observations are best made by measuring temporal variations in the Earth's gravitational field.
Past satellite gravity missions such as Challenging Minisatellite Payload (Reigber et al., 1999;Reigber et al., 2003) and the Gravity Recovery and Climate Experiment (GRACE; Tapley et al., 2004) revolutionized our understanding of global mass redistribution (Tapley et al., 2019). The GRACE mission was particularly foundational, establishing a near-continuous 15-year time series (2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017) of global mass transport. The success of the GRACE mission concept and the need for continuous monitoring of mass flux variations led to the launch of a successor mission, GRACE Follow-On (GRACE-FO; Flechtner et al., 2019), in May 2018. GRACE-FO, with a nominal mission lifetime of 5 years, has now successfully established continuity after GRACE, albeit with a 1-year data gap between the two missions. The importance of continuing the time series of global mass transport after the end of life of GRACE-FO was articulated in the 2017-2027 Decadal Survey for Earth Science and Applications from Space (National Academies, 2018), where mass change was listed as a Designated Observable. This designation has paved the way for the development of a future mission to continue the mass change data record as established by GRACE and GRACE-FO.
The importance of not only continuing but also improving upon measurements of mass change (improved spatial resolution, temporal resolution, and accuracy) was recently acknowledged in two reports by international panels of scientists representing the main fields of application, that is, continental hydrology, cryosphere, oceanography, solid Earth, and atmosphere (IGSWG, 2016;Pail et al., 2015). Due to the diversity of applications of mass flux information, user needs vary significantly. For instance, certain applications require high temporal sampling (drought monitoring and flood potential indicators), while others require lower temporal sampling but higher spatial resolution (monitoring changes in ocean transport). The desired spatial resolution, temporal resolution, and accuracy vary significantly depending upon the area of application (e.g., Table 1 from Pail et al., 2015;Appendix B of National Academies, 2018). Such a diverse set of science objectives demands improved analysis tools from which to rapidly assess the utility of potential future observing system architectures.
In past studies of future observing system architectures, traditional analysis has relied heavily on degree variance analysis, where expected errors are assessed in the spectral domain, per spherical harmonic (SH) degree of the gravity field, for a given temporal resolution (Elsaka et al., 2013;Wiese et al., 2009). Expanded studies have included error assessments using empirical orthogonal functions, as well as assessments of error in the spatial domain for dedicated regions of interest and in some cases have included the use of post-processing techniques Wiese et al., 2012). In short, there are a variety of effective tools that can be used in coordination to assess the performance of future observing system architectures in a fairly comprehensive manner. However, none of these tools singularly offer a straightforward way to directly, and simultaneously, assess the utility of an observing system architecture to satisfy a diverse set of science objectives (typically expressed as a set of targeted spatial resolutions, temporal resolutions, and accuracies). Further, no previous studies have assessed the performance of future observing system architectures simultaneously across the space-time continuum.
In this manuscript, we develop a new metric, which we call space-time-accuracy grid (STAG) analysis, with which we assess the performance capability of any observing system architecture. This analysis quantifies the expected error, or accuracy, across a continuum of spatial scales (200 to 1,800 km in our analysis) and temporal scales (daily to monthly in our analysis), and allows for the inclusion of state-of-the-art post-processing methods if desired. Additionally, the analysis can be adapted to analyze errors regionally or targeted toward specific applications. STAG analysis allows for a direct relation between science objectives and predicted performance of an observing system architecture.
Here, we demonstrate the first application of STAG analysis to future gravity mission architectures, by comparing the performance of three viable architectural options: a single pair of polar satellites, two polar satellite pairs, and a "Bender formation" (Bender et al., 2008) consisting of a polar pair of satellites coupled with a pair at 70°inclination. Each architecture has been studied extensively in the literature (Daras & Pail, 2017;Elsaka et al., 2013;Visser et al., 2010;Wiese, Nerem, & Han, 2011;Wiese, Visser, & Nerem, 2011;Wiese et al., 2012); our goal is to quantify the relative performance of each architecture more comprehensively than done in the studies listed above through the use of STAG analysis. This manuscript is organized as follows: section 2 describes in detail the three architectures studied, section 3 describes the methods, including a description of STAG analysis, and sections 4 and 5 are dedicated to results and conclusions, respectively.

Observing System Architectures
Three potential future observing system architectures that have been studied fairly extensively in the literature are further assessed in this manuscript in the context of STAG analysis. Each architecture mimics the basic measurement system of GRACE and GRACE-FO, consisting of a low-low satellite-to-satellite tracking concept, where a pair of satellites separated by some distance in the along-track direction, continuously tracks the inter-satellite separation distance as they orbit the Earth. In the GRACE and GRACE-FO architectures, the satellites are nominally separated by 220 km in a near-polar orbit at an altitude of~500 km, and the tracking is performed by a K-band microwave ranging system with a precision to the micrometer level. The GRACE-FO architecture additionally has a technology demonstration laser ranging interferometer that has demonstrated an increase in the precision of the inter-satellite link by two orders of magnitude relative to the K-band link (Abich et al., 2019). Furthermore, each satellite is equipped with a GPS receiver to determine inertial position, has star cameras for attitude knowledge, and contains a high-precision accelerometer located at the center of mass of the satellite to measure nongravitational accelerations.
Each architecture studied here represents an improvement relative to GRACE and GRACE-FO, in that we assume the satellites are equipped with a drag-compensation system that allows the satellites to fly at a lower altitude (~350 km) to increase sensitivity to short-wavelength variations in the gravity field. Further, all inter-satellite tracking is performed with a laser ranging interferometer, and the satellites are assumed to have a separation distance of 100 km. The specific performance of each instrument (laser interferometer, star cameras, accelerometer, and GPS receiver) is given in section 3.1.
It is well understood that the expected limiting source of error for the GRACE-FO mission architecture is not due to the measurement system; rather, it is due to undersampling of high-frequency geophysical signals, namely, ocean tides and nontidal atmosphere and ocean variability (Wiese, Nerem, & Han, 2011), which alias into the gravity solution. The most straightforward way to improve upon performance relative to GRACE-FO is to sample the gravity field more frequently by adding more pairs of satellites (Wiese, Visser, & Nerem, 2011). Such an increase in sampling is required to meet more ambitious science objectives put forth by the community (IGSWG, 2016;Pail et al., 2015). Furthermore, since GRACE and GRACE-FO sample the gravity field solely in the north-south direction, both architectures have poor isotropy, and the resulting gravity fields are contaminated with meridional stripes that obscure the geophysical signals. Several studies have pointed out that increasing the isotropy of the observable, that is, sampling in multiple directions, drastically reduces or eliminates the striping artifact, resulting in increased spatial resolution in the final gravity products, decreasing reliance on post-processing procedures (Elsaka et al., 2013;Wiese et al., 2009;Wiese et al., 2012). The three architectures studied here are chosen to isolate the impact of each architectural improvement.
The first architecture studied (Case A) can be considered as a baseline scenario consisting of a single pair of satellites in a near-polar orbit similar to the GRACE and GRACE-FO architectures at 342-km altitude. The second architecture (Case B) consists of the same pair of satellites in Case A coupled with a second pair of satellites in a near-polar orbit. This architecture essentially increases the temporal sampling by a factor of two but does not improve the observation isotropy. The third architecture (Case C) consists of Case A coupled with a second pair of satellites in an inclined orbit (70°), a so-called Bender formation. This architecture both increases the temporal sampling by a factor 2 (except for small polar cap areas) and increases the observation isotropy; because of these characteristics, it has been studied fairly extensively (Wiese, Visser, & Nerem, 2011;Wiese et al., 2012;Elsaka et al., 2013). It further has "self-de-aliasing" properties, allowing for co-estimation of low resolution gravity fields at high temporal frequencies to further improve the solution (Wiese, Nerem, & Han, 2011;Daras & Pail, 2017).
The orbit parameters for each case listed in Table 1 (each row is related to one pair of satellites) are derived from European Space Agency's (ESA) Additional Constellation & Scientific Analysis of the Next Generation Gravity Mission Concept ) study, where a multitude of dual-pair constellations was investigated in terms of performance. The orbits are assumed to be circular, and the dedicated parameters were chosen in such a way that the homogeneity of the ground track pattern of Case C provides the densest possible ground track coverage for a repeat period of 7 days. This relatively short repeat period is necessary to ensure optimal ground track coverage over shorter timescales; these particular orbit parameters additionally have fairly homogeneous ground track coverage even over 3 days. The orbits are additionally designed as near-repeat orbits with a long repeat of~45 days for both polar pairs and~42 days for the inclined pair. Ensuring homogeneity of ground track coverage over multiple timescales should allow for a diversity of science objectives to be met, to the best extent possible. For Case C, the altitude of the inclined orbit is chosen such that ground tracks of the polar and inclined satellite pairs drift at the same rate within the sub-cycle period (1.3°per 7 days), which aims to achieve consistency in data product quality over time. Further, these drifting orbits have the additional advantage that they produce a dense ground track distribution over longer periods of time, which might improve the spatial resolution for the gravity retrieval of long-term static gravity solutions. For Case B, the initial mean anomaly of the second polar pair was shifted by 180°in order to avoid a collision of the satellite vehicles at the poles. The initial right ascension of ascending node of the second pair is offset in such a way that the configuration ensures optimal ground track coverage for 3 days, providing near equidistant ground track crossings at the equator over this time frame. Such a design ensures the best quality gravity solutions over short time intervals. We explicitly note that since it is assumed that the satellites feature a drag-compensation system, this implies that the mean altitude is constant and the sub-cycles do not change over time.

Simulation Setup
The simulations were executed with a full numerical mission simulator (Daras, 2016;Daras et al., 2015), which has already been successfully applied to recover satellite-only gravitational field models from Gravity Field and Steady-State Ocean Circulation Explorer (Drinkwater et al., 2003) data (Yi, 2012). Compared to real data processing, the advantage of these closed-loop simulations, which are performed in a completely synthetic simulation world, is that the truth is perfectly known. Therefore, the effect of any error source can be quantified by the deviation of the final result from the "true world." The simulation environment is based on numerical orbit integration, following a multistep method for the numerical integration according to Shampine and Gordon (1975), which applies a modified divided difference form of the Adams predict-evaluate-correct-evaluate formulas and local extrapolation.
The geophysical signal models listed in Table 2 give an overview of the force models used in the processing for the "true" and the "reference" world. The static gravity field model is represented by the GOCO05s model (Mayer-Gürr, 2015), which combines data from the GRACE, Gravity Field and Steady-State Ocean Circulation Explorer, and Laser Geodynamics Satellite (Spencer, 1977) missions. The time variable gravity field is defined by a tidal and a nontidal gravity field component. For the latter, ESA's updated Earth System Model  has been used, which contains the five main geophysical signal components atmosphere (A), ocean (O), hydrology (H), ice (I), and solid Earth (S) with a time resolution of 6 hr, linearly interpolated to the 5-s observation epochs. The Earth system model covers the time period 1995-2006 and contains plausible variability and trends in both low-degree coefficients and the global mean eustatic sea level. It depicts reasonable mass variability all over the globe at a wide range of frequencies including multi-year trends, year-to-year variability, and seasonal variability, even at very fine spatial scales (up to 110 km), which is important for a realistic representation of spatial aliasing and leakage. Current state-of-the-art gravity field retrieval methods applied to GRACE and GRACE Follow-On sensor data require a priori background information on high-frequency mass variability in atmosphere and ocean, which is typically taken from the Atmosphere and Ocean Level-1B De-aliasing Product (Dobslaw et al., 2017). The updated ESA Earth System Model provides a set of AOD error products used in the simulation environment as well, consisting of a "DEAL" component, which differs from "AO" of the source model only by means of the physical processes currently omitted in Atmosphere and Ocean Level-1B De-aliasing Product, and an "AOerr" component, representing a series of errors as the sum of both large-scale and small-scale errors with zero mean and stationary variance at a variety of frequencies. The impact of ocean tide model errors is assessed by taking the difference of two tide models, EOT11a (Savcenko & Bosch, 2012) and GOT4.7 (Ray, 1999).
The adopted gravity field solution approach is based on a modification of the integral equation approach from Schneider (1969) where the orbit is divided into continuous short arcs of 6-hr length (where the neighboring arcs share the same node point making the orbit continuous), and the position vectors at the arc node points are set up as unknown parameters, which are estimated together with the gravity field coefficients.
The functional model follows the typical formulation used for the GRACE and GRACE-FO missions, which comprises a high-low SST and a low-low SST component. Absolute position information of a satellite is used for the computation of the reference values for the high-low SST part of the observation system, whereas the reference values for the low-low SST part are derived by projecting position and velocity differences between two satellites onto the line of sight, leading to the computation of inter-satellite range rates.
Realistic correlated noise models for each instrument (laser interferometer, GNSS receiver, accelerometer, and star cameras) are introduced. The error assumptions used for the laser ranging instrument are derived from the ESA-Assessment of Satellite Constellations for Monitoring the Variations in Earth's Gravity Field (SC4MGV; Iran Pour et al., 2015) project, provided from the consultancy support of Thales Alenia Space Italia, Turin. The root-mean-square (RMS) of the noise on the laser is approximately 7.36 nm/s and is expressed by the amplitude spectral density: The nongravitational forces are sensed by the on-board accelerometers located in the center of mass of the satellite. For all satellite vehicles, a GRACE-FO-like electrostatic accelerometer is assumed with two highly sensitive axes oriented in the flight direction and in the radial direction and one low-sensitive axis in the cross-track direction. The accuracy level in terms of accelerations is expressed by: (3) with x denoting along-track, y cross-track, and z (close to) the radial direction.
Geo-location of satellite observations, as well as gravity retrieval, require highly accurate continuous orbit determination, making GNSS space receivers on all satellites obligatory. In our simulations, we assume an absolute kinematic positioning on a centimeter level, as 1-cm white noise is added to these observations.
Using a laser ranging instrument as the main measurement system requires exact pointing of the tracking antenna on the order of 10 μrad or less, and therefore the implementation of systems for attitude determination and control. We assume star camera sensor errors for all satellites represented as rotation angles around the along-track (roll), cross-track (pitch), and radial (yaw) axes, expressed by the amplitude spectral density of the following analytical noise models (Iran Pour, 2015): The total stochastic model for the satellite observations is approximated by means of a cascade of digital Butterworth autoregressive moving average filters (Pail et al., 2011;Siemes, 2008). Filter coefficients are chosen in such a way that the cascade's frequency response optimally matches the inverse of the amplitude spectrum of the previously generated pre-fit residuals. They are estimated as a result of the computation of the linearized normal equations, which include differences between the "true" (only the static GOCO05s gravity field model and sensor noise according to equations (1)-(5) is included) and reference observations (only the static GOCO05s gravity field model is included), such that the error sources from the sensors are considered exclusively. Assuming uncorrelated GNSS and laser interferometer observations, weighting matrices are set up for all observation components separately.
The software environment uses nonlinear observation equations to retrieve the SH coefficients, and the "reference" observations have to be reduced from the "true" observations as a result of the linearization process. The gravity field parameters are estimated by solving full normal equations of a least squares system based on a standard Gauss-Markov model using weighted least squares with stochastic models in accordance with the simulated instrument noise levels. Wiese, Visser, and Nerem (2011) proposed an extended processing method for double-pair architectures primarily, which enables the mitigation of temporal aliasing effects due to nontidal time-varying signals with periods shorter than the temporal resolution of the satellite mission, as they alias into the long-term (e.g., 30 days) solutions. This technique demonstrated the benefit of the co-parameterization of additional short-term (e.g., daily) low degree and order gravity field coefficients and is applied during the processing for Cases B and C in the simulations. The maximum degree of the co-estimated daily gravity fields was chosen independently for Cases B and C, dependent on the influence on the long-term solution, which is highly correlated with the short-term solutions.
The goal of this study is to demonstrate the ability of STAG analysis to describe the performance of singleand double-pair satellite gravity mission architectures in a more comprehensive manner than previous studies. This requires a multitude of gravity field solutions of different timescales and sufficient spatial scales. For this purpose, gravity field solutions in terms of SH coefficients are retrieved for time and spatial scales according to Table 3 consecutively, from observations sampled every 5 s for the year 2002. The decision of the maximum SH degree and order of gravity field retrieval, especially for Case A, is primarily driven by stability of the normal equation system, which is directly linked to the observation geometry, and also to achieve consistency between cases and alleviate computational efforts. Thus, the used force models (Table 2) are truncated at a maximum SH degree and order that is consistent with the maximum SH degree and order of the gravity field retrieval.

STAG Analysis
The primary motivation for the development of the STAG analysis is to map error across a continuum of space-time scales. It is desired to map both "raw" retrieval error without the application of de-striping, filtering, and scaling methods, which can be described as post-processing in general, as well as to map retrieval error after post-processing techniques are applied. First, the retrieved gravity field solutions described in section 3.1 are analyzed in the frequency domain (Figure 1), expressed in terms of the degree RMS error of equivalent water height (EWH), which is computed from fully normalized coefficients of a SH series expansion C nm ; S nm È É of degree n and order m, by where ρ w and ρ e represent the average densities of water and Earth, a the semi-major axis of the Earth, and k n the load Love number of degree n. C nm ; S nm È É can represent either the full signal or the differences between estimated and true coefficients. This has been done in many studies before, representing the signal/error information per degree. The retrieval error is derived by removing the average of the true mass transport model (HIS signal) from the recovered signal for a dedicated retrieval period. Figure 1 shows the expected error behavior where Case C benefits due to its observation geometry allowing the combination of two anisotropic measurements taken in different directions, which increases the isotropy of the combined system. Case A offers the worst performance, suffering from strong aliasing effects coupled with poor

Earth and Space Science
HAUK AND WIESE observation isotropy. Case B doubles the number of observations compared to Case A, but the consistent polar observation geometry reduces only errors in the longer wavelengths (lower SH degrees) of the time-varying gravity field signal and does not lead to significant enhancement of spatial resolution.
The effect of the different error spectra of the three satellite concepts is even more visible when computing spatial covariance functions shown in Figures 2a-2c. They describe the correlation of the computation point with its neighborhood in the normal equation system due to the stochastic instrument noise model and the observation geometry. The spatial characteristics and the pattern of the covariances provide information about the spatial behavior of the retrieved signals. The figures show the typical stripes for Case A and Case B caused by the north-south observation direction that are known from the GRACE temporal gravity models. Due to the additional polar pair, Case B offers a slightly lower error level compared to Case A. The highest isotropy and the lowest error level are represented by Case C providing an improved observation geometry due to the cross-track component of the inclined pair. Figures 2d-2f reflect the spatial error behavior by showing the logarithm of the error in each SH coefficient for the three cases in the frequency domain. Case A in particular provides less well-determined near-sectorial coefficients (SH degree equal to SH order) than near-zonal coefficients (SH order equal to zero), which is slightly diminished in Case B. However, only Case C offers a strongly improved estimation performance of the sectorial coefficients leading to an enhanced spatial resolution in the derived gravity fields.
In order to compare "raw" retrieval errors in a fair way across mission architectures, the retrieved gravity field coefficients are truncated at SH degree and order where the signal-to-noise ratio (SNR) in Figure 1 is equal to one, represented by the crossing between the signal (black curve) and the error (colored curves). The remaining coefficients (SNR > 1) are synthesized on a global 0.2°× 0.2°grid of longitude and latitude in units of cm EWH, together with the full true mass transport model (up to SH degree and order 180), which is averaged up to a dedicated retrieval period, via SH synthesis. The resulting grids are subtracted from each other and form a global representation of gravity field retrieval errors in the spatial domain. In the next step, a conversion of the grids from the spatial domain into the equal-area mascon domain (Watkins et al., 2015) is performed. The mascon approach is currently used as a state-of-the-art data processing approach for GRACE and GRACE-FO to extract more information at smaller spatial scales but serves as a transformation into

Earth and Space Science
HAUK AND WIESE exact spatial scales (e.g., 2°is equal to 222 km) in this study. Finally, global error RMS values are computed from the mascons averaged over 1 year of synthetic simulations for spatial scales of 222, 444, 888, and 1,776 km and temporal resolutions of 1, 3, 7, 14, and 30 days. For space-time scales in between these computation points, error is assessed via linear interpolation, which we claim is valid given the linear nature of the results.
To demonstrate mission architecture performance for post-processed gravity field data, for Cases A and B, we applied de-striping and filtering methods, which are typically used in real data post-processing from GRACE and GRACE-FO. Here, the de-striping algorithm of Swenson and Wahr (2006) is used, de-striping gravity field coefficients in the frequency domain from SH degree and order 14, and Gaussian smoothing with a radius of 300 km starting from degree 2 was applied in a second iteration. Due to the inclined satellite pair, Case C offers a different error isotropy compared to Cases A and B (Figure 2). Tests have shown that the treatment of the gravity field coefficients retrieved by Case C with the method described above (de-striping and smoothing parameters adapted to Case C were used) reduces the aliasing errors up to a certain extent but does not present the optimal solution due to the fact that this method is designed for treating gravity field data retrieved by a single polar pair. Therefore, an alternative post-processing method was applied for Case C, represented by the time variable decorrelation filters (Horvath et al., 2018) derived from covariance information on normal equations and signal variance information from the mass transport model, showing a slightly better performance (~5% overall improvement; not shown) in terms of signal recoverability compared to a tailored de-striping method by Swenson and Wahr (2006) and tailored Gaussian smoothing. In contrast to the "raw" errors, for all mission architectures, the estimated gravity field coefficients to be post-processed were not truncated and were considered up to the dedicated maximum SH expansion listed in Table 3.
The application of the described de-striping and filtering methods leads to an undesirable and inevitable side effect of also filtering the truth HIS signal included in the gravity field retrieval. In order to account for this effect, scale factor maps can be generated by applying the same filtering procedures to the truth HIS signal, as described in Landerer and Swenson (2012). The scale factor map is convolved with the recovered,

Earth and Space Science
HAUK AND WIESE post-processed gravity fields to then restore damped signal due to the implementation of the post-processing. This type of treatment of the gravity field signals represents the best possible method to recover gravity fields and reduce errors and is accommodated in the STAG analysis if desired.
The resulting grids from STAG analysis depict a global representation of errors derived from satellite gravity observations for a large bandwidth of spatial and temporal scales and can be linked directly to science objectives. The general method can be tailored toward specific questions of interest and has the ability to accommodate any desired post-processing procedures by the user, as discussed above. In section 4, we additionally demonstrate the ability of STAG analysis to be adopted for specific scientific objectives and applications and analyze errors only over the land area as an example.

Results
The figures presented in this section describe and compare the performance of dedicated satellite gravity mission architectures from different points of view, but the structure of the STAG analysis stays the same. It is arranged by spatial scale on the horizontal axis, which includes scales between 200 and 1,776 km, by the temporal resolution (1 to 30 days) on the vertical axis, and by the error RMS of the retrieved gravity field signal as the third dimension represented by different colors and contour lines. Additionally, we show proportionalities by computing the relative improvement of one dedicated mission architecture versus another in terms of percentages, based on error RMS values of two dedicated STAG plots. We restricted the lower end of the spatial scale correspondent to the maximum SH degree and order of retrieved gravity field signals (SH degree and order 100 is equal to 200 km). In general, values for smaller spatial scales can be calculated, but there is no information from our simulations except errors of omission. STAG analysis is based on computation points corresponding to the dedicated time periods of gravity field retrieval and dedicated spatial scales as described in section 3.2 and are presented as black dots in Figures 3a-3c for purposes of demonstration. All plots include a white box representing areas where independent gravity field retrieval is not feasible according to the modified Colombo-Nyquist rule, which describes the maximum resolvable geopotential SH degree is equal to the number of orbital revolutions the satellite completes in one repeat period (Visser et al., 2012). In Case A, this box encompasses the area between 3 and 1 days due to the fact that a daily co-parameterization is not feasible (see section 3.1), and no daily gravity field solutions are available. The gray boxes included in the proportionality plots represent an area with additional new information from the second polar or inclined satellite pair. They are defined by the modified Colombo-Nyquist rule as well and by the additional daily gravity field information due to the daily co-parameterization in Cases B and C. In some instances, our STAG representations include error information where no information from our simulations results exists, for example, 7 days at 200 km, but the 7-day gravity field solutions were solved up to maximum SH degree and order 80, which corresponds to a maximum spatial scale of 250 km. Here, we define these regions based on the modified Colombo-Nyquist rule and show that in those areas, a higher maximum resolvable SH degree and order is theoretically feasible but was not performed due to reasons stated in section 3.1. Figure 3 depicts the "raw" gravity field retrieval errors for (a) Case A, (b) Case B, and (c) Case C, together with the resulting relative improvement of (d) Case B versus Case A, (e) Case C versus Case A, and (f) Case C versus Case B. All plots contain a white hashed sector that represents the area where the SNR is smaller than 1 according to Figure 1. Cases A and B have the same SNR values (Figures 3a and 3b), while Case C enables a movement of the SNR line to higher SH degrees ( Figure 1) and smaller spatial scales (Figure 3c), which is further expressed by red hashes in the proportionality plots of Figures 3e and 3f. Results demonstrate that the improvement of Case B over Case A is primarily seen in spatial scales greater than 800 km (long wavelengths) and temporal scales greater than 1 week, while there is only a small reduction of errors at small spatial scales due to the presence of stripes, which were not treated at this stage. Case C offers the largest improvement at medium spatial scales (500-1,200 km) due to the fact that the retrieved signals are clean (almost free of errors, Figure 2f) and do not necessitate any post-processing up to SH degree and order 40 for monthly solutions, for instance. In general, the benefit of Case C is greatest when the signal is observed directly and post-processing is not needed. Furthermore, there is still significant benefit at smaller spatial scales for Case C compared to Case A (about the square root of the number of observations), as well as compared to Case B, even in the presence of systematic errors. Comparing Figures 3d-3f indicates that Case C

10.1029/2019EA000922
Earth and Space Science benefits primarily due to its observation geometry and shows that tackling error isotropy is first-order importance.
The STAG analysis presented in Figure 3 is based on retrieved gravity field signal truncated at points where SNR is equal to one according to Figure 1 in order to ensure a fair comparison among the mission architectures. The computed error RMS values therefore appear relatively small. Another comparison method would be to truncate the solutions of all three architectures at SH degree and order 40 (the lowest common denominator because of the 3-day retrieval of Case A-see Table 3). In this case (not shown here), the error in Figure 3a would increase substantially. For instance, at 500-km spatial scales and 3-day temporal scales, the error in Case A would increase from~3 to~6.5 cm EWH, while the error in Case C would stay the same as what is seen in Figure 3c. As a consequence, the relative improvement ( Figure 3e) would change from 35% to~70% at this node point, and the interpretation would be even more relative improvement from Case A to Case C at smaller spatial scales in the raw error case. Hence, it is important to realize that the exact error RMS values and interpretation of the STAG analysis are subject to user decisions on how it is desired to accumulate errors for each architecture. Figure 4 represents the mission architecture performance after the application of post-processing techniques as described in section 3.2. All architectures show reduced error levels relative to Figure 3, which implies a realistic separation of some components of noise and signal in the domain where SNR < 1. The comparison of Figure 4a with Figure 3a shows only a small decrease in the level of error of Case A (e.g.,~2.2 to~2.0 cm EWH for 444 km at 30 days) and also for Case C (e.g.,~1 to~0.8 cm EWH for 444 km at 30 days), which is due to the choice of truncation in Figure 3; the improvement would be bigger if different choices were made regarding the degree of truncation as discussed above. This improvement seen here indicates that post-processing is effective and enables a reduction of errors where SNR < 1. Compared to Figure 3e, the relative improvement between Case C and Case A (Figure 4e) slightly decreases, primarily in medium spatial scales (500-1,200 km) because post-processing on polar architectures is somewhat effective in this domain, but there is still a large improvement in Case C relative to Cases A and B (~45-60% in medium spatial scales) overall. The relative improvement between Case B and Case A (Figure 4d) goes down compared to the values in the non-post-processed case due to the post-processing being equally effective on each architecture since the correlated error structure is similar. This again points to the importance of observation isotropy first and foremost.
The STAG analysis presented in Figures 3 and 4 depict a global view on the performance of a satellite mission architecture. Such analysis can also be performed for certain applications, such as the analysis of continental hydrological mass flux changes. As a demonstration of a regional assessment, Figure 5 refers exclusively to post-processed errors over land, which were achieved by the application of an ocean-land mask. For all cases, the error increases because the signal is higher and can lead to larger errors in these regions (stripes smear out signals), and errors of omission are larger due to increased signal strength over land. The relative improvement of Case C over other cases increases slightly (Figures 5e  and 5f), which means that Case C offers more improvement for land hydrology applications, for instance, than if the assessment is done on a global scale. Over land, the difference in performance between 7-and 30-day solutions is small for Cases A and B due to similarities in the error structure and the overall uniformity in the effectiveness of post-processing techniques for polar orbiting pairs of satellites. Figure 6 shows STAG analysis with scale factors applied to the results in Figure 5, as discussed in section 3.2. The application of scale factors is performed in an effort to restore signals of interest that were removed or contaminated during the post-processing. Figure 6 shows reduced error RMS values for all three mission architectures compared to the non-scaled case in Figure 5. This version of error RMS represents the "best case" since the scaling is derived from the truth signal. Scaling is extremely effective for Cases A and B, as can be seen in the reduction of errors from Figures 5a and 5b to Figures 6a and 6b. Scaling is less important for Case C because this architecture does not rely on the post-processing techniques as much as Cases A and B, since the signals are observed with more accuracy. Due to the effectiveness of scaling on Cases A and B, the relative improvement for Case C decreases (e.g., about 10% for 200 km at 30 days) but is still in the range of 40-55% for medium spatial scales. The discrepancy between solutions over different timespans in Cases A and B is larger after scaling compared to the non-scaled case and shows a recovered signal that is more realistic.

10.1029/2019EA000922
Earth and Space Science

Conclusions
In this paper, we present a new method of demonstrating the performance capability of remote sensing observing system architectures, with a focus on satellite gravimetry architectures. We term this analysis method STAG and find it useful to establish a direct link between architecture performance and science objectives, which are typically expressed in targeted spatial resolution, temporal resolution, and accuracy. As a demonstration of the STAG analysis, three possible satellite gravimetry mission architectures are examined in terms of error RMS and relative improvement respectively: one pair of polar orbiting satellites, two pairs of polar orbiting satellites, and a pair of polar orbiting satellites coupled with an inclined pair of satellites. STAG analysis is performed for different representations of error (non-post-processed, post-processed, and regional), where all results show strongly reduced errors across space and time for the polar pair coupled with the inclined pair (about 30-35% for spatial scales 200-400 km and up to 50-55% for spatial scales 800-1,600 km) due to a more isotropic observation geometry. In general, the benefit of this architecture is largest at medium spatial scales (500-1,200 km) where the retrieved signal is almost free of errors and does not necessitate post-processing. STAG analysis reveals that the double polar pair architecture modestly improves the long wavelengths signals (spatial scales greater than 800 km) compared to the single-pair constellation (about 15%) due to the fact that the gravity field signal is observed predominantly in along-track direction in both architectures, and the only benefit is from doubling the number of observations. Furthermore, STAG analysis shows a slight decrease of the relative improvement in case of the polar pair coupled with the inclined pair once state-of-the-art post-processing methods are taken into account. This demonstrates the effectiveness of the de-striping and smoothing techniques for both mission architectures including only polar satellite pairs. In a gross sense, these results agree with those from previous studies (Wiese, Nerem, & Han, 2011) but offer significantly more insight into the relative merits of each architecture across space and time than has previously been published.
STAG analysis represents a tool to simultaneously map architecture performance across space and time, facilitating a direct relation of performance to science objectives. The method can be applied to any observing system architecture of interest, be represented across different areas of geophysical application and tailored to specific geographical regions, can include state-of-the-art post-processing methods, and thus can provide a fundamental overview regarding capability and performance of remote sensing observations.