An Interpolation Method to Reduce the Computational Time in the Stochastic Lagrangian Particle Dispersion Modeling of Spatially Dense XCO2 Retrievals

Abstract A growing constellation of satellites is providing near‐global coverage of column‐averaged CO2 observations. Launched in 2019, NASA’s OCO‐3 instrument is set to provide XCO2 observations at a high spatial and temporal resolution for regional domains (100 × 100 km). The atmospheric column version of the Stochastic Time‐Inverted Lagrangian Transport (X‐STILT) model is an established method of determining the influence of upwind sources on column measurements of the atmosphere, providing a means of analysis for current OCO‐3 observations and future space‐based column‐observing missions. However, OCO‐3 is expected to provide hundreds of soundings per targeted observation, straining this already computationally intensive technique. This work proposes a novel scheme to be used with the X‐STILT model to generate upwind influence footprints with less computational expense. The method uses X‐STILT generated influence footprints from a key subset of OCO‐3 soundings. A nonlinear weighted averaging is applied to these footprints to construct additional footprints for the remaining soundings. The effects of subset selection, meteorological data, and topography are investigated for two test sites: Los Angeles, California, and Salt Lake City, Utah. The computational time required to model the source sensitivities for OCO‐3 interpretation was reduced by 62% and 78% with errors smaller than other previously acknowledged uncertainties in the modeling system (OCO‐3 retrieval error, atmospheric transport error, prior emissions error, etc.). Limitations and future applications for future CO2 missions are also discussed.

• Determining sources of spatially dense XCO 2 observations with LPDM techniques can become time intensive and strain computational resources • Presented in this work is an interpolation scheme that eases the computational burden of spatially dense XCO 2 source determination studies • Evaluating the efficiency of this interpolation scheme revealed reductions of >50% in computational time at two testing locations Understanding the relationship between CO 2 fluxes and atmospheric warming is a crucial step toward mitigating global climate change (Le Quéré et al., 2009;Matthews et al., 2014). Efforts to bolster CO 2 monitoring have led to innovative methods for measuring atmospheric mixing ratios. Ground-based stationary networks are becoming more dense and mobile measurement platforms are increasingly utilized to provide high-resolution spatiotemporal CO 2 measurements (Apte et al., 2017;Bush et al., 2015;Mitchell, Crosman, et al., 2018;Shusterman et al., 2016). Likewise, the number of space-based instruments is set to increase in the near future. NASA's Orbiting Carbon Observatory mission has recently added an additional instrument to its arsenal of space-based observation platforms. Originally built as a spare to the OCO-2 instrument, OCO-3 was launched in May of 2019 and was successfully installed on the International Space Station (ISS). With its unique orbital path and "pointing mirror assembly" (PMA), this instrument is capable of performing dense scans of column-averaged CO 2 (XCO 2 ) over specific urban areas (so called "snapshot area mapping," or SAM). These SAMs consist of discretized atmospheric soundings in a rectangular region (∼100 km × ∼100 km) over key areas (Eldering et al., 2019). Compared to the 16-days revisit time of OCO-2, the revisit time of OCO-3 is erratic. Its unique flight path and PMA provide revisits ranging from a few per year to multiple per day.
As with ground-based measurements, space-based observations may be coupled with gridded emission inventories to provide a means of validation through model and observation agreement (Hedelius et al., 2018;Janardanan et al., 2016;Labzovskii et al., 2019;Wu et al., 2018;Ye et al., 2020). This coupling often leverages Eulerian and Lagrangian atmospheric modeling techniques, which link emissions' sources with observations of the atmosphere. In previous analyses, Wu et al. (2018) (referred to hereafter as W18) built upon the Stochastic Time-Inverted Lagrangian Transport (STILT) model by extending the traditional surface-based receptor scheme to a column-based receptor scheme Lin et al., 2003;Wu et al., 2018). This new configuration was applied to aggregated XCO 2 values provided by OCO-2 observations, producing a sensitivity matrix defining the upstream area contributing to mole fractions observed at the receptor location. This matrix, known as a "footprint," was generated for multiple sounding locations and the contribution of nearby urban areas to detected XCO 2 enhancements was quantified.
The W18 method couples satellite observations and emission inventories using modeling processes. The increase in observation density and more frequent revisit times provided by OCO-3 is expected to make this method of analysis inefficient, as the amount of computational time will be greater than in previous work.
To accommodate the increase in simulations required to interpret OCO-3 measurements, this work presents an interpolation method capable of reducing resource requirements. This new technique is designed to work in tandem with the X-STILT model or other inverse modeling framework where influence footprints are generated. In situations where a large number of spatially distributed, column-based receptors are to be processed, this method systematically subsets the receptor field, generates a fraction of the influence footprints, then interpolates the remaining influence footprints. The interpolation method provides a faster means to footprint generation relative to the traditional X-STILT method.
Presented here is a description of the interpolation process and an evaluation of its effectiveness. At the time of writing, preliminary OCO-3 data were not fully quality assured; therefore, the presented methodology used a series of simulated SAMs (s-SAMs) over two cities in the United States: Los Angeles, CA and Salt Lake City, UT. At both locations, each s-SAM was compared to a set of interpolated counterparts in which the dependency on the interpolation scheme was systematically increased. Each iteration was compared to the original s-SAM by quantifying the spatial distribution and point-wise agreements between individual s-SAM values and their respective footprints. Additional consideration was given to the influence of large CO 2 sources (power generating and manufacturing plants, etc.) on this method's ability to accurately produce XCO 2 values using limited information. For improvements, a large point source detection algorithm was applied to s-SAMs to reduce cases of large error. Since the X-STILT model can be driven by a variety of meteorological data, the effects of this data's resolution on the interpolation were also addressed. The overall utility of this method is then discussed in the context of monitoring urban CO 2 emissions, the OCO-3 instrument, and future space-based missions.

Coupling X-STILT and ODIAC
The X-STILT model developed by W18 requires several user-supplied parameters, three of which include: (1) the geolocation of each column-based receptor, (2) meteorological fields to drive the model, and (3) the amount of time to propagate a backwards trajectory. SAMs provided by OCO-3 are a set of near-uniformly distributed soundings, each with a unique longitude and latitude (λ, ϕ). When applying the X-STILT model, a column-averaged influence footprint is generated for each sounding location. Each footprint, denoted as F(λ, ϕ, t 0 ), is constructed by distributing a series of individual X-STILT receptors at interval altitudes above the location (λ, ϕ) as prescribed by W18. Collectively, these vertical distributions constitute a column receptor. Incorporating turbulence, ensembles of particles released at each receptor location act as air parcels and follow the supplied meteorological fields backwards in time (beginning at time t 0 ) across a grid (x, y) independent of the s-SAM. W18's full calculation required for the influence footprint is summarized in Equation 1: (1) The contribution of each particle/parcel released from the column-based receptor is summed. This contribution is defined as the amount of time a particle's (p) trajectory propagates below a mixing layer (z ≤ h) for each backward time step, ξ, over each grid space (x i , y j ). This calculation is performed for all particles, N.
Other parameters are the dry-air mass, m air , and the mean atmospheric density (for z ≤ h), , at grid cell (x i , y j ).
Equation 1 represents a column-based footprint and therefore requires a vertical weighting throughout the atmospheric column. A weighting scheme is given to the X-STILT model by providing the averaging kernel and pressure weighting function A n and P n from OCO-2 sounding data. Vertical release locations within the column receptor are indicated by n. After temporal integration and vertical averaging, each footprint is represented by a matrix of values reflecting the amount of influence (in ppm) per surface flux (μmol/m 2 /s) a particular location and start time has on the column-averaged value. For simplicity of notation in the discussions that follow, F(λ, ϕ, t 0 ) will represent the footprint associated with a particular column receptor at (λ, ϕ) and f (x i , y j , t 0 ) will represent its distributed elements of influence values.
As an example, Figure 1 presents a column-based footprint generated by the X-STILT model within the Salt Lake Valley. Backwards-in-time trajectories were calculated for 12 hours beginning at 18:00:00 UTC on January 12, 2017. The column receptor from which the particle trajectories originate is located at the black point (left panel) and is identified as F(λ, ϕ, t 0 ). The cells making up the footprint lie on the (x, y) grid and represent each location's influence (per surface flux) on the "observation" made at the receptor. These values are denoted as f (x i , y j , t 0 ). As released particles propagate backwards in time, the sensitivity to surface fluxes typically decays as the distance from the column receptor is increased. The footprint's depiction on a regional scale (right panel) demonstrates the northerly winds experienced by the region during this time. The s-SAMs generated in this work consist of many column receptors equally spaced on a grid to emulate OCO-3's measurement capabilities. Column receptors and their footprints are independently generated and do not influence one another.
A priori column-based receptor values can be calculated by convolving footprints with surface emission inventories. This work used the 2019 release of the Open-source Data Inventory for Anthropogenic CO 2 (ODIAC), a spatially explicit 1 × 1 km inventory that uses the locations of power plants and nighttime light density to approximate anthropogenic CO 2 fluxes Oda & Maksyutov, 2011). This inventory is resolved at the month level and therefore is represented as a temporally static field, Φ(x i , y j ), as the length of typical backwards trajectories span from a few hours to a few days. Calculating an XCO 2 enhancement due to this anthropogenic release can be performed by an element-wise multiplication (Hadamard product) of these two fields followed by an element-wise summation (Equation 2): ROTEN ET AL.
10.1029/2020EA001343 4 of 28 Figure 1. The X-STILT model was used to produce a footprint for a column-based receptor placed within the Salt Lake Valley (black point; left panel) and was initialized for time 18:00:00 UTC on January 12, 2017. Using 3 km HRRR data, particles were driven backwards in time for 12 hours and temporally integrated. Examining the footprint on a regional scale (right panel) reveals strong northerly winds. It should be noted that only footprint values ≥ 10 −5 ppm/(μmol/m 2 /s) were included. Nonfiltered footprints can be more expansive than what is presented here. (X-STILT, Stochastic Time-Inverted Lagrangian Transport).
The calculated XCO 2 enhancement, ΔXCO 2 , represents the amount added to the regional background XCO 2 value and typically falls between 0ppm and 10ppm. A simplified application of X-STILT is presented in Figure 2 in which a single column-averaged sounding is analyzed. Also depicted is a representation of how ODIAC fluxes reflect urban density and human-related emissions. For further details on the evaluation and functionality of X-STILT, the reader is referred to the following works: Lin (2003), Fasoli et al. (2018), and Wu et al. (2018).
For this work, X-STILT parameter values were selected from W18 such that each column-based receptor initially consisted of individual receptors distributed vertically upward and spaced 100 m apart. Receptors with this spacing characteristic covered 3 km above ground level (AGL). Beyond 3 km, receptors maintained a 500 m spacing. The maximum height for all column-based receptors was 6 km AGL. Each receptor in the column released 100 particles per 2 min time step, driven by the 3 km High-Resolution Rapid Refresh (HRRR) model (Benjamin et al., 2016;Rolph et al., 2017) for a total of 12 hours backwards in time. The result was integrated in time and the footprints from each vertically placed receptor were spatially averaged to construct the column-based footprint. W18 investigated the output's sensitivity to the selection of these parameters and reported ∼4% error when compared to observations. Values for A n and P n were gathered from the closest available OCO-2 sounding in both space and time (OCO-2 data can be downloaded at https://disc.gsfc.nasa.gov/).

Snapshot Area Mapping and Target Locations
Two locations within the western United States were selected as testing sites for the evaluation of the inter- to the northwest is the Great Salt Lake. U.S. Census estimates of the greater metropolitan population of Salt Lake City is 1.2 million (2017). Satellite images of both target locations are included in Figure 3 (Kahle & Wickham, 2013).
To reduce the computation required for the evaluation process and focus the analysis on urban CO 2 emissions, this work uses a domain smaller than a typical SAM. Figure 3 (top) depicts a full SAM domain (∼100 km × ∼80 km) with the smaller inner domains representing the areas used for this study. Dimensions are 48 × 41 km and 52 × 41 km for the inner domains of LA and SLC, respectively. To simplify the distribution, analyses, and discussions of soundings within each s-SAM, their spacing was constrained to 0.019 ° × 0.019 ° (roughly 2.1 km), keeping the length scales in the longitudinal and latitudinal directions consistent. It was assumed that the swaths making up each SAM were adjacent. Included in Figure 4 is a depiction of the s-SAM sounding distribution over SLC. Accompanying this distribution is an OCO-2 overpass that coincides with the target location. Assuming that OCO-3 soundings will have similar spacing to that of OCO-2, it is clear that the intended s-SAM distribution is reflective of the proper latitudinal spacing yet over estimates longitudinal spacing. Given the dimensions of inner domains and intersounding spacing, the domain over SLC accommodates 500 soundings. The domain over LA accommodates 460. Although it is expected that the SAMs provided by the OCO-3 instrument may vary in orientation and/or coverage of target locations, it is assumed that they will largely cover the same area as the inner domains presented here.
As described by Eldering et al. (2019), OCO-3's SAM mode provides scanned geographic regions on the order of 100 × 100 km with a sounding spacing of roughly 1.6 × 2.2 km; furthermore, the orbit of the ISS allows SAMs to be collected at various times of day. To accommodate Eldering et al.'s specifications in the interpolation method's evaluation, the s-SAMs generated for this work are sampled from a 6 month period (January to June of 2017). Roughly 3 days were selected from each month. On each selected day, three s-SAMs were generated, reflecting morning, noon, and afternoon times of day. Specific dates and times are included in Table 1. This sampling encapsulates the transition from winter months to summer months while the generation of three s-SAMs on each day will reflect unique diurnal events that may be found at each location.
In this work, only the effects of anthropogenic CO 2 emissions are considered; therefore, ODIAC was used to represent the urban areas and their contribution to the ΔXCO 2 found in each s-SAM. ODIAC is reflective of both the difference in population between the two cities and their number of large industrial CO 2 sources. In Figure 3 (bottom), ODIAC fluxes have been averaged from January to June of 2017 and are displayed in the context of each test location's inner s-SAM domain. Large CO 2 fluxes (>50 μmol/m 2 /s) are indicated by red points. Although the spatial extent of the s-SAMs is contained within the inner domains, their footprints were generated on the larger (x, y) grid (20° × 20° ). This includes any influences from the regions outside of the urban areas. The spatial extent of ODIAC used for ΔXCO 2 calculations was identical to the (x, y) grid's extent. A total of 90 s-SAMs were generated by passing the sets of column receptors and appropriate times to the X-STILT model. A footprint was generated for each receptor and then convolved with ODIAC (Equation 2). These ΔXCO 2 values were treated as simulated OCO-3 enhancement values.

Selecting Subsets
Using the PMA, OCO-3 creates SAMs by conducting multiple adjacent transects over targets within a ∼2 min interval, providing 100s of spatially dense soundings. The mechanics and geometry of data collection may cause the inter-sounding spacing to vary across these transects. Thus, soundings may not fall on a uniform grid. Furthermore, the spatial orientation of OCO-3 provided SAMs may vary. These subtle variations between soundings do not permit grid indexing based strictly on latitude and longitude coordinates. Since the interpolation method requires spatial consistency across SAMs, soundings are indexed on a regular m × n grid, S. Here, m and n represent the number of rows and columns of soundings in a SAM. This two-dimensional scheme allows each sounding (or missing sounding) to be referred to using indices i and j rather than latitude and longitude. This work does not attempt to anticipate the possible orientations of OCO-3 generated SAMs nor the potential variations in intersounding distances; therefore, as noted in Section 2.2, all s-SAMs are identically oriented. These consistencies simplify the mapping to S, analyses, and discussions of results.
ROTEN ET AL.
10.1029/2020EA001343 7 of 28 The interpolation method is applied to subsets of S. Subsets are defined by constraining the indices of soundings such that subset S IJ is given by: Here, s ij represents individual soundings. Sounding indices, i and j, are constrained by the indices of the subset of interest: I and J. The scaling parameter α reflects the extent of each subset such that α x and α y indicate the columns and rows of soundings in each S IJ . Figure 5 demonstrates the selection of subset S 1,1 from a SAM analog. This subset (black rectangle) has four rows and four columns of soundings, thus α x = α y = 4. Soundings that are a part of this subset are thus defined as: The interpolation method presented here requires preexisting X-STILT footprints for soundings located at the corners of each subset. These soundings are indicated as black points in Figure 5.  . The distribution of soundings (0.019° × 0.019° ) that make up the s-SAMs over Salt Lake City (black points) are accompanied with an example OCO-2 overpass that coincides with the target location (green points). Assuming that OCO-3 soundings will have a distribution similar to OCO-2 soundings, it is evident that the prescribed s-SAM distribution reflects the latitudinal spacing but overestimates the longitudinal spacing of simulated soundings. (SAM, snapshot area mapping; OCO, Orbiting Carbon Observatory). This subset of days is reflective of the winter-to-spring transition experienced at both evaluation locations. Further stratification is introduced by generating three s-SAMs for each day. These s-SAMs are generated at morning, noon, and evening local times.

Month
The remainder of soundings within the interior of the subset S IJ are withheld, not being passed to the X-STILT model. Equation 5 below prescribes the construction of this repressed set.
In the construction of each S IJ , C IJ , and W IJ , it should be noted that only subsets where |C IJ | = 4 are considered. For the purposes of this work, requiring each C IJ to contain four soundings ensures that only interpolated data is included and extrapolated data is excluded. The selection of the subset size in tandem with the dimensions of S may result in soundings near the periphery of the SAM/s-SAM to not be assigned to a subset. These elements are also passed to the X-STILT model and footprints are generated using the traditional method. An example of soundings that are not assigned to a subset is demonstrated by the uppermost row and rightmost column of the SAM analog presented in Figure 5.

Applying the Interpolation Method to Subsets
The geolocations associated with the soundings in each S IJ become column receptors for X-STILT and the interpolation method. Using X-STILT footprints associated with each control receptor location (Equation 4), a synthetic footprint is generated for each withheld receptor in W IJ . A spatial shift and inverse square weighting are applied to the X-STILT footprints. This approach applies the highest weighting value to nearby footprints while reducing the influences of more distant footprints. Calculating the elements of an interpolated footprint, 0 )( , , i j f x y t , is achieved by the following: ROTEN ET AL. 10.1029/2020EA001343 9 of 28 Figure 5. Depicted here is an analog to a full SAM (64 soundings). A subset is selected such that α x = α y = 4. In this scenario, a footprint for withheld receptor w 5 is to be interpolated (red point). This requires each footprint generated for the c k s (black points) to be translated by  k d and averaged using an inverse weighting scheme. For any withheld receptors that fall directly between two control receptors (dashed green box) only the two closest control receptors are used in the interpolation method. All other interior receptors (dashed blue box) are constructed using all available control receptors. (SAM, snapshot area mapping).
For a particular withheld receptor, w ∈ W IJ , its synthetic footprint is determined by taking the individual values making up control footprints from C IJ receptors and averaging them according to Equation 6. The haversine distance between w and each c k ∈ C IJ is used as the weighting: The collection of synthetic footprint elements, 0 )( , , i j f x y t , constitutes a complete synthetic footprint for a repressed receptor such that Qualitatively, this method translates control footprints from C IJ locations to the location of each withheld receptor (W IJ ), performing a point-wise averaging to generate synthetic footprints. Figure 5 depicts the construction of a subset using a SAM analog consisting of 64 soundings. Inside the subset identified by the black box, a synthetic footprint is generated for a withheld receptor (red). The distances,  k d , are used as weights in Equation 6 and determine where control footprints are to be translated. A subtlety is introduced when Equation 6 is applied. If the withheld receptor of interest shares a longitude or latitude value with two of the control receptors, then only the two closest control footprints will be used in the interpolation. This is indicated in Figure 5 by the dashed green box. Here, only control footprints from c 3 and c 4 are used. For interior withheld receptors (dashed blue box), all control footprints are used.

Metrics for the Comparison of Footprints
Serving as controls for each comparison in this study, the series of s-SAMs described in Section 2.2 was generated using X-STILT. Each control s-SAM had an accompanying series of s-SAMs constructed by using the interpolation scheme, increasing the subset extent to α x = α y = 3, 4, 5, 6, and 7. These extent parameters corresponded to length scales of 4 , 6, 8, 10, and 12 km, respectively. To evaluate the interpolation scheme's effectiveness, each interpolated s-SAM was compared to its corresponding control. Comparisons were drawn between (1) each column-based receptor value with and without convolution with ODIAC (without convolution, F tot = ∑ ij f (x i , y j , t 0 ); with convolution, see Equation 2), and (2) each spatial distribution of control and interpolated footprint elements. In the first case, changes in the Pearson correlation coefficient (r), root mean squared error (RMSE), and mean bias error (MBE) as a function of subset extent were investigated. For footprint-to-footprint comparisons, two aspects were considered. A threat score (TS), commonly used in forecast verification, was computed to quantify the spatial agreement between an interpolated footprint and its associated control footprint (Jolliffe & Stephenson, 2003). Conversely, a spatially weighted mean absolute error (WMAE) calculation was used to quantify the agreement between corresponding footprint element values. Specific methodology regarding the TS and WMAE metrics is discussed in Section A1.

Comparison of XCO 2 Errors
Generally, errors in ΔXCO 2 are minimized when differences in interpolated footprints are reduced yet scenarios exist where subtle disagreements within an interpolated footprint can be magnified by the distribution of fluxes in Φ(x i , y j ). Errors associated with interpolated ΔXCO 2 values can be addressed in the context of OCO-2,3 error; however, OCO-3 SAMs were not fully released at the time of writing. Since these two observational platforms were constructed with identical instrumentation, it was assumed that OCO-3's error characteristics will be similar to those of OCO-2. The notable difference between these two instruments is in the spatial coverage that OCO-2 provides. In standard operating mode, OCO-2 observes a narrow transect of the atmosphere (Figure 4). For this work, OCO-2 transects were selected such that their swath width was within the s-SAM domain at each testing site. Errors reported with quality assured soundings were used to characterize the distribution of observational errors.
Every XCO 2 sounding (in ppm) in an OCO-2 transect is reported with a error value such that XCO 2,τ ± δX-CO 2,τ , were τ indicates a particular sounding in the transect. The individual error values, ɛ, within these distributions were defined as: XCO is randomly selected from a uniform distribution, U(−δXCO 2,τ , δXCO 2,τ ). Five thousand values of ɛ were generated for each site. Constructing errors with this method provided customized distributions at both testing locations whose characteristics were constrained by realistic observations. The transects used are specified in Table 2.

Large Point Source Detection Algorithm
When applying this interpolation scheme, the magnitude and number of induced errors can be driven by local meteorology, topography, and large point sources. Significant topography can reduce atmospheric transport and allow for the buildup of XCO 2 without a large surface-based CO 2 source acting as a driver. These features are coupled with meteorological characteristics and are variable in both space and time; Conversely, large point sources of CO 2 are better anticipated. Although the magnitude and dispersion of XCO 2 enhancements may vary due to meteorology, variations in point source plumes are constrained by their locations and consistent output. Large point sources (LPSs) contribute a significant fraction of national CO 2 emission totals and form localized XCO 2 enhancements (Nassar et al., 2017;Singer et al., 2014).
A key feature of this interpolation method that is affected by large point sources lies in the hypernear-field (HNF) of synthetic footprints. This area typically covers length scales of 1-10 km and timescales of 0.1-1 hours. In this area, surface fluxes are weakly diluted and thus more strongly influence the receptor . This causes a dense area of particles near the receptor that tapers off over time and space. As nearby footprints are selected for the interpolation process, the lack of mixing in their concentrated HNFs causes any interfootprint variations to be averaged into a single, smoothed footprint with a larger spatial HNF distribution. Thus, a smoothed synthetic footprint may interact with a nearby point source that would have been missed by an X-STILT generated footprint at the same location.
In this work, the potential errors associated with large point sources are addressed by implementing a large point source detection algorithm (LPS-DA). This algorithm is tuned to detect XCO 2 enhancement characteristics within an s-SAM that are associated with large point sources. Any soundings identified by this algorithm were not interpolated and instead passed to the X-STILT model. This detection process constructed a Moore neighborhood around each sounding of each s-SAM, with the extent encapsulating immediately adjacent soundings. The XCO 2 value of each central sounding was compared to the average value of the adjacent soundings. If the difference was greater than 1ppm then no soundings in the Moore neighborhood were interpolated. An example of a Moore neighborhood can be found in Figure 6.

Investigating Effects of Meteorological Resolution
The final parameter investigated was the grid spacing of the meteorological fields. Data from the Weather Research and Forecasting (WRF) model (Nehrkorn et al., 2010) were available for September and October of 2015 for SLC at three different spatial resolutions: 1.33, 4, and 12 km (Kunik et al., 2019;Lin et al., 2017;Mallia et al., 2015). Applying the same methodology as described in Section 2.2, 5 days were selected and s-SAMs were constructed at three different times per day. The specific days and times are included in Table 3. Using WRF, it is possible to ensure that the only variable being changed is the resolution, keeping all other model physics the same. Errors were calculated for all interpolated XCO 2 enhancements (ΔXCO 2,interpolated − ΔXCO 2,control ). For each subset length scale, differences in the groups of errors generated by the 1.33, 4, and 12 km WRF data were investigated using a Kruskal- The number of soundings in each transect is given in the fourth column after quality filtering has been applied. Soundings provided by the second transect associated with the Los Angeles test site, indicated by (*), were taken in OCO-2's "target" mode. means of errors associated with the 1.33, 4, and 12 km WRF resolutions. A discussion on the Kruskal-Wallis H test is provided in Section A2. Since WRF is used solely as a means to modify meteorological input resolution, comparisons between WRF and HRRR are beyond the scope of this paper.

Comparison of Summed Footprint Values
Typically, X-STILT footprints are used in tandem with an emission inventory to determine a priori column-averaged CO 2 enhancements (Equation 2); however, the evaluation of this interpolation method first considered only the summed values of footprint elements to remove any potential influences from ODIAC. In Figure 7, summed footprint values (F tot = ∑ ij f (x i , y j , t 0 )) are compared to their interpolated counterparts, stratified only by subset length scales. Comparisons of all values revealed strong correlations across length scales with r > 0.9 in all cases. The correlation coefficient was inversely proportional to the subset length scale which reflected the loss of accuracy when larger subsets were used for interpolation. This loss was also reflected in RMSE values. From a 4 km to a 12 km length scale, RMSE increased by 50% (0.2-0.3) and 100% (0.2-0.4) at LA and SLC, respectively.
Between both testing locations, 99.8% of all F tot, control values fell below 0.7 ppm/μmol/m 2 /s. The remaining 0.2%, indicated in Figure 7, were all attributed to a single day (January 12, 2017) at the Salt Lake City location. Spatially, these particular receptor locations were constrained to the area over the Oquirrh Mountains to the west of the city on a day where higher elevations experienced little atmospheric transport. As the subset length scale was increased these interpolated soundings began to underestimate larger modeled values. This demonstrates that uncertainties in the interpolated results are sensitive to large F tot values and complex terrain. These uncertainties decreased when smaller subset length scales were used.
An ordinary least squares regression was performed on the comparisons in Figure 7 (top) and the coefficients were included in each panel. Although the linear model assembled here is not used, the calculated values of slope and y-intercept offer insights into the interpolation method's ability to reproduce F tot values. At both locations, slope values decreased as the subset length scale increased; conversely, y-intercept values increased as the length scale increased. Slopes lying below the 1:1 line reflected a tendency to underestimate F tot values due to averaging in the HNF. The trend of this parameter indicated that HNF smoothing effects increased as the subset length scale increased.
Although the systematic error suggested by each y-intercept was relatively large, the MBE was driven by the majority of F tot values lying between 0.0 and 0.3 ppm/μmol/m 2 /s. The density of values in this range is shown in Figure 7 by the high count of the weighted scatter plot (top). Additionally, the percentage of points within this range is provided above the corresponding box plots (bin [0.0, 0.3); bottom). The amount of values within this range was consistently ≥86% for both locations and all length scales. Coefficients from the linear regression and accompanying boxplots demonstrate that this interpolation scheme has a tendency to overestimate small F tot values and more significantly underestimate larger F tot values. For the 10 and 12 km subset length scales, MBE values at the SLC test location were significantly higher than MBE values for length scales at LA. This demonstrated the geographic sensitivity of the method; however, given that at least 86% of values lie below 0.3 ppm/μmol/m 2 /s for both locations, it is likely that any interpolated footprints in future work will have a bias on the order of 10 −4 ppm/μmol/m 2 /s.

Spatial Comparison of Summed Footprint Values
In Figure 7   These data were available from previous analyses performed for the year 2015 and therefore was only available for Salt Lake City. Each s-SAM was created using three different WRF resolutions. averaged. At small length scales, errors are smaller and no clusters exist. As the length scales increase, clusters of larger error appear.
Over Los Angeles, three clusters of large error exist: an area to the south that is on the land-sea border, an area to the northwest at the base of a small mountain range, and an area near a small mountain range to the northeast. It is assumed that column-based receptors generated over bodies of water are propagated with different model physics than those generated over land. These differences were exaggerated at larger length scales as more distant footprints were used for interpolation. In Figure 8 ( error indicates that the interpolated footprint was overestimating the correct summed value. In addition to proximity to bodies of water, close proximity to mountains can influence near-surface winds. As the subset length scale increased, the interpolation increasingly overestimated values around the base of the northwestern mountain range. Conversely, there was an area of underestimation that appeared around the base of the smaller northeastern mountain range. Of these three areas of higher/lower error, the receptors at the land-sea border were consistently the largest value and highest concentrated. At SLC, an increasing tendency of overestimation appeared to the southeast along the Traverse Mountains, stretching northward into the Wasatch Mountains. At the larger length scales, overestimates occurred along the left side of the s-SAM domain over the Oquirrh Mountains. Viewed through the element-wise summation of footprints, the method's ability to interpolate these values was broadly demonstrated. Comparisons showed that small F tot values were overestimated while larger, less-frequent values were underestimated by this method. The amount of under/over-estimation was dependent on the subset length scale used in the interpolation process. The method's sensitivity to significant geographic features was also evident as relative errors increased near land-sea borders and mountainous terrain. Although similar influences exist at both test locations, error values are not comparable in magnitude. This is due to the unique meteorological and topographical characteristics found at each location. LA experienced a maximum relative error of 18.2% along the land-sea border when the largest subset length scale (12 km) was used. Similarly, the testing domain across SLC contained a maximum relative error of ROTEN ET AL.

10.1029/2020EA001343
14 of 28 33.4% atop the Traverse Mountains while using the same length scale. Regardless of errors associated with geographical features, values are relatively small (<10%) when interpolating across smaller length scales.

Footprint-To-Footprint Comparison
Error clustering, found in Figure 8, was investigated with methods described in Section 2.4. First, the threat score (TS) from Equation A3 was applied to all interpolated footprints. Spatial characteristics of interpolated footprints were compared to the control footprint associated with the same receptor location. Results of this comparison were included in Figure 9. In addition to the comparison of spatial characteristics, the interpolation method's ability to reproduce values was also considered. The weighted mean absolute error (WMAE) calculation of Equation A4 was applied to characterize the difference between control footprint element values and their interpolated counterparts. The results of this comparison are presented in Figure 10.
Stratified only by the length scale used to generate them, TS values were assigned to their respective receptor locations and spatially averaged. The mean of these averaged threat scores (TS) was greater than 70% in all panels of Figure 9. Smaller scales (4 and 6 km) produced TS values that were uniform across the s-SAM domain. Among these smaller scales, the largest range in values (TS max − TS min ) was 10% and attributed to the 6 km length scale at the LA location. Beyond the 6 km subset length scale, systematic differences began ROTEN ET AL.  to appear. At these larger subsets, it was clear that interior TS values disagreed considerably with TS values along the periphery of subsets. As the subset length scale increased, so did the variation among control footprints used in the interpolation process. Using only two control footprints to synthesize a third footprint introduces less variation than using four; therefore, synthetic footprints along the periphery of each subset had better spatial agreement to control footprints than those requiring four X-STILT generated footprints.
Overall, the spatial accuracy of synthetic footprints generated at LA was more sensitive to the subset extent when compared to the SLC location. At the 12 km length scale, the minimum threat score value at the LA location (TS min = 68.19%) corresponded to an interior receptor whereas the maximum value (TS max = 83.43%) corresponded to a receptor on the periphery of a subset. While this location had a 15% difference, the 12 km length scale at SLC had a difference of 9%. Across the range of length scales, the values of TS at LA experienced roughly twice the decrease relative to SLC; Any geographic contributions to spatial mismatch were dominated by the variation in meteorology used to generate each footprint.
Spatial agreement between synthetic and control footprints was driven primarily by variations in the meteorology used in their construction; however, when comparing the values of footprint elements, influences of geographic features were evident. Applying Equation A4   this averaging are presented in Figure 10. In the southwest corner of the s-SAM domain over LA, a spatial cluster of higher WMAE values was present across all subset length scales. As the length scale increased, the errors in this region intensified and became more evident along the land-sea border to the south. This region of higher WMAE occurs in the same location as the area of higher F tot error in Figure 8 (top). The mountain ranges surrounding SLC demonstrate a similar effect on the WMAE of synthetic footprint elements. At the smallest possible implemented length scale (4 km), subtle clustering exists along the east and west edges of the s-SAM domain. These errors increased as the subset size increased.
Both Figures 9 and 10 indicate the locations of column-based receptors whose footprints were used in the generation of synthetic footprints. Since TS values are not applicable for these receptor locations, they are represented by black tiles in Figure 9. Likewise, these locations were represented in Figure 10 as locations with no error. At the largest length scale (12 km), two footprints from control receptors along the Los Angeles land-sea border were averaged with two footprints that were 12 km inland. This difference was likely driving the clustered WMAE values in this region. Over the mountain ranges in the SLC domain, the 12 km length scale required the footprints from mountaintop receptors to be averaged with footprints generated 12 km away in the Salt Lake Valley. The difference in elevation across this large subset was the likely driver of the errors found along the Oquirrh and Wasatch mountain ranges. Given the characteristics of the TS and WMAE values in Figures 9 and 10, it appears that meteorological influences specific to the domain location introduce a uniform error into synthetic footprints while geographic features introduced localized errors.
ROTEN ET AL.

ΔXCO 2 Comparison to OCO-2
Each footprint was convolved with the ODIAC emission inventory. Considering interpolation errors in "XCO 2 space" provided an intuitive context for interpreting results and a means of selecting an upper threshold for appropriate subset length scales. Errors are presented in Figure 11 with noticeably different ranges between testing locations. Control values associated with SLC were predominantly constrained between 0 and 5 ppm whereas many values associated with LA were above this range. The difference in these ranges was reflective of the larger emissions and dense collection of large point sources in the Los Angeles metropolitan region; furthermore, RMSE and MBE values at the Los Angeles location were larger across all length scales than values found in the Salt Lake City area. RMSE values in LA were 100-200% greater than the associated SLC values. MBE values differed by an order of magnitude. Across both locations, underestimation by the interpolation method was still present. Slope values reported from linear least squares regression decreased as subset length scales increased. This trend was likely propagated by the mechanisms responsible for the systematic underestimation of larger F tot, interpolated values (described in Section 3.1). Table 4. Although the spread of simulated OCO-2 errors was constrained between ±1 ppm at both locations, it is possible for these errors to be larger (Connor et al., 2016). Therefore, s-SAM errors are investigated using two different thresholds:

A collection of simulated OCO-2 errors is presented with s-SAM errors in
(1) the percentage of values lying outside the range of simulated OCO-2 error and (2) the percentage of values lying outside of the ±1 ppm range. Of the two testing locations, the distribution of s-SAM errors at LA had a larger spread than values at SLC. At the LA location, s-SAM errors associated with the smallest possible length scale (4 km) had a standard deviation of SD = 0.36ppm. This was larger than the SD of the OCO-2 error (SD = 0.28 ppm). Like LA, the SD of s-SAM errors for the SLC location increased as the length scale increased; however, the SD of s-SAM errors for SLC was less than the SD of associated OCO-2 error (<0.42 ppm) across all subset length scales. LA featured the largest min/max range across all length scales with the maximum range being 36.43 ppm (10 km length scale). Roughly 12% of the s-SAM errors within this range were outside of the OCO-2 error range with 3.5% lying outside of the ±1 ppm range. Unlike the LA location, the 10 km length scale at SLC had an associated s-SAM error range of 9.53 ppm and <1% of the values were outside of the ±1 ppm range.
At both locations, the SD increased with the subset length scale; conversely, the range of errors was not strongly correlated with the length scale but was driven predominantly by a few anomalous values created from large point sources and subset geometry. Across all length scales, SD values at the LA location were ROTEN ET AL.  at least a factor of two greater than their SLC counterparts. These results demonstrate the interpolation scheme's sensitivity to geographic location. The presence of both natural topography and anthropogenic CO 2 emissions drive interpolation error. An additional constraint that was considered was the accuracy of the instrument collecting atmospheric soundings. Coarse measurements with large error will allow for the selection of larger subset length scales when the interpolation is applied. Overall error will be dominated by the instrument; however, as resolution/accuracy of the instrument is increased, the interpolation error must be further constrained to be smaller or equal to the instrument error. In the two locations presented in this work, usable length scales at LA were more restricted by instrument error. The s-SAM errors at this location had a large spread while the available OCO-2 error was small. Usable length scales at the SLC location were less constrained as there was less spread in s-SAM errors yet larger spread in OCO-2 error.

Applying the Large Point Source Detection Algorithm
The large point source detection algorithm (LPS-DA) implemented in this work was selected to address large point sources (LPSs) and their influence on interpolated ΔXCO 2 values. LPSs are included in ODIAC but, as with any emission inventory, errors in their locations may exist (Hogue et al., 2016(Hogue et al., , 2017Hutchins et al., 2016;Oda et al., 2019); therefore, rather than constructing a detection method that relied on potentially incorrect location data from emission inventories, the Moore neighborhood approach (Section 2.6) was guided solely by the characteristics of s-SAM values, anticipating source-induced signals within them. When an assumed signal was detected, its distance from the nearest LPS was recorded (LPSs were defined as any ODIAC emission > 50 μmol/m 2 /s of CO 2 ). The magnitude and frequency of error values were then binned by their respective distances to LPS values in ODIAC. These relationships are presented in Figure 12. It is important to note that OCO-3 soundings were not used in this proximity-based evaluation.
Using ODIAC to generate s-SAMs allowed LPS locations within the inventory to be treated as "truth." Although LPS signals were "assumed" in each s-SAM, Figure 12 demonstrates a clear trend between large ΔXCO 2 discrepancies and their proximity to LPSs in ODIAC. The distributions in the top panel of this figure represent the frequency of large errors and their distance from the nearest LPS. Not only did anomalous errors outside the ±1 ppm range occur more frequently near LPS locations, the magnitude of the error also decayed as distance increased. For LA, all errors >5 ppm occurred within the HNF of a LPS. The number of anomalous errors at SLC was smaller due to the low number of LPSs identified within the Salt Lake Valley. Furthermore, as the subset length scale increased at both locations, the number of anomalous errors near LPSs also increased. Figure 12 also presents anomalous errors after the LPS-DA was applied (bottom panel). This reduced the magnitude and number of errors in the HNF of LPS locations. Most notable was the complete removal of ±1 ppm HNF errors in the 6 km interpolated s-SAMs over SLC. After the application of the LPS-DA, the magnitude and number of errors still increased with length scale; however, the number of anomalous errors was reduced across all length scales and the spatial distribution of (2 ppm, 3 ppm) errors was constrained to the HNF of LPS locations.
Error statistics after the LPS-DA was applied are presented in Figure 13. As with the previous data, there was a difference in the ranges of ΔXCO 2 values between the two testing locations; however, the detection algorithm reduced the ranges at both sites. At the LA location, the largest ΔXCO 2 values were removed with the remaining values only approaching 8 ppm. The SLC location experienced no profound reduction but ΔXCO 2 values were constrained under 4ppm. Furthermore, correlation coefficients increased after the detection algorithm was applied. LA was most benefited as the correlation coefficients across length scales were increased by 0.02-0.09. The SLC location was less affected, with increases in correlation ranging from 0.00 -0.02.
After applying the detection algorithm, RMSE values at LA were reduced by roughly half, reinforcing the notion that overall error is heavily influenced by LPS-driven ΔXCO 2 values. Conversely, applying the algorithm to the already small ΔXCO 2 values and low number of LPSs present at the SLC had less of an impact on RMSE. The detection algorithm also reduced MBE in the two smallest length scales at the LA location. For larger length scales, the algorithm's influence on MBE was ambiguous. The SLC location experienced a slight increase in MBE across all length scales. This is due to the removal of large, underestimated ΔXCO 2 values during the detection process, increasing the influence of the over-estimated yet smaller error values.
Increasing slope values suggested that many of the receptors identified by the algorithm were large ΔXCO 2 values that were underestimated by the interpolation method. Removing these points from the interpolation scheme applied more weight to the overestimated error associated with smaller ΔXCO 2 values.
Results after applying the detection algorithm are presented in Table 5. Notable reductions in range were demonstrated as minimum and maximum values across all length scales were drastically reduced. Since most error values fall within the OCO-2 error range and are centered near 0 ppm, their high-density dominated the mean. Although most errors were relatively confined, the spread of these values was heavily influenced by outliers. After removal from the interpolation scheme, the SDs associated with the LA errors were reduced by roughly half and SD values at SLC were reduced by ∼0.02 ppm. The ROTEN ET AL.
10.1029/2020EA001343 20 of 28    W18 addresses errors from a variety of other sources. In their Riyadh test case, the per sounding error in modeled ΔXCO 2 from atmospheric transport was reported to be 0.07-2.87 ppm for areas of low urban enhancement with an occasional value >5 ppm in areas of localized high urban enhancement. Using several CO 2 emission inventories, W18 calculated a per sounding error range of 0.04-2.82 ppm for prior emission estimates (ODIAC). Additionally, the selection of parameters in X-STILT was associated with ∼4% error in modeled ΔXCO 2 . Applying the LPS-DA, the errors associated with the interpolation method in this work fell closely in line W18's estimates of external errors. Across both locations, ≤2% of all error values had the potential of being out of the range associated with transport and prior emissions error. Given the analyses presented here, the most constraining source of error is the per sounding OCO-2 errors.

Dependency on the Grid Spacing of Driving Meteorology
Using s-SAMs generated from available WRF data for SLC (September and October of 2015), the influence of meteorology resolution was investigated. A Kruskal-Wallis H test (see Section A2) was applied and the results are presented in Figure 14. The errors associated with the 1.33 km WRF resolution demonstrated the smallest increase in spread as the subset length scale increased; furthermore, these values maintained the smallest spread when compared to other resolutions at each length scale. With a resolution this fine, meteorological variables were interpolated across multiple points within all subset length scales. This allowed for potentially smoother changes in variable values across length scales and better agreement among control footprints used in the interpolation process. Conversely, when the most-coarse meteorology field (12.0 km) was used, no subset extent was capable of containing more than one point of WRF interpolation. This coarse gridding likely induced abrupt changes in topography and meteorology.
Another feature of the data presented in Figure 14 was the tendency for the interpolation scheme to overestimate ΔXCO 2 when a higher resolution WRF and larger subset length scale were used while interpolated values relying on the low resolution WRF data were underestimated. This reinforced topography's influence on the interpolation results. Using the HRRR data, the use of large length scales at the SLC location resulted in a tendency of overestimation along mountain ranges (Figure 8). 1.33 and 4.0 km WRF data are also capable of resolving many of the features associated with these mountain ranges. The 12.0 km WRF data is less likely to capture these features, reducing their effects on interpolation results. Overall, when considering only 4-8 km length scales there were no statistically significant differences between the means of interpolation errors. Only at the larger length scales were differences significant. These differences were driven predominantly by the coarsening of the meteorological grid and the smoothing of topological features. p-Values from the Kruskal-Wallis test are included in Figure  The distribution of errors presented here were generated using three different resolutions of WRF output for the Salt Lake City region: 1.33, 4.0, and 12.0 km. With each WRF domain, the interpolation method was applied using several subset length scales (4-12 km). The results were then stratified by WRF resolution and length scale. (WRF, Weather Research and Forecasting). Figure 15 presents the efficiency of the interpolation method throughout the two case studies. When the LPS-DA was not applied, the 4 km length scale interpolation scheme consistently required ∼30% of the X-STILT generated footprints to construct a full s-SAM via interpolation. By increasing the length scale to 6 km, ∼25% of the receptors required an X-STILT generated footprint to interpolate across the remainder of the s-SAM. Without the LPS-DA, these values remained unchanged across all s-SAMs, as the number of receptors passed to X-STILT did not change. These values are represented by the gray bars in the figure. Typically, as the subset length scale increases, the percentage of required X-STILT generated footprints decreases. This is shown in the 4 and 6 km length scales but a different pattern exists in the 8 and 10 km length scales. Referring back to Figure 5, the subsets are defined from the bottom left corner; thus, the rightmost column and topmost row cannot fall within a complete subset. These receptors are passed to the X-STILT model and footprints are generated using the traditional method. Similarly, all s-SAMs in this work are broken into equal subsets beginning from the bottom left element of a full domain.

Interpolation Efficiency
Due to certain geometries, it is possible for large sections on the edge of an s-SAM to not be assigned a subset. The 8 and 10 km subset length scales highlight the importance of the s-SAM domain and subset interaction and its consideration when implementing this process; however, in an idealized case where all soundings within a SAM domain are part of a subset, the 4, 6, 8, 10, and 12 km require 25.00%, 11.11%, 6.25%, 4.00%, and 2.78% of the footprints to be generated by X-STILT. The remainder can be interpolated.
When the LPS-DA was applied to each s-SAM, the number of soundings removed from the interpolation process varied depending on meteorological factors. The footprints for soundings selected by the algorithm were generated by X-STILT rather than interpolation. Applying the LPS-DA yielded results that were dependent upon meteorological conditions. Across a duration of low atmospheric transport, emissions from LPSs are weakly diluted, increasing the likelihood of being detected by the LPS-DA. In this case, more soundings will be identified by the algorithm and passed to X-STILT, reducing the number of footprints to be interpolated. Conversely, over periods of strong transport, a smaller number of signals from LPSs may be detected. This will increase the number of footprints that can be generated via interpolation. These two scenarios are represented in Figure 15 by the black bars. The least efficient scenario (weak transport) is marked by the high end of the black bar. This scenario required the most X-STILT generated footprints, reducing the speed gained from the interpolation scheme. The low end of the black bar represents the most efficient scenario (strong mixing/transport) in which the LPS-DA was applied. The average efficiency across all s-SAMs is represented by the red bar.
Computations in this work were distributed across a bank of 8-core Intel Xeon E5-2670 2.6 GHz processors; however, small time trials were performed on sequential calculations to assess the difference in run time between X-STILT generated and interpolated footprints. A single X-STILT generated footprint using 100 particles per vertical level averaged a run time of 11 min and 37 s (n = 7). The generation of interpolated footprints averaged a run time of 23 s (n = 12). Thus, each interpolated footprint was generated in ∼3% of the time X-STILT required. There was considerable variance among run times for the interpolation process due to the number of footprints required in each interpolation. Interpolated footprints along the periphery of subsets required only two control footprints whereas interior locations required four. This doubled the number of raster files to be manipulated in the interpolation process, leading to longer run times.
For an estimation of the computational time required to interpolate a full SAM, an approximation was made with the following assumptions: (1) five nodes consisting of 10 cores (E5-2670 2.6 GHz processors) were ROTEN ET AL.  available, (2) X-STILT footprints are created first, followed by interpolations, and (3) both X-STILT and the interpolation scheme were capable of running in parallel. Given the distribution of errors and density of LPSs at each location, a 4 km subset length scale was selected for LA. A 6 km length scale was selected for SLC. After applying the LPS-DA across all s-SAMs, the average number of receptors passed to X-STILT at the LA location was 171 (of 460) while SLC only required an average of 95 (of 500). Assuming computations were distributed evenly, the estimated time required to generate footprints for an entire s-SAM was reduced by 62% and 78%, respectively.

Discussion
The number of space-based CO 2 observation platforms is projected to increase and subsequent increases in instrument resolution is expected to follow. With the anticipation of spatially dense XCO 2 measurements on the horizon, the development of an interpolation scheme is needed, aimed to reduce the burden on the X-STILT model as these data are digested and analyzed. Estimates suggested that, on average, reductions of 62% and 78% in computational time can be attained at the Los Angeles and Salt Lake City test locations. Errors accrued by this method remained largely within the range of external errors (instrument retrieval error, prior emissions error, atmospheric transport error, etc.) at both test sites. Regional meteorology and topography influenced the accuracy of the method but most effects could be mitigated by careful selection of a subset length scale. These effects and the effects of LPSs were mitigated by selecting a smaller length scale that also constrained interpolation errors within the associated instrument error. A 6 km subset length scale was selected for the SLC location and a 4 km subset length scale was selected for LA.
Currently, this interpolation method must be optimized on a site-by-site basis. Recommendations for the two case studies in this work may not be broadly applicable. For new locations of interest, the generation of several preliminary s-SAMs will be required. The generated s-SAMs should cover several times of day and multiple seasons. Modeled values can be systematically removed to test several subset length scales. Based on (1) the distributions of LPS locations, (2) sensitivity to topographic features, (3) interpolated ΔXCO 2 errors, and (4) the error of the instrument(s) used for XCO 2 observation, an appropriate subset length scale can be selected. Future users may generate an initial coarse sampling of s-SAMs to calculate performance statistics, adding additional s-SAMs as needed. The required sensitivity analyses and subsequent "tuning" of the interpolation scheme for each location of interest makes broad implementation difficult and limits the use of this method to areas of intensive and long-term study.
Moving forward, a dynamic detection algorithm can be implemented. In this work, the detection algorithm aimed to detect large point sources and remove nearby receptors from the interpolation scheme. Although results show reasonable effectiveness, influences from meteorology or topography are not considered. More rigorously designed algorithms may incorporate methods to detect areas of potentially large error based on these factors. Schemes where the subset length scale is varied across SAMs may also be implemented. Expanding X-STILT to a regional model would allow for the dissection of regional XCO 2 domains by investigating the individual footprints corresponding to key ΔXCO 2 values. Although a coarse analysis was presented in this work, future investigations can integrate this interpolation scheme with emission inventories of high temporal resolution (Nassar et al., 2013) and biospheric CO 2 source/sink inventories. Additional inputs will provide computationally efficient and accurate simulated XCO 2 values for comparison with OCO-3 SAMs or future high-density XCO 2 retrieval missions such as NASA's Geostationary Carbon Observatory (Geo-Carb) (Moore et al., 2018) or the ESA's Copernicus Anthropogenic Carbon Dioxide Monitoring instrument (CO2M) (Kuhlmann et al., 2019).

Conclusions
Space-based XCO 2 observations are projected to increase in the coming years. One of the most recent sensors, NASA's OCO-3 instrument, is set to provide unprecedented spatial and temporal coverage of targeted areas around the Earth. Consequently, a strain on current analysis methods is expected.
In preparation for unparalleled data availability, this work presents a novel interpolation method for influence footprints derived from the column-averaged ("X") Stochastic Time-Inverted Lagrangian Transport (X-STILT) model. The traditional X-STILT approach requires rigorous computation to determine the upwind influences associated with each OCO-3 sounding. Here, a series of simulated OCO-3 observations were generated to evaluate the effectiveness of the interpolation scheme. The methodology used a key subset of OCO-3 soundings and their associated influence footprints generated by X-STILT. Using a nonlinear weighted averaging, synthetic footprints were created for the remaining sounding locations. The relationship between decreasing the number of X-STILT generated footprints and interpolation accuracy was thoroughly investigated along with the influences of meteorology, topography, and meteorology resolution. The influences of large point sources were given special attention via a filtering scheme. In two test cases: Los Angeles, CA and Salt Lake City, UT, time trials revealed an estimated 62% and 78% reduction in computational time when compared to the traditional X-STILT modeling approach. Errors associated with the interpolation method remained within the range of external errors (instrument error, atmospheric transport error, prior emissions estimates, etc.), demonstrating the viability of this method to reduce the computational time required of dense observations of space-based XCO 2 and other space-based trace gas measurements.
Including the weighting term removes any error calculations where there is spatial mismatch between the control and interpolated footprint, allowing for strict evaluation of the interpolation method's ability to produce similar element values at control points.

A2. Kruskal-Wallis H Test
This test serves as a nonparametric alternative to the Analysis of Variance (ANOVA) test using a rank-based approach (Kruskal & Wallis, 1952). The initial step in using this test required errors within each group (WRF resolutions) to be ordered in ascending order. An H value was then calculated by       C is the total number of groups (C = 3), n i is the number of elements in group, and R i is the average rank of the values in each group. In a Kruskal-Wallis H test, the rank of each value is determined by listing all samples, N, in ascending order and assigning a numerical rank (1, 2, …, N) to each value. H approximately follows a χ 2 distribution, allowing for the calculation of a p-value.

Data Availability Statement
Relevant OCO-2 data were publicly provided by NASA's OCO-2 project and retrieved from the Goddard Earth Science Data and information Services Center (https://disc.gsfc.nasa.gov/datasets/OCO2_L2_Lite_ FP_9r/summary, DOI: 10.5067/W8QGIYNKS3JC). The HRRR data used in this work were downloaded from the National Oceanic and Atmospheric Administration's (NOAA) Air Resources Laboratory (ARL) and is also publicly available at https://www.ready.noaa.gov/archives.php. Additionally, current and previous versions of the ODIAC dataset are maintained by the Center for Global Environmental Research and can be accessed via: http://db.cger.nies.go.jp/dataset/ODIAC/. Details are provided in Oda et al. (2018). Source code for X-STILT can be found at https://doi.org/10.5281/zenodo.2556989. Lastly, the authors of this work declare no conflicts of interest.