Global Three‐Dimensional Water Vapor Feature‐Tracking for Horizontal Winds Using Hyperspectral Infrared Sounder Data From Overlapped Tracks of Two Satellites

The lack of measurements of three‐dimensional (3D) distribution of horizontal wind vectors is a major challenge in atmospheric science. Here, we develop an algorithm to retrieve winds for nine pressure levels at 1° grid spacing from 70°N to 70°S. The retrieval is done by tracking water vapor from the hyperspectral Cross‐track Infrared Sounder aboard two polar satellites (NOAA‐20 and Suomi‐NPP) that have overlapped tracks separated by 50 min. We impose a gross error check by flagging retrievals that are too different from ERA‐5 reanalysis. Testing the algorithm for the first week of January and July 2020 indicates that our algorithm yields 104 wind profiles per day and these 3D winds qualitatively agree with ERA‐5. Compared with radiosonde data, the errors are within the range of reported errors of cloud‐tracking winds.

include representativeness error produced by the radiosonde comparison; Martin et al., 2021). Finally, forecast models also assimilate wind measurements from commercial aircrafts and dropsondes (Stoffelen et al., 2020).
More recently, it has become possible to calculate AMVs using water vapor fields derived from hyperspectral infrared (IR) sounders that measure the atmospheric column with vertical resolution of ∼1 km. Being able to derive vertical winds from observed water vapor fields has the potential of ameliorating the midtroposphere wind data gap, since multiple pressure levels of water vapor can be derived at each horizontal coordinate through radiative transfer calculations, in contrast to the single-level retrieval of cloud-tracking winds. However, assigning the hyperspectral 3D winds to the middle of each ∼1 km vertical layer may contain some errors if the winds vary nonlinearly with height within this layer. Regardless, hyperspectral IR makes it possible to derive a multiple pressure level vertical wind profile per horizontal coordinate, in contrast to traditional single-level cloud-tracking retrievals.  derived hyperspectral, 3D AMVs from the Atmospheric Infrared Sounder (AIRS) carried by the Aqua satellite on LEO, exploiting the satellite's track convergence that appears near the poles in order to extract AMVs from sequential, overlapping swaths at high latitudes (in a similar way to the cloud-tracking AMVs derived from the MODIS instrument). While these results are very encouraging over the polar regions, they are not applicable to lower latitudes due to the orbital track separation. Recently, Hautecoeur et al. (2020) presented some preliminary efforts with the Infrared Atmospheric Sounding Interferometer (IASI) data to derive 3D winds poleward of 45° using the swath overlap of Metop-A and Metop-B. In this letter, we describe the method we developed for calculating 3D AMVs from IR sounders on two satellites: NOAA-20 and Suomi-NPP, which are separated by ∼50 min in the early afternoon orbit with 1:30 a.m./p.m. local overpass time. Both LEO satellites carry the hyperspectral Cross-track Infrared Sounder (CrIS). We used the overlap of the satellites orbital tracks to extract the sequential moisture fields necessary for the 3D AMVs. In contrast to only using one LEO satellite, this method can extract 3D AMVs for the tropics and midlatitudes.

Data
We originally retrieved AMVs for 49 pressure levels between surface and 100 hPa (although we later coarsen the AMVs from 49 to 9 pressure levels, as detailed in Section 2.2), using the hyperspectral water vapor product from Community Long-term Infrared Microwave Combined Atmospheric Product System (CLIMCAPS). CLIMCAPS is the NASA continuity retrieval product for the AIRS and CrIS instruments (N. Smith & Barnet, 2019). CLIMCAPS employs MERRA-2 reanalysis (Gelaro et al., 2017) as a priori for its temperature and water vapor retrievals. CLIMCAPS uses "cloud clearing" to remove the radiative effects of clouds from the IR measurements.
CLIMCAPS cloud-cleared retrievals range in horizontal resolution from 50 to 150 km. To allow direct match-ups between CLIMCAPS water vapor retrievals from the CrIS instruments on SNPP and NOAA-20, we gridded the values onto an equal-angle 1° × 1° grid. We found that the calculation of AMVs from water vapor fields benefits from having no data gaps, even if clouds introduce error. We, therefore, did not apply any quality filtering ahead of gridding. The 1° resolution is coarser than the effective resolution of CLIMCAPS retrieval at nadir (∼50 km) to take into account sampling errors.
To test our algorithm, we calculated AMVs for 1-7 January 2020 and 1-7 July 2020. Both ascending and descending orbits were included. The region where the CrIS swaths from SNPP and NOAA-20 overlap has a width of approximately ∼900 km (e.g., see Figure S1 in Supporting Information S1).
The ERA5 reanalysis data (Hersbach et al., 2020) were used to qualitatively assess our 3D wind retrievals. We also evaluated AMVs by collocating them with radiosondes from Integrated Global Radiosonde Archive (IGRA) version 2 database (full measurement resolution; Durre et al., 2018; see Figure S2 in Supporting Information S1 for locations).

Algorithm
To retrieve AMVs, we divided the overlapping track for each satellite into patches of 5 min in observation time (or ∼1,700 km along the track). The width of the overlapped swath is ∼900 km. For each patch, F NOAA20 in NOAA-20's track beginning at observation time t NOAA20 , there is an overlapping patch F SNPP from SNPP roughly beginning at observation time t SNPP = t NOAA20 + 50 min. We practiced quality control by rejecting patches with areas smaller than 100 (1° × 1°) pixels, so that the wind retrieval algorithm has enough pixels to process. Although each patch has ≥100 pixels, the optical flow algorithm retrieves winds at pixel-level resolution (1° × 1°). There are about 20 potential patches of 5 min per ∼100 min orbital track, but only about 12 per orbit (6 for each ascending/descending part) make the quality control on average (see Figure S1 in Supporting Information S1).
Although the patches have irregular shapes, they are stored as rectangular images of the smallest dimension necessary to fit the patches. This is because the wind retrieval algorithm requires as input rectangular images, whereas the patches themselves have irregular shapes. The rectangular images will produce gaps in the areas that are not filled by the irregular shaped patch. Furthermore, the CLIMCAPS fields themselves have gaps, even without applying quality filters due to gaps in orbital tracks at low latitudes. Given that the optical flow algorithm cannot process gaps, the gaps are filled with a nearest neighbor interpolation algorithm. The filled gaps are only used temporarily, however, since once the algorithm retrieves the winds, the winds from the gaps are removed, only leaving the winds from pixels that have real moisture information.
For each patch, we quantify the gap area by calculating the ratio of pixels with information over the total pixels of the patch (which include informative pixels and empty pixels). We tested this relationship for AMVs of the first day of January 2020, using all 49 levels. The average ratio is 0.47 with a standard deviation of 0.16. To test the deterioration of the retrieved wind, we calculated the correlation of magnitude of vector difference (VD) between AMVs and ERA-5 winds versus the pixel ratio which gives a near zero correlation, implying that error is nearly independent of the gap size. The reason may be that we only temporarily use the interpolated pixels to calculate the AMVs, and then we remove them.
We used Dense Optical Flow (DOF), specifically the OpenCV (Bradski, 2000) implementation of the TV-L 1 algorithm (Zach et al., 2007), to compute AMVs from the quality controlled (rectangular) patches F NOAA20 and F SNPP . TV-L 1 computes the displacement of every pixel by minimizing a function that penalizes against nonconservation of pixel brightness, leading to subpixel accuracy.
For a gross error check, we computed the VD magnitude between our AMVs and the ERA-5 reanalysis (Hersbach et al., 2020), which is commonly used to get rid of outliers (Holmlund, 1998;Holmlund et al., 2001;Lewis et al., 2020;Li et al., 2020). Winds with VD over a threshold δ = 10 m s −1 are flagged. We avoid the traditional model-independent techniques for quality control since they require a triplet sequence of images (Holmlund, 1998;Holmlund et al., 2001;, whereas only two images separated by ∼50 min are available to calculate 3D winds here. For a gross error check, the ERA-5 horizontal winds were collocated with the overlapping pixels of each pair of 5 min patches (F NOAA20 , F SNPP ) by using a nearest neighbor algorithm that takes as input the latitude and longitude coordinates of the overlapping pixels, and the date that is the midpoint between t NOAA20 and t SNPP .
Finally, after the above step, we further postprocess the AMVs and the collocated ERA-5 winds by reducing the original 49 pressure levels by calculating the median of every five vertically consecutive AMVs located at each horizontal coordinate, leading to nine pressure levels (hPa): 151. 62, 201.34, 260.36, 329.08, 407.9, 497.04, 596.72, 707.0, and 827.8. This pressure coarsening is done for two reasons: the retrieved soundings from CrIS have only between three and seven degrees of freedom depending on ambient conditions (N. Smith & Barnet, 2020; W. L. Smith et al., 2021), and coarsening decreases the root mean square vector difference (RMSVD) by removing some of the random noise.

Results
Our most significant result is deriving 3D AMVs that cover the tropics and midlatitudes. Figure 1 shows global AMV maps at the P = 827.8, 707.0, 497.04, and 407.9 hPa pressure levels and the descending orbit of 1 July 2020 (for ascending orbit, see Figure S3 in Supporting Information S1). The AMVs are qualitatively similar to the corresponding ERA-5 fields, showing that the AMVs are capturing complex synoptic-scale circulation features such as the 707.0 hPa anticyclonic circulation in the Atlantic Ocean near northern Africa. To quantify the similarity between AMVs and ERA-5, we also calculated the RMSVD for each plot. The error does not vary 10.1029/2022GL101830 4 of 9 much at different levels with a minimum of 5.52 m s −1 at the 707.0 hPa level and a maximum of 6.03 mm s −1 at 407.9 hPa level. These RMSVDs using 1° resolution data are comparable to those calculated by Santek, Dworak, et al. (2019) between cloud-tracking AMVs and ERA-Interim reanalysis, which are 2.9-7.1 m s −1 , with the caveat that the quality control analysis they did is different than ours. We retrieved about 10 4 wind vectors per pressure level per day (see the line of δ = 10 m s −1 in Figure S6 in Supporting Information S1).
Our 3D AMVs have good coverage not only horizontally but also vertically. As an example, Figure 2 shows a cross section across one of the descending orbital tracks for 1 July 2020, depicting the nine derived levels. The 3D AMVs are similar to the ERA-5 fields even at altitudes above the 500 hPa pressure level, where capturing 3D AMVs is more difficult due to lower water vapor content (Ouyed et al., 2021;Posselt et al., 2019). The RMSVD of AMVs in this cross section is 5.78 m s −1 which is within the bound of errors of between cloud-tracking AMVs and ERA-Interim (Santek, Dworak, et al., 2019).
Although ERA-5 fields do assimilate radiosonde winds, the more accurate baseline for comparison is the radiosonde wind measurements themselves. Figure 3 shows pressure versus RMSVD or the average speed difference (or speed bias) between AMVs and radiosonde winds, and the shaded region shows the ranges extracted from Santek, Dworak, et al. (2019). The AMVs are collocated within 1.5 hr and 1° of radiosondes. We used 144 radiosonde profiles for 1-7 January 2020 and 174 profiles for 1-7 July 2020 (See Figure S2 in Supporting Information S1), leading to the total number of individual collocated wind vectors at nine pressure levels for both Figure 1. Retrieved horizontal wind for the descending orbit of 1 July 2020 for different pressure levels in left panels. The corresponding results from ERA-5 are shown in right panels. Resolution averaged to 5° to make the plot readable. Root mean square vector difference (RMSVD) between Atmospheric Motion Vectors (AMVs) and ERA-5 winds in left panels computed at the original horizontal resolution of 1°. January (N = 1,278) and July (N = 1,851), as provided in the panels. Figure 3 also shows that the 3D AMV errors lie mostly within the GEO AMV error bounds (5.83-8.97 m s −1 ) with an average (calculated as the average of the RMSVDs at each pressure level) of 7.58 m s −1 for January and 7.39 m s −1 for July. Another interesting fact is that for the collocated ERA-5 winds nearest to the radiosondes (averaged to 1° resolution for comparison), the RMSVD, although smaller than for the AMVs, is still significant, since the averaged RMSVD is 3.51 m s −1 in January, and 4.19 ms −1 in July, most probably reflecting the representativeness error in using point data to evaluate gridded products (as these radiosonde data were assimilated in ERA-5).  Another glaring aspect of Figure 3 is the low speed bias of the 3D AMVs, which is −2.76 m s −1 for January and −3.01 m s −1 for July (calculated as the average of the speed biases at each pressure level). These quantities exceed the maximum GEO AMV slow speed bias, which is −1.79 m s −1 . This is not surprising, since slow speed bias in wind tracking is increased when smaller features are not resolved (Bresky et al., 2012), which applies to the relatively coarse resolution (i.e., 1° grids) of the retrieved moisture fields. To evaluate the correlation between speed bias and resolution, we calculated AMVs from GEOS5 Nature Run (G5NR) moisture fields as used in Ouyed et al. (2021), by coarsening the original resolution (0.0625°), into 0.125°, 0.25°, and 1° grids. We use 1-3 January 2006 and 1-3 July 2006 for P = 500 hPa, and for each day we calculated AMVs at 0:00, 6:00, 12:00, and 18:00 UTC. We use the OpenCV implementation of deepflow as in Ouyed et al. (2021) since our experiments show it works better than TV-L 1 for the large dimensions (2,881 × 5,760) of the nature run image. The results are in Figure S4 in Supporting Information S1, and as expected, they show that slow speed bias decreases with higher resolution. For example, for 1-3 January 2006, the slow speed bias for a grid size of 0.0625° is just a third of that for 1° grids. Note that the large slow speed bias (of −5.27 m s −1 ) for 1° grids in Figure S4 in Supporting Information S1 (compared with those in Figure 3) may be caused by the use of the same algorithm parameters for all resolutions in this sensitivity test (rather than using parameters optimized for each resolution).

Discussion
A significant result is that we, not only retrieved vertical profiles of winds, but that our 3D AMVs lie within the typical error ranges of GEO AMVs. Although GEO AMVs use a quality control algorithm (Santek, Dworak, et al., 2019; that cannot be used here due to the unavailability of image triplets, they still can act as benchmarks for acceptable error values. Given that the 3D AMVs lie within accepted error ranges, the retrieved wind profiles can have immediate research and operational benefits (e.g., used in addition to other products, like GEO AMVs and Aeolus winds).
One weakness of our 3D AMVs derived from CLIMCAPS water vapor fields is the coarse horizontal resolution of 1° with only two AMVs per day for locations near the equator. In contrast, GEO AMVs have a higher horizontal resolution of a few kilometers and resolve the diurnal cycle (but with the disadvantage of being available at only one pressure level for a given location and time). The temporal and spatial restrictions of our 3D AMVs limit their usability for mesoscale events (Hristova-Veleva et al., 2021). Furthermore, as argued in Section 3.1, this low horizontal resolution leads to considerably low speed bias.
Another issue with our 3D AMVs is that they are not independent of reanalysis data, as CLIMCAPS uses MERRA-2 fields as the first guess in its moisture field retrieval, as quantified in N. Smith and Barnet (2020). This is not desirable for the assimilation of 3D AMV data for weather forecasting. In future, we will test our approach using retrievals from the NOAA-unique combined atmospheric processing system (NUCAPS; Berndt et al., 2020) that runs in near real time and is independent of reanalysis models.
Furthermore, the error check itself may introduce correlations of the AMVs with ERA-5 fields. To address this issue, we computed a random wind vector for every available AMV for all nine levels, drawing the random u and v components from a random uniform distribution from −20 to 20 m s −1 , and then we applied the δ = 10 m s −1 error check to the random vectors. In order for 3D AMVs to be informative, they must perform better than the correlated random vectors.
First, we computed the statistics in Figure 3 but for the random vector, which we display in Figure S5 in Supporting Information S1. Indeed, the errors are larger than for the 3D AMV equivalent in Figure 3. For example, the mean RMSVD of the random vectors for July is 1.2 m s −1 higher than for the 3D AMV equivalent. Furthermore, the sample size for random vectors is much smaller (e.g., N = 499 for January vs. N = 1,278 for 3D AMVs), signifying that more AMVs survive the quality control since they contain more meaningful information. In Figures  S6 and S7 in Supporting Information S1, we compare random vectors and AMVs to ERA-5 fields, confirming qualitatively the error analysis above, since the error of random vectors is ∼1 m s −1 larger than for the 3D AMV. For example, random vectors for July have an average RMSVD of 7.04 m s −1 (vs. 6.05 m s −1 for the 3D AMVs). Furthermore, the error check outputs a much larger yield for 3D AMV than random vectors since the former is more informative. For example, the daily yield for July is 1.2 × 10 5 AMVs (at all nine levels) per day, whereas for random vectors it is 2.9 × 10 4 only. Finally, we calculated the correlation coefficients between the speed of AMVs and random vectors versus ERA-5 winds, the latter a proxy for ground truth, since the correlation coefficient is a proxy for the mutual information shared between two variables. Figure S8 in Supporting Information S1 shows the correlation coefficients and the corresponding histograms for 1-7 January 2020. Indeed, the error check introduces a correlation between random vectors and ERA-5 winds. However, AMVs are more informative due to its higher correlation of 0.84 than that (0.69) for random vectors.
The nontrivial information content of 3D AMVs retrieved from CrIS data suggests that future satellite missions with water vapor sounders of higher spatial and temporal resolutions and coverage (e.g., by a constellation of small satellites) can revolutionize our wind retrieval capacities. Furthermore, a higher horizontal resolution would decrease the considerable slow speed bias (see Figure S4 in Supporting Information S1), and a higher vertical resolution would reduce "height assignment" errors when winds vary nonlinearly with height within each layer.
Our 3D AMVs have some advantages and disadvantages compared to Aeolus winds. The 3D AMVs have more global coverage since CrIS onboard two satellites have overlapping swath widths of about 900 km (e.g., Figure S1 in Supporting Information S1), whereas the lidar retrieves a "thin curtain" of winds. Furthermore, the 3D AMVs retrieve the whole horizontal wind vector, whereas Aeolus can only retrieve the LOS wind component. The 1° resolution of our 3D AMVs is comparable to that (90 km) of Aeolus for Rayleigh scattering (for which most of the Aeolus wind is available) but it is much coarser than that (10 km) for Mie scattering (for which Aeolus data are frequently not available; Martin et al., 2021). However, Aeolus winds have much lower errors (4-5 m s −1 ) and in contrast to our 3D winds, they are model independent. Furthermore, its main advantages are its high vertical resolution (0.25 km near the surface), and its ability to retrieve winds when the air is dry, in contrast to 3D AMVs that require moisture. If AMVs and lidar winds are collocated over some areas, an optimal way might be to use the more accurate lidar winds to do the bias correction of the AMVs, as demonstrated in Ouyed et al. (2021).
Another issue is the quality control of the moisture patches themselves. Right now, using a minimum area of 100 pixels for the overlapping patch rejects virtually all high high-latitude patches. This also implies that the number of patches yielded by the retrieval algorithm might be suboptimal, since more patches, and therefore a higher AMV yield, might be possible by tweaking the algorithm. We could also potentially double the number of 5-min patches per track (and therefore double the yield of AMVs) by making the edges of patches overlap over 2.5 min in length/time. For example, if one patch ends at 00:05:00 UTC, the next patch could begin at 00:02:30 UTC (rather than at 00:05:00 UTC at present). The overlapped patches will generate overlapping AMVs. However, two overlapping AMVs will not be the same since the pixel neighborhoods used to compute them will differ, increasing the amount of AMVs that could survive the quality controls of operational centers. Furthermore, there is an element of spatial representativeness error (Mile et al., 2021), since the computation of an AMV is a function of the position of the patches' spatial boundaries.
Note that the optical flow algorithm works through a (regularized with extra terms) pixel brightness continuity equation, and therefore when applied to moisture fields, it becomes a continuity equation of moisture. However, moisture is not necessarily conserved in the atmosphere. In other words, the temporal evolution of moisture also depends on its sources and sinks, and this dependence introduces errors in the retrieved winds. The significance of sources and sinks depends on their temporal and spatial scales, relative to the 3D AMVs' spatial (100 km) and temporal scales (∼50 min). Even in an extreme scenario like convection at the mesoscale (100 km) level, the shortest time scale is about 5 hr (Fujita, 1986), which is relatively "slow" compared to horizontal advection (∼50 min separation time of SNPP and NOAA-20). Therefore, the impact of moisture sources and sinks should be relatively small for our results.
Finally, we must acknowledge that winds can also be retrieved from four-dimensional data assimilation of hyperspectral radiances (4D Var). Although both 4D Var and our 3D wind retrieval method produce winds by processing radiances, the methods are different. The only assumptions behind 3D AMVs are the assumptions inherent in the radiance retrievals and the variational algorithm for wind retrieval, whereas winds from data assimilation are a function of more complex factors such as specific numerical models, cost functions for error minimization, and other assimilated observations (e.g., radiosondes). Therefore, our 3D AMVs add a different source of information that can complement 4D Var.
Future studies should address the optimal way to retrieve 3D winds (e.g., using our algorithm plus the use of additional data, such as those from radiosondes and wind lidar, for bias correction) and the optimal way to assimilate wind information in numerical weather prediction (e.g., based on the assimilation of radiances, 3D winds, or combination of both).

Conclusion
We retrieved 3D AMVs from the hyperspectral instrument CrIS onboard two satellites for nine pressure levels between surface and 100 hPa. Our method consists of dividing the LEO overlapping swathes into 5-min-long patches and then processing each individual patch with a DOF algorithm. We applied this AMV algorithm to 49 pressure levels retrieved from CLIMCAPS moisture fields and then further coarsen the AMVs into 9 levels. We used a simple but effective gross error check algorithm that uses ERA5 fields to flag AMV outliers. The algorithm outputs about 10 4 vertical profiles of 3D winds that lie within the error range of existing operational AMV for the analyzed periods.
For future work, we will modify our 3D retrieval algorithm to derive AMVs from the polar regions. Future efforts should also look into designing a quality control algorithm that is more independent of model data and does not require triplet images.

Data Availability Statement
The AMV retrieval code we developed is released under the MIT License and can be found in https://github. com/aouyed/3d-amvs. The CLIMCAPS data products can be retrieved through the NASA Earth Observing System Data and Information System website. For Suomi-NPP: https://disc.gsfc.nasa.gov/datasets/SNDRSNIM-L2CCPRET_2/summary. For NOAA-20: https://disc.gsfc.nasa.gov/datasets/SNDRJ1IML2CCPRET_2/ summary.