Scale‐Dependent Ocean Vertical Correlations in the California Current System

We present observation and model estimates of temperature and salinity depth‐depth cross‐correlations in two different horizontal scale regimes. Glider data from the 2021 S‐MODE pilot campaign were assimilated into an ocean simulation using a three dimensional variational algorithm. We map the glider time series into distance allowing a Fourier analysis of model errors in wavenumber space. Power spectra indicate model error variance is less than the observed variance at scales larger than approximately 250 km. Based on this result, a Gaussian filter partitions the glider and model data into larger and smaller scale series at each depth. The larger and smaller scale temperature and salinity cross‐correlations between depths are compared and contrasted. Glider and model cross‐correlations are found to be similar, implying that the model physics at scales not constrained by the observations are similar to the true world.

For brevity, consider a widely used ocean data assimilation technique: three dimensional variational (3DVAR). The algorithm assumes time-invariant observations and uses prescribed background error covariances to spread observations, leading to model corrections horizontally and vertically. This prescribed covariance is defined by users, who have historically focused on geostrophically balanced, baroclinic adjustments due to the coarse nature of the available observations that have horizontal scales based on the Rossby radius of deformation (e.g., Jacobs et al., 2014). Vertical relations between variables over depth are a critical factor. Previous efforts have developed methods for computing this information from available in situ observations (Helber et al., 2013). The historical data have mainly been used to estimate the larger mesoscale vertical cross-covariance structures between different depths, for example, the cross-covariance between temperature and salinity expressed as ⟨ ( 1) , ( 2)⟩ and referred to as a depth-depth cross-covariance. However, the historical in situ ocean profile information is relatively limited for computing statistical relations over the globe at horizontal scales smaller than the mesoscale. Numerical ocean model output is another source for computing statistics (Oke et al., 2010), and there are far fewer limitations in data coverage or quantity. However, the validity of the model-derived statistics can be impacted by model biases in which, for example, the depth of the thermocline variability may be biased thereby leading to biased statistical relations.
Multi-scale 3DVAR assimilation approaches have been developed and demonstrated to utilize horizontally dense data to correct smaller features (Li et al., 2015b;Souopgui et al., 2020), however, the appropriate vertical Abstract We present observation and model estimates of temperature and salinity depth-depth cross-correlations in two different horizontal scale regimes. Glider data from the 2021 S-MODE pilot campaign were assimilated into an ocean simulation using a three dimensional variational algorithm. We map the glider time series into distance allowing a Fourier analysis of model errors in wavenumber space. Power spectra indicate model error variance is less than the observed variance at scales larger than approximately 250 km. Based on this result, a Gaussian filter partitions the glider and model data into larger and smaller scale series at each depth. The larger and smaller scale temperature and salinity cross-correlations between depths are compared and contrasted. Glider and model cross-correlations are found to be similar, implying that the model physics at scales not constrained by the observations are similar to the true world.

Plain Language Summary
Historical observations around the globe have enabled insight into the vertical variability of temperature and salinity within features having horizontal extent larger than about 200 km. The paper exploits a collection of high-density data that provides the opportunity to observe vertical variability within structures at smaller scales. There are two applications. The first allows us to verify that numerical models, representing our theoretical understanding of the ocean, are consistent with observations. The second application is to improve regular ocean forecasts. As we seek to forecast smaller scale features through upcoming observing systems, we must be able to correct both the large-and small-scale features and their underlying vertical variability.

D'ADDEZIO AND JACOBS
Published 2022. This article is a U.S. Government work and is in the public domain in the USA. This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
• The assimilative nonlinear model has skill to 250 km using available observations • Temperature and salinity vertical cross-correlations have unique profiles in each corresponding horizontal regime • Observation-and model-based depth-depth cross-correlations are similar

Supporting Information:
Supporting Information may be found in the online version of this article.
relations to use at the smaller horizontal scales is not yet clear. Model-derived estimates of small-scale vertical cross-covariances exist (D'Addezio et al., 2020), but have not been validated against observations. We use profiling glider data from the 2021 Sub-Mesoscale Ocean Dynamics Experiment (S-MODE) (Farrar et al., 2020) pilot campaign ( Figure 1) to produce observation-based, small-scale vertical temperature and salinity cross-correlations and compare these to model-derived estimates. Here, small-scale is relative to the horizontal scales constrained by the 3DVAR used to assimilate data during the campaign. The term "sub-mesoscale" is intentionally avoided because, as the reader will see, the model constrained scales are still quite large (>200 km wavelength).
The article is organized as follows: Section 2 describes the model, assimilation, in situ data, and the data analysis used to separate scales and thereby derive scale-dependent depth-depth temperature and salinity cross-correlations.
In Section 3 we compare and contrast the temperature and salinity depth-depth cross-correlations between the two horizontal regimes based on the glider observations and corresponding data sampled from a numerical model. Finally, in Section 4 we summarize the results, consider the implications for numerical model abilities to represent vertical structures of smaller scales, and discuss their utility for potentially improving ocean multi-scale state estimation and forecasting.

Modeling and Data Assimilation
Ocean nonlinear modeling was performed using the Navy Coastal Ocean Model (NCOM) (Barron et al., 2006). NCOM integrates the primitive equations using leapfrog time stepping and includes tides, river inflow, and realistic surface forcing from a global atmospheric model. The area modeled was off the coast of California with lon/ lat bounds of (233°E-244.5°E; 28°N-39°N). The simulation has a 1 km horizontal grid and 100 sigma/z layers. Therefore, the simulation can resolve ocean features of approximately 10 km and greater. The model vertical grid transitions from 39 sigma levels near the surface to 60 z levels at a depth of 42 m. Tidal forcing from the global Oregon Tidal Inverse Solution (Egbert & Erofeeva, 2002) was added along the open boundaries, which were taken from the Global Ocean Forecast System version 3.1 run operationally by Fleet Numerical Meteorology and Oceanography Center (Metzger et al., 2014). A monthly climatology river database (Barron & Smedstad, 2002) was used to provide fresh water inflow to the simulation. Surface forcing was provided by the NAVy Global Environment Model (Hogan et al., 2014) and included: surface wind stress, latent heat flux, sensible heat flux, solar radiation, and precipitation. Boundary conditions were introduced through a double nesting procedure starting with global 1/12° HYbrid Coordinate Ocean Model (Metzger et al., 2014(Metzger et al., , 2017) that fed boundary conditions to a 3 km NCOM simulation that ultimately fed boundary conditions to the final 1 km model grid used in the experimentation. The 3 km NCOM simulation covered (228°E-244.9°E; 25°N-45°N), providing several mesoscale eddies worth of space between the 3 km open boundary and where boundary conditions were introduced into the final 1 km NCOM simulation. Finally, at the beginning of the S-MODE experiment time period (31 August 2021), the 1 km NCOM simulation had been running with data assimilation turned on since 1 January 2019.
During the experiments, the Navy Coupled Ocean Data Assimilation (NCODA) system assimilated observations every 00z of each simulation day. Specifically, we used NCODA-3DVAR, a thorough review of which can be found in Jacobs et al. (2014). All regular observations were assimilated: in situ Argo profiles, satellite sea surface temperature (SST), and sea surface height anomaly from nadir altimeters. Additionally, nine Slocum gliders deployed during the S-MODE pilot campaign were assimilated. These nine gliders are also used to perform our data analysis throughout the paper. NCODA-3DVAR uses a horizontal correlation dictated by a second order autoregressive (SOAR) function with a defined length scale that is dependent on the local Rossby radius of deformation multiplied by a constant. The area-averaged horizontal correlation scale was 29 km, a value found to minimize assimilation errors through empirical methods . While this value appears relatively small-scale, the SOAR power spectral density (PSD) quarter power point is approximately 11 times greater than the decorrelation length scale. Using these settings, NCODA-3DVAR is making large-scale corrections with quarter power at about 320 km.

Data Analysis
We endeavor to calculate the horizontal scale down to which the model has skill: the so-called constrained scale . This is the scale at which the PSD of errors is equal to the PSD of the ocean variability.  Souopgui et al., 2020), which demands that the data be on a regularly spaced grid. The glider data collected during the S-MODE pilot are not regularly spaced. Therefore, we first interpolate the glider data to a regularly spaced sampling interval. At the same time, we replicate the glider sampling process and interpolation using the model data.
Each of the nine glider time series provide longitude, latitude, and time positions at which the glider returned underwater temperature and salinity measurements (see the "Data Availability Statement" section for links to the glider netcdf files). We first convert this data as a function of time to a function of distance traversed. By starting with the first reported data at distance = 0, calculating the distance traveled between successive measurements, and adding to the prior distance, we derive a series as a function distance from the location of the first measurement taken (Figure 2a). The distance between profiles is not constant due to ocean currents affecting glider speed over ground leading to different distance traveled between successive profiles. This distance-depth data set was then interpolated to have a grid spacing of 1 km in distance and to match the set vertical grid output within the available model files (see the "Open Research" section for links to the model netcdf files). Interpolation was performed by organizing temperature, distance, and depth into corresponding one-dimensional arrays and passing them through MATLAB's "scatteredInterpolant" algorithm (e.g., F = scatteredInterpolant (distance, depth, temperature)), which uses Delaunay triangulation to perform interpolation (Amidror, 2002). "F" represents a function that is used to project and interpolate the temperature array into our desired two-dimensional structure.
For the interpolation, we used the "natural" setting, which is a natural nearest neighbor interpolation. For any required extrapolation at the boundaries, we used the "nearest" setting, which simply sets any extended data to the last available data near the boundary. An example of the resulting interpolation is shown in Figure 2b. The interpolation is largely consistent with the features apparent in the original data set (Figure 2a). A similar process was performed using the model data: model data were taken at the nearest model grid point and interpolated in time to match that of the corresponding glider measurement. The full profile on the model output vertical grid was taken as to avoid interpolating model information in the vertical. The model data were then interpolated to the same 1 km distance points as the interpolated observations. Features shown in the interpolated model data are consistent with those in the non-interpolated distance-depth volume (results not shown). Finally, the two interpolated distance-depth volumes were subtracted from one another to produce distance-depth volumes of model error (results not shown).
For each glider data set, the matching model data set, and the matching model error data set, one-dimensional wavenumber Fourier analyses were performed independently on each depth level within the distance-depth volumes. We pause here to stress that the spatial scales derived from the Fourier analysis of the distance series do not correspond directly to the true ocean spatial scales as evaluated in a Eulerian frame. The S-MODE region only covers a couple of hundred kilometers in each direction (Figure 1), and yet our distance series extends to at least 1,000 km ( Figure 2). Evaluating model errors in the distance series frame is simply a tool for deriving the band of spatial scales over which the model has skill in the distance series frame. We will return to this frame-of-reference issue in Section 4 when we discuss implications this work has for implementing a multi-scale 3DVAR that uses disparate vertical physics in each of the two assimilation steps.
When computing the Fourier transform, we excluded the first 200 km of the series, which is approximately 10 days (10 assimilation cycles) (shown by the vertical dashed white line in Figure 2b), to allow the model time to assimilate the glider data and gain skill. The squared amplitude of those Fourier coefficients produced wavenumber PSD (i.e., spectrum) of the glider temperature data, the model temperature data, and the model temperature error. A single spectrum for each was generated by averaging across the nine realizations (i.e., one for each of the nine gliders). With these mean spectra, we evaluate the function below following D'Addezio et al. (2019): where model is the mean spectrum of the model temperature error, glider is the mean spectrum of the glider temperature data, model is the mean spectrum of the model temperature data, and the brackets denote the mean of the two spectra. This formulation creates values ranging from 0 (perfect skill) to 2 (no skill) as a function of wavelength. An example at 200 m depth is shown in Figure 3a. For larger wavelengths, the error spectrum is smaller than the two other spectra. These are wavelengths over which the model has skill. At smaller wavelengths, the error spectrum is larger than the two other spectra. The variance of the summation of two uncorrelated variables is the simple sum of the variance of each of the individual variables; hence, the maximum value of 2 in Equation 1. We choose the value of 1 to define skill (i.e., a correlation of 0.5). Plotted as a function of depth (Figure 3b), we can determine the approximate constrained scales over which the model has skill in that the error PSD is less than the averaged model and glider PSD, or the values at which Equation 1 is less than 1. Unconstrained scales are those at which the value in Equation 1 is greater than 1. The contoured value of 1 suggests that the depth-average constrained scale is approximately 250 km (denoted by the vertical dashed black line). The same analysis was performed for salinity, and a depth-averaged constrained scale closer to at least 300 km is observed (results not shown). Here, we focus primarily on temperature and thus proceed using the 250 km depth-averaged constrained scale found using that variable. Ultimately, this difference in the constrained scales for temperature and salinity is irrelevant for implementation in the multi-scale 3DVAR as the scale separation will be a function of the assimilation scales set by the quarter power point of the SOAR function. This discussion is expanded upon in Section 4.
We can now construct a spatial filter to separate the glider and model temperature volumes into constrained and unconstrained bands. A Gaussian e-folding scale of 66 km corresponds to a PSD quarter power point at wavelength 250 km. This Gaussian filter is convolved over each of the depth levels in the glider and model distance-depth volumes. At the boundaries, where the stencil size becomes larger than the remaining data, the function is rescaled so that the reduced set of coefficients sum to 1; essentially forcing the filtered set to converge to the last value at each boundary of the original data set. An example of the resulting filtering is shown in Figure 2c. Clearly, much of the small-scale variability in Figure 2b has been removed. This filtered field is now considered the variability in the constrained band. The residual (original minus filtered) is now considered the variability in the unconstrained band. The following results examine the depth-depth temperature cross-correlations in each of these horizontal regimes.

Results
We compute ⟨ ( 1) , ( 2)⟩ and normalize by the standard deviation along the diagonal to produce depth-depth cross-correlation of temperature using data from all nine gliders and the corresponding nine model matchups. The computations are conducted separately for the constrained and unconstrained temperature. Those results are shown in Figure 4. The top row is the glider data and the bottom row is the model data. The left column is the constrained results and the right column is the unconstrained results. The glider-observed and model constrained cross-correlations look very similar with distinct vertical levels. The levels are identified by the approximately square red blocks along the diagonal. Variability within these blocks is positively correlated. These ranges are 0-30 m, 30-70 m, and deeper than 70 m. Off-diagonal blocks indicate how these levels vary relative to one another. The 30-70 m depth range is negatively correlated to the upper layer and negatively correlated to waters below about 150 m. From Figure 2, the thermocline is primarily at approximately 50 m depth, and these relations reflect variability above and below the thermocline being in opposite directions, as expected. These profiles are roughly consistent with mesoscale depth-depth temperature cross-correlations derived from large-scale in situ data taken over the last several decades ( Figure S1 in Supporting Information S1; Helber et al., 2013). Deviations are expected because the glider data set encompasses a much shorter time period and thus are potentially biased by synoptic ocean features.
The observed and modeled unconstrained cross-correlations are much more depth limited. In the constrained cross-correlations, the off-diagonal values are relatively large indicating strong cross-correlations between wide ranges of depth. In the unconstrained band, however, the off-diagonal values are much smaller indicating variations are much more localized vertically. While generally similar, the observation-and model-based cross-correlations of the unconstrained variability show some differences. The observations show a negative correlation between 20 m depth and depths deeper than 100 m. The model results indicate the negative correlation occurs over a broader depth range with 20-100 m having negative correlation with depths below 200 m.
Substantive differences between the constrained and unconstrained depth-depth temperature cross-correlations are observed. This is consistent with a known transition of vertical structure from larger to smaller scales (e.g., McWilliams, 2016). This is further highlighted by the standard deviations along the diagonal of each cross-covariance matrix (Figure 4e). The constrained and unconstrained bands show larger differences near the surface. The constrained band contains more energy in this vertical region, presumably due to diurnal heating effects that would be large-scale. However, the subsurface profiles differ less, with the unconstrained band only showing a slight shift of higher energy toward shallower depths. Below approximately 50 m, the standard deviation depth profiles look very similar in the constrained and unconstrained bands. The model unconstrained energy is higher than the observed in the 50-100 m depth range. The model vertical resolution in this depth range is between 5 and 10 m, suggesting that model resolution is not a limiting factor in this band. Rather, it may instead be associated with suboptimal mixed layer parameterizations, though, a definitive conclusion is beyond the scope of this work.
The same depth-depth cross-correlations were calculated for salinity ( Figure S2 in Supporting Information S1). In the constrained band, the salinity cross-correlations have a three-layer structure: one layer from the surface to approximately 75 m, another layer from approximately 100-175 m, and a final layer from approximately 200 m to at least 500 m. This three-layer structure is observed in both the glider and model data, though, the third layer has stronger off-diagonal correlation in the glider data than the model data. There is also a low correlation (∼0) band between 25 and 100 m that extends from approximately 200 m to at least 500 m. Finally, this band of low cross-correlation is slightly broader in the model data than in the glider data. Overall, like with temperature, the constrained glider and model salinity cross-correlations compare favorably.
In the unconstrained band, the salinity cross-correlations have a two-layer structure: the first is very shallow near the surface between 0 and 25 m and the second extends from approximately 100 m to at least 500 m. Though, the latter has weak off-diagonal magnitude; this effect also having been observed in the temperature unconstrained band. There is also a band of low correlation (∼0) between 0 and 100 m that extends from approximately 100 m to at least 500 m. This feature is much more evident in the model data. Overall, like in the constrained band, the glider and model unconstrained results compare favorably.
Differences between the constrained and unconstrained salinity cross-correlation profiles are highlighted in Figure S2e in Supporting Information S1. As shown for temperature (Figure 4e), the standard deviation along the diagonal of the cross-covariance matrix was calculated and plotted with depth. Like with temperature, the salinity standard deviations for the constrained band are surface-enhanced with the glider data having greater energy than the model data. In the unconstrained band, the energy is still surface-enhanced, but less so than in the constrained band, and in this case, the model data have more energy than the glider data. At approximately 100 m, the constrained and unconstrained bands have similar magnitudes to at least 500 m. Overall, these results are similar to temperature: both the constrained and unconstrained bands are surface-enhanced, with lower energy in the unconstrained bands, and the constrained and unconstrained bands have similar energy at depth.

Conclusions
We have demonstrated a unique technique to derive model skill as a function of wavelength using dense glider data. By deriving the model constrained scale in the distance series frame, we were able to develop a filter that allowed us to partition temperature and salinity depth-depth cross-correlations into two different horizontal regimes. For temperature, the constrained band has a two-layer structure: within and below the mixed layer. In the unconstrained band, only one layer is observed, with weak off-diagonal elements with depth, suggesting that the smaller scale vertical structure has weak correlation over many depth layers. In both the constrained and unconstrained bands, the cross-correlations and standard deviations are surface-enhanced and then decay with depth. For salinity, a three-layer structure in the constrained band is observed. The first is shallow, likely approximating the mixed layer. The second is confined to approximately 100 and 175 m. The third and final layer begins at approximately 200 m and extends to at least 500 m. In the unconstrained band, a two-layer structure is observed and, like for temperature, the off-diagonal elements at depth have low magnitude. For both temperature and salinity, substantial differences between the constrained and unconstrained cross-correlations are observed, particularly in the near surface region. However, glider-and model-derived cross-correlations in each band were found to be similar. Two-step, multi-scale 3DVAR assimilation methods currently assimilate congruent vertical physics in each horizontal scale regime. In the future, we can use scale-dependent depth-depth cross-correlations to prescribe disparate vertical physics in each assimilation step. Future work will explore the utility of such an approach using an Observing System Simulation Experiment (OSSE). Similarity between the observation-and model-derived depth-depth cross-correlations suggest that model simulations can be used to derive the vertical cross-correlations in a Eulerian frame of reference. For the multi-scale 3DVAR implementation of the scale-dependent vertical physics, this change of reference from distance series to Eulerian is likely important as we observe that the scale-dependent vertical cross-correlations as derived in the distance-depth frame differ slightly from those derived in a Eulerian frame (results not shown). Therefore, when applying the scale-dependent cross-correlations to the multi-scale 3DVAR system, we advise against trying to directly apply the distance series specific cross-correlations shown in this manuscript. Instead, in the controlled OSSE experiments, we can filter the "truth" in a Eulerian frame to correspond to the scales used in the multi-scale assimilation. The first, large-scale step of the multi-scale will use the 320 km quarter power point of the SOAR function used in the assimilation, and we can use this value to partition the "truth" into large-and small-scale depth-depth cross-correlations. The Eulerian scale-dependent vertical cross-correlations can then be used in each independent step of the multi-scale analysis. The results of this multi-scale, multi-physics 3DVAR run can be compared to the current multi-scale 3DVAR approach as described and validated by Souopgui et al. (2020). This paper provides a first estimate of what those scale-dependent vertical physics should qualitatively look like, based on the glider observations collected during the S-MODE pilot experiment.