Diagnosing the Thickness‐Weighted Averaged Eddy‐Mean Flow Interaction From an Eddying North Atlantic Ensemble: The Eliassen‐Palm Flux

The thickness‐weighted average (TWA) framework, which treats the residual‐mean flow as the prognostic variable, provides a clear theoretical formulation of the eddy feedback onto the residual‐mean flow. The averaging operator involved in the TWA framework, although in theory being an ensemble mean, in practice has often been approximated by a temporal mean. Here, we analyze an ensemble of North Atlantic simulations at mesoscale‐permitting resolution (1/12°). We therefore recognize means and eddies in terms of ensemble means and fluctuations about those means. The ensemble dimension being orthogonal to the temporal and spatial dimensions negates the necessity for an arbitrary temporal or spatial scale in defining the eddies. Eddy‐mean flow feedbacks are encapsulated in the Eliassen‐Palm (E‐P) flux tensor and its convergence indicates that eddy momentum fluxes dominate in the separated Gulf Stream. The eddies can be interpreted to contribute to the zonal meandering of the Gulf Stream and a northward migration of it in the meridional direction. Downstream of the separated Gulf Stream in the North Atlantic Current region, the interfacial form stress convergence becomes leading order in the E‐P flux convergence.


Model Description
We use the model outputs from the realistic runs described in Jamet et al. (2019b), Jamet et al. (2020) and Uchida, Jamet, et al. (2021), which are an air-sea partially coupled, 48-member ensemble of the North Atlantic ocean at mesoscale-permitting resolution (1/12°; or sometimes referred to as "eddy rich") using the hydrostatic configuration of the Massachusetts Institute of Technology general circulation model (MITgcm; J. Marshall et al., 1997). We have 46 vertical levels increasing from 6 m near the surface to 250 m at depth. Harmonic, biharmonic horizontal and vertical viscosity values of h2 = 20 m 2 s −1 , h4 = 10 10 m 4 s −1 and v = 10 −5 m 2 s −1 were used respectively. For completeness, we provide a brief summary of the configuration below. Figure 1 shows the bathymetry of the modeled domain extending from 20°S to 55°N. In order to save computational time and memory allocation, the North Atlantic basin was configured to zonally wrap around periodically. Open boundary conditions are applied at the north and south boundaries of our domain and Strait of Gibraltar, such that oceanic velocities (u) and potential temperature and practical salinity (Θ , S) are restored with a 36 min relaxation time scale toward a state derived by an ocean-only global Nucleus for European Modeling of the Ocean (NEMO) simulation (Molines et al., 2014, ORCA12. L46-MJM88 run in their paper, hereon referred to as ORCA12). The open boundary conditions are prescribed every 5 days from the ORCA12 run and linearly interpolated in between. A sponge layer is further applied to two adjacent grid points from the open boundaries where model variables are restored toward boundary conditions with a 1-day relaxation time scale. In total, relaxation is applied along three grid points from the boundaries with it being the strongest at the boundary along with radiation conditions at the northern/southern most boundary. Although relatively short, no adverse effects were apparent upon inspection in response to these relaxation time scales; for example, changes in the open boundary conditions were seen to induce a physically consistent Atlantic Meridional Overturning Circulation response inside the domain (Jamet et al., 2020).
The 48-member ensemble was constructed as follows: 48 oceanic states separated by 48 hr each were taken during an initial 96-day-long integration beginning 14 November 1962. Simulations initialized with these states were then run under yearly repeating 1963 atmospheric and boundary conditions for a year, that is, the atmospheric state and boundary conditions are cyclic for this year. After the 1 year of integration from the 48 states, the last time step from each simulation was taken as the initial condition for the ensuing ensemble members; each spun-up initial oceanic state is physically consistent with the atmospheric and boundary conditions of 1 January 1963 (details are given in Jamet et al., 2020). At the surface, the ocean is partially coupled to an atmospheric boundary layer model (CheapAML; Deremble et al., 2013). In CheapAML, atmospheric surface temperature and relative humidity respond to ocean surface structures by exchanges of heat and humidity computed according to the Coupled Ocean-Atmosphere Response Experiment (COARE3; Fairall et al., 2003) flux formula, but are strongly restored toward prescribed values over land; there are no zonally propagating signals of climate teleconnection. The prescribed atmospheric state is taken from the Drakkar forcing set and boundary forcing from the ORCA12 run (details are given in Jamet et al., 2019a). The ensemble members are integrated forward in time for 5 years (1963)(1964)(1965)(1966)(1967), and exposed to the same prescribed atmospheric state above the boundary layer and relaxation at the north/south boundaries across all ensemble members. (Note that the forcing and relaxation are no longer cyclic after the 1-year spin-up phase.) During this interval, the oceanic state and the atmospheric boundary layer temperature and humidity evolve in time. In the following, we interpret the ensemble mean as the ocean response to the atmospheric state prescribed above the atmospheric boundary layer as well as the oceanic conditions imposed at the open boundaries of the regional domain, while the ensemble spread is attributed to intrinsic ocean dynamics that develop at mesoscale-permitting resolution (Jamet et al., 2019b;Leroux et al., 2018;Sérazin et al., 2017).
The model outputs were saved as 5-day averages. In the context of mesoscale dynamics, which is the focus of this study, some temporal averaging is appropriate in order to filter out temporal scales shorter than the mesoscale eddies themselves. From a probabilistic perspective, the 5-day averaging results in more Gaussian-like eddy statistics (based on the central-limit theorem). From a dynamical point of view, this does not allow us to close the residual-mean and eddy budgets (cf. Stanley, 2018, Section 4.4). Nevertheless, the ensemble dimension of our data set provides an unique opportunity to examine the TWA eddy-mean flow interaction. In the following analysis, we exclude the northern and southern extent of 5° from our analysis to avoid effects from the open boundary conditions and sponge layer ( Figure 1) and to maximize the signal of intrinsic variability amongst the ensemble members. We also use the last year of output (1967) for the same reasons.

Theory and Implementation of Thickness-Weighted Averaging
The ocean is a stratified fluid, and the circulation and advection of tracers tend to align themselves along the stratified density surfaces. Hence, a natural way to understand the circulation is to consider the variables in a buoyancy framework and the residual-mean flow rather than the Eulerian mean flow. We leave the detailed derivation of the TWA framework to Young (2012, and references therein) and here, only provide a brief summary; the primitive equations in geopotential coordinates are first transformed to buoyancy coordinates upon which a thickness weighting and ensemble averaging along constant buoyancy surfaces are applied to obtain the TWA governing equations. Following the notation by Young (2012) and Ringler et al. (2017), the TWA horizontal momentum equations in the buoyancy coordinate system (̃̃̃̃) are:̃+̂̂̃+̂̂̃+̂̂̃−̂+̃= where (⋅) and ( ⋅) def = −1 (⋅) are the ensemble averaged and TWA variables respectively, (=̃) the specific thickness and ζ the depth of an iso-surface of buoyancy. The subscripts denote partial derivatives. The Montgomery potential is =̆−̃ where ̆ is the dynamically active part of hydrostatic pressure (the meaning of ( ⋅) will become clearer later). is the dia-surface velocity across buoyancy contours, which we detail below for a realistic equation of state (EOS) for density. The vectors 1 = +̃ and 2 = +̃ form the basis vectors spanning the buoyancy horizontal space where i, j and k are the Cartesian geopotential unit vectors, and E is the E-P flux tensor described in detail in Section 4.1. Although each ensemble member has an individual basis ( 1, 2) , the E-P flux divergence yields no cross terms upon averaging as the TWA operator commutes with the divergence of E (for mathematical details, see Section 3.4 in Maddison & Marshall, 2013); this allows for the tensor expression in Equations 1 and 2.
 and  are the viscous and forcing terms.

Journal of Advances in Modeling Earth Systems
UCHIDA ET AL.
10.1029/2021MS002866 5 of 25 One subtle yet important point involves the buoyancy coordinate (̃) for a realistic, non-linear EOS (Jackett & McDougall, 1995). The analysis in Young (2012) implicitly assumes a linear EOS. With a realistic EOS, defining the vertical coordinate using potential density introduces errors. However, what constitutes a better buoyancy variable is the subject of some debate (e.g., de Szoeke & Springer, 2009;Jackett & McDougall, 1997;Klocker et al., 2009;Lang et al., 2020;McDougall & Jackett, 2005;Tailleux, 2016). Although other choices are possible, we argue for the use of in-situ density anomaly ( def = −̆( ) where ρ is the in-situ density and is a function of only depth; Montgomery, 1937;Stanley, 2018Stanley, , 2019. With in-situ density anomaly, buoyancy can be defined as: where ρ 0 = 999.8 kg m −3 the Boussinesq reference density prescribed in MITgcm.
∼ is used to denote a thermodynamic function and ̃ denotes the buoyancy at a point in space-time. The question becomes how to choose ( ) so that monotonicity is maintained ( . if the stratification is statically stable). The vertical derivative of the in-situ density anomaly can be decomposed as: where Φ = − 0 is the dynamically non-active part of hydrostatic pressure, and c s is the sound speed. We remind the reader that a Boussinesq fluid is not strictly incompressible and a finite sound speed can be diagnosed (Olbers et al., 2012;Vallis, 2017). For simplicity, we can write ̆d ef = − 0  −2 where  =  ( ) is a function of only depth, which yields: Denoting  = + Δ where −1 Δ ≪ 1 , the right-hand side (RHS) of Equation 5 becomes: Hence, so long as  ≳ , monotonicity is assured while removing a large portion of compressibility, that is, the iso-surfaces of ∼ become close to neutral surfaces. In practice, we chose  to be larger than the maximum sound speed at each depth by 10 −5 m s −1 over the entire ensemble in order to avoid a singularity (viz. [ ∼ ] Θ, = 0 ). With  determined, integrating for gives:̆= which reduces to ̆ | =0 = 0 . The buoyancy equation using the in-situ density anomaly becomes: where  def = ∼ ΘΘ + ∼̇ , and Θ and ̇ are the net diabatic contributions on potential temperature and practical salinity respectively, which we approximate by diagnosing off-line the sum of harmonic and biharmonic diffusion below the mixed layer using the 5-day averaged outputs of Θ and . We summarize the RHS of (9) as the dia-surface velocity A further requirement of the TWA framework is that the pressure anomaly defined by such buoyancy coordinate translates into a body force in the buoyancy coordinate where the subscript (⋅) h represents the horizontal gradient and ∇ h = (̃,̃) . Using in-situ buoyancy anomaly, the pressure anomaly becomes:̆( The ( ⋅) is used to denote that the pressure anomaly is defined by the in-situ buoyancy anomaly. The pressure anomaly for a Boussinesq hydrostatic fluid, on the other hand, is: Since is only a function of depth, the horizontal gradient of the two remain identical and Equation 10 holds (We note that Equation 10 does not hold for pressure anomaly defined by potential density when the EOS is non-linear, and while more elaborate techniques may improve the neutrality of , the relation to the dynamics is non-trivial for other density variables such as neutral and orthobaric densities.). The use of in-situ density anomaly to define the buoyancy coordinate maintains the desirable properties of a unique, statically stable vertical coordinate and a simple hydrostatic balance ( =̃= −̃̃) while removing roughly 99% of the effect of compressibility basin wide at each depth . For a non-linear EOS, a material conservation of potential vorticity (PV) and non-acceleration conditions do not exist (cf. Vallis, 2017, Chapter 4). Discussion regarding the energetics are given in Appendix A.
The raw simulation outputs were in geopotential coordinates so we first remapped all of the variables in Equations 1 and 2 onto 55 buoyancy levels spread across the range of ̃∈ (−0.196 where 0 = 20 kg m −3 , = 9.2 kg m −3 , and ∈ [ 0, 2 ) in order to account for the abyssal weak stratification): using the fastjmd95 Python package to compute the in-situ density and its partial derivatives (Abernathey, 2020), and the xgcm Python package Jones et al., 2020) which allows for coordinate remapping consistent with the finite-volume discretization of MITgcm. The horizontal velocity vector becomes + ⟼ 1 + 2 . For the horizontal pressure anomaly gradient, we re-computed the pressure anomaly using the 5-day averaged outputs and have invoked the identity (10). In the case where the buoyancy contour outcrops for some members, we treat it by making the layer thickness vanish (Δ = 0 ) and carry on with our TWA analysis. This is consistent with the boundary treatment of Young (2012) where he notes that buoyancy contours intersecting the boundary to be continued just beneath the surface.

Results
We start by showing the time series of domain-averaged horizontal kinetic energy (KE) and potential temperature, and an arbitrary buoyancy iso-surface ( Figure 2). Figure 2a shows the simulation has a prominent seasonal cycle with the KE and temperature both peaking in summer. In Figure 2, we also show the residual-mean fields on 3 January 1967, the first day of the year of output we analyze. The depth of the buoyancy level shown in Figure 2c is below the ensemble-mean mixed-layer depth (MLD; Figure 2b) basin wide where diabatic effects are small, but is shallow enough to capture the imprint of the Gulf Stream; the iso-surface shoals drastically across the latitude of ∼38°N where the separated Gulf Stream is situated ( Figure 2d). The ensemble-mean MLD was computed as the depth at which the potential density computed from ensemble-mean temperature and salinity fields increased

The Eliassen-Palm Flux
The E-P flux tensor (E) in the TWA framework (Equations 1 and 2) is: where (⋅) ′′ = (⋅) −(⋅) and (⋅) ′ = (⋅) − (⋅) are the residual from the thickness-weighted and ensemble averages respectively (Aoki, 2014;Maddison & Marshall, 2013;Ringler et al., 2017). The two are related via the (eddy-induced) bolus velocity (Greatbatch, 1998;McDougall & McIntosh, 2001): We show each term in Equation 14 in Figure 3. The eddy momentum flux ′′ ′′ is often associated with barotropic processes in analogy to atmospheric jets ( Figure 3a; Aoki et al., 2016;Chan et al., 2007;Jamet et al., 2021;Vallis, 2017, Chapter 15). The zonal and meridional eddy momentum flux (′ ′ 2 ,′ ′ 2 ) exchange momentum between the eddies and mean flow, that is, to accelerate or decelerate the Gulf Stream as they affect the horizontal shear upon taking their gradients. The term due to the vertical displacement of buoyancy layer is related to the eddy potential energy (EPE; cf. Equations A15-A17). The interfacial form stress ( ′∇ h ′ ; Figures 3e and 3f) often associated with baroclinic instability is "deceivingly" orders of magnitude smaller than the other terms. However, it is the divergence of the E-P flux and not the flux itself that goes into the momentum equations, and the horizontal (∇ h ) and vertical gradient (̃) differ by roughly O (10 6 ). The contribution from the diabatic and compressibility effects (i.e., the terms with ϖ) were smaller than the interfacial form stress by another order of magnitude or more in the subtropics (not shown). It is quite surprising that the signals in the equatorial undercurrent region, although having relatively high KE (Figure 2d), are significantly smaller than in the Gulf Stream and North Brazil Current regions, virtually not visible in Figure 3. This implies that the mean flow dominates over the eddies in the equatorial region.
Writing out the E-P flux divergence in Equations 1 and 2 gives: As the signal in the North Atlantic basin is the largest in the separated Gulf Stream region (Figure 3), we show each term in the E-P flux divergence north of 25°N ( Figure 4). The large signal is consistent with Jamet et al. (2021) where they found the subtropical gyre to be a Fofonoff-like inertial circulation (Fofonoff, 1981), and that the separated jet was where the energy input to the gyre from surface winds was predominantly lost to becomes larger than the convergence of the eddy momentum flux terms due to cross correlation in the zonal and meridional momentum ( 10 , 01 ) , which are the smallest amongst the three terms in the E-P flux convergence (Figures 4b and 4c). The contribution from the terms with dia-surface velocity (ϖ′′) was roughly two-orders of magnitude smaller than the other terms in    (Figures 4a and 4e). The repeating positive and negative features further downstream are roughly on the scales of the Rossby deformation radius, consistent with  where they diagnosed the E-P flux convergence from a 101-member quasi-geostrophic (QG) double-gyre ensemble. In the meridional direction, the eddy momentum flux and potential energy convergence also tend to smooth out the Gulf Stream (decelerate the jet in the subpolar gyre by injecting northward momentum, and southward momentum in the subtropical gyre) while the interfacial form stress convergence tends to sharpen it (Figures 4d and 4f). The similar order of magnitude between 00 , 11 and 20 , 21 is in contrast, however, from a fully developed QG jet within a wind-driven double-gyre circulation where the interfacial form stress convergence dominated the E-P flux convergence . While this does not provide as proof that the Gulf Stream in primitive equation models deviates from quasi geostrophy, the disagreement is consistent with previous studies arguing that western boundary currents, which are on the order of O (100 km) in the across-jet direction but O (1000 km) in the along-jet direction, may not be well approximated by QG dynamics, which is isotropic in its formulation (Grooms et al., 2011;Jamet et al., 2021). Further examinations, however, are required to quantify the level of deviation.
We now examine further details in the separated Gulf Stream, a region where eddies have been shown to modulate the mean flow structure (e.g. Chassignet & Xu, 2021;Cronin, 1996), as seasonal means in order to capture representative features. Winter is defined as the months of January, February, March, and summer as July, August, September. Upon separation, the zonal E-P flux convergence tends to decelerate the Gulf Stream. The repeating features of positive and negative values for the zonal component of the E-P flux convergence persist and are likely associated to the jet meandering (Figures 5a and 5c). In the meridional direction, we again see positive values on the northern flank of the separated Gulf Stream and negative on its southern flank (Figures 5b and 5d). This north-south dipole feature is likely associated with the gradient of the eddy energy, and may be trivial as the energy naturally maximizes near the center of the jet. The zonal and meridional component of the E-P flux convergence can jointly be interpreted to force the Gulf Stream to migrate northwards (decelerate the jet north-  Figure 6 where the sign convention in Equation 17 is reversed from ours for the eddy forcing term and their units are in [m s −1 day −1 ]) where they diagnosed an idealized zonally re-entrant jet. It is interesting to note, however, that the vertical structure of the E-P flux convergence is much smoother and barotropic during the summer with a consistent deceleration of the jet on its northern flank and acceleration on its southern flank (Figures 5g, 6e, and 6f). We note that such seasonal features may be specific to the year of 1967, and the temporal evolution of the E-P flux convergence should be addressed in a dedicated study. We leave this for further work, focusing here on the TWA implementation for a realistic model.
In Figure 6, we show the vertical profile of the seasonal E-P flux convergence along with each component in Equations 17 and 20 area averaged over the zonal extent of 290°E−305°E. The E-P flux convergence closely follows that of the interfacial form stress convergence (i.e., baroclinic instability) with the Reynolds stress due to cross correlation between the zonal and meridional eddy momentum ( 10 , 01 ; orange lines) taking the smallest magnitude. The amplitude of interfacial form stress convergence is larger near the surface (viz. larger buoyancy values), which is expected from the seasonal surface forcing affecting the isopycnal tilt and hence baroclinicity of the surface flow. The meridional smoothing of the separated Gulf Stream is also apparent from the vertical profiles with the meridional E-P

The Ergodic Assumption
In this section, we replace the averaging operator with the temporal mean of the 50 years of output ( from a single arbitrary realization (realization 00 to be specific) to examine the ergodic assumption and compare with our TWA results. Realization area averaging is separated area averaging is separated 00 was taken from a 24-member ensemble originally designed for a different study (Jamet et al., 2019b). The 48 members discussed above were constructed by adding 24 members to the first 5 years of this data set. The TWA operator now becomes ( ⋅) def = −1 (⋅) and eddies (⋅) ′′ def = (⋅) −(⋅) . The maximum sound speed per depth ( ) was recomputed for the 50 years of realization 00 in remapping the coordinate system. Although the averaging operator is now along the time dimension, we note that this is different from the Temporal-Residual Mean (TRM) framework developed by McDougall and McIntosh (2001) in the sense that we proceed with our analysis  in buoyancy coordinate. The hope of applying the ergodic assumption to a temporally varying system, as we have shown in previous sections, is that for a sufficiently long time series, such sub-and inter-annual variability will cancel out with only the stationary feature being extracted in the "mean" flow.
In Figure 7, we show the climatological E-P flux convergence from realization area averaging is separated area averaging is separated 00. In other words, all time scales shorter than 50 years are now relegated to the eddies. While having similar spatial structures to Figures 4 and 5a-5d, they are more spread out with less detail. In particular, the seasonality is obscured by the climatological mean of 50 years and becomes similar to the summertime of the 48-member ensemble (Figures 5c and 5d). In other words, the wintertime signal seen with the ensemble diagnostics (Figures 5a and 5b) are not well captured by the climatological E-P fluxes convergence. This could either suggest that such signal are peculiar to the year 1967 we analyzed with our 48-member ensemble,  (Figures 5e-5h, and 8) where the eddies tend to zonally decelerate the separated Gulf Stream on its northern flank and accelerate it on its southern flank (Figure 8a). In the meridional direction, the eddies tend to decelerate the subpolar gyre on the northern flank of the separated Gulf Stream and the subtropical gyre on its southern flank (Figure 8b).
Taking the climatological time mean of 50 years of output is perhaps the most conservative definition of the mean flow under ergodicity. We, therefore, now loosen the temporal averaging to a climatological annual cycle in defining the residual mean flow. In doing so, we chunk the 50 years into 50 annual segments and take their average to produce a single segment of ∼365 days. Namely, we treat each year as an individual realization of the ocean, generating a pseudo 50-member year-long ensemble (hereon pseudo-ensemble for short). The eddies are now defined as fluctuations about this climatological annual cycle. In Figure 9, we show the MKE on a buoyancy level on January 3 with similar depths diagnosed from the ensemble and pseudo-ensemble. While the maximum MKE amplitudes are similar, the mean flow is more spread out in the pseudo-ensemble. This likely comes from the different paths the Gulf Stream takes resulting as a response to different yearly atmospheric states, which get averaged all together. In other words, while the degrees of freedom are similar between the ensemble (48 members) and pseudo-ensemble (50 members assuming a decorrelation time scale of a year), the ensemble mean captures the oceanic response to the atmospheric state specific to 1967. The pseudo-ensemble, on the other hand, implies that 50 years are not sufficient for the "eddies" to emerge as a coherent signal upon averaging for a climatological annual cycle and the mean flow incorporates the signal of atmospheric interannual, decadal and low-frequency variability.
The imprint of fluctuations from each year onto the MKE domain averaged over the depths of ∼50-500 m (̃∈ (−0.25, −0.26) ) result in its seasonality to differ from the ensemble mean; the pseudo-ensemble takes its maximum around March while the ensemble around August (black solid and dashed lines in Figure 9c respectively). However, the seasonality in the area averaged MKE from the pseudo-ensemble on ̃= −0.26 m s −2 shows a summertime maximum (black dot-dashed line in Figure 9c). This implies that the discrepancy between K # and K #t results from the surface ocean being sensitive to the atmospheric state while being less so in the interior. Indeed, the domain averaged eddy KE (EKE; see Appendix A for definition) diagnosed from the ensemble shows a maximum during winter when the surface ocean is more susceptible to baroclinic instability due to atmospheric cooling (red line in Figure 9c; Uchida et al., 2017). We conclude that in the process of creating a climatological annual cycle, we convolute the oceanic response to different atmospheric states (i.e., interannual variability) and contaminate the eddy-mean flow decomposition. The oceanic mean flow conflated with atmospheric variability also imprints itself onto the E-P flux convergence for the climatological winter and summer as we show in

Discussion and Summary
By running a 48-member ensemble run of the North Atlantic Ocean at mesoscale-permitting resolution (1/12°) partially coupled to the atmosphere, we have shown that the TWA framework can be employed successfully in diagnosing eddy-mean flow interactions in a realistic ocean simulation. In doing so, we have introduced a new buoyancy variable for a realistic EOS, which is approximately neutral and dynamically consistent; both characteristics are necessary for the TWA analysis (Stanley, 2018). The ensemble approach negates the necessity for any temporal averaging in defining the residual-mean flow; we are able to exclude any temporal variability, such as seasonal and interannual fluctuations, from the eddy term and extract the intrinsic variability of the ocean. We show that the Eliassen-Palm (E-P) flux convergence (i.e., negative divergence), which encapsulates the eddy feedback onto the mean flow (Maddison & Marshall, 2013), tends to accelerate the Gulf Stream northwards on its northern flank and decelerate it on its southern flank (− 2 ⋅ (∇ ⋅ ) < 0 ; Figures 5b, 5d, 5f, and 5h); that is, the eddies can be interpreted to force the Gulf Stream to migrate northwards on 3 January 1967. However, a more detailed examination of the mechanism of poleward jet migration will likely necessitate studies using idealized simulations where each dynamical mechanism is easier to parse out (cf. Chemke & Kaspi, 2015). Modeling studies with varying spatial resolution have shown that the Gulf Stream tends to overshoot northwards and the North Atlantic Current (NAC) flows too zonally in coarse resolution models (e.g., Chassignet & Xu, 2017Lévy et al., 2010). The overshooting may partially be attributable to eddy feedback being insufficiently resolved at mesoscale-permitting resolutions, in addition to unresolved submesoscale boundary layer processes (e.g., Renault et al., 2016). In particular, it would be interesting to see whether further increasing the model resolution would increase the amplitude of baroclinic instability near the surface ( 20 , 21 ) and convergence of eddy momentum flux and potential energy in the interior ( 00 , 11 ) , which tend to accelerate the jet southward in the subpolar gyre and decelerate it southward in the subtropical gyre upon the Gulf Stream separation west of 290°E (i.e., shift the jet southwards) as we see from their annual means ( Figure 11). The same could be said for a better representation of the NAC path where the eddies in our model tend to flux northward momentum into the mean flow and hence allow for its north-eastward turn near the continental rise of the Grand Banks (Figures 4 and 5). Although it is beyond the scope of this study, the significance of baroclinic processes will likely increase with resolution as mixed-layer instability becomes better resolved (Boccaletti et al., 2007;Capet et al., 2008aCapet et al., , 2008bSu et al., 2018;Uchida et al., 2019;Yang et al., 2021).
We have also examined the often assumed ergodicity in decomposing the eddy and mean flow by replacing the averaging operator with a 50-year time mean for a single realization within the ensemble. To some extent, the agreement between Figures 4, 5, 7, and 11 implies that the ensemble size of 48 is able to extract the eddy signals that emerge at mesoscale-permitting resolution. The difference between the ensemble and 50-year climatology of an arbitrary realization amongst the ensemble (realization 00), on the other hand, likely comes from seasonal, interannual and decadal variability, and transient eddies, which are obscured in the climatological view. Loosening the time mean to a climatological annual cycle for the mean flow, on the other hand, convolutes the oceanic response to interannual variability in the atmospheric forcing and contaminates the eddy-mean flow decomposition ( Figure 9). This is consistent with Aiki and Richards (2008) where they found the energy stored in the mean and eddy flow to change depending on the duration of the temporal averaging applied. While it is not our intention to claim whether defining the mean flow via a time mean is appropriate or not for realistic simulations, our results imply that one should be mindful of what goes into defining the mean flow and consequently the eddies.
Lastly, ensemble modeling has shown us that a small perturbation such as eddies to the non-linear system can lead to very different states of the ocean and climate (e.g., Bessières et al., 2017;Fedele et al., 2021;Jamet et al., 2019b;Lorenz, 1963;Maher et al., 2019;. In light of this, we argue that it is important to consider the full spatiotemporal variability of the ocean. The ensemble framework allows one to capture the space-time varying eddy-mean flow interaction and not just its climatological state.

Appendix A: Energetics Under a Non-Linear Equation of State
In this Appendix, we derive the energetics in a similar manner to Aiki et al. (2016) but in a framework consistent with the ensemble formalism and a realistic equation of state (EOS). The thickness-weighted average (TWA) residual-mean horizontal momentum equation in geopotential coordinates neglecting dissipation is (Ringler et al., 2017;Young, 2012): where # def =̂+̂+ # and # def = (̃̃̃# ( ) ) + # are the residual-mean velocity and hydrostatic pressure anomaly. It is important to keep in mind that the " " here is the ensemble averaged depth of an iso-surface of buoyancy, viz. = . The residual-mean kinetic energy (MKE; # = |̂| 2 ∕2 ) budget becomes: We can now define the dynamic enthalpy for the mean state in a similar manner to McDougall (2003) and Young (2010): where Φ # = Φ 0 − gz is the dynamically non-active part of the hydrostatic pressure to be consistent with the Boussinesq approximation. Note that h # is not a function of the TWA temperature and salinity (Θ ,̂) due to non-linearities in the EOS, that is, While there exist a temperature and salinity variable to evaluate the material derivative of # since an EOS exists for # , it is unclear whether they can be analytically expressed for a non-linear EOS. We, therefore, express the material derivative of # as: where  # carries the net sum of the diabatic and non-linear effects. Thus, the residual-mean total energy equation becomes: where we have invoked ∇ ⋅ v # = 0.

Journal of Advances in Modeling Earth Systems
UCHIDA ET AL.

10.1029/2021MS002866
20 of 25 On the other hand, the total KE budget remapped onto buoyancy coordinate is: is the three-dimensional divergence. Unlike the residual-mean dynamic enthalpy, the definition of the total dynamic enthalpy is straight forward (Young, 2010): yielding: where  def = ℎΘ Θ + ℎ . Terms due to non-linearity in the EOS do not emerge in the definition of  as Equation A8 is not averaged. Ensemble averaging after thickness weighting Equation A8 gives: The total KE can be expanded as: so plugging in Equation A10, and keeping in mind that ( ⋅) =(⋅) and (⋅) ′′ = 0 , each term on the left-hand side (LHS) of Equation A9 can be written as: where  is the eddy kinetic energy (EKE), and def =′ ′  1 +′ ′  2 +′ ′  3 , def =′ ′ 2 1 +′ ′ ′′ 2 +′ ′ ′′ 3 , def =′ ′ ′′ 1 +′ ′ 2 2 +′ ′ ′′ 3 are the eddy fluxes of kinetic energy, eddy zonal and meridional velocities respectively, and where ℎ def =′ ′ ℎ ′′ 1 +′ ′ ℎ ′′ 2 +′ ′ ℎ ′′ 3 is the eddy flux of fluctuations in dynamic enthalpy, and we have used the relation = Subtracting Equations A5 from A13 yields the eddy energy budget: Equations A5 and A14 are the relations derived by Aoki (2014) but for a non-linear EOS and non-zero dia-surface velocity where the residual-mean flow and eddies exchange energy via the E-P flux divergence and residual vertical buoyancy flux due to non-linearities in the EOS. It is perhaps interesting to note that h′′ is not the eddy potential energy (EPE;  def =ĥ − ℎ # in Equation A14) and they are related to one another as ′′ = − ( # + ) .
For a linear EOS, the EPE can be rewritten as: . Equation A15 provides the physical intuition of EPE being defined as the difference between potential energy at the TWA depth (̂) and ensemble-mean depth ( ) . In a similar manner, we can also derive: and hence, ℎ ′′ = − . Assuming the background buoyancy frequency can be defined as the inverse of ensemble-mean thickness (viz. −1 ∼ 2 ) leads to further manipulation of EPE: where the last term in Equation A17 further reduces to the available potential energy under quasi-geostrophic approximation ( ′ ∼ 2 ′ ). The first-term on the RHS of Equation A17 vanishes upon volume integration pending on boundary conditions (i.e., rigid lid and a flat bottom).

Appendix B: Kinematics of Discretization
As in Figure B1, imagine u 1 and u 2 are on the same buoyancy contour. The relation between the two is: 2 ≈ 1 + Δ + Δ (B1) Figure B1. Schematic of discretized gradients.

Journal of Advances in Modeling Earth Systems
UCHIDA ET AL.
so once all of the variables are remapped onto the buoyancy coordinate from geopotential, the discretized horizontal gradients can be taken along the original Cartesian grid. The gradients on the model outputs were taken using the xgcm Python package . In order to minimize the computational cost, we took the ensemble mean first whenever possible, for example, =̃=̃ , ∇ h =̃∇h etc. The gradient operators commuting with the ensemble mean is also the case for the perturbations, that is,∇ Hence, ∇ h ′ = (∇ h ) ′ (cf. Maddison & Marshall, 2013, Section 2.3 in their paper).