Nonlinear inviscid damping and shear-buoyancy instability in the two-dimensional Boussinesq equations

We investigate the long-time properties of the two-dimensional inviscid Boussinesq equations near a stably stratified Couette flow, for an initial Gevrey perturbation of size $\varepsilon$. Under the classical Miles-Howard stability condition on the Richardson number, we prove that the system experiences a shear-buoyancy instability: the density variation and velocity undergo an $O(t^{-1/2})$ inviscid damping while the vorticity and density gradient grow as $O(t^{1/2})$. The result holds at least until the natural, nonlinear timescale $t \approx \varepsilon^{-2}$. Notice that the density behaves very differently from a passive scalar, as can be seen from the inviscid damping and slower gradient growth. The proof relies on several ingredients: (A) a suitable symmetrization that makes the linear terms amenable to energy methods and takes into account the classical Miles-Howard spectral stability condition; (B) a variation of the Fourier time-dependent energy method introduced for the inviscid, homogeneous Couette flow problem developed on a toy model adapted to the Boussinesq equations, i.e. tracking the potential nonlinear echo chains in the symmetrized variables despite the vorticity growth.


INTRODUCTION
This article is concerned with the long-time dynamics of a 2D incompressible and nonhomogeneous fluid under the Boussinesq approximation near a stably stratified Couette flow in the infinite periodic strip × ℝ. The background density profile is taken to be affine, thus we study the simple equilibrium being the gravitational constant. The parameter is the Brunt-Väisälä frequency, which is the characteristic frequency of the oscillations of vertically displaced fluid parcels, and hence provides a measure of the strength of the buoyancy force [44,54]. We write the system (1.1) in vorticitystream formulation as Density stratification is a common feature of geophysical flows; under appropriate averaging, most of the Earth's ocean is well-approximated as an incompressible, stably stratified fluid so that its dynamics are well described by fluctuations around a mean background density profile which increases with depth (namely ′ < 0, usually referred to as stable stratification profile [18,21,44]). The system (1.2) under investigation models a stably stratified fluid with the additional Boussinesq assumption, according to which density is assumed constant except when it directly causes buoyancy forces [24,43]. The Boussinesq system gained the interest of the mathematical community thanks to its wide range of applications, especially in oceanography [21,44], and many mathematical works have been dedicated to it [1, 15-17, 19, 20, 27, 29, 47, 57]. It also holds mathematical interest through a connection with the 3D axisymmetric Euler equations for homogeneous fluids [45], where the term multiplied by 2 in (1.2) plays the role of the vortex stretching.
Perturbations of the equilibrium state in a stably stratified fluid induce two related mechanisms as consequences of gravity's restoring effect (Archimedes' principle) and the shearing transport of the equilibrium. The first one is a buoyancy force generated by the pressure gradient of the stable stratification as a response to gravity, which pushes the higher density fluid downwards. The second one is vorticity production due to the horizontal density gradient, which acts as a source term in the vorticity equation of (1.2). These two mechanisms are coupled even at the linear level, in such a way that their interplay may lead to an overall instability of the system [44]. Note that gravity's restoring effect also manifests itself as radiation of internal gravity waves, whose propagation is supported by stably stratified fluids as a remarkable feature: understanding the dynamics of internal waves is in fact of crucial importance to many geophysical applications [21,44,56]. The non-trivial underlying dynamics have been observed in laboratory experiments [13,40] and investigated in the physics literature [14,31,34].
In the case of the Couette flow, linear stability is ensured by the so-called Miles-Howard criterion [35,49], which requires the Richardson number Ri = 2 to be greater than 1∕4. Under this condition, precise quantitative estimates can be extrapolated from the linear dynamics. In 1975, Hartman [34] observed an enstrophy Lyapunov instability with a growth of ( 1∕2 ), despite the fact that the velocity field undergoes an ( −1∕2 ) time decay. This phenomenon persists for more general, stably stratified fluids without the Boussinesq approximation, as showed by Case [14]. A decay of ( −1∕2 ) for both the velocity and the density has been proved rigorously for the Couette flow in [61] and extended to shears near Couette in [11]. In addition, a vorticity and density gradient growth with rate ( 1∕2 ), which confirms the observation of [34], has been rigorously proved in [11]. Due to the nature and the origin of such growth, we will refer to it as a shear-buoyancy instability. It is worth pointing out that the enstrophy growth is in striking contrast with the 2D homogeneous and inviscid Couette flow, which is Lyapunov stable in the enstrophy norm for both the linear and nonlinear problem (in fact, the enstrophy of the perturbation is conserved in both).
The decay of the velocity field of the perturbation, called inviscid damping, is due to the mixing of vorticity and is a key dynamical property of shear flows and vortices. This was first noticed by Orr [52] and later studied by Case and Dikiȋ [14,26] for a 2D homogeneous fluid, where the velocity field decays as ( −1 ). In particular, inviscid damping occurs when the shear transfers enstrophy to high frequencies. This is a fundamental mechanism of inviscid fluids, intimately connected with the stability of coherent structures [55,62] and the theory of 2D turbulence [12]. Its first mathematically rigorous study in the full 2D homogeneous Euler equations was carried out in [8] for the Couette flow. It bears remarking that due to transient unmixing effects, the Couette flow is in fact Lyapunov unstable in the kinetic energy norm (a consequence of [8]).

The main result
The purpose of this article is to provide the first rigorous study on the long-time dynamics of the Couette flow for the 2D inviscid Boussinesq system (1.1). We prove that the nonlinear system undergoes a shear-buoyancy instability and nonetheless the velocity field experiences nonlinear inviscid damping, confirming that the linear dynamic extends to the nonlinear setting at least on a natural timescale ( −2 ). Fix > 1∕2 and define the Gevrey norm of class 1∕ as ‖ ‖ (1. 4) The main result of this article is stated in the next theorem.
The above result describes the long-time dynamics of the Boussinesq system (1.2) in the perturbative regime near the linearly stratified Couette flow, and it is the first of its kind describing such behavior in a fully inviscid coupled system which has both wave propagation and phase mixing. The works [28,47,64] study nonlinear systems with both phase mixing and wave propagation, but these problems all contain dissipative effects, whereas the works [11,61] are all linear. The inviscid damping due to vorticity mixing is encoded in (1.7)-(1.8).
One of the main novelties here is the quantification of the shear-buoyancy instability given by (1.9). The linearized dynamics of (1.2) predict exactly the decay rates (1.7)-(1.8) and the instability (1.9) for all times [11,34,61], see also Theorem 2 below. Therefore, in a nonlinear perturbative regime as the one studied here, the time-scale ( −2 ) appears naturally. As another manifestation of the instability, the rates in (1.6)-(1.8) are ⟨ ⟩ 1∕2 slower compared to the constant density case studied in [8]. This is due to creation of vorticity in the perturbation by interaction with the density stratification.
The proof of Theorem 1, described in detail in the next Section 2, truly uses the specific linear coupling of and via a suitable symmetrization of the unknowns. Specifically, the scaled density is not simply transported by the Couette flow, as this would imply a growth rate of order ⟨ ⟩ for ∇ ≠ , rather than the ⟨ ⟩ 1∕2 appearing in (1.9).
The need of an infinite regularity (Gevrey) space is by-now classical in phase mixing problems, both for Landau damping in plasma physics [9,25,30,33,50] and for inviscid damping in fluid mechanics [8,[36][37][38]48]. This is strictly connected with loss of derivatives as a price to pay for the control of transient growths or echoes: further discussions on this aspect can be found in the course of the paper. The regularity requirement on the initial data (1.5) is the same as in the constant density case [8,37,48], and it is likely to be sharp [22]. This can be heuristically understood by a toy model that estimates the worst possible growths due to the nonlinear interactions. Despite predicting the same total loss of regularity as in the constant density case, the model is tailored specifically to the Boussinesq system and displays crucial differences in terms of the regularity imbalance between resonant and non-resonant modes (see Section 2.3). The picture may change with the addition of thermal diffusivity and/or viscosity. When viscosity is added in the vorticity equation, the Gevrey index can be relaxed to = 1∕3 as in [47], while when also diffusivity is present in the density equation one can work in Sobolev regularity [28,64].
The restriction of the parameter in Theorem 1 is sharply consistent with the classical Miles-Howard criterion for linear spectral stability [35,49] mentioned above. The role of this restriction is very explicit in the coercivity of the main energy functional used to prove Theorem 1, but also implicitly appears in many of the constants hidden by the symbol ≲, which blow up as → 1∕2. The linear dynamics when ≤ 1∕2 was studied in [61]. In this case the vorticity grows with faster rates (and the density decays with slower rates). Reproducing the results of [61] by means of an energy method like the one used in [11] could lead to further insight at the nonlinear level as well.
Finally, we do not expect that the linear dynamics persist to leading order after times ( −2 ), but rather that a secondary instability engages to carry the solution a fully nonlinear regime. Specifically, after this time, we expect that mixing creates large adverse vertical density gradients, resulting in an overturning instability. There are some analogies between Theorem 1 and the work on subcritical transition in 3D Couette [5]: both study a spectrally stable problem with an algebraic instability and show that the only way to trigger a secondary instability is through the underlying destabilizing mechanism (at least in Gevrey class). The possible secondary instability, the 3D case, and the case of stably stratified fluids without the Boussinesq approximation will be studied in future work.

Organization of the article
Section 2 describes the main ideas needed for the proof of Theorem 1, including the symmetrized variables, the weighted energy functionals and the fundamental bootstrap Proposition 2.8. In Section 3 we prove Theorem 1 assuming Proposition 2.8. The rest of the article is dedicated to the proof Proposition 2.8. The construction of the time-dependent Gevrey weights is carried out in Section 4, while Section 5 is dedicated to the proof of the elliptic estimates crucial to control the nonlinear terms. The heart of the article is contained in Section 6, where we prove the energy estimate on the symmetric variables. These require direct bounds on the vorticity and the gradient of the density, which are carried out in Section 7. Finally, Section 8 contains the control of the nonlinear change of coordinates.

Notations and conventions
We use the notation ≲ when there exists a constant > 0, independent of the parameters of interest, such that ≤ . Similarly, ≈ means that here exists > 0 such that −1 ≤ ≤ . We will denote by a generic positive constant smaller than 1. Given a vector ( , ), we indicate by | , | = | | + | | its norm. We will use the symbol ⟨ ⟩ = √ 1 + | | 2 for either scalars or vectors. Given a normed space , its norm is denoted by ‖ ⋅ ‖ , omitting the subscript when = 2 . We recall also that (1.3) and (1.4) are used throughout the article.

OUTLINE OF THE PROOF
In this section, we outline the proof of Theorem 1. There are a number of different ideas that go into it, some arising from the inviscid damping result for the homogeneous problem [8], others arising from the study of the linearized problem [11], and others which are new and specific to this nonlinear problem.

Change of coordinates
Given the incompressibility of the flow, we know 0 = ( 0 , 0), which implies Due to the inviscid damping, we expect the non-zero -frequencies to decay and hence it is natural to treat the last term as a perturbation. However, there is no decay mechanism for 0 and so this term could be treated perturbatively on an ( −1 ) time-scale at most. To deal with this difficulty, [8] introduced a change of coordinates that depends on 0 ( ), and for the same reason, we use the same coordinate change. We briefly recall it here; see [8] for more details. Define Provided that 0 is sufficiently small, this coordinate change can be inverted; we assume this is the case for now. The corresponding unknowns written in the new variables (writing = ( , , ), = ( , )) are given by In this way we obtain (we write the change of variables only for Ω but similar relations hold for the other functions) The Biot-Savart law also gets transformed as In the new coordinates, the original system (1.2) is now expressed as where ∇ = ∇ , . Notice that the zero mode in and in are the same, and therefore we use the same symbol as in (1.4) to denote the projection of Ψ off the zero mode in .
To control the coordinate system itself, as in [8], we introduce the auxiliary variables .
Propagating smallness for ℎ is enough to invert the change of coordinates and it will be crucial to handle new nonlinear problems appearing in (2.6). For instance, the piece of the velocity field with ′ ∇ ⟂ Ψ ≠ can be splitted into a velocity field in the standard form ∇ ⟂ Ψ ≠ plus ℎ∇ ⟂ Ψ ≠ . We treat this last piece as a "perturbation" of ∇ ⟂ Ψ ≠ by proving ℎ is small in an appropriate sense. The term  instead arises when deriving the equation satisfied by ℎ. Indeed, notice that from (2.4) we have Since from (2.4) one has that ( ( ′ − 1)) = 0 , we have from (2.3) that Taking the average of the first equation in (2.6), we similarly derive Finally, we also record the equation satisfied bẏin the ( , ) coordinates, namelẏ

The linearized dynamics: Symmetric variables
Unlike [8], the linear dynamics are non-trivial. The linearized dynamics associated with (2.6) are best understood by passing to Fourier variables ( , ) ↦ ( , ). Since at the linear level we have = , the differential operators in these coordinates read We denote the symbols associated to −Δ as ( , ) = 2 + ( − ) 2 , ( , ) = −2 ( − ). (2.11) The explicit dependence on of the above quantities will often be omitted. The linearized equations are obtained from (2.6) by neglecting all nonlinear terms including the one arising from the nonlinear change of coordinate (hence Δ is formally replaced by Δ ). On the Fourier side, they take the formΩ While the decoupling in is a general feature of stratified flows near shears and general background density profiles ( ), the linear nature of the Couette flow and ensures the decoupling in as well. While the zero-mode is clearly conserved, the nonzero modes exhibit an interesting behavior which has been studied in the applied mathematics literature since the 1950s; we refer to [34] for a detailed literature review. In [34], the system (2.12) is investigated by a method involving hypergeometric functions, made mathematically rigorous and precise in [61]. For our purposes, it is more convenient to recall the energy method used in [11], originally introduced to deal with the linear stability of the Couette flow in a compressible fluid [3]. The idea is to symmetrize the system (2.12) via time-dependent Fourier multipliers and use an energy functional for the new auxiliary variables. Compared to [11], we slightly change the symmetrized variables by modifying powers of , defining them here as for which (2.12) takes the particularly amenable form (2.14) Throughout the article, we will often omit the subscript and the dependence on when no confusion arises. The presence of the 2 factors in (2.13) only modifies the linearized equations by changing to | |, however, the adjustment to the definition of , will be important to treat the nonlinear problem later. Define the following energy functional point-wise in frequency Since, | ∕( 1∕2 )| ≤ 2, the energy functional is coercive for > 1∕2 with 16) and can be shown to satisfy the Miles-Howard condition is only sufficient as in the constant density case = 1 (the homogeneous 2D Euler equations) every shear flow without any inflection point is spectrally stable by Rayleigh's criterion for homogeneous fluids (which is a necessary and sufficient condition), see [44] for further details.

The nonlinear growth mechanism
The full nonlinear system corresponding to (2.6) in the ( , ) variables (2.13) reads where in (2.23) we have isolated the linear part identical to that in (2.14) and we have used the identity Δ Ψ = Ω − (Δ − Δ )Ψ.
In inviscid damping around Couette flow, the unmixing of enstrophy causes transient growth of the velocity, called the Orr mechanism, with analogous transient growth effects in other phase mixing problems [4,8,50,53]. As discussed in [4,8,22,50,58], when studying nonlinear phase mixing problems, a key effect to look for are "echoes", wherein well-mixed enstrophy, through nonlinear interactions, transfers back to frequencies which will be un-mixed at a future time and hence cause growth in the velocity field by the Orr mechanism, possibly repeating the process into a chain of nonlinear oscillations. Echo chains were captured in experiments for plasmas modeled by the Vlasov equations [46] and in plasmas modeled by the 2D Euler equations near a vortex in [62]. This can be considered a kind of "resonance" associated with the linear transient growth mechanism that appears at the second iterate of linearization (i.e., if one linearizes around the linear dynamics) [4,8,22,23,50,58]. It is the primary reason that proving nonlinear inviscid damping (or Landau damping) type results is challenging and why such results have generally required very high regularity; see for example [5, 8, 36-38, 47, 48, 50]. In order to account for the echo resonances, [8] introduced a time-dependent Fourier multiplier method which builds a norm carefully designed to match exactly the worst-case estimates of these resonances. In fluid mechanics, there does not yet exist an alternative to this method for studying nonlinear inviscid damping problems. In order to adapt these ideas to the system (2.22)-(2.23), we need to derive a "toy model" that captures the worst possible growth caused by nonlinear interactions. As we will see, though we proceed in the spirit of [8] and [5], the toy model has significant differences with previous works.
As Ω is the unstable quantity, the worst possible nonlinear term appears in the equation for . As in [8], we derive a formal toy model by a paraproduct decomposition of the nonlinearity, which can be thought of as a secondary linearization of the evolution of the high frequencies about the low frequency linear dynamics. To obtain the toy model, we first observe that interacts with which could then excite through the linear and nonlinear interactions. However, for the variables , the linear semigroup is bounded and we therefore ignore linear terms and the fact that and are coupled (but the linear growth mechanism will still be seen in the some pieces of the nonlinearity). Assuming that Δ −1 can be well-approximated by Δ −1 , we want to write a good model for the nonlinear interactions of the scalar equation where is to be considered as a fixed parameter. Similarly to [8], the dangerous scenario is when −2 > 1 and there is a high-to-low cascade in which the mode has a strong effect at time ∕ that excites the − 1 mode, which itself has a strong effect at time ∕( − 1) that excites the − 2 mode and so on. This physically corresponds to an echo chain [4,8,22,23,62]. Therefore, we focus near one critical time ∕ on a time interval of length roughly ∕ 2 , so that ∕( − 1) is not critical, and consider the interaction between the mode and a nearby mode with ≠ . Calling = ( , ) and = −1 ( , ) the resonant and non-resonant dominant modes, respectively, keeping only the leading order terms and taking absolute values, we obtain the coupled toy system , where we also included that 1 (0) ≈ , as we are assuming linear dynamics to leading order. Since ∕( − 1) is not critical and 2 ≤ , we have that −1 ( ) ≈ ( ∕ ) 2 . Therefore, using (2.11) and that 1 (0) 1 4 ≈ 1 2 ≲ 1 for our purposes, the toy model that we finally consider is (2.25) In Section 3 we construct a weight based on this model, that takes into account a regularity imbalance between the resonant and non-resonant modes. Some remarks are in order.
Remark 2.2 (On the Gevrey-2 − regularity). As in previous works [8], one can deduce that the maximal possible growth for and is of order ( ∕ 2 ) for some constant 1 < < 16. If this growth accumulates for all the frequencies = 1, … , ⌊ √ ⌋, Stirling's formula implies a growth of order exp( √ ). This is consistent with the loss of Gevrey-2 + regularity in the inviscid, homogeneous case [8,22]. (2.26) The key practical difference among the two toy models is that for (˜,˜) the power 1∕2 in (2.24)-(2.25) is replaced by the power 1. This implies that while both models predict the same regularity loss (Gevrey-2 + ) we are going to impose a smaller regularity imbalance between resonant and non-resonant modes, and hence are going to lose less derivatives when measuring the effect of non-resonant modes on resonant modes and gain less when measuring the effect of resonant modes on non-resonant modes. The toy model used to build the norm in [47] for the Boussinesq equations with viscosity (but not thermal diffusivity) near Couette flow is more significantly different. However, the derivation and use of the model depend crucially on the presence of viscosity.  2 ≲ 1 (analogous to the way the lift-up effect time-scale of ( −1 ) dictated the toy model in [5]). For times ≥ −2 , the 1∕2 in (2.24)-(2.25), due to the structure of the system, would lead to an exponential growth for and which could not be controlled in any regularity class. Instead, for the toy model (2.24)-(2.25) we show that the growth is at most polynomial, see Proposition 4.1 and accumulates only to Gevrey-2 + losses.

Weights and energy functionals
Ultimately, the main step in the proof of Theorem 1 is to obtain the following uniform-in-estimate for for some > 0 and > 1∕2. Using this estimate (and suitable estimates on the change of coordinates), it is not too difficult to complete the proof of Theorem 1; see Section 3. However, we cannot obtain such an estimate directly, instead, there are several additional ingredients that are required involving three energy functionals: • To obtain uniform bounds in the presence of the linear term, we need to estimate , with an energy based on the linear analysis of [11]; we will call this energy functional . • The , variables break the natural energy structure of the quadratic transport nonlinearities, hence requiring an energy which estimates Ω and ∇ Θ directly; this estimate is at the highest level of regularity, so it controls the highest frequencies, but due to the linear instability, it necessarily grows in time. We denote this energy functional . • Both of these estimates are in turn coupled to an energy that controls the coordinate system which can be considered to be an estimate on the evolving shear 0 ; this energy is denoted .
The control of these three energies (and associated time-integrated quantities) forms the main bootstrap argument, detailed below in Section 2.5. This general scheme is common in perturbative quasilinear problems such as scattering in dispersive PDEs (see e.g., [32,39] and related references) and in Landau damping in kinetic theory (e.g., [9,30]). Here there is the additional complication of requiring estimates on a coordinate system that is coupled to the other unknowns; this same additional complication arises in certain dispersive PDEs (see e.g., [39,41,51]).
The key idea to the Fourier multiplier method of [8] is to introduce time dependent Fourier multipliers that allow us to capture the possible growth mechanisms by suitably weakening the norms in a time and frequency dependent way. All three energies, , , are based on such Fourier multiplier norms. As in other methods based on time-dependent norms, weakening the norm generates artificial damping terms in the equations that can be used to absorb terms in the energy method. We remark that this method is reminiscent also of Alinhac's ghost weight method [2], however (aside from being on the Fourier side), this method necessitates the norm losing a significant amount of regularity in an anisotropic way, as time proceeds, which significantly increases the complexity. The main weight is defined as a time-dependent Fourier multiplier = ( , ) of the form where > 16 is a fixed constant, ( ) is the bulk Gevrey regularity index and , are suitable Fourier multipliers to be defined in the sequel. The function ( ) is assumed to satisfẏ where 0 , ′ are those of Theorem 1, ≈ 0 − ′ is a small parameter to ensure that 2 ( ) > 0 + ′ , and 1∕2 < ≤ 1∕4 + ∕2 is a parameter chosen by the proof. The function ( ) allows a loss of the radius of regularity, by a finite amount, in a continuous way. As discussed in [8], it suffices to consider the case close to 1∕2 as higher regularities can be treated by adding an additional factor exp( ( )| , | ) for any < ≤ 1 which would play little role in the energy estimates that follow.

The linear weight
As we have seen in Section 2.2, the error term appearing in (2.17) can be integrated in time at any fixed frequency ( , ). However, the nonlinear case cannot simply be treated point-wise in ( , ), and we are forced to introduce the bounded Fourier multiplier (2.30) Such multiplier creates the artificial damping term (see (2.35) below) that controls the analogous of the linear error term in (2.17). This multiplier (or similar ones) have been used previously in for example, [10,42,63].

The nonlinear weight
The remaining multiplier to complete the definition of in (2.27) is given by where 1 < < 23. The weight is extremely important and is constructed using the toy model (2.24)-(2.25) in Section 4. In particular, it is used to distinguish between the resonant and nonresonant behavior of the system (see Section 4.2 for all the properties of ). For the moment we can think of it as a correction to the main exponential factors of and that mimics the behavior of the toy model (2.24)-(2.25) near the critical times = ∕ . Most importantly, it assigns more regularity to the "resonant" frequencies ( , ) than to the "non-resonant" frequencies ( ′ , ). It is analogous to the corresponding weight in [8], however, the weight here is different from the one in [8] due to the different toy model. Finally, for technical reasons it is convenient to definẽ ( , ) = e | | The coordinate system weight .
It turns out that the energy functional that controls coordinate system needs to use a stronger (compare to above) weight of a similar form Here, plays a similar role as in (2.31), and is defined in terms of a weight below in (4.6). However, is constructed from the toy model for the homogeneous 2D Euler equations (2.26) used in [8], making it essentially the same as the weight used in [8]. Due to the different toy model being used here, this implies we will be propagating a relatively large amount of additional regularity on the coordinate system (relative to the -dependent unknowns). This additional regularity is crucial to closing the estimates below.

2.4.4
The linear-type energy functional The energy functional that will permit uniform bounds is designed by taking inspiration from the linearized energy (2.15). It is defined as Notice that, for > 1∕2, we have the same coercivity bounds as in (2.16). Through a careful computation of its time derivative (carried out in Section 6.1) and using the definition of (see (2.27), (2.30) and (2.31)) we arrive at an inequality of the type Here we denote The terms above are also called Cauchy-Kovalevskaya terms since they come from the weakening of the norms caused by the Fourier multipliers. Those are good terms since they have a definite sign and can be used to control the other error terms in the identity (2.35), which we divide as follows: , is a linear error term, analogous to that in (2.17) and defined precisely in (6.6), , contains the main nonlinear errors that come from the transport structure of the equations (see (6.7)), while  div and  Δ are simpler error terms to treat that arise as a consequence of the nonlinear change of coordinates (2.1), and are defined in (6.8) and (6.9), respectively.

The coordinate change energy functional
The control on the change of coordinates, described by Equations (2.7)-(2.8) is achieved via the energy functionals and where the weight is defined in (2.33) and 1 = 1 ( , 0 , ) > 1 is a constant chosen in the proof. In Section 8 we derive the energy inequalities for the two functionals above, where it will be more convenient to treat each term in separately. Due to the presence of multiplier in the definition of , when computing the time derivative of we get as good terms (2.38) Remark 2.5. The structure of the energy functionals for the change of coordinates is heavily inspired by [8], indeed, the coordinate change and the associated Equations (2.7)-(2.9) are exactly the same. However, the control we have on the quantities under study is significantly different. First, we need an additional, even higher regularity control on ℎ, namely the last term in . Moreover, while is essentially the weight used in [8], the norm is not the same, and so we have a significantly different regularity gap between the estimate on  and those on ℎ. For this reason, we cannot rely completely on the proofs given in [8].

2.4.6
The nonlinear-type energy functional To control high frequencies we also need a direct control on the vorticity Ω and the gradient of the density ∇ Θ that is consistent with the linear prediction. From (2.21), a natural quantity to control is which satisfies the following inequality (see Section 7.1) Note that the structure is very similar to (2.35), with the good Cauchy-Kovalevskaya terms, a linear error, a nonlinear error and the errors due to the change of coordinates. Crucially, the linear term involves precisely and a bounded multiplier, thanks to the definition (2.13), and it will be treated thanks to the energy above for and . All the error terms involved in this energy balance are analyzed in Section 7.
Remark 2.6 (On the necessity of the symmetric variables). The results stated in Theorem 1 can also be obtained by ( ) ≲ 2 and coordinate system estimates. The reason why it is necessary to control the symmetric variables , is the term does not have a definite sign and is positive for > ∕ . The weight cannot be used to control this term for all the frequencies, and any other weight on Ω and ∇ Θ would have to be of order −1∕4 (at best), hence leading back to the symmetric variables. Instead, , have a nice structure at the linear level and, once we have a control on them, the bound on (2.40) is immediate. Error terms containing ∕ have been previously handled in the literature for linear inviscid [11] or viscous problems (e.g., [5][6][7]42]).

The bootstrap proposition
To control the energy functionals , , and (2.37), we rely on a continuity argument. Hence, we first state the local well-posedness result. We omit the proof since it follows by standard reasoning for 2D Euler in Gevrey spaces (see [8] for discussion).

Proposition 2.7.
For all > 1∕2, 0 > 0, there exists a constant ′ 0 > 0 with the following property: for every > 0 and every ′ < ′ 0 , Thanks to Proposition 2.7, the rest of the proof will only deal with times ≥ 1. By a standard approximation argument, we may work with regularized solutions, for which the quantities on the left-hand side take values continuously in time (see [8]). We now introduce the time ⋆ as the supremum of the set of times within which the following bootstrap hypotheses are assumed to be satisfied.

Immediate consequences of the bootstrap hypotheses
The bounds in the bootstrap hypotheses imply a control on several other quantities that we need to prove Proposition 2.8 and Theorem 1. We first show the bounds we have for the unweighted variables in lower regularity spaces. In fact, we only need the following bounds to prove the main Theorem 1.

Lemma 2.10. Under the bootstrap hypothesis, the following inequalities holds
The proof of the Lemma above is straightforward from the definition of and Proposition 5.1, which shows that by paying Sobolev regularity, decay on Ψ follows as in the case Δ = Δ ; see [8] for more detail.
From the definition of Δ and , we need to have a control also on 1 − ( ′ ) 2 , ′′ anḋin the proper regularity classes. These coefficients are controlled by ℎ and bounds oṅare recovered from . We collect the estimates in the following.

Lemma 2.11. Under the bootstrap hypothesis, the following inequalities hold
The proofs of the bounds (2.44)-(2.45) resemble the ones providing the analogous estimates of [8]; they are sketched briefly in Section 8.

PROOF OF THE MAIN THEOREM
In this section we prove Theorem 1, under the assumption that Proposition 2.8 holds. We need the bounds of Lemma 2.10 (which follow directly from the bootstrap hypothesis (H1)-(H4)). We remark again that the main part of this paper is the proof of Proposition 2.8. The first step of the proof of Theorem 1 is undoing the nonlinear coordinate transform. Instead of the change of coordinates (2.1)-(2.2), we want to use as the second spatial coordinate and define where is still given by (2.1). Define ∞ = ( 2 −2 ). By following the arguments in [ for some 0 < ′ ∞ < ∞ . The estimates on , , and ≠ stated in Theorem 1 now follow immediately. Taking the -average of the momentum Equations (1.1) in the coordinates we have (using that where we denote * ( , , ) = ( , , ). Using (3.1), it then follows that This takes care of the uniform estimates on 0 stated in Theorem 1. The bound on 0 follows similarly. Next, we are interested in proving the instability result, which requires a more detailed analysis of the dynamics. First, we observe that (2.6) in the new Fourier variables becomes As before, it is convenient also to define namely, the analogues of (2.10)-(2.11) in the ( , ) coordinates (with the -Fourier variable), and the new auxiliary variables (as in (2.13)) Similarly to (2.22)-(2.23), we have Let us view the system as the vector ODE pointwise-in-frequency where = ( ⋆ , ⋆ ) and the linear part ( ) is given by the time-dependent matrix Calling Φ ( , ) the associated solution operator, we may re-write (3.4) as In light of (3.5)-(3.6), we therefore have that there exists ′ > 0 such that The rest of this section is devoted to providing a suitable upper bound for the nonlinear term above. Precisely, we prove the following bound: where , are as in Theorem 1. In fact, there holds for all ′′ ∞ < ′ ∞ , Assuming now Lemma 3.1, there exists some > 1 such that (3.7) becomes for every ≤ 2 −2 , which completes the proof of Theorem 1. It now suffices to prove Lemma 3.1 Proof of Lemma 3.1. We will simply prove (3.8) as (3.9) is a straightforward extension and is not required for the statement of Theorem 1. From (3.2)-(3.3) and the fact that 3 , is an algebra, we find that In view of (3.1), which includes the estimates on the change of coordinates, it follows that and together with the rest of the estimates of (3.1), Lemma 3.1 follows. □ This concludes the proof of Theorem 1.

THE MAIN WEIGHTS AND THEIR PROPERTIES
This section is dedicated to the construction of the Fourier multipliers which will play the role of weights in our energy functional. As anticipated in Section 2.4, the Fourier modes with horizontal frequency ≠ 0 need two different weights. We call the first one the linear weight, as it allows to control linear terms, and has already been defined in (2.29). We also introduced the nonlinear weight ( , ) in (2.27), which encodes the dynamics of the nonlinear toy model derived in the previous sections. Here we provide a construction of the multiplier ( , ) in (2.31). Finally, the treatment of the zero mode = 0 requires a slightly different nonlinear weight, introduced in (2.33), which we define now.

Construction of the weight
As the nonlinear weight ( , ) in (2.31) actually encodes the dynamics of the toy model, we start with a more detailed description of its growths.

Definition 4.2. For any
The critical intervals are then defined as (4.1) We also introduce the resonant intervals as We now follow the construction of [8], using (2.24)-(2.25) as the reference toy model. In particular, for ∈ , we choose ( , ) such that We assume ( , ) = ( , ) = 1 for ≥ 2 and we construct the weight backward in time, by gluing all the growths of Proposition 4.1. For simplicity, we assume , ≥ 0, but the construction below easily applies to the case , ≤ 0 (when they have different signs we take ( , ) ≡ 1). We start our construction with the non-resonant part of the weight. Let be such that ( , ) = 1 for ≥ 2 or | | ≤ 2. Assume that ( | |−1, , ) is known. Motivated by Proposition 4.1, for any 1 ≤ ≤ ⌊ √ ⌋, we define where , , , have been introduced in (4.1), while , and , satisfy In particular, we have Thanks to this choice, notice that By the expressions of , , , , we also have that The main weight ( , ) is finally given by  The weights and are defined in terms of in (2.27) and (2.31), respectively. Notice that for = 0 the weight is always non-resonant and the linear multiplier is ≡ 1. Therefore, 0 ( , ) always encodes non-resonant regularity.
The change of coordinates requires a stronger weight. More precisely, we need to propagate the same regularity of the homogeneous case treated in [8], where the weight of the coordinate system assigns always the resonant regularity given by the toy model (2.26). Hence, we define The weight ( , ) is given by   Finally, we underline the following useful inequalities:

Properties of the weights
Here we present technical results which will be used to deal with the weights, throughout the paper. First, we recall the trichotomy lemma due to [8, Lemma 3.2]. We state here some useful inequalities, whose proofs can be found in [8].

Properties of the main weight
We collect the properties of the main weight ( , ), which in most cases will be analogous to [8], while the substantial differences will be carefully highlighted. First, we note that the maximal growth of the weight dictates the Gevrey-2 − regularity requirements. The proof is essentially the same as in [8] and is hence omitted here. The weight we constructed is not tracking the Gevrey regularity losses in an optimal way. In particular, for times ≲ √ | | we have no decay on −1 , whereas in principle for these short times before the resonances we could gain something. This refinement was implemented by Ionescu-Jia in [36] to obtain the results in Gevrey-2 instead of 2 − as we do here. We believe such an improvement is possible also in our case but we do not consider this issue in this paper.
Our weights also have the analogous property of [8,Lemma 3.3]; the proof is similar and is hence omitted. . (4.10) We now state two crucial results, which allow us to exchange frequency when dealing with . This is completely analogous to [8,Lemma 3.4] and the proof is hence omitted. For all ≥ 1 and , , , such that for some ≥ 1, −1 | | ≤ | | ≤ | | one has (4.14) If ∈ , ∩ , , ≠ then ]. However, estimate (4.15) is due to the specific structure of our weight. This is a consequence of the fact that our weight is slightly weaker than the one used in [8].
When is small enough, , attain their maximal growth and behave like exponential Fourier multipliers, so allowing us to gain half derivative from a commutator term. This is the content of the next result, which is the analogue of [8, Lemma 3.7].

Properties of
We also need to exchange frequencies in the multiplier . Lemma 4.14. Let , , , be given. We have the following: , if ∈ , ∩ , , 1, in all the other cases.
The proof is over. □

ELLIPTIC ESTIMATES
This section is devoted to some elliptic estimates that play a crucial role for the nonlinear bounds. The first inequality is a lossy elliptic estimate, as it allows to gain time decay at the price of regularity.
Thanks to this result, we can treat Δ as a perturbation of Δ in a lower regularity class. Its proof is identical to [8,Lemma 4.1]: we summarize it below for convenience of the reader.
Proof of Proposition 5.1. To write Δ as a perturbation of Δ we introduce the notation By definition of Δ in (2.5), we get By the algebra properties of Gevrey spaces and the bound (2.44), since As is small, we can absorb the last term on the left-hand side. Since −1 ( ) ≲ (⟨ ⟩∕⟨ ⟩) 2 , we have hence proving the proposition. □ We now provide a more precise elliptic control, which plays a central role in the rest of the paper. In fact, if one knows a priori that 0 ≡ 0 then Δ = Δ and the following proposition would not be needed.

Proposition 5.2. Under the bootstrap hypotheses, for small enough,
Also the following inequality holds true Proof. The proof of (5.3) is exactly the same as the one of [8, Proposition 2.4], up to a replacement of with (and the use of | |≤| | ≲ when needed), so it is omitted. In turn, we present the detailed proof of (5.4) which, although being heavily inspired by [8,Proposition 2.4], it shows some substantial differences. Taking the Fourier transform of (5.2), we have Multiplying by Now, define the following multipliers Hence, from (5.8) we have where the last inequality relies on the fact that˜≤ . To prove (5.4), we need to control and ′′ . Taking into account the decoupling with respect to the frequencies, we make a paraproduct decomposition of and ′′ only in the variable as = + + , ′′ = ′′ + ′′ + ′′ , with the notation introduced in (1.10). We start with the low-high interactions. On the support of the integral | | ≈ | |. From the paraproduct decomposition in (1.10), | − | ≤ 3∕16| |, so that Lemma 6.3 applies and e | , | ≤ e | , | + | − | for some ∈ (0, 1). In addition, since every term of (5.10) has the same horizontal frequency , we can appeal to (4.21), (4.17), so obtaining that Altogether, since in (2.29) is bounded, this implies that Turning to  ( ), we use the same frequency exchanges as before, together with (4.12), to get where in the last line we used that 1 2 ≤ . Hence, we can absorb this term on the left-hand side of (5.9) for is sufficiently small. The bound for the term with ′′ is analogous and we omit it. • Bounds on and ′′ . Exchanging the role of − and in (5.10), on the support of the integrand we have | | ≈ | |. Notice that the Ψ could be at high frequencies in , therefore we further split these terms as follows where ∈ { , ′′ }. When 16| | > | |, we claim that there is some ∈ (0, 1) such that This can be proved thanks to (4.8), by considering separately the cases Thus, we can use (4.9) which gives (5.11) and argue as in the low-high case before. Indeed, by (5.11) we can pay regularity on , ′′ and conclude by applying Young's convolution inequality. This way, which can be absorbed in the left-hand side of (5.9).
We now turn our attention to the terms where 16| | ≤ | |. Here, being the coefficients at high frequencies, we cannot absorb these terms on the left-hand side but we can exploit the integrability properties of Ψ. The most difficult term is ′′ , since, in view of the bounds (2.44)-(2.45), we need to recover some derivatives for ′′ . This term is explicitly given by On the support of the integrand | − | ≤ In addition, from 16| | ≤ | | we also get ≲˜. It is now crucial to exploit the definition of the weight given in (2.33). In particular, if ∈ , , by (4.19), from Remark 4.4 and Lemma 4.6, we deduce that and ≲ when 16| | ≤ | |. Therefore, appealing again to (4.12)-(4.17), in general we have where we have absorbed all the low-frequency Sobolev regularity in the exponential term. As remarked, we need to bound ⟨ ⟩ −1 ′′ , while up to now we only recovered half derivative. If | | ≤ | |, observe that When | | ≥ | |, since > 1∕2, we argue as follows Combining the bounds (5.15)-(5.16) with (5.14) and since  (( − )Ψ) ≤ 1 2Ψ , we have that Then, using Proposition 5.1 and (2.41) we get Hence, from (5.17) and (5.18) we obtain The control of , when 16| | < | | follows by a similar argument. The only difference is the case ∈ , . Indeed, we do not have to recover derivatives for but extra time decay is necessary because one has to deal with the analogous of (5.18) with 1 2Ψ replaced withΨ. To overcome this problem, it is enough to split the relative size of with respect to . When | | ≤ | |∕2, we have that When | |∕2 ≤ | | ≤ 2| | or | | ≥ 2| |, we can exchange the factor | | −1∕2 in (5.14) with ⟨ ⟩ −1∕2 . Therefore, we always recover a factor −1∕2 which is necessary to close the estimate. In particular, one has • Bounds on and ′′ . For these terms it is easy to show Finally, to prove (5.5), for the first term we can follow the arguments done to prove (5.4), since no specific properties of |∇| ∕2 ∕⟨ ⟩ or ∕ have been used. Analogously, the proof for the second term is obtained from the arguments for (5.3). The bound then follows by the bootstrap hypotheses (H1)-(H3).

BOUND ON THE ENERGY FUNCTIONAL
In this section, we aim at proving the first part (B1) of the bootstrap Proposition 2.8. In general, we have to estimate nonlinear terms of the type ⟨ ⋅ ∇ , ⟩. To do so, we use a paraproduct decomposition, see (1.10), where we decompose the nonlinear term in transport, reaction and remainder contributions (with terminology from [8,50]) Note that if we write for example, it is important to note that on the support of the integral we have In particular, if (6.2) holds, thanks to Lemma 4.6 we have In what follows, = ( , 0 , ) ∈ (0, 1) will denote a generic constant, independent of , . It will be mainly used in terms of the form e | − , − | to absorb Sobolev or exponential weights as the one in (6.3).
We also need to distinguish between short (S), intermediate (I) and long (L) times via the cutoffs

The energy inequality
Recalling the definition of given in (2.34), we obtain the following result.
Lemma 6.1. For every ≥ 0 we have the energy inequality where the [⋅] are defined in (2.36) and the error terms are given by Proof. The proof follows from the cancelations observed in [11] together with the definition of . Commutators have been introduced to better handle the transport structure. We recall briefly from [11] that the Miles-Howard condition arises from using | | ≤ 2| |

Enumerating nonlinear terms
We decompose the nonlinear term , in (6.7) as in (6.1), where the transport nonlinearity is Since | | ≤ 4 | | 1 2 for > 1∕2, it is enough to show how to deal with Ω,1 and Θ,1 , as Ω,2 and Θ,2 are completely analogous ( will be dealt with separately). Similarly, the reaction nonlinearity is given by (6.14) Finally, the remainder reads as In this section we prove the following.

Transport nonlinearities
In this section, we control the transport nonlinearities, defined in Section 6.2, to prove (6.16). First, by the bootstrap hypothesis (H2), Proposition 5.1 and (H4) we get (6.20) As mentioned already, we present only the proof for the terms Ω,1 , Θ,1 and .
• Bound on Ω, . Writing down this term and using that | | ≈ 1, we have where we define We claim that  ,1 and  ,2 are bounded in a way that is consistent with (6.16). To control  ,1 , by the elementary identity , we deduce For  ,2 , by the mean value theorem, there is ′ between and such that Therefore, the most dangerous term will appear in  ,1 , since there is a loss of order | |∕| |.
Hence, we only deal with  ,1 . By means of (6.22) and (4.21), we have . (6.23) We now have to consider different cases, depending on intermediate, short and long times.
• Bound on . Turning to , we write the commutator in each of its components as and bound where That  1 and  2 satisfy bounds that comply with (6.16) follows from an argument analogous to [8,Section 5], thanks to (4.20) and the fact that 1 2 ≤ . We therefore only focus on the more problematic term  . It is convenient to split  as where the difficult domain is defined as which is consistent with (6.16). We now turn our attention to the terms where ≥ The term with ∈ is the most delicate one. In this case, we cannot gain anything from the commutator. Notice that in this interval we may have a loss in the bound (4.19) and (4.15). Combining (4.19) with (4.15) and Lemma 4.8 we get where in the last line we have used that ≈ | ∕ |. Then, using (4.7) and (5.1) we have which works well with (6.16). For the remaining terms we need to consider two subcases, namely | | > 100| | and | | < 100| |. In the first scenario, using (4.7) and (4.20), we can repeat the argument in [8, Section 5] and obtain When | | ≤ 100| |, we can again ignore any gain from the commutator. Indeed, for the terms we are considering we can always apply (4.17). Then, if ∈ , ∩ , , by (4.19) and since where the last equality follows by trigonometric identities. Now we claim that can now obtain an estimate as in (6.23) and argue as done to control  ,1 to get The terms Θ, and Θ, can be bounded with exactly the same arguments used for Ω, and Ω, . The term Θ, is equivalent to Ω, with the role of ( , ) and ( , ) switched. Just notice that the extra factor of out of the commutator can be easily moved onto the high-frequency part by paying Sobolev regularity on . In addition, we need to replace the bounds on Ω with the ones for ∇ Θ. We do not detail more the bounds for these three terms. On the other hand, we present the bounds for Θ, . Here, we again need to use the bounds available for ∇ Θ.
• Bound on Θ, . Writing explicitly this term in the Fourier space and using that ≈ 1 we have Again, our goal is to bound the above term as in (6.16).
Besides the distinction between intermediate, short and long times, among the intermediate times we need to separate the resonant (R) versus the non-resonant (NR) interactions. As we shall see, the hardest terms to treat are those of the form Ω, in (6.10)-(6.11), on which the toy model has been constructed. The terms Θ, in (6.12)-(6.13) will be simpler to handle. The goal of this section is to prove that the reaction term satisfies the bound (6.17). Recall that throughout this section, on the support of the integral we have (6.2).

6.4.1
Bound on Ω, The bounds for Ω,1 and Ω,2 are analogous since | |∕(| | 1 2 ) ≤ 2. We will then consider just the first one. We split this term as where we use that = ′ ∇ ⟂ Ψ ≠ + (0,̇) and we define The main contribution will be the one given by Ω ,Ψ and the term Ω , can be considered, roughly speaking, as a perturbation of it. The term Ω , comes from the commutator which we had to introduce to deal with the transport nonlinearities. When the velocity is at high frequencies, we do not need to gain anything from the commutator and we can deal with this term separately.
• Bound on Ω ,Ψ . In view of the notation introduced in (6.36), we split the term as . First of all, observe that on the support of the integral, we have Using this and appealing to (4.14), (6.39) we deduce Since ∈ , , we observe that Appealing to Lemma 4.8 and the fact that Combining the inequality above with (4.11), using the bootstrap hypothesis (H2) and the elliptic estimate (5.4) we get Appealing to (4.22), the bootstrap hypothesis (H2) and (5.4), from the inequality above we then deduce that as needed for (6.17). ⋄ Bound on Ω,( , ) ,Ψ . When ∈ , ∩ , we necessarily have 4| | ≤ | | and 4| | ≤ | | Indeed, Then, on the support of the integrand | , | ≈ | , | so that | | ≈ | |. Therefore we can apply the trichotomy Lemma 4.5. If case (b) holds, then we repeat the same argument done for Ω, ( , ) ,Ψ and we omit the details. If we are in the case (a), namely = , then appealing to (4.21) and (4.17) we get Now observe that since ≥ 2 max{ √ | |, √ | |}, (4.10) implies We also have ( ) ≲˜( ) and ( ) ≲˜( ). Using (4.11) and (H2) we deduce In case (c) of Lemma 4.5 one has Therefore we can repeat the same argument as above.
as we wanted.
Hence, by the definition (6.37) we have One would like to directly treat these terms as a -perturbation of Ω ,Ψ , however, this is not true in general. More precisely, as done in the proof of Proposition 5.2, we first consider the following paraproduct decomposition in the -variable (since ′ does not depend on ): has the cut-off = | |≤16| ′ | . With a slight abuse of notation we omitted the subscript .
⋄ Coefficients in (relatively) low frequencies. In the proof of Proposition 5.2 we have seen that we can treat in the same way the low-high case or high-low with | | ≥ 16| |. This because we can always pay derivatives on the coefficients. Therefore, the most problematic term will be 1 , since more derivatives are hitting Ψ. However, the case under consideration can be treated by reasoning as done for the term Ω ,Ψ . More precisely, we first claim that the following inequality holds true where ( , ) = | |≥16| | . Indeed, first observe that since does not depend on , the factor | | 1 2 is always on the stream function. To prove (6.45), one can always use (4.21), (4.17) to move the multipliers onto Ψ by paying regularity on ℎ. We omit the details of this argument since it has been done in the proof of Proposition 5.2. Hence, from Young's convolution inequality, the bootstrap hypothesis (H3) and Proposition 5.2 we infer where in the last inequality we used (5.3 Using the bounds done for Ω ,Ψ , we see that also in this case we get a bound consistent with (6.17). ⋄ Coefficients (truly) at high-frequencies. We now have to deal with the high-low case when | | ≤ 16| |. In this case, as evident from 2 , we need to recover some derivative for the term ℎ. We will treat only 2 , since the term 1 , is analogous in this high-low regime. Writing down the term explicitly one has 2, , where is the cut-off of the paraproduct decomposition as defined in (1.10). We now have to exploit the fact that we control the coefficients with a stronger norm. More precisely, by the definition of , see (2.33), reasoning as in (5.12)-(5.13) we have where we also used the fact that ≲˜when | | ≤ 16| ′ |. Hence, since > 1∕2, appealing to Proposition 5.1 and the bootstrap hypotheses (H2), we infer .
• Bound on Ω ,̇. To control this term, we are going to exploit the consequences of the bootstrap hypotheses (2.46)-(2.48). Indeed, we recall that to we control̇via (H4) and bounds on . Notice that we need to recover -derivatives but this will be balanced by the extra decay in time available for  (oṙ). From (6.38) we have ,̇( ∈ , ∩ , + (1 − ∈ , ∩ , )).
Since Ω is at low frequencies we can always move all the factors | | to this term. Here, the most dangerous case is when ∈ , ∩ , . Indeed, we know thaṫalways has non-resonant regularity since does not depend on , whereas the weight is at resonant regularity. Due to the regularity gap between and , we will lose 1∕2 derivatives in . In particular, when ∈ , ∩ , and 2 max{ √ | |, √ | |} ≤ ≤ 2 min{| |, | |}, from (4.15), Lemma 4.8 and the definition of the weight , see (4.5), we have Since ∈ , (and < 1) we know that | | 1∕2+ ∕| | 3∕2 ≲ 1∕2+ . Hence, combining the inequality above with (4.11) and (2.48) we get as required by (6.17). For the remaining term, thanks to (4.17) we and (4.13) we know that we never lose derivatives from ( )∕ 0 ( ). In particular, we have This way, we conclude that • Bound on Θ ,Ψ . Following the notation introduced in (6.36), we split the term as We now control each term separately.
• Bound on Θ 0 ,Ψ . For this term, since Ψ is at the same frequency , we can move the multiplier 1 4 without losing derivatives in the high-frequency part. In addition, we only need to recover one derivative in . Observe that, appealing to (4.21) and (4.17), we have We are then left with the high-low case when the coefficients are truly at high frequencies. In analogy with the notation used in (6.46) we have to control For this term we can always move derivatives in onto the stream function. Then, from the definition of the weight , see (2.33), since 16| | ≤ | ′ | we know that on the support of the integral ( ) ≲ ( ) ≲ ( ′ ). Hence, appealing to (4.21) and (4.17) we have Since in general we do not have ≠ , we can only use the worst bound (6.48). However, combining the two bounds above with the bootstrap hypotheses and Proposition 5. ,̇( ∈ , ∩ , + (1 − ∈ , ∩ , )).
This concludes the proof of (6.17).
Hence, we can always pay regularity to move the multipliers. Arguing as in [8,Section 7], we deduce that
For ℎ

Control of 
To complete the proof of (B3), we start from the energy inequality where [⋅] is defined in (2.36) and the transport and forcing terms are given respectively by In this case  in (8.7) is similar to the transport terms of Section 6; in (8.8) describes the nonlinear feedback of the non-zero frequencies onto the zero one. Bounds on  are obtained as for (8.5), giving .

(8.9)
We focus on the forcing term, which contains ′ = 1 + ( ′ − 1), so that we can write = 0 + , where As argued in [8,Section 8], it is enough to consider 0 , treating in a separate way low-high, highlow and remainder interactions, namely 0 = 0 + 0 + 0  , with There are various similarities with [8,Section 8] in the treatment of all the non-resonant contributions, as the weight in our case is comparable to that in [8]. In particular, as in [8], appealing to (4.22) and the usual arguments for short and long times, taking the case of 0