[7] To build a non-orographic GW parameterization, we generally assume that the disturbances with horizontal scales below some subgrid scales Δ*x*, Δ*y* can not be explicitly solved by the model, and need to be parameterized. GW theory also indicates that these disturbances have life cycles which duration Δ*t*can be around 1 day. This rough estimate is an approximate time scale measuring for instance the time of travel of a mid-frequency GW through the neutral atmosphere. This is also roughly the characteristic time scale of the peaks in GW drag measured during field experiments [*Scavuzzo et al.*, 1998]. Therefore, it seems a priori reasonable to represent the unresolved GWs at each grid point by a spectrum specified via a time versus horizontal space triple discrete Fourier series over the subgrid scale volume Δ*x*Δ*y*Δ*t.* In reality however, none of these scales is well known: if Δ*x* and Δ*y* are eventually comparable to the model grid scales *δx* and *δy*, the temporal scale Δ*t* can largely exceeds the time step *δt* of the model. This, added to the uncertainties about the mesoscale dynamics that produces the waves, tell that a stochastic formalism is more adapted than pure Fourier series. We will therefore consider that at each time *t* the vertical velocity field can be represented by a sum of GWs *w*′_{n},

where the *C*_{n}'s are normalization coefficients such that

Up to this point, this representation is very near the Fourier formalism, which can be recovered by choosing suitably the *w*′_{n}. Nevertheless, and for the reason mentioned before, we will partly chose them randomly.

[9] In the following, we apply this formalism to a very simple multiwave parameterization. To specify the *w*′_{n} we actually consider monochromatic waves,

where the wavenumbers *k*_{n}, *l*_{n}, and frequency *ω*_{n} are chosen randomly. In (3), *H* = 7 km is a middle atmosphere characteristic vertical scale and *z*is the log-pressure altitude*z* = *Hln*(*P*_{r}/*P*), where *P*_{r} = 1023 mb. To evaluate , we will impose its amplitude randomly at a given launching altitude *z*_{0}, , and then iterate from one model level, *z*_{1}, to the next, *z*_{2}, by a WKB approximation,

In (4), we have dropped the n-index for conciseness,*m* is a vertical wavenumber, and the minus sign in the exponential ensures that the wave propagates upward. Still in (4), we have also introduced explicitly a constant vertical viscosity *μ* acting on the GWs only. It controls the GW drag vertical distribution near the model top.Actually, the efficiency of the dissipative attenuation in (4) is related to the kinematic viscosity *ν* = *μ*/*ρ*: it increases rapidly with altitude since *ρ* is the density *ρ* = *ρ*_{r}*e*^{z/2H} where, *ρ*_{r} = 1 kgm^{−3}. We have also made the Hydrostatic approximation and we will take the WKB non-rotating approximation for*m* in the limit *H* ∞,

In (5), is the intrinsic frequency, the large-scale horizontal wind, and*N*the Brunt-Vaisala frequency. We then follow*Lindzen* [1981] and limit the prediction in (4) to amplitudes which do not exceed the breaking amplitude *w*_{s},

or to when Ω changes sign to treat critical levels. In (6) the amplitude *w*_{s} is that beyond which the waves convectively overturn, and the term on the right, is to take into account that each of the individual waves is not supposed to occupy the entire domain Δ*x*Δ*y*, but only a fraction of it. We consider that this fraction is related to the ratio between a minimum horizontal wavenumber (for instance *k*^{∗} ≈ 1/ ), and the wave horizontal wavenumber amplitude . From the WKB expression (4) and the polarization relation between and (not shown) we can deduce the EP flux:

It does not vary with altitude if we take for its WKB approximation in (4), but varies if we take the saturated value in (6). To treat a large number of waves at a given time *t*, we launch at each time step *δt* a finite number of waves *M*, and compute the tendencies due to them, , where *n*′ = 1, *M.* As they are independent realizations the averaged tendency they produce is the average of these *M* tendencies. We then redistribute this averaged tendency over the longer time scale Δ*t* by first rescaling it by *δt*/Δ*t*and second by using a lag-one Auto Regressive (AR-1) relation between the GW tendencies at two successive time steps:

In other words, and at each time step, we promote *M* new waves by giving them the largest probability to represent the GW field, and degrade the probabilities of all the others by the multiplicative factor (Δ*t* − *δt*)/Δ*t.*If we express the cumulative sum underneath the AR-1 relation in(8), we recover the formalism for stochastic waves infinite superposition in (1) by taking

where *p* is the nearest integer that rounds (*n* − 1)/*M* towards the left.