A Baseline for Global Weather and Climate Simulations at 1 km Resolution

In an attempt to advance the understanding of the Earth's weather and climate by representing deep convection explicitly, we present a global, four‐month simulation (November 2018 to February 2019) with ECMWF's hydrostatic Integrated Forecasting System (IFS) at an average grid spacing of 1.4 km. The impact of explicitly simulating deep convection on the atmospheric circulation and its variability is assessed by comparing the 1.4 km simulation to the equivalent well‐tested and calibrated global simulations at 9 km grid spacing with and without parametrized deep convection. The explicit simulation of deep convection at 1.4 km results in a realistic large‐scale circulation, better representation of convective storm activity, and stronger convective gravity wave activity when compared to the 9 km simulation with parametrized deep convection. Comparison of the 1.4 km simulation to the 9 km simulation without parametrized deep convection shows that switching off deep convection parametrization at a too coarse resolution (i.e., 9 km) generates too strong convective gravity waves. Based on the limited statistics available, improvements to the Madden‐Julian Oscillation or tropical precipitation are not observed at 1.4 km, suggesting that other Earth system model components and/or their interaction are important for an accurate representation of these processes and may well need adjusting at deep convection resolving resolutions. Overall, the good agreement of the 1.4 km simulation with the 9 km simulation with parametrized deep convection is remarkable, despite one of the most fundamental parametrizations being turned off at 1.4 km resolution and despite no adjustments being made to the remaining parametrizations.


Introduction
One of the key sources of model error in weather forecasts and climate projections is limited spatial and temporal resolution. The Earth system exhibits variability on a wide range of temporal and spatial scales, from the microphysics of clouds up to the synoptic scale of weather systems. In order to remain computationally feasible, Earth system models truncate the governing equations temporally and spatially. Any processes occurring on scales finer than the truncation scale are represented by "parametrization schemes," which describe the statistical effect of subgrid-scale processes on the mean flow, expressed in terms of the resolved-scale parameters. However, the uncertainty in the key parameters underlying such parametrizations and the simplifying assumptions made when deriving them means that some fundamental processes that drive global circulation are imperfectly represented in weather and climate models.

10.1029/2020MS002192
Key Points: • A unique simulation with 1.4 km average grid spacing is presented for model development and process evaluation • The 1.4 km simulation shows remarkable fidelity with respect to the well-calibrated simulation at 9 km with parametrized deep convection • Switching off deep convection at a too coarse resolution (9 km) generates too strong convective gravity waves Supporting Information: • Supporting Information S1 • Figure S1 • Figure S2 • Figure S3 • Figure S4 • Movie S1 Whereas limited-area numerical weather prediction (NWP) models and some regional climate downscaling models are able to explicitly simulate convection (e.g., Leutwyler et al., 2017;Prein et al., 2015;Termonia et al., 2018), global NWP models, which currently use coarser than 9 km grid spacing, and global climate models, which typically use 25-200 km grid spacing, rely on convection parametrizations (e.g., Bechtold et al., 2014). If global models were able to resolve convection explicitly, it is hypothesized that their accuracy would improve (e.g., Palmer & Stevens, 2020;Satoh et al., 2019;Stevens et al., 2019).
Thanks to increases in computing power and advances in scalability (e.g., Bauer et al., 2020), storm-resolving simulations with 3-5 km grid spacing in which deep convection is explicitly simulated (but not necessarily fully resolved) have become possible. A comprehensive overview of the history of global storm-resolving models can be found in Satoh et al. (2019), and only a few examples are highlighted here. Pioneering global storm-resolving simulations down to 3 km grid spacing have been first performed as part of the Nonhydrostatic ICosahedral Atmospheric Model (NICAM) project (Fudeyasu et al., 2008;Miura et al., 2007;Satoh et al., 2008;Tomita et al., 2005). Recently, the first intercomparison of global storm-resolving simulations was performed in the framework of the DYnamics of the Atmospheric general circulation Modeled On Non-hydrostatic Domains (DYAMOND Stevens et al., 2019) project, in which nine atmospheric models were integrated for 40 days with grid spacings between 2.5 and 4 km. In a followup initiative DYAMOND-Winter, the first coupled storm-resolving simulations are produced and compared on time scales from months to years.
The storm-resolving simulations performed so far have given support to the hypothesis that resolving deep convection explicitly improves model accuracy. For example, the representation of the Madden-Julian Oscillation (MJO; Madden & Julian, 1971, 1972 and tropical precipitation improved in global convection permitting NICAM simulations with 3.5 km grid spacing (Miura et al., 2007;Miyakawa et al., 2014;Tomita et al., 2005). By performing storm-resolving simulations with approximately 3 km grid spacing over large domains, Stevens et al. (2020) found improvements to the distribution of precipitation in particular over tropical oceans. Using global models with grid spacing down to 2.5 km, Hohenegger et al. (2020) found improvements to the net shortwave radiation, which increases with resolution due to the reduction in low cloud amount over the subtropical oceans. The role of the level of parametrized convection in the context of climate change feedbacks was recently explored by Retsch et al. (2019). They performed global aqua-planet simulations with resolutions between 2,525 and 5 km grid spacing, finding a remarkably stable climate sensitivity regardless of the resolution.
However, at 3-5 km grid-spacing, deep convection is unlikely to be fully resolved. For example, by performing short 12 hr NICAM simulations at subkilometer resolution, Miyamoto et al. (2013) found that the nature of simulated convection changed significantly between 3.5 and 1.7 km resolutions. In idealized squall line simulations with a limited-area model, Weisman et al. (1997) found an overestimation of convective mass flux and precipitation for grid spacings coarser than 4 km. Therefore, to demonstrate the impact of resolving deep convection globally and to assess if this can address the uncertainty in cloud-radiative feedback on climate change, longer, coupled global simulations at horizontal resolutions finer than 2-3 km are desirable.
Global simulations at a grid spacing of around 1 km, at which deep convection is expected to be fully resolved, may become possible for routine use in the next decade, provided significant progress is made in model development and the efficient use of modern, heterogeneous hardware (Neumann et al., 2018;Schulthess et al., 2019). In addition to Miyamoto et al. (2013), some short simulations at these resolutions have already started to emerge. For example, Fuhrer et al. (2018)  A simulation of this resolution and length is unprecedented and requires substantial computing resources.
Here, we use Summit, the fastest supercomputer in the world as of November 2019, accessed through an Innovative and Novel Computational Impact on Theory and Experiment (INCITE) award .
Our choice of IFS model configuration follows earlier sensitivity experiments described in Dueben et al. (2020). Notably, we perform all simulations using the hydrostatic approximation. Although it is a common belief in dynamic meteorology that the hydrostatic assumption should break down at such a high resolution as is used here, the nonhydrostatic configuration of IFS is simply too computationally expensive for performing a seasonal simulation. However, the hydrostatic configuration of IFS still performs well even at 1.4 km grid spacing . The importance of nonhydrostatic effects remains an open question, and we hope the simulation presented here can provide a baseline (building on over 40 years of NWP experience) against which future nonhydrostatic simulations can be compared.
The paper is organized as follows. Section 2 describes the model setup, and section 3 highlights the technological/computational breakthrough of the 1.4 km simulation. The realism of this simulation, which was performed with no changes to the operational IFS (apart from the resolution), is explored in section 4. To do this, we compare the results to a simulation performed with the well-tested and calibrated version of the IFS at 9 km resolution and to the latest reanalysis of the Copernicus Climate Change Service, ERA5 (Hersbach et al., 2020), that is, the most accurate reconstruction of the atmospheric state, created by blending short range forecasts and observations with ECMWF's NWP system. Demonstrating the realism of the 1.4 km seasonal simulation can be considered a breakthrough in itself, given that for this initial attempt at producing such a simulation, no changes were made to the IFS to adapt it to these scales (e.g., turbulence, microphysics, and shallow convection parametrizations; land-atmosphere coupling; and land surface description may all need adapting to fully exploit the benefits of the resolution increase). In section 4 we also explore if even in this first ever attempt at a seasonal integration we can see glimpses of what such a high resolution could bring in terms of aspects that are not well represented at current resolutions, that is, convective storms, and convectively generated gravity waves (GWs). Finally, conclusions are drawn in section 5.

Model Setup
We compare the global 1.4 km simulation with explicit deep convection (i.e., deep convection parametrization is switched off) to a simulation performed with the default version of ECMWF IFS with deep convection parametrization (Bechtold et al., 2014;Tiedtke, 1993) at 9 km grid spacing, the resolution of ECMWF's operational high-resolution 10 day forecasts. To make sure that the differences between the 1.4 and 9 km simulations are not merely due to the deep convection parametrization being switched off in the 1.4 km simulation, we also perform a 9 km simulation with explicit representation of deep convection. We will refer to the 9 km simulation with parametrized deep convection as "9 km parametrized" and to the one without parametrized deep convection as "9 km explicit." All three simulations are performed with the global atmospheric model of the IFS (based on Science Version 45r1: ECMWF, 2020). The 1.4 km simulation uses 7,999 wave numbers in the spherical harmonic expansion and a cubic-octahedral (TCo) grid, resulting in the highest resolution that was ever used for a global spectral atmospheric model. The 9 km simulations use 1,279 wave numbers, also on a TCo grid. The simulations make use of the fast Legendre transform method (Wedi et al., 2013). All simulations are uncoupled (no ocean or wave model), have 137 vertical levels distributed between the surface and 0.01 hPa (80 km), and use single precision arithmetic (Vana et al., 2017).
The IFS is run at full complexity with the configuration used for operational weather forecasts at ECMWF, except that the deep convection parametrization is switched off in the 1.4 and 9 km explicit simulations. While a detailed description of the parametrization package can be found in the IFS documentation available online (ECMWF, 2020), IFS has the turbulent diffusion and exchange with the surface represented by the Monin-Obukhov similarity theory in the surface layer and an Eddy-Diffusivity Mass-Flux (EDMF) framework above the surface layer and includes a mass-flux shallow-convection; a multilayer, multitiled land-surface scheme (HTESSEL); a five-species cloud microphysics model; and a shortwave and longwave radiation scheme including cloud radiation interactions. Orography and land use fields are specially prepared for the 1.4 km resolution . The subgrid-scale orographic and nonorographic GW  drag parametrizations are both designed to reduce with increase in horizontal resolution and have zero contribution at 1.4 km grid spacing. The turbulent orographic form drag remains active in all simulations.
All simulations are integrated for 4 months and initialized on 1 November 2018 00 UTC from the ECMWF operational analysis (at 9 km horizontal resolution), suitably interpolated horizontally using the integrated interpolation and postprocessing software of Arpege/IFS (https://www. umr-cnrm.fr/gmapdoc). The 1.4 km simulation uses a time step of 60 s, with full radiative transfer calculations every 30 min on the 1.4 km grid. The 9 km simulations use a time step of 450 s, with full radiative transfer calculated hourly on a reduced grid with a grid spacing of 25 km. There is only a single time step advancing all (physics and dynamics) processes in the model, apart from radiation. The simulations are forced by the 1/20 • OSTIA sea surface temperature (SST) data and sea ice (SI) concentrations (Donlon et al., 2012), regridded to the target resolution using ECMWF's SST/SI analysis software.
For the analysis in section 4 we truncate spectral fields in the 1.4 km simulation at the total wave number l = 1,279. This is because handling output from 7,999 wave numbers poses its own challenges. For grid-point fields, such as precipitation, we conservatively interpolate the field to the TCo1279 grid with 9 km grid spacing. Furthermore, zonal-mean fields are interpolated to the 0.5 • × 0.5 • grid in order to compare to ERA5 reanalysis, which is produced on a TL639 grid with 31 km grid spacing.

I/O and Scalability
The default postprocessing frequency with output at full simulation resolution (1.4 or 9 km) is 3 hr, storing a wide range of surface, pressure, and model level fields (of size ≈450 TB for 1.4 km simulation and ≈10 TB for 9 km simulation, respectively). We have also stored restart files (1.7 TB each) every 2 days to have the possibility to perform additional simulations with high-frequency output in the future. Simulations have been conducted at a speed of one simulated week per day (and per job submission) using 960 Summit nodes with 5,760 MPI tasks × 28 threads (SMT4), which translates into 5.4 s per model time step. Other technical details pertaining to the model output can be found in the supporting information.
The scalability of the IFS atmospheric model has been tested on Summit CPUs and compared with results on PizDaint , the fastest European supercomputer (Top500, November 2019). This is shown in Figure 1. To meaningfully compare to PizDaint results, which were obtained with 62 vertical levels, the scalability simulations on Summit were also performed with 62 vertical levels at a time step of 120 s and without model I/O. However, for the rest of the paper results with 137 vertical levels are discussed. Note that there is an approximately linear scaling with an increasing number of vertical levels in IFS.

Results
The realism of our simulations and a quick flavor of resolution and/or explicit or parameterized deep convection impact is given in Figure 2, which shows simulated and observed visible satellite images. In the figure, the impact of resolution and convection parametrization is especially apparent in the tropics. Animations of the simulated satellite images for the 1.4 and 9 km parametrized simulations can be found in the supporting information.
We begin the scientific assessment of the 1.4 km simulation by comparing the realism of the seasonally averaged (from December 2018 to February 2019-DJF2019) large-scale circulation to the 9 km simulations, ERA5 and independent observations in subsection 4.1. The impact of explicitly resolving deep convection on the large-scale tropical variability is then assessed in subsection 4.2, which discusses the MJO. We then proceed to evaluate smaller-scale processes in the 1.4 km simulation, starting with resolved GWs in subsection 4.3 and moving onto convective storms in subsection 4.4. We conclude our assessment by evaluating energy at all scales and its transition between the scales in subsection 4.5, where kinetic energy spectra are examined together with spectral fluxes.  Figure 3 shows the time-mean zonal-mean zonal wind for the three simulations and ERA5. Despite no changes made to the IFS to adapt it to km scales (apart from the orographic and land surface fields and switching off deep convection and GW drag parametrizations), the 1.4 km simulation looks as good when compared to ERA5 as the 9 km parametrized simulation, which is well calibrated and tested (cf. Figures 3a and 3c to 3d). There are small hints of improvement in the 1.4 km simulation in the strength of the Northern Hemisphere tropospheric eddy-driven jet, which is too broad and located too far poleward compared to ERA5 in the 9 km parametrized simulation. The poleward tilt of the polar night stratospheric jet also appears to be better captured in the 1.4 km simulation than in the two 9 km simulations. There are also small hints of degradation at 1.4 km: the Northern Hemisphere subtropical jet is too strong, and there are larger biases in the tropical winds in the stratosphere than in the 9 km parametrized simulation (cf. Figures 3e to 3f). However, similar hints of degradation are also found in the 9 km explicit simulation (cf. Figure 3b to Figure 3d). The stronger subtropical jets are also observed in nonhydrostatic aquaplanet simulations with the ICON model when the deep convection parametrization-which is similar to the one used in IFS-is switched off (Retsch et al., 2019).

Realism of the Large-Scale Circulation
The distribution of humidity in the tropical troposphere is closely tied to deep convection. To assess the impact of explicitly resolving deep convection, time-mean zonal-mean specific humidity bias against ERA5 for the three simulations together with the difference between the 1.4 and 9 km parametrized simulations are shown in Figure 4. To complement the specific humidity evaluation, the impact on the zonal-mean temperature is shown in Figure 5 and on the zonal-mean relative humidity in Figure S2. When the deep convection parametrization is switched off at 9 km, the tropical troposphere is drier than when the deep convection parametrization is on, exacerbating the dry tropical troposphere bias against ERA5 (cf. Figures 4a to 4b). This finding is consistent with Retsch et al. (2019), who find that an overall drier tropical atmosphere is simulated when deep convection is explicitly simulated in aqua-planet experiments. However, when deep convection is resolved at 1.4 km, a moistening of the tropical upper troposphere and a drying of the lower tropical troposphere occur when compared to the 9 km parametrized simulation ( Figure 4d). The moistening in the tropical upper troposphere is consistent with the warming there (as warmer air can hold more moisture; Figure 5d). The warmer tropical troposphere in the 1.4 km simulation in comparison to the 9 km parametrized simulation is consistent (by thermal wind balance) with the stronger subtropical jet in the 1.4 km simulation (see Figure 3). However, other indirect effects of the explicitly simulated deep convection on the tropical environment may also play a role. It is also possible that other parametrizations do not work very well when deep convection parametrization is switched off.
In the lowermost stratosphere, the 1.4 km simulation exhibits a larger moist bias against ERA5 over the polar regions than both 9 km simulations (cf. Figures 4a and 4b to 4c). A moist bias in the polar lowermost stratosphere is common to many models including the IFS and results in a cold bias there ( Figure 5) due to too strong longwave cooling resulting from excess moisture (Boer et al., 1992;Hogan et al., 2017;Stenke et al., 2008). While stratospheric humidity in ERA5 reanalysis is not reliable due to the lack of assimilated moisture observations, this moist bias has been confirmed in comparisons to Aura MLS (Hogan et al., 2017), aircraft (Dyroff et al., 2015), and lidar (Woiwode et al., 2020) specific humidity observations. It has been suggested that adequately resolving mesoscales might alleviate this bias (e.g., Shepherd et al., 2018). However, it is interesting to note that even at 1.4 km resolution, this bias is still there and appears larger than at 9 km resolution ( Figure 4).
The ITCZ is also known to sensitively depend on the explicit versus parametrized representation of convection (e.g., Nolan et al., 2016;Retsch et al., 2019;Tomita et al., 2005). Average tropical precipitation rates are often taken as a proxy for the ITCZ. The zonal-mean distributions of average precipitation rates for the three simulations, ERA5, and the Integrated Multi-satellitE Retrievals for GPM (IMERG) observations (Huffman et al., 2020) are shown in Figure 6a for DJF2019. A sharper ITCZ structure with much stronger tropical precipitation rates can be seen with explicitly simulated deep convection at both the 1.4 and 9 km resolutions compared to the 9 km parametrized, ERA5, and IMERG observations ( Figure 6) and also compared to the Tropical Rainfall Measuring Mission (TRMM) observations, which indicate average precipitation rates of 6-7 mm/day in the ITCZ (Kummerow et al., 2000). Therefore, simulations with explicit deep convection have a 4 mm/day precipitation rate bias, maximizing in the Northern Hemisphere. The 9 km parametrized simulation better matches observations and ERA5. The increase in tropical precipitation in simulations with explicit representation of deep convection is consistent with what is found for the nonhydrostatic aqua-planet simulations of Retsch et al. (2019). Comparing the distribution of precipitation events over 24 hr intervals in DJF2019 (Figure 6b) shows that a higher probability of more extreme rainfall events occurs in the simulations with explicitly resolved deep convection. A similar finding was reported by Stephan et al. (2019b), who in the context of the DYAMOND intercomparison analyzed IFS simulations at 9 km grid spacing with parametrized deep convection and at 4 km grid spacing with explicitly simulated deep convection. More intensive localized convection in explicitly convection resolving simulations can also be seen in the ITCZ in the satellite image in Figure 2 and in the animations shown in the supporting information.

The Madden-Julian Oscillation
The MJO is a dominant mode of tropical intraseasonal atmospheric variability, characterized by eastward propagating planetary-scale disturbances with a period of 30-60 days. A realistic representation of the MJO is often considered a considerable milestone in climate model fidelity, closely associated with the ability to simulate tropical convection and cloud-radiative effects well (e.g., Bony & Emanuel, 2005;Tomita et al., 2005). Following Wheeler and Hendon (2004) and Vitart (2017), we use the real-time multivariate MJO (RMM) index, projecting outgoing longwave radiation (OLR), zonal wind at 200 and 850 hPa (U200 and U850) onto the empirical orthogonal function (EOF) principle components. This is then shown in the MJO phase diagram, to illustrate the representation of the MJO in the different simulations. Figure 7 shows the evolution of OLR, averaged between 15 • N and 15 • S (top panels), and the RMM (bottom panels) over the 120 day simulation period. Surprisingly, the 9 km simulation with parametrized deep convection shows the most realistic MJO propagation for this period. Although it would clearly be desirable to have more ensemble members for confirmation, these results indicate that more resolution may be desirable but alone does not seem sufficient to simulate a realistic MJO. Therefore, other parametrizations (e.g., boundary layer turbulence, shallow convection, land-atmosphere coupling, land surface representation, and cloud microphysics) and their interaction with the resolved flow might need adjusting for storm-resolving simulations in order to fully exploit the benefits brought by explicitly resolving deep convection. This conclusion-but at a coarser resolution--was also reached in Pilon et al. (2016), who analyzed simulations with and without explicit convection at 60, 15, and 3 km grid spacing.

Resolved GWs
Atmospheric GWs play an important role in driving the upper atmosphere circulation and variability (e.g., Shepherd, 2000) and can therefore influence stratosphere-troposphere coupling (e.g., Polichtchouk et al., 2018). Deep convection is a major source of GWs in the tropics and in the summer midlatitudes (e.g., Preusse et al., 2001). Such convectively generated GWs are important drivers of the quasi-biennial oscillation (QBO), which not only influences stratosphere-troposphere exchange of trace gases but also drives polar stratosphere variability (e.g., Baldwin et al., 2001). Therefore, it is of interest to assess what impact explicitly resolving deep convection has on convective GWs. Moreover, finer-scale orographic features are better represented at 1.4 km. Thus, it is also of interest to establish the impact of 1.4 km resolution on GWs over orography. To assess resolved GWs in our three simulations, we compute the absolute vertical GW momentum flux: where u, v, and w are the zonal, meridional, and vertical wind components, respectively, and is density. The wind fluctuations (i.e., primes) represent the physical space representation (on the 0.2 • × 0.2 • grid) of total wave numbers 42 ≤ l ≤ 1,279 in the spherical harmonic expansion (i.e., they are deviations from the large-scale flow with scales 0 ≤ l ≤ 41). M is calculated from 3-hourly simulation output. In what follows, it is important to bear in mind that only waves up to l = 1,279 are considered for the 1.4 km simulation. To illustrate geographical regions that experience the largest differences in the absolute GW momentum flux, its latitude-longitude distribution at 50 hPa for the 1.4 km simulation is shown in Figure 9a, together with the differences in M between the three simulations (Figures 9b-9d). Again, November only is shown. A similar analysis at 350 hPa can be found in Figure S3. There are clear differences between all the simulations over the regions dominated by deep convection, namely, above the ITCZ and over South America and southern Africa. In particular, convective GW activity increases when convection is explicitly simulated at 1.4 km resolution in comparison to the 9 km parametrized simulation (Figure 9c). GWs of much stronger amplitude are generated when the deep convection parametrization is switched off at 9 km in comparison to both the 9 km parametrized (Figure 9d) and the 1.4 km (Figure 9b) simulations. This is because explicitly resolved convection at 9 km resolution is occurring on a too large scale (see animations in the supporting information). A similar conclusion is reached at 350 hPa ( Figure S3). This result should be compared to Stephan et al. (2019a), who also find that in the ICON model with 5 km grid-spacing convective GWs are stronger when the deep convection parametrization (which is the same as in the IFS) is switched off than when it is on. It is likely that at 5 km resolution deep convection is still not well resolved and is occurring on a too large scale, therefore also unduly exciting GWs of a too large scale. Note that because convective GWs drive the QBO, the differences in the tropical winds in the stratosphere between the three simulations shown in Figure 3 are likely due to the differences in convective GWs. Over orography, there is an increase in the absolute GW momentum flux at 50 hPa over the Southern Andes, the Middle East mountain chain, the Rockies, and the tip of Greenland in the 1.4 km simulation in comparison to both 9 km simulations (Figure 9). The increase in M over the Southern Andes in the 1.4 km simulation also contributes to an increase in M in the zonal mean as illustrated in Figure 8b. However, at 350 hPa ( Figure S3) no clear differences in GW momentum flux between the three simulations can be identified over orographic regions likely due to small-scale tropospheric synoptic waves obfuscating the signal at this altitude. It is also interesting to note that no clear signal can be seen over the Himalayas or Antarctic Peninsula at 50 or 350 hPa. We emphasize, however, that our diagnostics do not capture GWs with total wave numbers 1,280 ≤ l ≤ 7,999 in the 1.4 km simulation. The contribution of these scales to the momentum flux remains to be assessed.

Convective Storms
In order to illustrate the potential arising from better resolving convective storms in the 1.4 km seasonal simulation, we investigate the occurrence of convective storms emanating from supercells (e.g., mesocyclones and tornadoes). While tornadoes and mesocyclones are not expected to be resolved at 1.4 km grid spacing, we use updraft helicity (Kain et al., 2008) as a surrogate for convective storm hazards emanating from supercells (e.g., Sobash et al., 2016). In pressure coordinates, the upward helicity is computed as − ∫ 500 850 w g dp, where is relative vorticity, g is gravitational acceleration, and p is pressure (in hPa). We only consider cyclonic updrafts ( > 0, w > 0) with upward helicity >5 m 2 s −2 for the convective storm analysis (we also calculated upward helicity for anticyclonic updrafts and cyclonic and anticyclonic downdrafts but found their contribution to the upward helicity to be much smaller than for cyclonic updrafts).
We focus on the convective storm activity in the Gulf of Mexico and southern U.S. region. Even if this region is experiencing the most severe convective storm activity (with frequent occurrence of tornadoes and mesosyclones) in the spring and summer season, occurrences in DJF are also reported. The convective storm activity for the three simulations averaged over all available time steps is shown in Figure 10. A much weaker convective storm activity is observed in the 9 km parametrized simulation in comparison to the 9 km explicit and 1.4 km simulations (cf. Figures 10a to 10b and 10c). This is also verified by calculating the 95th percentile for the upward helicity (with >5 m 2 s −2 ), which is 15, 23, and 21 m 2 s −2 for the 9 km parametrized, 9 km explicit, and 1.4 km simulations, respectively. In addition, convective storms with upward helicity >5 m 2 s −2 are five times more likely in the 1.4 km simulation and 5.2 times more likely in the 9 km explicit simulation than in the 9 km parametrized simulation. Both the 1.4 km simulation and the 9 km explicit simulation clearly map out the danger zone over the southern area of the United States that is qualitatively consistent with statistical maps of Gulf Coast significant tornado events during DJF (1981-2016) based on the significant tornado parameter (STP; Molina et al., 2018). However, the convective storm activity in the 9 km explicit simulation appears stronger especially in the Gulf of Mexico and over the Atlantic than in the 1.4 km simulation (cf. Figures 10b to 10c), which is likely due to deep convection occurring on a too large a scale in the 9 km explicit simulation.
While a detailed comparison of mesoscale convective systems for the three simulations will be done in a future study, we note that the increase in upward helicity for cyclonic updrafts in the 1.4 km simulation compared to the 9 km parametrized simulation is mostly due to an increase in the vertical velocity. For the 9 km explicit simulation, however, both the vertical velocity increase and the relative vorticity increase contribute to the increase in upward helicity. This is illustrated in Figure S4 where vertical profiles (at 500, 700, and 850 hPa) for the median and the 95th percentile of w and for all cyclonic updrafts with upward helicity >5 m 2 s −2 are shown. Note that the vertical velocities in all our simulations are comparable to those observed in cumulus convection (Xu & Randall, 2001).

Kinetic Energy Spectrum and Nonlinear Spectral Energy Fluxes
The shape of the global kinetic energy spectrum in the IFS is known to sensitively depend on physical parametrizations. Malardel and Wedi (2016) found that if the deep convection parametrization is switched off at 9 or 5 km resolution, a shallower-slope mesoscale energy spectrum emerges for total wave numbers larger than 80. To investigate how the shape of the spectrum changes (up to total Wave Number 1,279) with both the increase in resolution to 1.4 km and with the explicitly simulated deep convection, Figure 11 shows tropospheric divergent and rotational kinetic energy for the three simulations. Up to total Wave Number 500, there is a remarkable agreement in the both the divergent and rotational kinetic energy spectrum between the 9 km parametrized and the 1.4 km simulations (Figure 11a). For wave numbers 500 < l ≤ 1,279, a shallower-slope spectrum is seen in the 1.4 km simulation, which is due to the more realistic variance of the orography and of all the surface fields, the more accurate variability of the resolved and subgrid nonlinear processes, and the reduction of numerical diffusion (both explicit and implicit) at these scales. In contrast, when comparing the 1.4 km simulation to the 9 km explicit simulation, the 9 km produces too much energy in Wave Numbers 40-800 especially in the divergent kinetic energy (cf. dashed to solid blue line in Figure 11b). This finding is consistent with our analysis of GWs in section 4.3, thus indicating too strong mesoscale GWs caused by convection occurring at the wrong (too large) scales. Therefore, if the deep convection parametrization is switched off at a too coarse resolution at which the convection is not entirely resolved, the model simulates an unrealistic level of variability (i.e., too strong) at these scales. Malardel and Wedi (2016) also found that the nonlinear spectral energy fluxes (Augier & Lindborg, 2013) sensitively depend on the deep convection parametrization. These fluxes illustrate how the energy is redistributed between different wave numbers. A positive flux indicates a down-scale flux of energy propagating from wave number l to the wave number l + 1. A negative flux represents an up-scale energy flux (for a schematic and a more detailed explanation, see Malardel & Wedi, 2016). Therefore, assessing the difference in nonlinear energy fluxes between the three simulations gives an insight into the impact of resolution and explicitly simulated convection on the energy redistribution across the spectrum of waves. Figure 12 shows the cumulative nonlinear spectral energy fluxes (see Augier & Lindborg, 2013, for definition) for the total sum Π as well as for the two components of available Π A and kinetic Π K energy. The fluxes sum up to zero when integrated over all wave numbers (from right to left in Figure 12, cf. Augier & Lindborg, 2013). The figure also shows the internal conversion from available potential energy to kinetic energy C, which sums to about 4 W m −2 in the IFS with similar values for either the 1.4 or 9 km simulations.
Overall, the nonlinear energy fluxes are remarkably similar for the 1.4 km and 9 km parametrized simulations (Figures 12a and 12c), the only noticeable difference being that the simulation with 1.4 km grid spacing shows an increase in mesoscale energy conversion (C) and a reduction in conversion at larger scales. This difference may be due to changes in the circulation in the 1.4 km simulation possibly resulting from small changes in jet positions and amplitude (see section 4.1) or better resolved mesoscale GW activity (see section 4.3). However, the 9 km explicit simulation shows a much larger conversion at mesoscales that increases with increasing accumulation (blue lines in Figures 12a and 12c). Moreover, compared to the 1.4 km simulation, the 9 km explicit simulation shows a larger range of upscale available potential energy (APE) cascade from l = 20 to l = 900 (cf. dashed to solid orange lines in Figure 12b) that dominates the total energy flux change (red solid and dashed lines in Figures 12b and 12d). A similar finding was reported in Malardel and Wedi (2016), who found a larger upscale APE cascade in IFS simulations at 5 km resolution with explicitly resolved deep convection than with the parametrized deep convection and that the forcing of APE at the mesoscale was much larger when the deep convection parametrization is off. However, the marked similarity of APE fluxes in the 1.4 km simulation and the 9 km parametrized simulation (cf. dashed to solid orange lines in Figure 12a) indicates that the APE response to switching deep convection parametrization off at 9 and 5 km grid spacing is not realistic. Therefore, if the resolution is not appropriate for switching off deep convection, the model simulates not only a physically unrealistic level of variability but also physically unrealistic nonlinear interactions at larger scales.
GW momentum fluxes and the nonlinear spectral energy fluxes appear thus to be useful diagnostics to assess large-scale feedback of different realizations of convection in weather and climate models. In particular, for "more affordable" global resolutions with 4-5 km grid spacing that will soon become available for operational use, the divergent energy component could be used to indicate the required level of deep convection parametrization to avoid the excitation of convectively driven GWs at a too large scale. It would also be interesting to assess if storm-resolving simulations with 2.5-3 km grid spacing and without parametrized deep convection generate physically realistic level of variability and nonlinear interactions when compared to simulations at 1.4 km grid spacing.

Conclusions
The INCITE award allowed us to perform for the first time a global 4-month integration (November 2018 to February 2019) at convection resolving 1.4 km grid spacing and with 137 vertical levels on Summit. The scalability progress in the IFS in recent years  combined with Summit's computing potential allowed us to run at an impressive speed of 112 forecast days per day. With I/O, and beyond the benchmark setup (i.e., more frequent radiation calls, a smaller time step, and more than double the vertical levels), we averaged approximately seven forecast days per day on 960 nodes. In performing the 1.4 km simulation, the focus was placed on the efficient use of CPUs on Summit. A GPU acceleration of the spectral transforms that has already shown great promise (Müller et al., 2019) was not yet available for the simulations presented here. With the use of GPUs in the future, we anticipate to further reduce the cost of transforms and therefore to increase the throughput of the 1.4 km simulations.
The 1.4 km simulation is performed at and designed for grid spacings well beyond any contemporary climate models (e.g., Haarsma et al., 2016) or global NWP models. Past experience suggests that a reduction in grid spacing by more than a factor of 6 from a well-tested and finely tuned model configuration (9 km → 1.4 km) typically requires readjustment of model parameters to allow for the optimal use of the additional resolution and to obtain a realistic large-scale circulation on not only the seasonal but also on medium-range time scales. Although in this case the only change to the IFS was to turn off the deep convection parametrization (without tuning anything else), the 1.4 km simulation shows a remarkable fidelity in its representation of the large-scale circulation when compared to ERA5 and to a well-tested and finely tuned 9 km simulation with parametrized deep convection (Figures 3 and S2). On seasonal time scales some differences begin to emerge, some of which are actually improvements (e.g., the width of the Northern Hemisphere eddy-driven jet and poleward tilt of the polar night jet in the stratosphere shown in Figure 3). That the global circulation remains realistic when simply switching off deep convection parametrization (as well as parametrized GW drag, which is designed to switch off at this resolution in IFS) is in itself remarkable and highlights the enormous potential of these simulations. It also testifies that the parametrizations in IFS behave well, even up to resolutions never tested until recently .
Apart from presenting the unique 1.4 km data set, which we invite the community to use for future model development and more detailed process evaluation, our aim was to test the hypothesis that the accuracy of global models is improved if they are able to resolve deep convection explicitly. Based on our initial evaluation, we find that due to its added detail (Figure 2), the 1.4 km simulation does show improvement in some aspects of the modeled atmosphere such as the representation of convective storms and the eddy-driven jet when compared to the 9 km simulations (Figures 3 and 10). The impact of the 1.4 km resolution on the generation of GWs can also be clearly identified (Figures 8 and 9c): The increase in horizontal resolution enables the explicit initiation and simulation of GWs leading to stronger GW activity compared to the 9 km simulations with deep convection parametrization. However, other aspects like the representation of the tropical rainfall magnitude or the MJO are not improved or even degraded when compared to the well-tested 9 km simulation with parametrized deep convection (Figures 6 and 7). In this regard, these results are somewhat disappointing but confirm earlier conclusions (e.g., Pilon et al., 2016) that other Earth system model components and/or the interaction of these components with the resolved deep convection and the resolved tropical flow likely need revising to fully benefit from the increased 1.4 km resolution. For example, further revision of turbulence, shallow convection, and microphysics parametrizations and their interaction is likely necessary. To make firm statements about the representation of the MJO would also require more ensemble members and a longer time series than the four months available here.
Additionally, the comparison of the 1.4 to 9 km simulations with and without parametrized convection allowed us to examine coarser resolution simulations in a new light. Switching off the convection parametrization at a too coarse resolution (i.e., 9 km) generates too strong convective GWs (Figures 8 and 9) and too much divergent kinetic energy at mesoscales (Figure 11) when compared to both the 1.4 and 9 km simulations with parametrized convection. This implies that the simulated level of variability is unrealistic at 9 km resolution if one of the fundamental parametrizations is switched off when the deep convection is not yet entirely resolved. This corroborates Stephan et al. (2019a), who find that the nonhydrostatic ICON model at 5 km grid spacing produces much stronger convective GW activity when the deep convection parametrization is switched off. It also corroborates Stephan et al. (2019b), who find that convective GW sources are stronger in the IFS simulation at 4 km grid spacing with explicitly simulated deep convection compared to the IFS simulation at 9 km grid spacing with parametrized deep convection. It is of course possible that the deep convection is still underresolved even at 1.4 km resolution (e.g., Prein et al., 2015). However, the similarity of global energy redistribution between the 1.4 and 9 km parametrized simulation is remarkable (Figures 11 and 12) and suggests that it is appropriate to switch off the deep convection parametrization at 1.4 km resolution.
We consider ensemble simulations to be essential to represent uncertainties, both for medium and extended-range predictions of weather, but equally for longer time scales (cf. Roberts et al., 2020). For our examples, ensembles will be important to obtain robust MJO statistics and to assess predictive skill. However, questions remain about how to initialize, perturb, and design global ensembles (Leutbecher & Ben Bouallègue, 2020;Leutbecher et al., 2017) with an explicit representation of convection, given the increased variability (see Necker et al., 2020) and that some of the parametrizations whose tendencies are used at the current 9 km resolution to perturb the ensemble are switched off. Undoubtedly, an ensemble at 1.4 km grid spacing will be needed, but this substantially adds to future computational constraints.
Finally, the fine details represented by the 1.4 km simulation ( Figure 2) suggests that this simulation could also play a role in future observing system simulation experiments (OSSEs), continuing the history of using ECMWF model output for this purpose (e.g., McCarty et al., 2012). OSSEs allow to explore the potential benefits of future satellite observation platforms by simulating the observations from the planned satellites in a "nature run," which our 1.4 km simulation can represent. In summary, we believe that the 1.4 km storm and convection resolving seasonal simulation presented in this paper not only allows a glimpse into a promising future for modeling Earth's weather and climate but can aid future satellite mission planning. Facility (OLCF), which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725. Access to the simulation output can be requested by contacting OLCF Help Desk via email to help@nccs.gov and referring to Project CLI900. This paper benefited from the close collaboration between high-resolution simulation model benchmarking and advanced methodologies presently being developed for heterogeneous high-performance computing platforms at ECMWF in the ESCAPE-2 (No. 800897), MAESTRO (No. 801101), EuroEXA (No. 754337), and ESiWACE-2 (No. 823988) projects funded by the European Union's Horizon 2020 future and emerging technologies and the research and innovation programmes. The authors thank the three anonymous reviewers for their comments, which have significantly improved this manuscript.