Confronting the convective gray zone in the global conﬁguration of the Met Oﬃce Uniﬁed Model

In atmospheric models with kilometer-scale grids the resolution approaches the scale of convection. As a consequence the most energetic eddies in the atmosphere are partially resolved and partially unresolved. The modeling challenge to represent convection partially explicitly and partially as a subgrid process is called the convective gray zone problem. The gray zone issue has previously been discussed in the context of regional models, but the evolution in regional models is constrained by the lateral boundary conditions. Here we explore the convective gray zone starting from a deﬁned global conﬁguration of the Met Oﬃce Uniﬁed Model using initialized forecasts and comparing diﬀerent model formulations to observations. The focus is on convection and turbulence, but some aspects of the model dynamics are also considered. The global model is run at nominal 5km resolution and thus contributions from both resolved and subgrid turbulent and convective ﬂuxes are non-negligible. The main conclusion is that in the present assessment, the conﬁgurations which include scale-aware turbulence and a carefully reduced and simpliﬁed mass-ﬂux convection scheme outperform both the conﬁguration with fully parameterized convection as well as a conﬁguration in which the subgrid convection parameterization is switched oﬀ completely. The results are more conclusive with regard to convective organization and tropical variability than extratropical predictability. The present study thus endorses the strategy to further develop scale-aware physics schemes and to pursue an operational implementation of the global 5km-resolution model to be used alongside other ensemble forecasts to allow researchers and forecasters to further assess these simulations.


Introduction
Coarse-resolution atmospheric models do not resolve any of the main energy-containing turbulence scales, and therefore turbulence is essentially unresolved and has to be parameterized.Wyngaard (2004) discussed models with resolution in between coarse mesoscale models (grid length 10km and larger) and large-eddy resolving models (grid length 100m and smaller) for which the resolution is similar to the length scale of the most energetic eddies in the atmosphere.The range of values for the length scale of these energetic eddies and coherent structures depends on the prevailing meteorological situation, but can vary from less than 100m in shallow convection to several tens of kilometers in organized convective systems.This has led to the articulation of the "gray zone" problem, the question about the adequate atmospheric model formulation in situations where model res-olution approaches the scale of turbulence and convection (Tomassini et al., 2017;Field et al., 2017;de Roode et al., 2019;Honnert et al., 2020).
A distinction is often made between the gray zone of turbulence and the gray zone of convection, but in practice there is no clear-cut separation and the representation of turbulence has a significant impact on, for example, the modelling of organised deep convective systems (Honnert et al., 2020).Turbulent eddies and convective motions usually occur simultaneously and across a wide range of scales.Ultimately, all atmospheric scales are deeply intertwined.The terms "convection-permitting" or "turbulence-permitting" to designate models operating in the gray zone of turbulence and convection (in contrast to "convection-resolving" and "turbulence-resolving") are adequate as they reflect the fact that convection and turbulence are partly resolved and partly unresolved in these regimes.
The term "convection-permitting" does not imply that the subgrid convection scheme is switched off completely in the model (Kendon et al., 2021).On the contrary, it suggests that part of the convective motion is still unresolved and needs to be parameterized, either through an adequately formulated (ideally scale-aware) and calibrated convection scheme or a non-local term in the turbulence parameterization, or both.In one of the earliest explorations of the convective gray zone problem, Roberts (2003) suggested to use a CAPE-dependent (CAPE for Convective Available Potential Energy) CAPEclosure time scale in the convection scheme (see also Lean et al. (2008)).The idea was that in high-CAPE environments convective systems are more organized and exhibit a larger spatial extent and that the model should be able to resolve these larger convective structures, whereas smaller convection would still need to be parameterized.Gerard and Geleyn (2005) proposed to use a prognostic closure in the representation of subgrid convection involving prognostic updraught vertical velocities and a prognostic fractional area of a model grid box covered by convective clouds to address the gray zone problem.In related earlier work Pan and Randall (1998) argued that the distinction between convective and large-scale processes is ambiguous anyway.Arakawa et al. (2011) and Arakawa and Wu (2013) framed the problem in a similar way, and various subsequent studies followed an analogous approach (Grell & Freitas, 2014;Sakradzija et al., 2016;Zheng et al., 2016;Kwon & Hong, 2017;Su et al., 2021;W. Wang, 2022).
Moreover, Gerard et al. (2009) and Gerard (2015) highlighted the importance of the interaction between convection parameterization, microphysics, and the cloud scheme in the context of the gray zone issue.Specific modifications to the convection scheme such as a better coupling between subgrid convection and the resolved mesoscale circulation (Becker et al., 2021), or the representation of low-CAPE convection (McTaggart-Cowan et al., 2020) can also prove important and beneficial in convection-permitting simulations even if these aspects do not address the issue of scale-awareness of the convection parameterization directly from a conceptual perspective.
Convection-permitting regional models have been used with considerable success for more than two decades (Prein et al., 2015;Clark et al., 2016;Kendon et al., 2021), but the dynamics in regional models is constrained and influencd by the prescribed lateral boundary conditions (Kendon et al., 2010;Radermacher & Tomassini, 2012;Dipankar et al., 2020).In the context of the current modelling system at the Met Office, the Met Office Unified Model, the present study reports on a first thorough exploration of convectionpermitting global modelling, besides the preliminary study Tomassini (2018).In this work we focus on some of the main and most basic model configuration choices concerning convection, boundary layer processes, turbulence, and to some extent model dynamics.These choices address the gray zone problem in one way or another and are part of what is sometimes discussed under the headline of "gray zone model physics".In order to deal with The development of global convection-permitting models is more than just an incremental enhancement in resolution.Such models allow for, at least partly, resolving fundamental new turbulent and convective phenomena in the atmosphere, studying interactions between convection and the atmospheric circulation across scales, and thus addressing important and novel science questions (Tomassini, 2020;Senior et al., 2021;Slingo et al., 2022;Tomassini & Yang, 2022).This step change in fidelity and realism of weather and climate models implies also a step change in forecasting severe convective storms and in supporting and informing climate change mitigation and adaptation measures for the benefit of societies around the world (Slingo et al., 2021).In the presented work, however, the focus is on developing an appropriate tool for these exciting scientific endeavors, and on better understanding the strengths, limitations, and sensitivities of this tool.
The outline of the paper is as follows.In Section 2 the main model configuration options that are studied in the present work are introduced and briefly described.The results from simulations of different forecasts covering varying meteorological conditions are contained in Section 3. The process-based analysis focusses on different locations, times of the year, and particular phenomena.Section 4 highlights some important apects of the convective gray zone problem and discusses a few additional sensitivity experiments.
Two whole months, July 2016 and January 2018, were simulated based on a particular convection-permitting configuration with a view towards assessing statistics on climate time scales, and the analysis is showcased in Section 5. Finally, in Section 6 the main findings are summarized and an outlook on future planned work is given.

Description of model configurations
In the present study the reference configuration is Global Atmosphere version 7.0 (Walters et al., 2019), GA7 in short, as used in deterministic forecasts at the Met Office, i.e. with climatological aerosols, without the stochastic physics package, and using 70 levels in the vertical.The simulations are based on the Unified Model code version 11.7.The configurations described in this paper include some of the changes or specifications described in Table 1.
The main model configurations considered are a combination of the configuration options shown in Table 1, namely GA7, MidLevShConv25, MidLevShConv25RAturb, Mi-dLevShConv15RAturb, and ConvOffMoistConsRAturb.The latter configuration, Con-vOffMoistConsRAturb, is chosen in such a way that it shares primary convection, dynamics, and turbulence model formulation features with some of the Unified Model Regional Atmosphere configurations (Bush et al., 2020).A few important sensitivities and additional configuration tests will be discussed in Section 4.
All simulations are run at global N2560 (nominal 5km) resolution unless stated otherwise.N2560 refers to a latitude-longitude grid with 1920 regular latitude lines between the pole and equator, and 5120 longitude points along each latitude line.The grid length is 5km in latitude direction everywhere, in longitude direction the grid length is about 7.8km at the equator, 5km in the midlatitudes, and 4m near the poles.In the following the term "5km-resolution" will be used for the sake of simplicity.In all configurations except GA7, the deep convection parameterization is switched off.In the configurations that include the MidLev option the midlevel convection scheme is allowed to start from the top of the boundary layer and not lower.The mixing in the boundary layer is left to the boundary layer turbulence and shallow convection schemes.
The rationale is to more realistically represent nonequilibrium convection such as the diurnal cycle of convection and the convection tied to advective boundary layers.The approach combines a reduced subgrid convection approach based on a CAPE closure for the free troposphere and a separate representation of boundary layer processes, acknowledging that free-tropospheric adjustment time scales can differ from boundary layer time scales and that there often is an imbalance between boundary layer heating and deep convective overturning (Bechtold et al., 2014).The choice of a CAPE closure time scale of 2700 seconds is aimed at reducing the subgrid convective mass flux according to the model resolution, allowing convection to be partly explicit, and at the same time to account for the fact that vertical motion is not fully resolved by a model with 5km grid spacing.Sensitivity experiments with regard to the CAPE closure time scale are presented in Section 4.1.One should stress that even when the deep convection parameterization is switched off in some of the configurations it is still possible for at least some of the surface-based deep convective processes in the model to be handled by the subgrid schemes.
The use of a prognostic entrainment rate (Willett & Whitall, 2017) takes into account changing convective entrainment in different stages of the life cycle of convective systems and varying degrees of convective organization.Convective increment time-smoothing in addition improves the coupling between subgrid convection and the resolved circulation, reduces time intermittency of the convection scheme, and undesirable dynamical effects of this intermittency such as spurious gravity waves.One particular advantage of unifying the subgrid convection treatment in the free troposphere is the avoidance of a late afternoon "tea break" in convective rainfall, especially over tropical land.In previous model versions an erroneous minimum in modelled convective rainfall featured in the evening when mainly surface-based convection handled by the deep convection scheme was passed over to the midlevel scheme which treated elevated convection.
In convection-permitting Unified Model simulations, single grid-column updraughts can become unrealistically intense and persistent because a stagnation point forms at the base of the updraught.Equal convergent velocities at either side of an updraught column mean the SL advection scheme's back-trajectories from the cell centre go straight down, and so fail to advect in what will typically be drier air from neighbouring columns.
This allows single-point updraught columns to keep creating their own moisture by copying a near-surface moist anomaly upwards.The spuriously created moisture feeds condensation and latent heating, generating stronger ascent and convergence, leading to a positive feedback.To address the problem, all simulations, except GA7 and ConvOff-MoistConsRAturb, use the "fountain buster", an implementation of a modification to the semi-Lagrangian (SL) advection scheme, aimed at making a local correction to the lack of conservation.The scheme is called directly after the SL advection increment has been calculated, by interpolation of the virtual potential temperature and moisture variables to departure points.It works by identifying grid points where the horizontal winds (on the grid cell faces) are converging, and adds onto the standard SL increment a simple linear up-wind advection increment arising from just the locally convergent part of the flow.In this way it is adding in the effects of just the convergent inflow that will have been missed by the SL advection (because SL advection interpolates the cell-face winds to the cell centre in order to find the departure point).
The label ShConv refers to a multiplicative scaling of the shallow convective subgrid mass flux which is included in its default setting also in the reference configuration GA7 and is used in some of the convection-permitting configurations to limit the influence of the subgrid shallow convection scheme.The configuration option RAturb includes the turbulence blending scheme (Boutle et al., 2014) which blends the one-dimensional turbulence parameterization as used in lower-resolution versions of the global model with a three-dimensional Smagorinsky-Lilly representation of subgrid turbulence as typically applied in large-eddy simulation models.The two schemes are combined via the mixing length and the key parameter is the ratio of grid length to boundary layer depth.The SHARPEST scheme for stable boundary layers as described in Derbyshire (1997) and used in the Regional Atmosphere configuration (Bush et al., 2020) is employed.The fraction of the maximum allowed value of the diffusion coefficient is set to 0.75 (Hanley et al., 2015).Moreover, the free atmospheric turbulent mixing in RAturb is based on interactively diagnosed turbulent layer depths throughout the atmospheric column (as in the tropical version of the Regional Atmosphere configuration described in Bush et al. (2020)).
All simulations, except GA7 and ConvOffMoistConsRAturb, use the parameter choice puns=1.0 in the non-linearity setting for the boundary layer solver in the case of unstable boundary layers which makes the implicit solver more stable (Wood et al., 2007, see Appendix A).All configurations use the multigrid solver (Maynard et al., 2020).
For Met Office Global Model Evaluation and Development (GMED) tickets associated with the different changes and some more information on the tickets see the Appendix A.

Case studies and results
A case study approach is taken in which testbed cases are defined and the model is evaluated against observations and reanalysis data.Five model configurations are further investigated in greater detail: GA7, MidLevShConv25, MidLevShConv25RAturb, Mi-dLevShConv15RAturb, and ConvOffMoistConsRAturb.The focus of the study is on model forecasts with lead times of up to 10 days because this way the simulations can be directly compared to observations.Note that some errors in the forecast will come from deficiencies in the initial conditions or a lack of predictability not captured due to running only one forecast per configuration per initialization time.The different cases aim at covering different, important meteorological conditions and phenomena in different parts of the world and at various times during the year with a focus on high-impact weather.

Africa
An African easterly wave disturbance is detectable starting from 18:00 UTC on July 7, 2010, over North Africa (Tomassini et al., 2017;Tomassini, 2018).The dynamics of the wave is rather weak over the first 30 hours after detection, i.e. until about 00:00 UTC on July 9. Starting around July 9 03:00 UTC a crucial strengthening phase of the wave occurs, which lasts for about 2 days.Tropical Rainfall Measuring Mission (TRMM) precipitation (Huffman et al., 2007) shows distinct organized precipitation ahead of the trough at around 12 to 18 degrees North where the main center of the wave disturbance is located (Figure 1, panel a).The case is of interest because it allows for assessing the coupling between moist convection and atmospheric circulations over tropical land in the different model configurations.The simulations are initialized on July 7, 2010, 00:00 UTC and on July 11, 2010, 00:00 UTC with European Centre for Medium-Range Weather Forecasts (ECMWF) operational analyses, and two 5-day forecasts are performed.
GA7 is not able to develop and predict the mesoscale convective systems associated with the wave in the early, developing stage (Figure 1 radiative transfer code (Saunders et al., 2018).
In GA7 the convective systems over Africa are not represented realistically (Fig- ures 2 and 3, top right panel), although it should be mentioned that the satellite simulator does not consider cloud water or ice that is held within the convection scheme.
MidLevShConv25RAturb and MidLevShConv15RAturb develop well organized and realistic mesoscale convective systems at the wave trough and over the African continent.
The same could be said of MidLevShConv25, however, as in the Hovmöller diagrams of In order to better understand the representation of convective systems over the wider Africa-Atlantic region, convective storms are tracked in the area 40W-40E and 25S-25N over the period from July 8 to July 15, 2010, both for the different model configurations and Global Precipitation Measurement (GPM) observations (Huffman et al., 2019).The tracking is based on half-hourly precipitation data, a threshold of 3 mm/hour is chosen to identify convective systems, and an overlap of 50% is required from time stamp to time stamp for propagating systems (Stein et al., 2014;Crook et al., 2019).The histogram of convective storms lifetimes (Figure 4a) indicates that there are overall too many convective storms in all considered model configurations compared to GPM.GA7 overestimates the number of short-lived storms and underestimates the number of long-lived storms.For the other considered, convection-permitting configurations, the long-lived convective storms tend to be too persistent.For the mean size (over the storm life cycle) all models exhibit too many small storms, and too few very large storms compared to GPM (Figure 4b).This holds true for GA7 in particular.For the number of mediumsized storms it is hard to make a robust statement.The distributions of mean speeds (over the storm life cycle) suggest that the storms in GA7 move too fast compared to GPM.In the convection-permitting configurations the storms tend to move too slowly, most distinctly so in ConvOffMoistConsRAturb.The distributions confirm the visual impression obtained from the Hovmöller plots (Figure 1) that the fraction of slower-moving storms, as part of the total number of storms, is slightly larger in MidLevShConv25 than in MidLevShConv25RAturb.It is to be expected that the characteristics of storms in GPM will have biases over the region, particularly with regard to their areas, and the results should be interpreted with some caution.
The diurnal cycle of precipitation over the Sahel depends on the exact region considered, and it is not possible to assess the issue very robustly based on just two 5-day forecasts.Nevertheless, choosing the region 1E to 11E and 5N to 10N over the period from July 7 to July 15, 2010, the picture that emerges is plausible (Figure 5).According to the analysis GA7 shows an erroneous, well-known peak in precipitation around local noon and too little rainfall during night time.The convection-permitting model configurations fare much better and exhibit an afternoon maximum in agreement with TRMM.
In the MidLev configurations the rainfall persists somewhat too strongly during the evening, whereas in ConvOffMoistConsRAturb rainfall is underestimated during the night.Note that mean rainfall over the course of the day is not generally overpredicted in the model simulations over the area.
The latter observation is confirmed also when it comes to the simulation of heavy rainfall over Africa.Box plots for both TRMM observations and model configurations of 3-hourly mean rainfall rates above the threshold of 30mm/hr over the region 20W to 45E and 35S to 35N, comprising the whole of the African continent, and the period July 7 to July 15, 2010, reveal that the convection-permitting configurations do not overestimate heavy rainfall events over Africa on the 3-hourly time scale (Figure 6).On the contrary, they tend to underestimate extremes compared to TRMM.The box plots show the lower, middle, and upper quartile of the data above the threshold, and the length of the whiskers are 1.5 times the interquartile range.GA7 does not have any data points above the threshold when considering 3-hourly mean rainfall rates.
Overall the model configurations MidLevShConv25RAturb and MidLevShConv15RAturb show the best convection-circulation coupling, speed of convective systems, and degree of convective organisation over the African region in the analyses presented in this section.Impressively, the respective simulations can predict the evolution of some individual convective systems several days ahead.And it is these convective systems that impact the livelihoods of people in the area most.

Hurricane Dorian and Typhoon Goni
The simulation and prediction of tropical cyclones are important applications of weather forecasting because of the devastation these phenomena can cause.Furthermore, changes in tropical cyclones under a warming climate may deliver some of the most significant impacts.Here we present results of forecasts of two tropical cyclones, hurricane Dorian initialized on August 30, 2019, 12:00 UTC, and typhoon Goni initialized on October 28, 2020, at 12:00 UTC.The runs use ECMWF operational analyses as initial conditions except the operational model shown in Figures 7 and 8.The then operational model is GA6.1 (Walters et al., 2017) in the case of Dorian, and GA7.2 in the case of Goni, both run at 10km resolution and using Met Office analyses as initial conditions.
The predictions of central pressure in the simulations of tropical cyclones Dorian and Goni are significantly improved in the MidLev model configurations compared to the operational models, GA7 at 5km resolution, and also ConvOffMoistConsRAturb (Fig- ure 7).In particular, spells of rapid intensifications for both tropical cyclones are well captured by the MidLev model configurations.
The simulations of tropical cyclone tracks, however, are quite consistently degraded in the 5km-resolution simulations compared to the operational models (Figure 8 in the case of Goni, but also in results that are not shown here).This is true, though, also for GA7 at 5km resolution.We do not expect this to be due to the different initial conditions.The reason is not clear at this point and might be related to subtle degradations in regional synoptic scale conditions that affect the steering flow rather than the representation of the local cyclone processes themselves.

EUREC 4 A case over the Atlantic trades
One reason why tropical cyclones tracks are not particularly well simulated by the 5km-resolution models might be related to the structure of the atmosphere in the subtropics which influences the steering flow away from the cyclone center.To get a sense of this issue and to investigate shallow cumulus clouds over the trade wind region, a topic of particular importance in climate change studies, a case during the EUREC 4 A field campaign is considered (Stevens et al., 2021).The models are initialized on February 1, 2020, at 00:00 UTC with ECMWF ERA5 reanalysis (Hersbach et al., 2020), and evaluated on February 2.
The GOES16 satellite observed and simulated reflectance demonstrates how trade wind cumulus clouds can be controlled in the model by scaling the shallow convective subgrid mass flux (Figure 9).A reduced subgrid mass flux leads to more, and brighter, shallow clouds.A difficulty here is that, as also known from regional models when simulating low clouds explicity, the structure and organization of the clouds becomes more realistic but the cloud amount or the optical thickness of the clouds is often overestimated.
In the particular snapshot shown in Figure 9  The moisture profile in the cloud layer is probably most realistic in MidLevShConv15RAturb while GA7 tends to be too dry.The case shows quite clearly how none of the configurations is perfect, and a subjective, informed judgment has to be made when assessing the different model configurations with regard to various aspects and features of the atmosphere.

Darwin mesoscale convective systems case
To conclude the analysis of tropical and subtropical cases, advantage is taken of the C-band radar at Darwin on the northern tip of Australia.The models are initialized on February 17, 2014, at 00:00 UTC with ECMWF operational analysis (Franklin et al., 2016).
All models show substantial deficiencies in the profiles of radar reflectivities (Fig- ure 11).As with the simulated satellite imagery previously, the simulated radar reflectivity does not include cloud liquid and ice that are held within the convection scheme, somewhat penalizing the model with parameterized convection (GA7 ).Most model configurations tend to underestimate radar reflectivity at upper levels.The configuration with convection switched off, ConvOffMoistConsRAturb, reaches highest, but seems to overestimate high radar relectivities at middle levels.All models simulate an unrealistically strong outflow around the freezing level.The shortcomings of the models will at least partly be related to the representation of microphysics, an aspect that has not been considered in detail in the present study and could in principle be tuned for a better agreement with observations.The comparison makes the high-dimensional nature of the model development problem obvious, and the extent of the challenge that is due to the fact that only a limited number of sensitivity experiments and cases can be run and assessed with a high-resolution global model.The profiles in Figure 11 show an area mean over 12 hours, so part of the errors may also be caused by inaccuracies in the exact location of the convective systems and not only by their structure.
Despite the challenges in the assessment, there are robust features across the different tropical cases, for example the structure of precipitation, here shown around Darwin (Figure 12).Our judgement is that the MidLev configurations overall exhibit the most realistic characteristics in terms of structure and organization (Figure 12d).The GPM product likely overestimates the extent of the rainfall areas of the largest clusters.
And the model ConvOffMoistConsRAturb exhibits a too scattered and blobby rainfall field and misses lighter rainfall.GA7 shows too widespread areas of light rainfall and very little precipitation over land.Similar conlusions can be drawn also when investigating other cases like the African easterly wave case (Section 3.1) over tropical land (not shown).

NAWDEX cases: the Atlantic extratropics
To understand the behavior of the convection-permitting model configurations in the midlatitudes, cases from the North Atlantic Waveguide and Downstream Impact Experiment (NAWDEX) field campaign (Schäfler et al., 2018) are examined.The results from two sets of forecasts are presented here.For one set the models are initialized on September 22, 2016, 00:00 UTC with ECMWF operational analysis, for the other set the initialization time is October 5, 2016, 00:00 UTC, again using ECMWF operational analysis.
The ex-tropical storm Karl reintensified on September 26, 2016, as it approached western Europe.In the subsequent development the jet stream was unusually strong on its southern flank, forming a jet streak that propagated ahead from Karl.The impact on European weather occured through the formation of a new cyclone, Walpurga.Moistureladen air was drawn around the Atlantic subtropical high.A particularly moist boundary layer was observed in this atmospheric rivertype flow that extended to Norway, where it caused heavy, persistent rainfall (Schäfler et al., 2018).
A similar moisture transport pattern was also involved in the second case considered.On October 5, 2016, there was a high-pressure block over Europe.Two days later midlatitude cyclone Sanchez developed over the Atlantic.Nine days after forecast initialisation, on October 13, Sanchez caused heavy rainfall over southeastern France (Schäfler et al., 2018).
One could argue that the 5-day rainfall forcast for the September case is most accurate in the GA7 configuration (Figure 13).Although the front-like structure in rainfall moved slightly too fast over northern Europe in GA7, the heavy rainfall over Norway is well reproduced.In the convection-permitting MidLevShConv25RAturb configuration, the main rainband has moved slightly too slowly, although some of the rainfall reaches the west coast of Norway.ConvOffMoistConsRAturb shows more deficiencies and places the main precipitation area further inland.
Both GA7 and MidLevShConv25RAturb reproduce the river-like moisture band from the subtropics towards Scandinavia in the daily-mean vertically integrated water vapor fields (Figure 14, left column for the September case).MidLevShConv25RAturb shows a stronger gradient in moisture in the northern part of Great Britain which could be responsible for the rainfall in this region.The differences in rainfall over the Norwegian coast seem to be related also to differences in the dynamical fields where GA7 exhibits a stronger bend in the upper-level potential vorticity (PV) in the area (not shown).
In the October case, GA7 performs distinctly worse than MidLevShConv25RAturb in terms of rainfall prediction 9 days ahead after initialization (Figure 15).The rainfall field on October 13, when the heavy rainfall event over southern France ocurred (red circles in the panels of Figure 15), shows that GA7 moves the rainfall too quickly over the continent.Only the MidLev configurations predict the location of the heavy rainfall area correctly.Comparing the integrated water vapor fields on October 10 (Figure 14 right column) suggests that MidLevShConv25RAturb manages to draw the moisture further north.However, it is not obvious whether this difference is key in this case.
It is interesting to examine the PV pattern at the 320K isentrope in the different configurations (Figure 16).It would be difficult to derive the differences in rainfall prediction from the differences in the upper-level PV field.What is striking is how different the upper-level PV fields are in MidLevShConv25RAturb compared to MidLevShConv25 (Figure 16, panels c and d).The two configurations differ only in the representation of turbulence, suggesting that the subgrid turbulence scheme can have a strong impact on upper-level dynamics.However, as noted before, the predictions of rainfall over Europe disagree less, although there are some differences.
Based on the two presented cases it is possible that the convection-permitting configurations tend to move the fronts related to extratropical cyclones somewhat too slowly compared to observations, perhaps due to increased cumulus friction or differences in meridional geopotential height gradients resulting in changes in geostrophic jet strength.However, this does not necessarily translate into worse performing rainfall forecasts compared to the model with parameterized convection.The key features in large-scale dynamical fields that lead to better rainfall prediction can be difficult to identify, and over the whole of the Atlantic area the differences in these fields are typically large in individual forecasts.This poses significant challenges in the assessment of the performance of the various configurations over the midlatitudes which will require a much more extensive set of statistics, and the results of such an assessment will depend on the considered quantity and the specific application.

Important sensitivities and additional configuration tests
A few additional sensitivity tests regarding model configuration options are briefly described here.They are generally based on a rather limited amount of evidence, but some of them concern important aspects directly related to the convective gray zone problem and are not incidental.Not all sensitivity experiments that were conducted are presented and discussed, though, in this section.

CAPE closure time scale
The CAPE closure time scale of the convection scheme in the MidLev configurations is a key parameter (see Table 1) that controls the amount of subgrid convective mass flux relative to the resolved, explicit vertical motion and related rainfall in the model.The effect of changing the CAPE closure time scale can also be identified in global mean values of total rainfall versus convective rainfall.For the MidLevShConv25 configuration and a mean over the days July 12 to July 15 (the second forecast of the African easterly wave case) the global mean of total precipitation is 3.689, 3.666, and 3.638 mm/day for the simulations with CAPE time scales 3000s, 2700s, and 2400s, respectively.In other words, there is relatively little change in gloabl mean total rainfall.The global mean convective rainfall, however, increases from 0.909 to 1.032 to 1.178 mm/day, implying that the percentage of convective rainfall as part of the total rainfall increases from 24.7% to 28.1% to 32.5% when reducing the CAPE closure time scale.

Initial perturbations in the convection scheme
It is common in convection schemes to use initial perturbations in order to represent subgrid variability, where some regions will be buoyant enough to trigger convection not achievable with the grid-box mean profiles, and this is also the case for the convection scheme used in the MidLev configurations.The original and default option is to apply perturbations in the temperature field, but in the deep convection scheme used in GA7 humidity perturbations are added.Whether temperature or humidity perturbations are used in the MidLev configurations turns out to have quite a substantial effect.Using humidity perturbations makes the MidLev simulations look somewhat more similar to GA7, the model with fully parameterized convection (Figure 18).The buoyancy perturbation is fixed when shifting from temperature to humidity perturbations, but the perturbation in terms of moist static energy becomes somewhat larger, a circumstance that could also play a role.
In some of the diagnostics used in the African easterly wave case (Section 3.1), for example, one can see that the coupling between rainfall and wave becomes weaker in the early phase of the wave when applying humidity perturbations, and the clouds have a somewhat more blurred appearance compared to the simulations using temperature perturbations (Figure 18).A similar impression, namely that the simulations tend to become a bit more similar to GA7 when using humidity perturbations, results also from the extratropical NAWDEX cases (Section 3.5, not shown).

Sensitivity to turbulence formulation and miscellanea
Given that in Section 3.5 it was shown that the turbulence scheme has a substantial influence on the large-scale dynamical fields in the troposphere, two more turbulence options were tested and investigated for the NAWDEX cases on top of the MidLevSh-Conv25RAturb configuration.In one test, instead of relaxing the mixing length towards the Smagorinsky length scale away from turbulent layers, a background mixing length scale of 40m is used if the layer is stable or weakly turbulent.In another test a different stability function is used for unstable Richardson numbers, which produces significantly more mixing and which has previously been employed in the tropical version of the Unified Model regional configuration (Bush et al., 2020).In both tests the rainfall forecasts look somewhat degraded.For example, in the October case (see Section 3.5), the rainfall is placed mainly over the Alps in the first test, and mainly east of the Alps in the second test on day nine of the forecast and therefore showing similar deficiencies as GA7 (not shown).
A few additional targeted, but limited, sensitivity tests have been conducted also with regard to the fountain buster, the time-smoothing of midlevel convective increments, the convective momentum transport settings, and different moisture conservation schemes (not shown).For example, the impact of the convective increment smoothing and the convective momentum transport settings on the speed of convective systems was assessed in the African easterly wave case, but the impact was found to be small.The use of the fountain buster was judged to be overall beneficial.And the effect of applying different available local moisture conservation schemes was concluded to be minor on the considered time scales.

Months of July 2016 and January 2018
To estimate features that are of importance in climate studies, in particular the topof-the-atmosphere energy budget, two full months, July 2016 and January 2018, are covered by 5-day forecasts with one overlapping day between forecasts using the convectionpermitting MidLevShConv25RAturb configuration.The first 24 hours of each forecast are not included in the analysis.The first initial times are June 30, 2016, 00:00 UTC, and December 31, 2017, 00:00 UTC, respectively.Due to a model failure, all forecasts that cover January 2018 were run with a so-called polar cap for advection in which the semi-Lagrangian advection is replaced by an interpolation in an area close to the poles (see Appendix A).Low-resolution tests have shown that on time scales of days the impact of the polar cap scheme on the quality of forecasts is neutral.

Mean rainfall and top-of-the-atmosphere radiative fluxes
Figures 19 and 20 show mean top-of-the-atmosphere outgoing shortwave (top row) and longwave (middle row) radiation for both months alongside the corresponding CERES EBAF 4.1 observations (Loeb et al., 2018).Moreover, monthly mean precipitation is depicted for the model simulations and GPM rainfall observations (bottom rows).Overall, the clouds look very realistic in the simulations and it would be difficult to distinguish between simulations and observations by eye.There are, however, some quite distinct biases in the precipitation fields, particularly over the tropical oceans.Rather strong rainfall biases can be identified over the tropical Pacific and Indian Ocean during boreal winter (Figure 20) and the southern branch of the ITCZ in the western tropical Pacific during boreal summer (Figure 19).Over tropical land and over the extratropics the biases are much less pronounced.Similar tropical rainfall biases have also been described in other convection-permitting global atmosphere models (Caldwell et al., 2021;Hohenegger et al., 2022).
In terms of the global-mean top-of-the-atmosphere radiation budget, the outgoing longwave radiation (OLR) is fairly close to observations without any tuning, while the model is too reflective when it comes to the top-of-the-atmosphere outgoing shortwave radiative fluxes, especially during boreal summer.The results are summarized in Table 2. Some cloud tuning would be required before coupling the MidLevShConv25RAturb configuration to an ocean model.

Madden-Julian Oscillation
A strong Madden-Julian Oscillation (MJO) passed from the Indian Ocean over the Maritime Continent to the West Pacific during the month of January 2018 (Figure 21).
The MidLevShConv25RAturb model reproduces the propagation of the rainfall very well and the 3-hourly mean rainfall rates are in good agreement with GPM observations.The then-operational GA6.1 global model with parameterized convection shows much weaker eastward propagation with a significant reduction of rainfall when it reaches the Maritime Continent land which is often referred to as the barrier effect (Zhang & Ling, 2017).The propagation of the MJO does not appear to suffer from the barrier effect over the Maritime Continent in the convection permitting simulations and its regions of impacts are fairly well captured.

The West Pacific subtropical high
The West Pacific subtropical high (WPSH), a distinct anticyclone in the middle and lower troposphere over the western North Pacific, is a key component of the East Asian summer monsoon system.It affects the regional hydrological cycle and has a significant influence on tropical cyclone activity in the Western North Pacific (Yihui & Chan, 2005;B. Wang et al., 2013).Correct simulations of the WPSH are important for seasonal forecasting and climate projections.In low-resolution versions of the Unified Model, as in many other models, the mean-state summer WPSH is too weak and located too far east (Rodríguez et al., 2017).These biases lead to an underestimation of the southwesterly monsoon flow over East Asia, affecting its representation of the seasonal water cycle in the area.Studies of systematic errors in climate models have shown that they can develop in the first few days of a simulation, and then persist to climate time scales (Martin et al., 2021).We can therefore gain an understanding of the emergence of such biases by analyzing errors in initialized hindcasts (Rodríguez & Milton, 2019).
Here forecasts show an improvement in the representation of the WPSH mainly in terms of the intensity and the westward extension, important aspects in the context of the regional weather and climate.However, biases in the region of the Kamchatka Peninsula increase.

Extratropical cyclones
Extratropical cyclones are tracked in the MidLevShConv25RAturb configuration simulations for both months, July 2016 and January 2018, using the TRACK (Hodges, 1995) algorithm on the 850hPa relative vorticity field.The same tracking algorithm is also applied to ECMWF analyses and the then-operational Met Office Unified model for both July 2016 and January 2018.The operational model used by the Met Office in July 2016 and January 2018 was the GA6 configuration with fully parameterized convection (Walters et al., 2017), at N768 resolution for July 2016 and at N1280 resolution for January 2018.
Cyclones identified in the MidLevShConv25RAturb simulation are matched (location and intensity) against those in ECMWF analyses using the method of Froude et al. (2007aFroude et al. ( , 2007b)).Then, for comparison, the matching is re-run on the then-operational The investigation of the aforementioned questions always takes place in a certain context, with regard to a particular chosen reference configuration and taking specific representations of subgrid processes into account.For example, if the considered representation of subgrid convection is distinctly deficient, then in practice it might be advisable to switch off such a subgrid convection scheme in kilometer-scale models completely, even though in theory convection is not well resolved.Moreover, it is clear that the particular application of the model and the quantities of interest are an important factor in the assessment.There is not one model configuration which performs better than any other configuration in all meteorological situations and geographical locations.
The particular metrics used in the evaluation will always influence the conclusions.In numerical weather prediction, for instance, an issue is how to weight more traditional error indices that target large-scale fields in the middle troposphere against more impactbased, near-surface measures.
Here a case study approach is taken in which selected test cases are compared to observations.A robust conclusion of the assessment in the present study, conditional on the assumed framework, is that the convection-permitting configurations which include scale-aware turbulence and a carefully reduced and simplified mass-flux convection scheme outperform both the model with fully parameterized convection as well as a configuration in which the subgrid convection parameterization is switched off completely with regard to the organization of convection and related features of tropical variability.The key idea with regard to the representation of subgrid convection in the convective gray zone is to use a unified scheme in the free troposphere which is allowed to trigger from the top of the boundary layer and not below, and to limit the subgrid convective mass flux, both in the free troposphere as well as the shallow-convective mass flux in the boundary layer.The assessment in the extratropics is more ambiguous and challenging than in the tropics, and better statistics are needed.The configuration with fully parameterized convection is more competitive in the midlatitudes, whereas for tropical variability it shows distinct deficiencies.The results are in broad agreement with the study by Judt and Rios-Berrios (2021).Moreover, as already mentioned, the conclusion might differ depending on the considered variable and metric, for example rainfall or upper-tropospheric wind and temperature.Also the comparison of longer-term mean statistics can be less straightforward because the model with parameterized convection is rather resolution insensitive and can therefore be tuned using low-resolution versions of the model.This does not apply the same way to the convection-permitting configurations, especially given that physical parameterizations are generally not scale-aware.Nevertheless, for the convectionpermitting configurations a further effort in cloud tuning will be required, for instance, to bring global-mean top-of-the-atmosphere radiative fluxes more inline with observations, an aspect that is important in climate applications.A process-based approach to evaluation, however, as taken in the present study, is certainly a valuable and integral part of any assessment.
Based on the presented work the further development of scale-aware physical parameterizations is endorsed, particularly with regard to convection (Holloway et al., 2014),   The model data are interpolated to the TRMM grid using an area-weighted regridding scheme.
The data are averaged over the latitudes 10N to 20N.
, panel b), and also in Con-vOffMoistConsRAturb the predictive skill related to the convective systems at and ahead of the wave trough, and the imprint of the diurnal cycle signal, are not satisfactory (Figure 1, panel f).In the MidLevShConv configurations (panels c, d, and e) the coupling between convection and the wave dynamics is well represented, and moist convection supports the dynamical development of the wave (not shown).However, in MidLevShConv25 the speeds of the convective systems around the wave trough seem to be slightly too slow in the first forecast (Figure 1, panel c), whereas this feature is better represented in Mi-dLevShConv25RAturb and MidLevShConv15RAturb (Figure 1, panel d and e, respectively).These characteristics are also confirmed in observed and simulated satellite imagery (Figures 2 and 3), showing reflectance based on the visible channel at 0.8µm on July 10, 12:00 UTC (Figure 2), and brightness temperature based on the IR channel at 10.8µm on July 10, 18:00 UTC (Figure 3), i.e., almost 4 days after forecast initialization.The dashed red lines in the plots indicate the approximate position of the wave trough.The satellite simulated fields from the global model simulations are derived using an offline

Figure 1 ,
Figure 1, in relation to the wave the main convective systems are located behind the trough (Figures2 and 3, left panel in the middle row), suggesting that they move somewhat too slowly.For ConvOffMoistConsRAturb (Figures2 and 3, bottom right panel), the convective systems are too scattered and widespread across the land area, a feature that is evident even more clearly in the precipitation field (not shown).Moreover, clouds and rainfall are not very sensitive to the presence of the wave and do not show a distinct response to the wave circulation around the trough in this configuration.
low cloud cover is distinctly underestimated in the GA7 configuration, especially over the Pacific (Figure9, top right panel).However, lower-resolution versions of the GA7 model have been tuned to produce realistic global-mean fluxes, whereas in radiation budget estimates presented in Section 5 MidLevSh-Conv25RAturb exhibits a distinctly overpredicted global-mean top-of-the-atmosphere outgoing shortwave radiative flux in the absence of tuning.This shows how seemingly more realistic process representation does not always translate into better agreement with observations, at least not in every respect.Observed profiles from HALO research aircraft dropsondes sampled over the EUREC 4 A campaign observation area east of Barbados over a time window of 17:50 to 18:10 UTC(George et al., 2021) show that the increased trade wind cloudiness in the MidLevSh-Conv15RAturb configuration is associated with a cooler and moister lower troposphere (Figure10).The overall structure of the lower troposphere is quite well represented, including by GA7.The slight cold bias in the subtropical trade wind cloud layer is consistent with anecdotal evidence from other forecasts and might at least partially be due to the somewhat overestimated cloud amount or optical thickness in the lower troposphere of the subtropics.The most notable feature is a dry bias in the subcloud layer of the models which is present in all configurations despite the different turbulence schemes used.

Figure 17
Figure17shows the rainfall Hovmöller plots of total rainfall and convective rainfall for Figure 21 also shows evidence of faster eastward Kelvin-like disturbances and westward Rossby or Mixed Rossby-Gravity wave features that are better captured in the convection-permitting model configuration MidLevShConvRAturb compared to the parameterized convection model.The result shows that the convection-permitting model configuration MidLevShConv25RAturb is able to accurately predict MJO-related high impact weather in the region.
, a comparison of the development of circulation biases associated with the WPSH in global 5km-resolution MidLevShConv25RAturb configuration simulations and low-resolution N216 (about 80km-resolution) GA6 configuration forecasts initialised from Met Office analyses is made for July 2016 (Figure 22).The 850hPa geopotential height fields at 5 days lead time indicate that in the 5km-resolution MidLevShConv25RAturb simulations the WPSH is developed more strongly and extends further to the west and south, in better agreement with ERA5 compared to the low-resolution GA6 forecasts (Figure 22, panels a-c).This is confirmed by comparing three indices, I in , I W and I N in ERA5 and the model forecasts at a lead time of 5 days, following Chen et al. (2010) and Lu and Dong (2001).The indices represent, respectively, the intensity, westward extension and northern edge of the WPSH (Figure 22, panel d).The 5km-resolution MidLevShConv25RAturb

GA6. 1
forecast data and ECMWF analyses.The mean error in extratropical cyclone location (top row) and intensity (bottom row) for MidLevShConv25RAturb (black lines) and GA6 (red lines) relative to ECMWF analyses, are shown for the NH winter (Figures 23a and b) and SH winter (Figures23c and d).The performance of the MidLevSh-Conv25RAturb simulation is almost identical to that of both resolutions of the GA6 operational model, relative to ECWMF anayses (Figure23).Therefore, in the context of extratropical cyclone mean location and intensity, there is no improvement from running the MidLevShConv25RAturb configuration at 5km grid spacing compared to the thenoperational model at 10km resolution, but also no degradation.Nevertheless, the structure of individual cyclones (e.g.fronts) may be simulated better at 5km resolution and will be an area of future work.6Summary and conclusionsIn the present work the challenges of the convective gray zone are discussed in the context of a 5km-resolution global atmospheric model, and a few fundamental ideas concerning the representation of turbulence and convection are proposed and assessed.Different questions regarding the formulation of subgrid turbulence, convection, and model dynamics are considered: should the convection parameterization be switched off at a particular grid length in convection-permitting models or is a more seamless approach to be preferred?Are one-dimensional turbulence parameterizations sufficient in kilometerscale models or is a three-dimensional representation of turbulence beneficial, or a compromise between the two?What are the impacts of non-conservation characteristics of the advection scheme and physical parameterizations?Most of these questions are not in themselves binary in nature (convection parameterization switched on or off).On the contrary, as the model resolution is increased continuously, the resolved convective and turbulent fluxes are supposed to increase continously and the subgrid, parameterized fluxes should decrease continously.
only because of the expected better performance, fidelity of process representation, and agreement with observations, but also because it enables better traceability and seamlessness of the model development process across resolutions, time scales, and applications.In the context of the Met Office and its partner institutions, we plan to operationally implement the 5km-resolution global model as an experimental forecast to supplement the main lower-resolution global ensemble.This could contribute to a limited number of operational outputs, likely focussed on near-surface variables and high-impact weather, as well as to the ongoing assessment and development and act as a stepping-stone towards a future convection-permitting global ensemble.This step-by-step approach allows for better identifying and understanding the strengths and limitations of a convectionpermitting global model under various meteorological conditions.Future plans also include season-and year-long simulations and the coupling to the NEMO ocean model.But as the radiative flux estimates in the case of the MidLevShConv25RAturb configuration show, some cloud tuning will be required as a prerequisite.The Met Office modeling system will be overhauled over the next few years with the introduction of a new dynamical core, a new two-moment microphysics scheme, a new convection scheme, and improvements to the cloud scheme.The present work is therefore not intended to be the final word on convection-permitting global model development.Nevertheless, the results of the study will help to guide some fundamental choices in the future development of an adequate convection-permitting global coupled modeling system for use across weather and climate time scales.Appendix AMet Office GMED (Global Model Evaluation and Development) tickets associated with the discussed model configurations are listed and described in some more detail in

Figure 2 .
Figure 2. Observed (top left) and simulated reflectance at 0.8µm for the MSG satellite for July 10, 2010, 12:00 UTC, based on the same configurations as in Figure 1, presented in the same order.

Figure 3 .
Figure 3. Observed (top left) and simulated brightness temperature at 10.8µm for the MSG satellite for July 10, 2010, 18:00 UTC, based on the same configurations as in Figure 1, presented in the same order.

Figure 4 .
Figure 4. Results of convective storm tracking over the Africa-Atlantic region (40W-40E, 25S-25N) and the period July 8 to July 15, 2010.Top left panel: histograms of convective stroms life times; Top right panel: histograms of convective storms mean areas; Bottom panel: histograms of convective storms mean speeds.The model data are interpolated to the GPM grid using an area-weighted regridding scheme before the storm tracking is performed.

Figure 5 .
Figure 5.Diurnal cyle of precipitation for July 7 to July 15, 2010, averaged over the region 1E to 11E and 5N to 10N.The arrangement of the plots is as in Figure 1 with the top left panel showing TRMM observations.

Figure 6 .
Figure 6.Boxplot of TRMM observations and model simulations of 3-hourly mean rainfall rates above the threshold of 30mm/hr over the region 20W to 45E and 35S to 35N, comprising the whole of the African continent, and the period July 7 to July 15, 2010.The box plots show the lower, middle, and upper quartile of the data above the threshold, and the length of the whiskers are 1.5 times the interquartile range.Individual values are indicated by dots, the points outside the whiskers are also shown as bold dots.

Figure 7 .
Figure 7. Observed and modelled central pressure developments for tropical cyclones Dorian (top panel) and Goni (bottom panel).The operational model is GA6 in the case of Dorian and GA7 in the case of Goni, both run with fully parameterized convection at 10km resolution.The observations are from the IBTrACS data base of the National Oceanic and Atmospheric Administration (NOAA) National Climate Data Center (NCDC).

Figure 8 .
Figure 8. Cyclone track predictions for tropical cyclone Goni in the vicinity of the Philippines.The operational model (OP; red lines) is GA7 run with fully parameterized convection at 10km resolution.Black lines are observations and red lines the different 5km-resolution configurations.The observations are from the IBTrACS data base of the National Oceanic and Atmospheric Administration (NOAA) National Climate Data Center (NCDC).

Figure 9 .
Figure 9. Observed (top left) and simulated reflectance at 0.64µm for the GOES16 satellite for February 2, 2020, 18:00 UTC, based on the same configurations as in Figure 1, presented in the same order.

Figure 11 .
Figure 11.Observed (top left) and simulated radar reflectivity profiles (February 18, 12-24 UTC mean) over the area within reach of the Darwin C-band radar.Color shaded contours represent the frequency of occurrence.

Figure 12 .
Figure 12.Precipitation rate for February 18, 2014, 18:00 UTC, over the wider Darwin region based on the same configurations as in Figure 1, presented in the same order.Top left is forGPM observations.The model data are interpolated to the GPM grid using an area-weighted regridding scheme.

Figure 13 .
Figure 13.NAWDEX case precipitation for September 27, 2016, 18:00 UTC, based on the same configurations as in Figure 1, presented in the same order.Top left is for GPM observations.The model data are interpolated to the GPM grid using an area-weighted regridding scheme.

Figure 14 .
Figure 14.Daily mean integrated water vapor path for September 27, 2016 (left), and October 10, 2016 (right).Top row: Satellite Application Facility on Climate Monitoring (CMsaf) observations.Middle row: GA7 configuration; bottom row: MidLevShConv25RAturb.The model data are interpolated to the CMsaf grid using an area-weighted regridding scheme.

Figure 15 .
Figure15.NAWDEX case precipitation for October 13, 2016, 18:00 UTC, based on the same configurations as in Figure1, presented in the same order.Top left is for GPM observations.The red circle highlights the area of heavy rainfall in southern France.The model data are interpolated to the GPM grid using an area-weighted regridding scheme.

Figure 16 .
Figure 16.NAWDEX case PV at 320K for October 13, 2016, 18:00 UTC, based on the same configurations as in Figure 1, presented in the same order.Top left is for ERA5.

Figure 17 .
Figure 17.Model convective rainfall for MidLevShConv25 with CAPE time scale 3000 seconds (top left), MidLevShConv25 with CAPE time scale 2700 seconds (top middle), and Mi-dLevShConv25 with CAPE time scale 2400 seconds (top right).Bottom row: Total precipitation for the same configurations.Although the convective rainfall is only moderately increased in the simulation with CAPE time scale 2400 seconds compared to the other ones, the more active convection scheme starts to disrupt the prediction of the mesoscale convective systems during the early strengthening phase of the African easterly wave.The model data are interpolated to the TRMM grid using an area-weighted regridding scheme

Figure 19 .
Figure 19.Mean outgoing shortwave radiation for July 2016 from the MidLevSh-Conv25RAturb simulation (top left) and CERES EBAF 4.1 (top right).Mean outgoing longwave radiation for July 2016 from the MidLevShConv25RAturb simulation (middle left) and CERES EBAF 4.1 (middle right).Mean precipitation for July 2016 based on the MidLevShConv25RAturb simulation (bottom left) and GPM (bottom right).

Figure 20 .
Figure 20.Mean outgoing shortwave radiation for January 2018 from the MidLevSh-Conv25RAturb simulation (top left) and CERES EBAF 4.1 (top right).Mean outgoing longwave radiation for January 2018 from the MidLevShConv25RAturb simulation (middle left) and CERES EBAF 4.1 (middle right).Mean precipitation for January 2018 based on the MidLevSh-Conv25RAturb simulation (bottom left) and GPM (bottom right).

Figure 21 .
Figure 21.Right panel: Hovmöller plot of 3-hourly mean rainfall rate based on the Mi-dLevShConv25RAturb configuration simulations for January 2018.Middle panel: Hovmöller plot of 3-hourly mean rainfall rate based on the then operational model GA6.1 (Walters et al., 2017) with fully parameterized convection.Right panel: Hovmöller plot of rainfall based on the GPM rainfall data averaged to 3-hourly mean values.For all plots the rainfall was averaged over the latitudes 10 South to 10 North, units are mm/day.

Figure 22 .
Figure 22.Panel a: July 2016 mean 850hPa geopotential height and horizontal wind in ERA5.Panels b and c: Difference of July 2016 mean 850hPa geopotential height and horizontal wind to ERA5 for the 5km-resolution MidLevShConv25RAturb configuration and GA6 at N216 resolution, respectively.Geopotential height is in m and wind in ms −1 .WPSH intensity, west extension and northern edge indices in ERA5 and model hindcasts are shown in panel d.Axes in the scatter plot are as follows: abscissa displays values of the west extension index, IW ( • E) and ordinate shows values of the northern edge index, IN ( • N).The radii of scatter circles are proportional to the intensity index, Iin.Indices are shown for July 2016 ERA5 (light green), 5km-resolution (blue) and N216-resolution (red).To place 2016 data in context, the following information has been added: dotted lines indicate maximum and minimum values of July ERA5 IW and IN for the 1979-2019 period, and a dark green dot shows the period mean value.

Figure 23 .
Figure 23.Results of extratropical storm tracking for January 2018 in the northern hemisphere (two left panels) and July 2016 in the southern hemisphere (two right panels).Panel a: Mean storm track position error of the global 5km-resolution MidLevShConv25RAturb forecasts against ECMWF analysis (black line), and the then-operational Unified Model GA6 at 10km (N1280) resolution against ECMWF analysis (red line) for January 2018 and the northern hemisphere.Panel b: The same models and comparisons as in panel a, but for the mean intensity error of extratropical cyclones.The intensity measure is 850hPa vorticity.Panel c: As for panel a, but for July 2016 and the southern hemisphere.Here the then-operational model is run at N768 resolution.Panel d: The intensity error results for July 2016 over the southern hemisphere.

Table 1 .
Description of different model configuration options.See also main body of text in Section 2 and additional information in Appendix A.
Daily varying Operational Sea Surface Temperature and Ice Analysis (OSTIA) sea surface temperatures and sea ice are prescribed.The model time step is 90 seconds.Soil moisture is initialised from Met Office Unified Model analysis.

Table 2 .
Observed and modeled global-mean top-of-the-atmosphere (TOA) radiative fluxes for July 2016 and January 2018.

Table A1 .
The information is supplementary to Table1and the configuration descrip- tion in Section 2. A more in-depth description of the tickets is accessible only to regis-