Sensitivity of Convection Permitting Simulations to Lateral Boundary Conditions in Idealized Experiments

Limited‐area convection‐permitting climate models (CPMs) with horizontal grid‐spacing less than 4 km and not relying on deep convection parameterisations (CPs) are being used more and more frequently. CPMs represent small‐scale features such as deep convection more realistically than coarser regional climate models (RCMs) with deep CPs. Because of computational costs, CPMs tend to use smaller horizontal domains than RCMs. As all limited‐area models (LAMs), CPMs suffer issues with lateral boundary conditions (LBCs) and nesting. We investigated these issues using idealized Big‐Brother (BB) experiments with the LAM COSMO‐CLM. Grid‐spacing of the reference BB simulation was 2.4 km. Deep convection was triggered by idealized hills with driving data from simulations with different spatial resolutions, with/without deep CP, and with different nesting frequencies and LBC formulations. All our nested idealized 2.4‐km Little‐Brother (LB) experiments performed worse than a coarser CPM simulation (4.9 km) which used a four times larger computational domain and yet spent only half the computational cost. A boundary zone of >100 $ > 100$ grid‐points of the LBs could not be interpreted meteorologically because of spin‐up of convection and boundary inconsistencies. Hosts with grid‐spacing in the so‐called gray zone of convection (ca. 4–20 km) were not advantageous to the LB performance. The LB's performance was insensitive to the applied LBC formulation and updating (if ≤3 $\le 3$ ‐hourly). Therefore, our idealized experiments suggested to opt for a larger domain instead of a higher resolution even if coarser than usual ( ∼5 $\sim 5$ km) as a compromise between the harmful boundary problems, computational cost and improved representation of processes by CPMs.


10.1029/2021MS002519
2 of 15 (Coordinated REgional Climate Downscaling Experiment), for example, a default grid-spacing of about 50 km was suggested to be used in multiple domains covering all global continents (https://cordex.org/, Giorgi et al. (2009)), but finer grid-spacing was already suggested and later on used (e.g., 12 km in EURO-CORDEX, https://www.euro-cordex.net, or CORDEX-CORE, Sørland et al., 2021). This, however, results in RCMs being used in the so-called gray zone of convection, that is, in a grid-spacing range of about 4-20 km. Here, the assumptions of the deep convection parameterisations (CPs), which are used in climate models, are not well fulfilled (Weisman et al., 1997).
Recently, limited-area convection-permitting climate models (CPMs) with grid-spacing below 4 km were developed and successfully applied (Ban et al., 2014(Ban et al., , 2021Kendon et al., 2012;Prein et al., 2015;Purr et al., 2021). These CPMs resolve much of the deep convective processes and do not use any deep CP, but are otherwise applied similarly to RCMs. They too rely on driving data from coarser-grid models or analyses data sets. However, because of their very high spatiotemporal resolution, the CPMs are computationally very expensive and at present feasible only over smaller domains and/or shorter climate periods than RCMs. For example, in Ban et al. (2021), CPM simulations performed by 23 institutions are evaluated with grid-spacings between 2.2 and 4 km. All simulations were driven by a reanalysis with ca. 80 km grid-spacing with 21 institutions opting for an intermediate nest with grid-spacing of 12 or 15 km and with only two institutions applying direct nesting. Consequently, the resolution jumps were factors between ca. 35 and 3 with the intermediate nests in the gray zone of convection. Three groups applied European-scale compute domains. Most of the institutions chose to set-up smaller domains with extents of about 1,000 km.
In a convection-permitting simulation with grid-spacing of 2.8 km, Brisson et al. (2016) found an extended spatial spin-up zone at the primary lateral inflow boundary, which the simulated convective systems needed to fully develop. They investigated the nesting strategy, and concluded that an additional nesting step in the gray zone of convection with 7 km grid-spacing is not beneficial for the CPM simulation compared to a direct nesting into a driving simulation with grid-spacing of 25 km (i.e., with a resolution jump of about a factor of 10). Liang et al. (2019) conclude it might be reasonable to accept a resolution jump by a factor 30 in CPM nesting, if an intermediate nest in the gray zone of convection can be avoided.
Coarse-grid CPM experiments by Panosetti et al. (2019) have shown that convective processes are climatically well represented in case of strong orographic forcing (in a domain over the European Alps), but less well in case of weaker forcing in hilly terrain (in a domain over Central Germany). They concluded that a coarse-grid CPM with grid-spacing of 4.4 km might be sufficient in the mid-latitudes in cases with strong forcing. Typical grid-spacings in real-data applications, however, are 3 km and finer (see, e.g., in Ban et al., 2021 which evaluates several CPMs over the European Alps). Figure 1 illustrates the CPM nesting challenge we want to investigate here. The simulated precipitation amounts shown are from RCM and CPM simulations discussed in Purr et al. (2019). The RCM was driven by the European Center for Medium-Range Weather Forecast Interim Reanalysis (ERA-Interim) from 1979 to 2015 using a European-scale domain with horizontal grid spacing of 0.22° (≈ 25 km). The CPM with a domain over Germany with grid-spacing of 0.025° (≈ 2.8 km) was nested into the RCM simulation and laterally nudged toward the driving data using Davies relaxation (H. C. Davies, 1976) with hourly updates of the LBCs provided by the RCM. The CPM simulated about 40% more precipitation on the 309 convectively most active days in the simulation period. Yet, the CPM simulates less precipitation in a spin-up zone along the primary inflow boundary from the South-West. How does the extend of this spin-up zone depend on the resolution jump between driving RCM and driven CPM, or the nesting strategy? In Germany, there is mainly hilly or flat terrain, that is, without strong orographic forcing. Nevertheless, is a European-scale coarser-resolution CPM, for example, with grid-spacing of 5 km, without intermediate nest better and even computationally cheaper?
This study investigates the challenge of nesting CPM into RCM simulations by using idealized experiments following a Big-Brother experiment (BBE) design (Denis et al., 2002). BBEs can rule out inconsistencies due to different model grids, physics parameterisations, or reference data. The first step in BBEs consists in realizing a simulation, nicknamed the Big-Brother (BB) simulation, on a sufficiently large domain at the desired finest resolution, to serve as reference data set. Such BBEs have been successfully applied in investigations about the optimal nesting strategies for RCM simulations (Denis et al., 2002;Leps et al., 2019;Matte et al., 2016). Here, idealized simulation experiments following a BBE variant described in Leps et al. (2019) explore the dependence of the added value of CPM simulations nested into coarser simulations on resolution jumps (which implies decreasing quality of the coarser driving simulation), LBC update frequencies, and LBC formulation.
We were not able to implement spectral nudging in our idealized CPM experiments. Spectral nudging is an often used technique to reduce inconsistencies between RCM and driving data (von Storch et al., 2000), which can also reduce the issues developing due to LBC treatment (Omrani et al., 2012). In spectral nudging, the large scales of the nested model are nudged toward the driving fields. However, this approach can also be seen critically (e.g., Mesinger & Veljovic 2013;Leps et al., 2019), as it imprints the driving models deficits on the nested simulation.
Motivated by the above given discussion, we focus on hilly and flat terrain, that is, with weak or no orographic forcing. This study aims to increase CPM user's awareness of the nesting challenge and provide additional guidance in planning CPM climate simulations.
The following section introduces the idealized simulation experiments applied using a modified Big-Brother experiment design (Leps et al., 2019) and the applied limited-area climate model with its set-up. Section 3 presents and discusses the idealized simulation results. Finally, we summarize and draw conclusions.

Method, Model, and Experiments
In this study, we used the modified Big-Brother-Experiment protocol as introduced in Leps et al. (2019). First, an idealized simulation was performed using a large domain with a high, convection-permitting resolution and deep-convection parameterization switched off. This idealized simulation is called the Big-Brother (BB) simulation. The BB simulation drove, that is, provided lateral boundary and initialization conditions for, a simulation on a smaller domain but otherwise the same set-up as the BB set-up. The small domain simulation is called the Little-Brother (LB) simulation and is chosen to have a typical domain size as in studies with realistic simulations (e.g., in Brisson et al., 2021;Purr et al., 2021). So-called Coarse-Brother (CB) simulations were performed on the BB domain with a coarser resolution to represent input data from a coarser model. CB simulations also drove LB simulations, and the BB simulation was used as the reference for the LB and CB simulations. With this protocol, it was possible to show the impact of nesting, the update frequency, U, of LBCs, and the resolution jump, J, from CB to LB set-ups. The simulation domains are illustrated in Figure 2 and following subsections give the details.

Model and Big-Brother Set-Up
The nonhydrostatic LAM COSMO-CLM (e.g., Rockel et al., 2008) in version COSMO5.0-CLM7 was applied in the idealized test configurations. COSMO-CLM has been used successfully in many climate studies with typical grid-spacings from ∼ 50 km to convection-permitting scales with grid-spacing of (1 km) Sørland et al., 2021). Necessary initial and lateral boundary data were compiled with the pre-processor INT2LM2.0-CLM4.
The reference set-up which was used to perform the reference simulation for later sensitivity experiments is called Big-Brother (BB) set-up. We used a one-moment microphysics scheme and shallow convection is parameterized using the convection scheme after Tiedtke (1989). In the reference simulation, no deep CP was used. The used radiation scheme follows Ritter and Geleyn (1992), and the lower boundary conditions were provided by the sub-model TERRA (with homogeneous land cover: short grass, roughness length 0.01 m) and turbulence scheme as documented in Doms et al. (2018). The Coriolis force term was switched off in all simulations aiming at a more zonal idealized flow, which reduced the impact of the meridional lateral boundary conditions and eased the discussions.
The BB set-up used a horizontal grid-spacing of 0.022° (≈ 2.4 km), 50 vertical levels, numerical time steps of 20 s, and a cartesian simulation domain of 1,006 × 452 grid points (domain area: ≈ 2430 × 1100 km 2 ). This domain size is large enough to host two non-overlapping domains with an order of size typical in CPM studies (e.g., Brisson et al., 2016or Panosetti et al., 2019. The BB simulation was run for 24 hr with periodic LBCs (with six grid-point wide overlapping boundary zones). The simulation orography is mainly flat with 12 Gaussian hills (height = 450 m, half-width = 25 km) in the western part of the domain. These hills are planted into the domain to trigger deep convection in the simulation, but are rather smooth aiming not to provide too strong forcing. They resemble hilly terrain in, for example, central Germany but not alpine terrain. Panosetti et al. (2019) have shown that simulations with km-scale grid spacing are more robust with strong orographic forcing from the European Alps than with weaker central German orographic forcing. The BB domain with the locations of the hills is sketched in Figure 2.
The simulation was initialized with a Weisman and Klemp (1982), wind shear profile as implemented by Blahak (2015) with a mean zonal wind speed of 20 m/s above 6 km, potential temperatures/relative humidities of 300 K/1 and 343 K/0.25 at the profile base at 0 m and the tropopause in 12 km, respectively. The initial simulation profiles are shown in Figure 3. The zonal wind speed implies a parcel advection time of ≈ 34 h in the upper atmosphere from the inflow to the outflow boundary.

Coarse-Brother Set-Up
Five different Coarse-Brother (CB) 24-hr simulations were performed with COSMO-CLM and covering the BB domain. Due to the overlapping zone for the periodic boundary conditions, the CB domains are slightly larger than the BB domain (interior domains are identical). Three grid spacings frequently used in RCMs, that is, 0.11°, 0.22°, and 0.44°, and additionally 0.044° were used. Thus, the CB simulations were 2, 5, 10, and 20 times coarser than the reference BB simulation. Table 1 summarizes the domain set-ups. The idealized hills were smoothed (yielding lower heights and larger half-widths, but keeping the same volume as in the BB set-up) as is usually the case with coarser model grids. Figure 3 shows the different hill profiles.
Following, for example, Weisman et al. (1997) and Brisson et al. (2017), these CB simulations resolved deep convection partly at best, and therefore deep convection processes were usually parameterized here using the Tiedtke (1989) scheme in addition to the shallow convection processes. This is as it usually is in real-data applications: the CPMs are driven by RCMs (here represented by the CBs) that have to rely on deep and shallow convection parameterizations. Later we show results with deep convection switched on and switched off to explore the behavior of the simulations in the gray zone of convective parameterisations. In addition, we show some results with CP triggering by CAPE threshold instead of low-level moisture convergence threshold, which is the default in COSMO-CLM. LBCs and initialization were done as in the BB set-up. Table 1 gives approximate values of the relative computational processing times for the different CB simulations. The 12-km CB simulation needed only about 1% computing time compared to the BB reference simulation. Additionally, the CB simulations were also much cheaper in terms of necessary memory resources. The difference in cost of switching on or off the deep CP was negligible.

Little-Brother Set-Up
The Little-Brother (LB) simulations were driven by BB and CB simulations in order to quantify the impact of typical scale jumps J between driving and driven simulations (see Table 1) and of the update frequency U (i.e., the frequency of availability of driving data per day). The set-up of numerics and physics of the LB simulations were the same as in the BB simulations, but   (Figure 2). The western LB domain includes the hills and thus represents a region, where orographic triggering of deep convection occurs. The eastern domain in contrast represents a region, where convective cells are advected into the domain through its lateral boundaries.
We chose typical driving frequencies U ∈ {96, 24, 8, 4}/day (every 15 min, hourly, 3-and 6-hourly). The available driving data were interpolated linearly in time to provide the necessary LBCs for the LB simulation for every numerical time-step. By default, COSMO-CLM uses the H. C. Davies (1976) relaxation approach. This approach prescribes all driving variables at all lateral boundaries, which means the problem is over-specified (too much information is given at the lateral boundaries). A sponge zone is introduced to buffer any spurious noise developing at the lateral boundary, where the internal model solution is relaxed toward the driving data. Leps et al. (2019) implemented another approach based on Mesinger (1977) which prescribes less information at the outflow boundaries. We call this approach Mesinger approach. More details on the formulation of the LBCs are given in Leps et al. (2019). If not noted otherwise, the experiments discussed here used the Davies relaxation approach.
As Table 1 shows each of the LB simulations costs about 26% of the reference BB simulation and about twice as much as the 4.9-km CB simulation in terms of processing time.

Statistics
The simulations of BB, CBs, and LBs were compared using simple statistics of simulated 15-min precipitation P (x, y, t): (a) the grid-point sum in time, sum (x, y), and (b) the grid-point anomaly time series' standard deviation , with BB values as reference as one-value statistics in the comparisons (as in Ahrens et al., 1998). Thus, simulations yielding sumr and tsdr values of one match the reference BB perfectly well as measured by these statistics. Values larger/smaller one over-/underestimate precipitation amount or spatially averaged temporal variability. With positively skewed precipitation values, a simulation that underestimates sumr often tends to underestimate tsdr.
If not mentioned otherwise, the comparisons were done for each of the LB domains separately on the common BB/LB subdomain grids. The CB simulations were interpolated to the BB grid before calculation of statistics using simple bilinear interpolation. This lead to slight smoothing of the interpolated CB fields and underestimation by interpolation of the CB tsd values (which adds to the expected smoothing by coarser numerical CB grids). The evaluation domains were reduced by 15 LB grid-points along the boundaries to avoid to measure direct nesting effects (through relaxation or filtering) in the lateral boundary zones.

Results and Discussion
We show and discuss the reference BB and coarse driving CB simulations first, and then the LB simulations driven by BB and CB simulations with different scale jumps J between the simulation grids, LBC's update frequencies U, and LBC formulations. Figure 4 shows the precipitation sums of one simulation day for the reference BB and different coarser CB simulations. The reference BB simulation shows precipitation largely orographically triggered by the Gaussian hills. The impact of the periodic boundary conditions in meridional direction can be seen too. Precipitating systems 7 of 15 were not advected to or triggered near the outflow boundary within the simulated 24 hr. The Figure shows two CB simulations with the deep CP switched off. With twice as coarse grid-spacing than BB (J = 2, CP = off) the CB pattern looks similar to the reference yet rougher (with intensified precipitation tracks). As Table 2 and Figure 5 show, this CB simulation reduced the precipitation sum and temporal variability by about 15% in the orographic subdomain and by less than 5% in the inflow subdomain. The five times coarser CB simulation (J = 5, CP = off) shows delayed precipitation triggering and further reduced precipitation amounts and variability, especially in the orographic domain (Table 2). For J = 5, that is, with grid-spacing of ca. 12 km, the mountain drag of the hills with a half-width of 25 km (cf. Figure 3) is already largely underestimated by the numerical scheme following Davies and Brown (2001). This underestimation is worse for the even coarser CBs. It should be noted that with using COSMO-CLM's subgridscale orography parameterization the degradation of simulation quality with increased grid-spacing would be smaller (Obermann-Hellhund & Ahrens, 2018).

BB and CB Simulations
The coarse CB simulations with CP switched on and using low-level moisture convergence triggering produced only up to 56% (J = 2, CP = on) and as little as 22% (J = 20, CP = on) precipitation and even less variability (Figures 4 and 5, and Table 2). Thus, the simulation with grid-spacing J = 2, that is, ≈ 4.9 km, with CP = on performed much worse than with CP = off. Obviously, the CP reduced instability too much and suppressed grid-scale convective precipitation.
The simulations with the gray zone grid-spacings of 12 and 24 km were similar with further degradation of simulation quality when increasing grid-spacing to ≈ 49 km. The CB quality was slightly better with orographic forcing than in the inflow evaluation domain without the orographic forcing. Interestingly, simulations with CAPE triggering of convection were better in terms of amount and variability than with moisture convergence triggering in our test set-up, but still convective activity was strongly suppressed as the underestimation of amount and variability by more than 50% in the inflow domain shows in case of J = 2 and CP = on. Additionally, the characteristic precipitation tracks as simulated in the CP = off simulation are not visibly in the CAPE simulations (not shown). Overall, there was a decrease of simulation quality with increasing grid-spacing, and especially without internal forcing by hills. Here, the limitations of the Tiedtke-like CP will not be further discussed, but its weakness shows less with strong forcing.
The results show that the CB simulations are useful idealized coarse-grid host simulations for the nested LB simulations to be discussed in the following.   Figure 2).

Driven LB Simulations
Next to the quality of the driving simulations, Figure 5 summarizes the quality, as measured with sumr and tsdr, of LB simulations driven by BB and CB simulations with different LBC update frequencies U. The quality of the LB simulations driven by the reference BB (with identical grid in the LB domain) was substantially degraded in comparison to the BB data. The precipitation sum was underestimated by about 30% and more in both the orographic and the inflow LB domains. The transient-eddy variability was underestimated by about 10% in the orographic domain and up to about 30% in the inflow domain by the LB simulations with LBC update only every six or 3 hrs (U = 4/day or 8/day, respectively), and much better represented with hourly or 15-min updates (U = 24/day or 96/day, respectively). The LB results were less sensitive on update frequency in the orographic than in the inflow domain, with orographic precipitation triggered by the hills in the orographic domain and not well inherited from the BB simulation at the inflow boundary. Figure 6 shows the precipitation sums as simulated in the two LB domains with hourly LBC update (U = 24/day). The LB driven by BB simulation in the orographic domain underestimates the impact of the hills in or near the western inflow-boundary zone. This generates a substantial spin-up zone of about 80-100 grid-points depth. This deep spin-up zone can be seen for all U values and is largest for 6-hourly updates (Figure 7).
The inflow-domain simulation shows the deep spin-up zone too. Additionally, the inflow-domain simulations show for most experiments too much precipitation next to the eastern outflow boundary in an area which is more than 50 grid points deep (Figures 6 and 7). But here, because of small absolute values, the small absolute errors generated large relative errors. Still, this backwatering of inconsistencies and subsequent precipitation near the outflow boundary was observed in real-data regional climate modeling experiments too (see T. Davies, 2014).
The Figures indicate that even in case of using perfect BB driving data with temporally dense 15-min LBC update, only the inner ca. 50% of the domain in zonal direction provided good simulation results in both LB domains. If there are additionally inconsistencies between the driven and driving models' physics parameterisations, this might add to the challenge in real-data experiments (e.g., Yang et al., 2012).
Nesting into the 0.044°, that is, 4.9 km and J = 2, CB simulation with deep CP switched off provided quality comparable to nesting into the reference simulation ( Figures 5, 6, and 7) in the orographic domain. In the inflow domain, there was stronger precipitation overestimation in a deeper zone at the outflow boundary (Figure 7). Nesting the LB into the 4.9-km CB simulation with deep CP switched on gave the worst results of all nesting experiments ( Figure 5). This LB's precipitation processes were strongly suppressed (Figures 6 and 7). For illustration, Figure 8 shows the mean potential temperature and relative humidity profiles the inflow-domain simulations inherit in the Davies relaxation zone from driving simulations. The driving simulations with 0.022° and 0.044° grid-spacing and CP = off drove the LB simulation with less relative humidity than with 0.044° and 0.22° grid-spacing but CP = on. However, with CP = on the relative moisture decreases less with height and thus the LB simulations were driven by a more stable air mass.
Sensitivity to the update frequency is again small in the orographic domain compared to the inflow domain. All the nested LB simulations performed worse averaged over the evaluation domains than the 0.044° CB simulation with deep CP switched off. Additionally, the CB simulation with J = 2 spent only 13% of the computing time while a LB simulation needed 26% compared to a BB simulation.
Interestingly, LB simulations nested into the CB domain with ≈ 12 km grid-spacing (0.11°, scale jump J = 5, and deep CP switched on) did not add value to the average precipitation amount results in the orographic domain ( Figure 5). As Figure 7 shows, the LB simulations suffered damaging spin-up at the inflow boundaries of more than 150 grid-points (about 40% of the zonal domain extent). The results in the inflow domain are slightly better than in the orographic domain, probably because of enough disturbances provided at the inflow boundary to generate precipitation. Beyond the spin-up region the precipitation amounts are comparably well to nesting into the BB simulation.
The results with CB J = 10 are better on average. The domain average results are even comparable to the results by nesting into the BB simulation. But, as Figure 7 shows the underestimation of precipitation in a somewhat smaller spin-up zone than in case J = 5 is compensated by an overestimation deeper into the domain. For the inflow domain with U = 24/day and 96/day, precipitation is overestimated substantially (up to 100%) in a zone of more than 100 grid points at the zonal outflow boundary (Figure 7).
Surprisingly, in the orographic domain the LB nested into the coarsest CB simulation with a scale jump of J = 20 produced the best total precipitation amount ( Figure 5). But, there is an extended spin-up zone underestimation which is later on compensated by overestimation ( 50 % in the central region of the domain, Figure 7). The mean quality in the inflow nesting experiment was comparable to the other experiments. They all show the degraded quality at the outflow boundary. Still, Figure 6 gives the impression that the simulated precipitation pattern with J = 20 deviates the strongest from the BB pattern. The pattern is dominated by artifacts at the boundaries (compensated by an underestimation in the domain center, Figure 6), which can clearly be seen in the J = 10 simulation too, though to a weaker extent. Therefore, the error compensation ranks the LB results with a scale jump from about 50 to 2.4 km at the boundaries wrongly best on average.
All the LB simulations systematically improved on tdsr compared to their driving CBs with CP = on ( Figure 5). This is partly because of the known RCM deficiency of too large areas of precipitation and generally too frequent weak intensities (the "drizzle" problem, Lind et al., 2020) which added to the tdsr underestimation by the CBs and were reduced by the LBs.
Given the shown nesting challenge, we tested, as in Leps et al. (2019) at coarser nesting grid-scales, the Mesinger approach as an alternative to the Davies relaxation approach for LBC specification. As illustrated in Figure 9, the LB simulations with Mesinger LBCs tended to show less deep spin-up zones in the orographic domain, which fits to a smaller boundary zone and thus better representation of the western hills. But, total precipitation underestimation in the evaluation domain was increased by 5%-10% compared to simulation with the Davies relaxation approach. In the inflow domain the total precipitation amount was generally even more underestimated (by ca. 15%). This might be an indication of smaller disturbances near the domain boundaries, which did not trigger convection. Near the outflow boundary, the effects of driving and driven simulation inconsistencies were simulated in a narrower zone with the Mesinger than with the Davies relaxation approach. Overall, the Mesinger approach performed comparable to the Davies relaxation approach.

Summary and Conclusions
This paper presented idealized CPM nesting experiments using a modified Big-Brother (BB) experiment design as it was used before for RCMs in Leps et al. (2019). The model applied, COSMO-CLM, was used before successfully in many real data simulations at both, the RCM and at CPM scales. Our reference BB simulation used a convection-permitting grid-spacing of 2.4 km. Coarse-Brother (CB) simulations with increased grid-spacing by factors J = 2-20 using the BB domain showed the expected degradation of simulation quality in terms of precipitation sum and temporal variability in two sub-domains, the Little-Brother (LB) domains with and without idealized hills. The CB simulation with J = 2 (i.e., grid-spacing of 4.9 km) and with deep convection parameterization switched off (CP = off) performed very well compared to the CB with the same set-up but with CP = on and compared to the coarser CB simulations. In the discussed idealized set-up, even the 12-km J = 5 simulation performed better without than with deep CP.
The LB simulations nested into the BB or the J = 2 & CP = off CB simulations produced up to about 30% less precipitation than the driving simulations with best results using hourly or 15-min update frequency of the LBCs.
In the domain with hills, that is, with orographic forcing, the LB nesting could not improve the driving CB simulations with J = 2 or 5 in terms of precipitation amount, but with J = 10 and 20 in domain average. All LB simulations showed a large spin-up zone with precipitation underestimation near the inflow boundaries. The hilly domain LB simulations driven by the coarsest CBs compensated spin-up underestimation by overestimation in the inner parts of the domain resulting in unwanted error compensation. The nested LB simulations in the flat domain, that is, without internal orographic forcing of convection, did not inherit convective disturbances from the driving CBs with J = 10 and 20 at the inflow boundary. Their relatively good evaluation results were probably due to disturbances because of inconsistencies between driving and driven simulations at the inflow boundary.
These results lead to the conclusion that in our idealized LB set-up at best only the inner 50% of the domain in main flow direction, that is, the inner 200 grid points of 400 grid points in zonal direction, provided useful information. In other words, a buffer zone which is at least 100 gridpoints deep along the lateral boundaries has to be accepted in CPM simulations. This suggests that the useful domain fraction for CPMs is at least as small as for RCMs (cf. Warner et al., 1997).
The LB results are slightly better for hourly or 15-min than three-hourly LBC update frequencies. Six-hourly updates (the lower limit at RCM scales as suggested by Leps et al., 2019) systematically yielded the worst results. Therefore, a 3-hourly or better lateral update frequency should be applied. Using either the Davies relaxation or  the Mesinger approach in preparing the LBCs had an only negligible impact on results. However, it should be noted that better tuning of the Davies and Mesinger LBC approaches and a broader Davies relaxation zone than the model's defaults used here (cf. Beck et al., 2004 at RCM scales) might somewhat reduce the inconsistencies between the nested and driving simulations (especially at the outflow boundaries).
In our set-up, the large-domain CB simulation with grid-spacing of 4.9 km & CP = off performed better than all LB simulations. A grid-spacing of 4.9 km is coarser than usually suggested for CPMs applied in the midlatitudes (Brisson et al., 2017). The forcing by the hilly terrain was well seen in the 4.9-km simulation and was well advected into the flat sub-domain. Following Panosetti et al. (2019), an even stronger orographic forcing would further improve the performance in comparison with the BB simulation. Such an additional forcing would improve the LBs performance too. However, the 4.9-km convective-permitting CB simulation is computationally about two times cheaper than one of the small domain LB simulations.
Thus, our results suggest that opting for a larger domain and ca. 5 km grid-spacing is better than for higher resolution and by factors smaller-domain, still computationally expensive CPM simulations. But, the optimal compromise will be application and model dependent (e.g., on the convection parameterization in the driving model).
And, there are CPM applications like investigations of future convective cell properties which favor the higher CPM resolutions . But, these investigations have to evaluate the useful fraction of the simulation domain carefully. We expect that in real world applications with additional forcings like surface heterogeneity and frontal systems the relative quality of the LBs but also of the 4.9-km CB without convection parameterization is better compared to the BB simulation. This expectation is supported by the promising evaluation results of a large ensemble of CPM simulations in an evaluation domain covering the European Alps presented in Coppola et al. (2020).
Anyhow, it is recommended to use a driving model with grid-spacing scales not too deep in the gray zone of its convection parameterization. Direct nesting of a CPM with grid-spacing of 4.9 km or finer into, for example, global ERA5 re-analysis data (Hersbach et al., 2020), global HighResMIP (Haarsma et al., 2016), or regional CORDEX-CORE (Sørland et al., 2021) simulations with about 30-25 km grid-spacing is sensible given the results shown here. Nesting into an intermediate gray-zone nest is not advised. This is also concluded in Liang et al. (2019) after doing sensitivity experiments with real-data CPM set-ups.
Finally, developing methods for better preconditioning of convective activity at the CPM domain's inflow boundary (like preconditioning of eddies in large-eddy simulations, Tabor & Baba-Ahmadi 2010) might help to decrease the depth of the observed spin-up zone.

Data Availability Statement
COSMO-CLM is the community model of the regional climate modelling community, which is freely available for community members (https://www.clm-community.eu). Namelists for reproducing the simulations and the data used for evaluation are available online (http://doi.org/10.5281/zenodo.4553188). tional facilities at the DMRZ (Deutsches Meteorologisches Klimarechenzentrum). They thank C. Purr, GUF, for help with Figure 1, and D. Risto, GUF, and anonymous reviewers for their careful reading of the manuscript and for valuable comments and suggestions.