The Fractal Nature of Clouds in Global Storm-Resolving Models

Clouds in observations are fractals: they show self-similarity across scales ranging from one to 1000 km. This includes individual storms and large-scale cloud structures typical of organised convection. It is not known whether global storm-resolving models reproduce the observed fractal scaling laws for clouds and organised convection. We compute the fractal dimension of clouds using Himawari satellite data and compare this to global storm-resolving model simulations completed as part of the DYAMOND intercomparison project. We find cloud fields in these simulations are indeed fractal, and reproduce the observed fractal dimension to within 10\%. We find the fractal dimension is sensitive to the choice of boundary layer parametrisation scheme used in each model simulation, and not to the convection parametrisation as might have been expected. The fractal dimension is independent of cloud area distributions, providing a complementary metric to assess the multi-scale structure of convection and convective organisation in model simulations.


I. Introduction
Numerical models are a useful tool to understand the climate system. Low-resolution climate models are routinely used for long simulations to address questions related to long-term climate change, the physics of low-frequency climate variability, and the likelihood of rare events. However, persistent biases remain in these models (e.g. Flato et al. 2013). Many biases and inconsistencies between models can be traced to the approximations used when developing model parametrisation schemes (e.g. Zelinka et al. 2020).
Recent years have seen an increase in the production of global storm-resolving (also called convection-permitting) atmospheric simulations. Global storm-resolving models directly simulate important processes that must be parametrised in conventional climate models, most notably deep convective clouds (Palmer & Stevens 2019). This substantially improves the fidelity of clouds and precipitation in those models (Stevens et al. 2020). This enables us to use these models as a digital laboratory to explore fundamental unsolved questions in atmospheric physics, such as the physics of convective organisation (Wing 2019).
Before using storm-resolving models in this way, we must quantify and understand the differences and similarities between the models and the real world. This reveals what questions can and cannot be addressed using those models. Certainly, global storm-resolving models appear realistic when compared to observations. An example of this is the visual similarity between images of simulated cloud condensate fields and satellite images of the Earth ). Assessing a model simulation using a visual comparison with observed fields as described here has been coined the 'Palmer-Turing test' (Palmer 2016) by analogy with the Turing test for Artificial Intelligence. While it can be difficult at first glance to distinguish between model and observed data, on closer inspection, the satellite image can often be identified.
It has long been known that clouds exhibit self-similarity across scales ranging from one to 1000 km (Lovejoy 1982) -i.e. they are fractals. This behaviour can provide insights into the physics underlying convective aggregation (Haerter 2019) and precipitation processes (Peters & Neelin 2006). In particular, scaling relationships imply emergent laws and universal behaviour (Lovejoy & Schertzer 2013). The fractal behaviour of clouds has been used to assess the fidelity of large eddy simulations (Siebesma & Jonker 2000). However, the approach has not been used to assess simulations with domain sizes large enough to simulate convective aggregation. In particular, it is not known whether convection in global storm-resolving models reproduces the observed behaviour. However, in order to use such models to understand the physics of convective aggregation, we must be confident that they represent this process well.
In this paper, we take the Palmer-Turing test as a starting point to assess global stormresolving models. However, we seek to go further and quantify the extent to which a simulated cloud field appears similar to that observed. We achieve this by computing the fractal dimension of clouds. We use this approach to assess the ensemble of global storm-resolving simulations produced for the DYAMOND project. In Section II, we start by describing the model and observational datasets used in this paper. In Section III we describe how to identify comparable cloud objects in satellite and model data, and the definition of the fractal dimension. In Section IV we compare the fractal dimensions thus computed for observed and simulated cloud fields. The DYAMOND models differ from each other in terms of resolution, parametrisation schemes, and other modelling choices: we assess which of these choices impacts convective aggregation as measured by the fractal dimension. Finally we discuss the significance of our results and draw conclusions in Section V.

II. Data i. DYAMOND Global Storm-Resolving Models
We evaluate the global storm-resolving simulations produced for the DYAMOND summer ensemble . Each model simulation was initialised at 00:00 UTC on 1st August 2016 and spans 40 days. The models were initialised using the ECMWF 9 km analysis. Daily observed sea-surface temperatures and sea ice concentrations were used as boundary conditions. Top of atmosphere outgoing longwave radiation (OLR) is available for each model simulation, regridded to a consistent 0.1 degree grid. We discard the first ten-days of each simulation as 'spin-up' as suggested in Stevens et al. (2019).
The DYAMOND protocol was deliberately simple and unprescriptive, to encourage as many groups to participate as possible. The model simulations therefore differ in their choice of vertical resolution, model top, cumulus parametrisation scheme, and boundary layer scheme, among many other modelling choices . The linear resolution (square root of the maximum grid box size) ranges from 2.5 km to 7.8 km for the core ensemble. In addition to the core ensemble, we analyse six simulations from the 'incidental ensemble'; these simulations differ from the core simulations in their horizontal resolution and choice of cumulus parametrisation, allowing us to test the sensitivity of fractal dimension to these modelling choices.

ii. Observed cloud fields
We compare the global storm-resolving simulations to satellite derived cloud fields. The Japan Meteorological Agency Himawari 8 satellite was the most sophisticated geostationary satellite in orbit in 2016, and was used in Stevens et al. (2019) to validate the DYAMOND simulations: we also use Himawari 8 data for consistency with that paper. Himawari 8 has been operational since July 2015 (Bessho et al. 2016). The satellite is positioned at 140.7 o E, with local solar noon at 02:41 UTC. The derived cloud-top temperature and total cloud optical thickness products used in this analysis are provided on a 5 km grid, with 10 minute resolution, in daytime regions only.

iii. Region
We restrict our attention to an area below the footprint of the Himawari satellite. Throughout the analysis we will approximate pixels as squares. This limits the latitudinal extent of the domain we can consider. We use data between 25 o N and 25 o S, such that the maximum fractional distortion in the longitudinal arclength is cos(25 o ) = 0.91. The longitudinal range is 80 − 180 o E.

i. Defining cloud objects
We follow the methodology presented in Lovejoy (1982). We first compute a binary cloud field from the observed cloud-top temperature data using a threshold of 215 K: any pixel with cloudtop temperature below this threshold is defined as a cloudy pixel. This selects cold, deep clouds. Setting the threshold colder restricts the number of clouds analysed, while choosing a warmer threshold makes it harder to compare satellite to model data. A warmer threshold also reduces the number of clouds, because cloud objects that intersect the edge of the domain are discarded. We found our analysis was robust to changes to this threshold -indeed, our observed estimate of the fractal dimension is not significantly different to that reported in Lovejoy (1982), despite our chosen thresholds differing by 50 K.
To compute equivalent binary cloud fields for each simulation, we must map between cloudtop temperature and OLR fields. This is carried out in a two-stage process. First, we map between the OLR and the effective radiating temperature of the cloud, T rad . The relationship is not simply that of a black body flux, because of absorption of radiation by the atmosphere above cloud top. We therefore use the empirically measured relationship of Vaillant De Guélis et al. (2017), where the OLR, F TOA , is in Wm −2 , α = 2.0 Wm −2 K −1 and β = −310 Wm −2 . Note that this linear relationship between OLR and cloud radiating temperature was originally identified by Ramanathan (1977).
The effective radiating temperature of the cloud will typically reflect the temperature some depth below the cloud-top. To convert between cloud radiating and cloud top temperature, we compare the satellite cloud-top temperature field to the cloud radiating temperature derived from one of the DYAMOND simulations following equation 1. We use the ICON 2.5 km simulated field for this mapping, regridded to the same 5 km grid as the satellite data. We use model data from 15 minutes after initialisation to ensure a strong equivalence between the simulated and observed cloud fields. Despite the short lead time, there are likely to be cloudy pixels in the model simulation where no cloud was observed in satellite data, and vice versa. To ensure we only compare locations where there is a high, thick cloud in both model and satellite data, we only compare pixels for which the Himawari cloud optical thickness is at least 2.5 and the ICON column integrated ice and water content exceeds 0.005 kg m −2 and 0.0 kg m −2 respectively. A total least squares fit on the retained pixels produced where the cloud-top temperature, T CT , and radiating temperature, T rad , are in K. Combining equations 1 and 2, the cloud-top temperature threshold of 215 K corresponds to a threshold in OLR of 132 Wm −2 . We use the same OLR threshold for all model simulations.
ii. Area-Perimeter fractal dimension The fractal dimension of an object can be measured in several different ways depending on the nature of the object in question, including the similarity dimension (Strogatz 2015) or boxcounting dimension (Walsh & Watterson 1993). For two-dimensional objects, the area-perimeter relation is a natural choice (Lovejoy 1982). This relation predicts a fractal dimension of the perimeter through the equation P ∝ A D/2 . The fractal dimension, D, thus derived is a measure of the complexity, or 'crinkliness', of the perimeter, since a smooth perimeter of given length encloses more area than a complex one.
We compute an instantaneous fractal dimension for each observed cloud top temperature and simulated OLR field. After applying the relevant threshold to generate a binary cloud field, the images are cleaned by removing holes in each cloud object. We discard any clouds with size smaller than 24 pixels as it is not possible to accurately estimate the cloud perimeter for objects of this size and smaller. Note that this pixel threshold means that the lower resolution model fields have a physically larger minimum size of cloud than the satellite fields. We finally remove any clouds which overlap the edge of the field, as the perimeters of these clouds would be anomalously smooth. IV. Results

i. Fractal dimension of cloud objects
The area and perimeter of each cloud object in each binary field is measured. Figure 1(a) shows the measurements from 0200 UTC on 11 August 2016 of observed cloud objects identified from the satellite cloud top temperature field. As demonstrated in Lovejoy (1982), the observed cloud objects are fractals, exhibiting the expected scaling behaviour. Linear regression provides an estimate of the instantaneous fractal dimension, D, of 1.36, consistent with the value of 1.35 reported in Lovejoy (1982). Figures 1(b-c) show the equivalent area-perimeter measurements for simulated cloud fields from three of the models. We find that the cloud objects in the stormresolving simulations also exhibit fractal behaviour, showing a clean scaling relation similar to that observed in satellite data. The instantaneous fractal dimensions computed for each simulated field are close to the estimate from the satellite-derived image. The linear relationship shown for observed and simulated fields supports monofractal behaviour, as opposed to a multifractal scaling law. The fractal dimension is computed separately for each day between 11 August and 9 September 2016 as the average of the dimension computed for 0200, 0300 and 0400 UTC. These times are close to local noon, for which satellite estimates of cloud top temperature are available. This analysis is performed for the satellite data and for each model simulation. The distribution of the computed fractal dimension D over this 30-day period is indicated in Figure 2. All the global storm-resolving models simulate cloud fields that exhibit fractal behaviour. The fractal dimension measured for the simulated cloud fields varies from model to model. For example, the ICON 2.5 and 5 km simulations have a notably higher fractal dimension than the other models (i.e. they have more crinkly clouds), while the IFS 4.8 km and NICAM 7 km simulations have a notably lower fractal dimension (i.e. they have smoother clouds). However, in general, the fractal dimension of the simulated cloud fields are all remarkably similar to that measured using satellite data, with the median dimension for all models falling within 10% of the observed value.
Within the DYAMOND ensemble, the model set up differs substantially between simulations . We consider whether any of the key choices made by the modelling groups contribute to the observed differences in simulated fractal dimension. Figure 3 shows the relationship between fractal dimension and the model resolution in both the horizontal and vertical. A small negative correlation (-0.32) is measured for horizontal resolution, while a small positive correlation (0.35) is measured for vertical resolution, though neither correlation is significant at the 95% level. The colour of the data points in Figure 3 also indicates the choice of convection scheme and boundary layer scheme. Interestingly, there is no systematic relationship between the choice of convection scheme and the simulated fractal dimension. However, sensitivity is  observed in the choice of boundary layer scheme. Models which use a turbulent kinetic energy (TKE) scheme, in which closure involves solving a prognostic equation for TKE, systematically have a higher fractal dimension than models which use a diagnostic eddy-diffusivity scheme.
One simulation analysed uses a 3-dimensional Smagorinsky-type scheme, as is commonly used in Large Eddy Simulations: the fractal dimension of cloud fields in that simulation falls between the dimension estimated for the models using a TKE or eddy-diffusivity scheme. We also considered the role of the height of the model top, the height of the sponge layer, and the number of vertical levels: none of these showed a significant impact on the simulated fractal dimension. While the satellite derived binary cloud fields are only available during daylight hours, the simulated cloud fields are also available during the night. Figure 4 shows the fractal dimension computed for the ICON 2.5 km simulation as a function of time, including both instantaneous values and a 6-hour rolling average. Here we consider the entire 40-day simulation. An initial spin-up period of 4-5 days is evident, during which the fractal dimension increases before converging on a stable value. This is shorter than the spin up period of ten-days highlighted by Stevens et al. (2019) for these simulations, suggesting ten-days is overcautious. Substantial variability is observed in the fractal dimension. This includes both a diurnal cycle and lower frequency variations.

ii. Distribution of cloud areas
We finally measure the distribution of cloud object areas. This complementary diagnostic also provides an independent assessment of whether the OLR and cloud top temperature thresholds are consistent with each other. Figure 5 shows the number of cloud objects as a function of cloud object area for the satellite and model data sets. On average across the models, the simulated distributions match the observed distribution, suggesting the thresholds in OLR and cloud top temperature are indeed consistent. However there are substantial differences between simulations produced using different models. When the same model is used but the resolution is varied, the resolution tends to have a large impact on the simulated distribution, except for the MPAS* simulations. This pair of simulations are from the 'incidental ensemble' and include the same version of the MPAS convection scheme that is used at coarser resolutions. This indicates 0 1 -A u g 0 5 -A u g 0 9 -A u g 1 3 -A u g 1 7 -A u g 2 1 -A u g 2 5 -A u g 2 9 -A u g 0 2 -S e p 0 6 -S e p 1 0 -S e p the full MPAS convection scheme is scale aware. We considered a range of metrics to summarise the fidelity of the distribution of simulated cloud objects captured in figure 5, including the average total area of cloud objects, and measures of the difference between modelled and observed cloud distributions. We found no correlation between any of these metrics and the fractal dimension (not shown): these two diagnostics are independent of each other.

V. Discussion and Conclusions
We describe a novel methodology for examining the fidelity of cloud fields in high resolution model simulations. We define a binary cloud field using a threshold in OLR or cloud top temperature, before examining the fractal nature of the resultant cloud objects. We demonstrate that global storm-resolving simulations produced for the DYAMOND project ) are able to reproduce the observed fractal nature of clouds. In other words, the area and perimeter of cloud objects are related by the scaling law P ∝ A D/2 where D is the fractal dimension. The fractal nature of clouds is an emergent property of models. It gives us confidence that these global storm-resolving models are representing the key aspects of the physics of clouds, convection, and convective organisation and aggregation.
In addition to reproducing the fractal nature of clouds, the fractal dimension of clouds in these simulations is close (within 10%) to that of cloud fields observed by the Himawari 8 geostationary satellite. To explain differences in the fractal dimension between models, we considered the modelling choices made when producing each of the DYAMOND simulations. The different simulations vary substantially in their model setups, including the model horizontal and vertical resolutions, the type of boundary layer scheme used, and whether any part of the model's convection parametrisation scheme is activated. Of these choices, the only significant predictor of fractal dimension was the choice of boundary layer scheme. Models which included a prognostic equation for TKE had a higher fractal dimension than those which use an eddy diffusivity scheme. This highlights the importance of the boundary layer scheme for convective organisation as opposed to the convection scheme. The comparative role of these two parametrisation schemes in organising convection is poorly known, though some single-model studies have highlighted the importance of the choice of boundary layer scheme for phenomena such as the Madden Julian Oscillation (MJO) (Holloway et al. 2013), consistent with the Convective-Dynamic-Moisture triointeraction theory of the MJO (Zhang et al. 2020). More generally, it is known that the different approximations used in boundary layer parametrisations can significantly impact the climate in models, including the model's climate sensitivity (Garratt 1993, Davy & Esau 2014, 2016.
We noted substantial temporal variability in the fractal dimension of clouds as simulated by a single model. Future work will seek to understand this variability, including aspects related to the diurnal cycle and lower frequency modes, as an indicator of changes in convective organisation.
One of the motivations behind this study was to quantify the 'Palmer-Turing test' (Palmer 2016). This test suggests that we can assess the fidelity of climate models by visually comparing instantaneous maps of model output to satellite observations. If we cannot tell which image is simulated and which is observed, then the model has passed the Palmer-Turing test. We suggested that the fractal dimension of cloud objects would be a useful tool to quantify and explain the visual similarities and differences noted when performing the Palmer-Turing test, which are otherwise difficult to enunciate. It is instructive to return at this point to the images of the simulated condensate fields from the DYAMOND simulations , Figure 2). The model whose instantaneous fractal dimension is closest to Himawari at 0400 UTC on 4 August 2016 is FV3 (D = 1.38 compared to the observed D=1.37), followed by NICAM and the UK Met Office. We note that these three models are visually different to each other. The fractal dimension alone does not fully characterise the image. In particular, low cloud and the distribution of cloud object sizes also influence our impression. Nevertheless, the fractal dimension is a useful additional tool for quantifying the ability of models to simulate deep convection and convective organisation.