Estimation of vertical structure of latent heat generated in thunderstorms using CloudSat radar

The Earth's atmosphere is highly coupled between the vertical layers and the surface. An understanding of circulations in the atmosphere is important for developing models and improving weather forecasting. The latent heat produced in the atmosphere is one of the key driving forces of these circulations. It is therefore very important to estimate the latent heat in the atmosphere accurately, especially in thunderstorm clouds, which have proved to be one of the major sources of gravity waves in tropical regions. The current space‐based latent heat retrievals are limited to precipitation‐based estimation which cannot define the complete structure of a thunderstorm where precipitation is not the main indicator of the severity. A novel method is proposed in this study which retrieves the latent heat profiles of thunderstorm clouds using CloudSat W‐band radar profiles. A realistic database of simulated thunderstorm events developed using the Regional Atmospheric Modelling System – Cloud Resolving Model (RAMS‐CRM) is compared with observations using the Bayesian Monte Carlo method to derive an estimate with an uncertainty analysis for each estimate. The method is validated in the southeast Asian region with European Centre for Medium‐Range Weather Forecasts driven RAMS‐CRM profiles. The algorithm performance on observation data using linear fit, regression and bias analysis is discussed. A case study retrieval is also performed to demonstrate the retrieval on real CloudSat data.

weather through waves (Thies and Bendix, 2011;Tomlinson et al., 2011;Lee et al., 2014). The atmospheric circulations are sensitive to the vertical distribution of heating associated with convection in tropical regions (Hartmann et al., 1984). The latent heat also differs with cloud type: (a) convective clouds deal with total effective heating (condensation and freezing) throughout the vertical extent of the cloud; (b) stratiform clouds have higher total cooling (evaporation). The deep convective clouds are a major source of waves in tropical regions, leading to a number of studies focusing on understanding the coupling between troposphere and upper atmosphere at different scales.
Methods have been developed to derive the latent heat from both ground-based and space-based measurements. To name a few, studies with the Tropical Ocean Global Atmospheres/Coupled Ocean Atmosphere Response Experiment data retrieved diabatic heating fluxes using a combination of radiosondes and radar data etc. Thompson et al. (1979) used a network of ship observation data to relate the wave activity and diabatic heating during the Global Atmospheric Research Programme Atlantic Tropical Experiment. Many other ground-based and airborne campaign data were used to derive heating profiles inside clouds for a range of cloud types (Anagnostou, 2004;Schumacher et al., 2007;2008; Guimond and Reisner, 2012). Apart from regional studies, there is a need for consistent large-scale or global estimates of latent heat, which are only possible using satellites. The launch of the Tropical Rainfall Measuring Mission (TRMM) provided essential data in tropical regions for developing methods for latent heat retrieval. Various algorithms have been developed for TRMM satellite based latent heat retrieval (Tao et al., 2016). Over the years, the TRMM satellite has provided valuable data which have helped atmospheric heating to be better understood (Wang and Zhang, 1999;Shige et al., 2004;Schumacher et al., 2007;Zhang et al., 2008;Rosa et al., 2010;Indu and Nagesh Kumar, 2015;Tao et al., 2016). Yet, there is lack of a complete picture because the TRMM satellite was particularly designed to detect only the precipitation in the atmosphere (Schumacher et al., 2007); therefore, there is a chance that TRMM retrieval underestimates the latent heat in thunderstorms. In order to obtain the heating profiles of deep convective clouds accurately, measurements of the complete cloud structure are necessary. Recent studies have highlighted that the smaller particles that freeze while rising up also generate a considerable amount of latent heat leading to severe thunderstorms (Khain et al., 2005;Koren et al., 2010;Li, Li, et al., 2011a;Li, Niu, et al., 2011b;Lee, 2012). These smaller particles are not detected by the Ku-band radar in the TRMM. Therefore, methods need to be developed for W-band cloud radars which can detect the structure of thunderstorms more accurately.
CloudSat contains the cloud profiling radar (CPR) operating at 94 GHz that has revolutionized the satellite remote sensing of clouds. It can detect a wide range of hydrometeors of different sizes from smaller supercooled cloud droplets to large rain drops and precipitation Sassen and Wang, 2008;Stengel et al., 2015). The use of CPR to retrieve latent heat profiles can provide an advantage over the TRMM as it can detect a wider size range of hydrometeors which undergo phase change . This is significant in thunderstorms as the cloud top can extend to 18 km in tropical regions with a variety of hydrometeors in all size ranges such as rain, graupel, snow, hail and ice (Dodson et al., 2013). Nelson and L'Ecuyer (2016) suggested an algorithm to retrieve the latent heat profiles of warm rain clouds using the Monte Carlo method of estimation. This method proved that CPR profiles can be used to obtain latent heat profiles with reasonable accuracy. Since the structure of a thunderstorm is different from warm rain clouds, the effects of multiple scattering become significant especially for W-band radar (Hogan, 2006;Hogan and Battaglia, 2008). The previous methods have not considered the effects of multiple scattering, which is a major novelty in the present study.
In this work an algorithm developed to estimate the latent heat profiles of thunderstorms in particular is described. Using the Bayesian Monte Carlo (BMC) method, CloudSat radar observations are compared with a simulated database of thunderstorm events to estimate latent heat vertical profiles of thunderstorm events in the region near Singapore. The estimations are validated with simultaneous European Centre for Medium-Range Weather Forecasts (ECMWF) based cloud-resolving model simulations. The paper is organized as follows. Section 2 describes the algorithm overview and mathematics involved. Section 3 describes the CloudSat data and the process of generating the database followed by results in Section 4 and conclusions in Section 5.

| ALGORITHM DESCRIPTION
This study proposes a method to estimate the latent heat of thunderstorm clouds using CloudSat W-band CPR. The CPR reflectivity profiles are used as the basis for all the processes related to cloud microphysics as they are the only measurable quantity. The basic idea in this method is that the latent heat release is directly related to drop size and mass which is undergoing a phase change, as described in Cotton et al. (1986). Therefore, using model-based comparisons, the latent heat release can be indirectly related to reflectivity as reflectivity represents the size of a reflecting or scattering object. This has been proved to produce good results through TRMM studies (L'Ecuyer and Stephens, 2002;Wilks, 2006;Schumacher et al., 2007;Tao et al., 2016).

| Overview
The CPR reflectivity observations and the corresponding temperature profiles from the ECMWF are compared with a simulated database of reflectivity and temperature profiles using the BMC method. The simulated database was developed using the Regional Atmospheric Modelling System -Cloud Resolving Model (RAMS-CRM). The RAMS is initiated using the ECMWF ERA-Interim data which provide atmospheric profiles of general variables such as pressure, temperature and humidity. The RAMS is capable of simulating cloud microphysical processes reasonably well (the details are provided in Section 3). The simulation outputs various cloud variables such as hydrometeor concentration and mixing ratio, latent heat, surface and radiation variables.
Since the observational quantities are reflectivity and temperature profiles, the database that is developed should also consist of reflectivity and temperature profiles. Therefore, the output from the RAMS is converted into W-band radar reflectivity using scattering models (more details given in Section 3).
Once the database is created, individual observations are compared with the database using the BMC method to obtain estimates of latent heat. The algorithm flowchart is shown in Figure 1.

| Bayesian Monte Carlo method
The algorithm to estimate the latent heating of thunderstorms is based on the BMC method. Numerous studies have applied this method for retrieval of various atmospheric variables (Olson et al., 1999;Hogan, 2006;Grecu et al., 2009;Battaglia and Tanelli, 2011;Nelson and L'Ecuyer, 2016). An advantage of using the BMC method is that it facilitates incorporation of variables of different types (even from different instruments) for the estimation. It also provides the uncertainty in each estimation, which was developed by Lécuyer and Stephens (2002). The method does not need any initial guesses regarding the variables involved, unlike the optimal estimation method. The BMC method takes an a priori distribution and updates it to an a posteriori distribution by reducing the possible solution set to include only those states that provide a close match to the observations (Nelson and L'Ecuyer, 2016).
To explain the algorithm mathematically, the algorithm requires two vectors which are compared with the BMC method to get the latent heat estimation. The vectors contain the available cloud variables from the CloudSat products and the corresponding cloud variables representing each member of the database developed using the RAMS. The vector y s from the database and the vector y 0 from the CloudSat measured event comprise the same quantities and x s is the database of latent heat profiles corresponding to y s : where Z, T and LH refer to reflectivity (dBZ), temperature (K) and latent heat (K). The subscript values 1, …, q refer to altitude levels with q being the highest level, and i refers to the ith member of the database.
The method is based on the theory that the probability density function pdf(x) of the variable x to be estimated is proportional to the probability that x is the true set of parameters, x true , given that y s corresponding to x is equal to y 0 , pdf(x) / P(x = x true |y s = y 0 ). The probability distribution function for the database members, p i , can be expressed as: where N is the number of database members, S is the matrix representing the error covariance corresponding to the elements of y s and the prime symbol refers to transpose of the matrix. The simulated error covariance matrix S consists of variances of the elements of y s in the diagonal components and correlation with other elements multiplied by standard deviations of both elements in the off-diagonal components (L'Ecuyer and Stephens, 2002), as shown in Equation (3), where r ij is the correlation between the ith and jth elements: . . . r 1q σ 1 σ q r 2q σ 2 σ q ÁÁÁ σ 2 q 2 6 6 6 6 6 6 4 3 7 7 7 7 7 7 5 The method to calculate the variance and correlation is shown in Equations (4) and (5): where σ is the standard deviation, ϵ is the error and an overbar represents mean. The error covariance matrix is one of the main aspects of the algorithm as this directly defines the accuracy of the estimation. The key to producing accurate estimations using the BMC method is dependent on the accuracy of the database members to represent the thunderstorm events close to measured profiles by CloudSat and the variance of y s which represents the error covariance matrix. According to Bayes's theorem as discussed earlier, the estimated vector x r of the latent heat is given by: The uncertainty σ x in each estimate is given by considering the probability of each database member which is a measure of the posterior distribution variance:

| DATA PRODUCTS AND METHODS
The latent heat generated in the atmosphere is a quantity which is not directly measurable with the current state of instrument technology (Nelson and L'Ecuyer, 2016). Therefore, dynamic models are used to derive the heat from measurable quantities. This approach for immeasurable variable estimation is established and common in atmospheric studies (Marzano et al., 2003;Haynes et al., 2009;Battaglia and Tanelli, 2011). It is important that the model simulations are close to reality which ensures the accuracy of the algorithm. In this section the CloudSat data products and the models used for generating the database of thunderstorm events are described.

| CloudSat data
The CloudSat mission was launched in 2007 as a part of the A-Train constellation by joint efforts of the National Aeronautics and Space Administration and the Japan Aerospace Exploration Agency to collect data regarding clouds and aerosols. CloudSat contains a CPR operating at 94 GHz which is actively collecting cloud data on a global level while orbiting in polar low earth orbit (Tanelli et al., 2008;Delanoë and Hogan, 2010). Over the years, various algorithms have been developed to retrieve different cloud variables such as liquid/ice water content, cloud phase, precipitation rates and liquid/ice water path using radar Haynes et al., 2009;Delanoë and Hogan, 2010;Mitrescu et al., 2010;Yao et al., 2010;Liu et al., 2012;Pounder et al., 2012;Ceccaldi et al., 2013;Josset et al., 2013;Stengel et al., 2015;Mathew et al., 2016). The existing data products corresponding to CPR provide profiles of liquid water content and ice water content which are derived from reflectivity and backscatter measurements. The reflectivity is associated with the hydrometeors based on scattering information. Using the optimal estimation approach, the particle size distribution is retrieved, followed by calculating different moments of distribution which give water content and effective radius etc. (Austin et al., 2009). Since the latent heat is not directly related to the particle size distribution, the optimal estimation method does not suit this application.
The main variables which are directly related to latent heat are the amounts of hydrometeors present which can undergo a phase change. Since the hydrometeors are detected using backscatter measurements, this study uses the reflectivity measured by the CPR along with the temperature profile obtained by the ECMWF ERA-Interim model as auxiliary information. The reflectivity profiles are obtained from the Level 2 GEOPROF data product and the collocated temperature profiles are obtained from the ECMWF-AUX product; both products are available on the CloudSat data processing centre website. The use of the temperature profile is helpful in retrieval as the phase change is partially dependent on the ambient temperature of the atmosphere. The CloudSat reflectivity profiles which are available in the Level 2 GEOPROF data product have a vertical resolution of 240 m.

| Cloud-resolving model
Cloud resolving model (CRM) simulations are a wellestablished method of analysing cloud growth and other microphysical processes involved. In the last few decades, CRMs have evolved and are able to simulate the microphysical processes reasonably well; even processes such as aerosol-cloud interactions are represented well (Tao, 2007;Tao et al., 2012). The RAMS is the CRM used in this study which is developed by Colorado State University. It employs a full nonhydrostatic compressible system representing the dynamics and thermodynamics of the atmosphere with conservation equations for moisture budgets. These equations are supplemented with parameterizations for turbulent diffusion, solar and terrestrial radiation, moist processes including the formation and interaction of clouds and precipitating liquid and ice hydrometeors, kinematic effects of terrain, cumulus convection, and sensible and latent heat exchange between the atmosphere and the surface, consisting of multiple soil layers, vegetation, snow cover, canopy air and surface water. The RAMS is a regional model, designed for mesoscale or higher resolution scale grids. A two-way interactive grid nesting allows fine mesh grids to resolve smallscale atmospheric systems such as thunderstorms while simultaneously modelling the large-scale environment of the systems on a coarser grid (Cotton and Pielke, 2000). Various studies have implemented the RAMS for simulating the real events of moist convective events such as thunderstorms, hurricanes and cyclones (Altaratz et al., 2005;van den Heever et al., 2006;van den Heever and Cotton, 2007;Herbener et al., 2014). These studies have shown that the RAMS can simulate the events very realistically and is quite reliable for simulating thunderstorms.
In this study, four two-way nested C-grids are used with horizontal spacings of 25 km, 6.25 km, 1.5 km and 375 m covering an area of approximately 62,500 km 2 in the outer grid and 1,463 km 2 in the innermost grid (grid top view is shown in Figure S4). The model consists of 41 vertical levels with variable spacing; the lower altitudes are assigned to a higher resolution grid and the model top extends to 20 km. The basic radiative condition was applied to the normal velocity components at the lateral boundaries of the grids (Klemp and Wilhelmson, 1978). The surface processes were parameterized using the Land Ecosystem Atmosphere Feedback 3 (LEAF-3). Convection was explicitly resolved on the inner grids while parameterized on the outer grid using a modified Kuo cumulus scheme (Kuo, 1974). The mixing ratios and number concentration of the various hydrometeors were predicted through a two-moment bulk microphysical scheme (Meyers et al., 1997), with all the available water species activated (pristine ice, cloud water, snow, aggregates, graupel, hail and rain). Since the Singapore region does not have instrumental data of vertical aerosol profiles, the default aerosol initiation and vertical profile inbuilt in the RAMS was used; it is shown in Table 1 which lists the complete model configuration.
The model was initiated using ECMWF ERA-Interim model data with 0.3 grid resolution. The study was conducted in the Singapore region covering an area of approximately 250,000 km 2 . The initiation was supplemented with local radiosonde data whenever possible which are available at 12 hr intervals (http://weather. uwyo.edu/upperair/soun-ding.html). The initiation data were collected for events that were identified as thunderstorms through rain rate and lightning intensity data from Meteorological Services Singapore (MSS). The database contains simulations to collect approximately 100,000 profiles of the concerned variables. The RAMS simulations were validated in the Singapore region using CloudSat reflectivity profiles, which seek the RAMS closeness to the truth values. Since there are no hydrometeor profile measuring instruments in the region, the reflectivity profiles were compared as shown in Figure S1. The details are provided in the supplementary material. The nonprognostic variables such as temperature, potential temperature and mixing ratios were compared with the local radiosonde profiles for validation, as shown in Figures S2  and S3. These may be used to validate the ECMWF data accuracy.

| Scattering models
The database is populated separately for latent heat and the corresponding measurable quantities. It has to be populated with quantities which are observable with CPR, i.e. radar reflectivity (Z). The RAMS simulations can produce hydrometeor profiles (corresponding to latent heat profiles) in terms of mixing ratios (gÁm -3 ) and diameter (m). These quantities can be converted into Z. A model to simulate the radar/LIDAR measurable quantities from the hydrometeor scattering information was developed by Hogan and Battaglia (2008) and shows good performance in estimating backscatter information for thunderstorm profiles. This fast model calculates the time-dependent multiple scattering returns from LIDAR or radar. It uses a hybrid approach with the small-angle multiple scattering returns of LIDAR, calculated by the photon variancecovariance method (Hogan, 2008), and the wide-angle multiple scattering that occurs for both radar and LIDAR, calculated using the time-dependent two-stream approximation (Hogan and Battaglia, 2008). The basic schematic showing the parameters used in the multiscatter algorithm is shown in Figure 2. The cloud profile is discretized into layers of thickness Δr that are assumed to be homogeneous in the orthogonal direction to the quasi-direct beam. A small fraction of photons in the quasi-direct beam that are backscattered toward the receiver can return to the instrument along the quasi-direct path, allowing small-angle  Boundary conditions Radiative lateral boundary (Klemp and Wilhelmson, 1978) Turbulence scheme Deformation-based parameterization (Lilly, 1962;Hill, 1974) Aerosol initiation Exponential profile with maximum concentration C = C max exp(-z/7000), where z is in meters and C is the aerosol concentration Radiation scheme Harrington (Harrington, 1997) Surface scheme LEAF-3 F I G U R E 2 Schematic diagram illustrating the essential parameters for the multiscatter algorithm. The thick arrow depicts the propagation of a transmitted beam toward a two-layer cloud. Some of the radiation is backscattered toward the receiver (solid open-headed arrows), and its travel time (distance on the abscissa) may be interpreted directly in terms of range to the layer. Some is scattered into diffuse streams inside the cloud, with the possibility of being subsequently scattered back toward the receiver (dashed open-headed arrows). In this case the travel time cannot be interpreted directly as a depth of penetration into the cloud forward scattering. The photon covariance method accounts for the photons that have been forward-scattered an arbitrary number of times on the outgoing or return journeys by calculating the variance and covariance of the photon position and direction as a function of range (Hogan, 2008). The photons that experience wide-angle scattering in the quasi-direct beam within the cloud and then enter the diffuse distribution are modelled at each range gate by an outgoing and incoming stream travelling in two discrete directions (Hogan and Battaglia, 2008). This method needs the extinction coefficient, extinctionto-backscatter ratio, asymmetry factor and scattering albedo information for each hydrometeor profile at each altitude level in order to get the corresponding apparent radar reflectivity. The extinction area and backscatter area can be calculated using scattering theories such as the Mie and Rayleigh theories. Methods have been developed which use versions of these theories for determining the scattering properties of cloud hydrometeors (Battaglia et al., 2010). The method used in this study to calculate the scattering properties uses the density based approximation because, in the larger ice particles, the internal structure becomes important and the assumption of spherical shapes leads to error in estimation of the scattering properties (Hogan et al., 2017). The density of large aggregates is generally low (for the size they project); in such cases the Rayleigh-Gans approximation has been found to be valid. In this approximation, the electric field experienced at any point in the particle is approximated by the incident field, thereby neglecting interaction between dipoles. The backscatter crosssection may then be estimated numerically from a 1D description of the structure of the particle in the direction of the incident wave (Hogan et al., 2017). Hogan and Westbrook (2014) showed that, for realistic aggregates, this 1D function has a self-similar structure and hence its power spectrum can be represented by a power law. Therefore, it is referred to as the self-similar Rayleigh-Gans approximation (SSRGA). Detailed discussion regarding the method is beyond the scope of this study but interested readers should refer to Hogan et al. (2017). Therefore, to determine individual particle scattering the Mie theory is used for cloud, rain, drizzle and hail and the SSRGA is used for unrimed ice, graupel and snow particles. Both these methods are available as a ready-to-run program developed by Robin Hogan (http://www.met.rdg.ac.uk/~swrhgnrj/). Once the crosssectional area is determined using the scattering model, it is possible to calculate the extinction coefficient ζ ext using (Seinfeld and Pandis, 2006): where N c is the particle number concentration (m −3 ), which is based on the size distribution function used in the RAMS model for the particle hydrometeor, and A ext is the extinction cross-sectional area (m 2 ), which is obtained from the SSRGA model output. Primarily, the multiscatter algorithm was developed for calculating LIDAR backscatters and it was later extended to radar (Hogan, 2006;Hogan and Battaglia, 2008). From the apparent backscatter coefficient, the apparent radar reflectivity can be calculated as described by Hogan and Battaglia (2008): where jκj is a reference dielectric factor of water and, at millimetre wavelengths, is temperature dependent (Hogan, 2006). From this, the calculated reflectivity values are populated in the database for each altitude level (simulated by the RAMS) along with the temperature profiles obtained from the ECMWF-AUX product.

| Database properties
The database has to cover the most commonly occurring profiles of thunderstorms without having a large inherent variance (which will lead to higher estimation errors as explained in Section 2.2). Since this is a regional study, it is expected that the thunderstorm profiles will have similar profiles due to similar landscape and water bodies. In order to decide on the acceptable variance, databases with different variances (varying standard deviation of the reflectivity profiles from 10% to 50% of the mean values) were applied to retrieve the latent heat. Based on trial and error with different standard deviation values, the database with 34.6% standard deviation produced the optimal result. The database developed here contains two different sections: (a) the reflectivity and temperature profiles of simulated thunderstorms, and (b) the corresponding latent heat profiles of simulated thunderstorms. Ten different RAMS simulations were performed around the Singapore region at different times at which thunderstorms were identified using lightning intensity maps produced by MSS which coincided with the CloudSat orbital path. This was done in order to make sure that CPR can detect the thunderstorms seen in the lightning maps. Since this study concerns only thunderstorms, only those profiles with identified thunderstorms were selected and the corresponding backscatter calculations were performed. The thunderstorm identification follows exactly the method used by the CloudSat CLDCLASS data product which classifies the cloud type Sassen and Wang, 2008). This ensures that the database will be populated with thunderstorm events which are recognized by the CloudSat scene classification algorithm. The CPR footprint is roughly 1.5 km which is coarser than the innermost grid of the RAMS from which the database profiles are taken. In order to accommodate the differences, the database was downgraded by selecting the profiles with 1.5 km horizontal spacing and 240 m vertical resolution. The data in the lower atmosphere which have very high vertical resolution (below 1 km) in the RAMS are binned into four equally spaced grids to simulate the same effects of ground clutter as observed in the CPR.
The input components of the database consist of reflectivity and temperature profiles. As explained in previous sections, the reflectivity profiles were calculated from the RAMS hydrometeor profiles using scattering models (Hogan and Battaglia, 2008;Hogan et al., 2017). The maximum of the mean reflectivity profile occurs between approximately 6 and 8 km as shown in Figure 3, which agrees with previous studies in the southeast Asian region (Dodson et al., 2013). It is also seen that the latent heat peaks in that altitude range indicating that reflectivity is indeed indirectly representative of latent heat. It should be noted that the lowest reflectivity value that can be detected by the CPR is roughly −30 dBZ which refers to light cirrus clouds with small hydrometeors which are not dense (Tanelli et al., 2008). Therefore, the reflectivity values in the database at the top (above 17 km), which are very low, will not be highly matched with the CPR data.

| Quantitative analysis
The estimated latent heat profiles were calculated for the CloudSat identified thunderstorm events during 2016 near Singapore which are not part of the database. The CloudSat CLDCLASS product was used to detect the thunderstorms with particular criteria to identify the pixels . Approximately 150 profiles from four different events in 2016 were used to test the algorithm. The four events are April 22, April 29, June 2 and November 14, all around 0600 UTC or 1400 local time. Once the estimations were obtained for the CloudSat profiles, validation of the profiles was done using simultaneous RAMS simulations initiated with ECMWF ERA-Interim atmospheric profiles at the particular location of the event. It should be noted that the RAMS based validation is the only option in this case because the nearest radiosonde station collects data every 12 hr at 0000 and 1200 UTC and the CloudSat orbital pass time over this region is approximately 0600 and 1800 UTC. Therefore, the radiosonde cannot be used for validation, and there are no other instruments that are able to measure the vertical profile of the thunderstorm in the region. To find the closeness of the RAMS data to the truth values, the RAMS simulations were validated using reflectivity profiles (details are provided in the supplementary document). The performance of the algorithm is judged based on the error in the vertical profile estimation and linear regression analysis between the retrieval and validation values of column integrated heating. Figure 4 shows the mean percentage error in the retrieved latent heat profiles. It is seen that the error increases with altitude. The reason may be the RAMS vertical resolution which does not allow a resolution higher than 1 km at altitudes higher than 5 km. This may lead to an inaccurate representation of microphysics at higher altitudes. Also, the reflectivity differences between the database and CPR at altitudes higher than 16 km can be high as CPR can only detect a minimum reflectivity of −30 dBZ, while some of the simulated reflectivity profiles are up to −40 dBZ. It is also seen that the largest error below 16 km occurs between 8 and 10 km, which is analogous to the peak standard deviation of the latent heat database. The database may need to be improved in order to obtain better results. There are issues that need to be addressed in the future: (a) the effect of different microphysics schemes used; (b) improving the database may solve some issues, but other estimation schemes may also be implemented to compare and optimize; and (c) errors due to the scattering models need to be studied.
F I G U R E 3 Mean and standard deviation of reflectivity and latent heat profiles in the database It is also sensible to compare the integrated values of the heating in a column to judge the performance of the algorithm better since the accuracy of estimation at each altitude level may not be required for some applications. Figure 5 shows the linear regression plots for column integrated latent heating. The plots show that the column integrated values have reasonably good linear fit with consistent negative bias; however, due to some similar estimated values (estimated values below 2 K), the regression has a low value of 0.8022. The consistent negative bias may be due to the limitation of the current scattering models which can represent the reflectivity of a W-band radar. The CPR itself gets attenuated heavily in heavy precipitation and this attenuation may not always be represented accurately by the multiscatter models. Therefore, the algorithm will underestimate the latent heat exchanges involving large particles such heavy rain or growth of graupel and hail.
Similar estimation values may be due to overgeneralization of the method when deriving the probability distribution. This happens when the observation profile used does not find a similar profile in the database leading to a more general outcome. The few ways to resolve this issue are as follows: (a) increase the quantity and quality of samples in the database which can cover more scenarios similar to the observation (this is one of the future tasks for this study); (b) implement other estimation methods such variational analysis, machine learning methods and least-square averaging etc.; and (c) conclude that the algorithm does not work consistently for latent heat values under 2 K, and avoid estimated values below 2 K. To add further confidence in the retrievals, the algorithm was also used to estimate rain rates which are validated using the rain data from MSS. Rain rates are estimated in a similar way to latent heat using Equations (6) and (7) and also add value since rain rates are measurable at the surface. Since rain data are available only from the station collecting the data, the validation is performed by comparing the rain rate estimations from CloudSat profiles which are within 1 km of the station. Figure 6 shows the linear regression comparison between the estimations and measurements of rain rates for the four events mentioned earlier in this section. Since the stations are limited and not as dense as the CloudSat horizontal resolution, the comparison is limited to 17 data points. The rain rate estimation shows lower performance compared to the latent heat estimation. This is expected as CPR is heavily attenuated during heavy rain, which is reflected in the underestimation of rain rates at values higher than 20 mmÁhr -1 . The reason for the underestimation is the same as explained previously where the representation of reflectivity values might be inaccurate for heavy rain cases which can contain particles with complicated shapes that are difficult to model accurately.

| Case study
A case study was performed to demonstrate the results and retrieval limitations. The profile discussed in the case is a thunderstorm event from April 29, 2016 which was detected by CloudSat CPR. As seen in Figure 7, the retrieval algorithm was applied for a CLDCLASS identified thunderstorm event. It can be seen that the actual cloud extends beyond the identified thunderstorm (the anvil of the storm), yet the algorithm is restricted to perform the retrieval only in the regions of the storm tower. The reason for this is mainly because the towering structure of the thunderstorm (shown as the red patch in the cloud scenario panel of Figure 7) is believed to be the container for all the rapid heating, updrafts and downdrafts and the mass (Beres, 2004;van den Heever and Cotton, 2007;Tao et al., 2012;Jewtoukoff et al., 2013;Koren et al., 2014). As this study is mainly concerned with estimating the thunderstorm strength, the estimation is performed only for the limited identified area. The latent heat profiles are exactly aligned with pixels where the reflectivity values are near the maximum.
The maximum reflectivity values for the clouds are approximately 5 dBZ. This aligns with the fact that bigger drops or a higher drop concentration (higher reflectivity) undergoing a phase change release a large amount of heat which is seen as bright red patch in Figure 7. Also, having a temperature profile supports the retrieval which will help in better probability distribution.

| CONCLUSIONS
Thunderstorms are one of the largest disturbances in the atmosphere and are the source of gravity waves and changes in circulation. They can be considered as one of the phenomena in which a large amount of energy exchange is involved. The hydrometeors and ambient environment undergo complex interactions which generate heat, precipitation and waves of different frequencies (sound, light and gravity waves) which affect human life. It is important to develop methods to observe and monitor the energies and heat released during a thunderstorm event which will be helpful for understanding the following processes and developing weather models.
This study suggests a method for estimating the latent heat release of a thunderstorm using CloudSat cloud profiling radar (CPR) measurements in tropical regions. The method is based on the Bayesian Monte Carlo approach which calculates the probability of any particular event using a sample database. The sample database is generated using the Regional Atmospheric Modelling System -Cloud Resolving Model (RAMS-CRM) (Cotton and Pielke, 2000) in the present study. The hydrometeor profiles from the model simulations are converted into radar backscatter values using scattering models which also include multiple scattering effects (Hogan, 2006;Hogan and Battaglia, 2008;Hogan et al., 2017). The CPR reflectivity profiles along with European Centre for Medium-Range Weather Forecasts (ECMWF) temperature profiles are compared with the corresponding database profiles using the Bayesian Monte Carlo method to calculate the probability of each database member for the particular observation (Lécuyer and Stephens, 2002).
The algorithm performance was evaluated using approximately 150 CPR observations in 2016 near Singapore which are identified as thunderstorm profiles. The mean vertical error profile for latent heat has a peak value of less than 20% at higher altitudes where the F I G U R E 7 Case study profiles: (a) cloud scenario profiles obtained from the CLDCLASS product which shows the different types of clouds (here the number 8 represents a thunderstorm which is seen as the red patch in the plot); (b) the corresponding reflectivity (dBZ) profiles obtained from the GEOPROF product; (c) the retrieved latent heat (K) RAMS has poor vertical resolution and also the differences between the reflectivity database and the CPR reflectivity may differ at that altitude. The performance was also analysed by comparing column integrated values of latent heat release which show a consistent negative bias. The bias may be due to inaccurate and underestimation of reflectivity profiles in the database. The representation of scattering effects at 94 GHz in thunderstorms is a study that still requires significant improvements as the difficulty increases due to the various shapes and sizes of hydrometeors that occur in thunderstorms. To give reasonable confidence in the method, the rain rates were also estimated along with the latent heat and compared with local measurements of the rain rate in Singapore. These estimations show reasonable performance in low rain while underestimating at higher rates above 25 mmÁhr -1 .
The algorithm still needs improvements in many regards such as vertical profile error reduction and increasing the sample database to cover a wide range of scenarios while maintaining the quality and verifying the performance in other tropical regions. Some future work identified for this study is as follows: (a) improve the database quality and quantity; (b) study the scattering model errors; and (c) apply other estimation methods for comparison. Overall, the study has proved that the latent heat can be estimated using CPR reflectivity profiles with reasonable accuracy for thunderstorms (for which CPR is subject to multiple scattering effects). Gradually, the method will be extended to a global scale and other types of convective clouds. Developing such methods is important as weather models are in constant need of input from observations of different aspects of the atmosphere. The CloudSat and the Tropical Rainfall Measuring Mission have been able to provide valuable information regarding cloud structure and climatology, and the information regarding the latent heat will help for further development of models that are addressing the impact of thunderstorms on local and global weather.