Classification of convective/stratiform echoes in radar reflectivity observations using a fuzzy logic algorithm
Key Laboratory for Semi-Arid Climate Change of the Ministry of Education, Key Laboratory of Arid Climatic Changing and Reducing Disaster of Gansu Province, College of Atmospheric Sciences, Lanzhou University, Lanzhou, China
Corresponding author: Y. Yang, Key Laboratory for Semi-Arid Climate Change of the Ministry of Education, Key Laboratory of Arid Climatic Changing and Reducing Disaster of Gansu Province, College of Atmospheric Sciences, Lanzhou University, No.222 TianShui South Road, Lanzhou City, Gansu Province, 730000, China. (email@example.com)
Key Laboratory for Semi-Arid Climate Change of the Ministry of Education, Key Laboratory of Arid Climatic Changing and Reducing Disaster of Gansu Province, College of Atmospheric Sciences, Lanzhou University, Lanzhou, China
 In radar reflectivity observations, the convective and stratiform rain types always have poorly defined boundaries, which caused problem for rain classification. A fuzzy logic (FL) algorithm is developed to classify convective and stratiform rainfall based on the radar reflectivity observations in the next three steps: First, the algorithm is calibrated on Hefei Doppler radar site in China. Four features are selected based on a dataset for calibration, which spanned the period from 29 June to 23 July 2003; and the features basically represent a subjective choice of characteristics that are expected to distinguish different rain types. In the second step, membership functions are used to determine the degree to which each feature belongs to each rain type in the fuzzification process. Finally, the degree of fuzzification for each input feature is multiplied by predetermined weighting coefficients. The weighted degrees of the fuzzification are aggregated to produce a single value for each rain type. The aggregation can represent the possibility of classified rainfall type, and larger value reveals the higher potential for a particular class. The FL algorithm has been applied to four typical independent individual cases collected by Hefei Doppler radar, which have not been included in the database for calibration. Results show that the classification using the proposed FL algorithm is physically reasonable according to the analysis of three-dimensional radar reflectivity patterns, implying that the proposed FL algorithm has a great potential for the precipitation classification.
 The classification of precipitation into convective and stratiform types is useful in a variety of meteorological applications [Anagnostou, 2004; Biggerstaff and Listemaa, 2000; Steiner et al., 1995]. First, it is important in understanding cloud physics, as the two precipitation systems are characterized by different precipitation growth mechanisms [Gao and Stensrud, 2011; Gao et al., 2006; Houze, 1993; Li and Pu, 2008; Sun and Crook, 1998]. Second, it is also important from the thermodynamic standpoint in determining the vertical distribution of the diabatic process [Houze, 1989; Mapes and Houze, 1993, 1995]. Finally, it is important in quantitative precipitation estimation from both ground- and space-based instruments [Anagnostou and Krajewski, 1998; Zhang and Qi, 2010; Qi et al, 2013].
 A number of methods for partitioning rainfall from precipitating clouds into convective and stratiform components have been proposed. Many of these methods have originated from the studies of rain gauge data [Austin and Houze, 1972; Houze, 1973] and classified precipitation as convective whenever the rain rate exceeds the background level by a certain threshold amount. This background-exceedance technique (BET) is generally adequate for identifying the core of convective precipitation. The vertical and horizontal features of radar echoes provide useful information about the structure of precipitating cloud systems [Houze et al., 1976; Houze et al., 1990]. Churchill and Houze  used observations of radar reflectivity to extend BET into two dimensions. However, the fixed convective radius that they used was inadequate. Adopting an approach similar to that used by Adler and Negri , Steiner et al.  (hereinafter as SHY95) modified the BET method to allow the influence radius of each convective core to vary in size. In their approach, the influence radius and the threshold are functions of the area-averaged background reflectivity. Biggerstaff and Listemaa  (hereafter called BL00) improved SHY95 method by applying additional information based on the three-dimensional hydrometeor field inferred from radar reflectivity (the vertical lapse rate of reflectivity, a bright band fraction, and the magnitude of the two-dimensional horizontal gradient of radar reflectivity). Compared to the original SHY95 method, BL00 estimated larger convective areas but smaller convective rain volumes for all systems other than scattered, isolated convective cells. However, this method needs the height of freezing level (0°C) to calculate the bright band fraction. This requirement may limit the general usefulness of the method.
 From a classification standpoint, most approaches discussed above generally seek to designate mutually exclusive classes with well-defined boundaries to discriminate between rain classes. These approaches are termed “hard” classifications. However, the use of uncertain or imprecise boundaries is prone to lead to misclassification because the properties of stratiform and convective areas may overlap in many respects. The flexibility of classification schemes could potentially be improved through the use of more advanced methods based on uncertain or imprecise information, such as neural network (NN) and fuzzy logic approaches. For example, Anagnostou  developed a classification algorithm using the NN approach as a tool. The classification results showed great improvements when compared to SHY95 and BL00 methods. However, the large amount of computation is highly desirable by NN algorithms. By comparison, the fuzzy logic algorithms have much more computational efficiency than NN algorithms.
 In contrast to hard classifiers, the fuzzy logic approach assigns each observation to every class with a strength evaluated by membership functions. It avoids the application of a strict threshold until all available information has been combined. The advantage of this approach is its ability to systematically address natural ambiguities in measurement data, classification categories, and pattern recognition. Moreover, the fuzzy logic approach avoids the challenges in establishing complex relationships between features and calibration of the different thresholds required in the decision-tree scheme of a hard classifier. Although fuzzy logic algorithms have been successfully applied in the engineering sciences, their use in the atmospheric sciences has been limited to small area so far [Albo, 1994; Baum et al., 1997; Gourley et al., 2007; Key et al., 1989; Vivekanandan et al., 1999].
 In addition, most algorithms discussed above are black-and-white (i.e., convective or stratiform) classification. Clearly, nature is not black-and-white, and there is a lot of gray-scale in between, i.e., precipitation transitioning from convective to stratiform either in space and/or time. In this respect, memberships in fuzzy logic approach are similar to probabilities. The fuzzy logic approach can be used to express the classifications in a probabilistic way.
 This paper develops a fuzzy logic (FL) algorithm for precipitation classification with radar reflectivity observations and evaluates its performance. However, the major purpose of this study is not intended to compete with other radar-based precipitation classification algorithms. Instead, we would rather demonstrate the potential of FL algorithm for the precipitation classification, which may face the difficulties in defining precise boundaries for classes when a “hard” classification method is applied. The organization of the paper is given as follows: section 2 describes the radar data. Section 3 describes the proposed FL algorithm; section 4 details the performances of FL algorithm applied on four typical independent individual cases. Section 5 provides a summary of the results.
2 Doppler Radar Data
 A dataset for calibration and four typical independent individual cases for the evaluation of FL algorithm were all taken, collected by Hefei Doppler radar (117.716°E, 31.883°N) in China, the type of which is Chinese New Generation Radar S-band radar instruments (CINRAD WSR-98D/SA).
 The WSR-98D/SA is a 10 cm wavelength Doppler radar with a 1° half-power beamwidth. The data consist of volume scans of radar reflectivity, radial velocity, and spectrum width collected in the polar coordinate at increasing elevation angles. The radar is operated using a 360° azimuthal volume scan mode during the period of precipitation, with the elevation angle increasing in steps from 0.5° to 19.5°. The number of elevation angle steps and the temporal resolution of the data depend on the operational mode of the radar. The bin spacing is 250 m in the radial direction; however, reflectivity values are averaged over four bins to increase the number of independent measurements collected for each recorded value. Accordingly, reflectivity values are recorded at 1 km intervals along the radar beam, while velocity parameters are recorded at 250 m intervals. Each volume scan takes approximately 5 min. The rain classification presented in this study uses only the radar reflectivity observations.
 Following the data quality control, the radar data are interpolated from the polar coordinate to the Cartesian coordinate using the three-dimensional Barnes algorithm [Barnes, 1964]. The horizontal domain is a 300 × 300 km grid centered at the radar with a horizontal resolution of 1 km (301 × 301). The vertical domain consists of 73 layers and 18 km altitude (0.25 km resolution).
3.1 Overview of the Fuzzy Logic Algorithm
 The theory of fuzzy logic is based on approximate reasoning and has proven to be flexible and capable of combining a variety of different features. Fuzzy sets display a strength of membership for all potential classes, and the degree of membership to any particular class is provided by a membership function. Larger membership value reveals a greater potential for a particular class to be present in the sample.
 The algorithm uses a joint analysis of a number of features to assign each measurement a set of values in a range [0, 1]. Each such value provides an assessment of the possibility that the measurement belongs to a specific class. The analysis is accomplished through a set of user-defined curves μk,e(x) (known as membership functions), which serve to quantify the extent to which the feature Xk = x helps classify the measurement. The subscript k = 1, 2…N corresponds to the input features, which are defined below. The subscript e corresponds to the classification category, which in this case is either convective (e = C) or stratiform (e = S). The membership functions of the feature fields Xk for each category yield a set of strengths of membership Pk,e = μk,e(Xk) at each grid point. The total strength of membership Pe for the grid point is then obtained for each category by calculating the weighted average of the individual Pk,e using a set of weighting factors wk. It can be expressed as:
 The weighting factors incorporate the uncertainty in the importance of different features to the overall classification. For simplicity, this study has ignored the importance of various features and assigned equal weighting for them. Grid points for which PC > PS are classified as convective rainfall; otherwise, grid points are classified as stratiform rainfall. In this research, the sum of PC and PS is 1. So, in this paper, the value of PC is used to express the classifications for convective in a probabilistic way. It can give an assessment of possibility for a radar pixel, where the convective rain exists.
3.2 Feature Selection and Membership Function
 The features basically represent a subjective choice of characteristics that are expected to discriminate between two rain types. Figure 1a gives the sketch of the stratiform and convective frequency distributions of the feature X. The two frequency distributions overlap in the range [a, b]. The feature selection process identifies a subset of features those that minimize the expense of the computation and maximizes the accuracy of the classification.
 The membership function is one of the most important components of any fuzzy logic algorithm. Here, a linear function is used as membership function for convection associated with Figure 1b (see dashed line). The linear function is defined as:
where the subscript k = 1, 2, 3, 4 denotes the input feature parameters. a is the left breaking point and, b is the right breaking point.
 Of course, the corresponded membership function for stratiform (see solid line in Figure 1b) is:
It is apparent that μk,S(x,a,b) = 1 − μk,C(x,a,b).
 The results of the FL algorithm depend on the definitions of the two breaking points for the input features. Ideally, the statistics used to constrain the input features should be derived from a large amount of samples specific for the site and season, with the “true” convective and stratiform rain areas known in advance. The truth can be obtained by manually inspecting the three-dimensional radar reflectivity pattern by an experienced radar meteorologist to classify radar echoes into convective and stratiform elements. However, it is difficult to achieve the manual classification for a large number of samples. Since the FL algorithm tolerates uncertain or imprecise information, this study uses the rain classification generated by BL00 algorithm with Hefei radar data as the “observation” of precipitation type, although they have “observation errors”. A large number of statistical samples from Hefei Doppler radar for calibration are used to constrain the relationships between the input features and the distribution of convective and stratiform rains identified by BL00 algorithm. The statistical samples for calibration spanned the period from 29 June to 23 July 2003. The volume scans with precipitation echoes have been selected at a 15 min time interval with total 1263 volume scans being used in the statistics. For the lack of the height of freezing level, the condition of bright-band fraction in BL00 algorithm has not been used in this study.
 The primary purpose of the feature selection process is to find features that maximize the classifier accuracy. The studies on relationships between seven features and the distribution of convective and stratiform rain are conducted initially. Six features are those used in Anagnostou , and the seventh is the vertically integrated liquid water content (VIL) [Zhang and Qi, 2010; Qi et al, 2013]. After careful examination, only four features have been selected for the data analysis in the next section. These four features have shown the ability to characterize two rain types (see Figure 2). The distributions of the three discarded features are disorganized, and the relationships between them and the distribution of convective and stratiform rains are not evident. By combining the selected four features in the fuzzy logic algorithm, we were able to discriminate the convective rain from the stratiform rain. Histograms of the four features selected in this research and their association with stratiform and convective types generated by BL00 algorithm using a total of 1263 volume scans are presented in Figure 2.
 The four features basically represent a subjective choice of characteristics that are expected to discriminate between two rain types. The features, which are denoted as F1–F4, are described as follows:
1.F1: Reflectivity value at 2 km elevation [Anagnostou, 2004]. The value of F1 (units in dBZ) is expected to be higher in convective systems than in the stratiform systems, as is apparent from the histogram (Figure 2a). However, when ice crystals or snowflakes fall into warm surroundings below the freezing level, two consequent effects may impact the reflectivity of the particles: (1) the change from ice to water results in an increase in the scattering properties of the particles, so the radar reflectivity intensity will increase, and (2) the fall velocity of the snowflakes is less than that of the resulting water drops, so the number of particles per unit volume decreases continuously. The two effects bring about a so-called bright band (BB) in radar reflectivity field. The existence of BB that can provide high values may create a false classification only with F1. Nevertheless, a combination of others is expected to minimize this issue.
2.F2: VIL (units in kg/km2) is a measure of the amount of liquid water in a column of atmosphere. The VIL within a weather system is dependent on features of the surrounding environment and represents a sensitive indicator of the potential for strong convection associated with the weather system. In the work of Zhang and Qi , if the VIL at any range-azimuth bin is greater than a threshold of 6.5 kg/km2, the gate will be classified as convective. Otherwise, it is classified as being stratiform. As shown in Figure 2b, the higher values of VIL are more likely associated with convective systems.
 VIL can be calculated from radar reflectivity [Amburn and Wolf, 1997] according to the equation:
where Nz is the number of layers in vertical, Δhk is the depth of the current layer (units in m), and Zk and Zk + 1 (units in mm6/m3) are the values of radar reflectivity at kth and k + 1th layer, respectively. VIL takes into account the three-dimension reflectivity such that combining it with F1 can mitigate the effect of BB.
3.F3: Standard deviation (STD, units in dBZ) of F1 in the horizontal [Anagnostou, 2004]. This is the standard deviation of reflectivity within a radius of 11 km of the pixel in question. For the convective profiles that are associated with stronger horizontal gradients, higher values of standard deviation are expected in contrast to the stratiform type (as shown in Figure 2c).
4.F4: Product of radar top height and reflectivity value at 2 km (units in km · dBZ). Here, the radar echo top is defined as the height where the reflectivity value of a pixel becomes greater than 18.3 dBZ. The higher radar echo top is generally associated with the convective system and the lower one is with the stratiform. F4 can reduce the effect of BB on F1 as well. Preliminary analysis has shown that this feature, as shown by the histograms in Figure 2, magnifies the distinction between the stratiform and convective rain types.
 Combining Figure 1a and the results of the statistical analysis summarized in Figure 2, the two breaking points of the membership function for each feature are determined. The breaking points for F1 are set to be a = 20 dBZ and b = 45 dBZ, those for F2 are set to be a = 0.5 kg/km2 and b = 5.0 kg/km2, those for F3 are set to be a = 1 dBZ and b = 14 dBZ, and those for F4 are set to be a = 100 km · dBZ and b = 500 km · dBZ.
 In the next section, the trained FL algorithm with the chosen breaking point parameter values is applied and evaluated on Hefei Doppler radar with four independent cases.
 In this section, typical independent individual cases from the Hefei WSR-98D are selected to examine the classification performance of the proposed FL algorithm.
4.1 Widespread Stratiform
 A widespread stratiform rain was captured by Hefei WSR-98D radar in China during 8 June 2003. The features and results from the FL algorithm applied to this case at 1600 UTC are shown in Figure 3. The reflectivity field (Figure 3a) shows a broad precipitation with the reflectivity intensity less than 40 dBZ. There are several regions with reflectivity greater than 30 dBZ. There is a very tight relationship between the VIL field (Figure 3b) and the reflectivity (Figure 3a), and the values of VIL are much less than the right breaking point. The STD field (Figure 3c) also associated well with reflectivity. The peak value distributed in region A is about 7 dBZ, which is only half of the right breaking point. The product of echo top and reflectivity value at 2 km (Figure 3d) shows that the peak value about 350 km · dBZ is also located in region A. The result from the FL algorithm (Figure 3e) shows that the probability of convective precipitation is less than 0.5 in the entire rainfall area, indicating that this precipitation system can be totally identified as stratiform rain. Although region A has relatively high values of each feature, the stratiform rainfall is more likely to exist in this region according to the classification. We are aware that extremely different vertical structures of stratiform and convective precipitation are normally associated with different microphysical processes [Cao et al., 2012], and the bright band is only well defined when the stratiform rainfall is well developed [Zhang and Qi, 2010; Qi et al, 2013]. The type of precipitation in region A can be examined through the vertical structure of reflectivity. Figure 3f shows the vertical cross-section of reflectivity at Y = 155 through region A. There are bright bands at about X = 40 and X = 110, and these can further indicate that region A is stratiform rainfall. Based on the horizontal and vertical reflectivity structures, the FL algorithm can physically identify convective and stratiform rainfall.
4.2 Meso-Scale Convective System
 Figure 4 shows one meso-scale convective system (MCS) event at 1408 UTC on 5 June 2009 and the corresponding results with the FL algorithm. The reflectivity at 2 km (Figure 4a) shows two highlighted bow echoes (A and B) oriented from the northwest to the southeast. The leading convective line in the south (echo A) is composed of a series of intense reflectivity cells with a strong reflectivity gradient in front of the line. The second line (echo B) in the rearward of the first convective line (echo A) consists of the region, which gives a secondary peak of reflectivity values. The VIL field (Figure 4b) shows that the leading convective line (echo A) normally has a high VIL value, while echo B has a much smaller VIL value. The pattern of STD (Figure 4c) is quite consistent with that of reflectivity. Two highlighted bow echoes are related to high STD values. The pattern of the product of echo top and reflectivity value at 2 km (Figure 4d) also matches well with the reflectivity in the two highlighted echoes (regions A and B). The values in region A are much larger than those in region B. The result from the FL algorithm (Figure 4e) shows that a high probability of convective rainfall occurs most in region A. The left end (the location around (45,190)) of region B indicates a high probability of convective rainfall as well. The three-dimensional structure of reflectivity (not shown) also indicates apparent convective characteristics in this region. In addition, two small areas in B have the probability of convection between 0.5 and 0.6. Figure 4f shows the vertical cross-section of reflectivity at X = 140 (through echoes A and B). Convective characteristics are apparently seen from Y = 45 to 65 in region A. The enhanced area of reflectivity around X = 100 indicates the evolved bright band. The vertical structure of reflectivity around X = 135 implies the strong bright band and which has been misidentified as convective rainfall with FL algorithm, but with low classification value around 0.5 ~ 0.6.
4.3 Stratiform Rain with Embedded Convection
 The case study of 22 July 2009 storm illustrates the performance of FL algorithm for a stratiform event with embedded convection. The reflectivity field (Figure 5a) shows a large area of precipitation with enhanced reflectivity (regions A and B) to the west of the radar. The peak of VIL is distributed in region A and B where the reflectivity is larger than 40 dBZ (Figure 5b). The STD field (Figure 5c) matches well with the reflectivity field except in region C. Although the reflectivity in region C is greater than 30 dBZ, the value of STD in region C is low, showing the characteristics of stratiform. The pattern of the product of echo top and reflectivity value at 2 km (Figure 5d) is generally consistent with the reflectivity pattern. The output from the FL algorithm (Figure 5e) shows that the probability of convection greater than 0.5 just covers regions A and B, and a much lower probability is found in region C. Figure 5f shows the vertical cross-section of reflectivity at Y = 155 across regions A and C. There are convective characteristics obviously shown from X = 20 to X = 60 (in region A) and bright band signatures from X = 100 to X = 130 (in region C). All of these suggest that the classification of FL algorithm would be reasonable.
4.4 Unorganized Convection with Little Stratiform Rain
 An unorganized convective storm system with little stratiform rain was observed by the Hefei WSR-98D radar on 22 July 2008. Results from the FL algorithm applied to this precipitation system at 0803 UTC are shown in Figure 6. This precipitation system contained several isolated convective cells distributed across the west and north of the radar site, and many of them are quite small with a range less than 20 km and reflectivity peak less than 50 dBZ (Figure 6a). The high VIL is mainly associated with the region where reflectivity value is greater than 40 dBZ (Figure 6b versus Figure 6a). The pattern of STD (Figure 6c) looks very similar with the reflectivity pattern. Regions with high reflectivity correspond to high STD values, and the STD peak in region A is greater than 7 dBZ. Figure 6d shows the product of echo top and reflectivity value. Its intensity and pattern are well associated with those of VIL. High values well correspond to strong reflectivity especially when it is larger than 40 dBZ. Region A is associated with low values of the product. The result from the FL algorithm (Figure 6e) shows that high probability of convection spreads mainly over regions with strong reflectivity, which exceeds 40 dBZ. The probability of convection in region A is low, although the STD value in this region is relatively high. A vertical cross-section of reflectivity at Y = 145 (through region A) is shown in Figure 6f. A strong bright band region can be seen from X = 90 to X = 110 in region A. Therefore, region A should be stratiform precipitation. The above analysis further shows that the FL algorithm can provide physically reasonable classification of stratiform and convective rainfalls.
 A fuzzy logic algorithm for the partitioning of radar data into convective and stratiform components has been developed and tested with the Hefei WSR-98D radar data in China. The fuzzy logic algorithm, which can tolerate the uncertain or imprecise information, avoids the application of strict boundaries until all available information has been combined. This approach is conceptually straightforward but effective and avoids the difficulties in establishing the complex relationships between features and calibration of the various thresholds that are required in decision-tree schemes. In addition, the fuzzy logic algorithm can express the classifications in a probabilistic way.
 First, four features and breaking points of membership function for each feature are determined from large sampling datasets against the “observations” of precipitation type generated by BL00 method. The four features are the reflectivity at 2 km, vertically integrated liquid water content, standard deviation of reflectivity in the horizontal, and the product of radar top height and reflectivity value at 2 km. These features are found to best discriminate between convective and stratiform precipitations based on their physical characteristics. In the fuzzification process, the linear membership functions are used to determine the degree to which each input feature belongs to each rain type. The fuzzification gives an assessment value between 0 and 1 for each input feature corresponding to each rain type. Finally, the results of fuzzification for each input feature are multiplied by predetermined weighting coefficients. The aggregation value is related to the probability of rainfall for convection/stratiform. Four typical independent individual cases have been chosen in this study to evaluate the performance of the proposed fuzzy logic algorithm. Results show that its classification is reasonable according to the analysis of three-dimensional radar reflectivity, implying its great potential for the precipitation classification.
 For the lack of the “truth” of precipitation type, it is difficult to give the quantitative evaluation of the proposed fuzzy logic algorithm in the current study. Similarly, the sensitivity tests to weight (or feature: if the weight for one feature equals 0, then the feature is not used), grid resolution, and breaking points in membership functions have not been addressed. Nevertheless, all these issues deserve further investigation and will be included in our follow-up works.
 This research was supported by the National Science Foundation of China (41175092, 40805044, and 41205025), the Natural Science Foundation of Gansu Province (1010RJZA118), and the University graduate student's Scientific Research Innovation of Jaingsu Province (CX10B_285Z). The authors wish to thank Weather Service Forecast Office of Anhui Province for providing the radar data. The authors also like to thank Dr. Qing Cao and Dr. Xiaoming Hu from the University of Oklahoma and three anonymous reviewers for their useful comments and language editing which have greatly improved the manuscript.