A photo-based quality assessment model for the estimation of PM2.5 concentrations

Rapid economic growth has caused severe environmental pollution, which has aroused great concern. This pollution affects public health and impairs visibility, therefore, it should be given greater consideration. In this paper, a photo-based PM2.5 concentration predictor is proposed based on the natural scene statistics without artiﬁcial assistance or extra information. Given that the quality of PM2.5 concentration images is determined by many factors, three types of inﬂuencing factors are analysed: the colourfulness, the structural degradation and the contrast. The ﬁrst feature consists of the hue, saturation and colour descriptors, which measure the colourfulness of the PM2.5 concentration images. The second feature is determined based on the contrast can effectively portray the quality of PM2.5 concentration in the images. The third feature is extracted based on the natural scene statistics model, which measures the local and global structural degradation information and the naturalness of the PM2.5 concentration images. Finally, the three features are used to train a random forest model that can be used to predict the concentration of PM2.5. Experimental results illustrate that the performance of the proposed model is better than those of popular competitors on AQID.


INTRODUCTION
With the rapid development of urban industrialization and electronic business, environmental pollution is becoming increasingly more serious, particularly air pollution. The fine particulate matter (PM2.5) with an aerodynamic diameters less than or equal to 2.5 mm is the major air pollutant. PM2.5 brings many hazards, mainly in the following aspects. First of all, it affects human health. Long-term exposure to PM2.5 particles can cause cardiovascular disease, respiratory disease and lung cancer [1][2][3][4]. Secondly, the PM2.5 in the atmosphere stays too long, and will form haze, resulting in traffic jams. Third, PM2.5 forms haze weather, which will make people experience dyspnea. Finally, PM2.5 can affect cloud formation and rainfall processes, and can indirectly affect climate change. PM2.5 will increase the number of concretions and increase the number of raindrops in the sky, meaning rainstorms may occur in extreme cases. Hence, increasingly more attention has been paid to the study of PM2.5. At present, the commonly used methods of monitoring the PM2.5 concentrations mainly include the microoscillation balance method, This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited. © 2021 The Authors. IET Image Processing published by John Wiley & Sons Ltd on behalf of The Institution of Engineering and Technology the beta ray attenuation method, the gravimetric method and so on, which mainly measures the weight of the PM2.5 and then give its concentration [5]. However, these methods often have the disadvantages of poor detection repeatability, easily accumulated errors, media consumption and so on. Therefore, increasingly more attention has been paid to monitoring PM2.5 concentrations, and it is urgently necessary to develop a new method to replace traditional PM2.5 measurement methods.
At present, smart phones have become indispensable tools for everyone. In addition to their simple communication functions, smart phones also have more functions, especially the widely used photography function. With a smart phone, people can record and describe their activities anytime and anywhere as well as share them with others. Therefore, using the photos taken by smart phones to estimate PM2.5 concentrations may become a simple and efficient method of monitoring these concentrations.
So far, research on the prediction of the PM2.5 concentration based on photos is very limited [6][7][8][9]. Gu  wileyonlinelibrary.com/iet-ipr FIGURE 1 Sample realistic PM2.5 images from the AQID database [6] and spatial domains and then measured their degree of deviation from the naturalness statistics model to give the PM2.5 concentration [6]. In [7], the authors proposed a photographbased method based on the saturation map appearance for low and high PM2.5 concentrations. Zhang et al. designed a PM2.5 estimator based on the statistics and saliency of PM2.5 pictures [10].
In addition , few efforts have been made in the research on bench marking for photo-based prediction of PM2.5 concentrations [6]. Gu et al. built a database for evaluating PM2.5 concentrations, and it was denoted as AQID [6]. This database contains 750 pictures, and the contents of these pictures include buildings, roads, temples, lakes, cars, parks, squares etc. The PM2.5 concentrations of the images in this database range from 1 to 423, and the resolutions range from 500× 261 to 978× 550. Sample images chosen from AQID database [6] are illustrated in Figure 1. These images were captured outdoors under different PM2.5 concentrations.
In this paper, a photo-based PM2.5 concentration estimator is proposed based on the natural scene statistics (NSS) principle without artificial assistance or extra information. Considering that the PM2.5 concentration images are determined by different factors, measuring only structural distortions is not sufficient for PM2.5 images since colour information plays a very important role. According to previous research, an increase in the PM2.5 concentration would make an image contain less information and less colour. Hencečňwe analyse three aspects of the images, including the colourfulness, structural degradation, and contrast. In total, 18 features are used to train the model to predict the concentration of PM2.5 using images. Experimental results from AQID database illustrate the effectiveness of the proposed model.
The remainder of this paper is arranged as follows. The details of the estimator algorithm is presented in Section 2. Section 3, we provide performance measures and comparisons of the proposed algorithm on the AQID database [6] dedicated

PROPOSED PM2.5 CONCENTRATION ESTIMATION METHOD
In this section, the proposed photo-based PM2.5 concentration estimation method will be described in detail. The proposed method utilises three types of features, including the colourfulness, structural degradation, and contrast. The flowchart of the proposed prediction method is outlined in Figure 2.

Hue feature
Hue distortions impact the visual experience [11], and so they play an important role in image quality assessment (IQA). In order to establish a naturalness statistics model based on the hue, we must first investigate the general characteristics of the hue data. Figure 3 shows the joint distribution of the adjacent hue values for high fidelity images and their distorted versions.  Figure 3, we can see that the hue values of two neighbouring pixels are highly correlated, and the joint distributions of the two high fidelity images are altered when the images are subject to distortions. Considering the impact of the image contents, we adopt the relative hue of a PM2.5 image [11] to measure the hue distortion of a PM2.5 image.
In order to calculate the hue, the opponent colour space is adopted to provide the decorrelated effects for the RGB colour channels [12]. The red-green channels RG are defined in the opponent colour space as follows: And the yellow-blue channels YB is defined as: Finally, the Hue related to the dominant wavelength of colour signal, is calculated as [12]: where RG and YB represents the red-green and the yellow-blue channels. The relative hue ΔHue(i, j ) of the horizontal direction is defined as [12]: where (⋅) is the angular difference operator, and the values is in the range of [− , ]: The ΔHue for natural pictures appears unimodal circular distributions. Hence, the wrapped Cauchy distribution models are used to fit the ΔHue histograms [12]. r h denotes a random variable, the probability density function of the relative hue is computed as: where h is the scale parameter, and h is the location parameter. In addition to this two parameters, we also calculate the circular kurtosis of input angular samples k h as a feature: where h is the input angular sample, and is defined as: Therefore, we use h , h , and k h to handle the chromatic distortions for every orientation. In this paper, the horizontal and vertical directions are the main considerations. Hence, in totalčň 6 features are used to measure the chromatic distortions based on the hue properties.

Saturation feature
According to the literature [6,7,10], an increase in the PM2.5 concentration would make an image contain less information and less colour. Through many experiments, we found that the saturation map presents a Gaussian distribution, and moreover, the distribution is changed when the PM2.5 concentration increases, as illustrated in Figure 4. Hence, the saturation can reflect the change of the PM2.5 concentration.
We found that image intensity distribution in the HSV colour space is more effective than the RGB colour space for reflecting the quality of PM2.5 concentration images. Hence, we transform the PM2.5 concentration images from the RGB colour space to the HSV colour space. The colour space conversion formula to obtain the saturation map is defined as follows [6]: where X (m, n) and Y (m, n) are computed by the maximum operator and the minimum operator among the R(m, n), G (m, n) and B(m, n) channels, respectively. m and n separately denote the pixel indices in horizontal and vertical directions.
In this paper, we use the saturation intensity distributions as the first saturation feature to measure distortion, it is defined as: where mean is the sample mean operator. Entropy is the most commonly used to measures the amount of information in an image. Typically, a high-quality image has bigger entropy value, which will be changed by the presence of PM2.5. Here the saturation entropy is used as the final saturation feature: where M and N represent the size of saturation map, P (i, j ) denotes the frequency of intensity value in S . These two features affect the image intensity distributions, and the proposed method built upon this information would be useful to capture the changes in the colourfulness and naturalness of images experiencing PM2.5 pollution.

Colour descriptor feature
Most existing image quality methods are based on structural distortions [13,24,35]. However, measuring only structural distortions is not sufficient for PM2.5 images since colour information plays a very important role. From Figure 1, it is easy to see that the image captured during the low PM2.5 concentration has a higher quality than the image captured during the high PM2.5 concentration. For low PM2.5 concentration images, the colour is natural and consistent. In this paper, the colour descriptor is used to measure the colour of the PM2.5 concentration images. The dark channel prior measures the naturalness based on the statistics of outdoor high quality images [14]. Most of the local non-sky regions of high quality image often have very low intensities in at least one of the three colour channels. That is, the intensity of a dark channel is low and tends to be zero:I dark → 0. However, the intensity of these dark pixels will change based on the concentration of the atmospheric particles, such as those due to haze, fog, smoke and so on. Hence, using the concept of a dark channel, if an image is polluted by PM2.5, the saturation of its dark channel I dark (S ) will be changed. It is calculated as [14]: where the dark channel I dark (S ) is the outcome of two minimum operators, and they are commutative. min c∈{R,G,B} calculate on the pixel of each colour channel, and min x∈Ω(S ) is a minimum filter. I c is a colour channel of the image I and Ω(S ) is a local patch centred at S . Finally, the mean of the dark channel I dark (S ) is computed to measure the naturalness of an PM2.5 concentration image.

Contrast feature
In the perception of PM2.5 concentration images, contrast plays a very important role [6,9]. In our proposed method, the contrast energy is used to describe the contrast characteristic [15].The Gaussian second-order derivative filters are utilised to separate an image. All the responses of the filter are adjusted with rectification and divisive normalisation to build the process of nonlinear contrast gain control in the visual cortex. We compute the contrast energy on three channels as follows [16]: where And Y(I f ) is defined as follows [16]: where f h and f v denote horizontal and vertical secondorder derivatives of Gaussian function, respectively. Finally, C GR , C RG , C YB are defined as the contrast-aware features in our proposed method.

Structure feature
Structural distortion has been used extensively in image quality assessment [18,24,36,37]. In this section, the structural features are based on global and local histograms to measure the losses of the naturalness in PM2.5 photographs. The local structural naturalness statistics features are based on the structure degradation measurement and the free energy entropy [17]. To facilitate their calculation, the internal generative model M for visual perception is assumed to be parametric [28]. The parameter vector s is used to infer the perceived scenes. Given an input visual sign V , the differential information is inferred by calculating the joint distribution p(V, s) of the model parameter s over space. The joint distribution function is calculated as follows: However, it is difficult to calculate the joint distribution p(V, s) in the light of our present knowledge. Hence, we add an auxiliary posterior distribution of the model parameters q(s|V ) to both the denominator and the numerator. So Equation (15) is rewritten as Then, we use Jensen's inequality to Equation (16): (17) and the free energy is defined as the right side of Equation (17): Equation (18) expresses energy minus entropy as the free energy N (s). And the free energy of the visual signal V can be presented by Considering the ease of implementation and its effectiveness, the linear AR model is chosen as the generative model in this paper to measure natural scenes [17,19].
The distorted images, after low-pass filtering, always have different degrees of spatial frequency decreases. The differentiated information between the original and distorted images is used in the structural degradation model (SDM) [17] to evaluate the similarity. Through extensive experiments, it is found that there is an approximately linear relationship between the free energy feature and the structural degradation information [17] of the original images in the LIVE database [29]. In this paper, we use this relationship and the free energy [17] as the local structural NSS-based features to describe the PM2.5 photos.
The global NSS-based features come from the classical spatial domain model [20,21]. The logarithmic contrast is used to remove the local average displacement so as to estimate the decorrelation effect and normalise the local variance of the logarithmic contrast. As a rule, the normalised luminance coefficients of high quality images follow a generalised Gaussian distribution, and the distortions of images will violate the distribution. However, the generalised Gaussian distribution (GGD) can effectively measure a wider spectrum of statistics of distorted images. The probability density function of GGD is calculated as follows [20]: where controls the distribution, is the mean, and and the gamma function Γ(⋅) is defined as [33]: where is standard deviation. In the proposed method, the mean subtracted contrast normalised (MSCN) coefficients is fitted by the zero mean GGD, because MSCN coefficients distributions are symmetric and have global characteristics [20]. The zero mean GGD is defined as follows: For every PM2.5 photo, one pair of parameters ( , 2 ) is extracted from a GGD fit using the MSCN coefficients.
( , 2 ) are defined as the global NSS-based features, which will be utilised to capture the global distortions of PM2.5 images.

FIGURE 5
The schematic diagram of random forest

PM2.5 estimation based on the random forest
After the features are extracted, a proper method is needed to map the to PM2.5 concentration the feature space. In this paper, we use a random forest (RF) model to generate a proper mapping to predict the PM2.5 concentration [22]. This method is based on ensemble learning, and the schematic diagram is shown in Figure 5.
A set of extracted feature vector f = { f 1 , … , f 18 } is given, and S is the PM2.5 concentration of the test PM2.5 photo. The training objective function of the ith node of the t th decision tree t ∈ {1, … , T } is calculated as: (24) where T i governs the randomness of training node i, and the G i is defined as: where P i denotes the amount of training data for node i, P L i and P R i are the left and right partition sets, respectively. s is the conditional covariance matrix derived by probabilistic linear fitting. Then, the predicted scoreŜ is calculated by averaging the outputs of T regression trees as:

Evaluation criterions
In this section, the air quality image database (AQID) specifically dedicated to PM2.5 concentrations is adopted as the benchmark to test the effectiveness of the proposed method. AQID database consists of 750 photographs, and the PM2.5 concentration values in it range from 1 to 423 g/m 3 . A higher PM2.5 concentration indicates bad air quality, in contrast, better air quality is represented by a lower PM2.5 concentration.
According to the video quality experts group's (VQEG) suggestion [34], a five-parameter nonlinear fitting function is utilised to map objective quality scores to PM2.5 concentrations: ) + 4 s + 5 (27) where s denotes the predicted PM2.5 concentration value, f (s) denotes the corresponding actual monitored PM2.5 concentration value. i {i = 1, 2, 3, 4, 5} are the parameters to be fitted. Then four widely used criterions RMSE, KRCC, SRCC and PLCC are adopted for performance test. The accuracy of IQA models are measured by the RMSE and PLCC. The RMSE is calculated as where S d denotes the predicted value of the PM2.5 concentration and S o denotes the real PM2.5 concentration value. And N is the total number of pairs of estimated PM2.5 values and real PM2.5 values. PLCC is computed as follows: where o i andō are the ith photo's real PM2.5 value and the mean of the overall o i , q i andq are the ith photo's converted estimation PM2.5 value after nonlinear regression and their mean value. KRCC and SRCC are utilised to evaluate the proposed method's monotonicity, and the KRCC calculation method is as follows: where N c and N d indicate the total number of consistent and inconsistent images in the database. The last SRCC is another criteria to evaluate the prediction monotonicity. It is defined as follows: where N is the total number of pairs of estimated PM2.5 values and real PM2.5 values. d n represents the order difference between the estimated PM2.5 values and the corresponding real PM2.5 values of each group. A good PM2.5 concentration estimator is expected to obtain high SRCC, PLCC and KRCC values, as well as low RMSE value.

Performance comparison
In our experiments, the proposed method is compared to the PPPC [6], Yue [7] and IPPS [10], which are dedicated to PM2.5 concentration prediction. The performance of the proposed method is compared to the others using state-of-the-art general purpose IQA metrics, including NIQE [21], NFERM [17] and BQIC [19], and popular IQA metrics devoted to contrast metrics, including NIQMC [27], CDIQA [16] and BIQME [15]. Finally, four sharpness metrics are also included for comparison, including RISE [13], ARISM [26], BIBLE [23] and FISH [25]. Since our proposed method adopts the RF regression model for PM2.5 concentration estimation, AQID is randomly divided into two parts: 80% of the images are used for model training, and the remaining 20% are used for model testing. To be fair, the segmentation is performed 1000 times and the median values are reported. All source codes for the IQA metrics being compared come from their authors or web sites. Table 1 lists the experimental results on AQID that contain PM2.5 concentration photos, and the top two performance values are marked in bold. Among the state-of-the-art general purpose IQA methods, the BQIC metric achieves the better performance on AQID database. In addition, the FISH metric acquires the highest values compared with the other sharpness metrics. The BIQME method achieves the best performance among the compared contrast methods. The PLCC, KRCC, SRCC and RMSE values of our proposed method are 0.8082, 0.6115, 0.8177 and 51.5973, respectively. It can be clearly observed from Table 1 that the proposed method outperforms the other methods using the state-of-the-art general purpose IQA metrics and the prevailing sharpness and contrast metrics compared in this paper. Even compared with the specialized estimation methods for PM2.5 concentrations, the performance of the proposed model is among the top two, and it achieves the best prediction monotonicity. The KRCC value of our results shows that the proposed model

Analysis of different components
Considering that the proposed model consists of three types of features, it is necessary to know the contributions of each type of feature. The first feature measures the colour naturalness, including the hue, saturation and colour descriptor, using a total of nine factors. The second feature describes the contrast characteristics, and it consists of three factors. The last feature is composed of six factors, which measure the local and global structure degradation based on NSS models. In order to recognize how well the features are correlated with the PM2.5 concentrations, the values of PLCC, SRCC, KRCC and RMSE are listed in Table 2. Important findings are drawn from the performance comparisons in Table 2. The three types of features used in the proposed method perform well. For example, the KRCC of the first feature is 0.5912, that of the second feature is 0.4085, and that of the third feature is 0.3041. In comparison, the KRCC of combining the three features can reach 0.6115. Moreover, it is necessary to identify that the three features in the proposed models consider different aspects. The first and the second features measure the non-structural information based on naturalness statistics models. It can be seen from Table 2 that these two features contribute more to the overall model performance, furthermore, colour information plays a more important role in evaluating PM2.5 photos. The third feature is based on using the local and global histograms to quantify the possible losses of naturalness in PM2.5 photographs. For low PM2.5 concentration images, the colour is natural and consistent, and the colour will change as the PM2.5 concentration increases. In addition, structural distortion has been extensively used in image quality assessment. The contrast distortions impact the visual experience, and so they play an import role in evaluating PM2.5 photos. Thus, the combination of all three types of features results in better performance, and this verifies the effectiveness of the proposed model.
In addition, we test the impacts of regression models on the performance of the proposed method, and experiments are conducted using different training methods. In this section, the support vector regression (SVR) [30], random subspace(RS) [31,32] and RF [22] are used as regression models to train the model. Table 3 lists the experimental results. It is known from Table 3  that the results are not much different when using different regression models. However, the RF has a better accuracy and stability than SVR and RS. Hence, RF is adopted in the proposed method to train the regression model.

CONCLUSION
At present, environmental pollution has received extensive social attention. In this work, we concentrate on photo-based PM2.5 concentration estimation. Through observations and experiments, it is found that the quality of PM2.5 concentration photos is affected by a variety of factors. On this basis, we extract the features of PM2.5 photos, including a total of 18 features in the three aspects of colourfulness, contrast and structure. Then, the random forest method is used to train a regression model to estimate the PM2.5 concentration. A comparison of our proposed method with photo-based PM2.5 concentration predictors, popular distortion specific methods and state-of-the-art general-purpose IQA methods is conducted on AQID. The experimental results have illustrated that the proposed metric is better on AQID than the popular general purpose IQA, contrast and sharpness metrics. In addition, the superior performance of our proposed PM2.5 concentration estimator is verified on AQID. Via observations and analyses, the performance of our proposed metric can be further enhanced by introducing saliency features [10], which is our future work.