Mathematical models for the improvement of detection techniques of industrial noise sources from acoustic images

In this paper, a procedure for the detection of the sources of industrial noise and the evaluation of their distances is introduced. The above method is based on the analysis of acoustic and optical data recorded by an acoustic camera. In order to improve the resolution of the data, interpolation and quasi interpolation algorithms for digital data processing have been used, such as the bilinear, bicubic, and sampling Kantorovich (SK). The experimental tests show that the SK algorithm allows to perform the above task more accurately than the other considered methods.


INTRODUCTION
Industrial plants play a fundamental role in the society from the economic and social point of view; on the other hand, their environmental impact can produce detrimental effects on the quality of people's lives. Indeed, the environment noise is one among the most common problems arising from industrial plants, and it usually arises from many sound sources whose emissions vary in space and time, having different frequency spectra and characteristics.
The set of these sources contributes to the global noise level and, being received by the surrounding buildings, workplaces, and social services (such as schools or hospitals), must comply with the national limits indicated by laws.
In order to plan a reduction of the described environmental noise, it can be crucial to detect all the individual noise sources.
In last years, a new useful tool has become available in order to individuate the main noise sources from industrial plants: the acoustic camera (AC). This is essentially a planar array of microphones which allows to obtain graphical maps (acoustic images) of sound pressure levels (or other acoustic descriptors) of a selected area, providing an approximated localization of different sound sources which does not result accurate for our aim.
The above spatial localization is given by a suitable post-processing software, based on the technique known with the name of beamforming (BF). However, the above localization algorithm is affected by some intrinsic limits, due both to technical (AC) and algorithmic (post-processing) reasons.
In Asdrubali et al, 1 a procedure for the detection of the industrial sources of noise from acoustic images has been introduced. A crucial step in the above detection process was given by the enhancement of the quality of acoustic images achieved using standard reconstruction algorithms, such as the bilinear and bicubic interpolation methods (which are typically provided with the BF software of the AC).
In this way, starting from any given acoustic image, we become able to automatically compute some parameters, such as the distance between various noise sources, and also their distance from the AC, which allow to completely and accurately detect the main noise emitters.
From some recent results on digital image processing, it is known that quasi-interpolation algorithms are more performing in image enhancement (in terms of peak signal-to-noise ratio [PSNR]) than interpolation methods (see Blu et al 2 ). For this reason, if we replace the bilinear and bicubic methods in the procedure developed in Asdrubali et al 1 by a suitable quasi-interpolation algorithm, we expect to enhance the accuracy of the detection of the noise source, so improving the localization method given in Asdrubali et al. 1 In the present paper, we consider the so-called sampling Kantorovich (SK) quasi-interpolation algorithm for image reconstruction and enhancement, in place of the bilinear and bicubic methods. This choice is motivated by two main reasons. Firstly, in Costarelli et al, 3 it has been proved that the SK algorithm is more performing than the above considered interpolation methods, and it is also better than some well-known quasi-interpolation method, such as the quasi-finite impulse response (quasi-FIR) and quasi-infinite impulse response (quasi-IIR) filters. 2,4 Secondly, the SK algorithm has been recently employed in some concrete applications. [5][6][7] The improvement that can be achieved by the main novelty here introduced, namely, the application of the SK algorithm in place of bilinear and bicubic methods, has been showed by numerical experiments. In Section 2, the experimental setup is described; in Section 3, the mathematical background behind the implementation of the SK algorithm is outlined, and finally, in Section 4, numerical results are presented.

EXPERIMENTAL SETUP
Here, we give a detailed description of the experimental setup for the detection of industrial noise sources from acoustic images. In particular, we propose an improvement of the post-processing procedure first introduced in Asdrubali et al, 1 based on the application of the well-know SK algorithm for image reconstruction and enhancement (see, e.g., Asdrubali et al 5 ). The main techniques available for the realization of ACs can be classified in BF and acoustic holography. BF uses the hypothesis of far field in the acoustic signal processing, assuming the signal propagating on a plane front from an infinitely diffused source. Compared to acoustic holography, BF is more suitable for investigation of big sized industrial implants. For this reason, considered the aim of this work, BF is preferable compared to acoustic holography. A detailed description of the mathematical model for BF is provided in previous studies. [8][9][10] In order to recall the procedure, 1 we consider the following experimental setup, built using two noise sources. The two sources, two speakers S 1 and S 2 , respectively, on the left and on the right of the camera optical axis O A , have been fed with to sinusoidal signals of frequency of 1 = 1733 Hz and 2 = 1000 Hz. S 1 and S 2 are considered two point sources with sound emitting element in the center of each speaker. Having two signals whose frequencies are not integer multiples each other avoids any summation effects that could result from spurious harmonics superposition.
In Figure 1, the disposition of the two speakers is shown. The horizontal distance of S 1 and S 2 from O A is 3000 mm, the height of S 1 , h S 1 = 1500 mm, the height of S 2 , h S 2 = 700 mm. The Euclidean distance between the sources is d(S 1 , S 2 ) ≃ 6053 mm.

FIGURE 1 Experimental setup: planar view
The AC used for the measurement is composed of a matrix of microphones rigidly connected with a single optical camera in the center of the system, so that all the sensors (microphones and optical camera) lie on the same plane Q. 11 We denote by P the (vertical) plane passing for S 1 and S 2 and parallel to Q (see Figure 1).
The distance between the AC and the plane P is Z = 12 250 mm. Then, two measurements M 1 and M 2 , of both sound pressure and corresponding visual scene, have been performed by the AC. The data obtained from the above measurements have been stored in two matrices for each, the ones with apex O (standing for Optical) containing the optical data expressed by gray level 8-bit coded images, the other ones with apex A (standing for Acoustic) containing acoustic data expressed in dB. The measurement M 1 has been acquired by the AC at a height of h M 1 = 1350 mm and O A equidistant from the projection of S 1 and S 2 on the ground baseline (as in Figure 1), while the measurement M 2 has been acquired horizontally shifting the AC on the right of 230 mm. Figure 2 shows the shifting procedure, and Figure 3 shows the frontal view of the system to be measured.
The ground of the experimental setup is made of asphalt, with an acoustic reflectivity coefficient of 0.95. The size of the matrices M A 1 , M A 2 is 16 × 21 according with the data provided by the AC, and the data have been stored in Technical Data Management Solution by National Instruments™(TDMS) files and expressed in dB with a scaling factor of 2 · 10 5 Pa. A normalization by this factor has been operated as follows: Starting from the matrices M A 1 , M A 2 , we can select from them, using certain software filters, the spectral components contained in a band of 100 Hz tuned to f 1 and f 2 , obtaining from each one two new matrices that we denote by M A The images M O 1 and M O 2 consist of gray scale images of 1024 × 780 pixels; an acoustic "pixel" of the acoustic matrices M A , = 1, 2, corresponds to a square of 48 × 48 optical pixels (see Figure 4). The AC guarantees a rigid superposition of the acoustic data with the optical ones.
In Asdrubali et al, 1 the whole post-processing procedure for the detection of the noise sources (the speakers S 1 and S 2 ) is based on the above greyscale images in which the correspondence "1 acoustic pixel = 48 × 48 optical pixels" holds.
In the above paper, in order to improve the resolution of the acoustic data, such that to each pixel of the optical acquisition can be assigned a single value of sound pressure, two standard bivariate interpolation methods were used: bilinear and bicubic interpolation.
In this paper, the results obtained by the above standard methods have been improved using the quasi-interpolation method based on the theory of the SK operators.

INTERPOLATION AND QUASI-INTERPOLATION METHODS
For what concerns the bilinear and bicubic method, we refer to Keys and Lehmann et al. 12,13 [Colour figure can be viewed at wileyonlinelibrary.com] [Colour figure can be viewed at wileyonlinelibrary.com] FIGURE 4 Acoustic resolution over the optical one. The colored square represents the acoustic "pixel" size Now, we recall some basic facts concerning the so-called SK (quasi-interpolation) algorithm. It is well-know (see, e.g., Asdrubali et al 5 ) that the above algorithm is based on a numerical and optimized implementation of the following sampling series: where ∶ R n → R is a locally integrable image (multivariate signal) such that the above series is convergent for every x ∈ R n , and are the multivariate intervals in which we consider the averages of the given signal f.. 14 The series S w are known with the name of SK operators and can be considered as the L 1 -version of the generalized sampling operators of previous works. [15][16][17][18][19] The optimized-numerical implementation of the SK algorithm has been pointed out in Costarelli et al and Asdrubali et al 7,20 (where also a pseudo-code is available). Moreover, in a recent paper (see Costarelli et al 3 ), a numerical comparison concerning the efficiency and the accuracy in the image reconstruction and enhancement has been carried out in term of the so-called PSNR between the SK algorithm and some other interpolation and quasi-interpolation methods (among them the bilinear and bicubic interpolation). In particular, in Costarelli et al, 3 a deep comparison between quasi-FIR, quasi-IIR, and other interpolation and quasi interpolation methods widely used in literature has been performed, also taking into account different kernels for the application of the SK algorithm. Moreover, the numerical results by means of the well-known PSNR index, as well as the need of a suitable execution time, have together determined the choice for the values of the parameters and the kernels used in the present paper. In particular, both the computation time and the reconstruction quality increase when N decreases, due to the bigger size of the kernel support (see, e.g., Asdrubali et al 5 for technical details), while higher values of w improve the quality of the reconstruction, but in the meantime, they increase the computation time. These considerations have determined the choices of the values N and w and of the kernel functions.
In Definition (1), the function ∶ R n → R is called a kernel, and it satisfies suitable assumptions (see, e.g., Costarelli and Vinti 21 ), typically used in literature. 22 Some well-known and important classes of kernels which satisfy the above mentioned assumptions and that can be used in order to implement (1) are listed below (for more details, see Costarelli and Vinti 21 again).
First of all, we recall the definition of the one-dimensional central B-spline of order N (see e.g., Cantarini et al and Costarelli and Vinti 23,24 ): where (·) + denotes the positive part. The corresponding multivariate version of central B-spline of order N is given by [Colour figure can be viewed at wileyonlinelibrary.com] Another important class of kernels is given by the so-called Jackson type kernels of order N, defined in the one-dimensional case case by x ∈ R, (4) with N ∈ N, ≥ 1, and c N is a nonzero normalization coefficient, given by We recall that the sinc-function is defined as sin( x)∕ x, if x ≠ 0, and 1 if x = 0; see, for example, previous studies. [24][25][26] As in case of the central B-splines, multivariate Jackson type kernels of order N are defined by the following product-type expression: For several examples of kernels (e.g., of radial type), see, for example, other works. 3,21,24,[27][28][29][30][31] We recall that, one of the main advantages of the SK algorithm is that, it allows to reconstruct a given image I with any fixed resolution (sampling rate). For instance, we can reconstruct I with its original resolution, or we can also consider higher resolution than the original one. In the latter case, it is possible to obtain an enhancement of the original image I which has been proved to be useful for the applications. 5,20

NUMERICAL RESULTS
In order to quantify the Euclidean distance between S 1 and S 2 , namely, d(S 1 , S 2 ), the estimation of the distance Z (which in general is unknown) is needed and can be achieved from the available data by a triangulation procedure, as the one described below. [32][33][34] The knowledge of Z is necessary to quantify the corresponding dimension, in world units (expressed in mm), of a single pixel at that distance. Accordingly to 3-D scene reconstruction from multiple measurements techniques, we want to find a pointwise correspondence between the values of M A 1 S and M A 2 S , = 1, 2. In our case, it is sufficient only to find a correspondence between the maximum points of the above matrices in which S 1 and S 2 are obviously localized. This is due to the fact that the main noise sources, which are the ones to be identified, are located at the points of maximum intensity of the noise itself. Thus, The Euclidean distance d ∶= d( 1 ,  2 ), = 1, 2, in 3-D scene reconstruction from multiple measurements, is called disparity; see, for example, Barnard and Thompson. 35 The estimation of Z can be obtained as follows: where f is the focal length of the camera, a characteristic constructive parameter, d is the disparity, and b is the displacement of the AC position in M 1 and M 2 . In our case b = 230 mm. For the calculation of f, we have used the method introduced by Zhang 36 and available for camera calibration in MATLAB™. Using this technique, together with the available data, we obtain that = 2173.6 pixel. We stress that, since the value of the disparity changes on the bases of the used interpolation and quasi interpolation methods for the processing of the above acoustic matrices, we can obtain different estimation of Z.
Moreover, once Z is determined, we can, using the knowledge of the dimension of some reference objects inside the image, calculate the size of a single pixel, using the linear equation: Note: The columns have the same meaning that in Table 1 except that now, the introduced relative errors (7)-(10) are shown. where D is a depth for which we know the size p size (D) of a single pixel in mm; p size (Z) is the size of a pixel in mm at distance Z. Then, the numerical results achieved for each one of the used interpolation and quasi-interpolation methods are given in Table 1. Here, we show the results of the estimations of Z (expressed in mm); d(S 1 , S 2 ), that is, the distance between S 1 and S 2 ; E d , that is, the number of wrong pixels between the reference of d(S 1 , S 2 ) (expressed in pixel) and its estimations; p size (Z), that is, the size (expressed in mm) of one single pixel at distance Z; E p size (Z) (expressed in mm), that is, the error between the reference value and its estimations for each single pixel; E TOT ∶= E d · E p size (Z) .
In particular, for what concerns the SK quasi-interpolation algorithm, we considered reconstructions performed with w = 20 and based upon the two-dimensional Jackson type kernel with parameter N = 12 according to the results obtained in Costarelli et al, 3 obtaining an enhanced image with size increased by a scale factor equal to 48.
The simplest way to convert the error E d , originally expressed in pixel, is to multiply it by the estimated pixel size p size (Z). This procedure is not accurate because it does not take into account the error on the estimation of p size (Z). In order to avoid this inaccuracy, we introduce E TOT as the most correct index expressing and evaluating the goodness of the results. In this case, the SK algorithm performs better than the other proposed methods according to Costarelli et al. 3 The same considerations can be done in the case of the following relative errors: where E r d is compared to the reference distance d(S 1 , S 2 ), E r p size (Z) to the reference size p size (D) (expressed in mm), while E r Z is compared to Z re = 12250 (expressed in mm). Results are shown in Table 2.
In Figures 5 and 6, the planar distributions of sound pressures are shown before and after the application of the interpolation and quasi-interpolation methods.
In Figures 7 and 8, the three-dimensional distributions of sound pressures are shown before and after the application of the interpolation and SK quasi-interpolation methods.
For what concerns S 2 , the results are different. Due to the reduced height of the speaker h S 2 ≃ h S 1 ∕2 in the experimental setup, to the nondirectivity of S 2 and to the reflectivity coefficient of the asphalt, the estimation of Z results wrong. This error is due to the reflections occurring on the ground at a distance that is half of that one of S 1 ; see Table 3. In the case of S 1 , the contributions of the reflections are negligible, being the sound waves enough attenuated, and they do not influence the estimation of Z. Moreover, in accordance with the dependence from the height of the speakers, the estimation of Z for S 2 is approximately one half of that one resulted for S 1 .  Note: Z (expressed in mm) is the achieved depth of the plane of the source S 2 .

CONCLUSIONS
Using some well-known techniques in the field of image processing and applying them to the above contest, we are able to individuate the location of the sources of noise in a 3-D orthogonal system. Due to poor resolution of the microphonic array, interpolation or quasi-interpolation methods are needed. To overpass the problem of the low resolution of acoustic data, standard bilinear and bicubic interpolation methods together with the SK algorithm have been used. Numerical results show that by means of the data enhanced with the SK algorithm, a more accurate detection of noise sources can be obtained with respect to that performed by means of the standard bilinear and bicubic interpolation methods (see Asdrubali et al 1 ). This conclusion can be derived from the values of E TOT shown in the previous tables of Section 4.