An image denoising algorithm based on adaptive clustering and singular value decomposition

Self-similarity, a prior of natural images, has attracted much attention. The attribute means that low-rank group matrices can be constructed from similar image patches. For low-rank approximation denoising methods based on singular value decomposition (SVD) the ability to accurately construct group matrices with noise and handle singular values are keys. Here, combining image priors, a two-stage clustering method to adaptively construct group matrices is designed. The method is anti-noise, that is, when noise levels are high, these matrices are more accurate than that constructed by other algorithms. Then, according to the signiﬁcance of singular values and singular vectors, singular vectors of the low-rank estimations are corrected so that the residual noise in the low-rank estimations is further suppressed. For back projection , the authors use the original noise level and the residual image to adaptively determine projection parameters and new noise levels . So, authors’ back projection can provide a good foundation for authors’ two-stage denoising methods, better remove noise and preserve image details. Experimental results show that compared with the existing state-of-the-art denoising algorithms, the proposed algorithm achieves competitive denoising performances in terms of quantitative metrics and preserving details. Especially with the increase of noise, the competitiveness of authors’ algorithms is gradually enhanced.


INTRODUCTION
Image denoising is an important pre-processing process in the fields of computer vision and image processing. It has been widely studied in the past decade [1][2][3][4][5][6][7][8][9][10][11][12][13][14]. Many applications have high requirements for image denoising, such as visual enhancement, medical diagnosis [15] and target recognition [16,17]. In general, image denoising algorithms can be divided into two categories. The first type of algorithms exploit the inherent prior attributes of images in the spatial domain, such as [8,18,19]. The removal of salt and pepper noise is also a typical application in the spatial domain. Thanh et al. [20] propose an adaptive total variation (TV) regularization model. Erkan and Gökrem [21] propose a based on pixel density filter (BPDF) and Erkan et al. [22] propose a different applied median filter (DAMF) to remove salt and pepper noise. Enginoglu et al. [23] propose a adaptive Riesz mean filter (ARmF), by operationalizing pixel similarity for salt and pepper noise removal. Erkan et al. [24] propose an iterative mean filter (IMF) for salt and  [25] propose a new method to find the parameters of a non-linear diffusion denoising method by ridge analysis. Jin et al. [26] establish convergence theorems for the non-local means (NLM) filter in removing the additive Gaussian noise.Cheng et al. [27] propose to model the gradient distribution of natural images as spatially variant hyper-Laplacian and the model is free from tedious tuning trade-off parameters. Following the NLM approach, Jin et al. [28] propose an adaptive estimator based on the weighted average of observations taken in a neighbourhood with weights depending on the similarity of local patches. The second type of algorithms are based on transform domain, including wavelet transform, dictionary learning (DL), principal component analysis (PCA) and so on. Bayesian least squares with Gaussian scale-mixture (BLS-GSM) [29] is a typical case of wavelet algorithms. However, fixed global wavelet bases cannot accurately represent image structure and often introduce pseudo textures into the denoised images. To overcome the disadvantages, denoising algorithms based on DL use a set of adaptive redundant representation bases called a dictionary to make image representations more flexible and sparse. The most typical dictionary-based denoising algorithm is K-SVD [2], which is a generalization of the k-means algorithm. PCA is another dimension reduction technique that is widely used in denoising algorithms. Muresan and Parks [30] use PCA and linear minimum mean square error (LMMSE) to denoise overlapping patches. Using Marchenko-Pastur law (MP-law) based on random matrix theory, Zhao et al. [31] proposed the adaptive patch clustering with progressive PCA thresholding (ACPT) algorithm to improve the performance of clusterwise PCA threshold denoising algorithm. Zhang et al. [32] further improved the performance by adding a patch selection mechanism. Advanced denoising algorithms often combine several denoising techniques. There are some advanced denoising algorithms based on non-local image patches [1-3, 5, 29]. Prior to this, there have been various neighbourhood filtering techniques based on variational and partial differential equations, but their performances are not as good as [1-3, 5, 29]. Some algorithms use the prior information, such as the redundancy and sparse representation of images in the transform domain, to improve the algorithms [1-3, 5, 29]. It turns out that these priors can effectively improve the performance of denoising algorithms. Variational regularization methods are widely used for removing noise without destroying edges that are important visual cues. Pang et al. [33] propose an adaptive weighted TV p regularization-based denoising model. Liu et al. [34] present a general Retinex model to effectively and robustly restore images degenerated by both illusion and noises and proposes a novel variational model by incorporating appropriate regularization technique for the reflectance component and illumination component accordingly. Thanh et al. [35] propose an adaptive image restoration method based on a combination of first-and second-order TV regularizations with an inverse-gradient-based adaptive parameter. Due to the adaptive parameter estimation based on the inverse gradient, it avoids the staircasing artefacts associated with TV regularization and its variant models. Prasath and Thanh [36] provide an adaptive version of the TV regularization model that incorporates structure tensor eigenvalues for better edge preservation without creating blocky artefacts associated with gradient-based approaches. Block-matching and 3D filtering (BM3D) [3] is widely regarded as one of the most effective denoising algorithms. Since the geometric features of natural images can be sparsely decomposed in some suitable transform domains, BM3D uses a pre-set dictionary to cooperatively filter three-dimensional image groups composed of non-local similar image patches and achieves enhancement of sparse expression. Although the quantitative metrics of BM3D achieve good results and the execution time of the algorithm is extremely short, it uses a pre-set dictionary to make its sparse expression limited. Many pseudo textures visible to the naked eyes are often introduced in the denoised results of BM3D. LSSC [5] avoids the drawback of using a pre-set dictionary. An improved BM3D filter [37] applies PCA to the shape-adaptive patch groups in the spectral contracted 3D transform domain an improved BM3D filter which exploits adaptive-shape patches and principle component analysis and improves the BM3D's detail retention capability. However, these algorithms do not protect the irregular textures in the image and introduce pseudo textures. Shi et al. [38] enhances structural details in image restoration by regarding global boundary context and residual context as complimentary information. Except redundant priors and sparse priors, low-rank priors are also effectively utilized in a variety of computer vision applications, such as image segmentation, facial recognition and especially image denoising [7,8,18,19,39]. One of the low-rank priors is to construct matrices by stacking non-local similar patches in a given noisy image, which are in a low-dimensional subspace of a high-dimensional space that satisfies the low-rank criterion. Gu et al. [8] use the weighted kernel norm minimum (WNNM) to exploit the low-rank prior. They combine the meaning and importance of singular values and treat them differently according to their respective values. Guo et al. [19] proposed the LRA-SVD (low-rank approximation, LRA) algorithm, which finds L image patches that are most similar to the target image patch in the neighbourhood window centred on the target image patch to construct similar patch matrices and uses the way of singular value truncation to obtain the low-rank estimations of the similar patch matrices. LRA-SVD achieves effective denoising performances, but it also lacks protection for image textures.
Based on the above problems, we propose an image denoising algorithm based on adaptive clustering and SVD. The main contributions are as follows: (1) We propose an adaptive clustering method with better antinoise ability. It can classify image patches and construct low-rank similar patch matrices adaptively and accurately. The size of image patches is the only parameter that can affect the number of clusters in the clustering method. Based on the idea of divide and conquer, this paper first divides the image patches into four categories according to the colour features, spatial features, shape features and texture features of the image features, and it further subdivides the image patches in each category to obtain low-rank matrices to be denoised. (2) In this paper, we consider the assumption of the singular value truncation is not rigorous, that is, the assumption that image information is concentrated on large singular values and noise information is concentrated on the remaining small singular values is not rigorous. After performing SVD low-rank approximation on similar patch matrices, we correct the singular vectors of similar patch matrices to further suppress the residual noise in the low-rank approximation matrices. (3) Based on the intuitive information and the implicit information contained in residual images and noise levels, we reasonably define a calculation method of the projection parameter and the noise level to be estimated in the back projection, so that they can be adaptive solutions based on the image information. Experimental results show that the proposed algorithm is superior to some existing denoising techniques. The rest of the paper is structured as follows.
In the second part, we briefly review and introduce some basic knowledge involved in our algorithm. In the third part, the proposed algorithm is introduced in detail. In the fourth part, the effectiveness of the proposed algorithm is verified by comparing the proposed denoising algorithm with other advanced denoising algorithms. Finally, we summarize the proposed algorithm and indicate the direction of future work.

DEPENDENT WORK
Here, we introduce some previous theories or algorithms that our method depend on. In particular, we introduce more about LRA-SVD [19], because our method is mainly improved on the basis of LRA-SVD. Our method is to remove additive white Gaussian noise (AWGN). The AWGN is usually introduced during image acquisition. Restoring images corrupted by AWGN has been a hot research in the field of image denoising and this kind of denoised problems are always described as where Y is the noisy observation image, X is the original noisefree image and E is the white Gaussian noise with a mean of 0 and a standard deviation of . For restoring the original images from the noisy versions of the image as accurately as possible and preserving the edges, details and other important information of images, it is important to choose a noise model and prior information in natural images.

Singular value decomposition
For the matrix Y , the singular value decomposition theorem [40] shows that the singular value decomposition of Y can be expressed as where and V Y are called left singular matrix and right singular matrix of Y , respectively. The diagonal elements of Y are non-negative and descending, . If Y is low rank and k is the magnitude of its rank, then k < n, Y 2 = diag(0, 0, … , 0). When Y is a matrix composed of similar image patches, it is generally considered to be low rank.

Image patch grouping
Non-local self-similarity characterizes the non-local textures or structures of images with a recurring characteristic. Y , including AWGN with a mean of 0 and a standard deviation of , is a noisy image. Image patches of size √ d × √ d are extracted from Y with a certain step size and each image patch is vectorized to obtain p i . In [19], the similarity between two patches is measured by the simplest Euclidean distance in the spatial domain. The spatial Euclidean distance between two patches is defined by where ‖•‖ 2 represents the Euclidean distance, p i is the candidate patch, p c is the reference patch. , construct the low-rank similar patch matrix P c . Each image patch p c, j is a column of P c . The corresponding similar patch matrix P c can be expressed as In general, the number L of similar patches in a similar patch matrix is too small, which means that there are not enough patches in each similar patch matrix and the robustness of the SVD denoising algorithm is not good. While a too large L causes the dissimilar patches to be grouped together, so that the correlation between the columns in the similar patch matrix is low, thereby making the low-rank estimation inaccurate. So we improve the process of image patch grouping by our two-stage clustering, where the parameter L is adaptively decided based on image information rather than artificially set.

Eckart-Young-Mirsky theorem
As shown in (4), since P c is composed of noisy image patches, it can be expressed as where Q c represents a noise-free similar patch matrix and N c represents the noise contained in P c . P c is the object to be denoised. In the latter part of the paper, for simplicity of description, P c , Q c and N c are replaced by P, Q and N , respectively. The task of denoising the similar patch matrix P is to estimate the noise-free similar patch matrix Q ′ as accurately as possible from P. Ideally, the estimated Q ′ should satisfy where ‖ • ‖ F is the Frobenius norm, is the standard deviation of the noise and t is the number of pixels in the matrix P.
The similarity between image patches in the noise-free image X gives a high correlation between these patches, that is, Q is a low-rank matrix, which has been confirmed in many algorithms, such as [19,41]. In the least squares sense, the estimation of Q can be obtained by low-rank approximation, that is, by solving the optimal problem of (7), Q ′ can be estimated from P.
where rank(•) represents the rank of matrix B. P can be expressed as (8) after singular value decomposition. Let where ′ is obtained from the matrix by setting the diagonal elements to zeros but the first k singular values, i.e.
The Eckart-Young-Mirsky theorem [42,43] gives that (9) is the solution of (7). The Eckart-Young-Mirsky theorem also states that for any real matrix P, if the rank of the matrix Q is k, then where i (i = 1, … , n) are the singular values of P, and equality is attained when Q = P k is defined by (9). From the theorem, we can get In particular, the Q ′ obtained when ∑ n i=k+1 2 i = t 2 is the most ideal estimation [19]. By using the above theories and conclusions, Guo et al. derive an effective singular value threshold k of singular value truncation. The threshold k is calculated as follows: Our method continues to use this conclusion. But, the way of singular value truncation assumes that the signal energy is completely concentrated on the first few large singular values, while the noise energy is concentrated on the rest of singular values. Because the energy of noise is also distributed over large singular values and the energy of signal can be distributed over small singular values, the assumption is not rigorous. Therefore, the performances of these denoising algorithms are deficient when the noise level is high. So in our method, we propose a process of singular vector correction after singular value truncation. It can make up for the lack of the assumption to some extent.

Back projection
Usually, back projection mechanism [44,45] is used to iterate the basic processes to enhance the denoising performance of algorithms. Its basic idea is to generate a new noisy image by adding filtered noise back to the denoised image, i.e.
where ∈ (0, 1) is a constant projection parameter andX is the denoised image produced by the previous stage. In LRA-SVD [19], = 0.5 is artificially assigned and fixed. While the parameter in our method is adaptively determined based on image information. Then, the noise standard deviationˆin the new noisy image should be updated in the next denoising stage.
In LRA-SVD,ˆis written aŝ where is a scaling factor. In our method, the calculation ofî s improved and more use of image information. It makes our method perform better in detail protection.

Quantitative metrics and test images
Peak signal to noise ratio (PSNR) is often inconsistent with the human eye perception, but it is the mostly widely used quality measurement in the literature. The calculation of PSNR is as formula (16), where m is the number of rows and n is the number of columns in the image. x(i, j ) represents the value of the pixel at position (i, j ) in an image without noise.x(i, j ) represents the value of the pixel at position (i, j ) in a denoised image. However, image details (low-level features) are very important when the visual system acquires information. Feature-Similarity (FSIM) index [46] is a new image quality metric based on low-level features in images, and it measures the similarity between two images by phase congruency (PC) and gradient magnitude (GM). FSIM demonstrates a high degree of consistency with human vision through subjective visual assessment. Here, we do a brief introduction to the calculation of FSIM. Let f 1 and f 2 denote the two images to be compared. PC 1 (x) and PC 2 (x) are the PC of f 1 (x) and f 2 (x) at position x, respectively. The similarity measure for PC 1 (x) and PC 2 (x) is defined as follow: where T 1 is a positive constant to increase the stability of S PC . G 1 (x) and G 2 (x) are the GM of f 1 (x) and f 2 (x), respectively. The similarity of G 1 (x) and G 2 (x) is defined as follows: where T 2 is a positive constant depending on the dynamic range of GM values. Then, S PC (x) and S G (x) are combined to get the similarity S L (x) of f 1 (x) and f 2 (x). The S L (x) is defined as follows: where and are parameters used to adjust the relative importance of PC and GM features. The FSIM index between f 1 and f 2 is defined as follows: where Ω means the whole image spatial domain. PC m (x) = max(PC 1 (x), PC 2 (x)), is used for weighting the importance of S L (x) in the overall similarity between f 1 and f 2 . More details can be found in [46].
In this paper, we use PSNR index and FSIM index to fully reflect the performance of the denoising algorithm. And the parameters used to calculate FSIM are the same as the settings in [46]. In our experiment, natural greyscale images with size 256 × 256 are used for simulation. These images have been commonly used to verify the effectiveness of many state-of-the-art denoising algorithms. The noisy images are generated by adding zero mean white Gaussian noise with different standard deviations to the test images. The noise standard deviation ranges from 30 to 150. Greyscale images used in the experiment are shown in Figures 1 and 2. To prevent accidental data, we also test each algorithm on the BSD100 database. Since the BSD100 is a colour image set and our method depend on colour features, we directly add the same level of noise to the RGB channels and remove them, respectively.

INTRODUCTION OF THE METHOD PROPOSED IN THIS PAPER
Non-local self-similarity is a very important feature of natural images. It characterizes the non-local textures or structures with a repetitive feature, which can be used to effectively maintain edges and details of images. According to the non-local selfsimilarity of images, we propose an image denoising algorithm that effectively removes AWGN and achieves better denoising performances. The basic process of the denoising algorithm is to divide the image into patches firstly. Next step, it uses the method of clustering and the similarity between image patches to adaptively construct similar patch matrices. Then singular values and singular vectors obtained by the singular value decomposition of the similar patch matrices are processed to remove the noise. Finally, the denoised similar patch matrices are reconstructed to obtain the denoised image. Combined with back projection technology, the algorithm iterates the basic process to achieve two-stage denoising of images and further improves the performances of the algorithm. The complete algorithm

Adaptive clustering
The essence of utilizing the non-local self-similarity is to find similar image patches and construct low-rank similar patch matrices. Cluster analysis is a multi-variate statistical method for classifying research objects according to certain characteristics, and it has been successfully applied to economics, medicine, meteorology and other fields. It does not care about the causality relationship between features and variables. The result of clustering shows that the individual differences between categories are large, while the individual differences of the same category are relatively small. The difference between clustering and classification is that the number of classes divided by the clustering method is unknown. This unknown provides a feasible space for adaptively dividing image patches according to image content. An existing classical method of constructing similar patch matrices is to find L image patches that are most similar to the target image patch in the neighbourhood window centred on the target image patch, as described in LRA-SVD [19]. The size of L is crucial for ranks of the similar patch matrices and denoised results. The similar image patches are searched in a fixed-size search window centred on the target image patch. When search windows of different target patches overlap, the same image patch can be similar to multiple image patches and appear in multiple similar patch matrices. About other algorithms, such as the traditional k-means clustering algorithm, it cannot adaptively determine the number of clusters. However, a fixed number of clusters results in multiple image features in the same cluster, which makes it difficult to identify some features after noise removal. According to the divide-and-conquer kmeans algorithm [47], ACPT [31] uses the over-clustering-anditerative-merging strategy to obtain the final clustering results and improves the clustering method. Usually, when a cluster corresponds to an image feature in a small range, the cluster is considered to be a better and more accurate partition. However, the iterative-merging in [31] overly merges some subtle different clusters representing different details into one class, so that the low-rank feature of the similar patch matrices is not guaranteed and the image details are lost in the denoising phase. At the same time, the initial state of the cluster is also crucial for the clustering results.
Inspired by the above methods, we propose an adaptive twostage clustering algorithm to classify image patches in the whole image range, so that the low-rank feature of similar patch matrices is guaranteed and improved. The adaptive two-stage clustering algorithm makes the features of image patches contained in each cluster as consistent as possible. In the first stage, since the image features can be divided into colour features, texture features, shape features and spatial relationship features, we set the number of clusters to 4 so that each cluster corresponds to an image feature after clustering. In the second stage, each feature cluster in the first stage is further divided into max{floor(L j ∕M ), 1} clusters, where L j is the number of image patches in the j th cluster in the first stage and M = d 2 , where d is the size of image patches. In the extreme case, when the second stage divides a cluster in the first stage into a cluster, it shows that the image patches in the cluster have a very high similarity and the cluster does not need to be divided. The results of each first-stage-cluster after the second clustering are similar patches matrices we want. The adaptability is mainly reflected in two points: The first one is that the number of clusters is adaptively determined based on image information. The second one is that the number of image patches in each similar patch matrix is adaptively determined based on image information. The relationship of features between the first and the second stages is as the second stage is whether the specific distinction of the colour features in the first stage is light grey or dark grey. In this paper, we use the Euclidean distance between an image patch and cluster centres to identify the cluster which the image patch belongs to. As shown in (3), where p i represents the cluster centre, the image patch belongs to the cluster with the smallest distance between the cluster centre and the image patch.
Compared with the construction method of similar patch matrices in LRA-SVD [19], we use the adaptive clustering method in this paper, so that the number L of image patches included in each similar patch matrix is adaptively determined according to the image content. It means our method avoids the limitation of artificial regulation L. The clustering process searches for image patches with similarity in all image patches and clusters them into one cluster. Each image patch only appears in a unique cluster, which is a similar patch matrix. As shown in Figure 4, the LRA-SVD and our clustering method use the same patch size to obtain the similar patches' distribution on the original image. The image patches with the same colour are similar and belong to the same similar patch matrix. Figures 4(a) and (c) are the results of LRA-SVD, which finds the fixed L similar patches, while Figures 4(b) and (d) are the results of our clustering method, where L is adaptively determined. According to Figure 4, we can intuitively find that no matter how big the noise standard deviation is, the adjacency between patches in our clustering algorithm is better and the results of our method Compared with the construction method of similar patch matrices in ACPT [31], the number of similar patch matrices in our clustering algorithm is adaptively determined according to the image information, and the similar patch matrices constructed by our algorithm are more in line with the requirement of low rank. Figure 5 shows the distribution of similar patches searched by our adaptive two-stage clustering algorithm and the over-clustering-iterative-merging algorithm in [31]. , it can be seen that the clustering results of these two methods in the house part in the foreground are similar, but for the sky part in the background, the clustering result of [31] is smoother than the clustering result of our algorithm. The clustering results in [31] may be better for other image processing methods, such as image segmentation, but for image denoising, it fully clusters the sky part and iterative-merging loses the image details. Therefore, for denoising algorithms, the clustering results obtained by our algorithm are more inclusive of image details in the case of improving the low-rank feature of the similar patch matrices. From Figure 5, we can also find that the clustering algorithm proposed in this paper has stronger anti-noise performance.
The advantage of the low rank in our algorithm can be further observed from Figure 6. As shown in Figure 6, the right side of each figure is an enlarged view of the box on the original curve. From Figure 6(c), we can get that compared with LRA-SVD finding a fixed number of similar patches and the clustering algorithm in ACPT, the similar patch matrices formed by our clustering method not only maintain the low-rank characteristic, but also have more concentrated signal energy. Although the energy of ACPT is more concentrated in Figures 6(a, b) and (d), its low-rank feature is obviously inferior to low-rank feature of the other two methods. And the energy of the similar patch matrices constructed by our algorithm is more concentrated than that of LRA-SVD. In summary, the similar patch matrices constructed by our clustering method are more advantageous and have strong anti-noise performance, which provides a very good foundation for the subsequent clusterwise denoising process.

3.2
Low rank approximation based on SVD (singular value truncation) As described in Sections 2.2 and 2.3, P denotes the noisy cluster and Q ′ denotes the noise-free cluster obtained by P. The ideal Q ′ should satisfy (6). Eckart-Young-Mirsky theorem states that (9) is the solution of (7). As described in [19], comparing (6) and (11), we can find that when ∑ n i=k+1 2 i = t 2 , P k is an ideal estimate of P, that is, Q ′ = P k . Therefore, the truncated threshold k can be approximated by Equation (21) is also applicable to the low-rank estimation of similar patch matrices obtained by our clustering algorithm. It is worth noting that the derivation of (21) does not have the t in [19], and the form is as shown in (22).
But there is the parameter t in its implementation. Therefore, we have made a slight modification here, only because we think that such expression is more rigorous.

Correction of singular vectors
After performing the singular value truncation in the previous step, a low-rank approximation result of each cluster is obtained, that is, . So far, the image obtained by reconstructing Q ′ has been removed part of noise compared with the original noisy image. However, the low-rank approximation of SVD is based on the assumption that the noise energy is only distributed over smaller singular values, that is, the assumption ignores the distribution of noise over larger singular values.
The fact is that image information dominates in larger singular values, and noise information dominates in smaller singular values. Therefore, the accuracy of the estimation based on the SVD low-rank approximation will decrease as the noise increases. In combination with the viewpoint of PCA, the axis with the largest variance is the first singular vector, and the axis with the second largest variance is the second singular  Let U c and V c denote the left singular matrix and the right singular matrix obtained by singular value decomposition of Q, The two denoised images obtained by reconstructing Q c and reconstructing Q ′ are denoted as cs-v-cd and s-v-d, respectively. Figure 7 shows the mean of the quantitative metrics of the eight greyscale images in the two cases. Figure 8 is a comparison of visual quality. From Figures 7 and 8, we can find that in terms of quantization index, image sharpness and detail retention, the denoised image obtained by reconstructing Q c is much better than the denoised image obtained by reconstructing Q ′ . And the denoised images obtained by reconstructing Q c are not introduced with pseudo textures. Therefore, according to the denoising methods based on threshold compression in PCA domain, we design a method to correct the singular vectors u i (i = 1, 2, … , k), so that the distribution of the corrected singular vectorsû i (1, 2, … , k) is as close as possible to the distribution of u ci (i = 1, 2, … , k), as shown in Figure 9. The correction of the singular vectors u i can be regarded as a further elimination of the noise in Q ′ . The correction operation compensates for the defect of SVD low-rank approximation to some extent and improves the denoising performance of these algorithms when the noise is large.
Obviously, there are U ′ ′ = ( 1 u 1 , 2 u 2 , … , k u k ) and u i andv i represent the estimations of u i and v i , respectively. It is easy to see that u i and v i are equivalent to weighting u i and v i , that is, weighting the spatial direction vectors. The correction of the singular vectors is shown in where sign (•) is a sign function and is the original noise standard deviation. Based on the original noise level, Equation (24) sets different correction thresholds for each weighted coordinate. NoteÛ ′ = (û 1 ,û 2 , … ,û k ) andV ′ = (v 1 ,v 2 , … ,v k ), then the estimation of Q ′ obtained by singular vector correction is Q ′′ =Û ′ ′V ′T .

Image reconstruction
In the process of clustering, each image patch only belongs to one cluster. But since the image patches are overlapped, the same pixel appears in multiple image patches. So, the same pixel will have multiple estimations. We obtain the final pixel estimations by performing a weighted averaging of these multiple estimations. Meanwhile, the weighted averaging procedure can suppress noise further. Here, we use the rank k of the cluster to determine the weight of the estimated Q ′′ . The weight of Q ′′ j is  In all tables, the bold font indicates that the method achieves the best performance under the same conditions, and the underlined font indicates that the performance of the method under the same conditions is second only to the highest value. Method 1 is the complete method proposed in this paper, combining our basic process with (28), (29) and (30). Method 2 represents the combination of our basic process and (14), (15).   In all tables, Figure 1 defined as where g j is max{d 2 , L j }, d is the size of the image patches and L j is the number of image patches in P j . The smaller k is, the higher correlation between image patches in the cluster is and the more accurate the result of low-rank estimation is. Therefore, the weight corresponding to the cluster should be assigned a higher value. Extremely, when k j = min{d 2 , L j }, the cluster is full rank, and there is almost no correlation between the image patches in the cluster. At this time, the weight is minimized.
Since an image patch only exists in a certain cluster, the weight of the image patch is the weight of the cluster in which it is located. The denoised estimation for ith pixel of the image can be expressed asx where Γ(x i ) denotes the index set of all image patches containing the pixel x i ,x i, j denotes the denoised estimation of the ith pixel in the j th image patch and W is a normalizing factor defined by

Back projection
Although most of the noise is removed by the above procedures, the denoising results are still inaccurate due to the noise interference clustering accuracy. Like most algorithms, we combine the back projection mechanism [44,45] to iterate the basic processes of our algorithm to further improve the denoising performance. The method of back projection has been widely used as a post-processing step and can be formulated aŝ where Y is the original noisy image,X is the denoised result produced in the current stage,Ŷ is the noisy image of the next stage and is the projection parameter. In this paper, is jointly determined by the residual image and the noise standard deviation, rather than an artificially specified parameter. The calculation of is as shown in where N is the number of pixels in Y , G g represents G g th iteration of the algorithm, is the initial noise standard deviation and is an adjustment factor. Intuitively, the Frobenius norm of the residual image ‖Y −X ‖ 2 F can roughly represent the noise level in the residual image, and determining the projection parameter by its proportional relationship with the original noise level is the original intention of (29). The algorithm in this paper is a correction to the assumption that the noise is distributed on the smaller singular values. That is to say, we think that there is a noise energy distribution on larger singular values, and the smaller singular values also contain the image information. Therefore, with back projection, the loss of image details contained in the residual image can be reduced, because it re-injects the residual image in a certain ratio. So, back projection can enhance the algorithm's ability to protect the details of images. In addition, the new noisy image obtained by the back projection contains much less noise than that contained in the original noisy image, so that the interference of noise on the clustering process is greatly reduced in a new denoising process. Thereby, the denoising ability of the whole algorithm is enhanced, and the denoised results are more accurate. In short, the use of back projection technology not only further suppresses the noise and enhances the detail protection ability of the algorithm, but also reduces the number of iterations and avoids the huge computational cost caused by a large number of iterations. From (21) and (24), we can see that the value of is especially important for noise filtering, but the noise level in the new noisy image is no longer and less than . Therefore, another important task here is to estimate the noise standard deviationˆin the new noisy image. Referring to the noise estimation method proposed in [19] and combined with the design idea of (29), we propose (30) to get an adaptive solution ofˆ.
where is a scaling factor. Since the residual image contains most of the noise, the residual image is re-injected with proportion , meaning that can roughly represent the noise level in the new noise image, and the second term in the equation is the calibration ofˆ.
In order to verify the effectiveness of our back projection, the basic process shown in the dashed box in Figure 3 is, respectively, combined with our back projection equations and the back projection equations in LRA-SVD. The back projection of LRA-SVD is shown in (14) and (15). Method 1 represents the combination of our basic process and (28), (29), (30), that is to say, Method 1 is the complete method proposed in this paper. Method 2 represents the combination of our basic process and (14), (15). In addition, we control the number of iterations of the two methods to be consistent. As shown in Tables 1 and 2, there are the PSNR comparison and the FSIM comparison of the two methods on Figures 1 and 2, respectively. The mean of PSNR and FSIM on BSD100 is shown in 3. The bold result in these tables represents the better result of the two methods with the same condition. From the Tables 1 and 2, we can get that Method 1 is significantly better than Method 2. But in Table 3, we can see that mean PSNR of Method 1 is much higher than that of Method 2. While mean FSIM of Method 2 is higher than that of Method 1. Then, we combine visual effects to evaluate the two methods. Figure 10 is a visual comparison of Method 1 and Method 2. The noise levels in Figure 10 are all 70. From    In all tables, the bold font indicates that the method achieves the best performance under the same conditions, and the underlined font indicates that the performance of the method under the same conditions is second only to the highest value. and (e). The experimental results of other images still show this phenomenon. Therefore, we have reason to think that Method 1 is better than Method 2. It means that the back projection equation designed in this paper is successful. And applying it to the basic processes of our algorithm makes the denoising performance of our algorithm greatly improved. As a summary, the complete procedures of our algorithm are described in Algorithm 1.

EXPERIMENTAL RESULTS
To demonstrate the efficacy of the proposed denoising algorithm, in this section, we give our experimental results.

Description of the parameters
The tunable parameters involved in this paper are the size d of the image patches, the adjustable factors and in the back projection and the number of iterations G . Since the image patches contain both image information and noise information, d is too small to distinguish the noise. While d is too large, the image patches easily contain too much geometric information in the image, so that the differences between image patches are large, which reduces the accuracy of clustering, makes the similar patch matrices lose their low-rank features and thus affects the denoising performance. Therefore, the value of d is an important parameter in our denoising algorithm.    In all tables, the bold font indicates that the method achieves the best performance under the same conditions, and the underlined font indicates that the performance of the method under the same conditions is second only to the highest value.

Experimental comparison
In order to quantitatively evaluate the denoising performance of our algorithm, we compare it with five state-of-the-art and related image denoising algorithms: WNNM [8], BM3D [3], LRA-SVD [19], ACPT [31] and MR [41]. All of these denoising algorithms utilize the self-similarity of natural images to suppress noise. In the experiment, the source codes of these algorithms are provided by their authors, and the parameters are the default values in their source codes, without any modification. Including our algorithm, these six algorithms are implemented in MATLAB programming language. Table 4 shows the quantization performances of the six competing algorithms on Figure 1 and Table 5 shows the results on Figure 2. The bold font indicates that the algorithm achieves the best performance under the same conditions, and the underlined font indicates that the performance of the algorithm under the same conditions is second only to the highest value. Since the processes of clustering in our algorithm and ACPT are based on the k-means algorithm, the experimental results are related to the selection of the initial seed position. The initial seed points in our algorithm implementation are generated by the random function in MATLAB. To ensure the stability of the results, the results of our algorithm and ACPT in Tables 4  and 5 are the average of the multiple executions. From these two tables, we can see that the denoising algorithm proposed in this paper is robust to noise, that is, the competitiveness of the algorithm becomes stronger with the increase of noise. In particular, the FSIM of our algorithm is significantly better than that of other algorithms. For example, for Figures 1(c), 2(a, b) and (d), the results of our algorithm are optimal for both PSNR and FSIM when the noise standard deviation is 100, 120 and 150. Specifically, when the noise level is 120, for Figure 2(b), the PSNR of our algorithm is higher than that of BM3D, WNNM, LRA-SVD, ACPT and MR by 0.69, 0.36, 0.77, 1.24 and 0.54 dB, respectively, and FSIM of our algorithm is higher than the FSIM of these algorithms by 0.0545, 0.0431, 0.0565, 0.0517 and 0.0609, respectively. When the noise level is 150, for Figure 1(b), The mean of the quantitative metrics of each algorithm on eight greyscale images the PSNR of our algorithm is higher than the PSNR of BM3D, WNNM, LRA-SVD, ACPT and MR by 0.95, 0.27, 0.47, 1.29 and 0.25 dB, respectively, and FSIM of our algorithm is higher than that of these algorithms by 0.0168, 0.0264, 0.0149, 0.0339 and 0.0172, respectively. The advantage of our algorithm is more obvious when the noise is large, but it does not mean that the performance of our algorithm is not competitive when the noise is small. For Figure 1(a), when the noise level is 30, the PSNR of our algorithm is higher than that of BM3D, WNNM, LRA-SVD, ACPT and MR by 0.27, 0.06, −0.05, 0.29 and −0.08 dB, respectively, and FSIM of our algorithm is higher than the FSIM of these algorithms by 0.0454, 0.0303, 0.0142, 0.02 and 0.0041, respectively. Figure 11 shows the mean of the quantitative metrics of each algorithm on eight greyscale images. From Figure 11, it can be seen intuitively that the quantized performance achieved in this paper is very competitive with state-of-the-art algorithms. It is also worth noting that as the noise increases, the competitiveness of our algorithm is gradually enhanced. Table 6 shows the mean of quantitative metrics on BSD100. Similarly, the bold in the table denotes the best performance under the same conditions and the underline is the second best. From Table 6, we can see that the PSNR of MR is very competitive compared to our method. But in the case of medium and high levels of noise, its FSIM index is much lower than ours. It is also worth noting that FSIM index of ACPT is very excellent. However, its PSNR index is quite unsatisfactory. Perhaps it is appropriate to use the ACPT denoised method in some applications. While the quantitative metrics of our method is excellent and balanced, which makes the application of our method will be more extensive.
In terms of visual quality, our algorithm is also comparable and even superior to the state-of-the-art denoising algorithms. Figure 12 displays a visual comparison of the results obtained by denoising Figure 1(a) with a standard deviation of 50, where Figures 12(a-f) are the denoised results of BM3D, WNNM, LRA-SVD, ACPT, MR and our algorithm, respectively. It can be seen from Figure 12 that for the nose part in the red frame, the denoised results of WNNM, BM3D, LRA-SVD and MR are too smooth, the results of MR and WNNM introduce obvious pseudo textures and the result of ACPT still contains some noise points. While our algorithm achieves the closest result to the original image, especially the shape of the nostrils. Figure 13 is a visual comparison of the results obtained by denoising Figure 2 Figure 13, we can see that our algorithm and BM3D achieve relatively good visual quality, and the pseudo textures introduced by BM3D are slightly less. Figure 14 shows a visual comparison of the results obtained by denoising Figure 1(c) with a standard In all tables, the bold font indicates that the method achieves the best performance under the same conditions, and the underlined font indicates that the performance of the method under the same conditions is second only to the highest value. To further demonstrate our performance, we apply our algorithm to real noisy images. Due to the lack of clean images as a reference, the comparison of denoised results becomes more subjective. Figure 15 displays the denoised images yielded by our algorithm and the competitive algorithms by using the noise estimation algorithm in [48]. From the above objective data and subjective visual analysis, it can be seen that the denoising performance of our algorithm is comparable to that of WNNM and MR, especially the feature similarity. The quantitative metrics of our algorithm far exceed the quantized results of the algorithms proposed in BM3D and LRA-SVD. Visually, our algorithm maintains textures and details richer than that of the other five algorithms, and achieves the denoised results closest to the original images.

Computational cost
To evaluate the computational cost of six denoising algorithms, we compare the average running times on eight greyscale images in Figures 1 and 2 with different noise levels. All experiences are performed on a platform of an Inter (R) Core (TM) i5-347 CPU 3.2 GHz with 8 GB memory. Denoising 256 × 256 greyscale images, the execution time of each algorithm is shown in Table 7. The average execution times of BM3D, WNNM, LRA-SVD, ACPT, MR and our algorithm are roughly 0.62, 164, 17, 11, 57 and 26 s, respectively. Although the denoising performances of WNNM, MR and our algorithm are very competitive, the computational cost of our algorithm is more advantageous. There are two main computational components of our algorithm, one is the adaptive clustering and the other is the calculation of SVD for each cluster. The first component takes approximately 75% of the whole execution time and the second component takes only 20% of the time.

CONCLUSION
In this paper, we propose a denoising algorithm, which constructs similar patch matrices by adaptive clustering and cooperatively filters the noise contained in singular values and singular vectors of similar patch matrices. In order to achieve the goal, we have done various denoising experiments based on SVD and threshold processing. Finally, we choose the scheme described in the paper, which achieves the ideal denoising results. Adaptive clustering reduces the disadvantages of constructing similar patch matrices with a fixed number of image patches. Although it adds a lot of computational, good clustering results directly make the correlation between image patches in the similar patch matrices high. And then, the accuracy of the subsequent denoising process is improved. In future work, we will try to adjust the adaptive clustering to reduce the execution time of our algorithm, and further explore the relationship between the number of clusters and the values of pixels. We will further improve the accuracy of singular vector correction, so that the distribution of the denoised singular vectors is closer to the distribution of the noise-free version. Attempting to study and use singular values discarded under low-rank approximation is also a direction for our future work.