Non-linear unmixing of hyperspectral images using multiple-kernel self-organising maps

: The spatial pixel resolution of common multispectral and hyperspectral sensors is generally not sufficient to avoid that multiple elementary materials contribute to the observed spectrum of a single pixel. To alleviate this limitation, spectral unmixing is a by-pass procedure which consists in decomposing the observed spectra associated with these mixed pixels into a set of component spectra, or endmembers, and a set of corresponding proportions, or abundances, that represent the proportion of each endmember in these pixels. In this study, a spectral unmixing technique is proposed to handle the challenging scenario of non-linear mixtures. This algorithm relies on a dedicated implementation of multiple-kernel learning using self-organising map proposed as a solver for the non-linear unmixing problem. Based on a priori knowledge of the endmember spectra, it aims at estimating their relative abundances without specifying the non-linear model under consideration. It is compared to state-of-the-art algorithms using synthetic yet realistic and real hyperspectral images. Results obtained from experiments conducted on synthetic and real hyperspectral images assess the potential and the effectiveness of this unmixing strategy. Finally, the relevance and potential parallel implementation of the proposed method is demonstrated.


Introduction
With the recent and rapid development of hyperspectral imaging technology, hyperspectral images have been widely used in various scientific fields, such as environmental mapping, risk prevention, urban planning, pollution monitoring and mining exploration [1]. Even if classical processing tasks of hyperspectral image analysis include dimensionality reduction, object detection and image segmentation/classification, spectral unmixing certainly remains the most investigated technique to extract relevant information from the observed scene while facing with the generally limited spatial resolution of the sensors. Indeed, the raw spatial resolution of standard hyperspectral sensors usually leads to observing mixed pixels, i.e. which are composed of more than one elementary material. Separating these mixed pixels into a set of elementary material spectral signatures (or endmembers) and estimating their corresponding fractions (or abundances) in each pixel is an automated procedure known as spectral unmixing or spectral mixture analysis [2].
There are two main classes of mixing models commonly advocated to relate the observed pixel spectrum to the endmember spectra and abundances. The first one, referred to as a linear mixing model (LMM), assumes that the observed spectra result from linear combinations of the endmember signatures. Conversely, a wide class of non-LMMs (NLMMs) covers a large variety of applicative contexts where LMM fails to accurately describe the mixing process underlying the observations. Although LMM is easy to implement and to handle in common scenarios, considering NLMM may be inevitable when analysing more complex scenes, e.g. characterised by multi-layer structures, such as vegetated areas [3] or urban canyons [4], or by intimate sandlike mixtures [5]. The interested reader is invited to consult [6][7][8] for comprehensive discussions on both classes of models.
Most of the unmixing strategies are based on two-step procedures. Firstly, the spectral signatures associated with the endmembers are extracted from the whole set of image pixels. Secondly, in an inversion step, the abundances are estimated in each pixel based on the estimated endmembers and the adopted LMM or NLMM. Popular endmember extraction algorithm includes N-FINDR [9] and simplex growing algorithm (SGA) [10] which search for a simplex with the maximum volume inscribed in the dataset. Similarly, pixel purity index (PPI) [11] and vertex component analysis (VCA) [12] recovers this simplex through geometrical projections. Regarding the inversion step, in the past decades, many algorithms based on LMM have been developed to recover the abundances associated with each pixel measurement. Heinz and Chang [13] introduced the popular fully constrained least squares (FCLS) unmixing method based on LMM. It is based on a least-squares approach that simultaneously exploits the socalled abundance of sum-to-one and non-negativity constraints. More recently, Bioucas-Dias and Figueiredo [14] introduced two algorithms, referred to as Sunsal and C-Sunsal, to solve a class of optimisation problems derived from the LMM-based spectral unmixing formulation. The proposed algorithms are based on the alternating direction method of multipliers, which decomposes the initial problem into a sequence of easier ones, with the great advantage of being much more computationally efficient than FCLS.
When dealing with non-linear mixtures, dedicated unmixing techniques should be considered. Some of them proposed in the literature rely on parametric formulations of the non-linearities that may occur during the mixing process. When these non-linearities result from multiple scattering effects, various bilinear mixing models have been designed [15] for which specific unmixing techniques have been developed [4,16,17]. Conversely, nonparametric model-based unmixing strategies have been derived to address a larger variety of non-linear mixtures. When the nonlinearities are considered as additional contributions of the LMM, robust models allow these non-linear terms to remain unspecified [18,19]. Additionally, some significant contributions exploit the versatility of the machine learning framework to handle the variety of non-linear effects, in particular when training dataset are available to learn the non-linear mapping from the abundances to the measured pixel spectra. For instance, a neural network approach has been proposed in [20] to tackle the unmixing problem by two successive stages: the first one reduces the dimension of the input data, and the second stage performs the mapping from the reduced input data to the abundance coefficients. In [21], extreme learning machines are implemented to train a regression model subsequently resorted to recovering the abundances of target classes.
Besides, multiple-kernel learning (MKL), which is known as learning a kernel machine with a set of basis kernels, has been developed in machine learning area [22] to solve optimisation problems arising when conducting an SVM-classification task. MKL can also be envisaged to address unmixing problems. Such a framework has several advantages: (i) it can simultaneously use numerous features (computed from the different bands of the hyperspectral images) to enrich the data similarity representations, (ii) it is an intermediate combination of data which means that the mixed pixel is preserved during the process without loss of information in its early combinations [22] and (iii) it can be easily parallelised to speed up the computation [23]. To this end, Liu et al. [24] have proposed a framework called multiple-kernel learning-based spectral mixture analysis (MKL-SMA) that integrates an MKL method into the training process of linear spectral mixture analysis. They derived a closed-form solution of the kernel combination parameters, unlike most MKL approaches. Similarly, Chen et al. [25] have formulated the problem of abundance estimation resulting from non-linear mixtures based on an MKL paradigm. Also, Gu et al. [26] have shown that integrating reproducing kernel Hilbert spaces (RKHSs) spanned by a series of different basis kernels in multiple-kernel Hilbert space can empower handling general non-linear problems more easily than traditional single-kernel learning. All these methods are based on MKL with an SVM-based solver. Thus, per se, they are supervised learning since they need a priori knowledge on the estimated endmembers and abundances. Unfortunately, in most applicative context, this prior knowledge is not available. To overcome this limitation, this paper suggests replacing the usual solver by the neural network, the Kohonen's self-organising maps (SOMs) [27]. The non-linear unmixing model proposed in this paper boils down to an appropriate implementation of multiple-kernel SOM (MK-SOM) specifically designed to solve the unmixing problem. This simple and intuitive strategy opens promising routes since, in particular, it is highly parallelisable, which can lead to an easy VLSI implementation based on systolic arrays or FPGAs, and it is easily expendable to a high number of dimensions [28].
The paper is organised as follows. Section 2 presents the proposed MK-SOM unmixing strategy, focusing on the training and testing algorithms. Section 3 provides the experimental results we have obtained after applying the proposed methodology to real and synthetic hyperspectral data. Different parameters in the training stage have been considered in order to obtain a detailed description of their influence on the final results. Section 4 describes and illustrates the potential computational gain, which can be expected by a parallel implementation of the proposed algorithm on a graphics processing unit (GPU). Finally, Section 5 concludes this work.

MK-SOM unmixing
This section details the dedicated implementation of MK-SOM to perform hyperspectral unmixing of possibly non-linearly mixed pixels. The proposed strategy assumes the endmember spectra have been identified beforehand, based on a priori knowledge regarding the imaged scene of interest, or extracted by a dedicated algorithm (such as VCA).

General framework: from SOM to MK-SOM
The measurements x ℓ, p p ∈ P with P ≜ 1, …, P , i.e. the reflectance of all the pixels observed in the band #ℓ with ℓ ∈ ℒ ≜ 1, …, L , are assumed to live in a given set X ℓ : ∀ℓ ∈ ℒ, ∀p ∈ P, x ℓ, p ∈ X ℓ .
Each band ℓ ∈ ℒ is associated with a known kernel, i.e. a function K ℓ : X ℓ × X ℓ → ℝ such that K ℓ ( ⋅ , ⋅ ) is symmetric and positive, i.e. and Popular kernel functions include the linear, Gaussian and polynomial kernels. Then, a new kernel where the set of coefficients α ℓ ℓ ∈ ℒ ensures a convex combination As noticed in [29], K( ⋅ , ⋅ ) is also symmetric and positive. As a consequence, according to the RKHS framework, a function ϕ: X → ℍ, referred to as the feature map, can be introduced such that and ℍ is a Hilbert space referred to as the feature space. Traditionally, SOM consists of searching for M so-called neurons or, equivalently, prototypes p m ∈ ℍ (m ∈ ℳ ≜ 1, …, M ), to represent the input data X. In other words, an SOM represents the data by a set of prototypes (such as the centroids identified by an unsupervised clustering technique, e.g. K-means), which are topologically organised on a lattice structure. According to the general framework of kernel SOM described in [30], all prototypes are written as convex combinations of the input data in the feature space The proposed MK-SOM unmixing algorithm consists of two main steps. First, the training step recovers the pure pixels to obtain the weight map. Then, the testing step computes the abundance vector for each pixel in the image. These two steps are briefly recalled in what follows.

MK-SOM training
Following the ideas of [29,31], we adopt an online implementation of MK-SOM algorithm detailed in Algorithm 1 (see Fig. 1 standard (i.e. affectation and representation) steps of the conventional SOM are iteratively applied here for a given total number T of epochs. In particular, the affectation step consists in finding the index k p of the so-called best-matching unit, i.e. of the closest prototype p m of the considered input data x p . Note that the MK-counterpart of SOM relies on a distance computed in the feature space ℍ thanks to the associated Gram matrix. Line 8 of Algorithm 1 (Fig. 1), the updating rule involves a neighbourhood function h ⋅ , ⋅ chosen as where σ 2(t) is an algorithmic parameter decreasing along with the iterations according to with σ 2(0) = 20 and λ σ 2 = 0.05. Moreover, μ (t) is a learning rate defined as a decreasing function with μ (0) = 0.1 and λ μ = 0.05. Finally, the δ( ⋅ ) function stands for the Kronecker operator, i.e. δ( j) = 1 if j = 0 and δ( j) = 0 otherwise. In the proposed implementation, the output weight vectors are normalised between the range (0, 255) as the input training and testing vectors have been also normalised [32]. It is worth noting that the output prototypes recovered during this training step can be interpreted as virtual endmember spectra or class representatives for each material present in the observed scene.

MK-SOM testing
The MK-SOM testing step of the proposed unmixing algorithm aims at recovering the abundance vector associated with each pixel y n (n = 1, …, N) of the image to be unmixed. Following a commonly accepted approach [20,33,34], the degrees of membership d kn k = 1 K of a testing pixel y n to all neurons p m are resorted as surrogates of abundances. As in [20], these degrees are finally rescaled to ensure that the final abundance vectors satisfy the widely admitted non-negativity and sum-to-one constraints [2]. The overall process is described in Algorithm 2 (see Fig. 2).

Initialisation
Before beginning the unmixing process, two initialisation tasks should be conducted. There are discussed in what follows. First, the spectral signatures of the endmembers to be used in the unmixing process, as well as their number, should be selected. These spectral signatures can be chosen by the practitioner in the case of reliable prior knowledge regarding the scene of interest. Otherwise, one may make use of an endmember extraction algorithm already proposed in the literature [35][36][37][38]. Note that, as highlighted in [7,8], some of these algorithms can provide reliable results to recover endmember from non-linearly mixed pixels, although they initially rely on an implicit LMM. This is, in particular, the case for the geometrical extraction algorithms, such as VCA [12] and N-FINDR [9]. Moreover, note that, in the proposed implementation of the MK-SOM, the number of endmembers is the squared value of the dimension of the SOM weight vectors. However, when this constraint can be fulfilled, any dimension of the SOM weight vectors can be still employed and the resulted map can be post-processed by any unsupervised clustering algorithm (e.g. K-means) to reach the desired number of endmembers [39]. Then, the second task consists in selecting training samples (i.e. hyperspectral pixels) necessary for the training step of the MKL. In [20,40], the authors propose to resort to only pure pixels in the training phase, based on the assumption that each pure pixel can be associated with a unique endmember by a degree of membership equals to 1 and with a degree of membership to the other endmembers equal to 0. Instead, in this work, we propose to define this training set as the outputs of an endmember extraction algorithm, e.g. VCA. Indeed, VCA is able to recover the P most distinct pixels in the considered hyperspectral image. In addition to being unsupervised, this strategy has the great advantage of allowing the learning step of the MK-SOM to be conducted more efficiently since the training set is not limited to actual existing pure pixels in the image. In other words, the training pixels are selected as the P purest pixels extracted by VCA.

Experimental results
To validate the proposed MK-SOM-based unmixing algorithm, its performance has been compared to those obtained by the linear unmixing methods FCLS [13] and SUNSAL [14] and the nonlinear unmixing methods PPNM [16], rLMM [18], K-Hype and its generalised counterpart (SK-Hype) [41]. The experimental datasets consist of synthetic and real images described in what follows.

Datasets
Firstly, to assess the performance of the unmixing procedures, we consider a hyperspectral image of size 100 × 100 pixels and composed of nine endmembers representative of minerals which have been extracted from the U.S. Geological Survey (USGS) spectral library [42]. The abundance maps have been choosing as freely available dataset referred to as Fractal 1 and available online [43] has been used. They have been generated according to fractal patterns to ensure realistic distribution of the materials other the scene. In particular, it is constructed such that the pixels nearby the region centres are more spectrally pure than the pixels in the transition areas [44]. Based on these endmembers and abundance maps, the mixed pixels have been generated according to the nonlinear model, namely the so-called Fan bilinear model introduced in [45]. An additive white Gaussian noise has been considered with a corresponding signal-to-noise ratio chosen as SNR=40 dB. To illustrate Fig. 3 depicts the synthetic image in a band #1. Secondly, the real so-called Cuprite dataset acquired by the airborne visible/infrared imaging spectrometer (AVIRIS) has been considered [46]. It consists of a 614 × 512 pixel image and 224 bands with a spectral range of 0.4 − 2.5 μm. A set of 33 bands have been removed prior to analysis since they correspond to water absorption band or low SNR. Fig. 4 shows the Cuprite image observed in band #10.

Quality metrics
For the synthetic dataset for which ground-truth (i.e. in terms of actual abundance maps) is available, the performances of the unmixing procedures are evaluated through the mean square error of the abundances [47] where a n and â n are the actual and estimated abundance vectors associated with the pixel #n (n ∈ 1, …, N ), respectively, and K is the number of endmembers used during the unmixing process. For the real datasets for which no ground-truth is available, the reconstruction error between the original and recovered pixel signatures [47] where y n and ŷ n are the measured and reconstructed spectra associated with the pixel #n (n ∈ 1, …, N ) and L is the number of spectral bands.

Results
In the conducted experiments, the proposed MK-SOM unmixing procedure has been implemented with a Gaussian kernel of width equal to 1 since it has shown to outperform linear and polynomial kernels [26]. During the training step, various numbers T epoch of epochs have been considered T epoch ∈ 20, 35, 50, 100, 200, 500 with the largest size of training sets, which equals the number of bands. Tables 1 and 2 report the performance of the different considered algorithms for a number of endmembers equal to K = 4 and K = 9, respectively and an SNR = 40 dB. These results show that, in the case of a number of endmembers equal to 4, the proposed MK-SOM unmixing algorithm consistently outperforms all the other algorithms for any training step parameters (e.g. number T epoch of epochs). The best MSE value (0.0625) is obtained by MK-SOM for a number of epochs equals to T epoch = 500. When the number of endmembers is equal to K = 9, the MK-SOM consistently performs better than all other methods. The best MSE value (0.0123) is reached when MK-SOM is run with the number of epochs equal to T epoch = 500.
To illustrate, the corresponding estimated abundance maps for a synthetic image are depicted in Fig. 5 for K = 4.
As complementary results, Figs. 6 and 7 report the reconstruction errors for different levels of noise for K = 4 and a different number of endmembers for SNR = 40, respectively.
Similarly, Table 3 shows the performance of compared algorithms when applied to the real AVIRIS Cuprite dataset for a number of endmembers equal to K = 9. The proposed MK-SOM consistently outperforms the concurrent methods and, in particular, provides the lowest error for a number of epochs equal to T epoch = 20. To illustrate, the corresponding estimated abundance maps are depicted in Fig. 8 for K = 4.

Parallel implementation of the MK-SOM algorithm
This section presents a parallel implementation of the MK-SOM algorithm described in Section 2. This implementation is developed on a GPU using a CUDA programming language. The experiments are conducted on an NVidia Tesla N2090 architecture and show that the parallelisation provides a significant speedup of the required computational time. Per se, MK-SOM learning is a highly parallelised problem as it is based on an SOM learning approach. The complexity of the calculation is greatly affected by the SOM map size and number of features (number of bands in the   considered unmixing problem). A simple SOM learning can be summarised as Algorithm 3 (see Fig. 9). The most computationally intensive steps are Steps 3 and 4 of Algorithm 3 (Fig. 9) described below: • Computing the Euclidean distance between the selected training sample and the SOM map, then obtaining the best matching unit (BMU) which correspond to the minimum distance, • Updating the SOM map weights according to the distance between the selected training sample and BMU yet updates are limited to only the BMU neighbours.
The degree of parallelism of Step 3 is limited to the dimension of the training set (number of features). On the other hand, the complexity of the MK-SOM testing step (see Section 2.3 and Algorithm 2 (Fig. 2)) mainly depends on the image size, the SOM map size and the number of endmembers. Therefore, for a given image, the expected parallelisation speedup directly increases with the SOM map size. As an illustrative example of the possible computational gain, the implemented CUDA version has been evaluated in the real AVIRIS Cuprite Image of size 512 × 614 × 189 with different SOM map sizes. The resulting computational times, compared to those obtained on crude (i.e. serial) or multi-threading MATLAB implementations, are reported in Table 4. These results show a significant speedup ranging from × 185 to × 6252 with respect to its MATLAB counterpart. It clear appears that this computation speedup increases as the SOM map size increases. We could also note that the execution time of the serial MATLAB version is exponentially increasing with SOM map size.

Conclusion
This paper introduced a new unmixing method based on simple MKL with SOM as a solver. The proposed non-linear unmixing approach succeeded in reducing the estimation error when applied to synthetic and real hyperspectral images. Moreover, it seemed to be robust with respect to the choice of the parameters required by  the training stage. At the price of a slightly more computationally intensive complexity, the proposed MK-SOM framework appears as a relevant alternative of state-of-the-art non-linear unmixing methods. Finally, this paper showed that a parallel implementation of the proposed algorithm was possible. This parallelisation was implemented using a graphics processing unit and the resulting computational gain was shown to be highly significant when compared to serial and multi-threaded MATLAB implementations.