Speciﬁc radar emitter identiﬁcation based on two stage multiple kernel extreme learning machine

To make full use of the discriminative information containing in the whole ambiguity function (AF) plane, a novel two stage multiple kernel extremelearningmachine(TSMKELM)methodforspeciﬁcradaremit-teridentiﬁcationisproposed.Firstly,theAFplaneissegmentedintothe non-overlappingDopplershiftstripesandeachstripeisencodedasakernel.Next,thediscriminationofthesestripesisevaluatedandsorted accordingtothekerneldiscriminantratio(KDR)criterion,whichisinlinewiththelargemarginprincipleofKELM.Then,onlythestripes withlargeKDRsarekeptandthecombinedkerneliscalculatedbydi-rectlyusingthenormalizedKDRsascombinationweights.Atlast,the KELMclassiﬁerisemployedtofulﬁltheindividualidentiﬁcationtask.Theproposedalgorithm,namedasKDR-TSMKELM,solvesthekernel combinationweightsandkernelclassiﬁerparameterseparately,bring-ingmuchefﬁciencyinpractice.Experimentsontworealradardatasets validate the proposed algorithm.

✉ Email: shiyaworld@163.com To make full use of the discriminative information containing in the whole ambiguity function (AF) plane, a novel two stage multiple kernel extreme learning machine (TSMKELM) method for specific radar emitter identification is proposed. Firstly, the AF plane is segmented into the non-overlapping Doppler shift stripes and each stripe is encoded as a kernel. Next, the discrimination of these stripes is evaluated and sorted according to the kernel discriminant ratio (KDR) criterion, which is in line with the large margin principle of KELM. Then, only the stripes with large KDRs are kept and the combined kernel is calculated by directly using the normalized KDRs as combination weights. At last, the KELM classifier is employed to fulfil the individual identification task. The proposed algorithm, named as KDR-TSMKELM, solves the kernel combination weights and kernel classifier parameter separately, bringing much efficiency in practice. Experiments on two real radar datasets validate the proposed algorithm.
Introduction: Specific emitter identification (SEI) distinguishes each individual radio emitter of interest utilizing the extracted signal features [1]. We focus on the radar emitters here, i.e. specific radar emitter identification issue, of which the most challenging situation is that the radars may come from the same production line and thus have identical type [2,3]. To meet this challenge, unintentional modulation on pulse (UMOP) has attracted great attention during the past thirty years or so [1][2][3][4][5]. The individual difference of radars originates from inevitable UMOP caused by the production technique of transmitter, providing a physical basis for feature extraction and individual identification [2,3].
Pulse envelope [2,6], frequency and phase modulation profiles [4,5] were verified to be related to phase noise of transmitters, so they can represent the subtle difference of radars. Further, variational mode decomposition was used to decompose the envelope or instantaneous frequency into many modes to extract the secondary features [3]. In [7], Fourier spectrum was regarded as the primary feature, and then one dimensional convolutional neural network was applied to obtain the deep features. In [8], the image-like bispectrum feature was put into the discriminative adversarial network to magnify the unintentional modulation information. Based on cyclostationarity, [6] presented zero frequency slice of cyclic spectrum (CS0) feature. However, the above features are less comprehensive than the joint time-frequency ones [9][10][11], among which the ambiguity function (AF) performs well. In terms of accuracy, the representative Doppler shift slice of AF (AFR) feature is of great use, but its location needs to be determined by a heavy searching process. To handle this issue, [11] proposed the integrated kernel canonical correlation analysis (IKCCA) algorithm to fuse a few Doppler shift slices. IKCCA is better than AFR, but it still ignores lots of potentially useful information of the two-dimensional AF plane and moreover is not applicable for mining the AF plane in practice.
In this letter, we aim to fuse the informative Doppler shift stripes of AF by two stage multiple kernel learning (MKL) [12]. In the first stage, one quarter of the AF plane is segmented into Doppler shift stripes, which are encoded by multiple kernels; then, kernel discriminant ratio (KDR) is used to evaluate the discrimination and informativeness of each stripe and thus the stripes with large KDRs are kept to calculate the combined kernel. In the second stage, due to the superiority and large margin principle, the kernel extreme learning machine (KELM) is chosen as our base learner to identify the individual radar emitter [13].
ELM and KELM: ELM is an efficient algorithm for single hidden layer feedforward neural networks. Assume we have N training data where D is data dimension, U is the number of classes and y i is the label vector of x i . If x i belongs to the u-th class, the u-th element of y i is set as one but all the others are zeros. Given L hidden neurons, the input weights and biases are {(a l ∈ R D , b l ∈ R)} L l=1 . Given the active function G( · ), the output vector of the hidden layer is thus h( ELM aims to minimize the training error and the norm of output weight matrix, i.e. the objective function is: where || · || F is the Frobenius norm, ξ i ∈ R U is the training error vector of x i and C > 0 controls the trade-off between model complexity and training error. Based on the Karush-Kuhn-Tucker (KKT) theorem, the optimal solution of (1) is The term HH T ∈ R N × N can be regarded as a kernel matrix K. Originally, the input weights and biases of ELM are randomly generated, i.e. h(x i ) is known to users. Meanwhile, h(x i ) can be an implicit feature mapping and K becomes our familiar Gram kernel matrix, which leads to KELM. For any testing data point x te , the output of KELM is: where k( ·, ·) is the kernel function and (K + I/C) −1 Y = α * is the learned classifier parameter of KELM. Finally, the predicted label of x te is just the index number corresponding to the maximum value of f(x te ).
Proposed method: MKL was initially devised for kernel selection but now it has become a useful tool to fuse multiple information sources. Inspired by MKL, KELM was extended to multiple KELM (MKELM) algorithm [14]. However, the optimal kernel combination weights and classifier parameter are solved by alternative optimization, so MKELM needs many iterations to converge. By contrast, two stage MKL finds the optimal kernel weights using independent criteria (e.g. kernel target alignment [12]) in the first stage and then train a standard kernel method using the combined kernel in the second stage, which is more efficient. Therefore, the KDR-based two stage MKELM (KDR-TSMKELM) algorithm is proposed in this paper. KDR is the ratio of between-class scatter and within-class scatter in a kernel-induced feature space, i.e. KDR can measure the class separability of training data [15]. Generally, good separability means high accuracy, so we can use KDR criterion to obtain the kernel weights in the first stage. Moreover, it has been proved that ELM's minimal norm of output weight property is in line with the large margin theory [13]. Hence, our algorithm makes sense in theory.
Based on Equations (4) and (5), the KDRs are {s b r /s w r } R r=1 . Hence, the kernel combination weights {μ * r } R r=1 and the combined kernel are as follows: In the second stage, the optimal classifier parameter can be directly calculated by α * = (K tr comb + I/C) −1 Y . In the testing period, the kernel matrices between any testing data point {x te r } R r=1 and all the training data are {K te r = [k r (x te r , x 1,1 r ), . . . , k r (x te r , x U,nU r )] ∈ R 1×N } R r=1 . Consequently, the final output of KDR-TSMKELM is ( R r=1 μ * r K te r )α * . Since α* and {μ * r } R r=1 are closed-form solutions, our proposed algorithm is faster than MKELM and more suitable for specific radar emitter identification. Generally, to mine the whole AF plane, all the Doppler shift slices or Doppler shift stripes (see below) are regarded as the multiple homogenous feature representations of radar signals and then encoded by multiple kernels; next, the informativeness of AF slices or stripes is automatically sorted according to the KDRs and thus the kernels with large KDRs are chosen to calculate the combined kernel; finally, the standard KELM is used.
To better illustrate our idea, a real single-tone radar dataset (U = 20) is utilized. Figure 1 shows a typical sample and its corresponding AF feature representation. By KELM classifier with Gaussian kernel (see Experiments section for details), Figure 2a demonstrates the recognition accuracy of AF slices when 6% of the data are randomly selected for training (N = 60). Further, the KDRs of AF slices are shown in Figure 2b.
As can be seen from Figure 1b, the AF plane of single-tone signals has three salient regions (i.e. bright regions). Figure 2a shows that the AF slices located in the salient regions perform better than those in other regions, which is in line with our cognitive. Comparing Figure 2a to b, the KDRs of AF slices are consistent with the recognition accuracy of AF slices. Therefore, the discrimination and informativeness of AF slices can be well evaluated by KDRs, so Equation (6) is very reasonable. Figure 2 also shows that lots of AF slices are uninformative, so we sort the KDRs and keep the slices with large KDRs, which helps to save testing time. However, there will be too many kernels if slices are used as feature representations. Hence, we can further segment the AF plane into stripes. For symmetry, a quarter of AF plane is considered and Figure 3 shows the detailed segmentation strategy (taking part of the AF plane as an example). Using stripes, the total number of kernels is reduced and the training time can be accelerated when the training proportion is large. Of course, the stripe width should not be too large since the performance of adjacent slices in salient regions fluctuates greatly.
Experiments: Data1 (U = 20, N = 1000) and Data2 (U = 30, N = 3000) are real radar datasets for experiments (see Figure 1a for waveform). The laptop with Intel core i7 CPU, 20G memory and MATLAB R2020b platform is used. The compared methods are pulse envelope, CS0, AFR with KELM classifier and ten AF slices [11] with KDR-TSMKELM. The Gaussian kernel parameter is empirically set as the mean of pairwise distances of training samples. Let C = 100. We randomly select 6%, 8%, 10%, 20% and 50% of data for training and all methods are repeated 10 times. For KDR-TSMKELM, AF slices (stripe width = 1), AF stripes of widths 2 and 3 are tested. To determine that how many kernels should be kept (denoted by d), Figure 4 plots the curve of sorted KDRs and the recognition accuracy with different d when the training proportion is 6%. We can see that the best performance can be obtained around the points where the KDR curves start to change more and more slowly. Thus, d = 120, 60 and 40 for each stripe width on Data1 and d = 50, 20 and 10 on Data2. Table 1 gives the recognition results of Data1. Training and testing time of Data1 are shown in Figure 5. As shown in Table 1, among the single features, AFR performs best and the envelope is almost useless when there are many classes. The fusion methods outperform the single-feature-based methods and mining the whole AF plane is indeed better than fusing only several AF slices, especially when the training proportion is small. Because of the fixed amount of information, the recognition accuracy of different stripe widths has no significant difference. In detail, KDR-TSMKELM with width 1 has best accuracy; the reason is that the adjacent AF slices have very different discriminative ability, thus the minimum width can eliminate the interference of redundant information to the maximum extent. In terms of time efficiency, the width 2 is preferable. To further verify these observations, Table 2 shows the results of Data2. We can see that KDR-TSMKELM with width 1 still performs best. The training time is 17.2, 11.0 and 8.8 s (50% proportion) for widths 1, 2 and 3 respectively. In summary, from the point view of recognition accuracy and time efficiency, the best choice is KDR-TSMKELM with width 2.

Conclusion:
We propose the KDR-TSMKELM algorithm under the framework of MKL. To identify individual radars, the AF plane of radar signals is transformed to AF stripes by non-overlapping segmentation and KDR is used as the kernel combination weights to fuse the stripes. The validity of our method is tested on two real single-tone radar datasets and more complex signals will be tested in future work.