Joint likelihood estimation and model order selection for outlier censoring

Seyed Mohammad Karbasi, Department of Electrical Engineering, Sharif University of Technology, Azadi Avenue, Tehran, Iran. Email: m.karbasi@sharif.edu Abstract This study deals with the problem of outlier censoring from the secondary data in a radar scenario, where the number of outliers is unknown. To this end, a procedure consisting of joint likelihood estimation and statistical model order selection (MOS) is proposed. Since the maximum likelihood (ML) estimation of the outlier subset requires to solve a combinatorial problem, an approximate ML (AML) method is employed to reduce the complexity. Therefore, to determine the number of outliers, different MOS criteria based on likelihood function are applied. At the analysis stage, the performance of the proposed methods is assessed based on simulated data. The results highlight that the devised algorithms exhibit satisfactory performance with efficient complexity at the same time.


| INTRODUCTION
Accurate statistical analysis of the interference is crucial for radar signal processing applications such as adaptive radar detection [1][2][3][4], passive emitter source enumeration [5], direction-of-arrival estimation [6], spectral analysis [7], and space-time adaptive processing [8][9][10]. The statistical information, which is fully described via the unknown interference covariance matrix for the case of Gaussian distribution, is usually obtained via sample covariance matrix (SCM) of the secondary (training) data. The secondary samples are selected from range cells spatially adjacent to the cell under test and are assumed to be statistically independent and identically distributed (IID) [1,3,11].
It is necessary to consider practical limitations for SCM estimation connected with lack of sufficient samples and occurrence of outliers. In fact, the convergence of the SCM estimation is achieved when the number of analysed samples is higher than two times the system degrees of freedom. The lack of sufficient training samples, which is inevitable in many practices, has been accounted in several studies [12][13][14][15][16][17], where the problem is treated by exploiting knowledge-aided as well as Bayesian approaches to estimate the SCM. Besides, in many real situations, the observed environment is often non-homogeneous and contains outliers due to the presence of clutter discretes, clutter power variations, transition interfaces, undesired outlier signals of different types, and so on [10] (the so-called heterogeneous environment). A possibility to circumvent the damaging effects of heterogeneous environments is to employ structured covariance matrix estimators which entail less training data than the SCM to achieve convergence. To this end, several researches have been conducted to utilize the structure of the covariance matrix [12,[18][19][20][21][22].
Due to the presence of outliers, the IID assumption for training data is often violated. Hence, to estimate the covariance matrix of the interference, it is important to purify the secondary data from possible outliers. In this respect, several methods have been proposed in the open literature. For instance, in [23][24][25], the generalized inner product (GIP) metric has been introduced to detect the outliers. Therein, the samples sharing large GIP values (via an estimated covariance matrix) have been labelled as non-homogeneous. Moreover, in [26], a reiterative censored GIP method has been devised. Each iteration eliminates a fixed number of samples in a reiterative scheme to converge to the most likely outlier subset. Additionally, knowledge-aided methods have been proposed in [27][28][29], via exploiting the a priori knowledge to select secondary data as well as the censoring algorithm. Furthermore, in [30], the training data selection has been interpreted as a parameter estimation problem and has been solved via a sparsity-based approach. Besides, in [31], the maximum likelihood (ML) estimate of the outlier subset has been considered, and an approximate ML (AML) procedure has been devised to reduce the computational complexity. Moreover, in [32], the generalized regularized likelihood function criterion has been introduced to estimate the outlier subset. Particularly, a suitable regularization function necessary to ensure well-conditioned estimates has been used.
In many parametric approaches to signal processing applications, the parameter estimation problem is coupled with the model order estimator to determine the number of parameters to be estimated [33][34][35][36]. As the cardinality of the outlier subset is not a priori known, a possibility to handle the mentioned outlier censoring problem is based on its formulation in terms of a model order selection (MOS) problem, since each possible choice for the outliers represents a model with a given number of parameters.
Here, a censoring method based on the joint use of MOS and ML estimation is presented. Namely, the outlier subset that exhibits the best MOS metric is chosen as the candidate for censoring. To this end, different available MOS criteria are discussed and utilized so as to determine the order of outlier subset [34]. The MOS methods described, all share common characteristics and are based on minimizing a criterion over the order of parameters. The criteria are composed of the negative of the loglikelihood function (NLLF) plus a penalty term for the model order. The minimization of the MOS criteria is performed for each given model order, obtaining the minimum NLLF, which is equivalent to solve the ML estimation problem. As the exhaustive search for the ML problem imposes a combinatorial complexity, the more efficient AML approach [31] is employed. Moreover, to further reduce the computational burden of the proposed joint AML and MOS ( JAM ) method, an efficient JAM (EJAM) implementation approach is suggested that shares almost the same performance with significant complexity reduction via recursive calculation of the NLLF. At the analysis stage, several numerical experiments utilizing different existing MOS criteria are given to reveal the effectiveness of the proposed algorithms. The results are compared with the case of AML approach with a known number of outliers. Additionally, a heuristic box-and-whisker plot (BWP)-based censoring approach is also analysed for comparison purposes [33]. The results highlight that the JAM (as well as the EJAM) solution exhibits comparable outlier selections without a priori knowledge in comparison with the AML method with prior information about the number of outliers.
The study is organized as follows. The signal model and the secondary data selection problem is formulated in Section 3. Section 4 is devoted to the description of the proposed censoring algorithm, while Section 5 discusses some design issues. In Section 6, the performance of the proposed method is assessed via several simulation analyses. Finally, Section 7 provides concluding remarks as well as future research tracks.

| NOTATION
Throughout the study, we use boldface lowercase, boldface uppercase, and uppercase letters for vectors, matrices, and index sets, respectively. We denote the transpose and conjugate-transpose by (⋅) T and (⋅) † , respectively. The identity matrix is I and its dimension are determined by the context. The notations det(⋅) and tr(⋅) are the determinant and the trace of the matrix argument, respectively. Moreover, j⋅j is the cardinality of the set argument. The number of M combinations in a set with cardinality K is denoted by (M K ).

| SIGNAL MODEL AND PROBLEM FORMULATION
Let us assume that the sensing system has gathered the observation vectors x k 2 C N , for k = 1, 2, …, K, from K range cells (secondary data) to estimate the interference covariance matrix. Let us also suppose that among these K samples, there exist M outliers whose corresponding indices belong to the set Ω 0 = {i 1 , i 2 , …, i M }, whose specific entries and cardinality M = jΩ 0 j are considered unknown. Specifically, the ith observation vector is expressed as where Ω = {1, 2, …, K} (see Figure 1), c i s denote the homogeneous interference components modelled as zero-mean circularly symmetric complex Gaussian vectors with covariance matrix R (positive definite matrix and modelled as an unknown parameter). Finally, p i s are the non-homogeneous interference components, representing the outliers still modelled as deterministic unknown parameters. The problem considered in the following is to determine Ω 0 together with its cardinality M. Denote by C the N � K matrix of the homogeneous components, that is, C ¼ c 1 ; c 2 ; …; c K ½ �, whose probability density function ( pdf ) is Moreover, let X be the N � K observation matrix, that is, F I G U R E 1 Diagram representing Ω and Ω 0 562 -KARBASI According to Equation (2) the LLF of X is given by where X 1 and X 2 are the matrices containing the homogeneous {x i , i 2Ω −Ω 0 }, and the outlier contaminated observations {x i , i 2Ω 0 }, respectively, and P ¼ ½p i 1 ; …; p i M � is the matrix containing the outlier vectors.
The censoring procedure is aimed at determining the outlier subset Ω 0 , so as to excise them from the secondary data. Given M, the ML estimate of Ω 0 is obtained solving the following optimization problem The external maximization over Ω 0 is equivalent to choosing M indices among K possible choices. Namely, the exact solution to arg max Ω 0 ð⋅Þ is the optimal set between � M K � cardinality-M subsets of Ω, which maximizes the inner argument. This is a combinatorial problem with an extremely high computational cost, particularly for large K. Hence, approximate approaches are of interest to come up with practically implementable solutions. In this respect, a quite significant reduction in computational complexity can be obtained via the AML method proposed in [31]. However, since the number of outliers M is unknown, it is necessary to extend the method in [31]. To solve this problem, let us consider the predefined upper-bound jΩ 0 j max for the unknown parameter M, that is to say M 2 f0; 1; …; jΩ 0 j max g: ð6Þ The idea is to jointly select the outlier indices and the cardinality of the set Ω 0 . To this end, we use the AML approach to estimate the outlier subset in combination with MOS criteria [34]. Specifically, for each 0 ≤ M ≤ jΩ 0 j max , the most likely outlier subset is determined utilizing the AML method. Then, among the AML outlier estimated subsets with cardinality in the range 0 ≤ M ≤ jΩ 0 j max , the subset that shares the minimum MOS metric is selected. In what follows, the detailed description of this procedure is discussed.

| PROPOSED CENSORING METHOD
For the problem at hand, the number of unknown outlier parameters is to be indicated. In this respect, we aim at optimizing general MOS whose metrics are defined via the ML found for the outlier selection problem. In this section, a joint outlier censoring approach is presented based on AML and MOS methods.

| Model order selection description
In this subsection, different MOS criteria are discussed with emphasis on their use to establish the size of the outlier set Ω 0 . The different criteria are based on the minimization of a specific objective function over the unknown M, a criterion over the order of parameters. The objective is the sum of the NLLF plus a penalty term which characterizes the different MOS criteria [34]. The penalty term is a monotonically increasing function of the model order parameter M and is chosen in a way that makes the algorithm able to determine the correct model order [37].
Let us consider a general order selection problem with p dimensional observation vector y which depends on a n dimensional real-valued unknown parameter vector θ n , where n = 1, …, M. Denoting byθ n the ML estimate of the parameter vector θ n the order-selection rules considered in this analysis share the common form as follows where α indicates a specific method and the corresponding penalty term η α (n, p) is as given in Table 1.
Notice that the order selection rule α chooses the order n that minimizes ℓ α n . It is worth noting that the Akaike information criterion (AIC) is founded on informationtheoretic argumentation [38]. Besides, extensive simulations and studies (see, e.g. [39]) have empirically revealed that the generalized information criterion (GIC) with the scalar parameter ν > 2, may outperform AIC considering various performance measures. Precisely, according to the scenario as well as the size of the available data, values of ν 2 [2,6] usually guarantees better performances as compared with AIC [34]. Therefore, in our analysis, the AIC will not T A B L E 1 Order selection criteria penalties EEF ≈logð−2log f ðy jn; b θ n Þ=nÞ Abbreviations: AIC, Akaike information criterion; AICc, corrected version of AIC; BIC, Bayesian information criterion; EEF, exponentially embedded family; GIC, generalized information criterion; HQC, Hannan-Quinn information criterion. KARBASI further be discussed as a separate method, and a corrected version of AIC, called AICc, is exploited (see [40], [39] for more details). Furthermore, the Bayesian information criterion (BIC) rule selects the order according to a fully Bayesian approach (see [41], [42] for more details and examples). Additionally, in statistical literature, as an alternative to AIC and BIC, the Hannan-Quinn information criterion (HQC) [43] is often used for model selection. Finally, the method of exponentially embedded families (EEFs) [44], [45] is also addressed. In particular, the EEF extends the generalized likelihood ratio test to the case of multiple alternative hypotheses with differing number of unknown parameters and is found to be superior for cases of practical interest. In the next subsection, the joint use of Equation (8) and the AML method is discussed for the task of censoring unknown number of outliers.

| AML method
To find the ML function that is required in Equation (8), let us first maximize Equation (4) over P, Moreover, optimising Equation (9) over R yields where R is the scaled SCM computed from X 1 . Hence, the ML estimation Ω 0 for a given outlier order M, is the solution toΩ According to Equation (11), the problem reduces to finding the subset Ω 0 of Ω that shares the minimum SCM determinant. To approximately solve Equation (11) the iterative approximation algorithm discussed in [31] is exploited. Particularly, according to Theorem 1 in [31], given an starting subset H 1 with cardinality h and the corresponding SCM S 1 , it is possible to obtain the subset H 2 that shares a SCM with lower determinant. Setting h = K − M and exploiting this construction procedure, which is further referred to as the Concentration-step (C-Step), it is possible to obtain an iterative algorithm that produces a non-increasing sequence of SCM determinants (i.e. proportional to the ML solution given in Equation (11)). Specifically, each iteration of C-Step, given the cardinality h subset H 1 and the corresponding SCM S 1 , suggests the subset H 2 with corresponding SCM S 2 where det (S 2 ) ≤ det(S 1 ). Algorithm 1 describes the steps for C-Step procedure.

Algorithm 1 C-Step algorithm
Due to the finite search space for cardinality-h subsets, the iterations must converge. As to the stop criterion, it can be set the constraint det(S 2 ) = det(S 1 ), besides a threshold on the maximum number of iterations, N c . It is important to mention that the convergence to the global optimum is not guaranteed for this algorithm. However, it is possible to trigger the algorithm many times, say N i , from different initial randomly selected subsets, and run the C-Step until convergence, and then choose the minimum determinant SCM among them. In this way, the result will be more probable to fall in the global optimum. Algorithm 2 describes the steps for the AML method. Following the AML algorithm, it is straightforward to determine the subsetΩ 0 ðMÞ. In the next part, the joint order selection and outlier censoring procedure are described.

Algorithm 2 AML method
Ensure: An AML estimate for Ω 0 .
1: Select N i random cardinality-h subsets from Ω; 2: Run C-Step for each input subset until convergence in order to obtain N i candidates for Ω 0 ; 3: Among the candidates, report the one that shares the lowest SCM determinant, as the output.

| Joint AML-MOS procedure
Leveraging the approximate maximum of the LLF in Equation (8), the MOS criterion for the outlier estimation problem is expressed as where the number of real-valued unknown parameters in the matrices P and R is chosen as: 564 -KARBASI Moreover, the number of real-valued available data in matrix X is given by p = 2NK.
For each penalty method α defined in Table 1, the estimated order selection for the outlier set is derived, taking the minimum over all possible cases for M, viz. Once the orderM α is determined, the corresponding AML estimate for the outlier subsetΩ 0 ðM α Þ is then chosen as for censoring (following Equation (11)), and the estimated covariance matrix is derived based on the resulting outlier-free secondary data. Algorithm 3 describes the steps for JAM outlier censoring procedure.

Algorithm 3 JAM algorithm for outlier censoring
Require: X, jΩ 0 j max , penalty α. Ensure: An optimal selection for Ω 0 . It is worth noting that, for the considered application, the ratio between the number of parameters, n, and the number of samples, p, approaches zero as the number of secondary data, K, increases. This property, called the large-samples assumption, is important for some of the discussed MOS criteria. However, this situation might not be realizable in practical experiments with the consequence that the large-samples assumption would be no longer valid [46]. Hence, it is worthwhile investigating the considered MOS rules to determine which one performs better than the others (to be discussed in the numerical analysis section).

| DISCUSSIONS
In this section, the computational complexity of the proposed JAM method is addressed. To this end, an efficient implementation of the JAM method is proposed. Moreover, the computational burden of the ML, JAM, and EJAM methods is analysed and compared. Finally, to give a benchmark for the performance analysis of the JAM, a heuristic solution to the problem is described.

| Efficient implementation
To reduce the computational burden of the JAM method, an efficient implementation for the JAM (called EJAM ) is proposed. Recall that the JAM consists of the execution of C-Step algorithm for N i random initial subsets followed by the MOS criteria calculation, for each M 2{0, …, jΩ 0 j max }. The idea to implement EJAM is to do the followings: � Judiciously choose the initial point for the C-Step so as to reduce the number of random initial subsets N i ; � Once run AML for M = jΩ 0 j max , obtain the optimal LLF, and for M = 0, …, jΩ 0 j max − 1 find the corresponding LLF values in a recursive manner; � Use the efficient formula for LLF calculation of rank-one updated covariance matrix.
In AML method, the initial outlier subset is randomly chosen for N i times. This leads to the overall computational complexity of AML to be N i times that of the C-Step algorithm. To remove the randomization step, an appropriate initial subset selection is necessary. The idea is to order the samples in X according to their GIP computed via the SCM S = XX † /K. Then, the initial guess for the subset is to choose the K − M samples with largest GIP values. Let us order the samples with respect to the GIP values corresponding to S, that is, To make the plots comparable a bias is added to the EEF plot, which is denoted as its modified version.

KARBASI
-565 with the ordered sequence The initial subset for C-Step is chosen as It is shown in the performance analysis part that such a selection leads to comparative quality results with no randomization.
Moreover, to run AML recursively, start running the algorithm for M = jΩ 0 j max . Perform AML to obtain the estimated covariance matrix, R � jΩ 0 j max , whose determinant gives the LLF used in MOS criteria. Additionally, construct the GIP values corresponding to the SCM R � jΩ 0 j max as with the ordered sequence According to AML method, the estimated outlier subset for M = jΩ 0 j max is given bŷ To complete the procedure, for M 2{0, …, jΩj max − 1}, the idea is to choose the estimated outlier subset aŝ Here, it is approximately assumed that the order of GIPs in Equation (19) remains the same for lower M values, which makes it sufficient to only run the AML method for M = jΩ 0 j max . In this respect, the estimated covariance matrix for any M < jΩ 0 j max , is recursively derived as: Furthermore, to compute the determinant for the MOS criterion in Equation (12), the recursion in Equation (22) leads to the efficient formula for LLF calculation of rank-one updated covariance matrix [47]. Namely, the updated LLF is expressed as: followed by the recursive inverse matrix update: Summarizing, the MOS criteria are derived in a recursive manner for 0 ≤ M ≤ jΩ 0 j max starting from R � jΩ 0 j max . Algorithm 4 describes the steps for EJAM method. Throughout the simulations, the performance and speed of EJAM method are compared with JAM.

| Computational complexity
The proposed JAM can provide a significant computational complexity reduction as compared with the exhaustive search. Specifically, the JAM algorithm consists of jΩ 0 j max 566 -KARBASI steps, and in each step, the AML method is performed. The overall computational complexity of the AML method is approximately OðN i N c KN 2 Þ floating point operations (FLOPs) 2 [31]. Consequently, the computation burden of the JAM method is OðjΩ 0 j max N i N c KN 2 Þ plus the complexity required to compute the determinant of the SCM for each M which is OðjΩ 0 j max N 3 Þ. By contrast, the computational complexity of the exact ML approach to find the outliers is . Furthermore, the EJAM implementation provides more efficient computational complexity as compared with JAM. Specifically, the procedure consists of once running the AML step for M = jΩ 0 j max with the computational cost OðN c KN 2 Þ, and for each value of 0 ≤ M < jΩ 0 j max , it is required to perform Equations (23) and (24) with overall computational complexity OðjΩ 0 j max N 2 Þ. Hence, the computational burden of the EJAM is approximated as OðN c KN 2 Þ which is readily more efficient.

| BWP method
To assess the performance of the proposed method, a heuristic outlier censoring approach based on box plot, also known as the BWP [33], is presented as a benchmark. The BWP method represents both the summary statistics and the distribution of the primary data, which enables visualization of the minimum, lower quartile, median, upper quartile, and maximum of any data set. The plot contains a box plot-like graph which defines a range bar to show the median and inter-quartile range. Moreover, the whiskers outside the box allow for identification of outliers in the data set.
Let us consider the GIP values, e j , given in Equation (15), which is derived according to the total samples SCM, S. Due to the fact that heterogeneous components (outliers) statistically enjoy higher GIP values as compared to homogeneous components [31], the idea is to analyse the BWP of GIP values to indicate the outliers. Notice that, in this method, the approximation is made where the homogeneous SCM is assumed to be the total samples SCM, S. In Figure 3, the BWP analysis for the GIP values versus the number of outliers M is given, for a special case study. Notice that the plus sign points indicate the outliers determined utilizing the BWP.
This simple method is further assessed in the numerical analysis section.

| NUMERICAL ANALYSIS
In this section, the performance of the different proposed outlier selection methods are analysed for a variety of MOS metrics and are also compared with the BWP method. To assess the effectiveness of the proposed outlier identification algorithms, four benchmark measures are defined: � The mean masking probability (fraction of undetected true outliers), P m ¼ PfΩ 0 −Ω 0 g, � The mean swamping probability (fraction of homogeneous samples labelled as outliers), P s ¼ PfΩ 0 − Ω 0 g, � The correct outlier detection probability (fraction of simulations with 0 masking), P c ¼ PfΩ 0 ⊆Ω 0 g, � The mean jointly correct detection probability (fraction of simulations with no errors), P d ¼ PfΩ 0 ¼Ω 0 g.
In the process of outlier detection, masking is more important than swamping. The former can cause gross distortions, whereas the latter is often just a matter of lost efficiency. Hence, in this analysis, we focus on P m rather than P s . Moreover, it is worth noting that the jointly correct detection event is a subset of the correct outlier detection, and as a result, we always have P d ≤ P c .
where s is the steering vector corresponding to the tested target with the normalized Doppler frequency f t , that is, and The red plus signs denote the outliers. BWP, box-and-whisker plot; CNR, clutter-to-noise ratio; GIP, generalized inner product is the estimated normalized weight vector for the estimated outlier subsetΩ 0 . Since the lack of closed-form expression for the aforementioned metrics, the analysis is performed resorting to Monte Carlo (MC) simulation method based on 1000 independent trials.
We consider a Doppler processing scenario (the same results can be obtained for array processing setup). Moreover, throughout the simulations, the secondary data is modelled as zero-mean complex Gaussian vectors with the covariance matrix as in [48,49]. Precisely, the covariance matrix R consists of two terms: a scaled identity matrix accounting for noise variance and a coloured matrix representing the clutter covariance matrix, R c , viz.
where σ 2 is the additive white Gaussian noise power level. In what follows, without loss of generality, σ 2 is assumed to be equal to 0 dB. As to the clutter, R c is assumed exponentially shaped, based on the model in [48,49], that is, where σ 2 c is the clutter power, ρ denotes the one-lag clutter correlation coefficient, and f c is the normalized clutter Doppler frequency. In the sequel, unless otherwise stated, these parameters are set to σ 2 c ¼ 20dB, ρ = 0.95, and f c = 0.05. As to the outliers, M random vectors are randomly added to the secondary data. The model used for the ith outlier is given by where α i and f o,i denote the complex amplitude and the normalized Doppler frequency for the ith outlier, respectively. It is assumed that the outliers share the same power, that is, the outlier-to-noise ratio (ONR) is given by For the Doppler frequencies f o,i , a fixed similar Doppler frequency is assumed for all the outliers, that is, f o,i = 0.15 for i = 1, …, M. Notice that, such a situation can occur when an outlying moving target is present in more than one range cell, that is, an extended range target.
As to the parameters utilized for the AML method, unless otherwise stated, the following parameters are set to implement the AML method: N c = 5 and N i = 40. Finally, the clutter-to-noise ratio (CNR) is defined as σ 2 c =σ 2 ¼ σ 2 c . As to the censoring schemes, two proposed outlier detection approaches are analysed, that is, JAM and EJAM. Moreover, for each aforementioned approach, MOS penalties BIC, AICc, GIC (for ν 2{4, 5, 6} with corresponding labels GIC, GIC5, and GIC6), HQC, and EEF are reported. Furthermore, as a benchmark for the performance of the proposed methods, the detection approach with the assumption of a priori known M is used with the label AML. Besides, the intuitive BWP approach is used to compare the performance with the corresponding BWP outlier determination.

| P c analysis for JAM
In this section, the performance of the proposed JAM outlier censoring scheme is analysed in terms of P c . Figure 4 depicts P c versus σ 2 o for a scenario with the parameters: K = 50, N = 8, M = 4, and jΩ 0 j max = 6. The plot highlights that the performance of the censoring methods improves as σ 2 o increases. Additionally, the performance of the BWP method is not satisfactory and P c is saturated with the values around 0.3 with σ 2 o above 30 dB. As to the order selection criteria, the AICc method has a superior performance over the other criteria and has approximately the same performance as the upper-bound. Moreover, the results also indicate the superiority of the GIC with ν = 4 over the GIC curves with ν = 5, 6. In sum, a performance gap of about 12d B appears between the best performance, AICc, and the least performance BIC MOS methods.
In Figure 5, the same analysis is conducted for K = 50 and different values of M = 1, …, 6. As the plots highlight, the performance of the censoring methods deteriorates as M increases. Otherwise stated, increasing the number of outliers requires a higher ONR to achieve the same P c . Moreover, the performance gap between the curves increases as M increases. It is interesting to note that the BWP approach has a reasonable performance for M = 1, 2, but it fails when the number of outliers increases.
Additionally, Figure 6 depicts the average P c over uniformly distributed M over the set 1, 2, …, jΩ 0 j max versus σ 2 o , with jΩ 0 j max = 6. The plots recommend the use of AICc method to be utilized as the MOS penalty for the outlier censoring problem.

| Censoring approaches comparison
In this section, the performance of the proposed outlier censoring schemes is analysed in terms of P c , P m ,  Figure 7 depicts P c for the two approaches JAM and EJAM. As the plots highlight, a comparison shows that EJAM has obtained the same performance as the JAM. Interestingly, this occurs while EJAM enjoys significant computational complexity reduction as compared with JAM (to be discussed shortly). Moreover, throughout the MOS penalties, the corrected AIC metric outperforms all the other criteria. Figure 8 depicts P m curves versus ONR. As it is also observed in these plots, the JAM curves outperform the other detection approaches, and among the penalties, AICc has a superior performance. It is an interesting fact that EJAM has the same performance as that of the JAM. Figure 9 illustrates P d curves versus ONR. As discussed before, P d metric is more strict than P c . This is highlighted in the plots. As can be seen, the AICc has missed its leadership among the penalties. It is observed that the penalties with better performance in lower ONRs have lower performance in higher ONRs. The AML approach with known M outperforms the other competitors. As also observed in these plots, the JAM approach has achieved higher P d s as compared with the other detection approaches.

| Data length analysis
This section is devoted to the performance analysis of the proposed method as a function of the number of secondary data samples K. Figure 10 shows P c behaviour versus K for the proposed censoring method obtained using different MOS penalties. For the same situation, in Figure 11, the NSIR plots versus K is depicted for ONR values from 20 dB to 35 dB. As the plot highlights, the NSIR improves as K increases.

| Speed analysis
This subsection is devoted to the speed analysis of the proposed outlier censoring methods. The analysis is performed based on the CPU time in seconds. The reported values are obtained with a standard PC with Intel® Core TM i7-10510U CPU @ 1.80 GHz with installed memory (RAM) 16 GB. Figure 12 demonstrates the elapsed time in seconds versus the number of ONR, outlier-to-noise ratio 570 -KARBASI data samples K. Moreover, Figure 13 gives the elapsed time versus the number of sensors N. As discussed in Section 5.2, by increasing K and/or N, the processing time of the proposed method increases. It is interesting to highlight the fact that the EJAM method has 10 times faster execution speed as compared with the other methods. As shown in the plots, EJAM has slightly the same performance as that of JAM.

| CNR analysis
This section is devoted to the analysis about the effect of CNR, σ 2 c . Figure 14 denotes the P c curves versus CNR for a scenario with ONR of 20 dB. As expected, increasing the CNR leads to lower performance of the outlier censoring approach.
Moreover, the superior performance of the AICc penalty as well as the JAM approach is highlighted.

| Outlier number analysis
In this section, the effect of ONR, σ 2 o is assessed on the performance of the proposed censoring methods versus the number of outliers M. Figure 15 illustrates the P c curves versus M for a scenario with different ONRs from 20 dB to 35 dB. As expected, increasing the ONR improves the performance of the outlier censoring approaches. Moreover, the plots show the superior performance of the EEF and AICc metrics as compared with other MOS penalties.

| CONCLUSIONS
An effective procedure for joint outlier order selection and ML censoring has been proposed. Specifically, to overcome the problem of unknown number of outliers, an order selection algorithm relying on different available MOS criteria has been coupled with the existing AML scheme [31] (called JAM). Moreover, an efficient implementation with some loss of performance (called EJAM) has been proposed to reduce the computational burden. JAM and EJAM procedures have been analysed on simulated data, also in comparison with the BWP method. The numerical analyses have revealed the effectiveness of the proposed methods, in terms of different performance metrics. Specifically, for both the methods, the corrected AIC outperforms the others. Additionally, EJAM achieves the same performance as JAM with significant complexity reduction.
Possible future research tracks might concern the analysis of the proposed techniques on real data [50]. Additionally, the performance evaluation in the context of adaptive detection provided with the proposed data selector might be interesting. Finally, compressed sensing approaches for outlier censoring are of great importance [51], considering sparse nature of the outlier occurrence.