Direction-Based Jamming Detection and Suppression in mmWave Massive MIMO Networks

In this paper, we study the problem of physical layer security in the uplink of millimeter-wave massive multiple-input multiple-output (MIMO) networks and propose a jamming detection and suppression method. The proposed method is based on directional information of the received signals at the base station antenna array. The proposed jamming detection method can accurately detect both the existence and direction of the jammer using the received pilot signals in the training phase. The obtained information is then exploited to develop a channel estimator that excludes the jammer's angular subspace from received training signals. The estimated channel information is then used for designing a combiner at the base station that is able to effectively cancel out the deliberate interference of the jammer. By numerical simulations, we evaluate the performance of the proposed jamming detection method in terms of correct detection probability and false alarm probability and show its effectiveness when the jammer's power is substantially lower than the user's power. Also, our results show that the proposed jamming suppression method can achieve a very close spectral efficiency as the case of no jamming in the network


I. INTRODUCTION
Recently, the advent of new mobile and wireless applications has caused a fast-growing demand for very high data rate communications in the fifth-generation (5G) and beyond wireless networks. There are many techniques that have been proposed to address these demands among them massive multiple-input multiple-output (MIMO) and millimeter-wave (mmWave) are the most promising [1], [2]. In addition to higher bandwidth in the mmWave bands, their shorter wavelength enables us to employ a large number of antennas at the base stations (BSs) and makes it appropriate for combination with the massive MIMO systems [3]. One of the inherent weaknesses of wireless networks is their vulnerability to security attacks at the physical channels including jamming and eavesdropping. Security can be provided in different layers of the network in which the physical layer security is a powerful technique that has attracted much attention in the recent years [4]. Massive MIMO systems are naturally immune to a passive eavesdropping attacks due to their ability to create very narrow beams toward the legitimate users which reduces any signal leakage to the illegitimate terminals. However, an active eavesdropping attack that disrupts the training phase by transmitting a jamming signal can reduce the secrecy rate of the massive MIMO systems [5], [6]. In addition to the training phase, a jamming attack can arXiv:2104.01856v1 [cs.IT] 5 Apr 2021 also occur at the data transmission phase with the goal of decreasing spectral efficiency of the system [7]. Consequently, one of the crucial problems in the massive MIMO systems is detecting jamming attacks and then using techniques to suppress them or alleviate their effects. This is of high importance especially for operation in hostile environments.
The problems of jamming detection and suppression are discussed in many of the prior works in the field of physical layer security in massive MIMO systems [8]- [19]. In [8], a jamming attack detection scheme is proposed based on pilots drawn randomly from a known constellation. The authors in [9], [10] use additional signaling and cooperation between the BS and users to suggest jamming attack detection methods. Two approaches based on random matrix theory for jamming detection have been proposed in [11], [12] in which the authors use the rank of the received signal's covariance matrix and eigenvalues of the received signal matrix. Also in [13], the intentionally unallocated pilots are leveraged by the authors to propose a jamming detection technique. Authors of [14], [15] propose two similar approaches based on double channel training where the received signals during two training phases in a single coherence block are compared to determine the presence of pilot spoofing attack. None of the aforementioned researches or other related works utilize directional information for detecting the presence of the jamming attack. On the jamming suppression problem, a jamming attack mitigation method is proposed in [11] in which, the users' eigen-subspace is estimated and then the received signal are projected onto this subspace. Zhao et. al. in [16] propose a jamming rejection technique using cooperation between transmitter and receiver and using an interference alignment technique. In [17] and [18], two approaches are suggested that leverage unused pilots to estimate the jammer's channel information and then utilize that information to null out the jammer's signal. Also, a framework for estimating legitimate users' channels and maximizing spectral efficiency under jamming attack is proposed in [19] which is based on statistical information of the channels and can be implemented in spatially correlated channels.
High angular resolution is one of the most important properties of the massive MIMO systems which can be used for combating different challenges of these networks [20], [21]. Accurate directional information provided by high angular resolution can be also utilized to mitigate the undesirable effects of the jamming and eavesdropping attacks [22]- [25]. For example, a secure downlink transmission scheme exploiting angular information is suggested in [22] where authors assume the BS has perfect knowledge of the eavesdropper's directional information and propose three precoding methods based on this information. In [23], a beam domain secure transmission is proposed which optimizes allocated power to each of angular paths to maximize the secrecy sum-rate of the downlink transmission in a massive MIMO system where a passive multi-antenna eavesdropper is present. Authors in [24] propose a technique to estimate the angular information of users and the eavesdropper in airborne massive MIMO systems and use this information to enhance channel estimation performance. Xu et. al. in [25] suggest a hybrid beamforming technique that transmits confidential data towards a legitimate user's dominant directions and an artificial noise signal towards all other directions using the statistical angular information. All the abovementioned works that rely on directional information have assumed that the BS is aware of the presence of the adversarial terminal, therefore, they do not discuss the detection of the adversary. Also, in none of the above and other related works, the problem of suppressing a jamming attack on the uplink transmission of a massive MIMO system using directional information is addressed.
Motivated by the above, in this paper, we address the problem of jamming detection and suppression in mmWave massive MIMO networks using directional (angular) information of the users and jammer. The network consists of a BS with a large number of antennas that serves a number of single-antenna users. There is also a jammer in the network that transmits interfering signals in both training and data transmission phases to sabotage the systems performance. A discrete channel model comprising a number of spatially resolvable paths (RP) is presented to describe the mmWave massive MIMO channel in the angular domain. Because of the limited scattering in the environment, the received signal from each terminal (i.e. a user or the jammer) arrives the BS array from only a few RPs which are referred to as active RPs of that terminal. First, for the jamming detection purpose, based on the received pilot signals in the training phase, we propose a method that estimates the set of RPs through which a particular pilot is received. Then, we check whether there are active RPs that are common among a large number of pilots and accordingly detect the presence as well as the directional information of the jammer. This information along with other information obtained about the users' directional information is then used for jamming suppression in the next phase. In the next part of the paper and for the jamming suppression purpose, utilizing the above directional information, we propose a channel estimation scheme which is based on projecting received pilot signals onto the orthogonal complement of the jammers angular subspace. Then, the estimated channel information is used for designing the combining vectors which are orthogonal to the jammers channel and cancels out the jamming signal. Note that the directional information obtained in the channel training phase can be used for the jamming suppression in several consecutive intervals of coherence time. The reason is that the spatial characteristics, such as active RPs of a terminal's channel, change slower than the small scale fading parameters such as the gains along with the active RPs. The key contributions of this paper can be summarized as follows: • A jamming detection and suppression method in a mmWave massive MIMO network is proposed which relies on directional (angular) information obtained during the channel training phase. Received energy along with each RP over several sub-carriers (sub-channels) is leveraged to determine if a RP is an active RP or not. • In contrast to other works, our proposed jamming detection technique does not require any prior knowledge of the users and jammers channel state information or any necessary precondition such as unused pilots in the network. Also, it does not introduce any additional signaling overhead. • The proposed detection technique is capable of detecting the jammer even if its power is very low. Besides, the false alarm probability of this detection method is substantially lower than other similar techniques. The performance of the proposed scheme can be further enhanced by increasing the number of antennas at the BS or the number of sub-carriers (sub-channels) at the system. The jamming detection method directly yields directional information of the jammer which can be utilized to suppress the jamming attack over several intervals of coherence time. • The proposed jamming suppression also can efficiently cancel out the jamming attacks. We show that this method performs very well and similar to the case where no jamming is present in the network. Also, its performance does not depend on the jammers power and is effective even with relatively high-power jamming attacks.
II. SYSTEM MODEL As depicted in Fig. 1, we consider uplink of a single-cell massive MIMO system consisting of a BS with a large array of M antennas that serves K single-antenna legitimate users in the presence of a single-antenna jammer, which all are randomly located in the cell. The network operates in the mmWave bands and a multi-carrier transmission with N sub-carriers (or subchannels) is considered. We also consider a block-fading channel model where channel gains

A. Angular Domain Channel Modeling
Since our proposed scheme is based on the directional information of the channels, we need a channel model that properly describes signals in the angular domain. We assume an uniform linear array (ULA) at the BS and for each direction of arrival (DOA), θ ∈ [−π/2, π/2], the BS has the following array steering vector where d and λ are array element spacing and carrier wavelength, respectively. It is assumed that d = λ 2 . Also, Θ = sin(θ) denotes the directional sine and since θ is in the interval of [−π/2, π/2], there is a one-to-one mapping between θ and Θ. We assume that the DOAs from which signals of the kth user are received are uniformly distributed in the interval of I k = [θ k −∆ k /2,θ k +∆ k /2], whereθ k and ∆ k are average incident angle and angular spread, respectively. We call the interval I k as the angular span of the kth user.
The angular resolution of massive MIMO depends on the array's length, L which is equal to L = M d λ [26]. If the directional sine of two paths differ less than 1/L, these paths would not be resolvable by the array [26]. Hence, the angular domain can be sampled at fixed angles with a spacing of 1/L in the directional sine. The steering vectors with angular spacing of 1/L at their Θ are orthonormal, thus they can form an orthogonal basis for channel expansion that can be represented as [27] U . . , M are the sampled angles. By these definitions, we can decompose the channels to M different physical directions which we refer to them as resolvable path (RP). The gain of the ith RP can be represented as the superposition of all paths whose Θ are located within a window of width 1/L around sin(φ i ). Subsequently, we define a virtual channel representation (VCR) model which samples the angular domain using spatial orthogonal basis U. The VCR channel vector of the kth user in the nth sub-carrier can be denoted by whereg n k = [g n k,1 ,g n k,2 , . . . ,g n k,M ] is a complex gain vector andg n k,i indicates the small scale gain of ith RP. If a RP is located in the angular span of the kth user, i.e. φ i ∈ I k , we call it as an active RP of that user. Since each active RP includes several physical paths, we assume that the small scale random modelling of an active RP is a complex Gaussian random coefficient with zero mean and unit variance, i.e.g n k,i ∼ CN (0, 1). If a RP is not an active RP, i.e. φ i / ∈ I k , its gain would be equal to zero, i.e.g n k,i = 0. Furthermore, β k and C k are large scale fading coefficient and the number of active RPs of the kth user, respectively. Due to limited scattering characteristic of the mmWave channels, the received power is concentrated on a narrow interval in the angular domain. Thus, the angular spread is relatively small and only a few RPs are located within it. Thus, C k is considerably smaller than the number of array antennas, i.e. C k M [22].
The angular span of each user is constant over different sub-carriers since the spatial propagation characteristics are unchanged within system's bandwidth. Therefore where supp denotes the support set of a vector. This equation implies the active RPs of each user are unchanged over different sub-carriers. The set of Ω k is also referred to as spatial signature. Fig. 2 illustrates an example of the described channel model in angular domain for M = 18 antennas at the BS. Solid arrows show the spatially resolvable paths that three of them are located within the angular span of the user and in fact are the active RPs. These active RPs are demonstrated by red arrows. Thus, in this example, the set of active RPs for this user is Similar to the users' channels, the jammer's channel in the nth sub-carrier can also be denoted by h n w = M β w /C w Ug n w where β w and C w denote the large scale fading coefficient and the number of active RPs of the jammer, respectively. The angular span of jammer is also denoted by I w = [θ w − ∆ w /2,θ w + ∆ w /2], whereθ w and ∆ w are the average incident angle and angular spread of the jammer, respectively. Also, similar to (4), the set of indices of active RPs (spatial signature) for the jammer is denoted by Ω w .
We assume that the users and the jammer are randomly distributed in the cell, thus we consider average incident angle of the kth user and jammer to be uniformly distributed in the intervals of [−π/2 + ∆ k /2, π/2 − ∆ k /2] and [−π/2 + ∆ w /2, π/2 − ∆ w /2], respectively. The kth user's and the jammer's channels can be alternatively defined as

B. Pilot and Data Transmission Phases
Transmission from the users to the BS in the uplink occurs in two phases. First, in the channel training pahse, each user sends its allocated orthogonal pilot sequence to the BS for estimating that user's channel. In the second phase which is data transmission phase, the user sends its data to the BS. The jammer also transmits jamming signals during both training and data phases with different powers q t and q d , respectively.
1) Channel Training: Assuming a block fading channel with N c sub-channels that are placed within its coherence bandwidth. As a result, only one sub-carrier out of N c needs to be estimated using orthogonal pilot sequences [1], [3]. We denote the number of estimated sub-carriers by N e = N/N c . The set of orthogonal pilots is S = [s 1 , s 2 , . . . , s τ ] ∈ C τ ×τ , where τ is length of orthogonal pilots, and we assume τ = K. Also, pilots are considered to have unit power, i.e. S H S = I. The kth user transmits the kth pilot over N e predefined sub-carriers. At this phase, the received signal in the nth sub-carrier can be represented as where p t,k is the pilot transmission power of the kth user and s n w ∈ C τ ×1 is the jamming pilot signal sent by the jammer in the nth sub-carrier. Z n t ∈ C M ×τ is also additive white Gaussian noise (AWGN) whose elements are [Z n t ] i,j ∼ CN (0, σ 2 z ). In order to estimate the kth user's channel, the BS de-spread the received signal in (7) with the pilot s k as where z n t,k = Z n t s k is the correlation between noise and the kth pilot. Also, γ n k = (s n w ) H s k indicates the correlation between the jamming signal and the kth pilot. It is clear that the interference exerted by other users have been eliminated by orthogonality of pilots; however, the deliberate interference by the jammer signal still remains.
A jammer usually aims to limit the SE of the massive MIMO network rather than targeting a special user. The reason is that, in the massive MIMO systems, if a random pilot hopping method is used for assigning pilots to the users [6], [17], [18], it is difficult for a jammer to acquire a particular user's pilot. However, it is reasonable to assume that the jammer has knowledge about the pilot set S.
In addition, a necessary condition for jamming signal with the purpose of mitigating performance of the system, is that the P rob(γ n k = 0) = 0 which can be achieved by spreading jamming power over all the pilots. A known method for designing jamming signal in the training phase is that E(|(s n w ) H s i | 2 ) = 1/τ . One of the signals that satisfies this property is [17] s n w ∼ CN (0, To restrain the jamming attack on the training phase, some extra information about the users' and jammer's channels is needed which is the directional information in our work. In Sec. III-A and III-B, we will investigate how to obtain this information and then utilize it in Sec. IV to propose the channel estimation technique. 2) Data Transmission: In the uplink data transmission phase, each user k sends its data with power p k,d . Simultaneously, the jammer sends its jamming signal to the BS. The received signal at the BS over the nth sub-carrier can be expressed as, where z n d ∼ (0, σ 2 z I) is the AWGN noise, x n k and x n w are the kth user's data and jamming signal, respectively. These signals are considered to be i.i.d Gaussian random variable with zero mean and unit variance, i.e. x n k , x n w ∼ CN (0, 1). Then the BS receives this signal and decodes it by the use of information obtained in the training phase. Suppose V n is the decoding (combining) matrix at the BS and v n k is its kth column which is the receive combining vector of the kth user. By calculating the inner product of v n k and y n d , the signal from the k th user in the nth sub-carrier can be decoded as Since the BS has an imperfect CSI of user's channel, we utilize a lower bound of the capacity called use-and-then-forget which is widely used in massive MIMO [1], [3]. The name originates from the fact that estimated channels are used for the design of the receive combining vectors and then disregarded in the signal detection. This achievable rate is defined as follows where ρ n k is the effective SINR defined in (13) at the top of the next page. Note that this SINR is conditioned on Ψ which is the directional information and will be discussed in the following sections.

III. JAMMING DETECTION
In this section, we discuss the proposed jamming detection scheme which consist of two stages. In the first stage, the signals in (8) are used to estimate the active RPs from which any particular pilot is received. To this end, we exploit the received training signals in several sub-carriers to distinguish the active RPs (i.e those that include a signal) from the RPs that only include noise. Then, in the second stage, we present our proposed detection method. In this method, after determining the active RPs, we evaluate the number of pilots that are received through each RP and in this way we determine if there is a jammer in the network.

A. Active RP Estimation
As we explained before, the kth pilot is transmitted by both the kth user and the jammer. The union set of active RP indices for the jammer and the kth user is defined as Ω k,w = Ω k ∪ Ω w .
In fact, this set shows the spatial directions from which the kth pilot is received.
Calculating the inner product of U in (2) with y n k in (8) maps the received signals to the angular domain. Therefore, wherez n k = U H z n k whose distribution is the same as z n k since U is an unitary matrix, i.e. z n k ∼ CN (0, σ 2 z I). We need a sensing scheme that can distinguish between the active RPs and the RPs that only consist of noise. For this purpose, we first define a hypothesis test for the ith RP as where H RP 1,i denotes three cases of (a) both the kth user and the jammer transmit the signal, (b) the user transmits the signal and (c) the jammer transmit the signal all along with the ith RP. H RP 0,i denotes the opposite case in which no signal is received from the jammer or the kth user and only a noise signal is present along with the ith RP.
Furthermore, the energy received along a certain RP can be utilized to determine if it is a active RP or not. This stems from the fact that receiving signal along a RP increases the energy level comparing to the case that it only receiving noise. Thus, it is judicious to infer that a RP is active if its energy is more than a predefined threshold. To this end, for the ith RP, we compute the sum received energy from N d sub-carriers as After computing the received energy, we can use it for determining the active RPs according to what is mentioned above. Therefore, we define the decision rule as where k is the decision threshold which can be chosen to satisfy various criteria. Here, we propose a condition for selecting the threshold that guarantees the probability of false alarm (i.e. declaring a non-active RP as an active RP) to be less than a given value η. The condition is defined as By choosing η close to zero, e.g. 10 −3 , and determining k accordingly, it will be assured that the RPs selected by the decision rule in (17) are always included in Ω k,w . Note that, W i,k when the ith RP only comprises of noise, is a Gamma distributed random variable, i.e W i,k ∼ Gamma(N d , σ 2 z ), which can be used to calculate k . In order to efficiently estimate the active RPs, the user's and jammer's powers have to be sufficiently large compared to the noise power. It is rational to assume that the users' powers have been adjusted to be adequately more than noise power. However, there is no control over the jammer's power. In the case of low power jamming, the BS can use a larger number of sub-carriers. It enhances the RP detection performance since the gains of active RPs in different sub-channels are independent and it is highly probable that at least some of these gains are large enough to make the RP detectable. In general, increasing the number of received samples in the summation (16) would always help to improve the detection performance. It can be done either by increasing the number of considered sub-channels or considering the received training signal over several consecutive intervals of coherence time since the directional characteristics of channels change more slowly in time. This means we can sum the energy received along a RP over several sub-channels and in some consecutive intervals of time coherence. The impact of increasing number of sub-carriers or number of coherence time intervals is the same, thus, we only consider one coherence time and adjust the number of sub-channels.
Finally, the estimated set of RPs for kth pilot is described as followŝ Having estimated the active RP sets, we exploit them to suggest a jamming detection scheme in the next section.

B. Jamming Detection
In this section, we propose a jamming detection scheme based on the estimated RP sets in the previous section. Initially, we discuss a principal characteristic of users' active RP sets {Ω 1 , Ω 2 , . . . , Ω K }, which we are going to use to propose a jamming detection method.
Remark 1. Suppose that average incident angles of all K user are uniformly distributed in the interval of ( π 2 − ∆ 2 , − π 2 + ∆ 2 ), and the angular spread is equal for all user, i.e. ∆ 1 = . . . = ∆ K = ∆. Then, the probability that g out of K users have at least one common active path, P K g , is upperbounded by . Proof. Please refer to Appendix VII-A.
It should be noted that when g has a low value, e.g. g = 2 the upper-bound is close to one, i.e.P K g ∼ 1. However, as g increases to K,P K g exponentially decreases to a very low value. For instance, for K = 10, ∆ = π/18 and g = 6, 8, 10, the upper-bound is proportional to P 10 6 ∼ 10 −3 ,P 10 8 ∼ 10 −5 andP 10 10 ∼ 10 −9 . Thus, it is rational to assume that it is not likely for a large number of users to have common active RPs. Another conclusion that can be drawn is that a common active RP can indicate the existence of a jammer.
Having this property, we define another hypothesis test as where the hypothesis H JD 0 and H JD 1 denote absence and presence of the jammer, respectively. Then, we define the following set that contains indices of common active RPs between at least g pilots where R(i) is a function that returns the number of pilots that is received along ith RP according to the estimated sets {Ω 1,w ,Ω 2,w , . . . ,Ω K,w }. In addition, as illustrated in Remark 1, in the absence of the jammer, we anticipate only a small subset of pilots to have common active RPs since then pilots would be only received from the users' RPs. On the other hand, when a jammer is present in the network, its active RPs would be common for many of the pilots and hence, the number of pilots with common RPs would be relatively large. Based on this fact, we infer that if the number of pilots that have common RPs is more than a predefined threshold, a jammer is present in the network and vice versa. Therefore, we can rewrite the detector as This decision rule states that if Q g is an empty set where g is the predefined threshold, the number of pilots which have common RPs is less than g and thus, there is no jammer present in the network. On the other hand, when Q g is a non-empty set, there are active RPs common between at least g pilots and therefore, there is a jammer present in the network. Furthermore, the parameter g should be selected properly to avoid a high false alarm probability (FAP). If g is not large enough, the set Q g can be non-empty even when there is no jammer, which leads to a high FAP. If g is large, FAP would be low but correct detection probability (CDP) would be lower, especially when the jammer's power is low. The reason is that the jammer's RPs may not appear in a large number of estimated pilot RP sets. Finally, note that the set Q g gives us an estimate of the jammer's RPs since the common RPs are jammer's paths with a high probability.
IV. JAMMING SUPPRESSION In this section, we utilize the directional information attained during the jamming detection phase to design a signal detection scheme which mitigates the intentional interference of the jammer in the data transmission phase. The proposed method relies on a modified channel estimation which takes into account the effect of the jammer. This channel estimation is implemented by projecting the received pilot signals onto the orthogonal complement of the jammers angular subspace and is based on a linear minimum mean squared error (LMMSE) estimation technique in the training phase. Afterward, in the data transmission phase, a maximum ratio combining (MRC) is constructed by the use of the aforementioned estimated channels. We show that the combining vector is orthogonal to the jammers channel and can effectively suppress the effect of jamming.

A. Channel Estimator
As discussed in section II, the channel of a user is constituted of two types of information, 1) a set of active RPs, 2) the gains of these active RPs. We obtained an estimate of RPs along which pilot signals were received,Ω k,w , which also included paths of the jammer. The proposed jamming detection scheme yields an estimate of the jammer's spatial signature. By subtracting the jammer's paths fromΩ k,w , we obtain a set that only comprises the kth user's RPs, i.e. Ω k =Ω k,w \ Q g . Thus, by estimating the gain along these paths, an estimation of the users' channel will be attained.
The signals received in the training phase along with ith RP for i ∈Ω k can be written as Therefore, according to [28], we can define the LMMSE estimation of the ith RP's gain aŝ where µ k = M β k C k . The mean-square of the estimated gain are Note that this estimator is the same for all the RPs of a user, since they all have the same distribution. Moreover, the estimation error is defined as e n k,i =ĝ n k,i −g n k,i and its mean-square is E{|e n k,i | 2 } = σ 2 z σ 2 z + τ p k,t µ k . By having the RP set of a user and the RPs' gains, the channel of the kth user can be estimated asĥ This estimation technique has a great advantage that makes the estimated channels orthogonal to the jammer's channel, i.e. (ĥ n k ) H h n w 0. It stems from the fact that by eliminating the jammer's active RPs, the angular subspace of the jammer's channel is excluded and the users' channels lie in its null space. Although some of the jammer's RPs which have insignificant gains may remain, especially when the jammer's power is very low, their impact on the estimation accuracy and orthogonality to the jammer's channel will be negligible due to their small gain. We should also note that for the users that have some common RPs with the jammer, the accuracy of channel estimation decreases. However, it is unlikely and would not affect the majority of users.

B. MRC Decoder
As mentioned before, in this paper we adopt an MRC combining method which is a simple and efficient precoder for massive MIMO systems. Hence, in the uplink data transmission phase, the decoding matrix will be V n =Ĥ n , whereĤ n = {ĥ n 1 ,ĥ n 2 , . . . ,ĥ n K }. As mentioned before, canceling out the jammer's RPs would eliminate its deliberate interference due to the orthogonality of the combining vectors to the jammer's channel which can be interpreted as a kind of spatial filtering. Using MRC, terms of SINR in (13) for the kth user in the nth subcarrier will be obtained as where C k/w = card{Ω k } and C k,l = C l,k = card{Ω k ∩Ω l }. The derivation of these terms is included in Appendix VII-B with more details. Now, after calculating these terms, the effective SINR of the kth user in the nth sub-carrier is described as V. SIMULATION RESULTS In this section, we evaluate the proposed jamming detection and suppression scheme by numerical simulations. In our simulations, we assumed K = 10 and equal user transmit power in both training and data transmission phases with p k,d = p k,t = 0 dBw for all users and noise power is σ 2 z = −25 dBw. Also, the large scale fading coefficients are assumed to be equal to 0 dB for users and the jammer, i.e. β k = β w = 1. The length of pilot signals is taken as τ = K = 10 and the length of coherence block is set to T = 200. The users' and jammer's average incident angles are considered to be uniformly distributed in the interval [−π/2 + ∆/2, π/2 − ∆/2] and angular spread is constant for all terminals. The number of sub-carriers is N d = 1, 20 and the threshold in (17) is selected as k = 0.02, 0.11.
At first, we evaluate the performance of the proposed jamming detection method by evaluating its CDP and FAP. Fig. 3 shows the CDP versus the jammer's power q t for different values of g = 6, 8, 10. It is apparent that CDP improves with the jammer's power as the jammer's RPs become easier to estimate. Furthermore, this figure shows that using a larger number of subcarriers N d leads to a better CDP performance. For example, using N d = 20 sub-carriers helps the BS to detect the jammer with 10dB lower power. Besides, a lower value of g results in a better detection probability. Fig. 4 demonstrates the relation between the FAP and angular spread ∆. As shown in this figure, FAP increases with the angular spread ∆. It is due to the fact that with a larger angular spread, the probability that an RP is common among a large subset of users would rise. In addition, a smaller value of g would result in a higher FAP. In other words, the FAP of the proposed detection method can be decreased by a more careful selection of g. Fig. 5 depicts how the jammer's power affects sum-SE of the network for three different scenarios, i) no jammer in the network, ii) jamming with the proposed jamming detection and suppression scheme and iii) jamming when no suppression. In this simulation, we considered M = 200, ∆ = π/18, g = 6 and N d = 20. We can see that the proposed suppression technique is able to effectively cancel out the jamming attack and the resulting sum-SE is very close to the case that there is no jammer in the network. However, it is worth mentioning that for jamming power below −5 dBw, using the proposed suppression scheme is undesirable. However, in this case, the effect of the jamming on the system's performance is negligible.   Fig. 6 illustrates how sum-SE changes with the number of array antennas. In this figure we also consider three aforementioned scenarios and set the parameters as q t = q d = 0 dB, g = 6, N d = 20 and ∆ = π/18. As expected, the sum-SE of the system increases with a higher number of array antennas. It can also be deducted from this figure that the proposed strategy can successfully reject the jammer's attack and be close to the case that there is no jamming in the network. The insignificant loss of SE is the result of excluding the common RPs between the jammer and the users.

VI. CONCLUSION
In this paper, we investigated a direction (angular)-based strategy for detecting and mitigating jamming attacks on the uplink transmission of mmWave massive MIMO networks. The proposed scheme first utilized the received signals in the training phase to estimate the active RPs (i.e. directions) from which a pilot signal is received. Then, by using this information, the presence of the jammer and its direction were detected. Finally, using the information obtained in the previous step, a suppression technique was proposed which was able to countermeasure the jammer's attacks in the data transmission phase. Our numerical analysis showed the effectiveness of the proposed strategy against the jamming attacks under different scenarios.
VII. APPENDIX A. Proof of Remark 1 There are K g combinations of g user from K users. We denote the set of chosen users by M i , i = 1, ..., K g . Therefore, we can define where M 1 = {1, . . . , g} and (a) is derived from Frechet inequality for union of events and (b) holds from the fact that P rob( j∈M i Ω j ) is constant for different combinations of users since no distinctive characteristics are considered for different users. Now, we investigate the probability P rob( g j=1 Ω j ). First, we define the I int as intersection of I j , j = 1, . . . , g. Now, we can write where P Ψ is the probability that there is at least one RP in I int and (c) holds since P Ψ ≤ 1. In order to I int to be non-empty, all I k should overlap with each other, which would be achieved if for any i, j ∈ M 1 , i < j, the distance betweenθ i andθ j is less than ∆. We define these events as and P rob(E i,j ) = 1 − ( π−2∆ π−∆ ) 2 sinceθ i , i = 1, . . . , K are independent uniformly distributed in interval [ π 2 − ∆ 2 , − π 2 + ∆ 2 ]. Now, we can write P rob(I int = ∅) = P rob( Although these events are not independent, but they can be divided into independent events, e.g. E i,j ⊥ ⊥ E i,k , i, j, k = 1, . . . , g; j = k; i < j, k. We define new events as intersection of independent events E i = g j=i+1 E i,j , i = 1, . . . , g − 1. (38) These new events are not independent but their probability can be attained by P rob(E i ) = (1 − ( π−2∆ π−∆ ) 2 ) g−i . By substituting (38) to (37), we can obtain P rob( i<j E i,j ) = P rob( where (d) is derived according to Frechet inequality for intersection of events and min i (P rob(E i )) = P rob(E 1 ) = (1 − ( π−2∆ π−∆ ) 2 ) g−1 since it has the largest exponent. Thus, according to (34-39), the upper bound of (20) is derived.

B. The derivation of SINR terms using MRC decoder
When MRC is utilized, the terms of achievable rate mentioned in (13) can be evaluated as follows.
• Interference exerted by the jammer E{|(ĥ n k ) H h n w | 2 }: As mentioned in section IV, since the subspace of the jammer is excluded from estimated channels, they would be approximately orthogonal. Thus E{|(ĥ n k ) H h n w | 2 } 0.