Energy and spectral efﬁcient relay selection and resource allocation in mobile multi-hop device to device communications Energy and spectral efﬁcient multi-hop D2D communications

The expansion of smart devices and throughput requirements have triggered the integration of multi-hop Device to Device (D2D) underlay the ﬁfth generation (5G) network. This new paradigm has extended the coverage area and enabled ultra-low latency and cost-per-bit communications. While the previous 3GPP releases address only the coverage problem via two-hop D2D, this paper focuses on multi-hop D2D deﬁned in 3GPP release 15 and beyond. It proposes a distributed relay selection and resource allocation algorithm based on a mixed strategy game and a novel amplify and forward factor. The proposed schema maximizes the joint spectral and energy efﬁciency. Besides, it determines the optimal amount of data and energy transferred by each device. The efﬁciency of this approach against other state-of-the-art solutions is evaluated and veriﬁed through computer simulations.


INTRODUCTION
With the ever-increasing number of connected devices, the traditional uplink/downlink cellular mode became incapable to meet the 5G users' requirements.To address this issue, a set of technologies were integrated including D2D communications, heterogeneous networks and millimetre waves.D2D has especially drawn much attention for its potential to improve the system performance, enhance the user experience and expand more cellular applications [1].
It was firstly specified in release 12 3GPP for public safety to allow direct transmissions out of coverage area.Then, it was utilized to offload the traffic from eNodeB and to improve the total cellular capacity and EE inside the cell [2].
Release 13 opened the possibility for two-hop communication between the user equipment (UE) and the core network.An out-of-coverage device can access the network through the assistance of a fixed relay using outband or inband spectrum.In the former, devices use unlicensed spectrum as Bluetooth and WiFi to communicate with the relay [3], whereas the latter leverages from the cellular spectrum in overlay or underlay mode.The underlay mode enhances the cellular capacity and the over-all spectral efficiency.Yet, it requires an efficient management of the dynamic resource allocation and the relay selection compared to the overlay schema [4].
Later, efficient solutions based on release 14 3GPP were provided to maximize the benefits of D2D communications in terms of implementation, interference management, and spectrum reuse.Various schemas were proposed to develop adaptive solutions for ultra-dense and interference-rich networks [5].
Recently, releases 15 and 16 upgraded the two-hop fixed relaying to mobile relaying and enabled multi-hop communication between devices inside the cell.This solution was proposed to avoid the limitations of short-range direct transmissions constrained by the interferences and the device's battery.In a multi-hop D2D network, a UE can act as relay either between eNodeB and a UE or between two UEs.A variant of the first scenario could be a cooperative cluster of UEs in which eNodeB transmits a data item to the cluster head which then group-casts it to the others.Accordingly, it can efficiently support local data services as files, audios, and video streaming through group-cast and broadcast transmissions.This approach can minimize the latency, the cost and offload more traffic from eNodeB.Moreover, through multihop D2D, IET Commun.2021; 1-17.wileyonlinelibrary.com/iet-com 1 a UE experiencing inferior processing or low energy budgets may offload computation-to a nearby capable UE.Multihop D2D enhances the Internet-of-Things (IoT) paradigm.It involves autonomous connectivity and communication among low-power devices.It is used to establish M2M communication since it affords ultra-low latency and real-time responses [2].
A particular application is vehicle-to-vehicle (V2V) communication, where D2D links are utilized to share information between neighbouring vehicles and offload more traffic.Multihop D2D can also support many other use cases as proximity services like public safety and operates in a disaster-hit area where all the BSs are damaged.It can be applied to enhance the sustainability and productivity in e-agriculture due to the smart devices and sensors [6].It is a promising solution in Mobilehealth (m-health) and Telecare Medicine Information Systems (TMIS), which require performance and efficient transmission in networks with billions of connected devices.It provides realtime responses regarding the sensibility of the e-health shared data.Moreover, it considers the scarcity of bandwidth in wireless networks [7].
The recent releases opened many horizons for 5G to expand its application areas.However, several questions arise regarding the integration of multi-hop D2D underlay dense networks.How to assign the optimal channel for a direct D2D communication?How to choose the best relay that maintains the Quality of Service in a mobile multi-hop scenario?And how can this process be self-organized and generalized over a network with limited resources?
In this paper, we try to answer these questions through a distributed energy and spectral-efficient algorithm.It considers both direct and multi-hop D2D scenarios.For the first, the D2D pair selects the best cellular channel that maximizes the SE/EE tradeoff taking into account the mobility.In the second, the relay selection considers the joint SE/EE, the UEs mobility, and a novel amplify and forward factor.This algorithm combines the advantages of distributed approaches and the minimum information exchange between the devices.
The rest of the paper is organized as follows.Section 2 analyses the previous research in D2D resource allocation, power consumption, and relay selection.Section 3 draws the system model applied to the proposed approach.Section 4 models the resource management and relay selection problem as a mixed strategy non-cooperative game.Through Section 5, we describe the distributed spectral and energy-efficient algorithm.In Section 6, the joint SE/EE performances are analyzed and compared to former state-of-the-art solutions.Finally, Section 7 concludes the paper and proposes some perspectives for future work.

D2D relay types
Since 3GPP release 12, mobile devices can communicate in pairs via a direct D2D link.However, the connectivity at the cell edge is still challenging.To address this limitation, a relaying technology called ProSe UE-to-network was introduced in Release 13 3GPP.A static relay acts as a wireless backhaul to the base station and an access provider for the remote UEs.This solution affords the coverage to devices with poor connectivity.Yet, relays should support some eNodeB features and provide a particular UL/DL interfaces.From 3GPP Rel-14, a UE can access the core network using a direct UE-to-UE rely interface.Recently, 3GPP Rel-15 enabled licensed multi-hop D2D; the remote UEs can communicate in pairs forming a multi-hop cluster.Therefore, more data is transferred in a wide range with higher throughput and minimum power consumption.

Related work
The joint EE/SE resource allocation and relay selection were not well studied in the release 15 multi-hop D2D.Most of the previous work focuses only on release 13 and 14 coverage extension scenarios.In this context, the authors of [8] proposed a relay selection algorithm based on interference minimization.Paper [9] used a static cellular network environment with one cell scenario.It relies on the distance between the relay and the user equipment (UE) to form a cluster of candidates.Then, the most appropriate is selected based on the bit error rate and outage probability metrics.Other studies considering the power control are inspired by ad-hoc and sensor networks.Reference [10] investigates an energy-efficient relay selection.It decreases the power consumption by minimizing the distance between the D2D transmitter and the relay.However, this approach generates interferences in a dense cellular environment.Authors in [11] propose a similar schema to select the relay with the highest remaining energy in its battery.Consequently, more data is transferred to remote destinations.Yet, the problem arises when different transmitters select the same relay.It may quickly run out of battery and drop the communication.In [12], a power-aware algorithm is investigated to draw the multi-hop path.
In [13,14], to reduce the energy consumption, a transmitter selects the relay that requires lower transmission power.This approach favours the nearest relay.However, it increases the number of hops to the final destination, leading to higher endto-end errors and delays.
Other works in two-hop D2D are deployed underlay 4G networks in a centralized fashion.In [8], a semi-base station approach is investigated.The devices calculate the useful parameters and broadcast them to the base station to select the optimal one.In paper [15], a cluster of multiple idle UEs acting as relays is proposed, and the most appropriate relays is picked by eNodeB.In [16], the authors analyze the outage probability of a cooperative D2D communication.In [17], a clustering method is proposed in out-of-band D2D to enhance the spectrum reuse, while in [18], the base station selects the relay that minimizes the energy consumption.Paper [19] also considers a centralized joint power control and resource allocation.The authors propose a multi-antenna base station employing ADCs with different resolutions.In reference [5], a two-hop relaying is proposed to extend the cellular coverage.The authors develop an overhearing-based protocol.They utilize the overheard interference in the receiver design to enhance the cell spectral efficiency.Reference [36] studied the energy-efficient joint sourcerelay power allocation.It investigates a centralized multi-input multi-output (MIMO) amplify-and-forward relaying system.It relies on high signal-to-noise ratio (SNR) approximation and transforms the problem into a pseudo-convex optimization.
However, like other centralized approaches with high complexity, they should restrict the iterations to obtain approximate solutions.
Clustering methods brought several advantages for multihop D2D.However, they assume the knowledge of the channel state information (CSI).This knowledge is not always feasible in practice and requires significant signaling overhead.Besides, all the UEs must estimate their communication channel and feed them back to eNodeB, whereas D2D technology is selforganized and could not rely on the base station except in the pair discovery step.
Consequently, distributed approaches have recently received more attention in the resource block allocation and relay selection schemes.In paper [20], the problem is formulated as a bipartite graph between the UE and a set of fixed relays to minimize the power consumption.In [21], a Stackelberg game is proposed.The relay determines the price of the power allocated by the transmitters.Then, the best is selected based on interference mitigation.The authors of [22] focused on a low complexity algorithm to choose the nearest relay, whereas in [23], they proposed a distributed machine learning approach.
In [24], the authors propose a two-level Stackelberg game for the relay selection based on power mitigation.Similarly, in [25], the approach favours the energy harvesting relays in decodeand-forward mode.
In [26], a game theory model is investigated.It limits the relay location area by the transmitter's radius.Then, it analyses the candidate's capability to forward the data.
The authors of the paper [24] propose an adaptive learning algorithm to allow transmitters to learn the optimal relay coordinates.However, if several transmitters simultaneously shared the relay, their utilities decrease.In paper [27], the relay choice is broadcast to all D2D transmitters to allocate the optimal channel.This solution requires excessive message transmission to inform all DUEs about relay selection.Consequently, it can increase signaling and interferences.
Similar approaches are proposed to select the optimal resource block in a direct D2D communication.Paper [28] investigates an interference minimization mechanism based on UEs location; cellular UEs listen and measure the received signal to interference plus noise ratio (SINR) from a control channel.If it is higher than the maximum threshold, a report is sent to eNodeB.It stops allocating RBs occupied by D2D pairs to cellular users.Paper [29] focuses on spectral efficiency (SE) optimization through resource allocation in a static environment.In [30], the authors used an analytic approach to evaluate the D2D system capacity based on auction theory.
Paper [31] investigates the EE maximization through a joint time allocation and relay selection while considering the UEs SINR constraints.During this process, the energy collection and communication time are randomly allocated.Besides, the EE maximization problem is divided into two sub-problems: relay selection and time optimization.However, both the cellular and D2D UEs are considered static, and each D2D transmitter selects the best relay without relaying it to other devices.Consequently, the best relay is allocated to the first connected D2D pair, and the total SE decreases relative to the increase of UEs number.
Various techniques have been proposed to solve the interference problem and maximize the spectral efficiency in direct D2D communication.However, more challenging issues in 5G networks are energy consumption and mobility management.UEs are typically handled devices with limited battery capacity.They can quickly drop the communication if the mobility and the energy consumption are ignored in the system design.Accordingly, more analysis is needed to provide an efficient spectral and energy communication multi-hop schema and preserve the performance gain as UEs move within and across the cell.

MAIN CONTRIBUTIONS
Based on the aforementioned schemas, we extend the previous research on two-hop coverage extension by an inter-cell multihop D2D that enhances both the SE and EE in a mobile scenario.
We consider optimizing the joint SE/EE as the main criteria in our algorithm.Multiple D2D pairs can be allocated to the same cellular channel and relay to maximize the spectrum reuse.The device's mobility generates instant fading, channel variation, and temporal correlation.Therefore, both the resource allocation and relay selection consider the Doppler Effect.The optimization problem is modelled as a mixed strategy noncooperative game, and the relay amplifying is expressed by a novel amplify and forward factor.The proposed algorithm provides the optimal transmission energy and amount of data to be transferred by each D2D transmitter and relay.
Besides, it requires a reduced amount of data transmission between devices to reduce the complexity and the convergence time.

SYSTEM MODEL
We assume a single-cell scenario with multiple UEs and an eNodeB located at the cell centre (Figure 1).We suppose that each UE has already selected its communication mode.Cellular UEs denoted Nc communicate with eNodeB through UL/DL channels.While the D2D set denoted Nd communicate in pairs reusing the same cellular links.Each source node S i (i∈ [1,Nd]) aims to transfer its data symbol x i to the destination D i with the transmission power P i , i ∈ [1,Nd].If the distance between the pair is less than the threshold d max and the channel condition supports the direct

Channel model
We consider a time division duplex with two-time slots and amplify and forward relay factor.The source transmits xi symbols in the first time slot.In the second, the relay broadcasts a scaled combination of the received signal.Finally, the destination subtracts or cancels the self-interference and extracts the target signal.Figure 2 represents the channels between the source S, the relay R and the destination D denoted f, g and h.
In a static environment, we can model the channel gain h between the source and destination as a circularly complex with zero mean and a variance.Considering the large scale fading effect; g can be expressed as: where d i, j is the distance between the pair and h i, j is the complex Gaussian channel coefficient satisfying h i, j ∼ CN(0,1).|h i, j | 2 represents the log-normal shadowing and fading, and α is the path loss exponent.The device's mobility generates time variation and causes errors in the channel estimation.Accordingly, we can model the channel f between the source and the relay as: where 0 < a i ≤ 1, i ∈ [1,3] represents the indicator factor of the channels variation rate related to the Doppler shift by the zero-order Bessel function; Δh(n), Δ f (n) and Δg(n) are the time-varying components of the channels.∆i are independent and identically distributed (iid), zero-mean complex Gaussian random processes with variances  2 1 ,  2 2 ,  2 3 , respectively, with CN(0, In a direct D2D scenario, the channel estimation ĥ between the source and the destination is a complex random Gaussian process expressed by: )) Thus, the instantaneous SNR at the receiver is: where P s is the source node transmission power.Similarly, in a multi-hop scenario, we can model the channel f (n) between the source and the relay and the channel g(n) between the relay and the destination by: f = f 0 and ĝ = g 0 at the beginning of transmission or when there is no channel variation.

Relay amplifying factor
In the literature, relaying strategies include two main categories: Decode-and-Forward (DF) and Amplify-and-Forward (AF).DF maximizes the spectrum efficiency.Yet, it raises the probability of errors, especially when the devices are in motion.AF is simple to implement.It outperforms DF in coverage extension and capacity especially, in dense networks with significant shadowing [17].AF can have a fixed or variable gain.The first applies a constant-coefficient to the received signal from the source-relay amplitude hop.While the second ensures an accurate output power by inverting the channel input.However, it requires the knowledge of the instantaneous channel gains.
To satisfy the average energy, we bound the amplifying factor according to the following constraint: " And to adjust the energy constraint of the channel f between the source and the relay, β(k) is expressed as: Most AF approaches set the constant  = 1 or  = P r .So the β factor cannot consider the combined effects of fading and noise as it depends only on the channel variance  2 2 .To give the amplifying factor more accuracy, we use the following formula: d and ɣ represent the distance and the path loss exponent of the channel f, respectively.Using this knowledge, we calculate the amplification factor  2 k of the k th symbol duration, we obtain: When the CSI is not available, that is, a 2 1 = 0.
In a relay-assisted communication, we can use the Maximum Ratio Combiner (MRC).It is set to combine signals on the source-destination and the source-relay-destination links.
Given the signal, the channels at the beginning of the transmission, and their indicators; we can express the maximum likelihood estimation of the received signal by Equations ( 14) and (15); The optimal detection involves a maximum ratio combining for the received signals scaled by their variances.The SNR  at the receiver can be expressed as defined in Equation (18), where  h and  g are the average SINR at the destination coming from the source node and the relay, respectively.| f 0 | 2 , |g 0 | 2 and |h 0 | 2 represents the log-normal shadowing and fading corresponding to the given channels.
In a direct D2D scenario, the destination receives the following interference.
where a 1 represents the indicator factor of the sourcedestination channel, P c is the transmission power of the cochannel cellular UE, |h 0,c | 2 is the cellular instantaneous gain and D ∑ j = 1,j≠i a 2 1 P j |h 0, j | 2 represents the interferences caused by all other co-channel D2D pairs.We can express the SINR at the receiver as: whereas in a relay assisted D2D communication, the Interference received at the relay can be expressed as: And at the final destination is: Accordingly, the final SINR at the receiver is defined in Equation (23).

Average capacity
We define the average SINR for h, f and g channel as where I h , I f and I g are the interferences sensed by the receiver at the given channel.
Considering the AF factor, the instantaneous capacity at the receiver can be expressed by Equation (25).
B is the system bandwidth and γ is the SINR given by Equations (20) and (23).The division by 2 indicates that the transmission protocol takes two time slots to transmit a packet from the source to the destination via a relay.In a fading channel, the SINR becomes a random variable.To obtain the averaged capacity C erg , we must integrate or calculate C ins over many Monte Carlo simulations.

Total consumed power
Let p k i the transmission power of the i th DUE in the k th channel.In a direct D2D communication, the total consumed power p d i,total is composed of the power of the i th D2D transmitter, the k th cellular UE, the j th D2D transmitters in the k th cellular channel, and the circuit power of the D2D transmitter and receiver denoted p t crt .We assume that the circuit power is the same for any UE.
is the power amplifier (PA) efficiency verifying 0 <  < 1 In a relay assisted scenario, we consider both the transmit power and the circuit power of the r th relay, denoted as p t r and p crt , respectively.
Then, the EE of the i th DUE can be expressed as:

THE JOINT EE/SE MAXIMIZATION PROBLEM
In this section, we formulate the problem of joint SE/EE resource allocation and relay selection in a D2D scenario.Our main goal is to maximize the spectrum reuse and minimize the total consumed power in the network.The cellular links allocate different RBs while the D2D pairs reuse the same channels.Thus, the interference is added to every link occupying a RB shared among others.
Let D1 be the total D2D direct pairs in the cell, D2 the D2D pairs requiring relay assistance, and R the number of the relays.Nd1, Nd2, and Nr are their total numbers, respectively.
The EE/SE maximization problem of all direct and multihop D2D pairs in the cell can be formulated as a mixed-integer programming problem, respectively: max Subject to: where is the total capacity of the i th D2D pair over all the cellular channels, and C k j,erg denotes the total capacity of the k th relay over all the D2 pairs.The constraint (29) specifies the minimum SE requirement for every D2D communication in the cell, and (30) limits the devices' power consumption.
The above problem aims to maximize the total energy efficiency of all links in the cell.It is centralized and NP-hard complex [8]; eNodeB must know the full channel state information (CSI) of all the links.This assumption is not always feasible, especially in dense networks.To address this issue, we proceed to a distributed approach based on approximation methods.

Game theory in cellular networks
In situations with conflict decisions involving known payouts, game theory can efficiently determine the best outcome for each.The result considers all the players' decisions.
A game can be cooperative or non-cooperative.The first describes how a coalition of players allocates the payoff.While the second defines how each one deals with others to achieve its goal.Recent research applied widely pure strategy approaches in cellular networks.However, the equilibrium of the game is not always guaranteed, especially with high players' numbers.

Mixed strategy game model
According to John Nash, "There is at least an equilibrium for every finite Mixed Strategy game".Hence, we model the distributed resource allocation and relay selection problem as a mixed strategy non-cooperative game; G = {N, S, U } .Each player acts independently from the others.It attempts to select the best strategy that maximizes its utility function.The game reaches its best outcome if no player can increase its payoff when others change their decisions.
N: The players: are the UEs inside the cell: Cellular UEs denoted Nc, D1 direct D2D pairs, D2 pairs requiring relay assistance, and relays R.
S: The strategies: Each UE tries to choose the strategy that maximizes its payoff; cellular RB or relay according to the D2D communication type.(Table 1) Mathematically S = [S D1 , S R ] represents all the pure strategies of the game; refers to the pure strategies of R relays.
In a direct D2D communication: The strategy set of the transmitter i ∈ D1 is denoted s D1 i , while that given by any other D2D transmitter in D1∖{i} is denoted s D1 −i .In a multihop communication: the strategies of the relay k ∈ R and other relays in R∖{k} are denoted, respectively, S R k and S R −k .
The EE of the i th D2D and k th relay denoted U D1 i,EE and U R k,EE (bits/Hz/J), respectively, depend on both the given player's strategy and all strategies taken by other D2D in D1 and R, respectively, that is, Similarly, the SE (bits/Hz/J) of the i th pair from D1 set and k th relay from the R set, depends not only on its own strategy S D1 i and S R k but also on the strategies taken by other D2D pairs The pure strategy payoff of the i th player to the profile σ is:

The players mixed strategies
Let s i j i be the pure strategy played by the i th player.In a mixed strategy game, at least one player's strategy is a random variable.
We denote u i () = ∑ s∈S u i (s)(s) the players' payoff at the pure strategy space S{N}, where σ is the combination between its strategy  i and other players strategies denoted If a mixed strategy combination  is played then the probability that the pure strategies combination; Accordingly, the payoff assigned to the i th player is given by: , where u i (s) represents its payoff at the pure strategies combination s.
In such a game, the payoff of the player depends on all the strategies taken by the other players.Then, the strategies combination  can be represented by ( −i ,  i ).
The main concern of our algorithm is to calculate the mixed strategy Nash equilibrium of the game denoted  * that verifies: In a non-cooperative game, each player aims to maximize its payoff.It attempts to maximize the payoff obtained by a possible mixed strategies combination in order to reach the maximum payoff.

Nash equilibrium
Nash Theorem: Every finite non co-operative game has at least one mixed strategy Nash equilibrium.
Accordingly, (P) has at least a solution in S verifying: The existence of the equilibrium does not imply that it is necessarily optimal.There may be other choices of the players which can lead, for each, to a higher gain.The equilibrium means that each player could not attain a better payoff than that at Nash equilibrium, by changing only his own mixed strategy leaving all others unchanged.Yet, if the utility function is convex, the MSNE denoted s* = (s ) represents the optimal combination of all the players' strategies in the game.It verifies: It represents the optimal SE/EE resource block and relay selection for all D2D pairs in the cell.

Mathematical formulation
The objective formula defined in ( 24) is non-convex.It can be transformed into a concave function using the nonlinear fractional programming developed in [33].We define the maximum EE of the i th D2D pair as: b i is the best response of the i th D2D transmitter given the other UEs' strategies.It represents its maximum payoff.
Here the sense of optimization for each player is to maximize his payoff when his opponents play their Nash equilibrium strategies.Consequently, a player denoted i attempt to minimize the gap between its optimal payoff b i and the payoff obtained by a possible mixed strategies combination.The following theorem can be proved: Theorem 1.The maximum payoff of the i th player is achieved if and only if b i − max (eq 28) = 0.
This theorem shows that the transformed problem with an objective function in subtractive form is equivalent to the nonconvex problem in fractional form, which means they lead to the same optimum solution  * : The optimization problem of this player can be modeled as follows: where ( −i , s i j ) denotes the mixed strategies combination where the i th player plays with his j th pure strategy, that is, the probability assigned to the j th pure strategy is 1.
(P i ) is a nonlinear programming problem.Applying the Karush-Kuhn-Tucker (KKT) optimality conditions, we can obtain a local optimum satisfying the KKT first order necessary and sufficient conditions, and derive the solution.
Lemma 1.A necessary and sufficient condition for  to be a Nash equilibrium of the game G is: If such  exists, then it is nothing then the optimal solution of the nonlinear programming problems (P i ) for i ∈ N, with a global-optimal value equals 0.
In the rest, we show that the Nash equilibrium is an optimal solution of a single optimization problem.Theorem 2. A necessary and sufficient condition for  to be Nash equilibrium of game G is that it is an optimal solution of the following minimization problem: The optimal value of this problem is 0. The value of b i , at the optimal point gives the expected payoff of the player i.
Proof of Theorem 2: In light of Lemma 1, the feasible set of (P i ) is nonempty as for every finite non-cooperative game at least a Nash equilibrium exists.
Thus, if  * is a Nash equilibrium it is feasible for (P i ): Yielding that  * is an optimal solution of (P i ).
With the existence of Nash theorem, there must exist at least one (, b 1 , … , b n ) to be a global minimum for (P i ), verifying: Consequently  * is a Nash equilibrium of the game.On account of Lemma 1 the payoff b * i is obviously the optimal expected payoff of the player i.
To solve the problem (P i ), we reformulate it as a vector in m + n dimensional Euclidean space.
Let x be a vector of length m + n described as follows.Arranging the strategies of players 1 to n in order, we have a total of m strategies, where Performing this transformation, the optimization problem of (P i ) gets converted to the following form: ) is a non-linear optimization model in a space S of dimension equal to the total number of pure strategies in the game and the number of players.S = CxD1 in the resource allocation stage and S = RxD2 in the relay selection stage.

Algorithm presentation
We coded a sub-algorithm in MATLAB to calculate the MSNE.The computation of the solution in the formulated n-person non-cooperative finite game is equivalent to the non-linear optimization problem (P i ).Their constraints and objective functions are polynomials.Moreover, the number of its variables is equal to the sum of players' numbers and the total number of the available pure strategies.To solve this problem, we used the sequential quadratic programming-based quasi-Newton method.We refer to the reference [32].
The input of the sub algorithm is the matrix M representing the u i j payoffs.
M : The main goal of the proposed approach is to maximize the EE of each player subject to its SE and power constraints.Since D2D communication is self-organizing, each DUE executes the following "Energy and spectral efficient RA/RS" algorithm steps without relying on eNodeB except for pair discovery.
Before starting a communication session, any UE who wants to communicate via D2D mode sends a request to eNodeB.It assigns a tag to the D2D transmitter containing its Id, and broadcasts it in the coverage area of the cell.If there is a D2D receiver that matches with the transmission radius and the channel condition constraints, it replies with an Ack containing its Id.
Consequently, the list of the possible receivers of any i th D2D transmitter is formed as below: where R OI represents the cell.R i ≠0, d (i, j ) and d th represent, respectively, the transmission radius of the UE i , the distance between the i th and the j th DUE and the maximum distance threshold between them.SIN R i, j is the signal to interference plus noise ratio at the j th receiver, and SIN R threshold represents the minimum SINR to guarantee the quality of the communication.
If the distance between the D2D pair is higher than the maximum threshold, or if the transmitter experiences poor channel conditions, every D2D pair from the D2 list forms the list of its candidate relays defined as: At the equilibrium, every D2D pair belonging to D1 selects its optimal channel, its transmission power, and the amount of data to transfer in each RB.Then, the source sends its signal through the allocated spectrum, and the destination turns to the same resource block to receive it.Similarly, every relay from R selects the best D2D candidates from D2 to maximize their SE/EE.This relay determines its optimal transmission energy and amount of data to relay the required communication.
DUEs move randomly across the cell.The direction and velocity of every UE are independent of time slots.We set random a i factors for each channel in the cell.These indicators represent the channels variation rate related to the Doppler shift by the zero-order Bessel function.Δh(n), Δ f (n) and Δg(n) represent the time-varying components of the channels, and ∆i are independent and identically distributed (iid), zero-mean complex Gaussian random processes with variances  2 1 , 2 2 ,  2 3 , respectively, with CN(0,  2 i ), i ∈ [1, 3], as described in (4).
All these parameters are integrated into the utility function ( 24) through the instantaneous SINR factor  represented by Equation (23).Then, we investigate the impact of UE's mobility on the resource allocation and relay selection via its coordinates and a i factors.
In this algorithm, we focus on a distributed Nash equilibrium computation mechanism using only local information.We assume that all cellular links have already allocated their RBs and every D2D link knows its coordinates in the cell.For any D2D pair, we can derive and distinguish the interference caused by co-channel cellular UEs from other D2D pairs.This information is obtained through the cellular RB allocation broadcast by eNodeB.
it max defines the maximum iteration number.MSNEA and MSNER represent the final matrixes representing the optimal strategies for each player from D1 and D2.Every time slot, a new UE can access the cell sequentially and attempts to maximize its payoff.
D1 devices communicate directly in pairs.Each of them competes for the optimal RBs maximizing its utility function considering the strategies of the others.
D2 set represents the pairs requiring relay assistance.Each one performs the list of its candidate relays DR symbolizing their pure strategies.Then, the payoff matrix M: To obtain the MSNE that maximizes all the devices payoff, this process repeats until no other D2D pair can strictly improve its utility, or when it i = it max .According to Nash Theorem, the convergence of the algorithm to the MSNE is guaranteed.
At the equilibrium, the maximum achievable EE/SE is reached for all the players.It considers both the device's mobility and the novel relay amplification factor.All these parameters are efficiently managed in the same utility function.Our approach operates with a sequential UEs access, while other solutions require performing all the D2D communications before executing the algorithm.Many advantages make the proposed algorithm of high interest.The first leverages from the mixed strategy benefits.It provides the optimal amount of data and energy transferred by each player and relay.The second is the reduced data transfer between devices.In the existing distributed approaches, the strategy of each player is broadcasted to all the UEs.In this algorithm, each D2D pair has only to estimate the sensed interference on the available channels.Then, determines the power optimization rather than the exact strategies.
The proposed solution considers the devices' mobility, their transmission and circuit power, and all the players' strategies.It can be applied to the previous 3GPP two-hop wireless scenarios [5] and MIMO systems such in [36].First, it allows a D2D pair to be allocated many RBs simultaneously according to its bandwidth requirement and its antenna number (x i variable in Equation ( 47)).Besides, it can extend the coverage to mobile experiencing inferior channel condition.This can be achieved by considering eNodeB as transmitter in the communication process and update the related interference factor and SINR in Equations ( 22) and ( 23).

The complexity and the convergence of the proposed algorithm
The proposed algorithm presented in Table 2 is based on the nonlinear fractional programming developed in [33].It solves the convex problem of (24) for each player.It investigates the optimal solution of the optimization model (P i ) with zero optimal value.To get the solution, we use the sequential quadratic programming (SQP) based quasi-Newton method.For a detailed description of this approach, we refer to [32].This method is proved to converge to the optimum value at a superlinear convergence rate [34][37].The complexity of this algorithm is O(n 2 ) where n is the number of variables.In our case n = D 1 +m1 (cellular RBs) for the first stage resource allocation and D 2 +m2 (relays) for the relay selection, where m1 and m2 denote the total pure strategies number of D 1 and D 2 players, respectively.
Taking the i th D2D pair from the D 1 set as an example, it accesses the cell at a given timeslot and plays the game with the already connected UEs (1,…, i − 1) to allocate the best RB.The optimal MSNE is provided throw the zero optimal value given by the quasi-Newton method applied to the convex form of Equation ( 27), and similarly, for D 2 set.
For each time slot, the devices requiring cellular RBs or relay assignment investigate the optimal solution of the optimization model at most with: max (D 1 +m1, D 2 +m2).
Convergence Time: After executing our algorithm on a set of random combinations, an average number of 14 runs was needed to reach the convergence.After that, all UEs in the cell preserve their optimal mixed strategy probabilities obtained in the previous iterations.The total average time taken from generating random PPP distribution to reach the final MSNE matrix is 48.693143 s.
Resource consumption: The algorithm presented above has been coded in MATLAB.We have tested the software on intel core i7 machine on Microsoft Windows platform.
We provide a set of PPP distribution with intensities  D1 ,  D2 and  c as shown in the table below to vary the players' number and strategies in the simulations.The games randomly generated are solved in each case, and the average time taken is presented in Table 3.
eNodeB needs to decide on the RB allocation at every time slot.Thus, the reduced amount of transferred data between pairs, the low complexity, the reduced number of iterations, and the convergence time are highly desirable characteristics of this algorithm.

NUMERICAL AND SIMULATION RESULTS
We use 'Matlab' to investigate the results of the proposed algorithm.We model the communication channels between the UEs and eNodeB via small-scale Rayleigh fading with path loss and log-normal shadowing.We calculate the gain using g = To evaluate the results of the proposed approach, we present in Table 4, Appendix 1 the default RB and the optimal RB allocation of the D1 set.Table 6, Appendix 2 represents the random and optimal relay selection for D2.
In the default allocation, the D2D pairs select almost the nearest co-channel UE and relay.Whereas, at the convergence,

Spectral and energy efficiency tradeoff
In this section, the SE and EE of the network are investigated through computer simulations and compared to other state-of-art solutions.First, we fix the variation rates of all the channels to ai = 1.The displayed results are averaged through 1000 Monte Carlo simulations and normalized by the maximum value.
Figure 4 shows the normalized EE of all D2D links in the cell according to the game iteration number.It compares the proposed 'Joint SE/EE RA and RS' algorithm to the 'nearest' and 'pure' strategy energy-efficient approaches.Our solution outperforms the others in EE maximization.Its averaged EE converges to 0.586, while the energy-efficient algorithm converges to 0.422, and the nearest algorithm fluctuates between 0.228 and 0.251.The obtained results are expected; in a relay-assisted communication, the nearest relay to the transmitter is the farthest to the receiver.Thus it requires higher transmit power from the relay and generates more end-to-end errors.
The EE algorithm fluctuates around the mean.It performs better than the nearest algorithm.Because reducing the transmission power reduces the interferences between the cochannel communications.However, in pure strategy games, each transmitter is self-interested.It competes to select the optimal RB or relay without synchronization with others.Thus, multiple transmitters can simultaneously choose the same RB or relay.This situation causes harmful interferences and energy drop in its battery.Also, the gain obtained by reducing the energy in the D2D transmitter is not always able to compensate for the loss caused by the relay.The same assumption is available in direct D2D communication; the nearest co-channel CUE causes higher interferences and errors.It leads to SE and EE loss since all co-channel transmitters share the same RB.
Contrarily to these approaches, the proposed mixed strategy game considers both the energy and the spectrum optimization.Each source and relay calculates the optimal amount of data and power to transfer.Besides, the relay chooses the pairs that will serve.The algorithm converges to the MSNE after only 14 iterations and 48.693143 s due to the reduced data broadcast between UEs.
Figures 5 and 6 show the SE gains of direct and relay-assisted D2D communications, respectively.The displayed results represent the SE of each link, averaged through 1000 Monte Carlo To obtain these results we vary the devices location and energies randomly according to the given PPP distributions.Take the example of the first point in the blue plot x(2;0.5):we take 1000 random location for  c = 15 and  D1 = 2.Then, we calculate the average spectral efficiency.For the second point x(3;0.96),we generate 1000 random PPP distribution for  c = 15 and  D1 = 3 and average the SE results.Similarly for the D2 set.We evaluate the SE according to the input values, and connect the dots between the fitted points.
The SE gain of the proposed algorithm increases with the devices number.It reaches 3.8 bits/s/Hz at the 15 th D2D link compared to 2.6 for the EE RA/RS and 2.1 for the nearest RA/RS.The curves prove the superiority of the proposed approach in dense networks.
The interference to and from the co-channel D2D links decreases when the distance between them increases.The algorithm applies the same reasoning in cooperative communications.The optimal relay location for both the source and destination is almost in the middle.It reduces the interferences, enhances the system capacity and the EE.
Numerical results in Appendices 1 and 2 prove the same analyses.At the equilibrium, the useful signal of each pair is considerably higher than the interference channel gains.In a direct D2D communication, we obtain the optimal EE/SE values, especially when the D2D transmitter and receiver (final D2D receiver or relay) are close to each other and far from other interference sources.While in relay-assisted communication, the EE/SE tradeoff is increased when the relay is in the middle of the D2D pair.
Figure 7 shows the EE and SE tradeoff corresponding to the D2D links.The obtained values are averaged over a total number of 1000 simulations.Simulation results show that the proposed approach provides a 7 bits/Hz/J gain compared to Also, we observe a similarity between the EE/SE shapes in the three approaches.The EE firstly increases as the SE ≤ SE max increases.Then, it decreases as the transmission energy of the source nodes and relays increases.There is always an optimal p opt to preserve the spectral and energy efficiency gains.At the convergence, the algorithm provides the value of p opt and the optimal data transferred by each source and relay.(See Table 4, Appendix 1 and Tables 7 and 8, Appendix 2).
Take the example of the relay R7. 19.3329 is its optimal transmission energy in dBm, 56.83% and 43.17% present the data percentage that it will transmit for the selected D7 and D12 transmitters.The energy control decreases the interferences between the co-channel communications.Consequently, it increases the EE and SE gains.However, below the optimal value, it can't compensate for the SE and EE loss caused by the signal gain drop.That is why the algorithm performs the best EE/SE values when the D2D transmitter and receiver are close to each other but far from other interference sources.

Impact of mobility on the network capacity
Figure 8 shows the impact of mobility on the system outage probability.We set different channel variation rates (ai = 0:998; 0:899; 0:799).Figure 9 presents the impact of channel variation rates on the resource allocation and relay selection process.We present the maximum achievable SE and EE corresponding to different ai values, ai∈[0.7;0.8] and ai∈[0.89;0.99].The figure shows a sharp increase in the outage probability as the values of ai approaching zero.These results are expected since a high velocity impacts the convergence process and requires more effort to predict another optimal relay.Again, all DUEs update their default allocations to maximize their SE/EE tradeoff.Each pair selects a channel preallocated by a cellular UE or a relay relatively far from it with a high ai coefficient (approaching 1).Accordingly, the outage probability, the energy consumption, and the interferences decrease corresponding to the distance and ai factors.
For example, for ai = 0.799, both the EE and SE decrease is expected.The device's mobility limits the maximum achievable SINR, which agrees with Equations ( 20) and ( 23).
It's impossible to achieve an acceptable EE and SE for some ai mobility because in a high-velocity regime, the optimal relay and RB allocation change rapidly.Also, this may seriously affect the convergence of the algorithm.
For the rest, we investigate the impact of node mobility on both direct and multi-hop communication.Figure 10 presents three cases; the source, the destination, and the relay mobility.
Naturally, all scenarios are expected to perform identically due to the D2D symmetry.However, for the same ai factor, the source mobility affects the SE more than the destination.This assumption matches very well with SINR Equations (20) and (23).The same figure analyses the impact of time variation on mobile relaying.5G are dense networks, and the channel variation rate between the D2D transmitter and the relay and between the relay and the destination can change rapidly.
The relay employs the AF amplification factor given by Equation (11).We consider the case where only the relay is in motion.For the same ai factor, the relay mobility impacts the SE/EE performance less than the transmitter or receiver.As the direct path is time-variant, the relay mobility performs better than the source and destination.

Impact of relay amplification factor on the system capacity
In this section, we assume random transmit power for all UEs in the cell.Then, we study the impact of the relay amplification factor on the multihop scenario and the overall system capacity.The amplification factor is expressed by  Due to the novel  parameter given in Equation (10), the proposed AF achieves the best average capacity compared to other relaying approaches.Figure 11 shows that the new strategy increases the devices SINR.It favours the relay in the middle of the source-destination pair as the same static SE/EE principle.This reasoning reduces the interference signal gains.
The AF1 achieves the worst capacity since the relay power PR uses a fixed scaling parameter.It amplifies more noise compared to AF2.
In Figure 12, the proposed AF achieves less outage probability.The AF1 outage probability increases, respectively, as the total transmit power.Thus, to achieve acceptable results with AF1, the transmitters should increase their energy beyond 30 dBm.This assumption conflicts with the power minimization objective; The energy-efficiency is negatively affected by the relay power as a scaling constant.Again, the optimal relay location obtained through the proposed AF factor is almost in the middle of the pair.This reasoning further improves the overall network capacity and energy efficiency.

CONCLUSION
In this paper, we provided a review of key literature on direct and relay-assisted D2D.We proposed a new resource allocation and relay selection algorithm for multi-hop D2D communication, considering both the devices' mobility and a novel AF schema.This solution brings significant improvements in SE and EE and outperforms other state-of-the-art approaches.
As perspectives, we plan to enable multihop communication between DUEs from different radio cells and further analyze high mobility through machine learning.

FIGURE 1 FIGURE 2
FIGURE 1 System model of the proposed algorithm Where  d s and  d d are the direct neighbours of the source and the destination, respectively.

FIGURE 3 25 }R = {r 1 ,
FIGURE 3 Random PPP distribution of D2D and cellular equipments generated in one simulation

FIGURE 4
FIGURE 4The average energy efficiency of D2D links corresponding to the number of game iterations

FIGURE 5 FIGURE 6
FIGURE 5The averaged SE corresponding to the D1 D2D links

FIGURE 7
FIGURE 7The averaged energy efficiency corresponding to the spectral efficiency

FIGURE 8
FIGURE 8The outage probability corresponding to the averaged SINR of different ai factors

FIGURE 9
FIGURE 9The joint EE/SE corresponding to the averaged SINR of different ai factors

FIGURE 10
FIGURE 10 Source, destination and relay mobility corresponding to the SINR threshold Only the variance can change in the relay gain.Thus, it can increase the combined effects of fading and noise.In our schema, we set  2 = (1−d −k ) P s  2 SR + N 0 to provide an accurate expression in the amplifying process.

FIGURE 11 FIGURE 12
FIGURE 11Average system capacity corresponding to the source transmit power

TABLE 1
The mixed strategy game parameters

TABLE 2
Energy and spectral efficient RA/RS algorithm

TABLE 4
Default versus optimal RB allocation of D1 pairs

TABLE 5
Optimal D1 transmission energy in dBm

TABLE 6
Default versus optimal relay selection of D2 pairs

TABLE 8
Optimal relays transmission energy and data