Contribution of non-orthogonal multiple access signalling to practical multibeam satellite deployments

Summary This work explores the contribution of non-orthogonal multiple access (NOMA) signalling to improve some relevant metrics of a multibeam satellite downlink. Users are paired to exploit signal-to-noise ratio (SNR) imbalances coming from the coexistence of different types of terminals, and they can be flexibly allocated to the beams, thus relaxing the cell boundaries of the satellite footprint. Different practical considerations are accommodated, such as a spatially non-uniform traffic demand, non-linear amplification effects and the use of the DVB-S2X air interface. Results show how higher traffic volumes can be channelized by the satellite, thanks to the additional bit rates which are generated for the strong users under the superposition of signals, with carefully designed power levels for DVB-S2X modulation and coding schemes in the presence of non-linear impairments.

considered, with an arbitrary traffic demand across beams. The use of NOMA in DVB-S2X 8 is evaluated for the forward link at the system level, using a new specific superframe (SF) profile, first presented in Ramírez et al. 3 The exchange of resources across beams is also explored, in an effort to exploit NOMA to provide additional flexibility to resource allocation; under this beam-free approach, users can be paired across beam boundaries and served with NOMA, so traffic asymmetry in different beams can be leveraged to reap some benefits from a system perspective. This is expected to contribute on top of the advantages provided by NOMA when serving users with strong asymmetries in received SNR, due to the use of different front-ends. For completeness, non-linear impairments from the power amplification are also considered in the analysis, as they are expected to degrade the NOMA performance. In this paper, we will show how the operation point of the power amplifier (PA) should be jointly designed with NOMA power allocation to harness the potential NOMA improvement. Both flexible and non-flexible PAs will be considered 9 ; in the former case, the operation point of the amplifier, that is, the input back-off (IBO), can be adjusted for each frame, following the specific waveform requirements. If no flexibility is available, a fixed IBO will be used.
The rest of the paper is organized as follows. Section 2 presents the satellite system model. Next, the pairing of users and power allocation is addressed in Section 3, whereas the effect of non-linearities is discussed in Section 4. Numerical results and conclusions are presented in Sections 5 and 6, respectively.

| SYSTEM MODEL
We consider a multibeam satellite system with M beams and K users across the coverage, with K > M. Conventional four-colour frequency reuse across the beams is assumed. With two orthogonal polarizations, the beam bandwidth is half of the total bandwidth, with the duration of the time-frequency resource slots equal to V. To evaluate the potential of the free scheduling of the users to the beams, we simplify the resource allocation process and assume all available bandwidth W per beam is allocated to a single carrier. Consequently, the number of beams in the coverage sets the available frequency slots. The objective is to carefully assign the users to the beam slots to optimize a given system metric in both OMA and NOMA cases, as showcased in Figure 1. The flexibility in the resource assignment is such that users can be freely served by any beam in the coverage. The received power from non-dominant beams is exploited, for example, by precoding schemes, 10,11 or to balance the traffic load of different beams. 12 Thus, resource allocation entails the user scheduling in both time and frequency dimensions, together with the optimization of the user rates. Finally, two terminals classes coexist on the satellite footprint, namely, strong and weak receivers, which have different frontends, ‡ which give rise to a signal-to-interference and noise ratio (SINR) imbalance between them. In the case of NOMA, only two users will be served by each carrier at a given time instant, with successive interference cancellation (SIC) performed only at the stronger receiver.
As practical air interface, we choose the DVB-S2X standard. 8 The embedding of PD-NOMA requires some extensions, for instance, the SF profile which is proposed in Ramıirez et al. 3 Under this SF profile, the generation of the corresponding NOMA-PLFRAME payload is depicted in Figure 2, where the symbols of the combined DVB-S2X XFECFRAMEs (complex symbol frames) are aligned § and summed after being allocated a fraction of the total transmit power. If the kth and pth users are mapped to the mth beam, and λ m kp and 1 À λ m kp denote the power fraction allocated to them, the transmit signal at mth beam reads as with P the beam transmit power, s k and s p the corresponding signals for users k and p and the received signals expressed, for the linear case, as where ðh m k , n k Þ and ðh m p , n p Þ are the complex channel gains in beam m and noise values at the kth and pth terminals, respectively. If SNR m k > SNR m p , the rates for both users are given by where Π is a function which maps the SINR and spectral efficiency provided by DVB-S2X MODCODs. Note that the weak user suffers from the interference of the strong user, which grows with λ m kp . In this work, we will study the impact of a non-linear PA on the spectral efficiency.
Note that additional interference caused by the non-linear amplification, parameterized by IBO, will reduce the SINR and, consequently, the rates, which are now a more involved function Π NL of the operating point: As a remark, let us mention that the allocation of a time-frequency resource provided by a given beam to an arbitrary user has practical limitations, because those users located far away from its footprint do not receive any significant power from the beam. We will exploit this to simplify the search for the optimal mapping between users and beams, extending the potential user locations only to the first ring surrounding a given central beam. For notation purposes, we define S m as the set of users which can be served by the mth beam, with size jS m j.

| Problem formulation
With a fair sharing of resources in mind, we select proportional fair scheduling (PFS) to drive the resource allocation process with the beam-free approach as in Ramírez and Mosquera 13 ; the PFS maximizes the geometric mean of the rates in the long run. 14 Under this policy, the long-term averaged rate of the user k with PFS is computed as The instantaneous rate of the kth user at time index t, r k ðtÞ can be written as a function of the achievable rate by user k at time index t when served by the mth beam, r m k ðtÞ: F I G U R E 1 Example of the flexible resource assignment. Each box displays the user indexes that are served by the corresponding beam.
where u m k ðtÞ is a binary scheduling variable that is equal to 1 when the mth beam serves the kth user at time index t. With this notation, the PFS system metric for a given time slot can be reformulated as a weighted sum-rate (WSR) problem and expressed as where fw k ðtÞg are the weights of the WSR problem, which are inversely proportional to the long-term rates. If the time index is dropped to keep the notation simple, the resulting WSR problem for the resource allocation can be expressed as where u m kp is a scheduling variable that is active when both kth and pth users are paired and assigned to the mth beam. The constraints A1, A2 and A3 ensure that users can only be served by one beam at a time, and each beam can only serve two users in a given time slot. The user scheduling u m kp , together with the rates r m k and r m p , will be driven by the maximization of WSR. Interestingly, the power per feed constraint allows us to decouple the problem in (10) and focus on the maximization of the user rates in each beam. For a given pair of users served by the mth beam, we can solve the following optimization problem: where the allocated fraction of resources λ m kp , either time (OMA) or power (NOMA), is omitted in the user rate description to keep the notation simple. With this, the optimization follows different paths for both OMA and NOMA, as detailed next: • OMA: In the OMA case, the function f λ m kp in (11) is monotonic with λ m kp . Therefore, one of the users will take the whole slot. With this, problem (10) boils down to a matching problem, which is expressed as (A1) and (A2) in Appendix S1 ensure that a carrier beam is only allocated to one user at a time. The matching problem can be optimally solved by the Hungarian algorithm. 15 • NOMA: In the case of NOMA, the rates of the user pairs are obtained by selecting the best pair of DVB-S2X MDOCODs which optimize (11).
On the other hand, the optimal user scheduling requires an exhaustive search exploring all possible solutions. As a practical implementation, an ad hoc algorithm is presented in the next section.

| RESOURCE ALLOCATION ALGORITHM FOR PD-NOMA
Because the maximization of the WSR with NOMA signalling is known to be NP-hard, 16 a heuristic algorithm has been developed to avoid an exhaustive search. ¶ This algorithm is inspired by many-to-one matching theory 16,17 and is outlined below. First, let be C m a set of user indexes which indicates the candidates served by the mth beam. This set will be labelled as a candidate set and satisfies C m & S m . Under this notation, the WSR of the candidates selected by the mth beam can be expressed as WSR(C m ). Then, the overall WSR of the system can be expressed as Furthermore, we consider single-carrier terminals, so that users can only be served by one beam at a time. To indicate this situation, we state a user conflict when two candidate sets intersect and aim to serve to the same user or group of users. For example, two candidate sets C m and C p present a user conflict if A m,p ¼ C m \ C p and jA m,p j ≥ 1, with A m,p the set of user indexes from the user conflict. Thus, maximization of the metric in (13) consists of finding the adequate sets C m without any user conflict. With this, the heuristic algorithm is split into two phases: • Initialization phase: The algorithm starts by obtaining all possible pairs in S m and maximizing the associated metric rate in (11), by selecting the best possible pair of DVB-S2X MODCODs. As a result of this initial phase, the results of pair combination and WSR are stored, and the highest WSR for each beam is attached to the corresponding candidate set C m .
• Candidature approval phase: If the proposed candidates fC m g from the initialization phase do not pose any user conflict, then an optimal solution is achieved because each beam serves its best candidates. In general, guaranteeing the optimal solution is prohibitively expensive in terms of computational complexity. As a consequence, we resort to an ad hoc algorithm with affordable cost, and without optimality guarantees, although the achieved solutions have been tested to be quite effective. In case of conflict, alternative candidate sets must be proposed, with the algorithm addressing two user candidate sets at a time. As a side effect, loops can appear in the algorithm, and further elaboration is needed. The detailed algorithm is presented in Appendix S1.

| NON-LINEAR DISTORTION
Due to limited available power in the satellite payload, the satellite channel usually presents a non-linear behaviour caused by the PAs. This comes from the operation close to saturation to achieve acceptable onboard power efficiency. As the IBO reduces the non-linear distortion of the amplifier, we need to find a middle-ground between onboard power efficiency and non-linear distortion. Non-linear effects are expected to be more detrimental for signals with higher dynamic range, which makes it especially relevant to address their impact on PD-NOMA as the addition of two signals, in our case belonging to the family of DVB-S2X MODCODs. To this end, physical layer simulations have been performed for a hard-limiter TWTA model as in the DVB-S2X standard. This model, shown in Figure 3, presents a linear region where the output back-off (OBO) equals the IBO values until the output power saturates. This hard-limiter model can embody the response of a conventional TWTA after applying an appropriate digital predistortion technique. The induced level of distortion is set by the PA operation point, through both P Sat and IBO, and the peakto-average power ratio (PAPR) of the input signal. In the case of superimposed signals, the latter depends on the power allocation as presented in Figure 4. It is clear that the non-linear degradation, as an increasing function of PAPR, will vary with the relative contribution of each message to the NOMA signal.
Furthermore, each layer of the superimposed signal can be affected differently by the non-linear distortion. For instance, let us consider the reception of the NOMA signal (1), QPSK modulated at both layers, under a linear and non-linear channel; in both cases, the noiseless extracted symbols by the strong receiver are presented in Figures 5 and 6, before and after removing the weak user interference, respectively. Note that we consider the strong receiver because it has to decode both layers of the superimposed signal, whereas the weak receiver does not decode the strong message.
We can note the interference caused by the non-linear distortion.
By measuring the average error vector magnitude (EVM) of the received signal with respect to an ideal reception for the different messages, we can observe the different degradation in each layer. In the case of the weak message, we obtain À5.6 dB for the linear case and À6.2 dB for the non-linear case. Note that the EVM is not null in the linear case, even in the absence of white noise, because the strong message is treated as additional noise. Under perfect removal of the linear part of the weak message in both linear and non-linear cases, the received signal for the demodulation of the strong message is presented in Figure 6. Now, the average EVM for the linear case becomes zero, whereas the non-linear case yields an average EVM of À15.5 dB, because the non-linear components are not suppressed by the SIC process. As a consequence, the power ¶ This algorithm was included in the PhD thesis of the co-author Tomás Ramírez and reproduced here for completeness. allocationλ in (1) must also be adjusted to compensate for the different signal degradation in each layer. In general, we can expect a higher impact on the strong user performance due to the non-linear behaviour of the channel.
A general procedure for the selection of the PA operation point, that is, saturation power and IBO, is to minimize the required saturation power that results in successful demodulation of a selected MODCOD in the non-linear channel. 18 In addition, for the PD-NOMA case, the power allocation λ has also to be considered. Now, the goal is the joint selection of the PA operation point and power allocation λ to ensure a target frame error rate (FER) for both layered modulations. To this end, let Δ be the ratio between the required ES N0 at saturation point, ES , to achieve the target FER: Under this metric, the iterative method in Algorithm 1 describes the selection of the PA operation point and the power allocation λ. It should be remarked that this iterative process can be run offline beforehand. In practice, a look-up table will be generated by mapping the MODCOD pairs, SNR gap, required saturation power, IBO and power allocation, giving rise to the function Π NL in (6). For illustration purposes, an example is depicted in Figure 7. Note that we can find an optimal value IBO that minimizes the required transmitted power. For high IBO values, the nonlinear distortion is negligible, and additional transmit power is needed to compensate for the SNR loss. As we move closer to saturation, the nonlinear distortion grows, so that additional power is also required to compensate for the non-linear degradation. The physical layer simulations F I G U R E 7 Degradation due to non-linear operation with respect to IBO. MODCODs: QPSK 2/5 (weak), QPSK 5/6 (strong). SNR gap = 8 dB. FER ¼ 10 À3 . under this iterative approach have been performed with ideal time, frequency synchronization and SINR estimation. Although the non-linear effects could be reduced with the SIC process, 19 only the linear components are suppressed and non-linear components are left as cancellation residue. Exhaustive simulations, with transmission and reception stages as in Figure 8, have been performed to minimize the required saturation F I G U R E 8 Scheme of the simulation process for the analysis of the non-linear distortion for PD-NOMA.
F I G U R E 9 NOMA rate region without (red)/with (green) joint optimization of the PA operation point and power allocation λ. SINR ¼ 17 dB for the strong user and 10 dB power imbalance.
power for different combinations of DVB-S2X MODCODs. An operation point is considered to be valid when the NOMA FECFRAME error is equal to or below 10 À3 on average. # As evidence of the practical relevance of the described method, the growth of the NOMA rate region from the optimization of the power coefficient is presented in Figure 9, as an example making use of DVB-S2X MODCODs in the presence of nonlinearities. As we can observe, the joint optimization of IBO and power allocation λ can extend the achievable rates for the strong and weak users in the NOMA operation and increase the improvement over OMA.

| NUMERICAL RESULTS
The beam-free approach for both NOMA and OMA has been tested in a satellite scenario, with an antenna radiation pattern covering Europe with 200 beams, provided by the European Space Agency (ESA). 20 To keep the complexity of the simulations manageable, a set of M ¼ 16 beams at the centre of the coverage is selected to serve K ¼ 320 users. The corresponding beam footprints are represented in Figure 10. For simplicity, we assume the same statistical distribution of strong and weak users across the beams. To model the user distribution, we resort to the Dirichlet distribution DirðK, αÞ, which lends itself to shape different traffic demands across cells through the parameters α ¼ ½α 1 , …, α M ; the α i . In particular, we set α i ¼ 1, i ¼ 1,…, K so that we explore a homogeneous case where every possible distribution of users among beams has equal probability. This will allow to average out the impact of the traffic profile on the performance of PD-NOMA.
The number of transmission slots, V, has been set to 500 for both OMA and NOMA, high enough to include multiple transmissions for each user. The system parameters presented in Table 1 were used for the simulations, with 500 Monte Carlo realizations for the optimal (adaptive) IBO case, and 8000 realizations for each fixed IBO value. As mentioned in previous sections, the spectral efficiency will be measured based on the use of DVB-S2X MODCODs and taking into account the non-linear distortion of the PAs. The benchmarking metrics are the geometric mean, minimum rate and sum rate, when comparing OMA and NOMA. As the adaptation of IBO on a frame basis may not be feasible in some practical settings, both adaptive and fixed IBO cases will be analysed. In the former, the optimal PA operation point will be adjusted for each frame by modulating the amount of back-off (IBO), whereas tests with constant IBO with time will be also performed. As a reminder, note that SIC will be considered ideal as to the removal of the linear component of the weak message, although the non-linear interference will not be cancelled.

| Optimal IBO
The overall system improvements of NOMA over OMA are presented for different ratios of strong and weak user per beam L in Figure 11a.
Cumulative distribution functions of the total, strong and weak user rates, are also displayed in Figures 11b and 12, respectively. These average rates are computed at each Monte Carlo simulation and measured as # The magnitude of the FECFRAME error has been selected to limit the simulation duration.
where the variable r k ðtÞ keeps track of the kth user rate at the time slot t. From the results, NOMA clearly outperforms OMA in both geometric mean and sum rate according to Figure 11a. On the other side, Figure 12 reveals that the NOMA gain applies mainly to strong users, for which the overall improvement on the average rate is around 30% (see Table 2). Additionally, higher gains appear for higher L values, with more noticeable presence of strong users. There is a minor penalization for those weak users with lower rates, although the performance is quite aligned according to Figure 12b. With PD-NOMA, strong users can be served more frequently thanks to the non-orthogonal access, with a minimum impact when servicing weak users.
F I G U R E 1 0 Beam footprints used in the simulations.

| Fixed IBO
The lack of dynamic adjustment of the PA operating point causes an unavoidable impact on the system performance, although it is perhaps of more interest to understand the relative behaviour of OMA and NOMA in this more rigid setting. To this end, Figure 13 displays the relative weight of optimal IBOs for all the MODCODs which are eligible in our setting and an SNR gap of 10 dB. In this regard, note that some DVB-S2X MODCODS are not used in NOMA; in particular, both VL-SNR (very low SNR) and 8PSK MODCODS are excluded due to their irregular associated XFECFRAME lengths, 3 whereas 64APSK is not considered for insufficient SNR. The conspicuous absence of content at the leftmost side of the NOMA graph means that no single QPSK constellation is used, and only a small set of combinations of QPSKs gives rise to a peak for lower IBO values. As extension of Figure 9, let us consider the impact of different fixed IBO values on the rate regions in Figure 14. In both NOMA and F I G U R E 1 1 Results for NOMA and OMA. Optimal IBO. L indicates the ratio between the number of strong and weak users within the beams.
F I G U R E 1 2 Cumulative distribution of average user rates. Optimal IBO.
T A B L E 2 Average rate improvements of NOMA over OMA for weak and strong users. OMA cases, the outer boundary corresponds to the adaptive IBO case, with a rate region which shrinks as the system is constrained to a fixed PA operation point. Interestingly, a noticeable gap still exists between the NOMA and OMA performance. In this regard, each signalling scheme is expected to require a different IBO for maximum performance, as discussed next from a system perspective.
Next, we plot the rate performance metrics for different values of fixed IBO in Figure 15. In this figure, the optimal IBO values in terms of geometric mean are showcased with square (circle) markers for PD-NOMA (OMA). The corresponding improvement of PD-NOMA over OMA for the optimal fixed IBO is displayed in Figure 16a for the minimum, sum and geometric mean of the rates. Additional information is provided by the cumulative distribution functions of the overall rates, strong rates and weak rates in Figures 16b and 17.
From the results, it is clear that the optimal IBO value differs between OMA and NOMA, with the latter keeping the advantage with respect to OMA by resorting to a higher IBO value. The improvement in the aggregated rates (geometric mean and sum rate) is lower when compared with the adaptive IBO case and higher in terms of minimum rates. From the results, we can infer that a fixed IBO alters the rate allocation, and hence the PFS weights, not necessarily in the same vein when comparing OMA and NOMA. Thus, results in Figure 15b,c are anchored with respect to the optimal geometric mean ( Figure 15a) which is the target under proportional fairness scheduling. The aggregated spectral efficiency curves (geometric mean and sum rate) display a relatively smooth behaviour with respect to IBO, which is good news from a system design perspective. However, the minimum-rate performance decreases significantly as IBO grows, as seen in Figure 15c. As expected, the use of geometric mean as optimization criterion poses a trade-off between the sum-rate maximization (higher IBO) and the minimum-rate performance (lower IBO). Finally, let us remark that the optimal operation point depends on the specific system settings and can be obtained only after statistical analysis. Interestingly, from observation of Figure 15, we see that small IBO changes have a minor effect on aggregated performance metrics, as also concluded for more conventional systems, 21 which alleviates the impact of the non-linearities on the design requirements when implementing non-orthogonal schemes, keeping their superiority with respect to orthogonal signalling. F I G U R E 1 6 Results for NOMA and OMA. Best IBO cases for NOMA and OMA in terms of geometric mean. L indicates the ratio between the number of strong and weak users within the beams.

| CONCLUSIONS
System-level studies were presented for a multibeam scenario under a traffic agnostic model, where all terminal distribution profiles across beams displayed the same probability. Two different types of terminals coexisted, with a significant power imbalance in their respective link qualities.
The potential gains of NOMA with respect to more conventional orthogonal allocation schemes were evaluated, after considering the use of specific DVB-S2X MODCODs, along with the impact of non-linearities, for which superimposed signals are specially sensitive due to their higher PAPR. The fair allocation of resources to the different users across time was also embedded into the study. It was concluded that the power allocation in NOMA needs to be carefully optimized jointly with the IBO of the PAs to yield sum-rate gains in the order of 20% to 25% for the strong users, while keeping similar overall rates for the weak users. If PAs lack flexibility, and IBO is to remain constant, NOMA still outperforms OMA by operating with a higher fixed IBO. However, the incremental performance depends on the specific settings. For design purposes, we can conclude that the overall spectral efficiency depicts a very smooth behaviour with respect to IBO, so that small IBO changes have a minor impact on the system performance. communications and is a reviewer for different European research agencies. He is member of the Satellite Network of Experts, funded by the European Space Agency, for which he has been involved in different research projects. In the field of Satellite Communications he was corecipient of the best paper awards at several international conferences. He serves as an associate editor for the Frontiers in Space Technologies journal.

SUPPORTING INFORMATION
Additional supporting information can be found online in the Supporting Information section at the end of this article.

APP E NDIX A: USER SELECTION AND PAIRING: CONFLICT RESOLUTION
Let C m and C p be two candidate sets that present a user conflict, with the users in conflict represented by a set A m,p . Furthermore, let us define V m and V p as alternative candidates that avoid any user conflict with V m ∩P m ∩A m,p ¼ x000D8; and V p ∩P p ∩A m,p ¼ x000D8;. Here, P m and P p are counter-measures to prevent loops; more details are given later. Under this formulation and taking into account the optimization metric of the algorithm, we also define U m ¼ WSRðV m Þ þ WSRðC p Þ, The user conflict resolution is dictated by comparing U m and U p . For instance, if U m ≥U p , the resolution goes in favor of the set C p , with the alternative set V m becoming the new candidate for the m-th beam. Moreover, the loop counter-measure P m is updated as P m ¼ P m ∪A m,p . Thus, this set keeps track of previous user conflicts and precludes forthcoming conflicts of the m-th beam with the already resolved user conflict. In addition, the loop counter-measure P p of the winner set is also updated by deleting the indexes of the conflicted users A m,p . If U m ≤U p the same process applies in favor of the set C m .
As an example, Figure A1 illustrates the process of user conflict resolution. Alternative sets are proposed after encountering a user conflict, which will be resolved in favor of one of the sets. Unfortunately, the ad-hoc algorithm does not guarantee an optimal solution and, in fact, candidate solutions with better performance than the final output of the algorithm might be discarded during the user conflict resolution process. As a consequence, the algorithm also stores the discarded solutions if they do not present any user conflict. The final achieved solution from the algorithm will be compared against the best solution across the discarded pool, and the best solution in terms of weighted sum-rate will be selected.