A low complexity mechanism for congestion notification in rural IPSec-enabled heterogeneous backhaul networks

simulations and implementing the notification mechanism. Our low complexity approach offers 2% accuracy and backhaul update latency lower than 10 ms during 80% of the time, which makes the solution appropriate for admission control and scheduling intervals in small cells.


INTRODUCTION
3G and 4G networks are widely common in urban areas worldwide. However, most rural areas, especially in developing countries, lack of decent connectivity infrastructures. Usually operators only provide coverage as far as revenues compensate their infrastructure investment, which leaves many sparsely populated areas without internet connectivity or even 2G telephony. For these cases, new approaches have been proposed in order to offer Information and Communications Technology (ICT) services to these regions, such as community infrastructures, [1][2][3] where users are the owners of the network. However, these approaches are mainly focused on household connectivity and mobile 2G. For providing 3G and 4G connectivity, one of the most cost-effective approaches is to use shared low-cost private infrastructures where the operator offers services adjusted to the real needs of the target population. 4,5 These infrastructures may consist of an heterogeneous wireless backhaul (BH) network that interconnects a number of small cells deployed as access networks (ANs). In many real setups, the BH network is composed of a set of long-distance wireless links (eg, WiFi, WiMAX, and VSAT). An example of such network is illustrated in Figure 1. ANs are usually small cells built with inexpensive lightweight low-power Home NodeB's (HNB). This approach allows to provide guaranteed 3G and 4G services in specific isolated areas at lower cost than traditional wide range cellular networks operating with nonconstrained BH networks.
However, small cells require stable and low latency BH networks in order to provide guaranteed QoS, and therefore an intelligent BH management becomes crucial. When small cells are used in urban environments, 6 where xDLS or fiber technologies are widely available, the BH management among the small cells is not particularly problematic. [7][8][9] However, when the BH is a highly constrained, heterogeneous set of wireless links, managing the BH becomes a challenge, since the provided capacity may be limited and highly variable, and the latency and throughput perceived by the AN may be seriously affected by the level of saturation in the BH. For example, if a WiFi link is reaching its saturation point, an increase of 1% in the load can increase the latency up to 20 times without even increasing the offered throughput. 10 Usually, this issue is solved in the edges of the network with traffic engineering and QoS rules. 11 However, additional mechanisms are needed to notify the HNB's and modify their admission control policy (ie, blocking current and future exceeding connections), improving eventually the overall users' experience.
The AN dependency from the BH makes crucial to jointly control the operation of both networks, meaning that AN admission control and scheduling should take into account the actual instantaneous QoS provided by the BH. Since small cells are not designed to perform these tasks and are not even able to communicate with the BH nodes, a new approach for low complexity lightweight real-time monitoring of the BH is required. This approach must provide to the AN-node with specific information about its own BH, no matter what happens with the rest of the transport network. However, the 3GPP ANs are designed to be BH technology-agnostic, and there is no standardized interface to communicate both network segments. The BH network supports the Iur or Iuh 3GPP interface by encapsulating the user data and signaling as TCP or RTP/UDP traffic 13,14 from the AN and are always secured using IPSec tunnels. 15 In this work, we proposed a simple approach to use the Explicit Congestion Notification (ECN) bit of the outer IP Header in the IPSec tunnels in order to provide to each AN with instantaneous feedback about the congestion state of the BH network segment that supports its Iuh interface. This way, ANs can adjust their admission control and scheduling policies in order to reduce congestion in the BH or increase the service load in case resources are available in the BH.
The proposed mechanism has very low complexity (lightweight), is distributed (it only affects the interface between the AN-node and the edge node in the BH), and is highly scalable. Also, it is compatible with the standard usage of ECN bits, since the original ECN encoding is carried in the inner IP headers. Our results show that the proposed method provides a simple but effective way for notifying congestion from the BH to the AN in the targeted scenarios.
The reminder of this paper is organized as follows. Section 2 provides an overview of the different existing monitoring solutions for BH networks. Section 3 introduces the ECN bit and its different usages, and Section 4 describes our proposed ECN-based solution, addressing compatibility and security details. Finally, Section 5 validates the proposed solution for different scenarios and configurations.

RELATED WORK
Since the outbreak of Software Define Networks (SDN), 16,17 most of the state of the art in network management mechanisms follows this approach, 18,19 and by using a centralized controller, the BH congestion problem would become trivial. There is a number of works that study how small cells can be optimized while being aware of interference, 20 energy efficiency, 21 dinamicity, 22 and latency. 23,24 These studies locate the intelligence in the centralized controller that runs an optimization algorithm or an heuristic to solve the problem. However, applying SDN principles in rural BH networks is not straightforward. On one hand, moving away from distributed routing protocols, which are scalable and robust, to a comparable SDN controller with the same scalability and reliability can be challenging. Second, most of the current wireless BH networks in rural areas are not SDN-enabled, are equipped with old management tools, and have human operators that are not familiar with the technology. In order to solve this, a number of SDN architectures exist that support evolutionary upgrades to SDN. 25,26 These architectures allow to keep the legacy management tools and to gradually update the network with SDN capabilities. Another alternative is to use Hybrid SDN networks that allow to combine non-SDN with SDN equipment and techniques. [27][28][29] Despite this, we think that the proposed ECN-based mechanism is still simpler and more cost-effective for the given task.
In addition to that, additional techniques to measure congestion inside the BH are needed. Traditional flooding approaches to measure the available throughput 30 are not suitable for BH networks, since they significantly degrade the network operation while the tests are being performed. This implies that they do not accurately provide the actual nonsaturate throughput in a link, ie, the throughput offered with bounded. 10 There are also a number of passive methods that provide an accurate measurement of the actual traffic in an interface at almost real time. 31,32 Although these approaches are simple and lightweight, they require extra information about the actual maximum capacity of a given link in order to estimate its actual congestion and how far from the saturation point its current load is. The actual maximum capacity in a link is usually pre-established when designing the network by a given modulation and codification scheme; however, it is not an accurate enough figure since other variables such as interference and momentary fading are not considered. More elaborated approaches involve a minimum of signaling overhead 33 while being efficient in identifying the level of congestion of the link regardless of its nominal capacity. In the present work, we take an equivalent approach for measuring the link congestion level as described in Section 4.6.

EXPLICIT CONGESTION NOTIFICATION
The use of ECN with TCP is described in Ramakrishnan et al, 34 and its usage with RTP/UDP in Westerlund et al. 35 TCP's congestion control algorithms are based on the idea that the network is a black box. This means that end nodes measure the network state without any interaction with the intermediate nodes. With basic TCP mechanisms, if the network is congested and some packets are dropped in any intermediate router, TCP detects the packet drops and reacts by diminishing the sender bit rate. However, if active queue management is used, eg, Random Early Detection (RED), 36 alternatives to packet dropping exist. The main one is to use the Congestion Experienced (CE) codepoint in the IP header  to explicitly notify congestion. This is possible since active queue strategies aim to detect congestion before it happens (ie, before the queue is full). When a network is ECN capable, it can modify the CE code point and therefore, it is possible to notify congestion to the transport layer in the sender node without dropping packets in the path.
ECN is based on two bits that are placed in the IP header as it is illustrated in Figure 2. Specifically, IPv4 Type of Service octet and IPv6 Traffic Class octet is divided in two parts. The six first bits are used as Differentiated Service Code Point (DSCP), and the last two bits (7 and 8) are used for the ECN field. Table 1 shows the different meanings to the four ECN values.
If a connection is ECN-enabled, the sender will set the ECT codepoint (ECT(0) or ECT (1)) in all the packets. In general, routers will indistinctly use the ECT(0) and ECT(1) codepoints. 34 Then, if an intermediate router detects congestion, it sets the ECN bits to CE. If during the negotiation of the transport connection, sender and receiver choose to use ECN, the sender will set the ECT codepoint in all the packets, setting up a ECN-enabled connection. If an intermediate router detects congestion and the ECT codepoint is set, instead of dropping a packet, it simply sets the ECN bits to CE. When an intermediate router receives a CE packet, it forwards the packet as usual without changing the CE codepoint. The role of the receiver and the sender varies depending on the transport layer. The use of ECN has been defined for TCP, SCTP, 37 DCCP, 38 and RTP with UDP. 35 In the Iuh interface, the main transport protocols are TCP and RTP over UDP, this one optionally controlled by Real Time Control Protocol (RTCP). 13,39

ECN with TCP
When a TCP receiver detects the CE, it sets the ECN-echo flag in the next TCP ACK. When the sender receives a packet with this indication, it reacts in the same way as when detecting a packet drop. Then, the sender sets a flag in the TCP header to acknowledge the reception of the ECN-echo.

ECN with RTP/UDP
Congestion control for real time applications over RTP/UDP is particularly important, since there is no retransmission when a packet is lost. Before the transport connection is established, RTCP initializes the ECN mechanism by ensuring that all nodes are ECN-capable. When the receiver detects a packet with the CE codepoint, it notifies this to the sender using RTCP feedback mechanisms.
When an IPSec tunnel is used between the sender and the receiver (or in a part of the path), the TCP or UDP sender may not take the ECN bits in the external IP header into account, since they are not secured. 34 In this case, the ECN bits in the external header may be simply discarded. This is called limited-functionality. Alternatively, the other option is full-functionality that takes these bits into account together with the ECN of the internal IP header (which is secured). In that case, an OR operation is done between the congestion notifications (there exists congestion if it is marked in the inner or the outer headers). Table 2 represents all the possible values for the ECN field in the inner and outer headers at the tunnel ingress and egress. ECT(0) or ECT(1) equal to a ECN-capable network with no congestion detected, and CE equals to congestion detected in a ECN-capable network. This means that when the outer header is removed at the tunnel egress, the CE codepoint is set in the ECN field of the inner header if it is set in the outer header. Otherwise, the inner header is not changed and will contain the value it had before the tunnel ingress point.

Reference scenario
We consider the reference scenario depicted in Figure 3. The BH is divided into two parts: one between the edge node and the gateway router (GW), called managed BH, and another between the GW and the Core Network Gateway (CN-GW), called not-managed BH. This split aims to consider a general scenario by including a part of the network, which is controlled by the operator (eg, a multi-hop wireless BH) and another that is not under its control (eg, a leased transport network). If the connection between the AN and the CN-GW uses ECN, the nodes in the not-managed BH may notify congestion using the CE codepoint. Our solution shall consider this situation to preserve that congestion notification, if desired.
The main objective of our approach is to use the ECN bits to notify not only the existence of congestion in the managed BH but also its severity, while preserving congestion notifications that may come from the not-managed BH and supporting both TCP and RTP/UDP traffic.

Overview of the solution
The edge node is assumed to know the congestion level of the managed BH at any time instant. In our scenario, the mechanism to achieve this is described in Section 4.6. The nodes in the managed BH are configured to be not-ECN capable, so they do not modify the ECN bits by default. We propose to change the meaning of the outer ECN bits only in the interface between the edge node and the AN-node (blue dashed vertical line in Figure 3). This interface is a wired connection between each edge node and each AN-node (eg, HNB). The new usage of the bits is: 1. First bit (b 0 ) is used to notify congestion in the not-managed BH. This bit is set to one if the packet is marked with the external-header CE codepoint. 2. Second bit (b 1 ) is used to notify the congestion level in the managed BH.
For using b 1 , the following method is applied. Let denote the percentage of congestion. Then, for coding the value of a congestion of , bit b 1 is set to one for only 1 of each 100∕ packets. This is basically to modify the b 1 bit to notify congestion by setting it to 1 or 0 during a certain period. The proportion of ones and zeroes during that period is used to calculate the congestion level. For example, if we set four levels of congestion (0%, 50%, 75%, and 100%), the ECN of at least four packets must be read in the AN-node in order to be able to estimate the congestion level. Then, for a congestion of 25%, the edge node of the managed BH will set the bit b 1 to 1 in 1 of every four packets. For a congestion of 50%, two of four packets should have their bit b 1 set to 1, and so on. The granularity in the congestion notification will determine the minimum number of packets N w that have to be received before obtaining , and hence the update period in which the HNB will be notified of the congestion level.

Compatibility with standard ECN usage
If the AN-node establishes a connection with full-functionality ECN with CN-GW, after reading b 0 and b 1 , the outer ECN field must be set back to the original value. Table 3 shows all the possible combinations of ECN values in the IPSec tunnel ingress and egress. Three states are possible for the outer ECN field (ECT, Not-ECT and CE) but only two values have been stored (0 or 1 in b 0 ). However, if the outer ECN field in the egress part of an IPSec tunnel is ECT or Not-ECT, the receiver must ignore it, so in fact, only two values are relevant in the outer ECN field at the receiver: CE and any other value. Hence, after reading the congestion information for both the managed BH and the not-managed BH, the outer ECN is rewritten and set to CE (ie, congestion is notified to the receiver transport layer) if b 0 is 1, or to ECT otherwise. Comparing the two first and the last columns in Tables 1 and 2, we observe that standard usage of the ECN bits is not compromised.

Implementation details
The congestion level ( ) is coded as the fraction of packets in a window of length N w (packets) with b 1 = 1. Two types of window are considered: nonoverlapping window (NW) and sliding window (SW).

Nonoverlapping window
With an NW of length N w packets, a new estimation of the congestion level is obtained each N w packets. The length of N w depends on the precision required for the congestion level at the HNB, which is obviously quantized. Let e max be the maximum error allowed for the congestion level estimation. Then, For example, applying Equation 1 in an window with congestion level ∈ {0, 25, 50, 75, 100}%, the minimum value for N w would be 4, and e max = 12.5%. On the other hand, the congestion is estimated in the HNB reading b 1 during N w packets as the mean value of the stored vector: where b 1 [k] is an indication function that equals 1 if b 1 = 1 in packet k and T w is the time duration of the window.

Sliding window
The SW approach can be used in order to reduce the latency in the notification congestion. This approach can be useful for those cases in which the scheduling period is short (eg, in HSPA is 2 ms) and when sudden changes in the BH are usually present. In order to compute and update the congestion level̂[n] each time a packet arrives, a general windowing function can be applied:̂[ where [n] is the true congestion at time instant n and̂[n]| i is the congestion level estimated by the AN-node if a packet with b 1 [n] = i is received. Note that, since the AN-node and the edge node are linked by a short dedicated wired connection, we can assume that all packets are successfully received without loss.

Security issues
The proposed mechanism is designed to cope with a general scenario where the transport network may have a segment, which is not controlled by the operator. The method notifies not only about the congestion level in the managed BH but also about the existence of congestion in the not-managed BH. This part of the network may not be secured, so ECN bits of the external IP header may be changed by a malicious attacker if ECN with full functionality is used. In that case, a false congestion notification would be generated in the AN. This is a general limitation of the use of fully functional ECN, with or without our approach. 40 Hence, if security in the unmanaged-BH cannot be assured, bit b 0 should be ignored (limited functionality), and then no information about congestion in this part of the network would be provided. The value of b 1 is based on internal congestion measurements, and it is used only in the link between the AN-node and the edge node, which is typically secure.

Measuring congestion
As stated in Section 2, a way to estimate the actual congestion of the link regardless of its nominal capacity is required.
On the basis of the work of Rattaro and Belzarena, 33 we have implemented a statistical descriptor of a common wireless link state. This descriptor is the mean value of the variable K n , computed as where Δ j is the jitter at time instant j (difference of delays between packet j and packet j-1), and K 0 an initialization parameter. From Rattaro and Belzarena, 33 the correlation coefficient between E{K n } and predicted throughput is 0.86. Since statistical learning tools are used, we take a set of samples with known capacity (C), in which each sample is represented as x = (E{K n }, C). With this set of samples, a support vector machine (SVM) have been trained, which learns the relationship between input variables and congestion level. In the test phase, a periodic short train of packets is transmitted to estimate the BH state. Using the model created by the SVM and the jitter information read in the receiver, an estimation of the actual network throughput is obtained. In our setup, we run different experiments to compute and validate the correlation between jitter and capacity in a testbed with two IEEE 802.11n links as described in Figure 4.
From node A, traffic is injected towards the node C in order to congest the network, using D-ITG. 41 Once the flooding traffic is in the network, additional measurement bursts of 200 packets are used each time, and 1500 bursts are periodically transmitted for each experiment, progressively increasing the congestion level in the network. Nodes 1 and 3 are synchronized with NTP 42 with a maximum accuracy of 2 s. Measurements are made in both directions (alternating node A and C as burst transmitters). The receiver measures the unidirectional delay, round trip time, packet loss, and received throughput. For all the case cases, we compute the mean jitter, the standard deviation of the jitter, and packet loss, for different congestion levels. In order to obtain more robust results, we also experiment with different  Each experiment consists in 16 different configurations with 1500 samples each. Then, correlation coefficients between mean jitter and input rate and standard deviation of jitter and input rate are computed. In all cases, the correlation coefficient is higher than 0.65 and goes up to 0.75 for mean jitter and 0.85 for standard deviation of the jitter. As an example, Figures 5, 6, and 7 represent how the mean jitter, its standard deviation, and the packet losses vary regarding the congestion level, respectively. In this example, 1500 samples were taken in each load point for a wireless links with MCS1 (QPSK-1/2), injecting unidirectional traffic without VoIP and only congesting the A-B segment. As it can be seen, the packet loss and the mean jitter suddenly increase for a congestion level of approximately 80%. This is because the link always reaches the saturation point before reaching the maximum actual capacity. 10 Precisely, wireless BH links should always work under the saturation point, since although all available throughput is not used, the delay is kept bounded. * Although our figures are lower than the ones obtained in Rattaro and Belzarena, 33 they allow us to be confident enough to use this measurement congestion mechanism in order to estimate the congestion in our ECN-enabled BH tests.

Basic functionality
We have tested the basic functionality of the proposed procedure in a real network similar to the one described in Figure 4 and have implemented the two procedures, one for modifying the ECN in the edge node (ECN writer, ecn_injector) and other for read and decode the ECN in the HNB (ECN reader, ecn_reader). † For this test, the injector generates its own Secured IP packets and codes the congestion level % by setting b 1 = 1 in N 1 packets during a window of N w packets, with is the rounding operator. The ECN reader reads all IP packets that come from a specific interface and estimates the congestion level at the end of each window as , which corresponds to using a rectangular NW (ie, the congestion estimate is updated each N w packets). Figure 8 shows the real and estimated congestion using these scripts. Blue line represents the actual load of an UDP flow, and colored lines represent the estimated congestion level for different values of N w (5, 10, 20, and 40). It can be observed  that the estimated congestion follows the real value, although since the inter-arrival time between packets is random, the delay when updating congestion during transitions varies with time.

Numerical simulations
In order to provide some more insight of the performance of the proposed mechanism, a simulator is implemented in Matlab. The simulator has three main elements: 1. Traffic generator. Packet inter-arrival time ( [n]) is simulated with a Pareto distribution. Being C (in bps) the BH capacity, L the packet length (in bits), and the shape parameter of the Pareto distribution, the mean packet inter-arrival time is̄= 2 2 −1 · 1 r , withr = ·C 100·L the desired mean packet rate. 2. ECN writer. Only the bit b 1 is considered. Equation (3) is applied at the transmitter. An NW and an SW are used and compared. In the last case, congestion is updated each time a packet arrives. 3. ECN reader. The HNB reads the bit b 1 in each packet and generates an estimate of congestion using Equation (2).
Two experiments are run. First, a set of packets is generated during 100 seconds with congestion level incrementing in steps of 10%, capacity C = 10 Mbps and L = 1400 bytes. A window length N w = 20 is used. Figure 9 shows the result of the estimation process.
Since N w is high, the congestion levels are accurately notified and mean square error (MSE) values are zero for most time instants. However, as it is shown in detail in Figure 9B, during the transients, the delay in the response produces error peaks. As it is illustrated, when NW is used, the delay in estimating the congestion becomes higher since the receiver must wait until the end of the window. This problem is partially solved by using SW.
In the second experiment, 10 periods of 200 ms are generated, each one with a constant random congestion level. Window lengths of 5, 10, and 20 packets are tested with both NW and SW; 1000 repetitions of the experiment are run. The cumulative distribution function (CDF) of the relative mean square error (e) and the updating delay (Δ) are represented in Figure 10. Δ is computed as the timêneeded to reach when there is a variation in . As expected, Δ is lower for shorter windows. Since error happens mainly at transitions, shorter windows also provide more accurate results. Equivalent results are shown for different C and N w in Figure 11. As expected, Figure 11A shows that the delay and mean squared error increases for higher values of N w . This is caused because with the lowest value N w = 10, the congestion levels used in the experiment (10% and 90%) can be exactly represented. This means that the minimum window length that provides enough accuracy should always be used. Also, as can be seen in Figure 11B, mean squared error and mean delay decreases when congestion increases. Obviously, this is because with higher rates, the number of received packet (with congestion information in the ECN bit) per second in the HNB is higher, and hence, the response to congestion level is faster updated. Since tests have been performed for a capacity of 10 Mbps, it is expected that for high capacity BH networks, delay and error will decrease significantly.

CONCLUSION
In this work, we have presented a low complexity method to notify the BH congestion to the AN, conceived for rural cellular networks that use a multi-hop IPSec wireless network as BH. The proposal is based on a nonstandard local usage of the ECN bits between the edge node and the HNB, which does not compromise the standard usage of ECN bits in the end-to-end transport connection. The method allows to inform the HNB about the congestion level in the BH, and even about the existence of congestion in a not-managed part of the BH if this part of the network is secure. The proposed solution is focused on BH networks such as multi-hop wireless networks that serve 3G/4G small cells, where the resources are limited and variable, and where SDN techniques are not possible. In these scenarios, the AN has to perform access control and scheduling procedures, which are dependent of the BH congestion level. Our solution aims to solve this issue in an easy and efficient manner.
In order to validate the approach, two functions, ECN read/write, have been implemented and tested in a testbed network. Also, extensive simulations have been performed to test the solution for a wide range of configurations. Results show on one hand that the solution is simple, lightweight, and compatible with previous ECN IPSec standards, making it suitable to be implemented in a wide variety of devices. Second, the results for different configurations show that using small SWs provide significant performance, with delays lower than 10 ms and accuracies of 2% during 80% of the time. These figures make the solution suitable for admission control and scheduling intervals in current 3G/4G mobile technologies. The integration of this ECN-based approach into future hybrid SDN network designs remains as an interesting future work direction.