LoRaSync: Energy efficient synchronization for scalable LoRaWAN

Low‐power wide‐area networks connect a large number of battery‐powered wireless devices over long distances. Among them, long range wide area networks (LoRaWAN) implement a pure ALOHA medium access scheme to save device energy by minimizing the radio usage. However, frame collisions restrain the network scalability when the traffic load increases. In this context, synchronization can be used to exploit the available bandwidth more efficiently by controlling the timing of frame transmissions and reducing the collision probability. Such strategy allows to increase the network throughput at the cost of an extra energy demand due to the inherent overhead. In that, the entailed network scenario is still supposed to address low power applications. Therefore this article timely presents LoRaSync, an energy‐efficient synchronization scheme designed for LoRa networks of any size. An accurate clock drift model was established based on measurements made on real cheap devices, and leveraged to support the design of LoRaSync. Our mechanism has been used to evaluate the same ALOHA‐based random access but on a time‐slotted basis, thus increasing the maximum achievable throughput compared to the legacy access. Throughput and energy efficiency models are established to evaluate the performances of a LoRaSync‐operated network. These models are validated with a simulation environment mimicking large‐scale deployments, and then used to determine the most energy efficient slot size for any traffic load. As a final proof of concept, LoRaSync has been implemented and tested on a LoRa testbed to demonstrate the feasibility of our solution on real hardware.


INTRODUCTION
By providing cheap devices with low power and long range communication capabilities, 1 low power wide area networks (LPWAN) keep gaining relevance in industrial and research environments.In particular, long range wide area networks (LoRaWANs) have emerged as a very promising technology to support a wide range of distributed sensing applications. 2][5][6][7][8][9] The reason behind LoRaWAN's scalability limitations lies in the simplicity of its medium access scheme.Indeed, to send information through the internet, a node forges and immediately transmits a link layer frame without checking if the radio channel is free.This simple access scheme is the very well-known pure ALOHA MAC (medium access control) protocol. 10Herein the channel throughput is limited to a maximum of 18% of the available bandwidth due to frame collisions.
Given that this strategy insures very poor network performances for high traffic loads, synchronization mechanisms have been explored as means to reduce the frame collision probability and increase the maximum LoRaWAN throughput. 11,12As a matter of fact, leveraging time synchronous slots to setup a very simple slotted ALOHA random access scheme 13 doubles the maximum achievable throughput compared to its asynchronous version (i.e., pure ALOHA).More interestingly, a synchronization mechanism allows the development of even more sophisticated MAC layer schemes capable of handling various traffic requirements.For instance, a scheduled access could be designed to insure even collision-free communications.In order to share a common time reference across the network, all devices need to periodically receive timing information to cope with the natural drift of their embedded clock.To do so, they need to frequently switch on their radio to listen to incoming frames.This technique consumes a considerable amount of the available energy in the feeding batteries.Therefore, energy efficiency should be prioritized when designing the synchronization mechanism.
With this approach in mind, we introduced Class S in a previous contribution 14 as an extension of the LoRaWAN Class B, leveraging a beacon-based synchronization scheme to setup uplink transmission timeslots.Simulations showed that this solution could enable a slotted access over LoRa.However, a beacon skipping mechanism was required to limit the device radio usage and ensure energy efficiency.In this sense, we timely present the LoRaSync synchronization scheme, as a means to implement Class S on real cheap low power devices.More specifically, we measured the clock skew of typical LoRa hardware to evaluate how fast devices drift away from a given time reference.Based on this preliminary experimental evaluation, we designed LoRaSync with slots large enough to cope with clock errors.In that, the slot size and the maximum clock drift are used to compute the maximum amount of beacons that may be safely skipped while keeping devices synchronized.Such a beacon skipping mechanism is crucial to minimize the power consumption due to frame receptions, and eventually maximize the lifetime of feeding batteries.
In order to assess the performances of large-scale LoRaSync-enabled networks, throughput and energy efficiency models have been established.These models notably account for slot size variations, reception slot enlargements and the proposed beacon skipping mechanism.They additionally have been validated through the LoRaWAN-sim simulation environment.Furthermore, the impact of the LoRaSync slot size on the overall energy efficiency of the network has been evaluated.Remarkably, the slot size has a twofold impact on the network behavior.On the one hand, increasing the slot length reduces the number of slots that fit into the slotframe, thus reducing the maximum achievable throughput.On the other hand, using large slots also allows devices to stay synchronized for longer time periods without receiving beacons, thus reducing their overall power consumption.As a consequence, we discovered that setting up the proper slot size value is functional to fit the best trade-off between throughput and power consumption.We therefore provide a methodology to determine the most energy efficient slot size for any traffic load.
In order to prove the soundness of the conceived LoRaSync mechanism, a real-world testbed has been setup to implement our synchronization scheme.This proof-of-concept demonstrates that the timing error can successfully be controlled despite the low-quality clocks found on such devices.In addition, this implementation allows us to highlight that the capture effect impacts the real-world throughput.We have therefore studied this phenomenon in more detail in a separate contribution. 15rom these premises, this contribution is organized as follows.First, technical background on the LoRa technology and the LoRaWAN MAC layer is provided in Section 2. Section 3 describes related works about access scheme improvements for LoRaWAN.It justifies the need for synchronization to enhance the network performances, and highlights the novelty of our beacon-skipping approach that has been for the first time implemented on real hardware.Then, Section 4 introduces the LoRaSync design, based on real clock skew measurements.Section 5 presents and validates models for the throughput, power consumption and energy efficiency in LoRaSync-enabled networks, that are then leveraged to determine the most energy efficient slot sizes.For the sake of completeness, a preliminary evaluation of LoRaSync in real LoRa networks is presented in Section 6 through the description of a proof-of-concept implementation.Finally, Section 7 discloses concluding remarks and research perspectives.

BACKGROUND ON LORA AND LORAWAN
The LoRa LPWAN technology is well known for its capability to gather data over wide deployment areas.The term LoRa specifically designates the physical layer, while LoRaWAN is the MAC layer that was designed on the top of it.The proprietary LoRa modulation relies on a chirp spread spectrum technique, which has proven to be resilient to multi-path interference and channel fading. 16It allows transmissions to reach up to 10 km ranges under ideal conditions, 3 with a relatively low power operation.A key parameter of this technique is the spreading factor (SF), that can be associated to the chirp sweeping time. 17A bigger SF results in a longer transmission range, but also in a bigger time on air (ToA).High SFs are therefore featured with a low data rate and high energy consumption.Later in this contribution, we will focus on using the smallest SF (SF7) because it results in the shortest ToA, and therefore in the minimal energy consumption.The LoRa physical layer modulation is proprietary, however the upper layers are open and well documented. 18In particular, the LoRa Alliance provides a common view on the MAC layer implementation through the LoRaWAN specification.This standard presents a network topology in which nodes communicate with gateways via LoRa communications, and gateways interact with servers through an IP backhaul.This architecture is represented in Figure 1.
LoRaWAN defines several device classes with specific communication modes able to fit different types of applications.Class A is the default scheme that all devices must implement.In this mode, the uplink (device to server) transmissions are carried-out in a pure ALOHA manner, and followed by two 30 ms reception slots (i.e., RX1 and RX2), providing the server with an opportunity to return a downlink message.The reception slot length corresponds to the time a device needs to detect a LoRa preamble.When such a preamble is identified, the device maintains its radio in a listening state until the end of the transmission in order to demodulate the data symbols.This behavior is depicted in Figure 2A, where a downlink frame is received during the second reception slot RX2.When the server has a pending message for a Class A node, it thus has to wait until this node sends an uplink message to have a downlink transmission opportunity.As a result, Class B was introduced to reduce and bound the downlink delay.Class B devices operate like Class A ones for uplink transmissions, but they also periodically open additional reception slots, offering frequent downlink opportunities.To do so, they need to synchronize to the network by listening to beacons broadcast by the gateways, as shown in Figure 2B.Finally, Class C devices (cf. Figure 2C) must continuously listen to possible incoming frames when they are not transmitting.This mode obviously consumes a significant amount of power and is reserved for certain critical actuators.

RELATED WORKS
A LoRaWAN network relies on gateways to gather frames transmitted by large numbers of low-power end devices and forward them to a data collection server.4][5][6][7][8][9] Indeed, frame collisions experienced with this simplistic access scheme result in a poor network capacity.In this section, we explore the alternative access strategies that have been proposed by the research community to enhance the performances of the LoRa MAC layer.Aside from synchronization, a common MAC strategy to increase the network capacity is the use of carrier sense multiple access (CSMA) schemes. 27Approaches based on CSMA rely on listen-before-talk features to alleviate collisions.However their application to LoRa communications is still stammering in the current state-of-the-art.First of all, LoRa transceivers prior to the SX126x generation were only capable of detecting preamble symbols transmitted at the beginning of a transmission, and not the data symbols that compose the rest of the frame.After that, Semtech implemented a channel activity detection feature 28 allowing them to detect the presence of data symbols as well.This improvement suggests that a clear channel assessment (CCA) procedure, crucial requirement to the conception of a CSMA access, could be implemented.However it was found in Reference 29 that this CCA quickly becomes unreliable when the distance between the concurring nodes increases.Despite of this weakness, several listen-before-talk schemes have been proposed [30][31][32][33] to enhance the network performances.Yet, their application to real-life networks where a large number of widely spread nodes are in competition remains uncertain.Indeed if no reliable sensing can be performed passed a certain distance, it is expected that the hidden terminal problem 34 will drastically weigh on the protocol performances.For this reason, we believe that access schemes relying on synchronization are more promising.In fact, even a very simple slotted ALOHA access as the one explored in this article nearly doubles the network capacity with very little impact on the end-to-end delay.Furthermore, synchronization will later allow to setup scheduling algorithms performing even better than CSMA strategies, taking advantage of the centralized server already present in the LoRaWAN topology.In the rest of this section, we therefore focus on contributions that leverage a synchronization strategy to exploit the available bandwidth more efficiently.
A first approach to seamlessly synchronize devices without impacting the LoRa communications is the use of out-of-band communications.In Reference 19, Piyare et al. propose an on-demand time division multiple access scheme that leverages another low-power communication technology (i.e., micro-Watt wake-up receivers) to activate all devices and provide them with a common time reference before starting LoRa data exchanges.Similarly, Beltramelli et al. 20 propose the use of FM-RDS signals to achieve synchronization in LoRa networks.The main drawback of these strategies is that they need LoRa devices to be equipped with additional hardware components, thus making such solutions unusable with current products available on the market.
Another popular strategy is to use individual acknowledgments to synchronize devices upon demand.For instance in Reference 11, Polonelli et al. implemented a slotted ALOHA scheme on real hardware, using Class A reception windows to share time information through acknowledgments.In Reference 12, the synchronization frame additionally carries a timeslot assignment schedule to organize the transmission of all devices based on their traffic needs.Notably, Bloom filters are used to reduce the size of this extended downlink frame and enhance the efficiency of this mechanism.This work is then leveraged in Reference 21 to setup a transmission scheduling protocol for LoRaWAN.These contributions account for the clock drift of devices and are supported by real-hardware implementations.This shows the feasibility of the strategy, but also reveals that a re-synchronization is periodically required.On this regard, the gateway's duty cycle limitations bound the network size that can be handled by this mechanism. 35Besides, most LoRa gateways are half-duplex, 36 meaning that they cannot transmit and receive frames at the same time.In that, the downlink traffic will considerably increase if many nodes require an ack-based synchronization.The gateway will therefore be unreachable during the transmission of these messages, which will eventually affect the uplink throughput.Another proof-of-concept implementation of an ack-based synchronization has been proposed in Reference 22, and leveraged to setup a channel-hopping scheme.However the authors focus on using a single slave node, and therefore do not realize the scaling difficulties presented above.
These issues are avoided when using beacons to advertise the network and synchronize end devices.Such an approach scales with any network size, since the reception of a beacon frame can serve as time source for all devices listening to it on the air.That is why we chose the LoRaWAN Class B beaconing mechanism as a basis to design Class S, 14 an access scheme allowing slotted uplink communications.One of the key conclusions of this contribution was that a beacon-skipping mechanism was highly desired in order to save battery on the low-power LoRa devices.Several contributions [23][24][25][26] have explored beacon-based synchronization, but until now the idea to fine-tune the beacon reception periodicity in order to foster energy efficiency has not been explored in depth.These other approaches are detailed below, and for each we state which key differences have been identified in regards to LoRaSync.In Reference 23, beacons are leveraged to share the settings of a fine-grained scheduling algorithm designed for bulk data collection.The problem of bulk data collection differs from the continuous traffic management studied in our article, as it targets networks in which the base-station is not always available.In that, it corresponds to use-cases in which gateways are carried by drones or satellites, and periodically hover over sensors deployed in very remote areas.Alternatively, a beaconing mechanism is used in Reference 24 to group transmissions with similar RSSI and SF within the same timeslots.The objective of their article is to reduce the impact of the capture effect, notably improving the network fairness.Their goal is therefore different from ours, because we rather focus on extending the network capacity while maximizing energy-efficiency.With TS-LoRa, 25 Zorbas et al. tackle the problem of autonomous slot assignment in synchronized LoRa networks with acknowledged traffic.SACK (synchronization/acknowledgement) packets are introduced to serve both as synchronization beacon and group acknowledgement.The viability of this protocol is asserted with a real-hardware implementation, yet the reception of SACK packets increases the energy consumption by 50%.Authors claim that they offer a better energy efficiency than legacy LoRaWAN, but in fact they compare their implementation to the LoRaWAN acknowledged mode in which devices already have to spend energy for ACK receptions.This approach can be put into question since, as explained above, it has been shown in Reference 35 that bidirectional traffic heavily affects the uplink throughput.That is why we focus on unconfirmed traffic in the scope of this article, which makes our approach completely different from TS-LoRa.Finally, beacon synchronization was once again used in Reference 26 with a focus on industrial networks featured with periodic, predictable traffic.Clock drift measurements and corrections are notably emphasized in this last contribution.However the targeted protocol is an offline-scheduling algorithm designed for industrial sensor networks with predictable and periodic traffic.Conversely, LoRaSync is designed for general-purpose LoRa deployments with unpredictable traffic and possible layout changes.
In this landscape, LoRaSync is introduced as a scalable beaconing-based synchronization approach that focuses on energy efficiency thanks to its beacon-skipping mechanism.It is built on top of the LoRaWAN Class B beacon window structure, and can therefore be easily implemented on typical LoRa hardware with just a few firmware adjustments (cf.Section 6.2).This beacon-skipping mechanism relies on real clock drift measurements in order to space out the synchronization events as much as possible without creating inter-slot transmission overlaps, thus keeping the device power consumption to the minimum.To support this claim, other synchronization schemes found in the literature are compared against LoRaSync in Table 1.Its columns refer to the main features of the different synchronization strategies mentioned in this section.In detail, we want the synchronization to be in-band so that additional hardware components do not need to be added to the LoRa devices.We also mark the acknowledgment-based approaches as non-scalable, because we have seen that the gateway's half-duplex characteristic and DC regulations limit their ability to handle a large number of devices without weighing on the uplink throughput.Moreover, we highlight the approaches that have demonstrated the feasibility of their solution on real-hardware.Finally, the papers relying on the existing LoRaWAN Class A transactions and on the Class B beacon window structure have been indicated as well.For these, implementing the proposed protocols in existing networks require less changes to the device firmware.

LORASYNC REQUIREMENTS AND DESIGN
With LoRaSync, we propose an energy-efficient and scalable synchronization approach tailored for LoRa deployments.
It is designed to support the development of time-slotted access schemes for such networks, with an effort to minimize the device power consumption.This section first provides an experimental evaluation of the clock drift experienced with low-cost LoRa devices.Second, we detail the reasons why we recommend to parameterize all transmissions with SF7, and why we concentrate on using the largest payload length allowed in the scope of this article.Based on these premises, we then present the LoRaSync design which includes (i) the synchronization strategy, (ii) a slotframe structure resilient to the device clock skew, and (iii) a beacon-skipping mechanism designed to save battery power.

TA B L E 1
Comparison of related works about LoRa synchronization.

Synchronization strategy
In-band synchronization

LoRaWAN compliant
Piyare et al. 19 Low-power wake-up receivers x x Beltramelli et al. 20

FM-RDS signals x x
Polonelli et al. 11

Acknowledgements x x x
Haxhibeqiri et al. 12 and Garrido et al. 21 F I G U R E 3 Device drift measurements.

Device drift measurement and modeling
The LoRaSync synchronization mechanism aims at efficiently correcting the natural clock drift of cheap wireless devices to allow slotted uplink communications.Therefore, it is necessary to understand how this drift can be modeled before tackling the LoRaSync design.To do so, we use the testbed presented in Section 6, and program LoPy devices* to periodically transmit their local time.By comparing the intervals between the subsequent transmissions as seen by the gateway (which benefits from a very accurate GPS time), and the estimation of the same intervals done by the devices, we are able to evaluate the variations of the timing error.This clock skew has been plot in Figure 3 for two devices.
In most related works found in the literature, 11,20,26 the clock drift is modeled as a linear function with the assumption that the sensor temperature is constant.This is referred to as the simple skew model, 37 and it is verified in our experiment as we can see on Figure 3 that the device drift follows a linear trend on the long term.For cheap devices equipped with low-quality clock crystals, a 20 parts per million (ppm) worst-case drift coefficient is typically assumed.
However, it should also be noted that the clock skew displays a substantial noise around this overall trend, which must be estimated and is accounted for when designing the LoRaSync margins.This jitter has been mentioned in Reference 38, but the authors chose not to include it in their modeling for simplicity reasons.Such a noise is indeed rarely considered in practice, even though for such cheap hardware it can result in significant errors for short time spans.As a result, we define a clock drift interval model in which d is the drift coefficient in ppm, and  the maximum deviation from the linear model due to noise.Therefore, we are assured that a device featured with such parameters which has not been synchronized for a duration of t will experience a drift comprised within the interval (d ⋅ t) ± .The value of  has been experimentally obtained by observing the maximum deviation from the linear trend found in the samples collected by the two devices and rounding it to the higher value.Following this methodology, we find  = 11 ms.Such a model has been plot next to the real drift on Figure 3 with d adapted to the drift coefficient of each of the two devices, respectively −1.5 and 3.6 ppm.This preliminary drift evaluation will later be used to fine tune the LoRaSync parameters.Besides, we note that the worst-case 20 ppm drift hypothesis is checked for the two considered devices.

Spreading factor selection
We have seen in Section 2 that the SF is a key parameter of the LoRa modulation, which has an impact on the range and ToA of frame transmissions.In this part, we justify our choice to recommend using only the smallest SF, SF7.First of all, using the other SFs is detrimental in terms of energy efficiency.In detail, a higher SF results in a lower data rate and longer ToA, the only benefit being the improved transmission range.The consequence of having a longer ToA is that devices will have to use their radio longer to transmit the same amount of data, and will thus consume a greater amount of energy.Indeed we know that given SF the transmission spreading factor and BW its bandwidth, the symbol duration T Sym can be computed as such 39 : As a result, the ToA magnitude grows proportionally to 2 SF , meaning that high SFs drastically affect the device's energy efficiency.In this article, we have designed LoRaSync on top of SF7, because it showcases the best performances in terms of both data rate and power consumption.Yet, the same designing logic proposed throughout this article for SF7 can be applied to work with other SFs: for each additional SF, a new timeslot duration must be defined to accommodate the maximum frame size related to that SF.The resulting slotted structures (each on a separate SF) can herein coexist on a given channel.SFs are presented as orthogonal by Semtech, 39 meaning that transmissions occurring on the same channel but different SFs should be able to overlap without interference.However this claim has been put into question by the research community. 40Specifically, Croce et al. measure a 16 dB co-channel inter-SF rejection threshold.Such a RSSI difference is very likely to occur in near-far conditions, which are commonly experienced in large scale LoRaWANs, especially if high SFs featured with long transmission ranges and frame lengths are used.Multi-SF LoRa deployments can therefore not be studied as the simple superposition of independent networks.This research finding lets us prefer deploying multi-gateway single-SF LoRaWANs to a single-gateway multiple-SF LoRaWAN, if the goal is to cover a wider geographical area.Yet, this hypothesis is out of scope for this article and its validity and application scope will be investigated in future works.When the use of several non-orthogonal SFs is still the preferred option, the availability of a slotted structure for each SF (of course, featured by a SF-specific timeslot size) would be beneficial to the definition of access schemes (ranging from slotted Aloha to collision-free scheduling) able to bound possible inter-SF collisions.This is also a research topic to be investigated in future works.

Payload size standardization
In this contribution, the timeslot length is derived from the maximum frame ToA allowed in the network, ToA max (cf.Equation ( 4)).This parameter depends on the SF and payload size.We consider the maximum payload size authorized by the LoRaWAN regional parameters 41 (i.e., 250 bytes in Europe).LoRaSync can still be used with heterogeneous payload sizes, but optimal performances are obtained when every frame is featured with the maximum ToA.
The advantage of such a design in our situation is twofold.First, it allows us to use the available bandwidth more efficiently when a slotted access is utilized.Indeed, having frames with a length smaller than the maximum would create gaps between slotted transmissions, thus wasting a portion of the bandwidth.In the worst case, a slotted access could even showcase a lower throughput than pure ALOHA.Conversely, an homogeneous frame length ensures the gaps in between frames are kept as small as possible, and guarantees that the bandwidth is used more efficiently.
Additionally, this hypothesis is also beneficial to the energy efficiency of devices.In fact when the maximum payload length is selected, the proportion of frame overhead compared to the amount of payload bytes is minimized.Therefore, a larger portion of the device energy is spent transmitting useful bytes, thus ameliorating the network energy efficiency from the application layer perspective.Such a requirement can be easily fulfilled by forcing devices to buffer their messages, and concatenate them into a single payload when the sufficient amount of data has to be accumulated.The detrimental aspect of such an approach is that the delay between frames increases because of such a buffering mechanism.However LoRa networks are typically not used for time-critical applications, so we consider this hypothesis to be reasonable.
In a nutshell, enforcing the transmission of frames with the maximum allowed size is beneficial to both extend the network capacity and foster device energy efficiency.

LoRaSync design
Building a slotted access requires signaling messages to be exchanged in the network to share a common time reference among all devices.To do so, LoRaSync exploits the LoRaWAN Class B beaconing mechanism.Therefore, this scheme can run on typical LoRa hardware with a few adjustments to the firmware (cf.Section 6.2).A synchronization event is defined as a tuple (ts bcn , cnt bcn ), where ts bcn is the beacon's UTC timestamp and cnt bcn is the device's internal counter value at the beginning of the beacon reception.Such event is saved by the device at each beacon reception.UTC timestamps are defined as the total number of microseconds elapsed since the Unix epoch.Device counters are incremented every microsecond as well, therefore it is possible to add and substract counter values to timestamps.Hence, the timestamp ts associated with any counter value cnt may be computed with the following equation: The precision of this estimator depends on both the internal clock quality and the time elapsed since the last synchronization event.Indeed, indicating the device's worst-case clock drift with d (expressed in parts per million, ppm) and  the noise margin, the worst-case timing error err may be computed for any counter value cnt with the following equation: This worst-case error is notably used to enlarge the width of the reception slot each time a device needs to receive a beacon.This feature allows the radio to be in a listening state when the beacon is sent, regardless of its current clock misalignment.LoRaSync implements Class S, 14 meaning that the beacon window is divided into timeslots dedicated to uplink transmissions.As a means to preserve bidirectional communications, Class S transmissions are followed by the RX1 and RX2 reception slots just like Class A ones.In order to adapt Class S to real hardware constraints, a maximum clock offset threshold  max is defined.The slot size computation thus accounts for  max and the maximum frame ToA allowed: With such a slot size, a device may not trigger a transmission that overlaps over two different slots as long as the current clock skew of this device is smaller than  max .The associated slotframe layout within the BEACON_PERIOD is pictured in Figure 4.It is divided in three intervals: BEACON_RESERVED which is dedicated to the beacon reception, BEACON_WINDOW in which slots are laid-out, with the last slot overlapping onto BEACON_GUARD.These intervals are identical to the ones used in the LoRaWAN Class B (cf.According to the scheme pictured so far, the number of slots in the slotframe is: A beacon-skipping mechanism has been implemented in order to ameliorate the energy efficiency of the protocol.In particular, n skip is defined as the maximum number of beacons that can be skipped by a device before it needs to get synchronized again.Given the worst-case drift coefficient d, the noise margin  and the maximum offset allowed  max , n skip is equal to the biggest k ∈ N that fits the following inequality: This means that increasing the maximum offset allowed (and thus the slot size) reduces the synchronization beacon periodicity, eventually decreasing the device power consumption induced by beacon receptions.

MODELING AND OPTIMIZATION
This section presents the throughput, power consumption and energy efficiency models used to evaluate the network performances.For the sake of comparison, both Class A (i.e., pure ALOHA) and a slotted ALOHA scheme running on top of LoRaSync will be assessed.It has been shown that bidirectional traffic drastically weighs on the throughput of LoRa networks. 35For this reason, transmissions do not require acknowledgments in the scope of this article.It is also well-known that the capture effect occurs in LoRa deployments, and has a strong impact on the network throughput. 4,42Including this effect to the throughput model is not a trivial task.We therefore tackle this problem in a separate contribution, 15 in which more advanced models are introduced and validated with real performance measurements performed on our testbed.In this article, we therefore assume that overlapping frames are necessarily lost.Let us first remind the most usual ALOHA throughput modeling with a finite number of devices n, as presented in Reference 43.The number of frames generated by any device during a duration t is modeled by a Poisson distribution of parameter .Besides, we make the hypothesis that all frames in the network have the same ToA (cf.Section 4.3) and do not require an acknowledgment.Additionally, pure ALOHA devices do not use any kind of buffering, and slotted ones use a buffer of size 1 that simply allows them to wait for the start of the next slot.Taking this ToA as the elementary time unit of our modeling, we then define the probability p that a pure ALOHA device generates at least a frame during such a time unit.Since a buffer of size 1 is assumed, terminals ignore new packet arrivals while they are busy trying to transmit a packet.Therefore, p follows the cumulative distribution function of the exponential law 44 : The probability that exactly k among n devices generate a frame during a time unit can then be derived with the binomial coefficient: For pure ALOHA, the average throughput T p (in erlangs) is equal to the probability that exactly one device generates a frame during a time unit, and that none of the other (n − 1) devices do during the previous one: In order to adapt this modeling to LoRaSync, we introduce q as the probability that exactly one device generates a frame for the duration of a slot.We once again leverage the exponential law: Besides, the k s coefficient is introduced as a means to represent the actual transmission time available per slotframe: Using once again the binomial coefficient to compute the probability that exactly one device generates a frame during the duration of a slot, the average slotted ALOHA throughput T s (in erlangs) is equal to: In order to model the energy efficiency of the devices, their power consumption (P) must first be computed.This is done by considering its value in the transmission, reception, and sleep states separately.We note P TX , P RX , and P SLEEP the power consumption of a typical SX1276 LoRa transceiver in transmission, reception and sleeping states respectively, when equipped with a 3.3 V battery voltage. 45It is computed with the transceiver's supply current in the considered state, respectively 20, 10.8, and .2× 10 −3 mA.Our modeling accounts for the LoRaWAN 30 ms reception slots following the uplink transmissions (i.e., RX1 and RX2).For this purpose, we introduce the slot listening rate  s : The overall network power consumption therefore is: For the Class S model we need to additionally consider the time spent in a receiving state for beacon reception purposes.We define the device beacon reception period T bcn as: The beacon ToA is noted ToA bcn .With LoRaSync, the beacon reception window is enlarged to cope for the worst-case drift that can occur during T bcn with a drift coefficient d and a noise margin .Indeed, the device transmission may be shifted by an offset ranging from −d ⋅ T bcn −  to d ⋅ T bcn + .At the end of the beacon reception, the device switches its radio to sleep mode, so that the elapsed listening time may range from ToA bcn to ToA bcn + 2 ⋅ (d ⋅ T bcn + ) depending on the transmission offset.We assume that the drift coefficient d of all devices is uniformly distributed in the interval ±d, therefore the average time spent in listening mode is d ⋅ T bcn + .As a result, the beacon listening rate  b is: The power consumption model can then be expressed as such: The energy efficiency (E) is expressed in bytes per joule and can be obtained by dividing the throughput by the power consumption.The throughput must however be expressed in bytes per second first, which is achieved by the term bytes pkt ToA pkt in the equations below.The energy efficiency models for pure and slotted ALOHA, respectively E p and E s , are therefore:

Model validation and analysis
LoRaSync targets large-scale deployments that experience scalability issues with the legacy Class A. Given the difficulty to approach such a large scale scenario with a research testbed, the models presented above have been validated using the LoRaWAN-Sim simulation environment to instantiate a network of 2000 devices.This analysis is completed by a proof-of-concept implementation on a few devices (cf.Section 6) to demonstrate the feasibility of our approach.The LoRaWAN Class A and a slotted ALOHA access over LoRaSync have been replicated in the simulator.Additionally, LoRaWAN-Sim integrates a linear clock drift feature that allows us to replicate the LoRaSync clock correction.The SF has been set to 7 for the reasons stated in Section 4.2.All transmissions occur on a single channel, and are parameterized with a coding rate of 4∕5 with explicit header and cyclic redundancy check enabled.As specified in Section 4.3, all frames carry the maximum payload size allowed by the LoRaWAN specification (i.e., 250 bytes) to minimize the amount of overhead transmitted by the devices, which results in a ToA of 389.376 ms.LoRaSync was set up with a worst-case drift coefficient d = 20 ppm, which is a typical value for low-quality crystals found in such cheap hardware.Throughout this analysis, error bars represent 99% confidence intervals along the x and y axes computed with Student's t law.In order to facilitate the reproduction of our experiments, detailed simulation parameters are provided in Table 3.
The simulated throughput for pure and slotted ALOHA are compared to the models in Figure 5, and the energy efficiency equivalent is displayed in Figure 6.In all cases, the model curves match the simulation results, showing the consistency between the modeling and the protocol implementation on LoRaWAN-Sim.In each plot Class A is compared to LoRaSync for several slot sizes.This comparison notably shows that the models faithfully capture the impact of the transmission margin size on the overall performances, which will be relevant for the analysis of Section 5.2.Specifically, the chosen slot sizes correspond to the  max values of 2.56, 12.8, 28.16, and 53.76 ms.Since the worst-case drift coefficient d has been fixed to 20 ppm and that the simulator drift follows a linear trend without noise, the n skip values can be very easily deduced.Indeed we know that any device is subject to drift of a maximum of 20 × 10 −6 ⋅ 128 000 = 2.56 ms in the duration of a single beacon window.Therefore, we have in this case n skip = (⌊ max ∕2.56⌋ − 1).Interestingly,  max values lower than 2.56 ms cannot be allowed because it is the drift experienced between the smallest possible interval between two beacon receptions.The selected  max then result in n skip values of 0, 4, 10, and 20.
When focusing on the observed performances, Figure 5 also shows the good operation of the LoRaSync synchronization, since the slotted access allows to nearly double the maximum achievable throughput as expected.Regarding energy efficiency we find the result already presented in Reference 46, which is that Class A performs better for low generation rates and the slotted access becomes preferable when traffic rate is higher than the threshold  p∕s1 .With this network configuration, and if we consider 53.76 ms to be the upper bound to  max ,  p∕s1 is equal to .34 erlangs.This value strongly depends on the hypothesis that devices are featured with a 20 ppm.drift, which was checked with our devices as we saw in Figure 3. Using devices of higher quality will naturally increase the slotted ALOHA efficiency and move the crossing point towards a lower rate.More interestingly, Figure 6 shows that the value of  max , and therefore the slot size, has a strong impact on the network energy efficiency.Indeed, for a generated traffic range of .34 to .6 erlangs the maximum energy efficiency is attained when  max = 53.76ms (n skip = 20).Then the  max = 28.16ms (n skip = 10) curve is the best until 1.2 erlangs and finally  max = 12.8 ms (n skip = 4) prevails above that point.These intervals are represented by the  s1∕s2 and  s2∕s3 thresholds in Figure 6.We remark that the curve associated with  max = 2.56 ms (n skip = 0) is never optimal in terms of energy efficiency.This is consistent with the preliminary results presented in Reference 14 that already showed that the beacon-skipping mechanism was unavoidable when trying to optimize energy efficiency.We also notice that the optimal amount of skipped beacons decreases as the traffic load increases.In order to further explore this finding, the next part sheds some light on the relationship between slot size and energy efficiency.

Slot size optimization
This part aims at evaluating the impact of the LoRaSync slot size on the network performances.Indeed the slot margin length  max has an effect on the maximum achievable throughput, but also on the synchronization periodicity which in turn alters the device power consumption.This makes it a relevant parameter to consider when optimizing energy efficiency.In order to represent a realistic scenario, the models presented in the previous section has been used to analyze the same large-scale deployment of 2000 devices.
From Equation (4) we know that the slot size depends on  max , which represents the maximum device drift allowed.Increasing  max will result in larger slots, ultimately decreasing the number of slots per slotframe and the overall throughput.On the other hand the bigger  max , the more devices are able to drift.This means that they are able to skip more beacons.Each additional beacon skipped reduces the overall device power consumption.Energy efficiency, defined as the ratio between the throughput and power consumption, thus strongly depends on the slot size.It has been plot as a function of the slot size and generated traffic in Figure 7.
At each rate, the slot size associated with the maximum energy efficiency is represented by the black dashed line.The line is only traced for generation rates bigger than the  p∕s1 threshold introduced in Figure 6.Indeed the asynchronous pure ALOHA access should be preferred in terms of energy efficiency for rates lower than this threshold, it is therefore useless to consider the slot size there.This line shows that the optimal slot size decreases as the network load increases.Remarkably, with the same maximum ToA, smaller slots mean that devices are allowed to drift of a smaller amount.As a result a shorter synchronization period is required, which implies consuming more power in beacon receptions.Such short margins are therefore only suitable for high transmission rates.Conversely, wide slots allow to space out synchronization events, but reduce the number of slots available per slotframe.This allows to save power at the cost of a reduced maximum achievable throughput, which is not a problem when the traffic load is low.In a nutshell the more traffic there is, the more the margins should be reduced.
All in all, this result shows that the slot size should ideally be adapted to the traffic load in order to maximize energy efficiency.We presented in Reference 46 the TREMA mechanism to dynamically switch between pure and slotted ALOHA depending on the probed traffic conditions.Interestingly, this work could easily be extended to dynamically adapt the slot size to the generated load in order to further optimize the energy efficiency of the network.

PROOF OF CONCEPT ON REAL HARDWARE
In order to complement the previous theoretical analysis, LoRaSync was implemented on a small-scale research testbed.This proof-of-concept demonstrates that our mechanism is capable of bounding the synchronization error despite real hardware constraints.We additionally provide hands-on insights on how to handle the hardware and software challenges faced when synchronizing low-cost LoRa devices.Finally, we provide a performance evaluation of the pure and slotted ALOHA access schemes implemented on the testbed with 10 devices, which reveal a discrepancy between the models and reality.In that, the hardware implementation allows us to demonstrate the impact of capture effect on the channel throughput.

Architectural challenges
A LoRa testbed was set up according to the typical LPWAN architecture, that is, a server, a gateway and devices, as depicted in Figure 8.The server side mimicks the ChirpStack architecture, in which a MQTT broker centralizes the communications of all gateways.The ChirpStack gateway bridge allows to translate the MQTT streams into UDP datagrams understandable by the gateways.Finally, a custom LoRaSync server was developed to control and monitor the network with MQTT messages.The gateway is composed of a Raspberry Pi 3 running the Semtech Packet Forwarder software, used in conjunction with an IMST IC880A LoRa concentrator.The testbed devices are LoPy 4 chips developed by Pycom, offering quick prototyping capabilities for a relatively low cost.
By default, LoRa gateways are only capable of triggering downlink transmissions (i) immediately, or (ii) after a short delay following an uplink transmission (Class A RX windows).In order to allow timestamp-based downlink transmissions, an Adafruit Ultimate GPS chip was connected to the Raspberry Pi through serial peripheral interface.It provides time and location information to the gateway program, and also sends a direct pulse per second signal to the concentrator enabling the triggering of transmissions with a 10 ns precision. 47This GPS chip requires a direct line-of-sight with the sky, as a result our gateway has been setup outdoors.This module ultimately allows the periodic broadcasting of the LoRaSync synchronization beacons.
F I G U R E 9 lock_wait_execute flowchart.

Software challenges
This section details the firmware adjustments that had to be made to implement LoRaSync on LoPy 4 chips.It must be pointed out that this hardware that does not support Class B, and that a big portion of its behavior had first to be reproduced.The main challenge faced when programming LoRaSync on such devices was to schedule event executions with absolute timestamps, even though the embedded time library only allows callbacks after relative delays.To achieve this goal, the device logic was first split onto four threads, each assigned to a specific task: (i) message generation, (ii) frame reception, (iii) handling of received messages, and (iv) frame transmission.In this way, events requiring a precise timing such as frame transmissions and receptions may lock the CPU when needed, while the other tasks may proceed their execution freely during non-critical time.Additionally, a software overlay was developed in order to handle critical method executions.Once synchronized, a node is able to estimate its current timestamp with its internal counter according to Equation (2).When a timestamp event is scheduled, relative callbacks can then be used to trigger a call to the lock_wait_execute method, capable of handling time-critical events.This method is in charge of locking the hardware resources for the current thread, and waiting until the internal counter falls within a precise confidence interval around the target timestamp (cf. Figure 9).It then triggers the event execution with a ∼ 100 s precision.In the case where resources could not be locked in time for the critical method call, a backup event passed as parameter is executed in order to correct the failure.
Once the device is synchronized, this triggering process allows to track beacons by turning the radio on just before the next broadcast.It also enables uplink timestamp-based transmissions, which ultimately allows to implement the Class S timeslots.

Clock correction demonstration
In order to the good operation of LoRaSync, the clock drift of a periodically synchronized device is plot in Figure 10.In order to facilitate the experiments, the node has been set indoor, and is not in direct line of sight with our outdoors gateway (cf.Section 6.1).It notably ensures that the temperature is relatively constant for the duration of the experiment.This single device is setup with  max = 39.16 ms, a worst-case drift coefficient d = 20 ppm and a noise margin  = 11 ms.This setting results in n skip = 10, meaning that a beacon is received every 11 × 128 = 1408 s.Beacon receptions are represented on the graph by black vertical dashed lines.
Here it is clear that the device never crosses the maximum drift allowed line  max = 39.16.The margin here appears particularly wide because the actual drift coefficient of the chosen device is d = −1.5 ppm, which is very small compared to the worst-case estimation of 20 ppm.The drift model interval has been plot over the actual drift in order to highlight the synchronization events.It shows that in this particular case, the noise plays a bigger role in clock errors than the linear component of the drift.
All in all, this proof of concept shows that the synchronization error is kept below the worst-case drift threshold  max .LoRaSync therefore operates as expected in a realistic context.

6.4
Impact of the capture effect on the testbed performances Additionally to this proof of concept with one device, we have evaluated the performances of the testbed under pure and slotted ALOHA access schemes for various generation rates with 10 devices.All devices are setup indoors at the same location.For the slotted ALOHA scheme, we use LoRaSync parameterized with  max = 39.16, which results in the same parameters as in Section 6.3.For each generation rate the network runs for 30 min, and the throughput is sampled every minute.Errorbars represent 95% confidence intervals computed with Student's t law.Results are plotted in Figure 11, along with ALOHA models presented in Section 5 for the sake of comparison.Here, it is obvious that the models underestimate the real-world throughput.As a result, we have thoroughly investigated this finding in a separate contribution, 15 which presents experimental models able to accurately picture the testbed throughput.One of topics explored in this complementary study is that the impact of capture effect on the network heavily depends on the deployment layout.
Even though we were able to find a model that applies to our experimental setup, more studies are still desired to generalize this result.Regardless, the increased performances experienced with the slotted access prove the viability of the synchronization approach in a real deployment.

CONCLUSION AND PERSPECTIVES
In this article, we introduced LoRaSync, a synchronization scheme designed for LoRa networks that aims at maximizing the device energy efficiency.Synchronization was used to setup a slotted ALOHA access providing capacity improvements compared to the legacy pure ALOHA scheme.LoRaSync slots may however be leveraged to implement any slotted MAC protocol.Synchronization is achieved through beacon broadcasts, and is therefore scalable to any network size.
The LoRaSync slots contain margins to cope for the worst-case device clock drift, ensuring a reliable operation even on low-cost hardware.A beacon skipping mechanism is implemented so that devices only realign their internal clock when their current drift is likely to exceed the margin length.This way, the energy devices spend listening to incoming beacon broadcasts is minimized.Throughput, power consumption and energy efficiency models were established to assess the performances of the slotted ALOHA scheme built on top of LoRaSync.This access scheme was compared to the asynchronous pure ALOHA access, and the LoRaWAN-Sim simulator was used to verify the consistency of the results.Such modeling ultimately allows to find the most efficient slot size for any traffic load.Results show that the more traffic is generated, the smaller margins should be in order to maximize energy efficiency at all times.LoRaSync was implemented on an experimental testbed, showing its capability to successfully correct the timing errors and keep the clock drift below the maximum threshold allowed even on a real-hardware setup.This proof-of-concept also reveals the impact of capture effect in a realistic setup, and this phenomenon has been studied in depth in a separate contribution. 15ll in all, LoRaSync provides a robust and energy efficient basis to allow development of slotted schemes over LoRaWAN.It must be pointed out that the simple slotted ALOHA scheme used in this article to demonstrate the viability of the mechanism is not flawless.For instance, a severe competition occurs during the first timeslot due to traffic accumulating during the beacon reserved and guard intervals.In that, more advanced schemes such as scheduling techniques should be developed, and LoRaSync has been designed as robust cornerstone able to support these future improvements.Besides, the TREMA switching mechanism allowing to select synchronous or asynchronous access schemes depending on the traffic load should be implemented, and improved to also adapt the slot size.Finally, all these contributions should be extended to operate in multi-gateway scenarios as well.

2
Access schemes defined by the different LoRaWAN classes.(A) Class A. (B) Class B. (C) Class C.

F I G U R E 6
Energy efficiency model and simulation results.F I G U R E 7 Energy efficiency as a function of the slot size (2000 devices).

F I G U R E 8
Experimental testbed infrastructure.

F I G U R E 10
Device drift measurements.F I G U R E 11Experimental pure and slotted ALOHA throughput.
) in order to facilitate the implementation of Class S.F I G U R E 4 TA B L E 2 Simulation parameters.Throughput model and simulation results.
TA B L E 3