Delayed mobile data offloading scheme for quality of service traffic: Design and analysis

Astheworldis enteringtheeraof 5G, we arewitnessing exponentialgrowthinthevolumeof data traffic generated and consumed by mobile devices. Mobile data offloading handles the surge in mobile data traffic by offloading part of the traffic onto the Wi-Fi network. In this context, we design and analyse a quality of service (QoS) enhancement scheme for delayed mobile data offloading. We first consider prioritised queuing of traffic as a scheme for QoS provisioning. Using the concepts of virtual waiting time, renewal process, and level crossing arguments, we derive the average transmission delayof the offloaded and reneged packetsof high priority. Next, we mathematically validate the benefit of balking all such packets whose expectedvirtualwaitingtimeexceedstheirdeadline.Usingathree-dimensionalMarkovchain model,wederivetherequiredbalkingprobability.Moreover,wemergethetechniqueof balkingwithprioritisationforimprovingQoSindelayedoffloading.Throughanextensive simulation,wevalidateouranalysisanddemonstratehowtheproposedschemereduces thetransmissiondelaywithoutsacrificingoffloadingefficiency.Ourinvestigationhasthe potentialtobeadoptedinfuturemobiledataoffloadingstandardsforimprovingQoS.


| INTRODUCTION
In recent years, there has been exponential growth in the use of wireless devices such as smartphones and tablets. The ever-increasing use of the Internet over these handheld devices has created a surge in data traffic. Global mobile data traffic was 4.4 exabytes per month in 2015; it grew at the rate of 63% and reached 7.2 exabytes per month at the end of 2016. Furthermore, it was expected to grow to 49 exabytes per month by 2021, a sevenfold increase over 2016 [1].
To meet the surge in the mobile data demand, use of Wi-Fi as a supplementary network to offload cellular traffic has been found to be a promising approach. Wi-Fi advocates for an unlicensed spectrum, which makes Wi-Fi attractive to telecom operators. Dual-mode phones use cellular radio, which contain global system for mobile communications as well as IEEE 802.11 (Wi-Fi) radio. The main advantages of mobile data offloading are that from the user's perspective, data offloading can be seen as a viable way to have a higher data rate and also being cost-effective. Apart from the high data rate, for all transfer sizes, data transfer in WLAN is significantly more power efficient than 3G [2]. This reduces the transmission time and hence the transmission power.
Two types of mobile data offloading have been discussed in the literature: on-the-spot and delayed offloading [3,4]. On-thespot offloading takes place only when the user is within the Wi-Fi coverage area. When the user moves out of the Wi-Fi coverage area, offloading is stopped. In delayed mobile data offloading, offloading occurs whenever the Wi-Fi is available. When the user moves out of the Wi-Fi coverage area, the packets are not immediately routed through the cellular network. Transmission of data packets through the cellular network occurs only if the user does not reenter the Wi-Fi coverage area within a deadline. An efficient mobile data offloading scheme should have a low transmission delay and high offloading efficiency.
Offloading data traffic via an long-term evolution wireless local area network (LTE-WLAN) interworking has been specified for 3GPP release 10 [5]. Until that release, LTE-WLAN interworking was supported only in the core network. Because there was no tight integration between LTE and WLAN, offloading was mainly mobile-controlled. However, this helps the operator to reduce congestion in the backbone network. In 3GPP release 12 [6], Radio Access Network (RAN)-assisted offloading was proposed. The focus of 3GPP release 13 [7] is operator-controlled mobile data offloading. LTE-WLAN Aggregation (LWA) and LTE-WLAN Radio Level Integration with IPsec Tunnel (LWIP) [8], as specified in release 13, allows the simultaneous combination of both cellular and Wi-Fi technologies to increase user equipment throughput. LWA standardises the aggregation of RAN and WLAN at the Packet Data Convergence Protocol(PDCP) layer, whereas in LWIP, the aggregation is done above the PDCP layer, at the internet protocol layer.
Motivation for the proposed work is as follows. Although much standardisation has been done in LTE-WLAN integration, the choice of the policy regarding which offloading decision has to be made is operator-dependent. Therefore, many efficient offloading schemes have been presented in the literature, and the current discussion falls into this category. To the best of our knowledge, the prioritisation and advantage of balking in delayed mobile data offloading have not been fully investigated in the literature.
In the present work we have designed and analysed a quality of service (QoS) enhancement scheme for delayed mobile data offloading. Our scheme is based on (1) prioritising QoS traffic, and (2) an intelligent balking decision, both geared towards reducing packet latency.
Major contributions are that: 1. We have developed a preemptive priority queuing model for a mobile data offloading scheme catering to traffic with different QoS requirements. 2. By analysing the stochastic process characterising the virtual waiting time V 1 (t) of the packets, we have derived a Volterra integral equation for the probability density function of V 1 (t). 3. Using the Laplace Stieltjes transform technique, we then solved the integral equation and arrived at the average transmission time of the offloaded packets and the reneged packets. Using these results we motivate the advantage for balking. 4. We derive the optimal balking probability of a packet by setting it to be equal to the probability that its virtual waiting time exceeds its deadline. 5. We develop a three-dimensional (3D) Markov chain model for delayed mobile data offloading with balking and priority. Solving the model, we derive the offloading efficiency of packets belonging to different priority classes. 6. Through extensive simulation, we validate our analysis and demonstrate how the proposed scheme reduces transmission delay without sacrificing much offloading efficiency. 7. The effect of finite cellular delay on our model is also analysed.
The rest of the article is organised as follows: In Section 2, we present a brief literature survey of analytical work related to delayed mobile data offloading and then highlight the unique contributions of our research. In Section 3, we validate the importance of balking for QoS enhancement in delayed offloading with multiple traffic classes. We study delayed offloading with balking for multiple traffic classes in Section 4.
Results and discussions are presented in Section 5. Conclusions are drawn in Section 6.

| LITERATURE SURVEY
Insight into the potential benefits of mobile data offloading using trace-driven simulation was given by Lee et al. [9]. Traces show that 65% of traffic can be offloaded by incorporating onthe-spot offloading. Further, an additional 29% of traffic can be offloaded for a deadline of one hour. Sok-Ian Sou developed an analytical model to quantify the amount of 3G resources saved by offloading and deadline assurance for measuring the quality of user experience with policy and charging control support [10]. Mobile data offloading for LTE in the unlicensed spectrum has also been proposed [11]. A detailed survey on mobile data offloading can be found in Rebecchi et al. [3] and Zhou et al. [12] and references therein. Literature on the problem of mobile data offloading can be classified based on (1) the system model, (2) system parameters, (3) performance metrics, and (4) the analytical framework adopted, such as game theory [13][14][15], optimization [16][17][18][19], artificial intelligence/learning-based methods [20,21], and stochastic/queuing theory-based methods [9,[23][24][25]. Table 1 provides the classification of some of the literature in the area of mobile data offloading.
Abbreviations: APs, Access points; BS, Base station; CSMA/CA, Carrier-sense multiple access with collision avoidance; MU, Moblie user; QoE, Quality of experience; SINR, Signal-to-interference-plus-noise ratio; We now confine our literature survey to analytical models dealing with mobile data offloading. Analytical models based on queueing theory are prevalent in the literature. Lee et al. [9] developed a queueing model for delayed mobile data offloading along with service interruption and reneging. They derived offloading efficiency using matrix-geometric techniques [22]. Mehmeti et al. analysed delayed mobile data offloading using Z-transforms and derived optimal values of the deadline [23]. Ajith et al. [24] derived offloading efficiency and packet transmission delay for a delayed mobile data offloading scheme using the concept of virtual waiting time.
Yu et al. [4] explained operator-initiated offloading and user-initiated offloading schemes. The deployment of Wi-Fi can be treated as an independent Poisson point process and the user's movement as a semi-Markov process. Effective data offloading based on these processes was modelled in Hu et al. [25]. Xu et al. [26] analysed a delayed offloading scenario with multiple priority classes and derived packet delay and offloading efficiency. Cheung [16] posed single-user user-initiated network selection as a finite-horizon sequential decision problem to minimise cost for the case of a delayed Wi-Fi offloading scheme.
Sok-Ian Sou and Yi-Ting Peng showed that multipath Wi-Fi offloading has better resource use compared with opportunistic offloading [27]. Although delayed mobile data offloading has been widely investigated, the literature still lacks design and performance analysis of an offloading scheme that ensures QOS for different types of traffic. The purpose of this report is to fill this gap.
The unique contribution of our work and how it differs from existing related literature are as that: 1. Mehmeti et al. [23] analysed delayed mobile data offloading, but they do not consider the case of multiple-class traffic. We deal with multiclass traffic and show its benefit for QoS enhancement. 2. We derive the average transmission time of offloaded and reneged packets and mathematical validation of balking compared with the reference [24]. Furthermore, Ajith Venkatesh [24] deal with only single-class packets with no concept of priority. In the current work using Markov chain and priority queuing theory, we show how prioritisation along with balking improves QoS. 3. Our proposed idea of balking appropriate packet is equivalent to the optimal solution derived for the optimization problem in Cheung and Huang [16]. 4. Our work differs from that of Xu et al. [26] because we consider multiple-class traffic in delayed offloading scenario and also show that balking can further improve QoS. 5. Unlike previous work in the literature, we derive the transmission delay of both offloaded and reneged packets. For the first time, the concept of a virtual waiting time and level crossing technique is used to analyse the formation of a queue in mobile data offloading.

| DELAYED OFFLOADING
In this section, we consider prioritising the traffic of a user as a technique to improve QoS in delayed offloading. We first derive and critically analyse the factors contributing to the average transmission delay of high-priority packets. This analysis should help us arrive at a QoS enhancement scheme that reduces transmission delay.

| System model
We consider a mobile node that roams randomly and therefore enters and leaves zones with Wi-Fi coverage area randomly. We consider that the network traffic can be categorised into two classes based on QoS requirements. The packet generation process of class 1 and class 2 traffic is modelled as a Poisson process with rate λ 1 and λ 2 , respectively. Class 1 is given preemptive priority over class 2. We assume packet size to be exponentially distributed with mean κ for both the classes. Each packet has a deadline that is exponentially distributed with parameter γ 1 and γ 2 for class 1 and class 2 packets, respectively. All packets wait in the queue for transmission over WLAN. If transmission is not started before the expiry of deadline, the packet gets reneged and it is transmitted over the cellular network. Furthermore, we assume there is no queuing delay for transmission over the cellular network because it is usually assumed in the literature [3]. The effect of a finite cellular delay will be probed in Subsection 5.2. Let the Wi-Fi data rate be denoted as R W . The availability of the Wi-Fi network is modelled as an ON-OFF alternating renewal process with ON and OFF periods being exponentially distributed with parameters α and β, respectively [28].
The whole system can be modelled as a queue with the following characteristics: (1) preemptive priority for class 1 over class 2, (2) packets renege on deadline expiration, (3) the server breakdown is governed by the ON-OFF process, and (4) server services are at rate μ ¼ R W κ for both class 1 and class 2 when Wi-Fi is ON. Transmission delay and offloading efficiency are the metrics in which we are interested in delayed offloading. Transmission delay (T ) is defined as the average queueing delay before packet transmission begins via either the cellular network or WLAN. Offloading efficiency (η) is the fraction of packets that are transmitted over WLAN.

| Derivation of transmission delay
In this section, we analyse the stochastic process characterising the virtual waiting time of class 1 packets using the level crossing method [29]. The analysis will help us to derive formulas for (1) the average transmission delay of offloaded class 1 packets, T S 1 , and (2) the average transmission delay of reneged class 1 packets, T R 1 .
Virtual waiting time [30] is defined as the time for which a hypothetical packet arriving at time t would have to wait before commencing service. Figure 1 shows typical realisations of V 1 (t), the virtual waiting time of class 1 packets, and K(t), the Wi-Fi availability status. V 1 (t) has a value of 0 when queue is empty of class 1 packets and the server status is ON. The time interval during which V 1 (t) has a value of 0 is called the idle period (IP). While the queue is empty of class 1 packets (V 1 (t) = 0), an initial jump in V 1 (t) can be due to either server breakdown (time instant m of Figure 1) or a packet arrival while the server is ON (time instant a 1 of Figure 1). The time interval when V 1 (t) has a value greater than 0 is called the busy period (BP). After the initial jump, the virtual waiting time decreases at rate V 1 0 (t) = −1 until it reaches 0 if there is no packet arriving within that duration. If a new packet arrives before V 1 (t) reaches 0, the virtual waiting time jumps by a value corresponding to the completion time of the new packet. The completion time of the packet, C, is defined as the time the instant service of a packet is started until the completion of the service, including all service unavailability periods as a result of server breakdown (denoted as C i for ith class 1 packet). Because we follow preemptive priority for class 1 over class 2, the Laplace Stieltjes transform (LST) of density function of C has been derived in Ajith and Venkatesh [24] as: Because the completion time includes service unavailability periods, the virtual waiting time does not jump for OFF duration. However, the initial jump in virtual waiting time can be the result of service unavailability periods. The virtual waiting time of class 1 does not jump for the arrival of class 2 packets (b 1 , b 2 and b 3 ) because class 1 packets have preemptive priority over class 2. Also, the virtual waiting time does not jump for class 1 packets that renege before the BP (a 2 and a 7 ), because they do not contribute to the virtual waiting time of future packet arrivals. Figure 1 shows that the whole system can be modelled as an alternating regenerative process that alternates between BP and IP. The probability density function of the virtual waiting time of the class 1 packet, f V 1 ðxÞ, can be written as: based on whether a packet arrival event occurs during BP or IP. Here, P(IP) and P(BP) are the probability that an arrival takes place in a IP or BP, respectively. According to the Poisson arrivals see time averages property [31], P(IP) and P(BP) can be formulated as: Because IP ends owing to a packet arrival or service interruption, E½IP� ¼ 1 λ 1 þα . Here, E [BP] has been derived in Equation (3) of Ajith Venkatesh [24]. Taking the Laplace transform on both side of Equation (2), the LST of f V 1 ðxÞ in Equation (2) is given by: Because any packets arriving during IP are served immediately, To find the LST of f V 1 ðxÞ, we have to derive the LST of f V 1 ðx|BPÞ.
For further analysis, we use the level crossing method [29]. Consider the particular value V 1 (t) = x as a level as shown in Figure 1. A downcrossing of level x is a left-continuous hit of level x from above [29]. An upcrossingof level x is defined as the jump, which starts at a value below level x and ends at a value above level x. For example, an instant of downcrossing and upcrossing of level x can be seen at time h and time a 1 , respectively in Figure 1. Downcrossing rate of level x is the number of downcrossings of level x per unit time. Similarly, the upcrossing rate of level x is the number of upcrossings of level x per unit time. The basic level crossing argument states that for every level x and every sample path, in the long run, the total downcrossing rate is equal to the total upcrossing rate [29].
The downcrossing rate of level x during BP is given by the probability density function (PDF) of V 1 conditioned on BP, f V 1 ðx|BPÞ. The upcrossing rate of level x during BP can be calculated as the sum of the upcrossing rate of level x from level 0 and the upcrossing rate of level x from any level y 2 (0, x). Let IJ correspond to a random variable representing the initial jump of virtual waiting time from IP, and F IJ (x) be its distribution function. While queue is empty, an initial jump in V 1 (t) can either result from server breakdown or be due to a packet arrival while the server is ON. Therefore, F IJ (x) can be formulated as: The upcrossing rate of level x during BP from level 0 is the rate at which the initial jump exceeds x. � F IJ ðxÞ is the probability with which the initial jump exceeds level x. Because the initial jump occurs only once during BP, the upcrossing rate of level x during BP from level 0 is given by E½BP� . In BP, the virtual waiting time jumps by a value corresponding to the completion time of the packet. Given v as the initial virtual waiting time (the starting point of the jump), for the jump to cross level x, the completion time must be greater than x − v. The probability of upcrossing level x from level v owing to a packet arrival is given by packet will stay in queue if its deadline, d, is greater than the virtual waiting time. P(d ≥ v) is the probability with which the packet will not renege, which is the complementary CDF of the deadline distribution evaluated at v, that is, e −γ 1 v . Only packets that are not reneged change V 1 (t) upon their arrival. Therefore, � F C ðx − vÞe −γ 1 v is the probability that upcrossing from level v crosses level x. f V 1 ðv|BPÞ is the probability that the virtual waiting time has value v over BP.
Integrating v from 0 to x gives the probability of crossing level x from level (0, x) over BP. Multiplying by the arrival rate gives the required upcrossing rate of level x during BP from any level y 2 (0, x).
Using the level crossing method [29], on equating the downcrossing rate to upcrossing rate, we get: This is a Volterra integral equation of the second kind [32] which can be solved by recursion. Similar to the analysis in Iravani and Balcıoglu [33], we get the LST of f V 1 ðx|BPÞ as: where Lf � F IJ ðxÞgðsÞ and Lf � F C ðxÞgðsÞ are the LST of complementary distribution of the initial jump and completion time, respectively, and m is the number of recursions. Φ i (s) can be written as: Substituting Equations (3), (4), (6) and (9) into Equation (5) yields the LST of f V 1 ðxÞ as: The LST of the distribution function of the transmission delay of the offloaded/served packet according to Stanford [34] can be derived as: where η 1 is the offloading efficiency of class 1 packets. According to Ajith and Venkatesh [24], this can be derived as: The average transmission delay of the offloaded/served packet, T S 1 , can be calculated by: where: The average transmission delay of class 1 packets in the mobile data offloading scheme, T 1 , can be calculated as [34]: The relationship between T S 1 ,T R 1 and T 1 can be written as: Therefore, the average transmission delay of the reneged packet, T R 1 , is: Equations (14) and (17), which give the average transmission delay of offloaded and reneged packets, respectively, are the final results of the section.

| Results
To validate the analytical model, extensive simulations were carried out in MATLAB [35]. A node is simulated by generating Poisson traffic of class 1 and class 2. The Wi-Fi data rate is assumed to be 2 Mbps. Because delayed offloading could afford to transmit a larger packet than on-the-spot offloading, a packet length of 2.5 MB is assumed in the simulation. Wi-Fi connectivity is assumed to be available three-fourths of the time (AR = 0.75), as obtained in measurement studies in [9]. Because class 1 has preemptive priority over class 2, the arrival process of class 2 is immaterial in the analysis of transmission delay for class 1. In Figures 2-6, lines represent analytical results and the corresponding markers denote simulation results.
The transmission delay of offloaded class 1 packets, T S 1 , transmission delay of reneged class 1 packets, T R 1 , and transmission delay of class 1 packets, T 1 are plotted in Figure 2 for different deadlines of class 1. The transmission delay of reneged class 1 packets is larger than the transmission delay of offloaded class 1 packets. T 1 is a weighted sum of T R 1 and T S 1 . Thus, to reduce T 1 when T R 1 is higher than T S 1 , there is a need to balk appropriate packets. The reneged packets do not contribute to offloading efficiency. However, they increase the average queuing delay. If we could identify packets that are likely to be reneged, and if they are made to balk and routed to cellular network directly, the performance of the delayed offloading scheme can be improved.

| DELAYED OFFLOADING WITH BALKING AND PRIORITY
In this section, we analyse the system that includes prioritising traffic and appropriate balking to improve QoS in am delayed offloading scenario.

| System model
The system model is the same as the one in Section 3.1. The only change is that not all packets wait in the queue for Wi-Fi transmission; instead, the class j packet joins the queue with probability 1 − b ðjÞ ðn 1 ;n 2 ;kÞ , where n 1 and n 2 are the number of class 1 and class 2 packets in the queue, and k is the Wi-Fi service status at the time of the arrival of the corresponding packet. b ðjÞ ðn 1 ;n 2 ;kÞ is the balking probability of the packet, which is the probability with which the packet upon arrival has to leave the queue without waiting in the queue.

| Derivation of balking probability of a packet
To derive the balking probability, b ðjÞ ðn 1 ;n 2 ;kÞ , we first derive the distribution function of the virtual waiting time, V 1 (t) and V 2 (t), of class 1 and class 2, respectively.
Consider a packet whose arrival time is t 0 with deadline d.
i is the time from the instant the previous packet leaves the system until the ith class 1 packet leaves the system, 1 ≤ i ≤ n 1 . Similarly, T ð2Þ i is the time from the instant the previous packet leaves the system until the ith class 2 packet leaves the system, 1 ≤ i ≤ n 2 : Here, H j,i ∼ Exp(iγ j ), is the reneging rate of class j when the number of class j packets waiting in the queue is i. Also,

and:
Given N(t 0 ) = [n 1 n 2 ] and K(t 0 ) = k, we get: Taking the Laplace transform on both side of Equations (18) and (19), the LST of the virtual waiting time, V j of class j packet, j 2{1, two} given N(t 0 ) = [n 1 n 2 ] and K(t 0 ) = k, is: where L ðjÞ T i ðsÞ is the LST of T ðjÞ i : The conditional CDF of the virtual waiting time, V j of the class j packet, can be derived as: Because the number of class 1 and class 2 packets in the queue are n 1 and n 2 , respectively, and the Wi-Fi service status is k, the probability that V j is less than d 0 is given by: Then, the balking probability of the class j packet, b ðj;n 1 ;n 2 ;kÞ , can be formulated as: b ðj;n 1 ;n 2 ;kÞ ¼ 1 − F V j ðd 0 | ðNðt0Þ ¼ ½n 1 n 2 �; Kðt 0 Þ ¼ kÞÞ: ð24Þ Algorithm 1, Balking, is the algorithm that every mobile node executes to make a balking decision. The algorithm takes as input (1) the class of the packet, (2) the deadline of the packet, (3) the queue length of the high-and low-priority queues, and (4) the Wi-Fi status at the time of arrival. Upon execution of the algorithm, the packet either joins the queue corresponding to its class or balks.

| Derivation of transmission delay and offloading efficiency
Based on the system model in Section 4.1, delayed offloading with balking and priority can be modelled as a 3D Markov chain with states (n 1 , n 2 , k), as shown in Figure 3. Here, n 1 2 {0, 1, …} denotes the number of class 1 packets waiting in the queue, n 2 2 {0, 1, …} denotes the number of class 2 packets waiting in the queue, and k 2{c, w} denotes status of the Wi-Fi service. When the system is in state (n 1 , n 2 , k), class j packet arrival to the Wi-Fi queue happens with the rate, λð1 − b ðj;n 1 ;n 2 ;kÞ Þ, where b ðj;n 1 ;n 2 ;kÞ is as given in Equation (24).
Let π ðn 1 ;n 2 ;kÞ denote the steady-state probability of state (n 1 , n 2 , k). Values of π ðn 1 ;n 2 ;kÞ can be found thus: Let R be the rate matrix corresponding to the Markov chain, and π be the steady-state vector for R with elements as π ðn 1 ;n 2 ;kÞ . To find the steady-state vector, notice that πR = 0 [28]. Therefore, R T π T = 0, and to solve for π, we have to obtain the null space of R T . After finding a vector in null space of R, a scalar multiple of that, which satisfies ∑ k¼c;w ∑ ∞ n 1 ¼1 ∑ ∞ n 2 ¼1 π ðn 1 ;n 2 ;kÞ ¼ 1, gives π. Given the steady-state probabilities of states, we can determine the average number of class j packets in the queue as: The reneging rate of class j packets is the mean number of class j packets that get reneged per unit time, which is given by The balking rate of class j packets is the mean number of class j packets that do not join the queue per unit time: b ðj;n 1 ;n 2 ;kÞ π ðn 1 ;n 2 ;kÞ ð27Þ Algorithm 1 Balking AJITH AND VENKATESH -9 We can calculate the reneging and balking probability of class j packet as: Finally, the offloading efficiency of class j is given by: and the transmission delay of class j is given by: where we assume balked packets contribute zero to transmission delay and use Little's law [36] with effective arrival rate λ − B j .

| Exact analysis to virtual waiting time of class 2
In deriving Equations (18) and (19), we made an exponential approximation for the PDF of the completion time. V 2 (t 0 ) of the tagged class 2 packet as given in Equation (19) must include a term that accounts for the arrival of a high-priority class 1 packet that arrives after the arrival of a tagged class 2 packet but is served before the tagged packet. It can be accounted for by introducing a correction term, W add , whose derivation is as follows. V 2 (t 0 ) of the tagged class 2 packet as given in Equation (21) must include a term that accounts for the arrival of a high-priority class 1 packet that arrives after the arrival of a tagged packet but is served before the tagged packet. It can be accounted for by the term G, which is the BP generated by the class 1 packet during V 2 (t 0 ). Let V 2 (t 0 ) = x. The Laplace transform of G given x can be derived as follows: Let N be the number of class 1 packets that arrive during time duration x and still present until the end of x (the number of packets arrived minus the number of packets reneged). According to Stanford et al. [34], it can be considered as an M/M/∞ queue with arrivals occurring at rate λ 1 , according to a Poisson process, and move the process from state i to i + 1. Service times have an exponential distribution with parameter γ 1 , and there are always sufficient servers such that every arriving job is served immediately. Transitions from state i to i − 1 are at rate iγ 1 . Then, the PMF of N given x can be derived as [28]: Then, the Laplace transform of G given x is given by: where L j (s) is the BP with j class 1 packets and is as given by Equation (2) of Iravani and Balcıoglu [33]. Therefore, the Laplace transform of G is given by: where: Because calculation of Equation (34) is computationally complex, a correction term can be found as follows. Let W add be the average time required to serve class 1 packets that arrive after the arrival of the tagged class 2 packet but are served before class 1 owing to their preemptive priority. Let n 1 and n 2 be the number of packets in the queue at the time of the tagged class 2 arrival. Then, W add can be approximated as (without considering reneging of class 1 packets): Therefore, We then introduce the correction term to deadline d as d 0 = d − W add , where W add is the correction term added to incorporate the class 1 packet that arrived after the tagged class 2 packet. Therefore, W add is as given by Equation (36) for j = 2 and is equal to zero for j = 1.

| Results and discussion
To validate the analytical model, extensive simulations were carried out in MATLAB [35]. A node is modelled to generate the Poisson traffic of class 1 and class 2. The Wi-Fi data rate is assumed to be 2 Mbps. A packet length of 2.5 MB is assumed. Wi-Fi connectivity is assumed to be available three-fourths of the time (AR = 0.75). We consider the case in which high-class priority has a short deadline in terms of seconds, and low-class priority has a deadline in terms of minutes. Note that MDO corresponds to a standard variant of delayed mobile data offloading, and MDOB corresponds to the proposed scheme, mobile data offloading with balking.
The transmission delay for class 1 and class 2 for different λ 2 is plotted in Figures 4a,b, respectively. Figure 4a shows that class 1 transmission delay is not affected by the class 2 arrival rate for both schemes because class 1 has preemptive priority over class 2. Although class 1 has a delay tolerance of 30 s, the transmission delay is much lower in prioritising class 1 traffic, as seen in Figure 4a. However, class 2 transmission delay increases with an increase in the class 2 arrival rate for both schemes, as seen in Figure 4b. Even the high-priority class 2 packets have a delay tolerance of 5 min, and the transmission delay is less than the delay tolerance in the range of λ 2 , as shown in Figure 4b.
The transmission delay for MDOB is lower than transmission delay curve for MDO for both classes and for all values of λ 2 . Hence, the transmission delay of both the classes can be reduced upon including balking. Our results are supported by the conclusion drawn in Cheung and Huang [16]. The optimal solution derived in that work [16] showed that when there are many packets waiting to be transmitted or when the deadline is close, the user should start transmitting via a cellular network immediately to minimise the overall cost. Figure 5 shows the offloading efficiency of class 1 and class 2 packets. Although class 1 is given preemptive priority over class 2, the offloading efficiency of class 1 is not high. This is because class 1 has a shorter deadline of 30 s. Upon increasing λ 2 , the offloading efficiency of class 1 is not affected because it has preemptive priority over class 2. However, the offloading efficiency of class 2 decreases. The offloading efficiency curve for MDOB is slightly lower than that of the MDO scheme for class 1. However, the offloading efficiency curve for MDOB is slightly higher than the corresponding curve of MDO for class 2. This shows that for balking appropriate packets, although there is a slight decrease in offloading efficiency for class 1, there is an increase in offloading efficiency for class 2 as well.
The reneging and balking probability of class 1 and class 2 is plotted in Figure 6. The reneging probability of class 1 is unaffected by λ 2 , and the reneging probability of class 2 increases upon increasing λ 2 for both MDO and MDOB. The reneging probability is lower for MDOB compared with MDO for both classes.

| Comparison with related works
Although a number of works are reported in the literature, as Table 1 shows, the references differ from one another in (1) system model and system parameters considered, (2) performance metrics analysed, (3) the analytical framework adopted, and (4) overall goal of the paper. Therefore, we confine comparison of our work with two closely related works [23,26]. Nonetheless, some differences in the system model and parameters probed exist [23,26]. As a result, we can make only qualitative comparisons of our work against those authors [23,26] and draw general inferences. The average delay (10.8 s) of high-priority data in our model is less compared with the delay of high-and low-priority data (25 s) of Xu et al. [26], and also with an average delay (23 to 42 s) of single-class data of Mehmeti and Spyropoulos [23]. This is achieved through prioritisation; as a result, the delay of low-priority data of our model is larger. Our model achieves a higher offloading efficiency η (44 and 36) compared with that of Xu et al. [26] is 18. Although the availability ratio of that work [26] is only 0.1 (which reduces η), the deadline of 30 to 90 min of Xu et al. [26] favours higher η. Most important, the reneging rate of our model (0.23 for class 1 and 0.4 for class 2) is much lower compared with that of 0.867 of Mehmeti and Spyropoulos [23].

| Effect of finite cellular delay
We now investigate the effect of considering a finite cellular delay. Assume that files transmitted over the cellular network incur a fixed delay D cellular (D cellular is the ratio between packet size κ and R Cellular ), capturing queueing delays over the cellular network. Let T j,MDO , j 2{1, two} denote the transmission delay of class j packet in the Wi-Fi queue without considering cellular delay. Then, in standard delayed offloading, the transmission delay is given by T j;MDO þ P R j D cellular . In the proposed mobile data offloading with balking, the transmission delay is given by T j;MDOB þ ðP R j þ P B j ÞD cellular . We show in the simulation as well as theoretically that T j,MDOB is smaller than T j,MDO in Figure 4. Furthermore, we have shown that P R j for standard delayed offloading is almost equal to P R j þ P B j for our scheme in Figure 6; thus, we are not seeing a reduction in offloading efficiency even after incorporating balking. Therefore, T j;MDOB þ ðP R j þ P B j ÞD cellular is smaller than T j;MDO þ P R j D cellular . Thus, even if we consider a finite cellular delay, our proposed scheme has a lower transmission delay.

| Conclusion
We presented a delayed mobile data offloading scheme to support QoS traffic. We considered a set of mobile nodes under the coverage of cellular and Wi-Fi technology that carry packets with two priority levels. We have modelled the mobile nodes as an M/G/1 queue with an ON-OFF server along with reneging and balking. We validate the need for balking for QoS enhancement in delayed offloading by deriving a transmission delay of offloaded and reneged packets separately. A balking probability was derived. Finally, delayed offloading with balking along with a multiclass prioritisation scheme was analysed. We demonstrated that the average transmission delay of the highpriority packets can be substantially reduced without sacrificing F I G U R E 5 Offloading efficiency for class 1 and class 2 for different λ 2 . Lines (analysis) and corresponding symbols are simulation results. (a)T 1 and (b)T 2 F I G U R E 6 Reneging and balking probability for class 1 and class 2 for different λ 2 . Lines (analysis) and corresponding symbols are simulation results. (a)T 1 and (b)T 2 offloading efficiency. Our investigation has the potential to improve QoS in future mobile data offloading standards.