Pay to activate service in vacation queues

We study a vacation queueing model where an arriving customer, upon finding the server to be on vacation, is offered an opportunity to pay a fee to instantaneously end the server's vacation, which is referred to as pay‐to‐activate‐service (PTAS). If no one utilizes PTAS, the service will automatically resume when the system's workload reaches a critical level. We investigate customers' equilibrium strategies: (i) joining or balking and (ii) if joining, accepting PTAS or rejecting PTAS, in response to such a mechanism; we show that customers' equilibrium strategies exhibit both avoid‐the‐crowd (ATC) and follow‐the‐crowd (FTC) types of behavior. Our results indicate that the adoption of PTAS is efficient in improving the system performance (e.g., revenue and throughput) when the demand volume is intermediate. We also discover that, upon selecting the appropriate queue‐length information disclosure policy, the service provider has to trade off between collecting a higher revenue through PTAS and improving the system throughput, because revealing the queue‐length information will impact the aforementioned two performance metrics in opposing directions. Finally, we compare our new setting to other common mechanisms including regular vacation queues and pay‐for‐priority queues.


INTRODUCTION
In service systems, allowing servers to take vacations when the congestion level is low can help reduce the system's operating cost and arouse servers' enthusiasm, see Tian and Zhang (2006).An efficient vacation mechanism in practice is to let the server resume service once the system's workload reaches a critical level.For example, in make-to-order production systems with high setup costs, the production line usually begins to operate only when the number of tasks reaches a critical level, see Guo and Hassin (2011) and Li et al. (2016) for more detailed discussions of make-to-order production systems.
This type of vacation termination rule is referred to as the N-policy (Yadin & Naor, 1963), that is, the vacation continues as long as the total number of waiting customers is below a threshold N and it terminates otherwise.In a vacation queue with a relatively low threshold N, customers' waiting times have little variations because it only takes a few more arrivals to activate the server (if it is not already active).For instance, a special case of N = 1 is equivalent to the conventional queueing model without vacation.On the other hand, when the vacation termination threshold is high, customers' delays can exhibit significant fluctuations because an arrival at the beginning of the server's vacation time has to passively wait for N − 1 additional arrivals before the service eventually resumes, while the last arrival will immediately activate the service.This may give rise to an issue on service unfairness (variance of customers' waiting times has been proven a useful metric for the service fairness, see for example Cao et al. (2021) and the references therein); also see Liu and Whitt (2014) and Aras et al. (2018) for analysis on variance of waiting times.
In this paper, we consider a new mechanism in a vacation queueing system, where each arriving customer, upon finding the server to be on vacation, is offered an opportunity to pay a fee to instantaneously end the server's vacation; we 2610 WANG ET AL.

Production and Operations Management
refer to this option as pay-to-activate-service (PTAS).Ideas similar to PTAS have already been implemented in several practices.One relevant application is manufacturing systems of high-end products, such as new energy vehicle (NEV).Since the Chinese government has terminated the NEV subsidy policy in 2020, 1 NEV productions have experienced a significant slowdown.Manufacturers intend to adopt more cautious production plans; they begin producing new cars only when the number of pending orders reaches a certain level, unless consumers are willing to pay a premium; see Li and Xu (2020).Another example is the production line of labor-intensive products (e.g., fashion clothing and designer handbags), where the products are made by one designer exclusively.Due to the labor-intensive nature, the designer often resumes to work when there is a sufficient number of orders or a considerable premium is paid by some highly delay-sensitive consumers; see Guo and Hassin (2011) and Li et al. (2016).Other applications close to the PTAS mechanism include online group buying of beauty products (Hu et al., 2021) and car-pooling in ride-sharing platforms (Jacob & Roet-Green, 2021) in those the service may be initiated either by having a sufficient number of requests or fees paid by impatient consumers.
PTAS enables customers to gain proactive control of their own service experience because, if they deem PTAS to be worthy, customers no longer need to wait for other (future) customers to help advance the service process.In some sense, PTAS can help address the fairness issue from the customers' perspective, it turns the control of the server's state from passive to active.In addition, the impact of PTAS is beyond the scope of an individual customer.A customer adopting PTAS may help improve the service experience of other customers, including (i) old customers already buffered in queue awaiting the server to return to work and (ii) new customers arriving in the near future before the server takes another vacation.
In our vacation queueing model endowed with PTAS, customers are delay-sensitive and strategic.They make the following one-time decisions immediately upon their arrivals: (i) to join or to balk and (ii) if joining, to pay (for PTAS) or not to pay, in anticipation of their expected individual welfare.If no one utilizes PTAS and the server is on vacation, the service will automatically resume when the queue length reaches some designated threshold N. At a glance, the idea of using PTAS to reduce delay can be in some sense similar to paying for a service priority (e.g., the FastPass service in Disney World for reduced waiting times and the Amazon Prime service for quick deliveries).Nevertheless, we draw a distinction: In conventional priority service models, purchasing a higher priority will only reduce the delay cost for that particular customer, while PTAS proposed here will help advance the service process for the entire waiting line, which benefits all customers present in the system.We study customers' equilibrium joining-and-purchasing strategies in response to such a new mechanism under two main information policies: observable and unobservable queue length.From the perspective of the service provider, we aim to answer the following questions: Does the implementation of PTAS help generate a higher revenue and system throughput?If yes, when is the improvement most significant?What is the optimal information disclosure policy when PTAS is in effect?

Literature review
Our analysis has points of contact to four extant streams of research: (i) strategic behavior in vacation queues, (ii) strategic behavior in priority queues, (iii) two-dimensional customer strategies, and (iv) information provision policies.

Strategic behavior in vacation queues
The research on strategic customers in queues was pioneered by Naor (1969), where arriving customers decide on whether to join an M/M/1 queue based on the available queue length.
The case of unobservable queue for an M/M/1 model was developed by Edelson and Hilderbrand (1975).Following Naor (1969), strategic customer behavior in queueing systems has been widely studied in the literature, see Hassin and Haviv (2003), Stidham Jr (2009), and Hassin (2016) for comprehensive reviews.We hereby focus on reviewing works on vacation queues.The first work on vacation queues with Nthreshold policy dates back to Yadin and Naor (1963) where the service is resumed whenever N or more customers are present in the waiting line.Following Yadin and Naor (1963), various vacation queue models have been studied during the past decades.Interested readers can refer to the comprehensive monograph of Tian and Zhang (2006) and the references therein.In particular, Economou and Kanta (2008) studied an observable queue with server vacations due to breakdowns and developed the equilibrium joining strategy for customers.Guo and Hassin (2011) was the first work that studied customer equilibrium behavior in an N-policy vacation queue, and they discovered that customers may prefer to join a longer queue, in anticipation that the service will start sooner.Such an effect is referred to as positive externalities in the queueing game literature.The case of heterogeneous customers was investigated by Guo and Hassin (2012).

Strategic behavior in priority queues
Allowing customers to pay for priority has been proven an efficient way to increase service profit and social welfare in queueing systems.Adiri and Yechiali (1974) were the first to develop the pure equilibrium priority purchasing strategy in an observable queueing model.Their results were later extended by Hassin and Haviv (1997) to allow for mixed strategies in the same settings.Gavirneni and Kulkarni (2016) studied the equilibrium strategy in unobservable queues having heterogeneous customers.Wang et al. (2019) conducted a comparison analysis for the equilibrium performance of a priority queue under different information structures.The partial priority scheme is proposed by Yang et al. (2022) in Covid-19 testing queues.Offering PTAS to customers looks in a way similar to allowing them to purchase a higher

Production and Operations Management
priority over others.Nevertheless, we draw a major distinction: In priority queues, the purchasing behavior is shown to be of pure follow-the-crowd (FTC) type, that is, a customer is more inclined to purchase priority when others do so as well, see, for example, Hassin andHaviv (1997, 2003).However, in our vacation queue model endowed with PTAS, as we will show later, FTC and avoid-the-crowd (ATC) often coexist when the threshold N is neither too large nor too small.Such a distinction is due to the fact that a PTAS-purchasing customer not only will reduce her own delay cost but also will benefit all other customers present in the system.

Two-dimensional customer strategies
In contrast to previous queueing game literature that focuses on either the customers' joining strategy or purchasing strategy (e.g., in priority queues), our framework allows customers' strategy to be a combination of both.At the heart of our equilibrium analysis is to establish the two-dimensional joining-and-purchasing strategy.To our best knowledge, only a few papers in the queueing game literature have investigated this type of two-dimensional equilibrium strategy.Hassin and Roet-Green ( 2017) studied a queueing model where customers first decide on whether to inspect the queue length and then make a second join-or-balk decision.This work was later extended by Hassin and Roet-Green (2018) to a twoserver queue with inspection costs.Wang et al. (2019) studied the joining and priority purchasing strategy in a priority queueing model.Besides the joining decisions, Cui et al. (2020) and Yang et al. (2021) considered queueing models where customers can choose to pay to improve their queueing positions.Other two-dimensional settings can be found in referral priority models (Yang & Debo, 2019), online retailing queueing models (Wang et al., 2021a), multichannel service models with product exchange (Sun et al., 2022a), and restaurant models allowing orders to be placed ahead and then picked up later (Sun et al., 2022b).Motivated by cloud services, Dierks and Seuken (2021)  We emphasize that the consideration of a two-dimensional customer strategy adds significant complexity to the equilibrium analysis.

Information revelation policies
There is a stream of queueing literature that studies the impact of information provision on queueing outcomes.By studying social welfare under both full and no queue information, Has-sin (1986) discovered that the revelation of real-time queue length improves the social welfare because such information helps better match service capacity with customer demand.Chen and Frank (2004) investigated the system throughput under the two aforementioned information provision policies and discovered that delayed information may have both positive and negative effects on the system throughput.Simhon et al. (2016) considered the optimal information disclosure problem in an M/M/1 queue, and concluded that the commonly adopted threshold policy is never optimal.Hassin and Koshman (2017) proposed a new profit-maximizing mechanism in that customers will be notified whether the queue length is below a certain threshold.Hu et al. (2018) found that throughput and social welfare can be unimodal in the fraction of informed customers; their findings infer that creating the "right" amount of information heterogeneity among customers may lead to improved outcomes.Similar results can be found in the retrial and priority queueing models, see Wang and Wang (2019) and Wang and Fang (2022).Anunrojwong et al. (2020) studied the effective design of information policies with the objective of reducing congestion in social services.Recently, Lingenbrink and Iyer (2019) solved a long-standing open problem on the optimal signaling mechanism in unobservable queues; their illuminating findings suggested that such a signaling mechanism can be effective in achieving the optimal revenue in settings where statedependent pricing is infeasible.

Contributions and organization
In summary, we make the following contributions.
• Benefit of PTAS.To the best of our knowledge, the present work is the first to study a vacation queueing model endowed with PTAS, which can be viewed as an extension of regular vacation queues operated under the Npolicy.The ingenuity of PTAS lies in its ability to allow the server (when on vacation) to be activated immediately by arriving customers, giving them more controls over their service experiences.We study customers' equilibrium joining-and-purchasing strategies and the corresponding system performance measures.Our results show that, from the service provider's perspective, the model with PTAS can achieve a higher system-level performance than that without PTAS; and from the customers' perspective, they also benefit from receiving additional welfare through the utilization of PTAS.• Information provision policies.We study two base information policies.In case the queue length is unobservable, we discover that the equilibrium PTAS purchasing behavior shifts from FTC to ATC as the threshold N increases.Specifically, the equilibrium is FTC when N is small (e.g., N = 2) and is ATC when N is large (e.g., N ≥ 5).And interestingly, when N is intermediate (e.g., n = 3, 4), the equilibrium exhibits both FTC and ATC behavior.When the queue length is observable, we establish a

Production and Operations Management
subgame-perfect equilibrium (SPE) strategy, which, on an equilibrium path, reduces to a parameter-dependent threshold policy.We also consider the third information policy, called the "no-information" case, where neither the queue length nor the server's state is available.• Nonmonotonic performance functions.Under all information policies, we show that the equilibrium system throughput is not necessarily increasing in the service reward, and that the PTAS revenue is in fact decreasing in the service reward.These seem to counter the general intuitions.Another interesting result is that the PTAS revenue is a unimodal function in the demand volume (it is close to 0 when the demand volume is either too small or too large).The performance under different information policies is studied and compared.To gain understanding of these results, we conduct numerical experiments and provide in-depth discussions.

Organization of the paper
The rest of the paper is structured as follows.The model description is given in Section 2. In Section 3, we study an unobservable vacation queue model endowed with PTAS.We conduct equilibrium analysis in three steps: We first report the equilibrium joining strategy under a fixed PTAS purchasing probability (Propositions 2 and 3); we next develop the equilibrium PTAS purchasing strategy with exogenous arrival rates (Proposition 4); and finally, we integrate results in the previous two steps to establish the joint joining-andpurchasing strategy (Theorem 1).In Section 4, we study an observable vacation queue endowed with PTAS and characterize the SPE strategies (Theorems 2 and 3); we also provide the system performance in equilibrium.In Section 5 we compare the system performance under the two base information policies, investigate the revenue/pricing implications, and contrast our PTAS setting to other common mechanisms.
We develop some extensions of our base models in the Supporting Information and draw concluding remarks in Section 6.All proofs are given in the Supporting Information.

MODEL DESCRIPTION
We consider a production system with N-policy, where an arriving customer, upon finding the server to be on vacation, is offered a one-time opportunity to pay a fee to instantaneously end the server's vacation.Such a mechanism is called PTAS.Specifically, we study an M/M/1 queue having arrivals according to a Poisson process with rate Λ, and independent and identically distributed (i.i.d.) service times that are exponentially distributed with rate . 2 We denote by  ≡ Λ∕ the system's workload.The server alternates between two states: active (i.e., at service) and inactive (i.e., on vacation).When the server is active, waiting customers are being served under the first-come first-served (FCFS) discipline; when no customer is present, the server becomes inactive (takes a vacation).The server's vacation will end either (i) when the total number of waiting customers reaches a critical level N or (ii) an arriving customer adopts PTAS.The fee of PTAS is P > 0.
Customers are homogeneous and delay-sensitive.They incur a delay cost at rate C during their total sojourn time and receive a reward R upon completion of their services.All arriving customers are informed of the state of the server (because it may appear to be unreasonable to ask customers to pay for PTAS when the server is already active).Knowing the server's state, each customer needs to decide whether to join the queue or to balk; if the server is on vacation (inactive), a joining customer also has to decide whether to accept PTAS or reject PTAS (in the latter case she relies on future customers to activate the service either by increasing the queue length to N or by adopting PTAS).In summary, arrivals finding the server to be on vacation have three pure strategies: (i) balking; (ii) joining and accepting PTAS; (iii) joining and rejecting PTAS.We assume that customers are risk-neutral, and they aim to maximize their expected utilities conditional on the system state observed upon arrival. 3 We first study two main information policies: (1) unobservable queue (so customers' behavior will rely on their anticipation of the expected mean delay) and (2) observable queue (so that customers can make strategic decisions using the realtime queue length). 4We conduct equilibrium analysis in both cases and study which one provides more benefits from the service provider's perspective.To model the system dynamics as a continuous-time Markov chain (CTMC), we track the two-dimensional process {(B(t), X(t)), t ≥ 0} where B(t) = 1 (B(t) = 0) if the server is active (inactive) at time t, and X(t) is the total number of customers in the system at t.Under the N-policy, the state space of this CTMC is (1)

UNOBSERVABLE QUEUE
In this section, we establish customers' equilibrium strategy when the queue length is unobservable.In Section 3.1 we first study the steady-state system performance under an arbitrary (mixed) strategy.In Sections 3.2-3.4,we fully describe the joint joining-and-purchasing equilibrium strategy.

Preliminaries
Because the server's state is observable, we let  0 and  1 be the effective customer arrival rates when the server is inactive and active, respectively, and let p be the probability that a customer finding an inactive server accepts PTAS.Therefore, customers' strategy can be described by the triplet  ≡ (p, q 0 , q 1 ), where q 0 =  0 ∕Λ and q 1 =  1 ∕Λ.We also define  0 ≡  0 ∕ and  1 ≡  1 ∕ as the effective traffic intensities in the two cases.

Production and Operations Management
F I G U R E 1 State transition diagram under strategy (p, q 0 , q 1 )

Steady-state performance
The state transition diagram of the CTMC {(B(t), X(t)), t ≥ 0} is depicted in Figure 1.To understand Figure 1, for example, consider the state (1,2), one of two events may occur next: a service completion with rate  (in which case state (1,1) follows) because the server is active, or a new customer arrival with rate  1 (in which case (1, 3) follows).On the other hand, since an external customer arrival seeing B(t) = 0 can activate the server with probability p, the CTMC will move from states (0,1), (1, 1), and (1,3) to state (1,2) with rate  0 p,  1 , and , accordingly.Transitions regarding other states are similar.
Let  i,n denote the steady-state probabilities of (B, X), which satisfy the following balance equations For any given strategy (p, q 0 , q 1 ), the steady-state probabilities and expected queue length along with other system performance measures are given below.
Proposition 1 (Steady-state system performance under a given strategy (p, q 0 , q 1 )).Consider an unobservable M/M/1 vacation queue with PTAS.Assume that all customers follow strategy (p, q 0 , q 1 ).
(i) The steady-state probabilities are , x ∧ y ≡ min(x, y), and where .
(iii) The conditional expected waiting times of an arriving customer seeing an inactive server and an active server are (iv) The system throughput is The service provider's revenue collected by selling PTAS is Remark 1 (Decomposition of steady-state queue length).
According to (8), the mean steady-state queue length Q(p, q 0 , q 1 ) can be separated to two parts: Q(q 0 , q 1 ) and Q N (p).The first term Q(q 0 , q 1 ), which is independent of threshold N and purchasing probability p, is the mean queue length in a conventional M/M/1 queue having state-dependent arrivals (with arrival rates  0 and  1 when the server is inactive and active, respectively); the second term Q N (p), which is independent of  0 and  1 , can be interpreted as the extra queue size incremented due to the vacation mechanism.This term will become smaller when p increases (it will vanish if everyone adopts PTAS).Because Q N (p) will play an important role in characterizing customers' equilibrium strategies, we next establish the structural properties for Q N (p).

Production and Operations Management
Remark 2 (Important special cases).To make contact with existing results in the literature, we advocate that our model is general and covers several previously studied queueing models.If no one adopts PTAS (i.e., p = 0), we have and which coincide with the waiting time formulas for the Npolicy vacation queue model (Guo & Li, 2013).On the other hand, if everyone accepts PTAS (i.e., p = 1), we have which are the conditional expected delays of customers finding the server to be inactive and that of those finding the server to be active in a standard M/M/1 queue, respectively.Finally, if we take N → ∞, we have lim which degenerate to delays in a vacation queue with Bernoulli schedule (Gao & Liu, 2013), where the system can be activated by each arriving customer with a certain probability p.
Using results derived so far, we can investigate the equilibrium joining-and-purchasing strategies.When all customers adopt strategy (p, q 0 , q 1 ), the expected utilities of an arriving customer seeing an inactive server and an active server are which are increasing and decreasing in  0 and  1 , respectively.Then the ex ante expected utility is given by where (1 −  1 )∕(1 +  0 −  1 ) and  0 ∕(1 +  0 −  1 ) are the steady-state probabilities that the server is inactive and active, respectively.
Although customer arrivals arise from a homogeneous Poisson stream, we assign one of two "labels" to all arrivals immediately upon their arrivals; we do so according to the specific server state (busy working or on vacation) they observe.In particular, customers seeing an active server (i.e., B(t) = 1), thus assigned with a label "B 1 ," need to determine the joining probability q 1 , while those finding the server on vacation (B(t) = 0), thus assigned with a label "B 0 ," will first determine their joining probability q 0 and next the PTAS purchasing probability p.To characterize customers' best response functions, we let U 0 ((p, q 0 ); (p ′ , q ′ 0 , q ′ 1 )) be the expected utility of a tagged B 0 -customer who adopts (p, q 0 ), while assuming that all other B 0 -customers adopt (p ′ , q ′ 0 ) and all B 1 -customers adopt q ′ 1 .Similarly, let U 1 (q 1 ; (p ′ , q ′ 0 , q ′ 1 )) be the expected utility of an individual B 1 -customer who adopts q 1 , while assuming that all B 0 -customers adopt (p ′ , q ′ 0 ) and all other B 1 -customers adopt q ′ 1 .Below we carefully define a symmetric Nash equilibrium.
Definition 1 (Symmetric Nash equilibrium).A strategy profile ( e , q e 1 ) with  e ≡ (p e , q e 0 ) is a symmetric Nash equilibrium strategy if and only if 1 )) and q e 1 ∈ arg max U 1 (q 1 ; ( e , q e 1 )).( 17) Throughout the paper, we restrict our attention to symmetric Nash equilibrium.Similar definitions of state-dependent symmetric equilibria can be found in (3.3) and (3.4) of Wang and Wang (2019).
As will soon become clear in subsequent analysis, there often exist multiple equilibria.To identify those that are most relevant, we resort to notion of utility dominance.
In what follows, we will adopt the notion of Pareto dominance to identify the most efficient equilibrium strategy among multiple equilibria (whenever exist) that maximizes the ex ante expected utility of customers.We will first characterize the equilibrium strategy for B 1 -customers while assuming that the B 0 -customer strategy is held fixed.

3.1.2
Equilibrium strategy for B 1 -customers The following lemma guarantees the uniqueness of the equilibrium for B 1 -customers for any given strategy of B 0customers.

Production and Operations Management
Proposition 2. For any given strategy of B 0 -customers  = (p, q 0 ), the unique equilibrium strategy of B 1 -customers is given by where and q e 1 (p) is independent of q 0 .Proposition 2 indicates that B 1 -customers' equilibrium joining probability is independent of q 0 (only dependent on the PTAS purchasing probability).To see this, note that if the server is already active, there will be no future arrival of any B 0 -customers as long as the queue has a positive content, so that B 0 -customers' strategy has no bearing whatsoever on B 1customers' joining behavior.In the subsequent subsections, we first develop the equilibrium strategy for B 0 -customers with that of B 1 -customers held fixed.

Equilibrium strategy for B 0 -customers
Because B 0 -customers make two decisions, the characterization of their equilibrium strategy is less straightforward.We describe our roadmap in three steps.First, we derive the equilibrium joining probability q e 0 with any given p (Section 3.2.1);Next, we obtain the equilibrium PTAS purchasing strategy with all other strategies held fixed (Section 3.2.2);Last, we fully characterize the joint equilibrium strategy building on results in Section 3.1 and Section 3.2 (Section 3.3).
It should be noted that in the subsequent analysis, the Pareto-dominance criteria are no longer helpful in distinguishing two mixed equilibria because they can both induce a zero expected utility.In these cases, we will turn to the so-called evolutionarily stable strategy (ESS) (Hassin & Haviv, 2003) in face of multiple mixed equilibria.

Definition 3 (ESS). A two-dimensional equilibrium strategy
is said to be an ESS if U 0 (; ) > U 0 (; ) for all  ≠ .
ESS is useful in excluding the unstable mixed equilibria: If an equilibrium is stable, the system dynamics, when facing a small perturbation in customer behavior, is guaranteed to return to that equilibrium point.But this is not true for an unstable equilibrium.Following the steps to establish Proposition 2, we first obtain the best response of B 0customers with a given q 1 ∈ [0, 1], and we investigate it stability.
(i) According to Propositions 2 and 3, q e 0 and q e 1 , the equilibrium joining probabilities for both B 0 and B 1 customers exhibit opposite monotonicity in .We provide some intuitive explanations: When the server is inactive, more frequent arrivals reduce the server's vacation times to mitigate the queueing congestion, which encourages more customers to join the queue.On the other hand, if the server is already active, increasing the system's congestion level will incur a bigger waiting cost, leading to an increased number of customer balking.(ii) For a given p, a bigger service reward R attracts more customers to join the queue, hence both q e 0 (p) and q e 1 (p) are increasing in R.

PTAS purchasing strategy
In this subsection, we develop B 0 -customers' equilibrium purchasing strategy with arrival rates  0 and  1 (or equivalently q 0 and q 1 ) held fixed.As will soon become clear in the next section, integration of results in Sections 3.2.1 and 3.2.2 will establish the joint join-and-purchase strategy for B 0 -customers.Note that when N = 1, p = 0 is a dominant strategy because it is never optimal for an arriving customer to accept PTAS.Hence, we hereby focus on the case N ≥ 2. When all other customers adopt the strategy (p, q 0 , q 1 ), consider a "tagged" customer who arrives and finds the system in state (0, i) with 0 ≤ i < N. Suppose she rejects PTAS, then let N i be the number of additional future arrivals until the server becomes active.It is obvious that N i is a geometric random variable with parameter p truncated at N − i, so If the tagged customer adopts PTAS with probability p ′ , her expected delay is (20)

Production and Operations Management
When all other customers adopt strategy (p, q 0 , q 1 ), by Poisson arrivals see time averages (PASTA), the system probabilities as observed by an arriving customer are identical to the steady-state probabilities, which are given in Proposition 1.By switching from rejecting PTAS to accepting PTAS, the tagged customer can reduce her expected delay cost by .
Therefore, the best response of the tagged customer is to purchase PTAS if and only if P ≤ ΔW N (p).Since the equilibrium strategies largely depend on the structural properties of ΔW N (p), we next provide a careful analysis of ΔW N (p).
(i) For any given p Remark 4 (On structural properties of ΔW N (p)).Part (i) of Lemma 2 is intuitive because a higher threshold N is more difficult to reach, which makes PTAS more effective in activating the server and thus reducing the delay cost.Using Part (i) of Lemma 2, we can show that ΔW N (p) > 0 for any N ≥ 2 and p ∈ [0, 1] when the system is stable (i.e.,  1 ∈ [0, 1)), because Part (ii) of Lemma 2 characterizes the impact of p on the customers' best responses.At a first glance, offering PTAS is similar to offering a "higher priority" to customers who are willing to pay.Nevertheless, unlike priority queueing models where customers' equilibrium strategy always exhibits FTC behavior (Hassin and Haviv 1997), Part (ii) of Lemma 2 infers the coexistence of both FTC and ATC behavior.We provide some intuitions in the following three cases: • When N is small (e.g., N = 2), PTAS benefits only when a customer arrival finds the system in state (0,0) (because if the state is (0,1), the server is automatically activated).When more customers adopt PTAS (i.e., p is bigger), the server's vacation time is reduced so it is more likely for the tagged customer to find an empty system (i.e.,  0,0 increases).In this situation, the adoption of PTAS gives a bigger delay reduction for the tagged customer.Therefore, the equilibrium exhibits FTC behavior.• When N is large (e.g., N ≥ 5), if all other customers choose PTAS with a higher probability, there will be a bigger chance for the server to be activated by future customer 0 0.2 0.4 0.6 0.8 1 The function ΔW N (p) with 0 ≤ p ≤ 1, 2 ≤ N ≤ 6,  0 =  1 = 0.5, and C =  = 1 arrivals, making it unnecessary for the tagged customer to pay for PTAS.Therefore, ΔW N (p) decreases in p, indicating ATC behavior.• When N is intermediate (e.g, N = 3, 4), the equilibrium exhibits both FTC and ATC behavior.If p is small, adopting PTAS is an efficient way to reduce the delay as the probability  0,i (0 ≤ i ≤ N − 2) increases in p; but if p is large, PTAS is already adopted by many other customers, it becomes less effective for the tagged customer to pay for PTAS.
See Figure 2 for a graphical illustration of these three cases.
In the presence of multiple equilibria, the ESS is underlined.
Remark 5 (PTAS vs. priority: mixed ESS).First, it is the FTC behavior that gives rise to multiple equilibria (as in Cases (i)-(iii)).In this sense, the structure appears to be somewhat similar to the priority-purchasing strategy in priority queueing models.Nevertheless, a major distinction here is that a mixed strategy, which is never an ESS in priority queues (Hassin & Haviv, 1997), can in fact be an ESS in the present PTAS model (Cases (ii) and (iii) in Proposition 4 when N = 3, 4).Such a result is due to the coexistence of both FTC and ATC; also see Remark 4 and Figure 2.

Joint equilibrium strategy
We are now ready to derive the joint equilibrium of B 0 and B 1 customers.We denote by the triplet  = (p e , q e 0 , q e 1 ) the joint joining-and-purchasing equilibrium strategy.By Definition 1 and Propositions 2-4, a strategy (p, q 0 , q 1 ) is an equilibrium if and only if it satisfies p ∈ p e (q 0 , q 1 ), q 0 ∈ q e 0 (p), q 1 ∈ q e 1 (p), where q e 1 (p), q e 0 (p), and p e (q 0 , q 1 ) are identified in Propositions 2, 3, and 4, respectively.For p, q 1 ∈ [0, 1], we call 0 e ≡ (p, 0, q 1 ) the "zero" equilibrium strategy, under which no customer will join the system (because q e 0 = 0 means that no B 0 -customer joins for service, the system will never be activated regardless of the values of p e and q e 1 ).Characterizing the explicit form of equilibrium is challenging because (1) the joining-and-purchasing behavior depends on the values of all model parameters (such as the PTAS fee P and traffic intensity ) and (2) multiple equilibria can exist.We first focus on the case 5 ≤ N ≤ R∕C − 1.This case is relatively straightforward because the customer utility is a monotone function when N ≥ 5, which warrants the uniqueness of equilibrium, and the condition N ≤ R∕C − 1 ensures an active service even without PTAS. 5heorem 1 (Joint joining-and-purchasing equilibrium strategy).Consider the unobservable M/M/1 vacation queue with PTAS.Assume 5 ≤ N ≤ R∕C − 1, the joint equilibrium strategy is given below: where all relevant parameters are given by (S33)-(S35) in the Supporting Information.
In Theorem 1, all nonzero equilibria can be classified into three categories: p e = 0, p e ∈ (0, 1), and p e = 1, which correspond to the three regions in the tables.As P increases, the probability to adopt PTAS decreases, and eventually p e = 0 when P is large enough, resulting in a regular vacation queue operated under the N-policy.When P is small, it is never optimal for an arriving customer to wait for the future customers to activate the server (p e = 1); this case reduces to a standard M/M/1 queue.When P is large, PTAS is almost never in effect so the only way to activate the service is by accumulating enough waiting customers.However, if in addition,  is small (so interarrival times are long), it will take a long time for the server's vacation to end.As a result, the overall delay cost becomes too high so that no one is willing to join the queue.See the lower left part of the tables in Theorem 1 with  = 0 e .
We next conduct a numerical example with Λ =  = C = 1, N = 2, 5, 0 ≤ R ≤ 8, and 0.1 ≤ P ≤ 0.7 to investigate the impact of service reward on the equilibrium outcomes.In Figure 3, we graph the equilibrium purchasing probability p e , throughput  u e , and PTAS revenue Π u .We summarize our observations: First, both p e and  u e are (weakly) decreasing in P; as P increases, fewer customers adopt PTAS (i.e., p e ↓).Consequently, the server's vacation time increases, discouraging future customers from joining the queue (i.e.,  u e ↓).Next, plots a and b of Figure 3 show that p e decreases in R.
To see this, we point out that, when R is small, the equilibrium effective arrival rate is small so the queue size almost never reaches the threshold N. Hence, the PTAS purchasing probability needs to increase in order to achieve an acceptable delay.On the other hand, a bigger R leads to a bigger effective arrival rate, which in turn reduces the individual PTAS purchasing probability.Hence, the PTAS purchasing probability is nonincreasing in R.This result stands in sharp contrast to the equilibrium strategy in priority queues where customers are more inclined to purchase priority when R increases (Wang et al., 2019).Furthermore, we observe from panel c of Figure 3 that the throughput  u e is not always increasing in the service reward R. According to Remark 5, there exist multiple equilibrium purchasing probabilities when N < 5, thus the Pareto-dominant purchasing probability p e may shift from one equilibrium to another when R varies (this explains why p e is discontinuous in R as shown in panel a of Figure 3).In particular, when R is neither too large nor too small, p e drops from a positive value to 0 as R increases, leading to surged vacation times, and hence a sharp fall in the throughput.However, when R is sufficiently small or large, the equilibrium PTAS purchasing probability p e is unique and remains continuous in R. In fact, increasing R leads to two opposite effects: On the one hand, it reduces the PTAS purchasing probability p e , and on the other hand, it attracts more customers to join the queue.In this case, the latter effect outweighs the former, so the throughput  u e is increasing in R.

OBSERVABLE QUEUE
In this section, we investigate the M/M/1 vacation queue with PTAS when the real-time queue length is revealed to all arriving customers.Unlike the unobservable case where customers join the queue with a probability (independent with the queue length), their joining-and-purchasing decisions are now based on the real-time system state, see Naor (1969).When a customer (if joining) is indifferent between accepting and rejecting PTAS upon arrival, we assume for simplicity that she will choose to pay for PTAS.A similar assumption can be found in observable priority queues, see Wang et al. (2021b).Next, we characterize the equilibrium strategy, and then compute the system performance in equilibrium.
Since state-dependent decisions are made in the observable case, we consider this model with infinitely many decision makers (customers), each facing a state sampled from the state-space where B denotes "balking," and J 0 and J 1 denote "joining without purchasing PTAS" and "joining and purchasing PTAS," respectively.Then a pure strategy , specifies an action, (s) ∈ A(s), for every s ∈ .Using the preceding notations, we next give the formal definition of SPE strategy; also see Fudenberg and Tirole (1991) and Hassin and Haviv (2002) for discussions of SPE.
Definition 4 (SPE).A strategy  e is an SPE if a ∈ arg max a∈A(s) U s (a;  e ) and a =  e (s) ∈ A(s) for every s ∈ .
To characterize an SPE, one needs to consider all the system states in ; specifically, customers' best responses need to be determined in all scenarios (including transient states).SPE is especially useful in describing the dynamic evolution of the system, for example, in the case that a customer, due to some reason, deviates from her optimal strategy.It should be noted that an SPE specifies the best response actions in all states, even though not all of them can occur in the equilibrium.Hence, an SPE strategy must be an equilibrium strategy, but the inverse statement is not necessarily correct (Hassin & Haviv, 2002).To characterize the equilibrium strategy, we first need to identify customers' best responses to every system state, and then refine the unique SPE on the equilibrium path.

Equilibrium analysis
Suppose a tagged customer arrives and finds an active server, along with n existing customers waiting in line (excluding herself), that is, s = (1, n), for any strategy , her expected utility is (28) Hence, the tagged customer will join the system if and only if R − (n + 1)C∕ ≥ 0, or equivalently, n < ⌊R∕C⌋, 6 which gives Therefore, it remains to characterize the equilibrium strategy of a customer finding the server to be on vacation (i.e., s = (0, n) for 0 ≤ n ≤ N − 1), which is what we shall do in the rest of this subsection.We consider two cases: (1) P ≤ C∕Λ (low PTAS fee) and ( 2) P > C∕Λ (high PTAS fee).Suppose the server is inactive when the tagged customer arrives.When P ≤ C∕Λ, that is, the PTAS fee is lower than the expected cost spent waiting for the next arrival (who may or may not be able to activate the server), then it is optimal for the customer to adopt PTAS (i.e., it is not worthy to wait for even a single arrival), provided that she decides to join the queue.
If, in addition R ≥ P + C∕, then the tagged customer must join the system because doing so guarantees a nonnegative utility.Hence, under the two conditions P ≤ C∕Λ and R ≥ P + C∕, the system reduces to a regular M/M/1 model (the server will be activated by the very first arriving customer in equilibrium).On the other hand, if R < P + C∕, joining and adopting PTAS will induce a negative utility so that it is optimal for all arrivals to balk (the system is never active).Below we formally describe the equilibrium strategy when P ≤ C∕Λ.
Theorem 2 (Observable queue with low PTAS fee).Consider an observable M/M/1 vacation queue with P ≤ C∕Λ, the SPE on the equilibrium path is given as follows: • Low reward: If R < P + C∕, then  e (0, n) = B for n ∈ ℕ.
According to Theorem 2, when the PTAS fee is sufficiently small, any joining customer purchases PTAS (if seeing an inactive server) so the system will be activated by the first arriving customer in equilibrium.Besides, the system reduces to a standard work-conservation queue in which customers join if and only if the queue length is below some threshold.

Production and Operations Management
When the PTAS fee is higher than the cost of waiting for one future arrival, the tagged customer's best response has to take into account the behavior of future arrivals.Note that it may be worthwhile to wait for one arrival, but there is no guarantee that she will activate the server for sure.Let I ≡ min{n : nC∕Λ > P} = ⌈ΛP∕C⌉ be the minimum number of future arrivals a tagged customer awaits until her cumulative waiting cost exceeds P. We next present our results for the case P > C∕Λ (where I plays a critical role).
Theorem 3 (Observable queue with high PTAS fee).Consider an observable M/M/1 vacation queue with P > C∕Λ, the SPE on the equilibrium path is given as follows: where n = N − 1 if R ≥ max{u(i)} and n = ⌊(R − P)∕C⌋ − 1 otherwise; and Remark 6.According to Theorem 3, the SPE on the equilibrium path is of a threshold type, that is, when the server is on vacation, arriving customers will join without purchasing PTAS until the queue size reaches a threshold mod ( n, I), by that time the system will be activated by an arriving customer via PTAS.Afterwards, it remains active, and all future arrivals follow the standard Naor threshold strategy (see ( 29)).Unlike the standard vacation queues where the service-resumption threshold is exogenous, the threshold of the PTAS queue is dependent on the system parameters.We provide additional explanations regarding the structure of the threshold mod ( n, I) in Section D of the Supporting Information.

System performance
Next, we derive the system performance under the equilibrium strategy in the observable case.In particular, we compute the system throughput and the PTAS revenue, which is the rate at which customers pay for PTAS, multiplied by the PTAS fee P.
Theorem 4 (Throughput and PTAS revenue).Consider an observable M/M/1 vacation queue with PTAS, the steadystate probabilities, throughput, and revenue are given below.
The system throughput and PTAS revenue are given by (1 − )(n 2 + 1) +  n 1 +1 ( −  −n 2 ) and Remark 7 (Queueing dynamics in equilibrium).The server, whenever on vacation, will be activated as soon as the queue length reaches a certain level.Unlike standard N-policy vacation queues having a designated threshold N, the activating threshold of our PTAS queue depends on several model parameters (i.e., R, P, , C, and Λ).Specifically, when P is small (Case (i)), it is optimal to purchase PTAS whenever the server is on vacation, and customers will join as long as the queue length is less than the Naor threshold ⌊R∕C⌋.And the model reduces to the Naor model (Naor, 1969).When P is large (Case (ii)), the server remains inactive until an arriving customer finds n 2 = mod ( n, I) existing customers in the queue, so the model reduces to an N-policy vacation queue with N = n 2 .

Production and Operations Management
We consider a numerical example to visualize results in Theorem 4. In Figure 4 we plot the throughput  o e and PTAS revenue Π o for N = 2, 5, 10.Intuitively, a bigger R drives more customers to join the system (so a bigger  o e ), making it less necessary for customers to adopt PTAS (so a smaller Π o ).However, Figure 4 indicates that neither  o e nor Π o is monotone in the service reward.Unlike regular vacation queues where the threshold N is independent of R, the equilibrium threshold of our PTAS queue is a function of the service reward, in particular, the equilibrium threshold n 2 (R) = mod(⌊(R − P)∕C⌋ − 1, I), which itself is not monotone in R.This explains the cyclic "up-and-down" behavior of  o e and Π o (see cases N = 5 and N = 10).By contrast, when N = 2, n 2 does not vary much (n 2 = 1 or 2), so both  o e and Π o are monotone in R.

COMPARISONS AND IMPLICATIONS
In this section, we first compare the system performance (e.g., throughput and PTAS revenue) and pricing implications under two information disclosure policies.Next, we benchmark the performance of our PTAS vacation model to that of a regular vacation queue without PTAS.Finally, we study how our PTAS model distinguishes from the pay-forpriority queues.

Impact of service reward
Theorem 5.For any fixed PTAS fee P, there exists a threshold R for the service reward such that In the observable case, the expected customer utility depends on their queueing positions.So a smaller service reward R discourages more customers from joining the queue, leading to a smaller throughput.In contrast, hiding the queueing position becomes an advantage in an unobservable queue because customers make their joining decisions based on the average queue length.As a result, the unobservable setting yields a higher revenue.
In Figure 5, we plot the PTAS revenue under two information policies as a function of the service reward.Consistent with Theorem 5, Figure 5 shows that, when R is small, a higher revenue can be achieved by hiding the queue-length information; on the other hand, when R is large, more customers join the system in the unobservable case, making it less necessary to purchase PTAS (hence a lower PTAS revenue).

Impact of congestion level
Theorem 6.For any fixed PTAS fee P, there exists a threshold Λ for the congestion level Λ such that  u e >  o e if Λ < Λ.
Results in Theorem 6 are consistent with the general consensus: When the potential arrival rate is sufficiently small, all customers in the unobservable model join the system; but in the observable case, balking can still happen when customers observe a longer queue upon arrival.Next, we proceed to compare the system performance measures under two information levels relative to the case without PTAS.Note that the server in a standard vacation queue can never be activated if R < CN∕.To avoid triviality, we focus on the case R ≥ CN∕ in the rest of this section.Let Π u (Λ) (Π o (Λ)) be the maximum revenue collected from PTAS with demand volume Λ in the unobservable (observable) case, the following result reveals the impact of the congestion level on the system revenue.
Theorem 7.Under both information policies, the PTAS revenue is nonmonotonic in the congestion level Λ, with At a quick look, the fact that the PTAS revenue is not monotonically increasing in the potential demand seems to counter the conventional wisdom.In fact, the market size Λ impacts the revenue in two opposite directions.On the one hand, increasing Λ helps create bigger customer demand for purchasing PTAS; on the other hand, when Λ is sufficiently large, the high congestion level almost always warrants an active server, which impedes customers from paying for PTAS.When Λ is small, increasing the demand size yields a higher revenue because the first effect dominates.However, when Λ is already large enough, the server hardly has any vacation time, so the second effect prevails.
In the unobservable (observable) case, the revenue reaches its peak at some finite Λ u (Λ o ).Let P u (Λ) (P o (Λ)) be the optimal PTAS fee in the unobservable (observable) case with a demand volume Λ.We have P x (Λ) ∈ [0, R − C∕] for x = u, o, otherwise no one will ever purchase PTAS.The following theorem compares the optimal PTAS fees under the two information policies.
Theorem 8 (Optimal PTAS fee: observable queue vs. unobservable queue).The optimal prices satisfy P u (0) = R − C∕ and P u (∞) = 0. Furthermore, there exists a threshold Λ such that P o (Λ) When the demand volume is sufficiently low, the only way to activate the server is through purchasing PTAS (because the queue length almost never reaches N).This motivates the service provider to set an increase in the PTAS fee to gain improved revenue.We next discuss the case of high-demand volume (with a large Λ).In the unobservable model, customers anticipate a low expected delay (because the average system size should not be far below N), so most customers are reluctant to purchase PTAS.As a result, the service provider needs to lower the PTAS fee in order to achieve improved revenue.In the observable case, customers who observe a shorter queue length (due to the stochastic nature of the queueing system) will likely use PTAS to mitigate their waiting costs.This should explain why the observable model has a higher PTAS fee.In Figure 6 we use a numerical example to illustrate the PTAS revenue (left panel) and optimal PTAS fee (right panel) under the two information policies.Consistent with results in Theorem 8, Figure 6 shows that a higher PTAS fee should be set in the observable case.In addition, the PTAS revenue has a unimodal form in the demand volume, and the optimal PTAS fee is weakly decreasing in Λ (with P o ≥ P u ).

Advantage of PTAS in vacation queues
In this subsection, we investigate how the PTAS mechanism benefits vacation queues.In particular, we provide a comparison of throughput in two models: an M/M/1 vacation queue with PTAS and an N-policy M/M/1 vacation queue without PTAS.Theorem 9 (PTAS improves throughput).PTAS achieves improved system throughput for the M/M/1 vacation queue in both the observable and unobservable cases.

Production and Operations Management
PTAS improves the system throughput by allowing customers to activate service immediately upon their arrivals rather than awaiting future arrivals to increment the queue length to level N. Indeed, a customer adopting PTAS can not only reduce her own waiting time, it can also mitigate the delay cost for other customers (those present in the system and those yet to arrive).These effects work collectively to improve the system throughput.
In support of Theorem 9, we give a numerical example to compare the system throughput for vacation models with and without PTAS for different service reward R (see Figure 7a for the unobservable case and Figure 7b for the observable case).These results not only confirm that PTAS is useful in improving the system throughput, they also reveal some additional insights: PTAS is especially effective as long as the service reward R is relatively small.A bigger service reward attracts more customers to join for service so that queue size N is easily attained, which impedes joining customers from purchasing PTAS (Figure 7a confirms that, in the unobservable case, the system throughput of the PTAS model coincides with that of the vacation model under N-policy when R is large enough).In addition, Figure 7b shows that the superiority of PTAS remains in effect for both large and small Λ in the observable case.

PTAS versus pay-for-priority
In Section 3, we have given a brief discussion on how the equilibrium strategy of our PTAS queue differs from that of the priority queue.To reiterate, a major distinction is that the priority queue exhibits a pure FTC behavior while PTAS shows a more sophisticated behavior that is the hybrid of both FTC and ATC.To further this discussion, we next compare the optimal revenue (i.e., revenue under the optimal fee) of these two models.We hereby restrict our attention to the unobservable setting (i.e., the server's state is observable but the queue length is not). 7Denote by Π p (Λ) the optimal revenue by selling priorities, we can have the following result.
When the demand volume is low, customers in a priority queue anticipate a smaller expected delay so they intend not to pay for priority service, whereas in our PTAS model, customers are more inclined to purchase PTAS, because otherwise the server's vacation may last for a longer time.When the demand volume is high, customers are incentivized to mitigate their delay via the purchase of priority, while PTAS becomes less necessary because the queue is already long enough to reach level N (see Theorem 7).See Figure 8 for a numerical example, which shows a distinct structure of the two revenue functions: Π u is unimodal in Λ while Π p is increasing in Λ.In addition, there exists a cutoff point for the market size Λ, below (above) which the PTAS model yields a higher (lower) revenue than the priority model.

CONCLUSION
In this paper, we study the equilibrium performance of a vacation queueing model with strategic customers.Unlike standard vacation queues in the extant literature where the server's vacation is ended whenever the queue length reaches a critical level, we introduce a new mechanism in that a customer, upon finding the server to be on vacation, may choose to pay a fee to end the server's vacation.This mechanism is referred to as PTAS.The ingenuity of PTAS lies in its ability to allow the server (when on vacation) to be activated immediately by arriving customers, which gives customers more active controls on the server's state (so earlier customer arrivals no longer need to passively wait for future customers to reach the critical queue threshold).
In the present model, customers seeing an inactive server need to make two decisions: (i) whether to join the queue and (ii) if yes, whether to pay for PTAS.We investigate customers' equilibrium joining-and-purchasing strategies, and study their responses to this mechanism under three information cases: (i) observable queue and server state, (ii) unobservable queue and observable server state, and (iii) unobservable queue and server state.Our theoretical analysis reveals results that are seemingly contrary to the conventional wisdom.For example, due to the coexistence of FTC and ATC behavior, a higher service reward does not always guarantee a higher system throughput.In addition, the PTAS revenue is a nonmonotone function in the demand volume (a higher potential demand may even yield a lower revenue).These findings provide quantitative and qualitative insights into the system design of vacation queue systems.We also conduct a careful performance comparison of different information policies.
There are several avenues for future research.One interesting direction is to study customers' rational abandonment behavior in response to the new PTAS mechanism.Another potential topic is to allow the service provider to dynamically adjust the PTAS fee based on the real-time queue length in order to further improve the system revenue.

A C K N O W L E D G M E N T S
The authors thank the review team for their constructive comments that helped improve the paper.Zhongbin Wang acknowledges the financial support from the National Natural Science Foundation of China (Grants 72001118 and 72132007).