Optimizing resource allocation in service systems via simulation: A Bayesian formulation

The service sector has become increasingly important in today's economy. To meet the rising expectation of high‐quality services, efficiently allocating resources is vital for service systems to balance service qualities with costs. In particular, this paper focuses on a class of resource allocation problems where the service‐level objective and constraints are in the form of probabilistic measures. Further, process complexity and system dynamics in service systems often render their performance evaluation and optimization challenging and relying on simulation models. To this end, we propose a generalized resource allocation model with probabilistic measures, and subsequently, develop an optimal computing budget allocation (OCBA) formulation to select the optimal solution subject to random noises in simulation. The OCBA formulation minimizes the expected opportunity cost that penalizes based on the quality of the selected solution. Further, the formulation takes a Bayesian approach to consider the prior knowledge and potential performance correlations on candidate solutions. Then, the asymptotic optimality conditions of the formulation are derived, and an iterative algorithm is developed accordingly. Numerical experiments and a case study inspired by a real‐world problem in a hospital emergency department demonstrate the effectiveness of the proposed algorithm for solving the resource allocation problem via simulation.

cies and dynamics.Further, many activities in service systems deal directly with humans rather than machines, thereby introducing various forms of variability and challenges in measuring system performance.This paper focuses on service systems without tractable mathematical structures in view of the operational complexity, process nonlinearity, and performance variability, thus relying on simulation models in system evaluation and optimization.
A notable challenge of the service systems under study is to meet rising expectations on services without overspending on relevant resources.On the one hand, service providers are required or incentivized to meet and improve service quality as measured by certain metrics, such as waiting time (WT) in call centers and hospitals, on-time ratio in online food ordering and delivery platforms, and availability percentage in IT Prod Oper Manag. 2023;32:65-81.wileyonlinelibrary.com/journal/poms65 Production and Operations Management services.These service goals are expressly provisioned in a service contract or perceived as crucial to customer satisfaction and market share.On the other hand, service providers work with a given set of resources, such as personnel with specific skills or training, space and equipment availability, and an operating budget.With the limitation on resources, they are motivated to improve service by intelligently allocating resources in place (e.g., functional areas) and time (e.g., shifts).For example, a hospital can improve care delivery by managing bed assignments and nurse schedules (Best et al., 2015;Kim & Mehrotra, 2015); a retail store can optimize product variety and quantity by properly utilizing storage and shelf spaces (Ton & Raman, 2010).This paper studies a class of service problems where one service level is optimized, and several others are imposed as constraints, subject to given resource limitations.
A case in point is a real-world resource allocation problem in a hospital emergency department (ED) (see Section 3.1 for the formulation and Section 5.2 for the case study).Common resources in an ED include specialized medical staff, beds, medical equipment, and salary budgets.The ED has established a set of achievable service goals as measured by WTs for different categories of patients.While the target service levels on urgent patients are high and satisfiable, the ED has recognized the need to improve the service quality on lessurgent patients.Therefore, it is desirable to seek resource allocation scenarios that improve the "nice-to-have" service goal while continuing to meet all other "must-have" requirements.Furthermore, the imposed service levels on WTs are expressed in the form of probabilistic measures; that is, a service level is defined as the percentage of customers not waiting longer than a given threshold.
Such probabilistic constraints (or chance constraints) are often used in measuring quality of service (Hong et al., 2015), and thus problems with similar structures appear in other service systems.Call centers may operate under a delaypercentile contract, where delays for a certain percentage of contract customers should be within a fixed delay bound, and noncontract customers have no guarantees on delay (Milner & Olsen, 2008).A call center may seek an operator schedule to minimize the delay for noncontract customers, while satisfying the requirement for contract customers.Such problems are also prevailing in public service offices, such as governmental departments that issue passports, driver licenses, and ID cards.They need to determine the staffing level for a specific time slot, such that at least a certain percentage of customers would wait less than a given time limit (Taigel et al., 2018).In this paper, we will use the term "resource allocation" for the aforesaid staffing schedules, equipment assignments, and other allocation of resources in place and time.
To study these service systems with probabilistic measures, simulation models can be deployed to evaluate the performance of different resource allocation scenarios so as to identify the best one.However, one notable drawback of performing analysis using simulation is the existence of random noises and thus the need to run a large number of simulation replications in order to evaluate each scenario accurately.With sizable alternatives, it will be very time consuming to run sufficient replications for every scenario, and thus an efficient strategy is necessary to search for the optimal one subject to the computing budget (i.e., the total number of simulation replications).This line of research falls under the domain of ranking-and-selection (R&S).To this end, we will develop an optimal computing budget allocation (OCBA) formulation using the expected opportunity cost (EOC).Further, the proposed formulation will follow the Bayesian framework and impose a correlated prior belief, as treated in Frazier et al. (2009), on the performance of resource allocation scenarios.As such, it utilizes prior knowledge accumulated by the service provider on these scenarios and captures the performance correlation on "neighboring" scenarios.
The remainder of this paper is organized as follows.Section 2 summarizes related literature on applications of stochastic service systems and methodologies of simulation optimization.Section 3 introduces the resource allocation model and the OCBA formulation, with the solution methodology developed in Section 4. Section 5 presents the numerical results, and Section 6 concludes this paper.Proofs of lemmas and theorems, as well as additional analysis and numerical results, are enclosed in the Supporting Information.

LITERATURE REVIEW
In this section, we will review relevant literature from both application and methodology viewpoints.As the problem under study considers probabilistic measures, we limit the review to those dealing with stochastic systems.
From the application point of view, most modeling papers studying stochastic resource allocation in service systems focus on problems with tractable mathematical structures.These well-structured problems can be modeled and solved as stochastic optimization problems, such as stochastic programming (Bodur & Luedtke, 2017), robust optimization (Mattia et al., 2017), and Markov decision process (Huh et al., 2013).These problems typically have one or more stochastic constraints and/or objective with others being deterministic.While the stochastic constraints/objective are often in the form of expectations (see, e.g., Bodur & Luedtke, 2017;Gans et al., 2015), some problems study probabilistic constraints as those formulated in this paper.For example, Beraldi et al. (2004) developed a stochastic programming framework to determine the emergency medical service site locations and the number of emergency vehicles, where the servicelevel constraints are probabilistic constraints; Robbins and Harrison (2010) studied a call center scheduling problem subject to a service constraint that a proportion of calls must be answered within a fixed time.
For stochastic service systems that do not possess tractable structures, simulation has been the tool of choice for analysis.Simulation can be used to evaluate the stochastic feasibility requirements, such as passenger processing targets at an Production and Operations Management airport (Mason et al., 1998), and average WT requirements for patients in EDs (Izady & Worthington, 2012).Simulation can also be combined into optimization procedures to search for the optimal solution, known as simulation(-based) optimization, or optimization via simulation.For example, Atlason et al. (2008) and Cezik and L'Ecuyer (2008) solved call center staffing problems with non-closed-form servicelevel functions evaluated via simulation; Tsai and Zheng (2013) addressed a two-echelon inventory problem, where one constraint requires the expected response time at each depot to be within a threshold; Guo et al. (2017) and Chen et al. (2020) studied staffing problems in hospital EDs while satisfying stochastic service quality requirements.This paper contributes to the above literature on studying a generalized resource allocation problem in service systems, where two unique structures exist: (1) The performance of neighboring resource allocation scenarios may be correlated and the modeler can obtain prior knowledge on their joint distribution; and (2) the objective and multiple constraints are stochastic and in the form of probabilistic measures.
To tackle these special structures, this paper contributes to the R&S literature from the methodology point of view, particularly to those constrained R&S methods for solving problems with stochastic constraints and/or objective.One stream of research on constrained R&S is based on the sequential indifference-zone (IZ) framework and aims to identify the optimal solution with a probability exceeding a stipulated threshold (Andradóttir & Kim, 2010;Batur & Kim, 2010;Healey et al., 2014;Hong et al., 2015).Another type of formulation is based on the OCBA framework and aims to maximize the probability of correct selection (PCS) (Lee et al., 2012).A heuristic procedure was developed in Choi et al. (2021) by considering the precision of the sample means when maximizing PCS, so as to improve the algorithm performance when large simulation noise exists.Finally, large-deviation approaches have also been investigated, where Hunter and Pasupathy (2013) and Pasupathy et al. (2015) seek to maximize the rate of decay of the probability of false selection (PFS).Gao et al. (2019) further incorporated quadratic regression metamodels in the large-deviation framework to improve the search efficiency.
While these existing methods optimize the PCS or PFS measure, this paper develops a new Bayesian OCBA formulation using the EOC measure, motivated by the aforesaid structures of the resource allocation problem under study.First, the OCBA formulations with the PCS or PFS measure aim to identify the best solution with the maximum probability, treating all nonbest solutions equally.However, for the service systems under study, the industry is more interested in the economic value of the selection than its statistical significance.In other words, the decision makers are more concerned with the loss incurred in selecting a solution, rather than the statistical significance of selecting exactly the best one.As such, the OCBA formulation proposed in this paper uses the EOC measure as the objective, which penalizes for the quality of the selection.While EOC has been used as an objective in unconstrained R&S problems (Chick & Wu, 2005;Gao et al., 2017), it has rarely been studied in the constrained R&S counterparts.Second, in order to capture the prior knowledge on the solution performances and the correlation of neighboring solutions, the proposed formulation uses the Bayesian framework.While Pujowidianto et al. (2013) proposed a frequentist version of EOC for a constrained R&S problem, to the best of our knowledge, this is the first Bayesian EOC formulation developed in the constrained R&S literature that accounts for the prior belief and correlation of solutions.Third, the probabilistic measures in the objective and constraints of the problem provide convenience in the formulation and its theoretical properties, including a strong finite-time convergence property of the proposed algorithm.We mention that although the probabilistic measures can be written as expectations of an indicator function following a Bernoulli distribution, the information on the Bernoulli distribution makes our method more efficient and robust, as pointed out in Hong et al. (2015).
Finally, this paper is more tangentially related to the literature on stochastically constrained optimization via simulation (COvS) and stochastically constrained Bayesian optimization (CBO).Both COvS and CBO are constrained optimization problems where samples from the objective and constraint measures can be treated as being generated from a black box.This black box is the simulation model for COvS and is the real system for CBO.In COvS and CBO, the set of candidate solutions is much larger, and search-based heuristic methods need to be developed to find the optimal or nearoptimal solutions.Such heuristics can be based on penalty functions (Park & Kim, 2015), gradients (Luo & Lim, 2013;Nagaraj & Pasupathy, 2013), random search strategies (Chen et al., 2020;Gao & Chen, 2016), or acquisition functions (Gardner et al., 2014;Letham et al., 2019;Ungredda & Branke, 2021).A valuable future research is to extend the methodology developed in this paper to solve the COvS and CBO problems.

PROBLEM DESCRIPTION
In this section, we first introduce a resource allocation problem in a hospital, and then generalize the model to other service systems.Then, a Bayesian OCBA model is formulated to efficiently identify the optimal solution of the model via simulation.

A resource allocation model with probabilistic measures
A motivating example of our model is a resource allocation problem in the ED of a large public hospital in Hong Kong.This hospital had large patient flows with an average of 350 to 430 patients per day to the ED.Sixty-five percent of the patients were walk-in patients, while the others were brought in by ambulance.All patients were labeled by triage categories based on their clinical conditions, including categories  1, patients in category I should be treated immediately with any available resources.The WT and service-level expectations for patients in categories II, III, and IV lower as the urgency drops, while no requirement is imposed on category V patients alone.The ED also maintains a goal of treating at least 98% of patients in all categories within 4 h, measured by LOS.While these requirements were established based on benchmarks and attainability, the focus was mainly on urgent patients.The hospital leadership believed that, with an efficient allocation of resources, it is possible to improve the service level for category IV patients (semiurgent and 48.76% of total) while meeting all other service requirements.Indeed, different categories of patients undergo different treatment procedures, and occupy different medical resources such as manpower, machines, and spaces.By properly assigning medical staff to each service area and each shift, the ED can find a scenario to improve the WT of category IV patients while maintaining the service for urgent patients.More specifically, define WT 2 , WT 3 , WT 4 , and LOS as the daily average patient WT for categories II, III, IV, and the daily average LOS for all patients, respectively.Then, the aforementioned problem can be modeled as follows.

Production and Operations Management
x ∈  (5) where  represents random noises caused by uncertainties in patient arrivals and service durations, and x is a vector of decision variables defining a staff assignment scenario that satisfies all the deterministic (nonprobabilistic) constraints defined by .For the above example, such deterministic con-straints include the total budget constraint of personnel, and the minimum and maximum number of specialists required at each service area during each shift (e.g., no more than two nurses but no less than one nurse at the admission desk).Note that although the service level of category I patients is not explicitly modeled in the above formulation, any incoming category I patient should be treated immediately with available resources, thereby reducing the resource availability to other patients and their attendant service levels.
The aforementioned decision problem can be generalized to other service systems, such as call centers and public service offices.To this end, we define a stochastic performance measure F h (x, ) for h = 0, 1, … , H, where H is the total number of constraints and 0 indicates the objective function.Specifically, each F h (x, ) is a stochastic metric given a resource allocation scenario x.For example, F h (x, ) may capture the average daily WT or peak daily LOS in the ED as required, where the value of this metric observed varies due to random noise .Correspondingly, each performance goal is defined as d h for h = 0, 1, … , H, and the desired service level as 1 −  h for h = 1, … , H. Without loss of generality, we assume that (1) the lower d h is, the better performance it has; and (2) the higher 1 −  h is, the better service level the system provides.Hence, the generalized resource allocation problem for service systems can be formulated as follows: where  is a finite discrete solution space of x, determined by the deterministic constraints of the service system, such as budget and capacity constraints.In the remainder of this paper, we will focus on the problem  defined by Equations ( 6)-( 8) where  is finite, and will refer to a resource allocation scenario x as a solution of .
A simulation model will be used to evaluate the above probabilistic measures, P{F h (x, ) > d h }, for every solution x.Since most simulation models are expensive and time consuming to run and the decision may need to be made within a given time period, the challenge of solving problem  using a simulation model in this context is the one of allocating the limited computing budget to select the optimal solution under observation noises, thereby falling under the OCBA regime.As mentioned in Section 2, the characteristics of problem  warrant a new Bayesian OCBA formulation using the EOC measure, to be introduced in the sequel.

A Bayesian OCBA formulation
Suppose there is a total of K alternative solutions (resource allocation scenarios) in , and x i represents one such solution for i ∈ Θ = {1, … , K}.For a given x i , each simulation

Production and Operations Management
replication j provides one estimate, F h (x i ,  ij ), for each performance measure h where  ij is the random noise observed in the j-th replication.Here, we assume that for any solution x i and probabilistic measure h, F h (x i ,  ij )'s are identically distributed for j = 1, 2, … Further define an indicator function It is seen that X hij follows a Bernoulli distribution with a success probability Apparently, p hi is unknown in practice, and can be best estimated by its sample average phi To solve problem  via simulation given the computing budget T, one needs to determine the number of replications allocated to evaluate each solution x i (denoted by N i ), so as to best identify the true optimal solution x b where b = arg min To capture the prior knowledge on solutions and correlated samples of neighboring solutions, we proceed to develop an OCBA formulation from a Bayesian perspective.Specifically, since p hi is practically unknown, the algorithm starts with a prior belief over each p hi and updates its posterior distribution as simulation samples are gathered.To this end, let ) is the Bayesian belief over (p h1 , … , p hK ) for service level h after t simulation replications are gathered.Thus,  0 h ( h ) and  T h ( h ) represent the prior distribution before simulation and the posterior distribution after the computing budget T exhausts, respectively.Note that  0 h ( h ) allows Ψ hi and Ψ hi ′ to be correlated based on prior knowledge on x i and 0i as the best solution under realiza- Here, we first make a mild assumption on the prior distribution (Russo, 2020) in Assumption 1.
Assumption 1.The prior joint distributions  0 h ( h ) (h = 0, 1, … , H) are continuous and uniformly bounded away from 0 and ∞, that is, there exist 0 Remark 1. Assumption 1 is a mild assumption that covers a large class of distributions, such as the multivariate normal distribution (Howard, 1998) that we will use in the numerical experiments in Section 5.In particular, for any continuous function r() that satisfies Assumption 1.When no prior knowledge of the system is available, the modeler can use the uncorrelated uniform prior instead of the correlated prior, or estimate the prior distribution using the maximum likelihood estimation through initial samples (Rasmussen & Williams, 2006).In addition, it should be noted that the lower bound c in Assumption 1 does not have to hold for the entire region of [0, 1] K for  h .In fact, we can relax this condition and only restrict c ≤  0 h ( h ) to hold for certain subregions of [0, 1] K depending on h and solution x i .This will become more apparent in the proof of Theorem 1, and we only mention this fact here without stating the detailed conditions.However, since such tighter conditions would overcomplicate the elaboration of Lemma 2, we prefer the succinct statement in Assumption 1 and will stick to it throughout the paper.

Let N (t)
i be the number of simulation replications allocated to solution x i when t replications are gathered, and L t ( hi ) be the corresponding likelihood function (of observations).Note that The posterior distribution can be updated according to the Bayes' rule as follows: The posterior distribution in Equation ( 10) may not have a closed form if the prior distribution is correlated.In this case, we need numerical integration methods to solve integrals in Equation (10).In the case of high-dimensional integrals if the total number of solutions, K, is large, Monte Carlo estimation and the importance sampling method can provide robust estimates with high accuracy.
We are now ready to introduce the OCBA formulation that minimizes the Bayesian EOC.More specifically, based on T simulation samples, the selected solution x b is the one estimated with the minimum posterior mean in the objective function while satisfying all constraints; that is, b = arg min where is the posterior mean.The Bayesian EOC is then defined as

Production and Operations Management
where (13) Here, the weight 0 <  h < ∞ for h = 1, … , H reflects the unit cost of violating constraint h relative to the unit cost of the objective.Specifically, the opportunity cost OC 0 b is the penalty for the objective of the selected solution x b, and the cost OC h b penalizes the loss incurred if x b violates the constraint on service level h.Note that when arg min does not exist, we still recommend the solution x b when the constrained optimization problem is infeasible.Hence, we set the attendant opportunity cost to be 1 to penalize such selection, noting that the opportunity cost of max{ 0 b −  0 b, 0} ≤ 1 when b exists.A large weight  h can be imposed to penalize against selecting an infeasible solution without affecting the asymptotic results to be derived later.Subsequently, the Bayesian OCBA formulation of problem  is defined as follows: We assume that in problem , the constraint measures are not exactly equal to the constraint limit on the righthand side; that is, p hi ≠  h for all constraints in any solution.This is a common assumption made in the OCBA literature, which ensures that EOC Bayes approaches 0 as the computing budget increases.
The - formulation distinguishes from other OCBA models, such as the one in Lee et al. (2012) for constrained R&S, not only in the use of the EOC objective, but also in capturing the performance correlation of solutions rather than assuming independence in solution measures.Intuitively speaking, capturing such correlation enables more effective learning of solution performances from samples, since the performance of a solution is not only learned using its own samples, but also using samples from its neighboring solutions.On the other hand, it brings challenges in developing theoretical results, which will be addressed next.

SOLUTION METHODOLOGY
Although EOC Bayes in Equation ( 12) is very complex to analyze (Chick et al., 2010), we will show that as T → ∞, b exists almost surely, and EOC Bayes is well defined and decays in a deterministic rate.Therefore, we will next solve - by optimizing the approximated rate of decay of EOC Bayes .

Rate of decay of Bayesian EOC
Considering the definition of the selected best b, b does not always exist due to simulation noises, even if the optimal solution exists for problem .Particularly, if N i remains fixed (N i < ∞) for some solution H} could be an empty set.To avoid this unfavorable situation, we make the following assumption: Remark 2. In fact, under Assumption 2, we will show in the sequel that b exists given that  is feasible and lim log EOC Bayes = 0, which is an inferior allocation compared to the allocation under Assumption 2.
To facilitate the analysis, we first divide the set of indices for nonbest solutions into two exhaustive and mutually exclusive sets, Θ O and Θ F , such that Θ = Θ O ∪ Θ F ∪ {b}.Θ O is the set of indices for nonbest solutions that are feasible, whereas Θ F is the set of indices for nonbest solutions with at least one constraint in Equation ( 7) violated.Further defining the set of constraints violated for any solution x i as Furthermore, for any i ∈ Θ F , we define h * i = arg max h∈Π i (p hi −  h ) as the "most violated" constraint.Lemma 1 then provides an approximate and upper bound of EOC Bayes (called AEOC Bayes ).
Lemma 1. EOC Bayes in Equation ( 12) is bounded from above by the following AEOC Bayes almost surely:

Production and Operations Management
Proof.The proof is given in the Supporting Information EC.1.1 to this paper.□ Remark 3.Although AEOC Bayes is an upper bound of EOC Bayes , it is constructed to be as tight as possible such that it closely approximates EOC Bayes .The proof of Lemma 1 in the Supporting Information EC.1.1 shows why this bound is tight, where the Bonferroni inequality is applied to reach Equation (EC.4), and subspace is constructed to provide a tight bound on the EOC in Equation (EC.5) for both cases of i ∈ Θ O and i ∈ Θ F .Further, AEOC Bayes becomes tighter as the total simulation budget becomes larger, and AEOC Bayes → 0 as T → ∞ (Russo, 2020).
We now proceed to analyze the rate of decay of AEOC Bayes .We first present Lemma 2 and Lemma 3, and then develop Theorem 1 on the rate of decay using the lemmas. where Proof.The proof is given in the Supporting Information Theorem 1.The rate of decay of AEOC Bayes in Equation ( 16) is bounded by the following AEOC-B: where p 0b < c i < p 0i for any i ∈ Θ O , and I hi (x) is given by Equation (18) for any h = 0, 1, … , H and i = 1, … , K.
Proof.By Equation ( 16) and Lemma 3, we have Consider the term in Equation ( 20) for any h and Next, consider the term in Equation ( 21) for any i ∈ Θ O .For the domain { 0 :  0i <  0b }, there exists a  > 0 such that  ≤  0b −  0i ≤ 1.Following the similar proof as above, we have lim Considering the integrand on the right-hand side of Equation (26), define a constant c i where

Production and Operations Management
From Lemma 2, lim )d h exists for any 0 ≤ a 1 < a 2 ≤ 1.Thus, by Equation ( 26), Equation (27), and Lemma 3, Substituting Equations ( 25) and ( 28) into Equations ( 20)-( 22), we have Applying Lemma 2 to the right-hand side of Equation ( 29), Equation ( 19) readily follows, since I hi (x) is a convex function with the minimum achieved at x = p hi .□ We mention that the only approximation used in Theorem 1 to arrive at AEOC-B is Equation ( 28).The value of c i should depend only on p 0b and p 0i and affects the tightness of AEOC-B.We will discuss the choice of c i in the Supporting Information EC.2 to this paper.

Optimality conditions
Since neither minimizing EOC Bayes in problem - nor maximizing its rate of decay is mathematically tractable, we instead maximize the bound of the rate of decay of AEOC Bayes , that is, AEOC-B defined in the right-hand side of Equation ( 19).Letting z = min{ min )}, the resulting problem can be expressed as Problem -- in Equations ( 30)-( 35) is a linear optimization problem, whose optimal solution stated in Theorem 2 provides an asymptotic optimality condition for problem -.
Theorem 2. Let  i be as follows: where I hi (⋅) is given by Equation (18).As T → ∞, AEOC-B is minimized if and thus, the asymptotic optimal sample allocation strategy is given by Proof.Since  i ≐ lim T→∞ N i T exists from Assumption 2 and ∑ K i=1  i = 1 with  i ≥ 0, we can treat  i as continuous variables between 0 and 1.Therefore, problem -- becomes a linear program with continuous variables.As such, the optimal solution is obtained when the right-hand sides of all constraints (31)-( 33) are equal.To see why, z can only be decreased further from this solution if all  i for i ∈ Θ are decreased simultaneously so as to decrease all right-hand sides in Equations ( 31)-(33).However, decreasing one  i will increase another since 34).Hence, by the definition in Equation ( 36), we Production and Operations Management have Note that I hi ( hi ) in Equation ( 18) is a convex function with its minimum value of 0 achieved at  hi = p hi .By the definition of problem -, p hi ≠  h for all h and i, and p 0b < c i < p 0i by choice, we have  i > 0. Thus, the optimality condition in Equation ( 37) readily follows, so as the asymptotically optimal sample allocation strategy in Equation ( 38) considering the constraint (34).□ Intuitively speaking, the simulation replications allocated to solution x i should be proportional to the inverse of its rate function  i given by Equation ( 38).Observe that the rate function in Equation ( 18) is a convex function with its minimum achieved at  hi = p hi .The closer  hi is to p hi , the smaller I hi ( hi ) is.Therefore, the intuition is that we should increase the sample size to a solution if either optimality or feasibility is harder to detect.More specifically, the sample size allocated to solution x i should become larger as c i gets closer to p 0i for any i ∈ Θ O , and as the constraint violation becomes less substantial for any i ∈ Θ F ; for the best solution x b , its sample size should be increased if one of the attendant service levels p hb is close to the required service level  h , or if one of the c i gets close to p 0b .This intuition is consistent with the standard OCBA allocation strategy for normally distributed random variables in Chen et al. (2000), where more samples are allocated to solutions whose optimality is harder to decide, or whose variance is larger.
Theorem 2 provides a simple closed-form formula for allocating the total computing budget to all solutions, such that the AEOC-B is asymptotically optimized.That is, such an allocation strategy optimizes the Bayesian EOC of selecting the estimated best solution b given a sufficiently large computing budget.The allocation strategy in Theorem 2 may not be optimal when T is relatively small.However, the performance shall get better as T becomes larger.
Remark 4. It may be of interest to consider two degenerate cases of problem .The first case is an unconstrained version of problem , where stochastic constraints in Equation ( 7) no longer exist.In this case, Theorem 2 still holds, except that H = 0, Θ F = ∅ (no infeasible solution for the unconstrained problem), and Θ O = Θ ⧵ {b}, resulting in a degenerate case of Equation ( 36) given by The second case is a feasibility detection variant of problem , where the objective function in Equation ( 6) no longer exists.The goal is to find all feasible solutions meeting the service targets, that is, H} represent the set of true feasible solutions and let f = {i ∈ Θ : x i ∈  and  T [Ψ hi ] ≤  h for h = 1, … , H} represent the set of feasible solutions estimated from samples.The Bayesian EOC for problem  f can be defined as where Using analysis similar to Theorem 1, we can show that the rate of decay of EOC f :Bayes is bounded by min Maximizing Equation ( 43), the asymptotic optimal sample allocation strategy has the same form as Equation ( 38), where  i in Equation ( 36) becomes

Iterative algorithm and convergence analysis
Recall that  i in Theorem 2 is computed using the true success probability of Bernoulli variables, p hi .However, p hi is practically unknown, but can be estimated by the posterior mean.Therefore, we propose an iterative algorithm in Algorithm 1 to sequentially allocate simulation replications to the solutions in consideration, by implementing Theorem 2.
In Step 5 of Algorithm 1, for i ∈ ΘO , ĉi can be chosen between p0 b and p0i , and we will show in the Supporting Information EC.2 that ĉi = p0 b+ p0i 2 is a recommended choice.Further, parameter  i is strictly positive by definition, and thus ηr i is bounded by a sufficiently small positive number in Step 5. Finally, in Step 6, it may be necessary to round the values computed by Equation (47) to integers, for example, by preserving the total sum while minimizing the maximum relative deviation of such rounding.
While the asymptotic performance of Algorithm 1 is guaranteed by Theorem 2, a more important practical concern is its finite-time performance.Specifically, two metrics are critical: (1) How close is the posterior mean phi to the true success probability p hi ?and (2) How close is the empirical sample

Production and Operations Management
A L G O R I T H M 1 Iterative Sample Allocation Algorithm for Solving -- 0.
Initialize: Set iteration counter r = 0 and the incremental budget Δ.
Set initial sample size N r i = n 0 and incremental sample size  r i = N r i for all i ∈ Θ. Specify the prior  0 h ( h ) for h = 0, 1, … , H. 1.
Simulate: Run simulation for each solution x i , i ∈ Θ, for  r i replications.For the j-th simulation replication of x i , record the simulation output F h (x i ,  ij ), and compute X hij using Equation ( 9) for all h.

2.
Estimate: Update the posterior distributions  t h ( h ) using Equation ( 10) for all h, where t = ∑K i=1 N r i .Estimate the success probability of service level h for solution x i as phi =  t [Ψ hi ] using Equation ( 11) for all h and i ∈ Θ.

3.
Divide: Compute the set of violated constraints for any i ∈ Θ as Πi = {h ∈ {1, … , H} : phi >  h }.Then, find the sample best solution: and divide the non-best solutions into two sets: 4.
Stop: If t ≥ T, terminate and output b as the best solution; otherwise, continue.

5.
Update: Then, for each i ∈ Θ, compute ηr i using Equation ( 36), where Θ F and Θ O are respectively approximated by ΘF and ΘO , and I hi (⋅) is computed using Equation ( 18) with p hi approximated by phi .If ηr i = 0, bound ηr i by a sufficiently small positive number : ηr i ← .

6.
Allocate: Increase the computing budget by Δ, and determine the new allocation using allocation ratio in Equation ( 47) to the asymptotically optimal one in Equation ( 38)?For the latter, we define as the asymptotic optimal ratio of samples allocated to solution x i , and as the corresponding empirical ratio at iteration r.To this end, Theorem 3 characterizes the finite-time performance of Algorithm 1, with the proof supported by Lemma 4.
Lemma 4. Suppose the prior joint distributions  0 h ( h ) (h = 0, 1, … , H) satisfy Assumption 1. Further, suppose at iteration r of Algorithm 1, N r i samples have been allocated to x i out of a total of t samples, and define the estimated success probability phi ≐  t [Ψ hi ] from Equation (11).Then, for any  > 0, Proof.The proof is given in the Supporting Information EC.1.3to this paper.□ Theorem 3. Suppose Assumption 1 holds.For any sufficiently small  > 0, let H, and r ≥ R} (49) be the earliest iteration such that for all iterations r ≥ R  in Algorithm 1, the posterior mean phi and the empirical allocation ratio ρr i are sufficiently close to their asymptotic counterparts, p hi and  * i , respectively.Further, as the Proof.The proof is given in the Supporting Information EC.1.4to this paper.□ Remark 5.By the Chebyshev inequality, the finite expectation and variance indicate that by a very high probability, EOC Bayes will converge at the almost optimal convergence rate after a finite number of iterations.For example, let the Chebyshev inequality tells us that | phi − p hi | ≤  and | * i − ρr i | ≤  for all i ∈ Θ, h = 0, 1, … , H, and r > r * with a probability greater than 90%.
Note that by Equation ( 19), the empirical value of AEOC-B provided by Algorithm 1 converges to its theoretically optimal counterpart as the sample allocation converges.That is, where  decreases to zeros as  → 0. In other words, Algorithm 1 achieves the almost optimal value of AEOC-B when r ≥ r * .

NUMERICAL EXPERIMENTS
In this section, we provide numerical results that show the effectiveness of the proposed algorithm.To this end, we test

Production and Operations Management
randomly generated test cases and a case study inspired by the real-world staff allocation problem introduced in Section 3.1.

Random test cases
We first test the performance of Algorithm 1 (short for PB in the following context) using randomly generated test cases.Three benchmark algorithms are selected in comparison, namely, the OCBA for constrained optimization in  2021) (short for HE).Although the CO, SC, and HE algorithms do not consider the performance correlations in solutions as PB does, they are the most relevant algorithms developed to solve the constrained R&S problems under study in this paper.As such, we test the PB algorithm with an uncorrelated uniform prior (short for PB u ) as a direct comparison to the three benchmarks.Further, we test the PB algorithm with a correlated prior (short for PB c ) in order to quantify the benefit of using a correlated prior in the PB implementation.To this end, we generate random test cases with different parameter settings, and compute the optimal solutions using the aforesaid five algorithms by varying computing budgets.Note that the true optimum for each random test case can be analytically computed (to become apparent once we introduce the generation procedure for random cases below) and will be used as ground truths for performance comparison.Note further that our implementations of all algorithms strictly use sample approximations without leveraging the knowledge of the true optimum.
We compare the algorithm performance under different values of ||, H, and T. Recalling that the CO and HE algorithms were developed based on the assumption that samples follow a normal underlying distribution, we will further compare the performance with different distributions, including normal, uniform, and exponential.Given each parameter setting (||, H, T, and an underlying distribution), 1,000 random test cases were generated based on the following procedure: 1. Generate the value of d h for h = 0, 1, … , H, each following a discrete uniform distribution  (1, 200).2. Generate the value of  h for h = 1, … , H, each following a uniform distribution  (0.01, 0.1).3.For each solution x i ∈  for i = 1, … , K, generate the value of F h (x i , ) for h = 0, 1, … , H, each following the chosen underlying distribution.
For each random test case, the value of [F h (x i , )] depends on the randomly generated distribution parameters, while  s [[F h (x i , )]] increases with i where  s [⋅] denotes the expectation over the randomness caused by the distribution parameters.As seen, the test cases were generated to mimic the correlation in solution performances in a way that the performance of a solution may be more correlated with its neighboring ones.
In the PB c implementation, we set the prior distribution of log(Ψ hi ∕(1 − Ψ hi )), i ∈ Θ as a multivariate normal (Howard, 1998).The covariance function is squared exponential (Rasmussen & Williams, 2006); that is, Σ(x i , 2 with  > 0 and  > 0. Let Cov be the covariance matrix whose (i, i ′ ) entry is (Cov) ii ′ = Σ(x i , x i ′ ).Then, the prior belief has a density  0 h ( h ) proportional to exp where y is the expectation of the random vector log In the experiments, we set  = 1,  = 1, and y = (y 1 , y 2 , … , y K ) ⊤ to be y i = log( ), where p hi = P{F h (x i , ) > d h }.Here, the choice of y mimics practical situations when decision makers may have some prior knowledge on p hi , but the prior is inaccurate if p hi is predicted by  s [p hi ].We mention that the posterior mean does not have a closed form and is approximated using the Markov chain Monte Carlo approach.
For each random case, its true (theoretical) optimal solution for the attendant problem  can be analytically computed based on the distribution information.Then, the same problem is solved using the five algorithms given a computing budget T. The solution obtained is compared to the true optimal solution.Three metrics are computed based on the 1,000 cases: • Probability of correct selection (PCS): the percentage of cases where a true optimal solution is identified using the chosen algorithm; • Probability of selecting an infeasible solution (PIF): the percentage of cases where an infeasible solution is identified as an optimal solution using the chosen algorithm; • Expect opportunity cost (EOC): the expected performance gap between the selected solution and the true optimum.Specifically, the frequentist opportunity cost of selecting x i ≠ x b is defined as Here, PCS measures an algorithm's expected performance (i.e., how good it is in selecting the optimal solution); PIF measures its worst-case performance (i.e., how bad it is in selecting an infeasible solution); and EOC measures the expected performance gap between the selected solution and the true optimum (i.e., the overall performance on the solution quality).
Table 2 exhibits the performance comparison results for the normal underlying distribution.Specifically, the table shows the PCS, PIF, and EOC metrics for three groups of settings, (K = 200, H = 1), (K = 500, H = 3), and (K = 500, H = 1), where the difficulties of problems in each group increase in the aforesaid order.It is straightforward to see that group (K = 500, H = 1) is much more challenging than group (K = 200, H = 1), while it may be counterintuitive to see that group (K = 500, H = 3) is also easier than (K = 500, H = 1).The latter is due to the fact that each problem in group (K = 500, H = 1) is expected to have more feasible solutions than that in group (K = 500, H = 3).Since identifying an infeasible solution is typically less computationally expensive than verifying a feasible solution, group (K = 500, H = 1) is much harder than group (K = 500, H = 3).In summary, we characterize three groups of problems as follows: (K = 200, H = 1) has a small size of candidate solutions; (K = 500, H = 3) has a small size of feasible solutions and a large size of infeasible ones; and (K = 500, H = 1) has large sizes of feasible and infeasible solutions.Note that, for all algorithms, we set n 0 = 20 and Δ = 4, 000 for (K = 200, H = 1), and Δ = 10, 000 for both (K = 500, H = 3) and (K = 500, H = 1).For each group, we further vary the values of computing budget T to show the convergence trend.
From Table 2, it is seen that, for the PCS metric, PB u clearly outperforms three benchmarks for all three groups of problems; particularly, for the hardest (third) group, PB u reaches a PCS of 77.5% when T = 60, 000, a value that the benchmark algorithms fail to reach even when T = 100, 000.For the PIF metric, PB u shows good controls of not choosing an infeasible solution especially as T becomes sufficiently large, while the performance of the benchmarks are not as satisfactory.For the EOC metric, PB u again outperforms benchmarks by a substantial margin for the harder problems (second and third groups), indicating a strong overall performance by taking into account the quality of the selected solution.Further, PB c outperforms PB u for all test cases in terms of PCS and EOC, showing the benefit of considering a correlated prior when performance correlations indeed exist.In terms of PIF, PB c performs better in harder problems, while PB u is superior in easy problems.Note that the advantage of PB c over PB u tends to diminish as T increases.
To check the robustness of PB across different underlying distributions, Table 3 and Table 4 exhibit performance comparisons with uniform and exponential underlying distributions, respectively.It is seen that PB is superior to all three benchmarks robustly across various underlying distributions.We also point out that the advantage of PB is more significant for the hardest group of cases, indicating that PB is a better choice when the size of problem becomes large.
In the Supporting Information EC.3, we modified the procedure of generating random test cases by setting the values of  h (h = 1, … , H) to follow a uniform distribution  (0.01, 0.3) instead of  (0.01, 0.1), capturing a wider range The impact of Δ on performance and suggestions on the choice of Δ are also discussed in the Supporting Information EC.3.In summary, these numerical results show that, compared to CO, SC, and HE, PB is more likely to correctly select the optimal solution, and less likely to select an infeasible one; even if an optimal solution is not selected, PB is much more likely to select a near-optimal one.All these are desired characteristics for an R&S algorithm.Further, the performance improvement is quite significant even when the computing Production and Operations Management budget is relatively small.This will benefit many applications as each simulation replication can be expensive and time consuming to run and the decision-making time frame may be short (e.g., within hours or half a day).Finally, when prior knowledge is available for a problem, it would be beneficial to incorporate a correlated prior.

An ED case study
In this section, we will demonstrate how the proposed model and algorithm can be used to improve services in the ED example introduced in Section 3.1.A discrete-event simulation model is used to evaluate the performance of the highly complex and dynamic processes in ED.The data used in this case study were collected from July 1, 2009, to June 30, 2010, and the service process and simulation model used have been verified in Guo et al. (2017) and Chen et al. (2020).
From the historical data obtained, the ED received various patients with the volume varying from 350 to 430 per day, where the patients fall into five categories with the following percentages on average (from categories I to V): 0.87%, 1.45%, 42.09%, 48.76%, and 6.83%.In the simulation model, the patient arrivals follow a nonhomogeneous Poisson process, whose hourly arrival rates were calculated empirically based on the historical data and are shown in the Supporting Information EC.4.
The discrete event simulation model follows the actual service processes (Chen et al., 2020), where the detailed process map and the distributions used to generate the service times at each process are displayed in the Supporting Information EC.4.Specifically, an incoming patient, depending on the situation, may go through different service areas, including registration and triage, resuscitation (for critical patients), consultation, laboratory test and imaging investigation, and observation.Main medical resources required include different types of nurses, doctors with different seniority levels and expertise, different medical devices (e.g., electrocardiography [ECG] and ultrasonography [USG]), as well as different spaces (offices, rooms, and labs).Medical staff worked three shifts around the clock: morning shift (from 8 a.m. to 4 p.m.), evening shift (from 4 p.m. to 12 p.m.), and night shift (from 0 a.m. to 8 a.m.).
The simulation model was implemented in Arena, and was validated using the 1-year data collected.For the model validation, we set the length of the simulation to be 1 month with a 7-day warm-up period and ran the simulation model for 500 replications.We used three indicators, waiting time at the triage (WTT), waiting time for physicians (WTP), and length of stay (LOS), to assess the validity of outputs, as shown in Table 5.Note that the data of WTT and WTP for category I patients are not available from the cooperative hospital.It is seen that the simulation can closely mimic the real data.
Next, we experiment with the simulation model to identify the optimal solution x * that optimizes the resource allocation problem defined in Equations ( 1)-( 5).Recall that each candidate solution x consists of the medical staff assignment for • Solutions were filtered by the deterministic constraints specified in Equation ( 5), including the personnel and budget constraints.Specifically, the personnel constraint imposes practical bounds on the number of medical staff of each type and shift.Any solution resulting in a staffing cost of more than 85.5 CUs was also filtered, since the ED preferred maintaining the current budget.• Solutions dominated by others were filtered.For example, consider solution A with 2 SDs for the night shift and solution B with 1 SD for the night shift, while all other staff assignments are equal.We expect solution A to dominate solution B without evaluating them since the objective is to find a solution that maximizes the service level of type IV patients, thereby excluding solution B from the solution space.• Solutions were further filtered by consulting domain experts, who would remove nonpromising solutions based on their prior experience on staff schedules or preliminary simulation runs.
The full set of the 73 candidate solutions is included in the Supporting Information EC.4.We mention that the above filtering process substantially reduces the solution space, making it suitable to construct an R&S problem studied in this paper, while the original problem without filtering is more suitable to be solved using a search-based algorithm, such as the one developed in Chen et al. (2020).
We compare the performance of PB to two benchmarks: One is the CO in Lee et al. (2012) and the other is the equal allocation (EA).The EA approach is a simple strategy of allocating the computing budget equally among all candidate solutions, and is often used in the literature as a baseline.The computations were conducted in Arena 14.0 via VBA codes.The prior belief was set to have a density

Production and Operations Management
Simulation models are known to be capable of capturing complex processes and system dynamics, which are hard to model in mathematical programs but are often present in service systems.However, a major challenge in optimizing systems using simulation models, known as the research field of simulation optimization or optimization via simulation, is the trade-off between simulation accuracy and the affordable computing budget.To this end, we formulate an OCBA model to efficiently balance the effort of exploring the solution space and exploiting given solutions with more simulation replications.From the methodological point of view, the OCBA problem formulated in this paper possesses a couple of unique features compared to the literature: (1) Instead of maximizing the PCS, the problem under study aims to minimize the EOC, considering that the service provider is typically more concerned with the quality of the selected solution than the statistical significance of selecting the true optimal one; and (2) the proposed model considers the prior knowledge accumulated on solutions, as well as the performance correlations on solutions, and hence uses a Bayesian modeling approach.To the best of our knowledge, such a Bayesian OCBA model has not been studied in the literature.Next, based on the property of the probabilistic measures, we derive the asymptotic optimality conditions of the OCBA formulation.Then, an efficient iterative algorithm is developed to sequentially allocate the simulation budget to the most necessary solutions, guided by the optimality conditions.The efficiency of the algorithm is shown via the theoretical finite-time convergence analysis, as well as the numerical experiments.
From a practical vintage point, the generalized resource allocation problem is motivated by a real-life resource allocation problem stemming from a hospital ED, and is widely applicable to service systems, such as hospitals, call centers, and public service offices.The proposed algorithm can expedite the identification of an optimal resource allocation scenario using a simulation model, with the presence of probabilistic constraints and objective.Extensive numerical examples with known true optimum conducted in this paper have demonstrated the superiority of the proposed algorithm compared to benchmarks, evaluated by the PCS, PIF, and EOC metrics.Further, we apply the algorithms to the aforesaid staff allocation problem in the ED in Hong Kong, whose simulation model is driven by real data.This case study has further validated the practicality of the proposed model and algorithm.
Several future research directions are available to explore.Sometimes, the service system may want to guarantee the PCS (e.g., 90%) and minimize the required computing budget (Chen et al., 2014).Thus, a future research direction is to explore alternative simulation optimization methods, such as the IZ approaches.Further, this research considers problems with a manageable size of solution space.In the proposed algorithm, every solution is simulated for a minimum number of replications and evaluated to obtain an initial assessment.For a large-size problem, it may not be computationally feasible to provide an initial evaluation for every solution.As such, a metamodel-based or random search-based method needs to be developed to balance the exploration of unknown solutions and the exploitation of known solutions with more simulation replications.Finally, when the resource constraints in Equations ( 1)-( 5) or Equations ( 6)-( 8) can be expressed as mathematical constraints, it would be interesting to explore a combined simulation optimization and mathematical programming approach.

A C K N O W L E D G M E N T S
We gratefully thank the department editor, the senior editor, and the two reviewers for their valuable comments that have greatly improved the paper.Siyang Gao thanks the support by the City University of Hong Kong (Grants 7005269 and 7005568).Jianzhong Du thanks the support by the National Natural Science Foundation of China (Grant 72091211).

R E F E R E N C E S
EC.1.2to this paper.□ Lemma 3 (Ganesh et al. (2004)).Consider positive sequences a j (n), j = 1, … , m.If lim Lee et al. (2012) (short for CO), the stochastically constrained R&S via SCORE inPasupathy et al. (2015) (short for SC), and the heuristic procedure based on statistical hypothesis tests inChoi et al. (

TA B L E 1 Time limits and service levels for an ED in Hong Kong Patient category Expected time limit Desired service level
Performance comparison with exponential underlying distributions ( h ∼  (0.01, 0.1)) Tables EC.1, EC.2, and EC.3 in the Supporting Information, and the observations are quite similar to those in Tables2, 3, and 4, showing the robustness of PB across different settings of problem  by varying  h .Further, we varied the incremental budget Δ in the PB algorithm implementation.
CUs per JN, 2 CUs per SN, 3 CUs per JD, and 4 CUs per SD.For the actual solution implemented in the ED, the total staffing cost is 85.5 CUs.By applying the following filters to all possible solutions, we identified 73 promising candidate solutions for the R&S problem (|| = 73).