Packing returning secretaries

We study online secretary problems with returns in combinatorial packing domains with n candidates that arrive sequentially over time in random order. The goal is to determine a feasible packing of candidates of maximum total value. In the first variant, each candidate arrives exactly twice. All 2n arrivals occur in random order. We propose a simple 0.5‐competitive algorithm. For the online bipartite matching problem, we obtain an algorithm with ratio at least 0.5721 − o(1), and an algorithm with ratio at least 0.5459 for all n ≥ 1. We extend all algorithms and ratios to k ≥ 2 arrivals per candidate. In the second variant, there is a pool of undecided candidates. In each round, a random candidate from the pool arrives. Upon arrival a candidate can be either decided (accept/reject) or postponed. We focus on minimizing the expected number of postponements when computing an optimal solution. An expected number of Θ(n log n) is always sufficient. For bipartite matching, we can show a tight bound of O(r log n), where r is the size of the optimum matching. For matroids, we can improve this further to a tight bound of O(r′ log(n/r′)), where r′ is the minimum rank of the matroid and the dual matroid.


INTRODUCTION
The secretary problem is a classic approach to study online optimization problems: A sequence of n candidates are arriving in uniform random order. Each candidate reveals its value upon arrival and must be decided (accept/reject) before seeing any further candidate(s). Every decision is final-once a candidate gets accepted, the process is over. Moreover, no rejected candidate can be accepted later on. The goal is to accept the best candidate. An optimal solution is to discard the first (roughly) n/e candidates. From the subsequent ones, we accept the first, that is, the best one among the ones seen so far. The probability to hire the best candidate approaches 1/e ≈ 0.37 when n tends to infinity.
The secretary problem and its variants have been popular since the 1960s. Significant interest in computer science emerged about a decade ago due to new applications in e-commerce and online advertising markets [3,14]. For example, the classic secretary problem can be used to model a seller that wants to give away a single item, buyers arrive sequentially over time, and the goal is to assign the item to the buyer with highest value. More generally, online budgeted matching problems arise when search queries arrive over time, and the goal is to show the most profitable ads on the search result pages. The goal here is to design algorithms with good competitive ratio.
More recently, progress has been made towards a general understanding of online packing problems with random-order arrival, including matching [1,18,21], integer packing programs [19,23], or independent set problems [13]. Most prominently, the matroid secretary problem has attracted a large amount of interest [3,6]. Here the elements of a matroid arrive in uniform random order, and the goal is to construct an independent set with as high a value as possible. A central open problem in the area is the matroid secretary conjecture-is there a constant-competitive algorithm for every matroid in the random order model? The conjecture has been proved for a variety of subclasses of matroids [6]. Currently, the best-known algorithms for the general problem are 1/O(log log rank)-competitive [9,22].
A strong assumption in the secretary problem is that every decision about a candidate must be made immediately without seeing any of the future candidates. Instead, in many natural admission scenarios candidates appear more than once, or they arrive and stay in the system for some time, during which a decision can be made. An interesting variant that captures this idea is the returning secretary problem [26]. Here each candidate is assigned two random time points from a bounded time interval. The earlier becomes the arrival time, the later the departure time. Equivalently, we can assume that each candidate arrives exactly twice, and all 2n arrivals occur in random order. The decision about acceptance of a candidate can be made between the first and the second arrival. More generally, for k ≥ 2 each candidate chooses k random points, arrives at the earliest and leaves at the latest point. In this case, there are kn arrivals in random order. Vardi [26] showed an optimal algorithm for the returning secretary problem with k = 2, for which the probability of accepting the best candidate is about 0.768. For matroid secretary with k = 2 arrivals, a competitive ratio of 0.5, and for matching secretary a ratio 0.5625 − o(1) (with asymptotics in n) were shown.
In this paper, we significantly extend and broaden the results on the returning secretary problem to general packing domains. We provide a simple algorithm that can be combined with arbitrary -approximation algorithms and yields competitive ratios of 0.5 ⋅ for all subadditive packing problems, including matroids, matching, knapsack, independent set, etc. Moreover, we improve the guarantees for matching secretary and provide bounds that hold in expectation for all n. We extend all our bounds to arbitrary k ≥ 2. In addition, we study a complementary variant in which the decision maker is allowed to postpone the decision about a candidate. In this case, the goal is to minimize the number of postponements to guarantee an optimal or near-optimal solution in the end. These problems can be cast as a set of novel coupon collector problems, and we provide guarantees and trade-offs for matroid, matching and knapsack postponement.

Results and contribution
In the secretary problem with k arrivals in Section 3, each candidiate arrives exactly k times. We propose a simple approach for general subadditive packing problems with returns, which can be combined with arbitrary offline -approximation algorithms. It yields a competitive ratio of ⋅ 1 2 for k ≥ 2. For general packing problems with XOS (maximum of sums) objective we show a competitive ratio ⋅ (1 − 2 −(k − 1) ).
For additive bipartite matching, we obtain a new algorithm with an improved competitive ratio of 0.5721 − o (1) for k = 2 with asymptotics in n. Moreover, we present an algorithm with ratio 0.5459 for k = 2 for every n. Both algorithms rely on exact solution of partial matching problems. The algorithms can be combined with faster -approximations for partial matchings, by spending at most an additional factor in the competitive ratio. For the previous algorithm in [26], the algorithm description and proof of the ratio in the full version are slightly ambiguous. 1 Our algorithm clarifies and slightly improves upon this by including the twice-arrived and rejected candidates during a sampling phase when computing partial matchings. Their removal yields free nodes in the offline partition for matching in later rounds.
In the postponing secretary problem in Section 4, there is a pool of n undecided candidates. In each round, a random candidate from the pool arrives. Upon arrival a candidate can be either decided (accept/reject) or postponed and returned into the pool. We strive to minimize the expected number of postponements to compute an optimal or near-optimal solution. Postponing everyone until all candidates are observed at least once is the coupon collector problem. Hence, with an expected number of O(n log n) postponements one can reduce the problem to the offline optimization variant. For general XOS packing and an -approximation algorithm, a simple trade-off shows an (1 − ) ⋅ -approximation using O(n ln 1/ ) postponements. Based on a property we term exclusion-monotonicity, we show significantly improved results when the desired optimal solution has small cardinality. A bound of O(r log n) for the expected number of postponements holds when obtaining optimal solutions of size at most r in additive matroids and bipartite matching, and greedy 2-approximations for knapsack. For matroids, we can further improve the bound to O(r ′ ln n/r ′ ), where r ′ = min(r, n − r). This upper bound is at most n, and the worst-case is attained for uniform matroids. We fully characterize the expected number of postponements of every candidate in uniform matroids when the optimal solution is to be obtained. Finally, we conclude the paper with a lower bound-in general we might need Ω(n log log n) postponements to obtain an optimal solution, even when this optimal solution has cardinality O(log n).

1.1.1
Further related work The literature on secretary online variants of packing problems and online stochastic optimization has grown significantly over the last decade. We restrict the review to the most directly related results. For a survey of classic variants of the secretary problem, see [10].
The bipartite secretary matching problem was first studied in the context of transversal matroids [1], where a decision about accepting an arriving vertex into the matching needs to be taken directly, but matching edges can be decided in the end. Later works required that the edges must also be decided upon arrival [21]. The best algorithm for both variants obtains a competitive ratio of 1/e [18]. Most work in computer science has been devoted to the matroid secretary problem. Currently, the best algorithms obtain a competitive ratio 1/O(log log rank) [9,22]. It is an open problem if a constant competitive ratio can be shown. For a survey of work on classes of matroids and further developments see [6].
While above results are all for maximizing additive objective functions, recent work has started to consider submodular ones. For cardinality and matching constraints, constant-competitive algorithms exist for submodular secretary variants [17]. For matroids, there is a general technique to extend algorithms for additive objectives to submodular ones, which preserves constant competitive ratios [8].
Beyond matroids and matching, there are constant-competitive algorithms for knapsack secretary [2]. Prominent graph classes in networking applications allow good secretary algorithms for independent set [13]. The techniques for bipartite matching have been extended to secretary variants of combinatorial auctions and integer packing programs [19,23]. Moreover, there are 1/O(log n)-competitive algorithms even in a general packing domain [24].
Additional model variants that have found interest are, for example, local secretary [5] (several decision makers try to simultaneously hire candidates based on local feedback), temp secretary [11] (candidates are hired only for a bounded period of time), or ordinal secretary [15,25] (information available to the decision maker is only the total order of the candidates but not their numerical values).
Secretary postponement can be seen as a combinatorial extension of the coupon collector problem, a classic problem in applied probability. The elementary problem and its analysis are standard and discussed in many textbooks. The problem has many applications, and there is a plethora of variants that have been studied (see, e.g., [4,12,20]). To the best of our knowledge, however, the results for combinatorial packing problems derived in this paper have not been obtained in the literature before.
An extended abstract of this paper has appeared in the proceedings of the 29th International Symposium on Algorithms and Computation (ISAAC 2018) [16].

PACKING PROBLEMS
We consider a packing problem, in which there is a set N of n candidates, and a set ⊆ 2 N of feasible solutions. is downward-closed, that is, S ∈ and T ⊆ S implies T ∈ . For most parts, we assume that the objective function w : 2 N → R ≥0 is additive, that is, there is a non-negative value w : N → R ≥0 for each candidate, and w(S) = ∑ e ∈ S w(e) for all S ⊆ N. More generally, we will sometimes assume the objective function w is in the class XOS. An XOS function is defined as w(S) = max k i=1 w i (S), the maximum over some number k of additive functions w i (S) = ∑ e ∈ S w i (e), for all S ⊆ N and i = 1, … , k. We also consider functions that are monotone (w(S) ≤ w(T) for S ⊆ T ⊆ N) and subadditive (w(S) + w(T) ≥ w(S ∪ T) for all S, T ⊆ N). If a packing problem has an -approximation algorithm, then for any N ′ ⊆ N the algorithm guarantees an approximation ratio ≤ 1 for maximizing w over ∩ 2 N ′ .
In a secretary variant, we know the number n upfront, and the candidates arrive in random order. Suppose a set N i of candidates has arrived in rounds 1, … , i and candidate e ∈ N∖N i arrives in round i + 1. Then e reveals all new feasible solutions with previously arrived candidates ( ∩ 2 N i ∪{e} )∖( ∩ 2 N i ) and their corresponding weight. In the additive case, this reduces to simply revealing the solutions and the weight w(e).
We consider several specific variants based on specific constraints. In matroid secretary, the set of candidates and the set of feasible solutions form a matroid. Upon arrival, a candidate reveals the new feasible solutions and their weights. In the additive variant with known matroid, all candidates and feasible solutions are known upfront. Candidates only reveal their weight upon arrival.
In (bipartite) matching secretary, there is an undirected bipartite graph (N ∪ V, E). The nodes in the offline partition V are present upfront. The candidates in the online partition arrive sequentially. The feasible solutions are the matchings in the arrived subgraph. Upon arrival, a candidate reveals its incident edges and weights of the new feasible solutions. In the additive version, the arriving candidate reveals a weight per edge, and the weight of a matching M is w(M) = ∑ e ∈ M w(e), that is, the sum of edge weights of edges in M. Upon accepting a candidate, the algorithm also has to decide which matching edge to include into M.
For (additive) knapsack secretary, an arriving candidate e reveals its weight w(e) and a size b(e) ≥ 0. The size B of the knapsack is known upfront. The feasible solutions are all subsets of candidates such that their total size does not exceed B.

SECRETARIES WITH k ARRIVALS
Suppose that each candidate arrives exactly k times, and all these kn arrivals occur in uniformly random order. Consider a secretary packing problem and the following simple algorithm. In the beginning, flip kn fair coins. The number of heads is the length of an initial sampling phase. Reject all candidate arrivals during the sampling phase. Consider the set T of candidates that has appeared at least once and at most k − 1 times in the sampling phase. Apply the -approximation algorithm to the instance based on ∩ 2 T to choose a feasible solution. Accept each candidate in the solution by the time of its kth arrival.

Proposition 1. For any XOS packing problem with an -approximation algorithm, the secretary problem with k arrivals allows an algorithm with approximation ratio
.
For any monotone subadditive packing problem, the secretary problem with k ≥ 2 arrivals allows an algorithm with approximation ratio = ⋅ 1 2 .
Proof. Due to random order of arrival, we can simulate generation of T by attaching each of the kn coins to one arrival of one candidate. The arrival is in the sampling phase if and only if the coin turns up heads. Then, the probability is 1/2 k for each of the following events: (1) a given candidate never appears in the sampling phase, and (2) a given candidate appears k times in the sampling phase. T is distributed as if we would include each candidate independently with probability Consider XOS packing problems. Once T is created, we apply the -approximation algorithm to the instance based on ∩ 2 T to choose a feasible solution. Note that every candidate in T will appear at least once after the sampling phase and therefore is available for acceptance by our algorithm. Each element in T is sampled independently from N. Linearity of expectation shows that the optimum S * restricted to T has expected value By applying the -approximation algorithm to T, we obtain a feasible solution S of expected value Now consider a monotone subadditive packing problem. Once T is created, for the sake of the analysis we assume a second, hypethetical sampling step: For each candidate e ∈ T flip an independent coin to remove e from T-candidate e ∈ T remains in T with a probability of 2 k − 2 /(2 k − 1 − 1). Hence, at the end of the hypethetical sampling step, each surviving element in T is sampled independently from N with probability Due to subadditivity (see [7,Proposition 2]), the value of the best feasible solution S * Obviously the same bound holds when applying the -approximation algorithm to the set T directly without the hypothetical sampling step. ▪ The factor 1/2 in the analysis of our algorithm for monotone sudadditive functions is almost tight. The following example 2 shows a deterioration by almost a factor of 2. Consider a subadditive function w where w(N) = 2 and w(S) = 1 for all S ⊂ N, N = {1, … , n}, where all subsets S ⊆ N are feasible. The optimum is S * = N. However, when rejecting all candidates in the sample phase, there is a high probability to reject at least one candidiate k times-at which point a reduction by a factor of 2 becomes unavoidable.
Based on this observation, we can very slightly improve the analysis for subadditive functions to obtain a bound that increases monotonically with k. With probability ) k−1 ) n the set T contains all elements N. In this rare case, the solution of the -approximation algorithm does not suffer from another 1/2-factor decrease in the approximation ratio. By incorporating this insight into our analysis, the ratio becomes and this is tight for the algorithm in the example above. For secretary matching, we improve upon this with a slightly more elaborate approach. The algorithm again samples and rejects a number of candidates that is determined by kn independent coin flips with a suitable probability p < 1 (determined below). Hence, the length of the sampling phase is distributed according to Binom(kn, p). At the end of the sampling phase it computes a matching M s using an -approximation algorithm for all known candidates and offline vertices V. It accepts into M the edges incident to candidates with at most k − 1 arrivals in the sampling phase. Each of them can be accepted upon their last arrival after the sampling phase. The algorithm drops the edges from M s incident to candidates that arrived k times in the sampling phase. Let V s ⊆ V be the unmatched offline nodes.
In the second phase, the algorithm follows ideas from [18,26]. Upon arrival of a new candidate e, the algorithm computes an -approximate matching M e among V s and all candidates with first arrival after the sampling phase. If M e contains an edge (e, v) incident to e, this edge is added into M if v is still unmatched. Otherwise the edge is discarded.
Since the algorithm can be combined with arbitrary -approximation algorithms for matching, it also applies to many additional model variants, such as, for example, a k-arrival variant of the ordinal secretary matching problem [15].

Theorem 2.
For secretary matching with 2 arrivals and any -approximation algorithm for offline matching with ≤ 1, there is an algorithm with approximation ratio of 0.5721 ⋅ − o (1). For k arrivals, the ratio becomes at least Proof. By similar arguments as above, for each arrival of a secretary we can assume to flip a coin independently with probability p < 1 that determines if the arrival happens in the sampling phase. Hence, each candidate has probability p k to arrive exactly k times in the sampling phase and (1 − p) k to never arrive in the sampling phase. Let M be the matching computed by the algorithm, M 1 the matching obtained right after the sampling phase, M 2 the matching composed in the second phase and M * the optimum matching.
For M 1 we interpret the random coin flips as a two-step process. First, for each candidate in N we flip a coin independently with probability (1 − (1 − p) k ) whether the candidate arrives at least once in the sampling phase. Then, we flip another independent coin with probability p k /(1 − (1 − p) k ) whether the candidate arrives k times in the sampling phase. The first set of coin flips determines the matching M s that evolves when we apply the -approximation algorithm right after the sampling phase. Since every candidate is included independently we have Afterwards, the second set of coin flips determines the candidates that are dropped from M s . They are determined . We denote by X the random number of candidates that arrived at least once during the sampling phase. In the second phase of the algorithm, we consider all n − X candidates that have not arrived during the sampling phase. Standard arguments [17,18,26] show that each of these newly arriving candidates contributes in expectation a value of ( ⋅ (w(M * ))/n. For the th first arrival of a new candidate, the probability that the edge (e, v) suggested by the algorithm survives is the probability that the offline node v ∈ V was not matched earlier, which is lower bounded by Hence, the expected value for M 2 is at least For constants p and k, standard Hoeffding bounds imply that X = n(1 − (1 − p) k ) ± o(n) with probability at least 1 − 1∕n c for suitable constant c (see, e.g., [26]). Hence, where the asymptotics are in n. Numerical optimization shows that for k = 2 and p ≈ 0.49085, the ratio becomes at least 0.57212 ⋅ − o(1). See Table 1 for more numerical results. Intuitively, the algorithm benefits from the unseen candidates after the sampling phase and has a tendency to reduce the sample size. On the other hand, the candidates that arrive k times within the sampling phase create the set of free nodes in V available for matching to later candidates. Overall, optimizing this trade-off leads to a small reduction in the sample size. For larger k this effect vanishes since the number of candidates that appear never or k times during the sampling phase both become exponentially small. The optimal sampling parameter quickly approaches p → 0.5. This maximizes the profit from candidates that are available for optimization immediately after the end of the sampling phase. Thereby, the improvement over the simple procedure in Proposition 1 becomes smaller. More formally, we use ln (1) and obtain Note that ln(1 + x) ≤ x, so we deteriorate the expression only by the last negative term. For growing k, the optimal value of p approaches 0.5 very quickly. We see In contrast to [26], our algorithm computes an optimal (or -approximate) matching after the sampling phase for the set of all candidates that arrived during that phase (instead of the ones that arrived only once). All candidates that arrived k times are dropped. This creates free nodes of V to be matched to subsequently arriving candidates. The ratios depend asymptotically on n, since the guarantee in the second phase relies on concentration bounds for X, the number of candidates that arrive at least once in the sampling phase.
Alternatively, one can replace the second phase by recursively applying the sampling phase. More formally, after the sampling phase is done and matching M 1 is added to M, we apply the same sampling phase to V s and the candidates that have not arrived so far. In this way, we can iterate the sampling step and re-apply it to the unseen candidates and left-over nodes of the offline partition. The resulting ratios do not require concentration bounds.

Corollary 3.
For secretary matching with 2 arrivals and any -approximation algorithm for offline matching with ≤ 1, there is an algorithm with approximation ratio of 0.5459 ⋅ for every n ≥ 1. For k arrivals, the ratio becomes at least ⋅ for every n ≥ 1.
Proof. We apply the sampling phase recursively on the unknown candidates that have not arrived and the nodes V s that are still unmatched. In this way, we obtain more phases, and we denote the matching edges added in phase i by M i . In total, the matching M computed by the algorithm is composed of . For the first phase, we already argued above that For the second phase, we consider the set of candidates that have not arrived during the first phase. For each such candidate and each of its k arrivals, we can again assume to throw another random coin with probability p if at least one arrival is in the second sampling phase. Thus, every candidate arrives at least once in the second sampling phase with probability (1 − p) k ⋅ (1 − (1 − p) k ). For the offline partition, we can assume that each node survives the first phase independently with a probability of at least (p k / (1 − (1 − p) k )). This value is exactly the probability of matching the node to a candidate with k arrivals in the first phase. Thus, the -approximate matching M s, 2 based on the candidates that arrive for the first time in the second sampling phase has value at least Iterating this argument for the subsequent recursions, we see that Numerical optimization shows that for k = 2 and p = 0.48638 we obtain a ratio of at least 0.5459. More numerical results are shown in Table 2.
For large k, the optimal value for p rapidly approaches 1/2. For p = 1/2 we obtain ) .
Note that these ratios do not require concentration bounds. They apply for the expected value of the matching for any number n of candidates. ▪

POSTPONING SECRETARIES
In the postponing secretary problem, for each arriving candidate the algorithm can decide (accept/reject) or postpone it. The goal is to compute an optimal or near-optimal solution with a small expected number of postponements. Consider any algorithm for the postponement problem. We cluster the execution into rounds. Round i are the arrivals from and including the ith unique arrival (i.e., the ith time a candidate arrives for the first time) and before the (i + 1)th unique arrival. Clearly, there are always n − 1 rounds in the execution of any algorithm. If we simply postpone every candidate until we have seen all n candidates, we have full information to make accept/reject decisions for all candidates. Then the problem reduces to the classic coupon collector problem, and the expected number of returns is Θ(n log n). Our goal is to examine how we can improve upon this baseline. We first consider a general result for XOS packing. To reduce the expected number of postponements to Θ(n), it is sufficient to sacrifice a small constant factor in the approximation ratio. We obtain the following FPTAS-style trade-off between postponements and solution quality. Proof. We postpone every candidate until round ⌈n (1− )⌉. Then, we run the -approximation algorithm on the subset of arrived candidates to compute an approximate solution S. By the same arguments as in Proposition 1, this yields an (1 − )-approximation. Subsequently, there is no more postponement-for each candidate we decide upon arrival to accept (if contained in S) or reject (otherwise).
Let R i be the number of postponements in round i. Clearly, by linearity of expectation, In each round, the number of postponements is the number of rounds until we see the next unique arrival. Hence, the number of postponements is distributed according to a negative binomial distribution. The expected number is . ▪ A similar (but not exactly FPTAS-style) analysis applies to monotone subadditive packing problems. We can sample candidates until round ⌈n∕k⌉, for an integer k = 2,3,4, … . Applying the -approximation algorithm to the subset of arrived candidates we compute the approximate solution S. Based on [7, Proposition 2], this solution represents an /k-approximation. The expected number of postponements is ) .

Exclusion-monotone algorithms
Significantly better guarantees can be obtained for packing problems and algorithms with a monotonicity property. Consider a packing problem and any algorithm . We denote by (T) the solution computed by when applied to T ⊆ N.
An algorithm is called r-exclusion-monotone if for every inclusion monotone sequence there is a sequence of subsets Intuitively, to determine its solution for any subset of available elements N i , an r-exclusion-monotone algorithm can restrict attention to a set D i of at most r elements. Moreover, is such that any element e ∈ N i ∖D i that is discarded must never be reconsidered when more elements become available.
This property is fulfilled in a variety of important packing domains. For these problems we can obtain more fine-grained and significantly improved guarantees based on solution size.

Proposition 5. The following algorithms are r-exclusion-monotone.
• Optimal algorithm Opt for matroids. r is the rank of the matroid.
• Optimal algorithm Opt for bipartite matching. r is the maximum cardinality of any matching. 4 • Greedy 0.5-approximation algorithm for knapsack. Here r = | S | + 1 with S a feasible packing of the knapsack with maximum cardinality.
Proof. For matroids, upon arrival of an additional element, the new element forms a fundamental circuit with respect to the current optimal basis. A new optimal basis can be computed by discarding the element of smallest weight from the circuit. Since the new optimum can be computed from the old optimum and the newly arrived element, we never have to return discarded elements into the optimal solution. We can use D i as the set of elements in the optimum. For matching, upon arrival of an additional node v, consider the symmetric difference between the old optimal matching M and the new optimal matching M ′ . There is exactly one augementing path starting with the newly arrived node and ending with a node from the offline partition or a node from the online partition that is in the current optimal matching. Hence, we can use D i as the set of online nodes in the optimal matching.
For knapsack, upon arrival of an additional element e, Greedy composes a feasible subset S of elements by greedily packing elements in non-increasing order of ratio weight/size. In the end, the solution is either S or the element of maximum weight e max , whichever gives more value. Hence, D i can be restricted to the new element e, the arrived element of maximum-weight e max , and the greedy set S. ▪ Now consider candidates arriving in random order with postponements. Obviously, the set of arrived candidates forms an inclusion-monotone sequence. In our algorithm Maintain-, we apply the r-exclusion-monotone algorithm in the beginning of round i to the set N i of arrived candidates. Maintain-immediately rejects any candidate as soon as it is not contained in D i . It keeps postponing the candidates in D i . Finally, Maintain-accepts the candidates in (N) after the last round. Note that for the following result, Maintain-does not have to know n, r or any properties of the unseen candidates. The following guarantee significantly improves over the simple bound given in Proposition 4 when the solution is drawn from a small subset of elements. Proof. Consider the execution of the algorithm in rounds as discussed above. In each round, let U i denote the number of candidates that are still undecided (i.e., either have not arrived or have been left undecided in earlier rounds). In round i we have seen exactly i candidates. Thus, given a number of U i undecided candidates, the expected number of postponements R i in round i is given by a negative binomial distribution and amounts to .
To bound U i we note that, trivially, U i ≤ n. Moreover, the number of candidates that have arrived and are undecided is U i − (n − i). Since Maintain-postpones only candidates in the set D i , we have that U i − (n − i) ≤ r. This implies U i ≤ min(n, n − i + r) and yields Clearly, the first term in the bracket is at most 1. For r ≥ n − r + 1, the second term is larger than the third term and amounts to O(r ln n/r ′ ). For r ≤ n − r + 1, we upper bound n ln ( n − 1 Thus, the asymptotics are dominated by the third term, and E[R] = O(r ln n∕r ′ ). A similar calculation using elementary lower bounds shows that E[R] = Ω(r ln n∕r ′ ). ▪

Matroids
We adjust Maintain-for known matroids, that is, when the structure of the matroid is known upfront (only the weights of the elements are revealed). In this case, we can assume r ≤ n/2, since for r ≥ n/2 we can consider finding a minimum-weight basis in the dual matroid. We adjust algorithm Maintain-in the following way and denote the resulting algorithm by MaintainOPT. Instead of postponing all elements in the current optimum until the end, we can accept some elements earlier. In particular, we can directly accept an element e as soon as there is no unseen element that can force e to leave the optimum solution. This allows to significantly improve the number of returns to below n for any rank of the matroid. Proof.
Again we analyze the process in rounds as above. Given U i undecided candidates, the expected number of postponements R i in round i is given by a negative binomial distribution and amounts to . We obtain the bounds U i ≤ n and U i − (n − i) ≤ r as in the proof of Theorem 6. Due to the matroid property, if there are at most x unseen candidates, they can cause at most a set of x candidates to leave the optimum solution. This set can be determined in polynomial time (see Proposition 10 in the Appendix for a proof of this fact). Hence, MaintainOPT must keep at most (n − i) arrived candidates undecided, that is, This upper bound can be used to divide the process into three phases. Phase 1 consists of the rounds i = 1, … , r − 1 where U i ≤ n. Phase 2 consists of the rounds i = r, … , n − r, where U i ≤ n − i + r. Finally, Phase 3 are the rounds i = n − r + 1, Observe that all three upper bounds on U i are tight when we apply MaintainOPT in the uniform matroid. In Phase 1, no candidate can be accepted or rejected and all arriving candidates get postponed. The number of undecided candidates stays U i = n. In Phase 2, only the r candidates in the current optimum solution are both arrived and undecided, so U i = n − i + r. In Phase 3, upon each arrival of a previously unseen candidate, the algorithm makes accept/reject decisions. It can accept exactly one candidate and reject exactly one other candidate in each of the rounds of Phase 3, and thus U i = 2(n − i).
This proves that the uniform matroid results in the largest expected number of postponements, which is given by ) . Note that for any postponement problem, a simple calculation shows that the expected number of postponements of any single candidate can always be upper bounded by ln n + 1. In the worst case, a candidate arrives in the very first round, gets accepted or rejected only in the very end, and no candidate is accepted and rejected before the last round (so for every round the number of undecided candidates is U i = n). From the postponements in round i, candidate j receives in expectation an equal share. Hence, the expected number of postponements of j in this case is In contrast, the previous theorem shows that, on average, we need less than one postponement per candidate to compute even an optimal solution in matroids. However, they can be quite unbalanced over the candidates. We fully characterize the expected number of postponements in the uniform matroid with r ≤ n/2. The worst candidate in the optimal solution (i.e., the rth best candidate) asymptotically gets the largest expected number of postponements. The expected number is decreasing quickly for better and worse candidates.

Theorem 8. For
MaintainOPT with known uniform matroid of rank r ≤ n/2, the expected number of postponements R j of the jth best candidate is bounded by Proof. We again consider the process in rounds. Let us first concentrate on the top candidates.
Top-r candidates: For the top r candidates, we again consider three phases (phase 1: every candidate is postponed, rounds 1, … , r − 1; phase 2: exactly the top r candidates that have arrived so far are postponed, rounds r, … , n − r; phase 3: one candidate is accepted and one is rejected, rounds n − r + 1, … , n − 1).
For simplicity, we number the candidates according to their value, that is, the jth best candidate is simply candidate j, for j = 1, … , r. None of the best r candidates will ever be rejected, since they are part of the optimum that is accepted in the end. Hence, they will just be postponed or accepted. Let R i j be the number of postponements of candidate j in round i. By linearity of expectation, We start by analyzing each of the rounds i = 1, … , r − 1 in Phase 1. If j has not arrived until round i, then E[R i j ] = 0. If j arrives before round i, then the postponements E[R i j ] are a fair share of the number of nonunique arrivals of round i. Hence, if j arrives before round i, then If j is the ith unique arrival, he will be postponed once more at the time of his unique arrival: Clearly, j is the ith unique arrival with probability 1/n, for every i = 1, … , r − 1, so Let us now analyze the rounds i = r, … , n − r in Phase 2. Candidate j will not be accepted during any of these rounds. If j has not arrived until round i, then E[R i j ] = 0. If j arrives before round i, then R i j is a fair share of the non-unique arrivals of round i. Thus, if j arrives before round i, then If j is the ith unique arrival, he will be postponed once more at the time of his unique arrival: so, as above, In Phase 3, matters get slightly more complicated. Candidate j gets accepted at the beginning of round i if there are at most k = i − (n − r) − 1 strictly better candidates than j that have arrived so far. Hence, candidate j gets accepted at the latest in the beginning of round n − r + j, and therefore E[R i j ] = 0 for every i = n − r − j, … , n − 1. However, j is quite likely to be accepted earlier, especially if j is bounded away from 1 and r.
In particular, the probability that j has arrived before round i is again (i − 1)/n. Conditioned on the fact that j has arrived in this way, let X j be the number of candidates from the set of the best j − 1 candidates that are among the first i unique arrivals. X j is distributed according to a hypergeometric distribution-we draw j − 1 times (positions of j − 1 top candidates) out of an urn of n − 1 balls (remaining unique arrivals except the one chosen for i), where we have i − 1 blue balls (unique arrivals up to and including round i, excluding the one chosen for j) and (n − 1) − (i − 1) = n − i remaining red balls. X j is the number of blue balls we draw. If Otherwise, j is undecided in round j and gets postponed further in round i. In round i there are n − i undecided candidates and n − i unseen ones. Thus, the number of postponements until the (i + 1)th unique candidate arrives is distributed according to a negative binomial distribution with probability 1/2. From these postponements, candidate j is drawn a fair share of times. Thus, if j arrives before round i, then If j arrives newly in round i, then similar to the arguments above which implies , it is quite likely that X j ≥ k + 1 is true. Intuitively, this is true for the rounds in which i is smallest, that is, for rounds i with or, equivalently, i ≤ n − r + * , where * = (n − r − 1) n − j ⋅ (j − 1).
For the rounds i = n − r + 1, … , n − r + ⌊ * ⌋ we will upper bound the probability by 1 and hence Phases 1, 2 and the first part of Phase 3 can be combined. This yields For the first term, we bound as follows: .
For the last term, in rounds n − r + ⌊ * ⌋ + 1, … , n − r + j − 1 we note that the improvement is small since j ≤ r ≤ n/2. By simply upper bounding Pr(X j ≥ k + 1) ≤ 1 again, we obtain an overall bound of .
Note that the bound ln is arguably more accurate, since the rounds i > n + r + ⌊ * ⌋ have significantly decreased probability for j to get postponed until round i. The rather direct bound ln differs only by an additive term of at most ln(2) < 1.
Bottom-(n − r) candidates: The jth best candidates with j = r + 1, … , n are never accepted, only postponed and rejected. We again consider the algorithm in phases. The first phase is composed of rounds i = 1, … , r, in which no candidate gets rejected. By repeating the analysis above, we see that for these rounds The second phase now consists of the remaining rounds r + 1, … , n. If it has arrived, then candidate n will definitely get rejected in the beginning of round r + 1, candidate n − 1 in round r + 2, and candidate j in round (n − j) + (r + 1). However, candidate j is much more likely to get rejected earlier. Here our analysis must take this fact into consideration and extend the arguments made for Phase 3 above. For candidate j, we condition on the fact that i has arrived in the first i rounds. Then it gets rejected before or in the beginning of round i if at least r strictly better candidates have arrived. Let Y j be the number of candidates from the set of the best j − 1 candidates that are among the first i unique arrivals. Y j is again distributed according to a hypergeometric distribution-we draw j − 1 times out of an urn of n − 1 balls, where we have i − 1 blue balls and n − i red balls. Y j is the number of blue balls we draw.
If Y j ≥ r, then j does not get postponed in round i, that is, R i j = 0. Otherwise, j is undecided in round i and gets postponed. Following the analysis for Phase 2 above, we see that for rounds i = r + 1, … , n − r Moreover, following the analysis for Phase 3 above, the same holds for rounds i = n − r + 1, … , n.
, it is quite likely that Y j ≤ r − 1 is true. Intuitively, this is true for the rounds in which i is smallest, that is, for rounds with Note that this implies r + * < n − j + r. For the rounds i = r + 1, … , r + ⌊ * ⌋, we upper bound the probability by 1 and hence Thus, combining Phase 1 and the first part of Phase 2, we get For the rounds i = r+⌊ * ⌋+1, … , n−j+r, we first consider r = 1.
, where the latter is the probability of getting no blue balls when we draw with replacement. This probability is obviously higher, since we replace the red balls upon drawing them. Now we know j ≥ r + 1 = 2, which gives This proves the theorem for r = 1.
When r > 2 we bound the sum using a tail bound 5 for the hypergeometric distribution Here we use p = i−1 n−1 , t = i−1 n−1 − r−1 j−1 , which implies for round i that r − 1 = E[Y j ] − t(j − 1). Note that t < p since r ≥ 2. Now we define = (r − 1)/(j − r). With j ≥ r + 1 and r ≥ 2 we have 1 n−2 ≤ ≤ n − 2 and Note that Hence, To provide a constant upper bound on (3), we first consider the case when j = r + 1. Observe that in this case j − r = 1 and = r − 1. The formula simplifies to This proves the theorem for r ≥ 2 and j = r + 1. Finally, for r ≥ 2 and j ≥ r + 2, we split up the sum using values n /(1 + ) = a 0 ≤ a 1 ≤ · · · ≤ a k ≤ a k + 1 ≤ ⋅ ⋅ ⋅ with ) .
Note that a k are chosen such that over the interval [a k , a k + 1 ] the function 1 n−i at most differs by a factor of e, that is, For the rounds i ∈ [a 0 , a 1 ] we again use an upper bound Pr(Y j ≤ r − 1) ≤ 1 and obtain For the remaining sum, we bound each term for k = 1, 2, … separately by . The x-axis is the index of the candidate in the sorted order. The y-axis shows the average number of postponements over 5000 runs. The O(1)-terms in Theorem 4.2 turn out to be small. They appear to be maximal for candidates r and r + 1, but seem to vanish for growing n

Now using a standard upper bound via the integral,
.
Hence, overall for the case j − r ≥ 2 The term is constant and the theorem follows. For the sake of clarity in the analysis we did not attempt to optimize any constants. Based on our experiments (see Figure 1), it seems possible to reduce the term O(1) significantly by increasing the technical overhead in the analysis. ▪ Based on our experiments in Figure 1 the O(1) terms are small and even seem to vanish for large n. The logarithmic function captures the number of postponements rather precisely.
For matroids, the number of postponements of MaintainOPT with known matroid is always at most n. Instead, for bipartite matching the number of postponements of MaintainOPT must grow to Θ(n log n) when r becomes large, even if the graph is known. Example 1. Consider a simple cycle of length 2n and number the vertices consecutively around the cycle. Suppose the r = n even vertices form the offline partition V, and the n odd vertices arrive in random order. The edge weights can be arbitrary, but an adversary chooses them to be in [1, 1 + ]. Then, unless we see all vertices, we cannot decide which of the two perfect matchings will be the optimal one. MaintainOPT needs to see all vertices to be able to decide the matching edges. We recover the coupon collector problem.
The example also applies when the edges of the bipartite graph are candidates that arrive in random order (rather than the vertices). In order to guarantee that an optimal solution is returned with probability 1 in the end, all 2n candidate edges need to remain undecided until the last unique arrival. This shows, in particular, that the bound of O(r ′ ln n/r ′ ) for MaintainOPT for known matroids cannot be extended to known intersections of matroids.

Exclusion-monotonicity and solution size
For r-exclusion-monotone algorithms the algorithm Maintain-needs at most O(r ln n) postponements. One might hope that for any r-exclusion-monotone algorithm the parameter r is tied closely to the solution size of the algorithm. Then a large number of returns in Maintain-would be caused by returning a solution with many elements. This, however, is not the case-even if we are guaranteed that the size of the optimal solution is Θ(log n), an expected number of Ω(n log log n) postponements for MaintainOPT can be required.

Theorem 9.
There is a class of instances of the independent set problem with every optimal solution of size | I * | = 3 ln n, for which the expected number of postponements R in MaintainOPT is E[R] = Ω(n ln ln n).
Proof. For any > 0, consider a complete m-partite graph with m = n/(3 ln n). The nodes arrive sequentially in random order, and every node has an arbitrary positive but otherwise unknown value. Clearly, the optimum independent set I * consists of the nodes of exactly one partition, and | I * | = n/m = 3 ln n.
The adversary assigns the values of nodes in a small but unknown interval [1, 1 + ]. Thus, unless every node of a partition has arrived, the nodes in that partition cannot be rejected by MaintainOPT, since the last node could result in that partition becoming the optimal one. The nodes of the optimal partition can only be accepted in the very end when all nodes have been seen. A partition of nodes can be rejected only if it has fully arrived, and only if there is at least one other partition for which the arrived nodes have larger total value.
We again analyze MaintainOPT in rounds based on the unique arrivals as in the previous proofs. Consider the status at the end of round k = n − 1 − m. There are m unseen candidates. Our idea is to determine these candidates by throwing m balls (unseen candidates) into n 1 − bins (partitions). Then in each partition we pick the subset of candidates from that partition randomly. Note that the equivalence between our arrival process and the balls-and-bins process breaks only if we throw too many balls into a bin. After all, each partition (bin) contains only ln n many candidates (space for balls). However, if we throw m = n/(3 ln n) balls into m bins, standard calculations show that (1) with probability at most 2 m , there is at least one bin with load at least (3 ln m)/(ln ln m); and (2)  Hence, with probability 1 − 12 ln n n , there are undecided candidates throughout the first k rounds, in which case the expected number of postponements in rounds i = 1, … , k is at least ≥ ⋅ ln(3 ln n − 2) = Ω(n ln ln n). ▪

CONCLUSION AND OPEN PROBLEMS
In this paper, we have studied two variants of secretary problems with multiple arrivals per candidate. For the secretary problem with 2 arrivals, we propose a simple 0.5-competitive algorithm that can be combined with arbitrary approximation algorithms in general subadditive packing domains. For the popular and prominent case of bipartite matching, we propose algorithms that improve about this guarantee. Moreover, the algorithms are easily applicable to the case with k > 2 arrivals, and for growing k the competitive ratio approaches 1 in an exponential fashion. For secretary postponement, we study algorithms that obtain the optimal solution by making only a small number of postponements. This problem can be seen as a combinatorial generalization of the coupon collector problem. We consider a natural property of exclusion monotonicity that is fulfilled in a number of prominent applications. This property can be exploited to reduce the expected number of postponements to O(r ′ log n) or even O(r ′ log n/r ′ ), where r is the cardinality of the solution and r ′ = min(r, n − r). There are a number of interesting open problems that stem from our work. It remains a fascinating open problem whether the ratio of 0.5 can be improved in the general subadditive packing case with 2 arrivals. Moreover, the optimal ratio for the classic secretary problem with 2 arrivals is around 0.768 [26]. It would be interesting to see if there are algorithms that achieve this ratio even for bipartite matching. In a different direction, are there better algorithms for other secretary problems with 2 arrivals based on, for example, special classes of matroids or independent sets? For secretary postponement, we identify exclusion monotonicity as a property that allows to drastically decrease the expected number of postponements over the simple guarantee stemming from coupon collection. Are there other structural properties that allow to show similar results? Can improved results be shown in other domains, such as maintaining approximate solutions in variants related to independent set or integer programming?