# online_task_assignment_problems_with_reusable_resources__650fb363.pdf Online Task Assignment Problems with Reusable Resources Hanna Sumita,1 Shinji Ito,2 Kei Takemura,2 Daisuke Hatano,3 Takuro Fukunaga,4 Naonori Kakimura,5 Ken-ichi Kawarabayashi 6 1 Tokyo Institute of Technology 2 NEC Corporation 3 RIKEN AIP 4 Chuo University 5 Keio University 6 National Institute of Informatics sumita@c.titech.ac.jp, {kei takemura, i-shinji}@nec.com, daisuke.hatano@riken.jp, fukunaga.07s@chuo-u.ac.jp, kakimura@math.keio.ac.jp, k keniti@nii.ac.jp We study online task assignment problem with reusable resources, motivated by practical applications such as ridesharing, crowdsourcing and job hiring. In the problem, we are given a set of offline vertices (agents), and, at each time, an online vertex (task) arrives randomly according to a known time-dependent distribution. Upon arrival, we assign the task to agents immediately and irrevocably. The goal of the problem is to maximize the expected total profit produced by completed tasks. The key features of our problem are (1) an agent is reusable, i.e., an agent comes back to the market after completing the assigned task, (2) an agent may reject the assigned task to stay the market, and (3) a task may accommodate multiple agents. The setting generalizes that of existing work in which an online task is assigned to one agent under (1). In this paper, we propose an online algorithm that is 1/2competitive for the above setting, which is tight. Moreover, when each agent can reject assigned tasks at most times, the algorithm is shown to have the competitive ratio /(3 1) 1/3. We also evaluate our proposed algorithm with numerical experiments. Introduction Online task assignment problem has attracted extensive attention recently in combinatorial optimization and artificial intelligence. The problem models practical situations that assign agents to tasks arriving online. For example, in a rideshare platform (Dickerson et al. 2021; Dong et al. 2021; Lowalekar, Varakantham, and Jaillet 2020; Nanda et al. 2020) such as Uber and Lyft, we match drivers to riders where requests from riders arrive one by one. Other applications include crowdsourcing (Assadi, Hsu, and Jabbari 2015; Ho and Vaughan 2012; Xu et al. 2017) and job hiring (Anagnostopoulos et al. 2018; Dickerson et al. 2019). In this paper, we study online task assignment problem under known adversarial distributions. In the problem, we are given a time horizon T and a bipartite graph G = (U, V ; E), where U corresponds to a set of agents and V represents types of tasks. A type of a task is usually defined by attributes of the task, e.g., pick-up/drop-off locations of Copyright 2022, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. a rider in a rideshare platform and language skills required to complete a task in a crowdsourcing platform. An edge (u, v) E means that u has ability to process an online task v. In the problem, the set U of agents is known in advance (offline vertices), while we are given a vertex with a type v V (an online vertex) at each time, randomly according to a time-dependent distribution Dt over V . We identify an online vertex with its type v. On arrival of v, we immediately and irrevocably either assign some agents to v, or discard the chance of processing v. Then we obtain a profit produced by an assigned agent if she completes a task. The goal is to design an online algorithm that maximizes the total profit. We here assume that Dt is known in advance, which is called a Known Adversarial Distribution (KAD) model (Dickerson et al. 2021); see also (Alaei, Hajiaghayi, and Liaghat 2012). If Dt is the same for all t, it is called a Known I.I.D. (KIID) model. Motivated by practical applications, it has been studied to generalize the online task assignment problem with additional specific conditions. An example is reusability of agents (Dickerson et al. 2021). In the standard online task assignment problem, when an agent is assigned a task, she will leave immediately from the market. However, if agents are reusable, an agent comes back to the market after completing the task, and she can be matched to a new task again. Such situation especially arises in a rideshare platform with drivers and riders. Another example is rejections by agents (Nanda et al. 2020). It is natural that an agent is allowed to reject an assigned task if it is not satisfactory. Moreover, each agent may have an upper bound on the number of rejections, that is, when she rejects an assigned task times, then she must leave from the market. It should be noted that the previous work mentioned above studied the online bipartite matching where only one agent can be assigned to an online task, and it is not straightforward to extend1 the results to the online task assignment problem. 1We can consider a simple reduction that replaces an online vertex v with capacity bv as bv vertices with capacity 1 arriving sequentially. Such reduction, however, would not work in the KIID/KAD model because an online vertex is chosen independently at each time. The Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-22) In this paper, we introduce the online task assignment problem in the KAD model with all the constraints mentioned above, that is, the problem with the following constraints: A. (Reusability) an agent comes back to the market after she completes a task, where the occupation time is drawn from a known distribution, B. (Rejections) an agent u rejects an assignment with some probability, and has to leave after rejecting u times, and C. (Task capacity) an online vertex v can accommodate at most bv offline vertices. See the next section for a formal problem definition. Our Contribution Our main result is to propose an online algorithm for the above problem. We prove that our algorithm has the following theoretical guarantees: 1/2-competitive for the unlimited rejection case, i.e., when u is + for all u (Theorem 1), /(3 1)-competitive for the general case (Theorem 2), where = maxu U u. We here evaluate the performance of online algorithms with the common measure called competitive ratio. We say that an algorithm is α-competitive if the expected profit obtained by the algorithm is at least α times the offline optimal value. The formal definition is given in the next section. Our results and existing results are summarized in Table 1. Note that our algorithm is useful for the problem without reusability. In fact, we prove that our algorithm has the competitive ratio (1 1/e2)/2 > 0.432 in the KIID model without reusability, which improves on the competitive ratio 1/e (Nanda et al. 2020) for the same setting (Theorem 3). In addition, we show that no (adaptive) algorithm can achieve the competitive ratio better than 1/2. This is implied by a well-known example for the prophet inequality problem (Krengel and Sucheston 1977, 1978). We note that this hardness result is applied even in the setting of (Dickerson et al. 2021). Thus our result complements their result, which considers only non-adaptive algorithms. Due to the space limitation, we omit some results and the details of the proofs; see the full version (Sumita et al. 2022). Below let us describe technical highlights of the proof of our algorithms. Technique To design an online algorithm, we first construct an offline linear program (LP) that gives an upper bound on the offline optimal value. This is a similar approach to the online bipartite matching in the KIID/KAD model (Alaei, Hajiaghayi, and Liaghat 2012; Dickerson et al. 2021; Nanda et al. 2020). Our LP has variables xuv,t for each (u, v) E and t, and has linear inequalities that incorporate the constraints (A) (C) above. See the next section for the formal definition of our LP. Intuitively, each variable xuv,t corresponds to a probability that v arrives at time t and u is assigned to v. We shall use an LP optimal solution x uv,t to determine how to assign agents to v at each time. There are, however, several difficulties to obtain our results as described below. The first difficulty is to handle the task capacity condition, i.e., that we choose a set of agents each time. Given LP optimal solution x e,t for each edge e, we would like to have a distribution of feasible sets so that the probability of choosing e at time t is x e,t. For the online bipartite matching, this is easy; we just choose an edge e with probability proportional to x e,t at time t. However, for the online task assignment problem, such independent choice of edges may violate the task capacity condition. To avoid it, we use Carath eodory s theorem in convex analysis. The theorem allows us to decompose x e,t to a polynomial number of feasible sets, which can be regarded as a probability distribution on feasible sets. Another difficulty is that the LP optimal value may give a loose upper bound on the offline optimal value. In this case, finding a feasible set based on LP is not sufficient; we may select an agent which should not be chosen. In fact, there exists a problem instance such that an algorithm similarly to Nanda et al. (2020) that just assigns an agent according to an LP optimal solution would fail to obtain the competitive ratio more than 1/3 (See the full version for a specific example). To overcome this difficulty, we adapt the idea of Alaei, Hajiaghayi, and Liaghat (2012), who analyzed a different online matching problem in the KAD model. Specifically, after finding a feasible set S based on x e,t, we decide whether to assign a task v to u for each agent u S. We compute the expected profit that u earns at and after the current time t by assigning u to v, and, if it is larger than the one when not assigning, then we decide to assign u to v. To prove that the proposed algorithm admits desired competitive ratio, we evaluate the expected profit that each vertex u earns. Let Rd u,t be the expected profit that u earns at and after time t, when u has a remaining budget d. Then Rd u,t can be represented recursively with respect to t and d. Since the recursive equation is linear, we can bound Rd u,t by introducing another linear program and its dual. We note that when u is infinite, the recursive equation becomes simpler depending on only t, that gives a better competitive ratio. Let us describe the differences from (Dickerson et al. 2021; Nanda et al. 2020). The algorithm of Dickerson et al. (2021) assumes that the algorithm can access a probability that u U is available at time t, i.e., u is not occupied by a task at time t. This assumption may be problematic because it is not easy to obtain such information precisely. In contrast, we adopt expected profit after t to decide the assignment, which can be computed deterministically and efficiently by dynamic programming. Our algorithm is also different from Nanda et al. (2020), as their algorithm just uses x e,t as a probability mentioned above, which works only for the KIID model. Our algorithm is shown to admit a better competitive ratio in their setting, which improves on the competitive ratio 1/e by Nanda et al. (2020) (Theorem 3). Experiments We evaluate the performance of our algorithm through experiments. We use a synthetic dataset and the real-world dataset of taxi trip records, similarly to previous work (Dickerson et al. 2021; Nanda et al. 2020). Our algorithm performs the best in most cases, and runs practi- Reusability # Rejections # Assigned agents Online vertices Comp. ratio Hardness This work + 1 KAD 1/2 1/2 This work 1 KAD /(3 1) (Dickerson et al. 2021) NA 1 KAD 1/2 (1/2) This work 1 KIID > 0.432 (Nanda et al. 2020) 1 KIID 1/e 1 1/e Table 1: Summary of our results and previous work. The algorithm (Dickerson et al. 2021) requires to know in advance the probability that an agent is available at each time. cally fast enough. The results imply the superiority of our algorithm. Related Work The online task assignment problem is a generalization of the online bipartite matching (where each vertex can match to at most one neighbor). Nowadays there exist a large body of literature on online matching. We here mention some of them closely related to our problem. See Mehta (2013) for the detailed survey. The online bipartite matching was introduced by Karp, Vazirani, and Vazirani (1990). They considered the adversarial input model, that is, the model that online vertices arrive in an adversarial order, and proposed a randomized (1 1/e)-approximation algorithm for the unweighted case. It is known that the ratio is tight (Birnbaum and Mathieu 2008; Mehta et al. 2007). When online vertices arrive independently according to a distribution (i.e., the KIID model), Manshadi, Gharan, and Saberi (2012) proposed a 0.702competitive algorithm and showed that the ratio cannot be better than 0.823. For the edge-weighted case, a 0.667competitive algorithm is known (Haeupler, Mirrokni, and Zadimoghaddam 2011). Online stochastic matching, introduced by Mehta and Panigrahi (2012), is the problem that an offline vertex accepts an assignment with a probability. This can be viewed as that infinite number of rejections are allowed in the process. For the problem in the adversarial input model, Mehta, Waggoner, and Zadimoghaddam (2015) presented a 0.534competitive algorithm for the unweighted case when edge probabilities go to 0. For the KIID model, Brubach et al. (2016) gave a (1 1/e)-competitive algorithm, which works also in the edge-weighted case. Online bipartite matching in the KAD model was introduced by Alaei, Hajiaghayi, and Liaghat (2012) under a name of prophet-inequality matching, as the problem includes the prophet inequality problem (Krengel and Sucheston 1977, 1978) as a special case. Alaei, Hajiaghayi, and Liaghat (2012) extended the problem so that an offline vertex has a capacity. When offline vertices have capacities, the online bipartite matching is often called the Ad Words problem (Mehta et al. 2007). This variant is also studied extensively (Devanur and Hayes 2009; Lowalekar, Varakantham, and Jaillet 2020). Recently, online task assignment with the reusability condition receives attention. Dickerson et al. (2021) gave a 1/2-competitive algorithm for the KAD model, and variants of the problem have been studied (Dong et al. 2021; Gong et al. 2021; Goyal, Iyengar, and Udwani 2021; Rusmevichientong, Sumida, and Topaloglu 2020); see also references therein. Our problem can be viewed as a general framework that unifies the problem of Dickerson et al. and online stochastic matching with limited number of rejections. We also mention that our condition of the task capacity is close to the setting of the online assortment optimization (see e.g. (Gong et al. 2021)). In this problem, an online vertex is offered a set of offline vertices (an assortment), and the online vertex selects one of them. Nanda et al. (2020) discussed the online bipartite matching problem that maximizes fairness, instead of the total profit. Here, the fairness means the smallest acceptance ratio in tasks. Note that our results can easily be extended to their setting to balance a trade-off between profit and fairness. In this section, we describe a formal definition of our problem. For a positive integer k, we denote [k] = {1, . . . , k}. We are given a bipartite graph G = (U, V ; E) with edge weight we 0, where U is the set of offline vertices and V is the set of types of online vertices. We are also given a time horizon T. Each offline vertex u U has a positive integer u, which is a budget of allowed rejections in the process. In addition, each edge (u, v) E has a random variable Ce [T] that represents the occupation time for u to complete the task v. In other words, when u accepts v, u is absent from the market, and will be available at time t + Ce, where Ce is drawn from a given distribution. We say that an offline vertex u is available at time t if u is in the market (i.e., not being occupied by some task) and u has not rejected online vertices u times. For each time t [T], an online vertex v arrives according to a probability distribution {pv,t}v.2 Upon arrival of a vertex v, we immediately and irrevocably either assign at most bv neighbors to v that are available, or discard v. When u U is assigned, u either accepts v with probability qe or rejects v with probability 1 qe, where e = (u, v). When u accepts v, we obtain a profit we and u becomes absent from the market during the occupation time Ce. We note that we may assume u > 0 for all u U. When no rejection is allowed for u, i.e., u = 0, only edges e with qe = 1 can be matched to u. Therefore, we can remove all the edges e such that e is incident to u and qe < 1, and set u to 1. 2Nothing arrives at time t with probability 1 P Our goal is to design an online (randomized) algorithm that maximizes the expected total profit. The performance of an online algorithm is evaluated by the competitive ratio. In the subsequent sections, we define the offline optimal value and the competitive ratio. We note that our model includes the online bipartite matching problem studied in (Dickerson et al. 2021; Nanda et al. 2020) as special cases. We assume that bv = 1 for each v V . When u = + for any u U and the acceptance probability qe is one for each e E, we can ignore the constraint on rejections, and hence our model is identical to the one in (Dickerson et al. 2021). On the other hand, when probability distribution {pv,t}v is the same for all t [T] and Pr[Ce = T] = 1 for any e E, our model is the exactly one in the KIID model without reusable resources, which was studied in (Nanda et al. 2020). We also note that our model and the problem of Alaei, Hajiaghayi, and Liaghat (2012) have no inclusion relationships. Offline Optimal Algorithms Given a problem instance, a sequence of online vertices is determined according to probability distributions {pv,t}v,t. We denote by I the probability distribution over all input sequences of online vertices. In the offline setting, we suppose that we know the sequence I of online vertices in advance. An offline algorithm that maximizes the expected profit is called an offline optimal algorithm for I. We note that the algorithm does not know whether each offline vertex accepts or rejects an online vertex at each time. We denote the expected profit of the offline optimal algorithm for I by OPT(I). Define the offline optimal value by EI I[OPT(I)]. Thus we consider the best algorithm for each I in the offline optimal value. We denote the set of edges incident to u by Eu = {(u, v) | (u, v) E} for each u U, and similarly Ev for v V . We introduce an offline LP whose optimal value is an upper bound of the offline optimal value: e E weqexe,t e Eu xe,t qe Pr[Ce t t + 1] e Eu xe,tqe 1 (u U, t [T]) (1) e Eu xe,t (1 qe Pr[Ce T t]) u (u U) (2) X e Ev xe,t pv,tbv (v V, t [T]) (3) 0 xe,t pv,t (v V, e Ev, t [T]) (4) For each edge e = (u, v) and time t, a variable xe,t corresponds to a probability that e is chosen at time t. Intuitively, constraints (1) (3) corresponds to the constraints (A) (C) mentioned in the introduction, respectively. Lemma 1. The optimal value of LP (Off) is at least the offline optimal value EI I[OPT(I)]. Proof. For each input sequence I, an offline optimal algorithm for I determines a probability that, when an online vertex v arrives at time t, v is assigned to offline vertices u for each e = (u, v) E and t [T]. By taking expectations over I, we set xe,t to be the probability that an online vertex v arrives at t and an offline vertex u is assigned to v for each e = (u, v) E and t [T]. We show that x defined above is a feasible solution to the LP. By definition, xe,t is nonnegative and at most the probability pv,t, and hence (4) is satisfied. Let us see that the first constraint (1) is valid. Fix an input sequence I, u U, and t [T]. Let vt be an online vertex arriving at time t < t (if exists). For any realization of acceptances/rejections and Ce s, we have any one of the following: u has no assignment, or u is occupied by vt for some t < t with occupation time at least t t + 1. Since acceptances/rejections and Ce s are sampled independently, the expected number of online vertices occupying u is Pr[u accepts vt | I] + P t T. If we assigned an online vertex v to u with remaining budget d > 1 at time t, then the expected profit that u earns at and after t would be Qd e,t := qe we + PT t ℓ=1 Pr[Ce = ℓ]Rd u,t+ℓ + (1 qe)Rd 1 u,t+1. Algorithm 1: Proposed online algorithm 1: Solve LP (Off) to obtain an optimal solution x 2: for t = 1, . . . , T do 3: Let v be a vertex that arrives at time t (if none, skip) 4: Choose Sv k with probability λv,t k by (5) 5: for u Sv k do 6: Let d be the remaining budget of u 7: if Qd uv,t Rd u,t+1 and u is available then 8: Assign u to v 9: else 10: Do nothing for u Otherwise, the expected profit will be Rd u,t+1. In our algorithm, for each vertex u Sv k, we compare Qd uv,t and Rd u,t+1. If the former is larger, then this means that assigning u to v makes larger profit than not assigning, and thus we assign u to v. Otherwise, i.e., if assigning u to v does not make enough profit, then we do not assign u to v. Thus the expected profit for a vertex u Sv k is max Qd uv,t, Rd u,t+1 . Since the probability that v arrives at time t and u is chosen by v is x e,t for each e = (u, v) Ev and t [T], Rd u,t can be represented recursively by Rd u,t = P e Eu x e,t max Qd e,t, Rd u,t+1 + 1 P e Eu x e,t Rd u,t+1. (6) Our algorithm needs to compute Rd u,t for each d, u U and t [T]. This can be done efficiently in advance by dynamic programming with the above recursive equation (6). Modifying Instance We first observe that the total expected profit of the algorithm is equal to P u U R u u,1. Since LP (Off) gives an upper bound on EI I[OPT(I)] by Lemma 1, we aim to determine the ratio between P u U R u u,1 and the LP optimal value P u U P e Eu P t weqex e,t. To this end, we fix a vertex u in U, and we evaluate the ratio αu between R u u,1 and P e Eu P t weqex e,t. Then the competitive ratio of Algorithm 1 is at least minu αu. To obtain a lower bound of R u u,1, we modify a given instance to a simpler one with LP optimal solution x . We will show that the expected profit for the modified instance gives a lower bound on that for the original instance. The new instance is defined as follows. We define G = (U , V ; E ) where U = {u}, V = {vt | t [T]}, and E = {(u, vt) | t [T]}. In this instance, we have only one offline vertex u, and, at each t [T], only one vertex vt may arrive with edge (u, vt). We set the parameters3 for vt in the new instance as follows: the probability of arrival: p t = P e Eu x e,t; the probability of acceptance: q t = P e Eu x e,tqe/p t if p t > 0, and q t = 0 otherwise; 3For ease of notation, we use simpler subscripts, e.g., p t instead of puvt,t, as we have only one online vertex at time t. the profit: w t = P e Eu x e,tqewe/(p tq t) if p tq t > 0 and 0 otherwise; the distribution of occupation time (denoted by Ct): Pr[Ct = ℓ] = P e Eu x e,tqe Pr[Ce = ℓ] p tq t (ℓ [T]) if p t, q t > 0, and otherwise, Pr[Ct = T] = 1; the capacity: b t = 1. The budget u of u is set the same value as in the original instance. Since x is a feasible solution to LP (Off), it holds that p t 1 and q t 1 by (4), and moreover, ℓ=1 Pr[Ct = ℓ] = P e Eu x e,tqe P ℓPr[Ce = ℓ] p tq t = 1. Note also that P t [T ] w tq tp t = P e Eu P t weqex e,t. Let us execute Algorithm 1 for the modified instance, where we use p t instead of LP optimal solution x . Let Rd t be the expected profit that u earns on and after t when u has a remaining budget d at time t. Similarly to (6) for Rd u,t, it holds that Rd t = p t max n Qd t , Rd t+1 o + (1 p t) Rd t+1, (7) Qd t = q t w t + PT t ℓ=1 Pr[Ct = ℓ] Rd t+ℓ + (1 q t) Rd 1 t+1 . (8) We show that the modification does not increase the profit. Lemma 2. Rd u,t Rd t for all d [ u] {0} and t [T]. By this lemma, it suffices to lower-bound R u 1 to obtain a lower bound on R u u,1 for the original instance. Analysis for the Unlimited Rejection Case In this section, we assume that the number of allowed rejections is unlimited, that is, u = + for u U. This means that we can ignore the constraint (2) in LP (Off). The main result of this section is to prove that Algorithm 1 is 1/2-competitive for this case. For that purpose, we first modify the instance as in the previous section, and we will bound R u 1 from below. We observe that, when executing the algorithm for the modified instance, the remaining budget d of the vertex u is infinite in the whole process. For simplicity, we denote Rt = R t and Qt = Q t . Then, by (8), it holds that ℓ=1 Pr[Ct = ℓ]Rt+ℓ + (1 q t)Rt+1. Moreover, it follows from (7) that Rt = max {p t Qt + (1 p t)Rt+1, Rt+1} . (9) We note that RT = p T q T w T . We can make the first term in (9) simpler as Btw t + At Rt+1 + Bt PT t ℓ=2 Pr[Ct = ℓ]Rt+ℓ, where Bt = p tq t and At = p tq t Pr[Ct = 1] + p t(1 q t) + (1 p t) for each t [T]. We note that At = p tq t Pr[Ct = 1] + 1 p tq t = 1 Bt Pr[Ct 2]. A lower bound on R1 is obtained by minimizing R1 subject to the condition (9). The minimization problem can be formulated as a linear programming problem as below. Note that we do not need to solve the LP, but use it for analysis.4 min R1 s.t. Rt Btw t + At Rt+1 + Bt PT t ℓ=2 Pr[Ct = ℓ]Rt+ℓ(t [T 1]) Rt Rt+1 (t [T 1]) RT BT w T Rt 0 (t [T]), (10) Then the optimal value of the above LP gives a lower bound of R1. The dual of LP (10) is given by max PT t=1 Btw tαt s.t. α1 + β1 1 α2 + β2 A1α1 + β1 αt + βt Pt 2 ℓ=1 αℓBℓPr[Cℓ= t ℓ] + At 1αt 1 + βt 1 (3 t T 1) αT PT 2 ℓ=1 αℓBℓPr[Cℓ= T ℓ] + AT 1αT 1 + βT 1 αt 0 (t [T]) βt 0 (t [T 1]). (11) To show a lower bound on LP (10), we construct a feasible solution to the dual LP (11). Let γ 1/2. We set αt = γ for t [T], β1 = 1 γ, β2 = β1 γ + A1γ, and βt = βt 1 γ + At 1γ + Pt 2 ℓ=1 BℓPr[Cℓ= t ℓ]γ (3 t T 1). Lemma 3. For 0 γ 1/2, αt and βt defined as above are feasible to (11). Therefore, αt and βt defined as above are feasible to (11). The objective value for this solution is P t Btw tγ = γ P t p tq tw t. It follows from the LP duality theorem that the optimal value of LP (10) is at least γ P t p tq tw t. This is maximized when γ = 1/2, and in this case, R1 is lower-bounded by 1 2 P t p tq tw t. Since P t p tq tw t = P t P e Eu weqex e,t, we obtain R1 1 2 P t P e Eu weqex e,t. Summarizing, we have P u U R u 1 1 2E[OPT(I)] by Lemma 1, which implies the following theorem. Theorem 1. Algorithm 1 is 1/2-competitive for the problem with unlimited rejections. Analysis for the Limited Rejection Case In this subsection, we prove that Algorithm 1 is /(3 1)- competitive for the general case with = maxu U u. As in the analysis for the unlimited rejection case (the previous subsection), we present a lower bound on Rd t . We first show the following lemma. Lemma 4. Rd 1 t d 1 d Rd t for each d [ u] and t [T]. 4Such LP is called a factor-revealing LP. By (7) with the above lemma, we have the following relationships: Rd t max n pt ˆQd t + (1 pt) Rd t+1, Rd t+1 o , (12) where ˆQd t = qt(wt + P ℓ 1 Pr[Ct = ℓ] Rd t+ℓ) + d 1 d (1 qt) Rd t+1. Recall that it suffices to give a lower bound on R u 1 to obtain the competitive ratio of Algorithm 1. By (12), we can construct a linear program to bound R u 1 in a similar way to the previous subsection. See the full version for the details. Theorem 2. Algorithm 1 is a /(3 1)-competitive algorithm, where = maxu U u. We also show that our algorithm can be 1 2 1 1 e2 - competitive in the KIID model with non-reusable agents. Theorem 3. There is a ( 1 2 1/ 1 1 e2 1/ )-competitive algorithm for the problem in the KIID model with nonreusable resources, where = maxu U u. Note that 1 2 1/ 1 1 e2 1/ 1 e. Thus, our algorithm has a better competitive ratio than the 1 ecompetitive algorithm by Nanda et al. (2020). Experiments In this section, we present experimental results to evaluate the performance of Algorithm 1. We describe the experimental environment in the full version. Instance Setting We focus on the following four settings (a) (d); see also the full version for other settings. The setting (a) is the KIID model with non-reusable agents (i.e., Pr[Ce = T] = 1 for all e E) and u < T ( u U). This coincides with that of Nanda et al. (2020). The settings (b) (d) are the KAD model with reusable agents. In the setting (b), which is that of Dickerson et al. (2021), offline vertices always accept assignments (i.e., qe = 1 for all e E), while they may reject in (c) and (d). We set u < T ( u U) in (c) and u = + in (d). For each setting, we generate one instance and 1000 input sequences, and focus on the average profits and runtime for evaluation. Baseline Algorithms We compare our algorithm with the following baseline algorithms: (1) Random chooses offline vertices uniformly at random. (2) Greedy assigns at most bv available offline vertices u to v in the order of w(u,v)q(u,v). (3) NAdap* is a simple extension of NAdap proposed by Nanda et al. (2020). NAdap* solves LP (Off) and to assign a set of offline vertices with probability λv,t k . When bv = 1 for all v, NAdap* coincides with NAdap in the setting (a), and is also used in experiments in (Dickerson et al. 2021). In our experiments, we do not evaluate the algorithm by Dickerson et al. (2021) due to its impractical assumption. Synthetic Dataset We generate synthetic datasets similarly to (Nanda et al. 2020) as follows. We set |U| = 30, |V | = 100, and T = 200. For each u U and v V , an edge (u, v) exists in E with probability 0.1. For each e E, we set 2 4 6 8 10 bv Profit / opt. val. of LP Setting (a) of synthetic dataset Random Greedy NAdap* Proposed (a) KIID without reusability 2 4 6 8 10 bv Profit / opt. val. of LP Setting (b) of synthetic dataset (b) KAD without rejection 2 4 6 8 10 bv Profit / opt. val. of LP Setting (c) of synthetic dataset (c) KAD with u 3 2 4 6 8 10 bv Profit / opt. val. of LP Setting (d) of synthetic dataset (d) KAD with u = + Figure 1: The rates of average profits for synthetic datasets. qe U(0.5, 1) (uniform distribution) and we U(0, 1), respectively. For each u U and e Eu, the distribution of Ce for reusability is defined as a binomial distribution B(20, ηu), where ηu U(0, 1). For the settings (a) and (c), u is drawn uniformly at random from {1, 2, 3} for each u U. We set the same bv for all v from {2, 4, 6, 8, 10}. Figure 1 plots the ratios between average profits and the LP optimal values. We see that our algorithm (the black solid line) performs the best in almost all cases. Moreover, it obtains more than half of the LP optimal value, while Random and Greedy sometimes fail. NAdap* performs almost the same as us in (a), but is worse in (b) (d). All the algorithms run within less than 1 second for processing online vertices. Our algorithm needs additionally around 6 seconds on average because of the preprocess of solving LP (Off) and computing the table of Rd u,t s. For the detailed results, see the full version. Real-world Dataset We evaluate the performance for instances generated from the New York City yellow cabs dataset5, which is used also in existing work such as (Dickerson et al. 2021; Nanda et al. 2020). Using the data in January 2013 we set up a bipartite graph with |U| = 30 and |V | = 100 and parameters similarly to (Nanda et al. 2020). The detailed set-up is described in the full version. We set T = 100k (k [4]) in setting (a) and T = 288k (k [4]) in others. Figure 2 shows the ratios between average profits and the LP optimal values. Note that our algorithm earns more than half of the LP optimal value in any cases. In settings (a) and (c), Random and Greedy cannot obtain even half of the LP optimal value for a large T. NAdap* performs as well as ours in settings except (c), in which it performs worse than our algorithm. Our algorithm may be too careful in (b) and 5http://www.andresmh.com/nyctaxitrips/ 100 200 300 400 T 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Profit / opt. val. of LP Setting (a) of real-world dataset Random Greedy NAdap* Proposed (a) KIID without reusability 288 576 864 1152 T 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Profit / opt. val. of LP Setting (b) of real-world dataset (b) KAD without rejection 288 576 864 1152 T 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Profit / opt. val. of LP Setting (c) of real-world dataset (c) KAD with u 3 288 576 864 1152 T 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Profit / opt. val. of LP Setting (d) of real-world dataset (d) KAD with u = + Figure 2: The rates of average profits for real-world datasets. (d), in which there is no need to care about the rejection constraint, and has less performance. On the other hand, since offline vertices easily become unavailable in (a) and (c), our algorithm, which considers a long-term effect of the current assignment, performs the best. All the algorithms run in less than 1 second for processing even 1152 online vertices. The preprocess in our algorithm completes in 400 seconds. Since the preprocess is done before arrival of online vertices, our algorithm makes decision as fast as others. For the detailed results, see the full version. Conclusion In this paper, we studied the online task assignment problem with reusable resources in the KAD model, which generalizes the online bipartite matching in the KAD model. Our problem incorporates practical conditions arising in applications such as ridesharing and crowdsourcing. We proposed an online algorithm that is 1/2-competitive for the unlimited rejection case, which is tight, and /(3 1)-competitive for the general case. Practical usefulness of our algorithm is confirmed by numerical experiments. For future work, as solving LP (Off) is time-consuming, it would be interesting to obtain a constant competitive ratio without solving the LP. One direction is to use an approximate solution to the LP in our algorithm. Another future work is to consider a model that, when a task is rejected, it is allowed to re-assign some other agents. Acknowledgments We would like to thank the AAAI anonymous reviewers for valuable comments to improve the paper. HS was supported by JSPS KAKENHI Grant Numbers JP17K12646, JP21K17708, and JP21H03397, Japan. SI was supported by JST, ACT-I, Grant Number JPMJPR18U5, Japan. TF was supported by JSPS KAKENHI Grant Numbers JP20H05965, JP21K11759, and JP21H03397, Japan. NK was supported by JSPS KAKENHI Grant Numbers JP20H05795, JP18H05291, and JP21H03397, Japan. KK was supported by JSPS KAKENHI Grant Number JP18H05291, Japan. Alaei, S.; Hajiaghayi, M.; and Liaghat, V. 2012. Online Prophet-Inequality Matching with Applications to Ad Allocation. In Proceedings of the 13th ACM Conference on Electronic Commerce (EC), 18 35. Anagnostopoulos, A.; Castillo, C.; Fazzone, A.; Leonardi, S.; and Terzi, E. 2018. Algorithms for Hiring and Outsourcing in the Online Labor Market. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), 1109 1118. Assadi, S.; Hsu, J.; and Jabbari, S. 2015. Online Assignment of Heterogeneous Tasks in Crowdsourcing Markets. In Proceedings of the 3rd AAAI Conference on Human Computation and Crowdsourcing (HCOMP), 12 21. Birnbaum, B.; and Mathieu, C. 2008. On-Line Bipartite Matching Made Simple. SIGACT News, 39(1): 80 87. Brubach, B.; Sankararaman, K. A.; Srinivasan, A.; and Xu, P. 2016. New Algorithms, Better Bounds, and a Novel Model for Online Stochastic Matching. In Proceedings of the 24th Annual European Symposium on Algorithms (ESA), volume 57, 24:1 24:16. Devanur, N. R.; and Hayes, T. P. 2009. The Adwords Problem: Online Keyword Matching with Budgeted Bidders under Random Permutations. In Proceedings of the 10th ACM Conference on Electronic Commerce (EC), 71 78. Dickerson, J. P.; Sankararaman, K. A.; Sarpatwar, K. K.; Srinivasan, A.; Wu, K.-L.; and Xu, P. 2019. Online Resource Allocation with Matching Constraints. In Proceedings of the 18th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 1681 1689. Dickerson, J. P.; Sankararaman, K. A.; Srinivasan, A.; and Xu, P. 2021. Allocation Problems in Ride-Sharing Platforms: Online Matching with Offline Reusable Resources. ACM Transactions on Economics and Computation, 9(3). Dong, Z.; Das, S.; Fowler, P.; and Ho, C.-J. 2021. Efficient Nonmyopic Online Allocation of Scarce Reusable Resources. In Proceedings of the 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 447 455. Gong, X.-Y.; Goyal, V.; Iyengar, G. N.; Simchi-Levi, D.; Udwani, R.; and Wang, S. 2021. Online Assortment Optimization with Reusable Resources. Management Science. Forthcoming. Goyal, V.; Iyengar, G.; and Udwani, R. 2021. Asymptotically Optimal Competitive Ratio for Online Allocation of Reusable Resources. In Proceedings of the 17th Conference on Web and Internet Economics (WINE), 543. Available at ar Xiv:2002.02430. Haeupler, B.; Mirrokni, V. S.; and Zadimoghaddam, M. 2011. Online Stochastic Weighted Matching: Improved Approximation Algorithms. In Proceedings of the 7th International Workshop on Internet and Network Economics (WINE), 170 181. Ho, C.-J.; and Vaughan, J. W. 2012. Online Task Assignment in Crowdsourcing Markets. In Proceedings of the 26th AAAI Conference on Artificial Intelligence (AAAI), 45 51. Karp, R. M.; Vazirani, U. V.; and Vazirani, V. V. 1990. An Optimal Algorithm for On-Line Bipartite Matching. In Proceedings of the 22nd Annual ACM Symposium on Theory of Computing (STOC), 352 358. Krengel, U.; and Sucheston, L. 1977. Semiamarts and finite values. Bulletin of the American Mathematical Society, 83(4): 745 747. Krengel, U.; and Sucheston, L. 1978. On semiamarts, amarts, and processes with finite value. Probability on Banach spaces, 4: 197 266. Lowalekar, M.; Varakantham, P.; and Jaillet, P. 2020. Competitive Ratios for Online Multi-Capacity Ridesharing. In Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 771 779. Manshadi, V. H.; Gharan, S. O.; and Saberi, A. 2012. Online Stochastic Matching: Online Actions Based on Offline Statistics. Mathematics of Operations Research, 37(4): 559 573. Mehta, A. 2013. Online Matching and Ad Allocation. Foundations and Trends in Theoretical Computer Science, 8(4): 265 368. Mehta, A.; and Panigrahi, D. 2012. Online matching with stochastic rewards. In Proceedings of the 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science (FOCS), 728 737. Mehta, A.; Saberi, A.; Vazirani, U.; and Vazirani, V. 2007. Ad Words and Generalized Online Matching. Journal of the ACM, 54(5): 22:1 22:19. Mehta, A.; Waggoner, B.; and Zadimoghaddam, M. 2015. Online Stochastic Matching with Unequal Probabilities. In Proceedings of the 26th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), 1388 1404. Nanda, V.; Xu, P.; Sankararaman, K. A.; Dickerson, J.; and Srinivasan, A. 2020. Balancing the Tradeoff between Profit and Fairness in Rideshare Platforms during High-Demand Hours. In Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI), 2210 2217. Rusmevichientong, P.; Sumida, M.; and Topaloglu, H. 2020. Dynamic Assortment Optimization for Reusable Products with Random Usage Durations. Management Science, 66(7): 2820 2844. Sumita, H.; Ito, S.; Takemura, K.; Hatano, D.; Fukunaga, T.; Kakimura, N.; and ichi Kawarabayashi, K. 2022. Online Task Assignment Problems with Reusable Resources. ar Xiv:2203.07605. Xu, P.; Srinivasan, A.; Sarpatwar, K. K.; and Wu, K.-L. 2017. Budgeted Online Assignment in Crowdsourcing Markets: Theory and Practice. In Proceedings of the 16th Conference on Autonomous Agents and Multiagent Systems (AAMAS), 1763 1765.