# scheduling_under_uncertainty_a_querybased_approach__0c26ab99.pdf

Scheduling under Uncertainty: A Query-based Approach

Luciana Arantes1, Evripidis Bampis1, Alexander Kononov2,3, Manthos Letsios1, Giorgio Lucarelli4, Pierre Sens1

1 Sorbonne Universit e, CNRS, Laboratoire d Informatique de Paris 6, LIP6, F-75005 Paris, France 2 Sobolev Institute of Mathematics, Novosibirsk, Russia 3 Novosibirsk State University, Department of Mechanics and Mathematics, Russia 4 Univ. Grenoble Alpes, CNRS, Inria, Grenoble INP, LIG, 38000 Grenoble, France

We consider a single machine, a set of unit-time jobs, and a set of unit-time errors. We assume that the time-slot at which each error will occur is not known in advance but, for every error, there exists an uncertainty area during which the error will take place. In order to ﬁnd if the error occurs in a speciﬁc time-slot, it is necessary to issue a query to it. In this work, we study two problems: (i) the error-query scheduling problem, whose aim is to reveal enough error-free slots with the minimum number of queries, and (ii) the lexicographic errorquery scheduling problem where we seek the earliest error-free slots with the minimum number of queries. We consider both the off-line and the online versions of the above problems. In the former, the whole instance and its characteristics are known in advance and we give a polynomial-time algorithm for the error-query scheduling problem. In the latter, the adversary has the power to decide, in an on-line way, the time-slot of appearance for each error. We propose then both lower bounds and algorithms whose competitive ratios asymptotically match these lower bounds.

1 Introduction

Scheduling problems under uncertainty have been studied in different contexts such as on-line optimization, robust optimization, and stochastic optimization. A common assumption in all these models is that the uncertain information cannot be known before the construction of the schedule. However, in some applications it is possible to know the exact input information at some extra cost [Erlebach and Hoffmann, 2015]. In this paper, we consider scheduling with faults/errors and we introduce a new non-probabilistic model with explorable (query-able) uncertainty. Each unit-time error is characterized by an uncertainty area during which the error will occur, and it is possible to learn the exact slot at which it will appear by issuing a query operation of unit cost. Our goal is to reveal enough error-free slots by querying the

smallest number of errors in order to be able to successfully execute n unit-time jobs. Note that a query can be seen as an unsuccessful attempt of execution. In Figure 1 (left), we give a simple instance with 6 errors: error e1 with uncertainty area (0, 4], error e2 with (0, 2], and so on. More formally, assume that the time-line is split into unitsize slots. For simplicity, we refer to the slot (t 1, t], for any t 1, by its right limit, i.e., t. We are given a single machine, n unit-time jobs and a set E of k unit-time errors. The execution of each job takes place into a single slot. Each error e E is associated with an uncertainty area Ae = (se, fe] and its actual slot of appearance t e (se, fe]. We assume that, for any two errors e, e E, it holds that t e = t e . A slot t is called error-free if for each error e E where t Ae it holds that t = t e. In order to reveal an error-free slot we need to query all errors whose uncertainty area includes it. We propose to study the following two problems: (i) The error-query scheduling problem: Find n error-free slots by performing the minimum number of queries; (ii) The lexicographic errorquery scheduling problem: Find the earliest n error-free slots by performing the minimum number of queries. We consider two versions of the above problems. In the off-line version the whole instance and its characteristics are known in advance. In the on-line version the set of errors and their uncertainty areas are known to the algorithm, but the adversary can decide in an on-line way the slot of appearance t e for each e E. In other words, there is an iterative game between the on-line algorithm which queries an error and the adversary which reveals the slot of appearance of this error, and so on. Even-though the off-line version seems of marginal interest, it can be used as a lower bound on the number of queries for the on-line case. As a measure of efﬁciency for the on-line problems we use the competitive ratio. An algorithm is ρ-competitive if it makes at most ρ times the optimum number of queries for any instance of the problem. In what follows, we deﬁne two special families of instances concerning the uncertainty areas of the errors. Both families are widely studied in scheduling and graph theory literature. Agreeable instances: A set of uncertainty areas is agreeable, if for any two errors e and e with se se , it holds that fe fe . In other words, there is a unique ordering of the uncertainty areas of the errors with respect to both their left

Proceedings of the Twenty-Seventh International Joint Conference on Artiﬁcial Intelligence (IJCAI-18)

and their right limits, i.e., if se1 se2 . . . sek then fe1 fe2 . . . fek. Note that agreeable instances correspond to proper interval graphs. Laminar instances: A set of uncertainty areas is laminar, if for any two errors e and e with se se , it holds that either fe se or fe fe . The laminar instances have a tree-like structure and they can be represented as a forest. In order to see such a structure, consider the following construction. Each error eℓ, 1 ℓ k, is assigned to a node vℓ. For any two nodes vℓand vℓ with Aeℓ Aeℓ, an edge connects them if there is no other error eℓ such that Aeℓ Aeℓ Aeℓ. In the case of multiple errors with the same uncertainty area, we consider an arbitrary order of them and edges between nodes of any two consecutive errors are added to the tree based on this order. Note that the trees created by this construction are considered to be rooted. Speciﬁcally, the root of each tree is the node that corresponds to its error with the largest uncertainty area. Moreover, there is an order among the children of a given node which is determined by the order of the uncertainty areas of the corresponding errors in the time-line. An example of a laminar instance along with its tree representation can be seen at Figure 1.

0 1 2 3 4 5 6 7 8

Figure 1: A laminar instance and its representation as a forest.

Related work: The work in the area of explorable uncertainty has been initiated in the seminal paper of Kahan [Kahan, ] for selection problems. Since then, a series of papers study many other problems in this context: ﬁnding the k-th smallest value in a set of uncertainty intervals [Feder et al., 2003; Gupta et al., 2011; Kahan, ], caching problems in distributed databases [Olston and Widom, ], the shortest path problem [Feder et al., 2007], the knapsack problem [Goerigk et al., 2015], the minimum spanning tree problem [Hoffmann et al., ; Megow et al., 2015]. In most of these works the aim is to minimize the number of queries to guarantee an exact optimum solution, while some other study the trade-off between the number of queries and the precision of the returned solution [Olston and Widom, ]. Recently, explorable uncertainty has been introduced in scheduling [D urr et al., ]. The uncertain information concerns the processing time of each job for which an upper bound is known in advance. It is possible to learn the exact processing time by testing at the price of a unit cost. This line of research has similarities with models that have been extensively studied in AI for computation decision support [Boutilier et al., 2006], especially with models treating adaptive preference elicitation to combinatorial domains [Benabbou and Perny, b; a; Drummond and Boutilier, ; Gelain et al., 2010].

However, our model is also related to search theory [Angelopoulos, 2016]. For instance, consider an environment where the search domain is modeled by a star, i.e., consists of a set of rays which have a common intersection point, and there are more than one targets. The goal of the searcher is to locate some of these targets by adopting the expanding search paradigm in which the cost of re-exploration is negligible [Alpern and Lidbetter, 2013]. It is not difﬁcult to see that the instance of Figure 2 can be interpreted as a star with b rays and one target (corresponding to one error-free slot) in every ray. The goal is to ﬁnd some among these targets, while a query corresponds to the exploration of a ray by an additional unit of distance. It has to be pointed out here that star search offers an abstraction for modeling different applications such as interruptible algorithms [Angelopoulos, ], or database query optimization [Condon et al., 2009].

Our contribution: We ﬁrst give an optimal polynomialtime algorithm for the off-line error-query scheduling problem. For the on-line version of the problem, we prove that the competitive ratio of any on-line algorithm is Ω(

k) even if the instance is laminar and agreeable and even if we search for only one free slot. Then, we propose a O(

k)-competitive algorithm for laminar instances. Finally, in Section 4, we deal with the lexicographic error-query scheduling problem by matching lower and upper bounds of the competitive ratio for the on-line version. Table 1 summarizes our results.

Error-query scheduling

Lexicographic error-query scheduling Lower Bound

Upper Bound

Lower Bound

Upper Bound general

- Ω(log2 k) O(log2 k) agreeable - laminar O(

Table 1: Summary of results.

2 Preliminaries Let s consider the set T = {t1, t2, . . . , t|T |} of all si s and fi s sorted in increasing order (we eliminate all duplicate values). We deﬁne the set I = {I1, I2, . . . , I|I|} of elementary intervals where |I| = |T | 1 and each Ii, 1 i |I|, spans in [ti, ti+1]. Note that the number of elementary intervals is at most twice the number of errors, i.e., |I| = O(k). Let E(Ii) be the set of errors whose span includes the elementary interval Ii. Moreover, let F(E ) be the set of error-free slots that are revealed if we query the errors of the set E E. The elementary interval is an important concept in most of our proofs because of the following property: Property 1. Given an elementary interval Ii, all errors in E(Ii) must be queried in order to reveal an error-free slot during Ii. Moreover, if all errors in E(Ii) are queried then all error-free slots of the set F(E(Ii)) are revealed. Based on the above property, it is meaningless to ask fewer than |E(Ii)| errors of E(Ii) when targeting some error-free

Proceedings of the Twenty-Seventh International Joint Conference on Artiﬁcial Intelligence (IJCAI-18)

slots of Ii. Moreover, there is no reason for not using all error-free slots in F(E(Ii)) when all errors in E(Ii) are already queried, unless there is not a sufﬁcient number of unscheduled jobs. Therefore, a set of elementary intervals is also sufﬁcient to represent a solution to our problem. More speciﬁcally, given a set of elementary intervals, we need to query all errors whose uncertainty area intersects them. If the number of error-free slots that are revealed is greater than n, then we select arbitrarily n of them in the case of the errorquery scheduling problem, or the earliest n of them in the case of the lexicographic error-query scheduling problem. Finally, we give some additional notation regarding the laminar instances. Given a vertex vℓof depth dℓ(i.e., the distance in hops from vℓto the root of its tree is equal to dℓ), we deﬁne the level of the error eℓto be equal to dℓ+1. Moreover, for any node vℓin the tree-like representation of a laminar instance, let T(vℓ) be the subtree rooted at vℓ. For any (sub)tree T, let E(T) be the set of errors corresponding to the nodes of T. Due to the laminar structure, an algorithm does not have any interest to query an error eℓif it has not already queried all errors corresponding to the nodes in the path from vℓto the root. Given a set E of already queried errors which follows this observation, we say that a subtree T(vℓ) is maximal with respect to E if eℓ E and either vℓhas no ancestors or the errors corresponding to all ancestors of vℓare in E . For example, for the instance depicted in Figure 1, if E = {e4} then the subtrees T(v1) and T(v5) are maximal.

3 The Error-query Scheduling Problem

In this section, we ﬁrstly present a polynomial-time algorithm for the off-line version of the error-query scheduling problem. Then, we consider the on-line version of the problem and we present a lower bound to the competitive ratio as well as a O(

k)-competitive algorithm for laminar instances.

3.1 A Polynomial-time Algorithm

We present here a polynomial-time algorithm for ﬁnding an optimal off-line solution for the error-query scheduling problem. The proposed algorithm uses the representation of a solution to the problem by just indicating a set of elementary intervals, as this is implied by Property 1 and it is explained in preliminaries. Such a representation can be also used for partial solutions, i.e., solutions that reveal less than n errorfree slots. Moreover, each (partial) solution S corresponds to a triplet (i, j, ℓ), where i is the index of the latest elementary interval used, j is the number of error-free slots that are revealed and ℓis the number of errors that are queried. Note that a triplet (i, j, ℓ) may correspond to more than one solutions. By slightly abusing the notation, we denote by E(S) the set of errors that are queried by the solution S. Our algorithm keeps a set of partial solutions and it iterates on the elementary intervals: at iteration i, 1 i |I|, the elementary interval Ii is considered. Let Si 1 be the set of partial solutions at the end of the iteration i 1. We extend each S Si 1 to the partial solution S by including to it the elementary interval Ii and we add both S and S to Si: if S corresponds to the triplet (i , j, ℓ) with i i 1, then S =

S {Ii} corresponds to the triplet (i, j +|F(E(Ii))|, |E(S) E(Ii)|), where ℓ |E(S) E(Ii)| ℓ+ |E(Ii)|. Note that several partial solutions can be dominated by others. We say that a partial solution S1 corresponding to (i1, j1, ℓ1) is dominated by the partial solution S2 corresponding to (i2, j2, ℓ2) if i1 i2, j1 j2 and ℓ1 ℓ2. At the end of the iteration i we eliminate all dominated partial solutions from Si. Observe that this elimination phase removes also all but one partial solutions that correspond to the same triplet (i, j, ℓ). A formal description of the algorithm is given below.

Algorithm 1

1: Initialization: S (which corresponds to (0, 0, 0)), S0 {S}, A S Ii 2: for i = 1 to |I| do 3: Si Si 1 4: for each S Si 1 corresponding to (i , j, ℓ) do 5: Create a new partial solution S S {Ii} corresponding to (i, j + |F(E(Ii))|, |E(S) E(Ii)|) 6: if |F(E(S ))| n and |E(S )| < |E(A)| then 7: A S

8: else if |F(E(S ))| < n then 9: Si Si {S } 10: for each pair S1, S2 Si where S1 corresponds to (i1, j1, ℓ1) and S2 corresponds to (i2, j2, ℓ2) do 11: if i1 i2 and j1 j2 and ℓ1 ℓ2 then 12: Si Si \ {S1} 13: return A

Theorem 1. Algorithm 1 ﬁnds an optimal solution for the offline error-query scheduling problem in time polynomial in k.

Proof. Note that, if we remove the elimination step in Lines 10 12 of the algorithm, then it will create all possible solutions, i.e., all possible combinations of elementary intervals. Hence, in order to prove the correctness of our algorithm, it is sufﬁcient to show that the elimination step does not remove any partial solution which can lead to a better ﬁnal solution. For the sake of contradiction, assume that a partial solution S1 corresponding to (i1, j1, ℓ1) is eliminated by the algorithm due to its domination by the partial solution S2 corresponding to (i2, j2, ℓ2) in iteration i. Moreover, let I1 be the set of elementary intervals that can be added to S1 and get the ﬁnal solution S1 I1 which queries strictly less errors than the solution obtained by the algorithm. Note that the solution S1 I1 reveals j1+|F(E(I1))| error-free slots. Consider now the solution S2 I1 which reveals j2 +|F(E(I1))| error-free slots. By the deﬁnition of the elimination rules we have that j1 j2, and hence S2 I1 reveals at least as many error-free slots as S1 I1. For the errors that are queried by S1 I1 we have that |E(S1 I1)| = |E(S1) E(I1)| = |E(S1)| + |E(Ii)| |E(S1) E(I1)| = ℓ1+|E(Ii)| |E(S1) E(I1)|. Similarly, for S2 I1 it holds that |E(S2 I1)| = ℓ2 + |E(Ii)| |E(S2) E(I1)|. Since i1 i2, we have that |E(S1) E(I1)| |E(S2) E(I1)|. By taking into account the elimination rule ℓ1 ℓ2, we have that |E(S1 I1)| |E(S2 I1)|, which is a contradiction to the assumption that S1 I1 queries strictly less errors than the solution obtained by the algorithm.

Proceedings of the Twenty-Seventh International Joint Conference on Artiﬁcial Intelligence (IJCAI-18)

For the complexity of the algorithm, note ﬁrst that it iterates O(|I|) = O(k) times. By the deﬁnition of the elimination rules, in each iteration we keep only one value of j for each pair of i and ℓ. Thus, the parameter j can take at most min{n, k2} different values. Therefore, the number of partial solutions in each iteration is at most min{n, k2} k |I| < k4. Finally, the elimination step considers all pairs of partial solutions, whose number is polynomial to k, and hence the theorem follows.

3.2 A Lower Bound for the On-line Problem We next consider the on-line version of the error-query scheduling problem. Recall that we assume that the errors and their uncertainty areas are known to the on-line algorithm, but not their slots of appearance. In this section we present a lower bound to the competitive ratio of any on-line algorithm. Note that this lower bound is valid for particularly simple instances, i.e., there is only one job to execute as well as the uncertainty areas of the errors belong in both laminar and agreeable families of instances.

Theorem 2. The competitive ratio of any on-line algorithm for the error-query scheduling problem is Ω(

k), even in the case where n = 1 and the instance is both laminar and agreeable.

Proof. Consider the instance shown in Figure 2 which consists of k = b2 errors. The errors are partitioned into b groups, E1, E2, . . . , Eb, each one containing b errors. The uncertainty area of the error eℓ,i, i.e., the i-th error in Eℓ, spans into the interval ((ℓ 1)(b+1)+i 1, ℓ(b+1)], where 1 i b and 1 ℓ b. The uncertainty areas of any two errors belonging in different groups do not intersect. Speciﬁcally, in the interval ((ℓ 1)(b+1), ℓ(b+1)], 1 ℓ b, there exist only the uncertainty areas of the b errors of Eℓ. Since this interval spans into b + 1 slots, there is a unique error-free slot available in it. The goal is to ﬁnd one error-free slot in the whole instance in order to execute one job. Note that the constructed instance is both laminar and agreeable.

0 1 2 b 1 b b + 1

(ℓ 1)(b + 1) ℓ(b + 1)

(b 1)(b + 1) b(b + 1)

. . . . . . . . . . . . . . .

Figure 2: An instance of the error-query scheduling problem for which the adversary queries only one error while it can oblige any on-line algorithm to query

Given the above instance, we claim that any on-line algorithm can be forced by the adversary to query at least b errors, while the optimal solution will query only one. According to Property 1 there is no reason to query error eℓ,i without also querying all errors in {eℓ,1, . . . , eℓ,i 1}. In particular, if the

error-free slot of the set Eℓappears in position ℓ(b + 1), then any algorithm must query all errors in {eℓ,1, . . . , eℓ,b} in order to reveal it. On the other hand, if the error-free slot appears in position (ℓ 1)(b + 1) + 1 one request is sufﬁcient to ﬁnd it. Assume now that an on-line algorithm has queried any b 1 errors. We say that Eℓis unimpaired if the on-line algorithm queries no errors from this group. Let us denote the set of unimpaired groups by E0 and let E1 = {E1, E2, . . . , Eb} \ E0. We note that both sets are non-empty and each group Ei contains at least one not updated error. If Eℓ E1 the adversary sets the unique error-free slot in the position ℓ(b+1). Otherwise, the adversary sets the unique error-free slot in the position (ℓ 1)(b + 1) + 1. It follows that the on-line algorithm ﬁnd no error-free slot after b 1 queries and must update at least one more error. In its turn, the optimal solution asks the error el,1 for Eℓ E0 and reveal the error-free slot (ℓ 1)(b + 1) + 1. Concluding, the adversary can work in such a way that forces the on-line algorithm to query at least b errors, while the constructed instance contains an error-free slot that can be revealed by only one query. By recalling that b =

k, the theorem follows.

k)-competitive Algorithm for Laminar Instances In this section we give an O(

k)-competitive algorithm for laminar instances that exploits their tree-like structure. In each iteration, the algorithm updates the set of already queried errors E and considers the set of all maximal subtrees with respect to E as these are deﬁned at the end of Section 2. We say that the subtree T(vℓ) is big if it contains more than

k nodes, that is the uncertainty area of eℓincludes the uncertainty areas of more than

k errors. Otherwise, we say that T(vℓ) is small. For simplicity, we call also big and small the errors corresponding to the root of big and small subtrees, respectively. Then, given a set of maximal subtrees, our algorithm at each iteration queries the following errors. For each big maximal subtree T(vℓ) the error eℓis queried. Moreover, the algorithm queries the errors corresponding to all nodes of the small maximal subtree that contains the maximum number of error-free slots. Note that the number of error-free slots in a maximal subtree can be easily computed. Algorithm 2 presents a formal description of the procedure.

Theorem 3. Algorithm 2 is O(

k)-competitive for the errorquery scheduling problem on laminar instances.

Proof. A big subtree cannot be part of a small subtree, that is if vℓis the parent of vℓ and T(vℓ ) is big, then T(vℓ) is also big. Based on this, we observe that the algorithm at iteration i considers all big subtrees of level i. Since these subtrees are node-independent and each of them contains more than

k errors, their number cannot be greater than

k. Hence, the algorithm handles at most

k big subtrees per iteration in Lines 5 10, and for each subtree it queries only one error. Moreover, by the deﬁnition of small subtrees, the algorithm queries at most

k errors in Line 12. Therefore, at each iteration the algorithm queries at most 2

Proceedings of the Twenty-Seventh International Joint Conference on Artiﬁcial Intelligence (IJCAI-18)

Algorithm 2

1: i 0; E 2: S0: the set of trees in the forest representation of the initial instance (the initial set of maximal subtrees) 3: while n > 0 do 4: i i + 1; Si Si 1 5: for each big subtree T(vℓ) Si 1 do 6: Query the error eℓand set E E {eℓ} 7: Reduce n by the number of the error-free slots that are revealed due to the query of eℓ 8: Si Si \ {T(vℓ)} 9: for each children vℓ of vℓdo 10: Si Si {T(vℓ )} 11: Let T i Si 1 be the small maximal subtree that can reveal the maximum number of error-free slots 12: Query all errors in E(T i) and set E E E(T i) 13: Si Si \ {T i} 14: Reduce n by the number of the error-free slots that are revealed due to the query of the errors in E(T i) 15: return E

In order to ﬁnish the proof, we need to show the following claim: the error-free slots revealed by the algorithm up to the end of iteration i, are at least equal to the error-free slots revealed by any solution after querying exactly i errors. This claim can be rephrased as: if the algorithm performs i iterations in total then the optimal solution should query at least i errors, and hence the competitive ratio is at most 2

k). In order to prove the claim, consider a solution of any algorithm A that has queried i errors. Let ibig and ismall be the number of the big and the small errors, respectively, in this solution, i.e., i = ibig + ismall. As mentioned above, there is no interest to query an error without previously querying its parent in the forest representation, since the instance is laminar. Hence, all i errors queried by A are of level at most i. Then, all ibig big errors queried by A are also queried by our algorithm, since the later one has queried all big errors of levels 1, 2, . . . , i at the end of iteration i. Thus, it remains to show that the ismall errors queried by A cannot reveal more error-free slots than the errors in T 1 T 2 . . . T i queried by our algorithm. This is true, since ismall i and these i small subtrees are selected by our algorithm in order to contain the maximum number of error-free slots.

4 The Lexicographic Error-query Scheduling Problem

In this section we consider the lexicographic variant of our problem in which we need to query the minimum number of errors such that to reveal the n earliest error-free slots. The off-line case for this problem is trivial, since the algorithm knows the slot of appearance of each error, and hence the n earliest error-free slots. Thus, it sufﬁces to query all errors whose uncertainty areas contain these slots. In the following, we deal only with the on-line case. We ﬁrst present a lower bound which holds also for agreeable instances, and then we propose a competitive algorithm for general instances which matches asymptotically this lower bound.

Note that the lexicographic error-query scheduling problem is easy for laminar instances and an optimal on-line algorithm can be designed. This algorithm is based on maximal subtrees and the fact that the number of error-free slots in a maximal subtree can be immediately calculated. Indeed, given the set E of the already queried errors and a maximal subtree T(vℓ) with respect to E , the number of error-free slots in (seℓ, feℓ] is equal to

(feℓ seℓ) |E(T(vℓ))| a(E , T(vℓ))

where E(T(vℓ)) is the set of errors corresponding to the nodes of T(vℓ) and a(E , T(vℓ)) is the number of errors in E whose appearance slot is in (seℓ, feℓ]. Then, the algorithm for laminar instances queries at each iteration the error that corresponds to the root of the leftmost maximal subtree which contains at least one error-free slot. This procedure terminates when the n earliest error-free slots are revealed. The optimality of the algorithm comes from the fact that the queried errors reveal the n earliest error-free slots, while the algorithm queries an error only if it can reveal at least one of these error-free slots. Theorem 4. There is an optimal on-line algorithm for the lexicographic error-query scheduling problem on laminar instances.

4.1 A Lower Bound We present here a lower bound to the competitive ratio of any on-line algorithm. This lower bound is valid for instances where there is only one job to execute, one error-free slot available as well as the uncertainty areas of the errors form an agreeable instance. Note that the lexicographic property is implied by the uniqueness of the error-free slot. Theorem 5. The competitive ratio of any on-line algorithm for the lexicographic error-query scheduling problem is Ω(log2 k), even if n = 1 and the instance is agreeable.

Proof. We consider the instance shown in Figure 3 which spans into the time interval (0, k + 1]. In this instance there are k errors, while the uncertainty area of the error eℓspans in (ℓ 1, ℓ+ 1]. Note that there exists exactly one errorfree slot in this instance which should be found by the on-line algorithm in order to execute the n = 1 job.

0 1 2 3 4 ℓ 2

Figure 3: An agreeable instance of the lexicographic error-query scheduling problem for which the adversary queries at most two errors while it can oblige any on-line algorithm to query log2 k errors.

Consider any on-line algorithm A and assume that it initially queries the error eℓ. The adversary has two choices: the slot of appearance of eℓis set to be either (ℓ 1, ℓ] or (ℓ, ℓ+1]. In the ﬁrst case A immediately learns that the error-free slot

Proceedings of the Twenty-Seventh International Joint Conference on Artiﬁcial Intelligence (IJCAI-18)

cannot be in the interval (0, ℓ]. Indeed, there are ℓ 1 errors whose uncertainty area entirely spans in this interval plus eℓ whose appearance slot is (ℓ 1, ℓ]. Hence, there are ℓerrors whose appearance slots are in (0, ℓ]. Due to the assumption that two errors cannot have the same uncertainty slot, we conclude that there is not an error-free slot in (0, ℓ]. In a similar way, if the adversary sets the slot of appearance of eℓto be (ℓ, ℓ+ 1] then A immediately learns that the error-free slot cannot be in the interval (ℓ, k + 1]. In both above cases, the algorithm A can eliminate a part of the instance and continue the search for the error-free slot by querying a new error to the other part. We assume that A is a reasonable algorithm in the sense that it does not query errors for which it is easy to verify that cannot reveal any error-free slot. Note that the subinstance in which A will continue its search, has exactly the same form as the initial instance, but it has a smaller spanning interval and less errors. Based on the above observation, we consider that the adversary applies always the following policy: given a subinstance that spans in the interval (ℓ1, ℓ2 + 1] and the query for an error eℓ, where ℓ1 ℓ ℓ2, set t eℓ= (ℓ 1, ℓ] if ℓ ℓ1 ℓ2 + 1 ℓ, and t eℓ= (ℓ, ℓ+ 1] otherwise. In this way, the spanning interval of the remaining sub-instance which will be considered by the reasonable algorithm A is maximized. Speciﬁcally, the new sub-instance spans to an interval of size max{ℓ ℓ1, ℓ2 + 1 ℓ} and it contains max{ℓ ℓ1, ℓ2 + 1 ℓ} 1 errors. On the other hand, the algorithm A should select the error eℓsuch that the new sub-instance is of minimum size, which can be achieved if ℓ= ℓ2 ℓ1

2 . In other words, at each step the spanning interval of the sub-instance is decreased by half, and hence A is forced to query O(log2 k) errors before revealing the errorfree slot. Since all slots are included in the uncertainty area of at most two errors, the adversary needs at most two queries to reveal the error-free slot, and the theorem follows.

4.2 An O(log2 k)-competitive Algorithm In this section we propose an on-line algorithm for the lexicographic error-query scheduling problem whose main idea is to do a pruning of the intervals in which either no errorfree slots exist or an error-free slot exists in an earliest interval. In order to safely decide if an interval contains an error-free slot, we adapt the deﬁnition of maximal subtrees used for laminar instances as follows: given a set E of already queried errors, we say that a time interval I = (ta, tb] is maximal with respect to E if for each e E \ E either (se, fe] (ta, tb] or (se, fe] (ta, tb] = . Let E(I) be the set of errors in E \ E that satisfy the ﬁrst condition, i.e., e E(I) if (se, fe] (ta, tb]. Then, the number of error-free slots in I is equal to (tb ta) |E(I)| c, where c is the number of errors in E whose slot of appearance occurs in I. In what follows, we consider the slots partitioned into groups: a slot t belongs to group Gi, i 2, if the number of errors whose uncertainty area contain t is between 2i 2 + 1 and 2i 1. For convenience, let G0 and G1 be the groups of slots that are contained to the uncertainty area of zero and one errors, respectively. The total number of groups is in O(log2 k). Given two slots t, t with t t , let Gi[t, t ] be the set of slots of group Gi that appear in the interval (t 1, t ].

We deﬁne the middle slot of a set Gi[t, t ] to be the slot in position |Gi[t, t ]| /2 if we consider the slots in Gi[t, t ] ordered from left to right with respect to the time-line. In each iteration, the goal of our algorithm is to discover exactly one error-free slot: the earliest one which is not yet revealed. To do this, it performs a kind of binary search in the slots of the time horizon, starting from the slots of G1, then of G2, and so on. Speciﬁcally, the algorithm ﬁrst considers the middle slot in G1 and queries the error which contains it. Depending on the existence of at least one error-free slot on the left of the middle slot, we eliminate the interval on the right or on the left of this slot. The middle slot of the new interval that belongs to G1 is sought, and we repeat the above procedure until the reduced interval does not contain any other slot of G1. Then, we continue the procedure by choosing the middle slot of the reduced interval that belongs to G2. If such a slot exists, we query all errors that contain it, and so on. Note that at the end of each step of the binary search, all errors containing the current middle slot are queried. However, we do not have to query again already queried errors. Algorithm 3 describes formally the above procedure.

Algorithm 3

1: E , SOL 2: for j 1 to n do 3: left 1, right last slot, found False, i 1 4: while found = False do 5: if there is an error-free slot t SOL with left t right such that there is no error-free slot t SOL with left t t 1 and all errors containing t are in E

then 6: SOL SOL {t}, found True 7: Update the groups Gi with respect to the new errors added in E during current iteration 8: else if |Gi[left, right]| 2 then 9: i i + 1 10: else 11: mid the middle slot of Gi[left, right] 12: Query all errors in E \ E that contain mid and add them to E

13: if there is an error-free slot t SOL with left t right then 14: right mid 15: else 16: left mid

Note that the intervals (left 1, right] in Lines 5 and 13 of Algorithm 3 are maximal with respect to the current set of errors E , and hence these two conditions can be justiﬁed as described in the beginning of the section. Theorem 6. Algorithm 3 is O(log2k)-competitive for the lexicographic error-query scheduling problem.

Proof. By the deﬁnition of the problem, the algorithm and the adversary ﬁnd exactly the same set of n error-free slots, let {tj1, tj2, . . . , tjn}. We consider these slots ordered with respect to the time-line. Note that the algorithm reveals them according to this order. Without loss of generality, we can assume that the adversary uses also the same order. Let E ℓ, 1 ℓ n, be the set of errors queried by the adversary

Proceedings of the Twenty-Seventh International Joint Conference on Artiﬁcial Intelligence (IJCAI-18)

in order to reveal tjℓ, without querying again the errors in E 1 E 2 . . . E ℓ 1. Then, the optimal solution queries Pn ℓ=1 |E ℓ| errors. In a similar way, let Eℓ, 1 ℓ n, be the set of errors queried by the algorithm in iteration ℓ, i.e., the algorithm queries Pn ℓ=1 |Eℓ| errors. Since both the algorithm and the adversary reveal the slots in the same order and do not query again already queried errors, we can assume that after revealing the slot tjℓthe instance is reduced by eliminating the queried errors (see Line 7 of the algorithm). Moreover, we consider that the groups Gi are appropriately updated. Assume that the slot tjℓ, 1 ℓ n, belongs to the group Giℓat the iteration during which it is revealed. If iℓ= 0, then both algorithm and adversary does not query any additional error. If iℓ 1, then the adversary queries at least 2iℓ 2 + 1 errors, while the algorithm at most Piℓ i=1 2i 1 log2 |Gi| 2iℓlog2 k additional errors. Therefore, the algorithm queries at most a factor of 4 log2 k more errors than the adversary for revealing each error-free slot, and hence the theorem follows.

5 Conclusion

In this paper, we gave a polynomial-time algorithm for the off-line error-query scheduling problem. For the on-line version of the problem, we proved that the competitive ratio of any on-line algorithm is Ω(

k) even if the instance is laminar, or agreeable and even if we search for only one free slot. Then, we proposed on O(

k)-competitive algorithm for laminar instances. Finally, we studied the on-line lexicographic error-query scheduling problem and we proved that its competitive ratio is in Θ(log2 n). A research question that remains open is the development of efﬁcient on-line algorithms for general instances for the error-query scheduling problem. Going further, studying non-preemptive jobs of arbitrary processing times is an interesting direction.

Acknowledgments

This work has been supported by the French ANR-14-CE240007-01 project Co Co RICo-Co De C and by a LIP6 project.

[Alpern and Lidbetter, 2013] Steve Alpern and Thomas Lidbetter. Mining coal or ﬁnding terrorists: The expanding search paradigm. Operations Research, 61:265 279, 2013.

[Angelopoulos, ] Spyros Angelopoulos. Further connections between contract-scheduling and ray-searching problems. In IJCAI 2015, pages 1516 1522.

[Angelopoulos, 2016] Spyros Angelopoulos. Deterministic searching on the line. In Encyclopedia of Algorithms, pages 531 533. 2016.

[Benabbou and Perny, a] Nawal Benabbou and Patrice Perny. Adaptive elicitation of preferences under uncertainty in sequential decision making problems. In IJCAI 2017, pages 4566 4572.

[Benabbou and Perny, b] Nawal Benabbou and Patrice Perny. Incremental weight elicitation for multiobjective state space search. In AAAI 2015, pages 1093 1099. [Boutilier et al., 2006] Craig Boutilier, Relu Patrascu, Pascal Poupart, and Dale Schuurmans. Constraint-based optimization and utility elicitation using the minimax decision criterion. Artif. Intell., 170:686 713, 2006. [Condon et al., 2009] Anne Condon, Amol Deshpande, Lisa Hellerstein, and Ning Wu. Algorithms for distributional and adversarial pipelined ﬁlter ordering problems. ACM Trans. Algorithms, 5:24:1 24:34, 2009. [Drummond and Boutilier, ] Joanna Drummond and Craig Boutilier. Elicitation and approximately stable matching with partial preferences. In IJCAI 2013, pages 97 105. [D urr et al., ] Christoph D urr, Thomas Erlebach, Nicole Megow, and Julie Meißner. Scheduling with explorable uncertainty. In ITCS 2018, pages 30:1 30:14. [Erlebach and Hoffmann, 2015] Thomas Erlebach and Michael Hoffmann. Query-competitive algorithms for computing with uncertainty. Bulletin of the EATCS, 116, 2015. [Feder et al., 2003] Tom as Feder, Rajeev Motwani, Rina Panigrahy, Chris Olston, and Jennifer Widom. Computing the median with uncertainty. SIAM J. Comput., 32:538 547, 2003. [Feder et al., 2007] Tom as Feder, Rajeev Motwani, Liadan O Callaghan, Chris Olston, and Rina Panigrahy. Computing shortest paths with uncertainty. J. Algorithms, 62:1 18, 2007. [Gelain et al., 2010] Mirco Gelain, Maria Silvia Pini, Francesca Rossi, Kristen Brent Venable, and Toby Walsh. Elicitation strategies for soft constraint problems with missing preferences: Properties, algorithms and experimental studies. Artif. Intell., 174:270 294, 2010. [Goerigk et al., 2015] Marc Goerigk, Manoj Gupta, Jonas Ide, Anita Sch obel, and Sandeep Sen. The robust knapsack problem with queries. Computers & OR, 55:12 22, 2015. [Gupta et al., 2011] Manoj Gupta, Yogish Sabharwal, and Sandeep Sen. The update complexity of selection and related problems. In FSTTCS, pages 325 338, 2011. [Hoffmann et al., ] Michael Hoffmann, Thomas Erlebach, Danny Krizanc, Mat us Mihal ak, and Rajeev Raman. Computing minimum spanning trees with uncertainty. In STACS 2008, pages 277 288. [Kahan, ] Simon Kahan. A model for data in motion. In ACM STOC 1991, pages 267 277. [Megow et al., 2015] Nicole Megow, Julie Meißner, and Martin Skutella. Randomization helps computing a minimum spanning tree under uncertainty. In ESA, pages 878 890, 2015. [Olston and Widom, ] Chris Olston and Jennifer Widom. Offering a precision-performance tradeoff for aggregation queries over replicated data. In VLDB 2000, pages 144 155.

Proceedings of the Twenty-Seventh International Joint Conference on Artiﬁcial Intelligence (IJCAI-18)