# explaining_softgoal_conflicts_through_constraint_relaxations__a39b0afc.pdf

Explaining Soft-Goal Conﬂicts through Constraint Relaxations

Rebecca Eiﬂer1 , Jeremy Frank2 and J org Hoffmann1,3

1Saarland University, Saarland Informatics Campus, Germany 2NASA Ames Research Center, Mountain View, CA, USA 3German Research Center for Artiﬁcial Intelligence (DFKI), Saarbr ucken, Germany {eiﬂer, hoffmann}@cs.uni-saarland.de, Jeremy.D.Frank@nasa.gov,

Recent work suggests to explain trade-offs between soft-goals in terms of their conﬂicts, i. e., minimal unsolvable soft-goal subsets. But this does not explain the conﬂicts themselves: Why can a given set of soft-goals not be jointly achieved? Here we approach that question in terms of the underlying constraints on plans in the task at hand, namely resource availability and time windows. In this context, a natural form of explanation for a soft-goal conﬂict is a minimal constraint relaxation under which the conﬂict disappears ( if the deadline was 1 hour later, it would work ). We explore algorithms for computing such explanations. A baseline is to simply loop over all relaxed tasks and compute the conﬂicts for each separately. We improve over this by two algorithms that leverage information conﬂicts, reachable states across relaxed tasks. We show that these algorithms can exponentially outperform the baseline in theory, and we run experiments conﬁrming that advantage in practice.

1 Introduction Imagine planning the next Mars Rover Mission. Due to the rovers limited resources and timing constraints for data collection and uploads, only some of the rovers tasks can be planned. Recent work by Eiﬂer et al. [2020a; 2020b] suggests explaining trade-offs between such soft goals in terms of conﬂicts, i. e., minimal unsolvable soft-goal subsets. However, this does not give further insights into why the soft-goals can not be jointly achieved. Understanding the cause of these conﬂicts and possible resolutions is crucial to reasoning about different options and ﬁnding the best trade-offs. Here, we explore the question of what causes soft-goal conﬂicts in tasks with constraints such as resource availability and time windows. In this context, soft-goal conﬂicts can naturally be explained by identifying the minimal constraint relaxations under which the conﬂict disappears. For example the conﬂict {x-ray image, soil sample} could be explained by: The rover needs 2 more units of energy or the upload window to relay X needs to be 3 time units longer, to perform both tasks . A similar approach is used in constraint programming [Lauffer and Topcu, 2019;

Senthooran et al., 2021], where they introduce soft constraints, to provide suggestions on how to modify an unfeasible subset of constraints to make it feasible. We investigate algorithms for computing such explanations based on minimal relaxations in a set of given relaxations, which we instantiate with resource and time window constraint relaxations. Eiﬂer et al. [2020b] introduced an algorithm that computes all minimal unsolvable goal subsets of a task by expanding the whole search space while tracking all maximal solvable goal subsets. To reduce the search space size they prune all states from which, according to a given heuristic, no superset of the incumbent solution is reachable. The basic adaption of this procedure is to iteratively call it for each relaxed task separately and compute the minimal relaxed task where each conﬂict disappears in a post-processing step. We introduce two algorithms that improve over this baseline by exploiting the fact that information like reachable goal subsets and states can be propagated from one relaxed task to another if the latter is more relaxed. The ﬁrst algorithm, Internal Constraint Reuse (ICR), iteratively computes the conﬂicts for each increasingly relaxed planning task and reuses the reachable subgoals from less relaxed tasks. This provides the pruning function with a growing set of reachable subgoals that it can use to prune parts of the search space that do not contain any subgoals that have not yet been achieved. The second algorithm, Search Space Reuse (SSR), reduces duplicate work by iteratively increasing one search space instead of generating a new one per relaxed task. This is done by storing the search frontier for each task, and using it as the starting point for more relaxed tasks. Thus for each relaxed task, only the newly reachable states are generated. We show that these algorithms can exponentially outperform the baseline, with respect to the number of generated states, in theory. Experiments on 4 resource-centric domains and 3 domains with time windows show that both algorithms perform signiﬁcantly better than the baseline in practice, and that they are complementary to each other in terms of ﬁnding explanations on resourceand time-centric domains.

2 Preliminaries

2.1 Planning Formalism

A ﬁnite-domain representation (FDR) [B ackstr om and Nebel, 1995] planning task with soft-goals is a tuple τ =

Proceedings of the Thirty-First International Joint Conference on Artiﬁcial Intelligence (IJCAI-22)

(V, A, I, Ghard, Gsoft), where V is a ﬁnite set of state variables v with domain D(v), A is a ﬁnite set of actions, and I is a complete assignment to V called initial state. Ghard and Gsoft are disjoint partial assignment to V called hard and soft-goal. A state is a complete assignment to V . Variablevalue pairs v = d are referred to as facts, and (partial) variable assignments are identiﬁed by sets of facts. The value of v in the (partial) variable assignments s is referred to as s(v). Each action consists of a precondition and effect (prea, effa) deﬁned as partial assignments to V . An action a is applicable in state s (appl(a, s)) if prea s. Applying a to s, denoted by s[[a]] = s , changes the values of s to s (v) := effa(v) if effa(v) is deﬁned and s (v) := s(v) otherwise. The resulting state of an iteratively applicable action sequence π is denoted by s[[π]]. A plan is an action sequence π where Ghard I[[π]]. It achieves the soft-goals Gsoft I[[π]]. Instead of a cost-function with an upper bound, as in oversubscription planning by [Smith, 2004; Domshlak and Mirkis, 2015], here we consider constraints such as resources and time as a limiting factor for achievable soft-goals. The preﬁx a0 ai of plan π = a0 ai, aj an is denoted by preﬁx(π, aj).

Running Example Our running example is based on the IPC Rovers domain. A rover must collect up to three samples S0, S1, S2 and upload the data to a relay satellite. The rover can perform three different actions: move between locations, take a sample when it is at the corresponding location, and upload the collected data at l0 or l3. The road map and the initial location of the rover are depicted on the left in Figure 1.

δB = 1 δt = 2

δB = 1 δt = 1

δB = 1 δt = 2

Si : δB = 1 δt = 1

0 1 2 3 4 5 6 7 8 9 10

resource initρb ρbmax

0 1 2 3 4 5 6 7 8 9 10

time window tmax o c

Figure 1: Running Example

2.2 Planning with Consumed Resources A consumed resource ρ with domain D(ρ) = [0, ρmax] N has an initial value initρ D(ρ) and a function δρ : A 7 N that maps each action to the amount of resource consumed by that action. A state represents a complete assignment to V {ρ}. Action a is applicable in state s if prea s and the remaining value of ρ is sufﬁcient to execute the action s(ρ) δρ(a). Applying a in s decreases the resource by δρ(a): s[[a]](ρ) = s(ρ) δρ(a). The amount of resource ρ consumed by an action sequence π is given by con(π) = P a π δρ(a). An extension to multiple resources is deﬁned accordingly, where the set of all resources is denoted by R. In our running example, there is battery ρb as a resource.

2.3 Planning with Simple Time Windows We restrict ourselves to a concept of time that can be compiled to classical planning. This means discrete time units and no parallel execution of actions. A (start) time window is a tuple W = (AW , o, c) with 0 o c tmax. The application

of the actions in AW A is constrained by the opening time o and closing time c. The function δt : A 7 N maps each action to its execution duration. The passed time units are represented by the variable t with domain D(t) = [0, tmax]. A state represents a complete assignment to V {t}. Action a is applicable in state s if prea s and if a AW then o s(t) c. Applying a in s increases the passed time by δt(a): s[[a]](t) = s(t) + δt(a). The execution duration of an action sequence π is given by dur(π) = P a π δt(a) and the execution time point of action a in π by exec(π, a) = dur(preﬁx(π, a)). An extension to multiple time windows is deﬁned accordingly, where the set of all time windows is denoted with W. In our example, there is one time window WU = ({upload(Si) | i {0, 1, 2}}, 4, 6) which allows to upload data to the relay only between the time points 4 and 6.

2.4 Explanation Framework In [Eiﬂer et al., 2020a; Eiﬂer et al., 2020b] Gsoft represents a set of plan properties, speciﬁcally LTLf plan-preference formulas compiled into soft-goal facts [Baier and Mc Ilraith, 2006; Edelkamp, 2006]. The framework uses conﬂicts between these plan properties to generate answers to the users question. The soft-goals X, Y Gsoft conﬂict each other if all plans of τ that achieve all g X, do not achieve all g Y . The strongest dependencies of this kind are given by the minimal unsolvable goal subsets (MUGS) X Y = G Gsoft where G cannot be achieved but every G G can. The set of all MUGS for a task τ is denoted by MUGS(τ).

3 Conﬂict Explanation Through Relaxations

We provide explanations for a soft-goal conﬂict based on minimal constraint relaxations under which the conﬂict disappears. In the following sections, we deﬁne the relaxations that are considered and how we identify explanations based on a given set of relaxed tasks.

3.1 Relaxation Orders The most general property of an abstraction or relaxation T of a planning task T is that all plans of T are preserved [Culberson and Schaeffer, 1998; Edelkamp, 2001; Seipp and Helmert, 2013]. This gives the most general deﬁnition of a relaxed task as:

Deﬁnition 1 (Relaxed Task). Let T be a planning task and Π(T) the set of all possible plans for task T. Then T is a relaxed task of T (denoted by T T ) iff Π(T) Π(T ).

Our explanation approach and algorithms make no further assumptions about the speciﬁc implementation of relaxation. To compute the explanation for a conﬂict C MUGS(T), we assume a ﬁnal set of relaxed tasks T for T, where T T and for all T T : T T , is given. For T the relation Ti Tj represents a partial order, which we will use to deﬁne a minimal relaxed task that resolves C. The partially ordered set ˆT = (T, ) we call in the following a relaxation order for T. The functions CU(T ) and CL(T ) denote the upper and lower covers of T within ˆT . Given a partially ordered set S then the upper cover of an element e S is the set CU(e) = {e S | e > e e S : e > e > e}, and

Proceedings of the Thirty-First International Joint Conference on Artiﬁcial Intelligence (IJCAI-22)

the lower cover the set CL(e) = {e S | e < e e S : e < e < e}. One of our algorithms additionally assumes, that ˆT has a supremum.

3.2 Resource and Time Constraint Relaxations Next we instantiate the above with resource and time window constraint relaxations.

Resource Constraint Relaxations A task with consumed resources can be relaxed by increasing the initial resource value. Deﬁnition 2 (Resource Relaxed Task). Let T = (τ, R) be a planning task τ with resources R. Then a resource relaxed task for resource ρ R is deﬁned as T = (τ, R ) where ρ is replaced by resource ρ with D(ρ) = D(ρ ) = [0, ρmax], δρ = δρ and ρmax initρ initρ. A resource relaxed task indeed represents a relaxed task according to Deﬁnition 1: Proposition 1. Let T be a resource relaxed task of T. Then, Π(T) Π(T ). Proof sketch: Π(T) Π(T ), because every action sequence π = a0 an applicable in I of T is also applicable in I of T . For all actions ai π with πi = preﬁx(π, ai), s = I[[πi]], s = I [[πi]] and c = con(πi), ai is applicable in s because ai is applicable in s and s(V ) = s (V ) and initρ c = s(ρ) < s (ρ ) = initρ c. Making the application of an action cheaper by reducing δρ is another way to relax a task with respect to a resource. This is almost equivalent to increasing the initially available resource, given that the resource is exhaustible and the action appears once in the plan. We use increasing resource availability as a proxy for any reduction in resource consumption. Using the set Tρ of all resource relaxed tasks of task T for ρ, we get a well-deﬁned relaxation order ˆTρ = (Tρ, ) for T. Since all Ti Tρ are exclusively distinguished by initρi we have T T iff initρ < initρ , which results in a total order for Tρ. The task T where initρ = ρmax represents the supremum of ˆTρ. The upper/lower cover of T Tρ is the relaxed task where the initial resource value is one unit larger/smaller than in T . For our running example, we have four relaxed tasks for the battery Tρb = {Ti | i {7, 8, 9, 10}}, where in Ti the initial battery level is i.

Time Constraint Relaxations A task with simple time windows can be relaxed by increasing the time window, either by decreasing the opening time or by increasing the closing time. Deﬁnition 3 (Time-Window Relaxed Task). Let T = (τ, W) be a planning task τ with simple time windows W. Then a relaxed task for time window W = (AW , o, c) W is deﬁned as T = (τ, W ) where W is replaced by W = (AW , o , c ) with 0 o o c c tmax. A time-window relaxed task indeed represents a relaxed task according to Deﬁnition 1: Proposition 2. Let T be a time-window relaxed task of T. Then, Π(T) Π(T ).

Proof sketch: Π(T) Π(T ), because every action sequence π = a0 an applicable in I of T is also applicable in I of T . For all actions ai π, exec(π, ai) is the same in both tasks and with πi = preﬁx(π, ai), s = I[[πi]] and s = I [[πi]], ai is applicable in s because ai is applicable in s and if ai AW then o o exec(π, ai) c c . An alternative approach to relax a task with respect to time constraints is the reduction of the execution time of an action by decreasing δt. However, in addition to affecting multiple time windows, handling the explosion of possible relaxed tasks is not trivial, which is why we leave this for future work. The subsumption relation of the intervals [o , c ] for time window W yields the partial order for ˆTW = (TW , ), where TW is the set of all time-window relaxed tasks of T with respect to W. The task with W = (AW , 0, tmax) is the supremum of ˆTW . The upper/lower cover of T TW are the relaxed tasks where either o is decreased/increased or c is increased/decreased by one compared to T . For our running example we have 25 different relaxed tasks for the upload window TWU = {Ti,j | i {0, 1, 2, 3, 4} j {6, 7, 8, 9, 10}}, where in Ti,j the opening time is at i and the closing time at j.

3.3 Conﬂict Explanation We aim to generate explanations for the conﬂicts in MUGS(T). Given a relaxation order, we can now deﬁne for each conﬂict whether a task is minimally relaxed for it.

Deﬁnition 4 (Minimally Relaxed Task). Let ˆT = (T, ) be a relaxation order for task T and C MUGS(T) a conﬂict. Then T T is minimally relaxed for C / MUGS(T ) if for all T T : T < T C MUGS(T ). Thus, a minimally relaxed task for conﬂict C is one in which C is not a conﬂict, but for all less relaxed tasks it is. All conﬂicts in MUGS(T), for which T ˆT is minimally relaxed are denoted by mr-MUGS( ˆT , T ). The explanation for a conﬂict in MUGS(T) can then be deﬁned as: Deﬁnition 5 (Conﬂict Explanation). Let T be a task with conﬂict C MUGS(T) and ˆT a relaxation order for T. Then the set of all minimally relaxed tasks for C, E( ˆT , C) = {T | C mr-MUGS( ˆT , T )}, is the conﬂict explanation for C. To illustrate the explanation for conﬂict C = {S0, S2} in our running example, we use the diagram in Figure 2. The minimal relaxed tasks and therefore the explanations are given as E = {T1,6, T4,7}: Sample S0 and S2 can not both be uploaded, because the upload window, needs either to start 3 time units earlier or to end 1 time unit later .

C MUGS(Ti,j) C / MUGS(Ti,j)

Ti,j E( ˆT , C)

Figure 2: Part of hasse diagram for the time relaxed tasks of the running example. [i1, j1] [i2, j2] means Ti1,j1 Ti2,j2. T3,7 / E( ˆT , C) because T4,7 < T3,7.

Proceedings of the Thirty-First International Joint Conference on Artiﬁcial Intelligence (IJCAI-22)

Algorithm 1 Internal Constraint Reuse (ICR)

1: Given: relaxation order ˆT , heuristic h 2: function COMPUTEMSGS( ˆT , h) 3: M {} map of MSGSs 4: while HASNEXT( ˆT ) do 5: ˆT NEXT( ˆT ) current relaxed task 6: M[ ˆT ] S ˆ T CL( ˆ T ) M[ ˆT ] propagate MSGS 7: O {INIT( ˆT )} initial state of relaxed task 8: while |O| = 0 do 9: s NEXT(O, h) next state acording to expansion order 10: if Ghard s then update MSGS 11: M[ ˆT ] EXTEND(M[ ˆT ], s Gsoft) 12: if (Ghard Gsoft) s then check and propagate solvability 13: T ˆT ˆT T : M[T ] Ghard Gsoft 14: break 15: O O {s SUCC( ˆT , s) | PRUNE( ˆT , M[ ˆT ], h, s )} 16: return M

4 Internal Constraint Reuse (ICR)

In the following, we introduce two algorithms that given a relaxation order ˆT compute the maximal solvable goal subsets (MSGS) for each task. The MSGS can then be used to compute the mr-MUGS of each task as follows.

From MSGS to mr-MUGS A MSGS is a soft-goal subset G Gsoft where G can be achieved but every G G cannot. Given MSGS(T ) for T T we compute mr-MUGS( ˆT , T ) in two steps. First, we compute MUGS(T ) by performing a bottom-up tree search over all subsets of Gsoft and use the MSGS as a fast solvability check as introduced by [Eiﬂer et al., 2020b]. Then, the MUGS for which T is minimally relaxed are computed as mr-MUGS( ˆT , T ) = MUGS(T) ((T T CL(T ) MUGS(T )) \ MUGS(T )).

MSGS Computation Eiﬂer et al. [2020b] compute the MSGS for a task by exhaustively exploring the state space while tracking all reached MSGS. To reduce the search space size they introduce a pruning function, which prunes all states from which no superset of the current MSGS is reachable. Extending this algorithm, given a relaxation order ˆT = (T, ) for T, we compute the MSGS for all T T by iterating over T according to the partial ordering, starting with T, and computing the MSGS for each task individually. Since all plans are preserved, for all Ti, Tj T with Ti Tj, all softgoals G Gsoft that are reachable in Ti are also reachable in Tj. Thus, the MSGS of Ti can be propagated to Tj.

Pseudo Code of ICR The pseudo code of the Internal Constraint Reuse (ICR) algorithm is given in Algorithm 1. The underlying search algorithm for the state space exploration of one task is depth ﬁrst search (DFS) guided by a heuristic (see [Eiﬂer et al., 2020b]). This is abstracted by the function NEXT(O, h), which mimics the expansion order of DFS. M (line 3) is a map from task to a set of soft-goal subsets storing the MSGS for each task. The aforementioned propagation of MSGS is realized by initializing M[ ˆT] with all MSGS reached in the lower cover of ˆT (line 6). Iterating over the relaxed tasks according to the partial ordering is represented by the functions HASNEXT( ˆT ) and NEXT( ˆT ). The order of incomparable elements is resolved randomly.

Algorithm 2 Search Space Reuse (SSR)

1: Given: relaxation order ˆT , heuristic h 2: function COMPUTEMSGS( ˆT , h) 3: Ts SUPREMUM( ˆT ) maximal relaxed task 4: M {} map of MSGSs 5: F {} map of search frontiers 6: O INIT(Ts) initial state of maximally relaxed task 7: ˆT NEXT( ˆT ) current relaxed task 8: while True do 9: while |O| = 0 do 10: s NEXT(O, h) next state acording to expansion order 11: if Ghard s then update MSGS 12: M[ ˆT ] EXTEND(M[ ˆT ], s Gsoft) 13: if (Ghard Gsoft) s then check and propagate solvability 14: T ˆT ˆT T : M[T ] Ghard Gsoft 15: break 16: Ssuc {s SUCC(Ts, s) | PRUNE(Ts, M[ ˆT ], h, s )} 17: F[ ˆT ] F[ ˆT ] {s , Ssuc | APPL( ˆT , π(s ))} 18: O O {s Ssuc | APPL( ˆT , π(s ))} 19: if HASNEXT( ˆT ) then 20: return M 21: ˆT NEXT( ˆT ) current relaxed task 22: M[ ˆT ] S ˆ T CL( ˆ T ) M[ ˆT ] propagate MSGS 23: FC S ˆ T CL( ˆ T ){s F[ ˆT ] | PRUNE(Ts, M[ ˆT ], h, s )} 24: F[ ˆT ] {s FC | APPL( ˆT , π(s ))} 25: O {s FC | APPL( ˆT , π(s ))}

M[ ˆT ] is only updated (line 11) in case no superset has already been reached: EXTEND(M, G) returns M if there is a G M : G G and {G M|G G} {G} otherwise. The generation of successor states of state s according to the semantics of task T (SUCC(T, s)) is based on standard progression (see Sections 2.2 and 2.3). States are pruned (line 15) if no superset of soft-goals or the hard goal cannot be reached: PRUNE(T , M, h, s) returns true if Ghard R G M : R Gsoft G, where R is the set of facts reachable from state s in task T according to heuristic h and false otherwise. This can for example be realized by checking for each fact whether its hmax [Haslum and Geffner, 2000] estimation is ﬁnite, for more details we refer to [Eiﬂer et al., 2020b]. If a state satisﬁes all hard and soft goals the search for the current task can be terminated early (line 12). All more relaxed tasks are also solvable, so their MSGS are updated accordingly (line 13). Tasks whose MSGS have already been determined are skipped by HASNEXT/NEXT.

5 Search Space Reuse (SSR)

ICR causes overhead, because equivalent states are generated multiple times in the separate search spaces. In addition to the MSGS, it can be beneﬁcial to reuse the search space too. Since all plans are preserved in the relaxations T of task T, for all T T and the most relaxed task Ts = supremum( ˆT ) holds (S T CL(T ) ST ) ST , where ST are the states reachable from the initial state Is of Ts by plans of T . Thus, we can base the computation of the MSGS for all relaxed tasks on the search space of Ts. We begin by exploring the reachable state space for T. All states that are generated, but which are not reachable in T, are stored in a search frontier. To decide whether a state s is reachable in a task T , we check whether the action sequence π(s) leading to s is applicable in T (APPL(T , π(s))). In our example, states reached by

Proceedings of the Thirty-First International Joint Conference on Artiﬁcial Intelligence (IJCAI-22)

paths consuming more than 7 energy units are not reachable in T7 and are therefore stored in the frontier. In subsequent iterations, the search frontiers of less relaxed tasks are further extended for more relaxed tasks. For example, for T8 the frontier states of T7 are further extended. This limits the states generated for each task to the newly reachable states.

Pseudo Code of SSR The pseudo-code of the Search Space Reuse (SSR) algorithm is depicted by Algorithm 2. Unless explicitly mentioned, the algorithm parts work as described for Algorithm 1. The map F (line 5) stores for each task ˆT the states which were generated during the search for ˆT but were not reachable (line 17). In the ﬁrst iteration, the openlist is initialized with the initial state of the most relaxed task Ts (line 6). In each subsequent iteration, it is initialized with the states in F of all tasks in the lower cover of ˆT that are reachable in ˆT (line 23-25). States are pruned by following the same approach as in Algorithm 1 (line 16/23). However, instead of basing the pruning on the current relaxed task ˆT it is based on the most relaxed task Ts, since otherwise states that might be reachable in more relaxed tasked, would be pruned.

6 Theoretical Comparison

The propagation of MSGS can improve the pruning function, which is beneﬁcial to both ICR and SSR. Reusing the search space in SSR reduces duplicate work, but states are only pruned based on the reachability in the most relaxed task, not the current task. We compare the overall number of generated states by each algorithm as a measure to decide whether they are exponentially separated. As the baseline algorithm, we consider ICR without the propagation of MSGS.

Deﬁnition 6 (Exponential Separation). Let {T n|n N} be a family of planning tasks of size (number of facts and actions) polynomially related to n and S(X) the number of states generated by search method X. Then, search method X is exponentially separated from search method Y iff |S(Y ) S(X)| is exponential in n.

To give a family of planning tasks to prove the exponential separations of the algorithms we consider a planning task, where a robot has to visit different locations. The robot s movement is restricted by the resource ρ, which can have the values {0, 1, 2}, with initial value 1. Moving between connected locations consumes the amount of resources depicted in the maps in Figure 3. There is one location annotated with K which holds a set of n keys. The robot can pick up one key at a time (without using any resources) if it is in the same location as the key. To take the dashed connection the robot has to hold all keys. Since the robot can pick up any combination of keys, there can be exponentially many search states. In the following examples, the pruning function uses the h2 heuristic [Haslum and Geffner, 2000] to decide reachability.

Theorem 1. ICR and SSR are exponentially separated from the baseline.

Example Consider the map depicted on the left in Figure 3. In the ﬁrst iteration with initρ = 1 ICR and the baseline generate 2 states (R at L0 and L1). SSR, with initρ = 2, generates the same two states and 2 additional states (R at L2 and L3),

Figure 3: left: Map of separation of ICR and SSR from baseline, initial location: L0, goal: visit L1 and L2. right: Map of separation of ICR from SSR, initial location L0 , goal: visit L2.

which are not reachable and stored in the frontier. For both ICR and SSR MSGS = {{L1}} is propagated. In the next iteration of ICR (initρ = 2), moving to L3 is pruned because no new locations are reachable from there. The same holds for SSR. This leads to 2 + 3 and 4 + 1 states for ICR and SSR respectively. For the baseline, the reachability of L1 is not propagated and moving to L3 is not pruned. Thus, we get 2 + 3 + 2 2n states, for picking up any combination of keys. Theorem 2. ICR is exponentially separated from SSR. Example Consider the map depicted on the right in Figure 3. In the ﬁrst iteration with initρ = 1, ICR generates only one state. Moving to L1 is pruned because h2 recognizes that L2 is not reachable with ρ = 1. In SSR, the weaker constraint initρ = 2 prevents pruning L1 and picking up any combination of keys. Thus, 1 + 2n reachable and 2n (at L2 with any combination of keys) unreachable states are generated. In the last iteration in ICR with initρ = 2 visiting L2 via the upper connection and extending the MSGS to {{L2}} leads to early termination. The same holds for SSR. This results in 1 + 3 states for ICR and 1 + 2 2n for SSR.

7 Experiments We implemented both algorithms in the Fast Downward planning system [Helmert, 2006], extending the code base of Eiﬂer et al. [2020b] and using hmax as a base heuristic for the pruning function1. The experiments were run on Intel E52660 machines running at 2.20 GHz, with a time (memory) limit of 2h (4GB) per benchmark instance. Benchmark Our benchmark consists of 4 resourceconstraint domains (Blocksworld, No Mystery, Rovers R, TPP) and 3 domains (Parent s Afternoon, Rovers T, Satellite) with time constraints. The former part builds on the resource constraint benchmark by Eiﬂer et al. [2020b]. In each instance there are two individual resources R. For each resource ρ R we generated one benchmark instance, scaling initρ between 0 and two times the initial value in the original instance. Rover T and Satellite are extension of the IPC domains with data upload windows for Rovers and time windows to take the images for Satellite. Parents Afternoon [Eiﬂer et al., 2022] models a parent s afternoon routine, including shopping and family member activities. The execution of these activities is constraint by time windows. For each time window W W we generated one benchmark instance. For Parent s Afternoon and Satellite, each time window is relaxed between its original size and the maximal value of the time

1The source code and the benchmark are available at: https:// github.com/XPP-explainable-planning

Proceedings of the Thirty-First International Joint Conference on Artiﬁcial Intelligence (IJCAI-22)

variable domain. For Rovers the relaxation of an upload window is additionally bounded by the other upload windows. Each benchmark instance has up to 5 plan properties that, for example, restrict the order in which two goal facts are to be achieved. All plan properties and the original goal facts of the instance are soft goals. There are no hard goals.

7.1 Evaluation The coverage results are shown in Table 1. An instance is considered to be solved, when the MUGS for all relaxed tasks are computed. Comparing the ICR to the baseline shows propagating the MSGS increases the coverage in 5 domains, while not decreasing it in any. SSR solves more instances in 4 domains, while it is worse than the baseline in 2. ICR clearly has the advantage over SSR in the resource domains, while it is the opposite in the time constraint domains.

domain # base ICR SSR

Blocksworld 40 18 18 19 No Mystery 50 12 24 9 Rovers R 40 20 20 15 TPP 30 11 19 9

Parent s A. 72 35 37 53 Rovers T 138 47 53 96 Satellite 198 123 130 144

Table 1: Coverage (number of instances solved); ICR and SSR: algorithms introduced in Section 4 and 5; base: ICR without propagation of MSGS. Best result for each domain is highlighted in bold.

The increase in reachable states caused by relaxing a time window is usually much smaller than for a resource. Increasing a time window only adds few more times at which a single action a AW could start. However, as a is also constrained by all other time dependent actions, there may not be many added reachable states. In contrast, relaxing a resource allows you to add new actions and increases the number of action orderings. This is in favor for SSR, because it only considers the newly reachable states. A comparison of the number of expansions each algorithm requires per relaxed task, as depicted in Figure 4, conﬁrms this assumption. In the time constraint domains SSR expands more states than ICR in the ﬁrst task, but has many fewer expansions than ICR thereafter. In the resource-constraint task, the stronger pruning function in ICR is advantageous for a wider span of relaxed tasks, such that SSR only needs fewer expansions in more relaxed tasks. Problems may not be solved either due to the exhaustion of the time or the memory limit. For Blocksworld and all time constraint domains, all algorithms ran out of time. For the other resource constraint domains SSR failed due to the memory limit. In TPP ICR failed due to the time limit and in rovers due to the memory. In Nomystery failure of ICRwas cased by about 25% timeouts and 75% memory limit exhaustion. Overall, timeout is most common. This could be addressed by parallelization of tasks without a strict order.

8 Related Work Sreedharan et al. [2019] explain the unsolvability of a task by identifying necessary subgoals of relaxed tasks, that are unachievable in the original task. However, due to use of relaxations based on projections on subsets of variables, this

0 2 4 6 8 10 12 14 16 100

avg #expansions

resource constraint

0 2 4 6 8 10 12 100

avg #expansions

time constraint

Blocksworld No Mystery Rovers R TPP Parent s A. Rovers T Satellite

baseline ICR SSR

Figure 4: Comparison of average number of expansions over commonly solved task. Error bars represent the 95% conﬁdence intervals. top: resource constraint domains, x value corresponds initρ; bottom: time constraint domains, x value corresponds to size difference of the relaxed time window to the original one.

approach is not suitable for quantifying the relaxation necessary to make the task solvable. An excuse for unsolvability, as deﬁned by [G obelbecker et al., 2010], is a series of value changes in the initial state and additional objects to make a task solvable. Their approach does not provide an explanation for a speciﬁc conﬂict, but explains why the task is not solvable and focuses on pointing out errors in the model description. The scheduling system by Agraval et al. [2020] provide information about constraint relaxations for not scheduled activities. Their main focus is to identify all unmet constraints of an activity and present them alongside the schedule to facilitate user review. Their analysis does not yet include any reasoning on the extent to which a constraint needs to be relaxed to schedule the activity. The unsolvability certiﬁcates provided by the proof system of Eriksson et al. [2017; 2018] are not intended to be human-readable and do not provide information on how the task could be made solvable. The resource and time window constraints we consider here, can be compiled to classical planning and relaxation can be represented by domain abstractions [Domshlak et al., 2009]. Resources constraint relaxation could additionally be simulated by cost bound relaxation in classical planning with multiple cost functions [Katz et al., 2019; Altman, 1999].

9 Conclusion

Our approach addresses the question why soft-goal conﬂicts exist by identifying the minimal relaxation under which a conﬂict disappears. Combined with the work of Eiﬂer et al.[2020a; 2020b], this provides an explanation framework that can explain trade-offs between soft goals by identifying not only conﬂicts, but also options for resolving them. This not only helps to better understand why a conﬂict exists, but also whether it can be resolved. In addition it enables the user to evaluate the trade-offs and beneﬁts of a relaxation. Future work includes the evaluation in an application setting and the automatic identiﬁcation of relevant relaxations for a user and conﬂict.

Proceedings of the Thirty-First International Joint Conference on Artiﬁcial Intelligence (IJCAI-22)

Acknowledgments This material is based upon work supported by the Air Force Ofﬁce of Scientiﬁc Research under award number FA955018-1-0245, and by the German Research Foundation (DFG) under grant 389792660 as part of TRR 248 (see https:// perspicuous-computing.science).

References [Agrawal et al., 2020] Jagriti Agrawal, Amruta Yelamanchili, and Steve Chien. Using explainable scheduling for the mars 2020 rover mission. In ICAPS XAIP, 2020. [Altman, 1999] Eitan Altman. Constrained Markov Decision Processes. CRC Press, 1999. [B ackstr om and Nebel, 1995] Christer B ackstr om and Bernhard Nebel. Complexity results for SAS+ planning. Computational Intelligence, 11(4):625 655, 1995. [Baier and Mc Ilraith, 2006] Jorge A. Baier and Sheila A. Mc Ilraith. Planning with ﬁrst-order temporally extended goals using heuristic search. In Proc. AAAI, pages 788 795, 2006. [Culberson and Schaeffer, 1998] Joseph C. Culberson and Jonathan Schaeffer. Pattern databases. Computational Intelligence, 14(3):318 334, 1998. [Domshlak and Mirkis, 2015] Carmel Domshlak and Vitaly Mirkis. Deterministic oversubscription planning as heuristic search: Abstractions and reformulations. JAIR, 52:97 169, 2015. [Domshlak et al., 2009] Carmel Domshlak, J org Hoffmann, and Ashish Sabharwal. Friends or foes? on planning as satisﬁability and abstract cnf encodings. JAIR, 36:415 469, 2009. [Edelkamp, 2001] Stefan Edelkamp. Planning with pattern databases. In A. Cesta and D. Borrajo, editors, Proceedings of the 6th European Conference on Planning (ECP 01), pages 13 24. Springer-Verlag, 2001. [Edelkamp, 2006] Stefan Edelkamp. On the compilation of plan constraints and preferences. In ICAPS, pages 374 377, 2006. [Eiﬂer et al., 2020a] Rebecca Eiﬂer, Michael Cashmore, J org Hoffmann, Daniele Magazzeni, and Marcel Steinmetz. A new approach to plan-space explanation: Analyzing plan-property dependencies in oversubscription planning. In AAAI, 2020. [Eiﬂer et al., 2020b] Rebecca Eiﬂer, Marcel Steinmetz, Alvaro Torralba, and J org Hoffmann. Plan-space explanation via plan-property dependencies: Faster algorithms & more powerful properties. In IJCAI, pages 4091 4097, 2020. [Eiﬂer et al., 2022] Rebecca Eiﬂer, Martim Brandao, Amanda Coles, Jeremy Frank, and J org Hoffmann. Evaluating plan-property dependencies: A web-based platform and user study. In ICAPS, 2022. [Eriksson et al., 2017] Salom e Eriksson, Gabriele R oger, and Malte Helmert. Unsolvability certiﬁcates for classical planning. In ICAPS, pages 88 97, 2017.

[Eriksson et al., 2018] Salom e Eriksson, Gabriele R oger, and Malte Helmert. A proof system for unsolvable planning tasks. In ICAPS, volume 28, 2018. [G obelbecker et al., 2010] Moritz G obelbecker, Thomas Keller, Patrick Eyerich, Michael Brenner, and Bernhard Nebel. Coming up with good excuses: What to do when no plan can be found. In Proc. ICAPS, pages 81 88, 2010. [Haslum and Geffner, 2000] Patrik Haslum and Hector Geffner. Admissible heuristics for optimal planning. In Proc. AIPS, pages 140 149, 2000. [Helmert, 2006] Malte Helmert. The Fast Downward planning system. JAIR, 26:191 246, 2006. [Katz et al., 2019] Michael Katz, Emil Keyder, Dominik Winterer, and Florian Pommerening. Oversubscription planning as classical planning with multiple cost functions. In ICAPS, pages 237 245, 2019. [Lauffer and Topcu, 2019] Niklas Lauffer and Ufuk Topcu. Human-understandable explanations of infeasibility for resource-constrained scheduling problems. In Proceedings of the 2nd Workshop on Explainable Planning (XAIP 19), 2019. [Seipp and Helmert, 2013] Jendrik Seipp and Malte Helmert. Counterexample-guided Cartesian abstraction reﬁnement. In Proc. ICAPS, pages 347 351, 2013. [Senthooran et al., 2021] Ilankaikone Senthooran, Matthias Klapperstueck, Gleb Belov, Tobias Czauderna, Kevin Leo, Mark Wallace, Michael Wybrow, and Maria Garcia de la Banda. Human-centred feasibility restoration. In 27th International Conference on Principles and Practice of Constraint Programming (CP 2021). Schloss Dagstuhl Leibniz-Zentrum f ur Informatik, 2021. [Smith, 2004] David E. Smith. Choosing objectives in oversubscription planning. In ICAPS, pages 393 401, 2004. [Sreedharan et al., 2019] Sarath Sreedharan, Siddharth Srivastava, David Smith, and Subbarao Kambhampati. Why can t you do that HAL? explaining unsolvability of planning tasks. In IJCAI, pages 1422 1430, 2019.

Proceedings of the Thirty-First International Joint Conference on Artiﬁcial Intelligence (IJCAI-22)