# classical_planning_with_avoid_conditions__5c70bd2c.pdf

Classical Planning with Avoid Conditions

Marcel Steinmetz1, J org Hoffmann1, Alisa Kovtunova2, Stefan Borgwardt2

1 Saarland University, Saarland Informatics Campus, Saarbr ucken, Germany 2 Institute of Theoretical Computer Science, Technische Universit at Dresden, Germany steinmetz@cs.uni-saarland.de, hoffmann@cs.uni-saarland.de, alisa.kovtunova@tu-dresden.de, stefan.borgwardt@tu-dresden.de

It is often natural in planning to specify conditions that should be avoided, characterizing dangerous or highly undesirable behavior. PDDL3 supports this with temporal-logic state trajectory constraints. Here we focus on the simpler case where the constraint is a non-temporal formula ϕ the avoid condition that must be false throughout the plan. We design techniques tackling such avoid conditions effectively. We show how to learn from search experience which states necessarily lead into ϕ, and we show how to tailor abstractions to recognize that avoiding ϕ will not be possible starting from a given state. We run a large-scale experiment, comparing our techniques against compilation methods and against simple state pruning using ϕ. The results show that our techniques are often superior.

Introduction

It is often natural in planning to specify conditions that should be avoided. Work along these lines has so far focused on temporal-logic formulas that must be true in the state sequence induced by the plan. One prominent early approach used such formulas as control knowledge for effective handtailored planning (Bacchus and Kabanza 2000; Doherty and Kvarnstr om 2001). The PDDL3 language (Gerevini et al. 2009) features temporal formulas (among others) in the role of state trajectory constraints. Work since then has devised compilations back into classical tasks (Edelkamp 2006; Baier and Mc Ilraith 2006; De Giacomo, De Masellis, and Montali 2014; Torres and Baier 2015; Bonassi et al. 2021), compilations into SAT (Mattm uller and Rintanen 2007), and approaches handling soft-goal plan preferences effectively (Baier, Bacchus, and Mc Ilraith 2007, 2009). Here we focus on the special case where the state trajectory constraint is a state formula ϕ that must remain false in all states along the plan. We refer to such constraints as avoid conditions. This special case is relevant, for example, to express dangerous or undesirable situations, such as risky states in a deterministic approximation of a probabilistic planning application. While the avoidance of such situations could in principle be enforced via additional action preconditions, it is typically easier and much more natural to model

Copyright 2022, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

them directly as conditions to avoid. Moreover, avoid conditions can be used as a tool to specify domain-speciﬁc control knowledge, e. g., characterizing states the user knows to be dead ends. The question then becomes how to make use of the provided knowledge in the best possible way. A straightforward way to handle avoid conditions is the compilation into preconditions. However, this incurs a large overhead and is, as our experiments illustrate, often not effective. Our contribution consists in advanced algorithmic methods. Apart from several compilation techniques, we adapt prior work in classical planning to design a method learning from the avoid condition during search, and a method using abstraction, to predict states starting from which ϕ can no longer be avoided. Our learning method is based on traps (Lipovetzky, Muise, and Geffner 2016). Traps are goal-free sets of states that, once entered, cannot be left again. Represented compactly, traps yield an effective method to recognize dead-end states during forward search. By initiating trap reﬁnements from unrecognized dead ends encountered in search, one can incrementally extend the representation as a form of nogood learning (Steinmetz and Hoffmann 2017). We extend the trap deﬁnition to take into account ϕ. We show how these changes ﬁt into the trap learning approach, and introduce an adaption of Trapper (Lipovetzky, Muise, and Geffner 2016) as an alternative trap generation method. Our abstraction method is a form of state abstraction, a wide-spread method used to design heuristic functions in planning (Edelkamp 2001; Helmert et al. 2014; Seipp and Helmert 2018). Abstract state spaces group concrete states s into block states A. Observe that, given such an abstract state space, we know that s A can be pruned if all paths from A to a goal block traverse a block A that implies ϕ, i.e., where s |= ϕ for all s A . In other words, given an abstraction, we can predict that every plan for a state s will necessarily traverse the avoid condition. The question remains how to tailor abstractions for this purpose. To this end, we leverage so-called Cartesian abstractions and their associated counter-example guided abstraction reﬁnement (CEGAR) process (Seipp and Helmert 2013, 2018). We show how one can test A ϕ for Cartesian states. We modify the CEGAR process to incorporate ϕ as an additional source of counter-examples and, therewith, of reﬁnement steps. We run experiments on satisﬁcing planning, optimal plan-

The Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-22)

ning, and proving unsolvability, evaluating compilations, traps, and abstraction. We do so on (1) reformulated standard benchmarks that incorporate aspects (more) naturally formulated as avoid conditions; and (2) benchmarks involving road maps, where we systematically impose avoid conditions of the form do not use particular combinations of road segments . The results show that our new methods can be superior, in particular for proving unsolvability. Technical details, including full proofs and detailed benchmark descriptions, are provided in online appendix (Steinmetz et al. 2022).

Preliminaries

We consider classical planning tasks in FDR notation (B ackstr om and Nebel 1995; Helmert 2006). An FDR planning task is a tuple Π = V, A, I, G . V is a set of variables, each v V has a ﬁnite domain Dv. A fact is a variable assignment p = v, d for v V and d Dv. The initial state I is a complete assignment to V. The goal G is a partial assignment to V. For a partial variable assignment P, V(P) V denotes the set of variables v for which P(v) is deﬁned. For two partial variable assignments P1 and P2, we denote by P1[P2] the update of P1 by P2, i.e., P1[P2](v) = P2(v) for v V(P2), and P1[P2](v) = P1(v) for v V(P1) \ V(P2). A is the set of actions. Each action a A has a precondition prea and an effect eﬀa, both partial variable assignments, and a non-negative cost ca R+ 0 . The states S of Π are all complete variable assignments. A state s G is called goal state if s G(v) = G(v) for all v V(G). An action a is applicable in a state s if s(v) = prea(v) for all v V(prea). The application results in the state s Ja K = s[eﬀa]. These deﬁnitions are extended to sequences of actions in a straightforward manner. A sequence of actions π is a plan for s if π is applicable in s and s JπK is a goal state. An optimal plan is a plan with minimal summed up action cost. s is called a dead end if there is no plan for s. A plan for Π is a plan for I. Π is called unsolvable if I is a dead end. We denote Boolean formulae over facts by φ. We consistently use ψ to denote conjunctions of facts. Ψ and Φ denote negation-free formulae in disjunctive normal form (DNF). Partial variable assignments, conjunctions of facts, and sets of facts are used interchangeably. Ψ and Φ are treated like sets of conjunctions. We denote by s |= φ that state s satisﬁes φ. By [φ] S we denote the set of all states that satisfy φ. For action a, Progress(φ, a) denotes the progression of φ by a, and states the condition that holds after applying a to any s [φ]. The computation of Progress(φ, a) for general φ is shown by Rintanen (2008). For a conjunction ψ, Progress(ψ, a) = (ψ prea)[eﬀa] if ψ(v) = prea(v) for all v V(ψ) V(prea), and Progress(ψ, a) = otherwise. An avoid condition ϕ is an arbitrary Boolean formula. A plan a1, . . . , an for Π is called ϕ-compliant if I [ϕ], and it holds for all 1 i n that IJa1, . . . , ai K [ϕ]. An optimal ϕ-compliant plan is a ϕ-compliant plan with minimal action cost. We say that a state s is ϕ-unsolvable if there is no ϕ-compliant plan for s.

Compilations Compiling avoid conditions into the planning task is straightforward in principle, but the na ıve method is not very effective. Furthermore, compilations for temporal plan constraints are well known and we address a special case here. Hence, we evaluate three compilation methods in our experiments. They all operate at the PDDL input level.

Conditions Compilation The ﬁrst compilation ensures ϕ-compliance by conjoining ϕ to the preconditions of all actions and the goal. We denote by Π ϕ the resulting FDR planning task. Trivially, the plans of Π ϕ are the ϕcompliant plans of Π.

LTL Compilation Our second method uses existing tools for compiling temporal formulas into planning tasks (Edelkamp 2006; Baier and Mc Ilraith 2006). The compilation proceeds in two steps: 1) building an automaton representation of the formula, and 2) encoding this automaton into the planning task via additional variables and actions. The LTL formula in our case is G ϕ (always not ϕ), which translates into an automaton with exactly two locations. The initial location is accepting and has a self-loop conditioned by ϕ. The other location is not accepting, and is reached from the initial location if ϕ is satisﬁed. We denote by ΠLTL the result of encoding this automaton into Π. ΠLTL enforces an update of the automaton location in between applications of actions from Π. The automaton blocks as soon as it leaves its accepting state. Discarding the automaton-related actions, the plans of ΠLTL are exactly the ϕ-compliant plans of Π. Moreover, plan optimality is not affected provided that the newly introduced actions have 0 cost.

Axiom Compilation Our last compilation exploits derived predicates, aka axioms (Hoffmann and Edelkamp 2005). Axioms are deﬁned by rules of the form p φp. The fact p must not be affected by any action, i.e., its truth value must be exclusively determined by the axioms. In the simple (non-recursive) form of axioms that we need for our compilation, p is true in a state iff the state satisﬁes one of its associated rule conditions φp. To enforce ϕ with axioms, we introduce a rule (avoid) ϕ, and conjoin (avoid) to the precondition of every action and to the goal. We denote by ΠX the resulting FDR task with axioms.

Avoid Prediction In theory, the compilations induce a blow-up that is polynomial in the original task and ϕ. However, PDDL grounding typically involves formula normalization, which if not done carefully can actually result in an exponential overhead depending on ϕ. We now turn to our advanced techniques that handle ϕ without the detour via compilation. The base algorithm is forward search on Π. ϕ-compliance is assured by pruning all states that satisfy ϕ. We enhance this basic pruning condition by ϕ-predictors, functions u : S 7 {0, } that may identify ϕ-unsolvable states, u(s) = , where ϕ is not satisﬁed directly. Provided that u(s) = indeed only holds if s is ϕ-unsolvable, these states can be pruned in addition without affecting the search s completeness (returning a ϕ-compliant plan if one exists) and optimality (returning an

optimal ϕ-compliant plan) properties. In what follows, we devise two such predictors u from well-known techniques.

Avoid-Prediction via Traps There is an obvious methodological relation between ϕprediction and dead-end detection. Both methods attempt to prove the absence of goal-leading paths. Yet, while the latter requires universal absence, the former makes exceptions according to ϕ. This section shows how to incorporate this exception into traps (Lipovetzky, Muise, and Geffner 2016).

Background: Traps A trap is a set of states T S that (T1) does not contain goal states, and (T2) is closed under transitions. The conditions guarantee that all states in T are dead ends. Lipovetzky, Muise, and Geffner (2016) considered DNF formulae Ψ, for which T = [Ψ] satisﬁes (T1) and (T2) iff it holds for every ψ Ψ that (t1) ψ G , i.e., ψ and G disagree on some variable; and (t2) for every action a, Progress(ψ, a) Ψ, i.e., ψ Progress(ψ, a) for some ψ Ψ or else Progress(ψ, a) = . Trapper uses these syntactic conditions to build Ψ from conjunctions of a ﬁxed size k, called k-trap. Algorithm 1 sketches the procedure, which, starting from all size-k conjunctions, iteratively removes conjunctions from Ψ until (t1) and (t2) are satisﬁed.

Algorithm 1: k-trap computation. Adaptions to ϕtraps are highlighted in bold.

1 Ψ {ψ | |ψ| = k} Φ

2 foreach ψ Ψ s.t. ψ G ψ G Φ do Ψ Ψ\{ψ}

3 while there are ψ Ψ and a A s.t. Progress(ψ, a) Ψ and ψ prea Φ do

4 Ψ Ψ \ {ψ}

k-traps are limited in practice to small k. Steinmetz and Hoffmann (2017) presented a method to construct Ψ without bounding conjunction size. They interleave search s exploration with Ψ reﬁnements. Search starts with the empty trap, Ψ := , and terminates as soon as a goal state is found. Ψ is updated by learning from dead ends ˆS that were explored in search. Such ˆS is identiﬁed whenever all nonˆS successors of the states in ˆS were pruned due to Ψ. The reﬁnement computes a new trap ˆΨ := Ψ W

s ˆS ψs, where ψs s for every s ˆS. This is possible because, as per the identiﬁcation of ˆS, ˆΨ is a trap if ψs = s. Every state newly represented by ˆΨ besides those in ˆS can lead to additional pruning in the remainder of the search. To achieve this generalization, Steinmetz and Hoffmann use the greedy procedure sketched in Algorithm 2, which adds facts from s to ψs only as necessary to prevent the violation of (t1) (lines 1 3) and (t2) (lines 4 5).

Tailoring To Avoid Condition We call a set of states T S a ϕ-trap if (Tϕ1) every goal state in T satisﬁes ϕ, and (Tϕ2) every transition that leaves

Algorithm 2: Computation of ψs for the trap update ˆΨ as discussed in the text. Adaptions to ϕ in bold.

1 foreach s ˆS do

2 select v V(G) s.t. s(v) = G(v)

3 ψs v, s(v)

4 while there are s ˆS and a A s.t. Progress(ψs, a) ˆΨ and ψs prea Φ do

5 ψs ψs v, s(v) for some v V(ψs)

T either originates in a state that satisﬁes ϕ, or goes into one that does. Every state that satisﬁes ϕ is ϕ-unsolvable by deﬁnition. (Tϕ2) additionally ensures that leaving T is possible only by making ϕ true. In summary, it is not possible to reach the goal from any state in T without satisfying ϕ:

Theorem 1. If T is a ϕ-trap, then every state in T is ϕunsolvable.

To operationalize on this notion, we extend the construction methods from above. This is possible, in principle, because deciding whether [Ψ] satisﬁes (Tϕ1) and (Tϕ2) can still be decomposed into individual conditions on each ψ:

Theorem 2. Let Ψ be a DNF formula over facts without negation. [Ψ] is a ϕ-trap if it holds for all ψ Ψ that (tϕ1) (ψ G) ϕ, and (tϕ2) it holds for all a A that Progress(ψ ϕ, a) (Ψ ϕ).

The proof is straightforward and provided in the appendix. Unfortunately, the appearance of ϕ signiﬁcantly increases the complexity of verifying the conditions. Both conditions involve deciding propositional formula tautology, which without assumptions on ϕ is co NP-complete. Testing (tϕ1) and (tϕ2) could be cast into appropriate calls to an off-the-shelf SAT solver. This is however bound to generate a large overhead, given the overall number of such tests required. Instead, we reformulate (tϕ1) and (tϕ2) into simple trap membership tests. Consider the DNF transformation Φ of ϕ. Plugging Φ into (tϕ1) gives (ψ G) Φ, which boils down to ﬁnding a member ψ Φ with ψ (ψ G), as above. To reformulate (tϕ2), we split it into two cases: ϕ must be true before or it must be true after the application of a. Formally, (tϕ2a) Progress(ψ, a) (Ψ ϕ) or (tϕ2b) (ψ prea) ϕ. (tϕ2a) is clearly a sufﬁcient condition of (tϕ2). Moreover, if (tϕ2b) is satisﬁed, then Progress(ψ ϕ, a) = , i. e., (tϕ2) is also satisﬁed. By replacing ϕ by Φ, both conditions can again be veriﬁed via trap membership tests. We need to insert two remarks. (tϕ2a) and (tϕ2b) are sufﬁcient but not necessary, as removing ϕ entirely from the progression comes with a loss of information about a s application context that can be necessary to conclude (tϕ2). Secondly, the DNF transformation may come at the cost of an exponential blow-up in formula size. Nevertheless, the Φ approach offers another advantage: it straightforwardly integrates into the trap construction methods from before. The bold parts in Algorithm 1 and 2 show the necessary changes. In both methods, we make sure that Φ is included in Ψ at all times. This is

possible because [Φ] itself trivially constitutes a ϕ-trap. Since Progress(ψ, a) (Ψ Φ) then becomes equivalent to Progress(ψ, a) Ψ, (tϕ2a) is readily handled by both algorithms. (tϕ2b) maps into additional loop conditions. (tϕ1) replaces (t1) in the k-trap construction procedure. This is not necessary in Algorithm 2, because ˆS can never contain goal states, i. e., the initialization of ψs can be done as before. Consider the adapted Algorithm 1. (tϕ1) and (tϕ2) will be satisﬁed upon termination. Given Theorem 2, hence Theorem 3. If Ψ is is a k-ϕ-trap, then [Ψ] is a ϕ-trap. Algorithm 2 also guarantees that (tϕ1) and (tϕ2) hold upon termination. Yet, termination here is guaranteed only for ˆS such that ([Ψ] ˆS) remains a ϕ-trap. That this is always the case can be shown via the exact same arguments as already given by Steinmetz and Hoffmann (2017). Theorem 4. Every reﬁnement of the ϕ-trap learning procedure terminates with a ϕ-trap ˆΨ.

Abstraction for Avoid-Prediction We recall Cartesian abstractions and show how to tailor them to the identiﬁcation of ϕ-unsolvable states.

Background: Cartesian Abstractions An abstraction for Π is an equivalence relation between the states S. The abstract states S of are given by its equivalence classes. For state s, we denote by [s] the equivalence class that contains s, and omit if it is clear from the context. The abstract state space associated with is the transition system Θ = S , T , s I , S G with abstract initial state s I = [I] and abstract goal states S G = {[s] | s S, G s}. The abstract transitions are given by T = { [s], a, [s Ja K] | s S, a A applicable in s}. Let the variables of Π be V = {v1, . . . , v N}. Cartesian abstractions (Seipp and Helmert 2018) are abstractions whose abstract states are of the form A1 A2 AN, where Ai Dvi for all i. This structure makes Cartesian abstractions particularly suitable for a counter-example guided abstraction reﬁnement loop (CEGAR): The construction starts with the trivial abstraction that contains just a single abstract state. One then iteratively splits an abstract state into two until the abstraction provides enough information, or some size limit is reached. Each reﬁnement step starts with the extraction of an abstract solution, i.e., an abstract path [s0], a1, [s1], . . . , an, [sn] from the abstract initial state [s0] = s I to some abstract goal state [sn] S G . If no such path exists, then Π must be unsolvable, and the reﬁnement terminates. Otherwise, the corresponding concrete path s0, a1, s1, a2, . . . is computed by applying the actions successively, starting from s0 = I. The computation is stopped when one of the following conditions is satisﬁed:

(C1) Action ai is not applicable in si 1.

(C2) Concrete and abstract state do not match: [si] = [si]. (C3) sn does not satisfy the goal.

If not stopped, we have found a plan for Π and the reﬁnement terminates. Otherwise, the violated condition is used to

split an abstract state, guaranteeing that the same error cannot occur in future iterations ( denotes disjoint set union): (C1) [si 1] is split into [t1] [t2] such that si 1 [t2] and [t2] has no abstract transition via ai. (C2) [si 1] is split into [t1] [t2] such that si 1 [t2] and [t2] no longer has an abstract transition to [si] via ai. (C3) [sn] is split into [t1] [t2] such that sn [t2] and [t2] is no longer an abstract goal state. The selection of [t1] and [t2] is done via simple syntactic checks. During the entire construction, a full representation of the abstract state space is maintained. After each split, this representation is updated by rewiring transitions to [t1] and [t2]. For full details, we refer to the work by Seipp and Helmert (2018). Once the abstract state space has been updated, a new abstract solution is extracted, and the whole process starts anew.

Tailoring to Avoid Conditions An abstract state [s] implies ϕ, written [s] ϕ, if s |= ϕ holds for all represented concrete states s [s]. Since the abstract state space is path-preserving, its analysis with respect to this property yields information for ϕ-prediction: Theorem 5. Let [t] be any abstract state. If every path from [t] to any abstract goal state visits some [s] s.t. [s] ϕ, then every state represented by [t] is ϕ-unsolvable. Intuitively, the ϕ-unsolvable abstract states are exactly the abstract dead ends after pruning all [s] ϕ. We next show how to use this observation in Cartesian abstractions.

Implication Test Unfortunately, deciding whether [s] ϕ for Cartesian abstractions is co NP-hard in general. Suppose all variables are Boolean. Then, for the full Cartesian product [s] = { , }N, [s] ϕ holds exactly if ϕ is a tautology, deciding which is known to be co NP-complete. Yet despite the worst-case complexity, the implication check was usually not the bottleneck in our experiments. Our implementation runs a simple backtracking search for a state t [s] such that t |= ϕ, as depicted in Algorithm 3. We assume a positive DNF representation Φ of ϕ. This allows us to easily identify situations where branching is not required (line 3).

Algorithm 3: Contains(i, Φ): Tests whether t [s] = A1 AN s.t. t |= Φ. Initially, i = 0.

1 if Φ = then return true

2 if i = N + 1 then return false

3 if di Ai: vi, di ψ for all ψ Φ then

4 return Contains(i + 1, {ψ Φ | vi V(ψ)})

6 foreach di Ai do

7 if Contains(i + 1, {ψ Φ | vi V(ψ), or ψ(vi) = di} then return true

8 return false

CEGAR We propose two CEGAR variants that incorporate ϕ. Abstract goal paths are generally restricted such that

[si] ϕ holds at all times. If such a path does not exist, then I must be ϕ-unsolvable, and we can terminate. Otherwise the reﬁnement proceeds as follows.

Integrated: The ﬁrst variant introduces an additional error condition into the original analysis procedure:

(C4) The concrete state si satisﬁes ϕ.

Assume (C4) is satisﬁed. Since [si] ϕ by assumption, [si] must hence contain states that satisfy ϕ as well as ones that do not. We split [si] such that [si] ϕ holds after the reﬁnement. In line with the previous error conditions, this is sufﬁcient to ensure that the same concrete path cannot be subject to any future reﬁnement iteration. Concretely, [si] is split into abstract states [t1] [tk] such that [tk] ϕ, and si [tk]. Contrary to the original error conditions, a split into exactly two abstract states (k = 2) is not possible in general. To illustrate this, let x and y be two binary variables, and ϕ = (x = 1 y = 1), and consider the abstract state [si] = ({0, 1} {0, 1}). [tk] ϕ can only be satisﬁed for [tk] = ({1} {1}). However, every possible split of [si] into this [tk] requires k 3 abstract states. We use the following procedure to ﬁnd [t1] [tk]. We start with [t] = [si] and j = 1. We continue to split [t] into abstract states [tj] and [t ] with si [t ] until [t] ϕ is satisﬁed. [t ] replaces [t] in the next round, j + 1. Let v be any variable whose value set A in [t] is not a singleton. Such a variable must exist. Otherwise [t] = {si}, therefore [t] ϕ as per assumption (C4), and we would have terminated the reﬁnement already. Given such v, [t] is split by dividing A into (A \ {si(v)}) and {si(v)} respectively. This method guarantees that [t] will eventually only contain si. Termination with the desired result hence follows. Since we kept splitting abstract states into pairs, the abstract state space can be updated via the same efﬁcient methods as before.

Detached: To prioritize reﬁnements based on ϕ, our second CEGAR variant checks, prior to the original analysis steps, whether any abstract state [si] along the considered abstract goal path contains a concrete state s |= ϕ. If so, then [si] is split following the procedure just described. Afterwards, we directly proceed to the next reﬁnement iteration, skipping the original conditions altogether. Since the ϕ error condition is tested here before constructing the concrete path, the state s [si] with s |= ϕ must be searched actively. This is computationally more expensive than the simple check in (C4). To ﬁnd s, we follow a backtracking search similar to the one shown in Algorithm 3. We close this section with the remark that the original conditions (C1) (C3) still play a central role for identifying ϕ-unsolvable states. Consider the example in Figure 1. As the (spurious) path [s0], a3, [s1] shows, paths in the abstraction can simply bypass ϕ even if the concrete paths cannot. Note that this abstract path violates (C1). The corresponding reﬁnement will split [s0] by dividing the values of x into {0} and {1}. After the reﬁnement, every abstract goal path from [s0] needs to go through [s2]. Hence, since [s2] ϕ, via the reﬁnement due to (C1), the abstraction becomes able to prove that no ϕ-compliant plan exists.

[s0]: {0, 1} {0} {0} [s1]: {0, 1} {0} {1}

[s2]: {0, 1} {1} {0} [s3]: {0, 1} {1} {1}

Figure 1: Example abstract state space. The planning task consists of binary variables x, y, z, initially all 0, goal z=1, and three actions with pre/eﬀ: (a1) y=0/y=1; (a2) y=1/x=1; and (a3) x=1/z=1. The abstract states are depicted in terms of Ax Ay Az. The avoid condition is ϕ = (y=1). Abstract states that imply ϕ have dashed borders. Goal states have double borders.

Experiments

We implemented all described methods in Fast Downward (FD) (Helmert 2006). The avoid condition is speciﬁed as an additional input ﬁle in the full PDDL condition syntax. The compilations and DNF transformations are implemented as part of FD s translator component. Source code and benchmarks are available online1. The experiments were run on machines with Intel Xeon E5-2660 @ 2.20GHz CPUs, and 30 minutes time and 4 GB memory cutoffs. We conducted experiments in optimal and satisﬁcing planning, as well as proving unsolvability. For each category, we chose a canonical base planner conﬁguration: optimal planning via A search with LM-cut (Helmert and Domshlak 2009); satisﬁcing planning via greedy best-ﬁrst search with two open lists and preferred operators using h FF (Hoffmann and Nebel 2001); and proving unsolvability via depth-ﬁrst search with hmax (Haslum and Geffner 2000) for dead-end detection. We extended these base conﬁgurations by the following ϕ-predictors: prune-ϕ no prediction, only prune by ϕ; k-trap k-ϕ-traps; ˆS-trap ϕ-trap learning; a Ori Cartesian abstraction constructed via the original CEGAR approach; a Int our integrated CEGAR variant; respectively a Det the detached variant. For the k-ϕ-traps, we experimented with k {2, 3, 4, 5}. To terminate CEGAR, we enforced an upper limit N on the number of abstract states, N {25k, 50k, 100k, 150k, 200k, 300k}. In addition, we ran a ϕ-trap learning variant ˆS-k-trap that uses the k-ϕ-trap with k = 2 as kick-start. For the compilations Π ϕ and ΠLTL, we also considered traps and Cartesian abstractions for pruning dead-ends (not for ΠX as neither of them supports axioms).

Benchmark Design

Benchmarks with avoid conditions already appeared in IPC 2006 (Dimopoulos et al. 2006), encoded via state trajectory constraints. But hard constraints appeared only in benchmarks of the temporal track, which makes them unsuited for our experiments. Instead we created a new benchmark set, including solvable as well as unsolvable instances. By synthetically increasing the impact of ϕ, the unsolvable part pinpoints and evaluates the capabilities of the different prediction methods to generalize from ϕ. We design two cate-

1https://doi.org/10.5281/zenodo.6338021

COVERAGE SEARCH REDUCTION FACTORS Compil.

ϕ-Prediction ϕ-Prediction vs. prune-ϕ

k-trap ˆS-trap a Ori a Int a Det ˆS-k-trap a Ori a Int Domain # 2 3 k2 25k 100k 25k 100k 25k 100k k=2 100k 100k Satisﬁcing Cave Diving-REDONE 20 0 7 7 7 7 7 6 6 7 7 7 7 0 0 1.4 1.5 1.6 3.1 1.0 1.1 Fridge-REDONE 24 1 6 20 21 21 11 21 21 21 21 21 21 13 11 1 1 1 1 1 1 Miconic-REDONE 178 0 25 20 117 117 117 120 120 117 117 111 111 72 67 2.5 415 1 1 1.3 145 Nurikabe-REDONE 20 0 2 12 11 11 8 11 11 9 9 7 6 4 0 2.1 86.1 1 1 1.0 1.1 Openstacks-REDONE 60 0 4 12 13 13 13 13 13 13 13 13 13 8 7 1 1 1 1 1 1 Trucks-REDONE 30 0 7 8 18 18 18 18 18 18 18 18 18 15 15 1.2 1.7 1.1 2.2 1.0 1.2 Driverlog-ROAD 21 8 4 9 11 11 11 11 11 11 11 11 11 6 5 1.0 1.3 1 1 1.0 1.0 Rovers-ROAD 64 4 2 7 12 14 17 12 14 12 12 12 12 12 12 3.8 310.3 1 1 1.4 75.0 TPP-ROAD 40 4 4 8 9 9 7 10 10 9 8 10 9 9 8 2.4 258.2 1 1 1.3 9.9 Transport-ROAD 116 26 32 49 52 58 64 47 55 50 50 52 52 23 14 3.5 76.0K 1 1 1.3 6.2 P 573 43 93 152 271 279 273 269 279 267 266 262 260 162 139 2.2 76.0K 1.0 3.1 1.2 145 Optimal Cave Diving-REDONE 20 0 7 7 7 7 7 7 7 7 7 7 0 0 1.0 1.0 1.3 2.0 1.0 1.0 Fridge-REDONE 24 1 6 10 10 10 10 10 10 10 10 10 2 2 1 1 1 1 1.0 1.0 Miconic-REDONE 178 0 25 68 68 68 70 70 62 61 62 62 28 23 1.9 415 1 1 1.3 145 Nurikabe-REDONE 20 0 2 10 10 8 10 10 9 9 7 6 3 0 1.4 2.1 1 1 1.0 1.3 Openstacks-REDONE 60 0 4 25 25 25 25 25 25 24 25 25 20 19 1 1 1 1 1 1 Trucks-REDONE 30 0 10 11 11 11 11 11 11 11 11 11 8 8 1.1 1.2 1.0 1.1 1.0 1.1 Driverlog-ROAD 21 7 12 13 13 13 13 13 13 13 13 13 8 7 1.1 1.5 1 1 1.0 1.0 Rovers-ROAD 64 3 3 3 3 3 3 3 3 3 3 3 3 3 1.5 2.5 1 1 1.2 1.4 TPP-ROAD 40 0 0 0 0 0 0 0 0 0 0 0 0 0 Transport-ROAD 116 23 41 45 49 49 45 49 45 45 46 46 16 7 2.0 1.4K 1 1 1.2 3.2 P 573 34 110 192 196 194 194 198 185 183 184 183 88 69 1.5 1.4K 1.0 2.0 1.2 145 Unsolvability Cave Diving-REDONE 20 0 4 7 7 20 16 7 20 7 7 7 7 16 16 17.4K 29.5K 1.1 1.2 17.4K 29.5K Miconic-REDONE 28 0 0 0 18 19 18 16 16 18 18 16 16 16 16 10.0 15.4 1 1 15.0 46.6K Driverlog-ROAD 22 8 4 8 8 9 12 5 8 8 8 8 8 8 8 0.45M 36.4M 1 1 52.2K 36.4M Rovers-ROAD 64 0 0 0 0 16 22 14 21 0 0 2 2 1 1 TPP-ROAD 40 4 0 4 4 8 8 6 10 4 4 5 5 4 4 3.2K 0.45M 1 1 0.10M 0.45M Transport-ROAD 116 25 32 46 46 53 61 59 64 46 45 48 46 55 56 65.7 28.7M 1 1 35.0 46.8K P 290 37 40 65 83 125 137 107 139 83 82 86 84 100 101 215.9 36.4M 1.0 1.2 154.4 36.4M

Table 1: Left half: coverage results, best in bold. Results for the compilations are shown for the base conﬁgurations only. The conﬁguration names are described in the text. Right half: ratio of states visited by prune-ϕ versus states visited with a ϕpredictor on top (K for thousand, M for million). Larger values indicate more pruning. For each method per-domain geometric mean (left) and maximum values (right) are shown. Values between different conﬁgurations are not directly comparable.

gories of benchmarks.

REDONE. Several well-known benchmarks feature avoid conditions, not modeled explicitly but instead encoded into complex precondition and/or effect-condition formulas. We have identiﬁed 6 such domains, and manually separated the avoid condition from the actions descriptions in an equivalence-preserving manner. In summary, we use Cave Diving (IPC14), hiring a diver may preclude hiring other divers; Fridge, constraints on fridge components; Miconic, elevator moves are restricted by constraints on boarded passengers of numerous kinds; Nurikabe (IPC18), illegal groupings of board cells; Openstacks (IPC08), production and delivery must follow a particular order; Trucks (IPC06), relationship between the occupancy of and legal accesses to truck storage areas. An explicit avoid condition is a natural model for all of these, and partly actually more natural than the original PDDL. Additionally to the exist-

ing (solvable) instances, we created unsolvable instances for Cave Diving by introducing a cycle in the divers preclude relationships2; and solvable and unsolvable Miconic instances in which passengers must be served one at a time to not eventually violate ϕ. In the remaining domains, creating instances unsolvable due to ϕ is either difﬁcult (Nurikabe), or not possible at all (Fridge, Openstacks, Trucks).

ROAD. Our second category encompasses a set of controlled benchmarks, in which the avoid condition is generated in a systematic fashion. We adapted standard transportation-like benchmarks without dead ends: Driverlog, Rovers, TPP, and Transport. We add avoid conditions that forbid the usage of certain combinations of con-

2The Cave Diving instances of the UIPC instead make restrictions to the available resources. Diver preclude relationships are not used at all.

nections in the road-map graphs. The avoid conditions are computed individually per base instance. The computation is parameterized in the size n of combinations to be added to ϕ. Given a base instance, ϕ is built by: 1) computing a ϕ-compliant plan (for the current ϕ), and 2) extending ϕ by a random selection of n connections from this plan. This is repeated until no ϕ-compliant plan is left. The result is added to the unsolvable part. To obtain a solvable instance, we drop the last entry of ϕ, skipping an instance if this would make the condition empty. The ϕ-compliant plans were computed by domain-speciﬁc solvers. We considered n = 2 and n = 3, and aborted the generation of the conditions for an instance after 10 minutes. Note that the avoid conditions here are DNF formulas.

Results for Compilations Consider Table 1. For the compilations, the results in the different categories (satisﬁcing, optimal, and unsolvability) are qualitatively similar. Using additional dead-end detectors on top of the compilations turned out to be detrimental in all cases, so we omit these results. Both Π ϕ and ΠLTL cause a signiﬁcant overhead in grounding for almost all domains. This was to be expected for the ROAD part, as grounding in both compilations requires the conversion of the CNF ϕ back into DNF, which with the standard FD translator method is exponential in the size of ϕ. That said, the results are not much better on the REDONE benchmarks either. This is because, after the elimination of existential quantiﬁers, the avoid conditions there turn into big disjunctions too. The results for Π ϕ are significantly worse than for ΠLTL because the former needs to do the DNF conversion for every action, while the automaton construction in ΠLTL requires this only once. The axiom compilation ΠX was designed to avoid these problems (ΠX is missing in the optimal part since axioms are not supported by the optimal planner conﬁguration). Grounding was indeed much less of an issue, with the exception of Miconic, whose complex avoid condition caused problems to FD s axiom normalization. Nevertheless, planning performance does not beneﬁt from having the avoid condition encoded directly in the model. ΠX is dominated almost universally by prune-ϕ.

Results for ϕ-Prediction For the ϕ-prediction methods, Table 1 also shows search space size reduction statistics. For space reasons, we included a subset of the conﬁgurations only. Additional details are available in the appendix. The k-trap construction for k 4 was often too expensive, causing a signiﬁcant drop in performance in all domains but Transport. Differences between the abstraction conﬁgurations for N 150k diminishes, as the reﬁnements for 150k already exceeded the resource limits in many cases. The results in the optimal and satisﬁcing parts are similar. However, differences between the ϕ-predictors tend to be larger in the satisﬁcing part, where the base planner conﬁguration is able to solve more instances. In general, the impact of ϕ-prediction varies between the different domains. It turns out that Fridge, Openstacks, and Trucks actually do not

contain ϕ-unsolvable states besides ones that already satisfy ϕ. In these domains, ϕ-prediction becomes pointless, while still adding an overhead. Search could be reduced in Trucks due to dead ends, which can be identiﬁed by the ϕ-predictors as a byproduct. Overall, the performance of the ϕ-prediction conﬁgurations was worse than the baseline prune-ϕ only if too many resources were dedicated to the ϕ-predictor construction. In particular, the smallest k-trap conﬁguration performed as well, or better than prune-ϕ in all domains. Of the remaining three REDONE domains, ϕ-prediction could increase coverage in just Miconic, yet search reduction can be observed in all three. By design, reasoning over ϕ is central in the ROAD benchmarks and in the unsolvability part. Here, ϕ-prediction turned out advantageous throughout, and improvements over the prune-ϕ conﬁguration were largest. a Det performs signiﬁcantly worse than the other two abstraction variants in the solvable part, but has an edge for proving unsolvability. a Det is more prone to generating unreachable abstract states, which resulted in longer abstraction construction times. Moreover, in the solvable part, a Ori and a Int sometimes found concrete solutions, and therefore could terminate the reﬁnement before the state limit was reached (the former more so than the latter, the former not checking ϕ-compliance for the reﬁnements). This did not happen in a Det at all. Vice versa, a Det was slightly better in proving the initial state ϕ-unsolvable, causing early termination there. Taking into account ϕ for the reﬁnement has proved necessary for ϕ-prediction. a Ori could identify additional ϕ-unsolvable states in (almost) no domain, while a Int achieved notable search reduction in all domains, with the aforementioned exceptions. The ϕ implication checks can slow down the abstraction construction though. This was a particular issue in Miconic and Nurikabe. In the solvable part, the impact of ˆS-trap was limited, only few sets ˆS could be identiﬁed to start trap learning. k-trap was able to identify additional ϕ-unsolvable states almost throughout. Both construction methods showed different strengths in different domains. This has been exploited effectively by ˆS-k-trap, even surpassing the performance of the individual methods in some domains. Compared to the abstractions, the trap conﬁgurations offered a better tradeoff between ϕ-prediction and overhead.

State trajectory constraints are a natural modeling construct in planning, and have so far been considered mostly in temporal form. Here we consider the non-temporal special case of avoid conditions ϕ that must be false throughout the plan. We have designed methods predicting states unsolvable due to ϕ, and our experiments show that they can pay off. While our benchmarks are mostly designed having in mind a human modeler who speciﬁes the avoid condition, an interesting avenue for future research is to instead leverage this modeling construct to connect to ofﬂine domain analyses. Under-approximations of unsafe or dangerous regions of states naturally form avoid conditions. It may then make sense to consider non-deterministic or probabilistic planning, and to directly handle BDD representations of ϕ.

Acknowledgments

This work was funded by DFG Grant 389792660 as part of TRR 248 (CPEC, https://perspicuous-computing.science).

Bacchus, F.; and Kabanza, F. 2000. Using temporal logics to express search control knowledge for planning. Artiﬁcial Intelligence, 116(1 2): 123 191. B ackstr om, C.; and Nebel, B. 1995. Complexity Results for SAS+ Planning. Computational Intelligence, 11(4): 625 655. Baier, J. A.; Bacchus, F.; and Mc Ilraith, S. A. 2007. A Heuristic Search Approach to Planning with Temporally Extended Preferences. In Veloso, M. M., ed., Proceedings of the 20th International Joint Conference on Artiﬁcial Intelligence (IJCAI 07), 1808 1815. Hyderabad, India: Morgan Kaufmann. Baier, J. A.; Bacchus, F.; and Mc Ilraith, S. A. 2009. A heuristic search approach to planning with temporally extended preferences. Artiﬁcial Intelligence, 173(5-6): 593 618. Baier, J. A.; and Mc Ilraith, S. A. 2006. Planning with ﬁrstorder temporally extended goals using heuristic search. In Gil, Y.; and Mooney, R. J., eds., Proceedings of the 21st National Conference of the American Association for Artiﬁcial Intelligence (AAAI 06), 788 795. Boston, Massachusetts, USA: AAAI Press. Bonassi, L.; Gerevini, A. E.; Percassi, F.; and Scala, E. 2021. On Planning with Qualitative State-Trajectory Constraints in PDDL3 by Compiling them Away. In Biundo, S.; Do, M.; Goldman, R.; Katz, M.; Yang, Q.; and Zhuo, H. H., eds., Proceedings of the Thirty-First International Conference on Automated Planning and Scheduling, ICAPS 2021, Guangzhou, China (virtual), August 2-13, 2021, 46 50. AAAI Press. De Giacomo, G.; De Masellis, R.; and Montali, M. 2014. Reasoning on LTL on ﬁnite traces: Insensitivity to inﬁniteness. In Brodley, C. E.; and Stone, P., eds., Proceedings of the 28th AAAI Conference on Artiﬁcial Intelligence (AAAI 14), 1027 1033. Austin, Texas, USA: AAAI Press. Dimopoulos, Y.; Gerevini, A.; Haslum, P.; and Saetti, A. 2006. The benchmark domains of the deterministic part of IPC-5. In IPC 2006 planner abstracts, 14 19. Doherty, P.; and Kvarnstr om, J. 2001. TALplanner: A Temporal Logic Based Planner. The AI Magazine, 22(3): 95 102. Edelkamp, S. 2001. Planning with Pattern Databases. In Cesta, A.; and Borrajo, D., eds., Proceedings of the 6th European Conference on Planning (ECP 01), 13 24. Springer Verlag. Edelkamp, S. 2006. On the Compilation of Plan Constraints and Preferences. In Long, D.; and Smith, S., eds., Proceedings of the 16th International Conference on Automated Planning and Scheduling (ICAPS 06), 374 377. Ambleside, UK: AAAI Press.

Gerevini, A.; Haslum, P.; Long, D.; Saetti, A.; and Dimopoulos, Y. 2009. Deterministic planning in the ﬁfth international planning competition: PDDL3 and experimental evaluation of the planners. Artiﬁcial Intelligence, 173(5-6): 619 668. Haslum, P.; and Geffner, H. 2000. Admissible Heuristics for Optimal Planning. In Chien, S.; Kambhampati, R.; and Knoblock, C., eds., Proceedings of the 5th International Conference on Artiﬁcial Intelligence Planning Systems (AIPS 00), 140 149. Breckenridge, CO: AAAI Press, Menlo Park. Helmert, M. 2006. The Fast Downward Planning System. Journal of Artiﬁcial Intelligence Research, 26: 191 246. Helmert, M.; and Domshlak, C. 2009. Landmarks, Critical Paths and Abstractions: What s the Difference Anyway? In Gerevini, A.; Howe, A.; Cesta, A.; and Refanidis, I., eds., Proceedings of the 19th International Conference on Automated Planning and Scheduling (ICAPS 09), 162 169. AAAI Press. Helmert, M.; Haslum, P.; Hoffmann, J.; and Nissim, R. 2014. Merge & Shrink Abstraction: A Method for Generating Lower Bounds in Factored State Spaces. Journal of the Association for Computing Machinery, 61(3): 16:1 16:63. Hoffmann, J.; and Edelkamp, S. 2005. The Deterministic Part of IPC-4: An Overview. Journal of Artiﬁcial Intelligence Research, 24: 519 579. Hoffmann, J.; and Nebel, B. 2001. The FF Planning System: Fast Plan Generation Through Heuristic Search. Journal of Artiﬁcial Intelligence Research, 14: 253 302. Lipovetzky, N.; Muise, C. J.; and Geffner, H. 2016. Traps, Invariants, and Dead-Ends. In Coles, A.; Coles, A.; Edelkamp, S.; Magazzeni, D.; and Sanner, S., eds., Proceedings of the 26th International Conference on Automated Planning and Scheduling (ICAPS 16), 211 215. AAAI Press. Mattm uller, R.; and Rintanen, J. 2007. Planning for Temporally Extended Goals as Propositional Satisﬁability. In Veloso, M. M., ed., IJCAI 2007, Proceedings of the 20th International Joint Conference on Artiﬁcial Intelligence, Hyderabad, India, January 6-12, 2007, 1966. Rintanen, J. 2008. Regression for Classical and Nondeterministic Planning. In Ghallab, M., ed., Proceedings of the 18th European Conference on Artiﬁcial Intelligence (ECAI 08), 568 572. Patras, Greece: Wiley. Seipp, J.; and Helmert, M. 2013. Counterexample-guided Cartesian Abstraction Reﬁnement. In Borrajo, D.; Fratini, S.; Kambhampati, S.; and Oddi, A., eds., Proceedings of the 23rd International Conference on Automated Planning and Scheduling (ICAPS 13), 347 351. Rome, Italy: AAAI Press. Seipp, J.; and Helmert, M. 2018. Counterexample-Guided Cartesian Abstraction Reﬁnement for Classical Planning. Journal of Artiﬁcial Intelligence Research, 62: 535 577. Steinmetz, M.; and Hoffmann, J. 2017. Search and Learn: On Dead-End Detectors, the Traps they Set, and Trap Learning. In Sierra, C., ed., Proceedings of the 26th International Joint Conference on Artiﬁcial Intelligence (IJCAI 17), 4398 4404. AAAI Press/IJCAI.

Steinmetz, M.; Hoffmann, J.; Kovtunova, A.; and Borgwardt, S. 2022. Classical Planning with Avoid Conditions: Technical Report. https://fai.cs.uni-saarland.de/steinmetz/ aaai22 tr.pdf. Torres, J.; and Baier, J. A. 2015. Polynomial-time reformulations of LTL temporally extended goals into ﬁnal-state goals. In Yang, Q., ed., Proceedings of the 24th International Joint Conference on Artiﬁcial Intelligence (IJCAI 15), 1696 1703. AAAI Press/IJCAI.