# simulationbased_admissible_dominance_pruning__26805060.pdf

Simulation-Based Admissible Dominance Pruning

Alvaro Torralba and J org Hoffmann Saarland University Saarbr ucken, Germany torralba@cs.uni-saarland.de, hoffmann@cs.uni-saarland.de

In optimal planning as heuristic search, admissible pruning techniques are paramount. One idea is dominance pruning, identifying states better than other states. Prior approaches are limited to simple dominance notions, like more STRIPS facts true or higher resource supply . We apply simulation, well-known in model checking, to compute much more general dominance relations based on comparing transition behavior across states. We do so effectively by expressing state-space simulations through the composition of simulations on orthogonal projections. We show how simulation can be made more powerful by intertwining it with a notion of label dominance. Our experiments show substantial improvements across several IPC benchmark domains.

1 Introduction Heuristic search is the predominant approach to cost-optimal planning. But the number of states that must be explored to prove optimality often grows exponentially even when using extremely well-informed heuristics [Helmert and R oger, 2008]. Therefore, recent years have seen substantial effort devoted to identifying and exploiting structure allowing to prune redundant parts of the state space. Known techniques of this kind pertain to symmetries (e. g. [Fox and Long, 1999; 2002; Domshlak et al., 2012]), partial-order reduction based methods like expansion-core [Chen and Yao, 2009] or strong stubborn sets [Valmari, 1989; Wehrle and Helmert, 2012; Wehrle et al., 2013; Wehrle and Helmert, 2014], and dominance pruning [Hall et al., 2013]. We follow up on the latter here. Dominance pruning is based on identifying states better than other states. For example, consider a Logistics task where one truck must carry several packages to location G. Consider the position of any one package p. All other state variables having equal values, the best is to have p at G, and it is better for p to be in the truck than at any location other than G. We refer to this kind of relation between states as a dominance relation. (Hall et al. use the term partial-order , which we change here to avoid ambiguity with, e. g., partialorder reduction.)

Two main questions need to be answered: (1) How to discover the dominance relation? (2) How to simplify the search (and/or planning task) given a dominance relation? Hall et al. answered (2) in terms of an admissible pruning method, pruning state s if a dominating state t, with an atmost-as-costly path, has already been seen. We follow that idea here, contributing a BDD implementation. Our main contribution regards (1). Hall et al. use dominance relations characterized by consumed resources: state t dominates state s if s and t are identical except that t(r) s(r) for all resources r.1 Herein, we instead ﬁnd the dominance relation through simulation, used in model checking mainly to compare different system models [Milner, 1971; Gentilini et al., 2003]. A simulation is a relation on states where, whenever s t, for every transition s s there exists a transition t t using the same action, such that s t . In words, t simulates s if anything we can do in s, we can do also in t, leading to a simulating state. (For the reader familiar with the use of bisimulation in merge-and-shrink [Helmert et al., 2014]: simulation is one half of bisimulation.) A simulation clearly qualiﬁes as a dominance relation. But how to ﬁnd a simulation on the state space? We employ a compositional approach, obtaining our simulation relation on the state space from simulation relations on orthogonal projections, i. e., projections whose variable subsets do not overlap. We enhance simulation with a concept of label (action) dominance, in addition to states. In our Logistics example above, e. g., for each package this detects the described relation (G is better than being in the truck is better than being at any location other than G). This yields a very strong dominance relation that allows to ignore any state in which a package is unnecessarily unloaded at an irrelevant location. Empirically, we ﬁnd that indeed our pruning method often substantially reduces the number of expanded nodes. For space reasons, we omit some proofs. Full proofs, and more examples, will be made available in a TR.

1Precisely, Hall et al. consider numeric state variables r and analyze whether higher r is always good, or is always bad, or neither. In Metric-FF s [Hoffmann, 2003] linear normal form, this is equivalent to the formulation above. Hall et al. also handle STRIPS facts, as variables with domain {0, 1}. But, there, their notions trivialize to t dominates s if t s .

Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015)

2 Background

A planning task is a 4-tuple Π = (V, A, I, G). V is a ﬁnite set of variables v, each v V being associated with a ﬁnite domain Dv. A partial state over V is a function s on a subset V (s) of V , so that s(v) Dv for all v V (s); s is a state if V (s) = V . The initial state I is a state. The goal G is a partial state. A is a ﬁnite set of actions, each a A being a pair (prea, eﬀa) of partial states, called its precondition and effect. Each a A is also associated with its non-negative cost c(a) R+ 0 . A labeled transition system (LTS) is a tuple Θ = (S, L, T, s0, SG) where S is a ﬁnite set of states, L is a ﬁnite set of labels each associated with a label cost c(l) R+ 0 , T S L S is a set of transitions, s0 S is the start state, and SG S is the set of goal states. The state space of a planning task Π is the LTS ΘΠ where: S is the set of all states; s0 is the initial state I of Π; s SG iff G s; the labels L are the actions A, and s a s is a transition in T if s complies with prea, and s (v) = eﬀa(v) for v V (eﬀa) while s (v) = s(v) for v V \ V (eﬀa). A plan for a state s is a path from s to any s G SG. The cost of a cheapest plan for s is denoted h (s). A plan for s0 is a plan for Π, and is optimal iff its cost equals h (s0). As deﬁned by Wehrle and Helmert [2014], a plan for s is strongly optimal if its number of 0-cost actions is minimal among all optimal plans for s. We denote by h0 (s) the number of 0-cost actions in a strongly optimal plan of s. Abstractions and abstract state spaces are quite common in planning (e. g. [Helmert et al., 2007]). We build on this work, but only indirectly. We use merge-and-shrink abstractions as the basis from which our simulation process starts. That process itself will be described in a generic form not relying on these speciﬁc constructs. Hence, in what follows, we provide only a summary view that sufﬁces to present our contribution. Say we have a task Π = (V, A, I, G) with state space ΘΠ = (S, A, T, I, SG), and a variable subset W V . The projection onto W is the function πW : S 7 SW from the states S over V into the states SW over W, where πW (s) is the restriction of s to W. The projected state space ΘW Π is the LTS (SW , A, T W , πW (I), SW G ), where T W := {(πW (s), l, πW (s )) | (s, l, s ) T} and SW G := {πW (s G) | s G SG}. Given two variable subsets W and U, (the projections onto) W and U are orthogonal if W U = . Orthogonality ensures the following reconstruction property: ΘW Π ΘU Π = ΘW U Π for orthogonal W and U, where is the synchronized product operation. Namely, given any two labeled transition systems Θ1 = (S1, L, T 1, s1 0, S1 G) and Θ2 = (S2, L, T 2, s2 0, S2 G) that share the same set L of labels, Θ1 Θ2 is the labeled transition system with states S1 S2, labels L, transition (s1, s2) l (s 1, s 2) iff s1 l s 1 T 1

and s2 l s 2 T 2, start state (s1 0, s2 0), and goal states {(s1 G, s2 G) | s1 G S1 G, s2 G S2 G}. Merge-and-shrink abstractions [Helmert et al., 2007; 2014] construct more general abstraction functions, and the corresponding abstract state spaces, by starting from atomic abstractions (projections onto single state variables), and interleaving merging steps (replacing two abstractions with their

synchronized product) with shrinking steps (replacing an abstraction with an abstraction of itself). It has been shown that, if every shrinking step replaces the selected abstraction with a bisimulation of itself, then the ﬁnal abstraction is a bisimulation of the overall state space ΘΠ. It has also been shown that such shrinking can be combined with exact label reduction [Sievers et al., 2014]. A label reduction is a function τ from the labels L into a set Lτ of reduced labels preserving label cost i. e. c(l) = c(τ(l)). Given an LTS Θ, denote by τ(Θ) the LTS identical to Θ except that all labels l have been replaced by τ(l). Given a set {Θ1, . . . , Θk} of LTSs sharing labels L, a label reduction τ is exact if τ(Θ1) τ(Θk) = τ(Θ1 Θk). Reducing labels in this way, and using bisimulation shrinking, merge-and-shrink delivers a bisimulation of τ(ΘΠ).

3 Simulation Relations

Given a planning task with states S, a dominance relation is a binary relation S S where s t implies h (t) h (s) and, if h (t) = h (s) then h0 (t) h0 (s). This is exactly what is needed for admissible pruning during search, as discussed in the next section. To ﬁnd dominance relations in practice, we focus on the special case of simulation relations. These are well known in model-checking (e. g. [Grumberg and Long, 1994; Loiseaux et al., 1995]). Here we use a variant adapted to planning (only) in making explicit the distinction between goal states and non-goal states:

Deﬁnition 1 (Simulation) Let Θ = (S, L, T, s0, SG) be an LTS. A binary relation S S is a simulation for Θ if, whenever s t (in words: t simulates s), for every transition s l s there exists a transition t l t s.t. s t . We call goal-respecting for Θ if, whenever s t, s SG implies that t SG. We call the coarsest goal-respecting simulation if, for every goal-respecting simulation , we have .

A unique coarsest goal-respecting simulation always exists and can be computed in time polynomial in the size of Θ [Henzinger et al., 1995]. Note that the coarsest simulation is always reﬂexive, i. e., s s; the same is true of all simulation relations considered here. Intuitively, every state dominates itself . Observe that s t implies h (t) h (s), because any plan for s can be also applied to t. Hence a goal-respecting simulation over states is a dominance relation. But, for obtaining that property, goal-respecting simulation is unnecessarily strict. It sufﬁces to preserve, not the label l of the transition s s , but only its cost:

Deﬁnition 2 (Cost-Simulation) Let Θ = (S, L, T, s0, SG) be an LTS. A binary relation S S is a cost-simulation for Θ if, whenever s t, s SG implies that t SG, and

for every transition s l s there exists a transition t l t s.t. s t and c(l ) c(l).2

2Hall et al. [2013] include an equivalent deﬁnition, calling it compatibility and not relating it to simulation.

A cost-simulation still is a dominance relation, and any goal-respecting simulation is a cost-simulation but not vice versa. However, in our compositional approach where individual dominance relations are computed on orthogonal projections, we need to preserve labels for synchronization across these projections. Hence, to ensure we obtain a dominance relation of the state space, we must use goal-respecting simulation (rather than cost-simulation) on each projection. For our notion of label dominance, it will be important to consider LTSs with NOOPs added. For any LTS Θ, we denote by Θnoop the same LTS but with a new additional label noop where c(noop) = 0 and, for every state s, a new transition s noop s. Obviously, any dominance relation over Θnoop is a dominance relation over Θ. So for our purposes it sufﬁces to ﬁnd a dominance relation over the state space (ΘΠ)noop with NOOPs added.

4 Admissible Dominance Pruning

By A with dominance pruning, we refer to the following modiﬁcation of A : Whenever a node N with state s(N) and path cost g(N) is generated, check whether there exists a node N in the open or closed lists, with state s(N ) and path cost g(N ), so that s(N) s(N ) and g(N ) g(N). If so, prune N, i. e. do not insert it into the open list, nor into the closed list. As s t implies h (t) h (s), any plan through N costs at least as much as an optimal plan through N , so A with dominance pruning guarantees optimality. In presence of 0cost actions, one must be careful to not prune s if this eliminates all possible plans for t. However, s cannot belong to the strongly optimal plan of t because s t and h (t) = h (s) implies h0 (t) h0 (s). Dominance pruning can reduce A s search space, but comes with a computational overhead. First, checking cost, the runtime required for checking whether there exists a node N in the open or closed lists, with the mentioned properties. Second, maintenance cost, the runtime and memory required for maintaining whichever data structure is used to keep checking cost (which is excessive in a na ıve implementation) at bay. Depending on which of these two costs tend to be higher, variants of dominance pruning make sense. Hall et al. s [2013] dominance relations are characterized by resources r. They maintain, for each r, the set S(r) of seen states with a positive value for r. Given a new state s, their check iterates over the states in the intersection of S(r) for those r where s(r) is positive. This implementation has high checking cost but low maintenance cost. Hence Hall et al. perform the check not at node-generation time, but at nodeexpansion time, reducing the number of checks that will be made. To deal with our much more general dominance relations, we developed a BDD-based [Bryant, 1986] implementation. This has low checking cost but high maintenance cost. Hence we perform the check at node-generation time, but only against the closed list, reducing the number of maintenance operations needed. We maintain a BDD Bg for the set of states simulated by any s(N ) where N is a previously expanded node with g(N ) = g. This is done for every g-value

of the expanded nodes so far. Every time a node N is expanded, we determine the set of states S s(N ) simulated by s(N ), and add S s(N ) into Bg(N ). The checking operation then is very fast: when a node N is generated, test membership of s(N) in Bg for all g g(N). Each such test takes time linear in the size of the state.

5 The Compositional Approach As hinted, our approach is compositional, constructing the dominance relation over the state space ΘΠ as the composition of simulation relations over orthogonal projections thereof. Stating this in a generic manner (and simplifying to the atomic case of two orthogonal projections), we have an LTS Θ12 which equals the synchronized product Θ1 Θ2 of two smaller LTSs. We obtain a simulation for Θ12 from simulations for Θ1 and Θ2:

Deﬁnition 3 (Relation Composition) Let Θ1 = (S1, L, T 1, s1 0, S1 G) and Θ2 = (S2, L, T 2, s2 0, S2 G) be LTSs sharing the same labels. For binary relations 1 S1 S1 and 2 S2 S2, the composition of 1 and 2, denoted 1 2, is the binary relation on S1 S2 where (s1, s2)( 1 2)(t1, t2) iff s1 1 t1 and s2 2 t2.

Proposition 1 Let Θ12 = Θ1 Θ2, and let 1 and 2 be goal-respecting simulations for Θ1 and Θ2 respectively. Then 1 2 is a goal-respecting simulation for Θ12.

The proof is direct by deﬁnition, and is almost identical to that of a similar result concerning bisimulation, stated by Helmert et al. [2014]. Our basic idea can now be described as follows. Say we have a planning task Π = (V, A, I, G) with state space ΘΠ, a partition V1, . . . , Vk of the task s variables, and a goalrespecting bisimulation abstraction αi of each τ(ΘVi Π ) where τ is an exact label reduction. This is precisely the input we will get from merge-and-shrink abstraction. We will henceforth refer to this input as our initial abstractions. Say we construct a goal-respecting simulation i for each abstract state space Θαi. Because bisimulation is a special case of simulation, i is a goal-respecting simulation for τ(ΘVi Π ). Applying Deﬁnition 3 and Proposition 1 iteratively, N i i is a goal-respecting simulation for N i τ(ΘVi Π ). Because τ is exact, N i τ(ΘVi Π ) = τ(N i ΘVi Π ), and by the reconstruction property τ(N

i ΘVi Π ) = τ(ΘΠ). Because the label reduction is cost-preserving, N i i is a cost-simulation for ΘΠ, and hence a dominance relation as desired. One can use this result and method as-is, obtaining a new dominance pruning method as a corollary of suitably assembling existing results and methods. However, empirically, this method s ability to ﬁnd interesting dominance relations is quite limited. We now extend the simulation concept to overcome that problem.

6 Label-Dominance Simulation Sievers et al. [2014] introduce label subsumption , where l subsumes l if it labels all transitions labeled by l. To intertwine dominance between labels with dominance between states, we extend that concept as follows:

Deﬁnition 4 (Label Dominance) Let Θ be an LTS with states S, let S S be any binary relation on S, and let l, l be labels. We say that l dominates l in Θ given if c(l ) c(l), and for every transition s l s there exists a

transition s l t s.t. s t .

The relation here is arbitrary, but will be a simulation in practice. Hence, intuitively, a label dominates another one if it applies to the same states and always leads to an at least as good state . To give a simple example, consider the LTS corresponding to a single vehicle s position, and say we have a 0-cost beam action which takes us from any position to the vehicle s goal. Provided that every position is, per , simulated by the goal position, beam dominates all other labels. In IPC benchmarks, typically Deﬁnition 4 is important not for regular actions, but for NOOPs. For illustration, say we have a truck variable v T , two locations A and B, and a package variable v P whose goal is to be at B. Our variable partition is the trivial one, V1 = {v T } and V2 = {v P }. Bisimulation using exact label reduction will return LTSs as shown in Figure 1. The load and unload actions get reduced in a way allowing to synchronize with the correct truck position; the distinction between the truck drive actions is irrelevant so these are reduced to the same label. Clearly, no label dominates any other in either of these two LTSs. However, consider Θ1 and Θ2 with NOOPs added. The new label noop dominates the load/unload actions in Θ1, and dominates the drive actions in Θ2, provided 1 and 2 are reﬂexive as will be the case in practice.

Θ1: (truck) A B dr

Θ2: (package) A T B l A

Figure 1: Label-reduced bisimulations, i. e. the input to our simulation process, in the Logistics example.

This behavior allows us, e. g., to conclude that, in Θ2, B dominates T: While T has an outgoing transition to B, labeled l B, B itself has no such outgoing label. However, B has the outgoing label noop leading to B. The transition B noop B simulates T l B B, except that it uses label l = noop instead of label l = l B. This is admissible (only) if l dominates l in all other LTSs involved. In our case here, the only other LTS is Θ1, and indeed the label l = noop dominates l = l B in that LTS. We exploit this kind of information as follows:

Deﬁnition 5 (Label-Dominance Simulation) Let T = {Θ1, . . . , Θk} be a set of LTSs sharing the same labels. Denote the states of Θi by Si. A set R = { 1, . . . , k} of binary relations i Si Si is a label-dominance simulation for T if, whenever s i t, s SG i implies that

t SG i , and for every transition s l s in Θi, there exists a

transition t l t in Θi such that c(l ) c(l), s i t , and, for all j = i, l dominates l in Θj given j. We call R the coarsest label-dominance simulation if, for every label-dominance simulation R = { 1, . . . , k} for T , we have i i for all i. A unique coarsest label-dominance simulation always exists, and can be computed in time polynomial in the size of T . We will prove this in the next section as a corollary of specifying our algorithm for doing this computation. In the example, T 2 B holds because s l s in Θ2 is

T l B B, and the simulating t l t in Θ2 is B noop B, which works because c(noop) = 0 1 = c(l B), B 2 B, and noop dominates l B in Θ1. In the same fashion, provided that A 2 B, the transition T l A A is simulated by B noop B. Note that neither of these two inferences could be made with the standard concept of simulation (even after exact label reduction), because that concept insists on using the same labels, not dominating ones. We now prove soundness of label-dominance simulation, i. e., that label-dominance simulations R yield costsimulations of the original state space. Similarly to before, we iteratively compose R s element relations, as captured by the following lemma: Lemma 1 Let T = {Θ1, . . . , Θk} be a set of LTSs sharing the same labels, and let R = { 1, . . . , k} be a labeldominance simulation for T . Then { 1 2, 3, . . . , k} is a label dominance simulation for {Θ1 Θ2, Θ3, . . . , Θk}. Proof Sketch: The claim regarding 3, . . . , k is simple. For 1 2, consider states (s1, s2) 12 (t1, t2) and a

transition (s1, s2) l (s 1, s 2). We identify a dominating tran-

sition (t1, t2) l (t 1, t 2) as follows: 1. As s1 1 t1, obtain

a transition t1 ltmp ttmp 1 dominating s1 l s 1 in Θ1. 2.

As ltmp dominates l in Θ2, obtain a transition s2 ltmp stmp 2 dominating s2 l s 2 in Θ2. 3. As s2 2 t2, obtain a tran-

sition t2 l t 2 dominating s2 ltmp stmp 2 in Θ2. 4. As l

dominates ltmp in Θ1, obtain a transition t1 l t 1 dominat-

ing t1 ltmp ttmp 1 in Θ1.

Theorem 1 Let T = {Θ1, . . . , Θk} be a set of LTSs sharing the same labels, and let R = { 1, . . . , k} be a label-dominance simulation for T . Then N i i is a costsimulation for N i Θi. Proof: Applying Lemma 1, we get that N i i is a labeldominance simulation for {N i Θi}. Now, for such a single-

ton set of LTSs, the requirements on the transition t l t

replacing s l s are that c(l ) c(l), and s i t . Hence label-dominance simulation simpliﬁes to cost-simulation, and the claim follows.

To clarify the overall process, assume now again the initial abstractions provided by merge-and-shrink abstraction, i. e.,

a partition V1, . . . , Vk of the variables, and a goal-respecting bisimulation abstraction αi, with abstract state space Θαi, of each τ(ΘVi Π ) where τ is an exact label reduction. We compute the coarsest label-dominance simulation R = { 1, . . . , k} for T := {Θα1 noop, . . . , Θαk noop}. As adding NOOPs does not affect bisimulation and is interchangeable with the synchronized product, with Theorem 1 we have that N i i is a costsimulation for [N i τ(ΘVi Π )]noop, and hence a cost-simulation for ΘΠ as desired.

7 Computing R We now show how to operationalize Deﬁnition 5: Given T = {Θ1, . . . , Θk}, how to compute the coarsest label-dominance simulation R for T ? It is well known that the coarsest simulation can be computed in time polynomial in the size of the input LTS [Henzinger et al., 1995]. The algorithm starts with the most generous relation possible, then iteratively removes pairs s t that do not satisfy the simulation condition. When no more changes occur, the unique coarsest simulation has been found. This method extends straightforwardly to label-dominance simulation. Proposition 2 Let T = {Θ1, . . . , Θk} be a set of LTSs sharing the same labels. Then a unique coarsest label-dominance simulation for T exists. Proof: The identity relation is a label-dominance simulation. If R = { 1, . . . , k} and R = { 1, . . . , k} are labeldominance simulations, then { 1 1, . . . , k k}, also is a label-dominance simulation.

Denote the states of Θi by Si. Deﬁne the Boolean function Ok(i, s, t), where s i t, to return true iff the condition for label-dominance simulation holds, i. e., iff s SG i implies

that t SG i , and for every transition s l s in Θi there exists

a transition t l t in Θi such that c(l ) c(l), s i t , and, for all j = i, l dominates l in Θj given j. Our algorithm proceeds as follows: For all i, set i:= {(s, t) | s, t Si, s Si G or t Si G} while ex. (i, s, t) s.t. not Ok(i, s, t) do Select one such triple (i, s, t) Set i:= i \{(s, t)} endwhile return R := { 1, . . . , k}

Proposition 3 Let T = {Θ1, . . . , Θk} be a set of LTSs sharing the same labels. Our algorithm terminates in time polynomial in the size of T , and returns the coarsest labeldominance simulation for T . Proof: Each iteration reduces one i by one element. This gives a polynomial bound on the number of iterations, and every iteration takes polynomial time. The returned R is a label-dominance simulation as that is the termination condition. R is coarsest as every labeldominance simulation must reﬁne the initial relations {(s, t) | s, t Si, s Si G or t Si G}, and every time we remove a pair (s, t) we know that s i t in any label-dominance simulation.

8 Experiments

Our techniques are implemented in Fast Downward (FD) [Helmert, 2006]. We ran all optimal-track STRIPS planning instances from the international planning competitions (IPC 98 IPC 14). All experiments were conducted on a cluster of Intel E5-2660 machines running at 2.20 GHz, with time (memory) cut-offs of 30 minutes (4 GB). We run A with FD s blind heuristic, and with LM-cut [Helmert and Domshlak, 2009]. We perform an ablation study of label-dominance simulation, vs. standard simulation (neither NOOPs nor label-dominance), vs. bisimulation as computed by merge-and-shrink (not doing any work on top of mergeand-shrink, just using its output for the pruning). To represent the state of the art in alternative pruning methods, we include the best-performing partial-order reduction based on strong stubborn sets, which dominates other partial-order pruning approaches such as expansion-core [Wehrle et al., 2013]. Our initial abstractions are obtained using merge-andshrink with exact label reduction, bisimulation shrinking, and the non-linear merge DFP strategy [Dr ager et al., 2006; 2009; Sievers et al., 2014]. We impose two bounds on this process, namely a time limit of 300 seconds, as well as a limit M on the number of abstract transitions. When either of these limits is reached, the last completed abstractions form the starting point for our simulation process, i. e., are taken to be the initial abstractions. With this arrangement of parameters, the trade-off between merge-and-shrink overhead incurred vs. beneﬁts gained is relatively easy to control. The bound on transitions works better than the more usual bound on abstract states, because the same number of abstract states may lead to widely differing numbers of transitions and thus actual effort. A reasonably good magic setting for M, in our current context, is 100k. For M = 0, i. e. computing the component simulations on individual state variables only, performance is substantially worse. For M = 200k, the overhead becomes prohibitive. In between, overall coverage undergoes relatively small changes only (in the order of 5 instances). Consider Table 1. With our pruning method, nodes are ﬁrst generated and then checked for pruning, so the evaluated states are exactly the non-pruned generated ones. Hence the number of evaluated states assesses our pruning power, and the ratio between generated nodes and search time assesses the average time-per-node. The safety belt disables pruning if, after 1000 expansions, no node has been pruned. This is a simple yet effective method to avoid runtime overhead in cases where no or not much pruning will be obtained. Compared to partial-order reduction, simulation-based pruning tends to be stronger on its own, but less complementary to LM-cut . Consider ﬁrst the blind heuristic, which assesses the pruning power of each technique on its own . Simulation-based pruning typically yields much stronger evaluation reductions, the only clear exception being Parc Printer where partial-order reduction excels. This results in much better coverage in many domains and overall. With LM-cut, on the other hand, while simulation-based pruning still applies more broadly there are 14 test suites where it reduces evaluations but partial-order reduction does not the extent of the reduction is dramatically diminished.

Blind LM-cut Coverage Evaluations Gen/sec. Coverage Evaluations Gen/sec. Domain # A L0 L S B P L S B P L P A L0 L S B P L S B P L P Airport 50 22 -7 -7 -7 0 -1 1.2 1.2 1 4.4 341 11.3 28 -3 -1 -1 -1 +1 1 1 1 4.7 1.1 2 Driverlog 20 7 +2 +2 0 0 0 15.8 2 2 1 4.8 2.8 13 0 0 0 0 0 1.9 1.2 1.2 1 1.1 1.2 Elevators08 30 14 -1 0 0 0 0 1 1 1 1.1 0.9 4.1 22 0 0 0 0 0 1 1 1 1.3 1 1.2 Elevators11 20 12 -1 0 0 0 0 1 1 1 1.1 1 4.2 18 0 0 0 0 0 1 1 1 1.2 1 1.2 Floortile11 20 2 +4 +4 +4 0 0 177 177 1.8 1.3 5.7 3.7 7 +1 +1 +1 0 0 6.4 6.4 1 1 1.1 1.1 Floortile14 20 0 +5 +5 +5 0 0 6 +2 +2 +2 0 0 6.3 6.3 1 1 1.3 1.1 Free Cell 80 20 -7 0 0 0 -6 1 1 1 1 1 31.2 15 -1 0 0 0 0 1 1 1 1 0.9 1.4 Gripper 20 8 +6 +6 +6 +6 0 53968 53968 28353 1 292 3.1 7 +7 +7 +7 +7 0 14662 14662 10049 1 31.9 1.3 Hiking14 20 11 0 0 0 0 -3 2.4 1.9 1.8 1 3.1 30.5 9 0 0 0 0 0 1.7 1.5 1.5 1 1.9 1.8 Logistics00 28 10 +7 +6 0 0 0 32.7 3.1 1.2 1 9.3 3 20 0 0 0 0 0 1.9 1.1 1.1 2.9 0.8 1.4 Logistics98 35 2 +1 +1 0 0 0 6.7 1.2 1.2 1.5 4 4.4 6 0 0 0 0 0 1.3 1 1 4.3 0.9 1.3 Miconic 150 55 +6 +6 -1 0 -5 58.3 8.7 3.4 1 15.6 5.5 141 0 0 0 0 0 2.1 1.5 1.1 1 0.6 1.1 Mprime 35 20 -1 -1 0 0 -1 1.1 1 1 1 18 18.5 22 0 0 0 0 0 1.1 1 1 1 1 1.1 Mystery 30 15 -3 -3 -3 0 0 1.9 1.9 1 1.1 29.2 14.2 17 0 0 0 0 0 3.5 3.5 1 1.4 3.4 1.8 No Mystery 20 8 +10 +10 +1 +1 0 2497 128 29.1 1.1 46.4 23 14 +6 +6 +3 0 0 6.5 3.1 1 1 0.6 1.2 Open Stack08 30 22 +2 +2 +2 +1 0 2.1 2 1.8 2 8.1 9.3 21 0 0 0 0 0 2.5 2.4 2.1 1.8 2.3 2 Open Stack11 20 17 +2 +2 +2 +1 0 2.1 2 1.8 2 7.8 9.3 16 0 0 0 0 0 2.5 2.4 2.1 1.8 2.3 2.1 Open Stack14 20 3 0 0 0 0 +1 2.8 2.8 2.5 1.8 7 7.8 3 0 0 0 0 0 2.9 2.8 2.5 1.8 2.4 2.1 Parc Print08 30 10 +6 +5 +3 +1 +20 862 10 1.5 18349 13.6 532 18 0 0 0 0 +12 5 1.2 1.1 1028 2.2 20.3 Parc Print11 20 6 +6 +5 +3 +1 +14 869 10 1.5 21826 11.2 371 13 0 0 0 0 +7 5 1.2 1.1 1246 2.4 17.8 Peg Sol08 30 27 0 0 0 0 0 1 1 1 1 1 3.8 28 -1 0 0 0 -1 1 1 1 1 1 1.4 Peg Sol11 20 17 0 0 0 0 0 1 1 1 1 1 3.8 18 -1 0 0 0 -1 1 1 1 1 1 1.4 Pipes No Tank 50 17 -8 -1 -1 0 -3 1 1 1 1 1.1 10.3 17 -3 0 0 0 0 1 1 1 1 1 1.1 Pipes Tank 50 12 -1 0 0 0 -3 1.1 1.1 1.1 1 16.8 25.2 12 0 0 0 0 -1 1.8 1.8 1.8 1 1.1 1.2 Rovers 40 6 +2 +2 +1 0 +1 33.4 9.6 1.7 2 20.6 3 7 +2 +2 +1 +1 +2 6.1 3.8 1.2 4.4 1.8 1.8 Satellite 36 6 0 0 0 0 0 72.9 35.3 9.9 10.7 8.4 3.8 7 +3 +3 +3 +3 +4 4.8 1.8 1.7 21.5 0.9 2.3 Scanalyzer08 30 12 0 0 0 0 -4 1 1 1 1 1 8.7 15 -1 -1 -1 0 0 1 1 1 1 1 1.2 Scanalyzer11 20 9 0 0 0 0 -4 1 1 1 1 1 8.7 12 -1 -1 -1 0 0 1 1 1 1 1 1.2 Sokoban08 30 22 -9 0 0 0 -1 1 1 1 1 1.7 8.2 29 -7 -1 0 0 0 1 1 1 1 1.1 1.2 Sokoban11 20 19 -9 0 0 0 -1 1 1 1 1 1.6 8.1 20 -2 0 0 0 0 1 1 1 1 1.1 1.2 Tetris 17 9 -6 -1 -1 -1 -4 1 1 1 1 5.2 52.2 6 -3 -2 -2 -2 -1 1 1 1 1 1 1.3 Tidybot11 20 9 -8 -7 -7 -1 -2 5.5 5.5 1 1.8 59.4 8.5 14 -2 -2 -2 0 0 6.8 6.8 1 1.5 2.6 1.3 Tidybot14 20 2 -2 -2 -2 -1 -2 9 -7 -7 -7 -1 -1 3.9 3.9 1 1.7 3.1 1.4 TPP 30 6 0 0 0 0 0 6.5 3.4 1 1 22.7 3.3 6 +1 +1 +1 +1 0 1.2 1.1 1 1 1.3 1.1 Transport14 20 7 0 0 0 0 -1 1 1 1 1 1 9.1 6 0 0 0 0 0 1.4 1.4 1.4 1 1.4 1.2 Trucks 30 6 +2 +2 0 0 0 24.8 21.9 2.8 1 13.8 6 10 0 0 0 0 0 2.7 2.3 1 1 1.2 1.2 Visit All11 20 9 0 0 0 0 0 30 25.5 1 1 104 3.5 10 +1 +1 +1 0 0 7 6.8 1 1 1.5 1.1 Visit All14 20 3 +1 +1 +1 0 0 27.8 23.4 1 1 92.8 3.5 5 0 0 0 0 0 5.2 5.1 1 1 1.6 1.1 Woodwork08 30 8 +10 +10 +5 +4 +7 981 112 87.8 488 7.6 7.5 17 +7 +7 +5 +5 +10 91.4 23.7 16.9 762 1.8 3.1 Woodwork11 20 3 +9 +9 +5 +4 +6 1059 116 92.2 514 6.7 6.3 12 +5 +5 +4 +4 +7 91.6 23.8 17 772 1.8 2.9 Zenotravel 20 8 +1 +1 0 0 0 41.6 1.5 1.1 1 4.3 6.2 13 0 0 0 0 0 3.6 1.6 1 1 1 1.2 P 1271 605 +19 +57 +16 +16 +8 833 +3 +20 +14 +17 +38

Table 1: Experiments. A : A without pruning. L0 , L : label-dominance simulation; S : simulation; B : bisimulation; L0 is without safety belt (see text), all others with safety belt. P : partial-order reduction. Domains where no changes in coverage occur anywhere are omitted. Evaluations is the factor by which the per-domain summed-up number of evaluated states, relative to A , decreases. Gen/sec. is the factor by which the per-node runtime (summed-up number of generated nodes divided by summed-up search time), relative to A , increases.

Partial-order reduction suffers from this as well, but retains much of its power in Parc Printer and Woodworking, and consistently causes very little runtime overhead relative to this slow heuristic function. Thus partial-order reduction has better overall coverage. It does not dominate simulation-based pruning though, which yields better coverage in Floortile, Gripper, No Mystery, TPP, and Visit All. Label-dominance simulation clearly pays off against standard simulation as well as bisimulation. The latter already is very helpful in some domains, like Gripper and Woodworking. Simulation does add over this, but suffers in some domains, like Tidybot, from the runtime overhead. Labeldominance simulation has such issues as well, but makes up for them by more pronounced gains on other domains. The per-node runtime overhead in simulation-based pruning is almost consistently outweighed by the search space size reduction (compare the respective Gen/sec. vs. Evaluations columns in Table 1). The most substantial runtime overhead stems from computing the simulation relations. Our current implementation of that process is largely na ıve. We

experimented with ideas from model checking for doing this more effectively, but with limited success due to the different context (especially, label-dominance). It remains an important open topic to improve this part of our machinery.

9 Conclusion The idea of pruning states based on some form of dominance is old, but has previously been incarnated in planning with simple special cases ( more facts true , more resources available ) only. Simulation relations are the natural framework to move beyond this. Our work constitutes a ﬁrst step towards leveraging the power of simulation relations in, as well as extending them for, admissible pruning in planning. The method is orthogonal to existing pruning methods, and empirically exhibits complementary strengths relative to partial-order reduction, so there is potential for synergy. A major challenge in our view is how to intelligently control initial-abstraction size, investing a lot of overhead where simulation pruning is promising and, ideally, avoiding any overhead altogether where it is not.

Acknowledgments. We thank the anonymous reviewers, whose comments helped to improve the paper. This work was partially supported by the EU FP7 Programme under grant agreement 295261 (MEALS).

References [Bryant, 1986] Randal E. Bryant. Graph-based algorithms for boolean function manipulation. IEEE Transactions on Computers, 35(8):677 691, 1986. [Chen and Yao, 2009] Yixin Chen and Guohui Yao. Completeness and optimality preserving reduction for planning. In Proc. 21st International Joint Conference on Artiﬁcial Intelligence (IJCAI 2009), pages 1659 1664, 2009. [Domshlak et al., 2012] Carmel Domshlak, Michael Katz, and Alexander Shleyfman. Enhanced symmetry breaking in cost-optimal planning as forward search. In Proc. 22nd International Conference on Automated Planning and Scheduling (ICAPS 12), 2012. [Dr ager et al., 2006] Klaus Dr ager, Bernd Finkbeiner, and Andreas Podelski. Directed model checking with distancepreserving abstractions. In Proceedings of the 13th International SPIN Workshop (SPIN 2006), volume 3925 of Lecture Notes in Computer Science, pages 19 34, 2006. [Dr ager et al., 2009] Klaus Dr ager, Bernd Finkbeiner, and Andreas Podelski. Directed model checking with distancepreserving abstractions. STTT, 11(1):27 37, 2009. [Fox and Long, 1999] Maria Fox and Derek Long. The detection and exploitation of symmetry in planning problems. In Proc. 16th International Joint Conference on Artiﬁcial Intelligence (IJCAI-99), pages 956 961, 1999. [Fox and Long, 2002] Maria Fox and Derek Long. Extending the exploitation of symmetries in planning. In Proc. 6th International Conference on Artiﬁcial Intelligence Planning and Scheduling (AIPS-02), pages 83 91, 2002. [Gentilini et al., 2003] Raffaella Gentilini, Carla Piazza, and Alberto Policriti. From bisimulation to simulation: Coarsest partition problems. Journal of Automated Reasoning, 31(1):73 103, 2003. [Grumberg and Long, 1994] Orna Grumberg and David E. Long. Model checking and modular veriﬁcation. ACM Transactions on Programming Languages and Systems (TOPLAS), 16(3):843 871, 1994. [Hall et al., 2013] David Hall, Alon Cohen, David Burkett, and Dan Klein. Faster optimal planning with partial-order pruning. In Proc. 23rd International Conference on Automated Planning and Scheduling (ICAPS 13), 2013. [Helmert and Domshlak, 2009] Malte Helmert and Carmel Domshlak. Landmarks, critical paths and abstractions: What s the difference anyway? In Proc. 19th International Conference on Automated Planning and Scheduling (ICAPS 09), pages 162 169, 2009. [Helmert and R oger, 2008] Malte Helmert and Gabriele R oger. How good is almost perfect? In Proc. 23rd Na-

tional Conference of the American Association for Artiﬁcial Intelligence (AAAI-08), pages 944 949, 2008. [Helmert et al., 2007] Malte Helmert, Patrik Haslum, and J org Hoffmann. Flexible abstraction heuristics for optimal sequential planning. In Proc. 17th International Conference on Automated Planning and Scheduling (ICAPS 07), pages 176 183, 2007. [Helmert et al., 2014] Malte Helmert, Patrik Haslum, J org Hoffmann, and Raz Nissim. Merge & shrink abstraction: A method for generating lower bounds in factored state spaces. Journal of the Association for Computing Machinery, 61(3), 2014. [Helmert, 2006] Malte Helmert. The Fast Downward planning system. Journal of Artiﬁcial Intelligence Research, 26:191 246, 2006. [Henzinger et al., 1995] Monika Rauch Henzinger, Thomas A. Henzinger, and Peter W. Kopke. Computing simulations on ﬁnite and inﬁnite graphs. In 36th Annual Symposium on Foundations of Computer Science., pages 453 462, 1995. [Hoffmann, 2003] J org Hoffmann. The Metric-FF planning system: Translating ignoring delete lists to numeric state variables. Journal of Artiﬁcial Intelligence Research, 20:291 341, 2003. [Loiseaux et al., 1995] Claire Loiseaux, Susanne Graf, Joseph Sifakis, Ahmed Bouajjani, and Saddek Bensalem. Property preserving abstractions for the veriﬁcation of concurrent systems. Formal Methods in System Design, 6(1):11 44, 1995. [Milner, 1971] Robin Milner. An algebraic deﬁnition of simulation between programs. In Proc. 2nd International Joint Conference on Artiﬁcial Intelligence (IJCAI71), pages 481 489, 1971. [Sievers et al., 2014] Silvan Sievers, Martin Wehrle, and Malte Helmert. Generalized label reduction for mergeand-shrink heuristics. In Proc. 28th AAAI Conference on Artiﬁcial Intelligence (AAAI 14), pages 2358 2366, 2014. [Valmari, 1989] Antti Valmari. Stubborn sets for reduced state space generation. volume 483 of Lecture Notes in Computer Science, pages 491 515, 1989. [Wehrle and Helmert, 2012] Martin Wehrle and Malte Helmert. About partial order reduction in planning and computer aided veriﬁcation. In Proc. 22nd International Conference on Automated Planning and Scheduling (ICAPS 12), 2012. [Wehrle and Helmert, 2014] Martin Wehrle and Malte Helmert. Efﬁcient stubborn sets: Generalized algorithms and selection strategies. In Proc. 24th International Conference on Automated Planning and Scheduling (ICAPS 14), 2014. [Wehrle et al., 2013] Martin Wehrle, Malte Helmert, Yusra Alkhazraji, and Robert Mattm uller. The relative pruning power of strong stubborn sets and expansion core. In Proc. 23rd International Conference on Automated Planning and Scheduling (ICAPS 13), 2013.