# totsat__totallyordered_hierarchical_planning_through_sat__849b3e86.pdf tot SAT Totally-Ordered Hierarchical Planning through SAT Gregor Behnke, Daniel H oller, Susanne Biundo Institute of Artificial Intelligence, Ulm University, D-89069 Ulm, Germany {gregor.behnke, daniel.hoeller, susanne.biundo}@uni-ulm.de In this paper, we propose a novel SAT-based planning approach for hierarchical planning by introducing the SATbased planner tot SAT for the class of totally-ordered HTN planning problems. We use the same general approach as SAT planning for classical planning does: bound the problem, translate the problem into a formula, and if the formula is not satisfiable, increase the bound. In HTN planning, a suitable bound is the maximum depth of decomposition. We show how totally-ordered HTN planning problems can be translated into a SAT formula, given this bound. Furthermore, we have conducted an extensive empirical evaluation to compare our new planner against state-of-the-art HTN planners. It shows that our technique outperforms any of these systems. Introduction Hierarchical planning provides a natural and expressive, but nevertheless easily usable means for planning. Its most common formalism is Hierarchical Task Network (HTN) planning (Erol, Hendler, and Nau 1996), which extends classical STRIPS-based planning in two ways. It introduces first the concept of abstract tasks tasks which cannot be executed directly, and second decomposition methods, which relate abstract tasks to primitive STRIPS-like actions by providing a course of (primitive and/or abstract) tasks that needs to be performed to achieve the abstract task. The various decomposition methods for an abstract task define the allowed portfolio of ways to achieve that task. They can, e.g., be gained from domain experts, whose knowledge can be encoded and exploited this way. This makes HTN planning particularly useful for planning in complex real-world domains. Its topdown and decomposition-based reasoning also bears similarities with human thought processes (Byrne 1977). Although HTN planning is often used in real-world applications (Nau et al. 2005; Champandard, Verweij, and Straatman 2009; Hartanto and Hertzberg 2008; Morisset and Ghallab 2002; Dvorak et al. 2014; Bercher et al. 2015), there has so far only been little research in developing fast, domain-independent HTN planners. Systems are either guided by instructional information, hard-coded in the domain model, use blind search, or heuristic search in the plan space, relying on quite simple heuristics (Bercher et al. Copyright c 2018, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. 2017). Recently, a new HTN planning technique based on a bounded translation into classical planning as been proposed (Alford et al. 2016a). Most search processes of HTN planners consider both parts of the planning problem the hierarchy and the primitive action model separately and do not analyse interactions between them. This is especially true for heuristics, which have so far either been based solely on the hierarchy or solely on the primitive actions. The work by Alford et al. (2016a) provides an integration of both worlds. The planner FAPE (Dvorak et al. 2014) does this too (Bit-Monnot, Smith, and Do 2016), but restricts the decomposition hierarchy to be non-recursive, which greatly reduces the expressiveness of HTN planning. In this paper, we present a new technique for HTN planning based on a translation of the planning problem into a propositional satisfiability problem. It enables a unified view on both parts of the problem and leads to a significantly better performance compared to previous approaches. We restrict ourselves to the class of totally-ordered HTN planning, thereby easing the SAT formula s construction. This restriction is justified by various observations. First, we retain most of the expressive power of HTN planning and are still more expressive than STRIPS-based planning (H oller et al. 2014; H oller et al. 2016). Second, HTN domains often have a high degree of total order (Alford et al. 2016a), so that planners have even been specifically designed for this problem class (Nau et al. 1999; Alford, Kuter, and Nau 2009). Third, restricting models to be totally-ordered enables the creation of complete planning algorithms, as partially ordered HTN planning is undecidable (Erol, Hendler, and Nau 1996), while totally-ordered planning is EXPTIME-complete (Alford, Bercher, and Aha 2015). Finally, studying the structurally simpler class of totally-ordered HTN planning problems is a suitable first step towards creating a SAT-based planner for general HTN planning problems. The remainder of the paper is organised as follows. Next, we introduce the totally-ordered HTN formalism and survey related approaches. We then describe our planning algorithm tot SAT and present its empirical evaluation. HTN Planning In this paper, we use an adaptation of the formulation of HTN planning of Geier and Bercher (2011), in which we omit everything related to partial order. Hierarchical plan- The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18) ning distinguishes between two types of tasks: primitive and abstract. Primitive tasks a are identical to standard STRIPS actions: they can be executed directly and are specified by prec, add, and del lists with the usual state transition semantics. In contrast, abstract tasks must be refined through methods into primitive actions before they can be executed. To describe the allowed refinements, HTN planning problems use task networks, which describe a partially ordered multiset of both primitive and abstract tasks. In the totallyordered case a task network is merely a sequence of tasks. Definition 1. Given a set of tasks T, a task network tn is any element of T . We define tn(i) to be the ith task of tn. A totally-ordered HTN planning problem is defined as: Definition 2. A totally-ordered HTN planning problem is a 6-tuple P = (L, C, O, M, c I, s I), s.t. L is a finite set of proposition symbols C is a finite set of abstract (or compound) tasks O is a finite set of primitive actions (or operators) M C (C O) is a set of decomposition methods c I C is the initial abstract task s I L is the initial state. We define M(t) = {(t, tn) | (t, tn) M} to be the set of methods for the abstract task t and M(T) = t T M(t). Decomposition methods are essentially rules that describe how an abstract task can be replaced by a sequence of (primitive or abstract) tasks, defined as follows. Definition 3. Let P = (L, C, O, M, c I, s I) be a planning problem, tn = ω1cω2 a task network where c C, and m = (c, ϕ) M a decomposition method. Then the application of m to tn results in the task network tn = ω1ϕω2, written tn m D tn . Likewise we write tn D tn if any such method exists, and tn D tn if any sequence of decompositions exists, which transforms tn into tn . We define a solution to an HTN planning problem as: Definition 4. Let P = (L, C, O, M, c I, s I) be a planning problem. A task network tn O is a solution to P if and only if c I D tn and tn is executable in s I. The set of all such task networks is defined as S(P). In this paper, we consider the problem of satisficing planning, i.e., the decision problem to decide whether a task network tn S(P) exists. We want to emphasise that the HTN part of the planning problem is much more than a mere heuristic guide to solve some (in its core) classical planning problem it is a restriction on the allowed plans. This additional way to restrict the model makes HTN planning both more expressive than classical planning and computationally more difficult. Our formalism differs from the one used by the HTN planning system SHOP (Nau et al. 1999) in that we do not allow for methods to have preconditions. This is not a restriction, as SHOP-style domains can be compiled into our formalism by introducing tasks with each method s precondition, which are ordered before all other tasks in their respective methods. Three of our evaluation domains contain method preconditions, which are compiled by the planner. The reduced formalism bears a striking similarity to context-free grammars (CFGs). In fact, the set of solutions S(P) can be described by a CFG and for all CFGs there is a planning problem generating the same language (H oller et al. 2014). In theory, the task of planning boils down to finding a word generated by a context-free grammar, which can be done in linear time. The planning problem is a compact representation of the corresponding grammar, as it defines the solutions as the intersection of two properties: being a valid decomposition and being executable. Constructing the grammar explicitly would lead to an exponential blow-up, which makes any such approach to planning useless. Related Work The idea of translating hierarchical planning problems into logical formulae is not entirely new. Mali and Kambhampati (1998) have proposed a translation of HTN planning problems into a SAT formula. Their notion of HTN planning significantly differs from the (by now) established HTN formalism, making their formula much simpler and structurally different from ours. First, their formalism allows inserting additional tasks and does not have an initial task, which makes it equivalent to STRIPS planning. Second, they allow task sharing (Alford et al. 2016b), which leads both to significantly shorter or even new solutions. Consider a planning problem with a single abstract task A which decomposes into two instances of the primitive task a, which can only be executed once. Using standard HTN semantics this problem has no solution, while with task sharing it does have one of length one. Lastly, their technique is also far less powerful than ours, as it cannot handle recursive domains, which greatly reduces the expressive power of hierarchical planning. Such domains can always be translated into an equivalent STRIPS planning problem, which is not the case for general or totally-ordered domains (H oller et al. 2014). Dix, Kuter, and Nau (2003) have proposed an encoding of totally-ordered HTN planning into answer set programming (ASP), based on the mechanics of SHOP. Their empirical evaluation shows that the translated domain performs significantly worse than the standard SHOP algorithm (up to a factor of 1.000). We assume that this is due to the high expressive power needed for their models it requires numbers. Also, ASP is commonly solved by a translation into SAT, thus we deem a direct encoding to be more efficient. We have not included their approach in our evaluation as no code for their translation is publicly available and (as stated) its performance is worse than SHOP, which we did include. In recent work, we presented a SAT-based technique for verifying HTN plans (Behnke, H oller, and Biundo 2017), a task which was shown to be NP-complete (Behnke, H oller, and Biundo 2015). In plan verification, the objective is, given a sequence of actions, to determine whether this sequence can be obtained via decomposition from the initial abstract task. In theory, our proposed encoding could also be used as the basis for a planner, but its size and complexity make it impractical for that purpose. Also, the encoding we present here differs significantly from the verification encoding in that it represents decomposition in a tree-like structure rather than as a process iterating over sequences of actions. As noted, the problem of finding a plan for a totallyordered HTN planning problem is equivalent to finding a word of a CFL given in a highly compressed form. For this test, it would however be necessary to construct the respective grammar G explicitly, which would require to construct the state space explicitly which is not feasible. Due to the specific nature of our input, there has been no work in the formal languages community as far as we know. There has been work relating to the usage of SAT techniques when analysing properties of CFLs, like universality (Axelsson, Heljanko, and Lange 2008), but their techniques are not applicable here, as they need the expanded CFG as their input. Height-bounded Decomposition Trees Our translation of HTN planning problems into SAT formulae is based on the notion of decomposition trees (DTs), which were originally introduced by Geier and Bercher (2011). A DT is a witness that a task network is a solution to a given planning problem. We adapt their definition and simplify it for the case of totally-ordered domains as follows: Definition 5. Let P = (L, C, O, M, c I, s I) be an HTN planning problem. A valid totally-ordered decomposition tree T is a 4-tuple T = (V, E, α, β), where V are the nodes of a directed tree, whose edges are given by the function E : V V 1, mapping each node in the tree to a ordered list of vertices its children. Let r be the root-node of this tree. α : V C O assigns each inner node an abstract task and each leaf a primitive task. β : V M assigns each inner node a method. α(r) = c I for all inner nodes v V with β(v) = (c, tn) and children E(v) = c1, . . . , cn, it holds that c = α(v) and α(c1) . . . α(cn) = tn. The function E implicitly defines an order of all nodes of the tree. Two nodes a and b are ordered a b, if any only if for the lowest common ancestor c of a and b in the tree, the child a of c which is the ancestor of a occurs in E(c) before the respective ancestor b of b. The solution represented by a DT T are the tasks assigned to its leafs in the order implicitly induced by the function E. Geier and Bercher (2011) showed that for every solution tn S(P) a DT exists, whose leafs are assigned the tasks of tn in the correct order and vice versa. Thus, instead of generating a formula that describes all solutions, we can generate one that describes all possible DTs. This, however, is not possible in general, since the HTN can contain recursive methods, requiring to describe infinitely many DTs in one formula. We take the same approach as SAT-planning does for classical planning: Restrict the solution by some bound and increase the bound until a solution has been found. We propose to use the height of the DTs as this bound in con- 1Let X be the set of all sequences over X. trast to the plan length in classical planning2. I.e., given a height-bound K, we generate a SAT formula that describes all possible decomposition trees whose height is at most K. Here the question arises, which values should be tried for K? As the lower bound, we use the minimal height necessary to derive a task network solely containing primitive tasks from c I. Formally this height is the smallest K s.t. there exists an DT with height K whose leafs are assigned only primitive tasks. This bound can be computed inductively: For any primitive task it is 0, for every method it is the maximum of any contained task, and for any abstract task it is 1 plus the minimum over its decomposition methods. For cycles in the decomposition hierarchy, we can iterate until convergence. The following theorem provides an upper bound, showing that a planning procedure based on iteratively increasing the allowed height will terminate in finite time. Similar to classical planning, where there is also such an upper bound (2|L|), the determined bound is quite high, but plays no practical role, as solutions are usually found before that bound is reached. Theorem 1. Let P = (L, C, O, M, c I, s I) be an HTN planning problem. If S(P) = , then S(P) also contains a solution whose decomposition height is at most |C| 2|L| 2. Proof. Assume that S(P) is not empty, let tn be a solution with the minimal decomposition height, and K > |C| 2|L| 2 be this height. Then the decomposition tree of tn contains a path of length K. Label every node a in the tree with the states immediately before and after executing the primitive tasks resulting from a (i.e. the leafs below a). Since this path is longer than |C| 2|L| 2, at least two nodes will be labelled with the same abstract task and states. We can now remove the part of the tree between these two occurrences (including one of the two duplicated nodes). The resulting tree still represents a task network, which is a solution to P, as it only contains valid decompositions and the resulting task network will be executable. However the height of this path in the tree has decreased. If we repeat this process for all paths with length K, the resulting tree will represent a solution with a decomposition height