# reshaping_diverse_planning__623a41d0.pdf

The Thirty-Fourth AAAI Conference on Artiﬁcial Intelligence (AAAI-20)

Reshaping Diverse Planning

Michael Katz, Shirin Sohrabi IBM T.J. Watson Research Center 1101 Kitchawan Rd, Yorktown Heights, NY 10598, USA

The need for multiple plans has been established by various planning applications. In some, solution quality has the predominant role, while in others diversity is the key factor. Most recent work takes both plan quality and solution diversity into account under the generic umbrella of diverse planning. There is no common agreement, however, on a collection of computational problems that fall under that generic umbrella. This in particular might lead to a comparison between planners that have different solution guarantees or optimization criteria in mind. In this work we revisit diverse planning literature in search of such a collection of computational problems, classifying the existing planners to these problems. We formally deﬁne a taxonomy of computational problems with respect to both plan quality and solution diversity, extending the existing work. We propose a novel approach to diverse planning, exploiting existing classical planners via planning task reformulation and choosing a subset of plans of required size in post-processing. Based on that, we present planners for two computational problems, that most existing planners solve. Our experiments show that the proposed approach signiﬁcantly improves over the best performing existing planners in terms of coverage, the overall solution quality, and the overall diversity according to various diversity metrics.

1 Introduction Many applications of planning require generating multiple plans rather than one. Some examples include malware detection (Boddy et al. 2005), automated analysis of streaming data (Riabov et al. 2015), and risk management (Sohrabi et al. 2018). Planners that produce multiple plans were also found useful in the context of re-planning and plan monitoring (Fox et al. 2006), user preferences (Myers and Lee 1999; Nguyen et al. 2012), as well as the engine for plan recognition and its related applications (Sohrabi, Riabov, and Udrea 2016). All these applications justify the need for ﬁnding a diverse set of plans while keeping quality in mind. Many diverse planners were developed over the last decade, each one focused on addressing a particular diversity metric. For example, while DLAMA focuses on ﬁnding a set of plans by considering a landmark-based diver-

Copyright c 2020, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

sity measure (Bryce 2014), LPG-d and DIV focus on ﬁnding a set of plans with a particular minimum action distance (Nguyen et al. 2012; Coman and Mu noz-Avila 2011). Goldman and Kuter (2015) propose a diversity metric based on information retrieval literature. Roberts, Howe, and Ray (2014) suggest another diversity metric, introducing several planners, such as it A and MQA, which, in addition to the diversity metrics, consider plan quality. Recently, Vadlamudi and Kambhampati (2016) suggested cost-sensitive diverse planners, ﬁrst ﬁnding all cost sensitive plans and then ﬁnding a diverse set of plans among these. Top-k planners (e.g., Katz et al. 2018b) or top-quality planners (Katz, Sohrabi, and Udrea 2020) can also be viewed as diverse planners, purely addressing the quality metric.

Despite the large number of existing tools and diversity metrics, there is no adopted collection of computational problems in diverse planning, making the comparison of different approaches challenging. Further, mixing quality and diversity creates an additional challenge for comparing various planners, especially if they have different optimality guarantees. Even for the same computational problem, planner comparison can be challenging. Every planner can have a different implementation of the same diversity metric, and many planners produce a collection of plans without specifying the metric used, or the solution value under that metric. To the best of our knowledge, there exists no external validation tool for a collection of plans, producing the solution value under a given diversity metric. Additionally, most of the diverse planning approaches compute the set of plans by repeatedly solving the same task. To obtain a different behavior, planner s heuristic guidance is modiﬁed to account for already found plans, with a speciﬁc focus on a particular metric. This requires (a) having an intimate familiarity with the way a particular planner works, and (b) creating a separate modiﬁcation for each metric. However, the outcome is not always as intended. Tweaking the heuristic function does not necessarily result in a different plan and planners have to discard many equal plans and repeat unnecessary iterations.

In this work, we address the computational problems in diverse planning as well as the diverse planner construction paradigm. Similarly to the separation in classical planning, we distinguish between optimal, bounded, and satisﬁcing

diverse planning and map the existing planners to their respective categories. We propose a new quality metric for a set of plans, measuring how close the plans are to the best subset of all known plans. We create an external validation tool for the metrics considered in this paper, allowing us to compute the diversity values of the solutions produced by existing planners. We introduce an alternative planner construction paradigm, a diverse planning algorithm that instead of modifying a planner, modiﬁes a planning task. Following the ideas of Katz et al. (2018b), we suggest reformulating the planning task after each iteration, forbidding sets of plans. Next, we post-process the found plans to derive a subset of plans of the required size, according to the given metric. Our approach, Forbid Iterative (FI), is not restricted to any planner and can exploit the recent advances in classical planning. To demonstrate this advantage, we experiment with one of the recent best-performing approaches to agile planning, heuristic novelty of the red-black planning heuristic (Katz et al. 2017; Katz, Hoffmann, and Domshlak 2013; Domshlak, Hoffmann, and Katz 2015), a core component for several participants of the recent International Planning Competition (IPC) 2018 (Katz et al. 2018a; Katz 2018). Based on this approach, we create planners for two of the introduced computational problems. We show that the same approach outperforms the dedicated planners built for speciﬁc metrics on these metrics and on their linear combinations, for both computational problems.

2 Preliminaries and Related Work A SAS+ planning task (B ackstr om and Nebel 1995) is given by a tuple V, A, s0, s , where V is a set of state variables, A is a ﬁnite set of actions. Each state variable v V has a ﬁnite domain dom(v). A pair v, ϑ with v V and ϑ dom(v) is called a fact. A (partial) assignment to V is called a (partial) state. Often it is convenient to view partial state p as a set of facts with v, ϑ p if and only if p[v] = ϑ. Partial state p is consistent with state s if p s. We denote the set of states of a planning task by S. s0 is the initial state, and the partial state s is the goal. Each action a is a pair pre(a), eﬀ(a) of partial states called preconditions and effects. An action cost is a mapping C : A R0+. An action a is applicable in a state s S if and only if pre(a) is consistent with s. Applying a changes the value of v to eﬀ(a)[v], if deﬁned. The resulting state is denoted by s a . An action sequence π = a1, . . . , ak is applicable in s if there exist states s0, , sk such that (i) s0 = s, and (ii) for each 1 i k, ai is applicable in si 1 and si = si 1 ai . We denote the state sk by s π . π is a plan iff π is applicable in s0 and s is consistent with s0 π . We denote by P(Π) (or just P when the task is clear from the context) the set of all plans of Π. The cost of a plan π, denoted by C(π) is the summed cost of the actions in the plan. The pairwise plan distance is δ(π, π ) = 1 sim(π, π ), where the similarity measure sim is between 0 (unrelated) and 1 (equivalent). The diversity of a set of plans, D(P), P P is then deﬁned as some aggregation (e.g., min or average) of the pairwise distance within the set P. While some domain-dependent similarity measures exist (e.g., Myers and Lee 1999; Coman and Mu noz-Avila 2011), recent re-

search has focused on domain-independent measures, comparing plans based on their actions, states, causal links, or landmarks (Nguyen et al. 2012; Bryce 2014). Stability similarity (inverse of the plan distance (Fox et al. 2006; Coman and Mu noz-Avila 2011)) measures the ratio of the number of actions that appear on both plans to the total number of actions on these plans, referring to plans as action sets, ignoring repetitions. Given two plans π, π , it is deﬁned as simstability(π, π ) = |A(π) A(π )|/|A(π) A(π )|, where A(π) is the set of actions in π. Uniqueness similarity (Roberts, Howe, and Ray 2014) is another measure that considers plans as action sets. It measures whether two plans are permutations of each other, or one plan is a partial plan (subset) of the other plan. State similarity measures similarity between two plans based on representing the plans as a sequence of states, where each state is a set of predicates. While there are multiple ways to deﬁne state similarity, we adapt the following deﬁnition from (Nguyen et al. 2012), modifying it based on use of similarity rather than distance between plans. Let (s0, s1, . . . , sk) and (s 0, s 1, . . . , s k ) be the sequences of states traversed by the plans π and π , respectively. Let Δ(s, s ) = |s s |/|s s | be the similarity between two states. Assuming k k, the state similarity measure is deﬁned as follows: simstate(π, π ) = k

i=1 Δ(si, s i) k. Note, each state sk +1, ..., sk is considered to not contribute to the similarity measure (i.e., zero is considered). The combination of the state and uniqueness measures address some of the major weaknesses of the stability measure raised by recent research (Goldman and Kuter 2015). As our focus in this work is not on metrics, we thus omit the landmark-based distance description (Bryce 2014). While there seems to be no widely adopted deﬁnitions of diverse planning problems, previous work has introduced some restrictions on the sets of plans that constitute a solution to diverse planning. Our work was inspired by the following two deﬁnitions (d and c are thresholds on the distance and plan cost, respectively). The variant introduced by Nguyen et al. (2012) requires the distance between every pair of plans in the solution to be of bounded diversity. Formally, the search problem is depicted as follows:

d DISTANTk SET : ﬁnd P with P P,

|P| = k, min π,π P δ(π, π ) d. (1)

Another variant, by Vadlamudi and Kambhampati (2016) extends the previous search problem by requiring each plan in the solution to be of bounded quality. Formally:

c COSTd DISTANTk SET : ﬁnd P with P P,

|P| = k, min π,π P δ(π, π ) d, C(π) c π P. (2)

While Eq. 2 considers plan costs, Eq. 1 only considers the distance between plan pairs. Note that both deﬁnitions require ﬁnding k distinct plans. We denote the diversity of a set of plans P, computed as an average over the pairwise dissimilarity of the set P, under the similarity measures of stability, uniqueness, and state by

Da, Du, and Ds, respectively, dropping P for readability. Also, Dma denotes the diversity metric computed as minimum over the pairwise stability dissimilarity.

3 Quality Metric While most work in diverse planning focused on the diversity metrics, not much was done for quality metrics. One possible quality metric is based on the summed cost of plans. To normalize this value, as with the International Planning Competition (IPC) quality metric for individual plans, we can divide the best known solution value by the value of the given planner. One downside of such a metric is that a single plan can have a large effect on the overall quality. For example, a set of plans, where all plans are optimal except for one, of a much higher cost, may get a quality score worse than a set where all plans are not optimal. Thus, we suggest a quality metric that will allocate a score to each plan in the set, aggregating these scores into a single value. Given n diverse planners, let P = P1 P2... Pn be the set of all plans found by these planners. Let π1, . . . πk be k plans with the lowest cost, ordered by their cost from smallest to largest and let ci = C(πi). For a planner j, the quality of the solution Pj is measured relatively to the best known k plan costs c1, . . . , ck as follows. Let πj 1, . . . πj k be an ordering of plans in Pj according to their costs and let cj i = C(πj i ). The quality metric is deﬁned as follows.

ci cj i . (3)

Note that cj i ci, since Pj P, and thus πj i has at least i 1 plans of no larger cost in P. Thus, each sum component is between 0 and 1, and thus the whole score is a value between 0 and 1. Further, a solution Pj will get the score 1 if and only if it consists of k cheapest plans found by any planner. In other words, if there exists no plan in P \ Pj (found by any of the other planners) that is cheaper than a plan in Pj. The suggested metric is similar in spirit to the parsimony ratio (Roberts, Howe, and Ray 2014). The parsimony ratio is deﬁned as s(πk, πl) = |πk|/|πl|, where for each πl (|πl| = l) we need to ﬁnd an optimal plan, πk (|πk| = k), such that πk πl, k l. This can be challenging by itself, since it requires ﬁnding optimal plans. The parsimony ratio also only considers unit cost plans. Both these limitations do not exist in our suggested metric: it can handle general costs and the computation is relative to the set of known plans.

4 Diverse Planning Revisited In this section, we deﬁne a collection of computational problems in diverse planning for two optimization criteria, quality and diversity. Following previous deﬁnitions, depicted in Eqs. 1 and 2, we deﬁne a solution to a diverse planning problem as a set of plans of a required size. In contrast to previous deﬁnitions, in case there exist fewer plans than requested, the set of all plans is also considered to be a valid solution.

Deﬁnition 1 (Diverse planning solution) Let Π be a planning task and P be the set of all plans for Π. Given a nat-

ural number k, P P is a k-diverse planning solution if |P| = k or P = P if |P| < k. Restricting our attention to two optimization criteria, quality and diversity, let us introduce some terminology. We say that a solution is quality-optimal (diversity-optimal) if there exists no solution of better quality (diversity). In other words, given solution quality mapping Q (diversity mapping D), a solution P is quality-optimal (diversity-optimal) if for all solutions P we have Q(P ) Q(P) (D(P ) D(P)). Given a bound b, we say that a solution P is quality-bounded (diversity-bounded) if Q(P) b (D(P) b). For both quality and diversity, one could either strive to ﬁnd optimal or bounded solutions, or impose no restriction on solution quality. Unfortunately, these two optimization criteria can interfere with each other. Thus, in what follows, we deﬁne various search and optimization problems.

4.1 Satisﬁcing Diverse Planning We start with imposing no restrictions. Thus, the Satisﬁcing Diverse Planning problem can be deﬁned as follows.

sat-k : Given k, ﬁnd a k-diverse planning solution.

Note that the objective is to ﬁnd any set of k plans without any restrictions on either quality or diversity. This is the category under which most diverse planners fall (e.g., Bryce 2014; Roberts, Howe, and Ray 2014). To compare planners in this category, it is sufﬁcient to compare the quality and diversity of their solutions. Note, many of the satisﬁcing diverse planners incorporate the distance measure into their search and focus on ﬁnding diverse plans with respect to that particular distance measure in mind. Hence, while they may perform well for one diversity metric, they may do poorly in another one.

4.2 Bounded Diverse Planning Continuing now by restricting either quality or diversity by imposing a bound, we introduce a Bounded Quality (Diversity) Diverse Planning. We do that by restricting the set of feasible solutions. Deﬁnition 2 (Diversity-bounded solution) Let Π be a planning task, D be some diversity metric, b be some bound, and P be the set of all Π s plans. Given a natural number k, P P is a b-diversity-bounded k-diverse planning solution if it is a k-diverse planning solution and D(P) b. Deﬁnition 3 (Quality-bounded solution) Let Π be a planning task, Q be some quality metric, c be some bound, and P be the set of all Π s plans. Given a natural number k, P P is a c-quality-bounded k-diverse planning solution if it is a k-diverse planning solution and Q(P) c. Given the deﬁnitions above, we can now deﬁne the following search problems:

b D-k : Given k and b, ﬁnd a b-diversity-bounded k-diverse planning solution,

b Q-k : Given k and c, ﬁnd a c-quality-bounded k-diverse planning solution.

The search problem b D-k generalizes the deﬁnition in Eq. 1 by Nguyen et al. (2012), for a diversity score Dma deﬁned as the minimum over the pairwise stability dissimilarity. Note that this measure differs from Da, that averages over the pairwise stability dissimilarity. For bounded diverse planning, Dma dominates Da in the sense that solutions to the diversity-bounded diverse planning under Dma are necessarily solutions to the diversity-bounded diverse planning under Da with the same bound, but not the other way around. The planner LPG-d implements the approach of Nguyen et al. (2012), for a variant of Dma, where the stability similarity is computed over multisets, instead of sets. We denote this diversity metric by Dmma. Thus, LPG-d can be thought of as a diversity-bounded diverse planner for Dmma but not for any of the other metrics. Further, while LPG-d is a sound planner, it is not complete, since it can only add plans to the collection of previously found plans, and never reconsiders the decision to add a plan. Thus, in principle LPG-d might not be able to ﬁnd a solution to the diversitybounded diverse planning problem when a solution exists. For the search problem b Q-k, note that it can be solved by post-processing solutions to the top-quality planning problem (Katz, Sohrabi, and Udrea 2020). Restricting both quality and diversity results in an additional search problem, one we call Bounded Quality and Diversity Diverse Planning.

b Qb D-k : Given k, b, and c, ﬁnd a c-quality-bounded and b-diversity-bounded k-diverse planning solution.

The search problem b Qb D-k generalizes the deﬁnition in Eq. 2 by Vadlamudi and Kambhampati (2016), for diversity score that uses min as the aggregation method and quality score deﬁned as a maximum over the individual plan costs. It is worth noting here that in all these deﬁnitions, as in classical planning, if the bound is super-optimal, the search problem is considered to be unsolvable.

4.3 Optimal Diverse Planning Restricting now either the quality or diversity to be optimal, we deﬁne two optimization problems, Optimal Quality (Diversity) Diverse Planning.

opt Q-k : Given k, ﬁnd a quality-optimal k-diverse planning solution.

opt D-k : Given k, ﬁnd a diversity-optimal k-diverse planning solution.

Top-k planners (e.g., Riabov, Sohrabi, and Udrea 2014; Katz et al. 2018b) can be viewed as planners for opt Q-k, optimizing the quality metric Q =

π P C(π). To the best of our knowledge, there are no existing planners for the opt D-k optimization problem. In fact, it is not clear how to create such non-trivial planners, without the need to generate the set of all plans. If we further restrict the other optimization function, this results in additional optimization problems. The ﬁrst two

b Dopt Q-k b Qopt D-k opt Q-k opt D-k

Figure 1: Hierarchy between the computational problems.

are Optimal Quality Bounded Diversity Diverse Planning and Optimal Diversity Bounded Quality Diverse Planning, as follows.

b Dopt Q-k : Given k and b, ﬁnd a quality-optimal among b-diversity-bounded k-diverse planning solutions.

b Qopt D-k : Given k and c, ﬁnd a diversity-optimal among c-quality-bounded k-diverse planning solutions.

Note that the solutions to the optimization problems b Dopt Q-k and b Qopt D-k are relative to the restricted set of solutions as in Deﬁnitions 2 and 3, respectively. This means that a solution to, e.g., b Qopt D-k is not necessarily a solution to opt D-k. One possible way to obtain solutions to the b Qopt D-k optimization problem is by using a top-quality planner to generate a set of all plans of bounded quality and then select an optimal subset of size k from the generated set according to some diversity metric. We can further restrict a set of feasible solutions to quality-optimal (diversity-optimal) diverse planning solutions and choose the best according to the diversity (quality) metric among those. Instead, our last optimization problem we simply call Optimal Diverse Planning. The objective of optimal diverse planning is to ﬁnd a solution that is pareto-optimal, that is for all solutions P we have either Q(P ) Q(P) and for all P with Q(P) = Q(P ) we have D(P ) D(P) or D(P ) D(P) and for all P with D(P) = D(P ) we have Q(P ) Q(P). In words, optimal solutions are solutions on the pareto frontier of quality and diversity. We denote the optimization problem stated above by opt-k. The hierarchy between the presented computational problems is depicted in Figure 1. Edges represent solution set inclusion, i.e., whether a solution for one problem is necessarily a solution for another, assuming a solution exists. For example, a pareto-optimal solution is a solution to either opt D-k or opt Q-k, but not necessarily to either b Qopt D-k or b Dopt Q-k, since the latter two optimize over the solutions that are of bounded quality and diversity, respectively. The diagram does not reﬂect the transitive inclusion, which, in this case, means that solutions to all problems are solutions to the satisﬁcing diversity planning problem.

5 Satisﬁcing Diverse Planning with a Satisﬁcing Classical Planner Previous work has focused on modifying existing planners, either heuristic search or local search based ones, to come up with plans that differ from previously found ones. These modiﬁed planners were then applied to the same planning task, over and over again. We suggest a different approach, using possibly the same planner, iteratively modifying the planning tasks to forbid plan sets (Katz et al. 2018b). Below, we list some of the beneﬁts to such an approach: (1) it allows us to exploit any classical planner, effortlessly switching to better ones, taking advantage of the progress in classical satisﬁcing planning; (2) it removes the need for modifying the behaviour of the existing planners, allowing these planners to work as intended; (3) this allows us to take the selection of a subset of plans that is diverse according to a speciﬁc metric and postprocess them, thus also allowing us to deﬁne and use more sophisticated metrics.

5.1 Forbidding a Plan as a Multiset of Actions Existing literature suggests one such task reformulation, forbidding exactly the given set of plans (Katz et al. 2018b). This was done in the context of top-k planning, where plans could not be safely discarded from consideration. In satisﬁcing diverse planning, there is no such limitation. As a result, it is possible to forbid additional plans. One could envision a metric-dependent reformulation, forbidding also the plans that are similar according to the given metrics. With the stability metric in mind, we suggest a reformulation that ignores orders between actions in a plan and thus, also forbids all possible reorderings of a given plan. Below, we present the detailed description of such a reformulation.

Deﬁnition 4 Let V, A, s0, s be a planning task and X be a multiset of actions. The task Π X = V , A , s 0, s is deﬁned as follows. V = V {v} {va | a X}, with v being a binary variable, and dom(va) = {0, . . . , ma}, where ma is the number of occurences of a in X, A = {ae | a A \ X} {ar, ad | a X} ma i=1{af i | a X}, where

ae = pre(a), eﬀ(a) { v, 0 } , ar = pre(a) { v, 0 }, eﬀ(a) ,

ad = pre(a) { v, 1 , va, ma }, eﬀ(a) { v, 0 } ,

af i = pre(a) { v, 1 , va, i-1 }, eﬀ(a) { va, i } ,

C (ae)=C (ar)=C (ad)=C (af)=C(a), s 0[v] = s0[v] for all v V, s 0[v] = 1, and s 0[va] = 0 for all a X, and s [v]=s [v] for all v V s.t. s [v] deﬁned, and s [v]=0.

Let us explain the semantics of the reformulation in Definition 4. By Xπ we denote the multiset of actions in a plan π. The variable v starts from the value 1 and switches to 0 when an action is applied that is not from the multiset X = Xπ. Once a value 0 is reached indicating a deviation from plan π, it cannot be switched back to 1. Variables va

Algorithm 1 Iterative diverse planning scheme. Input: Planning task Π, number of diverse plans k, number of total plans for search phase K, diversity metric D P Π Π while |P| < K do π some solution to Π P P {π | π is symmetric to π} X

π P Xπ Π Π X according to Deﬁnition 4 end while return choose k diverse plans from P, according to D

encode the number of applications of the action a. The actions ar and ad are copies of the action a in X for the cases when π is already discarded from consideration (variable v has switched its value to 0) and for discarding π from consideration (switching v to 0), respectively. The latter happens if the action a was already applied as many times as it appears in X. af i are copies of the action a in X, counting the number of applications of a, as long as the number is not higher than the number of times it appears in X. These actions are applicable only while the plan is still followed. As mentioned above, ignoring plan reorderings sits well with the stability metric, but also with the uniqueness metric. For the state metric, note that although different reorderings of the same plan produce different sequences of states, these sequences will mostly be quite similar. Thus, we believe that it is more beneﬁcial to spend the time on ﬁnding additional plans that are set -different instead of ﬁnding additional reorderings of the found plans. Note that in principle we could do both, if time permits. When a set of plans is available, obtained, e.g., by applying structural symmetries (Shleyfman et al. 2015), one option would be to reformulate via a series of reformulations as in Deﬁnition 4. Another option is to forbid possibly more than just that set of plans by exploiting Deﬁnition 4 for forbidding a multiset of actions that is a superset of all plans in the set. In our implementation, we decided to follow the latter approach, depicted in Algorithm 1. Each iteration starts from the original task and forbids all plans found so far. In the last step, the algorithm selects a diverse subset of plans out of the set of plans found so far. In what follows, we discuss how such a selection can be done.

5.2 Selecting a Diverse Subset of Plans The idea of selecting a set of plans in a post-processing phase is not new. A basic ﬁltering and then clustering was performed over the set of plans for a top-k planning problem (Sohrabi et al. 2016; 2018). These approaches, however, may become time consuming when metric computation is computationally expensive. Hence, in this work, we instead apply a simple greedy algorithm, with a negligible computational overhead. We ﬁrst order the found plans by their cost. Then, going from the cheapest plans to the more expensive ones, we ﬁnd a pair of plans with the largest diversity score. Starting with the found pair of plans, we iteratively construct

the set by greedily choosing the next plan to add to the set, maximizing the diversity of the resulting set at that iteration step. We stop once the set reaches the requested size k. We note that the quality of the solution obtained by such an algorithm may be considerably improved. However, as we see next, even such a naive algorithm produces quite encouraging results.

6 Diversity-Bounded Diverse Planning

As previously mentioned, LPG-d as described by Nguyen et al. (2012) is a sound diversity-bounded diverse planner, although not complete. Similarly, our suggested approach can be used to produce a sound diversity-bounded diverse planner by post-processing the obtained plans differently. In general, such a post-processing procedure should ﬁnd a collection of plans that adhere to certain constraints and that often corresponds to solving an NP-hard computational problem. For Dmma, that corresponds to ﬁnding a clique of size at least k, for a graph over vertices that correspond to plans found during the search phase and edges that correspond to pairs of plans of stability dissimilarity of at least d. Such cliques can be found using, e.g., mixed-integer linear program tools. In what follows, we use binary variables, one for each graph vertex to encode whether the vertex is a part of the selected clique. For each pair of vertices that are not connected by an edge, at most one of these vertices can belong to a clique. Thus, we introduce a constraint stating that if there is no edge between two vertices, then the sum of the two corresponding binary variables cannot exceed 1. An additional constraint requires the sum of all binary variables to be greater or equal to k, the number of the requested plans. Thus, valid assignments to the binary variables correspond exactly to cliques of size at least k. As a result, any optimization criteria can be chosen. Here, we choose to minimize the size of the obtained clique, ﬁnding a clique of size exactly k. This is done by minimizing the sum of all variables. Note that, while it is not required by the diversity-bounded diverse planning problem, one can optimize other criteria while keeping the same set of constraints, e.g., maximizing the sum of pairwise stability measures.

7 Experimental Evaluation

In order to evaluate the feasibility of our suggested approach for deriving diverse sets of plans according to various existing metrics, we have implemented our approach on top of the Fast Downward planning system (Helmert 2006). Our planners, Forbid Iterative (FI) diverse planners are publicly available as part of the collection of Forbid Iterative planners (Katz, Sohrabi, and Udrea 2019). Further, we have implemented an external component, that given a set of plans and a metric returns the score of the set under that metric and made the code publicly available (Katz and Sohrabi 2019). We compared our approach for satisﬁcing diverse planning to existing satisﬁcing diverse planners, namely DLAMA planner (Bryce 2014), DIV (Coman and Mu noz Avila 2011), it A , RWS, MQAd, MQAs, MQAtd, and MQAts (Roberts, Howe, and Ray 2014), on state, stability, uniqueness, as well as a uniform linear combination over

FI DIV DLAMA it A* MQAts RWS

coverage 1143 95 178 611 615 51 Qc 1095.66 84.69 127.26 539.13 533.03 39.25 Da 736.88 33.65 123.07 271.77 322.17 30.62 Ds 585.34 45.01 96.35 200.58 229.62 25.32 Du 1093.70 53.1 176 527.7 539.4 41.20 Ds Da 640.46 39.33 108.6 236.17 275.9 27.97 Ds Du 837.18 49.06 136.19 364.14 384.51 33.26 Du Da 915.87 43.37 149.5 399.74 430.79 35.91 Da Du Ds 791.81 43.92 131.11 333.35 363.73 32.38

coverage 1113 1 133 422 430 31 Qc 1060.08 0.93 92.43 376.64 363 23.68 Da 681.08 0.48 88.79 191.66 219.3 21.38 Ds 534.93 0.52 71.32 164.53 175.29 15.84 Du 1054.53 1 132.62 353.76 365.4 28.91 Ds Da 590.98 0.5 79.38 178.1 197.29 18.61 Ds Du 792.08 0.76 101.92 259.14 270.34 22.37 Du Da 868.10 0.74 110.68 272.71 292.35 25.15 Da Du Ds 745.40 0.67 97.09 236.65 253.33 22.04

coverage 909 0 11 37 78 15 Qc 849.21 0 7.28 31.8 66.48 11.61 Da 492.55 0 6.89 22.12 51.24 11.53 Ds 404.63 0 5.78 17.9 34.09 7.70 Du 834.18 0 11 32.7 76.74 14.95 Ds Da 438.70 0 6.29 20.01 42.67 9.61 Ds Du 617.02 0 8.39 25.3 55.42 11.33 Du Da 661.18 0 8.94 27.41 63.99 13.24 Da Du Ds 569.55 0 7.86 24.24 54.02 11.39

coverage 552 0 0 0 0 7 Qc 543.14 0 0 0 0 5.96 Da 263.58 0 0 0 0 5.38 Ds 206.54 0 0 0 0 3.66 Du 490.44 0 0 0 0 7.00 Ds Da 233.82 0 0 0 0 4.52 Ds Du 348.47 0 0 0 0 5.33 Du Da 375.79 0 0 0 0 6.20 Da Du Ds 318.92 0 0 0 0 5.35

Table 1: Overall summed scores for various metrics, for k {5, 10, 100, 1000}. Da stands for stability, Ds for state, and Du for uniqueness diversity metrics. Qc stands for cost quality metric. Rows that correspond to a linear combination of diversity metrics are marked with all combined metrics.

all subsets of these metrics (seven diversity metrics overall), shown in Table 1. Our diversity-bounded diverse planner is compared to the only existing diversity-bounded diverse planner LPG-d (Nguyen et al. 2012), on the Dmma metric, varying the diversity parameter d to obtain values 0.15, 0.25, and 0.5 (see Table 2). We also varied the value of k, the number of required plans, for k {5, 10, 100, 1000}. To compare to all selected existing planners, we restrict our benchmark set to STRIPS domains with uniform action costs from the International Planning Competitions (IPC). This results in 1276 tasks in 40 domains. The experiments were performed on Intel(R) Xeon(R) CPU E7-8837 @2.67GHz machines, with time and memory limits of 30min and 2GB, respectively. Our suggested approach iteratively solves a planning task, ﬁnds a set of plans, and creates a new task that forbids a superset of the plans found so far. Considering plans as multisets, ignoring the order between the actions, this superset is deﬁned as the union

of all plans found so far. Thus, we forbid reorderings of found plans, but also, possibly additional plans, corresponding to a union of multiple found plans. We are restricting the number of found plans to 1000. For solving the (original and reformulated) classical planning tasks, we use an existing state-of-the-art agile planner. The planner that was chosen is MERWIN (Katz et al. 2018a). It performs a greedy best-ﬁrst search (GBFS), alternating between four queues, novelty of the red-black heuristic, landmark count, preferred operators from the red-black heuristic, and preferred operators from the landmark count heuristic. The conﬁguration has shown an exceptionally good performance on the IPC domains in our benchmark set (Katz et al. 2017). Note that while we report only results for MERWIN, we have also experimented with LAMA (Richter and Westphal 2010). The results were similar; therefore, we report here only the results for MERWIN. A minor restriction in our choice of an external planner is the ability to work directly on SAS+ representation, since our task reformulation is performed directly on SAS+ and results in a SAS+ task. While most state-of-the-art planners do work on the grounded SAS+ representation, in some cases, an adaptation might be required, since our implementation uses the input format of the Fast Downward planning system (Helmert 2006). For both satisﬁcing and diversity-bounded diverse planning, the ﬁnal solution is chosen from a larger set of plans in a postprocessing step. Focusing ﬁrst on satisﬁcing diverse planning, if the desired number of plans k is lower, we then greedily1 choose a subset of size k according to the given diversity metrics, as described in Section 5.2. Note that this can result in different subsets of plans chosen for different metrics. The algorithm is implemented as part of the external component (Katz and Sohrabi 2019). Each technique gets a score between 0 and 1 for each task and each metric, as described in previous sections. If not enough unique plans were found by some planner on a task, the planner gets the score of 0 for that task. Table 1 depicts the summed scores for all planners on all metrics, for various values of k, from 5 to 1000. For space reasons, we show only the best variant of MQA planner. First, note that our approach excels on all metrics, for both diversity and quality. This is due in part to a dramatically increased coverage, but that is not the only source of improved performance. As an evidence, for k = 5, even when averaging the diversity score by the coverage, our approach outperforms all competitors except for DLAMA on all diversity metrics considered. For the quality metric, we dramatically improve over all competitors, even when averaging over the solved instances. Note that while for smaller k values there are several techniques that are somewhat comparable in their performance to ours, larger k values seem to be challenging for most tested techniques. Moving now to diversity-bounded diverse planning, we increased the bound on the number of plans found in the ﬁrst phase to 2000, to give the planner some choice for k = 1000. The solution is obtained by solving the binary linear program, as described in Section 4.2 with the CPLEX solver in

1We have experimented with exact techniques, based on mixedinteger linear programs, but found them to be prohibitively slow.

0.15 0.25 0.5 k b FI LPG-d Dom b FI LPG-d Dom b FI LPG-d Dom 5 1011 675 28/6 974 671 25/8 890 652 24/10 10 946 632 26/7 912 623 27/9 771 586 25/10 100 569 532 17/13 454 517 15/15 213 433 8/16 1000 152 396 7/17 80 359 3/18 3 234 1/15

Table 2: Comparison of bounded-diversity score (total number of solved tasks) for k=5, 10, 100, and 1000 for the stability metric (Dmma). Best results are bolded. Dom shows # of domains with superior performance for b FI/LPG-d.

its default conﬁguration. The code is available as part of the external component (Katz and Sohrabi 2019). While in general these programs have up to 2K binary variables and up to 4M constraints, we observe that the run time of the solver is rarely above 10 seconds, with the peak reaching 47 seconds. If the binary linear program was solved by the solver (feasible solution found), the planner gets 1, and otherwise (infeasible) 0. We post-process the set of plans from both our approach and LPG-d in the same way. Table 2 shows the overall summed scores over all instances, as well as the number of domains where each approach exhibits superior performance. As a reminder, our approach chooses k plans out of the found plans with Dmma above the given threshold. Thus, our approach has a clear disadvantage when there is little or no choice, as in the case of the largest k values in our experiment. For smaller k values (k = 5, 10), there is a clear advantage to our approach, for all tested bounds on Dmma.

8 Summary and Future Work

We have presented various diverse planning computational problems and classiﬁed the existing diverse planners with their respective problems. Key contributions of this paper include: (1) characterization of optimal, bounded, and satisﬁcing diverse planning problems; (2) introducing an external validation component for diverse planning; (3) addressing the satisﬁcing and bounded-diversity diverse planning problems by iteratively solving a modiﬁed planning task using existing classical planners, escaping the need to adapt a planner to each new diversity metric. We have empirically demonstrated the beneﬁts of using such an approach, considerably improving the state-of-the-art in satisﬁcing diverse planning and favorably competing with the state-of-the-art in bounded-diversity diverse planning. For future work, in satisﬁcing diverse planning, we intend to explore alternative ways of reformulating a planning task, aiming at tackling a speciﬁc diversity metric. For various optimal diverse planning computational problems, it is often not clear how to create a non-trivial planner for that problem at all. For example, an opt D-k optimization problem, requires generating a set of plans that is diversity-optimal. A naive solution might require generating all possible plans ﬁrst, which might be infeasible, especially in cases when the set of all plans is inﬁnite. Focusing on such planning problems is a promising research direction.

References B ackstr om, C., and Nebel, B. 1995. Complexity results for SAS+ planning. Computational Intelligence 11(4):625 655. Boddy, M.; Gohde, J.; Haigh, T.; and Harp, S. 2005. Course of action generation for cyber security using classical planning. In Biundo, S.; Myers, K.; and Rajan, K., eds., Proceedings of the Fifteenth International Conference on Automated Planning and Scheduling (ICAPS 2005), 12 21. AAAI Press. Bryce, D. 2014. Landmark-based plan distance measures for diverse planning. In Chien, S.; Fern, A.; Ruml, W.; and Do, M., eds., Proceedings of the Twenty-Fourth International Conference on Automated Planning and Scheduling (ICAPS 2014), 56 64. AAAI Press. Coman, A., and Mu noz-Avila, H. 2011. Generating diverse plans using quantitative and qualitative plan distance metrics. In Burgard, W., and Roth, D., eds., Proceedings of the Twenty Fifth AAAI Conference on Artiﬁcial Intelligence (AAAI 2011), 946 951. AAAI Press. Domshlak, C.; Hoffmann, J.; and Katz, M. 2015. Red-black planning: A new systematic approach to partial delete relaxation. Artiﬁcial Intelligence 221:73 114. Fox, M.; Gerevini, A.; Long, D.; and Serina, I. 2006. Plan stability: Replanning versus plan repair. In Long, D.; Smith, S. F.; Borrajo, D.; and Mc Cluskey, L., eds., Proceedings of the Sixteenth International Conference on Automated Planning and Scheduling (ICAPS 2006), 212 221. AAAI Press. Goldman, R. P., and Kuter, U. 2015. Measuring plan diversity: Pathologies in existing approaches and a new plan distance metric. In Proceedings of the Twenty-Ninth AAAI Conference on Artiﬁcial Intelligence (AAAI 2015), 3275 3282. AAAI Press. Helmert, M. 2006. The Fast Downward planning system. Journal of Artiﬁcial Intelligence Research 26:191 246. Katz, M., and Sohrabi, S. 2019. Diversity score computation for diverse planning. https://doi.org/10.5281/zenodo.2691996. Katz, M.; Lipovetzky, N.; Moshkovich, D.; and Tuisov, A. 2017. Adapting novelty to classical planning as heuristic search. In Barbulescu, L.; Frank, J.; Mausam; and Smith, S. F., eds., Proceedings of the Twenty-Seventh International Conference on Automated Planning and Scheduling (ICAPS 2017), 172 180. AAAI Press. Katz, M.; Lipovetzky, N.; Moshkovich, D.; and Tuisov, A. 2018a. Merwin planner: Mercury enchanced with novelty heuristic. In Ninth International Planning Competition (IPC9): planner abstracts, 53 56. Katz, M.; Sohrabi, S.; Udrea, O.; and Winterer, D. 2018b. A novel iterative approach to top-k planning. In de Weerdt, M.; Koenig, S.; R oger, G.; and Spaan, M., eds., Proceedings of the Twenty-Eighth International Conference on Automated Planning and Scheduling (ICAPS 2018). AAAI Press. Katz, M.; Hoffmann, J.; and Domshlak, C. 2013. Who said we need to relax all variables? In Borrajo, D.; Kambhampati, S.; Oddi, A.; and Fratini, S., eds., Proceedings of the Twenty-Third International Conference on Automated Planning and Scheduling (ICAPS 2013), 126 134. AAAI Press. Katz, M.; Sohrabi, S.; and Udrea, O. 2019. Forbid Iterative planners for top-k, top-quality, and diverse planning problems. https://doi.org/10.5281/zenodo.3246773.

Katz, M.; Sohrabi, S.; and Udrea, O. 2020. Top-quality planning: Finding practically useful sets of best plans. In Proceedings of the Thirty-Fourth AAAI Conference on Artiﬁcial Intelligence (AAAI 2020). AAAI Press. Katz, M. 2018. Cerberus: Red-black heuristic for planning tasks with conditional effects meets novelty heuristic and enchanced mutex detection. In Ninth International Planning Competition (IPC-9): planner abstracts, 47 51. Myers, K. L., and Lee, T. J. 1999. Generating qualitatively different plans through metatheoretic biases. In Proceedings of the Sixteenth National Conference on Artiﬁcial Intelligence (AAAI 1999), 570 576. AAAI Press. Nguyen, T. A.; Do, M. B.; Gerevini, A.; Serina, I.; Srivastava, B.; and Kambhampati, S. 2012. Generating diverse plans to handle unknown and partially known user preferences. Artiﬁcial Intelligence 190:1 31. Riabov, A. V.; Sohrabi, S.; Sow, D. M.; Turaga, D. S.; Udrea, O.; and Vu, L. H. 2015. Planning-based reasoning for automated large-scale data analysis. In Brafman, R.; Domshlak, C.; Haslum, P.; and Zilberstein, S., eds., Proceedings of the Twenty-Fifth International Conference on Automated Planning and Scheduling (ICAPS 2015), 282 290. AAAI Press. Riabov, A. V.; Sohrabi, S.; and Udrea, O. 2014. New algorithms for the top-k planning problem. In ICAPS 2014 Scheduling and Planning Applications wo RKshop, 10 16. Richter, S., and Westphal, M. 2010. The LAMA planner: Guiding cost-based anytime planning with landmarks. Journal of Artiﬁcial Intelligence Research 39:127 177. Roberts, M.; Howe, A. E.; and Ray, I. 2014. Evaluating diversity in classical planning. In Chien, S.; Fern, A.; Ruml, W.; and Do, M., eds., Proceedings of the Twenty-Fourth International Conference on Automated Planning and Scheduling (ICAPS 2014), 253 261. AAAI Press. Shleyfman, A.; Katz, M.; Helmert, M.; Sievers, S.; and Wehrle, M. 2015. Heuristics and symmetries in classical planning. In Proceedings of the Twenty-Ninth AAAI Conference on Artiﬁcial Intelligence (AAAI 2015), 3371 3377. AAAI Press. Sohrabi, S.; Riabov, A. V.; Udrea, O.; and Hassanzadeh, O. 2016. Finding diverse high-quality plans for hypothesis generation. In Kaminka, G. A.; Fox, M.; Bouquet, P.; H ullermeier, E.; Dignum, V.; Dignum, F.; and van Harmelen, F., eds., Proceedings of the 22nd European Conference on Artiﬁcial Intelligence (ECAI 2016), 1581 1582. IOS Press. Sohrabi, S.; Riabov, A. V.; Katz, M.; and Udrea, O. 2018. An AI planning solution to scenario generation for enterprise risk management. In Proceedings of the Thirty-Second AAAI Conference on Artiﬁcial Intelligence (AAAI 2018), 160 167. AAAI Press. Sohrabi, S.; Riabov, A. V.; and Udrea, O. 2016. Plan recognition as planning revisited. In Kambhampati, S., ed., Proceedings of the 25th International Joint Conference on Artiﬁcial Intelligence (IJCAI 2016), 3258 3264. AAAI Press. Vadlamudi, S. G., and Kambhampati, S. 2016. A combinatorial search perspective on diverse solution generation. In Proceedings of the Thirtieth AAAI Conference on Artiﬁcial Intelligence (AAAI 2016), 776 783. AAAI Press.