# bounding_quality_in_diverse_planning__fa7003ba.pdf

Bounding Quality in Diverse Planning

Michael Katz1, Shirin Sohrabi1, Octavian Udrea2

1 IBM Research, Yorktown Heights, NY, USA 2 Dataminr, New York, NY, USA michael.katz1@ibm.com, ssohrab@us.ibm.com, oudrea@dataminr.com

Diverse planning is an important problem in automated planning with many real world applications. Recently, diverse planning has seen renewed interest, with work that deﬁnes a taxonomy of computational problems with respect to both plan quality and solution diversity. However, despite the recent advances in diverse planning, the variety of approaches and the number of available planners are still quite limited, even nonexistent for several computational problems. In this work, we aim to extend the portfolio of planners for various computational problems in diverse planning. To that end, we introduce a novel approach to ﬁnding solutions for three computational problems within diverse planning and present planners for these three problems. For one of these problems, our approach is the ﬁrst one that is able to provide solutions to the problem. For another, we show that top-k and top quality planners can provide, albeit naive, solutions to the problem and we extend these planners to improve the diversity of the solution. Finally, for the third problem, we show that some existing diverse planners already provide solutions to that problem. We suggest another approach and empirically show it to compare favorably with these existing planners.

Introduction Diverse planning is an important problem in AI Planning with many practical applications that require generating multiple plans rather than one. Example applications include automated machine learning (Mohr, Wever, and H ullermeier 2018; Katz et al. 2020), risk management (Sohrabi et al. 2018), automated analysis of streaming data (Riabov et al. 2015), and malware detection (Boddy et al. 2005). Diverse planning is also important in the context of re-planning and plan monitoring (Fox et al. 2006), under-speciﬁed user preferences (Myers and Lee 1999; Nguyen et al. 2012), as well as plan recognition and its related applications (Sohrabi, Riabov, and Udrea 2016). In all these applications, the importance of generating multiple diverse plans is on par with being able to control solution quality, in part due to the way many of these applications are modelled. As a prominent example, in domains that exploit a variant of soft goals compilation (Keyder and Geffner 2009), plans of low quality (high cost) may choose to discard many otherwise reachable soft goals (e.g., (Katz et al. 2020)).

Copyright 2022, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

Most diverse planners developed over the last decade are focused on addressing a particular diversity metric. For example, while the diverse planner DLAMA focuses on ﬁnding a set of plans by considering a landmark-based diversity measure (Bryce 2014), other diverse planners such as LPG-d, DIV, DFAA/DFAM, and A AA/A AM focus on ﬁnding a set of plans with a particular minimum action distance (Nguyen et al. 2012; Coman and Mu noz-Avila 2011; Vadlamudi and Kambhampati 2016). Goldman and Kuter (2015) propose a diversity metric based on information retrieval literature. Roberts, Howe, and Ray (2014) suggest another diversity metric, introducing several planners, such as it A and MQA, which, in addition to the diversity metrics, consider plan quality. While all these planners implement the chosen diversity metric and switching to another metric is not trivial, the planners DFAA/DFAM and A AA/A AM work in two phases: ﬁnd a set of plans and choose a proper subset of the found set. That is, plan subset selection is independent of the diversity metric. In 2020, Katz and Sohrabi have proposed a set of planners FI-diverse that separate the phase of ﬁnding candidate plans from choosing a diverse subset of these plans (Katz and Sohrabi 2020). Further, they provide a tool for selecting a subset of plans for a variety of metrics and computational problems (Katz and Sohrabi 2019).

The work by Katz and Sohrabi has also introduced a taxonomy of computational problems and classiﬁed existing planners according to the problems they tackle. Most existing planners, according to that taxonomy, solve Satisﬁcing Diverse Planning (sat-k), where any sufﬁciently large set of plans, k, is a solution, and the aim is to improve solution diversity. The planners LPG-d (Nguyen et al. 2012) and b FI (Katz and Sohrabi 2020) tackle Bounded Diversity Diverse Planning (b D-k), where a set of plans, k, is a solution only if its diversity is above a certain speciﬁed bound b. Top-k planners (Katz et al. 2018b; Speck, Mattm uller, and Nebel 2020) and top-quality planners (Katz, Sohrabi, and Udrea 2020), while usually are not considered diverse planners, according to the aforementioned taxonomy, return, albeit naive, solutions to the Bounded Quality Diverse Planning (b Q-k) problem, where plan set quality is required to be above a speciﬁed bound. The planners DFAA/DFAM and A AA/A AM (Vadlamudi and Kambhampati 2016) tackle Bounded Quality and Diversity Diverse Planning (b Qb D-k), where both the quality and the diversity of plan sets is bounded.

The Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-22)

Despite these recent advances in diverse planning, the pool of existing tools is still quite limited. The planners DFAA/DFAM and A AA/A AM by Vadlamudi and Kambhampati (2016) are the only existing planners for b Qb D-k. Topquality planners (Katz, Sohrabi, and Udrea 2020; Katz and Sohrabi 2022), although technically solving b Q-k, do not tackle solution diversity. For Optimal Diversity Bounded Quality Diverse Planning (b Qopt D-k), where a solution corresponds to a set of plans of best diversity among the sets of bounded quality, no planners currently exist. In this work, we expand the pool of available planners for diverse planning. We introduce novel planners for the three aforementioned computational problems, b Q-k, b Qb D-k, and b Qopt D-k, exploiting the recently introduced top-quality planners. To that end, we introduce a novel quality metric that reﬂects bounded plan costs, making a connection between the costs of plans in a set and quality of a set of plans. Focusing on the most popular diversity metric (Nguyen et al. 2012; Coman and Mu noz-Avila 2011; Vadlamudi and Kambhampati 2016), for all three planners we generate a subset of all plans of bounded quality as a ﬁrst step. In the second step, we select a subset of plans found in the ﬁrst step that constitutes a solution to the respective computational problem. For b Q-k, as any sufﬁciently large subset of plans is, albeit naive, a solution to the problem, we extend these planners by using a previously suggested greedy algorithm to choose a subset of plans of higher diversity (Katz and Sohrabi 2020). For b Qb D-k, we show that the decision problem that corresponds to the second step is NP-complete and suggest using a previously proposed integer linear programming formulation. As the formulation was not previously formally described, we describe it here and prove that it provides a solution to b Qb D-k. For b Qopt D-k, as the optimization problem in second step is NP-hard, we propose using a novel mixed integer linear program to solve it. We formally describe the program and prove that its solution can be used for solving b Qopt D-k. Our approach is the ﬁrst one that is able to provide solutions to the problem. Finally, we perform an empirical evaluation of our proposed planners. For b Qb D-k, we compare favorably to the existing planners. For b Q-k, as no previous planners exist, we test the quality of our solution by comparing it to the quality of the optimal solution, obtained by our proposed planner for b Qopt D-k, where no previous planners exist either. We show that the greedy algorithm works very well on tested domains, producing results close to the optimum. Our novel contributions, thus, include (i) the introduction of the new quality metric that allows us to connect between the cost of plans and quality of a set of plans, (ii) a concrete algorithmic scheme that uses top quality planners for the ﬁrst step of ﬁnding a set of plans of bounded cost, (iii) computational complexity investigation of the second step, choosing a proper subset from the found set, for various computational problems, and (iv) introduction of the new mixed integer linear program for the resulting optimization problem.

Preliminaries and Related Work In this work we follow the notation of Katz and Sohrabi (2020). A SAS+ planning task (B ackstr om and Nebel 1995)

is given by a tuple V, A, s0, s , with V being a set of state variables and A being a ﬁnite set of actions. Each state variable v V has a ﬁnite domain dom(v) of values. A pair v, ϑ with v V and ϑ dom(v) is called a fact. A (partial) assignment to V is called a (partial) state. Often it is convenient to view a partial state p as a set of facts with v, ϑ p if and only if p[v] = ϑ. A partial state p is consistent with state s if p s. s0 is the initial state, and the partial state s is the goal. Each action a is a pair pre(a), eﬀ(a) of partial states called preconditions and effects. An action cost is a mapping C : A R0+. An action a is applicable in a state s if and only if pre(a) is consistent with s. Applying a changes the value of v to eﬀ(a)[v], if deﬁned. The resulting state is denoted by s Ja K. An action sequence π = a1, . . . , ak is applicable in s if there exist states s0, , sk such that (i) s0 = s, and (ii) for each 1 i k, ai is applicable in si 1 and si = si 1Jai K. We denote the state sk by s JπK. π is a plan iff π is applicable in s0 and s is consistent with s0JπK. We denote by P(Π) (or just P when the task is clear from the context) the set of all plans of Π. The cost of a plan π, denoted by C(π) is the summed cost of the actions in π. In regard to reasoning about sets of plans rather than individual plans, there are two main measures deﬁned on sets of plans, quality and diversity. Previous work has introduced one deﬁnition of quality, mirroring the International Planning Competition (IPC) quality metric for individual plans (Katz and Sohrabi 2020). Deﬁnition 1 Let P be the set of known plans of Π and let P P be a subset of plans. The relative quality of P with respect to P is deﬁned as

QP (P ) := 1 |P |

C(πi) C(π i),

where π1, . . . , π|P | and π 1, . . . , π |P | are plans in P and P , respectively, sorted in ascending order of their costs. The relative quality of a set of plans is always between 0 and 1, being 1 if and only if there is no plan in P \ P that is cheaper than any plan in P . Switching now our attention to diversity metrics, pairwise plan distance is deﬁned by δ(π, π ) = 1 sim(π, π ), where sim is a similarity measure, a value between 0 (unrelated) and 1 (equivalent). The diversity of a set of plans, D(P), P P is then deﬁned as some aggregation (e.g., min or average) of the pairwise distance within the set P. In this work, we focus on one of the most popular similarity measures, stability (Fox et al. 2006; Coman and Mu noz-Avila 2011). Stability similarity measures the ratio of the number of actions that appear on both plans to the total number of actions on these plans, referring to plans as action multisets (sets with repetitions). Given two plans π, π , it is deﬁned as simstability(π, π ) = |A(π) A(π )|/|A(π) A(π )|, where A(π) is the multi-set of actions in π. In what follows, by Dma we denote the diversity metric computed as minimum over the pairwise plan distance under stability similarity, the diversity metric implemented by multiple existing diverse planners (Nguyen et al. 2012; Coman and Mu noz Avila 2011; Vadlamudi and Kambhampati 2016).

There is a variety of computational problems that fall under the umbrella of diverse planning. In our work, focusing on bounded quality problems, we follow the recently introduced taxonomy (Katz and Sohrabi 2020). Deﬁnition 2 (Diverse planning solution) Let Π be a planning task and P be the set of all plans for Π. Given a natural number k, P P is a k-diverse planning solution if |P| = k or P = P if |P| < k. Deﬁnition 3 (Quality-bounded solution) Let Π be a planning task, Q be some quality metric, c be some bound, and P be the set of all Π s plans. Given a natural number k, P P is a c-quality-bounded k-diverse planning solution if it is a k-diverse planning solution and Q(P) c. Bounded Quality Diverse Planning computational problem is deﬁned as follows. b Q-k : Given k and c, ﬁnd a c-quality-bounded k-diverse planning solution. Deﬁnition 4 (Diversity-bounded solution) Let Π be a planning task, D be some diversity metric, b be some bound, and P be the set of all Π s plans. Given a natural number k, P P is a b-diversity-bounded k-diverse planning solution if it is a k-diverse planning solution and D(P) b. Bounded Quality and Diversity Diverse Planning computational problem is deﬁned as follows. b Qb D-k : Given k, b, and c, ﬁnd a c-quality-bounded and b-diversity-bounded k-diverse planning solution. Note that the deﬁnition above generalizes the previously suggested search problem described in Equation 1 below and implemented for the diversity metric Dma by Vadlamudi and Kambhampati (2016).

c COSTd DISTANTk SET : ﬁnd P with P P,

|P| = k, min π,π P δ(π, π ) d, C(π) c π P. (1)

Finally, Optimal Diversity Bounded Quality Diverse Planning optimization problem is deﬁned as follows. b Qopt D-k : Given k and c, ﬁnd a diversity-optimal among c-quality-bounded k-diverse planning solutions, where given a diversity metric D, a diversity optimal solution P is one that maximizes D(P). In the presence of 0-cost actions, the set of c-quality-bounded k-diverse planning solutions can sometimes be inﬁnite and might not even have a maximal element, making the optimization problem not well-deﬁned. In order to make the problem well-deﬁned, the set of candidate plans might be restricted in a way that will make the set of c-quality-bounded k-diverse planning solutions be ﬁnite. In what follows we suggest one such restriction. Let mπ(s) denote the number of instances of a state s traversed by the plan π. A plan π is called C-cycle limited if for all states s we have mπ(s) C + 1. Cycle-free plans (analogously to cycle-free paths in graphs) are therefore 0cycle limited. Since the number of states for any planning task is ﬁnite, given a constant C, a planning task Π has a ﬁnite number of C-cycle limited plans. Therefore, in what follows we restrict the set of all plans P to be the set of all C-cycle limited plans, for a large constant C.

Bounded Quality in Diverse Planning As stated above, in this work, we focus on the three computational problems in diverse planning taxonomy of Katz and Sohrabi (2020) that deal with bounded quality, b Q-k, b Qb D-k, and b Qopt D-k. Our proposed solutions to these three problems all have in common the ﬁrst step ﬁnding a set of plans of bounded quality. While existing planners for bounded quality diverse planning took the same approach (Vadlamudi and Kambhampati 2016), they used planners for the top-k planning problem (Riabov, Sohrabi, and Udrea 2014). Speciﬁcally, A AA/A AM apply the m-A algorithm (Flerova, Marinescu, and Dechter 2016), while DFAA/DFAM apply the well-known branch-and-bound algorithm. We suggest using a different approach, generating plans with a planner for a recently proposed unordered top-quality problem (Katz, Sohrabi, and Udrea 2020). Switching to top-quality allows to ensure that all plans of bounded cost are found. Unordered top-quality allows disregarding plans that are reorderings of the found plans. For some diversity metrics, this is highly beneﬁcial, with pairwise diversity between a plan and its reordering might be very low. Further, ignoring plan orders reduces the computational effort required for ﬁnding all plans of bounded cost. The second step, after ﬁnding a set of plans of bounded quality, is different for the different computational problems that we consider in this work. For b Q-k, although any set of k plans is a solution, we strive to obtain solutions of higher diversity. Therefore, we apply a greedy algorithm that iteratively increases the set of plans by adding at each step the candidate plan that increases the overall diversity score the most, the same algorithm that was used for satisﬁcing diverse planning (Katz and Sohrabi 2020). For b Qb D-k and b Qopt D-k, we cast the problem of ﬁnding the subset of plans as (mixed) integer linear programs. We describe these programs in detail in what follows. We start with the discussion of the quality metric that we consider in this work.

Quality Metric While we consider the planners DFAA/DFAM and A AA/A AM to be solving the Bounded Quality and Diversity Diverse Planning problem as deﬁned by Katz and Sohrabi (2020), the quality metric they maximize is not obvious. These planners consider the set P of plans of cost smaller or equal than a given absolute bound value c, or maxπ P C(π) c. Alternatively, the criterion can be expressed via Qa(P) c for the quality measure

maxπ P C(π) (2)

where c is the task s optimal plan cost and c = c

c [0, 1]. Thus, the planners above solve the b Qb D-k problem for the quality metric Qa as in Eq. 2. Note, this quality measure is different from the measure described in Deﬁnition 1, as introduced by Katz and Sohrabi (2020), where the quality of the set of plans is affected by the costs of all plans, not only by the most expensive ones. The above proposed quality metric makes it possible to connect the quality metric to the cost bound. Note also that while it is sufﬁcient to require

c to denote the cost of the cheapest known reference plan, in some cases such measure might not satisfy the independence of irrelevant alternatives (IIA) criteria (Seipp 2019). Thus, we require c to denote the task s optimal plan cost.

Bounded Quality Planning

Let us consider now the (unordered) top quality planners (Katz, Sohrabi, and Udrea 2020), that, given a multiplier qm 1, return the set of all plans P such that π P we have C(π) qm c , or a subset thereof with a single representative for plans that differ only in the order of their actions for the unordered case. For such sets, we have Qa(P) 1 qm , and therefore these planners can be used to derive solutions of bounded quality according to the quality metric Qa. Furthermore, top quality planners produce a set of plans that is a super-set of the sets of plans that constitute solutions to all three computational problems of interest, b Q-k, b Qb D-k, and b Qopt D-k. Therefore, in what follows, we will focus on ﬁnding subsets of plans out of a given set of plans, according to the relevant solution deﬁnition for the corresponding computational problem. Focusing ﬁrst on b Q-k, while any subset of required size of the set of plans returned by top quality planners is a solution to b Q-k, different subsets can vary signiﬁcantly in their diversity measure score. Since b Q-k does not pose any restrictions on these subsets beyond the desired size, one possible way of coming up with subsets of high diversity is to employ the same greedy selection algorithm that was used for Satisﬁcing Diverse Planning (Katz and Sohrabi 2020). The algorithm iteratively constructs a set of plans by greedily adding a plan that will contribute the most to already added plans.

Bounding Diversity

Switching now our attention to b Qb D-k, ﬁrst, note that for a set of plans and a number k, the decision problem of whether there exists a subset of bounded diversity of size k is NPcomplete. To show that, we ﬁrst formally deﬁne the decision problem. For simplicity, assume the diversity metric Dma and a rational constant bound d = x y (0, 1], for some integers x and y. For each such bound, we deﬁne a separate decision problem1 as follows.

Input: A set of plans P and an integer k. Prop: P has a subset P of size k s.t. Dma(P ) d.

The membership in NP is trivial. We show the hardness by a polynomial reduction from the clique problem (Garey and Johnson 1979). Given a graph G = (V, E) and an integer k, let N = x|V | and n = 2(y x)|V |. We construct a set of plans PG = {πv | v V } as follows. First, for each (u, v) E, actions au,v are added to both πu and πv. Second, we add unique actions ai v to the plan πv for all v V until we get |πv| = N. Third, we add n actions ai to all plans πv, v V . Since the size of each plan is linear in |V |, the reduction is indeed polynomial.

1Thus, x and y are not part of the input to the decision problem.

For (u, v) E we then have πu πv = {a1, . . . , an} and thus δ(πu, πv) = 1 n 2N+n. For (u, v) E, on the other hand, πu πv = {au,v, a1, . . . , an}, and therefore δ(πu, πv) = 1 n+1 2N+n 1 < 1 n 2N+n. To conclude our construction, we now show that 1 n 2N+n = d:

1 n 2N + n = 2N 2N + n = 2x|V | 2x|V | + 2(y x)|V | = x

Theorem 1 Given a set of plans PG and a number k, there is a subset of PG with diversity bounded by d of size k if and only if there is a clique in G of size k.

Proof: Let P PG be a subset of plans such that |P| = k and D(P) d. Then for all πu, πv P we have δ(πu, πv) = d. Thus, it must be the case that (u, v) E for all πu, πv P and thus the set V = {v | πv P} is a clique, of size |V | = |P|. Let V V be a clique in G of size k and let P = {πv | v V } PG be a subset of plans. Then, (a) |P| = k, and (b) for all πu, πv P we have (u, v) E and thus δ(πu, πv) = d, giving us D(P) = d.

Next, we describe the mixed integer linear program that is used for ﬁnding a subset of plans of bounded diversity. While the program is not novel,2 its description was not presented in previous work. Here, we describe the program in detail. Given a set of plans P and a bound on the diversity d, the variables are as follows. A binary variable vπ per plan π P, describing whether the plan is selected for the subset. The constraints are as follows. (i) π, π P, s.t. δ(π, π ) < d : vπ + vπ 1, stating that if the pairwise diversity of π and π is below d, then at most one of these plans can be selected for the subset, and (ii) P

π P vπ k, forcing the size of the subset be at least k.

The objective of the program is to minimize P π P vπ. In words, the program encodes a subset selection and restricts the selected subset to not have pairs of plans with diversity outside of the provided bound. In what follows, we prove that the program can be used for devising solutions for b Qb D-k.

Theorem 2 For a planning task Π with a set of all plans of bounded quality P such that |P| k, and a bound d, the binary program ﬁnds a subset of size k with the bounded by d diversity score, for diversity metrics maximizing minimal pairwise diversity, if such subset exists. Otherwise, the program is infeasible.

Proof: We ﬁrst show that a solution to b Qb D-k corresponds to a feasible assignment. Let P P be a solution to b Qb D-k for the bound d, with |P | = k. Then, let v assign 1 to plans π P and 0 otherwise. Since |P | = k, constraint (ii) holds. For π, π P, if either of the plans is

2The mixed integer linear program was previously used for bounded diversity diverse planning (Katz and Sohrabi 2020).

not in P , then vπ + vπ 1. If both plans are in P , then δ(π, π ) d. Thus, constraint set (i) holds for v and the program is feasible. Now, let v be some feasible solution and let P = {π P | vπ = 1} be the corresponding subset of P. Then, (a) from constraint (ii) we have |P | k, and (b) for all π, π P , since vπ + vπ = 2, we know that the corresponding constraint is not in the constraint set (i), and therefore δ(π, π ) d. Therefore, P (or any of its subset of size k) is a solution to b Qb D-k.

We now switch our attention to the next computational problem, b Qopt D-k.

Optimizing Diversity Due to the NP-completeness of the decision problem of selecting a subset of plans of bounded diversity, the corresponding optimization problem is NP-hard. To solve it efﬁciently, we encode it in mixed integer linear programming. We present a novel mixed integer linear program that we use to ﬁnd a subset of size k that optimizes the diversity metric. Given a set of plans P, we deﬁne the variables as follows.

A binary variable vπ per plan π P, describing whether the plan is selected for the subset, and a single continuous variable vd for bounding the pairwise diversity.

The constraints are as follows. (i) P π P vπ = k, stating that the size of the subset is exactly k, and (ii) π, π P : vd + vπ + vπ δ(π, π ) + 2, stating that d is bounded by the diversity of each chosen pair, if the pair is chosen. The objective of the program is then to maximize vd. In words, as in the previous case, the program encodes a subset selection, but in this case all subsets of size k correspond to valid assignments. We additionally have a continuous variable vd that is bounded by the diversity score of the selected subset. In the case of diversity metrics that correspond to minimal pairwise diversity, this would mean to require the variable vd to be bounded by the diversity of each selected pair. In other words, if a pair of plans is selected, then vd should be no greater than their diversity score. If a pair is not selected, there is no such restriction, but since there is a natural upper bound of 1 on the overall diversity, vd can be required to be upper bounded by any value that is larger or equal to 1. If at least one of vπ, vπ gets 0 assigned to it, the constraint vd + vπ + vπ δ(π, π ) + 2 is then satisﬁed. Therefore, the constraint is valid whether the variables vπ and vπ are assigned 0 or 1. In what follows, we prove that the program can be used for devising solutions for b Qb D-k.

Theorem 3 For a planning task Π with a set of all plans of bounded quality P such that |P| k, the mixed integer program ﬁnds a subset of size k with the optimal diversity score, for diversity metrics maximizing minimal pairwise diversity.

Proof: Let v, vd be a feasible assignment to the variables of the mixed integer program (such an assignment always

exists) and let P = {π P | vπ = 1} be the corresponding subset of P. Then, from the constraint set (i) we have |P | = k and from constraint set (ii) we have vd δ(π, π ) for all π, π P . Further, for a plan π P \P and a plan π P we have vd 1 + δ(π, π ), which does not pose additional constraints on the values of vd since all pairwise distances δ(π, π ) are upper-bounded by 1. Similarly, for π, π P \ P , we have vd 2 + δ(π, π ), which also does not pose additional constraints on the values of vd. Therefore we have vd δ(π, π ) for all π, π P and maximizing vd without changing v would lead to vd = minπ,π P δ(π, π ). Thus, the linear program ﬁnds a subset of size k with maximum minimal pairwise diversity.

Experimental Evaluation To empirically evaluate the feasibility of our suggested approach, we have implemented our diverse planners on top of the Diversity Score Computation component (Katz and Sohrabi 2019), using CPLEX v12.8.0 for solving the mixed integer linear programs. The code is available at https://github.com/IBM/diversescore. The experiments were performed on Intel(R) Xeon(R) CPU E7-8837 @2.67GHz machines, with the time and memory limit of 30min and 2GB, respectively. The benchmark set consists of all STRIPS benchmarks from optimal tracks of International Planning Competitions (IPC) 1998-2018, a total of 1797 tasks in 64 domains. For Bounded Quality and Diversity Diverse Planning (b Qb D-k), we compare to the existing planners for that computational problem DFAM and A AM (Vadlamudi and Kambhampati 2016). Since these planners are implemented for the diversity metric Dma, we focus our experimental evaluation on Dma, although our approach works with any metric. Further, since these planners require an absolute bound on the solution cost to be provided as a parameter, and in order to ensure that our quality metric Qa (Eq. 2) satisﬁes IIA (Seipp 2019), we further restrict the benchmark set to tasks where optimal costs could be found with a state-of-the-art cost-optimal planner. For that, we used the 17 single cost-optimal planners from the portfolio of Delﬁ1 (Katz et al. 2018a). As a result, for the b Qb D-k computational problem, the benchmark set consists of 1192 tasks. As a ﬁrst step, we generate a set of plans of bounded quality. Focusing on Dma allows us to use unordered topquality planners (Katz, Sohrabi, and Udrea 2020) to derive all plans (modulo reorderings) of bounded cost. This is due to the fact that two plans that differ only in the order of their actions would produce pairwise diversity of 0 and thus any set of plans P that includes two such plans would get Dma(P) = 0. For other diversity metrics we might need to produce the set of all plans of bounded cost (Katz, Sohrabi, and Udrea 2020). Note that some top-k planners, such as K -based (Katz et al. 2018b) and symbolic search based (Speck, Mattm uller, and Nebel 2020) can be easily adapted to produce solutions for top-quality planning, modifying their stopping criteria. Further, these two planners can be rather naively adapted to produce unordered toq-quality solutions, by performing a duplicate check and skipping plans for which a reordering was previously found. Such a modiﬁcation can turn extremely beneﬁcial in practice, reducing

qm =1.00 qm =1.05 qm =1.10 qm =1.20

DFA A A K FI Sym DFA A A K FI Sym DFA A A K FI Sym DFA A A K FI Sym

DFA 26 39 14 19 24 40 13 18 30 47 16 22 35 50 22 21

A A 25 42 13 14 22 44 11 16 19 46 12 18 14 46 12 19

K 14 7 2 0 14 8 0 0 9 3 0 0 4 4 0 0

FI 37 37 53 22 37 38 54 26 35 39 55 27 30 39 55 28

Sym 41 42 50 30 42 42 52 25 39 40 51 24 35 39 51 24

Table 1: Domain-level performance, K -b Qb D, FI-b Qb D, and Sym-b Qb D vs. DFAM (DFA for short) and A AM (A A for short), k = 5, diversity bound 0.15, and four quality bounds.

the amount of disk write operations and the need for storing plans. In our experiments, we have performed the ﬁrst step with each of these three planners, namely FI, K , and Sym. We run these planners with a 29min time bound, to allow at least one minute for the second step. In all cases, the overall time bound for both steps is 30min. Further, to avoid generating a larger amount of plans, the overall bound on the number of generated plans for the ﬁrst step is set to 10000. Since b Qopt D-k requires the ﬁrst step to produce all plans of bounded quality, we report failure on the tasks where the bound on the number of plans was reached. As a second step, we select a subset of plans according to the computational problem of interest. For b Q-k, we use the greedy approach suggested by Katz and Sohrabi (2020). For b Qb D-k and b Qopt D-k, we solve a mixed-integer linear program, as described in the previous section. This results in three conﬁgurations for each problem of interest. Table 1 presents a pairwise comparison of our planners, K -b Qb D, FI-b Qb D, and Sym-b Qb D and the existing planners DFAM and A AM (Vadlamudi and Kambhampati 2016), for k = 5. Note, the preﬁxes FI, K , and Sym refer to the unordered top-quality planners used in the ﬁrst step while the sufﬁxes b Qb D, b Q, and b Qopt D refer to the computational problem. We use the diversity bound of 0.15 and experiment with four quality bounds, deﬁned by multipliers of the optimal plan cost, from qm = 1.0 (optimal plans only), to qm = 1.05, qm = 1.10, and to qm = 1.20 (up to 120% of the optimal plan cost). The results for these four quality bound multipliers are depicted in the four parts of the table. Each part compares the planners in rows to planners in columns in terms of overall coverage in a domain, computed as follows. A planner gets a coverage of 1 on a planning task if it was able to either ﬁnd a solution of size k or prove that no such solution exists. Otherwise, the planner gets coverage 0. The coverage of a domain is a sum over coverages of all tasks in the domain. Each entry depicts the number of domains, out of the total of 64 domains, where the planner in row achieves strictly larger coverage than the planner in column. Best results are highlighted in bold. For example, for qm = 1.0, Sym-b Qb D solves more tasks than DFAM in 41 domains, while DFAM solves more tasks than Sym-b Qb D in 19 domains. Consequently, they solve the same number of tasks in 4 domains. First, note that K -b Qb D performs quite poorly compared to all other techniques. We conjecture that our rather naive extension of the K -based planner results in an unordered top-quality planner that is not suf-

qm =1.00 qm =1.05 qm =1.10 qm =1.20 k DFA A A FI Sym DFA A A FI Sym DFA A A FI Sym DFA A A FI Sym 5 453 449 594 631 452 433 582 603 484 424 573 574 548 411 564 548 10 390 373 530 594 389 354 511 559 426 338 494 535 505 318 480 499 102 189 238 433 491 194 216 403 448 229 185 365 396 291 123 302 352 103 79 238 380 435 82 216 345 386 105 185 287 321 124 123 203 257

Table 2: Overall coverage, FI-b Qb D and Sym-b Qb D vs. DFAM and A AM. Diversity bound 0.15, four quality bounds.

ﬁciently competitive with the other two techniques. Therefore, we exclude K -b Qb D from further experiments. While both FI-b Qb D and Sym-b Qb D outperform the existing approaches in terms of the number of domains where they achieve superior performance, it is worth mentioning that for each of the approaches there are multiple domains where that approach excels. DFAM performs especially well in MICONIC, OPENSTACKS, and VISIT-ALL14, while A AM excels in MPRIME, MYSTERY, and SOKOBAN. To show how these planners scale with larger values of k, Table 2 presents aggregated overall coverage for the values of k = 5, 10, 100, 1000. Going deeper into the coverage results, note that DFAM does not prove unsolvability. A AM, on the other hand, for large values of k = 100 and k = 1000, did not ﬁnd any solutions, and all the instances reported in the table for A AM and these values of k correspond to unsolvable cases. For our suggested approach, both FI-b Qb D and Sym-b Qb D are able to cope with both, for any value of k. It is worth mentioning that the performance of DFAM often improves, sometimes signiﬁcantly, with larger quality bounds. We conjecture that this increase is due to the nature of the branch-and-bound algorithm, that does not necessarily produce plans in the order of their costs. Proving unsolvability, however, still requires to produce all plans of bounded cost and our experimental evaluation shows this to be challenging for DFAM, for larger values of k. Switching to b Q-k and b Qopt D-k, to evaluate the quality of the solution obtained using the greedy algorithm, we compare the diversity metric score of the subset chosen by FI-b Q (respectively, Sym-b Q) to the best possible score, obtained with the FI-b Qopt D (respectively, Sym-b Qopt D) planner. Figure 1 depicts the comparison for k = 5, for all four quality multipliers 1.0, 1.05, 1.1, and 1.2, for tasks where both planners were able to ﬁnd a solution, and the ﬁrst step produced at least k + 1 plans. Note that the greedy approach works surprisingly well. On these tasks, in most cases it has reached the optimum (nodes on the diagonal): 80 out of 121, 90 out of 126, 71 out of 105, and 50 out of 105 for FI-b Q and 106 out of 176, 99 out of 162, 67 out of 118, and 63 out of 129 for Sym-b Q (for the four quality multipliers, respectively). When it hasn t reached the optimum, the scores are still mostly below the y = x + 0.1 line. There are only 21, 18, 17, and 26 tasks for FI-b Q and 26, 23, 20, and 30 tasks for Sym-b Q for the four quality multipliers 1.0, 1.05, 1.1, and 1.2, respectively above the y = x + 0.1 line, and only 21 tasks for FI-b Q and 16 tasks for Sym-b Q in total for all quality multipliers above the y = x + 0.2 line. While our experiments show that the greedy approach of-

0 0.2 0.4 0.6 0.8 1 0

FI-b Qopt D

qm=1.00 qm=1.05 qm=1.10 qm=1.20

0 0.2 0.4 0.6 0.8 1 0

Sym-b Qopt D

qm=1.00 qm=1.05 qm=1.10 qm=1.20

Figure 1: Diversity score, greedy vs. optimal subset selection for k = 5, (a) FI-b Q vs. FI-b Qopt D, and (b) Sym-b Q vs. Sym-b Qopt D.

ten produces solutions of diversity close to optimum, the question remains how these algorithms compare in their run time. Figure 2 presents such run time comparison between the greedy and the optimal approaches. The greedy algorithm always ﬁnished in under 1.2 seconds, while solving mixed integer linear program takes signiﬁcantly longer on these tasks, up to 500 seconds for FI-b Qopt D and 1760 seconds for Sym-b Qopt D. Finally, note that an inherent limitation of our approach to solving b Qopt D-k is that the ﬁrst step must produce a solution to the (unordered) top quality planning problem. There is no such limitation when solving b Q-k. As a result, FI-b Q successfully solves b Q-k in 617, 617, 615, and 613 tasks for the four quality multipliers, while FI-b Qopt D solves b Qopt D-k in only 369, 333, 273, and 191 tasks. Similarly, Sym-b Q successfully solves b Q-k in 702, 675, 648, and 624 tasks for the four quality multipliers, while Sym-b Qopt D solves b Qopt D-k in only 379, 338, 262, and 196 tasks.

Discussion and Future Work

In this work, we extend the portfolio of existing tools for various computational problems in diverse planning by introducing three new such tools. We follow the recently introduced taxonomy and, focusing on b Qb D-k, map existing planners DFAA/DFAM and A AA/A AM to that problem. For that, we introduce a novel quality metric under which these planners can be considered to solve b Qb D-k. The metric also allows us to use top quality planners as a basis for our proposed planners, for b Qb D-k as well as for other computational problems, choosing a subset of plans from the solution for the top quality problem. We show that it is NP-complete to ﬁnd a solution to b Qb D-k, given the set of all plans of bounded cost and suggest using a previously proposed integer linear programming based approach, which is experimentally shown to favorably compete with existing planners. As the integer linear program was not previously detailed in the literature, we present it in detail and formally prove that it can be used for solving b Qb D-k. Switching from bounding to optimizing diversity, we suggest a novel mixed integer linear program and formally prove that this program solves b Qopt D-k. For another computational problem, b Q-k, we use

0 0.2 0.4 0.6 0.8 1 0

FI-b Qopt D

qm=1.00 qm=1.05 qm=1.10 qm=1.20

0 0.2 0.4 0.6 0.8 1 1.2 0

Sym-b Qopt D

qm=1.00 qm=1.05 qm=1.10 qm=1.20

Figure 2: Diversity score computation time, greedy vs. optimal subset selection for k = 5, (a) FI-b Q vs. FI-b Qopt D, and (b) Sym-b Q vs. Sym-b Qopt D.

an existing greedy approach of selecting a subset of plans, and empirically show that such a simple approach is able to often achieve the optimum in practice.

Our suggested approach is similar to the one of Vadlamudi and Kambhampati (2016), in that it is also separated into two steps: (i) ﬁnding a set of plans of bounded cost, and (ii) choosing a proper subset from the found set. There are two major differences. The ﬁrst one is the stopping criteria for step (i): while Vadlamudi and Kambhampati (2016) iterate until enough plans are found or no more plans exist, and can stop before ﬁnding all plans of bounded cost, we are using an existing (unordered) top-quality planner as is; thus, we will produce the set of all plans of bounded cost in step (i). While it is possible to adapt the top quality planner that we used to terminate earlier, we decided not to do so, to allow for easily replacing the top quality planner with a different one. The second difference is that instead of trying to construct a feasible solution during the execution of step (i), we perform step (ii) after the ﬁrst step is ﬁnished, as a post-processing. Further, instead of implementing a dedicated algorithm, we cast the problem of choosing the proper subset as an integer linear program, allowing us to use existing solvers. Thus, our solution is highly modular, allowing us to easily replace the solvers when better ones become available.

While there has been signiﬁcant progress in the ﬁeld of diverse planning recently, there are still several interesting computational problems for which no planners currently exist. For example, in this work we show how to optimize diversity when the set of candidate plans is given. However, if the quality restriction is alleviated, it is not clear how to choose a set of maximal diversity. It is not even clear whether all plans must be considered while searching for such a set. Another possible problem of interest is ﬁnding a subset of optimal quality among the bounded diversity ones. Focusing on these planning problems is an interesting research direction. Further, in this work we have shown the computational complexity of choosing a subset of plans of bounded diversity for the minimal pairwise stability metric, while for other metrics it remains an open problem.

References B ackstr om, C.; and Nebel, B. 1995. Complexity Results for SAS+ Planning. Computational Intelligence, 11(4): 625 655. Boddy, M.; Gohde, J.; Haigh, T.; and Harp, S. 2005. Course of Action Generation for Cyber Security Using Classical Planning. In Biundo, S.; Myers, K.; and Rajan, K., eds., Proceedings of the Fifteenth International Conference on Automated Planning and Scheduling (ICAPS 2005), 12 21. AAAI Press. Bryce, D. 2014. Landmark-Based Plan Distance Measures for Diverse Planning. In Chien, S.; Fern, A.; Ruml, W.; and Do, M., eds., Proceedings of the Twenty-Fourth International Conference on Automated Planning and Scheduling (ICAPS 2014), 56 64. AAAI Press. Coman, A.; and Mu noz-Avila, H. 2011. Generating Diverse Plans Using Quantitative and Qualitative Plan Distance Metrics. In Burgard, W.; and Roth, D., eds., Proceedings of the Twenty-Fifth AAAI Conference on Artiﬁcial Intelligence (AAAI 2011), 946 951. AAAI Press. Flerova, N.; Marinescu, R.; and Dechter, R. 2016. Searching for the M Best Solutions in Graphical Models. Journal of Artiﬁcial Intelligence Research, 55: 889 952. Fox, M.; Gerevini, A.; Long, D.; and Serina, I. 2006. Plan Stability: Replanning versus Plan Repair. In Long, D.; Smith, S. F.; Borrajo, D.; and Mc Cluskey, L., eds., Proceedings of the Sixteenth International Conference on Automated Planning and Scheduling (ICAPS 2006), 212 221. AAAI Press. Garey, M. R.; and Johnson, D. S. 1979. Computers and Intractability A Guide to the Theory of NP-Completeness. W. H. Freeman & Co. Goldman, R. P.; and Kuter, U. 2015. Measuring Plan Diversity: Pathologies in Existing Approaches and A New Plan Distance Metric. In Bonet, B.; and Koenig, S., eds., Proceedings of the Twenty-Ninth AAAI Conference on Artiﬁcial Intelligence (AAAI 2015), 3275 3282. AAAI Press. Katz, M.; Ram, P.; Sohrabi, S.; and Udrea, O. 2020. Exploring Context-Free Languages via Planning: The Case for Automating Machine Learning. In Beck, J. C.; Karpas, E.; and Sohrabi, S., eds., Proceedings of the Thirtieth International Conference on Automated Planning and Scheduling (ICAPS 2020), 403 411. AAAI Press. Katz, M.; and Sohrabi, S. 2019. Diversity Score Computation for Diverse Planning. https://doi.org/10.5281/zenodo. 2691996. Katz, M.; and Sohrabi, S. 2020. Reshaping Diverse Planning. In Conitzer, V.; and Sha, F., eds., Proceedings of the Thirty-Fourth AAAI Conference on Artiﬁcial Intelligence (AAAI 2020). AAAI Press. Katz, M.; and Sohrabi, S. 2022. Who Needs These Operators Anyway: Top Quality Planning with Operator Subset Criteria. In Thi ebaux, S.; and Yeoh, W., eds., Proceedings of the Thirty-Second International Conference on Automated Planning and Scheduling (ICAPS 2022). AAAI Press. Katz, M.; Sohrabi, S.; Samulowitz, H.; and Sievers, S. 2018a. Delﬁ: Online Planner Selection for Cost-Optimal Planning. In Ninth International Planning Competition (IPC-9): planner abstracts, 57 64.

Katz, M.; Sohrabi, S.; and Udrea, O. 2020. Top-Quality Planning: Finding Practically Useful Sets of Best Plans. In Conitzer, V.; and Sha, F., eds., Proceedings of the Thirty Fourth AAAI Conference on Artiﬁcial Intelligence (AAAI 2020). AAAI Press. Katz, M.; Sohrabi, S.; Udrea, O.; and Winterer, D. 2018b. A Novel Iterative Approach to Top-k Planning. In de Weerdt, M.; Koenig, S.; R oger, G.; and Spaan, M., eds., Proceedings of the Twenty-Eighth International Conference on Automated Planning and Scheduling (ICAPS 2018). AAAI Press. Keyder, E.; and Geffner, H. 2009. Soft Goals Can Be Compiled Away. Journal of Artiﬁcial Intelligence Research, 36: 547 556. Mohr, F.; Wever, M.; and H ullermeier, E. 2018. ML-Plan: Automated Machine Learning via Hierarchical Planning. Machine Learning, 107(8): 1495 1515. Myers, K. L.; and Lee, T. J. 1999. Generating qualitatively Different Plans through Metatheoretic Biases. In Proceedings of the Sixteenth National Conference on Artiﬁcial Intelligence (AAAI 1999), 570 576. AAAI Press. Nguyen, T. A.; Do, M. B.; Gerevini, A.; Serina, I.; Srivastava, B.; and Kambhampati, S. 2012. Generating diverse plans to handle unknown and partially known user preferences. Artiﬁcial Intelligence, 190: 1 31. Riabov, A. V.; Sohrabi, S.; Sow, D. M.; Turaga, D. S.; Udrea, O.; and Vu, L. H. 2015. Planning-Based Reasoning for Automated Large-Scale Data Analysis. In Brafman, R.; Domshlak, C.; Haslum, P.; and Zilberstein, S., eds., Proceedings of the Twenty-Fifth International Conference on Automated Planning and Scheduling (ICAPS 2015), 282 290. AAAI Press. Riabov, A. V.; Sohrabi, S.; and Udrea, O. 2014. New Algorithms for The Top-K Planning Problem. In ICAPS 2014 Scheduling and Planning Applications wo RKshop, 10 16. Roberts, M.; Howe, A. E.; and Ray, I. 2014. Evaluating Diversity in Classical Planning. In Chien, S.; Fern, A.; Ruml, W.; and Do, M., eds., Proceedings of the Twenty-Fourth International Conference on Automated Planning and Scheduling (ICAPS 2014), 253 261. AAAI Press. Seipp, J. 2019. Planner Metrics Should Satisfy Independence of Irrelevant Alternatives. In ICAPS 2019 Workshop on the International Planning Competition (WIPC), 40 41. Sohrabi, S.; Riabov, A. V.; Katz, M.; and Udrea, O. 2018. An AI Planning Solution to Scenario Generation for Enterprise Risk Management. In Proceedings of the Thirty-Second AAAI Conference on Artiﬁcial Intelligence (AAAI 2018), 160 167. AAAI Press. Sohrabi, S.; Riabov, A. V.; and Udrea, O. 2016. Plan Recognition as Planning Revisited. In Kambhampati, S., ed., Proceedings of the 25th International Joint Conference on Artiﬁcial Intelligence (IJCAI 2016), 3258 3264. AAAI Press. Speck, D.; Mattm uller, R.; and Nebel, B. 2020. Symbolic Top-k Planning. In Conitzer, V.; and Sha, F., eds., Proceedings of the Thirty-Fourth AAAI Conference on Artiﬁcial Intelligence (AAAI 2020). AAAI Press. Vadlamudi, S. G.; and Kambhampati, S. 2016. A Combinatorial Search Perspective on Diverse Solution Generation. In Schuurmans, D.; and Wellman, M., eds., Proceedings of the Thirtieth AAAI Conference on Artiﬁcial Intelligence (AAAI 2016), 776 783. AAAI Press.