# complexity_results_in_epistemic_planning__cc81fb69.pdf

Complexity Results in Epistemic Planning

Thomas Bolander DTU Compute Tech. University of Denmark Copenhagen, Denmark

tobo@dtu.dk

Martin Holm Jensen DTU Compute Tech. University of Denmark Copenhagen, Denmark

martin.holm.j@gmail.com

Francois Schwarzentruber IRISA ENS Rennes Rennes, France

francois.schwarzentruber@ens-rennes.fr

Epistemic planning is a very expressive framework that extends automated planning by the incorporation of dynamic epistemic logic (DEL). We provide complexity results on the plan existence problem for multi-agent planning tasks, focusing on purely epistemic actions with propositional preconditions. We show that moving from epistemic preconditions to propositional preconditions makes it decidable, more precisely in EXPSPACE. The plan existence problem is PSPACE-complete when the underlying graphs are trees and NP-complete when they are chains (including singletons). We also show PSPACE-hardness of the plan veriﬁcation problem, which strengthens previous results on the complexity of DEL model checking.

1 Introduction

An all-pervading focus of artiﬁcial intelligence (AI) is the development of rational, autonomous agents. An important trait of such an agent is that it is able to exhibit goal-directed behaviour, and this overarching aim is what is studied within the ﬁeld of automated planning. At the same time, such goal-directed behaviour will naturally be conﬁned to whatever model of the underlying domain is used. In automated planning the domain models employed are formulated using propositional logic, but in more complex settings (e.g. multiagent domains) such models come up short due to the limited expressive power of propositional logic. By extending (or replacing) this foundational building block of automated planning we obtain a more expressive formalism for studying and developing goal-directed agents, enabling for instance an agent to reason about other agents. For the above reasons automated planning has recently seen an inﬂux of formalisms that are colloquially referred to as epistemic planning [Bolander and Andersen, 2011; L owe et al., 2011; Aucher and Bolander, 2013; Yu et al., 2013; Andersen et al., 2012]. Common to these approaches is that they take dynamic epistemic logic (DEL) [Baltag et al., 1998] as the basic building block of automated planning, which greatly surpasses propositional logic in terms of expressive power. Brieﬂy put, DEL is a modal logic with which we

can reason about the dynamics of knowledge. In the singleagent case, epistemic planning can capture non-deterministic and partially observable domains [Andersen et al., 2012]. An even more interesting feature of DEL is the inherent ability to reason about multi-agent scenarios, lending itself perfectly to natural descriptions of multi-agent planning tasks. In [Bolander and Andersen, 2011] it is shown that the plan existence problem (i.e. deciding whether a plan exists for a multi-agent planning task) is undecidable, and this remains so even when factual change is not allowed, that is, when we only allow actions that changes beliefs, not ontic facts [Aucher and Bolander, 2013]. Allowing for factual change, a decidable fragment is obtained by restricting epistemic actions to only have propositional preconditions [Yu et al., 2013] (in the full framework, preconditions of actions can be arbitrary epistemic formulas). The computational complexity of this fragment belongs to (d + 1)-EXPTIME for a goal whose modal depth is d [Maubert, 2014]. In this work we consider exclusively the plan existence problem for classes of planning tasks where preconditions are propositional (as in most automated planning formalisms) and actions are non-factual (changing only beliefs). We show this problem to be in EXPSPACE in the general case, but also identify fragments with tight complexity results. We do so by using the notion of epistemic action stabilisation [van Benthem, 2003; Miller and Moss, 2005; Sadzik, 2006], which allows us to put an upper bound on the number of times an action needs to be executed in a plan. This number depends crucially on the structural properties of the graph underlying the epistemic action. To achieve our upper bound complexity results we generalise a result of [Sadzik, 2006] on action stabilisation. We also tackle lower bounds, thereby showing a clear computational separation between these fragments. Our contributions to the complexity of the plan existence problem are summarised in Table 1 (second column from the left), where we ve also listed related contributions. The fragments we study have both a conceptual and technical motivation. Singleton epistemic actions correspond to public announcements of propositional facts, chains and trees to certain forms of private announcements, and graphs capture any propositional epistemic action. Possible applications of such planning fragments could e.g. be planning in games like Clue/Cluedo where actions can be seen as purely epistemic; or synthesis of protocols for secure communication (where

Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015)

Types of epistemic actions Underlying graphs of actions

Non-factual, propositional preconditions

Factual, propositional preconditions

Factual, epistemic preconditions SINGLETONS NP-complete (Theorem 5.1)

PSPACE-hard [Jensen, 2014]

PSPACE-hard [Jensen, 2014]

NP-complete (Theorem 5.2) ? (open question) ? (open question)

TREES PSPACE-complete (Theorem 5.3, Theorem 5.4)

? (open question) ? (open question)

in EXPSPACE (Theorem 5.8)

in NONELEMENTARY [Yu et al., 2013]

Undecidable [Bolander and Andersen, 2011]

Table 1: Complexity results for the plan existence problem.

the goal speciﬁes who are allowed to know what). Technically our fragments increase in complexity as we loosen the restrictions put on the underlying graph. Some planning tasks might therefore be simpliﬁed during a preprocessing phase so that a better upper bound can be guaranteed. As a case in point, we can preprocess any planning task and replace each graph action with a tree action that is equivalent up to a predetermined modal depth k (by unravelling). Letting k be the modal depth of the goal formula we obtain an equivalent planning task that can be solved using at most space polynomial in k and the size of the planning task. In automated planning such preprocessing is often used to achieve scalable planning systems. Our results also allow us to prove that the plan veriﬁcation problem (a subproblem of DEL model checking) is PSPACEhard, even without non-deterministic union, thereby improving the result of [Aucher and Schwarzentruber, 2013]. Sections 2 and 3 present the core notions from epistemic planning. In Section 4 we improve the known upper bounds on the stabilisation of epistemic actions. This is put to use in Section 5 where we show novel results on the complexity of the plan existence problem. Section 6 presents our improvement to the plan veriﬁcation problem, before we conclude and discuss future work in Section 7.

2 Background on Epistemic Planning For the remainder of the paper we ﬁx both an inﬁnitely countable set of atomic propositions P and a ﬁnite set of agents Ag.

2.1 Dynamic epistemic logic Deﬁnition 2.1 (Epistemic models and states). An epistemic model is a triple M = (W, R, V ) where the domain W is a non-empty set of worlds; R : Ag 2W W assigns an epistemic (accessibility) relation to each agent; and V : P 2W assigns a valuation to each atomic proposition. We write Ra for R(a) and w Rav for (w, v) Ra. We often write W M for W, RM a for Ra and V M for V . For w W, the pair (M, w) is called an epistemic state whose actual world is w. (M, w) is ﬁnite when W is ﬁnite. Epistemic states are typically denoted by symbols such as s and s0. The language of propositional logic over P is referred to as LProp, or sometimes simply the propositional language.

Deﬁnition 2.2 (Propositional action models and epistemic actions). A propositional action model is a triple A = (E, Q, pre) where E is a non-empty and ﬁnite set of events called the domain of A; Q : Ag 2E E assigns an epistemic (accessibility) relation to each agent; and pre : E LProp assigns a precondition of the propositional language to each event. We write Qa for Q(a) and e Qaf for (e, f) Qa. We often write EA for E, QA a for Qa and pre A for pre. For e E, the pair (A, e) is called an epistemic action whose actual event is e. Epistemic actions are typically denoted α, α , α1, etc.

Propositional action models are deﬁned to ﬁt exactly our line of investigation here, though other presentations consider preconditions of more complex languages and postconditions that allow for factual (ontic) change [Bolander and Andersen, 2011; Yu et al., 2013]. The dynamic language LD is generated by the BNF:

ϕ := p | ϕ | (ϕ ϕ) | aϕ | α ϕ

where a Ag, p P and α is an epistemic action. Here a denotes the knowledge (or, belief) modality where aϕ reads as a knows (or, believes) ϕ , and α is the dynamic modality where α ϕ reads as α is applicable and ϕ holds after executing α . The epistemic language LE is the sublanguage of LD that does not contain the dynamic modality. As usual we use aϕ := a ϕ, and deﬁne by abbreviation , and the boolean connectives , , . Lastly, we deﬁne α 0ϕ := ϕ and α kϕ := α ( α k 1ϕ) for k > 0.

Deﬁnition 2.3 (Semantics). Let (M, w) be an epistemic state where M = (W, R, V ). For a Ag, p P and ϕ, ϕ LD we inductively deﬁne truth of formulas as follows, omitting the propositional cases:

(M, w) |= aϕ iff (M, v) |= ϕ for all w Rav (M, w) |= α ϕ iff (M, w) |= pre(e) and (M A, (w, e)) |= ϕ

where α = (A, e) is an epistemic action s.t. A = (E, Q, pre), and the epistemic model M A = (W , R , V ) is deﬁned via the product update operator by:

W = {(v, f) W E | (M, v) |= pre(f)}, R a = {((v, f), (u, g)) W W | v Rau, f Qag}, V (p) = {(v, f) W | v V (p)} for p P.

For any epistemic state s = (M, w) and epistemic action α = (A, e) satisfying (M, w) |= pre A(e), we deﬁne s α = (M A, (w, e)). The epistemic state s α represents the result of executing α in s. Note that we have s |= α ϕ iff (M, w) |= pre A(e) and s α |= ϕ. Two formulas ϕ, ϕ of LD are called equivalent (written as ϕ ϕ ) when s |= ϕ iff s |= ϕ for every epistemic state s.

Example 2.4. Consider the epistemic state s1 of Figure 1. It represents a situation where p holds in the actual world (w), but where the two agents, a and b, don t know this: s1 |= p ap bp. Consider now the epistemic action α1 = (A, e) of the same ﬁgure. It represents a private announcement of p to agent a, that is, agent a is told that p holds (the actual event, e), but agent b thinks that nothing is

s1 a, b a, b a, b

α1 a a, b b (w, e): p

Figure 1: (Top left) An epistemic state s1. We mark each world (circle) with its name and the atomic propositions that are true. The actual world is coloured black. Edges show epistemic relations of the agents. (Bottom left) An epistemic action α1. We use the same conventions as for epistemic states, except an event (square) is marked by its name and its precondition. (Right) The epistemic model to the right is the result of execution of α1 in s1, that is, s1 α1.

happening (event f). The dynamic modality allows us to reason about the result of executing α1 in s1, so for instance we have s1 |= α1 ( ap bp b ap): After agent a has been privately informed about p, she will know it, but b will still not know p, and will believe that a doesn t either. This fact can be veriﬁed by observing that ap bp b ap is true in the epistemic state s1 α1 of Figure 1.

2.2 Plan existence problem Deﬁnition 2.5 (Planning tasks). An (epistemic) planning task is a triple T = (s0, L, ϕg) where s0 is a ﬁnite epistemic state called the initial state, L is a ﬁnite set of epistemic actions called the action library and ϕg LE is called the goal. A plan for T is a ﬁnite sequence α1, . . . , αj of epistemic actions from L s.t. s0 |= α1 αj ϕg. The sequence α1, . . . , αj can contain any number of repetitions, and can also be empty. We say that T is solvable if there exists a plan for T. The size of a planning task T = (s0, L, ϕg) is given as follows. Following [Aucher and Schwarzentruber, 2013], for any α = (A, e) in L we deﬁne |α| = |Ag| |EA|2 +P e E |pre(e)| as the size of α, where |pre(e)| denotes the length of the (propositional) formula pre(e). The size of an epistemic action is always a ﬁnite number, since the domain of any propositional action model and Ag are both ﬁnite. Let P P be the ﬁnite set of atomic propositions that occur either in some precondition of an α L or in ϕg. The size T is then |T| = |P | |Ag| |W M|2 + P α L |α| + |ϕg| where s0 = (M, w). Note that a plan is nothing more than a sequence of epistemic actions leading to a goal. It is not hard to show that this deﬁnition is equivalent to the deﬁnition of a solution [Aucher and Bolander, 2013] and an explanatory diagnosis [Yu et al., 2013], which are both special cases of a solution to a classical planning task as deﬁned in [Ghallab et al., 2004] (for the relation to classical planning tasks, see [Aucher and Bolander, 2013]). Example 2.6. Consider again Figure 1. We ll use α2 to refer to the private announcement of p to b, obtained simply by swapping the epistemic relations of a and b in α1. Consider the planning task T = (s, {α1, α2}, ϕg) with ϕg = ap

bp a bp b ap. It is a planning task in which the only available actions are private announcements of p to either a or b, and the goal is for both a and b to know p, but without knowing that the other knows. A plan for T is α1, α2, since s |= α1 α2 ϕg. In other words, ﬁrst announcing p privately to a and then privately to b will achieve the goal of them both knowing p without knowing that each other knows. Deﬁnition 2.7 (Plan existence problem). Let X denote a class of planning tasks. The plan existence problem for X, called PLANEX(X) is the following decision problem: Given a planning task T X, does there exists a plan for T?

3 Background on Iterating Epistemic Actions To get to grips on the plan existence problem, we now consider the result of iterating a single epistemic action. We then proceed to derive a useful characterisation of exactly when a planning task is solvable. Deﬁnition 3.1 (n-ary product). Let α = (A, e) be an epistemic action where A = (E, Q, pre). We denote by An = (En, Qn, pren) the n-ary product of A. We deﬁne E0 = {e}, e Q0 ae for each a Ag, and pre0(e) = . For n > 0 we deﬁne En = {(e1, . . . , en) | ei E for all i = 1, . . . , n}, Qn a = {((e1, . . . , en), (f1, . . . , fn)) | ei Qafi for all i = 1, . . . , n} for each a Ag, and pren((e1, . . . , en)) = V i=1,...,n pre(ei).

The n-ary product of α is deﬁned as αn = (An, en), where en denotes (e, e, . . . , e) | {z } n

This is not the standard deﬁnition of the n-ary product of an action model, which instead goes via a deﬁnition of the product update operator on action models. Deﬁnition 3.1 is equivalent to the standard deﬁnition when preconditions are of LProp. The following lemma is derived from the axiomatization of [Baltag et al., 1998] (relying in particular on action composition), and is here stated for the case of the n-ary product and utilising that preconditions are of LProp. Lemma 3.2. For any epistemic action α and any ϕ LE we have that α nϕ αn ϕ. This lemma expresses that executing an epistemic action n times is equivalent to executing its n-ary product once.

3.1 Bisimilarity and Stabilisation Concerning n-ary products of epistemic actions, an interesting case is when executing the n-ary product is equivalent to executing the (n + 1)-ary product. This puts an upper bound on the number of times the action needs to occur in a plan since epistemic actions with propositional preconditions commute [L owe et al., 2011]. To analyse this, we introduce notions of bisimulation and n-bisimulation on action models (slightly reformulated from [Sadzik, 2006]). Deﬁnition 3.3 (Bisimilarity). Two epistemic actions α = (A, e) and α = (A , e ) are called bisimilar, written α α , if there exists a (bisimulation) relation Z EA EA containing (e, e ) and satisfying for every a Ag:

[atom] If (f, f ) Z then pre A(f) pre A (f ), [forth] If (f, f ) Z and f QA a g then there is a g EA

such that f QA a g and (g, g ) Z, and [back] If (f, f ) Z and f QA a g then there is a g EA such that f QA a g and (g, g ) Z. Deﬁnition 3.4 (n-bisimilarity). Let α = (A, e) and α = (A , e ) be epistemic actions. They are 0-bisimilar, written α 0 α , if pre A(e) pre A (e ). For n > 0, they are nbisimilar, written α n α , if for every a Ag:

[atom] pre A(e) pre A (e ), [forth] If e QA a f then there is an f EA such that e QA a f and (A, f) n 1(A , f ), and [back] If e QA a f then there is an f EA such that e QA a f and (A, f) n 1(A, f ). The modal depth md(ϕ) of a formula ϕ is deﬁned as: md(p) = 0; md( ϕ) = md(ϕ); md(ϕ ψ) = max{md(ϕ), md(ψ)}; md( aϕ) = 1 + md(ϕ); md( α ϕ) = md(ϕ). As epistemic actions have only propositional preconditions, α -operators do not count towards the modal depth. This deﬁnition of modal depth, Lemma 3.5 and Deﬁnition 3.6 are all due to [Sadzik, 2006] (slightly reformulated). Lemma 3.5. Let α, α be two epistemic actions and ϕ LD. 1) If α α , then α ϕ α ϕ. 2) If md(ϕ) n and α n α , then α ϕ α ϕ. Deﬁnition 3.6 (Stabilisation). Let α be an epistemic action. 1) α is -stabilising at stage i if αi αi+k for all k 0. 2) α is n-stabilising at stage i if αi n αi+k for all k 0. Example 3.7. The 2-ary product α2 1 of α1 of Figure 1 is:

(e, e): p p (f, f):

(e, f): p a a, b

It is easy to check that α1 α2 1, using Z = {(e, (e, e)), (f, (f, f))}, This argument can be extended to show that α1 is indeed -stabilising at stage 1. Since any epistemic action is ﬁnite, we have: Lemma 3.8. If two epistemic actions are n-bisimilar for all n, then they are bisimilar.

3.2 Bounding the Number of Iterations We re now ready to present our characterisation of when a planning task is solvable. We note that Proposition 3.9 below echoes the sentiment of [Yu et al., 2013, Theorem 5.15], in that it states the conditions under which we can restrict the search space when looking for a plan. Proposition 3.9. Let T = (s0, {α1, . . . , αm}, ϕg) be a planning task and B N. Suppose one of the following holds: 1) Every αi is -stabilising at stage B, or

2) md(ϕg) = n and every αi is n-stabilising at stage B. Then T is solvable iff there exists k1, . . . , km B s.t. s0 |= αk1 1 αkm m ϕg.

Proof. Assume 2) holds (the case of 1 is similar). Assume T is solvable, and let αi1, . . . , aij be a plan for T. Due to commutativity of propositional action models so is any permutation of αi1, . . . , αij [Yu et al., 2013]. We therefore have s0 |= α1 k 1 αm k mϕg for some choice of k i 0. Using

Lemma 3.2, it follows that s0 |= αk 1 1 αk m m ϕg. We now let ki = min(k i, B) for all i. By assumption, md(ϕg) = n and so by deﬁnition md( α ϕg) = n for any epistemic action α. Combining this with the assumption that every αi is nstabilising at stage B ki, we apply 2) of Lemma 3.5 m times to conclude that s0 |= αk1 1 αkm m ϕg, as required. The proof of the other direction follows readily from Lemma 3.2 and the deﬁnition of α k.

Let T = {s0, L, ϕg} be a planning task with md(ϕg) = n. Given the proposition above, to show that T is solvable we only need to ﬁnd the correct number of times to iterate each of the actions in L, and these numbers never have to exceed B for actions that are n-stabilising at stage B. The following result, due to [Sadzik, 2006], shows that such a bound B exists for any epistemic action.

Lemma 3.10. Let α = (A, e) be an epistemic action and n a natural number. Then α is n-stabilising at stage |EA|n.

4 Better Bounds for Action Stabilisation

In this section, we prove an original contribution, Lemma 4.2, that generalises Sadzik s Lemma 3.10 by giving a better bound for action stabilisation. The overall point is this: Sadzik gets an unnecessarily high upper bound on when an epistemic action (A, e) stabilises by considering it possible that any event can have up to |EA| successors. We get a better bound by counting paths.

Deﬁnition 4.1 (Underlying graphs). Let (A, e) be an epistemic action. We deﬁne QA = a Ag QA a . The underlying graph of (A, e) is the directed graph (A, QA) with root e.

Let (A, e) denote an epistemic action. Note that (e, f) QA iff there is an edge from e to f in A labelled by some agent. Standard graph-theoretical notions carry over to epistemic actions via their underlying graphs. For instance, we deﬁne a path of length n in (A, e) as a path of length n in the underlying graph, that is, a sequence (e1, e2, . . . , en+1) of events such that (ei, ei+1) QA for all i = 1, . . . , n (we allow n = 0 and hence paths of length 0). A path of length n is a path of length at most n. A maximal path of length n is a path of length n that is not a strict preﬁx of any other path of length n. We use mpathsn(e) to denote the number of distinct maximal paths of length n rooted at e. If all nodes have successors, this number is simply the number of distinct paths of length n. Note that mpathsn(e) is always a positive number, as there is always at least one path rooted at e (even if e has no outgoing edges, there is still the path of length 0). Note also that for any n > 0 and any event e having at least one successor in the underlying graph:

mpathsn(e) = P

e QAf mpathsn 1(f). (1)

In the epistemic action α1 of Figure 1 we have mpaths2(e) = 3, since there are three paths of length 2, (e, e, e), (e, e, f) and (e, f, f), and no shorter maximal paths.

Lemma 4.2. Let α = (A, e0) be an epistemic action and n any natural number. Then α is n-stabilising at stage mpathsn(e0).

Proof. When f = (f1, . . . , fm) EAm and e A, we use occ(e, f) to denote the number of occurrences of e in f1, . . . , fm. For instance we have occ(e, (e, e, f, f)) = 2. We now prove the following property P(n) by induction on n.

P(n): If e EAk+1 and e EAk only differ by some event e occurring at least mpathsn(e )+1 times in e and at least mpathsn(e ) times in e , then (Ak+1, e) n(Ak, e ).

Base case P(0): Since mpaths0(e ) = 1, e and e as described above must contain exactly the same events (but not necessarily with the same number of occurrences). By deﬁnition of the n-ary product of an epistemic action we get pre Ak+1(e) pre Ak(e ). This shows (Ak+1, e) 0(Ak, e ). For the induction step, assume that P(n 1) holds. Given e and e as described in P(n), we need to show (Ak+1, e) n(Ak, e ). [atom] is proved as P(0). [forth]: Let a and f be chosen such that e QAk+1

a f. We need to ﬁnd f such that e QAk

a f and (Ak+1, f) n 1(Ak, f ). Claim. There exists an f such that e QA

a f and occ(f , f) mpathsn 1(f ) + 1. Proof of Claim. By contradiction: Suppose occ(f, f) mpathsn 1(f) for all f with e QA

a f. Since e QAk+1

a f, the number of occurrences of e in e is equal or less than the number of occurrences of Qa-successors of e in f. Hence we get

occ(e , e) P

a f occ(f, f) P

a f mpathsn 1(f) (by assumption) P

e QAf mpathsn 1(f) (by QA = a Ag QA

a ) = mpathsn(e ) (by equation (1)).

However, this directly contradicts the assumption that e occurs at least mpathsn(e ) + 1 times in e, and hence the proof of the claim is complete. Let f be as guaranteed by the claim. Now we build f to be exactly like f, except we omit one of the occurrences of f (we do not have to worry about the order of the elements of the vectors, since any two vectors only differing in order are bisimilar [Sadzik, 2006]). Since f and f now only differ in f occurring at least mpathsn 1(f ) + 1 times in f and at least mpathsn 1(f ) times in f , we can use the induction hypothesis P(n 1) to conclude that (Ak+1, f) n 1(Ak, f ), as required. [back]: This is the easy direction and is omitted.

Now we have proved P(n) for all n. Given n, from P(n) it follows that (Ak+1, ek+1

0 ) n(Ak, ek

0) for all k mpathsn(e0). And from this it immediately follows that (A, e0) is n-stabilising at stage mpathsn(e0).

w2 : p2 w3 : p3 wm : pn a a a a

Figure 2: Initial state and actions used in Theorem 5.1.

procedure Plan Exists

(s0, {α1, . . . , αm}, ϕg), B

a) Guess a vector (k1, . . . , km) {0, . . . , B}m. b) Accept when s0 |= α1 k1 αm kmϕg.

Figure 3: Non-deterministic algorithm for the plan existence problem.

5 Complexity of the Plan Existence Problem

5.1 Singleton and Chain Epistemic Actions We deﬁne SINGLETONS as the class of planning tasks (s0, L, ϕg), where every α = (A, e) in L is a singleton; i.e. EA contains a single event.

Theorem 5.1. PLANEX(SINGLETONS) is NP-complete.

Proof. For any singleton epistemic action there is at most one maximal path of length n for all n. Hence, by Lemma 4.2 and 3.8, such actions are -stabilising at stage 1. In NP: Follows from Theorem 5.2 below as SINGLETONS is contained in CHAINS. NP-hard: We give a polynomial-time reduction from SAT. Let ϕ(p1, . . . , pm) be a propositional formula where p1, . . . , pm are the atomic propositions in ϕ. We construct T = (s0, {α1, . . . , αm}, ϕg) s.t. s0 and each αi are as in Figure 2 and ϕg = ϕ( ap1, . . . , apm) is the formula ϕ in which each occurrence of pi is replaced by api. For any propositional valuation ν, let sν be the restriction of s0 s.t. there is an a-edge from r to wi in sν iff ν |= pi. This means ν |= pi iff sν |= api, and so from our construction of ϕg we have ν |= ϕ iff sν |= ϕg. Observe now that s0 αi is exactly the restriction of s0 so that there is no aedge from r to wi, and that αi is the only action affecting this edge. Let ν(pi) = 0 if ν |= pi and ν(pi) = 1 otherwise. We now have that ϕ is satisﬁable iff there is a ν s.t. ν |= ϕ iff sν |= ϕg iff s0 |= α1 ν(p1) αm ν(pm)ϕg iff T is solvable, where the last equivalence follows from Proposition 3.9 since ν(pi) {0, 1} and each αi is -stabilising at stage 1. This shows that ϕ is satisﬁable iff T is solvable.

For an epistemic action α = (A, e) we say that α is a chain if its underlying graph (A, QA) is a 1-ary tree whose unique leaf may be QA-reﬂexive. We deﬁne CHAINS as the class of planning tasks (s0, L, ϕg) where every epistemic action in L is a chain.

Theorem 5.2. PLANEX(CHAINS) is NP-complete.

Proof. In NP: For any chain epistemic action there is at most one maximal path of length n for all n, hence any such action is -stabilising at stage 1 using Lemmas 4.2 and 3.8. It therefore follows from Proposition 3.9 that, for any T CHAINS, Plan Exists(T, 1) of Figure 3 is accepting iff T is solvable. We must show step b) to run in polynomial

time. Now if α is a chain and s an epistemic state, then the number of worlds reachable from the actual world in s α is at most the number of worlds in s. By only keeping the reachable worlds after each successive product update, we get the required, as the goal is in LE.1 NP-hard: Follows from Theorem 5.1 as SINGLETONS is contained in CHAINS.

5.2 Tree Epistemic Actions We now turn to epistemic actions whose underlying graph is a any tree. Formally, an epistemic action (A, e) is called a tree when the underlying graph (A, QA) is a tree whose leaves may be QA-reﬂexive. We call TREES the class of planning tasks (s0, L, ϕg) where all epistemic actions in L are trees.

Theorem 5.3. PLANEX(TREES) is in PSPACE.

Proof. Consider any tree action α = (A, e) and let l(α) denote its number of leaves. As α is a tree, we get mpathsn(e) l(α) for any n. Using Lemma 4.2 and 3.8, any tree epistemic action α is -stabilising at stage l(α). From Proposition 3.9 we therefore have, for any T TREES, that Plan Exists(T, max(l(α1), . . . , l(αm))) of Figure 3 is accepting iff T is solvable. Step b) can be done in space polynomial in the size of the input [Aucher and Schwarzentruber, 2013]. Hence, the plan existence problem for TREES is in NPSPACE and therefore in PSPACE by Savitch s Thm.

We now sketch a proof of PSPACE-hardness of PLANEX(TREES), by giving a polynomial-time reduction from the PSPACE-hard problem QSAT (satisﬁability of quantiﬁed boolean formulas) to PLANEX(TREES). For any quantiﬁed boolean formula Φ = Q1p1 Qnpnϕ [p1, . . . , pn] with Qi { , }, we deﬁne the planning task TΦ = (s0, {α1, . . . , αn}, ϕsat ϕall) where s0 and each αi are as in Figure 4 (every edge implicitly labelled by a), ϕsat = O1 Onϕ[ 1 a a , . . . , n a a ], and ϕall = n+1 a a 2n a a , where Oi = a if Qi = and Oi = a if Qi = . Then |TΦ| is polynomial in |Φ| and TΦ TREES. By Lemmas 5.6 and 5.7 below we get TΦ is solvable iff Φ is true. Hence:

Theorem 5.4. PLANEX(TREES) is PSPACE-hard.

bi i 1 : ti i :

f i n 1 : ti n : f i n :

Figure 4: Initial state and actions used in Theorem 5.4.

1Observe that even if each action in α1, . . . , αm is -stabilising at stage 1, this is not a sufﬁcient condition for membership in NP as we must also be able to verify the plan in polynomial time.

(w0, b1 0, b2 0)

(w1, t1 1, b2 1) (w1, f 1 1 , b2 1)

(w2, t1 2, t2 2) (w2, t1 2, f 2 2 ) (w2, f 1 2 , t2 2) (w2, f 1 2 , f 2 2 )

Non ci i-chain

c2 1, c2 2-chain

Figure 5: Binary decision tree simulated by s0 α1 α2 (n = 2).

The reduction is based on the idea that we can simulate a (complete) binary decision tree using s = s0 α1 αn. Each world at depth n of s simulates a valuation, using the convention that pi is true iff there is a maximal chain of length i in this world. By nesting belief modalities we can check if such a chain exists. Each action αi makes two copies of every node between depth i and n, which is how we can simulate every valuation. A world w at depth i n of s is called an i-world. It can now be veriﬁed that any i-world is of the form (wi, v1 i , . . . , vi i, bi+1 i , . . . , bn i ) where vj i {tj i, f j i }. See also Figure 5. For any i-world w, we deﬁne a propositional valuation νw on {p1, . . . , pi} by νw |= pj iff tj i occurs in w. We use w0 = (w0, b1 0, . . . , bn 0) to denote the single 0-world in s (the actual world of s ), and deﬁne M so that s = (M , w0).

Lemma 5.5. Let w be any n-world. Then (M , w) |= ϕ[ 1 a a , . . . , n a a ] iff νw |= ϕ[p1, . . . , pn] is true.

Proof sketch. Due to the ci 1, . . . , ci i chain in each αi, we have for any n-world w and i n that (M , w) |= i a a iff ti n occurs in w, from which the result readily follows.

We say that an n-world w is accepting if (M , w) |= ϕ[ 1 a a , . . . , n a a ], and for i < n we say that the iworld w is accepting if some (every) child w of w is accepting and Oi = a (Oi = a).

Lemma 5.6. TΦ is solvable iff w0 is accepting.

Proof sketch. As acceptance for i < n exactly corresponds to the O1 On preﬁx, we use Lemma 5.5 to show that (M , w0) |= ϕsat iff w0 is accepting. Now we must show: 1) (M , w0) |= ϕall, and then 2) TΦ is solvable iff α1, . . . , αn is plan for TΦ. We omit proofs of both 1) and 2).

Lemma 5.7. Φ is true iff w0 is accepting.

Proof sketch. Let w denote any i-world. Let νw(pi) = if νw |= pi and νw(pi) = otherwise. We deﬁne Φw = Qi+1pi+1 Qnpnϕ[νw(p1), . . . , νw(pi), pi+1, . . . , pn]. By induction on k we now show: If k n and w is an (n k)-world, then Φw is true iff w is accepting. For the base case, k = 0 and w is an n-world, hence ϕ[νw(p1), . . . , νw(pn)] (= Φw) is true iff w is accepting by Lemma 5.5. For the induction step we assume that for any (n (k 1))-world w , Φw is true iff w is accepting. Let w be an (n k)-world. By construction, w has two children v and u. We can then show that Φv and Φu are as Φw, except the Qn k+1pn k+1 preﬁx and that one sets pn k+2 true and the other sets pn k+2 false. Thus Φw is true iff Qn k+1 = (or, Qn k+1 = ) and Φw is true for some (every) child w

of w. Using the induction hypothesis, we get that Φw is true iff w is accepting for some (every) child w of w. Hence, Φw is true iff w is accepting, by deﬁnition. This concludes the induction proof. For k = n it follows that Φw0 is true iff w0 is accepting. Since Φ = Φw0 we are done.

5.3 Arbitrary Epistemic Actions We call GRAPHS the class of planning tasks (s0, L, ϕg) where all event models in L are arbitrary graphs. In this case, the original result by Sadzik (Lemma 3.10) is sufﬁcient.

Theorem 5.8. PLANEX(GRAPHS) is in EXPSPACE.

Proof. We consider (s0, {α1, . . . , αm}, ϕg) GRAPHS with αi = (Ai, ei) and md(ϕg) = d. By Lemma 3.10, each αi is d-stabilising at stage |EAi|d. It now follows from Proposition 3.9 that Plan Exists(T, max{|EA1|d, . . . , |EAm|d}) of Figure 3 is accepting iff T is solvable. The algorithm runs in NEXPSPACE = EXPSPACE.

6 Complexity of the Plan Veriﬁcation Problem

The plan veriﬁcation problem is deﬁned as the following decision problem: Given a ﬁnite epistemic state s0 and a formula of the form α1 αj ϕg, does s0 |= α1 αj ϕg hold? The plan veriﬁcation problem can be seen as a restriction of the model checking problem in DEL. A similar reduction as for Theorem 5.4 gives that:

Theorem 6.1. The plan veriﬁcation problem (restricted to propositional action models that are trees) is PSPACE-hard.

Model checking in DEL with the non-determinism operator included in the language has already been proved PSPACE-hard [Aucher and Schwarzentruber, 2013]. Theorem 6.1 implies that model checking in DEL is PSPACE-hard even without this operator. A similar result has been independently proved in [van de Pol et al., 2015].

7 Future Work

We remind the reader that an overview of our contributions are found in Table 1 and proceed to discuss future work. Since propostional STRIPS planning is PSPACE-complete [Bylander, 1994], efﬁcient planning systems have used relaxed planning tasks in order to efﬁciently compute precise heuristics. For instance, the highly inﬂuential Fast-Forward planning system [Hoffmann and Nebel, 2001] relaxes planning tasks by ignoring delete lists. Our contributions here show that restrictions on the graphs underlying epistemic actions crucially affect computational complexity. This, in combination with restrictions on preconditions and postconditions (factual change), provides a platform for investigating (tractable) relaxations of epistemic planning tasks, and hence for the development of efﬁcient epistemic planning systems.

References [Andersen et al., 2012] Mikkel Birkegaard Andersen, Thomas Bolander, and Martin Holm Jensen. Conditional epistemic planning. In Luis Fari nas del Cerro, Andreas Herzig, and J erˆome Mengin, editors, JELIA, volume 7519

of Lecture Notes in Computer Science, pages 94 106. Springer, 2012. [Aucher and Bolander, 2013] Guillaume Aucher and Thomas Bolander. Undecidability in epistemic planning. In Francesca Rossi, editor, IJCAI. IJCAI/AAAI, 2013. [Aucher and Schwarzentruber, 2013] Guillaume Aucher and Franc ois Schwarzentruber. On the complexity of dynamic epistemic logic. In TARK, 2013. [Baltag et al., 1998] Alexandru Baltag, Lawrence S. Moss, and Slawomir Solecki. The logic of public announcements, common knowledge, and private suspicions. In TARK, 1998. [Bolander and Andersen, 2011] Thomas Bolander and Mikkel Birkegaard Andersen. Epistemic planning for single and multi-agent systems. Journal of Applied Non-Classical Logics, 21(1):9 34, 2011. [Bylander, 1994] Tom Bylander. The computational complexity of propositional strips planning. Artiﬁcial Intelligence, 69(1-2):165 204, 1994. [Ghallab et al., 2004] Malik Ghallab, Dana S. Nau, and Paolo Traverso. Automated Planning: Theory and Practice. Morgan Kaufmann, 2004. [Hoffmann and Nebel, 2001] J org Hoffmann and Bernhard Nebel. The ff planning system: Fast plan generation through heuristic search. Journal of Artiﬁcial Intelligence Resesearch, 14(1):253 302, May 2001. [Jensen, 2014] Martin Holm Jensen. Epistemic and Doxastic Planning. Ph D thesis, Technical University of Denmark, 2014. DTU Compute PHD-2014. [L owe et al., 2011] Benedikt L owe, Eric Pacuit, and Andreas Witzel. Del planning and some tractable cases. In Hans van Ditmarsch, J erˆome Lang, and Shier Ju, editors, Logic, Rationality, and Interaction, volume 6953 of Lecture Notes in Computer Science, pages 179 192. Springer Berlin Heidelberg, 2011. [Maubert, 2014] Bastien Maubert. Logical foundations of games with imperfect information: uniform strategies. Ph D thesis, 2014. Ph D thesis directed by Pinchinat, Sophie and Aucher, Guillaume. Universit e de Rennes 1. [Miller and Moss, 2005] Joseph S Miller and Lawrence S Moss. The undecidability of iterated modal relativization. Studia Logica, 79(3):373 407, 2005. [Sadzik, 2006] Tomasz Sadzik. Exploring the Iterated Update Universe. ILLC Publications PP-2006-26, 2006. [van Benthem, 2003] Johan van Benthem. One is a lonely number: On the logic of communication. 2003. [van de Pol et al., 2015] Iris van de Pol, Iris van Rooij, and Jakub Szymanik. Parameterized complexity results for a model of theory of mind based on dynamic epistemic logic. In TARK, 2015. [Yu et al., 2013] Quan Yu, Ximing Wen, and Yongmei Liu. Multi-agent epistemic explanatory diagnosis via reasoning about actions. In Francesca Rossi, editor, IJCAI. IJCAI/AAAI, 2013.