# multiagent_epistemic_planning_with_common_knowledge__831e865b.pdf

Multi-agent Epistemic Planning with Common Knowledge

Qiang Liu and Yongmei Liu Dept. of Computer Science, Sun Yat-sen University, Guangzhou 510006, China liuq226@mail2.sysu.edu.cn, ymliu@mail.sysu.edu.cn

In the past decade, multi-agent epistemic planning has received much attention from both dynamic logic and planning communities. Common knowledge is an essential part of multi-agent modal logics, and plays an important role in coordination and interaction of multiple agents. However, existing implementations of multi-agent epistemic planning provide very limited support for common knowledge, basically static propositional common knowledge. Our work aims to extend an existing multi-agent epistemic planning framework based on higher-order belief change with the capability to deal with common knowledge. We propose a novel normal form for multi-agent KD45 logic with common knowledge. We propose satisﬁability solving, revision and update algorithms for this normal form. Based on our algorithms, we implemented a multi-agent epistemic planner with common knowledge called MEPC. Our planner successfully generated solutions for several domains that demonstrate the typical usage of common knowledge.

1 Introduction Reasoning about knowledge and beliefs and their change plays an important role in many intelligent tasks, since actions may have preconditions involving agents knowledge and beliefs, which may be changed by agents actions. In some applications, even higher-order knowledge and beliefs, i.e., knowledge and beliefs about other agents knowledge and beliefs, turn out to be insufﬁcient, and common knowledge is needed. We say that φ is common knowledge of a group of agents if everybody knows φ, everybody knows everybody knows φ, and so on to inﬁnity. For example, common knowledge is needed for agreement and coordination. To illustrate, suppose that Alice and Bob are trying to coordinate their actions. This involves the agents agreeing on when to perform the actions, which we represent by φ. We expect that if Alice and Bob agree on something, then each of them knows that they have agreed on that. By induction, Alice and Bob have common knowledge of φ. The common knowledge modality adds a great deal of expressive power to

multi-agent modal logics. As a result, deciding satisﬁability becomes EXPTIME-complete [Halpern and Moses, 1992]. In the past decade, multi-agent epistemic planning has received much attention from both dynamic logic and planning communities. On the theory side, Bolander and Andersen [2011] formalized multi-agent epistemic planning (MEP) based on dynamic epistemic logic [Van Ditmarsch et al., 2007], where both states and actions are represented as Kripke models. Very recently, Huang et al. [2017] proposed a general representation framework for MEP, where the initial knowledge base (KB) and the goal, the preconditions and effects of actions can be arbitrary multi-agent epistemic formulas, progression of KBs wrt actions is achieved through higher-order belief revision or update based on effects of actions, and the solution is an action tree branching on sensing results. On the implementation side, Kominis and Geffner [2015] and Muise et al. [2015] solved restricted versions of MEP problems by compiling them into classical planning, and Huang et al. [2017] implemented a contingent MEP planner based on AND/OR forward search. However, the implementation of [Muise et al., 2015] does not support common knowledge, and those of [Kominis and Geffner, 2015] and [Huang et al., 2017] provide very limited support for common knowledge, basically static propositional common knowledge. To the best of our knowledge, Lo TREC [del Cerro et al., 2001], a theorem prover for S5, is the only implementation capable of reasoning with common knowledge. In this paper, we extend the MEP framework proposed by Huang et al. to incorporate general common knowledge. To support efﬁcient reasoning in multi-agent KD45 logic, they made use of a normal form called alternating cover disjunctive formulas [Hales et al., 2012]. However, this normal form cannot be directly generalized to support reasoning with common knowledge. Thus we propose a novel normal form for multi-agent KD45 with common knowledge. This normal form makes use of a new common knowledge modality Caφ which means that it is common knowledge that everybody except agent a knows φ. We propose a novel algorithm for checking satisﬁability for this normal form. Also, we propose revision and update algorithms for this normal form. The essential idea is to change agents common knowledge before changing agents knowledge, and carry the changed common knowledge to change knowledge to ensure the consistency between knowledge and common knowledge. Based on our

Proceedings of the Twenty-Seventh International Joint Conference on Artiﬁcial Intelligence (IJCAI-18)

reasoning, revision, and update algorithms, we implemented a multi-agent epistemic planner called MEPC. Our planner successfully generated solutions for several domains that demonstrate the typical usage of common knowledge.

2 Preliminaries

In this section, we introduce the background work of our paper, i.e., the multi-agent modal logic and the epistemic planning modeling framework proposed by Huang et al.

2.1 Multi-agent Modal Logic We ﬁx a ﬁnite set of atoms P and a ﬁnite set of agents A.

Deﬁnition 2.1. The language LKC of multi-agent modal logic with common knowledge is generated by the BNF:

φ ::= p | φ1 | φ1 φ2 | Kaφ1 | Cφ1,

where p P, a A, φ1, φ2 LKC.

Intuitively, Kaφ means agent a knows φ, and Cφ means all agents commonly know φ. We let Laφ and Dφ abbreviate for Ka φ and C φ, respectively. Intuitively, Laφ means agent a thinks φ is possible, and Dφ means all agents commonly think φ is possible. We use φ and ψ to represent formulas, Φ and Ψ to represent sets of formulas, and to denote true and false, respectively. W Φ stands for the disjunction of members in Φ, while LaΦ (resp. DΦ) stands for the conjunction of Laφ (resp. Dφ) where φ Φ. The modal depth of a formula φ LKC is the depth of nesting of modalities in φ.

Deﬁnition 2.2. A frame is a pair (W, R), where W is a nonempty set of possible worlds; for each agent a A, Ra is a binary relation on W, called the accessibility relation for a.

When w Raw , we say w is an a-child of w. We say Ra is serial if for any w W, there is a w W s.t. w Raw ; we say Ra is transitive if w Rau and u Rav imply w Rav; we say Ra is Euclidean if w Rau and w Rav imply u Rav. A KD45n frame is a frame whose accessibility relations are serial, transitive and Euclidean.

Deﬁnition 2.3. A Kripke model is a triple M = (W, R, V ), where (W, R) is a frame, and V : W 2P is a valuation map. A pointed Kripke model is a pair s = (M, w), where M is a Kripke model and w is a world of M.

Deﬁnition 2.4. Let s = (M, w) be a Kripke model where M = (W, R, V ). We interpret formulas in LKC inductively:

M, w |= p iff p V (w);

M, w |= φ iff M, w φ;

M, w |= φ ψ iff M, w |= φ and M, w |= ψ;

M, w |= Kaφ iff for all v s.t. w Rav, M, v |= φ;

M, w |= Cφ iff for all v s.t. w RAv, M, v |= φ, where RA is the transitive closure of the union of Ra for a A.

Consider the context of KC n or KD45C n . We say φ is satisﬁable if there is a Kripke model (M, w) s.t. M, w |= φ. We say φ entails ψ, written φ |= ψ, if for any Kripke model (M, w), M, w |= φ entails M, w |= ψ. We say φ and ψ are equivalent, written φ ψ, if φ |= ψ and ψ |= φ.

Theorem 2.5. [Halpern and Moses, 1992] The satisﬁability problems for KC n and KD45C n are EXPTIME-complete.

Halpern and Moses presented an algorithm for checking satisﬁability in KC n . Given a formula φ LKC, their algorithm ﬁrst generates a set S consisting of all subformulas of φ and their negations. Then it computes a set W consisting of all subsets A of S that are propositionally consistent and maximal, i.e., either ψ A or ψ A for each ψ S. Finally it iteratively builds a model for φ with worlds in W. We now deﬁne a normal form for KC n .

Deﬁnition 2.6. The set of modal terms is inductively deﬁned:

A propositional term, i.e., a conjunction of propositional literals, is a modal term;

A formula of the form φ0 V a A(Kaφa LaΨa) Cµ DΛ is a modal term, where φ0 is a propositional term, Ψa, Λ are sets of modal terms, and φa, µ are conjunctions of disjunctions of modal terms.

A formula φ is in DNF if it s a disjunction of modal terms. A formula φ is in CDNF if it s a conjunction of DNFs.

So the knowledge and common knowledge parts of a modal term are in CDNF.

Proposition 2.7. In KC n , any formula in LKC can be transformed to an equivalent DNF whose length is at most singly exponential in the length of the original formula.

Proof. For an arbitrary modal formula φ LKC, we ﬁrst put it into negation normal form. Then we treat modal atoms, ie, formulas of the form Kψ, Lψ, Cψ and Dψ, as propositional atoms and transform the whole formula into propositional DNF. Then for the ψ in each modal atom, we repeat this process. By induction on the modal depth of φ, we can show that the length of the resulting DNF is at most singly exponential in the length of φ.

In the rest of the paper, the logic we use is KD45C n , except that in Section 3.1, we discuss satisﬁability for KC n .

2.2 Epistemic Planning Modeling Framework We follow the modeling framework for multi-agent epistemic planning (MEP) in [Huang et al., 2017], and extend it with the support of general common knowledge. We illustrate the framework with the classic muddy children example [Fagin et al., 1995]. There are n children playing together and m of them get mud on their foreheads. Each can see the mud on others but not on his own forehead. The father announces At least one of you has mud on your forehead . The father then keeps asking Does any of you know whether you have mud on your own forehead? It turns out the ﬁrst m 1 times he asks the question, they will all say No , but then the mth time the children with muddy foreheads will all answer Yes .

Deﬁnition 2.8. A MEP problem is a tuple A, P, D, S, I, G , where A is a set of agents, P is a set of atoms, D is a set of deterministic actions, S is a set of sensing actions, I LKC is the initial KB, and G LKC is the goal.

Proceedings of the Twenty-Seventh International Joint Conference on Artiﬁcial Intelligence (IJCAI-18)

Example 1. Assume that there are three children (A = {a, b, c}) and all of them are muddy. The atoms are: mi, indicating the ith child is muddy. The deterministic actions are: announce, the father announces that at least one of them is muddy; askno, the father asks whether anyone knows that he is muddy, but no child answers Yes . The initial KB is ma mb mc V i A(Ki V j =i mj). The goal is V i A(mi Kimi).

Deﬁnition 2.9. A deterministic action is a pair pre, eff , where pre LKC is the precondition; eff is a set of conditional effects, each of which is a pair con, cef , where con, cef LKC indicate the condition and the effect, respectively. Deﬁnition 2.10. A sensing action is a triple pre, pos, neg , where pre, pos, neg LKC indicate the precondition, the positive result, and the negative result, respectively. For example, announce = , {eff} , where eff = , C W i mi . askno = pre, {eff1, eff2} , where: pre = V i( Kimi Ki mi), eff1 = C W i mi, C W i =j(mi mj) ,

eff2 = C W i =j(mi mj), C V i mi .

Here pre says that no child knows if he is muddy, eff1 says: under the condition that the children commonly know that at least one child is muddy, they commonly know at least two of them are muddy, and eff2 says: under the condition that the children commonly know that at least two of them are muddy, they commonly know that all three children are muddy. An action a is executable wrt a KB φ LKC if φ |= pre(a). Suppose a is executable wrt φ. The progression of φ wrt a is deﬁned by resorting to belief change operators. Two main types of belief change are revision and update: revision concerns belief change about static environments due to partial and possibly incorrect information, whereas update concerns belief change about dynamic environments due to the performance of actions. We use a revision operator and an update operator for LKC. We use update for progression wrt deterministic actions and revision for sensing actions. Deﬁnition 2.11. Let φ LKC and a a deterministic action where eff(a) = { φ1, ψ1 , . . . , φn, ψn }. Suppose φi1, . . . , φim are all the φi s s.t. φ |= φi. Then the progression of φ wrt a is deﬁned as ((φ ψi1) . . . ) ψim. Deﬁnition 2.12. Let φ LKC and a a sensing action. Then the progression of φ wrt a with positive (resp. negative) result is φ+ = φ pos(a) (resp. φ = φ neg(a)). The progression of φ wrt a sequence of actions (with sensing results for sensing actions) is inductively deﬁned as follows: prog(φ, ϵ) = φ; prog(φ, (a; σ)) = prog(prog(φ, a), σ) if φ |= pre(a), and undeﬁned otherwise. A solution of a MEP problem is an action tree branching on sensing results, such that the progression of the initial KB wrt each branch in the tree entails the goal.

3 Checking Satisﬁability In this section, we ﬁrst present an algorithm for checking satisﬁability in KC n , and then extend it to KD45C n .

3.1 KC n Satisﬁability The main idea of our algorithm for checking satisﬁability in KC n is as follows. The input formula is in DNF, and we attempt to build a model for it recursively. When checking a model term δ = φ0 V a A(Kaφa LaΨa) Cµ DΛ, if φ0 is unsatisﬁable, we immediately return , otherwise we proceed as follows: For each agent a and each possibility ψa Ψa, we conjoin it with knowledge φa and common knowledge µ Cµ, and check if the resulting formula is satisﬁable. If not, we immediately return , otherwise we check λ φa µ Cµ and Dλ φa µ Cµ for each agent a and each common possibility λ Λ. Each of the above formulas is in CDNF, we call it a child of δ. When checking a CDNF φ, we put it into a DNF W , and recursively check each δ . We call δ a derivant of φ. However, the complication is that due to the presence of common knowledge, some newly generated modal terms might be the same as previous ones. To handle this issue, we maintain a graph whose nodes are modal terms. When a new modal term is generated, we mark it by , and when we know it is unsatisﬁable, mark it by . Also, when a new modal term δ is derived from a child of a modal term δ, we add an edge from δ to δ , mark the edge by 1 if δ is derived from ψa µ Cµ, and 2 otherwise. Now, when checking a modal term δ which already exists, we return its current mark. A formula δ may not be satisﬁable if it s marked by the above methods. We further update each modal term s mark by checking whether its common possibilities are satisﬁed. For each conjunct Dλ in each modal term δ, we check whether there is a modal term δ reachable from δ s.t. λ is a conjunct of δ and δ is marked by . If not, we mark δ by and further propagate to its ancestors through 1-edges. We repeat this procedure until no mark is updated. Deﬁnition 3.1. Let δ be a modal term. For a A, let γa denote a s knowledge, i.e., γa = φa µ Cµ. We deﬁne the set of δ s children, Gen K(δ), as the union of the following: S1 = {γa ψa | a A, ψa Ψa}; Sλ1 = {γa λ | a A}, and Sλ2 = {γa Dλ | a A}, where λ Λ. Proposition 3.2. Let δ be a satisﬁable modal term. Then the following hold for Gen K(δ): for all φ S1, φ is satisﬁable; for all λ Λ, there is φ Sλ1 Sλ2 s.t. φ is satisﬁable.

Algorithm 1: Check K(φ)

input: φ LKC is in DNF W output: /

set G = (W, R) to the empty graph; if Sat K(φ, , a, 1) = then return ; call Update; foreach δ in do if Mark(δ) = then return ; return ;

Proceedings of the Twenty-Seventh International Joint Conference on Artiﬁcial Intelligence (IJCAI-18)

Function Sat K(φ, φ , i, n)

input: φ is in CDNF, φ is a modal term, i A, n = 1, 2

if φ is not a modal term then transform φ into DNF W ; foreach δ in do if Sat K(δ, φ , i, n) = then return ; return ; else if φ = then Ri Ri {(φ , φ)n}; if φ W then return Mark(φ); W W {φ}, Mark(φ) ; Gen K(φ) = S1 S λ Λ(Sλ1 Sλ2); if one of the following conditions holds: 1. φ0 is propositionally unsatisﬁable; 2. δa S1: Sat K(δa, φ, a, 1) = . then Mark(φ) , return ; foreach δa in S λ Λ Sλ1 Sλ2 do call Sat K(δa, φ, a, 2); return ;

Procedure Update

foreach δ in W and λ s.t. Dλ is a conjunct of δ do let Child(δ, λ) be a set of modal terms δ where: 1. δ is reachable from δ; 2. λ is a conjunct of δ ; 3. Mark(δ ) = . Updated ; while Updated = do Updated ; Updated Nodes ; foreach δ in W and λ s.t. Dλ is a conjunct of δ do if Mark(δ) = and Child(δ, λ) = then Mark(δ) ; Updated ; Updated Nodes Updated Nodes {δ}; foreach δ in Updated Nodes do call Propagate(δ); foreach δ in W and λ do delete δ in Child(δ, λ) s.t. Mark(δ ) = ;

We demonstrate the procedure of Check K(φ) by Figure 1. Let φ = Cp D p and A = {a}.

1. We calculate Gen K(φ) = {δ1, δ2}, where δ1 = p p Cp and δ2 = D p p Cp, and set W = {φ, δ1, δ2} and Ra = {(φ, δ1)2, (φ, δ2)2}. 2. We check δ1 next. Since its propositional part is unsatisﬁable, we set Mark(δ1) = . 3. We turn to δ2, Gen K(δ2) = {δ1, δ2}, Ra Ra {(δ2, δ1)2, (δ2, δ2)2}, Mark(δ2) = . Thus Mark(φ) = . 4. Since D p is a conjunct of φ and δ2 but there isn t a reachable modal term δ s.t. p is a conjunct of δ and Mark(δ ) = , we set Mark(φ) = Mark(δ2) = . Eventually Check K(φ) returns .

Theorem 3.3. The complexity of Check K(φ) is O(4(d+1)cl+l), where l is the length of φ, d is the modal depth of φ, and c is the depth of nesting of C modalities in φ.

Figure 1: Procedure of Check K(Cp D p)

Procedure Propagate(δ)

input: δ is a modal term

foreach δ in W s.t. (δ , δ)1 R do if Mark(δ ) = then Mark(δ ) ; call Propagate(δ );

Proof sketch. First consider φ without C. For any modal term δ generated during the algorithm, δ corresponds to a substring of φ. Hence the graph has at most 2l nodes. Now consider that φ contains C modalities. Let Cν be the conjunction of all subformulas Cµ in φ. The maximal length of Cν is cl. The children generated from Cν are in the form ν1 Cν, where ν1 doesn t contain common knowledge since Cν is maximal. For 1 i < d, we name the children generated from νi as νi+1. All νi corresponds to a substring of ν, thus the length of νi is less than cl. Since the children of Cν without common knowledge will be reduced to propositional formulas after at most (d 1) recursions, the maximal length of the form V 1 i d νi Cν in a modal term generated during the algorithm is cl + dcl. Hence the maximum number of nodes is 2(d+1)cl+l. Let n be the number of nodes. Cycle checking and updating marks requires O(n2) time. Thus we obtain the complexity result O(4(d+1)cl+l). We proceed to prove soundness and completeness of Check K. For soundness, when Check K(φ) returns , we build a model Mφ for φ by modifying the frame G = (W, R) as follows: remove δ W s.t. Mark(δ) = ; rename δ W to wδ; and create a valuation map V s.t. V (wδ) satisﬁes φ0 in δ. We show that for each wδ in Mφ, Mφ, wδ |= δ. To facilitate the proof, we introduce the notion of implicants.

Deﬁnition 3.4. If the conjunction of a CDNF φ and another CDNF φ is equivalent to a DNF W , we call each δ an implicant of φ, and write δ |.= φ.

Obviously, if δ |.= φ, then δ |= φ. If a modal term δ is a subterm of a modal term δ, i.e., the conjuncts of δ is a subset of those of δ, then δ is an implicant of δ . In Algorithm 1, when a modal term δ is added as an a-child of a modal term δ = φ0 V a A(Kaφa LaΨa) Cµ DΛ, δ is an implicant of φa, µ and Cµ; similarly, δ is an implicant of

Proceedings of the Twenty-Seventh International Joint Conference on Artiﬁcial Intelligence (IJCAI-18)

some ψa Ψa, or an implicant of λ or Dλ for some λ Λ.

Lemma 3.5. When Check K(φ) returns , for all wδ Mφ, we have Mφ, wδ |= ψ if δ |.= ψ.

Proof. We prove by structural induction on ψ. Consider ψ in the following forms:

A propositional literal. Then ψ is a conjunct of the propositional term of δ. By the deﬁnition of V (wδ), we have Mφ, wδ |= ψ. ψ1 ψ2. Then δ |.= ψ1 and δ |.= ψ2. By induction, Mφ, wδ |= ψ1 and Mφ, wδ |= ψ2. So Mφ, wδ |= ψ1 ψ2. ψ1 ψ2. Then δ |.= ψ1 or δ |.= ψ2. By induction, Mφ, wδ |= ψ1 or Mφ, wδ |= ψ2. So Mφ, wδ |= ψ1 ψ2. Kaψ . By Algorithm 1, for all wδ s.t. wδRawδ , δ |.= ψ . By induction, Mφ, wδ |= ψ . So Mφ, wδ |= Kaψ . Laψ . By Algorithm 1, there is wδ s.t. wδRawδ and δ |.= ψ . If not, δ will be marked by Propagate. By induction, Mφ, wδ |= ψ . So Mφ, wδ |= Laψ . Cψ . By Algorithm 1, for all wδ reachable from wδ, δ |.= ψ . By induction, Mφ, wδ |= ψ . So Mφ, wδ |= Cψ . Dψ . By Algorithm 1, there is wδ reachable from wδ s.t. δ |.= ψ . If not, δ will be marked by Update. By induction, Mφ, wδ |= ψ . So Mφ, wδ |= Dψ .

For completeness, we prove a modal term δ is unsatisﬁable if it s marked . The difﬁcult case is when δ is marked during update. We introduce a lemma to handle this case.

Lemma 3.6. Let δ be a modal term s.t. δ |.= Dλ. When δ is satisﬁable, there is a modal term δ reachable from δ s.t. δ |.= λ and δ is satisﬁable.

Proof. Since δ is satisﬁable, there is a model (M, w) s.t. M, w |= δ. Since δ |.= Dλ, there is a path w, w1, . . . , wm, w in M s.t. M, w |= λ. Assume that w Ra1w1, wm Ra w and wi Rai+1wi+1 for 1 i < m. By Algorithm 1, there is a path δ, δ1, . . . , δm, δ in W where δi |.= Dλ for each i and δ |.= λ. Moreover, we have (δ, δ1) Ra1, (δm, δ ) Ra and (δi, δi+1) Rai+1. By induction, we can show that for each i = 1, . . . , m, M, wi |= δi. So M, w |= δ .

Theorem 3.7. Check K(φ) = iff φ is satisﬁable in KC n .

Proof. : Suppose Check K(φ) returns . Then there is a world wδ Mφ s.t. δ |.= φ. By Lemma 3.5, Mφ, wδ |= φ. Thus φ is satisﬁable. : Suppose Check K(φ) returns . Let φ = W . Then each modal term in is marked . We prove by induction on k that if δ is the kth modal term that Check K marks , δ is unsatisﬁable. There are 3 cases:

The propositional term of δ is unsatisﬁable. There is δ S1 marked by . By induction, δ is unsatisﬁable. By Proposition 3.2, δ is unsatisﬁable. δ contains a conjunct Dλ and for each δ reachable from δ where δ |.= λ, δ is marked . By induction, δ is unsatisﬁable. By Lemma 3.6, δ is unsatisﬁable.

3.2 KD45C n Satisﬁability The main idea of our algorithm for checking satisﬁability in KD45C n is as follows. We require the input formula φ to be in a certain normal form. We apply to φ the algorithm for checking satisﬁability in KC n . When the algorithm returns and a model for φ, we add edges to the model to make it serial, transitive, and Euclidean. The normal form we use ensures that after adding edges, the model still satisﬁes φ. To motivate our normal form, let s consider two examples. Example 2. Let A = {a, b}, P = {p}, and φ = DKap. Consider M = (W, R, V ) where W = {0, 1, 2}, Ra = {(0, 1), (1, 2)}, Rb = , V (0) = V (1) = , and V (2) = {p}. Then (M, 0) is a model for φ in KC n , but it can t be made a model for φ in KD45C n by adding edges. Now suppose we have an modality Daφ meaning φ holds in some world reachable by a path whose last edge is not of agent a. Let M be the same as M except Ra = {(1, 2)} and Rb = {(0, 1)}. Then (M , 0) is a model for φ = Da Kap in KC n , and it can be made a model for φ in KD45C n by adding edges. Example 3. Let φ = La(p Ka p). Then φ is satisﬁable in KC n but not in KD45C n , thus KC n algorithm obtains the wrong answer. Note that φ entails Ka p in KD45C n . If we add Ka p to φ, KC n algorithm could answer false correctly. In the following, we introduce our normal form. Deﬁnition 3.8. The semantics of the subscripted common knowledge modality Ca where a A is deﬁned as follows: M, w |= Caφ iff for all u, v s.t. w R Au and u Rbv where b = a, we have M, v |= φ, where R A is the reﬂexive transitive closure of the union of Ra for a A. We let Daφ abbreviate for Ca φ. We use LKCS to refer to LKC where the Ca modality is used instead of C. Deﬁnition 3.9. The set of modal S-terms (resp. normal Sterms) is inductively deﬁned as follows: A propositional term is a modal (resp. normal) S-term; A formula of the form φ0 V a A(Kaφa LaΨa Caµa DaΛa) is a modal S-term (resp. normal S-term), where φ0 is a propositional term, Ψa, Λa are sets of modal S-terms (resp. normal S-terms), Ψa is not empty, and φa, µa are conjunctions of disjunctions of modal Sterms (resp. φa, µa are disjunctions of normal S-terms). A formula φ is in SDNF (resp. normal SDNF) if it s a disjunction of modal S-terms (resp. normal S-terms). In the above deﬁnition, note the requirement that Ψa is not empty. Due to seriality, we can always replace an empty Ψa with the set { } in KD45C n .

Proposition 3.10. The following hold in KD45C n : Cφ V a A(Kaφ Ca Kaφ); Dφ W a A(Laφ Da Laφ).

Proof. We only prove Item 1. The direction is obvious. For the direction, suppose M, w |= V a A(Kaφ Ca Kaφ). Let v be reachable from w and p a shortest path from w to v. If p is of length 1, since M, w |= V a A Kaφ,

Proceedings of the Twenty-Seventh International Joint Conference on Artiﬁcial Intelligence (IJCAI-18)

M, v |= φ. If p is of length > 1, then the last two edges on the path must be of different agents; otherwise, by transitivity, there is a shorter path from w to v. Let a be the agent of the last edge. Since M, w |= Ca Kaφ, M, v |= φ. Thus M, w |= Cφ.

By Proposition 3.10, in KD45C n , every formula is equivalent to a formula in SDNF and a formula in normal SDNF. As motivated by Example 3, we present the following rules for unfolding nested knowledge.

Proposition 3.11. The following hold in KD45C n :

Ka(Kaφ φ+ ψ) Ka(φ Kaφ φ+ ψ); Ka(Cbφ φ+ ψ) Ka(φ Cbφ φ+ ψ); La(Kaφ φ+) Kaφ La(Kaφ φ+); La(Cbφ φ+) Ka(φ Cbφ) La(Cbφ φ+).

Here φ LKCS, φ+ is a modal S-term and ψ is in SDNF.

Deﬁnition 3.12. Let φ be in SDNF. We apply the rules for unfolding knowledge to φ, from inside to outside, and then put the resulting formula into an SDNF φ . We say that φ is in unfolded SDNF. Unfolded normal SDNF is similarly deﬁned.

For example, La Kaφ Kaφ La Kaφ, and Ka Ka Kaφ Ka(φ Ka(φ Kaφ)).

Proposition 3.13. Let φ be in unfolded SDNF. Given any formula ψ and agent a, if φ |= Kaψ in KD45C n , then we also have φ |= Kaψ in KC n .

Now we deﬁne the set of children of a modal S-term. Note that due to transitivity, Kaφa entails Ka Kaφa in KD45C n .

Deﬁnition 3.14. Let δ = φ0 V a A(Kaφa LaΨa Caµa DaΛa) be a modal S-term. For a A, we let γa denote the knowledge of a, i.e., γa = φa Kaφa Caµa V b =a(µb Cbµb). We deﬁne the set of δ s children, written Gen KD45(δ), as the union of the following sets:

S1 = {γa ψa | a A, ψa Ψa}, Sbλ1 = {γa λ | a A, a = b}, and Sbλ2 = {γa Dbλ | a A}, where b A, λ Λb.

Proposition 3.15. Let δ be a satisﬁable modal S-term. Then the following hold for Gen KD45(δ):

for all δ S1, δ is satisﬁable; for all b A and λ Λb, there is δ Sbλ1 Sbλ2 s.t. δ is satisﬁable.

Algorithm 2: Check KD45(φ)

input: φ is in unfolded SDNF output: /

Check K(φ) where Gen K is replaced by Gen KD45

When Check KD45(φ) returns , we ﬁrst build a model Mφ for φ as we do for Check K(φ), and then set each accessibility relation to its transitive and Euclidean closure. Note that seriality is ensured by requiring Ψa non-empty. The normal form we use and the deﬁnition of Gen KD45 ensure that the resulting KD45C n model M φ still satisﬁes φ.

Lemma 3.16. Let a, b A. When Check KD45(φ) returns , for all wδ, wδ M φ s.t. wδRawδ , we have:

when δ |.= Kaφ, δ |.= φ Kaφ; when δ |.= Kaφ, δ |.= Kaφ; when δ |.= Cbφ, δ |.= Cbφ; when δ |.= Cbφ, either δ |.= Cbφ or δ |.= Ka Cbφ.

Proof. We prove by induction on the order (wδ, wδ ) is added to Ra. The induction cases are straightforward. We prove the base case where (wδ, wδ ) is in Ra before edges are added. Item 1 and Item 3 follow from the deﬁnition of Gen KD45. From δ |.= Kaφ we can infer that δ is an implicant of Kaφ, Ka Kaφ or La Kaφ. Since δ is in unfolded SDNF, we have δ |.= Kaφ. From δ |.= Cbφ we can infer that δ is an implicant of Cbφ, Ka Cbφ or La Cbφ. If δ is an implicant of La Cbφ, we have δ |.= Ka Cbφ by unfolding rules.

Lemma 3.17. When Check KD45(φ) returns , for all wδ M φ, we have M φ, wδ |= ψ if δ |.= ψ.

Proof. We only prove the cases whose proofs are different from those for KC n . Consider ψ in the following forms:

Kaψ . By Lemma 3.16, for all wδ s.t. wδRawδ , δ |.= ψ . By induction, Mφ, wδ |= ψ . So Mφ, wδ |= Kaψ . Caψ . By Lemma 3.16, for all wδ s.t. wδR Awδ , δ

|.= Caψ . Then by the deﬁnition of Gen KD45, for all wδ s.t. wδ Rbwδ where b = a, we have δ |.= ψ . By induction, M φ, wδ |= ψ . So M φ, wδ |= Caψ .

Theorem 3.18. Check KD45(φ) returns iff φ is satisﬁable in KD45C n .

The proof for soundness follows from Lemma 3.17, while the proof for completeness is identical to that in Theorem 3.7.

Theorem 3.19. The complexity of Check KD45(φ) is O(4(d+1)(c+d)nl+l), where n is the number of agents, l, c, and d are as in Theorem 3.3.

Proof sketch. The complexity follows from Theorem 3.3 with two increasing from two factors: the common knowledge operator C is transformed into n copies of the subscripted version; knowledge of each agent is preserved when generating children of a modal S-term.

4 Belief Revision and Update

In this section, we introduce our algorithms for higher-order belief change involving common knowledge. As Huang et al. [2017], we reduce change of epistemic formulas to that of lower-order epistemic formulas, and as basis we resort to change of propositional formulas. The essential difference between revision and update is: revision satisﬁes the conjunction property that when φ φ is satisﬁable, φ φ φ φ , while update satisﬁes the distribution property that when both φ1 and φ2 are satisﬁable, (φ1 φ2) φ φ1 φ φ2 φ . We illustrate our main ideas with three examples.

Example 4. Revise Ka( p Ka p) with Cp. When we recursively revise p Ka p, we cannot simply revise it with p, which will give us the incorrect result Ka(p Ka p) Cp;

Proceedings of the Twenty-Seventh International Joint Conference on Artiﬁcial Intelligence (IJCAI-18)

we will have to revise it with p and carry Cp, which gives us the correct result Ka(p Kap) Cp. Example 5. Revise Ka( p q) with Ka q Lap. We cannot simply revise old knowledge with new knowledge, which gives us p q, inconsistent with new possibility p. We will take the disjunction of p q and the revision of old knowledge with the conjunction of new knowledge and new possibility. This gives us p q p q, equivalent to q. The same idea applies to the revision of common knowledge. Example 6. Revise La(p q) La( p q) with Kap Lar. Since there are old possibilities consistent with new knowledge, we will only keep such old possibilities and revise them with new knowledge. Thus we get La(p q) Kap Lar. However, consider revising La(p q) La( p q) with Kaq Lar. Since all old possibilities are inconsistent with new knowledge, we revise all of them with new knowledge. The result is La(p q) La( p q) Kaq Lar. The same idea applies to the revision of common possibilities. Motivated by Example 4, we deﬁne a revision with carry operator φ γ φ where γ is the carry. The difference between φ and γ is that the revision result must entail φ while the result only need to be consistent with γ. Deﬁnition 4.1. Let be a revision operator. The revision of φ with φ and carry γ, written φ γ φ is deﬁned as follows: φ γ φ = φ φ , if φ γ is satisﬁable; φ γ φ = φ (φ γ), otherwise. Similarly, we can deﬁne the update with carry operator. Motivated by Example 6, we deﬁne the operator to restrict attention to consistent pairs of formulas if possible. Deﬁnition 4.2. Φ Φ = {(φ, φ ) | φ Φ, φ Φ , φ φ is satisﬁable}, if there are φ Φ and φ Φ s.t. φ φ is satisﬁable; {(φ, φ ) | φ Φ, φ Φ }, otherwise. Below is the formal deﬁnition of our revision operator. Item 4 needs some explanation. In order to obtain a satisﬁable result, we revise different parts of the old S-term in the following order, using the ideas behind the three motivating examples: 1. When old common knowledge is consistent with new knowledge and new possibilities, revise it with new common knowledge; otherwise simply assume new common knowledge. 2. Revise knowledge using revised common knowledge. 3. Revise possibilities using revised common knowledge and knowledge. 4. Keep old common possibilities that are consistent with revised common knowledge, knowledge and possibilities. Deﬁnition 4.3. Let φ and φ be in unfolded normal SDNF. The revision of φ with φ , written φ φ , is deﬁned recursively: 1. When φ and φ are propositional formulas, φ φ = φ s φ , where s is Satoh [1988] s revision operator; 2. When φ = W and φ = W , φ φ = W{δ δ | (δ, δ ) };

3. When φ and φ are normal S-terms and φ φ is satisﬁable, φ φ is φ φ converted to a normal S-term; 4. Otherwise, φ and φ are normal S-terms, and φ φ is not satisﬁable. Let φ = φ0 V a A(Kaφa LaΨa Caµa DaΛa), and φ = φ 0 V a A(Kaφ a LaΨ a Caµ a DaΛ a). Then φ φ = φ = φ 0 V a A(Kaφ a LaΨ a Caµ a DaΛ a) where:

(a) φ 0 = φ0 φ 0; (b) If Caµa V b =a(Kbφ b LbΨ b) is satisﬁable, µ a = (µa γ1 µ a) W λ a Λ a(µa γ2 (µ a λ a)) where γ1 = Caµ a and γ2 = Caµ a Daλ a; otherwise µ a = µ a; (c) φ a = (φa γ1 (φ a V b =a µ b )) W ψ a Ψ a(φa γ2 (ψ a φ a V b =a µ b )), where γ1 = Kaφ a V a A Caµ a and γ2 = Kaφ a Laψ a V a A Caµ a; (d) Ψ a = {ψ γ ψ | (ψ, ψ ) Ψa {φ a V b =a µ b }} Ψ a, where γ = Kaφ a V a A Caµ a; (e) Λ a = {λ | λ Λa and Daλ V a A(Kaφ a LaΨ a Caµ a) is satisﬁable} Λ a.

We now state properties of our revision operator.

Deﬁnition 4.4. The set of disjunct-wise satisﬁable (d-sat) normal SDNFs is inductively deﬁned:

A disjunction W of normal S-terms is d-sat if each δ is d-sat; A normal S-term φ0 V a A(Kaφa LaΨa Caµa DaΛa) is d-sat if it is satisﬁable and each disjunct in each φa or µa is d-sat.

Proposition 4.5. An unfolded normal S-term without D modalities φ = φ0 V a A(Kaφa LaΨa Caµa) is satisﬁable if the following hold:

1. φ0 is propositionally satisﬁable; 2. For all a A and for all ψa Ψa, φψa = ψa φa Kaφa V b =a µb V a A Caµa is satisﬁable.

Proof. We construct a model (M, w) for φ. By Item 1, we create a world w where V (w) satisﬁes φ0. By Item 2, for each a A and ψa Ψa, there is a KD45C n model (Mψa, wψa) satisfying φψa. We add each (Mψa, wψa) to M and let w Rawψa. Then (M, w) is a KD45C n model for φ after calculating the transitive and Euclidean closures.

Proposition 4.6. Let φ and φ be two d-sat unfolded normal SDNFs. Then φ φ is a d-sat normal SDNF, and φ φ |= φ . Moreover, when φ φ is satisﬁable, φ φ φ φ .

Proof. The conjunction property follows from Deﬁnition 4.3 Item 3 directly. Now we assume that φ φ is unsatisﬁable. We ﬁrst prove φ φ |= φ by induction on the modal depth of φ. Let md(φ) denote the modal depth of φ. When md(φ) = 0, we consider φ in two cases:

md(φ ) = 0. Then φ s φ |= φ by Satoh s revision.

Proceedings of the Twenty-Seventh International Joint Conference on Artiﬁcial Intelligence (IJCAI-18)

md(φ ) > 0. Let φ = W and φ = W . By Deﬁnition 4.3 Item 2, φ φ = W{δ δ | (δ, δ ) }. For each (δ, δ ) , since higher-order subformulas in δ are directly conjoined into δ δ , we have δ δ |= δ . Let φ φ = W . For each δ , there is a δ s.t. δ |= δ . Therefore, φ φ |= φ . When md(φ) = n + 1, we consider φ and φ in two cases: φ and φ are normal S-terms. Let φ = φ φ . By Items (a), (d) and (e), we have φ 0 |= φ 0, V a A LaΨ a |= V a A LaΨ a and V a A DaΛ a |= V a A DaΛ a. By Item (b) and induction, for each agent a we have µ a |= µ a, thus V a A Caµ a |= V a A Caµ a. Similarly, by Item (c) we have V a A Kaφ a |= V a A Kaφ a. Take the conjunction of the above results, we have φ |= φ . Otherwise, we follow the proof in the base case. Let γ be a modal S-term s.t. φ γ is d-sat and φ = φ γφ . Now we prove that φ and φ γ are d-sat by induction on the modal depth of φ. When md(φ) = 0, φ γ is d-sat and φ γ φ = φ φ . We consider φ in two cases: md(φ ) = 0. Then φ s φ is d-sat by Satoh s revision. md(φ ) > 0. Let φ = W and φ = W . We have φ φ = W{δ δ | (δ, δ ) }. For each (δ, δ ) , let δ = φ0 and δ = φ 0 V a A(Kaφ a LaΨ a Caµ a DaΛ a). Thus δ δ = (φ0 sφ 0) V a A(Kaφ a LaΨ a Caµ a DaΛ a). Again by Satoh s revision, δ δ is d-sat. Therefore, φ and φ γ are d-sat. When md(φ) = n + 1, we consider φ and φ in two cases: φ and φ are normal S-terms. By Item (a) and Satoh s revision, φ 0 is d-sat. By Item (b) and induction, µ a is dsat. By deﬁnitions of revision with carry and unfolded formulas, we can show that Caµ a is d-sat. Similarly, we have that φ a and Kaφ a are d-sat. Since φ a |= V b =a µ b , φ a V b =a µ b V a A Caµ a is d-sat. By Item (d) and induction, ψ a φ a Kaφ a V b =a µ b V a A Caµ a is d-sat for each ψ a Ψ a. By Proposition 4.5 and Item (e), φ is d-sat. By Deﬁnition 4.1, we can show that φ γ is d-sat. Otherwise, we follow the proof in the base case.

Finally, we present the formal deﬁnition of our update operator. For Item 4, the difference with that of the revision operator lies with Item (d) where we update each possibility. Recall that for Item 4 (d) of the revision operator, when there are possibilities consistent with new common knowledge and knowledge, we only keep them and revise them. Deﬁnition 4.7. Let φ and φ be in unfolded normal SDNF. The update of φ with φ , written φ φ , is deﬁned recursively: 1. When φ and φ are propositional formulas, φ φ = φ w φ , where w is Winslett [1988] s update operator; 2. When φ = W , φ φ = W δ δ φ ; 3. When φ is an S-term, and φ = W , φ φ = W{φ δ | (φ, δ ) {φ} }; 4. Otherwise, φ and φ are normal S-terms, φ φ = φ where φ, φ , φ have the form as in Deﬁnition 4.3, and:

(a) φ 0 = φ0 φ 0; (b) If Caµa V b =a(Kbφ b LbΨ b) is satisﬁable, µ a = (µa γ1 µ a) W λ a Λ a(µa γ2 (µ a λ a)) where γ1 = Caµ a and γ2 = Caµ a DaΛ a; otherwise µ a = µ a; (c) φ a = (φa γ1 (φ a V b =a µ b )) W ψ a Ψ a(φa γ2 (ψ a φ a V b =a µ b )), where γ1 = Kaφ a V a A Caµ a and γ2 = Kaφ a Laψ a V a A Caµ a; (d) Ψ a = {ψ γ (φ a V b =a µ b ) | ψ Ψa} Ψ a, where γ = Kaφ a V a A Caµ a; (e) Λ a = {λ | λ Λa and Daλ V a A(Kaφ a LaΨ a Caµ a) is satisﬁable} Λ a.

Proposition 4.8. Let φ and φ be two d-sat unfolded normal SDNFs. Then φ φ is a d-sat normal SDNF, and φ φ |= φ . Moreover, (φ1 φ2) φ φ1 φ φ2 φ .

The distribution property can be obtained from Deﬁnition 4.7 Item 2 directly. We omit the proofs for the other properties of update since they are similar to those of revision.

5 Implementation and Experimental Results

Based on our reasoning, revision and update algorithms, we implemented a planner called MEPC for multi-agent epistemic planning with common knowledge. Our planning algorithm supports contingent planning by extending breadthﬁrst search to AND/OR graphs. We evaluate MEPC with ﬁve domains which use common knowledge in different ways. Collaboration-and-Communication: CC(n). There are 4 rooms, 2 agents and n boxes. Agents can enter rooms and sense boxes in it. Also, agents can share information. The goal is to let agents know the positions of boxes. The common knowledge is that each box is in exactly one room. Muddy-Children: MC(n, m). There are n children and m of them are muddy. Public-Announcing: PA(n). There are n agents in room 1. Agent 1 can sense whether the book is in room 2 and take it away. Each agent can share his belief about the book to others, while agent n can make a public announcement. The goal is to achieve common knowledge that agent 1 belives the book is missing. Selective-Communication: SC(n). There are n rooms and n agents in different rooms. A secret is false, but initially it s common knowledge that agent 1 believes the secret is true. Agent 1 can ﬁnd out that the secret is false. Each agent can move to a neighboring room and tell the secret to others in the room. The goal is to let all agents except agent 1 believe that the secret is false, while agent 1 believes that it is true. Prisoners-and-Lightbulb: PL(n). This domain is adapted from a puzzle in [van Ditmarsch and Kooi, 2015]. There are n prisoners in the prison. Every day one of them is interrogated in a room furnished with a light bulb. The goal is to let one of them know that all the prisoners have been interrogated. Our experiments were run on a Windows machine with 3.50GHz CPU and 8GB RAM. The results are shown in Table 1. The 2nd-4th columns indicate the number of agents, the

Proceedings of the Twenty-Seventh International Joint Conference on Artiﬁcial Intelligence (IJCAI-18)

Domain A P |S + D| MEPC CC(2) 2 16 6+16 44.831-33.830(5/10165) CC(3) 2 20 14+16 1953.0-1703.0(5/35351) MC(2,2) 2 3 0+2 0.014-0.001(2/3) MC(3,3) 3 4 0+2 0.101-0.013(3/4) MC(4,2) 4 5 0+2 0.201-0.027(2/3) MC(4,3) 4 5 0+2 2.603-0.435(3/4) MC(4,4) 4 5 0+2 5.101-0.895(4/5) MC(5,5) 5 6 0+2 108.682-20.187(5/6) PA(2) 2 1 1+3 0.011-0.001(4/11) PA(3) 3 1 1+4 0.013-0.001(5/15) PA(4) 4 1 1+5 0.019-0.003(6/20) PA(5) 5 1 1+6 0.027-0.008(7/25) SC(2) 2 5 0+13 0.103-0.018(0/42) SC(3) 3 10 0+25 9.198-1.597(9/2610) SC(4) 4 17 0+41 738.566-61.571(10/70951) PL(2) 2 12 0+6 0.061-0.002(5/21) PL(3) 3 15 0+9 0.416-0.050(7/93) PL(4) 4 18 0+12 4.040-0.443(9/385) PL(5) 5 21 0+15 30.944-3.582(11/1493)

Table 1: Experimental Results

number of atoms, and the number of sensing and deterministic actions. In the last column, A-B(C/D) indicates A seconds of total time, B seconds spent on satisﬁability solving, depth C of solution tree (C=0 means the problem is unsolvable), and D nodes searched. The results show that our planner is capable of solving these problems of planning with common knowledge. However, our planner doesn t scale well, due to the exponential time complexity of the satisﬁability solving algorithm and the naive search method we use.

6 Conclusion

In this paper, we have extended an existing framework for multi-agent epistemic planning with the capability to deal with general common knowledge. We propose a novel normal form for multi-agent KD45 with common knowledge which makes use of the subscripted common knowledge operator and unfolds knowledge in a certain way. We propose satisﬁability solving, revision and update algorithms for this normal form. We implemented a planner MEPC, and it is capable of solving several domains involving typical usage of common knowledge. Despite the current limitations of our work, we have made a signiﬁcant ﬁrst step towards multiagent epistemic planning with common knowledge. In the future, we are interested in extending our work to handle common knowledge of a subset of agents. Proposing more efﬁcient algorithms for reasoning about common knowledge is another important future work.

Acknowledgments

We acknowledge support from the Natural Science Foundation of China under Grant Nos. 61572535 and 61463044.

References [Bolander and Andersen, 2011] Thomas Bolander and Mikkel Birkegaard Andersen. Epistemic planning for

single and multi-agent systems. Journal of Applied Non-Classical Logics, 21(1):9 34, 2011. [del Cerro et al., 2001] Luis Fari nas del Cerro, David Fauthoux, Olivier Gasquet, Andreas Herzig, Dominique Longin, and Fabio Massacci. Lo TREC : The generic tableau prover for modal and description logics. In Proceedings of the First International Joint Conference on Automated Reasoning, (IJCAR 2001), pages 453 458, 2001. [Fagin et al., 1995] Ronald Fagin, Joseph Y Halpern, Moshe Y Vardi, and Yoram Moses. Reasoning about knowledge. MIT Press, 1995. [Hales et al., 2012] James Hales, Tim French, and Rowan Davies. Reﬁnement quantiﬁed logics of knowledge and belief for multiple agentsc. In Proceedings of the Ninth Conference on Advances in Modal Logic, (Ai ML 2012), pages 317 338, 2012. [Halpern and Moses, 1992] Joseph Y. Halpern and Yoram Moses. A guide to completeness and complexity for modal logics of knowledge and belief. Artiﬁcial Intelligence, 54(2):319 379, 1992. [Huang et al., 2017] Xiao Huang, Biqing Fang, Hai Wan, and Yongmei Liu. A general multi-agent epistemic planner based on higher-order belief change. In Proceedings of the Twenty-Sixth International Joint Conference on Artiﬁcial Intelligence, (IJCAI 2017), pages 1093 1101, 2017. [Kominis and Geffner, 2015] Filippos Kominis and Hector Geffner. Beliefs in multiagent planning: From one agent to many. In Proceedings of the Twenty-Fifth International Conference on Automated Planning and Scheduling, (ICAPS 2015), pages 147 155, 2015. [Muise et al., 2015] Christian J. Muise, Vaishak Belle, Paolo Felli, Sheila A. Mc Ilraith, Tim Miller, Adrian R. Pearce, and Liz Sonenberg. Planning over multi-agent epistemic states: A classical planning approach. In Proceedings of the Twenty-Ninth AAAI Conference on Artiﬁcial Intelligence, (AAAI 2015), pages 3327 3334, 2015. [Satoh, 1988] Ken Satoh. Nonmonotonic reasoning by minimal belief revision. In Proceedings of the Second International Conference on Fifth Generation Computer Systems, (FGCS 1988), pages 455 462, 1988. [van Ditmarsch and Kooi, 2015] Hans van Ditmarsch and Barteld Kooi. One Hundred Prisoners and a Light Bulb. Springer, 2015. [Van Ditmarsch et al., 2007] Hans Van Ditmarsch, Wiebe van Der Hoek, and Barteld Kooi. Dynamic epistemic logic, volume 337. Springer, 2007. [Winslett, 1988] Marianne Winslett. Reasoning about action using a possible models approach. In Proceedings of the Seventh National Conference on Artiﬁcial Intelligence, (AAAI 1988), pages 89 93, 1988.

Proceedings of the Twenty-Seventh International Joint Conference on Artiﬁcial Intelligence (IJCAI-18)