# concurrent_games_in_dynamic_epistemic_logic__4a97c4b5.pdf

Concurrent Games in Dynamic Epistemic Logic

Bastien Maubert1 , Sophie Pinchinat2 , Franc ois Schwarzentruber2 and Silvia Stranieri1

1Universit a degli Studi di Napoli Federico II , Italy 2Universit e de Rennes, IRISA, CNRS, France

Action models of Dynamic Epistemic Logic (DEL) represent precisely how actions are perceived by agents. DEL has recently been used to deﬁne inﬁnite multi-player games, and it was shown that they can be solved in some cases. However, the dynamics being deﬁned by the classic DEL update product for individual actions, only turn-based games have been considered so far. In this work we deﬁne a concurrent DEL product, propose a mechanism to resolve conﬂicts between actions, and deﬁne concurrent DEL games. As in the turn-based case, the obtained concurrent inﬁnite game arenas can be ﬁnitely represented when all actions are public, or all are propositional. Thus we identify cases where the strategic epistemic logic ATL K can be model checked on such games.

1 Introduction The discipline of distributed synthesis seeks solutions for automating the design of artiﬁcial agents that interact and make decisions in order to achieve some service. The main challenge is to synthesize strategies for each agent of a team collaborating towards a common goal, against opponents, usually in an imperfect information setting. As advocated in the Dynamic Epistemic Logic (DEL) approach to epistemic planning, we consider multi-agent systems described symbolically by DEL presentations, which consist in an initial epistemic state (actual world and worlds considered possible by some agents), and epistemic actions that agents may perform. An epistemic action, besides the standard precondition and effects of the action, also describes the agents ability to perceive its execution. Synthesising strategies was studied in the single-agent case [Bolander and Andersen, 2011; Aucher and Bolander, 2013; Bolander et al., 2015; Bolander, 2017], and in a multi-player turn-based game setting [Maubert et al., 2019; Maubert et al., 2020]. However, concurrent execution of actions has not been considered so far in epistemic planning. Some works consider concurrent execution of abstract actions (as in concurrent game structures [de Alfaro and Henzinger, 2000]), or concurrent execution of purely epistemic actions without effects on the world [van Eijck et al., 2011; Agotnes and van

Figure 1: The two-robot coordination example.

Ditmarsch, 2011; Benevides and Lima, 2019] or only public [Lima, 2014]. But to our knowledge, concurrent execution of arbitrary epistemic actions with explicit effects has never been studied. And yet they are essential for modelling realistic situations. Consider the two-robot coordination example of Figure 1. Some container can be placed either left or right. Two robots (or agents), i and j, can either wait or push the container, which incidentally is energy-consuming. Both agents waiting, the container stays in place. Otherwise, if i pushes while j waits, the container moves right (if not already there), and, symmetrically, if j pushes while i waits, it moves left (if not already there). Agents pushing concurrently results in a conﬂict. To determine the effect of such conﬂicting actions, one solution is to consider that they are blocking, as done for instance in [Fox and Long, 2003] for in temporal planning, or facts involved in conﬂifcts are just maintained [Lima, 2014]. Another option consists in selecting a maximal subset of non-conﬂicting actions, as done for instance in [Eshuis and Wieringa, 2001] in UML work-ﬂow modelling. In this paper, we adopt the latter solution, which is more challenging. A major additional difﬁculty with respect to this latter UML setting arises from epistemic features of actions of the DEL framework. In our robot example, while robots i and j perfectly perceive pushing and waiting actions, the two other agents k and ℓcannot distinguish them. Note that this issue is not tackled in [Lima, 2014] since actions are public. Our ﬁrst contribution is to deﬁne a DEL concurrent update product that provides the dynamics of concurrent epistemic actions. This new product yields a non-deterministic dynamics controlled by a scheduler, whose role is to resolve conﬂicts. On the basis of this product we show how ﬁnite DEL presentations can generate inﬁnite concurrent game arenas. Our second contribution is a proof that distributed synthesis can be solved for such games for two cases: the case where all actions are public (perfectly observed by everyone), and the case where all actions have propositional preand post-

Proceedings of the Twenty-Ninth International Joint Conference on Artiﬁcial Intelligence (IJCAI-20)

w u ijkℓ ijkℓ

Figure 2: The epistemic state of Figure 1.

conditions and information among the agents is hierarchical (Theorems 4 and 7). To prove this we extend to the concurrent setting some results from [Maubert et al., 2019; Maubert et al., 2020], namely, that the inﬁnite concurrent game arenas arising from DEL game presentations can be ﬁnitely represented when actions are all public or all propositional. We then transfer existing results on the model-checking problem for the epistemic strategic logic ATL K [Van der Hoek and Wooldridge, 2003] to obtain, in particular, decidability of distributed synthesis for rich temporal epistemic objectives. Remark that our work is unrelated to the concurrent dynamic logic from [van Benthem et al., 2007], where several games are played concurrently. Our work also differs from concurrent dynamic epistemic logic [van Ditmarsch et al., 2003], where concurrency represents some form of alternation [Chandra and Stockmeyer, 1976].

Outline: In Section 2, we provide all the necessary material on DEL presentations. In Section 3 we present the concurrent update product, and we use it in Section 4 to deﬁne DEL games and establish our main results.

2 Background on DEL

We ﬁx a ﬁnite set of agents Ag = {1, . . . , N} and a countable set of atomic propositions AP. The language of epistemic logic, noted LK, is deﬁned by the grammar ϕ ::= true | p | ϕ | ϕ ϕ | Kiϕ where p AP and i Ag. Formula Kiϕ reads as agent i knows that ϕ . Ruling out modality Ki yields propositional logic.

2.1 Epistemic Models and States Epistemic models are relational structures that represent agents knowledge.

Deﬁnition 1 (Epistemic models). An epistemic model is a structure M = (W, ( i)i Ag, λ) where W is a ﬁnite nonempty set of worlds, for each agent i Ag, the equivalence relation i W W is an epistemic relation that reﬂects agents i s information (or knowledge), and λ : W 2AP is a valuation function. An epistemic state is a pair (M, w) where M is an epistemic model and w is a world of M representing the actual world.

An epistemic state (M, w) will often be written w when the model M is clear from the context.

Example 1. The epistemic state of Figure 2 represents the situation of Figure 1. Proposition left holds when the container is on the left. In the actual world w, left is false since the container is on the right and we assume that agents i, j and k know this fact, but that ℓdoes not: he does not distinguish between world w and the world u where the container is on the left.

The semantics of LK is based on epistemic states.

pre: low Ei

post: low Ei := left left := false

pre: true post: low Ei := false

pre: low Ej

post: low Ej := left left := true

pre: true post: low Ej := false

Figure 3: The action model of the two-robot example.

Deﬁnition 2 (Semantics of LK). Let (M, w) be an epistemic state. We deﬁne the sentence Formula ϕ holds in (M, w) , written (M, w) |= ϕ or w |= ϕ, by induction on ϕ: w |= true; w |= p whenever p λ(w); w |= ϕ whenever w |= ϕ; w |= ϕ1 ϕ2 whenever w |= ϕ1 and w |= ϕ2; and w |= Kiψ whenever (M, u) |= ψ, for all worlds u such that w i u.

2.2 Action Models and Epistemic Actions

Action models ([Baltag et al., 1998], [Van Ditmarsch et al., 2007]) gather several concrete actions (with preconditions and effects [van Ditmarsch and Kooi, 2006]) with epistemic relations describing how the concrete actions are perceived.

Deﬁnition 3 (Action models). An action model is a structure A = (A, ( i)i Ag, pre, post) where A is a nonempty ﬁnite set of actions, i A A is the epistemic relation of agent i Ag, the precondition function pre : A LK speciﬁes the guard for an action to be executable, and the postcondition is a partial function post : A AP LK that describes the effect of actions over atomic propositions; when post(a, p) is undeﬁned, p is not affected by a. An epistemic action is a pair (A, a) where a A denotes the action actually executed.

Example 2. Figure 3 describes the action model for the tworobot coordination example. Action pushi represents agent i pushing the container to the right. Agent i can execute pushi whenever its energy is high, i.e., when proposition low Ei is false (low Ei and low Ej are initially false); this action consumes energy if it is effective (i.e., the container was not already on the right): in that case the postcondition makes proposition low Ei true. Action pushj for agent j is deﬁned similarly. Finally, the waiting action of robots, represented by waiti and waitj respectively, have the true precondition and the effect of reﬁlling energy (i.e. turning low Ei to false). To keep our example simple, we suppose that all agents but ℓ distinguish between pushi and waiti even if the box is on the right, and similarly for pushj and waitj.

We exhibit two particular kinds of epistemic actions.

Deﬁnition 4 (Propositional, public action). An action model A is propositional if for every concrete action a of A and each proposition p, formulas pre(a) and post(a)(p) are propositional. An epistemic action (A, a) is public if for all agents i, a i a if and only if a = a.

Proceedings of the Twenty-Ninth International Joint Conference on Artiﬁcial Intelligence (IJCAI-20)

w, pushi u, pushi

w, waiti u, waiti

Figure 4: Epistemic state after agent i pushed the container.

2.3 Update Product Execution of epistemic actions may change the world but also agents information. It is formalised by the update product between epistemic models and action models. Deﬁnition 5 (Update product). The update product of epistemic model M = (W, ( i)i Ag, λ) with action model A = (A, ( i)i Ag, pre, post) is the epistemic model M A := (W , ( i )i Ag, λ ), where: W = {(w, a) W A | (M, w) |= pre(a)}; (w, a) i (u, b) whenever w i u and a i b; p λ (w, a) whenever (M, w) |= post(a, p). We lift the product to epistemic states and epistemic actions: whenever M, w |= pre(a), we let (M, w) (A, a) be the epistemic state (M A, (w, a)). Example 3. Figure 4 shows (M A, (w, pushi)), that describes the epistemic state after execution of action pushi from the epistemic state of Figure 1 (strictly speaking, it only displays worlds connected to the actual one). The physical situation has not changed: the container is still on the right. However agent ℓ, who initially considered it possible that the container is on the left, and cannot distinguish between pushi and waiti, still considers it possible that the container is on the left. Some information has changed though: initially, agent ℓknew that robots had energy, but now he considers it possible that robot i s energy has become low.

3 Concurrent Actions Fix an epistemic state (M, w) and an action model A. From now on, atomic propositions are partitioned into shared propositions (APs) that all agents can modify, and private ones: APp i is the set of private propositions of agent i. As a result, AP = APs U i Ag APp i , where is the disjoint union. An agent can play any action that does not modify private propositions of others. We gather in set Ai those actions, namely those whose postconditions are undeﬁned on j =i APp j. A joint action is a tuple a = a1, . . . , a N Q i Ag Ai, and we let aj denote action aj. Finally, a joint action a is available in w when every individual action aj can be executed in w: (M, w) |= pre( a1) pre( an).

3.1 Conﬂicts We deﬁne a formula noconﬂict( a)(p) expressing that all individual actions of a joint action a agree on their effect (if any)

on proposition p. We ﬁrst introduce the set Ag( a, p) of agents whose individual action in a has an effect on p. Formally, Ag( a, p) is the set {i = 1..N | post( ai)(p) is deﬁned}. We then let noconﬂict( a)(p) be the formula V i Ag( a,p) post( ai)(p) V i Ag( a,p) post( ai)(p).

Now, in a situation where all individual actions of the joint action a agree on their effect on each proposition, the effect of executing them concurrently can be determined unambiguously. Such a situation is expressed by the formula noconﬂict( a) := V p AP noconﬂict( a)(p).

Deﬁnition 6. A joint action a is non-conﬂicting in w if M, w |= noconﬂict( a). Otherwise a is conﬂicting in w. Example 4. In our running example, pushi, pushj is conﬂicting in both worlds w and u. To resolve conﬂicts, we resort to a scheduler.

3.2 The Role of the Scheduler Given a conﬂicting available joint action, the scheduler selects a maximal subset of consistent individual actions, and inhibits the remaining actions by cancelling their effects on shared variables. An inhibited action of agent i may still have effects on APp i , which may differ from the original ones. The inhibiting mechanism is implemented by a ghost mapping : A A with the following properties: for every agent i Ag and every action a Ai, the ghost (a) (written a ) must be in Ai and post(a ) is undeﬁned on APs. This assumption ensures that no conﬂict arises when inhibiting actions. We also assume that (a) is executable as long as a is, i.e., that pre(a ) is a logical consequence of pre(a). Example 5. For the robots, APs = {left}, APp i = {low Ei}, APp j = {low Ej}, and APp k = APp ℓ= . Facing the conﬂicting joint action pushi, pushj , the scheduler can choose between pushi , pushj and pushi, pushj , and we set pushi := waiti and pushj := waitj. The epistemic relations from a ghost can be set in many different ways, and the choice is a design matter. For example, assume that ghost actions are distinct from those that can be played by agents, and that ghosts are not epistemically related to those playable actions. Then the scheduler s selection is public: indeed, when for joint action a scheduler inhibits the action of agent i, all agents are implicitly informed that agent i is discarded. Deﬁnition 7. If A = A A is split between playable actions A and ghosts ones A , and A and A are not related by any i, we say that the scheduler is public. In the sequel, we consider that action models A are implicitly equipped with a ghost mapping.

3.3 Concurrent Update Product To deﬁne the concurrent execution of possibly conﬂicting actions, we introduce a concurrent update product that generalises the classic update product (Deﬁnition 5). Before deﬁning this product, we formalise the possible choices for the scheduler. We say that joint action b is a subaction of joint action a, written b a, whenever b is obtained

Proceedings of the Twenty-Ninth International Joint Conference on Artiﬁcial Intelligence (IJCAI-20)

low Ej, left left

w, (pushi, waitj) u, (pushi, waitj)

w, (waiti, waitj) u, (waiti, waitj)

w, (waiti, pushj) u, (waiti, pushj)

Figure 5: The model (M A, w, (pushi, waitj) ) (for readability, all reﬂexive edges are missing).

from a by replacing some individual actions by their ghosts. Formally, b a if bi { ai, ( ai) }, for every 1 i N. Remark that, by deﬁnition of the ghost mapping, if a is available in w, so are its sub-actions. Given a joint action a, we let Max( a, w) be the set of joint actions composed of the -maximal sub-actions of a nonconﬂicting in w. Elements of Max( a, w) are therefore joint actions non-conﬂicting in w where a minimal number of actions are inhibited. Notice that Max( a, w) = { a} as soon as a is non-conﬂicting in w. Example 6. The set Max( pushi, pushj , w) is { waiti, pushj , pushi, waitj }. We now deﬁne the concurrent update product, that implements the execution of joint actions. Deﬁnition 8 (Concurrent update product). The concurrent update product of M and A is the Kripke model M A = (W , ( i )i Ag, λ ), where:

W = {(w, b) W AN | b Max( a, u) for some joint action a available in w};

(w, b) i (u, c) if w i u and bj i cj for all j;

p λ (u, b) if (M, u) |= V

i Ag( b,p) post(bi, p).

We can now deﬁne the set of epistemic states that may result from executing a joint action a in (M, w):

(M, w) a := {(M A, (w, b)) | b Max( a, w)}.

This set is a singleton when a is non-conﬂicting in w. Remark that the component-wise deﬁnition of concurrent epistemic relations i in Deﬁnition 8 makes the identity of who performs an action common knowledge. Example 7. The pointed concurrent update product (M A, w, (pushi, waitj) ) of Figure 5 belongs to (M, w) (pushi, pushj). Action pushj was inhibited, but agent ℓconsiders it possible that low Ej is now true.

4 Concurrent DEL Games In this section, we show how DEL can represent inﬁnite concurrent game arenas, in which players act simultaneously. We then show how, as for turn-based games [Maubert et al., 2019; Maubert et al., 2020] these inﬁnite game arenas can in some cases be folded back into ﬁnite ones, when actions are all public, or when they are all propositional. Finally, we show that

in these cases, one can model check the epistemic strategic logic ATL K on DEL-represented concurrent games.

4.1 Concurrent Game Arenas Here we recall (nondeterministic) concurrent game arenas with imperfect information and related notions. Deﬁnition 9. A concurrent game arena is a tuple G = (Act, V, vini, (Acti)i Ag, , ( i)i Ag, λ), where Act is a non-empty set of actions, V is a non-empty set of positions, vini V is an initial position, Acti : V 2Act \ { } is a repertoire function for i, V Act N V is a transition relation, i V V is an indistinguishability equivalence relation for agent i, and λ : V 2AP is a valuation function. In a position v each agent i chooses an available action ai Acti(v), yielding a joint action a = a1, . . . , a N . The set of joint actions available in v is written JAct(v). The game then moves nondeterministically to some position v (v, a). If, for every position v and joint action a, {v | (v, a, v ) } is a singleton, then G is deterministic and we may represent the transition relation as a function δ : V Act N V . A play π = v0v1v2 . . . is an inﬁnite sequence of positions such that for all k N, there exists a Act N such that vk+1 (vk, a). We let πk = vk and π k = v0v1 . . . vk. A history h = v0v1 . . . vn is a ﬁnite non-empty preﬁx of a play, last(h) = vn is the last position in h, |h| = n + 1 is its length, and Hist G is the set of histories in G. We write λ(h) for λ(last(h)). As usual for agents with synchronous perfect recall (see for instance [Fagin et al., 1995]), the indistinguishability relation of each agent i is lifted to histories as follows: h i h if |h| = |h | and hk i h k for every k < |h|. A (uniform) strategy σ for an agent i is a function σi that associates an action to any history and such that, for every pair of indistinguishable histories h i h , σ(h) = σ(h ). Classically, we assume that Acti(v) = Acti(v ) whenever v i v . A strategy proﬁle for a coalition of agents C Ag is a tuple σC = σi i C, and ΣC is the set of strategy proﬁles for coalition C. Finally, an outcome of a strategy proﬁle σC from a position v0 is a play starting in position v0 and in which agents in C follow the strategies in ΣC: formally, a play π = v0v1 . . . is an outcome of σi i C from v0 if, for every k N, there is a joint action a such that vk+1 (vk, a) and ( a)i = σi(v0 . . . vk), for all i C. We let out(h, σC) be the set of outcomes of σC from position v. Deﬁnition 10. Given a concurrent game arena G = (Act, V, vini, (Acti)i Ag, , ( i)i Ag, λ), we deﬁne the unfolding of G as the game arena Gunf = (Act, V , v ini, (Act i)i Ag, , ( i)i Ag, λ ) where V = Hist G, v ini = vini, for every h Hist G, Act i(h) = Acti(last(h)), λ (h) = λ(last(h)), i is the synchronous perfect-recall lifting of i to histories, and = {(h, a, h v) | (last(h), a, v) }. We say that two game structures G and G are equivalent whenever their unfoldings are isomorphic.

Proceedings of the Twenty-Ninth International Joint Conference on Artiﬁcial Intelligence (IJCAI-20)

4.2 DEL Presentations of Game Arenas We use DEL models to deﬁne inﬁnite concurrent arenas:

Deﬁnition 11. A DEL game presentation, or DEL game, is a tuple G = M, w, A where (M, w) is an initial epistemic state and A is an action model.

The game starts in the initial epistemic state (M, w). In each round, each agent i chooses an action a Ai available in the current state (M , w ), resulting in a joint action a (we assume that each agent always has at least one available action; if needed we may add a dummy action in the model). The next epistemic state is nondeterministically chosen by the scheduler among (M , w ) a (this choice is nontrivial if a is conﬂicting), and the game goes on. After n rounds in which the players chose joint actions a1, . . . , an, the epistemic state is of the form MAn, (w, b1, . . . , bn), where MAn is deﬁned by letting MA0 = M and MAn+1 = MAn A, and each bk+1 is a maximal sub-action of ak+1 consistent in MAk, (w, b1, . . . , bk). In the following we may write w b1 . . . bn instead of (w, b1, . . . , bn), and call it a history. The length of a history in MAn is deﬁned as |w b1 . . . bn| = n. Since the length of a history determines the epistemic model to which it belongs, we may omit it and write w b1 . . . bn |= ϕ instead of MAn, w b1 . . . bn |= ϕ. We require that players know their available actions: if w b1 . . . bn i u c1 . . . cn, then the same actions are available to i in both worlds. A DEL presentation denotes a concurrent game arena, where positions are epistemic states attainable from the initial one,and moves are obtained by applying the concurrent update product of Deﬁnition 8. Because this product yields several epistemic states, this concurrent game arena is nondeterministic.

Deﬁnition 12. Given a DEL game presentation G = M, w, A we deﬁne the concurrent game arena GG = (A, V, vini, (Acti)i Ag, , ( i)i Ag, λ) where, writing MAn = (Wn, ( n i )i Ag, λn) for every n:

A is the set of actions in A, the set of positions is V = n NWn, the initial position is vini = w, Acti(v) = {a Ai | v |= pre(a)}, = {(v, a, v ) | v v a}, v i v if |v| = |v | = n and v n i v , and λ(v) = λ|v|(v).

We want to reason about strategic abilities and knowledge on inﬁnite concurrent games given as DEL game presentations. To do so we consider the logic ATL K.

Deﬁnition 13. The syntax of ATL K is given by the following grammar:

ϕ ::= p | ϕ | ϕ1 ϕ2 | Kiϕ | C ψ ψ ::= ϕ | ψ | ψ1 ψ2 | ψ | ψ1Uψ2

with p AP, C Ag and i Ag.

Formula C ψ reads as coalition C has a strategy proﬁle to ensure ψ , and the meaning of other operators is as usual. We let C = i C i, and deﬁne the semantics of ATL K.

Deﬁnition 14. Let G be a concurrent game arena, v a position and π a play. The semantics of ATL K is deﬁned as follows, where h is a history in G, π is a play and i N is a point in time (we omit boolean cases):

G, h |= p if p λ(h) G, h |= Kiϕ if h Hist G s.t. h i h , G, h |= ϕ G, h |= C ψ if σC ΣC s.t. h C h, π out(h , σC), G, π |= ψ G, π |= ϕ if π[0] |= ϕ G, π |= ψ if G, π 1 |= ψ G, π |= ψ1Uψ2 if i 0 s.t. G, π i |= ψ2 and 0 j < i, G, π j |= ψ1

Remark 1. We use the uninformed semantics for the knowledge operators, which does not depend on the strategies used by the agents, and is the usual one used in epistemic planning (see [Puchala, 2010; Maubert, 2014; Maubert et al., 2020] for more details). Also, we use the subjective semantics for strategic operators, in contrast with the objective one that only asks strategies to be winning from the actual current history (see [Bulling and Jamroga, 2014]).

Given a game arena G with initial position vini, we write G |= ϕ if G, vini |= ϕ. We study the following modelchecking problem: given a DEL game presentation G and a formula ϕ ATL K, do we have GG |= ϕ?

4.3 Decidability for Public Actions

A deterministic game arena with transition function δ is said to have only public actions if, for all positions v, v and joint actions a, a such that a = a , we have δ(v, a) i δ(v , a ). The following is known for ATL K on such arenas:

Theorem 1 ([Belardinelli et al., 2017]). Model checking ATL K on ﬁnite deterministic concurrent game structures with only public actions is 2-EXPTIME-complete.

First we prove that DEL games with public actions can be ﬁnitely represented. Our proof yields nondeterministic game arenas, that we will then transform into deterministic ones that have only public actions in the sense of [Belardinelli et al., 2017]. We will need the following result on the concurrent update product with public action models and a public scheduler (see Deﬁnition 7):

Lemma 2. Let M be an epistemic model and A an event model. If the scheduler is public, A has only public actions and its ghost mapping : A A is injective, then for all worlds w, w M, all joint actions a, a , if a = a then for all (w, b) (M, w) a and (w , b ) (M, w ) a , for all agents i, (w, b) i (w , b ).

The injectivity assumption means that two different played actions cannot become the same after being inhibited, in which case the actions of DEL games could not be public in the sense of [Belardinelli et al., 2017].

Proceedings of the Twenty-Ninth International Joint Conference on Artiﬁcial Intelligence (IJCAI-20)

The proof that the inﬁnite game arena GG induced by a DEL game G with public actions can be ﬁnitely represented is very similar to that in [Maubert et al., 2020]. It relies on the fact that updating an epistemic state with a joint action made of public actions can only decrease the size of the obtained epistemic states (when removing their disconnected components) as long as their ghosts are public too. It follows that only ﬁnitely many different epistemic models can be generated from a given initial state, up to isomorphism.

Proposition 3. Given a DEL game G = M, w, A with only public actions, one can build a ﬁnite game arena G equivalent to GG and of size exponential in |G|.

Because of the scheduler, the resulting arena is nondeterministic. To obtain Theorem 4 below it essentially remains to show that we can transform these nondeterministic arenas into deterministic ones that have only public actions in the sense of [Belardinelli et al., 2017].

Theorem 4. Model checking ATL K on DEL concurrent games with public actions, public scheduler and injective ghost mapping is 2-EXPTIME-complete.

Proof sketch. The lower bound comes from LTL synthesis [Pnueli and Rosner, 1989]. For the upper bounds, let G be a DEL presentation and Φ an ATL K formula. First, by Proposition 3 we build in exponential time a ﬁnite nondeterministic game arena G = (Act, V, vini, (Acti)i Ag, , ( i)i Ag, λ) equivalent to GG. We build a deterministic game arena Gd by adding a special agent, the scheduler s, who has perfect information and whose possible actions Acts allow him to resolve the non-determinism. The agents choose their actions, and then the scheduler resolves nondeterminism. To do so we introduce intermediary positions of the form (v, a): in each round, in a position v, the agents in Ag choose a joint action a (the scheduler can only play some dummy action), and the game moves deterministically to (v, a). Now all normal agents can only choose the dummy move, and the scheduler can choose a maximal consistent sub-action b of a, which determines the new position v . Indistinguihability relations for all normal agents are as in G for positions v, and we let (v, a) i (v , a ) iff v i v and a = a . This deﬁnition is correct because, by Lemma 2, when the agents choose different joint actions, they distinguish the resulting epistemic states no matter the choices of the scheduler. One can check that Gd has only public actions in the sense of [Belardinelli et al., 2017]. We then duplicate all next operators in Φ to skip artiﬁcial positions of the form (v, a). To check Φ on Gd, the procedure from [Belardinelli et al., 2017] takes time doubly exponential in the size of Φ and exponential in the size of Gd, which is itself of size exponential in the size of the original DEL game presentation G.

4.4 Decidability for Propositional Actions A DEL game presentation presents hierarchical information if the set of agents Ag can be totally ordered (i1 < . . . < i N) so that ii ii+1 for each 1 i < N, and similarly for concurrent game arenas. We prove that when, in a DEL game presentation, all actions are propositional actions and information is hierarchical, model checking ATL K is decidable.

We ﬁrst generalise to the concurrent setting a result from [Maubert et al., 2019] which states that inﬁnite turnbased DEL games induced by propositional models can be ﬁnitely represented. As for public actions, the game arena that we obtain is nondeterministic. Proposition 5. Let G = M, w, A be a DEL game where A is propositional. One can build a ﬁnite nondeterministic game arena G equivalent to GG, of size exponential in |G|. We will invoke the following result, that was stated for deterministic game arenas: Theorem 6 ([Berthon et al., 2017]). Model checking ATL K with uninformed semantics is decidable on deterministic game arenas with hierarchical information. The result in [Berthon et al., 2017] is for ATL without knowledge operators, and for the objective semantics of strategic operators. But their bottom-up algorithm can be easily extended to deal with knowledge operators and subjective semantics: one ﬁrst performs a powerset construction on the game arena to include sufﬁcient information to evaluate knowledge operators positionally (for instance using ktrees [van der Meyden, 1998]), and then one evaluates knowledge and strategic modalities in a bottom-up fashion. Thanks to Proposition 5 and a reduction to deterministic arenas similar to the one presented in the proof of Theorem 4, we can use Theorem 6 and obtain that: Theorem 7. Model checking ATL K on DEL game presentations with propositional actions and hierarchical information is decidable. However the complexity is nonelementary, as it is already the case for multiplayer reachability games with hierarchical information [Peterson et al., 2001]. The number of exponentials will be the maximum between the number of agents with different observation power and the alternation of knowledge operators in the formula.

5 Conclusion We proposed a setting that permits the succinct speciﬁcation of concurrent game arenas with epistemic actions. Conﬂicts are resolved by a scheduler that nondeterministically selects consistent subsets of actions, and inhibits the discarded ones. We established that on such games one can perform strategic reasoning, or more precisely one can model check the strategic epistemic logic ATL K, in two cases: when actions are public (Theorem 4) and when actions are propositional and information is hierarchical among the agents (Theorem 7). Note that we could tune the scheduler differently: in our robot example, a third choice for pushi, pushj could be waiti, waitj , which would capture blocking conﬂicts. Technically, we could relax the selection to other sub-actions than -maximal ones, and the complexities established in our main theorems would still hold for such different schedulers. One could also consider fairness requirements to ensure that the scheduler does not favor some agents in resolving conﬂicts, for instance requiring that if an agent inﬁnitely often enters a conﬂict, then inﬁnitely often his action is selected by the scheduler. Fairness constraints can often be expressed directly in ATL K, in which case our main results still hold.

Proceedings of the Twenty-Ninth International Joint Conference on Artiﬁcial Intelligence (IJCAI-20)

References [ Agotnes and van Ditmarsch, 2011] Thomas Agotnes and Hans van Ditmarsch. What will they say? - public announcement games. Synthese, 179(Supplement-1):57 85, 2011. [Aucher and Bolander, 2013] G. Aucher and T. Bolander. Undecidability in epistemic planning. In IJCAI 13, 2013. [Baltag et al., 1998] Alexandru Baltag, Lawrence S. Moss, and Slawomir Solecki. The logic of public announcements and common knowledge and private suspicions. In Itzhak Gilboa, editor, Proceedings of the 7th Conference on Theoretical Aspects of Rationality and Knowledge (TARK-98), Evanston, IL, USA, July 22-24, 1998, pages 43 56. Morgan Kaufmann, 1998. [Belardinelli et al., 2017] Francesco Belardinelli, Alessio Lomuscio, Aniello Murano, and Sasha Rubin. Veriﬁcation of multi-agent systems with imperfect information and public actions. In AAMAS 17, pages 1268 1276, 2017. [Benevides and Lima, 2019] Mario Roberto Folhadela Benevides and Isaque Macalam Saab Lima. Dynamic epistemic logic with communication actions. ar Xiv preprint ar Xiv:1902.01164, 2019. [Berthon et al., 2017] Rapha el Berthon, Bastien Maubert, and Aniello Murano. Decidability results for ATL* with imperfect information and perfect recall. In AAMAS 17, pages 1250 1258, 2017. [Bolander and Andersen, 2011] T. Bolander and M.B. Andersen. Epistemic planning for single and multi-agent systems. J. Appl. Non-Class. Log., 21(1):9 34, 2011. [Bolander et al., 2015] T. Bolander, M.H. Jensen, and F. Schwarzentruber. Complexity results in epistemic planning. In IJCAI, pages 2791 2797, 2015. [Bolander, 2017] Thomas Bolander. A gentle introduction to epistemic planning: The DEL approach. In M4M@ICLA 2017, Indian Institute of Technology, Kanpur, India, 8th to 10th January 2017., pages 1 22, 2017. [Bulling and Jamroga, 2014] Nils Bulling and Wojciech Jamroga. Comparing variants of strategic ability: how uncertainty and memory inﬂuence general properties of games. Autonomous agents and multi-agent systems, 28(3):474 518, 2014. [Chandra and Stockmeyer, 1976] Ashok K Chandra and Larry J Stockmeyer. Alternation. In 17th Annual Symposium on Foundations of Computer Science (sfcs 1976), pages 98 108. IEEE, 1976. [de Alfaro and Henzinger, 2000] Luca de Alfaro and Thomas A Henzinger. Concurrent omega-regular games. In LICS 00, pages 141 154. IEEE, 2000. [Eshuis and Wieringa, 2001] Rik Eshuis and Roel J. Wieringa. A real-time execution semantics for UML activity diagrams. In FASE 01, pages 76 90, 2001. [Fagin et al., 1995] Ronald Fagin, Joseph Y. Halpern, Yoram Moses, and Moshe Y. Vardi. Reasoning about knowledge, volume 4. MIT press Cambridge, 1995.

[Fox and Long, 2003] Maria Fox and Derek Long. PDDL2.1: an extension to PDDL for expressing temporal planning domains. J. Artif. Intell. Res., 20:61 124, 2003. [Lima, 2014] Tiago De Lima. Alternating-time temporal dynamic epistemic logic. J. Log. Comput., 24(6):1145 1178, 2014. [Maubert et al., 2019] Bastien Maubert, Sophie Pinchinat, and Franc ois Schwarzentruber. Reachability games in dynamic epistemic logic. In IJCAI 19, pages 499 505, 2019. [Maubert et al., 2020] Bastien Maubert, Aniello Murano, Sophie Pinchinat, Franc ois Schwarzentruber, and Silvia Stranieri. Dynamic epistemic logic games with epistemic temporal goals. In ar Xiv preprint ar Xiv:2001.07141, 2020. [Maubert, 2014] B. Maubert. Logical foundations of games with imperfect information : uniform strategies. Ph D thesis, University of Rennes 1, France, 2014. [Peterson et al., 2001] Gary Peterson, John Reif, and Salman Azhar. Lower bounds for multiplayer noncooperative games of incomplete information. CAMWA, 41(7):957 992, 2001. [Pnueli and Rosner, 1989] Amir Pnueli and Roni Rosner. On the synthesis of a reactive module. In POPL, pages 179 190, 1989. [Puchala, 2010] Bernd Puchala. Asynchronous omegaregular games with partial information. In MFCS 10, pages 592 603, 2010. [van Benthem et al., 2007] Johan van Benthem, Sujata Ghosh, Fenrong Liu, et al. Modelling simultaneous games with concurrent dynamic logic. Institute for Logic, Language and Computation (ILLC), University of Amsterdam, 2007. [Van der Hoek and Wooldridge, 2003] Wiebe Van der Hoek and Michael Wooldridge. Cooperation, knowledge, and time: Alternating-time temporal epistemic logic and its applications. Studia logica, 75(1):125 157, 2003. [van der Meyden, 1998] Ron van der Meyden. Common knowledge and update in ﬁnite environments. Inf. Comput., 140(2):115 157, 1998. [van Ditmarsch and Kooi, 2006] Hans P. van Ditmarsch and Barteld P. Kooi. Semantic results for ontic and epistemic change. Co RR, abs/cs/0610093, 2006. [van Ditmarsch et al., 2003] Hans P. van Ditmarsch, Wiebe van der Hoek, and Barteld P. Kooi. Concurrent dynamic epistemic logic for MAS. In AAMAS 03, pages 201 208, 2003. [Van Ditmarsch et al., 2007] Hans Van Ditmarsch, Wiebe van Der Hoek, and Barteld Kooi. Dynamic epistemic logic, volume 337. Springer Science & Business Media, 2007. [van Eijck et al., 2011] Jan van Eijck, Floor Sietsma, and Yanjing Wang. Composing models. Journal of Applied Non-Classical Logics, 21(3-4):397 425, 2011.

Proceedings of the Twenty-Ninth International Joint Conference on Artiﬁcial Intelligence (IJCAI-20)