# boolean_observation_games__538266a4.pdf

Journal of Artiﬁcial Intelligence Research 79 (2024) 307-357 Submitted 07/2023; published 01/2024

Boolean Observation Games

Hans van Ditmarsch hans.van-ditmarsch@irit.fr CNRS, University of Toulouse, IRIT France

Sunil Simon simon@cse.iitk.ac.in Department of CSE, IIT Kanpur India

We introduce Boolean Observation Games, a subclass of multi-player ﬁnite strategic games with incomplete information and qualitative objectives. In Boolean observation games, each player is associated with a ﬁnite set of propositional variables of which only it can observe the value, and it controls whether and to whom it can reveal that value. It does not control the given, ﬁxed, value of variables. Boolean observation games are a generalization of Boolean games, a well-studied subclass of strategic games but with complete information, and wherein each player controls the value of its variables. In Boolean observation games, player goals describe multi-agent knowledge of variables. As in classical strategic games, players choose their strategies simultaneously and therefore observation games capture aspects of both imperfect and incomplete information. They require reasoning about sets of outcomes given sets of indistinguishable valuations of variables. An outcome relation between such sets determines what the Nash equilibria are. We present various outcome relations, including a qualitative variant of ex-post equilibrium. We identify conditions under which, given an outcome relation, Nash equilibria are guaranteed to exist. We also study the complexity of checking for the existence of Nash equilibria and of verifying if a strategy proﬁle is a Nash equilibrium. We further study the subclass of Boolean observation games with knowing whether goal formulas, for which the satisfaction does not depend on the value of variables. We show that each such Boolean observation game corresponds to a Boolean game and vice versa, by a diﬀerent correspondence, and that both correspondences are precise in terms of existence of Nash equilibria.

1. Introduction

Reasoning about strategic agents is an important problem in the theory of multi-agent systems and game-theoretic models and techniques are often used as a tool in such analysis. Strategic games (Osborne & Rubinstein, 1994) is a classic and well-studied framework that models one-shot multi-player games where agents make their choice simultaneously. It forms a simple and intuitive formalism to analyse and reason about the strategic behaviour of agents. From the perspective of computer science and artiﬁcial intelligence, one of the main drawbacks of strategic games is that the explicit representation of the payoﬀ(or utility) function is exponential in the number of players and the strategies available for each player. In many applications, compact representation of the underlying game model is highly desirable. Various approaches have been suggested to achieve compact representation of games and these mainly involve imposing restrictions on the payoﬀfunctions. For instance, constrain-

2024 The Authors. Published by AI Access Foundation under Creative Commons Attribution License CC BY 4.0.

Van Ditmarsch & Simon

ing the payoﬀfunctions to be pairwise separable (Janovskaya, 1968; Cai & Daskalakis, 2011) results in the well-studied class of games with a compact representation, called polymatrix games. Additively separable hedonic games (Aziz & Savani, 2016) form another subclass of strategic games with pairwise separable payoﬀfunctions which can be used to analyse coalition formation in multi-agent systems. It is also possible to achieve compact representation by explicitly restricting the dependency of payoﬀfunctions to a small number of other agents (or neighbourhood) as done in graphical games (Kearns, Littman, & Singh, 2001). An alternative approach to imposing quantitative constraints on payoﬀs is to restrict the payoﬀs to qualitative outcomes which are presented as logical formulas. For example, extensive games played on graphs where the goal formulas can specify the evolution of play with a combination of temporal and epistemic speciﬁcations. Although originally deﬁned as two-player perfect information games motivated by questions in automata theory and logic, these models are now sophisticated to reason about multi-player games and imperfect information (Chatterjee, Doyen, Henzinger, & Raskin, 2007; Apt & Gr adel, 2011; Gutierrez, Murano, Perelli, Rubin, & Wooldridge, 2017). Boolean games (Harrenstein, van der Hoek, Meyer, & Witteveen, 2001), a subclass of strategic games with complete information where objectives are expressed as Boolean formulas, is also a well-studied framework with such qualitative outcomes. In Boolean games, each player controls a disjoint subset of propositional variables where their strategies correspond to choosing values for these variables and each player s goal is speciﬁed by a Boolean formula over the set of all variables. While the model was originally deﬁned to analyse two-player games, the framework has been extended in many directions. Multi-player, non-zero-sum Boolean games are studied in (Harrenstein, 2004; Bonzon, Lagasquie-Schiex, Lang, & Zanuttini, 2006). In (Harrenstein et al., 2001; Harrenstein, 2004) Boolean games are modelled as imperfect information games by taking the uncertainty over the other player s actions as an information set, as in (van Benthem, 2001). In (Dunne & van der Hoek, 2004; Bonzon et al., 2006; Dunne & Wooldridge, 2012) the computational properties of Boolean games are adressed, in (Bonzon, Lagasquie-Schiex, & Lang, 2009) graphical dependency structures for Boolean games and their implications for various structural and computational properties, and in (Ianovski & Ong, 2014) mixed strategy Nash equilibria and related computational questions. The issue of equilibrium selection is considered in ( Agotnes, Harrenstein, van der Hoek, & Wooldridge, 2013a). Iterated Boolean games (Gutierrez, Harrenstein, & Wooldridge, 2015; Gutierrez, Harrenstein, Perelli, & Wooldridge, 2016) model repeated interaction between players with temporal goals speciﬁed in linear time temporal logic (LTL). Partial ordering of the run-time events in terms of a dependency graph on propositions is studied in (Bradﬁeld, Gutierrez, & Wooldridge, 2016). Epistemic Boolean games, wherein goal formulas may be epistemic, were proposed in ( Agotnes, Harrenstein, van der Hoek, & Wooldridge, 2013b; Herzig, Lorini, Maﬀre, & Schwarzentruber, 2016). Both works combine the control of variables with the observation of variables (or formulas), where some of this is strategic and some is given with the game. This hybrid setting allows the authors to continue to analyse these epistemic Boolean games as complete information strategic form games. Realizing epistemic objectives depends on the valuation of variables resulting from strategic action. In this paper, we introduce Boolean observation games as a qualitative model to analyse and reason about a subclass of strategic games with incomplete information. In Boolean

Boolean Observation Games

observation games, players control whether and to whom they reveal (announce) the value of propositional variables that can only be observed by them. This constitutes a multiplayer game model with concise representation where players have (qualitative) epistemic objectives. It is incomplete information because realizing the objectives depends on a given ﬁxed valuation that the players cannot control. Players do not know what that valuation is and therefore do not know what game they play. Realizing epistemic objectives depends on the unknown valuation of variables that is independent from strategic action. (We should note that such incomplete games of imperfect information can also be modelled as complete games of imperfect information by assuming an initial random move of a player nature determining the valuation.) Since Boolean observation games deﬁne a subclass of strategic games, they form an ideal framework to analyse interactive situations that incorporate aspects of both imperfect as well as incomplete information games. Please consider the following examples.

Example 1 (A West Side Story). Tony and Maria (or was Romeo and Juliet? or Shanbo and Jingtai?) are in love with each other. But they have not declared their love to each other yet. This is risky business, as they are both uncertain about the feelings of the other one. Surely, given that they both love each other, their objective is to get to know that. But they consider it possible that the other person does not love them, in which case they might prefer not to declare their love. Their personalities are diﬀerent in that respect. What Tony wants to know, depends on how his feelings (being in love / not being in love) relate to the other person s: if they match, he wants the other person to know, otherwise, he doesn t. Whereas what Maria wants to know only depends on the other person s feelings: if the other one is in love, she wants the other one to know her true feelings and otherwise not. Given their state of mind and their personalities, should they declare their love to each other? Let Tony be player 1 and Maria be player 2, and let p1 represent Tony is in love and p2 represent Maria is in love . Propositions p1 and p2 are both true and remain so forever after. They cannot be controlled. The objectives (goals) denoted γi for player i, and where Kipj means player i knows pj , are:

γ1 = γ2 = p1 p2 K1p2 K2p1 p1 p2 K1 p2 K2p1 p1 p2 K1p2 K2 p1 p1 p2 K1 p2 K2 p1

They each have two strategies: declare their feelings (revealing the value of pi), or not. We succinctly explain that in this game, whatever the facts are, there is a strategy proﬁle in which both players win by satisfying their goal formulas, but that they can never know that they win. It is not so clear whether there (hopefully) is an equilibrium strategy proﬁle allowing them to declare their love to each other. As p1 and p2 are true, it is an equilibrium when they both announce that, as K1p2 K2p1 is then true and they both win (the other three strategy proﬁles result in both losing, including another equilibrium namely when both don t declare). But Tony considers it possible that p2 in which case announcing p1, and Maria s behaviour being equal, goal K1 p2 K2p1 will fail. In that case he should have kept his mouth shut to have them win. Given the uncertainty over the game he has to reason

Van Ditmarsch & Simon

about not two but four strategies for Maria: depending on whether she is in love or not, whether she would show her feelings or not. What he will do given this information set of two indistinguishable outcomes, also depends on his risk aversity. If he s an optimist, he might still go for it. But if he s a pessimist, maybe better not. Maria s considerations are not dissimilar, but recall that she has a diﬀerent personality (the goals are a diﬀerent function from their value of pi, in other words, permuting all occurrences of 1 and 2 in the goal results in a diﬀerent goal). Example 11 on page 321 will reveal it all.

Example 2 (A game of pennies that do not match). Consider two players Odd and Even both having a penny. They also both have a dice cup wherein they put their penny, shake the cup, and then put it on the table and watch privately whether their penny is heads or tails. Now they decide whether to inform the other player of the result, or not. If they both do or if they both don t, Even wins. So, Even wins if either 2 players know or 0 players know, so that one might therefore say that their state of knowledge is even . Otherwise, Odd wins. For the outcome it only matters whether they know that the penny is heads or tails, it does not matter whether it is heads or tails. What should they do? We let Odd be player 1 and Even be player 2, and we let p1 represent Odd s penny is heads , whereas p2 stands for Even s penny is heads . The goals are therefore (where Kwipj abbreviates Kipj Ki pj and means player i knows whether pj):

γ1 = Kw1p2 Kw2p1 γ2 = Kw1p2 Kw2p1

On ﬁrst sight it seems quite straightforward what they should do, as the outcome does not depend on the valuation of p1 and p2. If Odd and Even both announce the result of their throw with the penny, Odd would then have done better not to make that announcement. But if that were to have happened, Even would have done better not to announce either. And so on. There is no equilibrium. Or is there? Yes, there is. And it is pure. Example 12 on page 321 will reveal it all.

Our framework of Boolean observation games clearly builds upon ( Agotnes et al., 2013b; Herzig et al., 2016) but a main diﬀerence is that these are complete information games whereas ours are incomplete information games. Thus we have very diﬀerent strategies. Players do not control the values of variables, but they control whether they reveal the ﬁxed values of variables that only they can observe. In that respect our framework also builds on the public announcement games of ( Agotnes & van Ditmarsch, 2011; Agotnes, van Benthem, van Ditmarsch, & Minica, 2011). They only allow strategies that are public announcements wherein the same information is revealed to all players. However, they permit announcing any epistemic formula, not merely propositional variables. A more detailed comparison with all these approaches is only possible after having given our framework in detail and is therefore in a later Section 3.3. Our games are strictly qualitative and thus abstract from truly Bayesian approaches (Harsanyi, 1968) with probabilities. To determine equilibrium we compare information sets, called expected outcomes . As the expected outcome may not be a value and the relation may not be a total order, our work is therefore in ordinal game theory (Durieu, Haller, Qu erou, & Solal, 2008; Cruz & Simaan, 2000; Amor, Fargier, & Sabbadin, 2017).

Boolean Observation Games

Our Contributions. We analyse structural and computational properties of Boolean observation games. We deﬁne Boolean observation games as incomplete and imperfect information games, a novel perspective in Boolean games. We show that Boolean observation games form a fragment of strategic games with compact representation. We determine equilibria based on four diﬀerent proﬁtable deviations from information sets, namely deﬁned as: the worst outcome is better, the best outcome is better, the expected outcome is better, and the outcome without uncertainty is better (the outcome is better even if all information sets are singleton, so that the game is one of complete information). We also provide existence results for such equilibria, which highly depends on what is considered a proﬁtable deviation. We identify various fragments of Boolean observation games including one where your goal may be to keep others ignorant but not to keep yourself ignorant, the self-positive goals, and another one where the goals are knowing whether formulas of which the realization does not depend on the valuation. The latter we call knowing-whether games. We provide an embedding of the standard Boolean games into a fragment of the knowing-whether games, and we also provide an embedding of the knowing-whether games into the Boolean games. Employing these embeddings we show that the knowing-whether games correspond to Boolean games in terms of existence of equilibrium outcomes. We also provide complexity results for the natural questions of veriﬁcation and checking of emptiness of equilibrium outcomes in Boolean observation games, for most of the proﬁtable deviations considered, and stretching the results as much as possible to also include fragments with ignorance goals. An overview of these complexity results is found in the conclusions in Table 3. Overview of Contents. Section 2 provides technical preliminaries needed to deﬁne Boolean observation games, that are then deﬁned in the subsequent Section 3, of which the ﬁnal Subsection 3.3 compares our proposal to other epistemic Boolean games. Section 4 presents the correspondence between Boolean games and Boolean observation games. Section 5 provides various results for the existence of Nash equilibria and Section 6 contains the results on the computational complexity of determining whether a strategy proﬁle is an equilibrium, and whether equilibria exist.

2. Preliminaries

In this section we introduce an auxiliary notion that is a complete information strategic game, which is played with strategies that are epistemic actions, that has epistemic formulas as goals and for which we propose a greatly simpliﬁed epistemic logic, and where outcomes are the truth values of those goals. Boolean observation games, that are incomplete information strategic games with more complex strategies and outcomes, will then be deﬁned in the next section. The logic is simple in order to ensure a compact representation allowing to obtain complexity results comparable to those for Boolean games. Some logical details that are fairly elementary but that might distract from the game theoretical content that is our focus, are deferred to the Appendix.

2.1 Strategies Consisting of Players Revealing Observations

Let N = {1, . . . , n} be a ﬁnite set of players i and P a ﬁnite set of (propositional) variables such that (Pi)i N deﬁnes a partition of P. The set Pi is the set of variables pi observed by

Van Ditmarsch & Simon

player i (that is, of which player i, and only player i, observes the value). A valuation is a subset v P, where pi v means that pi is true and pi / v means that pi is false. The set P(P) of all valuations is denoted V . A strategy for player i is a function si : N P(Pi) that assigns to each player j the set si(j) Pi of variables that player i reveals (announces) to player j. We require that si(i) = Pi. Let Si denote the set of all strategies of player i. A strategy proﬁle is a member s of S = S1 Sn. The set Pi(s) = {pj P | pj sj(i)} consists of the variables revealed to i in s. As si(i) Pi, Pi Pi(s). For i N, we denote the n-tuple s as (si, s i) where s i represents the (n 1)-tuple of the strategies of other players. Strategy s i is such that for all j N with j = i, si(j) = . This means that player i does not reveal anything to anyone. Strategy s i is such that for all i, j N, si(j) = Pi. This means that player i reveals everything and to everyone. Given i N and strategy proﬁle s, the observation relation s i on V is deﬁned as, for v, w V :

v s i w iﬀ v Pi(s) = w Pi(s).

Observation relation s i encodes the informative eﬀect of s. For s i we write i. This is the initial observation relation. We further note that Pi(s ) = V for any player i, so that s i is the the identity relation =. A s i equivalence class, deﬁned as [v]s i := {w V | w s i v} (where [v]s i is denoted [v]i), is also called an information set (of player i given valuation v and observation relation s i). Inasfar as strategies consist of each player i selecting a subset P i of her variables Pi, these are like the strategies in Boolean games. However we interpret this diﬀerently: player i does not make the variables in P i true, but reveals the value of the variables in P i according to a ﬁxed valuation v. Another departure (or generalization) from Boolean games is that diﬀerent variables are revealed to diﬀerent agents. This is because we felt that more interesting game theoretical results could be obtained for such a generalization, and because more interesting communicative scenarios could then be treated with the game theoretical machinery.

Example 3. We assume a strategy proﬁle to take place in some instantaneous, synchronous, fashion, such as, when s1(2) = {p1, q1}, s1(3) = {p1, q1}, and s1(4) = , player 1 informing player 2 and player 3 that p1 and q1 are both true, and such that player 4 observes this without being party to the message content (for example, 1 whispering to 2 and 3). In other words, player 4 knows that player 1 informs player 2 and player 3 whether p1 and q1, but player 4 remains uncertain of the value of p1 and q1, so does not know that 1 informs 2 and 3 that p1 and q2.1

Now consider s 1 that is like s1 except that s1(4) = {p1, q1} as well. This is the public announcement of p1 and q1 by player 1 to all players. What if for example s 1(2) = {p1} but s 1(3) = {p1, p2}? And what about s2, s3 and s4? This cannot be done instantaneously. But we can ensure independence: all players commit

1. In a diﬀerent semantics for strategies, less informative to the players, each player only learns what variables have been revealed by others to herself, and what variables she reveals to others. Applied to Example 3, this would also leave player 4 uncertain whether player 1 has informed player 2 and player 3. See Appendix A.3.

Boolean Observation Games

to their si before they execute it, and not after they see what variables are revealed to other players before it is their turn to reveal. Instead of whispering we can all have prepared closed envelopes adressed to all others on which is written for example, from player 1 to player 2: contains the truth about p1 and p2 . All envelopes are collected blindly and then put on the table for all to see and are then handed out.

Such forms of communication are known as semi-public announcement (van Ditmarsch, 2002), see Appendix A.2 on dynamic epistemic logic for details.

2.2 Goals that are Epistemic Formulas

The language of epistemic logic is deﬁned as follows, where i N and pi Pi.

LK α := pi | α | α α | Kiα

Here, is negation, is disjunction, and Kiϕ stands for player i knows ϕ. Other propositional connectives are deﬁned by abbreviation, and also ˆKiα := Ki α (player i considers α possible), and Kwiα := Kiα Ki α (player i knows whether α). The members of LK

are goals and may as well be called, suiting our purposes formulas. The following fragments of LK also play a role, where i, j N and pi Pi.

LB α := pi | α | α α LK nnf α := pi | pi | α α | α α | Kiα | ˆKiα L+ α := pi | pi | α α | α α | Kiα LKw α := Kwjpi | α | α α LKw nnf α := Kwjpi | Kwjpi | α α | α α

The language LB of the Booleans is the fragment of LK without Ki modalities. In the language LKw of knowing whether formulas (Kw formulas) the constructs Kwjpi play the role of propositional variables. The fragments LK nnf and LKw nnf are those of the negation normal form (nnf) of respectively LK and LKw, where the language L+ of the positive formulas is the fragment of LK nnf without ˆKi modalities (corresponding to a universal fragment of ﬁrst-order logic). Note that LKw and LKw nnf are really propositional languages, not modal languages. A goal is guarded if it has shape γi = Kiα. Apart from the above fragments yet another fragment plays a role in our contribution, namely that of the self-positive goals. The self-positive goal formulas are deﬁned as Lself+ := S j N Lj+, where each Lj+ is given by the following BNF, wherein i, k N and k = j.

Lj+ αj ::= pi | pi | αj αj | αj αj | Kjαj | Kkαj | ˆKkαj

Here, αj is the goal for player j. Note that L+ is a fragment of Lj+, namely the fragment where all occurrences of Kk are positive, and that Lj+ is a fragment of LK nnf, namely the fragment wherein all occurrences of Kj are positive. In a self-positive goal for agent j, j s objective is to (get to) know others variables and others knowledge and ignorance, although other players may either know or remain ignorant of j s knowledge. This implies that j s goal also cannot be for others to know j s ignorance. A larger number of communicative scenarios seem to have self-positive goals than merely positive goals: it seems

Van Ditmarsch & Simon

fairly typical that you wish others to remain ignorant even when you are only interested in obtaining (factual) knowledge. The inductively deﬁned semantics of LK formulas are relative to a valuation v and a strategy proﬁle s, where i N and pi Pi.

v, s |= pi iﬀ pi v v, s |= α iﬀ v, s |= α v, s |= α1 α2 iﬀ v, s |= α1 or v, s |= α2 v, s |= Kiα iﬀ w, s |= α for all w such that v s i w

For v, s |= α we write v |= α. This is a bit sneaky: by deﬁnition this represents what players know after the strategy proﬁle is executed wherein nobody reveals anything, but we can therefore just as well let it stand for what players initially know, before anything has been revealed. We let s |= α denote for all v V , v, s |= α, and |= α denote for all s S, s |= α . In our semantics, Kipi, Ki pi, and Kwipi are always true (equivalent to the trivial assertion ). We therefore informally assume that they do not occur in goal formulas. We note that our epistemic semantics is not the usual one for the epistemic language, interpreted on arbitrary Kripke models, but a greatly simpliﬁed epistemic semantics dedicated to reason about strategies that are joint revelations of observed variables. We do not even use the word model . And we do not allow announcements (revelations) of other information than variables. In Appendix A.2 we show how (valuation, strategy) pairs induce multi-agent Kripke models. All these simpliﬁcations are in order to obtain a smooth comparison with Boolean games and with comparable complexities, unlike the higher complexities common in multi-agent epistemic reasoning. We continue with some elementary properties of this simple logical semantics, in the form of propositions.

Proposition 4. Each formula in LK is equivalent to a formula in LK nnf. Similarly, each formula in LKw is equivalent to a formula in LKw nnf .

Proof. This well-known result in modal logic for LK is shown by induction on formula structure, using the equivalences α α, (α β) ( α β) and Kiα ˆKi α. For LKw, as this is essentially a propositional and not a modal language, we only need to use the ﬁrst equivalence.

Proposition 5. For all ϕ LKw, valuations v, and strategy proﬁles s: v, s |= ϕ iﬀs |= ϕ.

The basic but lengthy proof of this proposition is in Appendix A.1. Prop. 5 says in other words, that if v, s |= ϕ for some v V , then v, s |= ϕ for all v V .

Proposition 6. For any α LKw, |= α Kiα.

Proof. Let valuation v and strategy proﬁle s be given. Assume v, s |= α. Then from Prop. 5 it follows that for all w V , w, s |= α. Therefore, in particular, w, s |= α for all w s i v, which is by deﬁnition v, s |= Kiα. Now assume v, s |= Kiα. From v s i v and the semantics of knowledge now directly follows v, s |= α. As v and s were arbitrary, we have shown |= α Kiα.

Boolean Observation Games

As a consequence each formula in the fragment Kwipj | α | α α | Kiα is equivalent to a formula in LKw, in other words, knowledge can then be eliminated. This explains why we deﬁned the fragment LKw without an inductive clause for knowledge. Knowledge cannot generally be eliminated from a language with knowing whether variables. For example, Anne (1) may know whether Bill (2) passed the exam (p2), but Bill may be uncertain whether she knows. So we have Kw1p2 K2Kw1p2. Props. 5 and 6 (and the subsequent Prop. 7) do not hold for knowing whether fragments on arbitrary Kripke models.

Proposition 7. For all i, j, k N: |= Kwi Kwjpk.

Proof. Formula Kwi Kwjpk is by deﬁnition equivalent to Ki Kwjpk Ki Kwjpk. From Prop. 6 it follows that this is equivalent to Kwjpk Kwjpk which is a tautology.

Therefore, in our very simple epistemic logic it is common knowledge whether a player knows a variable. This reﬂects the dynamics of revealing variables. Suppose all players hold cards named p1, q1, p2, . . . on the back side and the value 0 or 1 on the front (face) side. You may not know that your neighbour has shown to your other neighbour that the value of the card p1 is 1 (true). But you know whether your neighbour has shown card p1 to your other neighbour. You saw it happen.

2.3 Pointed Boolean Observation Games

A pointed Boolean observation game (pointed observation game) is a pair (G, v), denoted G(v), where v V and where G is a triple (N, (Pi)i N, (γi)i N), where all γi LK. The players strategies in the pointed observation game are the strategies si Si. The players goals in the pointed observation game are the γi LK. Given i N, the outcome function ui : V S {0, 1} of a pointed observation game is deﬁned as:

ui(v, s) = 1 if v, s |= γi and ui(v, s) = 0 if v, s |= γi.

A strategy proﬁle s is a Nash equilibrium of G(v) iﬀfor all i N and s i Si we have ui(v, s) ui(v, (s i, s i)). That is, no player has a proﬁtable deviation from s in G(v), which would therefore be a s i Si such that ui(v, s) < ui(v, (s i, s i)). Observe that a player can only make a proﬁtable deviation from s if her goal is not satisﬁed in s. Let NE(G(v)) denote the set of Nash equilibria of G(v). The pointed observation game is an auxiliary notion, matching the intuition that after revealing variables a player wins when her goal has become true. The game is one of complete information because the valuation is known to you, the reader. But the valuation is typically not known to the players. It already uses the parameters of the Boolean observation game that we will now deﬁne in the next section.

Example 8. We recall Example 1. We summarily describe a pointed Boolean observation game and its equilibria, where a fuller development is only given in Example 11. Consider pointed game G(v) with G = ({1, 2}, ({p1}, {p2}), ({γ1, γ2}) where γ1, γ2 are as in Example 1, and where valuation v = {p1, p2} (both are in love). The strategies are to reveal nothing or to reveal all, that is: s 1, s 1, s 2, and s 2.

Van Ditmarsch & Simon

The strategy proﬁle (s 1, s 2) is an equilibrium strategy proﬁle of the pointed game G(v), with outcome 1 for both players. This is the only way to make K1p2 K2p1 true. However, both players not announcing their variable is also an equilibrium with outcomes 0. The pointed game G(w) for valuation w = {p1} (only Tony is in love) has equilibrium (s 1, s 2). We now need to make K1 p2 K2p1 true. (Another equilibrium (s 1, s 2) is when both get outcome 0.)

3. Deﬁning Boolean Observation Games

We will now deﬁne the Boolean observation game. A Boolean observation game is an incomplete information strategic form game with uniform strategies (uniform functions from valuations to strategies) and expected outcomes (information sets of outcomes), whereas the auxiliary notion of a pointed observation game is a complete information strategic form game with strategies and with (Boolean-valued) outcomes.

3.1 Boolean Observation Games

This section contains the crucial game theoretical notions of our contribution.

Boolean Observation Game. A Boolean observation game (or observation game) is a triple G = (N, (Pi)i N, (γi)i N), where all γi LK. Formula γi is the goal (objective) of player i. It is played with uniform strategies and the payoﬀs are expected outcomes. Both will now be deﬁned.

Uniform Strategy. A uniform strategy for player i N is a function si : V Si such that for all v, w with v i w, si(v) = si(w). It is globally uniform iﬀfor all v, w V , si(v) = si(w). So, uniform means the same for all indistinguishable valuations, which is diﬀerent from globally uniform, which means the same for all valuations. Let Si denote the set of uniform strategies of player i, and S = S1 Sn the set of uniform strategy proﬁles. Let Sg i and Sg denote the set of globally uniform strategies of player i and the set of globally uniform strategy proﬁles respectively. Given a valuation v, a uniform strategy proﬁle s determines a strategy proﬁle s(v) = (s1(v), . . . , sn(v)). Note that (s(v)i, s(v) i) = (si, s i)(v). For i N and si Si, we deﬁne si Sg i as: for all v V , si(v) = si. Similarly for s S we deﬁne s Sg as the globally uniform strategy proﬁle such that for all v V , s(v) = s. It follows from the deﬁnition that every globally uniform strategy proﬁle s Sg is of the form s for some strategy proﬁle s S.

Expected Outcome. Given i N, the expected outcome function is a function ui : V S {0, 1} that is uniform in V , and deﬁned as ui(v, s) = (ui(w, s(w)))w iv. So, expected outcome ui(v, s) is a vector of outcomes ui(w, s(w)) for each valuation w in the information set of player i. In our setting where outcomes are 0 (lose) or 1 (win) this vector is a bitstring. As far as nomenclature is concerned, we are putting the reader on the wrong foot, as a uniform strategy is not a kind of strategy (as deﬁned in the previous section), nor is expected outcome a kind of (binary valued) outcome. However, we are in good company: an artiﬁcial brain is not a brain, and a cable car is not a car. So we hope the reader will allow us this slight abuse of language.

Boolean Observation Games

Outcome Relation and Nash Equilibrium. To deﬁne the notion of an equilibrium in observation games, we need to ﬁrst deﬁne a comparison relation between uniform strategy proﬁles. Note that unlike in classical strategic games, the expected outcome function in observation games generates a vector of outcomes. Therefore, there is no canonical deﬁnition for the comparison relation. We deﬁne an outcome relation > over vectors of outcomes and write ui(v, s) > ui(v, s ) for player i prefers si over s i in the information set containing v ; we also say that si is a proﬁtable deviation from s i. This outcome relation may not be a total order. We therefore prefer not to use notation to compare the bitstrings that are outcome sets, as it is ambiguous whether x y means (x < y or x = y) or x y (and even when deﬁned as either one or the other, it seems unkind to the reader). Given an outcome relation >, a uniform strategy proﬁle is a Nash equilibrium if no player has a proﬁtable deviation.

A uniform strategy proﬁle s is a Nash equilibrium of G iﬀfor all i N, s i Si and v V , we have that ui(v, (s i, s i)) ui(v, s).

Given an observation game G, NE(G) denotes its Nash equilibria, and among those NE g(G) denotes the globally uniform Nash equilibria. Also, a uniform strategy si Si is dominant if for all s S with s = (si, s i), for all s i Si, and for all v, ui(v, (s i, s i)) ui(v, s).2

Four Outcome Relations. It remains to deﬁne the outcome relation. We propose four.

optimist : ui(v, s) >opt ui(v, s ) iﬀ max ui(v, s) > max ui(v, s ) pessimist : ui(v, s) >pess ui(v, s ) iﬀ min ui(v, s) > min ui(v, s ) realist : ui(v, s) >real ui(v, s ) iﬀ Σui(v, s) > Σui(v, s ) maximal : ui(v, s) >max ui(v, s ) iﬀ ui(w, s(w)) > ui(w, s (w)) for some w i v

The optimist, pessimist and realist outcome relations are (strict) total orders, as it suﬃces to assign a number to the information set constituting an expected outcome. The maximal outcome relation is not a total order. We let NE pess(G), NE opt(G), NE real(G), and NE max(G) denote the Nash equilibria under the pessimist, optimist, realist and maximal outcome relation, respectively. The optimist, pessimist and realist outcome relations are (strict) total orders, as it suﬃces to assign a number to the information set constituting an expected outcome. The maximal outcome relation is not a total order as illustrated in Example 9. However, deﬁning this relation is useful since NE max(G) has an interesting interpretation which we discuss below.

Example 9. Let us consider an abstract example where a player has to choose between expected outcomes (bitstrings) 00, 10, 01, 11. We then get (where clustered bitstrings means equally preferred): {01, 10, 11} >opt 00 11 >pess {00, 01, 10} 11 >real {01, 10} >real 00 ij >max kl iﬀi > k or j > l

2. This is weak dominance of the kind always at least as good where we emphasize that we do not deﬁne it as always at least as good and sometimes strictly better , which is also common in game theory.

Van Ditmarsch & Simon

The >max relation is neither antisymmetric nor transitive. For instance, in Example 9 we have that 10 >max 01 but also 01 >max 10, and it is not transitive because 01 >max

10 >max 01 however 01 max 01. Thus the >max relation is neither a total order nor a preorder. However, it has a maximum and a minimum: the expected outcome where the player always wins is preferred over all other expected outcomes, and the expected outcome where the player always loses is less preferred than all other expected outcomes. The maximal outcome relation also satisﬁes the important property that all outcomes can be compared and therefore, the notion of a Nash equilibrium is well-deﬁned. If ui(v, s) = ui(v, s ), then ui(v, s) >max ui(v, s ) or ui(v, s ) >max ui(v, s). The disjunction in the consequent is inclusive, both may hold (we recall that 10 >max 01 as well as 01 >max 10, as in Example 9). To require this property is common in ordinal game theory (Durieu et al., 2008). The outcome relations that we have proposed are qualitative versions of well-known criteria in decision theory and Bayesian reasoning. None assume a probability distribution, however, all assume a strictly positive probability for each valuation.

The optimist outcome relation is the max instantiation (as there is only one maximal value) of the minimax regret decision criterion (Savage, 1951). With respect to the highest possible outcome in the information set, a lower possible outcome in the information set (which can only be 0 instead of 1) would cause regret if this were to happen.

The pessimist outcome relation is the min instantiation of the maximin or Wald decision criterion (Wald, 1945). We then choose the information set with the best worst outcome. This outcome relation has been used to model uncertainty in voting (with similar considerations involving Nash equilibria and dominance) in (Conitzer, Walsh, & Xia, 2011; van Ditmarsch, Lang, & Saﬃdine, 2013; Bakhtiari, van Ditmarsch, & Saﬃdine, 2019).

The realist outcome relation is a qualititative version (lack of justiﬁcation to rule out any outcome) of a random decision in Bayesian terms, also known as the insuﬃcient reason or Laplace decision criterion, or as the principle of indiﬀerence (Keynes, 1921, Chapter IV).

Instead of taking the sum of the outcomes in the information set we could of course have normalized this so it adds up to 1, suggesting an even distribution of probability mass. Such scaling is irrelevant for our purposes of determining Nash equilibria and dominance, wherein we only need to compare outcomes. That comparison relation remains the same.

This outcome relation was used in ( Agotnes & van Ditmarsch, 2011; Agotnes et al., 2011) to determine equilibria of similar incomplete information games, but where more complex formulas than mere variables could be revealed (however, they could only be publicly announced). An issue for the realist outcome relation is whether bisimilar game states (that therefore satisfy the same goals for all players) should be counted once or twice.3 On the one hand, if two game states are bisimilar this is

3. Personal communication by Martin Otto.

Boolean Observation Games

justiﬁcation / suﬃcient reason to rule out one of them, according to Laplace. On the other hand these bisimilar game states might have originated from playing strategy proﬁles (executing epistemic actions) in initial game states that were non-bisimilar. It is relevant to observe this as we note that this phenomenon cannot occur in our simpler setting involving observation relations.

The notion of Nash equilibrium for the maximal outcome relation has an interesting interpretation. A maximal Nash equilibrium is a uniform strategy proﬁle where no player has a proﬁtable deviation even if the player has complete information about the game. There is an equivalent formulation of maximal Nash equilibrium as a qualitative version of ex-post equilibrium (Apt, 2011), which we show in Proposition 10.

Various of the above outcome relations have also been considered in (Parikh, Tasdemir, & Witzel, 2013).

Proposition 10. A uniform strategy proﬁle s is a maximal Nash equilibrium for G iﬀfor all v V , s(v) is a Nash equilibrium for G(v).

Proof. Suppose s NE max(G). Then there exist v V , i N, and s i Si such that ui(v, (s i, s i)) >max ui(v, s). It follows that there is w i v such that ui(w, (s i, s i)(w)) > ui(w, s(w)), so ui(w, (s i, s i)(w)) = 1 and ui(w, s(w)) = 0. Therefore s(w) / NE(G(w)). Suppose s(w) / NE(G(w)) for some valuation w. Then there exist i N, s i Si such that ui(w, (s i, s i(w)) > ui(w, s(w)). Let s i Si be the uniform strategy such that for all v i w, s i(v) = s i (so in particular, s i(w) = s i), and for all v i w, s i(v) = si(v). By the maximal relation, from ui(w, (s i, s i)(w)) = ui(w, (s i, s(w) i) > ui(w, s(w)) it follows that ui(w, (s i, s i)) >max ui(w, s). Therefore s / NE max(G).

In the remaining sections we focus on optimist, pessimist and maximal Nash equilibrium and not on realist Nash equilibrium. We use the operational deﬁnition of maximal Nash equilibrium given by the correspondence in Proposition 10. It is easy to see that a maximal Nash equilibrium is also an optimist, pessimist, and realist Nash equilibrium. In that sense the maximal outcome relation is the strongest notion, resulting in the smallest number of equilibria for a game (if any).

3.2 Various Classes of Observation Games

With all the technical tools now at our disposal, very diﬀerent observation games are of speciﬁc interest. We can distinguish them by which outcome relation they employ, and independently by the shape of the epistemic goals. Concerning goals it is useful to distinguish the following.

In two-player zero-sum games, γi = γj and in two-player symmetric games γi = γj, where |N| = 2, i = j, and i, j N. In cooperative games V i N γi is consistent. Example 11 below is symmetric, and Example 12 is zero-sum (and therefore not consistent). Communicative scenarios obeying the Gricean cooperative principle are clearly consistent observation games (and might still be considered games inasfar as people want to outdo each other in being informative). Whereas security protocol settings with eavesdroppers (consider observing an SMS code that you were sent to

Van Ditmarsch & Simon

conﬁrm a bank transfer) tend to be zero-sum; that is, a generalization of zero-sum: the objectives of the principals are the opposite of those of the eavesdroppers.

We do not have theoretical results for zero-sum or symmetric games.

In knowing-whether observation games (knowing-whether games, Kw games) all goals γi are in LKw. In knowing-whether games the outcome does not depend on the valuation. Whether some Kwipj is true only depends on player j revealing pj to player i, and does not depend on the valuation, because the truth of pj does not depend on the value of pj.

Section 4 is entirely devoted to knowing-whether games, and Section 5 contains results on existence of equilibria. They relate well to the usual Boolean game. Not surprisingly, as the outcome does not depend on the valuation, they also score better on the computational complexity of determining whether a uniform strategy proﬁle is a Nash equilibrium, or whether Nash equilibria exist, than other classes of observation game. That will be investigated in Section 6.3.

In observation games with guarded goals (of shape γi = Kiα, see Subsection 2.2) the players know whether they have achieved their objective after playing the game. Whereas in games where the goals are not guarded they may not and need an oracle to inform them of the outcome (such as, when standing in front of an ATM teller, the bank s interface informing them). If goals are guarded, Nash equilibria always exists for the optimist and the pessimist outcome relation, as formulated and shown in Theorem 28 in Section 5.

In games where all γi are positive formulas (in the fragment L+ where negations do not bind Kj modalities, see Subsection 2.2), a player s goal is never to remain ignorant of a fact, or even for other players to remain ignorant. Under such circumstances revealing all you know is a dominant strategy. This is therefore rather restricted.

More interesting than positive goals are the observation games with self-positive goals wherein your goal is to become less ignorant yourself although you may wish to keep other players ignorant (see again Subsection 2.2). We provide a result for self-positive goals in Corollary 29 in Section 5.

For all these, results on existence of equilibria and complexity also depend on which outcome relation is used, as already occasionally listed above. Last but not least one can consider iterated observation games with temporal eventuality goals, where players successively reveal more and more of their observed variables. An example are (successive) question-answer games wherein the strategic aspect is what variable(s) to ask another player(s) to reveal, which seems of particular interest for strategic negotiation (if you give me this, I ll give you that, and so on). All these come with speciﬁc questions on compact representation and existence of equilibria. We defer the investigation of iterated games and question-answer games to future research. In this work we focus on knowing-whether games and their relation to Boolean games, on the existence of equilibria for various outcome relations (where the realist outcome relation plays no role), and on complexity results for some of our variations. We now continue with some detailed examples.

Boolean Observation Games

Example 11. Recall Example 1 (page 309) and Example 8. We now give full details. Consider the observation game G where N = {1, 2}, P1 = {p1}, P2 = {p2} and the (symmetric) goals:

γ1 = γ2 = p1 p2 K1p2 K2p1 p1 p2 K1 p2 K2p1 p1 p2 K1p2 K2 p1 p1 p2 K1 p2 K2 p1

As there are only two players and each player observes a single variable the strategies are to reveal nothing or to reveal all, that is: s 1, s 1, s 2, and s 2. For each valuation v the pointed observation game G(v) has an equilibrium where both players get outcome 1. For example, if p1 and p2 are both true, then both players revealing (announcing) that is an equilibrium with outcome 1 for both players. However, both players not announcing their variable is also an equilibrium with outcome 0. Let us now determine equilibria for G, with uniform strategies instead of strategies, and let us consider the diﬀerent outcome relations.

pessimist. Player 1 cannot distinguish between the valuations {p1, p2} and {p1}. Thus, for all s S and for all v V , min u1(v, s) = 0. The situation is symmetric for player 2. Therefore, for all s S, s NE pess(G).

optimist. Similarly, for all s S, for all v V and for all i {1, 2}, max ui(v, s) = 1. Therefore, for all s S, s NE opt(G).

realist. In this example, whatever the valuation v, Σui(v, s) = max ui(v, s) = 1, so that also, for all s S, s NE real(G).

maximal. NE max(G) = . There are no maximal Nash equilibria, because every information set for both players always contains a win and a lose, so if they were to know the real valuation, one of those is not an equilibrium for the pointed game.

Possibly, the equilibria depend on what we called the personalities of Tony and Maria , that is on the shape of the goals? We considered two diﬀerent personalities that therefore allow four diﬀerent goals, but (the reader can check that) none makes a diﬀerence for any of the four outcome relations, as the property that each information set contains a win and a lose persists throughout such transformations. The best is always win, and the worst is always lose. However for other personalities (for lack of a better term) this need not be, for example, change K1p2 K2 p1 in the third conjunct into K1p2 K2 p1 (we removed one negation symbol). It is now dominant for player 1 to announce the value of p1 in the information set wherein p1 is false.

Example 12. Recall Example 2 on page 310 about the pennies that do not match. We can now model this as a knowing-whether Boolean observation game G where N = {1, 2}, P = P1 P2 with P1 = {p1}, P2 = {p2}, and

γ1 = Kw1p2 Kw2p1 γ2 = Kw1p2 Kw2p1

Van Ditmarsch & Simon

For i = 1, 2, player i has strategy s i wherein she reveals nothing ( hide pi ) and strategy s i wherein she reveals the value of pi. Irrespective of the valuation, in the strategy proﬁles (s 1, s 2) and (s 1, s 2), player 1 has a proﬁtable deviation in the corresponding pointed observation game. Similarly, in (s 1, s 2) and (s 1, s 2), player 2 has a proﬁtable deviation. Thus it can be veriﬁed that NE max(G) = . Also, within the set of all globally uniform strategy proﬁles, G does not have a Nash equilibrium for the pessimist and optimist outcome relation. However, this game has a Nash equilibrium with uniform strategies that are not globally uniform, for the pessimist and for the optimist outcome relation. Consider the uniform strategy proﬁle s = (s1, s2) where in s1, player 1 reveals p1 to 2 when p1 is true and hides p1 from 2 when p1 is false, and in s2, player 2 reveals p2 to 1 when p2 is true and hides p2 from 1 when p2 is false. Thus min u1(v, s) = min u2(v, s) = 0. It can then be veriﬁed that no player has proﬁtable deviation from s and therefore s NE pess(G). Similarly, it can be noted that max u1(v, s) = max u2(v, s) = 1. Therefore, s NE opt(G).

3.3 Comparison to Related Work

In this section we compare in more detail our epistemic Boolean games to the two prior proposals in the literature known to us ( Agotnes et al., 2013b; Herzig et al., 2016), that were already mentioned in the introductory section. We recall that these are imperfect information games (they feature epistemic objectives), however they are not incomplete information games. We also succinctly compare our proposal to an incomplete information game, that is however not a Boolean game ( Agotnes & van Ditmarsch, 2011).

Comparison to Boolean Games with Epistemic Goals . In Boolean games with epistemic goals ( Agotnes et al., 2013b) the set of variables P is partitioned into |N| = n mutually disjoint subsets of variables Pi, for i N, such that the variables in Pi can only be controlled by player i. This is as usual in Boolean games, and therefore the strategies played are also as usual, so that a strategy proﬁle is a valuation of all variables. However, the goals are diﬀerent from the usual in Boolean games, and like ours: these are not merely Boolean goals (formulas in the language LB) whose satisfaction depends on this valuation but these are epistemic goals (the language LK) whose satisfaction depends on what the players know about this valuation. This is where another parameter of their games comes into play: apart from a set Pi of controlled variables each player i also has a ﬁnite visibility set consisting of Boolean formulas, that is, some ﬁnite subset of the language LB: those are the propositions whose value that player can observe of the outcome valuation. Such Booleans may involve variables not controlled by player i but by other players j. Already, this seems to beg some questions on logical closure, for example if p q is in the visibility set but neither variable p nor variable q (where we note that the epistemic goal formulas have the usual compositional semantics, so Ki(p q) is true if and only if Kip and Kiq are true). However, a special case is when the visibility set consists of variables only, which ( Agotnes et al., 2013b) call atomic games, and this suﬃces for a comparison with our results. The visibility set determines what is known by the players and thus which epistemic goals are satisﬁed in a valuation. Because the players altogether control the value of all variables the game is not one of incomplete information (strategies do not depend on an unknown initial valuation) although it is one of imperfect information (over the outcome valuation). The authors then determine that model checking goal formulas is PSPACE-complete and that the existence of

Boolean Observation Games

Nash equilibria is in PSPACE, although they do not show a lower bound. They also provide an interesting embedding of their epistemic Boolean games into the standard Boolean games by observing that an epistemic goal corresponds to an exponentially larger Boolean goal that is the disjunction of all valuations over which the epistemic goal is uncertain. For example, in some given game, Kip may abbreviate (p q) (p q). This is therefore a rather diﬀerent embedding from our embedding of knowing-whether Boolean observation games into Boolean games wherein the goals remain the same but the set of variables (and thus valuations) is larger: we recall that a Kw game G for variables pi is transformed into a Boolean game BG for variables Kwjpi: the knowing-whether formulas are now considered atomic propositions. The goals remain the same in our approach, because knowing-whether goals are Booleans in the language wherein Kwjpi are atomic propositions.

Comparison to Epistemic Boolean Games Based on a Logic of Visibility and Control . The authors of this work (Herzig et al., 2016) propose a very expressive logical language and semantics for players controlling the value of propositional variables or observing the value of propositional variables. They also axiomatize this logic. They then use the logic to formalize game theoretical primitives, in particular the existence of equilibria, in an epistemic extension of Boolean games. This formalization allows them to determine the complexity of these games. The problems of determining whether a proﬁle is Nash equilibrium as well as the existence of Nash equilibrium are both in PSPACE. Their language extension includes knowledge, common knowledge, and for control or observation of propositional variables they propose additional propositional variables. We not only have, for example, a variable p, but also Sip, for player i observes the value of p and Cip for player i controls the value of p . But also variables like Cj Sip, for player j controls whether player i observes p , and so on for any stack of Cj or Si predicates. The interest of these complex propositional variables is that they induce relational Kripke models or can be used to formalize strategies in Boolean games. In the epistemic Boolean games of (Herzig et al., 2016) the strategies assign values to variables that are stacks of Si binding some atom p (so without any Ci or Ki), as in Si Sjp, saying that i can see whether j can see the value of p, whereas the goals are epistemic formulas in the language for such atoms p (so without Si or Ci), as in Kip Kjp.4 One might say that their epistemic Boolean games essentially remain Boolean games, because the players still only control the value of variables, but this is only by a (quite smart) stretch of the modeling imagination, because their Boolean variables hard-code arbitrarily complex higher-order multi-agent observations. However, these are not games of incomplete information. The focus of (Herzig et al., 2016) is the axiomatization of their logic of visibility and control (it also contains program modalities with primitive operations assigning values to variables). The game-theorical contribution is mainly proof of concept .

4. As Si stacks are arbitrarily long, there is an inﬁnite set of such atoms to consider. However, the partition among players controlling variables is of a ﬁnite subset only of that inﬁnite set. This permits Sip but not p to be in that ﬁnite subset, which would rule out to determine the value of a goal Kip (as no player gives a value to p, that is, no player controls p). In their accompanying examples, the ﬁnite subset jointly controlled by all players is always subformula closed. This therefore seems an omitted requirement.

Van Ditmarsch & Simon

Comparison to Public Announcement Games . Public announcement games which are studied in ( Agotnes & van Ditmarsch, 2011) and the related question answer games ( Agotnes et al., 2011) also present incomplete games of imperfect information. Expected outcomes are compared with the realist outcome relation. The value of variables is not controlled in any way in ( Agotnes & van Ditmarsch, 2011; Agotnes et al., 2011), the valuations are ﬁxed. Public announcement games are not Boolean games, because the players strategies are revelations of any formula, not merely of Booleans. Of course one could consider a class of public announcement games wherein the strategies are restricted to announcing propositional variables only. However, we recall that public announcements are revelations of the same information to all players simultaneously, so this is not as general as our proposal.

4. Knowing-Whether Boolean Observation Games

In this section we show a correspondence between knowing-whether Boolean observation games (Kw games) and Boolean games. We provide polynomial time reductions that convert a Boolean game to a Kw game and vice-versa. We ﬁrst recall the deﬁnition of Boolean game. We then show that every Boolean game deﬁnes a Kw Boolean observation game, and that every Kw Boolean observation game deﬁnes a Boolean game. These embeddings are diﬀerent, the ﬁrst is not the converse of the second. We further show a utility preserving equivalence between strategies in Boolean games and equivalence classes of globally uniform strategies in Kw games (Lemmas 18, 22). As a consequence, we prove a correspondence between the existence of Nash equilibria in Boolean games and the existence of maximal Nash equilibria in Kw games, and for both reductions (Theorems 19, 24). We ﬁnally show that there always exists a pessimist equilibrium for 2-player Kw games, but not for Kw games in general: we give an 8-player Kw game without a Nash equilibrium (where we do not know if such games exist for between 3 and 7 players). Recall that for any v V and α LKw, v, s |= α iﬀs |= α (Prop. 5). This justiﬁes writing ui(s) for the outcome ui(v, s) of a pointed Kw game. Now consider a globally uniform strategy proﬁle s Sg. As ui(v, s) = ui(v, s(v)) = ui(v, s), this justiﬁes writing ui( s) for the expected outcome of a such a Kw game.

4.1 Boolean Games

Boolean games have the same parameters as Boolean observation games but simpler strategies. A Boolean game is denoted B to distinguish it from a Boolean observation game G. A Boolean game is a tuple B = (N, (Pi)i N, (γi)i N) where all γi LB (all goals are Boolean). For i N, a strategy vi for player i is a (local) valuation vi Pi, where, slightly abusing notation, we identify a strategy proﬁle v = (v1, . . . , vn) with a valuation v = (v1 . . . vn) V . For Boolean games, the outcome function is denoted u B to distinguish it from the outcome function u of pointed Boolean observation games. We deﬁne u B i (v) = 1 if v |= γi and u B i (v) = 0 if v |= γi. Equilibrium is as for pointed observation

Boolean Observation Games

games: a strategy proﬁle v V is a Nash equilibrium in B if for all i N and v i Pi, u B i (v) u B i (v i, v i). Given B, its Nash equilibria are denoted NE(B).

Let us emphasize the diﬀerence between Boolean games and Boolean observation games. In Boolean observation games, as in Boolean games, a player i selects a subset vi of her local variables Pi. However, in Boolean observation games this subset may be a diﬀerent subset si(j) Pi for each other player j. Also, in Boolean games, excuting strategy vi means that the pi vi become true whereas the pi Pi \ vi become false. Whereas in Boolean observation games, executing strategy with component si(j) means that the pi si(j), that already have an observed truth value, are revealed (to j).

4.2 Boolean Games to Knowing-Whether Games

We construct a Kw game denoted GB from a Boolean game B as follows. Let B = (N, (Pi)i N, (γi)i N). Then GB := (N, (Pi)i N, (βi)i N) where each βi := λ(γi) is deﬁned as follows. Let i+ := i + 1 for i = 1, . . . , n 1 and n+ := 1. Then λ : LB LKw is inductively deﬁned as: for all i, pi Pi, λ(pi) := Kwi+pi, and (trivially) λ( α) := λ(α) and λ(α1 α2) := λ(α1) λ(α2). Note that B and GB are deﬁned for the same players and variables. Given a strategy proﬁle v V for B, we deﬁne globally uniform strategy proﬁle sv Sg

for GB such that for all i N and pi Pi: pi sv i (i+) if pi v; sv i (i) = Pi; and for all j N with j = i, i+, sv i (j) = . Note that for all valuations w, including v, sv(w) = sv. Notation sv is therefore not to be confused with notation s(v) for uniform proﬁles s. In this section we will show how the v strategy for B corresponds to the sv strategy for GB. Note that for all i N, we have |βi| = O(|γi|) where |βi| and |γi| denote the size of (number of symbols in) βi and γi respectively. Thus given B, the associated Kw game GB can be constructed in polynomial time.

Example 13. We illustrate how to construct a Kw game GB from a Boolean game B. (We will not analyze the equilibria of the game, if any.) Consider

B = ({1, 2, 3}, ({p1}, {p2}, {p3}), (p1 p3, p3 p1, p1 p2))

Then GB has the same variables p1, q1, p2, p3 but diﬀerent goals, namely Kw2p1 Kw1p3 for player 1, Kw1p3 Kw2p1 for player 2, and Kw2p1 Kw3p2 for player 3. In the Boolean game, for player 1 to obtain her goal γ1 = p1 p3, player 1 has to make p1 true, it does not matter whether player 2 makes p2 true or false, and player 3 has to make p3 true. In the Kw game, in order to achieve the goal β1 = Kw2p1 Kw1p3, player 1 has to reveal p1 to player 2, it does not matter whether player 2 reveals p2 to player 3, and player 3 has to reveal p3 to player 1, and all three do this independently from the valuation. Because in fact, for example, player 1 reveals the value of p1 to player 2, but what the value is does not matter as the outcome of a Kw game is independent from the valuation. So they players execute globally uniform strategies. More precisely, in order to ensure β1 globally uniform strategy s is required such that s1(2) = {p1}, s3(2) does not matter, and s3(1) = {p3}.

Lemma 14. Let GB be the Kw game associated with the Boolean game B. For all i N, for all w V , sw |= λ(γi) iﬀw |= γi.

Van Ditmarsch & Simon

Proof. This is shown by induction where only the base case is not-trivial. For that, we have that sw |= Kwi+pi iﬀw |= pi by deﬁnition of the embedding.

Therefore, for any i N and v V : ui(v, sw) = ui(v, sw) = ui(sw) = u B i (w). This correspondence allows us to relate Nash equilibria in B to Nash equilibria in GB. The result uses an interesting property of NE max in Kw games (this property does not hold for, e.g., the pessimist outcome relation).

Lemma 15. Let Kw game G be given. Let s NE max(G) and v V . Let s = s(v). Then s NE max(G).

Proof. Consider an arbitrary v V and let s = s(v). Suppose that s NE max(G), we claim that the globally uniform strategy proﬁle s NE max(G). Suppose not. Then there exists i N, w V and s i Si such that ui(w, (s i, s i(w))) > ui(w, s(w)). This implies that w, (s i, s i(w)) |= γi and w, ( s(w)) |= γi. Since for all j N, sj(w) = sj(v), we have w, (s i, s i(v)) |= γi. Since γi LKw, we have v, (s i, s i(v)) |= γi. Also, since w, s(w) |= γi, we have v, s(w) |= γi and by deﬁnition of s, we have v, s(v) |= γi. Therefore, s NE max(G) which gives the required contradiction.

Corollary 16. Let Kw game G be given. If NE max(G) = then NE g max(G) = .

An Equivalence Relation over Global Strategy Proﬁles. Recall that every s Sg

is of the form s where s S(GB). We deﬁne an equivalence relation over Sg in GB as follows. For i N, si i ti iﬀsi(i+) = ti(i+). For s, t Sg, we deﬁne s t iﬀfor all i N, si i ti. Let Sg/ denote the set of equivalence classes and [s] denote the equivalence class containing s Sg.

Lemma 17. Given s Sg, for all t [s], for all i N, for all v V , ui(v, s(v)) = ui(v, t(v)).

Proof. Let s = s and t = t. For all i N, since t [s], we have si(i+) = ti(i+). By induction of the structure of γi, we can prove the following: for all v V , for all i N and for all γi LKw, we have v, ( s(v)) |= γi iﬀv, ( t(v)) |= γi. This implies that for all i N, for all v V , ui(v, s(v)) = ui(v, t(v)).

An Outcome Preserving Bijection. We now show that there is an outcome preserving bijection χ between strategy proﬁles in B and equivalence classes in Sg/ . For a Boolean game B, and v V , χ(v) = [ sv].

Lemma 18. Given a Boolean game B, the function χ : V Sg/ is a bijection.

Proof. Given s Sg, consider v V deﬁned as follows: for all i N and pi Pi, pi v iﬀ pi si(i+). We then have χ(v) = [ s] and therefore χ is onto. For v, w V such that v = w, there exists i N , there exists pi Pi such that pi v and pi w. Thus, for χ(v) = [ s] and χ(w) = [ t], we have s i t, which implies that s t. Therefore, χ is a bijection.

Consequently, we can prove a correspondence between Nash equilibria existence.

Theorem 19. Let B be a Boolean game. Then NE max(GB) = iﬀNE(B) = .

Boolean Observation Games

Proof. ( ) We argue that if w NE(B) then sw NE max(GB). Suppose not, then there exists i N, v V and ti Si such that ui(v, (ti, sw i(v))) > ui(v, sw(v)). Let w = χ 1([ ti, s i]) From Lemmas 14, 17 and 18 it follows that u B i (w ) = ui(v, ( ti, s i)(v)) > ui(v, sw(v)) = u B i (w) for all v V . Therefore w NE(B) which is a contradiction. ( ) Suppose NE max(GB) = . By Lemma 15 there exists s Sg such that s NE max(GB). Let w = χ 1([s]). We claim that w NE(B). Suppose not, then there exists i N and w i such that u B i (w i, w i) > u B i (w). Let w = (w i, w i). Note that by deﬁnition w = w . From Lemma 14 it follows that for all v we have u B i (w ) = ui(v, sw (v)). From Lemmas 14, 17 and 18 it follows that u B i (w) = ui(v, s(v)) for all v V . Therefore for all v V , ui(v, sw (v)) = u B i (w ) > u B i (w) = ui(v, s(v)) which contradicts the fact that s NE max(GB).

Other ways to get a Kw Game from a Boolean Game. Let us imagine our n players sitting round a table numbered in clockwise fashion. In the embedding λ : LB LKw with basic clause λ(pi) := Kwi+pi,

every player i reveals the value of her observed variable pi to her left neighbour (while other players observe her doing that). There are many other embeddings that would serve equally well to obtain our results. For example, every player i could reveal her variable to her right neighbour. This would be a λ with basic clause

λ (pi) := Kwi pi

where i is i 1 except for 1 := n. A more interesting embedding would be every player publicly announcing pi to all other players. We then have a λ for which

4.3 Knowing-Whether Games to Boolean Games

A Kw Game to a Boolean Game. We now construct a Boolean game denoted BG from a knowing-whether Boolean observation game G. Let G = (N, (Pi)i N, (γi)i N). Assume that the goals γi do not contain trivial constituents Kwipi.5 Then BG := (N, (Qi)i N, (γi)i N) where for all i N, Qi = {Kwjpi | pi Pi, i = j}. We view Kwjpi, for each i and j with i = j, as atomic propositions in BG. Let Q = S i N Qi.

Observation. Both G and BG are deﬁned over the same set of players and goal formulas. The number of variables in BG for each i N is |Qi| = (n 1)|Pi|. Thus given G, the associated Boolean game BG can be constructed in polynomial time. Also, note that BGB = B and GBG = G, the constructions are unrelated. Let us give an example of that.

5. As such Kw ipi are always true, this would otherwise cause a problem in the translation, because the players in the constructed Boolean game would then be able to control the value of propositional variables Kw ipi (which is undesirable), unlike the players in the given Kw game. One can also address this formally, without assumptions, with an inductively deﬁned translation mapping Kw ipi to .

Van Ditmarsch & Simon

Example 20. As an illustration to construct a Boolean game from a Kw game, let us take the Kw game just constructed in Example 13. We recall that

GB = ({1, 2, 3}, ({p1, q1}, {p2}, {p3}), (γ1, γ2, γ3))

γ1 = Kw2p1 Kw1p3,

γ2 = Kw1p3 Kw2p1,

γ3 = Kw2p1 Kw3p2.

The Boolean game BGB constructed from that has the same goals but has more variables, namely Kwipj for all i, j N with i = j and for all pj Pi, that is: Kw1p2, Kw1p3, Kw2p1, Kw2q1, Kw2p3, Kw3p1, Kw3q1, Kw3p2. Therefore BGB has more variables than B. The constructions are not each other s converse. However, in order to realize the goals of BGB the players only need to assign a value to variables Kwipj occurring in the goal formulas, so with respect to playing this game the extra variables do not play a role. After replacing Kw2p1 by p1, etcetera for other variables ocurring in goal formulas, we recover the original Boolean game for, however, far more variables that are not used in goals.

Let W = P(Q) be the set of valuations over Q. We deﬁne a function η : Sg W and argue that it is a bijection which is outcome equivalent. Given s Sg, deﬁne w = η( s) as follows: for i N, Kwjpi η( s)i iﬀpi si(j).

Lemma 21. Let BG be the Boolean game associated with the Kw game G. For all i N, for all s S and for all γi, s |= γi iﬀη( s) |= γi.

Proof. This is shown by induction using as the base case that s |= Kwjpi, iﬀη( s) |= Kwjpi The other cases are trivial.

It therefore also follows, similarly to the above, that ui( s) = ui(s) = u B i (ws).

Lemma 22. Given a Kw game G, let BG be the associated Boolean game. The function η : Sg W is a bijection.

Proof. For an arbitrary w W, consider s Sg deﬁned as follows. For all i N, and for all pi Pi, pi si(j) iﬀi = j or Kwjpi wi. By deﬁnition, η( s) = w and thus η is onto. Consider s, t Sg where s = t. Then there exists i, j N with i = j and there exists pi Pi such that pi si(j) and pi ti(j). This implies that Kwjpi η( s)i and Kwjpi η( t)i. Therefore η is a bijection.

Non-global Uniform Strategies as Mixed Strategies for Boolean Games. We allow ourselves a little detour. We can straightforwardly adjust the function η mapping globally uniform strategy proﬁles of the Kw game to valuations that are strategy proﬁles of the Boolean game, to a function mapping arbitrary uniform strategy proﬁles of the Kw game G to mixed strategy proﬁles of the Boolean game BG. We simply deﬁne the revised η on the level of strategy proﬁles s S. Given a uniform strategy proﬁle s S, for each s S such that s(v) = s for some v V , we let π(s) := |{v V | s(v) = s}|/2|P|. Note that

Boolean Observation Games

2|P| = |V |. So π(s) is the probability that a valuation is mapped in s, given s. We can now deﬁne a mixed strategy proﬁle ws of the Boolean game BG as the one executing each s S with probability π(s). We defer the investigation of embeddings into a mixed equilibrium to future research and for now restrict ourselves to a relevant example, ﬁnally closing the loop with matching pennies.

Example 23. Once more we recall Example 2 on page 310 about the pennies that do not match, already further developed in Example 12, wherein it was shown that this game does not have a Nash equilibrium with globally uniform strategies for the pessimist and optimist outcome relation, but has a Nash equilibrium with uniform strategies that are not globally uniform: the uniform strategy proﬁle s = (s1, s2) where in s1, player 1 reveals p1 to 2 when p1 is true and hides p1 from 2 when p1 is false, and in s2 player 2 reveals p2 to 1 when p2 is true and hides p2 from 1 when p2 is false. When translating this game into a Boolean game, we can now observe that this uniform strategy s becomes a mixed strategy η(s) where player 1 randomly chooses between revealing or hiding her propositional variable Kw2p1 and where player 2 randomly chooses between revealing or hiding his propositional variable Kw1p2. To realize that this is indeed random it is important to observe that in Example 2 the probability of observing p1 or p1 was determined by Odd ﬂipping its penny before privately watching the outcome under the dice cup, and similarly for p2 and Even. So after all, for those who may have wondered, there was a reason for setting up the experiment just like that.

We continue with a relevant result for the maximal outcome relation.

Theorem 24. Let G be a Kw game. Then NE(BG) = iﬀNE max(G) = .

Proof. ( ) Suppose NE max(GB) = . By Lemma 15 there exists s Sg such that s NE max(GB). Let w = η(s), we argue that w is a Nash equilibrium. Suppose not, there exists i N, there exists w i such that u B i (w i, w i) > u B i (wi, w i). Consider the globally uniform strategy proﬁle t = η 1(w i, w i) (this is well deﬁned by Lemma 22). From Lemmas 21 and 22 it follows that ui(v, t(v)) = u B i (w i, w i) > u B i (w) = ui(v, s(v)). This implies that s NE max(GB) which is a contradiction. ( ) Suppose w NE(BG). Let s = η 1(w), we claim that s NE max(G). Suppose not, then there exists i N, v V and ti Si such that ui(v, (ti, s i(v))) > ui(v, s(v)). Let w = η( ti, s i). From Lemmas 21 and 22 it follows that u B i (w ) = ui(v, ( ti, s i)(v)) > ui(v, s(v)) = u B i (w). This implies that w NE(BG) which is a contradiction.

5. Existence of Nash Equilibrium

In this section we focus on the question of existence of Nash equilibria for observation games and identify various subclasses in which a Nash equilibrium is guaranteed to exist.

5.1 Existence of Pessimist Nash Equilibrium in Knowing-Whether Games

Example 12 shows that in the Kw fragment a maximal Nash equilibrium is not guaranteed to exist even for two-player games. It is natural to ask if a similar observation holds for pessimist Nash equilibrium. We ﬁrst show that for two-player Kw games, a pessimist Nash

Van Ditmarsch & Simon

equilibrium always exists (Proposition 25). However, for general Kw games, existence is not guaranteed. Example 26 gives an 8-player Kw game without a Nash equilibrium.

Proposition 25. All two-player Kw games have a pessimist Nash equilibrium.

Proof. We construct a uniform strategy proﬁle (s 1, s 2) as follows. For i {1, 2}, let ı denote the player such that ı = i. If i {1, 2} has a uniform strategy si that is dominant, then set s i (v) = si for all v V and let s

ı be the best response to s i . It can be veriﬁed that s as deﬁned above is a Nash equilibrium. If neither player has a uniform strategy that is dominant, then we have the following

For all s1 S1, there exists s2 S2 such that v, (s1, s2) |= γ1 for all v V .

For all t2 S2, there exists t1 S1 such that v, (t1, t2) |= γ2 for all v V .

For two player games there is a bijection between the set of strategies Si and the set of (local) valuations Vi (one can think of a strategy as deciding for each proposition whether to reveal to the other player). Therefore for each t1 and s2 as described above, we can set s 1(v1) = t1 and s 2(v2) = s2 appropriately for some v1 and v2. To see that (s 1, s 2) is a Nash equilibrium, note that for all i {1, 2} and for all v V , min ui(v, s ) = 0. Also, for all v V , min ui(v, (s i, s i)) = 0 due to the above condition.

However, for more than two players a Kw game need not have a pessimist Nash equilibrium. We present a counterexample for eight players.

Example 26. Consider the observation game G where N = {1, 2, . . . , 8} and Pi = {pi} for i N. Player 8 acts as an observer whose goal γ8 = . To specify the goals of the other players we use the following formulas.

A = Kw3p1 Kw4p1 B = Kw3p1 Kw4p1 C = Kw3p1 Kw4p1 D = Kw3p1 Kw4p1

The main idea is to exploit that player 1 controls a single variable. Therefore in any uniform strategy player 1 can choose to satisfy at most two of A, B, C, D. E.g., when p1 is true reveal p1 to 3 and 4 (A), when p1 is false reveal p1 to 4 but not to 3 (C).

Deﬁnition of the Goal Formulas. For players {2, . . . , 7} the goal formulas are as follows.

γ2 = ((C A) Kw8p2) ( (C A) Kw8p2), γ3 = (((B D) Kw8p3) ( (B D) Kw8p3)) ((A D) Kw8p2 Kw8p3), γ4 = ((D C) Kw8p4) ( (D C) Kw8p4) (((B C) Kw8p5) Kw8p4), γ5 = (((A B) Kw8p5) ( (A B) Kw8p5)) ((A C) Kw8p7 Kw8p5), γ6 = (((A D) Kw8p6) ( (A D) Kw8p6)) ((A B) (Kw8p3 Kw8p7) Kw8p6), γ7 = ((B C) Kw8p7) ( (B C) Kw8p7)

Boolean Observation Games

A 2, 5, 6 D, C, B B 3, 5, 7 A, C, A C 2, 4, 7 D, B, A D 3, 4, 6 A, B, B

Table 1: Uniform strategies for player 1. Explanations are given in the text.

The goal of player 1 is deﬁned as γ1 := W6 j=1 αj where α1 = Kw8p2 D, α2 = Kw8p3 A, α3 = Kw8p4 B, α4 = Kw8p5 C, α5 = Kw8p6 B and α6 = Kw8p7 A. We will now verify that NE pess(G) = . The goals of the players (except 8) involve assertions about whether players 2, . . . , 7 reveal the proposition that they control to player 8 along with whether 1 reveals p1 to players 3 and 4. For the purpose of this example, note that for all j {2, . . . , 7}, for all sj Sj and k = 8, the value of sj(k) is irrelevant. We now argue that NE pess(G) = . To simplify the presentation, we split the reasoning into two parts. First we argue that no globally uniform strategy of player 1 can be part of a pessimist Nash equilibrium in G. In the second part we extend this to cover all uniform strategy proﬁles.

Globally Uniform Strategies of Player 1. We show that for all uniform strategy proﬁles s S, if s1 is globally uniform then s NE pess(G). In other words, no uniform strategy proﬁle in G with a globally uniform strategy for player 1 can be a pessimist Nash equilibrium. Consider an arbitrary uniform strategy proﬁle s S where s1 = s1 Sg 1. The uniform strategy s1 satisﬁes exactly one of the formulas A, B, C, D. From the goal formulas we can see that there exists a non-empty subset of players X {2, . . . , 7} such that for all j X, min uj(v, ( s1, s j , s N {1,j})) = 1 for all v. Thus if s NE pess(G) then uj(v, s) = 1 for all j X and for all v. From the goal formulas it also follows that there exists s 1 = s1 such that min u1(v, ( s 1, s j , s N {1,j})) = 1 for all j X and for all v. In Table 1 we list all such possibilities. The ﬁrst column in Table 1 lists the formula in A, B, C, D that is satisﬁed by a globally uniform strategy s1. The second column lists the players j {2, . . . , 7} who can ensure an outcome 1 with s j given s1. The third column gives the corresponding formulas in A, B, C, D that player 1 should satisfy to achieve an outcome 1. For example suppose s1 satisﬁes A (ﬁrst row), players 2, 5 and 6 can reveal their proposition to player 8 to ensure an outcome of 1 given s1. Player 1 can then choose to satisfy D (corresponding to α1), C (corresponding to α4), B (corresponding to α5), respectively to achieve an outcome of 1. Using Table 1 it can be veriﬁed that any s S where s1 is a globally uniform strategy is not a pessimist Nash equilibrium.

Arbitrary Uniform Strategies of Player 1. Next, note that since player 1 controls a single proposition p1, any uniform strategy (of player 1) can satisfy at most two of A, B, C, D. For example, consider the uniform strategy s1: when p1 is true reveal p1 to 3 but not to 4 (B), when p1 is false do not reveal p1 to 3 and do not reveal p1 to 4 (D). Using an argument similar to the one above we can show that if s NE pess(G) then min u1(v, s) = 1 for all v V .

Van Ditmarsch & Simon

Now consider the uniform strategy s1 which is mentioned above for player 1 that satisﬁes B or D. We can argue that there is no uniform strategy s with s 1 = s1 such that s NE pess(G). Given the uniform strategy s1 of player 1, we have the following.

Player 2 can ensure an outcome of 1 (for all v V ) by not revealing p2 to player 8.

Player 3 can ensure an outcome of 1 (for all v V ) by revealing p3 to player 8. But this would in turn imply that player 1 can satisfy α2 by deviating to a uniform strategy that satisﬁes A.

If player 4 reveals p4 to player 8 then player 1 can satisfy α3 by deviating to a uniform strategy that satisﬁes B.

If player 5 reveals p5 to player 8 then player 1 can satisfy α4 by deviating to a uniform strategy that satisﬁes C.

If player 6 reveals p6 to player 8 then player 1 can satisfy α5 by deviating to a uniform strategy that satisﬁes B.

If player 7 reveals p7 to player 8 then player 1 can satisfy α6 by deviating to a uniform strategy that satisﬁes A.

Recall that the goal of player 1 is γ1 := W6 j=1 αj. Now consider any uniform strategy proﬁle s where s 1 = s1, players 2 and 3 are playing their best responses and players 4,5,6 and 7 do not reveal the proposition that they control to player 8. Then α3, α4, α5 and α6 are not satisﬁed by s . From items 1 and 2 above, we have that in s 3, player 3 reveals p3 to player 8 and subsequently player 1 has a proﬁtable deviation to a uniform strategy that satisﬁes A. Thus s NE pess(G). If at least one of the players 4, . . . , 7 reveal the proposition that they control to player 8, then for player 1 to ensure the outcome 1 for all v, it need not necessarily have to satisfy α2 but can choose to satisfy the corresponding formula α3, . . . , α6. But this would also imply deviating from s1 as listed in items 3-6 above. Thus we can conclude that for all uniform strategy s with s 1 = s1 we have that s NE pess(G). In Table 2 we enumerate all possibilities. In the ﬁrst column we list all the possible combinations of the formulas in A, B, C, D that player 1 can possibly satisfy in any uniform strategy. Corresponding to each row which denotes a uniform strategy s1 of player 1, in the second column in Table 2, we list the minimal set of players X which satisfy the following conditions. Given the uniform strategy s1 of player 1,

For all j X, player j cannot ensure the outcome 1 (for all v V ) by not revealing its proposition to player 8 (assuming player 1 chooses s1).

If for all j X, sj is a uniform strategy of player j that reveals pj to player 8 then in the resulting uniform strategy proﬁle s, we have u1(v, s) = 1 for all v V .

In other words, if all the players in X reveal their proposition to player 8 then the outcome for player 1 under the strategy s1 is 1 for all v. For example, consider the uniform strategy s 1 of player 1 deﬁned as follows: when p1 is true reveal p1 to 3 and 4 (A), when p1 is false reveal p1 to 4 but not to 3 (C) . Given s 1, player 3 violates the ﬁrst condition above as

Boolean Observation Games

C A {5, 7} B D D C A B {3, 6}, {6, 7} A D {2, 3} B C {4, 5}

Table 2: Uniform strategies for player 1. Explanations are given in the text.

player 3 can ensure the outcome 1 by not revealing p3 to player 8. If both the players 5 and 7 reveal p5 and p7, respectively, to player 8, then in the resulting uniform strategy proﬁle s

we have u1(v, s ) = 1 for all v V . However, note that then s NE pess(G). In s since player 7 reveals p7 to player 8 and s 1 satisﬁes C A, player 5 can deviate to not reveal p5 and ensure an outcome of 1 (for all v V ). Thus player 5 has a proﬁtable deviation from s and therefore, s NE pess(G). A similar reasoning applies to the other rows in Table 2. From the goal formulas γ1, . . . , γ7 we can verify that for every such set X, there is a player k X who can ensure an outcome of 1 by not revealing pk to player 8. Therefore NE pess(G) = .

5.2 Existence of Nash Equilibrium in the General Case

Following up on Example 26 we now determine more generally for which fragments of observation games the existence of a Nash equilibrium is guaranteed. An initial step would be to consider observation games where the goal formulas for all players are restricted to the positive fragment of LK. For this fragment, the following result is straightforward.

Proposition 27. Let G = (N, (Pi)i N, (γi)i N) be an observation game where γi L+ for all i N. Then NE(G) = for all outcome relations.

Proof. Observe that when γi L+ for all i N, the globally uniform strategy s i (public announcement by player i of Pi) is dominant for all i N. Thus s NE(G) for any outcome relation.

In this section, we present a more general structural result that identiﬁes a class of observation games in which a Nash equilibrium is guaranteed to exist. Our results show that the existence of equilibrium crucially depends on the combination of positive/negative epistemic assertions made by players in their goal formulas. Observation games where the goal formulas are restricted to the positive fragment of LK can be viewed as a particular simple case in this setting. We assume that the goal formulas are in negation normal form. For i, j N (where j may be i) and γi in LK nnf, we ﬁrst deﬁne xj i(γi) for x {+, }. Intuitively, +j i(γi) and j i(γi) encode the fact that player i makes a positive and negative (respectively) epistemic assertion about a variable assigned to player j in the goal formula γi. Formally, xj i(γi) is deﬁned as follows.

For γi = pj and γi = pj we have +j i(pj) and +j i( pj).

Van Ditmarsch & Simon

γi = Kkϕ (where k N): +j i(Kkϕ) iﬀ+j i (ϕ) and j i(Kkϕ) iﬀ j i (ϕ).

γi = ˆKkϕ (where k N): +j i( ˆKkϕ) iﬀ j i (ϕ) and j i( ˆKkϕ) iﬀ+j i (ϕ).

+j i(ϕ ψ) iﬀ+j i (ϕ) or +j i (ψ) and j i(ϕ ψ) iﬀ j i (ϕ) or j i (ψ).

+j i(ϕ ψ) iﬀ+j i (ϕ) or +j i (ψ) and j i(ϕ ψ) iﬀ j i (ϕ) or j i (ψ).

Note that the deﬁnition of +j i(γi) is intended to encode the fact that player i makes a positive epistemic assertion about a variable assigned to player j in the goal formula γi. So in item 4, we have that +j i(ϕ ψ) holds iﬀthe same holds for at least one of the subformulas ϕ or ψ. A similar comment applies to deﬁnition of j i(γi). For every player i, we deﬁne type(i) {+, , c+, c } as follows. For x {+, },

x type(i) if there is a player j = i such that xj i(γi),

cx type(i) if xi i(γi).

In other words, + and are in type(i) if there exists some player j with j = i such that player i makes a positive and negative (respectively) epistemic assertion about a variable assigned to player j in γi. Likewise, c+ and c is in type(i) if player i makes a positive and negative (respectively) epistemic assertion about its own variable in γi. For example, for i N, consider the goal formula γi given in Example 12, under its translation to negation normal form (NNF). We have the following for player 1.

K1p2 occurs as a subformula in the NNF of γ1 and therefore + type(1). ˆK1p2 occurs as a subformula in the NNF of γ1 and therefore type(1).

K2p1 occurs as a subformula in the NNF of γ1 and therefore c+ type(1). ˆK2p1 occurs as a subformula in the NNF of γ1 and therefore c type(1).

For player 2, the reasoning is similar and therefore, we have that type(i) = {+, , c+, c } for all i N. In fact, Theorem 32 given below shows that it is crucial that |type(i)| > 3 in this example. Based on the notion of type, we deﬁne the following subsets of N. Let

X+ = {i N | c+ type(i)}, X = {i N | c type(i)},

Wl = {i N | type(i) = {c+, c }},

W+ = {i N | type(i) = {+, c+, c }}, W = {i N | type(i) = { , c+, c }}.

For the proofs in this section, we also ﬁnd it useful to deﬁne an ordering over the set of strategy proﬁles. Let X N and s X, t X SX. We say that s X t X if for all i X and j N, ti(j) si(j). We can then show the following existence result.

Boolean Observation Games

Algorithm 1:

Input: G = (N, (Pi)i N, (γi)i N). Output: A uniform strategy proﬁle s NE pess(G).

1 Let Wo := X+ X Wl;

2 i X+, v V , set si(v) := s i ; /* a dominant uniform strategy */

3 i X \ X+, v V , set si(v) := s i ; /* a dominant uniform strategy */

4 i Wl, v V , if s S such that w : w i v, ui(w, s) = 1 then w : w i v, set si(w) := si else si(w) := s i ; /* ui does not depend on others choice */

5 i W+, j W , v V set si(v) := s i ; sj(v) := s j ; /* initialisation */

6 v V , set Y (v) := ; Z(v) := ;

8 v V , set Y (v) := Y (v); Z (v) := Z(v); /* process players who make positive assertions about variables controlled by others */

9 while v V , i W+ \ Y (v), si, such that w : w i v, s W \Z(w), we have (si, s W+\{i}(w), s Z(w)(w), s W \Z(w), s Wo(w)), w |= γi do

10 w : w i v, set si(w) := si; Y (w) := Y (w) {i};

/* process players who make negative assertions about variables controlled by others */

11 while v V , i W \ Z(v), si such that w : w i v, s W+\Y (w), we have (si, s W \{i}(w), s Y (w)(w), s W+\Y (w), s Wo(w)), w |= γi do

12 w : w i v, set si(w) := si; Z(w) := Z(w) {i};

13 until v V , Y (v) = Y (v) and Z(v) = Z (v);

14 i W+ \ Y (v), j W \ Z(v) and v V , set sj(v)(i) := ;

15 i W \ Z(v), j W+ \ Y (v) and v V , set sj(v)(i) := Pj;

16 return s;

Theorem 28. Let G = (N, (Pi)i N, (γi)i N) be an observation game where all goals γi are guarded. If for all i N, |type(i)| 3, then

1. NE pess(G) = .

2. NE opt(G) = .

Proof. Part 1. NE pess(G) = . Consider the procedure described as Algorithm 1. We argue that Algorithm 1 always terminates and constructs a uniform strategy proﬁle s NE pess(G). First, we note that the sets X+, X \ X+, Wl, W+ and W form a partition of N. In each iteration of the outer loop in Algorithm 1 (steps 7 - 13), the size of the set Y (v) or Z(v) strictly increases for some v V . We also have that for all v V , 0 |Y (v)| |N| and 0 |Z(v)| |N|. It follows that Algorithm 1 always terminates. Let s be the strategy proﬁle constructed by Algorithm 1. From the description of the procedure, it can also be veriﬁed that s is a uniform strategy proﬁle. Thus to prove the claim, it suﬃces to show that s NE pess(G).

Van Ditmarsch & Simon

Note that for all i X+, s i is a dominant uniform strategy and for all i X , s i is a dominant uniform strategy. Therefore, for all v V , for all i X+ X and for all s i Si, ui(v, s(v)) ui(v, (s i, s i(v))).

For all i Wl, for all v V , we have ui(v, (ti, t i)) = ui(v, (ti, t i)) for all v V , ti Si and for all t i, t i S i. Therefore, by the choice of si made in line 4 of Algorithm 1, we have for all i Wl, for all v V , for all s i Si, ui(v, s) ui(v, (s i, s i)).

Now consider a player i W+. For v V , suppose si(v) is assigned a value in the while loop (steps 9-10). Let sk denote the resulting strategy proﬁle after this assignment (step 10) and Zk denote the value of Z in the corresponding iteration. By deﬁnition of the while loop, for all w with v i w, for all s W \Zk(w), (si, sk W+\{i}(w), s Zk(w), s W \Zk(w), s Wo(w)), w |= γi. Since i W+, this implies that for all t W+\{i} SW+\{i} such that t W+\{i} sk W+\{i}(w), for all s W \Zk(w), (si, t W+\{i}, s Zk(w)(w), s W \Zk(w), s Wo(w)), w |= γi. By deﬁnition, we have s W+\{i}(w) sk W+\{i}(w) and for all j Zk(w), sk j (w) = sj(w). Therefore, it follows that (si, s W+\{i}, s Z(w), s W \Z(w), s Wo(w)), w |= γi and ui(v, s) = 1.

Consider a player i W . For v V , suppose si(v) is assigned a value in the while loop (steps 11-12). Let sk denote the resulting strategy proﬁle after this assignment (step 12) and Y k denote the value of Y in the corresponding iteration. By deﬁnition of the while loop, for all w with v i w, for all s W+\Y k(w), (si, sk W \{i}(w), s Y k(w)(w), s W+\Y k(w), s Wo(w)), w |= γi. Since i W , this implies that for all t W \{i} SW \{i} such that sk W \{i}(w) t W \{i}, for all s W+\Y k(w), (si, t W \{i}, s Y k(w)(w), s W+\Y k(w), s Wo(w)), w |= γi. By deﬁnition, sk W \{i}(w) s W \{i}(w). Thus (si, s W \{i}, s Y (w), s W+\Y (w), s Wo(w)), w |= γi. Therefore, ui(v, s) = 1.

Now suppose there exists v V and i W+ such that i / Y (v) (on termination of the repeat loop, steps 7-13). By deﬁnition, for all si, there exists w with v i w and there exists t W \Z(w) such that (si, s W+\{i}(w), s Z(w)(w), t W \Z(w), s Wo(w)), w |= γi. Since i W+ and sj(v)(i) = for all j W \ Z(v), it follows that for all si, there exists a w with v i w such that (si, s W+\{i}(w), s Z(w), s W \Z(w), s Wo(w)), w |= γi. Therefore, for all s i Si, ui(v, s) ui(v, (s i, s i)).

Suppose there exists v V and i W+ such that i / Z(v). Using a similar proof as above and using the fact that sj(v)(i) = for all j W \Z(v) we can argue that for all si, there exists a w with v i w such that (si, s W \{i}(w), s Y (w)(w), s W+\Y (w), s Wo(w)), w |= γi. Therefore, for all s i Si, ui(v, s) ui(v, (s i, s i)).

Part 2. To show that NE opt(G) = , we modify Algorithm 1 to reﬂect the optimist decision rule. This is achieved by changing the conditional in both the While loops (line 9 and line 11) as described below. Note that the only change is a switch to existential quantiﬁcation over the valuations w in order to capture the deﬁnition of the optimist decision rule.

Line 9. While v V , i W+ \ Y (v), si, such that w : w i v, s W \Z(w), we have (si, s W+\{i}(w), s Z(w)(w), s W \Z(w), s Wo(w)), w |= γi do.

Line 11. While v V , i W \ Z(v), si such that w : w i v, s W+\Y (w), we have (si, s W \{i}(w), s Y (w)(w), s W+\Y (w), s Wo(w)), w |= γi do.

Boolean Observation Games

The result in Theorem 28 is tight in the sense that there exist observation games where |type(i)| = 4 for i N and NE pess(G) = . This is illustrated in Example 26 where for players i {2, . . . , 7}, type(i) = {+, , c+, c }. An interesting corollary of Theorem 28 is for self-positive goals: my objective is never to remain ignorant of others variables even when it may be for others to remain ignorant. (We recall their deﬁnition in Section 2.2.)

Corollary 29. Let G = (N, (Pi)i N, (γi)i N) be an observation game where all (γi)i N are guarded and self-positive, then NE pess(G) = and NE opt(G) = .

Proof. Follows from Theorem 28, since i, type(i).

For NE max(G) we show a weaker result (Theorem 30) which can be strengthened for Kw games (Theorem 32).

Theorem 30. Let G = (N, (Pi)i N, (γi)i N) be an observation game where the goal formulas (γi)i N are guarded. If for all i N, |type(i)| 2 then NE max(G) = .

Proof. Let G = (N, (Pi)i N, (γi)i N) be an observation game. Let X+ = {i N | c+ type(i)} and X = {i N | c type(i)}. Consider the uniform strategy proﬁle s deﬁned as follows.

For i X+ X , we deﬁne si using the iterative procedure: for v V where si(v) is not deﬁned, if there exists s S such that ui(v, s) = 1 then set si(w) = si for all w : v i w. Otherwise set si(w) = s i for all w : v i w.

For all i X+, for all v V , let si(v) = s i .

For all i N \ [(X+ X ) X+], for all v V , let si(v) = s i .

Note that for all i X+, s i is a dominant uniform strategy. For all i X , s i is a dominant uniform strategy and for all i (X+ X ), both s i and s i are dominant uniform strategies. Since the goal formulas are guarded, we have for all i N and for all v V , v, s(v) |= γi iﬀw, s(v) |= γi for all w : v i w. Also, for all i X+ X , with type(i) 2, we have that ui(v, s(v)) = ui(v, (si(v), s i)) for all s i S i. It then follows that s NE max(G).

5.3 Existence of Maximal Nash Equilibrium in Knowing-Whether Games

For the subclass of Kw games, we show that Theorem 30 can be strengthened. We argue that if G is a Kw game where for all i N, |type(i)| 3 then the output of Algorithm 2 is a globally uniform strategy proﬁle s such that s NE max(G).

Lemma 31. Algorithm 2 always terminates and it satisﬁes the following properties.

After each iteration of the while loops, steps 9-10 and steps 11-12, the strategy proﬁle s constructed is a globally uniform strategy proﬁle.

The strategy proﬁle s which is the output of Algorithm 2 is a globally uniform strategy proﬁle.

Van Ditmarsch & Simon

Algorithm 2:

Input: A Kw game G = (N, (Pi)i N, (γi)i N). Output: A uniform strategy proﬁle s NE max(G).

1 Let Wo := X+ X Wl;

2 i X+, v V , set si(v) := s i ; /* a dominant uniform strategy */

3 i X \ X+, v V , set si(v) := s i ; /* a dominant uniform strategy */

4 i Wl, if s S, and v V such that ui(v, s) = 1 then w V set si(w) := si else set si(w) := s i ; /* ui does not depend on others choice */

5 i W+, j W , v V set si(v) := s i ; sj(v) := s j ; /* initialisation */

6 Set Y := ; Z := ;

8 set Y := Y ; Z := Z; /* process players who make positive assertions about variables controlled by others */

9 while i W+ \ Y (v), si and v V such that s W \Z, we have v(si, s W+\{i}(v), s Z(v), s W \Z, s Wo(v)) |= γi do

10 w V , set si(w) := si; Y := Y {i};

/* process players who make negative assertions about variables controlled by others */

11 while i W \ Z, si, v V , such that s W+\Y , we have v(si, s W \{i}(v), s Y (v), s W+\Y , s Wo(v)) |= γi do

12 w V , set si(w) := si; Z := Z {i};

13 until Y = Y and Z = Z ;

14 i W+ \ Y , j W \ Z, v V , set sj(v)(i) := ;

15 i W \ Z, j W+ \ Y , v V , set sj(v)(i) := Pj;

16 return s;

Proof. First, note that in each iteration of the outer loop in Algorithm 2 (steps 7 - 13), the size of the set Y or Z strictly increases. Therefore Algorithm 2 always terminates. At the end of the initialization steps (2 - 6), s Sg by deﬁnition. So it suﬃces to argue that at the end of each iteration of the two While loops (steps 9 - 10 and 11 - 12), the following invariant is maintained: s Sg. We can argue by induction on the number of iterations of the while loops (steps 7 - 13). The claim follows from the deﬁnition of the assignment statements: steps 10 and 12. Thus on termination of the outer loop (steps 7 - 13) we have that s Sg. It follows from the deﬁnition of lines 14 and 15 that the output of Algorithm 2, s Sg.

Theorem 32. Let G = (N, (Pi)i N, (γi)i N) be a Kw game. If for all i N, |type(i)| 3 then NE max(G) = .

Proof. We argue that the output of Algorithm 2 is a globally uniform strategy proﬁle s such that s NE max(G). As in the case of Theorem 28, note that the sets X+, X \ X+, Wl, W+ and W form a partition of N.

Boolean Observation Games

By Lemma 31, Algorithm 2 always terminates. Let s be the proﬁle constructed by Algorithm 2. To prove the claim, it suﬃces to argue that s NE max(G). Note that for all i X+, s i is a dominant uniform strategy and for all i X , s i is a uniform strategy that is dominant. Therefore, for all v V , for all i X+ X , ui(v, s(v)) ui(v, (s i, s i(v))). For all i Wl we have that ui(v, s(v)) = ui(v, (si(v), s i)) for all s i S i. Since the goals are knowing whether formulas, we have if there exists v V and there exists s S such that ui(v, s) = 1 then for all w V , for all s i S i, ui(w, (si, s i)) = 1. Therefore, for all v V , for all i Wl, ui(v, s(v)) ui(v, (s i, s i(v))). Now consider a player i W+. For v V , suppose si(v) is assigned a value in the while loop (steps 9-10). Let sk denote the resulting strategy proﬁle after this assignment (step 10). Let Zk denote the value of Z in the corresponding iteration. By Lemma 31, we have sk Sg and by deﬁnition of the while loop, there exists v such that for all s W \Zk, v, (si, sk W+\{i}(v), s Zk, s W \Zk, s Wo(v)) |= γi. By Lemma 31 and the fact that γi LKw

it follows that for all w, for all s W \Zk, v, (si, sk W+\{i}(v), s Zk, s W \Zk, s Wo(v)) |= γi. Since i W+, this implies that for all w V , for all t W+\{i} SW+\{i} such that t W+\{i} s W \Zk, w, (si, t W+\{i}, s Zk(w), s W \Zk(w), s Wo(w)) |= γi. By deﬁnition, s W+\{i}(w) sk W+\{i}(w) and for all j Zk, sk j (w) = sj(w). Thus we have w, (si, s W+\{i}, s Z(w), s W \Z, s Wo(w)) |= γi. Therefore, ui(w, s) = 1 for all w V . Consider a player i W . For v V , suppose si(v) is assigned a value in the while loop (steps 11-12). Let sk denote the resulting uniform strategy proﬁle after this assignment (step 12) and Y k denote the value of Y in the corresponding iteration. By Lemma 31, sk Sg. By deﬁnition of the while loop, there exists v such that for all s W+\Y k, v, (si, sk W \{i}(v), s Y k(v), s W+\Y k, s Wo(w)) |= γi. By Lemma 31 and the fact that γi LKw, we have for all w such that for all s W+\Y k, w, (si, sk W \{i}(w), s Y k(w), s W+\Y k, s Wo(w)) |= γi. Since i W , this implies that for all t W \{i} SW \{i} such that sk W \{i}(w) t W \{i}, for all s W+\Y k(w), w, (si, t W \{i}, s Y k(w)(w), s W+\Y k(w), s Wo(w)) |= γi. By deﬁnition, sk W \{i}(w) s W \{i}(w). Thus w, (si, s W \{i}, s Y (w), s W+\Y (w), s Wo(w)) |= γi. Therefore, ui(w, s) = 1 for all w W. Now suppose there exists i W+ such that i / Y (on termination of the repeat loop, steps 7-13). By deﬁnition, for all si, for all v V , there exists t W \Z such that v, (si, s W+\{i}(v), s Z(v), t W \Z, s Wo(v)) |= γi. Since i W+ and sj(v)(i) = for all j W \ Z, it follows that for all si, for all v V v, (si, s W+\{i}(v), s Z(v), s W \Z, s Wo(v)) |= γi. Suppose there exists v V and i W+ such that i / Z. Using a similar proof as above and using the fact that sj(v)(i) = for all j W \Z(v) we can argue that for all si, for all v V , v, (si, s W \{i}(w), s Y (w), s W+\Y , s Wo(w)) |= γi. It follows that s NE max(G).

Examples 33 and 34 show that Theorems 30 and 32 are tight.

Example 33. Consider the two-player game where N = {1, 2}, P1 = {p} and P2 = {q1, q2, q3}. Let γ1 = (Kw1q2 Kw2p) (Kw1q3 Kw2p) and γ2 = (q1 Kw1q2) ( q1 Kw1q3) ( Kw1q2 Kw1q3). Note that in this game, |type(1)| = 3 and |type(2)| = 2. The goal of player 1 is a Kw formula. It can be veriﬁed that NE max(G) = .

Van Ditmarsch & Simon

Example 34. Consider the two-player game where P1 = {p1, q1} and P2 = {p2}. Let the goal formulas be: γ1 = ( Kw1p2 (Kw2p1 Kw2q1)) (Kw1p2 ( Kw2p1 Kw2q1)) and γ2 = (Kw2p1 Kw1p2) (Kw2q1 Kw1p2). In this game, both goals are Kw formulas. We have |type(1)| = 4 and |type(2)| = 3. It can be veriﬁed that NE max(G) = .

6. Representation and Complexity

For an observation game G = (N, (Pi)i N, (γi)i N), let |N| = n, |P| = k and maxi N |γi| = m (where |γi| denotes the number of symbols in γi). For i N, every strategy si : N P(Pi), can be represented in size O(nk). That is, both observation games and strategies have compact representations linear in n, k and m. On the other hand, each uniform strategy si : V Si can be encoded as a tuple of Boolean functions (sj i(pi))j N,pi Pi where each sj i(pi) : P(P) { , }. Here sj i(pi)(v) = is viewed as player i revealing the variable pi to player j under the valuation v. We assume that the Boolean function sj i(pi) is represented as a propositional formula βj i (pi) over the propositions P. It is well known that every such Boolean function can be represented as a propositional formula, in the worst case the size of sj i(pi) can be exponential in k. In this section we address the computational complexity of the following two basic algorithmic questions.

Veriﬁcation. Given an observation game G and a uniform strategy proﬁle s S, is s NE x(G) for an outcome relation x {pess, opt, max}?

Emptiness. Given an observation game G is NE x(G) = for an outcome relation x {pess, opt, max}?

We show that the veriﬁcation and emptiness questions are PSPACE-complete and NEXPTIME-complete respectively for the maximal outcome relation. We also show that for the pessimist and optimist outcome relations, the veriﬁcation and emptiness questions are in PSPACE and NEXPTIME respectively. To obtain these results it is crucial to establish the complexity of the model checking problem of the logic LK. The following result shows that the model checking problem is PSPACE-complete. It follows directly from Proposition 2 in ( Agotnes et al., 2013b)6.

Theorem 35. Given α LK along with a strategy proﬁle s S and a valuation v V , checking if v, s |= α is PSPACE-complete.

It is well known that the model checking problem for epistemic logic formulas over Kripke structures (for example, formulas of multi-agent S5) can be solved in polynomial time (Fagin, Halpern, Moses, & Vardi, 1995a; Halpern & Vardi, 1991). Note that in our setting, a Kripke structure is not explicitly part of the input, rather the underlying relational structure is compactly presented in terms of the valuation v and strategy s. This is the reason for PSPACE-hardness of the model checking problem.

6. We thank Paul Harrenstein for providing us an unpublished full version of ( Agotnes et al., 2013b) which includes a proof of Proposition 2. For the sake of completeness, we give a full proof of Theorem 35 in the Appendix.

Boolean Observation Games

6.1 Veriﬁcation

In the rest of this section, we refer to valuations over various sets of variables and therefore ﬁnd it convenient to use the following notations. Let A be a ﬁnite set of variables. We use V (A) to denote the set of all valuations over A.

Theorem 36. Given an observation game G = (N, (Pi)i N, (γi)i N) and a uniform strategy proﬁle s S, checking if s NE max(G) is PSPACE-complete.

Proof. We can argue that the complement of the problem is in PSPACE. That is, given G and s S, the problem is to verify if s NE max(G). This can be solved with the following two steps:

1. Guess i N, v V and si Si.

2. Verify that ui(v, (si, s(v) i)) > ui(v, s(v)).

For step 1 note that the size of a strategy |si| = O(nk). So the triple (i, v, s) which forms a possible witness to the fact that s NE max(G) has a polynomial representation. By Theorem 35 it follows that step 2 can be solved in PSPACE. Since PSPACE is closed under complementation and NPSPACE = PSPACE due to Savich s Theorem, the membership in PSPACE follows. To show PSPACE-hardness, we give a reduction from the model checking problem for LK. That is, given α LK, a strategy proﬁle s S and a valuation v V we construct an observation game G and a uniform strategy s as follows. Let P(α) denote the set of variables occurring in α and p1 P(α) be an arbitrary ﬁxed variable. Let q be a variable such that q P(α). The set of players N = {1, 2}. P1 = P(α) and P2 = {q}. To deﬁne the goal formulas we make use of the following notations. Let δv denote the Boolean formula over P1 which uniquely characterises the valuation v. That is, δv := V p v p V p/ v p. For the (ﬁxed) variable p1 P1, we deﬁne the formula ﬂip(p1) as follows.

( Kw2q if p1 s1(2), Kw2q if p1 s1(2). The goal formulas are then deﬁned as:

γ1 = δv (α ﬂip(p1))

Let s be any uniform strategy proﬁle such that for all w V (P1 {q}) with w P1 = v we have s(w) = s. Now consider a w V (P1 {q}) such that w P1 = v. Suppose w, s |= α. By the deﬁnition of ﬂip(p1), we have that w, s(w) |= ﬂip(p1) and thus w, s(w) |= γ1. Again, by the deﬁnition of ﬂip(p1), there exists s 1 such that w, (s 1, s 1(w)) |= ﬂip(p1) and therefore u1(w, (s 1, s 1(w))) > u1(w, s(w)). Thus s NE max(G). Conversely, suppose w, s |= α. Then for player 1, w, s(w) |= γ1. For all w V (P1 {q}) such that w P1 = v, for all s 1 S1, w , (s 1, s 1(w) |= δv and therefore, w , (s 1, s 1(w )) |= γ1. For player 2, for all valuations u V (P1 {q}), we have u, s(u) |= γ2. Therefore s NE max(G). By Theorem 35 PSPACE-hardness follows, which gives the desired result.

Van Ditmarsch & Simon

In the case of pessimist and optimist outcome relations, the following computational upper bounds for the veriﬁcation question are relatively straightforward. Whether matching lower bounds can be shown is an interesting question.

Theorem 37. Given an observation game G = (N, (Pi)i N, (γi)i N) and a uniform strategy proﬁle s S, checking if s NE x(G) is in PSPACE where x {pess, opt}.

Proof. Observe that by Theorem 35, for i N, s S and v V , checking if ui(v, s(v)) = 1 can be done in PSPACE. It follows that checking if max ui(v, s) = 1 (respectively, if min ui(v, s) = 1) can be checked in PSPACE. Therefore, to check if s NE pess(G) (respectively, if s NE opt(G)), it suﬃces to perform the following two steps.

1. Guess a player i, a valuation v and a strategy s i Si.

2. Verify if min ui(v, s) < min ui(v, ( s i, s i)), (respectively, if max ui(v, s) < max ui(v, ( s i, s i))).

For step 1 note that the size of the strategy |si| = O(nk). Thus the triple (i, v, si) that forms a possible witness to the fact that s NE pess(G) (respectively, s NE opt(G)), has a polynomial representation. By the observation above, step 2 can be solved in PSPACE. Since PSPACE is closed under complementation and NPSPACE = PSPACE, the membership in PSPACE follows.

6.2 Emptiness

Next we address the complexity of checking for emptiness of maximal Nash equilibria in observation games. We ﬁnd it useful to introduce the following deﬁnitions. Let A = {a1, . . . , al} and B = {b1, . . . , bl} be two ﬁnite sets of variables where |A| = |B| and let ζ : A B be a bijection. For valuations v1 V (A) and v2 V (B), we say that consζ(v1, v2) holds if for all j : 1 j l, aj v1 iﬀζ(aj) v2. We also deﬁne the formula Cζ(A, B) := l j=1(aj ζ(aj)). Given a uniform strategy si and a set Z Pi, we say that si is globally Z-uniform if for all v, v V , if v Z = v Z then si(v) = si(v ). For i N, let SZ i = {si Si | si is globally Z-uniform}. Note that SZ i can be viewed as a natural generalisation of Sg i by parameterising the uniform strategies on the set Z.

An NEXPTIME-complete Problem. We now show that given an observation game G, checking if NE max(G) is empty is NEXPTIME-complete. To prove the hardness, we give a reduction from the Dependency quantifier Boolean formula game (Dqbfg) (Hearn & Demaine, 2009, p.87). Dqbfg involves a three player game with players 1, 2 and 3. There are four ﬁnite sets of variables which are mutually disjoint, X2, X3, A2 and A3 along with a Boolean formula ϕ over the variables X2 X3 A2 A3. Let X = X2 X3 and A = A2 A3. For the rest of this section we use LB to denote the set of Boolean formulas over the variables X A. Players strategies are deﬁned as follows.

Player 1: a strategy t1 V (X).

Player 2: a strategy t2 : V (X2) V (A2).

Boolean Observation Games

Player 3: a strategy t3 : V (X3) V (A3).

In other words, a strategy for player 1 is to select a valuation for variables in X. Player 2 chooses a valuation for variables in A2 and his strategy can depend on the valuation for variables in X2. Similarly, a strategy for player 3 is to choose a valuation for variables in A3 which can depend on the valuation of variables in X3. For player i {1, 2, 3} let Ti denote the set of strategies of player i and T the set of strategy proﬁles. It is easy to observe that a strategy proﬁle t = (t1, t2, t3) deﬁnes a valuation over the set of variables X A. For a formula α LB we then have the natural interpretation for t |= α. Given strategies t2 T2 and t3 T3, we say that the pair (t2, t3) is a winning strategy for the coalition of players 2 and 3 if for all t1 T1, (t1, t2, t3) |= ϕ. An instance of Dqbfg is then given by the tuple H = ((Xi)i {2,3}, (Ai)i {2,3}, ϕ) and the associated decision problem is to check if the coalition of players 2 and 3 have a winning strategy in H.

Theorem 38 ((Hearn & Demaine, 2009)). Dqbfg is NEXPTIME-complete.

The reduction. Given an instance of Dqbfg H = ((Xi)i {2,3}, (Ai)i {2,3}, ϕ), we construct an observation game GH = (N, (Pi)i N, (γi)i N) as follows. The set of players N = {1, 2, 3}. For i {2, 3}, let Yi be a copy of the variables in Xi, so |Yi| = |Xi| and let Y = Y2 Y3. Let P1 = X, P2 = A2 Y2 {q} and P3 = A3 Y3 {r}. For the rest of this section, we use V and LK to denote the set of all valuations and the set of all formulas over the variables in the observation game GH respectively (so V = V (X Y A {q, r})). We also deﬁne the bijection ζ : X Y as the function that maps each variable in Xi to its corresponding copy in Yi. Formally, let X1 = {x1 1, . . . , xl 1}, Y1 = {y1 1, . . . , yl 1}, X2 = {x1 2, . . . , xh 2} and Y2 = {y1 2, . . . , yh 2}. Then ζ(xj 1) = yj 1 for all j {1, . . . , l} and ζ(xj 2) = yj 2 for all j {1, . . . , h}. To simplify notation, we denote consζ by cons and Cζ by C for this ﬁxed bijection ζ. In order to deﬁne the goal formulas, we ﬁrst inductively deﬁne a function λ : LB LK

that transforms ϕ to a formula in LK as follows.

For p X, λ(p) := p.

For p A2, λ(p) := Kw3p.

For p A3, λ(p) := Kw2p.

λ( α) := λ(α).

λ(α1 α2) := λ(α1) λ(α2).

Let ψ2 = (Kw2r Kw3q) and ψ3 = (Kw2r Kw3q). Recall that Example 12 shows that already for the Kw fragment of observation games, NE max need not always exist. Observe that the formulas ψ2 and ψ3 precisely correspond to γ1 and γ2 respectively as used in Example 12. We deﬁne the players goal formulas as follows.

For i {2, 3}, γi = (λ( ϕ) ψi) C(X2, Y2) C(X3, Y3).

Van Ditmarsch & Simon

Properties of GH. It is easy to see that the resulting observation game GH is polynomial in the size of H. We ﬁrst make the following observations about GH.

Lemma 39. Let GH be the observation game corresponding to H and let s S. If there exists v V such that cons(v X1, v Y1), cons(v X2, v Y2) and v, s(v) |= λ(ϕ) then s / NE max(GH).

Proof. Suppose there exists v V such that cons(v X1, v Y1), cons(v X2, v Y2) and v, s(v) |= λ(ϕ). Then v, s(v) |= C(X2, Y2) C(X3, Y3). By Example 12, we have that there exists i {2, 3} such that v, s(v) |= ψi and there exists si Si such that v, (si, s i(v)) |= ψi. Therefore we have ui(v, s(v)) < ui(v, (si, s i(v))). Thus s NE max(GH).

Lemma 40. For i {2, 3}, for all s S, for all v, v V such that v (X Y ) = v (X Y ) we have v, s |= γi iﬀv , s |= γi.

Proof. For i {2, 3}, the claim clearly holds for formulas ψi and C(Xi, Yi). Thus for γi, the claim follows by a simple induction on ϕ.

Next, we show that if the set of maximal Nash equilibria in GH is non-empty then this set contains certain restricted types of maximal Nash equilibria. Let R denote the set of uniform strategy proﬁles s S that satisfy the following conditions:

for i {2, 3}, si SYi i .

In other words, R consists of the set of all uniform strategy proﬁles s such that s1 is globally uniform and for i {2, 3}, si is globally Yi-uniform.

Lemma 41. If NE max(GH) = then there exists s NE max(GH) such that s R.

Proof. For players i {2, 3} we deﬁne an equivalence relation =i V V as follows. For v, v V , v =i v if v Yi = v Yi. For v V , let [v]i denote the equivalence class containing the valuation v and ci v [v]i denote a ﬁxed valuation which is interpreted as the canonical representative element in the equivalence class [v]i. Suppose s NE max(GH). Consider the uniform strategy proﬁle s R deﬁned as follow.

For player 1, ﬁx a valuation w V and let s 1(v) = s1(w) for all v V .

For players i {2, 3}, for all v V , s i (v) = si(ci v).

We claim that s NE max(GH). Suppose not, then there exists i {2, 3}, there exists w V , there exists si Si such that ui(w, (si, s i(w))) > ui(w, s (w)). Then w, (si, s i(w)) |= γi and w, s (w) |= γi. Now consider the valuation u deﬁned as follows: u P1 = w P1 and for i {2, 3}, u Pi = ci w Pi. By deﬁnition of u we have that u (X Y ) = w (X Y ) and therefore, w =i u. From the deﬁnition of s it follows that s (w) = s(u).

Boolean Observation Games

Since w, (ai, s i(w)) |= γi we have that w, (ai, s i(u)) |= γi. By Lemma 40 we have that u, (ai, s i(u)) |= γi. Since w, s (w) |= γi we have that w, s (u) |= γi. By Lemma 40 we have that u, s (u) |= γi. However, this implies that s / NE max(GH) which is a contradiction.

Strategy Translation. Note that by the construction of GH, the strategies of player 1 are irrelevant in terms of existence of maximal Nash equilibria. Player 1 can ensure a utility of 1 by choosing any strategy. We now deﬁne two functions which translate strategies of players 2 and 3 between H and GH. For the rest of the section we make use of the following concise notation. For i = 2, let i+ = 3 and for i = 3, let i+ = 2. For i {2, 3}, let χi : Ti SYi i be the function that translates every strategy ti of player i in H to a globally Yi-uniform strategy si = χi(ti) in GH as deﬁned below.

For all v V , if cons(v Xi, v Yi) then si(v)(i+) = ti(v Xi) and si(v)(i+) = otherwise. For all v V , si(v)(1) = and si(v)(i) = Pi.

For i {2, 3}, let µi : SYi i Ti be the function that translates every globally Yi-uniform strategy si of player i in H to a strategy ti = µi(si) in GH as deﬁned below.

For all v V , such that cons(v Xi, v Yi), ti(v Xi) = si(v)(i+).

Note that since si SYi i , µi is well deﬁned.

Lemma 42. For all i {2, 3} and for all si SYi, let s i = χi(µi(si)). For all s1, s 1 S1, for all i {2, 3} and for all v V such that cons(v Xi, v Yi) we have v, (s1, s2, s3)(v)) |= γi iﬀv, (s 1, s 2, s 3)(v)) |= γi.

Proof. For i {2, 3}, the claim clearly holds for the formulas ψi and C(Xi, Yi). Thus for γi, the claim follows by induction on ϕ.

Lemma 43. For all α LB, for all t T, for all i {2, 3} and for all v V such that t X = v X and cons(v Xi, v Yi) we have t |= α iﬀv, (s1, χ2(t2), χ3(t3))(v)) |= λ(α) for all s1 S1.

Proof. For i {2, 3}, let si = χi(ti). The proof is by induction on the structure of α where the interesting cases involve the three base cases.

α = p X. Then we have λ(p) = p and the following sequence of equivalences. t |= p iﬀp t1 iﬀp v (since t1 X = v X) iﬀv, (s1, s2, s3)(v)) |= p for all s1 S1.

α = p A2. Then we have λ(p) = Kw3p and the following sequence of equivalences. t |= p iﬀp t2(t1 X2) iﬀp s2(v)(3) (since cons(v X2, v Y2)) iﬀv, (s1, s2, s3)(v)) |= Kw3p for all s1 S1.

α = p A3. Then we have λ(p) = Kw2p and the following sequence of equivalences. t |= p iﬀp t3(t1 X3) iﬀp s3(v)(2) (since cons(v X3, v Y3)) iﬀv, (s1, s2, s3)(v)) |= Kw2p for all s1 S1.

For α = α1 and α = α1 α2 the claim follows by a direct application of the induction hypothesis.

Van Ditmarsch & Simon

Lemma 44. Let H = ((Xi)i {2,3}, (Ai)i {2,3}, ϕ) be an instance of Dqbfg and GH the associated observation game. The coalition of players 2 and 3 have a winning strategy in H iﬀNE max(GH) = .

Proof. Let (t2, t3) be a winning strategy for the coalition of players 2 and 3 in H. By deﬁnition of a winning strategy, for all t1 T1, we have (t1, t2, t3) |= ϕ. Let t = (t1, t2, t3) and consider the observation game GH. Note that in GH, by the deﬁnition of player 1 s goal γ1, we have for all v V and for all s S, u1(v, s(v)) = 1. Now consider an arbitrary valuation v V . There are two cases to consider.

Case 1. Suppose there exists i {2, 3} such that cons(v Xi, v Yi) does not hold. By semantics, for all s S and for all i {2, 3} we have v, s(v) |= C(X2, Y2) C(X3, Y3) and thus v, s(v) |= γi. Therefore, ui(v, s(v)) = 0.

Case 2. Suppose for all i {2, 3}, cons(v Xi, v Yi) holds. By semantics we have for all s S, for all i {2, 3}, v, s(v) |= C(X2, Y2) C(X3, Y3). Let t 1 = v X and t = (t 1, t2, t3). By deﬁnition of t , we have t X = v X. Since (t2, t3) is a winning strategy for players 2 and 3 in H, we have (t 1, t2, t3) |= ϕ. By Lemma 43 v, (s1, χ2(t2), χ3(t3))(v) |= λ( ϕ).

Since the choice of v was arbitrary, we can conclude that (s1, χ2(t2), χ3(t3)) NE max(GH). In fact, note that the argument shows a stronger claim - for all s 1 S1, the uniform strategy proﬁle (s 1, χ2(t2), χ3(t3)) NE max(GH). ( ) Suppose NE max(GH) = . By Lemma 41, there exists a s R such that s NE max(GH). Let (t2, t3) = (µ2(s2), µ3(s3)). We argue that (t2, t3) is a winning strategy for the coalition of players 2 and 3 in H. Suppose not, then there exists t 1 T1 such that for the strategy proﬁle t = (t 1, t2, t3), we have t |= ϕ. Consider the pair of strategies (s 2, s 3) = (χ2(t2), χ3(t3)) and a valuation v V such that v X = t X and for all i {2, 3}, cons(v Xi, v Yi). By Lemma 43 we have that v, (s 1, s 2, s 3) |= λ(ϕ) for all s 1 S1. In particular, v, (s1, s 2, s 3) |= λ(ϕ). Since s i = χi(µi(si)) for i {2, 3}, by Lemma 42 we have that v, (s1, s2, s3) |= λ(ϕ). By Lemma 39, s NE max(GH) which is a contradiction.

Theorem 45. Given an observation game G, checking if NE max(G) = is NEXPTIMEcomplete.

Proof. Recall that for each player, a uniform strategy si can be encoded as a tuple of Boolean functions (sj i(pi))j N,pi Pi each of which can be represented by a propositional formula βj i (pi) whose size is at most exponential in k. To show that the problem is in NEXPTIME, we ﬁrst guess a uniform strategy proﬁle s. This involves guessing n2k formulas each of which can be exponential in k. Membership in NEXPTIME then follows from Theorem 36. By Lemma 44, it follows that checking if NE max(G) = is NEXPTIME-hard. Thus the claim follows.

Boolean Observation Games

In the case of pessimist and optimist outcome relations, an argument similar to that given in the proof of Theorem 45 along with Theorem 37 immediately gives us an upper bound on the complexity of emptiness problem.

Theorem 46. Given an observation game G, checking if NE x(G) = is in NEXPTIME where x {pess, opt}.

6.3 Knowing-Whether Observation Games

In the case of Kw games, we show that both the veriﬁcation problem and the emptiness problem have better complexity bounds which match the known complexity results for the corresponding questions in Boolean games. We ﬁrst recall the relevant results for Boolean games.

Theorem 47 ((Harrenstein, Turrini, & Wooldridge, 2017)). (Veriﬁcation) Given a Boolean game B along with a strategy proﬁle v checking if v NE(B) is co-NP-complete.

Theorem 48 ((Bonzon et al., 2006)). (Emptiness) Given a Boolean game B, checking if NE(B) = is Σp 2-complete.

In the context of Kw games, as an immediate consequence of Proposition 5 we get that the model checking question for the fragment LKw is in polynomial time.

Corollary 49. Given α LKw along with a strategy proﬁle s S and a valuation v V , checking if v, s |= α is in PTIME.

We then have the following results for the complexity of veriﬁcation and emptiness in Kw games.

Theorem 50. Given a Kw game G = (N, (Pi)i N, (γi)i N) and a uniform strategy proﬁle s S, checking if s NE max(G) is co-NP-complete.

Proof. Membership in co-NP follows immediately from Corollary 49. For hardness, we show a reduction from the corresponding veriﬁcation problem in Boolean games which is: given a Boolean game B and a strategy proﬁle v in B, to check if v NE(B). By Theorem 47 this problem is known to be co-NP-complete. Given a Boolean game B and a strategy proﬁle w in B, let GB and sw be the corresponding observation game and the globally uniform strategy proﬁle in GB as deﬁned in Section 4.2. We argue that w NE(B) iﬀ sw NE max(GB).

( ) This direction is exactly the same as the ﬁrst part of the proof of Theorem 19. Suppose w NE(B) and sw NE max(GB). Then there exists i N, v V and ti Si such that ui(v, (ti, sw i(v))) > ui(v, sw(v)). Let w = χ 1([ ti, s i]) From Lemmas 14, 17 and 18 it follows that u B i (w ) = ui(v, ( ti, s i)(v)) > ui(v, sw(v)) = u B i (w) for all v V . Therefore w NE(B) which is a contradiction.

( ) Suppose sw NE max(GB) and w NE(B). Then there exists i N and w i such that u B i ((w i, w i) > u B i (w). Let w = (w i, w i). From Lemma 14 we have that ui(v, sw (v)) = u B i (w ) > u B i (w) = ui(v, sw(v)). This implies that sw NE max(GB) which is a contradiction.

Van Ditmarsch & Simon

Theorem 51. Given a Kw game G, checking if NE max(G) = is ΣP 2 -complete.

Proof. Membership in ΣP 2 follows immediately from Corollary 16 and Theorem 50. For ΣP 2 -hardness, notice that the translation from observation games to Boolean games that we provide in Section 4.3 is polynomial time computable. Thus given an instance of an observation game G, we can construct a Boolean game BG in polynomial time. By Theorem 24, NE max(G) = iﬀNE(BG) = . From Theorem 48 it follows that checking if NE max(G) = is ΣP 2 -complete.

7. Discussion and Conclusion

Summary. We introduced Boolean observation games as a qualitative model which combines aspects of imperfect and incomplete information games. For these games we studied Nash equilibria based on diﬀerent ways to compare sets of outcomes, that result in diﬀerent expectations of outcomes. Our main technical contributions are for the existence of Nash equilibria, for the computational analysis of Nash equilibria, as well as for identifying knowing-whether games, a fragment of observation games that precisely corresponds to Boolean games in terms of existence of Nash equilibria. A summary of our results are listed in Table 3.

Existence Complexity |type(i)| 2 |type(i)| 3 |type(i)| > 3 Veriﬁcation Emptiness

Observation games NE pess Yes (Theorem 28) Yes (Theorem 28) No (Example 26) PSPACE (Theorem 37)

NEXPTIME (Theorem 46) NE opt Yes (Theorem 28) Yes (Theorem 28) PSPACE (Theorem 37)

NEXPTIME (Theorem 46) NE max Yes (Theorem 30) No (Example 33) No (Example 33) PSPACEcomplete (Theorem 36)

NEXPTIMEcomplete (Theorem 45)

Kw games NE pess Yes (Theorem 28) Yes (Theorem 28) NE opt Yes (Theorem 28) Yes (Theorem 28) NE max Yes (Theorem 32) Yes (Theorem 32) No (Example 34) Co-NPcomplete (Theorem 50)

Σp 2-complete (Theorem 51)

Table 3: Summary of results.

Complexity and Existence. Note that in Boolean observation games, the underlying relational structure (the Kripke model) is not explicit. It is implicitly presented in terms of a valuation v and strategy proﬁle s. Therefore, even the basic model checking problem is PSPACE-complete given the compact presentation. This in turn is one of the main reasons for the high complexity bounds that we obtain for the computational analysis of this model. An alternative would be to explicitly have a Kripke model as part of the input. Suppose the Kripke model is deﬁned over a set of worlds W. Then a uniform strategy can be thought of as a uniform function from W to the set Si of strategies for player i, which would have a polynomial representation in terms of the number of worlds |W|, the number of agents n, and the number of variables (atoms) |P|. Computing ui(v, s(v)) can then also be done in polynomial time. As a consequence it can be shown that the veriﬁcation problem is in

Boolean Observation Games

co-NP and the emptiness problem is in ΣP 2 . However, the size of the Kripke structure can in the worst case be exponential in |P|. Clearly, the computational complexity of the model requires further analysis. There are two approaches which are interesting. The ﬁrst is to try and identify fragments of the model which provide better complexity bounds. In the subclass of knowing-whether games we obtain bounds which match the known bounds for the corresponding questions in Boolean games. It is also known that in two player Boolean games where the goal formulas are restricted to Horn-renamable DNF, 2CNF or monotone CNF, the emptiness of Nash equilibrium can be checked in polynomial time (Bonzon et al., 2006). By modifying the arguments appropriately, we can identify subclasses of knowing-whether games in which the corresponding emptiness question can be solved in polynomial time. In general, any natural restriction of the logical speciﬁcation language which results in the corresponding model checking question to have better complexity is a promising fragment. The second approach would be to identify the speciﬁc parameters within the model which contribute to the exponential complexity bounds. Some of the natural candidates are the number of players and the number of variables used in the goal formulas. Since the hardness results given in Theorem 36 and Theorem 44 are for two and three players respectively, bounding the number of players alone is not suﬃcient. Analysis of fragments where the number of variables in the goal formulas are bounded appears to be a promising research direction which require more careful study. Analysing the lower bounds in the case of pessimist and optimist outcome relations is another question which is relevant. In Section 5.1, we analyse existence of Nash equilibria in knowing-whether games, and in Section 5.2 we identify conditions based on positive/negative epistemic assertions which ensure existence of Nash equilibria. Identifying other fragments where Nash equilibria are guaranteed to exist is an obvious direction of future research. It would be particularly interesting if the existence result can be related to structural properties of the underlying game. Extensions of the Model. There are many extensions of the model which are interesting for further research. One could imagine a whole and ever widening range of qualitative incomplete information games of imperfect information. For the strategies, instead of merely revealing the value of propositional variables, we could consider revealing the value of any epistemic proposition, as already considered in ( Agotnes & van Ditmarsch, 2011; Agotnes et al., 2011) for more complex, arbitrary, Kripke models. Instead of having merely partitions (exhaustive and exclusive) of all variables, one could consider overlapping sets of variables (exhaustive but not exclusive, so more than one player may observe the same variables)7 as for example employed in (Belardinelli, Grandi, Herzig, Longin, Lorini, Novaro, & Perrussel, 2017). Doing the same for Boolean games would create the possibility of conﬂict, as not more than one player can control the value of a variable. But as many agents as you wish can make the same observation. Another interesting extension to explore would be to consider iterated Boolean observation games, wherein players can gradually reveal more and more of their variables. This would be a generalization similar to that already studied for Boolean games in (Gutierrez

7. Kindly suggested by Paul Harrenstein.

Van Ditmarsch & Simon

et al., 2015, 2016). It would involve epistemic temporal goals or dynamic epistemic goals. Diﬀerent from iterated Boolean games, in iterated Boolean observation games one can only reveal more and more variables in every round, until all have been revealed. This should therefore considerably reduce the complexity of iterated Boolean observation games with respect to otherwise comparable iterated Boolean games. Yet another relevant direction is (epistemic) incentive engineering in Boolean observation games, similar to what is studied in Boolean games (Wooldridge, Endriss, Kraus, & Lang, 2013; Turrini, 2013; Harrenstein et al., 2017).

Acknowledgments

We thank the reviewers for their valuable comments which helped improve the presentation of the paper. Hans van Ditmarsch gratefully acknowledges a stay at IIT Kanpur initiating this research collaboration, during his temporary CNRS aﬃliation in 2018 at UMI Re La X, Chennai, India. Sunil Simon was partially supported by the grant CRG/2022/006140.

Appendix A. Dynamic Epistemic Logic

A.1 Proof in Section 2.2

Proposition 5. For all ϕ LKw, valuations v, and strategy proﬁles s: v, s |= ϕ iﬀs |= ϕ.

Proof. The proof is by induction on the structure of Kw formulas in negation normal form (LKw nnf ). The direction from right to left is by deﬁnition. For the direction from left to right we proceed as follows. Case atom: v, s |= Kwipj, iﬀ(for all w s i v, w, s |= pj iﬀv, s |= pj), iﬀ(for all w with w Pi(s) = v Pi(s), w, s |= pj iﬀv, s |= pj), iﬀpj Pi(s). As v no longer appears in the ﬁnal statement, v is arbitrary. Therefore, the initial statement v, s |= Kwipj is equivalent to for all w V , w, s |= Kwipj, in other words, to s |= Kwipj. Case negated atom: v, s |= Kwipj, iﬀ(there are w, x V with w s v and x s v and such that w, s |= pj and x, s |= pj), iﬀ(there are w, x V with w Pi(s) = x Pi(s) = v Pi(s) and such that w, s |= pj and x, s |= pj), iﬀpj / Pi(s). As in the previous case, the ﬁnal statement is independent from v and therefore the initial statement is equivalent to s |= Kwipj. Case conjunction: v, s |= α β, iﬀv, s |= α and v, s |= β, iﬀ(IH) s |= α and s |= β, iﬀ s |= α β. Case disjunction: v, s |= α β, iﬀv, s |= α or v, s |= β, iﬀ(IH) s |= α or s |= β, iﬀ s |= α β.

A.2 Strategies as Epistemic Actions

In this section we compare our modelling and our results with related work in epistemic logic. We model strategy proﬁles as epistemic actions in a dynamic epistemic logic, where we also discuss an alternative semantics of strategies resulting in far larger models. The alternatives can be compared on their game theoretical implications, which may help to motivate our preference.

Boolean Observation Games

The situation wherein each player only observes the value of its own variables, corresponds to a Kripke model where the accessibility relation is the initial observation relation, and a strategy proﬁle corresponds to an action model that, when executed in this Kripke model, results in an updated model wherein the accessibility relation is the observation relation (for that strategy proﬁle). In this section we make precise how. It may serve to illustrate that our setting is very simple. This was why we were able to obtain modelling and computational results for Boolean observation games that are close or analogous to those for Boolean games. An epistemic model (Kripke model) M is a triple (W, , π) where W is an (abstract) domain of worlds or states, where is a collection of equivalence relations on W, one for each agent, denoted a (also known as indistinguishability relations), and where π is a valuation (function) mapping each state w W to the subset of the propositional variables P that are true in that state. A pointed epistemic model (M, w) is a pair consisting of an epistemic model and a state w W. Now consider the situation in our observation games where each of n players 1, . . . , n only observes the value of its own variables Pi, but before they enact/play a strategy si. We have implicitly modelled this as the strategy proﬁle s wherein no player reveals any variable. We can identify this situation with the following epistemic model.

The initial observation model (IM , v), where IM = (V, , π), is such that:

domain V is the set of valuations of P (V = P(P));

for each player i N and valuations v, w V , v i w iﬀv Pi = w Pi;

for each v V , π(v) = v.

Note that the relations are exactly as in interpreted systems (Fagin, Halpern, Moses, & Vardi, 1995b). Similarly, the result of playing strategy proﬁle s S given valuation v V of observed variables, corresponds to an updated epistemic model.

The observation model (IM s, v), where IM s = (V, s, π), is such that V and π are as for IM , whereas in this case v s i w iﬀv Pi(s) = w Pi(s).

We recall that Pi(s) = {p P | there is a j N with p sj(i)}, the variables revealed to i in s, where by deﬁnition Pi(i) = Pi so that always Pi Pi(s). Surely more interestingly, we can model a strategy proﬁle as an independent semantic primitive namely as an action model U such that

v, s |= ϕ iﬀIM U, (v, s) |= ϕ

where the former is the satisfaction relation in our logical semantics for LK and the latter is the satisfaction relation in action model semantics. In order to establish that we ﬁrst need to deﬁne action models and their execution (following details as in (Baltag, Moss, & Solecki, 1998; van Ditmarsch, van der Hoek, & Kooi, 2008; Moss, 2015)). An action model U is a triple (E, , pre) where E is a domain of actions, for each player i = 1, . . . , n, i is an equivalence relation on E, and pre is a precondition function mapping each action e E to an executability precondition pre(e) that is a formula in some logical

Van Ditmarsch & Simon

language L. The execution of an action model in an epistemic model M = (W, , π) is then deﬁned as the restricted modal product M U = (W , , π ) where W = {(w, e) | w W, e E, M, w |= pre(e)}, where (w, e) i (w , e ) iﬀw i w and e i e , and where π (w, e) = π(w). In the case of strategy proﬁles for observation games, the logical language of action model preconditions can be restricted to LB, the Booleans (the language required to describe preconditions is therefore simpler than the language LK to describe epistemic goals), and a rather simple action model corresponds to a strategy proﬁle s. A strategy proﬁle can be identiﬁed with the following action model. In the deﬁnition, δv LB is the description of the valuation v, deﬁned as δv := V p v p V p/ v p.

A strategy proﬁle action model Us is a triple (V, s, pre) where the set of actions is the set of valuations V , where for each i = 1, . . . , n, v s i w iﬀv Pi(s) = w Pi(s), and where for each action v V , pre(v) = δv.

The domain of the strategy proﬁle action model is therefore the same as the domain of an observation model, namely the set of all valuations. In can be veriﬁed that

IM Us is isomorphic to IM s.

This is fairly elementary. We note that each action can only be executed in a single world IM , v |= δv, so that the size of IM s is the same as the size of IM . Then, (v, v) i (w, w) iﬀ, by deﬁnition of action model execution, v i v (in IM ) and v s i w (in Us), iﬀ, by deﬁnition of these relations, v Pi = w Pi and v Pi(s) = w Pi(s). As the latter is a reﬁnement of the former, the desired result that v Pi(s) = w Pi(s) follows. Finally, π (v, v) = π(v) = v. And the valuations π do not change. In fact, already Us is isomorphic to IM s (slightly abusing the notion, but when we identify valuations with their description). It should be noted that it is common that action models are isomorphic to updated models when executed in initial models consisting of all valuations (and representing some sort of initial maximal ignorance over those valuations). As a word of warning: the actions that are the points in our action model Us do not correspond to the strategies, that are sometimes also called actions. The action model action combines the strategies of all players simultaneously, so they rather correspond to strategy proﬁles.

More Succinct Action Models. A slightly more succinct modelling of strategy proﬁles as action models is conceivable, that is a quotient of the action model Us deﬁned above with respect to variables that are not revealed by any player. Let us call this set P s, that is therefore deﬁned as the complement of the set P s := {p P | i, j [1..n], i = j, p si(j)}. We can now redeﬁne Us small as (P(P s), s, pre) where in this case for any v, w P s (so for partial valuations of atoms revealed by some agent only), v i w iﬀv Pi(s) = w Pi(s). This looks the same as before, but note that Pi(s) may involve far more variables, namely in P s, than v and w, that are both restricted to P s. Also, still pre(v) = v for all v P s

(and where pre( ) = in case P s = ). Again, it is elementary to show that IM Us small is isomorphic to IM s. We now have that IM , w |= pre(v) iﬀv w. But in this case Us small is typically smaller than the resulting

Boolean Observation Games

updated model IM s. The resulting IM s, as before, has the same domain as the initial model IM . We now have, for example, that the action model corresponding to the reveal nothing strategy proﬁle s is the trivial singleton action model Us small with precondition (as P s = ), and in this case IM Us small is isomorphic to the initial observation model IM again: the relations i have not changed.

A.3 Strategies for Weaker Observations give Bigger Models

In our modelling, it is common knowledge to all players what variables have been revealed by who and to whom: the strategy proﬁle s is common knowledge after the fact . But, although I therefore know what variables are revealed by other players to yet other players, I still have not learnt the values of these variables.

For example: After player 1 reveals atom p1 to player 2 and atom q1 to player 3, player 2 knows whether p1 and player 3 knows whether q1. Also, player 2 knows that player 3 knows whether q1, and player 3 knows that player 2 knows whether p1.

In a diﬀerent modelling, each player only learns what variables have been revealed by other players to herself, and what variables she reveals to others.

For example: After player 1 reveals atom p1 to player 2 and atom q1 to player 3, player 2 knows whether p1 and player 3 knows whether q1. However, player 2 does not know that player 3 knows whether q1, and player 3 does not know that player 2 knows whether p1. Player 2 also considers it possible that no variable has been revealed to 3, in which case 3 does not know whether q1. And similarly for player 3.

So, clearly, depending on which modelling one prefers, diﬀerent goal formulas γ of observation games may be satisﬁed, and it will therefore aﬀect the existence of Nash equilibria and what the optimal strategies are. Let us ﬁrst formalize this as an action model, and let us be explicit about the (rather diﬀerent) updated model as well. The strategies si and proﬁles s = (s1, . . . , sn) remain the same, and thus also the Pi(s), the set of atoms revealed to agent i. However, we can no longer deﬁne an updated observation model as one wherein only the indistinguishability relations have been changed, namely as v s i w iﬀv Pi(s) = w Pi(s), while keeping the domain (and the valuation). Instead of models consisting of valuations (domain V ) we now need much larger models consisting of pairs (v, t) for valuations v and proﬁles t (domain V S) and deﬁne:

For all v, v V and for all s, t, t S and for all players i N: (v, t) s i (v , t ) if v Pi(s) = v Pi(s) [same valuation inasfar observed], ti = t i = si [same variables revealed to others], and Pi(s) = Pi(t) = Pi(t ) [same variables revealed by others to you].

Van Ditmarsch & Simon

As a consequence, we cannot describe the initial observation model as the one wherein s is executed, because that would still blow up the model and introduce maximal uncertainty about what is revealed by who. So the initial observation model IM needs to be given separately (namely as the model already deﬁned in Appendix A.2). However, once this is done, that is all. An action model can also be given for this modelling. In this alternative modelling the players would remain far more ignorant about other players: optimist expected outcome would be more optimist, pessimist expected outcome would be more pessimist, realist expected outcome would quantify over a far larger set of possible outcomes. Basically, any epistemic feature is diluted. It therefore appeared to us that our preferred modelling provides more interesting results and variations. Beyond that, the envisaged iterated Boolean observation games would become less meaningful for such strategies encoding weaker observations, as a player remains unaware of other players increasing knowledge over such iterations, unless as a consequence of that player informing those other players.

Appendix B. Representation and Complexity

Theorem 35. Given α LK along with a strategy proﬁle s S and a valuation v V , checking if v, s |= α is PSPACE-complete.

Proof. The membership in PSPACE is straightforward. For PSPACE-hardness, we give a reduction from Quantified Boolean Formula (QBF) which is a canonical PSPACEcomplete problem (Papadimitriou, 1994). A QBF instance consists of a formula of the form Q1x1Q2x2 . . . Qnxn ψ(x1, x2, . . . , xn) where every Qi is either a or quantiﬁer, every xi is a propositional variable and ψ(x1, x2, . . . , xn) is a Boolean formula over the variables x1, . . . , xn. From the deﬁnition, it follows that every QBF instance is either true or false (irrespective of the valuation under which it is evaluated). Given an instance ϕ = Q1x1Q2x2 . . . Qnxn ψ(x1, x2, . . . , xn) of QBF, we associate with each variable xi, a player i (thus N = {1, . . . , n}) and let P = {x1, . . . , xn}. We use the following notation introduced in Section 4.2: for i = 1, . . . , n 1 let i+ := i+1 and n+ := 1. For all i N, let Pi = {xi+} and let s i denote the strategy where player i reveals xi+ to all players except player i+. That is, s i (i+) = and s i (j) = Pi for all j = i+. Let αϕ LK be the formula obtained from ϕ by replacing all occurrence of xi by Ki and all occurrence of xi by Ki . Let v = denote the valuation that assigns all variables the value false. We show that the QBF instance ϕ is true iﬀv , s |= αϕ. We ﬁrst argue that for all QBF instances ϕ and for all valuations v over P, v |= ϕ iﬀ v, s |= αϕ. The proof is by induction on the structure of ϕ and the non-trivial cases involve quantiﬁers. Suppose ϕ = xiψ so that αϕ = Kiαψ, then

v |= xiψ iﬀ for all valuations u where u (P \ {xi}) = v (P \ {xi}), u |= ψ iﬀ for all valuations u where u (P \ {xi}) = v (P \ {xi}), u, s |= αψ iﬀ for all u where u s i v we have u, s |= αψ iﬀ v, s |= Kiαψ.

Since all variables in the QBF instance ϕ are bound, we have the following. ϕ is true iﬀv |= ϕ iﬀv , s |= αϕ. The claim then follows from the PSPACE-completeness of QBF.

Boolean Observation Games

Agotnes, T., Harrenstein, P., van der Hoek, W., & Wooldridge, M. (2013a). Veriﬁable equilibria in Boolean games. In Proc. of 23rd IJCAI, pp. 689 695.

Agotnes, T., Harrenstein, P., van der Hoek, W., & Wooldridge, M. (2013b). Boolean games with epistemic goals. In Proc. of 4th LORI, pp. 1 14. LNCS 8196.

Agotnes, T., van Benthem, J., van Ditmarsch, H., & Minica, S. (2011). Question-Answer games. Journal of Applied Non-Classical Logics, 21(3-4), 265 288.

Agotnes, T., & van Ditmarsch, H. (2011). What will they say? - Public announcement games. Synthese, 179(S.1), 57 85.

Amor, N., Fargier, H., & Sabbadin, R. (2017). Equilibria in ordinal games: A framework based on possibility theory. In Proc. of the 26th IJCAI, pp. 105 111.

Apt, K. R., & Gr adel, E. (Eds.). (2011). Lectures in Game Theory for Computer Scientists. Cambridge University Press.

Apt, K. (2011). A primer on strategic games. In Apt, K. R., & Gr adel, E. (Eds.), Lectures in Game Theory for Computer Scientists, pp. 1 37. Cambridge University Press.

Aziz, H., & Savani, R. (2016). Hedonic games, chap. 15, pp. 356 376. Handbook of Computational Social Choice. Cambridge University Press.

Bakhtiari, Z., van Ditmarsch, H., & Saﬃdine, A. (2019). How does uncertainty about other voters determine a strategic vote?. Studies in Logic, 12 (3), 32 56.

Baltag, A., Moss, L., & Solecki, S. (1998). The logic of public announcements, common knowledge, and private suspicions. In Proc. of 7th TARK, pp. 43 56. Morgan Kaufmann.

Belardinelli, F., Grandi, U., Herzig, A., Longin, D., Lorini, E., Novaro, A., & Perrussel, L. (2017). Relaxing exclusive control in boolean games. In Lang, J. (Ed.), Proc. of 16th TARK, Vol. 251 of EPTCS, pp. 43 56.

Bonzon, E., Lagasquie-Schiex, M.-C., & Lang, J. (2009). Dependencies between players in Boolean games. Int. J. Approx. Reasoning, 50(6), 899 914.

Bonzon, E., Lagasquie-Schiex, M., Lang, J., & Zanuttini, B. (2006). Boolean games revisited. In Proc. of 17th ECAI, pp. 265 269. IOS Press.

Bradﬁeld, J., Gutierrez, J., & Wooldridge, M. (2016). Partial-order boolean games: informational independence in a logic-based model of strategic interaction. Synthese, 193, 781 811.

Cai, Y., & Daskalakis, C. (2011). On minmax theorems for multiplayer games. In Proceedings of the SODA 11, pp. 217 234. SIAM.

Chatterjee, K., Doyen, L., Henzinger, T., & Raskin, J. (2007). Algorithms for omega-regular games with imperfect information. Logical Methods in Computer Science, 3(4).

Conitzer, V., Walsh, T., & Xia, L. (2011). Dominating manipulations in voting with partial information. In Proc. of AAAI.

Van Ditmarsch & Simon

Cruz, J., & Simaan, M. (2000). Ordinal games and generalized nash and stackelberg solutions. Journal of Optimization Theory and Applications, 107, 205 222.

Dunne, P., & van der Hoek, W. (2004). Representation and complexity in Boolean games. In Proc. of JELIA, pp. 347 359. LNCS 3229.

Dunne, P., & Wooldridge, M. (2012). Towards tractable Boolean games. In Proc. of 11th AAMAS, p. 939 946.

Durieu, J., Haller, H., Qu erou, N., & Solal, P. (2008). Ordinal games. IGTR, 10(2), 177 194.

Fagin, R., Halpern, J., Moses, Y., & Vardi, M. (1995a). Reasoning About Knowledge. The MIT Press.

Fagin, R., Halpern, J., Moses, Y., & Vardi, M. (1995b). Reasoning about Knowledge. MIT Press.

Gutierrez, J., Harrenstein, P., Perelli, G., & Wooldridge, M. (2016). Expressiveness and Nash equilibrium in iterated Boolean games. In Proc. of AAMAS, pp. 707 715. ACM.

Gutierrez, J., Harrenstein, P., & Wooldridge, M. (2015). Iterated Boolean games. Inf. Comput., 242, 53 79.

Gutierrez, J., Murano, A., Perelli, G., Rubin, S., & Wooldridge, M. (2017). Nash equilibria in concurrent games with lexicographic preferences. In Proceedings of the Twenty-Sixth International Joint Conference on Artiﬁcial Intelligence, IJCAI-17, pp. 1067 1073.

Halpern, J., & Vardi, M. (1991). Model Checking vs. Theorem Proving: A Manifest, chap. 10, p. 151 176. Academic Press Professional, Inc.

Harrenstein, P. (2004). Logic in Conﬂict. Logical Explorations in Strategic Equilibrium. Ph.D. thesis, Utrecht University.

Harrenstein, P., Turrini, P., & Wooldridge, M. (2017). Characterising the manipulability of Boolean games. In Proc. of 26th IJCAI, pp. 1081 1087.

Harrenstein, P., van der Hoek, W., Meyer, J.-J., & Witteveen, C. (2001). Boolean games. In van Benthem, J. (Ed.), Proc. of the 8th TARK, pp. 287 298, San Francisco. Morgan Kaufmann.

Harsanyi, J. (1967 1968). Games with incomplete information played by Bayesian players, Parts I, II, and III. Management Science, 14, 159 182, 320 334, 486 502.

Hearn, R., & Demaine, E. (2009). Games, Puzzles and Computation. CRC Press.

Herzig, A., Lorini, E., Maﬀre, F., & Schwarzentruber, F. (2016). Epistemic Boolean games based on a logic of visibility and control. In Proc. of 25th IJCAI, pp. 1116 1122.

Ianovski, E., & Ong, L. (2014). guaranteenash for Boolean games is NEXP-Hard. In Proc. of 14th KR, pp. 208 217. AAAI Press.

Janovskaya, E. (1968). Equilibrium points in polymatrix games. Litovskii Matematicheskii Sbornik, 8, 381 384.

Kearns, M., Littman, M., & Singh, S. (2001). Graphical models for game theory. In Proc. Seventeenth Conference on Uncertainty in Artiﬁcial Intelligence, pp. 253 260.

Keynes, J. (1921). A Treatise on Probability. Macmillan and Co., London.

Boolean Observation Games

Moss, L. (2015). Dynamic epistemic logic. In van Ditmarsch, H., Halpern, J., van der Hoek, W., & Kooi, B. (Eds.), Handbook of epistemic logic, pp. 261 312. College Publications.

Osborne, M., & Rubinstein, A. (1994). A Course in Game Theory. MIT Press.

Papadimitriou, C. (1994). Computational Complexity. Addison-Wesley.

Parikh, R., Tasdemir, C., & Witzel, A. (2013). The power of knowledge in games. IGTR, 15(4).

Savage, L. (1951). The theory of statistical decision. Journal of the American Statistical Association, 46 (253), 55 67.

Turrini, P. (2013). Endogenous Boolean games. In Proc. of 23rd IJCAI, p. 390 396. AAAI Press.

van Benthem, J. (2001). Games in dynamic epistemic logic. Bulletin of Economic Research, 53(4), 219 248.

van Ditmarsch, H. (2002). Descriptions of game actions. Journal of Logic, Language and Information, 11, 349 365.

van Ditmarsch, H., Lang, J., & Saﬃdine, A. (2013). Strategic voting and the logic of knowledge. In Proc. of 14th TARK Chennai.

van Ditmarsch, H., van der Hoek, W., & Kooi, B. (2008). Dynamic Epistemic Logic, Vol. 337 of Synthese Library. Springer.

Wald, A. (1945). Statistical decision functions which minimize the maximum risk. The Annals of Mathematics, 46(2), 265 280.

Wooldridge, M., Endriss, U., Kraus, S., & Lang, J. (2013). Incentive engineering for Boolean games. Artiﬁcial Intelligence, 195, 418 439.