# computing_social_behaviours_using_agent_models__8539aec3.pdf Computing Social Behaviours Using Agent Models Paolo Felli, Tim Miller, Christian Muise, Adrian R. Pearce, Liz Sonenberg Department of Computing and Information Systems, University of Melbourne {paolo.felli,tmiller,christian.muise,adrianrp,l.sonenberg}@unimelb.edu.au Agents can be thought of as following a social behaviour, depending on the context in which they are interacting. We devise a computationally grounded mechanism to represent and reason about others in social terms, reflecting the local perspective of an agent (first-person view), to support both stereotypical and empathetic reasoning. We use a hierarchy of agent models to discriminate which behaviours of others are plausible, and decide which behaviour for ourselves is socially acceptable, i.e. conforms to the social context. To this aim, we investigate the implications of considering agents capable of various degrees of theory of mind, and discuss a scenario showing how this affects behaviour. 1 Introduction Performance in many complex systems depends on the coordinated activity of a team of individuals, often characterised by complex decision tasks in rapidly evolving environment, where explicit communication may be limited. Research in Team Performance suggests that human teams coordinate activities more effectively and achieve better overall task performance when team members manage to track each other s beliefs, intentions and task-related states [Klimoski and Mohammed, 1994; Mohammed et al., 2010; Espevik et al., 2011]. This intuition was captured by the notion of shared Mental Models and team Mental Models [Johnson Laird, 1983; Cannon-Bowers et al., 1993; Bolstad and Endsley, 1999]. The idea is that information is organised in structured patterns and processed in a rapid and flexible manner, to describe, explain and predict the system behaviour as well as the ramifications of potential decisions prior to action. Moreover, various degrees of rich interactions depend not only on such ability to represent mental models (to attribute beliefs, desires, pretending, etc., to oneself and others), but also to understand that others have mental states that are different from one s own. This is often called theory of mind (To M) [Premack and Woodruff, 1978; Scassellati, 2002; Isaac and Bridewell, 2014], and it is a widely studied phenomenon in many disciplines [Bergwerff et al., 2014; Ficici and Pfeffer, 2008]. The ability to take the perspective of others is crucial, and studies of human-robot interaction and so- cial robotics identify the need for a human-oriented perception [Lemaignan et al., 2010; Warnier et al., 2012]. Also in the context of agent systems, we have seen in recent years a growing demand of more realistic social behavior [Kaminka, 2013; Dignum et al., 2014], so one critical feature becomes the ability to represent others in social terms. Giving agents an awareness of their social reality will enable more seamless interdependent collective behaviour [Dignum et al., 2014; Johnson et al., 2014], where interdependency informally means that one agent s deliberation is dependent on what another agent does (or intends to do), and vice-versa. We therefore need to investigate computational structures that allow agents to reason not just about themselves, but also about the so-called social reality [Dignum et al., 2014]. There has been considerable work on the design of intelligent agents and reasoning about their own knowledge and belief as well as that of others e.g., [Levesque, 1984; Lakemeyer, 1986; Fagin et al., 1995; Wooldridge and Lomuscio, 2001; Ditmarsch et al., 2007] typically for devising some form of strategy, or plan, to achieve goals. In real-life scenarios, however, agents must deal with a high degree of uncertainty, and although humans routinely interact successfully in limited cue conditions, this process requires complex, often multimodal, exchange of information. A critical perspective has thus to be assumed, one that can discern what is plausible from what is not. Existing work, e.g. [Bulling and Jamroga, 2007; Andersen et al., 2014], assumes that plausible traces are given as part of the model, rather than constructed. The contribution of this paper is: a representation of the model that agents have of each other, and their nested beliefs. An agent can use its own model for itself, yet use different representations and inference mechanisms for others. It can simulate others to deliberate, empowering interdependence and awareness. a computational mechanism to use these representations to discriminate which behaviours of others are plausible, given the context, and decide which behaviour for ourselves is socially acceptable (conforms to the context). Crucially, we preserve an agent s local perspective (firstperson view), instead of considering an omniscient observer (third-person view). We support two types of reasoning about others: stereotypical reasoning, using simple social rules, and empathetic reasoning, in which the agent casts itself into the Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015) mind of another agent and reasons as if it were them. In Section 2 we present a running scenario. In Section 3, we introduce our notion of agent models, and in Section 4 we model our scenario in this framework. In Section 5 we formally define the two types of reasoning above, and in Section 6 we give our definition of acceptable behaviour, given a social context. In Section 7 we comment on future work. 2 The Wumpus Quest In recent work (e.g., [Bergwerff et al., 2014; Ficici and Pfeffer, 2008]) authors studied agents with the cognitive ability to use of the To M, in (possibly iterated) one-step games. Here, we present a scenario attempting to bring together some strategic and social features, inspired by the Wumpus Hunt: The lord of a castle is informed by a peasant that a Wumpus is dwelling in a dungeon nearby. It is known that the Wumpus can be killed by one hunter alone only if asleep; if awake, two hunters are required. The lord then tasks the peasant to go to fetch the White Knight, his loyal champion, and hunt down the beast together. The White Knight is known for being irreprehensible, trustworthy and brave; however, the peasant does not know any knight, and neither their looks. While looking for the White Knight, he runs into the Black Knight and, believing him the White Knight, tells him about the quest. There is some additional information that needs to be taken into account: on one hand, the knight knows how a Wumpus can be killed by two hunters, but he is aware that a simple peasant may get scared by the thought of confronting an awake Wumpus. Also, the peasant can not hunt and is unable to see whether the Wumpus is awake (he can not approach unnoticed), but the knight can. Therefore it is not clear to him whether the peasant can be of any help to the quest. On the other hand, the knight is aware of the misunderstanding: he knows that the peasant attributes to him all the good qualities of the White Knight, so the peasant is confident that the knight won t put him in danger whenever possible. The implicit and explicit information of this scenario does not allow a unique understanding of the context, and it is not clear how each agent can use such information. While on the road, they agree on a protocol: they will enter the dungeon from two sides, and the Knight will use a whistle to signal whether the Wumpus is awake, then they will attack. 3 Agent models To allow one agent to reason about others in a social context, we provide agents with agent (mental) models. An agent is able to assign such models to others and itself (from some fixed collection), so when considering all possible eventualities, it is capable of determining its behaviour based on plausible estimates of others behaviour. These models can be used in orthogonal ways: they can describe either specific agents, or agents of which the role, in the present context, is more characterising than their intimate understanding. This is the case, for example, of a bank clerk or policeman, who can be modeled as members of a reference group (role or archetype) [Dignum et al., 2014]. This latter representation is akin to the stereotypical reasoning of humans, who Figure 1: Example of set A for concrete labels Ag = {1, 2, 3} do not necessarily engage in deep cognitive thinking about others, but rely on habits and social practices [Brooks, 1991; Dignum et al., 2014]. Manipulations of these models and stereotypes enable shortcuts to be taken [Pfau et al., 2014]. Models are therefore partial and task-specific: a complete description of an individual, even when possible, would force us to consider numerous irrelevant details [Mccarthy, 1992]. Instead, models can be specialised to characterise different levels of approximation, depending on the context. Although we do not address a specific application domain for our framework, we nonetheless identify its applicability to human-agent teamwork, in which the social context can be considered to design agents capable of using, or simulating, human-like bounded reasoning. For example, in domains where plausible estimates of others come from training (e.g. firefighters), previous interactions (e.g. cognitive assistants for pilots), and social practices (e.g. sports events). Consider a set of concrete agent labels Ag = {1, . . . , n}. Let A be the set of virtual agent labels, obtained by concatenating labels in Ag, such that the first is unique. For instance, if Ag = {1, 2, 3}, then A can be a subset of {1, 11, 12, 13, 111, . . .}, as informally represented in Figure 1. We will use these indices to refer to the representation that each agent has of others. E.g., agent 121 denotes the representation, according to agent 1, that 2 has of 1 itself. For simplicity, we use a tree representation, and will make use of a tree terminology. We will informally refer to the agent at the root as reasoning agent . Given i, j A, we write j ch(i) iff j = i Ag (j is a child of i), and i = pa(j) denotes that i is the parent of j. We regard concatenated labels as regular agent labels, i.e., we refer to A instead of Ag, and we assume that A is prefix-closed (i.e., if j A then any ancestor is in A as well). For simplicity, but without loss of generality, we exclude introspective agents, e.g. 11. Finally, l(i) denotes the last concrete agent label of i, e.g., l(123) = 3, and l(121) = 1. Intuitively, an agent model is a finite-state machine, whose states are called configurations, and transitions are labelled with observation symbols. This kind of modelling is in line with much of the literature on knowledge representation for multi-agent systems [Fagin et al., 1995; Parikh and Ramanujam, 1985], in which the notions of computational local states are given prominence. All the information an agent has at its disposal (facts, observations, etc.) is captured in the state relating to the agent in question. Although we opted for an explicit representation, a rule-based one is also possible. We denote the unique set of all possible configurations with C. As we want an agent to be able to represent and reason about a group Ag, we assume a set of equivalence relations: j: C C j Ag 0: C C such that if c j c then these configurations hold the same information about agent j, and if c 0 c then they hold the same local information , i.e. about self and the world. Hence, c is said to be locally indistinguishable from c iff c 0 c . These arbitrary relations depend on the specific set of configurations (e.g. internal language) considered. Definition 1. Formally, given C, an agent model is a tuple Mi = Ci, O, τi, ωi, pri, Act, prei where: Ci C is a finite set of configurations; O is a finite alphabet of observations; τi Ci Ci is a transition relation: c τi(c) means that the agent can move from c to c by reasoning; ωi : Ci O Ci is a transition function: c = ωi(c, o) models the act of registering the received observation symbol o. Therefore, it is such that c 0 c ; pri : C Ci is a function, termed projection function, that will be discussed later (see Definition 2); Act is a finite set of action labels, and prei : Ci 2Act is a function such that α is plausible in c iff α prei(c). The function prei is used to model action preconditions, but also to express plausibility with respect to goals and known plans, as in BDI agents [Rao and Georgeff, 1991]. Given A and a library M, i.e. a finite set of agent models, a model assignment is a function R : A M that assigns a model to each agent. We can imagine R to capture the present situation, and we assume it to be fixed. Given R and A, we call Γ = Mj j A the context, with Mj = R(j) for each j. A context captures the perspective of the reasoning agent. An agent model Mi, together with a configuration c Ci, is called mental state, denoted Si = Mi, c . Consider a propositional setting (that we will use in our scenario) describing an agent s internal logic with objective language P. Let L be the language with grammar : ϕ ::= ψ | Belj(ϕ) | ϕ where j Ag and ψ P. By writing ϕ, we represent the fact that the agent in question (say i) believes that formula ϕ is true, whereas Belj(ϕ) denotes the fact that the agent believes that agent j believes ϕ. Here, belief refers to a syntactic object denoting a fact regarded as true, with no assumed semantic properties. We do not require a belief base to be consistent or closed under logical implication, as such automaticity may be overly optimistic representations of real believers. Then C can be taken as the infinite set of all possible belief bases over L and, e.g., we can define j to be such that c j c iff {ϕ | Belj(ϕ) c} = {ϕ | Belj(ϕ) c }. Similarly, c 0 c iff the set of formulas in c and c not of the form Belj(ϕ), for any j ch(i) in A, are the same. For the agent model Mi, Ci can be the set of allowed belief bases according to some syntactic restriction, e.g. size of belief bases (memory). Also, τi and ωi can model a reasoning machinery i, i.e. a set of axioms and deductive rules that, together with L, characterise a deductive system for i. This has some similarity with the notion of multilanguage systems [Giunchiglia and Serafini, 1994] and, in general, with resource-bounded agents [Alechina and Logan, 2009]. Similarly to the latter, these models can be used to capture agents with limited computational resources (humans), in which the cost of deliberation is considered [Isaac and Bridewell, 2014]. Definition 2. Consider two agents i, j such that j ch(i). Given a mental state Si = Mi, c for i, the mental state ascribed to agent j by i is Sj = Mj, prj(c) . We extend the definition to the case where j is not a child of i by trivially applying a chain of projections. This projection function allows us to retrieve the configuration that is currently considered by agent j (its current internal state, according to c). For this reason, we impose a constraint on each prj to agree with the notion of indistinguishability over C: prj(c) = prj(c ) if c j c for each j A. In Section 5 we will make use of ascribed mental states to define the ability of reasoning as others. Considering the propositional setting above, we may choose to have prj(c) := {ϕ | Belj(ϕ) c} for each j, or we may define these functions to encode different representations of beliefs due to terminology or cognitive differences. Definition 3. A context Γ is complete iff, for each i A: (1) for each vector of configurations c j ch(i)Cj, one for each child j of i, there exists c Ci such that prj(c) = cj; and (2) for each c, c Ci with c 0 c , there exists c Ci holding the same information about children as c , and about i as c. Formally: c 0 c and c j c for any j ch(i). Γ is complete when any possible vector of configurations, one for each child, can be captured by a single configuration of the parent. From now on, we assume a complete Γ. 4 Modeling the Wumpus Quest Let us consider again the scenario from Section 2. As customary in the Wumpus World, the dungeon is represented by a grid. The Wumpus occupies one cell, and each cell adjacent to this has a stench. When the Wumpus is killed, a scream can be heard throughout the dungeon. For simplicity, we do not consider pits and breezes. The set of observation symbols is O = {s, a, bs, ba, d, sm}. The first two correspond to observing the state of the Wumpus (sleeping and awake), the second two to the signal from the knight (whether the Wumpus is awake). d is the observation of the Wumpus screaming, and sm of its smell. The set of actions is Act = {Mv, nil, At, Bs, Ba}, namely move, wait, attack, and signal that the Wumpus is asleep or awake. Assuming the propositional setting as in Section 3, we model the agents beliefs with a set of propositions {bs, ba, W, WA, WS, dead, scared, dec, att}, where bs and ba represent beliefs about the fact that the agent in question received observations bs and ba; W represents the belief that the position of the Wumpus is known; WA, WS and dead reprensent beliefs about the state of the Wumpus (awake, sleeping or dead, respectively); scared, dec, att are believed if the agent is scared, intends to deceive the other, or is ready to attack the Wumpus, respectively. Assume that the reasoning agent is the Black Knight (agent 1). He assigns to himself the model MBK, to agent 12 the model MP for the peasants reference group, and to agent 121 the model MW K. There models are depicted in in Figure 2, where transitions in each ωi do not have labels, and loops of the form c, o, c ωi are omitted. The table lists the configurations in C used in the agent models (where CB( ) stands Figure 2: The agent models MBK, MP , MW K, and the set of environment states. c1: c2: WA c3: WS c4: WA, dec c5: WA, dec c6: WA, bs c7: WA, Bel2(CB(WS)), CB(att) c8: WA, ba c9: CB(dead) c10: CB(WA), CB(att) c11: WA, Bel2(Bel1(WA)), Bel2(W) c12: W, Bel1(WS), bs c13: W, Bel1(WS) c14: W, bs c15: W, ba c16: scared c17: WA, Bel2(scared), ba c18: WA, Bel2(Bel1(WS)), Bel2(W), bs c19: W, Bel1(WA) c20: WA, Bel2(Bel1(WA)), Bel2(W), ba c21: W, Bel1(WA), ba c22: WA, ba, Bel2(W), Bel2(ba), Bel2(Bel1(CB(att))), Bel2(Bel1(CB(WA))) c23: W, ba, Bel1(CB(WA)), Bel1(CB(att)) c24: WA, Bel2(Bel1(WS)), Bel2(W) c25: W c26: WA, Bel2(Bel1(WS)), Bel2(W), dec c27: WA, Bel2(W) c28: WA, Bel2(W), dec c29: WA, Bel2(W), dec c30: WA, bs, Bel2(W), Bel2(bs) c31: WA, ba, Bel2(W), Bel2(ba). for a syntactic representation of common belief). Projections are defined accordingly. Each configuration corresponds to a state, and each state is labelled with the set of plausible actions (here only singletons). Consider for example MP. In c1 the peasant has no relevant belief, and only action Mv is plausible. An agent adhering to this model will keep moving until the smell observation is received (transition c1, sm, c25 ). When this happens, the peasant will wait for a signal. If ba is received, then the next configuration c15 contemplates two courses of thought: c15, c16 and c15, c10 . In the first case the peasant will wait indefinitely (scared); otherwise it will attack. The rest of the configurations were omitted for brevity. Unreachable configurations will become reachable via empathetic reasoning, as we see in the next section. For instance, if the current mental state of agent 1 is S1 = MBK, c11 , then S12 = MP , c19 and S121 = MW K, c2 . To model any concrete example, we need to first define our notion of environment. A (nondeterministic) environment is: E = E, e0, γ, perc where E is a finite set of environment states; e0 E is the initial state; γ = E Act|Ag| E is a transition relation; perc : E Ag O is an observation function that returns an observation o for each agent in Ag. As customary [Fagin et al., 1995; Parikh and Ramanujam, 1985; Wooldridge and Lomuscio, 2001], E is used to represent the physical world. Assume that the dungeon is as in the right side of Figure 2 (states e0-e8: we restrict its size for simplicity, and we consider only one position of the Wumpus). E.g., e0 represents the situation when both hunters are outside; e3 the one in which the Wumpus is asleep and the Knight signaled so. The transition relation is defined such that the effect of actions is reflected in the position of the hunters and in the communication channel. To model the fact that the state of the Wumpus is unknown, we consider two transitions from e0 with actions α = Mv, Mv , namely e0, α, e1 and e0, α, e2 . 5 Reasoning as and about others We now characterize two types of reasoning: (1) Stereotypical: when an agent represents and reasons about other agents and their beliefs; (2) Empathetic: the agent casts itself into the mind of another, reasoning as the other would. The former is similar to that of standard Epistemic Logic [Fagin et al., 1995], in which every agent is homogenous, while the latter uses different perspectives and inference mechanisms for others. Recalling the notion of ascribed mental state (Definition 2) we can now state these concepts formally: 1. An agent i with mental state Mi, c performs a stereo- typical reasoning about agent j ch(i) when it performs a transition c, c τi with c j c . The transition captures the application of a stereotype about j. 2. An agent i performs an empathetic reasoning as agent j whenever it performs a transition prj(c), c τj, i.e., progresses the ascribed mental state Mj, prj(c) . These are depicted in Figure 3 (left), where mental states in black may have changed, while gray ones are updated with respect to one child. In the first (above) agent 1 performs a transition c 1 τ(c1) and, as a consequence, the (implicit) mental state ascribed to 12 may change. In the second, agent 1 computes c12 = pr12(c1) then c 12 τ(c12) then c 1, as we will see. Computing such c 1 in case of empathetic reasoning is a nontrivial step: c 1 needs to hold the updated information about 12 as well as preserving the rest of the prexisting information from c1 that is about 1 or other children. Figure 3: (left) Representation of stereotypical and empathetic steps. (right) Depiction of an expansion from c0 to cm. With these two simple concepts at hand, we describe how an agent reasons by applying either or both these strategies. Of importance here is that we are describing the state-space in which an agent reasons, but not its algorithmic process. In other words, an agent that reasons with a context Γ needs its own heuristics for deciding when and how apply either form of reasoning, but we are agnostic with respect to this implementation. Instead, we define here the outcome of the reasoning through the notion of expansion. A mental path for i is a sequence of agent labels j0 jm such that j0 = jm = i and, for every 0 < ℓ< m, either jℓ+1 = jℓ, jℓ+1 ch(jℓ) or jℓ+1 = pa(jℓ). That is, a mental path is a path in the tree of A that starts and ends at the root. A path σ represents the mental steps of the reasoning agent when it direct its attention towards virtual agents, projecting the corresponding mental states, to identify a possible representation for the result of this simulated reasoning. Finally, an expansion of Mi, ci is a sequence of mental states σ = Mj0, c0 , Mj1, c1 , , Mjm, cm such that c0 = ci, j0 jm is a mental path for i and, ℓ< m, either: (a1) jℓ+1 = jℓand cℓ+1 τjℓ(cℓ), i.e. a local transition; (a2) jℓ+1 ch(jℓ) and cℓ+1 = prjℓ(cℓ); (a3) jℓ+1 = pa(jℓ) and cℓ+1 jℓc for any c s.t. prj(c) = prj(lastj(σ, ℓ)) and j ch(jℓ+1). lastj(σ, ℓ) is the last configuration of agent jℓin the prefix of σ of length ℓ. Point (a2) represents a top-down projection from jℓto jℓ+1, and (a3) represents an inverse projection, that computes a configuration cℓ+1 holding the preexisting information about the parent and other children, plus the updated information about child jℓ. Points (a1)-(a3) can be always computed, as we assumed Γ to be complete. For example, a possible expansion from MBK, c22 is σ = c22, c23, c16, c17, whose path is 1, 12, 12, 1. Figure 3 (right) depicts an expansion (dashed edges are local transitions point (a1) above). Expansions allow to retrieve a new configuration (like cm in Figure 3) through the thinking that happens along the path, but they do not consider observations. We now define a function that computes configurations resulting from mental expansions, by first considering received observations. Given Γ and a model Mi, we compute the relation nexti Ci O|A| Ci such that c0, o, cm nexti iff: c i ωi(c, oi) and, j A, j = i, c j ωj(prj(c i), oj); cm is the last configuration of an expansion from Mi, c , with prj(c ) = c j and c j 0 c j, j A. The first item progresses each model separately; the second selects a new c for i capturing all the mental states by only looking at their local information, then returns an expansion. 6 Social autism and To M In this section we illustrate different degrees of social behaviours, and how they affect computational aspects of the collective execution. We imagine that the agent can simulate executions in its mind to foresee plausible evolutions. Given Γ and E, the set of global states for agent i is such that each global state g = e, c holds the (current) environment state and a configuration for agent i. If we define an action vector as a tuple of size |Ag| comprising an action for each concrete agent, then a sequence ρ = g0 α0g1 α1 , alternating global states and action vectors, is termed a run for agent i on E. In what follows, to ease the notation, we assume that i is the reasoning agent, and we use j to quantify on A. Zero-order To M: autistic agents In the most basic setting, an agent is not socially intelligent: it does not consider models of other agents (although it may attribute them beliefs), thus it is not able to reason as them. The typical approach for group strategies is to synthesize a plan where the group collectively achieves some object. For this plan to be successful, each agent does not need to represent others: the synthesis of the plan is done externally, from a third-person view. Even if these agents may agree on a protocol, they are incapable of devising a new strategy autonomously when failure happens. A = {i} and Γ = Mi . Given a mental state Mi, c0 for i, we say that ρ = g0 α0g1 α1 is a feasible run for agent i over E iff g0 = e0, c0 , and gℓ+1 = eℓ+1, cℓ+1 is such that for each ℓ 0: αℓ i is plausible in Mi, cℓ ; eℓ+1 γ(eℓ, αℓ); cℓ+1 nexti(cℓ, o ), with o = perc(eℓ+1, i). The agent only considers its own actions and received observations, ignoring the actions and observations of others. Example 1. (Wumpus Quest). If we consider all feasible runs as possible, then a strategy for agent i guaranteed to kill the Wumpus does not exist. The knight does not take into consideration the fact that the other may get scared at the thought of facing an awake Wumpus: there are two distinct courses of thought c15, c16 , c15, c10 in MP . Instead MBK does not contemplate this, as the only path from c31 leads to the decision to attack: the knight modelled as a zero-order To M agent is not able to foresee this. First-order To M: socially aware agents The second category of social behaviours arises when the agent has its own representation of others, and can reason as them. A socially aware agent is aware of the fact that others have different mental states, but it only assigns a model to all concrete agents: A is the set {i} (i Ag). We say that ρ = g0 α0g1 α1 is a plausible run of Γ over E for agent i iff g0 = e0, c0 , and gℓ+1 = eℓ+1, cℓ+1 is s.t.: αℓ i is plausible in Mi, cℓ ; αℓ l(j) is plausible in Mj, prj(cℓ) , for any j ch(i); eℓ+1 γ(eℓ, αℓ); cℓ+1 nexti(cℓ, o), with each oj = perc(eℓ+1, l(j)); Note how l(j) is used to assign observations to virtual agents. Example 2. (Wumpus Quest). A strategy now exists, and the runs it generates are plausible. Being able to assign MP to 12, agent 1 has a strategy: if he sees the Wumpus sleeping he will kill it without help. Otherwise, he will signal that it is instead asleep, deceiving the other into cooperation. Indeed, MP assumes that the peasant would not be reluctant to help ( c26 c13 c3 X e5, e7 ( c11 c19 c2 c20 c21 c10 c22 e6, e6 c23 c16 c17 ( c17 c16 c1 e6, e6 Bs, nil, At 1 Ba, nil, Ba nil, nil, nil Figure 4: Two runs of the Wumpus Quest example. in this case: the only course of thought ( c14, c10 in MP ) leads to common belief about the state of the Wumpus. Second-order To M agents Although socially aware agents may show social behaviours and attribute mental states to others, they lack an evolved theory of mind, i.e. to acknowledge that others are reasoning in the same way, and therefore have expectations and models of them: socially aware agents do not model virtual agents. For second-order To M agents instead, A (and therefore Γ) includes two levels of the heirarchy: agent i, all agents j i Ag and (some) j ch(j), with j ch(i). However, to keep track of what others regard as plausible, we need to understand what does it mean to behave in an acceptable way. Our view is that the runs generated by the collective actions appear plausible to each concrete agent. This requires to (a) extend the size of action vectors to the size of A, and (b) a new definition of global states, in which we keep track of the environment state that each concrete agent regards as both possible and acceptable: G+ i = E1 E|Ag| Ci A run ρ is an acceptable run for i of Γ over E, iff g0 = e0 1, . . . , e0 n, c0 , and gℓ+1 = eℓ+1 1 , . . . , eℓ+1 n , cℓ+1 is s.t.: each αℓ j is a plausible action in cℓ, for any j A; eℓ+1 i γ(eℓ i, act( αℓ, i)); eℓ+1 j γ(eℓ j, α1, . . . , α|Ag| ) where αi = αℓ j i and αl(j) = αℓ j for each j ch(i). perc(eℓ+1 i , l(j)) = perc(eℓ+1 j , l(j)) for each j ch(i); cℓ+1 nexti(cℓ, o), such that oi = perc(eℓ+1 i , i), and oj = perc(eℓ+1 j , l(j)), with j = pa(j), for any j = i; First, each action component is plausible to the ascribed mental state (first item). In the second item, act( α, i) is the vector of actions computed from αℓby only taking the plausible actions for i and each child j ch(i) in A, ordered by l(j). E.g. if i = 1 and A = 1, 12, 121 then act( a1, a12, a121 , 1) = a1, a12 and act( a1, a12, a121 , 12) = a121, a12 . This is the environment state that agent i considers as plausible (and, as E is nondeterministic, more than one can exist). Similarly, in the third item, a new environment component is computed for each agent j ch(i), but the only actions forced to be plausible are those of j and i (indeed, we are not interested in assuming acceptable behaviours for agents other than i; in that case, the condition can be replaced with eℓ+1 j γ(eℓ j, act( αℓ, j))). In the fourth item, we compare the observations received if we take the environment state eℓ+1 i that the reasoning agent i considers as the next one, with the state eℓ+1 j that each other agent j is expecting to receive, i.e. would be observed if their prevision (about the action selected by agent i) was correct. That is, each eℓ+1 j justifies perc(eℓ+1 i , l(j)) to agent j. Note that, in the last item, virtual agents observe the environment component of their parents. One such explanation is selected for each gℓ+1 and, at each future step, a new one needs to be found. When one such run exists, then the observations received by all concrete agents, at each step, are justifiable, otherwise the behaviour of i violates Γ (in this case, it is unmasked ). Therefore, acceptability depends on the observations that others receive. In the case of observable actions, this requires equality, but this is not true for private actions. Example 3. (Wumpus Quest) The informal strategy in Example 2 does not always generate acceptable runs. Indeed, M121 = MBK tells (c3) that the White Knight would not ask for help if not needed. Hence, as the whistle is blown to signal that the Wumpus is sleeping when in reality it is awake (e1 = e5), the peasant can detect a misalignment with his expectations (a next environment state e12 = e7 in which the scream was heard), hence will fail to find a justification for the signal: perc(e5, 1) = perc(e7, 2); thus unmasking the Knight. This run is depicted in Figure 4 (run above; the other is one that leads the peasant to get scared, and the rest are omitted). If the knight shows enough empathetic attitude, he could realise, before acting, that his action may appear unjustifiable to 12. As we said, we are agnostic on his decision, but we are now capable of formalizing this notion. Finally, note that acceptable runs capture the view stated earlier. Given one acceptable run ρ for i, the run over G observed by j ch(i) is the sequence ρj = g0 j act( α0, j) , with gℓ j = eℓ l(j), prj(cℓ) for each ℓ 0. Theorem 1. Given an acceptable run τ for i, any run of Γ over E observed by each agent j ch(i) is plausible for j. 7 Conclusions and Future Work Using agent models, we devised a computational mechanism to discriminate which behaviours of others are plausible, and decide which behaviour for ourselves is acceptable. In future work, we will define different notions of acceptability, and investigate how to build ATL (Alternating-time Temporal Logic [Alur et al., 2002]) games to verify strategic abilities in this setting, and synthesise strategies that conform to Γ, i.e. induce acceptable runs, also in scenarios of deception. Acknowledgements: This research is partially funded by Australian Research Council Discovery Grant DP130102825, Foundations of Human-Agent Collaboration: Situation Relevant Information Sharing References [Alechina and Logan, 2009] Natasha Alechina and Brian Logan. A logic of situated resource-bounded agents. J. of Logic, Language and Information, 18(1):79 95, January 2009. [Alur et al., 2002] Rajeev Alur, Thomas A. Henzinger, and Orna Kupferman. Alternating-time temporal logic. J. ACM, 49(5):672 713, September 2002. [Andersen et al., 2014] Mikkel Birkegaard Andersen, Thomas Bolander, and Martin Holm Jensen. Don t plan for the unexpected: Planning based on plausibility models. Logique et Analyse, 1(1), 2014. [Bergwerff et al., 2014] Gerben Bergwerff, Ben Meijering, Jakub Szymanik, Rineke Verbrugge, and Stefan M Wierda. Computational and algorithmic models of strategies in turn-based games. Proceedings of Cog Sci, 2014. [Bolstad and Endsley, 1999] Cheryl A. Bolstad and Mica R. Endsley. Shared mental models and shared displays: An empirical evaluation of team performance. In Proceedings of the Human Factors and Ergonomics Society Annual Meeting, volume 43, pages 213 217, 1999. [Brooks, 1991] Rodney Brooks. Intelligence without representation. Artificial Intelligence, 47:139 159, 1991. [Bulling and Jamroga, 2007] Nils Bulling and Wojciech Jamroga. Agents, beliefs, and plausible behavior in a temporal setting. In AAMAS, page 146, 2007. [Cannon-Bowers et al., 1993] Janis A. Cannon-Bowers, Eduardo Salas, and Sharolyn Converse. Shared mental models in expert team decision making. Individual and group decision making: Current issues, 221, 1993. [Dignum et al., 2014] Frank Dignum, Gert Jan Hofstede, and Rui Prada. From autistic to social agents. In AAMAS, pages 1161 1164, 2014. [Ditmarsch et al., 2007] Hans van Ditmarsch, Wiebe van der Hoek, and Barteld Kooi. Dynamic Epistemic Logic. Springer, Incorporated, 1st edition, 2007. [Espevik et al., 2011] Roar Espevik, Bjørn Helge Johnsen, and Jarle Eid. Outcomes of shared mental models of team members in cross training and high-intensity simulations. J. of Cognitive Engineering and Decision Making, 5(4):352 377, 2011. [Fagin et al., 1995] Ronald Fagin, Joseph Y. Halpern, Yoram Moses, and Moshe Y. Vardi. Reasoning about Knowledge. MIT Press, Cambridge, MA, 1995. [Ficici and Pfeffer, 2008] Sevan G. Ficici and Avi Pfeffer. Modeling how humans reason about others with partial information. In AAMAS, pages 315 322, 2008. [Giunchiglia and Serafini, 1994] Fausto Giunchiglia and Luciano Serafini. Multilanguage hierarchical logics, or: How we can do without modal logics. Artificial Intelligence, 65(1):29 70, 1994. [Isaac and Bridewell, 2014] Alistair Isaac and Will Bridewell. Mindreading deception in dialog. Cognitive Systems Research, 28:12 19, 2014. [Johnson et al., 2014] Matthew Johnson, Jeffrey M. Bradshaw, Paul J. Feltovich, Catholijn M. Jonker, M. Birna van Riemsdijk, and Maarten Sierhuis. Coactive design: Designing support for interdependence in joint activity. Journal of Human-Robot Int., 3(1):43 69, 2014. [Johnson-Laird, 1983] Philip N Johnson-Laird. Mental models: Towards a cognitive science of language, inference, and consciousness. Harvard University Press, 1983. [Kaminka, 2013] Gal A Kaminka. Curing robot autism: A challenge. In AAMAS, pages 801 804, 2013. [Klimoski and Mohammed, 1994] Richard Klimoski and Susan Mohammed. Team mental model: Construct or metaphor? Journal of management, 20(2):403 437, 1994. [Lakemeyer, 1986] Gerhard Lakemeyer. Steps towards a first-order logic of explicit and implicit belief. In Proceedings of the 1986 Conference on Theoretical aspects of reasoning about knowledge, pages 325 340, 1986. [Lemaignan et al., 2010] S. Lemaignan, R. Ros, L. M osenlechner, R. Alami, and M. Beetz. Oro, a knowledge management module for cognitive architectures in robotics. In IROS 2010, 2010. [Levesque, 1984] Hector J Levesque. A logic of implicit and explicit belief. In AAAI, pages 198 202, 1984. [Mccarthy, 1992] John Mccarthy. Overcoming an unexpected obstacle. Technical report, Computer Science Department, Stanford University, 1992. [Mohammed et al., 2010] Susan Mohammed, Lori Ferzandi, and Katherine Hamilton. Metaphor no more: a 15-year review of the team mental model construct. J. of Management, 2010. [Parikh and Ramanujam, 1985] R. Parikh and R. Ramanujam. Distributed processes and the logic of knowledge. In Logic of Programs, pages 256 268, 1985. [Pfau et al., 2014] Jens Pfau, Yoshihisa Kashima, and Liz Sonenberg. Towards agent-based models of cultural dynamics: A case of stereotypes. In Perspectives on Culture and Agent-based Simulations, pages 129 147. Springer, 2014. [Premack and Woodruff, 1978] David Premack and Guy Woodruff. Does the chimpanzee have a theory of mind? Behavioral and Brain Sciences, 1:515 526, 12 1978. [Rao and Georgeff, 1991] Anand S. Rao and Michael P. Georgeff. Modeling rational agents within a BDIarchitecture. In KR, pages 473 484, 1991. [Scassellati, 2002] Brian Scassellati. Theory of mind for a humanoid robot. Autonomous Robots, 12(1):13 24, 2002. [Warnier et al., 2012] Matthieu Warnier, Julien Guitton, S everin Lemaignan, and Rachid Alami. When the robot puts itself in your shoes. Managing and exploiting human and robot beliefs. In RO-MAN, pages 948 954. IEEE, 2012. [Wooldridge and Lomuscio, 2001] M. Wooldridge and A. Lomuscio. A computationally grounded logic of visibility, perception, and knowledge. Logic Journal of the IGPL, 9(2):273 288, 2001.