# on_computational_tractability_for_rational_verification__9e701771.pdf On Computational Tractability for Rational Verification Julian Gutierrez1 , Muhammad Najib1 , Giuseppe Perelli2 , Michael Wooldridge1 1Department of Computer Science, University of Oxford, UK 2Department of Informatics, University of Leicester, UK {julian.gutierrez, mnajib, michael.wooldridge}@cs.ox.ac.uk, giuseppe.perelli@leicester.ac.uk Rational verification involves checking which temporal logic properties hold of a concurrent/multiagent system, under the assumption that agents in the system choose strategies in game theoretic equilibrium. Rational verification can be understood as a counterpart of model checking for multiagent systems, but while model checking can be done in polynomial time for some temporal logic specification languages such as CTL, and polynomial space with LTL specifications, rational verification is much more intractable: 2EXPTIMEcomplete with LTL specifications, even when using explicit-state system representations. In this paper we show that the complexity of rational verification can be greatly reduced by restricting specifications to GR(1), a fragment of LTL that can represent most response properties of reactive systems. We also provide improved complexity results for rational verification when considering players goals given by mean-payoff utility functions arguably the most widely used quantitative objective for agents in concurrent and multiagent systems. In particular, we show that for a number of relevant settings, rational verification can be done in polynomial space or even in polynomial time. 1 Introduction The formal verification of systems using temporal logics such as LTL and CTL [Emerson, 1990] is a major research area, which has led to the development of an impressive number of industrial-strength verification tools and techniques. Arguably the most successful technique within formal verification is model checking, which can be done in polynomial space for LTL specifications and even in polynomial time for CTL specifications [Clarke et al., 2018]. In the context of multiagent systems, rational verification forms a natural counterpart of model checking [Gutierrez et al., 2015b; Wooldridge et al., 2016; Gutierrez et al., 2017a]. This is the problem of checking whether a given property ϕ, expressed as a temporal logic formula, is satisfied in a computation of a system that might be generated if agents within the system choose strategies for selecting actions that form a game- theoretic (e.g., Nash) equilibrium. Unlike model checking, rational verification is still in its infancy: the main ideas, formal models, and reasoning techniques underlying rational verification are under development, while current tool support is limited and cannot yet handle systems of industrial size [Toumi et al., 2015; Gutierrez et al., 2018a]. One key difficulty is that rational verification is computationally much harder than model checking, because checking equilibrium properties requires quantifying over the strategies available to players in the system. Rational verification is also different from model checking in the kinds of properties that each technique tries to check: while model checking is interested in correctness with respect to any possible behaviour of a system, rational verification is interested only in behaviours that can be sustained by a Nash equilibrium, when a multiagent system is modelled as a multi-player game. This, in particular, adds a new ingredient to the verification problem, as it is now necessary to take into account the preferences of players with respect to the possible runs of the system. Typically, in rational verification, such preferences are given by associating an LTL goal γi with each player i in the game. In this case, rational verification with respect to a specification ϕ is 2EXPTIME-complete, regardless of whether the representation of the system is given succinctly [Gutierrez et al., 2017a; Gutierrez et al., 2015b] or explicitly simply as a finite-state labelled transition graph [Gutierrez et al., 2015a]. In this paper, we address this issue and provide complexity results that greatly improve on the 2EXPTIME-complete result of the general case. In particular, we consider games where the goals of players are represented as either GR(1) formulae (an important fragment of LTL that can express most response properties of a concurrent and reactive system [Bloem et al., 2012]), or mean-payoff utility functions (one of the most studied reward and quality measures used in games for automated formal verification). In each case, we study the rational verification problem for system specifications ϕ given as GR(1) formulae and as LTL formulae, with respect to system models that are formally represented as concurrent game structures [Alur et al., 2002]. Our main results, summarised in Table 1, show that in the cases above mentioned, the 2EXPTIME result can be dramatically improved, to settings where rational verification can be solved in polynomial space, NP, or even in polynomial time if the number of players in the game is assumed to be fixed. Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19) Players goals Specification E-NASH LTL LTL 2EXPTIME-complete GR(1) LTL PSPACE-complete (Corollary 1) GR(1) GR(1) FPT (Theorem 3) mp LTL PSPACE-complete (Corollary 2) mp GR(1) NP-complete (Theorem 5) Table 1: Summary of main complexity results. Related Work: Rational verification has been studied for a number of settings, including iterated Boolean games, reactive modules games, and concurrent game structures [Gutierrez et al., 2015b; Gutierrez et al., 2017a; Gutierrez et al., 2015a; Gutierrez et al., 2017b]. In all cases, the problem is 2EXPTIMEcomplete. Rational verification is also closely related to rational synthesis, which is also 2EXPTIME-complete both in the Boolean case [Fisman et al., 2010] and with rational environments [Kupferman et al., 2016]. All of the above cases only consider perfect information. In settings with imperfect information, the problem has been shown to be undecidable both for games with succinct and explicit model representations [Gutierrez et al., 2018b; Filiot et al., 2018]. Our work also relates to LTL and mean-payoff (mp) games at large. While the former are already 2EXPTIME-complete even for two-player games (and in fact already 2EXPTIMEhard for many LTL fragments [Alur and La Torre, 2004]), the latter are NP-complete for multi-player games [Ummels and Wojtczak, 2011] and in NP co NP for two-player games [Zwick and Paterson, 1996], and in fact solvable in quasipolynomial time since they can be reduced to two-player perfect-information parity games [Calude et al., 2017]. 2 Preliminaries Linear Temporal Logic. LTL extends propositional logic with two operators, X ( next ) and U ( until ), for expressing properties of paths [Pnueli, 1977; Emerson, 1990]. The syntax of LTL is defined with respect to a set AP of atomic propositions as follows: ϕ ::= | p | ϕ | ϕ ϕ | Xϕ | ϕ U ϕ where p AP. As usual, we define ϕ1 ϕ2 ( ϕ1 ϕ2), ϕ1 ϕ2 ϕ1 ϕ2, Fϕ U ϕ, and Gϕ F ϕ. We interpret LTL formulae with respect to pairs (α, t), where α (2AP)ω is an infinite sequence of sets of atomic proposition that indicates which propositional variables are true in every time point and t N is a temporal index into α. Formally, the semantics of LTL is given by the following rules: (α, t) |= (α, t) |= p iff p αt (α, t) |= ϕ iff it is not the case that (α, t) |= ϕ (α, t) |= ϕ ψ iff (α, t) |= ϕ or (α, t) |= ψ (α, t) |= Xϕ iff (α, t + 1) |= ϕ (α, t) |= ϕ U ψ iff for some t t : (α, t ) |= ψ and for all t t < t : (α, t ) |= ϕ . If (α, 0) |= ϕ, we write α |= ϕ and say that α satisfies ϕ. General Reactivity of rank 1. The language of General Reactivity of rank 1, denoted GR(1), is the fragment of LTL of formulae written in the following form [Bloem et al., 2012]: (GFψ1 . . . GFψm) (GFϕ1 . . . GFϕn), where each subformula ψi and ϕi is a Boolean combination of atomic propositions. Mean-Payoff value. For an infinite sequence β Rω of real numbers, let mp(β) be the mean-payoff value of β, that is, mp(β) = lim inf n avgn(β) where, for n N, we define avgn(β) = 1 n Pn 1 j=0 βj. Arenas. An arena is a tuple A = N, Ac, St, s0, tr, λ where N, Ac, and St are finite non-empty sets of players (write N = |N|), actions, and states, respectively; s0 St is the initial state; tr : St Ac St is a transition function mapping each pair consisting of a state s St and an action profile a Ac = Ac N, one for each player, to a successor state; and λ : St 2AP is a labelling function, mapping every state to a subset of atomic propositions. We sometimes call an action profile a = (a1, . . . , an) Ac a decision, and denote ai the action taken by player i. We also consider partial decisions. For a set of players C N and action profile a, we let a C and a C be two tuples of actions, respectively, one for all players in C and one for all players in N\C. We also write ai for a{i} and a i for a N\{i}. For two decisions a and a , we write ( a C, a C) to denote the decision where the actions for players in C are taken from a and the actions for players in N \ C are taken from a . A path π = (s0, a0), (s1, a1) is an infinite sequence in (St Ac)ω such that tr(sk, ak) = sk+1 for all k. Paths are generated in the arena by each player i selecting a strategy σi that will define how to make choices over time. We model strategies as finite state machines with output. Formally, for arena A, a strategy σi = (Qi, q0 i , δi, τi) for player i is a finite state machine with output (a transducer), where Qi is a finite and non-empty set of internal states, q0 i is the initial state, δi : Qi Ac Qi is a deterministic internal transition function, and τi : Qi Aci an action function, Aci Ac for all i N. Let Stri be the set of strategies for player i. A strategy profile σ = (σ1, . . . , σn) is a vector of strategies, one for each player. As with actions, σi denotes the strategy assigned to player i in profile σ. Moreover, by ( σB, σ C) we Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19) denote the combination of profiles where players in disjoint B and C are assigned their corresponding strategies in σ and σ , respectively. Once a state s and a strategy profile σ are fixed, the game has an outcome, a path in A, which we denote by π( σ, s). Because strategies are deterministic, π( σ, s) is the unique path induced by σ, that is, the sequence s0, s1, s2, . . . such that sk+1 = tr(sk, (τ1(qk 1), . . . , τn(qk n))), and qk+1 i = δi(qk i , (τ1(qk 1), . . . , τn(qk n))), for all k 0. Arenas define the dynamic structure of games, but lack a central aspect of a game: preferences, which give games their strategic structure. A multi-player game is obtained from an arena A by associating each player with a goal. We consider multi-player games with GR(1) and mp goals. A multi-player GR(1) game is a tuple GGR(1) = A, (γi)i N where A is an arena and γi is the GR(1) goal for player i. A multi-player mp game is a tuple Gmp = A, (wi)i N , where A is an arena and wi : St Z is a function mapping every state of the arena into an integer number. When it is clear from the context, we refer to a multi-player GR(1) or mp game as a game and denote it by G. In any game with arena A, a path π in A induces a sequence λ(π) = λ(s0)λ(s1) of sets of atomic propositions; if, in addition, A is the arena of an mp game, then, for each player i, the sequence wi(π) = wi(s0)wi(s1) of weights is also induced. For a GR(1) game and a path π in it, the payoff of a player i is payi(π) = 1 if λ(π) |= γi and payi(π) = 0 otherwise. Regarding an mp game, the payoff of player i is payi(π) = mp(wi(π)). Moreover, for a GR(1) game and a path π, by Win(π) = {i N : λ(π) |= γi} and Lose(π) = {j N : λ(π) |= γj} we denote the set of winners and losers, respectively, over π, that is, the set of players that get their goal satisfied and not satisfied, respectively, over π. With an abuse of notation, we sometime denote Win( σ, s) = Win(π( σ, s)) and Lose( σ, s) = Lose(π( σ, s)), respectively, the set of winners and losers over the path generated by strategy profile σ when starting the game from s. Furthermore, we simply write π( σ) for π( σ, s0). Nash equilibrium. Using payoff functions, we can define the concept of Nash equilibrium [Osborne and Rubinstein, 1994]. For a game G, a strategy profile σ is a Nash equilibrium of G if, for every player i and strategy σ i Stri, we have payi(π( σ)) payi(π(( σ i, σ i))) . Let NE(G) be the set of Nash equilibria of G. E-NASH and rational verification. In rational verification, a key question/problem is E-NASH, which is concerned with the existence of a Nash equilibrium that fulfils a given temporal specification ϕ. Formally, E-NASH is defined as follows: Definition 1 (E-NASH). Given a game G and a formula ϕ: Does there exist σ NE(G) such that π( σ) |= ϕ? This problem can be instantiated in many ways. For instance, in [Gutierrez et al., 2015b], E-NASH was investigated over iterated Boolean Games with specifications and players goals in LTL, and was proved to be 2EXPTIME-complete. Iterated Boolean games is a very natural framework, but it is computationally intractable. Motivated by this computational limitation, in this paper, we study E-NASH for a number of relevant instantiations of the problem, which we show to have better computational complexity. In particular, we study cases where Specifications ϕ are LTL and players goals are GR(1); Specifications ϕ are LTL and players have mp goals; Both the specification ϕ and the goals are GR(1); Specifications ϕ are GR(1) and players have mp goals. Automata. Some of the algorithms we present for the ENASH problem use techniques from automata theory. Specifically, we use deterministic automata on infinite words with Streett acceptance conditions. Formally, a deterministic Streett automaton on infinite words (DSW) is a tuple A = (Σ, Q, q0, δ, Ω) where Σ is the input alphabet, Q is a finite set of states, δ : Q Σ Q is a transition function, q0 is an initial state, and Ωis a Streett acceptance condition. A Streett condition Ωis a set of pairs {(E1, C1), . . . , (En, Cn)} where Ek Q and Ck Q for all k [1, n]. A run ρ is accepting in a DSW A with condition Ωif ρ either visits Ek finitely many times or visits Ck infinitely often, i.e., if for every k either inf (ρ) Ek = or inf (ρ) Ck = . 3 Games of General Reactivity of Rank 1 As indicated before, we solve GR(1) games in two cases: the first one is when the specification formula is expressed in LTL, while the goals are in GR(1); the second one when the specification formula as well as the goals belong to GR(1). First, we provide a general result about a characterization of Nash Equilibrium for GR(1) given in terms of punishments. We first require some notation. For a GR(1) game G, player j N, and state s St, the strategy profile σ j is punishing for player j in s if π(( σ j, σ j), s) |= γj, for every possible strategy σ j of player j. We say that a state s is punishing for j if there exists a punishing strategy profile for j on s. Moreover, we denote by Punj(G) the set of punishing states in G. A pair (s, a) St Ac is punishing-secure for player j, if tr(s, ( a j, a j)) Punj(G) for every action a j. Theorem 1. In a given GR(1) game G, there exists a Nash Equilibrium if and only if there exists an ultimately periodic path π such that, for every k N, the pair (sk, ak) of the k-th iteration of π is punishing secure for every j Lose(π). Proof sketch. From left to right, let σ NE(G) and π be the ultimately periodic path generated by σ. Assume by contradiction that π is not punishing secure for some j N, that is, there is k N and action a j such that tr(sk, ( a j, a j)k) / Punj(G). Thus, j can deviate at sk and satisfy γj, which is a contradiction to σ being a Nash equilibrium. From right to left, recall that π can be generated by a finite transducer, say T. Moreover, for every losing player j, there is a punishing strategy profile for j in every s Punj(G). Combining T with such punishment strategies, we build a profile σ that follows the actions prescribed by T, until a losing player j deviates. In such a case, σ would start punishing player j. Observe that GR(1) objectives are prefix-independent, which Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19) is not true for general LTL objectives. That means that the punishment from the k-th iteration takes effect no matter what prefix π k has been played so far. Thus, there is no beneficial deviation for j and σ is a Nash equilibrium. At this point, solving E-NASH can be done as follows: 1. Guess a set W N of winners; 2. For each player j L = N \ W, a loser in the game, compute its punishment region Punj(G); 3. Remove from G the states that are not punishing for players j L and the edges (s, s ) that are labelled with an action profile a such that (s, a) is not punishingsecure for some j L, thus obtaining a game G L; 4. Check whether there exists an ultimately periodic path π in G L such that π |= ϕ V i W γi holds. The four steps described in the above procedure yield Algorithm 1, which solves the problem at hand. Algorithm 1: E-NASH of GR(1) games. 1 Input: A game GGR(1) and a specification formula ϕ. 2 for i N do 3 Compute Puni(G) 4 for W N do 5 Compute G L 6 if π |= (ϕ V i W γi) for some π G L then 7 return Accept 8 return Reject While line 6 requires solving the model checking problem for an LTL formula, which can be done in polynomial space, line 5 can be done in polynomial time. Line 4, on the other hand, makes the procedure run in exponential time in the number of players, but still in polynomial space. We then only need to check line 3: this step can be done in polynomial time, as we now show. Theorem 2. For a given GR(1) game G over the arena A = N, Ac, St, s0, tr, λ and a player j N, computing the winning region Punj(G) of player j can be done in polynomial time with respect to the size of both G and γj. Proof. We reduce the problem to computing the winning region of a suitably defined Streett game of index k = 1, whose complexity is known to be O(mnk+1kk!) [Piterman and Pnueli, 2006]. Given that in our case we have k = 1, we obtain a polynomial time algorithm. Recall that the goal of player j is of the form: r=1 GFθj r, where ψj l s and θj r s are boolean combinations of atomic propositions. Then, consider the arena A = N, Ac, St , s 0, tr 1 where 1We omit the definition of labelling function, as not needed here. St = St {0, . . . , mj} {0, . . . , nj}; s 0 = (s0, 0, 0); tr ((s, ι1, ι2), a) = (tr(s, a), ι 1, ι 2) where 1, if ι1 = 0 ι1, if ι1 = 0 and s |= ψj ι1 (ι1 (mj+1) 1), otherwise , and 1, if ι2 = 0 ι2, if ι2 = 0 and s |= θj ι2 (ι2 (nj+1) 1), otherwise Intuitively, arena A mimics the behaviour of A and carries two indexes, ι1 and ι2. Index ι1 is increased by one every time the path visits a state that satisfies ψj ι1 and resets to 0 every time the path visits a state that satisfies ψj mj. Clearly, ι1 is reset infinitely many times if and only if the path satisfies every ψj l infinitely many times, and so if and only if it satisfies the temporal specification Vmj l=1 GFψj l . The same argument applies to index ι2, but with respect to the boolean combinations θj r s. Now, consider the sets Cj = St {0} {0, . . . , nj} and Ej = St {0, . . . , mj} {0}. Clearly, the Streett pair (Cj, Ej) is satisfied by all and only the paths in A that satisfy γj. Therefore, the winning region of γj can be computed as the winning set of the Streett game of index 1 with (Cj, Ej) being the only Streett pair. As this can be done in polynomial time, we proved the statement. Based on Theorem 2, we have the following result. Corollary 1. The E-NASH problem for GR(1) games with an LTL specification is PSPACE-complete. Proof. The upper-bound follows from the procedure described above. Regarding the lower-bound, note that modelchecking an LTL formula ϕ against a Kripke structure K can be easily encoded as an instance of E-NASH where G is played over a Kripke structure K, taken to be its arena, players goals being tautologies, and the specification being ϕ. In such a case, we have that K |= ϕ if and only if E-NASH for the pair (G, ϕ) has a negative answer. Corollary 1 sharply contrasts with the same result in case the goals of the players are general LTL formulae. In this more general case, E-NASH is 2EXPTIME-complete. The special case of GR(1) specifications. One of hardest parts of Algorithm 1 is line 6, where an LTL model checking problem has to be solved, making the running time of the whole procedure exponential in the size of the specification and goals of the players. As we show in this section, a way to drastically reduce the complexity of our decision procedure is to let the specification be in GR(1) too. In such a case, the LTL model checking procedure in line 6 of Algorithm 1 can be avoided, leading to a much simpler construction, which runs in polynomial time for every fixed number of players. In this section, we provide precisely such a simpler construction. Recall that every GR(1) specification ϕ can be regarded as a Streett condition of index 1 over an arena A suitably 2By k we denote the addition modulo k. Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19) constructed from the original arena A. Thus, by denoting (Cϕ, Eϕ) and (Ci, Ei) the Streett pairs corresponding to the GR(1) conditions ϕ and γi, respectively, the problem of finding a path in A satisfying the formula ϕ V i W γi amounts to deciding the emptiness of the Streett automaton A = Ac, St , s 0, tr, Ω where Ω= {(Cϕ, Eϕ), (Cγi, Eγi)i W }. Note that the size of A is polynomial in the size of the GR(1) formulae involved, polynomial in the number of states and actions in the original arena A, and exponential in the number of players. More specifically, we have that |St | = |St| |γ||N| and so the number of edges is at most |St | 2. Moreover, the emptiness problem of a deterministic Streett word automaton can be solved in time that is polynomial in the automaton s index and its number of states and transitions [Rauch Henzinger and Telle, 1996; Kupferman, 2015]. The complexity of the E-NASH problem takes 2|N| times a procedure for computing at most |N| punishing regions (that is polynomial in the size of both G and ϕ, γ1, . . . , γN) plus the complexity of the emptiness problem for a Streett automaton whose size is polynomial in G ϕ, γ1, . . . , γN, and exponential in the number of players. Based on the constructions described above, we can show the following (fixed-parameter tractable) complexity result. Theorem 3. For a given GR(1) game G and a GR(1) formula ϕ, the E-NASH problem can be solved in time that is polynomial in |St|, |Ac|, and |ϕ|, |γ1|, . . . , |γN| and exponential in the number of players |N|. Therefore, the problem is fixedparameter tractable, parametrized in the number of players. 4 Mean-Payoff Games We now focus on multi-player mean-payoff (mp) games. As in the previous case, we first characterise the Nash Equilibria of a game in terms of punishments and then reduce E-NASH to a suitable path-finding problem in the underlying arena. To do this, we first need to recall the notion of secure values for mean-payoff games [Ummels and Wojtczak, 2011]. For a player i and a state s St, by puni(s) we denote the punishment value of i over s, that is, the maximum payoff that i can achieve from s, when all other players behave adversarially. Such a value can be computed by considering the corresponding two-player zero-sum mean-payoff game [Zwick and Paterson, 1996]. Thus, it is in NP co NP, and note that both player i and coalition N \ {i} can achieve the optimal value of the game using memoryless strategies. For a player i and a value z R, a pair (s, a) is z-secure for i if puni(tr(s, ( a i, a i))) z for every a i Ac. Theorem 4. For every mp game G and ultimately periodic path π = (s0, a0), (s1, a1), . . ., the following are equivalent 1. There is σ NE(G) such that π = π( σ, s0); 2. There exists z RN, where zi {puni(s) : s St} such that, for every i N (a) for all k N, the pair (sk, ak) is zi-secure for i, and (b) zi payi(π). Proof sketch. (1) (2): We prove by contraposition. Let zi be the largest value player i can get by deviating from π, and let k N be such that zi = puni(tr(sk, ( a i, a i))), i.e., i can get as much as zi by going alone at sk. Suppose further that payi(π) < zi. Thus, player i would deviate at sk since i can get better payoff by not following π which is a contradiction to π being a path induced by a Nash equilibrium. (2) (1): Define strategy profile σ that follows π as long as no-one has deviated from π. In such a case where player i deviates on the k-th iteration, the strategy profile σ i starts playing the zi-secure strategy for player i that guarantees the payoff of player i to be less than zi. Therefore, we have payi(π( σ i, σ i)) zi payi(π), for every possible strategy σ i of player i (the second inequality is due to condition 2(b)). Thus, there is no beneficial deviation for player i and π is a path induced by a Nash equilibrium. The characterization of Nash Equilibria provided in Theorem 4 allows us to turn the E-NASH problem for mp games into a path finding problem over G. Similarly to the case of GR(1) games, we have the following procedure. 1. For every i N and s St, compute the value puni(s); 2. Guess a vector z RN of values, each of them being a punishment value for a player i; 3. Compute the game G[z] by removing the states s such that puni(s) zi for some player i and the transitions (s, a) that are not zi secure for some player i; 4. Find an ultimately periodic path π in game G[z] such that π |= ϕ and zi payi(π) for every player i N. Step 1 can be done in NP for every pair (i, s), step 2 can be done in exponential time and polynomial space in the number of z-secure values, and step 3 can be done in polynomial time, similar to the case of GR(1) games. Regarding the last step, its complexity depends on the specification language. For the case of ϕ being an LTL formula, consider the formula ϕE-NASH := ϕ i N (mp(i) zi), written in the language LTLLim, an extension of LTL where statements about mean-payoff values over a given weighted arena can be made [Boker et al., 2014]. Observe that formula ϕE-NASH corresponds exactly to requirement 2(b) in Theorem 4. Moreover, since every path in G[z] satisfies condition 2(a) by construction, every path that satisfies ϕE-NASH is a solution of the E-NASH problem and viceversa. We can solve the latter problem by model checking the formula against the arena underlying G[z]. Since this can be done in PSPACE [Boker et al., 2014], we have the following result. Corollary 2. The E-NASH problem for mp games with an LTL specification formula ϕ is PSPACE-complete. As for the case of GR(1) games, we can summarize the procedure in the following algorithm (Algorithm 2). The special case of GR(1) specifications. As in the case of GR(1) games, here we show that restricting the specification language to GR(1) lowers the complexity also for mp games. Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19) Algorithm 2: E-NASH of mp games. 1 Input: A game Gmp and a specification formula ϕ. 2 for i N and s St do 3 Compute puni(G) 4 for z {puni(s) : s St}N do 5 Compute G[z] 6 if π |= ϕE-NASH for some π G[z] then 7 return Accept 8 return Reject The reason is that the path finding problem for GR(1) specifications can be done while avoiding model-checking of an LTLLim formula. In order to do this, we follow a different approach. Using an mp game G and a GR(1) specification ϕ we define a linear program such that the linear program has a solution if and only if the pair (G, ϕ) is an instance of E-NASH. In particular, this approach is similar to the technique used in [Gutierrez et al., 2017c, Theorem 2], where Linear Programming is used to find the complexity of solving a variant of E-NASH. Formally, we have the following result. Theorem 5. The E-NASH problem for mp games with a GR(1) specification ϕ is NP-complete. Proof. We will define a linear program of size polynomial in G having a solution if and only if there exists an ultimately periodic path whose payoff for every player i is at least a minimum threshold zi and satisfies the GR(1) specification. In order to do that, first recall that ϕ has the following form and let V (ψl) and V (θr) be the subset of states in G that satisfy the Boolean combinations ψl and θr, respectively. Observe that property ϕ is satisfied over a path π if, and only if, either π visits every V (θr) infinitely many times or visits some of the V (ψl) only a finite number of times. For the game G[z], let V, E be the underlying graph, and for every edge e E introduce a variable xe. Informally, the value xe is the number of times that the edge e is used on a cycle. Formally, let src(e) = {v V : w e = (v, w) E}; trg(e) = {v V : w e = (w, v) E}; out(v) = {e E : src(e) = v}; and in(v) = {e E : trg(e) = v}. Consider ψl for some 1 l m, and define the linear program LP(ψl) with the following inequalities and equations: Eq1: xe 0 for each edge e a basic consistency criterion; Eq2: Σe Exe 1 ensures that at least one edge is chosen; Eq3: for each a A, Σe Ewa(src(e))xe 0 this enforces that the total sum of any solution is positive; Eq4: Σsrc(e) V (ψl) = xe = 0 this ensures that no state in V (ψl) is in the cycle associated with the solution; Eq5: for each v V , Σe out(v)xe = Σe in(v)xe this condition says that the number of times one enters a vertex is equal to the number of times one leaves that vertex. By construction, it follows that LP(ψl) admits a solution if and only if there exists a path π in G such that zi payi(π) for every player i and visits V (ψl) only finitely many times. In addition, consider the linear program LP(θ1, . . . , θn) defined with the following inequalities and equations: Eq1: xe 0 for each edge e a basic consistency criterion; Eq2: Σe Exe 1 ensures that at least one edge is chosen; Eq3: for each a A, Σe Ewa(src(e))xe 0 this enforces that the total sum of any solution is positive; Eq4: for all 1 r n, Σsrc(e) V (θr) = xe 1 this ensures that for every V (θr) at least one state is in the cycle; Eq5: for each v V , Σe out(v)xe = Σe in(v)xe this condition says that the number of times one enters a vertex is equal to the number of times one leaves that vertex. In this case, LP(θ1, . . . , θn) admits a solution if and only if there exists a path π such that zi payi(π) for every player i and visits every V (θr) infinitely many times. Since the constructions above are polynomial in the size of both G and ϕ, we can conclude it is possible to check in NP the statement that there is a path π satisfying ϕ such that zi payi(π) for every player i in the game if and only if one of the two linear programs defined above has a solution. For the lower bound, we use [Ummels and Wojtczak, 2011] and observe that if ϕ is true, then the problem is equivalent to checking whether the mp game has a Nash equilibrium. 5 Other Rational Verification Problems E-NASH is, arguably, the most fundamental problem in the rational verification framework, but it is not the only one. The two other key problems are A-NASH and NON-EMPTINESS. The former is the dual problem of E-NASH, which asks, given a game G and a specification ϕ, whether ϕ is satisfied in all Nash equilibria of G. The latter simply asks whether the game G has at least one Nash equilibrium, and it is the special case of E-NASH where the specification ϕ is any tautology. We can conclude from (the proofs of) the results presented so far, which are summarised in Table 1, that while A-NASH for GR(1) games is also PSPACE and FPT, respectively, in case of LTL and GR(1) specifications, for mp games the problem is, respectively, PSPACE and co NP, in each case. In addition, we can also conclude that whereas NON-EMPTINESS for GR(1) games is FPT, for mp games is NP-complete. These results contrast with those when players goals are general LTL formulae, where all problems are 2EXPTIME-complete since LTL synthesis, which is 2EXPTIME-hard [Pnueli and Rosner, 1989], can be encoded. These results also contrast with those presented in [Gao et al., 2017], where it is shown that, in succinct model representations given by iterated Boolean games or reactive modules, all problems in the rational verification framework can be reduced to NON-EMPTINESS, which clearly cannot be the case here, unless the whole polynomial hierarchy collapses. Acknowledgments Najib acknowledges the support of the Indonesia Endowment Fund for Education (LPDP), and Perelli the support of the ERC project d Syn MA (grant agreement No 772459). Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19) References [Alur and La Torre, 2004] Rajeev Alur and Salvatore La Torre. Deterministic Generators and Games for LTL Fragments. ACM Transactions on Computational Logic, 5(1):1 25, 2004. [Alur et al., 2002] Rajeev Alur, Thomas Henzinger, and Orna Kupferman. Alternating-Time Temporal Logic. Journal of the ACM, 49(5):672 713, 2002. [Bloem et al., 2012] Roderick Bloem, Barbara Jobstmann, Nir Piterman, Amir Pnueli, and Yaniv Sa ar. Synthesis of reactive(1) designs. Journal of Computer and System Sciences, 78(3):911 938, 2012. [Boker et al., 2014] Udi Boker, Krishnendu Chatterjee, Thomas Henzinger, and Orna Kupferman. Temporal Specifications with Accumulative Values. ACM Transactions on Computational Logic, 15(4):27:1 27:25, 2014. [Calude et al., 2017] Cristian S. Calude, Sanjay Jain, Bakhadyr Khoussainov, Wei Li, and Frank Stephan. Deciding parity games in quasipolynomial time. In STOC, pages 252 263. ACM, 2017. [Clarke et al., 2018] Edmund M. Clarke, Orna Grumberg, Daniel Kroening, Doron Peled, and Helmut Veith. Model Checking (2nd edition). MIT Press, 2018. [Emerson, 1990] E. Allen Emerson. Temporal and modal logic. In Handbook of Theoretical Computer Science Volume B: Formal Models and Semantics, pages 996 1072. Elsevier, 1990. [Filiot et al., 2018] Emmanuel Filiot, Raffaella Gentilini, and Jean-Franc ois Raskin. Rational Synthesis Under Imperfect Information. In LICS, pages 422 431. ACM, 2018. [Fisman et al., 2010] Dana Fisman, Orna Kupferman, and Yoad Lustig. Rational Synthesis. In TACAS, volume 6015 of LNCS, pages 190 204. Springer, 2010. [Gao et al., 2017] Tong Gao, Julian Gutierrez, and Michael Wooldridge. Iterated Boolean Games for Rational Verification. In AAMAS, pages 705 713. ACM, 2017. [Gutierrez et al., 2015a] Julian Gutierrez, Paul Harrenstein, and Michael Wooldridge. Expresiveness and Complexity Results for Strategic Reasoning. In CONCUR, volume 42 of LIPIcs, pages 268 282. Schloss Dagstuhl, 2015. [Gutierrez et al., 2015b] Julian Gutierrez, Paul Harrenstein, and Michael Wooldridge. Iterated Boolean Games. Information and Computation, 242:53 79, 2015. [Gutierrez et al., 2017a] Julian Gutierrez, Paul Harrenstein, and Michael Wooldridge. From Model Checking to Equilibrium Checking: Reactive Modules for Rational Verification. Artificial Intelligence, 248:123 157, 2017. [Gutierrez et al., 2017b] Julian Gutierrez, Paul Harrenstein, and Michael Wooldridge. Reasoning about Equilibria in Game-like Concurrent Systems. Annals of Pure and Applied Logic, 168(2):373 403, 2017. [Gutierrez et al., 2017c] Julian Gutierrez, Aniello Murano, Giuseppe Perelli, Sasha Rubin, and Michael Wooldridge. Nash Equilibria in Concurrent Games with Lexicographic Preferences. In IJCAI, pages 1067 1073, 2017. [Gutierrez et al., 2018a] Julian Gutierrez, Muhammad Najib, Giuseppe Perelli, and Michael Wooldridge. EVE: A Tool for Temporal Equilibrium Analysis. In ATVA, volume 11138 of LNCS, pages 551 557. Springer, 2018. [Gutierrez et al., 2018b] Julian Gutierrez, Giuseppe Perelli, and Michael Wooldridge. Imperfect Information in Reactive Modules games. Information and Computation, 261(Part):650 675, 2018. [Kupferman et al., 2016] Orna Kupferman, Giuseppe Perelli, and Moshe Vardi. Synthesis with Rational Environments. Annals of Mathematics and Artificial Intelligence, 78(1):3 20, 2016. [Kupferman, 2015] Orna Kupferman. Automata Theory and Model Checking. Handbook of TCS, 2015. [Osborne and Rubinstein, 1994] Martin J. Osborne and Ariel Rubinstein. A Course in Game Theory. MIT Press, 1994. [Piterman and Pnueli, 2006] Nir Piterman and Amir Pnueli. Faster Solutions of Rabin and Streett Games. In LICS, pages 275 284, 2006. [Pnueli and Rosner, 1989] Amir Pnueli and Roni Rosner. On the synthesis of a reactive module. In POPL, pages 179 190. ACM Press, 1989. [Pnueli, 1977] Amir Pnueli. The Temporal Logic of Programs. In FOCS, pages 46 57. IEEE, 1977. [Rauch Henzinger and Telle, 1996] Monika Rauch Henzinger and Jan Telle. Faster Algorithms for the Nonemptiness of Streett Automata and for Communication Protocol Pruning. In SWAT, pages 16 27, 1996. [Toumi et al., 2015] Alexis Toumi, Julian Gutierrez, and Michael Wooldridge. A Tool for the Automated Verification of Nash Equilibria in Concurrent Games. In ICTAC, volume 9399 of LNCS, pages 583 594. Springer, 2015. [Ummels and Wojtczak, 2011] Michael Ummels and Dominik Wojtczak. The Complexity of Nash Equilibria in Limit-Average Games. In CONCUR, pages 482 496, 2011. [Wooldridge et al., 2016] Michael Wooldridge, Julian Gutierrez, Paul Harrenstein, Enrico Marchioni, Giuseppe Perelli, and Alexis Toumi. Rational Verification: From Model Checking to Equilibrium Checking. In AAAI, pages 4184 4191. AAAI Press, 2016. [Zwick and Paterson, 1996] Uri Zwick and Mike Paterson. The Complexity of Mean Payoff Games on Graphs. Theoretical Computer Science, 158(1):343 359, 1996. Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19)