# probabilistic_alternatingtime_µcalculus__87992c7c.pdf The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19) Probabilistic Alternating-Time µ-Calculus Fu Song,1 Yedi Zhang,1 Taolue Chen,2,5 Yu Tang,3 Zhiwu Xu4 1Shanghai Tech University, 2University of London 3East China Normal University, 4Shenzhen University 5Nanjing University Reasoning about strategic abilities is key to an AI system consisting of multiple agents with random behaviors. We propose a probabilistic extension of Alternating µ-Calculus (AMC), named PAMC, for reasoning about strategic abilities of agents in stochastic multi-agent systems. PAMC subsumes existing logics AMC and PµTL. The usefulness of PAMC is exemplified by applications in genetic regulatory networks. We show that, for PAMC, the model checking problem is in UP co-UP, and the satisfiability problem is EXPTIME-complete, both of which are the same as those for AMC. Moreover, PAMC admits the small model property. We implement the satisfiability checking procedure in a tool PAMCSolver. Introduction Temporal logics play a key role in specification and verification of ICT systems. There are two fundamental decision problems of temporal logics: satisfiability checking and model checking. Given a temporal logic formula, the former asks whether a system satisfying the formula exists, while the latter asks whether a further given system satisfies the formula. Temporal logics for reasoning about strategic abilities in Multi-Agent Systems (MAS) have been proposed, typically in the form of alternating-time temporal logics (e.g., ATL, ATL and AMC) (Alur, Henzinger, and Kupferman 2002), and strategic logics (Chatterjee, Henzinger, and Piterman 2010; Mogavero et al. 2014; 2017). Both model checking and satisfiability checking for these logics have been extensively investigated (Schewe and Finkbeiner 2006; Walther et al. 2006; Schewe 2008). In practice, agents or the environment may exhibit random behaviors because of unpredictable physical conditions. Hence, it is vital to reason about strategic abilities of agents in a stochastic setting. From the modeling perspective, this gives rise to stochastic MAS, which consist of a set of agents operating concurrently in a stochastic environment. Probabilistic variants of ATL and ATL , for instance PATL, PATL (Chen and Lu 2007; Chen et al. 2013) and SGL (Baier et al. 2007), This work is partially supported by NSFC (No. 61532019, 61761136011, and 61502308), EPSRC (EP/P00430X/1) and ARC (DP160101652, DP180100691). Copyright c 2019, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. have been proposed, for which the model checking problem was also studied. In contrast, the satisfiability problem for these logics turns out to be much more difficult. Indeed, the satisfiability problem for PCTL, a (proper) fragment of these logics, is a long-standing open problem in the formal verification community (Chakraborty and Katoen 2016). This paper aims to propose a temporal logic with decidable satisfiability checking for reasoning about stochastic MAS. Apart from being theoretical appealing, admitting decidability for satisfiability is useful in practice. Examples include debugging specifications (as writing formal specifications is often error-prone (Rozier and Vardi 2010)), social procedure or mechanism design (Pauly 2011), and assertion-based design (i.e., verifying consistency of all requirements by formalizing designer s intention by assertions, and checking the full set of assertions to be satisfied before the model is ready for examination (Foster, Krolnik, and Lacey 2004)). To this end, we propose a probabilistic variant of AMC, probabilistic alternating-time µ-calculus (PAMC), which is acquired by equipping the coalition modalities with (qualitative) probability quantifiers A k in AMC. For instance, νZ(a {1, 2} 0.9Z) states that the agent group {1, 2} has a coalition strategy at each step such that the opponent coalition can escape the a-region in one step with probability less than 0.1. PAMC subsumes both AMC and PµTL (Liu et al. 2015), but is incomparable with PATL and PATL mentioned before. Remark that we do not extend AMC in a quantitative way such as (Mio 2012), in order to retain decidability. We investigate the model checking and the satisfiability checking problems of PAMC. The former can be done by an adaptation of the algorithm for AMC. As a result, the model checking problem of PAMC is shown in UP co-UP. In a sharp contrast, for satisfiability checking the presence of coalition modalities and probabilistic quantifiers pose new challenges. In particular, the decision procedures for (non-probabilistic) alternating-time logics (Schewe and Finkbeiner 2006; Walther et al. 2006; Schewe 2008) and PµTL (Chakraborty and Katoen 2016) cannot be directly adapted. To solve this problem, we propose a novel reduction from the satisfiability problem to solving (turn-based two-player) parity games, resulting in an EXPTIME decision procedure and a small model property in the sense that every satisfiable formula φ has a model of size exponential in |φ| (cf. Theorem 3). Remarkably, the com- plexities of the model checking and satisfiability problems of PAMC lie in the same complexity classes of AMC. We implement a satisfiability solver for PAMC, which is, to our best knowledge, the first satisfiability checker for (probabilistic) alternating-time temporal logics. Due to space restriction, proofs and experimental results are given in the full version, which, together with our tool PAMC and benchmarks, is available at http://faculty.sist. shanghaitech.edu.cn/faculty/songfu/Projects/PAMCSolver. Probabilistic Concurrent Game Structures For a natural number k N, let [k] denote the set {1, ..., k}. Given a probability distribution Pr : X [0, 1], let supp(Pr) denote the set {x X | Pr(x) > 0}. We denote by D(X) the set of probability distributions on X. Probabilistic concurrent game structures are a model of concurrent stochastic multi-agent systems, in which the transition probability function gives a probability distribution over the successor states after considering a joint action from the agents. In this work, we consider stochastic multi-agent systems with perfect information only. Definition 1. Fix a finite set AP of atomic propositions (i.e., observations). A probabilistic concurrent game structure (PCGS) is a tuple M = (Ag, Act, Q, Γ, δ, λ, q0), where: Ag = [n] is a finite set of agents. Act = S i Ag Acti and Acti is a non-empty set of actions of agent i Ag. A decision d = a1, ..., an is a joint action of all the agents in which ai, denoted by d(i), is the action chosen by agent i. We denote by D the set of decisions. Q is a finite set of states and q0 Q is the initial state. Γ = (Γi)i Ag with Γi : Q 2Acti assigning to agent i at a state q a finite set Γi(q) Acti of actions available at the state q. We denote by D(q) the set Q i Ag Γi(q) of available decisions at q. δ : Q D D(Q) is a (partial) probability transition function which associates each pair (q, d) Q D such that d D(q) with a probability distribution over Q. λ : Q 2AP is the labeling function assigning to each state q Q a set of atomic propositions. Given a state q Q and a decision d D(q), M moves to a next state q Q with probability δ(q, d)(q ), which is also denoted as δ(q, d, q ). We indicate by supp(q, d) the set supp(δ(q, d)). The outdegree of M is the maximum of |supp(q, d)| for q Q and d D(q). Paths and tracks. A path π Qω (a.k.a. play) is an infinite sequence of states q0q1... such that for all j 1, qj supp(q j 1, d j 1) for some d j 1 D(qj 1). A track is a finite prefix of a path. Given a track π = q0q1...qm (resp. a path π = q0q1...) and 0 i m for track, let πi denote the state qi and π i denote q0...qi. Strategies. A (mixed) strategy of agent i Ag is a function θi : Q+ D(Acti) that assigns to each track q0 . . . qm, representing the past history of the game, a probability distribution on its available actions Γi(qm) Acti. Therefore, the choice of the next action can be history-dependent and mixed. A strategy θi is pure if for all π Q+, θi(π) is a Dirac distribution, i.e., |supp(θi(π))| = 1. A strategy θi is memoryless (i.e., positional or imperfect recall) if for all π, π Q+ and q Q, θi(πq) = θi(π q), namely, the agent takes an action only depending on the last state. We denote by Θi the set of all strategies for agent i and let Θ = S i Ag Θi. A coalition is a set of agents A Ag. A coalition strategy for A is a function υA : A Θ assigning to each agent i A a strategy υA(i) Θi. Let ΥA denote the set of all coalition strategies for the coalition A. We denote by A the set Ag \ A. Outcomes. Given a track π Q+ such that the last state of π is q and a decision d D(q), we denote by PrυA,υA(π, d) the probability Q i A υA(i)(π)(d(i)) Q i A υA(i)(π)(d(i)), that is, the probability of the decision d chosen by the agents Ag with respect to π, υA and υA. We say that a path π is compatible with respect to υA and υA, if for all j 0, there is a decision dj D(π j) such that δ(π j, dj, πj+1) > 0 and PrυA,υA(π j, dj) > 0. We denote by O υA,υA q the set of paths starting from q that are compatible with respect to υA and υA, which is the set of paths that can be followed by the game when agents enforce strategies υA and υA. Probability Space. A σ-algebra over a set Ωis a set E 2Ω such that E contains and is closed under countable union and complement. An element of E is called an event. A probability space is a triple (Ω, E, Pr), where Ωis a sample space, E is a σ-algebra over Ω, Pr : E [0, 1] is a probability measure such that Pr(Ω) = 1 and the countable additivity property is satisfied, i.e., Pr(U V) = Pr(U)+Pr(V) whenever U V = . Given a coalition A Ag, a state q, two coalition strategies υA and υA, one can construct a probability space over the set of paths O υA,υA q with the probability measure Pr υA,υA q defined in the standard way (Vardi 1985), where an event is a measurable set of paths. CGS, MDP and MC. A concurrent game structure (CGS) is a PCGS such that all the probability distributions involved in the PCGS (i.e., probability transition function and strategies) are Dirac distributions. A Markov decision process (MDP) is a PCGS such that |Ag| = 1. A Markov chain (MC) is an MDP such that |Act| = 1. Probabilistic Alternating-Time µ-Calculus Probabilistic alternating-time µ-calculus is a simple and succinct, but natural, probabilistic extension of AMC (Alur, Henzinger, and Kupferman 2002). In this logic, coalition modalities A φ from AMC are replaced with probabilistic coalition modalities A kφ, which probabilistically quantify over the strategic choices of a group A of agents. Definition 2. Let Z be a finite set of propositional variables. The syntax of probabilistic alternating-time µ-calculus (PAMC for short) is defined as follows: φ ::= p | φ | Z | φ φ | φ φ | µZ.φ | νZ.φ | A kφ | [[A]] kφ where p AP, Z Z, { , >, <, }, k [0, 1] is a rational constant, A Ag, and for each µZ.φ and νZ.φ, each occurrence of Z in φ is under the scope of an even number of negations in φ. Let φ be a PAMC formula. Z Z is a free variable in φ if an occurrence of Z is not in the scope of a fixed-point operator µZ or νZ. φ is closed if φ does not contain any free variables. A closed formula is called a sentence. W.l.o.g., we assume hereafter that each variable Z is quantified by either µ or ν at most once in each PAMC sentence. We denote by p p and p p. For simplifying the presentation, formulae like {i1, ..., im} kφ may be written as i1, ..., im kφ. Given two PAMC sentences ηZ.φ for η {µ, ν} and ϕ, let φ[ϕ/Z] be the sentence obtained from φ by replacing every occurrence of Z with ϕ. The Fisher-Ladner closure FL(φ) of a PAMC sentence φ contains all the sub-sentences of φ with the rule: if ηZ.ϕ FL(φ) for η {µ, ν}, then ϕ[ηZ.ϕ/Z] FL(φ). The size of FL(φ) is at most double that of φ. We denote by FL (φ) FL(φ) and FL (φ) FL(φ) the set of sentences of the form A kϕ and [[A]] kϕ, respectively. Example 1. Let φ = νZ1.( 1, 2 1 2 Z1 φ ) where φ = µZ2.(p 1, 3 1 3 Z2), we have FL(φ) = {φ, 1, 2 1 2 φ φ , 1, 2 1 2 φ, φ , p 1, 3 1 3 φ , p, 1, 3 1 The semantics of PAMC is defined w.r.t. PCGSs and a valuation ξ : Z 2Q. Let ξ[S/Z] denote the valuation such that ξ[S/Z](Z) = S and ξ[S/Z](Z ) = ξ(Z ) for Z , Z . Definition 3. Given a PCGS M = (Ag, Act, Q, Γ, δ, λ, q0), the semantics of PAMC is defined via the denotation function ξ M, which is defined as follows: p ξ M = {q Q | p λ(q)}; Boolean connectors are defined in a standard way; µZ.φ ξ M = T{Q Q | φ ξ[Q /Z] M Q }; νZ.φ ξ M = S{Q Q | φ ξ[Q /Z] M Q }; A kφ ξ M = ( q Q | υA ΥA, υA ΥA : Pr υA,υA q ({π O υA,υA q | π1 φ ξ M}) k [[A]] kφ ξ M = ( q Q | υA ΥA, υA ΥA : Pr υA,υA q ({π O υA,υA q | π1 φ ξ M}) k We sometimes drop the superscript ξ from φ ξ M if φ is a PAMC sentence. When it is clear from context, the subscript M is also dropped from φ ξ M and φ M. AMC (resp. PµTL (Liu et al. 2015)) is an extension of µcalculus in which the next-modalities are replaced by A φ and [[A]]φ (resp. [Xφ] k) and the semantics is defined over CGSs (resp. MCs). PATL/PATL (Chen and Lu 2007) are probabilistic variants of ATL/ATL (Alur, Henzinger, and Kupferman 2002) and the semantics is defined over PCGSs. Theorem 1. PAMC subsumes AMC and PµTL. PAMC and PATL/PATL are incomparable. Proof sketch. Each AMC (resp. PµTL) formula can be encoded as a PAMC formula, which can be shown easily by structural induction. For instance, A φ in AMC (resp. [Xφ] k in PµTL) can be encoded as A 1φ (resp. kφ ), where φ denotes the encoding of φ in respective cases. Some PAMC formulae (e.g., A >0.5φ) cannot be expressed in AMC or PµTL. Since PµTL and PCTL are incomparable on Markov chains (Liu et al. 2015), there exists a PCTL formula (and thus a PATL formula) which is inexpressible in PAMC. Moreover, we note that AMC is more expressive than ATL , by which a formula in PAMC can be constructed which is inexpressible in PATL . Proposition 1 is directly from the PAMC semantics. To simplify the notation, we write > for <, < for >, for and for , and write b for <, b> for , b for > and b< for . Proposition 1. Given a PAMC formula A kφ or [[A]] kφ for k [0, 1], { , >, , <}, a PCGS M and a valuation ξ, we have the following deduction rules. 1. A kφ ξ = A 1 k φ ξ. 2. [[A]] kφ ξ = [[A]] 1 k φ ξ. 3. A kφ ξ = [[A]]b 1 kφ ξ. 4. [[A]] kφ ξ = A b 1 kφ ξ. 5. Ag kφ ξ = [[ ]] k φ ξ. 6. [[Ag]] kφ ξ = k φ ξ. A PAMC sentence is in negation normal form (NNF) if only appear in front of atomic propositions. Using Proposition 1 and the rule µZ.φ νZ. φ[ Z/Z], every PAMC sentence can be equivalently transformed into NNF. Hereafter, we assume that all PAMC sentences are in NNF. In this work, for a given PAMC sentence φ, we consider the following two fundamental problems. Model checking is to determine whether q0 φ M for the initial state q0 of a given PCGS M. Satisfiability checking is to determine whether there exists a PCGS M with the initial state q0 such that q0 φ M. In PAMC, only probabilistic next-time coalition modalities A kφ and [[A]] kφ are allowed, hence it suffices to consider memoryless strategies for the coalition A. By leveraging the model checking algorithm for AMC (Alur, Henzinger, and Kupferman 2002) that computes φ recursively, we can get a model checking algorithm for PAMC. The key difference lies in the handling of A kφ and [[A]] kφ, which can be done in polynomial time by linear programming. The model checking problem for AMC lies in UP co-UP, and can be solved in exponential time in the alternation depth of the formula. We have that Theorem 2. The model checking problem for PAMC is in UP co-UP and can be decided in O((|φ| |M|)c d) time for some constant c, where d denotes the alternation depth of φ. An application: Genetic Regulatory Networks To demonstrate the usage of PCGSs and PAMC, we present an example from precision medicine based on the probabilistic Boolean network (PBN) model for genetic regulatory networks (Shmulevich and Dougherty 2010). Consider a PBN with n genes, each of which has two local states {0, 1} where 0 and 1 indicate that the corresponding gene is not expressed and expressed, respectively. A (global) state of the PBN is a Boolean vector of local states with width n. Each gene i has a finite set of predictor functions Fi = { f i 1, ..., f i ki} denoting the inter-gene relationship, where each function f i j determines the local state of gene i at the next step when the j-th element of Fi (i.e., f i j) is chosen, and ci j is the probability that f i j is selected (so Pki j=1 ci j = 1). The probability of the PBN evolving from a state q = [q1, ..., qn] to the state q = [q 1, ..., q n] is P f1 F1: f1( q)=q 1 P fn Fn: fn( q)=q n Qn i=1 ci, where ci is the probability that fi is selected by the gene i. For intervention purposes, control inputs are introduced into PBN to reason about treatment strategies. For instance, in cancer therapy, control inputs (e.g. radiation, chemotherapy) may be employed to move the state probability distribution vector away from one associated with uncontrolled cell proliferation or markedly reduced apoptosis (Shmulevich and Dougherty 2010, Section 5.2). Suppose there are x1, ..., xm control inputs ranging over binary values {0, 1}, which control the probabilities of predictor functions. The control input being 1 indicates that a corresponding predictor function (intuitively a treatment) is applied; 0 otherwise. Under the values v = v1, ..., vm of control inputs, the probability of the predictor function f i j Fi becomes ci j, v with Pki j=1 ci j, v = 1. (The specific definitions of ci j, v are irrelevant here.) The main problem is to synthesize values for the control inputs at each state to ensure that the network behaves as desired. The control of PBN can be naturally modelled as a PCGS M = (Ag, Act, Q, Γ, δ, λ, q0), where Ag = [n+m] with i [n] denotes the gene i, and n + i {n + 1, ...n + m} denotes the control input i; Q = {0, 1}n corresponding to states of the network; Acti = Fi for i [n] denoting the predictor functions and Actn+i = {0, 1} for n + i {n + 1, ...n + m} denoting treatment choices; Γi is the function such that Γi( q) = Acti for every agent i and state q; q0 is the initial state of the network which is determined by the patient s physiology; δ is the probability transition function such that for every q Q, f i ji Fi and v = v1, ..., vm {0, 1}m: δ( q, f 1 j1, ..., f n jn, v1, ..., vm , [ f 1 j1( q), ..., f n jn( q)]) := Qn i=1 ci ji, v. Note that we do not specify λ explicitly as this is usually application-specific. With PCGS defined above, we can formulate the achievement property as: µZ.(p A 0.8Z) for some A {n + 1, ..., n + m}. This formula expresses that some desired-region is reachable, and there is treatment strategy using at most treatments in A at each step such that it has at least probability 0.8 to go on with the right direction. The maintenance property can be formulated as νZ.(p A >0.9Z) stating that there is a treatment strategy using at most treatments in A at each step such that the network has less than 0.1 probability to escape from the desired-region. Deciding Satisfiability In this section, we present a reduction from the satisfiability problem of a PAMC sentence φ to solving a (turned-based, two-player) parity game Gφ such that φ is satisfiable iffPlayer0 has a winning strategy in Gφ. Intuitively, as in the classic decision procedure of µ-calculus, the fixpoint is handled by parity games. In particular, the two players are model constructor and spoiler . The constructor intends to show that there a model witnessing the satisfiability of φ, while the spoiler tries to defeat the constructor by demonstrating a path in the game to show that a model cannot exist. Each state controlled by the constructor consists of a set of subsentences of φ for which the constructor strives to show that it is satisfiable. In addition, the coalition strategies for the set of sub-sentences are tackled by the notions of intersection graph and maximally independent set (cf. Definition 5), which are used to make sure that the strategies of an agent in different coalitions are consistent. To handle probabilities, weighted covers are used which would guarantee that there exist distributions over covers satisfying constraints k. To illustrate the reduction, the PAMC sentence Ψ = 1, 2 >0.5p1 3 >0.5( p1 p2) [[1, 3]]>0.4( p1 p2) will be used as an example. We start with some concepts. Parity automata. A parity automaton (PA) is a tuple P = (S, Σ, δ, S 0, F), where S is a finite set of states, Σ is the input alphabet, δ : S Σ 2S is a transition function, S 0 S is the set of initial states and F : S {0, ..., k} is a rank function. A run ρ of P over an ω-word α0α1... Σω is an infinite sequence of states s0s1 such that s0 S 0, and for every i 0, si+1 δ(si, αi). Let inf(ρ) be the set of states visited infinitely often in ρ. ρ is accepting iffmins inf(ρ) F(s) is even. An infinite word is accepted by P iffP has an accepting run over this word. We denote by L(P) Σω the set of all infinite words accepted by P. A B uchi automaton (BA) is a special PA (S, Σ, δ, S 0, F) in which F : S {0, 1}. A PA P = (S, Σ, δ, S 0, F) is deterministic if |S 0| = 1 and for all (s, α) S Σ, |δ(s, α)| 1. δ and S 0 in a deterministic parity automaton (DPA for short) are simplified as the function δ : S Σ S and a state s0. Turned-based two-player parity games. A (turned-based two-player) parity game G is a tuple (V = V0 V1, E, vinit, Ξ), where V0 is a finite set of states (i.e., vertices) of Player-0, V1 is a finite set of states (i.e., vertices) of Player-1, E V V is a finite the set of edges, vinit V0 is the initial state, and Ξ : V {0, ..., k} is a rank function. An infinite play ρ of G is an infinite sequence of states v0v1... such that v0 = vinit, and for every i 0, (vi, vi+1) E. A finite play ρ of G is a finite sequence of states v0v1...vn such that v0 = vinit, and for every i [n], (vi 1, vi) E. A strategy of Player-i is a partial function θ : V Vi V such that for every ρ V and v Vi, if θ(ρ v) is defined, then (v, θ(ρ v)) E. Given a strategy θ0 for Player-0 and a strategy θ1 for Player-1, let Gθ0,θ1 be the play such that Player-0 and Player-1 enforce their strategies θ0 and θ1 during the play. Player-0 wins on a finite play ρ iff the last state of ρ is controlled by Player-1 and it cannot move to the next state anymore. Player-0 wins on an infinite play ρ iffmins inf(ρ) F(s) is even. Player-0 has a winning strategy iff Player-0 has a strategy θ0 such that Player-0 wins on the play Gθ0,θ1, for each strategy θ1 Player-1 chooses. Covers and weighted covers. Given a set of sentences Φ, a cover c of Φ is a set c 2Φ such that S v c v = Φ. A weighted cover of Φ is a pair (c, w) such that c is a cover of Φ and w : c (0, 1] is a probability function such that P v c w(v) = 1. The width of a cover c is the cardinality of c. Given a weighted cover (c, w) of Φ = {φ1, , φm}, let Φ(φi) := {v c | φi v} and w(φi) := P v Φ(φi) w(v). We write Cb(Φ) for the set of covers of Φ with width at most b. Given b and a set Φ, |Cb(Φ)| is at most 2|Φ| (b+1) 2|Φ| 2|Φ| 1 (Chakraborty and Katoen 2016). Intuitively, a weight cover (c, w) of Φ can be seen as a distribution w over the cover c. Definition 4. A set v of sentences is transitive if the following conditions hold: (1) If φ1 φ2 v, then φ1, φ2 v; (2) If φ1 φ2 v, then φ1 v or φ2 v; (3) If ηZ.ϕ v for η {µ, ν}, then ϕ[ηZ.ϕ/Z] v; and (4) There exists at least one sentence of the form A kϕ or [[A]] kϕ in v. Let Vt denote the set of transitive sets of sentences. For every v Vt, we denote by v and [[v]] the sets { A kϕ v} and {[[A]] kϕ v}, respectively. Intuitively, consider a set v that is controlled by the constructor in the parity game Gφ. If v is transitive, then the constructor should give successor states to satisfy sentences in v and [[v]]. Example 2. The set v = { 3 >0.5( p1 p2), 1, 2 >0.5p1, [[1, 3]]>0.4( p1 p2)} is a transitive vertex. We have: v = { 1, 2 >0.5p1, 3 >0.5( p1 p2)}, and [[v]] = {[[1, 3]]>0.4( p1 p2)}. Definition 5. An intersection graph Gv = ( v , Ev) formed from v Vt is an undirected graph such that ( A1 1k1ϕ1, A2 2k2ϕ2) Ev iffA1 A2 , . A maximal independent set (MIS) M of Gv is a maximal set of vertices in Gv such that there are no edges between each pairs of vertices from M. Given a set B Ag, a B-dominant MIS (B-MIS) M of Gv is a maximal set of vertices in Gv such that (1) there is no edge between each pairs of vertices from M and (2) for all A kϕ M, A B. Let misv (resp. mis B v ) denote the set of MIS (resp. BMIS) in Gv. Note that misv and mis B v are sets whose members are sets of sub-sentences. Let MISv := misv \ S [[B]] kϕ [[v]] mis B v n M {[[B]] kϕ} | M mis B v , [[B]] kϕ [[v]] o . As mentioned before, MIS and B-dominant MIS are used to ensure the consistency of the strategies of an agent in different coalitions. Intuitively, consider a transitive set v controlled by the constructor. To satisfy sentences in v and [[v]], probabilistic constraints in each set M of MISv should be ensured simultaneously by one distribution, but this distribution need not satisfy the probabilistic constraints outside of M by choosing proper actions. Example 3. Consider the vertex v defined in Example 2, we have: misv = n 1, 2 >0.5p1, 3 >0.5( p1 p2) o , mis{1,3} v = n 3 >0.5( p1 p2) o , MISv = n 1, 2 >0.5p1, 3 >0.5( p1 p2) , 3 >0.5( p1 p2), [[1, 3]]>0.4( p1 p2) o . Proposition 2. (Moon and Moser 1965) Given a set v of sentences, the size of misv (resp. mis B v ) is at most 3 |v| 3 and can be computed in time O(3 |v| Definition 6. Given a set M MISv, the objective of M is the set OM := {ϕ | A kϕ M or [[A]] kϕ M}. A weighted cover (c M, w M) of OM such that c M Cb(OM) satisfies M, denoted by (c M, w M) |= M, if for all A kϕ, [[A]] kϕ M, w M(ϕ) k, where b = |M| + 1. Example 4. Consider the vertex v from Example 2. We have MISv = {M, M }, where M = { 1, 2 >0.5p1, 3 >0.5( p1 p2)} and M = { 3 >0.5( p1 p2), [[1, 3]]>0.4( p1 p2)}. Furthermore, OM = {p1, p1 p2}, and OM = { p1 p2, p1 p2}. OM has one cover c1 = {p1, p1 p2} such that (c1, w1) |= M for some weight w1, e.g., w1({p1, p1 p2}) = 1. M have two covers c2 = { p1 p2, p1 p2} and c 2 = { p1 p2}, { p1 p2} such that (c2, w2) |= M and (c 2, w 2) |= M for some weights w2 and w 2, e.g. w2({ p1 p2, p1 p2} = 1), w 2({ p1 p2}) = 0.55 and w 2({ p1 p2}) = 0.45). Definition 7. Given a PAMC sentence φ, the two-player game Gφ is a triple (V = V0 V1, E, vinit), where V0 and V1 form a partition of V, Vi for i {0, 1} is the finite set of vertices for Player-i, vinit = {φ} V0 is the initial vertex of the game and E V V is a finite set of edges. V0 and V1 are defined below. E V V is the least set of edges satisfying the following conditions. 1. (v, v {φi}) E if φ1 φ2 v and φi < v, for i {1, 2}. 2. (v, v {ϕ[ηZ.ϕ/Z]}) E if ηZ.ϕ v and ϕ[ηZ.ϕ/Z] < v, for η {µ, ν}. 3. (v, v {φ1, φ2}) E if φ1 φ2 v and {φ1, φ2} v. 4. (v, ) E if both p v and p v for some p AP. 5. (v, ) E if v < Vt and Items 1-4 cannot be applied to v. 6. For each v Vt, let MISv = {M1 v, , Mkvv }, C j := {c M | (c M, w M) |= Mj v}, and C = S j [kv] C j. If C j = for some j [kv], then (v, ) E; otherwise (v,C) E, (C,C j) E for j [kv], (C j, c) E for each c C j, and (c, v ) E for each v c. As a result, V0 := 2FL(φ) { } {C j | (C,C j) E} and V1 := V1,0 V1,1 { }, where V1,0 = {C | (v,C) E}, V1,1 = {c | (c, v ) E}, C,C j and c are defined as above. The intuition of Items 1-5 are straightforward. To see the intuition behind Item 6. Let us begin with v = { A1 1k1φ1, , An nknφn}. If Ai A j = for some i, j [n], then for each pair of possible strategies of Ai and A j, there always exist some common decisions when Ai and Aj enforce their strategies, hence their probabilistic constraints should be simultaneously ensured by one distribution. Suppose misv = {M1, , Mm} (cf. Definition 5), then, (1) for each pair of A kφ and A k φ in Mi, A A = , (2) for each sentence A kφ from Mi, for every set Mj , Mi, there exists a sentence A k φ in Mj with A A , , and (3) each sentence in v must occur at least in one set Mi. This allows us to consider each set Mi individually by assigning the action aj to players in Aj. For each set Mj, we compute its weighted covers, which check existence of distributions that satisfying the probabilistic constraints in Mj. For the general case v = { A1 1k1ϕ1, , Am mkmϕm, [[B1]] 1h1ψ1, , [[Bn]] nhnψn}, we assume there exists a player j < Sn i=1 Bi (cf. Remark 1). By assigning the action bi to the player j for each sentence [[Bi]] ihiψi, it suffices to check individually each of {[[B1]] 1h1ψ1, , [[Bn]] nhnψn}. For each Bi, let mis Bi v = {Mi 1, , Mi mi}, then (1) for each sentence A kϕ from mis Bi v , A Bi, and (2) Mi ℓfor 1 ℓ mi must be a subset of a set in misv. By assigning proper actions to players (cf. previous paragraph), for each [[Bi]] ihiψi, it suffices to check existence of a distribution for sub-sentences in Mi ℓand [[Bi]] ihiψi simultaneously, which is done by computing weighted covers. Moveover, each MIS M misv such that M M for some M mis Bi v is omitted to avoid double checking. Overall, for each v Vt with MISv = {M1 v, , Mkvv }, if C j = for some j [kv], then no distribution satisfies the v1 = { 1, 2 >0.5p1 3 >0.5( p1 p2) [[1, 3]]>0.4( p1 p2)} v2 = { 1, 2 >0.5p1 3 >0.5( p1 p2), [[1, 3]]>0.4( p1 p2)} v3 = { 1, 2 >0.5p1, 3 >0.5( p1 p2), [[1, 3]]>0.4( p1 p2)} C = n {p1, p1 p2} , { p1 p2, p1 p2} , { p1 p2}, { p1 p2} o v7 = {p1, p2} { p1, p1 p2} {p2, p1 p2} { p1} v9 = { p1, p2} v8 = {p2} C1 = n {p1, p1 p2} o C2 = n { p1 p2, p1 p2} , { p1 p2}, { p1 p2} o c1 = {p1, p1 p2} c2 = { p1 p2, p1 p2} c3 = { p1 p2}, { p1 p2} v4 = {p1, p1 p2} { p1 p2, p1 p2} v5 = { p1 p2} v6 = { p1 p2} Figure 1: The game GΨ for Ψ, where C, , c1, c2, and c3 are Player-1 vertices, others are Player-0 vertices. probabilistic constraints of Mj v, we added (v, ) into E. i.e., Player-0 loses the game. Otherwise, the play goes to C and let Player-1 to choose one Mj v to dissatisfy, i.e., (C,C j) E. At C j, Player-0 chooses one cover c C j such that (c, w) |= Mj v, i.e., the distribution w satisfies the probabilistic constraints of Mj v. Next, Player-1 chooses one set of sub-sentences v c to dissatisfy. Example 5. Recalling the sentence Ψ = 1, 2 >0.5p1 3 >0.5( p1 p2) [[1, 3]]>0.4( p1 p2), the corresponding game is shown in Figure 1, in which c1, c2, c3, and C are Player-1 vertices, others are Player-0 vertices. 1, 2 >0.5p1 and 3 >0.5( p1 p2) have some common decisions, as well as 3 >0.5( p1 p2) and [[1, 3]]>0.4( p1 p2). While, 1, 2 >0.5p1 and [[1, 3]]>0.4( p1 p2) can have disjoint decisions by assigning proper actions to Player-1. We have: MISv = {M, M }, where M = { 1, 2 >0.5p1, 3 >0.5( p1 p2)} and M = { 3 >0.5( p1 p2), [[1, 3]]>0.4( p1 p2)}. If Player-1 chooses C1 (resp. C2) to dissatisfy, then Player0 can choose c1 (resp. c3) as the witness of M (resp. M ). It is easy to verify that Player-0 has a winning strategy in the game GΨ by choosing c1, c3, v7, v8 and v9. The model construction from a winning strategy is given in Example 6. Finally, in order to prevent from regeneration sequences for each µ-sentence ψ (i.e., derivation sequences that derive ψ infinitely often), we construct a DPA which accepts all terminating regeneration sequences. By constructing the cross-product of the two-player game Gφ and the DPA, we get the resulting parity game Gφ such that φ is satisfiable iff Player-0 has a winning strategy in the parity game Gφ. Definition 8. Given a PAMC sentence φ, we define a B uchi automaton Pµ = (S µ, Σ, δµ, S 0 µ, Fµ) where S µ = S 0 µ = {ψ FL(φ) | ψ contains some µ-sentence}, Σ = 2FL(φ), δµ(q, α) = q , if q is derived from q and q α, and Fµ(s) = 0 for all s S µ. Pµ precisely accepts all the sequences in which some µ-sub-sentence of φ is derived infinitely often. Let Pφ = (S φ, Σ, δφ, s0 φ, Fφ) be a DPA such that L(Pφ) = Σω \ L(Pµ). Therefore, Pφ accepts all the sequences for which all the µ-sentences in FL(φ) are regenerated finitely often. Definition 9. Based on Definition 7 and Definition 8, we define a turn-based two-player parity game Gφ = (V = V 0 V 1, E , v init, Ξ) where V 0 = V0 (S φ { }), V 1 = V1 (S φ { }), v init = (vinit, s0 φ), for every v V such that v < Vt, s S φ and (v, v ) E: then (v, s), (v , δφ(s, v)) E ; for every v Vt and s S φ: (v, s), ( , ) E if (v, ) E; (v, s), (C, ) E if (v,C) E; (C, ), (C j, ) E for every (C,C j) E; (C j, ), (c, ) E for every c C j; (c, ), (v , δφ(s, v)) E for every v c. Ξ : V {0, , k} such that Ξ(v, s) = Fφ(s) for all s S φ, where is a placeholder, C,C j and c are the vertices as in Definition 7. Lemma 1. Player-0 has a winning strategy in the game Gφ for every satisfiable sentence φ. To show the other direction of Lemma 1, we shall show how to construct a model from a winning strategy of Player-0. W.l.o.g., we assume that {1, , g} is the set of all players appeared in φ. It is well-known that, for parity games, if Player-0 has some winning strategies, then Player-0 has a pure memoryless winning strategy (Gurevich and Harrington 1982). Let ζ : V 0 V be a winning strategy of Player-0 in Gφ. Since the DPA Pφ is deterministic, we can directly extract from ζ : V 0 V a winning strategy ζ : V0 V for Player-0 by projecting from V onto V. The winning strategy ζ and the game Gφ together yield the digraph Gζ φ in which only the edges specified by the strategy ζ are retained. Let Π be the set of all finite paths v = v1 vr V+ 0 (V1,0 { }) in Gζ φ such that v1 is either the initial vertex vinit or there is an edge (v, v1) in Gζ φ such that v V1,1. Note that Π is a finite set. We denote by fst( v) and lst( v) the vertex v1 and vr, respectively. Given a vertex v V0, let Πv Π be the set of finite paths starting from v. Let AP be the set of atomic propositions appearing in φ, Mφ = (Ag, Act, Q, Γ, δ, λ, q0) be a PCGS obtained from Gζ φ, where Ag := {1, , g + 1}, Act := S i Ag Acti, Acti := S q Q Γi(q), Q := {q v | v Π} {q }, q0 := q v such that fst( v) = vinit, λ(q v) := (S i [r 1] vi) AP, for v = v1 vr Π and λ(q ) := . Γ := (Γi)i Ag and δ are defined as follows. Consider a path v = v1 vr Π such that vr , , then vr 1 Vt, we assume that vr 1 := { A1 1k1ϕ1, , Am mkmϕm} [[vr 1]] := {[[B1]] 1h1ψ1, , [[Bn]] nhnψn}. Let ι( Aj jk jϕk) (resp. ι([[Bj]] jhjψ j)) denote the index j. Case 1: If [[vr 1]] = S i [m] Ai = , then we use one action a and let Γi(q v) := {a} for every i Ag. Case 2: Otherwise, we consider m + n distinct actions a1, ..., am, b1, ..., bn. Let Γg+1(q v) := {b1, , bn} and Γi(q v) be the least set of actions satisfying the following conditions: 1. for every j [m] and i Aj, a j Γi(q v) 2. for every j [n] and i Bj, bj Γi(q v). In the sequel, we define δ. We first extract essential information from v. Suppose mis Bj vr 1 := {M j 1, , Mj ℓj}, for j [n] and misvr 1 \ S j [n] mis Bj vr 1 := {M0 1, , M0 ℓ0}. Note that in Case 1, there is only one MIS that is vr 1 . By the construction of Gφ, for every M MISvr 1, there is a weighted cover (c, w) such that |c| |M| + 1, (c, w) |= M. Let w M be the distribution such that w M(q v ) = w(fst( v )) for all v S v c Πv, w be the distribution such that w M(q ) = 1. Moreover, vr {c | (c, w) |= M, M MISvr 1}. To define δ, we introduce some auxiliary notations. For a sentence ψ = A kϕ vr 1 , let D[ψ] := {d D(q v) | i A, d(i) = aι(ψ)}. For an MIS M {M0 1, , M0 ℓ0}, let D[M] := T ψ M D[ψ] and D := S M {M0 1, ,M0 ℓ0} D[M] . For a sentence ϕ = [[B]] hψ [[vr 1]], let D[ϕ] := {d D(q v) | i B, d(i) = bι(ϕ)}. For a sentence ϕ = [[B]] hψ [[vr 1]] and a B-MIS M mis B vr 1, let D[M, ϕ] := D[M] D[ϕ] and D := S [[B]] hψ [[vr 1]],M mis B vr 1 D[M, [[B]] hψ] . The following proposition reveals the property of action assignments of agents. Proposition 3. The following statements hold: 1. For every M {M0 1, , M0 ℓ0}, [[B]] hψ and M mis B vr 1: D[M] D[M , [[B]] hψ] = . 2. For every M1, M2 {M0 1, , M0 ℓ0}, if M1 , M2, then D[M1] D[M2] = . 3. For every [[B]] hψ, [[B ]] h ψ , M mis B vr 1 and M mis B vr 1, if M , M or [[B]] hψ , [[B ]] h ψ , then D[M, [[B]] hψ] D[M , [[B ]] h ψ ] = . 4. For every ϕ, ϕ vr 1 , if ϕ , ϕ , then (D[ϕ] \ (D D )) (D[ϕ ] \ (D D )) = . 5. For every ϕ, ϕ [[vr 1]], if ϕ , ϕ , then (D[ϕ] \ (D D )) (D[ϕ ] \ (D D )) = . 6. For every ϕ vr 1 and ϕ [[vr 1]], (D[ϕ] \ (D D )) (D[ϕ ] \ (D D )) = . We now define δ as follows. 1. For M {M0 1, , M0 ℓ0} and d D[M] , δ(q v, d) := w M. 2. For all ϕ = [[B]] hψ [[vr 1]], M mis B vr 1 and d D[M, ϕ] , δ(q v, d) := w M. 3. For all ϕ vr 1 and d D[ϕ] \ (D D ), for some M MISvr 1 such that ϕ M, δ(q v, d) := w M. Note that such M always exists. 4. For all ϕ = [[B]] hψ [[vr 1]] and d D[ϕ] \ (D D ), for some M MISvr 1 such that ϕ M, δ(q v, d) := w M. Note that such M always exists as well. 5. For all the other d D(q v), δ(q v, d) = w . Example 6. Consider the game in Figure 1 and the strategy ζ with ζ(C2) = c3, ζ(v4) = v7 and ζ(v5) = v8. Then, ζ is a winning strategy for Player-0 from which we can construct the digraph Gζ Ψ. Gζ Ψ is the subgraph of the game shown in Figure 1 consisting of the vertices {v1, , v9,C,C1,C2, c1, c3, } and related edges between them. Π consists of four finite paths {v1v2v3C, v4v7 , v5v8 , v6v9 } (blue paths in Figure 1). From Π, using the weights defined in Example 4, we get the model of φ as shown in Figure 2, where players are {1, 2, 3, 4} and actions are {a1, a2, b1} with Γ1(q0) = {a1}, Γ2(q0) = {a1, b1} and Γ3(q0) = {a2}, λ(q1) = {p1, p2}, λ(q2) = {p2} and λ(q3) = . a1, a1, a2, b1 , 1 a1, b1, a2, b1 , 0.45 a1, b1, a2, b1 , 0.55 q0 = qv1v2v3C q1 = qv4v7 q2 = qv5v8 q3 = qv6v9 Figure 2: The model of Ψ. By the construction (in particular, the function δ and finite path set Π), we can verify that Lemma 2. Mφ satisfies φ. Theorem 3. The satisfiability problem for PAMC is EXPTIME-complete. Moreover, if φ is satisfiable, we can construct a model of φ in exponential size of |φ| such that the number of players is bounded by k + 1, where k is the number of the players occurring in φ, the number of actions is bounded by |FL (φ)|+|FL (φ)|+1, the out-degree is bounded by |FL (φ)| + 2. Remark 1. Without adding the player g + 1, our reduction still works if Ag , (B B ) for each distinct pair of [[B]] hψ and [[B ]] h ψ in [[v]]. The case Ag = B B could be solved by adapting the reduction by adding new MISs M into MISv, which takes into account the common decisions for [[B]] hψ and [[B ]] h ψ . We leave it as future work. Related Work Probabilistic temporal logics such as PCTL, PCTL , PLTLK, µ-calculi and their variants (Vardi 1985; Morgan and Mc Iver 1997; Huth and Kwiatkowska 1997; de Alfaro and Majumdar 2001; Mc Iver and Morgan 2002; 2007; Mio 2012; Cleaveland, Iyer, and Narasimha 2005; Huang and Kwiatkowska 2016; Fu et al. 2018) can express probabilistic properties, rather than coalitions of players. The most relevant work includes PATL and PATL (Chen and Lu 2007; Huang, Su, and Zhang 2012), their variants (Chen et al. 2013; Huang and Luo 2013) and SGL (Baier et al. 2007). These papers mainly focus on the model checking problem (on different game structures or interpretations of strategies). The satisfiability problem, which is the main focus of the current paper, is a long-standing open problem for PATL and PATL , notwithstanding with some progress (Br azdil et al. 2008; Chakraborty and Katoen 2016). Satisfiability checking of (non-probabilistic) alternatingtime temporal logics has been considered (Goranko and Vester 2014). Our decision procedure shares some similarity, specifically, MIS plays a similar role as the distributed control therein. Modulo the probabilistic aspects and fixpoints, the main difference is that, to handle coalition strategies we are considering maximally independent sets rather than all subsets of distributed controls. As a result, our construction leads to potentially smaller models, which is important for implementation, but is at the cost of a considerably more complicated construction. Much research has been dedicated to algorithms of solving stochastic games, e.g., (Chatterjee 2007; Chatterjee and Henzinger 2012; Br azdil et al. 2006) in which objectives are usually given by ω-regular properties or PCTL. Stochastic games are useful for synthesis and verification of reactive systems, but hard to express cooperation of players. Conclusion and Future Work We proposed a temporal logic PAMC for reasoning about strategic abilities of agents in stochastic multi-agent systems. We showed that the complexities of its model checking and satisfiability checking problems lie in the same complexity classes of the respective problems of modal µ-calculus and AMC. We also implemented a satisfiability solver for PAMC. Several questions are left open for PAMC, such as (complete) axiomatization and extensions with epistemic operators. We leave them as future work. Alur, R.; Henzinger, T. A.; and Kupferman, O. 2002. Alternatingtime temporal logic. J. ACM 49(5):672 713. Baier, C.; Br azdil, T.; Gr oßer, M.; and Kucera, A. 2007. Stochastic game logic. In QEST, 227 236. Br azdil, T.; Brozek, V.; Forejt, V.; and Kucera, A. 2006. Stochastic games with branching-time winning objectives. In LICS, 349 358. Br azdil, T.; Forejt, V.; Kret ınsk y, J.; and Kucera, A. 2008. The satisfiability problem for probabilistic CTL. In LICS, 391 402. Chakraborty, S., and Katoen, J. P. 2016. On the satisfiability of some simple probabilistic logics. In LICS, 56 65. Chatterjee, K., and Henzinger, T. A. 2012. A survey of stochastic ω-regular games. J. Comput. Syst. Sci. 78(2):394 413. Chatterjee, K.; Henzinger, T. A.; and Piterman, N. 2010. Strategy logic. Information and Computation 208(6):677 693. Chatterjee, K. 2007. Stochastic Omega-Regular Games. Ph.D. Dissertation, University of California, Berkeley. Chen, T., and Lu, J. 2007. Probabilistic alternating-time temporal logic and model checking algorithm. In FSKD, 35 39. Chen, T.; Forejt, V.; Kwiatkowska, M. Z.; Parker, D.; and Simaitis, A. 2013. Automatic verification of competitive stochastic systems. Form. Method. Syst. Des. 43(1):61 92. Cleaveland, R.; Iyer, S.; and Narasimha, M. 2005. Probabilistic temporal logics via the modal µ-calculus. Theor. Comput. Sci. 342(23):316 350. de Alfaro, L., and Majumdar, R. 2001. Quantitative solution of omega-regular games. In STOC, 675 683. Foster, H.; Krolnik, A.; and Lacey, D. 2004. Assertion-based design (2. ed.). Kluwer. Fu, C.; Turrini, A.; Huang, X.; Song, L.; Feng, Y.; and Zhang, L. 2018. Model checking probabilistic epistemic logic for probabilistic multiagent systems. In IJCAI, 4757 4763. Goranko, V., and Vester, S. 2014. Optimal decision procedures for satisfiability in fragments of alternating-time temporal logics. In AIML, 234 253. Gurevich, Y., and Harrington, L. 1982. Trees, automata, and games. In STOC, 60 65. Huang, X., and Kwiatkowska, M. 2016. Model checking probabilistic knowledge: A PSPACE case. In AAAI, 2516 2522. Huang, X., and Luo, C. 2013. A logic of probabilistic knowledge and strategy. In AAMAS, 845 852. Huang, X.; Su, K.; and Zhang, C. 2012. Probabilistic alternatingtime temporal logic of incomplete information and synchronous perfect recall. In AAAI. Huth, M., and Kwiatkowska, M. Z. 1997. Quantitative analysis and model checking. In LICS, 111 122. Liu, W.; Song, L.; Wang, J.; and Zhang, L. 2015. A simple probabilistic extension of modal mu-calculus. In IJCAI, 882 888. Mc Iver, A., and Morgan, C. 2002. Games, probability and the quantitative µ-calculus qmµ. In LPAR, 292 310. Mc Iver, A., and Morgan, C. 2007. Results on the quantitative µ-calculus qmµ. ACM Trans. Comput. Log. 8(1):3. Mio, M. 2012. Game semantics for probabilistic modal µ-calculi. Ph.D. Dissertation, The University of Edinburgh. Mogavero, F.; Murano, A.; Perelli, G.; and Vardi, M. Y. 2014. Reasoning about strategies: On the model-checking problem. ACM Trans. Comput. Log. 15(4):34:1 34:47. Mogavero, F.; Murano, A.; Perelli, G.; and Vardi, M. Y. 2017. Reasoning about strategies: on the satisfiability problem. Log. Meth. Comput. Sci. 13(1). Moon, J., and Moser, L. 1965. On cliques in graphs. Israel J. Math. 3(1):23 28. Morgan, C., and Mc Iver, A. 1997. A probabilistic temporal calculus based on expectations. In FMP, 4 22. Pauly, M. 2011. Logic for Social Software. Ph.D. Dissertation, University of Amsterdam. Rozier, K. Y., and Vardi, M. Y. 2010. LTL satisfiability checking. STTT 12(2):123 137. Schewe, S., and Finkbeiner, B. 2006. Satisfiability and finite model property for the alternating-time mu-calculus. In CSL, 591 605. Schewe, S. 2008. ATL* satisfiability is 2EXPTIME-complete. In ICALP, 373 385. Shmulevich, I., and Dougherty, E. R. 2010. Probabilistic Boolean Networks - The Modeling and Control of Gene Regulatory Networks. SIAM. Vardi, M. Y. 1985. Automatic verification of probabilistic concurrent finite-state programs. In FOCS, 327 338. Walther, D.; Lutz, C.; Wolter, F.; and Wooldridge, M. 2006. ATL satisfiability is indeed EXPTIME-complete. J. Logic Comput. 16(6):765 787.