# parameterised_resourcebounded_atl__b3468956.pdf

The Thirty-Fourth AAAI Conference on Artiﬁcial Intelligence (AAAI-20)

Parameterised Resource-Bounded ATL

Natasha Alechina Utrecht University Utrecht, The Netherlands n.a.alechina@uu.nl

St ephane Demri LSV, CNRS, ENS Paris-Saclay, University Paris-Saclay Cachan, France demri@lsv.fr

Brian Logan University of Nottingham Nottingham, UK brian.logan@nottingham.ac.uk

It is often advantageous to be able to extract resource requirements in resource logics of strategic ability, rather than to verify whether a ﬁxed resource requirement is sufﬁcient for achieving a goal. We study Parameterised Resource-Bounded Alternating Time Temporal Logic where parameter extraction is possible. We give a parameter extraction algorithm and prove that the model-checking problem is 2EXPTIMEcomplete.

Introduction

There has been a considerable amount of work on logics of strategic ability interpreted over structures where agents actions consume resources, or both produce and consume resources. Examples include an extension of Coalition Logic where actions consume resources and coalitional modalities are annotated with resource bounds ( agents in coalition A have a strategy of cost at most b to achieve φ ) (RBCL) (Alechina et al. 2009; 2011), a similar extension for Alternating Time Temporal Logic ATL, Resource-Bounded ATL (RB-ATL) (Alechina et al. 2010), extensions of Computation Tree Logic and Alternating Time Temporal Logic with both consumption and production of resources (RTL, RAL) (Bulling and Farwer 2010a; 2010b), a variant of Resource-Bounded ATL where all resources are convertible to money and the amount of money is bounded (PRB-ATL) (Della Monica, Napoli, and Parente 2011; 2013), an extension of PRB-ATL to μ-calculus (Della Monica and Lenzi 2012), a version of ATL with more general numerical constraints (QATL ) (Bulling and Goranko 2013), and a version of RB-ATL where unbounded production of resources is allowed (RB ATL) (Alechina et al. 2017b). These logics can express properties such as: a coalition of agents with n units of resource 1 and m units of resource 2 can reach a state where they can maintain property φ forever without spending any resources . Essentially, such logics enable formulating and solving multi-agent conditional planning problems with resources, where the planning problems may be nested and involve invariant properties.

Copyright c 2020, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

However, with one exception, these logics assume that the amount of resources available to the agents is ﬁxed and known in advance. Using these logics, it is possible to ask whether a strategy to achieve some goal exists if the agents start with e.g., 100 units of energy. It is not possible to ask what is the minimal amount of energy that makes some goal achievable (unless it is already known that the goal is achievable for some ﬁxed cost). In other words, it is not possible to extract the values of resource parameters using model-checking procedures for these logics. The only exception is the logic Par RB ATL (Parameterised RB ATL , Resource Bounded Alternating Time Temporal Logic) introduced in (Alechina et al. 2018) that allows concrete values for resource parameters to be synthesised. Using Par RB ATL , we can ask what is the minimal amount of energy needed for achieving some goal, or for which values of x and y does it hold that coalition A has a strategy requiring at most x units of resource to enforce some temporal goal, and coalition A does not have a strategy to enforce another temporal goal with y units of resource. The function form of the model-checking problem for Par RB ATL (return a set of constraints on the resource variables for which the formula is true) was shown to be decidable in (Alechina et al. 2018), but the upper bound on the complexity of the problem was left open.

In this paper we study the model-checking problem for the logic Par RB ATL (Parameterised RB ATL). Par RB ATL is a fragment of Par RB ATL with the same syntax as RB ATL (Resource Bounded ATL), but with variables instead of concrete values for resource amounts. We show that the complexity of the model-checking problem for Par RB ATL is 2EXPTIME-complete, and give an explicit algorithm to compute the set of assignments satisfying the formula. For positive formulas, this algorithm computes the set of Pareto optimal (non-dominated) resource bounds satisfying the formula.

Our results build on results and techniques developed in games on vector addition systems with states (Courtois and Schmitz 2014) and energy games (Jurdzinski, Lazi c, and Schmitz 2015). In particular, we use techniques for solving reachability problems with unknown initial credit for vector addition systems with states for model checking formulas

with the Until temporal operator, and techniques for solving non-termination problems with unknown initial credit for energy games for model checking invariant properties. The crux of establishing a 2EXPTIME upper bound for the Par RB ATL model-checking problem is showing that there is a maximal value of the amount of resource needed to reach a particular state or to enter and execute a non-resourceconsuming loop, and this bound depends only on the model and not on the property to be checked. In the remainder of this section, we brieﬂy recall other research in multi-weighted games related to our work. Juhl, Larsen and Raskin (2013) consider the problem of computing minimal winning vectors in multi-weighted games. They state three decision problems, but focus on the second and third (computing minimal upper bounds b such that on all inﬁnite computations starting from 0, values stay below b), which are less relevant for us. Their ﬁrst problem, computing minimal winning vectors for multiweighted games when all the vectors in the inﬁnite runs have non-negative values, is the most related to model checking Par RB ATL. At the time of (Juhl, Larsen, and Raskin 2013), the best known approach to this problem was that of Br azdil, Janˇcar and Kucera (2010), which has upper bound (k-1)-EXPTIME, where k is the number of resource types. Jurdzinski, Lazi c and Schmitz (2015) signiﬁcantly improved on the results in (Br azdil, Janˇcar, and Kucera 2010), which is why we use the results from (Jurdzinski, Lazi c, and Schmitz 2015) in our paper. The main contribution of our paper is to show that the maximal value given in (Jurdzinski, Lazi c, and Schmitz 2015) transfers to Par RB ATL, and to show how it can be used for nested temporal goals that include negations and a mixture of reachability and non-termination goals.

Syntax and Semantics of Par RB ATL

In this section, we introduce the syntax and semantics of Par RB ATL. Let Agt = {a1, . . . , an} be a set of n 1 agents and Res = {res1, . . . , resr} be a set of r 1 resource types, Π denote a set of propositions, and V ar a countably inﬁnite set of variables. We write Par RB ATL(n,r) for the logic Par RB ATL with n agents and r resources, interpreted over models with the same numbers of agents and resources. When talking about all possible n and r, we write Par RB ATL. Formulas of Par RB ATL(n,r) are deﬁned by the following syntax

ϕ ::= p | ϕ | ϕ ψ | Ax ϕ | Ax ϕ | Ax ϕ U ψ

where p Π is a proposition, A Agt, and x is a sequence of variables x1, . . . , xr from V ar of length r, intuitively placeholders for amounts of resource 1 to resource r. Since formulas contain variables, formulas are evaluated relative to a model, a state and an assignment of values to variables. Informally, a state s and assignment x1 b1, . . . , xr br where the bi s are in N (which we will write as a mapping between vectors, x b) satisﬁes Ax ϕ

if A has a joint action with cost at most b such that all outcomes of this action satisfy ϕ. Similarly, Ax ϕ is satisﬁed if A has a strategy to ensure that ϕ is always true, and the cost of any computation generated by this strategy is at most b; Ax ϕ U ψ is satisﬁed if A has a strategy to enforce ψ while maintaining the truth of ϕ, and the cost of this strategy is at most b. The models of the logic Par RB ATL are resourcebounded concurrent game structures (RB-CGS) introduced in (Alechina et al. 2010), which are also the models of RB ATL.

Deﬁnition 1. A resource-bounded concurrent game structure is a tuple M = (Agt, Res, S, Π, π, Act, d, c, δ) where:

Agt is a non-empty set of n agents, Res is a non-empty set of r resources and S is a ﬁnite non-empty set of states. Π is a ﬁnite set of propositional variables and π : Π (S) is a truth assignment which associates each proposition in Π with a subset of states where it is true. Act is a non-empty set of actions which includes idle, and d : S Agt (Act) is a function which assigns to each s S a non-empty set of actions available to each agent a Agt. For every s S and a Agt, idle d(s, a). We denote joint actions by all agents in Agt available at s by D(s) = d(s, a1) d(s, an). c : S Agt Act Zr is a partial function which maps a state s, and agent a and an action α d(s, a) to a vector of integers where the integer in position i indicates consumption or production of resource resi by the action (positive value for consumption and negative value for production). We stipulate that c(s, a, idle) = 0 for all s S and a Agt where 0 = (0, . . . , 0) Nr. δ : (s, σ) S is a function that for every s S and joint action σ D(s) gives the state resulting from executing σ in s.

Given an RB-CGS M, we denote the set of all inﬁnite sequences of states (computations) by Sω and the set of nonempty ﬁnite sequences of states by S+. For a computation λ = s0s1 . . . Sω, we use the notation λ[i] = si and λ[i, j] = si . . . sj where j i 0. Given an RB-CGS M and a state s S, a joint action by a coalition A Agt is a tuple σ = (σa)a A such that σa d(s, a). The set of all joint actions for A at state s is denoted by DA(s). Given a joint action by the grand coalition σ D(s), σA denotes the joint action executed by A: σA = (σa)a A. The set of all possible outcomes of a joint action σ DA(s) at state s is:

out(s, σ) = {δ(s, σ ) | σ D(s) σ = σ A}

The cost of a joint action σ DA(s) is deﬁned as:

cost(s, σ) =

a A c(s, a, σa)

Given an RB-CGS M, a strategy for a coalition A Agt is a mapping FA : S+ Act|A| such that, for every λs S+, FA(λs) DA(s). A computation λ Sω is consistent with a strategy FA iff, for all i 0, λ[i + 1]

out(λ[i], FA(λ[0, i])). We denote by out(s, FA) the set of all computations λ consistent with FA that start from s. We use the usual point-wise notation for vector comparison and addition. In particular, (b1, . . . , br) (d1, . . . , dr) iff bi di for all i {1, . . . , r}, and (b1, . . . , br) + (d1, . . . , dr) = (b1 + d1, . . . , br + dr). For convenience, let us denote the set of possible resource bounds by B = Nr. Given a bound b B, a computation λ out(s, FA) is b-consistent with FA iff, for every i 0,

j=0 cost(λ[j], FA(λ[0, j])) b

Note that this deﬁnition implies that if agents start with resource allocation b, then on every preﬁx of the computation the amount of resources they have is never below 0. Furthermore, the constraint on b-consistent computations involves only action costs for agents from the coalition A (and not on the agents outside A). The set of all b-consistent computations of FA starting from state s is denoted by out(s, FA, b). FA is a b-strategy iff out(s, FA) = out(s, FA, b) for any state s. Given an RB-CGS M, a state s of M, the truth of a Par RB ATL formula ϕ with respect to M, s and assignment v from the set of variables to N is deﬁned inductively on the structure of ϕ as follows: M, s, v |= p iff s π(p); M, s, v |= φ iff M, s, v |= φ; M, s, v |= φ ψ iff M, s, v |= φ or M, s, v |= ψ; M, s, v |= Ax φ iff there exists a v(x)-strategy FA such that for all λ out(s, FA): M, λ[1], v |= φ; M, s, v |= Ax φ iff there exists a v(x)-strategy FA such that for all λ out(s, FA) and i 0: M, λ[i], v |= φ; M, s, v |= Ax φ U ψ iff there exists a v(x)-strategy FA such that for all λ out(s, FA), there exists i 0 such that M, λ[i], v |= ψ and M, λ[j], v |= φ for all j {0, . . . , i 1}. Note that along the evaluation of formula, the assignment v does not vary, and the formulae are evaluated on states and not on pairs of the form (s, b). The two decision forms of the model-checking problem for Par RB ATL are as follows. Deﬁnition 2. The following problem is the unknown initial assignment model-checking problem for Par RB ATL. Input: n, r 1 (in unary), a Par RB ATL(n,r) formula ϕ, a ﬁnite model M, and a state s. Question: Is there an assignment v such that M, s, v |= ϕ? Deﬁnition 3. The following problem is the known initial assignment model-checking problem for Par RB ATL. Input: n, r 1 (in unary), a Par RB ATL(n,r) formula ϕ, a ﬁnite model M, a state s, and an assignment v. Question: Is it true that M, s, v |= ϕ? The function form of the model-checking problem is to compute the set of all satisfying assignments.

Deﬁnition 4. The following problem is the function form of the model-checking problem for Par RB ATL(n,r).

Input: n, r 1 (in unary), a Par RB ATL(n,r) formula ϕ, a ﬁnite model M, and a state s. Output: Compute a formula γ describing the set of assignments v such that M, s, v |= ϕ, where γ is of the following form (for b N):

γ := x b | | | γ | γ γ | γ γ

Clearly, if the function form of the problem is solvable, then both model-checking problems for Par RB ATL are decidable. The following is an immediate consequence1 of Theorem 8 in (Alechina et al. 2018):

Theorem 1. The function form of the model-checking problem for Par RB ATL is decidable.

However, the upper bound on complexity for this problem is not known. The method for proving decidability of the function form of the model-checking problem for Par RB ATL is by reduction to parity games on singlesided VASS (Abdulla et al. 2013). While this can be used for Par RB ATL model checking, it leads to a huge complexity bound (probably non primitive recursive, as structures close to Karp-Miller trees need to be constructed). In the next section, we give a direct algorithm for model checking Par RB ATL.

Theorem 2. The lower bound on the complexity of the known initial assignment model-checking problem for Par RB ATL is 2EXPTIME.

Proof. From the result that the model-checking problem for RB ATL is 2EXPTIME-hard (Theorem 3, (Alechina et al. 2018)) when r is an input parameter.2 Note that the proof of Theorem 3 in (Alechina et al. 2018) does not require inﬁnite bounds, so the result also holds for RB ATL without inﬁnite bounds. It is straightforward to reduce the modelchecking problem for RB ATL to the known initial assignment model-checking problem for Par RB ATL.

Upper Bound on the Complexity of the Function Form of the Par RB ATL Model-Checking Problem In this section, we ﬁrst introduce the key concepts used in the model-checking algorithm for Par RB ATL, and in the process show that Until and Box formulas of Par RB ATL that do not contain nested modalities can be model-checked

1The set of assignments considered in (Alechina et al. 2018) is not of the form V ar N but of the form V ar N { }, and the constraint formula may include x = . This is motivated by the connection between Par RB ATL and RB ATL where modalities have bounds which may contain . That in turn was motivated by the desire to make RB ATL a conservative extension of ATL. However, the proof of Theorem 8 in (Alechina et al. 2018) goes through for the case when assignments are of the form V ar N, which is what we are interested in this paper. 2It is EXPTIME-hard for ﬁxed r 4, see Corollary 1 in (Alechina et al. 2018).

in 2EXPTIME. We then state the model-checking algorithm and the general result on the upper bound on complexity of the function form of the model-checking problem for Par RB ATL.

Preliminaries and Deﬁnitions

The proof of a 2EXPTIME upper bound for the function form of the Par RB ATL model-checking problem builds on results for reachability games with unknown initial credit for alternating vector addition systems with states (Courtois and Schmitz 2014) and non-termination energy games with unknown initial credit (Jurdzinski, Lazi c, and Schmitz 2015). These results roughly correspond to model checking formulas of the form Ax φ U ψ and Ax φ, respectively, where φ and ψ do not in turn contain variables, i.e. these are propositional formulae. First we introduce a notion of a witness for a successful A-strategy in enforcing a formula of the form Ax φ U ψ. Informally, a witness is a ﬁnite tree with the root labelled (s, b), representing the result of executing a successful Astrategy from s under assignment x b until ψ states are reached on all branches.

Deﬁnition 5. Given a formula of the form Ax φ U ψ, a state s in an RB-CGS M, and a vector b Nr, a ﬁnite tree with root (s, b) is a witness for M, s, x b |= Ax φ U ψ if

all nodes of the tree are of the form (s , b ) where s is a state (resulting from executing the joint action in the state of the parent node) and b the vector of resource amounts A have in s as a result, and b 0, in all leaf nodes (s , b ) of the tree, s satisﬁes ψ (since ψ is propositional, this can be determined using only the truth assignment π), in all non-leaf nodes (s , b ) of the tree, s satisﬁes φ (again, only the truth assignment π is needed here).

In the ﬁrst point of the deﬁnition, an edge (s, b) (s , b ) is present in the tree because of some joint action σ D(s), such that δ(s, σ) = s , and for all i [1, r], we have b [i] = b[i] (Σa A c(s, a, σ(a))[i]). Note that in the expression above, the value c(s, a, σ(a))[i] occurs negatively. Observe that a witness corresponds to a perfect recall bstrategy for A, because after reaching the leaf nodes satisfying ψ, the coalition can forever execute the idle action that consumes no resources. A witness for M, s, x b |= Ax φ is deﬁned similarly.

Deﬁnition 6. A witness for M, s, x b |= Ax φ is a ﬁnite tree with root (s, b) where in every node the state satisﬁes φ, all resource allocations in the nodes are nonnegative, and for each leaf node (s , b ) there is exactly one other node (s , b ) on the same branch with the same state s and b b .

In other words, a witness for Ax φ describes the initial preﬁx of a successful strategy for satisfying Ax φ. It can be extended to a perfect recall b-strategy for A since

it describes how to enforce a non-consuming loop on any computation. Clearly, if there is a witness for M, s, x b |= φ, then there is also a witness for M, s, x b |= φ, where b b (the same strategy will work for a greater initial resource allocation). We will also talk about w being a witness for φ being true in M and s if there is some assignment v such that w is a witness for M, s, v |= φ, and about w being a witness for φ being true in M if there is some state s and assignment v such that w is a witness for M, s, v |= φ. A norm of a vector b is the largest component of the vector (by absolute value), max(|b[i]|). This notion can be extended to the norm of a ﬁnite set of vectors. The largest value occurring in a description of an RB-CGS M is the norm of the set of costs of actions. We will denote this number by max M. Similarly, we can talk about the norm of a witness w, w . Let us call a witness w norm-minimal for some model-checking instance if there is no witness w for the same model-checking instance with w < w .

Proposition 1. If a witness for Ax φ being true in M, s exists, then there exists a witness w with w (4|S| max M)2(r+2)3.

Proof. The proof uses two results from (Jurdzinski, Lazi c, and Schmitz 2015). The ﬁrst result (Lemma 3.1) is that if Player 1 has a winning strategy in a bounded energy game (ensuring non-termination, with counters remaining nonnegative and below a certain bound), then Player 1 has a strategy where the norm of any computation is bounded by (4|S| max M)2(r+2)3. The second result (Proposition 2.4) is that Player 1 has a winning strategy in a non-termination energy game if and only if Player 1 has a winning strategy in a bounded energy game on the same graph with resource-decrementing loops added to every state. To construct the winning strategy for the bounded game, resourcedecrementing loops can be used to keep the resource amounts below the upper bound, while the transitions in the non-termination energy game simply ignore decrementing loops (in particular, they may involve incrementing some resource forever in a loop). If all useful (non-resourcedecrementing-loop transitions) in the bounded game can be accomplished without more than (4|S| max M)2(r+2)3 resources on any counter up to the point the ﬁnal loop is repeatedly executed, then the same applies to the original energy game, and hence to a witness for Ax φ being true in M, s.

Proposition 2. If a witness for Ax ψ1 U ψ2 being true in M, s exists, then there exists a witness w with w (4(|S| + 1) max M)2(r+3)3.

Proof. By contradiction, assume that in some model M, a norm-minimal witness w for Ax ψ1 U ψ2 being true in M, s has norm w > (4(|S| + 1) max M)2(r+3)3. Produce a model M from M as follows. If any joint action by A which is involved in w is composed of only idle actions by all agents in A, add to M another 0 cost action α and replace idle with α in w. Add an extra resource type r+1, and let all

actions by agents in A involved in w (apart from the idle action) consume one unit of r+1. Redirect all other transitions (not involved in w) to an additional new state s not satisfying ψ2. Note that M contains a witness w corresponding to w but with extra costs on the (r + 1)th resource, and w is still a minimal (and the only) witness for Ax ψ1 U ψ2. This is because the only state in M that is not part of w is s , and a ψ2 state is not accessible from s . Note also that w does not contain any non-resource-consuming loops on the branches leading to ψ2 states because of the costs on the (r + 1)th resource. Now to complete the transformation of M into M , we add a new propositional variable p that holds in all w states but not in s , and add 0 cost self-loops by the idle action to all ψ2 states. Consider the formula Ax p. The only way to satisfy it in M , s is to use a strategy corresponding to w . (This is because the only way to reach a nonresource-consuming loop satisfying p is to reach a ψ2 state, and w is a minimal witness to reaching a ψ2 state while satisfying ψ1, hense while satisfying p). The norm of w is still the same as the norm of w (without loss of generality, we assume that the costs on r +1 are less than w ), so the normminimal witness for Ax p has the norm strictly greater than (4(|S|+1) max M)2(r+3)3, that is, strictly greater than (4|S |) max M )2(r +2)3, where |S | = |S| + 1 is the number of states in M and r = r + 1 is the number of resource types in M . This contradicts Proposition 1.

Given that the norm of a norm-minimal witness is bounded by BM = (4(|S| + 1) max M)2(r+3)3, the height of a witness is bounded by an exponential in r and max M (since the model is encoded in binary). This means that all such potential witnesses can be generated and checked by an alternating Turing Machine in EXPSPACE. As pointed out in (Courtois and Schmitz 2014), this gives a naive algorithm for computing the Pareto frontier (all non-dominated initial resource allocations): all potential witnesses with resource allocations at the root starting from (0, . . . , 0) and up to (BM, . . . , BM) can be explored in turn in AEXPSPACE, and the non-dominated values b saved as constraints x b. Since ASPACE(f(n)) = DTIME(2O(f(n))), AEXPSPACE=2EXPTIME, and this algorithm produces a description of satisfying assignments for non-nested Box and Until formulas in 2EXPTIME. In the next section, we show that this upper bound applies to the whole language.

Model-Checking Algorithm In this section, we present a 2EXPTIME algorithm to solve the function form of the Par RB ATL model-checking problem. Given a model M and a Par RB ATL formula φ with variables x1, . . . , xn, the algorithm returns a set of pairs φ M of the form (s, γ) where s is a state and γ a constraint formula deﬁning the set of assignments v to x1, . . . , xn such that M, s, v |= φ. Note that γ could be equivalent to (unsatisﬁable), in which case there is no assignment v such that M, s, v |= φ. We will say that γ is satisﬁable iff there is at least one assignment v satisfying the constraints in γ. To formulate the cases for Ax ψ1 U ψ2 and Ax ψ, we need to redeﬁne the notion of a witness relative to

ψ1 M, ψ2 M, and ψ M. A constrained witness for Ax ψ1 U ψ2 being true in M, s is a ﬁnite tree where each leaf is a node of the form (s , b ) (for some b ) where s is such that (s , γ ) ψ2 M and γ is satisﬁable. All other nodes are of the form (s , b ) where (s , γ ) ψ1 M and γ is satisﬁable. A constrained witness w for Ax ψ1 U ψ2 being true in M, s is exactly like a (non-constrained) witness for Ax ψ1 U ψ2 being true in M, s, but instead of requiring states in the nodes to satisfy ψ1 or ψ2, we require them to be in ψi M with a satisﬁable constraint formula γ . The formula γ describes constraints on resources required to satisfy ψ1 or ψ2. If ψi is a propositional formula, then γ can only be of the form , otherwise it may be any satisﬁable constraint, for example, y < 1. These γ constraints are going to be used to construct the formula γ(w) deﬁned below.3

Similarly, a constrained witness for Ax ψ is obtained from a witness by replacing the notion of a state s satisfying ψ with a requirement that (s , γ ) ψ M and γ is satisﬁable. We call a constrained witness w with the root (s, b) nondominated if it is norm-minimal and there is no constrained witness w with root (s, b ) such that b < b. We denote the set of non-dominated constrained witnesses for (M, s, ψ) by W(M, s, ψ). This set is ﬁnite and each witness has height bounded by |S| (BM + 1)r by Propositions 1 and 2 (because no two nodes on the same branch have the same state and resource vector, and the maximal value in resource vector is BM). Given a constrained witness w, we deﬁne a formula γ(w) describing constraints on the nodes in w. Intuitively, for formulas of the form Ax ψ1 U ψ2 and Ax ψ, the constraint on x is x b if the root of the witness is (s, b). There are however additional provisos. If some of the variables in x occur in proper subformulas of that formula, then there will be some more constraints on x arising from the constraints on the occurrences in proper subformulas. Also, constraint x b is only tight (in the sense of M, s, v |= Ax ψ1 U ψ2 iff v |= x b), if the witness is non-dominated. In addition to x, ψ1, ψ2 and ψ may have other strategic modalities with variables y. The purpose of the formula γ(w) is to extract the constraint on y from the witness, by combining constraint formulas for the states in nodes of the witness. For a constrained witness w for Ax ψ1 U ψ2 with a root (s, b), γ(w) = x b

s leaves(w), (s , γ ) ψ2 M

s leaves(w), (s , γ ) ψ1 M

where s leaves(w) if (s , b ) is a leaf node of w and s leaves(w) if (s , b ) is a non-leaf node of w.

3Note that there is no connection between γ and b . As in the (non-constrained) witness, b corresponds to values remaining on the original resource assignement to x, that is, the resources used by the A-strategy to satisfy the formula Ax ψ1 U ψ2.

Algorithm 1 Computing φ0

1: function PARRB ATL-MC(M, φ0) 2: for φ Sub(φ0) do 3: case φ = p 4: p M {(s, ) : s π(p)} {(s, ) : s π(p)}

5: case φ = ψ 6: ψ M {(s, γ) : (s, γ) ψ M}

7: case φ = ψ1 ψ2 8: ψ1 ψ2 M {(s, γ1 γ2) : (s, γ1) ψ1 M (s, γ2) ψ2 M} 9: case φ = ψ1 ψ2 10: ψ1 ψ2 M {(s, γ1 γ2) : (s, γ1) ψ1 M (s, γ2) ψ2 M}

11: case φ = Ax ψ 12: Ax ψ M {(s, γ) : γ =

σ DA(s)(x cost(σ, s)

s out(s,σ),(s ,γ ) ψ M γ )}

13: case φ = Ax ψ1 U ψ2 14: Ax ψ1 U ψ2 {(s, γ) : γ =

w W (M,s, Ax ψ1 U ψ2) γ(w)}

15: case φ = Ax ψ 16: Ax ψ {(s, γ) : γ =

w W (M,s, Ax ψ) γ(w)}

17: return φ0 M

For a constrained witness w for Ax ψ with a root (s, b),

s nodes(w), (s , γ ) ψ M

where s nodes(w) if (s , b ) for some b is a node of w. For each subformula φ of φ0 in increasing order of complexity, Algorithm 1 computes φ M.

Theorem 3. On the input of M, φ0, Algorithm 1 returns a set of pairs φ0 M such that (s, γ) φ0 M if, and only if: for all v M, s, v |= φ0 iff v |= γ.

Proof. The proof is by induction on the complexity of φ0. The inductive hypothesis is that (s, γ) φ M iff (M, s, v |= φ iff v |= γ). Clearly this holds for atomic propositions: (s, ) p M iff M, s |= p, by line 4 of Algorithm 1. For booleans, the conditions on assignments are also straightforward (lines 6, 8, 10). Consider Ax ψ. M, s, v |= Ax ψ iff v assigns to x the amount of resources sufﬁcient to execute an action σ such that all outcomes of that action satisfy ψ under v. (This σ is the ﬁrst action in a successful strategy, but just as in the case of Until, all subsequent actions in the strategy can be chosen to be idle, which means the strategy will not require more than v(x) resources.) By the inductive hypothesis, all outcomes of σ satisfy ψ under v iff for every state s in out(s, σ) such that (s , γ ) ψ M, v |= γ , that is, iff v |=

s out(s,σ),(s ,γ ) ψ M γ . Hence M, s, v |= Ax ψ iff for some executable σ, v |= s out(s,σ),(s ,γ ) ψ M γ . This is exactly what line 12 of Algorithm 1 requires for Ax ψ M. M, s, v |= Ax ψ1 U ψ2 iff there is a strategy such that each computation on this strategy satisﬁes ψ1 under v until it satisﬁes ψ2 under v. This holds iff there is a

witness w such that initial amount of resources is sufﬁcient to execute the corresponding strategy and all nodes in the witness satisfy the corresponding formulas under v. By the inductive hypothesis, this holds iff for some nondominated witness rooted in (s, b) with b v(x), v |=

w W (M,s, Ax ψ1 U ψ2) γ(w), which is the constraint formula on line 14 of Algorithm 1. M, s, v |= Ax ψ holds iff there is a strategy such that each computation on this strategy reaches a loop where no resources are consumed and every state on this computation satisﬁes ψ under v. By the inductive hypothesis, this holds iff for some non-dominated witness w rooted in (s, b) with b v(x), v satisﬁes γ(w), iff v |=

w W (M,s, Ax ψ1)

s nodes(w),(s ,γ ) ψ M γ , which is the constraint formula on line 16 of Algorithm 1.

Observe that if φ is a positive formula (does not contain modalities in the scope of negations), then the corresponding constraint formula γ only has subformulas of the form x b. Since the constraint formula describes nondominated witnesses, it speciﬁes the optimal values of resource parameters (the Pareto frontier).

Theorem 4. The upper bound on the complexity of the function form of the model-checking problem for Par RB ATL is 2EXPTIME.

Proof. The set of non-dominated witnesses W(M, s, φ ) for each subformula φ of φ0 can be computed in AEXSPACE. Hence the whole algorithm can be executed in AEXSPACE which is equivalent to 2EXPTIME.

Conclusions and Future Work Algorithm 1 returns a set of states satisfying a Par RB ATL formula together with a constraint formula describing the set of assignments to resource variables for which the

Par RB ATL formula is true in that state. The algorithm can be easily modiﬁed to return the witness (ﬁnite representation of the strategy). The complexity of the algorithm is very high (2EXPTIME). However, only the number of resource types and the log of the largest cost in the model (log max M) are in the exponent; the running time is polynomial in the number of states and transitions in the model. In future work we plan to implement the algorithm. Another direction of future work is a parametrised version of the fragment of Resource Agent Logic (RAL) with a decidable model-checking problem that was described in (Alechina et al. 2017a). In Par RB ATL, the resources needed for satisfying each strategic modality are independent, even if nested subformulas share variables. For example, Ax U (q Ax U p) means that there is a value for x, e.g., 5, such that given 5 units of the resource, A can enforce a state where q holds, and from where, given 5 units of the resource, A can enforce p. The second, nested, strategy does not use the amount of resource remaining from the execution of the ﬁrst strategy, but starts from scratch with x resources, whatever value x happens to have. On the other hand, in RAL, there is an extra annotation for strategic modalities that means with the remaining resources . For example, Ax U (q A U p) means that there is a value for x, such that if A have this amount of resources, then they could enforce a state where q holds, and with remaining resources, they could enforce p. This logic can be used to solve model-checking problems that involve nested goals that share a single allocation of resources.

Acknowledgements We thank the anonymous reviewers for their helpful comments.

Abdulla, P.; Mayr, R.; Sangnier, A.; and Sproston, J. 2013. Solving Parity Games on Integer Vectors. In CONCUR 13, volume 8052 of Lecture Notes in Computer Science, 106 120. Springer.

Alechina, N.; Logan, B.; Nguyen, H. N.; and Rakib, A. 2009. A logic for coalitions with bounded resources. In Boutilier, C., ed., Proceedings of the Twenty First International Joint Conference on Artiﬁcial Intelligence (IJCAI 2009), volume 2, 659 664. AAAI Press.

Alechina, N.; Logan, B.; Nga, N. H.; and Rakib, A. 2010. Resource-bounded alternating-time temporal logic. In van der Hoek, W.; Kaminka, G.; Lesp erance, Y.; Luck, M.; and Sen, S., eds., Proceedings of the Ninth International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2010), 481 488. IFAAMAS.

Alechina, N.; Logan, B.; Nguyen, H. N.; and Rakib, A. 2011. Logic for coalitions with bounded resources. Journal of Logic and Computation 21(6):907 937.

Alechina, N.; Bulling, N.; Logan, B.; and Nguyen, H. N. 2017a. The virtues of idleness: A decidable fragment of resource agent logic. Artiﬁcial Intelligence 245:56 85.

Alechina, N.; Logan, B.; Nguyen, H. N.; and Raimondi, F. 2017b. Model-checking for resource-bounded ATL with production and consumption of resources. Journal of Computer and System Sciences 88:126 144.

Alechina, N.; Bulling, N.; Demri, S.; and Logan, B. 2018. On the complexity of resource-bounded logics. Theoretical Computer Science 750:69 100. Br azdil, T.; Janˇcar, P.; and Kucera, A. 2010. Reachability games on extended vector addition systems with states. In ICALP 10, volume 6199 of Lecture Notes in Computer Science, 478 489. Springer. Bulling, N., and Farwer, B. 2010a. Expressing properties of resource-bounded systems: The logics RBTL and RBTL . In Computational Logic in Multi-Agent Systems - 10th International Workshop, CLIMA X, Revised Selected and Invited Papers, volume 6214 of Lecture Notes in Computer Science, 22 45. Bulling, N., and Farwer, B. 2010b. On the (un-)decidability of model checking resource-bounded agents. In Coelho, H.; Studer, R.; and Wooldridge, M., eds., Proceedings of the 19th European Conference on Artiﬁcial Intelligence (ECAI 2010), 567 572. IOS Press. Bulling, N., and Goranko, V. 2013. How to be both rich and happy: Combining quantitative and qualitative strategic reasoning about multi-player games (extended abstract). In Mogavero, F.; Murano, A.; and Vardi, M. Y., eds., Proceedings of the 1st International Workshop on Strategic Reasoning (SR 2013), volume 112 of Electronic Proceedings in Theoretical Computer Science, 33 41. Courtois, J., and Schmitz, S. 2014. Alternating vector addition systems with states. In 39th International Symposium on Mathematical Foundations of Computer Science (MFCS 14), volume 8634 of Lecture Notes in Computer Science, 220 231. Springer. Della Monica, D., and Lenzi, G. 2012. On a priced resourcebounded alternating μ-calculus. In Filipe, J., and Fred, A. L. N., eds., Proceedings of the 4th International Conference on Agents and Artiﬁcial Intelligence (ICAART 2012), 222 227. Sci Te Press. Della Monica, D.; Napoli, M.; and Parente, M. 2011. On a logic for coalitional games with priced-resource agents. Electronic Notes in Theoretical Computer Science 278:215 228. Della Monica, D.; Napoli, M.; and Parente, M. 2013. Model checking coalitional games in shortage resource scenarios. In Puppis, G., and Villa, T., eds., Proceedings of the Fourth International Symposium on Games, Automata, Logics and Formal Veriﬁcation (Gand ALF 2013), volume 119 of Electronic Proceedings in Theoretical Computer Science, 240 255. Juhl, L.; Larsen, K.; and Raskin, J.-F. 2013. Optimal bounds for multiweighted and parametrised energy games. In Theories of Programming and Formal Methods - Essays Dedicated to Jifeng He on the Occasion of His 70th Birthday, volume 8051 of Lecture Notes in Computer Science, 244 255. Springer. Jurdzinski, M.; Lazi c, R.; and Schmitz, S. 2015. Fixed-dimensional energy games are in pseudo-polynomial time. In Proceedings of ICALP 2015, 260 272.