# constrained_phiequilibria__23c58c30.pdf Constrained Phi-Equilibria Martino Bernasconi 1 Matteo Castiglioni 1 Alberto Marchesi 1 Francesco Trov o 1 Nicola Gatti 1 Abstract The computational study of equilibria involving constraints on players strategies has been largely neglected. However, in real-world applications, players are usually subject to constraints ruling out the feasibility of some of their strategies, such as, e.g., safety requirements and budget caps. Computational studies on constrained versions of the Nash equilibrium have lead to some results under very stringent assumptions, while finding constrained versions of the correlated equilibrium (CE) is still unexplored. In this paper, we introduce and computationally characterize constrained Phi-equilibria a more general notion than constrained CEs in normal-form games. We show that computing such equilibria is in general computationally intractable, and also that the set of the equilibria may not be convex, providing a sharp divide with unconstrained CEs. Nevertheless, we provide a polynomial-time algorithm for computing a constrained (approximate) Phiequilibrium maximizing a given linear function, when either the number of constraints or that of players actions is fixed. Moreover, in the special case in which a player s constraints do not depend on other players strategies, we show that an exact, function-maximizing equilibrium can be computed in polynomial time, while one (approximate) equilibrium can be found with an efficient decentralized no-regret learning algorithm. 1. Introduction Over the last years, equilibrium computation problems have received a terrific attention from AI and ML research (Brown & Sandholm, 2019; Bakhtin et al., 2022), as game-theoretical equilibrium notions provide a principled framework to deal with multi-player decision-making 1Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Milan, Italy. Correspondence to: Martino Bernasconi . Proceedings of the 40 th International Conference on Machine Learning, Honolulu, Hawaii, USA. PMLR 202, 2023. Copyright 2023 by the author(s). problems. Most of the works on equilibrium computation problems focus on classical solution concepts such as the well-known Nash equilibrium (NE) (Nash, 1951) and correlated equilibrium (CE) (Aumann, 1974) , thus neglecting the presence of constraints entirely. However, in most of the real-world applications, the players are usually subject to constraints that rule out the feasibility of some of their strategies, such as, e.g., safety requirements and budget caps. Thus, addressing equilibrium notions involving constraints is a crucial step needed for the operationalization of game-theoretic concepts into real-world settings. The study of equilibrium notions involving constraints was initiated by Arrow & Debreu (1954), who define the concept of generalized NE (GNE). The GNE can be interpreted as an NE of a game where players strategies are subject to some constraints, which must be satisfied at the equilibrium and also determine which are the feasible players deviations. However, given that computing a GNE is clearly PPADhard (Daskalakis et al., 2009), all the works dealing with the computation of GNEs (see, e.g., (Facchinei & Kanzow, 2010)) provide efficient algorithms only in specific settings that require very stringent assumptions. Most of the computationally challenges in finding GNEs are inherited from the NE. In settings in which constrained are not involved, the computational issues of NEs are usually circumvented by considering weaker equilibrium notions. Among them, those that have received most of the attention in the literature are the CE and its variations, which have been shown to be efficiently computable in several settings of interest (Papadimitriou & Roughgarden, 2008; Celli et al., 2020). Surprisingly, with the only exception of (Chen et al., 2022) (see Section 1.2 for a detailed discussion on it), no work has considered the problem of computing CEs in constrained settings. Thus, investigating whether the CE retains its nice computational properties when adding constraints on players strategies is an open interesting question. 1.1. Original Contributions In this paper, we introduce and computationally characterize constrained Phi-equilibria, starting, as it is customary, from the setting of normal-form games. Our equilibria include the constrained versions of the classical CE and all of its variations as special cases, by generalizing the notion of Phi- Constrained Phi-Equilibria equilibria introduced by Greenwald & Jafari (2003) to constrained settings. In particular, constrained Phi-equilibria are defined as Phi-equilibria, but in games where players are subject to some constraints. Such constraints must be satisfied at the equilibrium, and, additionally, players are only allowed to undertake safe deviations, namely those that are feasible according to the constraints. Crucially, the set of safe deviations of a player does not only depend on the strategy of that player, but also on those of the others. We start by showing that one of the most appealing computational properties of Phi-equilibria, namely that the set of the equilibria of a game is convex, is lost when moving to their constrained version. This raises considerable computational challenges in computing constrained Phi-equilibria. Indeed, we formally prove a strong intractability result: for any factor α > 0, it is not possible, unless P = NP, to find in polynomial time a constrained (approximate) Phiequilibrium which achieves a multiplicative approximation α of the optimal value of a given linear function. Then, in the rest of the paper, we show several ways in which such a negative result can be circumvented. We prove that a constrained approximate Phi-equilibrium which maximizes a given linear function can be found in polynomial time, when either the number of constraints or that of players actions is fixed. Our results are based on a general algorithm that employs a non-standard Lagrangification of the constraints defining the set of safe deviations of a player. Moreover, the algorithm needs a way of dealing with the non-convexity of the set of the equilibria, which we provide in the form of a clever discretization of the space of the Lagrange multipliers. Finally, we focus on the special case in which the constraints defining the safe deviations of a player do not depend on the the strategies of the other players, but only on the strategy of that player. This includes constrained Phi-equilibria identifying a particular constrained version of the coarse CE by Moulin & Vial (1978a), in which the players strategies are subject to marginal cost constraints. These arise in several real-world applications in which the players have bounded resources, such as, e.g., budget-constrained bidding in auctions. In such a special case, we prove that a constrained (exact) Phi-equilibrium maximizing a given linear function can be computed in polynomial time, and we provide an efficient decentralized no-regret learning algorithm for finding one constrained (approximate) Phi-equilibrium. 1.2. Related Works GNEs Rosen (1965) initiated the study of the computational properties of GNEs. After that, several other works addressed the problem of computing GNEs by mainly exploiting techniques based on quasi-variational inequalities (see (Facchinei & Kanzow, 2010) for a survey). More re- cently, some works (Kanzow & Steck, 2016; Bueno et al., 2019; Jordan et al., 2022; Goktas & Greenwald, 2022) also studied the convergence of iterative optimization algorithms to GNEs. In order to provide efficient algorithms, all these works need to introduce very stringent assumptions, which are even stronger than those required for the efficient computation of NEs. Constrained Markov Games Equilibrium notions involving constraints have also been addressed in the literature on Markov games, with (Altman & Shwartz, 2000; Alvarez Mena & Hern andez-Lerma, 2006) being the first works introducing GNEs in such a field. More recently, Hakami & Dehghan (2015) defined a notion of constrained CE in Markov games. However, the incentive constraints in their notion of equilibrium only predicate on pure deviations, which, in presence of constraints, may lead to empty sets of safe deviations. Very recently, Chen et al. (2022) generalize the work of Hakami & Dehghan (2015) by considering mixed deviations. However, their algorithm provides rather weak convergence guarantees, as it only ensures that the returned solution satisfies incentive constraints in expectation. Indeed, as we show in Proposition 3.1, the set of constrained equilibria may not be convex (it is easy to see that Example 1 also applies to the setting studied by Chen et al. (2022)), and, thus, the fact that incentive constraints are only satisfied in expectation does not necessarily imply that the true incentive constraints defining the equilibrium are satisfied. We refer the reader to Appendix A for additional details on these aspects. 2. Preliminaries In this section, we introduce all the preliminary definitions and results that are needed in the rest of the paper. 2.1. Cost-constrained Normal-form Games In a normal-form game, there is a finite set N := {1, . . . , n} of n players. Each player i N has a finite set Ai of actions available, with s := |Ai| for i N being the number of players actions.1 We denote by a A := i N Ai an action profile specifying an action ai for each player i N. Moreover, for i N, we let a i A i := j =i N Ai be an action profile of all players other than i, while (a, a i) is the action profile obtained by adding a Ai to a i. Finally, we let ui : A [0, 1] be the utility function of player i N, with ui(a) being the utility perceived by that player when the action profile a A is played. We extend classical normal-form games by considering the 1For ease of presentation, in this paper we assume that all the players have the same number of actions. All the results can be easily generalized to the case of different numbers of actions. Constrained Phi-Equilibria case in which each player i N has mi cost functions, namely ci,j : A [ 1, 1] for j [mi].2 Each player i N is subject to mi constraints, which require that all player i s costs are less than or equal to zero.3 For ease of notation, we assume w.l.o.g. that all players have the same number of constraints, namely m := mi for all i N. Moreover, we encode the costs of player i N by a vectorvalued function ci : A [ 1, 1]m such that, for every a A, the j-th component of the vector ci(a) is ci,j(a). Correlated Strategies In this paper, we deal with solution concepts defined by correlated strategies. A correlated strategy z A is a probability distribution defined over the set of actions profiles, with z[a] denoting the probability assigned to a A.4 With an abuse of notation, for every player i N, we let ui(z) be player i s expected utility when the action profile played by the players is drawn from z A. In particular, it holds ui(z) := P a A ui(a)z[a]. Similarly, we let ci(z) := P a A ci(a)z[a] be the vector of player i s expected costs, so that player i s constraints can be compactly written as ci(z) 0. Finally, we define S A as the set of safe correlated strategies, which are those satisfying the cost constraints of all players. Formally: S := {z A | ci(z) 0 i N} . In the following, we assume w.l.o.g. that S = . 2.2. Constrained Phi-equilibria We generalize the notion of Phi-equilibria (Greenwald & Jafari, 2003) to cost-constrained normal-form games. Such equilibria are defined as correlated strategies z A that are robust against a given set Φ of players deviations, in the sense that, if a mediator draws an action profile a A according to z and recommends to play action ai to each player i N, then no player has an incentive to deviate from their recommendation by selecting a deviation in Φ. For every i N, we let Φi be the set of player i s deviations, i.e., linear transformations φi : Ai Ai that prescribe a probability distribution over player i s actions for every possible action recommendation. For ease of notation, we encode a deviation φi by means of its matrix representation. Formally, an entry φi[b, a] of the matrix represents the probability assigned to action a Ai by φi(b). We denote the set of all the possible deviations by Φ := {Φi}i N . Given a correlated strategy z A and a deviation φi Φi, we define φi z as the modification of z induced by φi, 2In this paper, given some x N>0, we let [x] := {1, . . . , x} be the set of the first x natural numbers. 3Since z A, we can assume w.l.o.g. that all the constraints are of the form 0, as any constraint can always be cast in such a form by suitably manipulating the cost function ci,j. 4In this paper, given a finite set X, we denote by X the set of all the probability distributions defined over the elements of X. which is a linear transformation that can be expressed as follows in terms of matrix representation: (φi z)[ai, a i] := X b Ai φi[b, ai]z[b, a i], for every ai Ai and a i A i. Moreover, given a set Φi of deviations of player i N, in the following we denote by ΦS i (z) := {φi Φi | φi z S} the set of safe deviations for player i at a given correlated strategy z A. We are now ready to provide our definition of constrained Phi-equilibria in cost-constrained normal-form games. Definition 2.1 (Constrained ϵ-Phi-equilibria). Given a set Φ := {Φi}i N of deviations and an ϵ > 0, a constrained ϵ-Phi-equilibrium is a safe correlated strategy z S such that, for all i N, the following holds: ui(z) ui(φi z) ϵ φi ΦS i (z). A constrained Phi-equilibrium is defined for ϵ = 0. 2.3. Computing Constrained Phi-equilibria In the following, we formally introduce the computational problem that we study in the rest of the paper. We denote by I := (Γ, Φ) an instance of the problem, where the tuple Γ := (N, A, {ui}i N , {ci,j}i N,j [m]) is a costconstrained normal-form game and Φ := {Φi}i N is a set of deviations. Moreover, we let |I| be the size (in terms of number of bits) of the instance I. We assume that the number n of players is fixed, so that |I| does not grow exponentially in n.5 We also make the following assumption on how the sets of deviations are represented: Assumption 1. For every i N, the set Φi is a polytope encoded by a finite of linear inequalities.6 Let us remark that, in games without constraints, this assumption is met by all the sets Φ which determine the classical notions of Phi-equilibria (Greenwald & Jafari, 2003). Next, we formally define our computational problem: Definition 2.2 (APXCPE(α, ϵ)). For any α, ϵ > 0, we define APXCPE(α, ϵ) as the problem of finding, given an instance I := (Γ, Φ) and a linear function ℓ: A R as input, a constrained ϵ-Phi-equilibrium z A such that ℓ(z) αℓ(z ) for all constrained Phi-equilibria z A. 5Notice that the size of the representation of a normal-form game is O(sn), and, thus, exponential in n. Any algorithm that runs in time polynomial in such instance size is not computationally appealing, as even its input has size exponential in n. For this reason, we focus on the case in which n is fixed, and, thus, the instance size does not grow exponentially with n. 6Notice that, since each φi Φi is represented as a matrix, a linear inequality is expressed as P b,a Ai M[b, a]φi[b, a] d, for some matrix M and scalar d. Constrained Phi-Equilibria Intuitively, APXCPE(α, ϵ) asks to compute a constrained ϵ-Phi-equilibrium whose value for the linear function ℓis at least a fraction α of the maximum value which can be achieved by an (exact) constrained Phi-equilibrium. In order to ensure that an instance of our problem is well defined, we make the following Slater-like assumption on how the players cost constraints are defined. Assumption 2. For every correlated strategy z A, player i N, and index j [m], there exists φ i ΦS i (z): ci,j(φ i z) ρ, where ρ > 0 and 1/ρ is O(poly(|I|)), with poly(|I|) being a polynomial function of the instance size |I|. In Assumption 2, the condition ρ > 0 is required to guarantee the existence of a constrained Phi-equilibrium (see Theorem 2.1) and that the sets ΦS i (z) are non-empty (otherwise our solution concept would be ill defined). Moreover, the second condition on ρ in Assumption 2 is equivalent to requiring that our algorithms run in time polynomial in 1 Assumption 2 also allows us to prove the existence of our equilibria, by showing that the constrained Nash equilibria introduced by Altman & Shwartz (2000), which always exist under Assumption 2, are also constrained Phi-equilibria. Theorem 2.1. Given a cost-constrained normal-form game Γ and a set Φ of deviations, if Assumption 2 is satisfied, then Γ admits a constrained Phi-equilibrium. 2.4. Relation with Unconstrained Phi-equilibria We conclude the section by discussing the relation between our constrained Phi-equilibria and classical equilibrium concepts for unconstrained games. Correlated Equilibrium When there are no constraints, the correlated equilibrium (CE) (Aumann, 1974) is a special case of Phi-equilibrium. As shown by Greenwald & Jafari (2003), the CE is obtained when the sets Φi contain all the possible deviations. Formally, the CE is defined by the set ΦALL := {Φi,ALL} of deviations such that: a Ai φi[b, a] = 1 b Ai Coarse Correlated Equilibrium The coarse correlated equilibrium (CCE) (Moulin & Vial, 1978b) is a special (unconstrained) Phi-equilibrium whose set of deviations is ΦCCE := {Φi,CCE}i N such that: Φi,CCE := n φi h Ai : φi[b, a] = h[a] b, a Ai o . Intuitively, such sets contain all the possible deviations that prescribe the same probability distribution independently of the received action recommendation. Thus, our constrained Phi-equilibria include the generalization of CEs and CCEs to cost-constrained games. Our definition of constrained Phi-equilibrium needs to employ mixed deviations that map action recommendations to probability distributions over actions. This is necessary in presence of constraints. Instead, without them, one can simply consider pure deviations that map recommendations to actions deterministically (Greenwald & Jafari, 2003). 3. Challenges of Constrained Phi-equilibria In this section, we show that, in cost-constrained normalform games, Phi-equilibria loose the nice computational properties that they exhibit in unconstrained settings. This is crucially determined by the fact that the set of constrained Phi-equilibria may not be convex in general. Proposition 3.1. Given any instance I := (Γ, Φ), the set of constrained Phi-equilibria may not be convex. Proposition 3.1 is proved by the following example. Example 1. Let ΦALL be the set of all the possible deviations in a two-player game in which each player has two actions, namely A1 = A2 = {a0, a1}. The first player s utility is such that u1(a, a ) = 0 for all a A1 and a A2, while the second player s utility is such that u2(a0, a1) = 1, and 0 otherwise. Both players share the same single cost constraint (m = 1). Their cost functions are defined as ci(a0, a1) = 1, ci(a0, a0) = 1 2, and ci(a1, a) = 1 for all a A2. Notice that the instance defined above satisfies Assumption 2 for ρ = 1/2. It is easy to see that the correlated strategy z1 A such that z1[a0, a0] = 2 3 and z1[a0, a1] = 1 3 is a constrained Phi-equilibrium. Moreover, the pure correlated strategy z2 A such that z2[a1, a0] = 1 is also a constrained Phi-equilibrium. However, the combination z3 = 1 2(z1 +z2) is not a constrained Phi-equilibrium. Indeed, the second player has an incentive to deviate by using a deviation φ2 such that φ2[a0, a1] = 1 and φ2[a1, a1] = 1. Such a deviation prescribes to play action a1 when a0 is recommended, and to play action a1 when the recommendation is a1. Straightforward calculations show that, for every a A1: (φ2 z3)[a, a ] = ( 1 2 if a = a1 0 otherwise, and u2(φ2 z3) = 1 2 > u2(z3) = 1 6. Moreover, the deviation is safe, since φ2 ΦS 2 (z3) as c2(φ2 z3) = 0. In order to formally asses the computational challenges of computing constrained Phi-equilibria, we prove the following strong inapproximability result: Theorem 3.1 (Hardness). For any constant α > 0, the problem APXCPE(α, (α/s)2) is NP-hard, where s is the number of players actions in the instance given as input. Constrained Phi-Equilibria Intuitively, Theorem 3.1 states that, for every multiplicative approximation factor α > 0, it is not possible to find a constrained ϵ-Phi-equilibrium having value of ℓat least a fraction α of its optimal value in time polynomial in 1 ϵ . Moreover, as a byproduct of Theorem 3.1, we also get the inapproximability up to within any factor of the problem of computing an optimal constrained (exact) Phi-equilibrium. Notice that the hardness result in Theorem 3.1 cannot hold for values of ϵ that are independent from the instance size. Indeed, as we prove in Corollary 4.3 in Section 4, problem APXCPE(1, ϵ) can be solved in quasi-polynomial time in the instance size whenever ϵ > 0 is a given constant. Thus, any NP-hardness result for APXCPE(α, ϵ) would contradict the exponential-time hypothesis.7 4. Computing Optimal Constrained ϵ-Phi-equilibria Efficiently In this section, we show how to circumvent the negative result established by Theorem 3.1. In particular, we prove that, when the number of cost constraints is fixed, problem APXCPE(1, ϵ) can be solved in time polynomial in the instance size and 1 ϵ for ϵ > 0 (Corollary 4.2). Moreover, we also prove that, in general, for any constant ϵ > 0 problem APXCPE(1, ϵ) admits a quasi-polynomial-time algorithm, whose running time becomes polynomial when the number of players actions is fixed (Corollary 4.3). First, in Section 4.1, we provide a general algorithm that is at the core of the two main results of this section. Then, in Section 4.1, we show how the algorithm can be suitably instantiated in order to prove each result. In the rest of this section, unless stated otherwise, we always assume that an ϵ > 0 has been fixed, and that I := (Γ, Φ) and ℓ: A R are the inputs of a given instance of problem APXCPE(1, ϵ). 4.1. General Algorithm The main technical tool that we employ in order to design our algorithm is a Lagrangification of the constraints defining the sets ΦS i (z) of safe deviations. First, we prove the following preliminary result, which shows that strong duality holds for the problem maxφi ΦS i (z) ui(φi z) of finding the best safe deviation for player i N at z A. Lemma 4.1. For every z A and i N, it holds max φi ΦS i (z)ui(φi z) = sup φi Φi inf ηi Rm + ui(φi z) η i ci(φi z) = inf ηi Rm + sup φi Φi ui(φi z) η i ci(φi z) . 7The exponential-time hypothesis conjectures that solving 3SAT requires at least exponential time. Then, by exploiting Lemma 4.1, we can prove that, under Assumption 2, strong duality continues to hold even when restricting the Lagrange multipliers ηi to have ℓ1-norm less than or equal to 1/ρ. Formally: Lemma 4.2. Let D := η Rm + | ||η||1 1/ρ . Then, for every z A and i N, it holds: max φi ΦS i (z)ui(φi z) = max φi Φi min ηi D ui(φi z) η i ci(φi z) = min ηi D max φi Φi ui(φi z) η i ci(φi z) . Lemma 4.2 allows us to write player i s incentive constraints in the definition of constrained ϵ-Phi-equilibria as ui(z) min ηi D max φi Φi ui(φi z) η i ci(φi z) ϵ. (1) This crucially allows us to show the following result: solving problem APXCPE(1, ϵ) is equivalent to computing max(η1,...,ηn) Dn Fϵ(η1, . . . , ηn), where Fϵ(η1, . . . , ηn) is the optimal value of a suitable maximization problem parameterized by tuples of Lagrange multipliers ηi D, one per player i N. Such a problem asks to compute a safe correlated strategy maximizing the linear function ℓsubject to players incentive constraints that are re-formulated by means of Lemma 4.2. Formally, we define Fϵ(η1, . . . , ηn) as the maximum of ℓ(z) over those z S that additionally satisfy the following constraint for every i N: ui(z) max φi Φi ui(φi z) η i ci(φi z) ϵ. (2) Notice that the min operator that appears in the right-hand side of Constraints (1) is dropped by adding the outer maximization over the tuples (η1, . . . , ηn) Dn, as the maximum of ℓis always achieved when the right-hand side of such constraints is as small as possible. Next, we show that Fϵ(η1, . . . , ηn) can be computed in polynomial time by means of the ellipsoid algorithm. Lemma 4.3. For every tuple (η1, . . . , ηn) Dn, the value of Fϵ(η1, . . . , ηn) can be computed in time polynomial in the instance size |I| and 1 Proof. We show that Fϵ(η1, . . . , ηn) can be solved in polynomial time by means of the ellipsoid algorithm. Let us notice that Constraints (2) can be equivalently encoded by a set of linear inequalities, one for each player i N and deviation φi vert(Φi), where vert(Φi) denotes the set of vertexes of the polytope Φi (recall Assumption 1). Thus, solving Fϵ(η1, . . . , ηn) is equivalent to solving an LP with a (possibly) exponential number of constraints, but polynomially-many variables. Such an LP can be solved Constrained Phi-Equilibria in polynomial time by means of the ellipsoid algorithm, provided that a polynomial-time separation oracle for the linearized version of Constraints (2) is available. Such an oracle can be implemented by solving the maximization in the right-hand side of Constraints (2) for a correlated strategy z A given as input. Formally, the separation oracle solves the following problem for each player i N: φ i arg max φi Φi ui(φi z) η i ci(φi z) , which can be done efficiently thanks to Assumption 1. Then, if the separation oracle finds any φ i such that: ui(z) ui(φ i z) η i ci(φ i z), it outputs the above inequality as a separating hyperplane to be used in the ellipsoid algorithm. Lemma 4.3 is not enough to complete our algorithm, since we need an efficient way of optimizing Fϵ(η1, . . . , ηn) over all the tuples of Lagrange multipliers. This problem is nontrivial, since Fϵ(η1, . . . , ηn) is non-concave in ηi. Nevertheless, we show that, by restricting the domain D of the Lagrange multipliers to a suitably-defined finite small subset, we can still find a constrained ϵ-Phi-equilibrium whose value of ℓis at least as large as that of any constrained (exact) Phi-equilibrium. This is enough to solve APXCPE(1, ϵ). In particular, we need a finite subset of good Lagrange multipliers, in the sense of the following definition. Definition 4.1. Given any δ > 0, a set D D is δ-optimal if, for every z A and i N, the following holds: min ηi D max φi Φi ui(φi z) η i ci(φi z) max φi ΦS i (z) ui(φi z) + δ. Intuitively, thanks to Lemma 4.2, if we optimize the Lagrange multipliers over a δ-optimal set D D, instead of optimizing them over D, then we are allowing the players to violate incentive constraints by at most δ. In the following, we assume that a finite δ-optimal set D D is available. In Section 4.2, se show how to design two particular δ-optimal sets that allow to prove our main results. For ease of presentation, we let L D,ϵ := max (η1,...,ηn) Dn Fϵ(η1, . . . , ηn) be the optimal value of Fϵ(η1, . . . , ηn) when the Lagrange multipliers are constrained to be in a δ-optimal set D D. Next, we show that, given any δ-optimal set D with δ ϵ, the value of L D,ϵ is at least that achieved by constrained (exact) Phi-equilibria, namely LD,0. Formally: Lemma 4.4. Given any 0 < δ ϵ and a δ-optimal set D D, the following holds: L D,ϵ LD,0. Intuitively, Lemma 4.4 is proved by noticing that, provided that δ ϵ, the incentive constraints violation introduced by using D instead of D is at most ϵ. Moreover, the set of feasible correlated strategies can only expand by allowing incentive constraints to be violated, and, thus, the value of the objective ℓcan only increase. Lemma 4.4 suggests a way of solving APXCPE(1, ϵ). Indeed, given a finite δ-optimal set D D with δ ϵ, by enumerating over all the tuples of Lagrange multipliers ηi D, one per player i N, we can find the desired constrained ϵ-Phi-equilibrium. The following theorem shows that this procedure gives an algorithm for APXCPE(1, ϵ) that runs in time polynomial in the instance size, | D|, and 1 Theorem 4.1. Given a finite δ-optimal set D D with δ ϵ, there exists an algorithm that solves APXCPE(1, ϵ) and runs in time polynomial in the instance size |I|, the number | D| of elements in D, and 1 ϵ for every ϵ > 0. Proof. The algorithm works by enumerating over all the possible tuples of Lagrange multipliers ηi D, one per player i N. These are polynomially many in the size | D| when the number of players n is fixed. For every tuple (η1, . . . , ηn) Dn, the algorithm solves Fϵ(η1, . . . , ηn), which can be done in time polynomial in |I| and 1 ϵ thanks to Lemma 4.3. Finally, the algorithm returns the correlated strategy z A with the highest value of ℓamong those computed while solving Fϵ(η1, . . . , ηn). It is easy to see that the returned solution solves problem APXCPE(1, ϵ) by applying Lemma 4.4. This concludes the proof. 4.2. Instantiating the General Algorithm Next, we show how to build δ-optimal sets D that, when they are plugged in the algorithm in Theorem 4.1, allow us to derive our results. In particular, we consider the set: Dτ := n η D ηj = kτ, k {0, . . . , 1/τρ } j [m] o , which is a discretization of D with a regular lattice of step τ R+ (notice that ηj is the j-th component of η). By a simple stars and bars combinatorial argument, we have that |Dτ| = 1/τρ +m m . Thus, since it holds that |Dτ| = O((1/τρ)m), if the number of constraints m is fixed, |Dτ| is bounded by a polynomial in 1/τρ. Moreover, simple combinatorial arguments show that |Dτ| (1 + m) 1/τρ .8 Thus, it also holds that |Dτ| = O(m 1/τρ). Notice that the two bounds on |Dτ| are non-comparable, and, thus, they give rise to two distinct results, as we show in the following. 8See Appendix D for a formal proof. Constrained Phi-Equilibria By using the first bound on |Dτ|, we can show that the set Dτ is δ-optimal for δ = mτ. Formally: Lemma 4.5. For any τ > 0, the set Dτ is (τm)-optimal. Thus, whenever the number m of cost constraints is fixed, Lemma 4.5, together with Theorem 4.1, allows us to provide a polynomial-time algorithm. Indeed, it is sufficient to apply Theorem 4.1 for the (τm)-optimal set Dτ with τ := ϵ/m to obtain the following first main result: Corollary 4.2. There exists an algorithm that solves problem APXCPE(1, ϵ) in time polynomial in |I| and 1 ϵ for every ϵ > 0, when the number m of cost constraints is fixed. On the other hand, by using the second bound on |Dτ|, we can show that Dτ is δ-optimal for δ depending logarithmically on the number of players actions. Formally: Lemma 4.6. For any τ > 0, the set Dτ is δ-optimal for δ = 2 p 2τ log s/ρ, where s is the number of players actions. Lemma 4.6 (together with Theorem 4.1) immediately gives us a quasi-polynomial-time for solving APXCPE(1, ϵ) for a given constant ϵ > 0. Moreover, its running time becomes polynomial when the number of players actions is fixed. Corollary 4.3. For any constant ϵ > 0, there exists an algorithm that solves APXCPE(1, ϵ) in time O(|I|log s). Moreover, when the number s of players actions is fixed, the algorithm runs in time polynomial in |I|. Notice that it is in general not possible to design an algorithm that runs in time polynomial in 1 ϵ , since this would contradict the hardness result in Theorem 3.1. 5. A Special Case: Deviation-dependent Costs We complete our computational study of constrained Phiequilibria by considering a special case in which player i s costs associated to a deviation φi only depend on φi and not on the (overall) modified correlated strategy φi z. We consider instances satisfying the following assumption. Assumption 3. For every player i N and player i s deviation φi Φi, there exists a function ci : Φi [ 1, 1]m such that ci(φi) := ci(φi z) for every z A. Notice that, whenever Assumption 3 holds, the set ΦS i (z) of safe deviations does not depend on z. Thus, in the rest of this section, we write w.l.o.g. ΦS i rather than ΦS i (z). A positive effect of Assumption 3 is that it recovers the convexity of the set of constrained Phi-equilibria, rendering them more akin to unconstrained ones. Formally: Proposition 5.1. For instances I := (Γ, Φ) satisfying Assumption 3, the set of constrained ϵ-Phi-equilibria is convex. Proposition 5.1 suggests that constrained Phi-equilibria are much more computationally appealing under Assumption 3 than in general, as we indeed show in the rest of this section. First, in Section 5.1, we show that APXCPE(1, 0) admits a polynomial-time algorithm under Assumption 3. Then, in Section 5.2, we design a no-regret learning algorithm that efficiently computes one constrained ϵ-Phi equilibrium with ϵ = O(1/ T) as the number of rounds T grows. Finally, in Section 5.3, we provide a natural example of constrained Phi-equilibria satisfying Assumption 3. 5.1. A Poly-time Algorithm for Optimal Equilibria We prove that, whenever Assumption 3 holds, the problem of computing an (exact) Phi-equilibrium maximizing a given linear function can be solved in polynomial time. This is done by formulating the problem as an LP with polynomially-many variables and exponentially-many constraints, which can be solved by means of the ellipsoid method, similarly to how we compute Fϵ(η1, . . . ηn) in Section 4 (see the proof of Lemma 4.3). Formally: Theorem 5.1. Restricted to instances I := (Γ, Φ) which satisfy Assumption 3, APXCPE(1, 0) admits a polynomialtime algorithm. 5.2. An Efficient No-regret Learning Algorithm Next, we show how Assumption 3 allows us to find a constrained ϵ-Phi-equilibrium by means of a polynomial-time decentralized no-regret learning algorithm. Our algorithm is based on the Phi-regret minimization framework introduced by Greenwald & Jafari (2003), which needs to be extended in order to be able to work with polytopal sets ΦS i of safe deviations, rather than finite sets of pure deviations. Algorithm 1 Learning a Constrained ϵ-Phi-equilibria Require: Regret minimizers Ri for the sets ΦS i , for i N 1: Initialize the regret minimizers Ri 2: for t = 1, . . . , T do 3: for each player i N do 4: φi,t Ri.RECOMMEND() 5: Play according to a distribution xi,t Ai s.t. xi,t[a] = X b Ai φi,t[b, a]xi,t[b] a Ai 6: end for 7: zt i N xi,t 8: Ri.OBSERVE(φi 7 ui(φi zt)) 9: end for 10: return z T := 1 T PT t=1 zt Algorithm 1 outlines our no-regret algorithm. It instantiates a regret minimizer Ri for the polytope ΦS i for each i N. Ri is an object that, at each round t [T], recommends a Constrained Phi-Equilibria safe deviation φi,t ΦS i to player i (Line 4 of Algorithm 1), and, then, observes a function φi 7 ui(φi zt) that specifies the utility that would have been obtained by selecting any safe deviation φi ΦS i at round t (Line 8 of Algorithm 1). Ri guarantees that the regret RT i cumulated by player i over [T] grows sublinearly, i.e., RT i = o(T), where: RT i := max φi Φi t=1 ui(φi zt) t=1 ui(φi,t zt), which is how much player i loses by selecting φi,t at each t rather than choosing the same best-in-hindsight deviation at all rounds. Notice that, by taking inspiration from the Phi-regret framework (Greenwald & Jafari, 2003), given a recommended deviation φi,t, player i actually plays according to a probability distribution xi,t Ai, which is a stationary distribution of the matrix representing φi,t. This is crucial in order to implement the algorithm in a decentralized fashion and to provide convergence guarantees to constrained ϵ-Phi-equilibria (see Theorem 5.2). All the distributions xi,t jointly determine a correlated strategy zt A at each round t [T], defined as zt := i N xi,t, where denotes the product among distributions; formally, zt[a] := Q i N xi,t[ai] for all a A. Algorithm 1 provides the following guarantees: Theorem 5.2. Given an instance I := (Γ, Φ) satisfying Assumption 3, after T N>0 rounds, Algorithm 1 returns a correlated strategy z T A that is a constrained ϵT -Phiequilibrium with ϵT = O(1/ T). Moreover, each round of Algorithm 1 runs in polynomial time. Let us remark that the crucial property which allows us to design Algorithm 1 is that the sets ΦS i of safe deviations do not depend on players other than i. Finally, from Theorem 5.2, the following result follows: Corollary 5.3. In instances I := (Γ, Φ) satisfying Assumption 3, a constrained ϵ-Phi-equilibrium can be computed in time polynomial in the instance size and 1 ϵ by means of a decentralized learning algorithm. 5.3. Marginally-constrained CCE We conclude the section by introducing a particular (natural) notion of constrained ϵ-Phi-equilibrium for which Assumption 3 is satisfied. This is a constrained version of the classical CCE in cost-constrained normal-form games where a player s costs only depend on the action of that player. We call it marginally-constrained ϵ-CCE. Formally, such an equilibrium is defined for games in which, for every player i N, it holds ci(a) = ci(a ) for all a, a A such that ai = a i, and for the set ΦCCE of CCE deviations that we have previously introduced in Section 2.4. Next, we prove that, with the definition above, Assumption 3 is satisfied. Theorem 5.4. For instances I := (Γ, ΦCCE) such that ci(a) = ci(a ) for every player i N and action profiles a, a A : ai = a i, Assumption 3 holds. Thanks to Theorem 5.4, we readily obtain the two following corollaries of Theorems 5.1 and 5.1. Corollary 5.5. The problem of computing a marginallyconstrained (exact) CCE that maximizes a linear function ℓ: A R can be solved in polynomial time. Corollary 5.6. A marginally-constrained ϵ-CCE can be computed in time polynomial in the instance size and 1 ϵ by means of a decentralized learning algorithm. 6. Discussion and Open Problems The main positive results that we provide in this paper (Corollaries 4.2 and 4.3) show that a constrained ϵ-Phi equilibrium maximizing a given linear function can be computed in time polynomial in the instance size and 1 ϵ , when either the number of constraints or that of players actions is fixed. Clearly, this implies that, under the same assumptions, a constrained ϵ-Phi-equilibrium can be found efficiently. Moreover, in Section 5, we designed an efficient no-regret learning algorithm that finds a constrained ϵ-Phi-equilibrium in settings enjoying special properties (Corollary 5.3). However, the problem of efficiently computing a constrained ϵ-Phi-equilibrium remains open in general. Formally: Definition 6.1 (Open Problem). Given any instance I := (Γ, Φ), find a constrained ϵ-Phi-equilibrium in time polynomial in the instance size and 1 Solving the problem above is non-trivial. Proposition 3.1 in Section 3 proves that the set of constrained ϵ-Phi-equilibria is non-convex, and, thus, solving the problem in Definition 6.1 is out of scope for most of the known equilibrium computation techniques. On the other hand, it is unlikely that such a problem is NP-hard. Indeed, a constrained ϵ-Phiequilibrium always exists and, given any z A, it is possible to verify whether z is an equilibrium or not in polynomial time. Formally, such a problem is said to belong to the TFNP complexity class, and, thus, standard arguments show that, if the problem is NP-hard, then NP = co NP (Megiddo & Papadimitriou, 1991). Thus, one should try to reduce the problem in Definition 6.1 to problems in TFNP, such as that of computing a Nash equilibrium. However, while the problem in Definition 6.1 shares some properties with that of computing a Nash equilibrium, such as the non-convexity of the set of the equilibria, the former is fundamentally different from the latter, since it exhibits correlation among the players. Thus, a reduction from such a problem to that of computing Nash equilibria would require a gadget to break the correlation among the players, and doing that is highly non-trivial as cost constraints are expressed by linear functions. Constrained Phi-Equilibria Acknowledgements This paper is supported by FAIR (Future Artificial Intelligence Research) project, funded by the Next Generation EU program within the PNRR-PE-AI scheme (M4C2, Investment 1.3, Line on Artificial Intelligence). Altman, E. and Shwartz, A. Constrained markov games: Nash equilibria. In Advances in dynamic games and applications, pp. 213 221. Springer, 2000. Alvarez-Mena, J. and Hern andez-Lerma, O. Existence of nash equilibria for constrained stochastic games. Mathematical Methods of Operations Research, 63(2):261 285, 2006. Arrow, K. J. and Debreu, G. Existence of an equilibrium for a competitive economy. Econometrica: Journal of the Econometric Society, pp. 265 290, 1954. Aumann, R. J. Subjectivity and correlation in randomized strategies. Journal of mathematical Economics, 1(1): 67 96, 1974. Bakhtin, A., Brown, N., Dinan, E., Farina, G., Flaherty, C., Fried, D., Goff, A., Gray, J., Hu, H., Jacob, A., et al. Human-level play in the game of diplomacy by combining language models with strategic reasoning. Science (New York, NY), pp. eade9097 eade9097, 2022. Brown, N. and Sandholm, T. Superhuman ai for multiplayer poker. Science, 365(6456):885 890, 2019. Bueno, L. F., Haeser, G., and Rojas, F. N. Optimality conditions and constraint qualifications for generalized nash equilibrium problems and their practical implications. SIAM Journal on Optimization, 29(1):31 54, 2019. Celli, A., Marchesi, A., Farina, G., and Gatti, N. No-regret learning dynamics for extensive-form correlated equilibrium. Advances in Neural Information Processing Systems, 33:7722 7732, 2020. Chen, Z., Ma, S., and Zhou, Y. Finding correlated equilibrium of constrained markov game: A primal-dual approach. In Advances in Neural Information Processing Systems, 2022. Daskalakis, C., Goldberg, P. W., and Papadimitriou, C. H. The complexity of computing a nash equilibrium. SIAM Journal on Computing, 39(1):195 259, 2009. Ekeland, I. and Temam, R. Convex analysis and variational problems. SIAM, 1999. Facchinei, F. and Kanzow, C. Generalized nash equilibrium problems. Annals of Operations Research, 175(1):177 211, 2010. Goktas, D. and Greenwald, A. Exploitability minimization in games and beyond. Advances in Neural Information Processing Systems, 2022. Greenwald, A. and Jafari, A. A general class of no-regret learning algorithms and game-theoretic equilibria. In Learning theory and kernel machines, pp. 2 12. Springer, 2003. Hakami, V. and Dehghan, M. Learning stationary correlated equilibria in constrained general-sum stochastic games. IEEE Transactions on Cybernetics, 46(7):1640 1654, 2015. H astad, J. Clique is hard to approximate within n1 ϵ. Acta Mathematica, 182(1):105 142, 1999. Hazan, E. et al. Introduction to online convex optimization. Foundations and Trends in Optimization, 2(3-4):157 325, 2016. Jordan, M. I., Lin, T., and Zampetakis, M. First-order algorithms for nonlinear generalized nash equilibrium problems. ar Xiv preprint ar Xiv:2204.03132, 2022. Kanzow, C. and Steck, D. Augmented lagrangian methods for the solution of generalized nash equilibrium problems. SIAM Journal on Optimization, 26(4):2034 2058, 2016. Megiddo, N. and Papadimitriou, C. H. On total functions, existence theorems and computational complexity. Theoretical Computer Science, 81(2):317 324, 1991. Moulin, H. and Vial, J.-P. Strategically zero-sum games: the class of games whose completely mixed equilibria cannot be improved upon. INT J GAME THEORY, 7(3): 201 221, 1978a. Moulin, H. and Vial, J.-P. Strategically zero-sum games: the class of games whose completely mixed equilibria cannot be improved upon. International Journal of Game Theory, 7(3):201 221, 1978b. Nash, J. Non-cooperative games. Annals of mathematics, pp. 286 295, 1951. Papadimitriou, C. H. and Roughgarden, T. Computing correlated equilibria in multi-player games. Journal of the ACM (JACM), 55(3):1 29, 2008. Rosen, J. B. Existence and uniqueness of equilibrium points for concave n-person games. Econometrica: Journal of the Econometric Society, pp. 520 534, 1965. Zuckerman, D. Linear degree extractors and the inapproximability of max clique and chromatic number. Theory of Computing, 3(6):103 128, 2007. Constrained Phi-Equilibria A. On the Weaknesses of the Guarantees of the Algorithm of Chen et al. (2022) The Algorithm of Chen et al. (2022) finds a distribution µ A over correlated strategies such that: max φi ΦS i (z) ui(φi z) ui(z) 0. (3) However, here we claim that this solution concept inherits some weaknesses from the non-convexity of the equilibria set that we proved in Theorem 5.1. Indeed, consider the same instance of Theorem 5.1 and consider the uniform distribution µ over {z1, z2}. In Theorem 5.1 we proved that maxφi ΦS i (z1) ui(φi z1) ui(z1) 0 for all i {1, 2} and maxφi ΦS i (z2) ui(φi z2) ui(z2) 0 for all i {1, 2} and thus Equation (3) holds over the distribution µ. However we show that the expected correlated strategy z3 derived from distribution µ, i.e., z3 = Ez µ[z] = 1 2z2, it is not a feasible equilibrium, or an approximate one. Indeed, in Theorem 5.1, we proved that maxφ2 ΦS 2 (z3) u2(φ2 z3) u2(z3) 1 3, showing that the average correlated strategies returned by their Algorithm is not an equilibrium nor close to it. This comes from the peculiar fact about Constrained Phi-equilibria that exhibit non-convex set of solutions, which is in striking contrast with the unconstrained case. Indeed the guarantees of Equilibria (3) would imply that Ez µ[z] is a equilibrium in the unconstrained case in which the set of equilibria is convex. B. Proofs Omitted from Section 2 Theorem 2.1. Given a cost-constrained normal-form game Γ and a set Φ of deviations, if Assumption 2 is satisfied, then Γ admits a constrained Phi-equilibrium. Proof. With assumption 2 Altman & Shwartz (2000, Theorem 2.1) proves the existence of a constrained Nash equilibrium. In our setting this is equivalent to a product distribution z = i [N]xi so that it is a Constrained Phi-equilibrium for any set of deviations Φi.9 This is easily seen by observing that a Constrained Nash Equilibria is defined as: j [N] xj(aj) j [N]\{i} xi(ai) for all xi (Ai) s.t. xi x i S. On the other hand it easily seen that for all φi Φi(z) there exists some xi (Ai) such that φi j [N]xj = xi x i and xi x i S. This is proved by the following calculations: φi j [N]xj [ai, a i] := X b Ai φi[b, ai]xi(b)x i(a i) (4) = xi(ai) x i(a i), (5) where xi(ai) := P b Ai φi[b, ai]xi(b) and xi (Ai) since, by definition, P ai Ai φi[b, ai] = 1 for all b Ai. This proves that a Constrained Nash Equilibrium is a Phi-Constrained Equilibrium for all Φ. C. Proofs Omitted from Section 3 Theorem 3.1 (Hardness). For any constant α > 0, the problem APXCPE(α, (α/s)2) is NP-hard, where s is the number of players actions in the instance given as input. 9As common in the normal form game literature, for any distribution x (X) and y (Y ), x y (X Y ) is the product distribution defined as (x y)[a, b] = x[a]y[b] for a X and b Y . Constrained Phi-Equilibria Proof. We reduce from GAP-INDEPENDENT-SET, which is a promise problem that formally reads as follows: given an δ > 0 and a graph G = (V, E), with set of nodes V and set of edges E, determine whether G admits an independent set of size at least |V |1 δ or all the independent sets of G have size smaller than |V |δ. GAP-INDEPENDENT-SET is NP-hard for every δ > 0 (H astad, 1999; Zuckerman, 2007). Let ℓ= |V | and α > 0 be the desired approximation factor. Given an instance of GAP-INDEPENDENT-SET, we build an instance such that if there exists an independent set of size ℓ1 δ, then there exists a Constrained Phi-equilibrium with social welfare 1. Otherwise, if all the independent sets have size at most ℓδ, all the Constrained ϵ-Phi-equilibria have social welfare at most α/2. We can use any δ > 0, since we simply need ℓδ < ℓ1 δ. Moreover, we take ϵ = α2 128ℓ2 . As we will see, ℓwill be smaller than the number of action of the players, satisfying the condition in the statement. Construction. The first player has a set of actions A1 that includes actions a0, a1, a2 and an action av for each v V . Moreover, the first player has an action a F .10 The second player has a set of actions A2 that includes actions av and av for each v V . Moreover, the second player has an action a F . Let γ = η = α/8. The utility of the first agent is as follows: u1(a0, a) = γ + 1 2η for all a A2 \ {a F }, u1(a1, av) = γ + η and u1(a2, av) = γ for all v V . u1(a1, av) = γ and u1(a2, av) = γ + η for all v V . u1(av, av) = u1(av, av) = γ for all v V u1(av, av ) = γ and u1(av, av ) = γ + ℓ ℓ1 δ ℓ ℓ1 δ 1η for all v = v. u1(a F , a) = 0 for each a A2. u1(a, a F ) = 0 for each a A1. The utility of the second agent is u2(a0, a) = 1 for each a A2 \ {a F } and 0 otherwise. There is a cost function cv for each v V , which is common to both the agents. For each v V , the cost function cv is such that cv(av, av ) = 1 for each v = v, (v, v ) E, cv(av, av ) = 0 for each v = v, (v, v ) / E, cv(av, av) = 1 for each v V . cv(a F , a) = 1 4ℓ2 for each a A2. cv(a, a F ) = 1 4ℓ2 for each a A1. For every other action profile the cost is 0. We dropped the player index from the cost functions c as they are equal to both players. Moreover, we set of deviations Φi = Φi,ALL for both players i {1, 2}. Notice that the instance satisfies Assumption 2. Indeed, the deviation φi such that φi[a, a F ] = 1 for all a Ai for i {1, 2}, that deviates deterministically to a F is always strictly feasible for both player 1 and player 2. Moreover, its cost is polynomial in the instance size. 10This action is needed only to satisfy the strictly feasibility assumption. Constrained Phi-Equilibria Completeness. We show that if there exists an independent set of size ℓ1 δ, then the social welfare of an optimal Constrained Phi-equilibria is at least 1. Let V be an independent set of size ℓ1 δ. We build a Constrained Phi-equilibria z with social welfare at least 1. Consider the correlated strategy such that z[a0, av] = 1 2ℓ1 δ for all v V , while z[a0, av] = 1 2(ℓ ℓ1 δ) for all v / V . All the other action profiles have probability 0. It is easy to see that the correlated strategy has social welfare at least 1 since player 1 always plays action a0 and u2(a0, a) = 1 for all a A2. Moreover, it is easy to verify that it is safe since cv(a0, a) 0 for each a A2. Hence, to show that z is an Constrained Phi-equilibria we only need to prove that it satisfies the incentive constraints. The incentive constraints of the second player are satisfied since they obtain the maximum possible utility, i.e., 1. Consider now a possible deviation of the first player φ1 Φ1. As a first step, we compute the expected utility of a strategy φ1. Let us define the following quantities: v V φ1[a0, av] h z[a0, av] + z[a0, av] + P v =v z[a0, av ] γ + γ + ℓ ℓ1 δ ℓ ℓ1 δ 1η P v =v z[a0, av] i v / V φ1[a0, av] h z[a0, av] + z[a0, av] + P v =v z[a0, av ] γ + γ + ℓ ℓ1 δ ℓ ℓ1 δ 1η P v =v z[a0, av] i T 3 = γ + η 2 φ1[a0, a0] + γ+η 2 (φ1[a0, a1] + φ1[a0, a2]) + γ 2 (φ1[a0, a1] + φ1[a0, a2]) We bound each component individually. v V φ1[a0, av] z[a0, av] + z[a0, av] + X v =v z[a0, av ] γ + γ + η ℓ ℓ1 δ v =v z[a0, av] v V φ1[a0, av] 1 γ + η ℓ ℓ1 δ v V φ1[a0, av] γ + η v V φ1[a0, av](γ + η), where in the last inequality we use ℓ ℓ1 δ ℓ ℓ1 δ 1 2 for ℓlarge enough. while v / V φ1[a0, av] z[a0, av] + z[a0, av] + X v =v z[a0, av ] γ + γ + η ℓ ℓ1 δ v =v z[a0, av] v / V φ1[a0, av] 1 2 + 1 2(ℓ ℓ1 δ) 2 1 2(ℓ ℓ1 δ) γ + η ℓ ℓ1 δ v / V φ1[a0, av] γ + η ℓ ℓ1 δ 1 1 ℓ ℓ1 δ 1 v / V φ1[a0, av] γ + η T 3 = [a0, a0] γ + η 2 ([a0, a1] + [a0, a2]) + γ 2 ([a0, a1] + [a0, a2]) ([a0, a0] + φ1[a0, a1] + φ1[a0, a2]) Finally, the utility of a deviation φ1 is X a1 A1,a2 A2 a A1 φ1[a1, a]z[a1, a2]u1(a, a2) Constrained Phi-Equilibria a A1,a2 A2 φ1[a0, a]z[a0, a2]u1(a, a2) = T 1 + T 2 + T 3 v V φ1[a0, av] + γ + η v / V φ1[a0, av] + γ + η (φ[a0, a0] + φ1[a0, a1] + φ1[a0, a2]) v V φ1[a0, av] + γ + η (1 φ1[a0, a F ]) Now, we show that no deviation φ1 Φ1 is both safe and increases player 1 utility. In particular, we show that if a strategy φ1 increases the utility than it is not safe. Indeed, if φ1 increases the utility, then X a1 A1, a2 A2 a A1 φ1[a1, a]z[a1, a2]u1(a, a2) > γ + η This implies that v V φ1[a0, av] + γ + η (1 φ1[a0, a F ]) > γ + η v V φ1[a0, av] > 1 2φ1[a0, a F ] (6) Next, we show that any φ1 that increases the utility (and hence that satisfies Eq (6)) is not a feasible deviation. First, notice that equation (6) implies that there is a v V such that φ1[a0, a v] > 1 2ℓφ1[a0, a F ]. (7) Then, we show that the deviation φ1 violates the constraint c v. In particular, a1 A1,a2 A2 a A1 φ1[a1, a]z[a1, a2]cv(a, a2) = φ1[a0, a v]z[a0, a v]1 1 4ℓ2 φ1[a0, a F ] X v V :(v, v) E φ1[a0, a v]z[a0, av]1 = φ1[a0, a v]z[a0, a v] 1 4ℓ2 φ1[a0, a F ] ℓφ1[a0, a v] 1 4ℓ2 φ1[a0, a F ] > φ1[a0, a v] 1 where the second inequality holds since V is an independent set, and the second-to-last inequality by Equation (7). Hence, there is no deviation φ1 that increases players 1 utility and that is safe. This concludes the first part of the proof. Soundness. We show that if there exists a Constrained w-Phi-equilibria with social welfare α/2, then there exists an independent set of size strictly larger than ℓδ, reaching a contradiction. Suppose by contradiction that there exists a Constrained ϵ-Phi-equilibrium z with social welfare strictly greater than α/2. Thus, X a A2\{a F } z[a0, a ] 1 + X a A1,a A2 (γ + η) X a A1,a A2 z[a, a ](u1(a, a ) + u2(a, a )) α/2, where the first inequality comes from u2(a0, a ) = 1 for each a A2 \ {a F } and 0 otherwise, and u1(a, a ) γ + η for each a A1 and a A2. This implies X a A2 z[a0, a ] α/4. (8) Constrained Phi-Equilibria Then, we show that z assigns similar probabilities on the set of action profiles {a0, av}v V and {a0, av}v V Given an a A1, let φa Φ1 be a deviation of the first player such that φa[a0, a] = 1 and φa[a , a ] = 1 for each a = a0. Since z is an Constrained ϵ-Phi-equilibrium there is no feasible deviation φa that increases the utility of player 1 by more than ϵ. This implies that v V z[a0, av] X v V z[a0, av] v V z[a0, av] > X v V z[a0, av] + 2ϵ then the deviation φa1 has utility at least X v V z[a0, av]φa1[a0, a1](γ + η) + z[a0, av]φa1[a0, a1]γ + X a A2 z[a, a ]φa1[a, a]ui(a, a ) v V z[a0, av] + γ X v V (z[a0, av] + z[a0, av]) + X a A2 z[a, a ]φa1[a, a]ui(a, a ) v V (z[a0, av] + z[a0, av]) v V (z[a0, av] + z[a0, av]) a A2 z[a, a ]φa1[a, a]ui(a, a ) v V (z[a0, av] + z[a0, av]) + X a A2 z[a, a ]φa1[a, a]ui(a, a ) where the first inequality comes from adding P v V z[a0, av] to both sides of Equation (10). Moreover, φa1 is feasible since for each constraint c v, v V , it has cost X v V (z[a0, av]φa1[a0, a1]c v(a1, av) + z[a0, av]φa1[a0, a1]c v(a1, av)) a A2 z[a, a ]φa1[a, a]c v(a, a ) v V (z[a0, av]φa1[a0, a1]c v(a0, av) + z[a0, av]φa1[a0, a1]c v(a0, av)) a A2 z[a, a ]φa1[a, a]c v(a, a ) = c v(z) 0. A similar argument shows that if P v V z[a0, av] < P v V z[a0, av] 2ϵ η then the deviation φa2 is safe and increases the utility. As a consequence of Equation (9), it holds v V z[a0, av] X v V (z[a0, av] + z[a0, av]) ϵ a A2\{a F } z[a0, a] 2ϵ where the first inequality comes from adding P v V z[a0, av] to both sides of P v V z[a0, av]| P v V z[a0, av] 2ϵ η The next step is to show that it is if there is no safe deviation φav, v V , that increases the utility, then there exists an independent set of size larger than ℓδ. Since z is an Constrained ϵ-Phi-equilibrium, for each av, v V one of the following two conditions holds: i) φav / ΦS 1 (z) or ii) u1(φav z) u1(z) + ϵ. Let V 1 V be the set of vertexes v such that φav Constrained Phi-Equilibria is not safe, i.e., φav / ΦS 1 (z), and V 2 = V \ V 1 be the set of v such that φav does not increase the utility by more than ϵ and are not in V 1, i.e., u1(φav z) u1(z) and φav ΦS 1 (z). We show that |V 2| ℓ ℓ1 δ. Indeed, for each v V 2, deviation φav does not increase the utility and hence it holds: a A2\{a F } z[a0, a] + η ℓ ℓ1 δ v =v z[a0, av ] + X a A2 z[a, a ]φa1[a, a]ui(a, a ) v V z[a0, a] + z[a0, av] φ[a0, av]γ + X v =v φ[a0, av]z[a0, av ] γ + η ℓ ℓ1 δ a A2 z[a, a ]φa1[a, a]ui(a, a ) a A2\{a F } z[a0, a] + X a A2 z[a, a ]ui(a, a ) + ϵ, where the inequality holds since the lhs is the utility of the deviation φav. This implies X v z[a0, av ] z[a0, av] a A2\{a F } z[a0, a] + ϵ η X v V z[a0, av] + 2ϵ, where the last inequality holds by Equation (11). Hence, z[a0, av] ℓ ℓ1 δ ℓ ℓ1 δ 1 ℓ ℓ1 δ ℓ ℓ1 δ 1 1 X v z[a0, av ] 2ϵ/η, z[a0, av] 1 ℓ ℓ1 δ X v z[a0, av ] 2ϵ/η. (12) Suppose that |V 2| > ℓ ℓ1 δ, and hence Equation (12) is satisfied by at least |V 2| ℓ ℓ1 δ + 1 vertexes. We need the following inequality. v z[a0, av ] 1 a A2\{a F } z[a0, a] 2ϵ where the first inequality comes from Equation (11), and the second one by Equation (8). Then, summing over the |V 2| inequalities we get v V 2 z[a0, av] (ℓ ℓ1 δ + 1) v z[a0, av ] 2ϵ/η v z[a0, av ] + 1 v z[a0, av ] 2ℓϵ v z[a0, av ], where the last inequality follows from equation (13). Hence, we reach a contradiction and |V 2| ℓ ℓ1 δ. Constrained Phi-Equilibria To conclude the proof, we show that V 1 is an independent set. Since |V 1| |V | |V 2| = ℓ1 δ we reach a contradiction. Let v and v be two nodes in V 1 and w.l.o.g. let z[a0, av] z[a0, av ]. We show that (v, v ) / E. Since v V 1, φav is not a safe deviation for player 1 with respect to constraint cv . if (v, v ) E, then X a1 A1,a2 A2 a A1 φ[a1, a]z[a1, a2]cv(a, a2) = z[a0, a v] X v :(v ,v ) E z[a0, av ] 1 4ℓz[a0, a F ]cv(a, a2) a1 A1\{a0},a2 A2 a A1 φ[a1, a]z[a1, a2]cv(a, a2) z[a0, a v] z[a0, av] 1 4ℓz[a0, a F ]cv(a, a2)+ a1 A1\{a0},a2 A2 a A1 φ[a1, a]z[a1, a2]cv(a, a2) 4ℓz[a0, a F ]cv(a, a2) + X a1 A1\{a0},a2 A2 a A1 φ[a1, a]z[a1, a2]cv(a, a2) Hence, (v, v ) / E. Since V 1 is an independent set of size at least ℓ1 δ we reach a contradiction. This concludes the proof. D. Proofs Omitted from Section 4 Lemma 4.1. For every z A and i N, it holds max φi ΦS i (z)ui(φi z) = sup φi Φi inf ηi Rm + ui(φi z) η i ci(φi z) = inf ηi Rm + sup φi Φi ui(φi z) η i ci(φi z) . Proof. First, it is easy to see that sup φi ΦS i (z) ui(φi z) = sup φi Φi inf ηi Rm + ui(φi z) η i ci(φi z) . Indeed, for every φi / ΦS i (z), it holds that the vector ci(φi z) has at least one positive component, and, thus, the vector of Lagrange multipliers ηi can be selected so that ui(φi z) η i ci(φi z) goes to . This implies that the supremum over Φi cannot be attained in ΦS i (z). On the other hand, for every φi ΦS i (z), all the components of ci(φi z) are negative, and, thus, the inf is achieved by ηi = 0. This proves the first equality. Then, the second equality directly follows from the generalization of the max-min theorem for unbounded domains (see (Ekeland & Temam, 1999, Proposition 2.3)), which allows us to swap the sup and the inf. Lemma D.1. For any two real-valued functions f(x) and g(x) with g(x) c then min(f(x), g(x)) min(f(x), c). Proof. We can identify three sets I1, I2 and I3 defined as follows: I1 := {x s.t. f(x) c} (14) I2 := {x s.t. g(x) f(x) c} (15) I3 := {x s.t. f(x) g(x) c}. (16) Then for all x I1 we have that min(f(x), c) = c min(f(x), g(x)) = g(x), while for all x I2 we have that min(f(x), c) = f(x) min(f(x), g(x)) = g(x). Finally for all x I3 we have min(f(x), c) = f(x) = min(f(x), g(x)) = f(x). In all three sets we have that min(f(x), c) min(f(x), g(x)). Constrained Phi-Equilibria Lemma D.2. For all ηi Dc it holds that ui(φi z) η i ci(φi z) 1 Proof. Thanks to Assumption 2 we have that for all z (A) we have that there exists φi ΦS i (z) such that ci( φi z) ρ1. Then, for all ηi Dc we have: η i ci( φi z) ρ ηi 1 1. This easily concludes the proof of the statement ui(φi z) η i ci(φi z) ui( φi z) η i ci( φi z) 1, as ui is positive. Lemma D.3. For all ηi D we have inf ηi D sup φi Φi ui(φi z) η i ci(φi z) 1 Proof. Since ui 1 we have that inf ηi D sup φi Φi ui(φi z) η i ci(φi z) 1 sup ηi D inf φi Φi η i ci(φi z). Next we claim that sup ηi D inf φi Φi η i ci(φi z) 0. This follows from the fact that for all negative components of ci(φi z) then the corresponding components of ηi will be 0. This concludes the statement. Lemma 4.2. Let D := η Rm + | ||η||1 1/ρ . Then, for every z A and i N, it holds: max φi ΦS i (z)ui(φi z) = max φi Φi min ηi D ui(φi z) η i ci(φi z) = min ηi D max φi Φi ui(φi z) η i ci(φi z) . Proof. In Lemma 4.1 we already showed that: sup φi ΦS i (z) ui(φi z) = sup φi Φi inf ηi Rm + ui(φi z) η i ci(φi z) = inf ηi Rm + sup φi Φi ui(φi z) η i ci(φi z) . Note that to prove the statement it is enough to prove that: inf ηi Rm + sup φi Φi ui(φi z) η i ci(φi z) = inf ηi D sup φi Φi ui(φi z) η i ci(φi z) and more specifically that: inf ηi Rm + sup φi Φi ui(φi z) η i ci(φi z) inf ηi D sup φi Φi ui(φi z) η i ci(φi z) since the reverse inequality holds trivially. We can show this by the following inequalities: inf ηi Rm + sup φi Φi ui(φi z) η i ci(φi z) inf ηi D sup φi Φi ui(φi z) η i ci(φi z) , inf ηi Dc sup φi Φi ui(φi z) η i ci(φi z) ! Constrained Phi-Equilibria inf ηi D sup φi Φi ui(φi z) η i ci(φi z) , 1 = inf ηi D sup φi Φi ui(φi z) η i ci(φi z) , where the first inequality hold thanks to Lemma D.1 and Lemma D.2, while that last equation follows from Lemma D.3. Lemma 4.4. Given any 0 < δ ϵ and a δ-optimal set D D, the following holds: L D,ϵ LD,0. Proof. By definition we have that: L D,ϵ = ℓ( z ), where z is a solution to the problem z arg max z S ℓ(z) s.t. ϵ + ui( z ) max φi Φi ui(φi z ) η , i ci(φi z ) η i arg inf ηi D sup φi Φi ui(φi z ) η i ci(φi z ) On the other hand, call z the optimal Constrained Phi-equilibrium. This is a solution to the problem: z arg max z S ℓ(z) s.t. ui(z ) max φi Φi ui(φi z ) η , i ci(φi z ) η i arg inf ηi D sup φi Φi ui(φi z ) η i ci(φi z ) which has value LD,0 = ℓ(z ). Moreover, thanks to Lemma 4.2 and since D is δ-optimal we have that: ui(φi z ) η , i ci(φi z ) max φi Φi ui(φi z ) η , i ci(φi z ) + δ which implies that feasible correlated strategies of problem P2 are feasible correlated strategies of problem P1, and thus problem P1 as long as δ ϵ. Thus problem P1 is the problem of maximizing the same objective function over a larger set then problem P2 and thus L D,ϵ LD,0. Lemma 4.5. For any τ > 0, the set Dτ is (τm)-optimal. Proof. By Lemma 4.2, we know that for each player there exists an η i D such that maxφ ΦS i (z) ui(φi z) = ui(φi z) η , i ci(φi z) . By construction of Dϵ there exists a ηi Dϵ such that || ηi η i || ϵ. Thus max φ ΦS i (z)ui(φi z) = max φi Φi ui(φi z) η , i ci(φi z) ui(φi z) η i ci(φi z) + mϵ, where the last inequality comes the fact that: |(η i ηi) ci(φi z)| ci(φi z) 1 η i ηi mϵ as ci [ 1, 1]m. Lemma 4.6. For any τ > 0, the set Dτ is δ-optimal for δ = 2 p 2τ log s/ρ, where s is the number of players actions. Constrained Phi-Equilibria Proof. The proof exploits a probability interpretation of the Lagrange multipliers. Let η be the optimal multipliers, i.e., η argminη D maxφi Φi ui(φi z) η ci(φi z) . Now consider a basis Γ = { 1 ρej}j [m] {0} for D. By Carathoedory s theorem there exists a distribution γ (Γ) such that η = P η Γ γηη. Assume that ϵ and ρ are such that 1/ϵρ is an integer and take 1/ρϵ samples from the distribution γ and call η the resulting empirical mean. First, we argue that η Dϵ. Indeed ηj = kj 1/ρϵ 1 ρ = ϵ kj 1/ρϵ 1 ρϵ = ϵkj where kj N and thus we have that η Dϵ.11 Now we show that with high probability η Dϵ is close (in terms of utilities) to the optimal multiplier η . First observe that: η , i ci(φi z) := X ai Ai,bi Ai φi[b, ai] X a i A i η , ci(ai, a i)z[b, a i] ai Ai,bi Ai a i A i η ci(ai, a i)z[b, a i] ai Ai,bi Ai φi[b, ai] X a i A i η ci(ai, a i)z[b, a i] ai Ai,bi Ai φi[b, ai]δai,b (17c) = η i ci(φi z) + X ai Ai,bi Ai φi[b, ai]δai,b (17d) where the inequality comes from applying the Hoeffeding s inequality to every ai, b Ai: a i A i ( η η ) ci(ai, a i)z[b, a i] where δai,b = 2 2 1/ρϵ log 2 pai,b a i A i z[b, a i] since the range of the each sample is 1 a i A i z[b, a i] . Moreover, for Hoeffeding s inequality, for every ai, b Ai the above inequality holds with probability at least 1 pai,b and thus holds for all the ai, b Ai simultaneously, with probability at least p := P ai,b Ai pai,b. If then we take pai,b := 1 2|Ai|2 for all ai, b Ai, then we have that p = 1/2 > 0 and δ := δai,b = 2 2 1/ρϵ log (|Ai|) a i A i z[b, a i] Now the following holds with probability at lest 1/2: a i A i ( η η ) ci(ai, a i)z[b, a i] a i A i z[b, a i] The proof is concluded by observing plugging this definition of δ = δai,b in Equation (17) yields P ai Ai,bi Ai φi[b, ai]δai,b = δ, and we can conclude that: η , i ci(φi z) η i ci(φi z) + δ. This holds with positive probability, and thus shows the existence of such η Dϵ for which the above inequality holds and thus Dϵ is 2 q ρ log(|Ai|) -optimal. E. Proofs Omitted from Section 5 Proposition 5.1. For instances I := (Γ, Φ) satisfying Assumption 3, the set of constrained ϵ-Phi-equilibria is convex. 11If ϵ if not such that 1/ρϵ N then the one can take 1/ρϵ samples from γ (Γ) and then the statement hold for a slightly smaller ϵ < ϵ defined as ϵ := 1 1/ρϵ 1 ρ. Constrained Phi-Equilibria Proof. Let z and z be Constrained ϵ-Phi-equilibria that is for all i [N]: ϵ + ui(z ) ui(φ i z ) for φ arg max φi ΦS i ui(φi z ). Equivalently it holds for all i [N] that: ϵ + ui(z ) ui(φ i z ) where φ arg max φi ΦS i ui(φi z ). For any z := αz + (1 α)z we have that: ϵ + ui(z) = α (ϵ + ui(z )) + (1 α) (ϵ + ui(z )) αui(φ i z ) + (1 α)ui(φ i z ) max φi ΦS i ui(φi z), where the inequality holds for the linearity of ui, the first inequality as both z and z are Constrained ϵ-Phi-equilibria and the last inequality holds since the max is a convex operator. Theorem 5.1. Restricted to instances I := (Γ, Φ) which satisfy Assumption 3, APXCPE(1, 0) admits a polynomial-time algorithm. Proof. APXCPE(1, 0) amounts to solving the following problem: max z S ℓ(z) s.t. ui(z) max φi ΦS i ui(φi z) i N, which can be written as an LP with (possibly) exponentially-many constraints, by writing a constraint for each vertex of ΦS i . We can find an exact solution to such an LP in polynomial time by means of the ellipsoid algorithm that uses suitable separation oracle. Such an oracle solves the following optimization problem for every i N: φ i arg max φi ΦS i ui(φi z). Then, the oracle returns as a separating hyperplane the incentive constraint corresponding to a φ i (if any) such that ui(z) ui(φ i z). Since all the steps of the separation oracle can be implemented in polynomial time, the ellipsoid algorithm runs in polynomial time, concluding the proof. Theorem 5.2. Given an instance I := (Γ, Φ) satisfying Assumption 3, after T N>0 rounds, Algorithm 1 returns a correlated strategy z T A that is a constrained ϵT -Phi-equilibrium with ϵT = O(1/ T). Moreover, each round of Algorithm 1 runs in polynomial time. Proof. Any regret minimizer Ri for ΦS i guarantees that, for every φi ΦS i : t=1 ui(φi zt) t=1 ui(φi,t zt) ϵi,T T, (19) where ϵi,T = o(T). Since xi,t[a] = P b Ai φi,t[b, a]xi,t[b] for all a Ai, for every t [T] and a = (ai, a i) A: (φi,t zt)[ai, a i] = X b Ai φi,t[b, ai]z[b, a i] b Ai φi,t[b, ai] xi,t[b] x i,t[a i] b Ai φi,t[b, ai]xi,t[b] Constrained Phi-Equilibria = xi,t[ai] x i,t[a i] = zt[ai, a i], Plugging the equation above into Equation (19), we get: t=1 ui(φi zt) t=1 ui(zt) ϵi,T T. Now, since z T := 1 T PT t=1 zt and ui(z) is linear in z, we can conclude that, for every i N and φi ΦS i : ui( z T ) ui(φi z T ) ϵi,T , and, thus, by letting ϵT := maxi N ϵi,T we get that z T satisfies the incentivize constrained for being a constrained ϵT -Phi-equilibrium. We are left to verify that z T S, namely ci( z T ) 0 for all i N. This readily proved as: ci( z T ) = 1 t=1 ci(φi,t zt) t=1 ci(φi,t) where the first equality holds since ci is linear, the second equality holds thanks to zt = φi,t zt, the third one by Assumption 3, while the inequality holds since φi,t ΦS i . This concludes the proof of the first part of the statement. In conclusion, Algorithm 1 runs in polynomial time as finding xi,t[a] = P b Ai φi,t[b, ai]xi,t[b] for all a Ai is equivalent to finding a stationary distribution of a Markov Chain, which can be done in polynomial time. Moreover, we can implement the regret minimizers Ri over the polytopes ΦS i so that their operations run in polynomial time, such as, e.g., online gradient descent; see (Hazan et al., 2016). Theorem 5.4. For instances I := (Γ, ΦCCE) such that ci(a) = ci(a ) for every player i N and action profiles a, a A : ai = a i, Assumption 3 holds. Proof. Since the costs ci(a) of player i N only depends on player i s action ai and not on the actions of other players, it is possible to show that there exists ci : ΦCCE [ 1, 1]m such that the following holds for every z A: ci(φi) := ci(φi z). Indeed, for every φi ΦCCE, by definition of ΦCCE there exists a probability distribution h Ai : φi[b, a] = h[a] for all b, a Ai. Then, for every ai Ai and a i A i, we can write: (φi z)[ai, a i] = X b Ai φi[b, ai]z[b, a i] b Ai h[ai]z[b, a i] b Ai z[b, a i]. Moreover, it holds: ci(φi z)[ai, a i] = X a A ci(a)(φi z)[ai, a i] Constrained Phi-Equilibria a A ci(a)h[ai] X b Ai z[b, a i] ai Ai ci(ai, )h[ai] X b Ai z[b, a i] ai Ai ci(ai, )h[ai], which only depends on φi, as desired. Notice that, in the equations above, for every a Ai we let ci(a, ) be the (unique) value of ci(a) for all a A : ai = a.