# leadership_in_singleton_congestion_games__cf45f8d0.pdf

Leadership in Singleton Congestion Games

Alberto Marchesi1, Stefano Coniglio2, Nicola Gatti1

1 Politecnico di Milano, Piazza Leonardo da Vinci 32, Milano, Italy 2 University of Southampton, University Road SO17 1BJ, Southampton, United Kingdom alberto.marchesi@polimi.it, s.coniglio@soton.ac.uk, nicola.gatti@polimi.it

We study Stackelberg games where the underlying structure is a congestion game. We recall that, while leadership in 2-player games has been widely investigated, only few results are known when the number of players is three or more. The intractability of ﬁnding a Stackelberg equilibrium (SE) in normal-form and polymatrix games is among them. In this paper, we focus on congestion games in which each player can choose a single resource (a.k.a. singleton congestion games) and a player acts as leader. We show that, without further assumptions, ﬁnding an SE when the followers break ties in favor of the leader is not in Poly-APX, unless P = NP. Instead, under the assumption that every player has access to the same resources and that the cost functions are monotonic, we show that an SE can be computed efﬁciently when the followers break ties either in favor or against the leader.

1 Introduction The problem of ﬁnding a Stackelberg equilibrium (SE) when mixed-strategy commitments are allowed is receiving a lot of attention in the artiﬁcial intelligence literature, also thanks to the many successful real-world applications in, e.g., security [Tambe, 2011]. In Stackelberg games, a player acts as leader, committing to a (potentially) mixed strategy, while the other players act as followers [Von Stengel and Zamir, 2010]. Different versions of SEs can be deﬁned based on how the followers break ties: optimistic (OSE) if in favor of the leader, and pessimistic (PSE) if against her. Finding an OSE or a PSE in 2-player normal-form games is easy, as shown by, respectively, Conitzer and Sandholm [2006] and Von Stengel and Zamir [2010]. The same holds for ﬁnding an OSE in n-player games where the followers play simultaneously in a correlated fashion [Conitzer and Korzhyk, 2011]. In more general situations, though, the problem is hard. Indeed, computing an O/PSE for normal-form games with two followers who play simultaneously a Nash equilibrium (NE) is not in Poly-APX unless P = NP [Basilico et al., 2017]. In polymatrix games with the followers restricted to pure strategies, ﬁnding an OSE is not in Poly-APX unless P = NP if the number of followers is not ﬁxed, while

it is easy if their number is a constant [De Nittis et al., 2018]. Finding a PSE is NP-hard in normal-form games even with two followers playing pure strategies [Coniglio et al., 2017]. Finding an OSE is also NP-hard [Conitzer and Sandholm, 2006] with multiple followers playing sequentially. We focus, here, on congestion games (CGs) game models which, in spite of their simplicity, enjoy nice computational properties even with many players with the aim of investigating whether, in these games, a Stackelberg paradigm is computationally tractable. In CGs, given a set of resources, the players actions are subsets of the resources and the costs the players perceive depend (monotonically or not) on the level of resource utilization (congestion). CGs always admit pure-strategy NEs which are achievable by best-response dynamics [Rosenthal, 1973; Monderer and Shapley, 1996]. CGs where each player cannot use more than a single resource are called singleton CGs (SCGs). Computing their NEs is easy [Ackermann et al., 2008]. Furthermore, in SCGs in which all the players have the same action space, ﬁnding a social-cost minimizing NE is also easy [Ieong et al., 2005]. Original contributions. We apply a Stackelberg paradigm to SCGs, assuming the presence of a special player acting as leader. We also allow the leader to perceive costs which are potentially different from the followers . The leader commits to a (potentially) mixed strategy, while all the other players, acting as followers, observe the leader s commitment and then play, simultaneously, reaching an NE.1 In particular, we study the case in which the followers play pure strategies after observing the leader s commitment, which is reasonable as this followers game always admits at least a purestrategy NE reachable by some best-response dynamics. A simple practical scenario is when a player has a higher priority to decide which resource to use before the other players, e.g., when the resources can be used for free, but gaining a higher priority requires a payment. We show that, when no further assumptions are made, computing an OSE is not in Poly-APX unless P = NP, even when the leader has only one available action and her costs are equal to the followers . This shows that the same inapproximability result also holds

1To our knowledge, the only works related to ours are [Roughgarden, 2004; Fotakis, 2010] and their extensions. However, they analyze a different Stackelberg paradigm where the leader is an authority whose objective is to minimize the social cost of the NE reached by the followers.

Proceedings of the Twenty-Seventh International Joint Conference on Artiﬁcial Intelligence (IJCAI-18)

for ﬁnding an NE minimizing the cost for a given player in an SCG without leadership. Instead, when every player has access to the same set of resources and the costs are monotonically increasing functions of the congestion, the problem of ﬁnding an OSE or a PSE can be solved in polynomial time. Such result holds even when leader s and followers costs are different. While the derivation is straightforward when the leader s commitment is a pure strategy, the analysis is more involved with mixed-strategy commitments, and the result follows from the fact that mixed-strategy commitments do not allow the leader to incur a cost smaller than the one she gets with pure-strategy ones. Furthermore, we show that the previous result holds only when the resource cost functions are monotonic, as, in the non-monotonic case, the leader s cost with mixed-strategy commitments can be strictly smaller than that obtained with a commitment in pure strategies.

2 Preliminaries

In this work, we analyze CGs in which a leader commits to a strategy beforehand, and, then, the followers simultaneously decide how to play, reaching an NE in the game that results from observing the leader s commitment. Following the notation by Shoham and Leyton-Brown (2008), we formally deﬁne a Stackelberg SCG (SSCG) as a tuple (N, R, A, cℓ, cf) where: N = F {ℓ} is a ﬁnite set of players, with player ℓdenoting the leader and F the set of followers, R is a ﬁnite set of resources, A = {Ap}p N, where Ap R represents the set of player p s actions, and cℓ= {ci,ℓ}i R and cf = {ci,f}i R are, respectively, the leader s and followers cost functions, with ci,ℓ, ci,f : N Q denoting the costs of resource i as a function of its congestion. As usual, we assume ci,ℓ(0) = ci,f(0) = 0 for every i R. In the following, let |N| = n and |R| = r be the number of players and resources, respectively. A strategy σp of player p N is a probability distribution over Ap where σp(ap) denotes the probability that ap Ap is played. Let p be the set of player p s strategies. A strategy σp p is said pure if it prescribes to always play action ap Ap, i.e., σp(ap) = 1; otherwise, σp is called mixed. A collection of players strategies is called strategy proﬁle in general, and action proﬁle if all the strategies are pure. In this work, we use σ = (σℓ, a) to collectively denote a strategy proﬁle in which the leader plays a (potentially) mixed strategy σℓ ℓand the followers play pure strategies which determine an action proﬁle a = (ap)p F p F Ap. In the following, given a followers action proﬁle a = (ap)p F p F Ap, let νa i = |{p F | ap = i}| be the number of followers selecting resource i R in a, i.e., the resource congestion caused by the followers presence only. We deﬁne the followers conﬁguration induced by a as the vector νa Nr whose i-th component is νa i . In addition, for σℓ ℓwe deﬁne cσℓ i,f : N Q, the followers expected cost of resource i R given σℓ, as a function of the number x of followers selecting i, i.e., cσℓ i,f(x) = σℓ(i)ci,f(x + 1) + (1 σℓ(i))ci,f(x). Indeed, given a leader s strategy σℓ, all followers who select resource i R experience a congestion that may (with probability σℓ(i)) or may not (with probability 1 σℓ(i)) be incre-

mented by one, depending on whether the leader would or would not choose resource i. Finally, given σ = (σℓ, a), let cσ ℓ= P i Aℓσℓ(i)ci,ℓ(νa i + 1) be the leader s cost. Notice that, after observing a leader s strategy σℓ, the followers play a new CG where resource costs are speciﬁed by functions cσℓ i,f, for i R. Being a CG, such game always admits an NE in which the players adopt pure strategies [Rosenthal, 1973]. Moreover, we assume that the followers play pure-strategy NEs, which are reached by playing some bestresponse dynamics [Monderer and Shapley, 1996]. Given strategy proﬁle σ = (σℓ, a), a is an NE for σℓif, for every p F and a p Ap, cσℓ ap,f(νa ap) cσℓ a p,f(νa a p + 1), i.e., if no follower has an incentive to unilaterally deviate from ap by selecting another resource a p. For σℓ ℓ, let Eσℓbe the set of NEs in the followers game resulting from σℓ. In the second part of the work, we restrict our attention to a subclass of SSCGs where each player can select every resource, i.e., where Ap = R for all p N. We refer to these games as Simple SSCGs (SSSCGs). Formally, an SSSCGs is a tuple (N, R, cℓ, cf) whose elements are deﬁned as in an SSCG where all the followers are identical as they are allowed to choose the same resources. Thus, only the number of followers selecting each resource is signiﬁcant. As a consequence, a followers action proﬁle a can be equivalently represented with the followers conﬁguration νa induced by it. Thus, when studying SSSCGs, we do not explicitly refer to followers action proﬁles but, rather, use ν Nr with P i R νi = n 1 to denote a followers conﬁguration. Moreover, let us notice that a followers conﬁguration ν is an NE for σℓ ℓif, for every i R : νi > 0 and j R, cσℓ i,f(νi) cσℓ j,f(νj + 1). Observe that, given a leader s strategy, there might be multiple NEs in the followers game and, hence, different deﬁnitions of SE can be considered depending on the equilibriumselection rule adopted by the followers. As customary in the literature, we consider two deﬁnitions: optimistic SE (OSE) and pessimistic SE (PSE). In the ﬁrst one, the followers act in favor of the leader, thus selecting an NE minimizing her cost, while, in the second one, the followers always select an NE which results in the maximum leader s cost. Formally: Deﬁnition 1. A strategy proﬁle σ = (σℓ, a) is an OSE if it solves the following bilevel problem:

min σℓ ℓ min a Eσℓc(σℓ,a) ℓ

As it is clear, an OSE always exists in SSCGs. Deﬁnition 2. A PSE, if it exists, is a strategy proﬁle σ = (σℓ, a) which solves the following bilevel problem:

min σℓ ℓ max a Eσℓc(σℓ,a) ℓ

Let us recall that, in general, the problem in Deﬁnition 2 may not admit a minimum (but only an inﬁmum) and, thus, a PSE may not exist [Von Stengel and Zamir, 2010].

3 SSCG NP-Hardness and Inapproximability Let us start our analysis with a negative result, showing that the problem of computing an OSE in SSCGs is computationally intractable, even if the leader can select a single resource

Proceedings of the Twenty-Seventh International Joint Conference on Artiﬁcial Intelligence (IJCAI-18)

and her costs are monotonic. Our result is based on a reduction from 3SAT, a well-known NP-complete problem [Garey and Johnson, 1979] which reads as follows:

Deﬁnition 3 (3SAT). Given a ﬁnite set C of 3-literal clauses deﬁned over a ﬁnite set V of variables, is there a truth assignment to the variables which satisﬁes all clauses?

Theorem 1. Computing an OSE in SSCGs is NP-hard.

Proof. We provide a reduction from 3SAT showing that the existence of a polynomial-time algorithm for computing an OSE in SSCGs would allow us to solve any 3SAT instance in polynomial time. Speciﬁcally, given a 3SAT instance (C, V ) and a number 0 < ϵ < 4, we build an SSCG Γϵ(C, V ) such that there exists an OSE where the leader s cost is ϵ if and only if (C, V ) is satisﬁable (if not, the leader s cost is 4 in any OSE). In the following, l φ denotes a literal (i.e., a variable or its negation) appearing in φ C, while v(l) denotes the variable corresponding to that literal. Moreover, we let |C| = m and |V | = s be, respectively, the number of clauses and variables. Mapping. Γϵ(C, V ) is deﬁned as follows: N = F {ℓ}, with F = {pφ, pφ,t | φ C} {pv,t, pv, p v | v V } {pφ,v, pφ, v | φ C, v V }; R = {rt} {rφ | φ C} {rv,t, rv, r v | v V } {rφ,v, rφ, v | φ C, v V }; Apφ = {rφ} {rφ,l | l φ}, Apφ,t = {rφ, rt} φ C; Apv = {rv,t, rv}, Ap v = {rv,t, r v}, Apv,t = {rv,t, rt} v V ; Apφ,v = {rv, rφ,v}, Apφ, v = {r v, rφ, v} φ C, v V ; Aℓ= {rt}. Moreover, cost functions are speciﬁed by the following table, where cr v,f = crv,f, crφ, v,f = crφ,v,f, and crt,f = crt,ℓ:

x crφ,f crv,f crv,t,f crφ,v,f crt,f 1 2 7 7 1 ϵ [2, m] 5 7 3 6 4 m + 1 5 0 3 6 4

Figure 1 shows an example of game Γϵ(C, V ). Clearly, given (C, V ), Γϵ(C, V ) can be constructed in polynomial time, as it features n = 2m + 3s + 2ms + 1 players and r = m + 3s + 2ms + 1 resources. Observe that, in Γϵ(C, V ), the leader has a single resource available. Hence, the only leader s commitment is to select resource rt, setting σℓ(rt) = 1. As a result, the leader s cost is ϵ if and only if no follower selects resource rt; otherwise, it is 4. If. Suppose that (C, V ) is satisﬁable, and let τ : V {T, F} be a truth assignment satisfying all clauses in C. Using τ, we recover a followers action proﬁle a = (ap)p F p F Ap such that a Eσℓ, with σ = (σℓ, a) providing the leader with a cost of ϵ. Since ϵ is the minimum cost the leader can achieve and the followers behave optimistically, σ is an OSE. In particular, let apφ,t = rφ, for all φ C, and apv,t = rv,t, for all v V . Moreover, if τ(v) = T, let apv = apφ,v = rv, ap v = rv,t, and apφ, v = rφ, v for all φ C, while, if τ(v) = F, let ap v = apφ, v = r v, apv = rv,t, and apφ,v = rφ,v for all φ C. Notice that, since either

τ(v) = T or τ(v) = F, one between rv and r v is selected by m + 1 followers and the other one by none, respectively. Say, w.l.o.g., νa rv = m + 1 and νa r v = 0, as the other case is analogous. First, no follower pφ,v would deviate from rv to rφ,v, as, otherwise, she would incur a cost of at least 1, rather than 0. The same holds for followers pφ, v, as their cost is at most 6 while, if any of them switched to r v, she would incur a cost of 7. Similarly, since there are exactly two followers selecting rv,t, follower pv would not deviate from rv (as 0 < 3), while p v and pv,t would not switch from rv,t, as they would get 7 and 4, respectively, rather than 3. Furthermore, since τ is a truth assignment satisfying (C, V ), at least one literal l φ evaluates to true under τ for every φ C. Let apφ = rφ,l for every φ C. Since l evaluates to true, it must be apφ,l = rl, thus pφ is the only follower who selects rφ,l. As a result, pφ experiences a cost equal to 1, and she has no incentive to deviate. Finally, pφ,t does not deviate from rφ to rt as 2 < 4. Thus, we can conclude that a is an NE and, since there is no follower using rt, the leader s cost is ϵ. Only if. Suppose there exists an OSE σ = (σℓ, a) in which the leader s cost is ϵ. We show that a = (ap)p F p F Ap can be employed to recover, in polynomial-time, a truth assignment τ that satisﬁes all clauses in C. First, let us note that no follower selects rt in a as, otherwise, the leader s cost would be 4 > ϵ. As a consequence, all followers pφ,t and pv,t must select the other resource they have available, i.e, apφ,t = rφ and apv,t = rv,t. Moreover, there cannot be two followers using resource rφ, for every φ C, as, otherwise, pφ,t would have an incentive to deviate from rφ to rt, as 5 > 4. Thus, apφ = rφ, i.e., there must be a literal l φ such that apφ = rφ,l, for all φ C. In addition, there cannot be two followers selecting rφ,l as, otherwise, pφ would have an incentive to deviate to rφ, as 5 < 6. Thus, it must be the case that apφ,l = rl. This implies that νa rl = m + 1 as, otherwise, the cost of pφ,l would be 7 > 6, and the follower would change resource, paying rφ,l. Furthermore, at least one between pv and p v must select rv,t as, otherwise, player pv,t s cost would be 7 < 4, and she would prefer switching to resource rt. As a result, at least one between rv and r v must be selected by a number of followers strictly smaller than m+1; in that case, no follower pφ,v (or pφ, v) selects that resource as, otherwise, she would incur a cost of 7 and she would have an incentive to deviate. We thus deﬁne a truth assignment τ such that: τ(v) = T if νa rv = m+1, τ(v) = F if νa r v = m+1, and τ(v) is either T or F whenever νa rv = νa r v = 0. Clearly, τ is well-deﬁned. Moreover, we previously showed that, for every φ C, there exists a literal l φ such that apφ,l = rl, which implies that rl = m + 1, and, thus, τ(v(l)) = T if l is positive, while τ(v(l)) = F if it is negative. Therefore, τ satisﬁes all clauses.

The proof of Theorem 1 also shows the following:

Observation 1. In SCGs without leadership, computing an NE minimizing the cost of a given player is NP-hard.

Furthermore, from Theorem 1, it directly follows that the leader s cost in an OSE cannot be efﬁciently approximated to within any approximation factor which depends polynomially on the size of the input:

Proceedings of the Twenty-Seventh International Joint Conference on Artiﬁcial Intelligence (IJCAI-18)

rφ1,x rφ1,y rφ1,z rφ2, x rφ2,y rφ2, z

rx r x ry r y rz r z

rx,t ry,t rz,t rφ2,x rφ1, x

rφ2,z rφ1, z

pφ1,z pφ2, x pφ2,y

pφ1,x pφ1, x

pφ2,z pφ1, z px p x py p y pz p z

px,t py,t pz,t

pφ1,t pφ2,t

Figure 1: Example of Γϵ(C, V ) with V = {x, y, z} and C = {φ1, φ2}, where φ1 = x y z and φ2 = x y z.

Corollary 1. The problem of computing an OSE in SSCGs is not in Poly-APX unless P = NP.

Proof. Given a 3SAT instance (C, V ), let us build an SSCG Γϵ(C, V ) as in the proof of Theorem 1. We have already proven that Γϵ(C, V ) has an OSE in which the leader s cost is ϵ if and only if (C, V ) is satisﬁable and that, otherwise, the leader s cost is 4. Let ϵ = 4 2n+r . Assume that there exists a polynomial-time approximation algorithm A with approximation factor poly(n, r), i.e., a polynomial function of n and r. Assume (C, V ) is satisﬁable. A applied to Γϵ(C, V ) would return a solution with leader s cost at most 4 2n+r poly(n, r). Since, for n and r large enough, 4 2n+r poly(n, r) < 4, A would allow us to decide in polynomial time whether (C, V ) is satisﬁable, a contradiction unless P = NP.

Notice that, since the followers break ties in favour of the leader in the reduction, the result in Theorem 1 does not apply to the problem of ﬁnding a PSE. Our conjecture is that, as it is the case for all the problems with known results on computing SEs, computing a PSE is as hard as ﬁnding an OSE.

4 SSSCGs with Monotonic Costs We focus, in this section, on SSSCGs, showing that, assuming players costs which are monotonic functions of the resource congestion, an O/PSE can be computed efﬁciently. Formally, we call the players cost functions weakly monotonic if, for every resource i R, ci,ℓ(x) ci,ℓ(x + 1) and ci,f(x) ci,f(x + 1) for all x N, and strictly monotonic if all the inequalities are strict. First, notice that, in these games, searching for an O/PSE is not as easy as it might appear, for the following reason: Observation 2. There are SSSCGs with weakly monotonic cost functions where some followers conﬁgurations are NEs only for leader s mixed strategies. Consider a game with three followers, R = {r1, r2, r3}, and followers costs as in Figure 2(a). The followers conﬁguration in which each follower selects a different resource is

not an NE if the leader commits to a pure strategy, while, for instance, it is an NE for σℓ(r1) = σℓ(r3) = 1

2, σℓ(r2) = 0. In the following, we show that, when searching for an OSE, one can restrict the attention without loss of generality to pure strategies of the leader, provided that the players cost functions are weakly monotonic. Intuitively, given an OSE in which the leader plays a mixed strategy, we can easily construct another equilibrium in which, instead, the leader s strategy is pure.

Theorem 2. Every SSSCG with weakly monotonic cost functions admits an OSE σ = (σℓ, ν) in which σℓis pure.

Proof. Given an OSE σ = (σℓ, ν), with σℓmixed, we construct another OSE ˆσ = (ˆσℓ, ˆν) such that ˆσℓis pure. Let S = {i R | σℓ(i) > 0} be the set of resources played by the leader with positive probability in σℓ, and let i arg mini S ci,ℓ(νi + 1). Clearly, cσ ℓ= P i Aℓσℓ(i)ci,ℓ(νi + 1) ci ,ℓ(νi + 1). Moreover, given that ν is an NE for σℓ, the following holds:

cσℓ i,f(νi) cσℓ j,f(νj + 1) i R : νi > 0, j R. (1)

Let us deﬁne ˆσℓsuch that ˆσℓ(i ) = 1. We now show that such ˆσℓis part of an OSE. Notice that cˆσℓ i,f(x) = ci,f(x) x N for every i = i R, while cˆσℓ i ,f(x) = ci ,f(x + 1) x N. Given that the followers behave optimistically, it is sufﬁcient to provide a ˆν Eˆσℓsuch that ˆσ = (ˆσℓ, ˆν) satisﬁes cˆσ ℓ cσ ℓ. Speciﬁcally, we construct a sequence of followers conﬁgurations reaching such ˆν. Given ˆσℓ, let us consider the sequence (ν(0) = ν, ν(1), . . . , ν(T) = ˆν) such that each conﬁguration differs from the previous one in that a single follower has changed resource, strictly decreasing her cost in the followers game resulting from ˆσℓ. Formally, for all 0 t < T, there exists i, j R such that ν(t)i > 0, ν(t+1)i = ν(t)i 1, ν(t + 1)j = ν(t)j + 1, and cˆσℓ i,f(ν(t)i) > cˆσℓ j,f(ν(t + 1)j). Moreover, let us assume that a follower deviates to resource i , i.e., ν(t + 1)i > ν(t)i , only if this is the only way of

Proceedings of the Twenty-Seventh International Joint Conference on Artiﬁcial Intelligence (IJCAI-18)

strictly decreasing some follower s cost. Now, we prove:

ν(t + 1)i ν(t)i 0 t < T. (2)

By contradiction, suppose there exists 0 t < T such that ν(t + 1)i > ν(t)i . Then, in ν(t), there exists a follower who can strictly decrease her cost by choosing i instead of resource j = i R : ν(t)j > 0. Thus,

cσℓ i ,f(νi + 1) ci ,f(ν(t)i + 2) < cj,f(ν(t)j), (3)

where the ﬁrst inequality holds since ν(t)i = νi . Two cases are possible. In the ﬁrst one, ν(t)j νj, implying cj,f(ν(t)j) cj,f(νj) cσℓ j,f(νj), which, together with Equations (1) and (3), leads to a contradiction. In the second case, ν(t)j > νj implies that there exists k = i R such that ν(t)k < νk (and νk > 0), otherwise P i R ν(t)i > n 1. It follows that cj,f(ν(t)j) ck,f(ν(t)k + 1) cσℓ k,f(νk), where the ﬁrst inequality holds since, due to our assumptions on the sequence, it cannot be cj,f(ν(t)j) > ck,f(ν(t)k+1) as ν(t + 1)i > ν(t)i , and the second inequality follows from ν(t)k < νk. Thus, Equations (1) and (3) give a contradiction. As a result, Equation (2) holds, and, thus, ˆνi νi . Given the monotonicity of the costs, ˆσ is an OSE.

Now, we prove that a similar result holds for the pessimistic case. The result is weaker though, as it requires the stronger assumption that the followers cost functions be strictly monotonic.

Theorem 3. Every SSSCG in which leader s and followers cost functions are weakly monotonic and strictly monotonic, respectively, admits a PSE σ = (σℓ, ν) in which σℓis pure.

Proof. Suppose there exists a PSE σ = (σℓ, ν) in which σℓis mixed. We show that there must be another PSE ˆσ = (ˆσℓ, ˆν) such that ˆσℓis pure. Let us deﬁne i R as in the proof of Theorem 2, so that cσ ℓ ci ,ℓ(νi + 1) and Equation (1) holds. Given that the followers behave pessimistically, we need to show that, for every ˆν Eˆσℓ, ˆσ = (ˆσℓ, ˆν) satisﬁes cˆσ ℓ cσ ℓ. By contradiction, assume that cˆσ ℓ> cσ ℓ, which implies ci ,ℓ(ˆνi + 1) > ci ,ℓ(νi + 1). It easily follows from the monotonicity of the costs that ˆνi > νi . Thus, there must be a resource j R such that ˆνj < νj, otherwise P i R ˆνi > n 1. Moreover, let us notice that νj > 0. Thus,

cσℓ i ,f(νi +1) ci ,f(ˆνi +1) cj,f(ˆνj+1) cσℓ j,f(νj), (4)

where the ﬁrst inequality follows from νi < ˆνi , the second one from the fact that ˆν is an NE for ˆσℓ, while the third one from ˆνj < νj. Equation (1) implies cσℓ j,f(νj) cσℓ i ,f(νi +1). If cσℓ j,f(νj) < cσℓ i ,f(νi + 1), then Equation (4) leads to a contradiction. Otherwise, if cσℓ j,f(νj) = cσℓ i ,f(νi + 1), all inequalities in Equation (4) must hold as equalities. However, this would imply cσℓ i ,f(νi + 1) = ci ,f(ˆνi + 1) and cj,f(ˆνj + 1) = cσℓ j,f(νj), a contradiction as σℓis mixed and the followers cost functions are strictly monotonic.

Moreover, let us notice that Theorem 3 fails to hold whenever the followers cost functions are weakly monotonic.

x cr1,f cr2,f cr3,f 1 1 4 0 2 3 7 2 3 5 7 5

x cr1,ℓcr1,f cr2,ℓcr2,f 1 1 1 1 1 2 2 1 2 1

(b) x cr1,ℓcr1,f cr2,ℓcr2,f 1 1 2 1 2 2 2 1 2 1

x cr1,ℓcr1,f cr2,ℓcr2,f 1 2 1 2 1 2 0 2 0 2

Figure 2: Cost functions of some SSSCG examples.

Observation 3. There are SSSCGs with weakly monotonic cost functions where any PSE prescribes the leader to play a mixed strategy.

Consider a game with two followers, R = {r1, r2}, and players costs as in Figure 2(b). Clearly, any followers conﬁguration is an NE, independently of the leader s commitment. Thus, whenever the leader commits to a pure strategy, she incurs a cost of 2, while she can pay only 1 by uniformly randomizing between the two resources. Theorems 2 and 3 provide the fundamental insights which allow us to efﬁciently compute O/PSEs in SSSCGs with monotonic cost functions. Speciﬁcally, we can compute an OSE (resp., PSE) by enumerating the leader s pure strategies and, for each of them, computing the followers NE which results in the leader s cost being minimized (resp., maximized). An O/PSE is then obtained by picking a pure strategy which minimizes then leader s cost. The detailed procedure is described in Algorithm 1, where function O-Pick(S) (resp., P-Pick(S)) returns some resource j S, giving precedence to resources j = i (resp., j = i).

Algorithm 1: Algorithm computing an O/PSE of an SSSCG. input : An SSSCG Γ = (N, R, cℓ, cf) output: σ that is an O/P-LFE of Γ

Function Compute-O/P-LFE(Γ)

σℓ[i] σℓ ℓ: σℓ(i) = 1; ν[i, j] 0 i, j R; while P

j R ν[i, j] < n do S arg minj R cσℓ[i] j,f (ν[i, j] + 1); j O/P-Pick(S); ν[i, j ] ν[i, j ] + 1;

cℓ[i] ci,ℓ(ν[i, i] + 1);

i arg mini R cℓ[i]; return σ = (σℓ[i ], ν[i , ]);

Let us remark that, in Algorithm 1, σℓ[ ], ν[ , ], and cℓ[ ] are the algorithm s variables, and, for every i R, ν[i, j] denotes the number of followers selecting resource j R in the NE that is reached when the leader s strategy is σℓ[i].

Theorem 4. Algorithm 1 is correct and it runs in time O(nr log r).

Proceedings of the Twenty-Seventh International Joint Conference on Artiﬁcial Intelligence (IJCAI-18)

Proof. In order to show that Algorithm 1 is correct, due to Theorems 2 and 3 we only need to prove that, for every i R and after the execution of the while loop, the followers conﬁguration ν is such that νj = ν[i, j] for j R is an NE for σℓ[i] minimizing (or maximizing) the leader s cost. First, let us show that ν is an NE. Suppose, by contradiction, it is not. Then, there exists j R : νj > 0 and k R such that cσℓ[i] j,f (νj) > cσℓ[i] k,f (νk + 1). Let νk be the value of ν[i, k] during the step in which ν[i, j] is set to its ﬁnal value νj. Clearly, cσℓ[i] j,f (νj) > cσℓ[i] k,f (νk +1) cσℓ[i] k,f ( νk +1), and the algorithm would have not incremented ν[i, j] during that step, a contradiction. In the rest of the proof, we focus on the optimistic case, as the pessimistic one can be treated analogously. Suppose, by contradiction, that ν is not an NE minimizing the leader s cost for σℓ[i]. Then, there exists another NE ˆν for σℓ[i] such that ci,ℓ(ˆνi + 1) < ci,ℓ(νi + 1). Given the monotonicity of the costs, ˆνi < νi. Therefore, there must exist j = i R such that ˆνj > νj. Let us consider the step in which ν[i, i] is set to νi, and let νj be the value of ν[i, j] during that step. It must be that cσℓ[i] i,f (νi) < cσℓ[i] j,f ( νj +1) as, otherwise, the algorithm would have incremented ν[i, j] instead of ν[i, i]. But then cσℓ[i] j,f ( νj +1) cσℓ[i] j,f (νj +1) cσℓ[i] j,f (ˆνj),

which implies cσℓ[i] i,f (ˆνi + 1) cσℓ[i] i,f (νi) < cσℓ[i] j,f ( νj + 1)

cσℓ[i] j,f (ˆνj), contradicting the fact that ˆν is an NE for σℓ[i]. Clearly, the while loop is executed exactly r times, and each execution performs n steps. Moreover, using efﬁcient data structures each step takes time O(log r). Thus, the overall running time is O(nr log r)

We conclude the section by showing that, in SSSCGs with monotonic costs and under the additional assumption that leader s and followers costs be equal, all O/PSEs in which the leader plays a pure strategy are NEs in the game where all players play simultaneously (i.e., without leadership).

Theorem 5. Given an SSSCG with monotonic costs and cℓ= cf = {ci}i R, any O/PSE σ = (σℓ, a) with σℓpure is an NE.

Proof. Let σ = (σℓ, ν) be an O/PSE with σℓ(i ) = 1 for some i R. Clearly, given that ν Eσℓ, for every i R : νi > 0 and j R, cσℓ i (νi) cσℓ j (νj + 1). Therefore, there is no follower who has an incentive to change resource, and, thus, it is sufﬁcient to prove that the leader does not deviate from resource i either, unilaterally. If νi > 0, we have ci (νi +1) = cσℓ i (νi ) cσℓ j (νj +1) = cj(νj +1) for every j = i R, and it immediately follows that the leader does not deviate and σ is an NE. The case in which νi = 0 is more involved. By contradiction, suppose that σ is not an NE. As a consequence, the leader must have an incentive to deviate for some j = i R, i.e., ci (νi + 1) = ci (1) > cj(νj + 1). Suppose the leader commits to a strategy ˆσℓsuch that ˆσℓ(j) = 1. We prove that, for every ˆν Eˆσℓ, ˆσ = (ˆσℓ, ˆν) provides the leader with a cost strictly smaller than ci (1). Suppose, instead, cj(ˆνj + 1) ci (1). Three cases are possible. In the ﬁrst case, ˆνj < νj and ci (1) > cj(νj + 1) cj(ˆνj + 1) ci (1). In the second one, ˆνj = νj and cj(ˆνj + 1) ci (1) > cj(νj +1). Finally, in the third case, ˆνj > νj, which

implies that there must be a resource k = i R such that ˆνk < νk, and ci (1) > cj(νj + 1) ck(νk) ck(ˆνk + 1) cj(ˆνj + 1) ci (1). As all cases lead to a contradiction, it must be cj(ˆνj + 1) < ci (1). The proof is complete as, in ˆσ, the leader s cost is cj(ˆνj + 1) < ci (1), contradicting the fact that σ is an O/PSE.

5 SSSCGs with Arbitrary Costs Finally, let us shift our attention to general SSSCGs, i.e., games in which the costs need not be monotonic functions of the resource congestion. Observation 4. Given an SSSCG, an optimal leader s pure strategy to commit to can be computed efﬁciently, both in the optimistic and the pessimistic case. Clearly, when the followers costs are monotonic functions, we can ﬁnd an optimal leader s pure strategy using Algorithm 1. In general, we can apply a procedure similar to that of Algorithm 1, enumerating the leader s pure strategies while computing, for each of them, an NE minimizing/maximizing the leader s cost in the resulting followers game. In order to ﬁnd one such NE, we can adapt an algorithm proposed in [Ieong et al., 2005], which relies on dynamic programming to compute in O(r5n6) an NE minimizing the social-cost in SSSCGs without leadership. It sufﬁces to change the objective function from the social cost to the leader s cost of the resource selected in the current pure strategy. Thus, in general, the overall computation requires O(r6n6). Unfortunately, the assumption that the leader always plays pure strategies is not safe in SSSCGs with arbitrary costs, as Theorems 2 and 3 do not hold if the monotonicity assumption is dropped. Observation 5. There are SSSCGs such that: the followers costs only are non-monotonic, and any O/PSE prescribes the leader to play a mixed strategy; the leader s costs only are non-monotonic, and any O/PSE prescribes the leader to play a mixed strategy. Consider a game with a single follower, R = {r1, r2}, and players costs as in Figure 2(c). Clearly, the follower selects r2 whenever σℓ(r1) 1

2, while, if σℓ(r1) 1

2, she chooses r1, providing the leader with a cost of 2 σℓ(r1) and 1 + σℓ(r1), respectively. Thus, any O/PSE prescribes the leader to play σℓsuch that σℓ(r1) = 1

2. Moreover, when the players costs are as in Figure 2(d), the follower selects r2 if σℓ(r1) 1

2, and r1 if σℓ(r1) 1

2, providing the leader with a cost of 2σℓ(r1) and 2 2σℓ(r1), respectively. As a result, any O/PSE of the game prescribes the leader to play σℓsuch that σℓ(r1) = 1

6 Conclusions and Future Works We analyzed Stackelberg games where the underlying structure is a congestion game, focusing on the case in which the players actions are singletons. We proved that, without further assumptions on the players action spaces and the resource cost functions, it is not possible to approximate in polynomial time the leader s cost in an OSE up to within a polynomial factor in the size of the game, unless P = NP.

Proceedings of the Twenty-Seventh International Joint Conference on Artiﬁcial Intelligence (IJCAI-18)

Differently, when each player can select any resource and the cost functions are monotonic, an O/PSE can be computed efﬁciently, as there is always a leader s optimal pure strategy. In the future, we will study the computational complexity of ﬁnding an O/PSE in SSSCGs with arbitrary costs in order to establish whether the problem can be solved efﬁciently or not, in spite of the fact that the leader s optimal commitment may be a mixed strategy. Moreover, we will extend our results for SSCGs studying the complexity of ﬁnding a PSE, also considering the special case where cost functions are monotonic in the resource congestion and we will analyze other classes of CGs with different combinatorial structures.

References [Ackermann et al., 2008] Heiner Ackermann, Heiko R oglin, and Berthold V ocking. On the impact of combinatorial structure on congestion games. Journal of the ACM (JACM), 55(6):25, 2008. [Basilico et al., 2017] Nicola Basilico, Stefano Coniglio, and Nicola Gatti. Methods for ﬁnding leader-follower equilibria with multiple followers. Co RR, abs/1707.02174, 2017. [Coniglio et al., 2017] Stefano Coniglio, Nicola Gatti, and Alberto Marchesi. Pessimistic leader-follower equilibria with multiple followers. In IJCAI, pages 171 177, 2017. [Conitzer and Korzhyk, 2011] Vincent Conitzer and Dmytro Korzhyk. Commitment to correlated strategies. In AAAI, pages 632 637, 2011. [Conitzer and Sandholm, 2006] Vincent Conitzer and Tuomas Sandholm. Computing the optimal strategy to commit to. In Proceedings of the 7th ACM conference on Electronic commerce, pages 82 90, 2006. [De Nittis et al., 2018] Giuseppe De Nittis, Alberto Marchesi, and Nicola Gatti. Computing the optimal strategy to commit to in polymatrix games. In AAAI, pages 82 90, 2018. [Fotakis, 2010] Dimitris Fotakis. Stackelberg strategies for atomic congestion games. Theory of Computing Systems, 47(1):218 249, 2010. [Garey and Johnson, 1979] M. R Garey and D. S. Johnson. Computers and Intractability: A Guide to the Theory of NP-completeness. WH Freeman and Company, 1979. [Ieong et al., 2005] Samuel Ieong, Robert Mc Grew, Eugene Nudelman, Yoav Shoham, and Qixiang Sun. Fast and compact: A simple class of congestion games. In AAAI, pages 489 494, 2005. [Monderer and Shapley, 1996] Dov Monderer and Lloyd S Shapley. Potential games. Games and economic behavior, 14(1):124 143, 1996. [Rosenthal, 1973] Robert W Rosenthal. A class of games possessing pure-strategy nash equilibria. International Journal of Game Theory, 2(1):65 67, 1973. [Roughgarden, 2004] Tim Roughgarden. Stackelberg scheduling strategies. SIAM Journal on Computing, 33(2):332 350, 2004.

[Shoham and Leyton-Brown, 2008] Yoav Shoham and Kevin Leyton-Brown. Multiagent systems: Algorithmic, game-theoretic, and logical foundations. Cambridge University Press, 2008. [Tambe, 2011] Milind Tambe. Security and Game Theory: Algorithms, Deployed Systems, Lessons Learned. Cambridge University Press, 2011. [Von Stengel and Zamir, 2010] Bernhard Von Stengel and Shmuel Zamir. Leadership games with convex strategy sets. Games and Economic Behavior, 69(2):446 457, 2010.

Proceedings of the Twenty-Seventh International Joint Conference on Artiﬁcial Intelligence (IJCAI-18)