# coalition_formation_in_multidefender_security_games__eff0a908.pdf

Coalition Formation in Multi-defender Security Games

Dolev Mutzari,1 Jiarui Gan,2 Sarit Kraus1

1Department of Computer Science, Bar-Ilan University 2Max Planck Institute for Software Systems dolevmu@gmail.com, jrgan@mpi-sws.org, sarit@cs.biu.ac.il

We study Stackelberg security game (SSG) with multiple defenders, where heterogeneous defenders need to allocate security resources to protect a set of targets against a strategic attacker. In such games, coordination and cooperation between the defenders can increase their ability to protect their assets, but the heterogeneous preferences of the selfinterested defenders often make such cooperation very difﬁcult. In this paper, we approach the problem from the perspective of cooperative game theory and study coalition formation among the defenders. Our main contribution is a number of algorithmic results for the computation problems that arise in this model. We provide a poly-time algorithm for computing a solution in the core of the game and show that all of the elements in the core are Pareto efﬁcient. We show that the problem of computing the entire core is NP-hard and then delve into a special setting where the size of a coalition is limited up to some threshold. We analyse the parameterized complexity of deciding if a coalition structure is in the core under this special setting, and provide a poly-time algorithm for computing successful deviation strategies for a given coalition.

Introduction The Multi-defender Stackelberg Security Game (SSG) is a model developed for studying real-world scenarios where multiple defenders protect a set of targets against a strategic attacker, who is their common enemy. The game is a generalization of the classic single-defender SSG, which was studied extensively in recent years in the multi-agent community (Paruchuri et al. 2008; Tambe 2011; Nguyen et al. 2013; Sinha et al. 2018; An and Tambe 2017). The augmentation of defenders makes the model fundamentally different from single-defender SSGs; understanding the relation between the defenders becomes the key to analysing the game. Though there is very limited research, some recent work proposed non-cooperative game models for multi-defender SSGs with corresponding equilibrium concepts for independent movement of the defenders (Lou and Vorobeychik 2015; Gan, Elkind, and Wooldridge 2018), which lay foundations for further development of research along this line. In this paper, we approach this model from the perspective of cooperative game theory, and aim at understanding formation of defender coalitions in the game.

Copyright c 2021, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

Indeed, collaboration in the form of coalition formation is a natural way of achieving efﬁcient defense in the real world. In multi-defender SSGs, defenders movements can be synchronized by collaboration, avoiding overspending of resources. Nevertheless, a recent attempt by Gan et al. (2020) to use mechanism design to promote defense collaboration in multi-defender SSGs shows that it s often not possible to achieve voluntary collaboration among all defenders in a multi-defender SSG, no matter what mechanism is used. The result is primarily due to the heterogeneous preferences of the defenders over candidate joint strategies to be carried out, which are hard to tame. Indeed, it would be very demanding to expect defenders with conﬂicting preferences to collaborate, but this does not mean that nothing can be done with those who share similar views or when external enforcement of cooperative behavior (e.g., via contracting) is possible. This motivates us to study multi-defender SSGs using cooperative game theory a powerful tool for analyzing how agents form stable groups to compete based on their collective payoffs. To ﬁt the multi-defender SSG model into the framework of cooperative game theory, we face two major challenges. First, SSG is a game with externalities, meaning that the payoff value of a coalition is dependent on the other coalitions. Second, the game is a non-transferable utility game (NTU), meaning that the defenders payoffs cannot be transferred between each other as monetary payoffs. These two properties introduce great difﬁculties in generalizing the model to allow for coalition formation. To begin with, we cannot deﬁne a utility value for a coalition that depends only on who are in the coalition; instead, we need to deﬁne a utility value for each defender separately, where a coalition structure is speciﬁed in advance. Furthermore, coalition s value depends on the attacker s strategy (which target they attack), who is a strategic player but will not be a participant in the coalition formation. Basically, the attacker will take the best response that maximizes her utility. However, in general there may be multiple targets that are equally optimal for the attacker but each of which results in different utilities for every defender, making it a challenge to deﬁne a defender s utility.

Our Contribution We make the following contributions in this paper. First, we present a multi-defender SSG model that allows coalition formation. The core of this game con-

The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21)

sists of all coalition structures in which no defender can beneﬁt from deviating from their current coalition in order to join another. Since it s a game with externalities, the strategies of the defenders that aren t deviating must be taken into account when considering successful deviations. We proposed a α-core deﬁnition, where these defenders join together to take revenge on the deviating defenders. We then study the problem of computing a coalition structure in the core. Our main result is that there is always a strategy that results in the grand coalition structure being in the core, and this strategy can be computed in polynomial time. In addition, all elements in the core are Pareto efﬁcient, i.e., a centralized mechanism that can enforce the agents to follow a given strategy that will give at least defender a higher utility while not making the other defenders worse off. On the other hand, the problem of computing the entire core is NP-hard even in the simple setting with only two defenders; we present a hardness result. We also present an efﬁcient algorithm to decide if a coalition structure is stable against a given deviating coalition. The algorithm can be used to decide the stability of a coalition structure against any possible deviating coalitions; we analyze the parameterized complexity of this problem and consider in particular a setting with a small number of defender types.

Related Work Multi-agent SSGs have only recently started to attract attention from the AI community. Lou and Vorobeychik (2015) ﬁrst proposed a non-cooperative game model and analyzed the Nash equilibrium and the price of anarchy of the game. More recently, Gan, Elkind, and Wooldridge (2018) studied a variant of the model that assumes that the defenders have common interests in protecting the targets and relaxed the constraint that each of them only protects a disjoint set of targets; they showed the existence of Nash equilibrium in their model. Based on this model, they further considered coordination between defenders and approached it by designing coordination mechanisms (Gan et al. 2020). Our model is largely based on these three pieces of work, though we focus on cooperative games. There are other papers on the applications of multi-defender SSGs or similar models (Jiang et al. 2013; Laszka, Lou, and Vorobeychik 2016; Castiglioni, Marchesi, and Gatti 2019), which are either conceptually very different from our model or are meant for very speciﬁc scenarios. To the best of our knowledge, our paper is the ﬁrst to study multi-defender SSGs from the perspective of cooperative game theory. As we ve mentioned, our model is an NTU game with externalities. There have been different approaches in the literature for studying coalition formation in such games. Yi (1996) studied how the sign of external effects affects the stability of coalition structures, and how to leverage this information to provide a useful organizing principle when examining coalition structures. Finus and Rundshagen (2003) derived a non-cooperative foundation of core-stability for positive externality NTU-games. Dunne et al. (2010), Chander (2010), and Rahwan et al. (2012) developed solution concepts for coalitional resource games and supplied some techniques for ﬁnding solutions under certain assumptions.

Skibski, Michalak, and Wooldridge (2018) offered a generalization of the stochastic Shapley value for coalitional games with externalities. Our work differs from previous work as our underlying model is an SSG that has a leaderfollower action sequence among the players. The attacker is a special player in the game, who is strategic but will never join any coalition. This introduces many challenges to modeling and analyzing the game as we will show in this paper.

Preliminaries: Multi-defender SSGs In a multi-defender SSG, there are n defenders 1, . . . , n, and a set T of m targets, which the defenders want to protect; there is an attacker who wants to attack the targets. For any integer n > 0, we write [n] = {1, . . . , n} in this paper. Each defender i [n] has ki N security resources that can be allocated to the targets to protect them. Using randomized allocation strategies, the resources of a defender can be distributed continuously, resulting in a vector x = (xt)t T , such that xt is the probability of target t being protected by some (at least one) resource. We call such vectors coverage (vectors); they will be crucial for deﬁning the players utilities, which we will do shortly. With ki resources, each defender i can use a coverage vector x Cki as their allocation strategy, where for any integer k > 0, it is deﬁned that Ck = {x [0, 1]m : P t T xt k}; provably, such vectors can be implemented by a distribution over deterministic allocation strategies, each using at most k resources. For example, a defender with two resources can use a coverage vector (0.6, 1, 0.4) C2 over three targets; this vector can be implemented by allocating the resources to the ﬁrst two targets w.p. 0.6, and to the last two targets w.p. 0.4. Suppose that each defender i uses a coverage vector xi = (xit)t T . When the defenders are uncoordinated, they act independently; hence, each target t T will be protected by some defender with probability ct = 1 Q i [n](1 xit),

i.e., Q i [n](1 xit) is the probability that t is not protected by any defender. We will refer to the vector c = (ct)t T as the overall coverage (vector). As in a standard SSG model, the attacker wants to attack the targets, and they can conduct surveillance on the defenders joint strategy beforehand and best respond to it. Suppose the attacker chooses to attack some target t T. Following the standard SSG model, the attack will be successful if no resource is allocated to protect t, which happens with probability 1 ct (we assume that the defenders resources are homogeneous). In this case, the attacker receives a reward value ra(t) and each defender i receives a penalty value pd i (t). On the other hand, the attack will fail if at least one resource is allocated to protect it, which happens with probability ct, and in which case the attacker receives a penalty pa(t) and each defender i receives a reward rd i (t). Thus, we can write the utility functions of each defender and the attacker as follows: U d i (c, t) = ct rd i (t) + (1 ct) pd i (t) (1) U a(c, t) = (1 ct) ra(t) + ct pa(t) (2)

Note that for every target t, it s assumed that rd i (t) > pd i (t) and ra(t) > pa(t), so all defenders prefer an unsuccessful

attack to a successful one and the attacker prefers the opposite. Thus, as a rational player, after knowing the defenders strategies and hence, the overall coverage c, the attacker will best respond by attacking a target in the set

BR(c) := arg maxt T U a(c, t)

that maximizes their expected utility in this situation. When there are multiple targets in BR(c), we follow the modeling approach by Gan, Elkind, and Wooldridge (2018) and assume that the defenders will adopt the pessimistic assumption about the attacker s tie-breaking behavior. Hence, we deﬁne bri(c) = arg mint BR(c) U d i (c, t);

that is, the target which defender i believes will be selected by the attacker.

Coalition Formation When the defenders are uncoordinated, independent movement inevitably causes inefﬁcient resource use. More specifically, if two defenders both protect a target with some positive probability, that results in a positive probability when both defenders appear on this target, which is not necessary since according to the SSG model, one resource is sufﬁcient for thwarting the attacker s attack. We study how the defenders can form coalitions to improve efﬁciency of resource use and gain advantages in the game. When a subset S [n] of defenders form a coalition, they can allocate their resources jointly and play as a single defender in the game, who has k S = P i S ki resources. For example, if two defenders move independently and each of them has one resource, the set of overall coverage vectors over three targets, which can be generated from their joint strategy, is c R3 : ct = 1 (1 x1t)(1 x2t) t T, x1, x2 C1

It is not hard to verify that this set is strictly contained in C2, which is the set of coverage vectors these two defenders can use if they coordinate their resource allocation, allocating resources jointly as if they are one defender. A coalition is called the grand coalition if it contains all the defenders. Therefore, coalition formation results in a partition of the set of defenders, which we refer to as a coalition set, denoted as P = {P1, . . . , Pl}, such that S l i=1 Pi = [n]. Each coalition Pi then chooses a joint strategy xi Ck Pi . Intuitively, the game now becomes a multi-defender SSG played by the coalitions. Similarly, the strategy proﬁle X = (x1, . . . , xl) of the coalitions now induces the following overall coverage for each target t:

covt(X) := 1 Q i [l] (1 xit) . (3)

Similarly, we write the overall coverage vector as cov(X) = (covt(X))t T . Given a strategy proﬁle X and an attacker response t, the players utilities are deﬁned in the same way as in (1) and (2). Since a strategy proﬁle X deﬁnes a unique coverage, for notational simplicity, we will sometimes write X instead of cov(X) in places where a coverage vector is expected, e.g., we write U d i (X, t) = U d i (cov(X), t) and

BR(X) = BR(cov(X)). We call a coalition set equipped with a strategy proﬁle a coalition structure, and denote it by CS = P, X = {(P1, x1), . . . , (Pl, xl)}. Deﬁnition 0.1 (Coalition structure). A coalition structure CS is a coalition set P = (P1, . . . , Pl) (i.e., a partition of [n]) equipped with a strategy proﬁle X = (x1, . . . , xl), such that xi Ck Pi for each i = 1, . . . , l. We denote CS = P, X = { P1, x1 , . . . , Pl, xl }. Similarly, since CS corresponds to a unique overall coverage vector, we will also view the players utilities and best response as functions of CS. More concisely, we let

U d i (CS) := U d i (CS, bri(CS)) be the utility a coalition structure CS yields for defender i. We remark that our game is an NTU game with externalities, which is different to many other coalitional games where a value function v : 2n R 0 determines the core uniquely under a relatively reasonable set of axioms. We deﬁne the core concept for our game next.

Core of Coalition Structures Given a coalition structure CS, a subset of defenders can choose to deviate from the coalitions in CS and form a new coalition to improve their utilities. We deﬁne a deviation as a subset of defenders D [n] and a joint strategy x Ck D of them, which will be played after the deviation. Deﬁnition 0.2 (Deviation). A deviation D, x from a coalition structure CS is a nonempty subset of defenders D [n] and a joint strategy x Ck D of these defenders, which will be played after the deviation. To further describe the outcome after a deviation and evaluate the beneﬁt of making that deviation, we need to specify how the other defenders will react to a deviation (again, this is because the externalities in our game, so the utilities generated by a deviation do not depend on the deviating coalition alone). We consider the concept of α-core which assumes that the other defenders will take revenge against any deviation to protect the status quo (Moulin and Peleg 1982; Chalkiadakis, Elkind, and Wooldridge 2011). More concretely, we assume that after a deviation we expect to see a deviators coalition D and a revengers coalition R = [n] \ D formed by the other defenders. The revenger coalition aims at making at least one deviator worse off, so as to prevent the deviation from happening. A deviation cannot be successful if the revengers can indeed achieve this. Deﬁnition 0.3 (ϵ-successful deviation). For any ϵ > 0, a deviation D, x from a coalition structure CS is ϵsuccessful if every defender i D gets a utility improvement of at least ϵ no matter what strategy the revengers coalition R = [n] \ D uses, i.e., for all i D and all y Ck R, it holds that

U d i (CS ) U d i (CS) + ϵ, where CS = { D, x , R, y }. In other words, a defender does not cooperate in a deviation unless she gets a strict utility improvement over the previous coalition structure. We also call a 0-successful deviation a successful deviation.

1/(ra(t) pa(t))

attacker utility

Figure 1: The tank-ﬁlling visualization (Gan, Elkind, and Wooldridge 2018).

Deﬁnition 0.4 (ϵ-core). A coalition structure CS is in the ϵ-core if there exists no ϵ-successful deviation from it. In fact, the above deﬁnition still leaves one modeling issue for us to deal with. We assume that the utility of each defender i is determined by bri, but under this assumption, a defender can always reduce the protection to some target to make the attacker strictly prefer attacking that target, thus avoiding the worst target bri to be chosen. Thus, as long as BR is not a singleton, the coalition structure will normally not be stable. To deal with this modeling issue, we view the core as a limit point in a similar vein to the 0+-NSE deﬁned by Gan, Elkind, and Wooldridge (2018) and introduce the following deﬁnition. Deﬁnition 0.5 (0+-core). A coalition structure CS is in the 0+-core if there exists a sequence of coalition structures CSℓ

ℓ=1 and a sequence of real numbers (ϵℓ) ℓ=1, such

that every CSℓis in the ϵℓ-Core, limℓ CSℓ= CS and limℓ ϵℓ= 0.

Non-Emptyness and Efﬁciency of 0+-Core In this section, we will show that the 0+-core is nonempty and it provides efﬁciency guarantees. To present our results, we will use the tank-ﬁlling model introduced by Gan, Elkind, and Wooldridge (2018). We introduce the following notions. Deﬁnition 0.6 (Height of a coverage). The height of a coverage vector c Rm, denoted height(c), is the optimal attacker utility it yields, i.e., height(c) := maxj U a(c, j).

Deﬁnition 0.7 (Level coverage). A coverage vector c is said to be level, if ct = 0 for all t BR(c). A strategy proﬁle X is level if cov(X) is level. The tank-ﬁlling model is illustrated in Figure 1. Brieﬂy speaking, each tank in the ﬁgure corresponds to a target t T, and the amount of water in it represents the coverage of that target. The width of tank t is exactly 1 ra(t) pa(t), so the height of the water surface as indicated on the left axis represents exactly the attacker s utility of attacking target t. The intuition of this model is that in a stable state, the water in the tanks must form a level across all of the tanks and the corresponding coverage vector is level as deﬁned in Deﬁnition 0.7; indeed, if the coverage is not level, some defender would rather shift resources from overly protected targets to the least protected ones (which will be attacked by the attacker) to increase their utility.

For simplicity, in the remainder of this paper, we will focus only on canonical games, where the maximal level coverage exists (see Deﬁnitions 0.8 and 0.9). Intuitively, if a game is not a canonical game, by putting k[n] units of water in the tank-ﬁlling model, the water surface will reach the top of some tank (i.e., ˇu as shown in Figure 1). This introduces additional complexities for proving results to be presented while in practice we expect non-canonical games to be rare as resources are usually insufﬁcient to allow a target in the attacker s best response set to be fully covered. All our results can also be extended to non-canonical games by using an approach similar to the one used by Gan, Elkind, and Wooldridge (2018) to show the existence of equilibrium in the uncoordinated situation.

Deﬁnition 0.8 (Maximal level coverage). A level coverage c is called the maximal level coverage if P t T c = k[n].

Deﬁnition 0.9 (Canonical game). A game is called a canonical game if there exists a maximal level coverage c.

Non-Emptiness of 0+-Core

We ﬁrst present a characterization of coalition structures in the 0+-core in the lemma below. The proof of this lemma is omitted due to space limitations.1

Lemma 0.10. Suppose that a coalition structure CS = P, X results in a coverage vector c. If the game is canonical, then CS is in the 0+-core if and only if it satisﬁes the following two conditions:

i. ct = ct for all t T, where c is the maximal level coverage. ii. There exists t such that ct > 0 and for every deviation D, x , it holds that U d i ({ D, x , R, y }) U d i (c, t ) for some i [n] and y Ck R.

Intuitively, the ﬁrst condition is due to the fact that if the coverage is not level, then some defender(s) could always re-balance the coverage to improve the coverage of targets in the attacker s best response set; this will improve the defender s utility accordingly. In the second condition, the target t corresponds to the limit point of the attacker s best response in the sequence of coalition structure that deﬁnes CS; this condition can help simplify our analysis about the limit point in the deﬁnition of the 0+-core (Deﬁnition 0.5), so that we only need to ﬁnd a target t in order to show that a coalition structure is in the 0+-core. Given Lemma 0.10, our approach to proving the nonemptyness of the 0+-core is to construct a coalition structure which satisﬁes the conditions in this lemma. The way we construct this coalition is inspired by the approach of Gan, Elkind, and Wooldridge (2018) for constructing a Nash-like equilibrium; we present it in Algorithm 1. The difference is that here we let the defenders form a grand coalition, so they can allocate their resources as if they are one defender; this results in the coverage contributed by each defender to be additive, instead of sub-additive as in the situation where they move independently.

1Omitted proofs can be found in the full version of this paper.

Algorithm 1: Construct a coalition structure in the 0+-core.

Input : a canonical game instance; Output: CS = { [n], c }, t , and z1, . . . , zn.

1. Let zi = (0, . . . , 0) Rm for each i [n]. Let cj := P i [n] zij for all j T throughout. Let c be the maximal coverage.

2. For each defender i = 1, . . . , n:

Sort targets in T, so that U d i (c, t1) U d i (c, t2) U d i (c, tm).

For each j = t1, . . . , tm: Increase zij until cj = cj, or P j T zij = ki. If zij > 0, let t = j.

In Algorithm 1, each defender is invited to distribute resources; the goal is for the resulting coverage to reach the maximal level coverage c. The defenders come one by one and distribute their resources according to the order determined by the values U a i (c, t); once the coverage of a target t is improved to ct, they move to the next target and repeat until their resources are used up (i.e., when P j T zij = ki). Hence, zij indicates the amount of resources defender i contributes to target j; these variables and t are set to help us verify the second condition of Lemma 0.10 in the proof of the next theorem, which shows that the 0+-core is nonempty. Theorem 0.11. The 0+-core is non-empty and it always contains a grand coalition structure.

Proof. We show that the grand coalition structure CS generated by Algorithm 1 satisﬁes the two conditions in the statement of Lemma 0.10, so it is in the 0+-core. Note that Algorithm 1 always ends with c = c. Indeed, suppose that this is not the case, according to the way c is updated in the algorithm we always have cj cj for all j, so the only possibility when we have c = c is that cℓ< cℓ for some target ℓ. This implies that P j T cj < P j T cj = k[n]

given that c is the maximal level coverage. Hence, P i [n],j T zij = P j T cj < k[n],

which means that P

j T zij < ki for some defender i. This is a contradiction because we would expect cℓ= cℓwhen we increase ziℓin Step 2. Thus, CS satisﬁes Condition (i) of Lemma 0.10. It remains to show that the second condition is also satisﬁed. We let t be the target in the output of the algorithm. Indeed, we have ct > 0 as it is always points to a target that receives a positive improvement in its coverage as in Step 2. Pick arbitrary D [n] and deviating strategy x k D. We show that the strategy y with

i R zij for all j T

will make the inequality in Condition (ii) hold. Note that according to Step 2, we have P j T zij ki for all j, so it follows that P j T yj = P i R P j T zij k R and indeed y Ck R. Let CS = { D, x , R, y } and c be the coverage resulting from CS , i.e., c j = 1 (1 xt)(1 yt). If c = c, then we have U d i (CS ) = min t BR(c) U d i (c, t) U d i (c, t ),

where we use the fact that t BR(c) given that c = c is a level coverage and ct > 0 (see Deﬁnition 0.7). Hence, Condition (ii) holds. In what follows we assume that c t = ct for some t. We must have c ℓ< cℓfor some ℓ T: otherwise, c t ct for all t while at least one of these inequalities must be strictly satisﬁed, which leads to the following contradiction:

t T (1 (1 xt)(1 yt)) X

t T (xt + yt) k[n].

Pick an arbitrary ℓ BR(c ). It must be that c ℓ < cℓ : otherwise, c ℓ cℓ , so we can establish the following inequalities, which contradicts the fact that ℓ BR(c ): U a(c , ℓ ) U a(c, ℓ ) U a(c, ℓ) < U a(c , ℓ), where the ﬁrst and third transitions follow by monotonicity of the utility function; and the second transition is due to the fact that ℓ BR(c) (as c is level and cℓ> 0). Observe that for all j with P i D zij = 0, we have

c j yj = P i R zij = P i [n] zij = ct.

Thus, we must have P i D ziℓ > 0, which means ziℓ > 0 for some i D. This implies that U d i (c, ℓ ) U d i (c, t ): if the opposite holds, it must be that when ziℓ is set to a positive value at Step 2 of Algorithm 1, we already have ct = ct ; this contradicts the fact that t is the last target that reaches coverage c. Consequently, we have

U d i (CS ) = min t BR(c ) U d i (c , t)

U d i (c , ℓ ) < U d i (c, ℓ ) U d i (c, t ) = U d i (c, t ) (recall that c ℓ < cℓ = cℓ ) so Condition (ii) holds, too.

Nevertheless, according to the following theorem, the problem becomes hard if we exclude the grand coalition from our consideration. In other words, to compute the entire core is computationally hard. Theorem 0.12. It is NP-hard to decide if the 0+-core contains any coalition structure CS = P, X where P does not contain the grand coalition even when n = 2.

Efﬁciency of 0+-Core It turns out that the 0+-core also guarantees some efﬁciency properties given that the coverage is always maximized by Lemma 0.10. The following result says that a coalition structure in the 0+-core is weakly Pareto efﬁcient, meaning that the situation cannot be strictly improved for every defender in any other coalition structure.

Proposition 0.13. For every coalition structure sequence CSℓ

ℓ=1 associated with a CS in the 0+-core, there ex-

ists no coalition structure CS such that for every defender i [n], it holds that U d i (CS ) > limℓ U d i (CSℓ).

Indeed, a stronger notion of the 0+-core, call it strict 0+- core, can be deﬁned by using a weaker notion of ϵ-successful deviation, which only requires some (instead of all) deviator to have an ϵ advantage as long as the other deviators will not be hurt by the deviation (i.e., the others have advantages of at least 0). The strict 0+-core may be empty (an example is available in the appendix). However, when it is not empty, any coalition structure in it is (strongly) Pareto efﬁcient. In other words, given a coalition structure in the strict 0+-core, one cannot hope that, by using a centralized mechanism that enforces the defenders to follow a given strategy, we can improve the utility of some defender without hurting the others. We present this result below.

Proposition 0.14. Every coalition structure in the strict 0+- core is Pareto efﬁcient. For every coalition structure se-

ℓ=1 associated with a CS in the strict 0+-

core, there exists no coalition structure CS and defender i such that U d i (CS ) > limℓ U d i (CSℓ) and for every other defender j [n] \ {i} it holds that U d j (CS ) limℓ U d j (CSℓ).

Stability of a Coalition Structure

We have shown that the grand coalition is always in the 0+- core. In this section, we investigate the problem of deciding the stability of a given coalition structure; namely, whether it is in the 0+-core. Our main result in this section is an efﬁcient algorithm for deciding the stability of a coalition structure against a given deviating coalition D. To present this result, we ﬁrst deﬁne the following useful notions.

Deﬁnition 0.15 (ϵ-safety demand). The ϵ-safety demand (ϵ > 0) of a defender i [n] with respect to a coalition structure CS is a vector sϵ i = (sϵ it)t T [0, 1]m, such that

sϵ it = max 0, U d i (CS) + ϵ pd i (t) rd i (t) pd i (t)

The safety demand of a set D [n] of defenders with respect to CS is a vector sϵ = (sϵ t)t T [0, 1]m such that

sϵ t = max i D sϵ it.

Deﬁnition 0.16 (ϵ-safe deviation strategy and safety value). With respect to CS, an ϵ-safe (ϵ > 0) deviation strategy x for a coalition D is the solution to the following optimization:

min x Ck D max t T :x t<sϵ t U a(x , t), (5)

where s is the safety demand of D and when {t T : x t < sϵ t} = we deﬁne maxt T :x t<sϵ t U a(x , t) = . We call the optimal value of (5) the ϵ-safety value of D.

Algorithm 2: Solve (5).

Input : CS, D; Output: x, v. 1. Compute the ϵ-safety demand s of coalition D with respect to CS according to (4). Sort targets in T, so U a(s, t1) U a(s, t2) U a(s, tm); 2. For each j = 0, 1, . . . , m: Let T + = {tℓ T : ℓ j} and T = T \ T +; Solve the following LP (linear program):

subject to v U a(xt, t) for all t T

0 xt 1 for all t T

sϵ t xt 1 for all t T + P t T xt q

3. Find the LP in Step 2 with the smallest optimal objective value.2 Output the optimal solution x, v.

In other words, Deﬁnition 0.15 ensures that a target t being attacked does not remove the deviating coalition s incentive to deviate as long as the coverage of t is not below the safety demand. As for targets that receive coverage below the safety demand, the deviating coalition can only hope that they are not the best response of the attacker. Hence, the idea behind (5) is to minimize the attacker s utility for attacking these targets. In what follows, for any strategy x, we deﬁne T + x = {t T : xt sϵ t} and T x = {t T : xt < sϵ t}. The following lemmas provide useful characterizations for ϵ-safe deviation strategies and ϵ-successful strategies. Using these lemmas, we prove our main result Theorem 0.19.

Lemma 0.17. Let x be an ϵ-safe deviation strategy of a coalition D [n] and v be the ϵ-safety value. If v > , then it holds that:

i. v = U a(x, t1) U a(x, t2), for any t1 T x , t2 T + x ; ii. xt = sϵ t for all t T + x .

Lemma 0.18. Let x be an ϵ-safe deviation strategy of a coalition D [n] with respect to CS, and v be the ϵsafety value. Let γt = ra(t) v ra(t) pa(t) for each t T (so that U a(γt, t) = v). Then D has an ϵ-successful deviation from CS if and only if one of the following conditions is true:

v = ; γt > 1 for some t T + x ; or

t T + x γt sϵ t 1 sϵ t > k R, where R = [n] \ D is the revergers coalition.

Theorem 0.19. There is a polynomial-time algorithm to decide whether a coalition D [n] has an ϵ-successful deviation strategy from a coalition structure CS.

2We let the optimal objective value be when the program is infeasible, and when it is unbounded.

Proof. Using Lemma 0.18, our approach to deciding the stability of a coalition structure against a coalition D is to compute an ϵ-safe deviation strategy and check whether any of the conditions in Lemma 0.18 holds. Indeed, given x and v, to verify the conditions is trivial, so the key is to solve the optimization problem deﬁned in (5). To this end, we present Algorithm 2 (which is obviously a polynomial-time algorithm) and show that it outputs an optimal solution to (5) correctly. To distinguish, let x be an optimal solution to (5) and v = maxt T :x t<sϵ t U a(x , t) be the corresponding objective value. For the output x and v of Algorithm 2, the constraints of the LP in Step 2 of the algorithm ensures that T x T , so we have

v = max t T U a(x, t) max t T :x t<sϵ t U a(x, t) v . (6)

On the other hand, observe that the LP is equivalent to: minx Ck D maxt T U a(x , t). Hence, as long as in some round it holds that T = T x , we will have

v = min x Ck D max t T U a(x , t)

max t T U a(x , t) = max t T x U a(x , t) = v

and hence, v = v , which will then complete the proof. We show next that indeed T = T x in some round. If T x = or T + x = , then in round j = m or j = 0, we have T = T x . Otherwise, T x = implies that v > , so Lemma 0.17 is applicable, according to which the following inequality holds for any a T x and b T + x :

U a(s, a) < U a(x , a) U a(x, b) = U a(s, b).

Since t1, . . . , tm are ordered according to the value U a(s, t), the above inequality implies that there exists j [m 1] such that T + x = {t1, . . . , tj} and T x = {tj+1, . . . , tm}. In round j, we have exactly T = T x .

We have presented an efﬁcient way to decide the stability of a coalition structure against a given deviating coalition. Generally speaking, if a deviating coalition is not provided, we do not know yet how to deal with this task; we leave this problem open in this paper. Despite this unanswered problem, we will next discuss a special setting that allow us to adapt the above results to decide the stability of a coalition structure against any possible deviating coalition. Trivially, now that we know how to decide if a given coalition has a successful deviating strategy, when there is only a small number of defenders we can enumerate all 2n 1 coalitions to check the stability. Similarly, if the size of coalitions the defenders can form is bounded by a small number, the same approach applies. A slightly more complex realistic setting is one in which the number of defenders is unbounded but the number of possible defender types is small. Such games were studied by Shrot, Aumann, and Kraus (2010). In our model, two defenders have the same type if they have the same payoff parameters, i.e., pd i (t) = pd j(t) and rd i (t) = rd j (t) for every t T. Theorem 0.20 shows

that if the number of types is ﬁxed, we can decide the stability of a coalition structure in polynomial time. The key idea for proving this result is to show that it sufﬁces to enumerate all the possible combinations of defender types.

Theorem 0.20. There is an algorithm for deciding whether a coalition structure is in the ϵ-core or not that runs in time O(2λ poly(m, n)), where λ is the number of defender types.

We provide a model of a coalitional multi-defender security game that enables defenders to form coalitions and distribute their resources jointly. We focus on the solution concept of the core. We prove that the 0+-core is none-empty, give a polynomial-time algorithm for computing a grand coalition structure in the 0+-core, and show that it is NP-hard to compute the entire core even when there is a ﬁxed number of defenders. We also discuss the stability of a coalition structure against a given deviating coalition and the parameterized complexity of validating that a coalition structure is in the core. There are other two possible approaches that could have been considered, namely, coordination mechanism and negotiation. However, in coordination mechanism, we assume the existence of a reliable third party, and that all of the defenders report their properties honestly to that party and that they all accept the resultant decision. None of these assumptions are required in our coalition formation framework. Moreover, even with the strong assumptions that coordination mechanism requires, it cannot yield a better solution for the parties than our stable grand coalition structure. As for negotiation, a challenge to deal with is when a defender opts out of negotiation. One can consider frameworks in which the defenders that reach an agreement are able to punish the defenders that opted out. Conversely, there are games where the opting out defenders can take advantage of the agreed upon strategies of the other defenders. In order for this approach to work, it has to be efﬁcient, to have a utility guarantee and for the complexity of ﬁnding the equilibrium negotiation strategies to be low. The idea of a coalitional Stackelberg game can evolve in many directions. It is interesting to consider settings where the game does not have complete information, that is, the attacker may not be aware of the defenders allocation strategies. Furthermore, the dependencies between targets may be considered. Protecting one target may also protect other targets close to it. Moreover, a more relaxed deﬁnition for the core should be considered, such as γ-core. We assumed that after deviation the rest of the defenders take revenge, however this may, in general, not be a credible threat.

Acknowledgments

This work has been supported in part by the Israel Science Foundation under grant 1958/20 and the EU project TAILOR under Grant 992215. Jiarui Gan was supported by the European Research Council (ERC) under the European Union s Horizon 2020 research and innovation programme (grant agreement No. 945719).

References An, B.; and Tambe, M. 2017. Stackelberg Security Games (SSG) Basics and Application Overview, 485 507. Cambridge University Press.

Castiglioni, M.; Marchesi, A.; and Gatti, N. 2019. Be a Leader or Become a Follower: The Strategy to Commit to with Multiple Leaders. In IJCAI, 123 129.

Chalkiadakis, G.; Elkind, E.; and Wooldridge, M. 2011. Computational aspects of cooperative game theory. Synthesis Lectures on Artiﬁcial Intelligence and Machine Learning 5(6): 1 168.

Chander, P. 2010. Cores of games with positive externalities. CORE DP 4: 2010.

Dunne, P. E.; Kraus, S.; Manisterski, E.; and Wooldridge, M. 2010. Solving coalitional resource games. Artiﬁcial Intelligence 174(1): 20 50.

Finus, M.; and Rundshagen, B. 2003. A Non-Cooperative Foundation of Core-Stability in Positive Externality NTUCoalition Games. FEEM .

Gan, J.; Elkind, E.; Kraus, S.; and Wooldridge, M. 2020. Mechanism Design for Defense Coordination in Security Games. In Proceedings of the 19th International Conference on Autonomous Agents and Multi Agent Systems (AAMAS 20), 402 410.

Gan, J.; Elkind, E.; and Wooldridge, M. 2018. Stackelberg security games with multiple uncoordinated defenders. In Proceedings of the 17th International Conference on Autonomous Agents and Multi Agent Systems, 703 711. International Foundation for Autonomous Agents and Multiagent Systems.

Jiang, A. X.; Procaccia, A. D.; Qian, Y.; Shah, N.; and Tambe, M. 2013. Defender (mis) coordination in security games. In Proceedings of the 23rd International Joint Conference on Artiﬁcial Intelligence (IJCAI 13), 220 226.

Laszka, A.; Lou, J.; and Vorobeychik, Y. 2016. Multi Defender Strategic Filtering Against Spear-Phishing Attacks. In AAAI, 537 543. Citeseer.

Lou, J.; and Vorobeychik, Y. 2015. Equilibrium Analysis of Multi-Defender Security Games. In IJCAI, 596 602.

Moulin, H.; and Peleg, B. 1982. Cores of effectivity functions and implementation theory. Journal of Mathematical Economics 10(1): 115 145.

Nguyen, T. H.; Yang, R.; Azaria, A.; Kraus, S.; and Tambe, M. 2013. Analyzing the Effectiveness of Adversary Modeling in Security Games. In AAAI.

Paruchuri, P.; Pearce, J. P.; Marecki, J.; Tambe, M.; Ordonez, F.; and Kraus, S. 2008. Efﬁcient Algorithms to Solve Bayesian Stackelberg Games for Security Applications. In AAAI, 1559 1562.

Rahwan, T.; Michalak, T.; Wooldridge, M.; and Jennings, N. R. 2012. Anytime coalition structure generation in multiagent systems with positive or negative externalities. Artiﬁcial Intelligence 186: 95 122.

Shrot, T.; Aumann, Y.; and Kraus, S. 2010. On agent types in coalition formation problems. In Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1-Volume 1, 757 764. Sinha, A.; Fang, F.; An, B.; Kiekintveld, C.; and Tambe, M. 2018. Stackelberg security games: Looking beyond a decade of success. In Proceedings of the Twenty-Seventh International Joint Conference on Artiﬁcial Intelligence. IJCAI. Skibski, O.; Michalak, T. P.; and Wooldridge, M. 2018. The Stochastic Shapley Value for coalitional games with externalities. Games and Economic Behavior 108: 65 80. Tambe, M. 2011. Security and game theory: algorithms, deployed systems, lessons learned. Cambridge University Press. Yi, S.-S. 1996. Stable coalition structures with externalities. Available at SSRN 2071 .