# security_games_on_a_plane__567fad21.pdf

Security Games on a Plane

Jiarui Gan,1 Bo An,1 Yevgeniy Vorobeychik,2 Brian Gauch2

1School of Computer Science and Engineering, Nanyang Technological University, Singapore 639798 2Electrical Engineering and Computer Science, Vanderbilt University, Nashville, TN 37235 1{jrgan, boan}@ntu.edu.sg, 2{yevgeniy.vorobeychik, brian.gauch}@vanderbilt.edu

Most existing models of Stackelberg security games ignore the underlying topology of the space in which targets and defence resources are located. As a result, allocation of resources is restricted to a discrete collection of exogenously deﬁned targets. However, in many practical security settings, defense resources can be located on a continuous plane. Better defense solutions could therefore be potentially achieved by placing resources in a space outside of actual targets (e.g., between targets). To address this limitation, we propose a model called Security Game on a Plane (SGP) in which targets are distributed on a 2-dimensional plane, and security resources, to be allocated on the same plane, protect targets within a certain effective distance. We investigate the algorithmic aspects of SGP. We ﬁnd that computing a strong Stackelberg equilibrium of an SGP is NP-hard even for zerosum games, and these are inapproximable in general. On the positive side, we ﬁnd an exact solution technique for general SGPs based on an existing approach, and develop a PTAS (polynomial-time approximation scheme) for zero-sum SGP to more fundamentally overcome the computational obstacle. Our experiments demonstrate the value of considering SGP and effectiveness of our algorithms.

1 Introduction

Security games have attracted much research attention and have been widely adopted to assist with real security tasks in recent years (Tambe 2011). Much of the research has focused on Stackelberg games games played between a defender and an attacker in which the attacker is assumed to know the defender s strategy when choosing his own (e.g., Conitzer and Sandholm 2006; Kiekintveld et al. 2009; Basilico, Gatti, and Amigoni 2009; An et al. 2011; Vorobeychik et al. 2014). The goal, from the defender s perspective, is to ﬁnd a strategy (a probability distribution over resource allocations) that maximizes her utility. A limitation of existing models of Stackelberg security games is that they ignore the underlying topology of the space in which targets and defence resources are located. As a result, allocation of resources is restricted to a discrete collection of exogenously deﬁned targets. However, in many practical settings, resources can be located on a plane. These

Copyright 2017, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

include security tasks in large open areas, such as protection of ships or natural resources over a sea area, or aerial monitoring by airplanes or drones. Better defense solutions could be potentially achieved by placing resources in a space outside of actual targets (e.g., between targets). A natural example is a simple game on a plane with one resource capable of covering a circular area of radius 1 and two identical targets in a distance of 1.5 from each other. To protect both targets simultaneously, we need to move the resource away from the targets to a point between them. If we only consider the targets as candidate locations for resource allocation, we lose half of the coverage and as much of the utility (this can be even worse: consider ﬁve targets distributed uniformly along a ring of radius 1). With this motivating example in mind, we aim to extend existing work with an assumed topology of the game space. As we note that most security scenarios are fundamentally planar, we develop a novel model called Security Game on a Plane (SGP), in which targets are distributed on a 2-dimensional plane, and resources, to be allocated on the same plane, protect targets within a certain effective distance. We investigate the algorithmic aspects of SGP. We ﬁnd that computing a Strong Stackelberg Equilibrium (SSE) of an SGP is NP-hard even for zero-sum games. Despite the negative results, we provide an exact solution which discretizes the continuous space into regions of equivalent effects and, based on an existing approach, uses column generation to tackle the scalability issue. To more fundamentally overcome the complexity barrier, we investigate approximation approaches. We develop a polynomial-time approximation scheme (PTAS) for zero-sum SGP so that a solution within any given factor of being optimal can be computed in polynomial time. Notably, unlike approximation schemes used in many existing works which are designed only for certain sub-procedures, such as slave problems in the column generation framework or defender/attacker oracles in the double oracle framework (e.g., Jain, Conitzer, and Tambe 2013; Gan, An, and Vorobeychik 2015; Wang, Yin, and An 2016), the PTAS we propose approximates the entire problem. For general-sum SGP, we show that they are generally hard to approximate due to the inherent lack of robustness of the SSE solution concept. To work around this issue, we explore several realistic restrictions of the problem and ﬁnd that under these considerations, solutions with

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17)

guaranteed quality can still be efﬁciently computed. Finally, through experimental evaluations, we demonstrate the value of considering SGP compared to existing approaches, as well as effectiveness of the proposed algorithms.

2 Problem Formulation An SGP is played between a defender and an attacker. The defender places m identical security resources on a 2-dimensional plane to protect a set of targets [n] = {1, . . . , n} located at (ui, vi) for each i [n]. The attacker chooses a target in [n] to attack. A target is said to be protected (or covered) if it is within a certain distance of at least one resource, and unprotected (or uncovered) otherwise. We consider Euclidean distance so that the protection area of a resource is a disk.1 If the attacker attacks some target i [n] which is protected, the defender receives a reward Rd i and the attacker receives a penalty P a i . If instead i is unprotected, payoffs for the defender and the attacker are P d i and Ra i , respectively. We assume Rφ i > P φ i i [n], φ {d, a}; namely, protecting a target is strictly preferred by the defender but disliked by the attacker. Without loss of generality, we normalize all payoff parameters to be in [0, 1], such that after normalization maxi,φ Rφ i = 1 and mini,φ P φ i = 0. When Rd i + P a i = 1 and P d i + Ra i = 1 for all i [n], the game is said to be zero-sum. We denote by s = (xj, yj) m j=1 an allocation of resources on the plane, with (xj, yj) being the coordinate of the jth resource. Equivalently, we also view s as a set of m points. We denote by a coverage vector c = ci the protection to each target, with ci = 1 (or 0) representing that target i is protected (or unprotected); and we let c(s) = ci(s) be a function mapping a pure strategy to the coverage vector it yields, such that

ci(s) = 1, if j [m], (ui, vi) D(xj, yj) 0, otherwise (1)

where D(x, y) denotes a disk centered at (x, y). Without loss of generality, we focus on disks of diameter 1 throughout the paper. Given the above deﬁnitions, utilities for the defender and the attacker are respectively captured by:

U d(c, i) = ci Rd i + (1 ci) P d i , (2a) U a(c, i) = ci P a i + (1 ci) Ra i . (2b)

Following the standard model of Stackelberg security game, the defender plays a mixed strategy p = ps in which each pure strategy, i.e., an allocation s, is chosen with probability ps; whereas the attacker plays a pure strategy as doing so is sufﬁcient for him to achieve optimality under the leader-follower structure of Stackelberg games. We remark that though resources are to be located on a continuous space, resulting in inﬁnite pure strategies, we do not need to consider all of them as those covering the same set of targets are equivalent. The defender s pure strategy space can be deﬁned as S = ζ c(s) | s Rm 2 where ζ( ) can be any function that gives a pure strategy offering the given coverage. S is ﬁnite since c(s) {0, 1}n. Consequently, there

1Our results extend to rectangular protection areas trivially.

always exists an optimal mixed strategy commitment with a ﬁnite support set (i.e., the set of pure strategies with nonzero probability). To deﬁne the players utilities for mixed strategies p, we ﬁrst generalize the coverage function as

s S ps c(s). (3)

It follows that Eq. (2), which calculates players utilities when coverage is a function of pure strategies c(s), also applies to mixed strategies c(p) since

U φ c(p), i =

s S ps U φ c(s), i φ {d, a}.

Strong Stackelberg Equilibrium (SSE) Our goal is to compute the SSE of SGP. In an SSE, the defender chooses the optimal strategy accounting for the attacker s best response to this strategy, under the assumption that the attacker breaks ties in favor of the defender (Von Stengel and Zamir 2004). Formally, p , i forms an SSE, iff

c(p ) = arg maxc C U d c, f(c) , and i = f(c(p )),

where C= c(p) p 0 1Tp=1 is the set of feasible coverage vectors; f(c)= argmaxi F(c)U d(c, i), where F(c)= argmaxi U a(c, i), is the attacker s best response to c. Throughout this paper, we also refer to the problem of computing SSE of an SGP as SGP. Alternatively, we also view SGP as an optimization problem in which the defender s utility is maximized over the mixed strategy space.

A Linear Program (LP) Formulation An LP formulation can be used to compute the SSE, which enumerates the attacker s responses i [n] with n LPs. For each i , we compute with the following LP the optimal defender strategy under the restriction that the attacker s best response is to attack target i .

maxp U d c(p), i (4a)

s.t. U a c(p), i U a c(p), i i [n] (4b)

s S ps = 1 (4c)

ps 0 s S (4d)

where Eq. (4b) guarantees that the attacker is indeed incentivized to attack i . The LP with the highest optimal value yields an SSE to the game. It remains to specify S to complete the construction of the LP formulation. We present how this can be done a part of a solution algorithm next.

3 Computing the SSE To specify S, we draw a circle centered at each target, so that a target gets covered if a resource is placed inside the corresponding circle. The borders of all the circles partition the plane into a collection of disjoint regions, and resources in the same region cover exactly the same set of targets. Therefore, we only need to consider one representative location for each of the regions (e.g., we can take the intersections of the circle borders) and obtain a ﬁnite S with O(2|A|) pure strategies, where A is the collection of all the regions. However, as |A| = O(n2) (Gordon and others 1987), the size of S still grows exponentially.

To address the scalability issue, we note that after we introduce the representative locations, the SGP actually degenerates to a Security game with Protection Externalites (SPE) an existing problem where resources are to be allocated to a given ﬁnite set of candidate locations (not necessarily on a plane), such that allocating a resource to each of the candidate location covers an exogenously deﬁned subset of the targets (Gan, An, and Vorobeychik 2015). The existing approach for SPE then applies to SGP, where column generation is used to decompose the large-scale LP as a way to tackle the scalability issue.

Computational Obstacles While the above approach provides a feasible way to break down the large-scale formulation, it does not essentially overcome the computational obstacle as computing an SSE of SPE is NP-hard (Gan, An, and Vorobeychik 2015). The question remains whether SGP can be solved or approximated in polynomial time, perhaps in a way quite different from the solution approach for SPE. We ﬁnd that computing SSE of SGP is NP-hard as well, even for zero-sum games (Theorem 1). Worse still, SGP cannot be approximated efﬁciently within any constant factor unless P = NP (Theorem 2). Given this, we investigate approximation approaches for zero-sum SGP in the next section.

Theorem 1. Computing SSE of SGP is NP-hard even for zero-sum games.

Proof. We note the known NP-complete problem, disk covering problem (DCov) (Masuyama, Ibaraki, and Hasegawa 1981),2 which asks if n given points on a plane can be covered by m identical disks. We reduce DCov to SGP. For any DCov instance, we construct an SGP with: n targets located at the n given points in the DCov; m resources; and utilities Rd i = Ra i = 1 and P d i = P a i = 0 for all targets i [n]. We solve the SGP and check if the defender receives an expected utility of 1 under SSE. This answers the DCov as: there exists a mixed strategy with utility 1 there exists at least one pure strategy (allocation of m disks) covering all targets (points).

Theorem 2. SGP cannot be approximated within any constant factor in polynomial time unless P = NP.

Proof. Suppose an algorithm computes an ϵ-approximate solution to SGP in polynomial time. We show that DCov can be solved with this algorithm, which implies P = NP. For any DCov instance, we can construct an SGP with: m resources; n + 1 targets with the ﬁrst n of them located at the n points given in the DCov, and the remaining one

Ra i P a i Rd i P d i i [n] 1 1/2 ϵ/3 0 i=n+1 1/2 0 1 1/2

at (M, M) where M is sufﬁciently large such that no resource can cover (M, M) and any one of the ﬁrst n targets simultaneously; and player utilities shown in the table above. In this game, the defender can obtain an expected utility of at least 1

2 iff the attacker attacks target n + 1. Furthermore,

2The problem is originally called Euclidian m-center Problem.

the attacker will indeed attack target n + 1 iff target n + 1 is uncovered and the other targets are fully covered (i.e., cn+1 = 0 and ci = 1 i [n]), which can happen iff there exists a pure strategy covering all targets except n + 1. In any other cases, the defender receives a utility of at most ϵ

3. Therefore, there exists a pure strategy (an allocation of m resources) covering all i [n] the defender receives a utility of at least 1

2 by playing the optimal strategy the ϵ-approximation algorithm offers a solution with defender utility no less than ϵ

3. By checking the solution of the approximation algorithm in polynomial time, we obtain an answer to DCov, an NP-complete problem. The same result holds for SGP by reduction from DCov.

4 Approximating Zero-sum SGP Having shown a series of negative results about SGP equilibrium computation, we now present a strong positive result: a PTAS for zero-sum SGP, so that for any ﬁxed parameter ϵ > 0 a (1 ϵ)-approximation can be computed in time polynomial in the input size. We remark that zero-sum security games, though having a restricted payoff structure, still capture a wide range of realistic security scenarios, as the attacker, being adversarial to the defender, usually beneﬁts directly from making the defender worse. The PTAS is almost the best approximation result we can obtain as there exists no fully PTAS (FPTAS) for zero-sum SGP unless P = NP (Theorem 3). An FPTAS computes a (1 ϵ)-approximation in time polynomial in both the input size and 1

ϵ . Theorem 3. Zero-sum SGP does not admit any FPTAS unless P = NP.

Proof. This follows from the strong NP-hardness of SGP. As we can see from the proof of Theorem 1, even if all the parameters are bounded by a polynomial in the length of the input, the problem remains to be NP-hard. Such problems are called strongly NP-hard problems (Gary and Johnson 1979). Any strongly NP-hard optimization problem with a polynomially bounded objective function cannot have an FPTAS unless P = NP (Vazirani 2013).

4.1 A PTAS for Zero-sum SGP

The PTAS is based on a grid-shifting approach, which has been used to derive approximations for many planar problems (e.g., Li et al. 2015). Our contribution is to extend the approach to SGP, particularly to the setting of mixed strategies and show that the approximation ratio and the polynomial runtime still hold, which is not trivial based on the existing results. We deﬁne a grid G1 consisting of inﬁnite vertical and horizontal lines distributed uniformly in a distance l Z>0, and speciﬁcally with two of them intersecting at (1, 1), e.g.,

G1 = (x, y) R2 | x = a l + 1 y = a l + 1, a Z .

Shifting G1 along the vector (1, 1) repeatedly, we obtain a series of grids G2, . . . , Gl (Figure 1). A grid divides the plane into inﬁnite disjoint l l cells. We assume for simplicity that no targets have an integer coordinate (we can always shift the coordinate system to achieve this), so that all targets are in the interior of the cells. For each grid Gk, we deﬁne a ﬁlter σk( ) which, given a pure strategy s, ﬁlters out resources in s whose protection areas overlap Gk, i.e.,

σk(s) = {(x, y) R2 | (x, y) s D (x, y) Gk= },

where D (x, y) denotes the interior of the disk D(x, y). Filtering the entire strategy space with σk( ), we obtain

Sk = {σk(s) | s S},

with which we deﬁne a modiﬁed SGP by restricting the strategy space to S = k [l] Sk. Let p and p be the optimal defender strategies in the original and the modiﬁed SGPs. Lemmas below complete the construction of the PTAS by showing: (i) p offers a 1 2

l -approximation to p (Lemma 5); and (ii) a defender strategy at least as good as p can be computed in polynomial time (Lemma 6). We conclude with Theorem 7. Lemma 4. For any defender mixed strategy p, there exists a mixed strategy p with support set Δ( p) S such that 1 2

l c(p) c( p) c(p).

Proof. For any given p, we construct a strategy p = ps such that pσk(s) = 1

l ps for all k [l] and all s Δ(p). In this way, we have Δ( p) S and

l ps c σk(s)

s Δ(p) ps c(s)

l c(s) c σk(s)

c(s) c σk(s) . (5)

We have c(s) c σk(s) as s covers more targets, so that Eq. (5) implies c( p) c(p). To derive the lower bound, we observe that the interior of the protection area of a resource intersects at most two of G1, . . . , Gl. Thus, for arbitrary s,

k [l] c(s) c σk(s) 2 c(s).

Again with Eq. (5), we have

c( p) c(p) s Δ(p) ps

l 2 c(s) 1 2

which completes the proof.

Lemma 5. U(c( p )) U(c(p )) 1 2

l if the SGP is zero-sum, where U(c) = U d c, f(c) .

Proof. According to Lemma 4, there exists a p such that c( p) 1 2

l c(p ). A feature of zero-sum security games is that adding more coverage to a solution does not make the solution worse for the defender, i.e., c c U(c) U(c ) for any coverage vectors c and c . Therefore

U c( p ) U c( p) U 1 2

U (c(p )) U 1 2

= P d i + (Rd i P d i ) 1 2

l ci (p ) P d i + (Rd i P d i ) ci (p )

(Rd i P d i ) 1 2

l ci (p ) (Rd i P d i ) ci (p ) = 1 2

Lemma 6. A defender strategy at least as good as p can be computed in time polynomial in n.

Proof. p can be computed by Eqs. (4) with S replacing S as the pure strategy space.3 To deal with the large variable number, we consider its dual problem structured as follows.

i [n](Ra i Ra i ) qi (6a)

i [n] ci(s) Qa i ci (s) Qa i qi + qn+1

ci (s) Qd i s S (6b)

qi 0 i [n] (6c)

where Qφ i = Rφ i P φ i i [n], φ {d, a}. The formulation has only n + 1 variables but as many as | S| + n constraints. We use the ellipsoid method (Bertsimas and Tsitsiklis 1994) to deal with the constraints, with which the problem reduces to implementing the separation oracle, i.e., a procedure that, for any given q, decides whether all constraints are satisﬁed or not, and, if not, ﬁnd out one of the violated constraints. The key is to verify satisﬁability to constraints in Eq. (6b). We observe that for a given q, Eq. (6b) can be rewritten as

i [n] wi ci(s) + wn+1 0,

where wi = qi Qa i i = i and wi = Qd i

i =i qi Qa i are all constants. Since S =

k Sk, to implement the separation oracle is equivalent to check if maxs Sk w(s) > 0 for some k [l], where maxs Sk w(s) can be furthermore interpreted as the Maximum Budgeted Coverage problem (MBC): given n weighted elements with weights w1, . . . , wn and a collection A of subsets of the elements, ﬁnd m subsets such that the sum of weights in the union of the subsets is maximized. Here A corresponds to the collection of regions deﬁned in Section 3. As the plane is detached into independent cells by the grids, allocation in one cell does not affect that in the other cells. Given this, we use dynamic programming to tackle the problem. For a grid Gk, let cell1, . . . , cell n ( n n) be the cells containing at least one target, and let A(η, j) be the maximum weights we can cover with j resources in cell1, . . . , cellη. Our goal is to compute A( n, m), which can be done with the following recursion:

max m j =0 A(η 1, j j ) + MBC(cellη, j ) , if η >1

MBC(cell1, j), if η=1

where MBC(cellη, j) is the maximum weights we can cover with j resources in cellη; and m is a cap of the number of

3Zero-sum security games admit a more concise single-LP formulation (Xu 2016). This does not change the nature of the proof. To be more general, we prove with the multiple-LP formulation.

resources needed to achieve optimality in a single cell. Here if we allow the protection areas of resources to cross the cell border, we have m = 2l2 as an l l square can be fully covered with 2l2 disks.4 We do so as this actually expands the pure strategy space, resulting in a solution at least as good as p . Therefore, given the number O(n2) of candidate locations, we can enumerate all O(n4 l2) possible allocations to ﬁnd MBC(cellη, j), which is polynomial in n. This implies that A( n, m), or the separation oracle, are polynomial-time computable, which concludes the proof.

Theorem 7. For any given l Z>0, a 1 2

l -approximate solution to SGP can be computed in time polynomial in n.

Proof. This follows readily from Lemmas 5 and 6.

4.2 Implementation Considerations The PTAS is highly theoretical in two aspects: ﬁrst, it relies on the ellipsoid method which, while having a theoretical guarantee of polynomial runtime, is notorious for being inefﬁcient for practical uses; and second, the runtime of the separation oracle has a large exponent on n. These make the PTAS inefﬁcient for practical uses. For better performance, we propose the following practical remedies.

Column Generation (CG) We use CG to replace the ellipsoid method. The slave problem of CG is exactly the same as the separation oracle of the ellipsoid method and is polynomial-time solvable. Note this does not mean that CG has polynomial runtime as there is no guarantee for the number of iterations CG needs. However, CG does exhibit better practical performance, which is similar to the relationship between the ellipsoid method and the simplex method in solving LPs the latter does not guarantee polynomial runtime but is generally much more efﬁcient in practice.

Greedy Approximation for the Slave Problem Since the ultimate goal of the slave problem is to ﬁnd a column with a positive reduced cost, there is no need to maximize the reduced cost all the time. Approximations to the slave problem, which ﬁnd better (instead of the best) columns, can be used to simplify the computation. Only when the approximation fails to ﬁnd a column with a positive reduced cost, we call the exact algorithm for the best column. We use a cell-wise greedy algorithm. The rough idea is to iteratively place one resource to the location where the most marginal weights are covered until all resources are used. To make use of the cell structure, a within-cell rank of candidate locations is maintained for each cell by the marginal weights of the locations; and so is a between-cell rank of cells by the marginal weights of the best locations in the cells. In each iteration, a best location is chosen from the best cell in the between-cell rank, and then the between-cell rank and only the within-cell rank of the chosen cell are updated. Such a bilevel implementation is faster than a direct implementation on the entire collection of the candidate locations.

4While wi 0 i = i , wi can be negative. In this case, we may not want to cover i . It is easy to see that using a constant number of additional disks, we can leave a small hole around i

when covering the cell, so that m is still bounded by a constant.

An MILP for MBC To compute the exact solution of MBC(cellη, j), instead of enumerating all the O(n4 l2) possible allocations, a more practical approach is to formulate it as a mixed integer linear program (MILP). Many mature algorithms or solvers for MILP can then be used to solve the problem. The MILP formulation can be found in the previous work (Gan, An, and Vorobeychik 2015).

5 Extension to General-sum SGP We have shown that general-sum SGP is hard even to approximate (Theorem 2). Further analysis implies that the inapproximability is due to the inherent lack of robustness of the SSE solution concept: when the game is non-zero-sum, the defender s utility function is non-continuous with respect to the defender strategy. This means that even if we can obtain a coverage vector arbitrarily close to the optimal one (indeed, we can, as Lemmas 4 and 6 also apply to generalsum SGP), the objective value we get does not necessarily converge to the optimum. For example, given one resource and two targets such that Ra 1 = 1, P a 1 = 1

2 and Ra 2 = 1

2, P a 2 = 0, only when the coverage is exactly c1 = 1 and c2 = 0, the attacker is incentivized to attack target 2. Despite the discouraging result, we point out that such special instances are rarely seen in practice. In most cases if we apply the PTAS for zero-sum SGP to general-sum SGPs we still obtain solutions of good quality. We present several theoretical observations to justify an optimistic outlook.

Quasi-zero-sum Games As we point out previously, in real scenarios, the attacker normally beneﬁts more from attacking targets with higher values to the defender. That is, the payoff structure, if not completely zero-sum, exhibits strong negative correlation between the players. We say that a game is (1 ϵ)-quasi-zero-sum if |Ra i + P d i 1| ϵ and |Rd i + P a i 1| ϵ for all i [n]. Theorem 9 shows that the PTAS for zero-sum games yields a solution to quasi-zerosum games with a bounded absolute error.

Lemma 8. If a security game is (1 ϵ)-quasi-zero-sum, then for any pair of coverage vectors c and c such that c c δ, U d c , f(c ) U d c, f(c) (2ϵ + δ).

Proof. We have, for any coverage vector c and i [n]

U d(c, i) + U a(c, i)= ci Rd i + P a i + (1 ci) P d i + Ra i

ci (1 ϵ) + (1 ci) (1 ϵ) = 1 ϵ (7)

where the second line follows from the deﬁnition of (1 ϵ)- quasi-zero-sum game. Similarly,

U d(c, i) + U a(c, i) 1 + ϵ (8)

U d c ,f(c ) U d c δ, f(c )

U d c, f(c ) δ (by Eq. (2a))

1 ϵ U a c, f(c ) δ (by Eq. (7))

1 ϵ U a c, f(c) δ

U d c, f(c) 2ϵ δ. (by Eq. (8))

0 200 400 600 800 1,000 0

500 Density = 0.5; zero-sum

0 200 400 600 800 1,000

Density = 2.0; zero-sum

0 100 200 300 400 500

Density = 0.5; general-sum

0 100 200 300 400 500

Density = 2.0; general-sum

80% - PTAS 80% - Exact 100% - PTAS 100% - Exact

Figure 2: Runtime (y-axis: time in seconds; x-axis: number of targets)

0.5 1.0 2.0 4.0

Zero-sum games

0.5 1.0 2.0 4.0

General-sum games

θ = 0.1 θ = 0.5 θ = 1.0

Figure 3: Defender utilities obtained with the existing SPE model as ratios to those obtained with SGP (x-axis: target density)

Theorem 9. Suppose an SGP is (1 ϵ)-quasi-zero-sum. For any given l Z>0, a solution p can be computed in polynomial time such that U c(p ) U c( p ) 2ϵ + 2

l , where p is the optimal solution and U(c) = U d c, f(c) .

Proof. By Lemmas 4 and 6, we can obtain in polynomial time a solution p such that c( p ) 1 2

l c(p ) c(p) 2

l . By Lemma 8, we obtain the desired result with δ = 2

Robustness Considerations Alternatively, we may consider eliminating unstable solutions on which a slight deviation of the defender may cause the attacker to change his target and result in a large difference to the defender s utility. This is important as in practice the defender can rarely implement her strategy precisely as planned; nor can the attacker observe precisely the mixed strategy the defender implements. We add a buffer ε to Eq. (4b):

U a c(p), i U a c(p), i + ε i [n].

The solution is then robust to errors causing ε loss in the attacker s utility. We call such a solution an ε-robust solution. Theorem 10 shows that if we limit our solution to the robust ones, the PTAS can still be used to ﬁnd approximate solutions with only a slight loss of robustness.

Theorem 10. For an SGP and a given l Z>0, if there exists an ε-robust solution with ε > 2

l , then an ε 2

l -robust solution can be computed in polynomial time, which offers at least 1 2

l the utility of the optimal ε-robust solution.

Proof. Let p denote the optimal ε-robust solution. By Lemma 4, there exists a solution p with support set Δ( p)

0.5 1.0 2.0 4.0

Zero-sum games

0.5 1.0 2.0 4.0

General-sum games

θ = 0.1 θ = 0.5 θ = 1.0

Figure 4: Approximation ratio of solutions obtained with the PTAS (x-axis: target density)

S such that 1 2

l c(p ) c( p ) c(p ). Let i be the attacker s best response to p , i.e., i = f(c(p )). We have

U a c( p ), i U a 1 2

U a c(p ), i 2

l U a c(p ), i + ε 2

l (as p is ε-robust)

U a c( p ), i + ε 2

which indicates that p is ε 2

l -robust and the attacker s best response to p is i . Therefore, combining Lemma 6, we can obtain in polynomial time an ε 2

l -robust solution which is at least as good as p (by solving the LP formulation with a buffer ε 2

l added to Eq. (4b)). It follows that (similar to the proof of Lemma 5)

U d (c( p ), i ) U d (c(p ), i ) U d 1 2

U d (c(p ), i ) 1 2

In addition to the above theoretical observations, our experimental results in Section 6 again corroborate the effectiveness of the PTAS on general SGP.

6 Experimental Evaluations We experimentally evaluate the proposed model and the algorithms. All results are obtained on a platform with a 3.2 GHz CPU and 16 GB memory. All LPs and MILPs are solved using the existing solver CPLEX (version 12.4).

Performance of Algorithms We compare the runtime of our exact algorithm and the PTAS on both zero-sum and general-sum games. In the experiments, target coordinates are randomly uniformly generated in [0,

n/ρ] for different target densities ρ. Player payoffs are randomly uniformly generated in [0, 1]. The comparison is shown in Figure 2, where the result is obtained with l = 10 in the PTAS,

which guarantees an approximation ratio of 80%. We can see that while the PTAS runtime is comparable to the exact approach for small instances, it begins to exhibit signiﬁcant improvements as scale increases, particularly for non-zerosum games. Moreover, we see that PTAS is signiﬁcantly faster in reaching its theoretical approximation ratio of 80% than the exact algorithm. This is useful as we can maintain an upper bound of the solution (which can be done with an LP relaxation for MBC (Gan, An, and Vorobeychik 2015)) and terminate the algorithm for faster performance when the solution reaches desired ratio to the upper bound.

Improvement of Solution Quality with SGP We evaluate improvement of solution quality with SGP as compared with the existing SPE model where resource allocation is restricted to targets. The results, as shown in Figure 3, are obtained with instances of 100 targets and 20 resources (the numbers of targets and resources do not have a signiﬁcant affect on results). The parameter θ deﬁnes the variance of the players payoffs over different targets, with which we ﬁrst generate the penalty P φ i in [0, θ], and then the reward in [P φ i +1 θ, 1] (when θ = 1 the approach is equivalent to the payoff generation model described above; when θ = 0, all targets are identical). This is associated with an interesting observation that when targets have similar payoffs, the gap between solutions of SGP and SPE increases. On the other hand, when payoff variance is large, SPE exhibits good solution quality, potentially being used as a heuristic to SGP.

Solution Quality of the Approximation Approach We evaluate how well PTAS actually approximates solutions in both zero-sum and non-zero-sum settings. As shown in Figure 4, the PTAS yields nearly optimal solution in most runs, and does so even for general-sum games.

7 Conclusions This paper aims at addressing the limitation of existing models of Stackelberg security games that ignore the underlying topology of the space in which targets and defence resources are or are to be located. A novel model SGP, which incorporates a planar topology, is proposed and studied. Hardness results are established that computing SSE of SGP is NP-hard and is generally hard to approximate. Despite of these results, an PTAS is found and implemented for zerosum SGPs, which in practice also offers solutions with good quality to general-sum SGPs. Experimental results show the improvement of solution quality with the SGP model, as well as the effectiveness of the proposed algorithms.

Acknowledgement This research is partially supported by NRF2015NCRNCR003-004, NAP, NSF (CNS-1238959, CNS-1640624, IIS-1526860), ARO (W911NF-16-1-0069), ONR (N00014151-2621), and AFRL (FA8750-14-2-0180).

An, B.; Pita, J.; Shieh, E.; Tambe, M.; Kiekintveld, C.; and Marecki, J. 2011. GUARDS and PROTECT: Next generation

applications of security games. ACM SIGecom Exchanges 10(1):31 34. Basilico, N.; Gatti, N.; and Amigoni, F. 2009. Leaderfollower strategies for robotic patrolling in environments with arbitrary topologies. In Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 57 64. Bertsimas, D., and Tsitsiklis, J. N. 1994. Introduction to Linear Optimization. Athena Scientiﬁc. Conitzer, V., and Sandholm, T. 2006. Computing the optimal strategy to commit to. In Proceedings of the 7th ACM conference on Electronic Commerce, 82 90. Gan, J.; An, B.; and Vorobeychik, Y. 2015. Security games with protection externalities. In Proceedings of the 29th AAAI Conference on Artiﬁcial Intelligence (AAAI), 914 920. Gary, M. R., and Johnson, D. S. 1979. Computers and Intractability: A Guide to the Theory of NP-completeness. WH Freeman and Company, New York. Gordon, B., et al. 1987. Challenging Mathematical Problems with Elementary Solutions, volume 1. Courier Corporation. Jain, M.; Conitzer, V.; and Tambe, M. 2013. Security scheduling for real-world networks. In Proceedings of the 12th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 215 222. Kiekintveld, C.; Jain, M.; Tsai, J.; Pita, J.; Ord o nez, F.; and Tambe, M. 2009. Computing optimal randomized resource allocations for massive security games. In Proceedings of the 8th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 689 696. Li, J.; Wang, H.; Zhang, B.; and Zhang, N. 2015. Linear time approximation schemes for geometric maximum coverage. In Computing and Combinatorics. Springer. 559 571. Masuyama, S.; Ibaraki, T.; and Hasegawa, T. 1981. The computational complexity of the m-center problems on the plane. IEICE Transactions (1976-1990) 64(2):57 64. Tambe, M. 2011. Security and Game Theory: Algorithms, Deployed Systems, Lessons Learned. Cambridge University Press. Vazirani, V. V. 2013. Approximation algorithms. Springer Science & Business Media. Von Stengel, B., and Zamir, S. 2004. Leadership with commitment to mixed strategies. CDAM Research Report LSECDAM-2004-01. Vorobeychik, Y.; An, B.; Tambe, M.; and Singh, S. P. 2014. Computing solutions in inﬁnite-horizon discounted adversarial patrolling games. In ICAPS, 314 322. Wang, Z.; Yin, Y.; and An, B. 2016. Computing optimal monitoring strategy for detecting terrorist plots. In Proceedings of the 30th AAAI Conference on Artiﬁcial Intelligence (AAAI), 637 643. Xu, H. 2016. The mysteries of security games: Equilibrium computation becomes combinatorial algorithm design. In Proceedings of the 2016 ACM Conference on Economics and Computation, 497 514.