# blameworthiness_in_security_games__fd572abc.pdf

The Thirty-Fourth AAAI Conference on Artiﬁcial Intelligence (AAAI-20)

Blameworthiness in Security Games

Pavel Naumov,1 Jia Tao2

1Tulane University, 2Lafayette College pgn2@cornell.edu, taoj@lafayette.edu

Security games are an example of a successful real-world application of game theory. The paper deﬁnes blameworthiness of the defender and the attacker in security games using the principle of alternative possibilities and provides a sound and complete logical system for reasoning about blameworthiness in such games. Two of the axioms of this system capture the asymmetry of information in security games.

Introduction In this paper we study the properties of blameworthiness in security games (von Stackelberg 1934). Security games are used for canine airport patrol (Pita et al. 2008; Jain et al. 2010), airport passenger screening (Brown et al. 2016), protecting endangered animals and ﬁsh stocks (Fang, Stone, and Tambe 2015), U.S. Coast Guard port patrol (Sinha et al. 2018; An, Tambe, and Sinha 2016), and randomized deployment of U.S. air marshals (Sinha et al. 2018).

Defender \Attacker Terminal 1 Terminal 2 Terminal 1 20 120 Terminal 2 200 16

Figure 1: Expected Human Losses in Security Game G1.

As an example, consider a security game G1 in which a defender is trying to protect two terminals in an airport from an attacker. Due to limited resources, the defender can patrol only one terminal at a given time. If the defender chooses to patrol Terminal 1 and the attacker chooses to attack Terminal 2, then the human losses at Terminal 2 are estimated at 120, see Figure 1. However, if the defender chooses to patrol Terminal 2 while the attacker still chooses to attack Terminal 2, then the expected number of the human losses at Terminal 2 is only 16, see Figure 1. Generally speaking, the goal of the defender is to minimize human losses, while the goal of the attacker is to maximize them. However, the utility functions in security games usually take into account not only the human losses, but also the cost to protect and to attack the target to the defender and the attacker respectively. Such

Copyright c 2020, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

a cost has to be converted to human lives using some factor, possibly different for the defender and the attacker. In game G1, we assume that the cost of defending Terminal 1 and Terminal 2 is 8 and 4 respectively, while the cost of attacking these terminals is 12 and 8 respectively, see Figure 2. As a result, for example, if the defender chooses to patrol Terminal 1 and the attacker chooses to attack Terminal 2, then the payoff of the defender is 120 8 = 128 and the payoff of the attacker is 120 8 = 112, see Figure 2.

Defender \Attacker Terminal 1 Terminal 2 (cost 12) (cost 8) Terminal 1 (cost 8) 28, 8 128, 112 Terminal 2 (cost 4) 204, 188 20, 8

Figure 2: Utility Functions in Security Game G1.

In real world examples of security games, the defender usually employs mixed strategies. For example, if the defender is using a strategy 75/25, then he will spend 75% of the time in Terminal 1 and 25% of the time in Terminal 2. In practice, each morning the defender might get a randomly generated timetable that speciﬁes at which terminal the defender should be at each time slot during the day (Jain et al. 2010). The distinctive feature of security games compared to strategic games is the asymmetry of information between the players: the attacker knows the strategy employed by the defender but not vice versa. For example, while planning the attack, the attacker might visit the airport multiple times and observe that the defender spends 75% of the time in Terminal 1 and 25% of the time in Terminal 2. Thus, the attacker will know the mixed strategy used by the defender, but she will not know the location of the defender at the moment she plans to arrive at the airport on the day of the attack. For the sake of simplicity, we assume that in game G1 the defender must choose between only three given mixed strategies: 75/25, 50/50, and 25/75. Then, game G1 can be described as an extensive form game depicted in Figure 3. The payoffs in this ﬁgure represent expected values of the utility functions. For example, suppose that the defender chooses the mixed strategy 75/25 and the attacker chooses to attack Terminal 1. The pair (75/25, T1) is called an action proﬁle of game G1. Under this action proﬁle, the pay-

50/50 75/25 25/75

T1 T2 T1 T2 T1 T2

-116,98 -74,60 -72,53 -101,86 -160,143 -47,34

Figure 3: Security Game G1 in Extensive Form.

offs of the defender and the attacker are 28 and 8, respectively, with probability 75%, and they are 204 and 188, respectively, with probability 25%, see Figure 2. Thus, the expected payoff (or just payoff ) of the defender is

75% ( 28) + 25% ( 204) = 21 51 = 72

and of the attacker is 75% 8 + 25% 188 = 6 + 47 = 53. Suppose that the defender chooses a strategy 50/50 and the attacker decides to target Terminal 2. Then, the attacker s payoff is 60, see Figure 3. We write this as

(50/50, T2) The attacker s payoff is 60. .

The attacker s mastermind might ﬁnd this to be the attacker s fault and blame the attacker for the payoff not being at least 98. We capture the attacker s blameworthiness by

(50/50, T2) A( The attacker s payoff is less than 98. ),

where the blameworthiness modality Aϕ stands for the attacker is blamable for ϕ . We deﬁne the blameworthiness using the well known Frankfurt s principle1 of alternative possibilities: an agent is blamable for ϕ if ϕ is true and the agent could have prevented ϕ (Frankfurt 1969; Widerker 2017). In our case, the attacker, after learning that the defender s strategy is 50/50, could have targeted Terminal 1, which would increase her payoff to 98, see Figure 3. The principle of alternative possibilities, sometimes referred to as counterfactual possibility (Cushman 2015), is also used to deﬁne causality (Lewis 2013; Halpern 2016; Batusov and Soutchanski 2018). Next, assume that the defender still chooses the strategy 50/50, but the attacker decided to target Terminal 1. Under this action proﬁle, the payoff of the attacker is 98, see Figure 3. Although the payoff is less than the attacker s payoff of 143 under the action proﬁle (25/75, T1), the attacker cannot be blamed for this:

(50/50, T1) A( The attacker s payoff is less than 143. ),

because the attacker had no action in game G1 to guarantee her payoff to be at least 143. At the same time, under the action proﬁle (25/75, T1), the defender is blameable for his payoff being less than 101:

1This principle has many limitations that (Frankfurt 1969) discusses; for example, when a person is coerced into something.

(50/50, T1) D( The defender s payoff is less than 101. ), because the defender could have guaranteed his payoff to be at least 101 by choosing mixed strategy 75/25, see Figure 3. Following the principle of alternative possibilities, the blameworthiness modality Dϕ stands for statement ϕ is true and the defender had a strategy to prevent it . In addition to the blameworthiness modalities A and D, in this paper we also consider an auxiliary necessity modality N. Statement Nϕ stands for ϕ is true under each action proﬁle of the given security game . For example, (50/50, T1) N( The defender s payoff is negative. ), because in game G1 the defender s payoff is always negative. Surprisingly, as we show in Lemma 1, modality D can be expressed through modalities A and N: Dϕ ϕ N( ϕ A ϕ). At the same time, we believe that modality A cannot be expressed through modalities D and N, which reﬂects the asymmetric nature of security games. In this paper we give a sound and complete axiomatization of the interplay between modalities A and N in security games. This work is related to our paper on blameworthiness in strategic games (2019b). They proposed a sound and complete axiomatization of the interplay between the necessity modality N and the coalition blameworthiness modality BC in strategic games. Their deﬁnition of the blameworthiness is also based on the principle of alternative possibilities. Namely, BCϕ stands for statement ϕ is true and coalition (a set of agents) C had a strategy to prevent it . Thus, our modalities Aϕ and Dϕ correspond to their modalities B{attacker}ϕ and B{defender}ϕ. In spite of this syntactic similarity between their and our works, the resulting axiomatic systems are quite different, which comes from the semantic difference between strategic games and security games. In security games, the attacker knows the defender s strategy while in a similar strategic game she would not. There are three aspects in which this work is different from (Naumov and Tao 2019b): 1. As stated above, in security games modality D is expressible through modalities A and N, while in strategic games modality B{defender} is not expressible through modalities B{attacker} and N.

2. Two of our core axioms for modality A, the Conjunction axiom and the No Blame axiom capture the asymmetry of information in security games. They are not sound in strategic games. The Fairness axiom from (Naumov and Tao 2019b) is not sound in our setting. We further discuss this in the Axioms section. 3. The proof of the completeness is using a completely different construction from the one used in (Naumov and Tao 2019b). This is discussed in section Completeness.

Syntax and Semantics In this paper we consider a ﬁxed set of propositional variables Prop. The language Φ of our logical system is deﬁned by the grammar: ϕ := p | ϕ | ϕ ϕ | Nϕ | Aϕ.

As usual, we assume that connectives , , and are deﬁned through connectives and in the standard way. Next, we formally deﬁne security games (or just games ).

Deﬁnition 1 A game is a tuple (D, {Ad}d D, π), where 1. set D is a set of actions of the defender, 2. non-empty set Ad is a set of actions of the attacker in response to the action d D of the defender, 3. valuation π(p) of a propositional variable p is an arbitrary set of pairs (d, a) such that d D and a Ad.

In game G1 from the introduction, the set of actions D of the defender is a three-element set {75/25, 50/50, 25/75}. For each action d D of the defender in this game, the set of responses Ad is the same two-element set {T1, T2}. Informally, π(p) describes the set of action proﬁles (d, a) under which statement p is true. The next deﬁnition is the core deﬁnition of our paper. Its item 5 deﬁnes blameworthiness of the attacker in security games using the principle of alternative possibilities (Frankfurt 1969; Widerker 2017): the attacker is blamable for statement ϕ under action proﬁle (d, a) if ϕ is true under this proﬁle and the attacker had an opportunity to prevent ϕ.

Deﬁnition 2 For any action d D of the defender and any response action a Ad of the attacker in a game (D, {Ad}d D, π) and any formula ϕ Φ, the satisﬁability relation (d, a) ϕ is deﬁned recursively as follows: 1. (d, a) p if (d, a) π(p), where p Prop, 2. (d, a) ϕ if (d, a) ϕ, 3. (d, a) ϕ ψ if (d, a) ϕ or (d, a) ψ, 4. (d, a) Nϕ if (d , a ) ϕ for each action d D of the defender and each response action a Ad of the attacker, 5. (d, a) Aϕ if (d, a) ϕ and there is a response action a Ad of the attacker such that (d, a ) ϕ.

As deﬁned above, language Φ includes the attacker s blameworthiness modality A, but does not include the defender s blameworthiness modality D. If modality D is added to language Φ to form language Φ+, then Deﬁnition 2 would need to be extended by an additional item: 6. (d, a) Dϕ if (d, a) ϕ and there is an action d D of the defender such that for each response action a Ad of the attacker, (d , a ) ϕ. As mentioned in the introduction, we do not include modality D into language Φ because it is expressible through modalities A and N. Indeed, the following lemma holds for any formula ϕ Φ+:

Lemma 1 (d, a) Dϕ iff (d, a) ϕ N( ϕ A ϕ).

PROOF. ( ) : Suppose that (d, a) ϕ N( ϕ A ϕ). Thus, either (d, a) ϕ or (d, a) N( ϕ A ϕ). In the ﬁrst case, (d, a) Dϕ by item 6 above. Next assume that (d, a) N( ϕ A ϕ). By item 6, to prove (d, a) Dϕ, it sufﬁces to show that for any action d D of the defender there is a response action a Ad of the attacker, such that (d , a ) ϕ. Indeed, consider any action d D of the defender. By Deﬁnition 1, set Ad is not

empty. Let a1 Ad be an arbitrary response action of the attacker on action d . Assumption (d, a) N( ϕ A ϕ), by item 4 of Deﬁnition 2, implies (d , a1) ϕ A ϕ. We consider the following two cases separately: Case I: (d , a1) ϕ. Then, choose the response action a to be a1 to have (d , a ) ϕ. Case II: (d , a1) ϕ. Thus, (d , a1) ϕ by item 2 of Deﬁnition 2. Hence, (d , a1) A ϕ by item 3 of Deﬁnition 2 because (d , a1) ϕ A ϕ. Thus, by item 5 of Deﬁnition 2, there is a response action a2 Ad of the attacker such that (d , a2) ϕ. Hence, (d , a2) ϕ by item 2 of Deﬁnition 2. Then, choose the response action a to be a2 to have (d , a ) ϕ. ( ) : Suppose that (d, a) ϕ N( ϕ A ϕ). Thus,

(d, a) ϕ (1)

and (d, a) N( ϕ A ϕ). The latter, by item 4 of Deﬁnition 2, implies that there is an action d D of the defender and a response action a Ad of the attacker such that (d , a ) ϕ A ϕ. Thus, (d , a ) ϕ and (d , a ) A ϕ by item 3 of Deﬁnition 2. Then, (d , a ) ϕ for each response action a Ad of the attacker, by item 5 of Deﬁnition 2. Thus, (d , a ) ϕ for each response action a Ad of the attacker, by item 2 of Deﬁnition 2. Hence, there exists an action d D of the defender such that (d , a ) ϕ for each response action a Ad of the attacker. Therefore, statement (1) implies (d, a) Dϕ by item 6 above.

Axioms In addition to the propositional tautologies in language Φ, our logical system contains the following axioms.

1. Truth: ϕ ϕ, where {N, A}, 2. Negative Introspection: Nϕ N Nϕ, 3. Distributivity: N(ϕ ψ) (Nϕ Nψ), 4. Unavoidability: Nϕ Aϕ, 5. Strict Conditional: N(ϕ ψ) (Aψ (ϕ Aϕ)), 6. Conjunction: A(ϕ ψ) (Aϕ Aψ), 7. No Blame: A(ϕ Aϕ).

The Truth (for N), the Negative Introspection, and the Distributivity axioms are the well known S5 properties of the necessity modality N. The Truth axiom for modality A states that the attacker can only be blamed for something true. The Unavoidability axiom states that the attacker cannot be blamed for something that could not be prevented. The Strict Conditional axiom states that if statement ψ is true under each action proﬁle where ϕ is true, the attacker is blameable for ψ, and ϕ is true, then the attacker is also blamable for ϕ. Indeed, because statement ψ is true under each action proﬁle where ϕ is true, any action of the attacker that prevents ψ also prevents ϕ. Hence, if the attacker is blameable for ψ and ϕ is true, then the attacker is also blamable for ϕ. The Truth axiom, the Unavoidability axiom, and the Strict Conditional axiom hold not only for modality A, but for

modality D as well. These axioms are also true for strategic games. The Conjunction and the No Blame axioms are the key axioms of our logical system. They capture the asymmetry of information in security games. Both of these axioms are true for the attacker s blameworthiness modality A their soundness is proven in (Naumov and Tao 2019a). However, as Lemma 3 and Lemma 4 show, they are not true for the defender s blameworthiness modality D in game G2 depicted in Figure 4. Lemma 2 is an auxiliary statement about game G2 used in the proofs of these two lemmas.

a1 a2 a1 a2

Figure 4: Game G2, where (d1, a1) D(p q) (Dp Dq), (d2, a2) D(p Dp), and (d2, a1) Ap N(p Ap).

Lemma 2 (d, a) Dp and (d, a) Dq for each action d of the defender and each response action a of the attacker in game G2.

PROOF. Note that (d1, a1) p and (d2, a1) p, see Figure 4. Thus, for each action d of the defender there is an action a of the attacker such that (d , a ) p. Hence, (d, a) Dp by item 6 after Deﬁnition 2. Similarly, (d1, a1) q and (d2, a2) q imply that (d, a) Dq.

Lemma 3 (d1, a1) D(p q) (Dp Dq).

PROOF. By Lemma 2, it sufﬁces to show that (d1, a1) D(p q). Indeed, observe that (d2, a1) p q and (d2, a2) p q, see Figure 4. Thus, (d2, a) p q for each response action a of the attacker on action d2 of the defender. Also, (d1, a1) p q, see Figure 4. Therefore, (d1, a1) D(p q) by item 6 after Deﬁnition 2.

Lemma 4 (d2, a2) D(p Dp).

PROOF. (d1, a1) Dp and (d1, a2) Dp by Lemma 2. Thus, (d1, a1) p Dp and (d1, a2) p Dp by item 3 of Deﬁnition 2 and because (d1, a1) p and (d1, a2) p, see Figure 4. Thus, (d1, a) p Dp for each response action a of the attacker on action d of the defender. At the same time, (d2, a2) p Dp by item 3 of Deﬁnition 2 because (d2, a2) p, see Figure 4. Therefore, (d2, a2) D(p Dp) by item 6 after Deﬁnition 2.

Informally, the Conjunction and the No Blame axioms capture the properties of the asymmetry of the information in security games and thus they cannot be true in strategic games (Naumov and Tao 2019b) where the information is symmetric. A strategic game in which these axioms fail

could be constructed by modifying the security game G2 into a strategic game. The logical system for blameworthiness in strategic games (Naumov and Tao 2019b) includes the Fairness axiom: BCϕ N(ϕ BCϕ). In the next two lemmas we show that in the case of security games this axiom is not sound for modality A, but is sound for modality D.

Lemma 5 (d2, a1) Ap N(p Ap) in game G2.

PROOF. Note that (d2, a1) p and (d2, a2) p, see Figure 4. Thus, (d2, a1) Ap by item 5 of Deﬁnition 2. Suppose that (d2, a1) Ap N(p Ap). Hence, (d2, a1) N(p Ap) by item 3 of Deﬁnition 2. Thus, (d1, a1) p Ap by item 4 of Deﬁnition 2. Note that (d1, a1) p, see Figure 4. Hence, (d1, a1) Ap by item 3 of Deﬁnition 2. Then, by item 5 of Deﬁnition 2, there must exists a response action a Dd1 of the attacker such that (d1, a ) p. However, such an action a does not exist because (d1, a1) p and (d1, a2) p, see Figure 4.

Lemma 6 (d, a) Dϕ N(ϕ Dϕ) for any formula ϕ Φ+, any defender s action d D, and any attacker s response action a Ad in an arbitrary security game (D, {Ad}d D, π).

PROOF. Suppose that (d, a) Dϕ N(ϕ Dϕ). Thus, (d, a) Dϕ and (d, a) N(ϕ Dϕ) by item 3 of Deﬁnition 2. By item 6 after Deﬁnition 2, statement (d, a) Dϕ, implies that (d, a) ϕ. By item 4 of Deﬁnition 2, statement (d, a) N(ϕ Dϕ) implies that there is an action d1 D of the defender and a response action a1 Ad1 of the attacker such that (d1, a1) ϕ Dϕ. Thus, (d1, a1) ϕ and (d1, a1) Dϕ by item 3 of Deﬁnition 2. Hence, by item 6 after Definition 2, for each action d D of the defender there is a response action a Ad of the attacker such that (d , a ) ϕ. Then, (d, a) Dϕ by item 6 after Deﬁnition 2 because (d, a) ϕ, which is a contradiction.

We write ϕ if formula ϕ is provable from the axioms of our system using the Modus Ponens and the Necessitation inference rules: ϕ, ϕ ψ

We write X ϕ if formula ϕ is provable from the theorems of our logical system and an additional set of axioms X using only the Modus Ponens inference rule. We conclude this section with an example of a formal proof in our logical system. The lemma below is used later in the proof of the completeness.

Lemma 7 If ϕ ψ, then Aϕ Aψ.

PROOF. By the Strict Conditional axiom,

N(ψ ϕ) (Aϕ (ψ Aψ)).

Assumption ϕ ψ implies ψ ϕ by the laws of propositional reasoning. Thus, N(ψ ϕ) by the Necessitation inference rule. Hence, by the Modus Ponens rule,

Thus, by the laws of propositional reasoning,

(Aϕ ψ) (Aϕ Aψ). (2)

Note that Aϕ ϕ by the Truth axiom. At the same time, ϕ ψ by the assumption of the lemma. Thus, by the laws of propositional reasoning, Aϕ ψ. Therefore, Aϕ Aψ by the Modus Ponens inference rule from statement (2).

The proof of soundness is available in the full version of this paper (Naumov and Tao 2019a).

Completeness In this section we prove the completeness of our logical system in three steps. First, we introduce an auxiliary modality R as an abbreviation deﬁnable through modality A. Next, we deﬁne a canonical security game and prove its basic property. Finally, we state and prove the strong completeness theorem for our logical system.

Preliminaries Let Rϕ be an abbreviation for (ϕ Aϕ). Note that Rϕ stands for statement ϕ is true, but the attacker cannot be blamed for it . In other words, Rϕ means that the defender s action unavoidably led to ϕ being true. This modality is not present in (Naumov and Tao 2019b). In the context of STIT logic, but not in the context of security games, a similar single-agent modality was studied in (Xu 1998). The same modality for coalitions was investigated in (Broersen, Herzig, and Troquard 2009). Below we prove the key properties of modality R that are used later in the proof of the completeness. Lemma 8 Nϕ Rϕ. PROOF. By the Unavoidability axiom, Nϕ Aϕ. At the same time, Nϕ ϕ by the Truth axiom. Hence, by propositional reasoning, Nϕ ϕ Aϕ. Thus, again by propositional reasoning, Nϕ (ϕ Aϕ). Therefore, Nϕ Rϕ by the deﬁnition of modality R.

The next four lemmas show that R is an S5 modality.

Lemma 9 Inference rule ϕ

Rϕ is derivable.

PROOF. Suppose that ϕ. Thus, Nϕ by the Necessitation inference rule. Therefore, Rϕ by Lemma 8 and the Modus Ponens inference rule.

Lemma 10 Rϕ ϕ. PROOF. Note that formula (ϕ Aϕ) ϕ is a propositional tautology. Thus, Rϕ ϕ by the deﬁnition of the modality R.

Lemma 11 R(ϕ ψ) (Rϕ Rψ). PROOF. Note that the following formula is a propositional tautology

((ϕ ψ) A(ϕ ψ)) ( (ϕ Aϕ) ( A(ϕ ψ) Aϕ)).

Thus, it follows from the deﬁnition of the modality R that R(ϕ ψ) (Rϕ ( A(ϕ ψ) Aϕ)). At the same time, formula ( A(ϕ ψ) Aϕ) A((ϕ ψ) ϕ) is a contrapositive of the Conjunction axiom. Thus, by the laws of propositional reasoning, R(ϕ ψ) (Rϕ A((ϕ ψ) ϕ). (3) Next, note that the following formula is also a propositional tautology ((ϕ ψ) ϕ) ψ. Hence, by the Necessitation inference rule, N(((ϕ ψ) ϕ) ψ). Thus, by the Strict Conditional axiom and the Modus Ponens inference rule, Aψ ((ϕ ψ) ϕ A((ϕ ψ) ϕ)). Then, by the laws of propositional reasoning, A((ϕ ψ) ϕ) ((ϕ ψ) ϕ Aψ). Hence, by propositional reasoning using statement (3), R(ϕ ψ) (Rϕ ((ϕ ψ) ϕ Aψ)). (4) Note that the following formula is a propositional tautology ((ϕ ψ) A(ϕ ψ)) ( (ϕ Aϕ) ((ϕ ψ) ϕ)). Thus, it follows from the deﬁnition of the modality R that R(ϕ ψ) (Rϕ ((ϕ ψ) ϕ)). (5) Then, by propositional reasoning using statement (4), R(ϕ ψ) (Rϕ Aψ). (6) Additionally, note that ((ϕ ψ) ϕ) ψ is a propositional tautology. Hence, statement (5) also implies R(ϕ ψ) (Rϕ ψ). Thus, by propositional reasoning using statement (6), R(ϕ ψ) (Rϕ (ψ Aψ)). Again by propositional reasoning, R(ϕ ψ) (Rϕ (ψ Aψ)). Therefore, R(ϕ ψ) (Rϕ Rψ) by the deﬁnition of the modality R.

Lemma 12 Rϕ R Rϕ. PROOF. Note that (ϕ Aϕ) (ϕ Aϕ) is a propositional tautology. Thus, A (ϕ Aϕ) A(ϕ Aϕ) by Lemma 7. Hence, A(ϕ Aϕ) A (ϕ Aϕ) by contraposition. Then, A (ϕ Aϕ) by the No Blame Axiom and the Modus Ponens inference rule. Thus, by the laws of propositional reasoning, (ϕ Aϕ) ((ϕ Aϕ) A (ϕ Aϕ)). Hence, again by the laws of propositional reasoning, (ϕ Aϕ) ( (ϕ Aϕ) A (ϕ Aϕ)). Recall that Rϕ is an abbreviation for (ϕ Aϕ). Then, Rϕ ( Rϕ A Rϕ). Thus, Rϕ R Rϕ again by the deﬁnition of R.

The next two lemmas capture well known properties of S5 modalities. For proofs, see (Naumov and Tao 2019a).

Lemma 13 If ϕ1, . . . , ϕn ψ, then ϕ1, . . . , ϕn ψ, where is either modality N or modality R.

Lemma 14 ϕ ϕ where is either modality N or modality R.

Canonical Security Game We deﬁne the canonical game G(X) = (Ω, {Aδ}δ Ω, π) for each maximal consistent set of formulae X.

Deﬁnition 3 Ω is the set of all maximal consistent sets of formulae such that if ω Ω, then {ϕ Φ | Nϕ X} ω.

Deﬁnition 4 ω ω if ϕ Φ (Rϕ ω Rϕ ω ).

Note that is an equivalence relation on set Ω. The set Aδ of possible responses by the attacker on an action δ Ω of the defender is the (nonempty) equivalence class of element δ with respect to this equivalence relation:

Deﬁnition 5 Aδ = [δ].

Thus, each defender s action δ Ω and each attacker s responses ω [δ] are maximal consistent sets of formulae. This is signiﬁcantly different from (Naumov and Tao 2019b), where actions of all agents are formulae.

Deﬁnition 6 π(p) = {(δ, ω) Ω Ω | ω Aδ, p ω}.

This concludes the deﬁnition of the canonical game G(X). As usual, at the core of the proof of completeness is a truth lemma (or an induction lemma), which in our case is Lemma 19. The next four lemmas are auxiliary statements used in the induction step of the proof of Lemma 19.

Lemma 15 For any action δ Ω of the defender, any response action ω [δ] of the attacker, and any formula Aϕ ω, we have (i) ϕ ω and (ii) there is a response action ω [δ] such that ϕ / ω .

PROOF. Assumption Aϕ ω implies that ω ϕ by the Truth axiom and the Modus Ponens inference rule. Thus, ϕ ω because set ω is maximal. This concludes the proof of the ﬁrst statement. To prove the second statement, consider the set of formulae

Y = { ϕ} {ψ | Rψ ω} {χ | Nχ ω}. (7)

Claim 1 Set Y is consistent.

PROOF OF CLAIM. Suppose the opposite. Thus, there are

Rψ1, . . . , Rψk, Nχ1, . . . , Nχn ω (8)

such that ψ1, . . . , ψk, χ1, . . . , χn ϕ. Hence, by Lemma 13, Rψ1, . . . , Rψk, Rχ1, . . . , Rχn Rϕ. Then, by Lemma 8 and the Modus Ponens inference rule, Rψ1, . . . , Rψk, Nχ1, . . . , Nχn Rϕ. Thus, ω Rϕ by statement (8). Hence, ω (ϕ Aϕ) by the deﬁnition of the modality R. Then, ω Aϕ by the laws of the propositional reasoning, which contradicts the assumption Aϕ ω of the lemma because set ω is consistent. Let set ω be any maximal consistent extension of set Y . Then, ϕ ω . Thus, ϕ / ω because set ω is consistent.

Claim 2 ω Ω.

PROOF OF CLAIM. Consider any formula Nχ X. By Deﬁnition 3, it sufﬁces to show that χ ω . Indeed, assumption Nχ X implies that X NNχ by Lemma 14. Thus, NNχ X because set X is maximal. Then, Nχ ω by Deﬁnition 3 and the assumption ω [δ] Ω of the lemma. Hence, χ Y ω by equation (7) and the choice of set ω .

Claim 3 ω [δ]. PROOF OF CLAIM. Recall that ω [δ] by the assumption of the lemma. Thus, by Claim 2, it sufﬁces to show that ω ω . Hence, by Deﬁnition 4, it sufﬁces to prove that Rψ ω iff Rψ ω for each formula ψ Φ. If Rψ ω, then ω RRψ by Lemma 14. Hence, RRψ ω because set ω is maximal. Thus, Rψ Y ω by equation (7) and the choice of ω . Suppose that Rψ / ω. Thus, ω R Rψ by Lemma 12 and the Modus Ponens inference rule. Hence, R Rψ ω because set ω is maximal. Thus, Rψ Y ω by equation (7) and the choice of set ω . Therefore, Rψ / ω because set ω is consistent. This concludes the proof of the lemma.

Lemma 16 For any action δ Ω of the defender, any response action ω [δ] of the attacker, and any formula ϕ Φ, if (ϕ Aϕ) ω, then ϕ ω for each ω [δ]. PROOF. Assumption (ϕ Aϕ) ω implies Rϕ ω by the deﬁnition of the modality R. Note that ω ω because ω, ω [δ]. Thus, Rϕ ω by Deﬁnition 4. Hence, ω ϕ by Lemma 10 and the Modus Ponens inference rule. Therefore, ϕ ω because set ω is maximal.

Lemma 17 For any actions ω, ω Ω, if Nϕ ω, then ϕ ω . PROOF. Suppose that ϕ / ω . Hence, Nϕ / X by Deﬁnition 3 and the assumption ω Ω. Thus, Nϕ X because set X is maximal. Then, X N Nϕ by the Negative Introspection axiom and the Modus Ponens inference rule. Hence, N Nϕ X again because set X is maximal. Thus, Nϕ ω by Deﬁnition 3 and the assumption ω Ω. Therefore, Nϕ / ω because set ω is consistent.

Lemma 18 For any action ω Ω and any formula Nϕ ω, there is an action ω Ω such that ϕ / ω . PROOF. Consider the set of formulae

Y = { ϕ} {ψ | Nψ ω}. (9)

Claim 4 Set Y is consistent. PROOF OF CLAIM. Suppose the opposite. Thus, there are formulae Nψ1, . . . , Nψn ω (10) such that ψ1, . . . , ψn ϕ. Hence, Nψ1, . . . , Nψn Nϕ by Lemma 13. Thus, ω Nϕ by the assumption (10), which contradicts the assumption Nϕ ω of the lemma because set ω is consistent. Let set ω be any maximal consistent extension of set Y . Then, ϕ ω . Thus, ϕ / ω because set ω is consistent.

Claim 5 ω Ω. PROOF OF CLAIM. Consider any formula Nψ X. By Deﬁnition 3, it sufﬁces to show that ψ ω . Indeed, assumption Nψ X implies that X NNψ by Lemma 14. Thus, NNψ X because set X is maximal. Then, Nψ ω by Deﬁnition 3 and the assumption ω Ω of the lemma. Therefore, ψ Y ω by equation (9) and the choice of set ω . This concludes the proof of the lemma.

Lemma 19 (truth lemma) For each formula ϕ, each action of the defender δ Ω, and each response action ω [δ] of the attacker, (δ, ω) ϕ iff ϕ ω. PROOF. We prove the lemma by structural induction on formula ϕ. The case when formula ϕ is a propositional variable follows from Deﬁnition 2 and Deﬁnition 6. The cases when formula ϕ is a negation or an implication follow from Deﬁnition 2 and the assumption of the maximality and the consistency of set ω in the standard way. Suppose that formula ϕ has the form Aψ. ( ) : Assume that Aψ / ω. Hence, ω Aψ because set ω is maximal. We consider the following two cases separately: Case I: (ψ Aψ) ω. Thus, statement ω Aψ implies ω ψ by the contraposition of the Modus Ponens inference rule. Hence, ψ ω. Then, (δ, ω) ϕ by the induction hypothesis. Therefore, (δ, ω) Aϕ by item 5 of Deﬁnition 2. Case II: (ψ Aψ) / ω. Hence, (ψ Aψ) ω because set ω is maximal. Thus, ψ ω for each action ω [δ], by Lemma 16. Then, by the induction hypothesis, (δ, ω ) ψ for each response action ω [δ] of the attacker on action δ Ω of the defender. Therefore, (δ, ω) Aψ by item 5 of Deﬁnition 2. ( ) : Assume that Aψ ω. Thus, by Lemma 15, we have (i) ψ ω and (ii) there is a response action ω [δ] such that ψ / ω . Hence, by the induction hypothesis, (i) (δ, ω) ψ and (ii) there is a response action ω [δ] of the attacker such that (δ, ω ) ψ. Therefore, (δ, ω) Aψ by item 5 of Deﬁnition 2. Next, assume formula ϕ has the form Nψ. ( ) : Let Nψ / ω. Thus, Nψ ω because set ω is maximal. Hence, by Lemma 18, there is an action ω Ω such that ψ / ω . Note that ω [ω ] because [ω ] is an equivalence class. Thus, (ω , ω ) ψ by the induction hypothesis. Therefore, (δ, ω) Nψ by item 4 of Deﬁnition 2. ( ) : Suppose that Nψ ω. Thus, ψ ω for each action ω Ω by Lemma 17. Hence, by the induction hypothesis, (δ , ω ) ψ for each action δ Ω of the defender and each response action ω [δ ] of the attacker. Therefore, (δ, ω) Nψ by item 4 of Deﬁnition 2.

Recall that the canonical game G(X) is deﬁned for an arbitrary maximal consistent set of formulae X. Lemma 20 X Ω. PROOF. Consider any formula Nϕ X. By Deﬁnition 3, it sufﬁces to show that ϕ X. Indeed, assumption Nϕ X implies X ϕ by the Truth axiom and the Modus Ponens inference rule. Thus, ϕ X because set X is maximal.

Strong Completeness Theorem

Theorem 1 If X0 ϕ, then there is an action d D of the defender and a response action a Ad of the attacker in a game (D, {Ad}d D, π) such that (d, a) χ for each formula χ X0 and (d, a) ϕ.

PROOF. Let the set of formulae X Φ be any maximal consistent extension of set X0 { ϕ}. Then, ϕ / X because set X is consistent. Consider the canonical game G(X) = (Ω, {Aδ}δ Ω, π). Then, X Ω by Lemma 20. Also, X [X] = AX because set [X] is an equivalence class and because of Deﬁnition 5. Therefore, (X, X) χ for each formula χ X0 X and (X, X) ϕ by Lemma 19.

In this paper we gave a sound and complete axiomatic system that describes the properties of blameworthiness in security games. A natural next step is to generalize this work to arbitrary extensive form games. The Conjunction and the No Blame axioms in this paper are speciﬁc to security games and are not sound for arbitrary extensive form games. As we have seen in Lemma 3 and Lemma 4, these axioms are already not sound for the player who makes the ﬁrst move in a security game. Although these axioms are sound for the player making the second move in security games, it is not sound for the second player in an arbitrary extensive form game. Consider, for example, game G3 depicted in Figure 5. In this game, (d1, a2) A(p q) because formula p q is true under the action proﬁle (d1, a2), but the second player could have prevented it by using action a1 instead of a2. At the same time, (d1, a2) Ap Aq because the second player has neither a strategy that would prevent p nor a strategy that would prevent q. This is a counterexample for the Conjunction axiom. The game G3 also provides a counterexample

Figure 5: Game G3, where (d1, a2) A(p q) (Ap Aq), and (d1, a1, d1) A(p Ap).

for the No Blame axiom: (d1, a1, d1) A(p Ap). Indeed, (d1, a1, d1) p Ap because (d1, a1, d1) p. At the same time, (d1, a2) p Ap. Thus, the second player could have prevented p Ap by using a2 instead of a1. In addition to ﬁnding the right set of axioms, proving a completeness theorem would also require to recover the structure of the canonical game tree from a maximal consistent set of formulae. Finding the right set of axioms sound for all extensive form games and proving their completeness remains an open problem.

References An, B.; Tambe, M.; and Sinha, A. 2016. Stackelberg security games (SSG) basics and application overview. In Improving Homeland Security Decisions. Cambridge Univ. Press. Batusov, V., and Soutchanski, M. 2018. Situation calculus semantics for actual causality. In Proceedings of the Thirty Second AAAI Conference on Artiﬁcial Intelligence (AAAI18). Broersen, J.; Herzig, A.; and Troquard, N. 2009. What groups do, can do, and know they can do: an analysis in normal modal logics. Journal of Applied Non-Classical Logics 19(3):261 289. Brown, M.; Sinha, A.; Schlenker, A.; and Tambe, M. 2016. One size does not ﬁt all: A game-theoretic approach for dynamically and effectively screening for threats. In Thirtieth AAAI Conference on Artiﬁcial Intelligence. Cushman, F. 2015. Deconstructing intent to reconstruct morality. Current Opinion in Psychology 6:97 103. Fang, F.; Stone, P.; and Tambe, M. 2015. When security games go green: Designing defender strategies to prevent poaching and illegal ﬁshing. In Twenty-Fourth International Joint Conference on Artiﬁcial Intelligence. Frankfurt, H. G. 1969. Alternate possibilities and moral responsibility. The Journal of Philosophy 66(23):829 839. Halpern, J. Y. 2016. Actual causality. MIT Press. Jain, M.; Tsai, J.; Pita, J.; Kiekintveld, C.; Rathi, S.; Tambe, M.; and Ord onez, F. 2010. Software assistants for randomized patrol planning for the LAX airport police and the federal air marshal service. Interfaces 40(4):267 290. Lewis, D. 2013. Counterfactuals. John Wiley & Sons. Naumov, P., and Tao, J. 2019a. Blameworthiness in security games. ar Xiv:1910.08647. Naumov, P., and Tao, J. 2019b. Blameworthiness in strategic games. In Proceedings of Thirty-third AAAI Conference on Artiﬁcial Intelligence (AAAI-19). Pita, J.; Jain, M.; Marecki, J.; Ord o nez, F.; Portway, C.; Tambe, M.; Western, C.; Paruchuri, P.; and Kraus, S. 2008. Deployed ARMOR protection: the application of a game theoretic model for security at the Los Angeles International Airport. In Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems: industrial track, 125 132. International Foundation for Autonomous Agents and Multiagent Systems. Sinha, A.; Fang, F.; An, B.; Kiekintveld, C.; and Tambe, M. 2018. Stackelberg security games: Looking beyond a decade of success. In IJCAI, 5494 5501. von Stackelberg, H. 1934. Marktform und gleichgewicht. J. Springer. Widerker, D. 2017. Moral responsibility and alternative possibilities: Essays on the importance of alternative possibilities. Routledge. Xu, M. 1998. Axioms for deliberative stit. Journal of Philosophical Logic 27(5):505 552.