# deception_through_halftruths__8e9a9951.pdf

The Thirty-Fourth AAAI Conference on Artiﬁcial Intelligence (AAAI-20)

Deception through Half-Truths

Andrew Estornell, Sanmay Das, Yevgeniy Vorobeychik Computer Science & Engineering, Washington University in St. Louis {aestornell, sanmay, yvorobeychik}@wustl.edu

Deception is a fundamental issue across a diverse array of settings, from cybersecurity, where decoys (e.g., honeypots) are an important tool, to politics that can feature politically motivated leaks and fake news about candidates. Typical considerations of deception view it as providing false information. However, just as important but less frequently studied is a more tacit form where information is strategically hidden or leaked. We consider the problem of how much an adversary can affect a principal s decision by half-truths , that is, by masking or hiding bits of information, when the principal is oblivious to the presence of the adversary. The principal s problem can be modeled as one of predicting future states of variables in a dynamic Bayes network, and we show that, while theoretically the principal s decisions can be made arbitrarily bad, the optimal attack is NP-hard to approximate, even under strong assumptions favoring the attacker. However, we also describe an important special case where the dependency of future states on past states is additive, in which we can efﬁciently compute an approximately optimal attack. Moreover, in networks with a linear transition function we can solve the problem optimally in polynomial time.

1 Introduction For better or for worse, deception is ubiquitous. It can be benign, but just as often deception is used to deliberately mislead. Commonly, the means of deception can be viewed as outright lies or misinformation. This is certainly the case with fake news and false advertising, as well as phishing emails, and it is also the case for honeypots, even though here deception is used to help network security, rather than for a nefarious purpose. However, a more subtle means of deception involves strategically hiding information. For example, misleading advertising about a drug may omit important information about its side-effects, and we may effectively protect a system against classes of attacks by strategically deciding what is public about it, such as a Windows computer publicizing a Safari browser, but not the OS, to make it appear it s running Mac OS X. Theoretical studies of deception typically leverage games of incomplete information, where deception takes the form of signaling misinformation about private state (Carroll and

Copyright c 2020, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

Grosu 2011; Pawlick and Zhu 2015), for example, advertising an incorrect conﬁguration of computing devices (e.g., a Windows machine advertising as Linux) (Schlenker et al. 2018), or warning that there may be inspections when no inspectors are present (Xu et al. 2016). We take a different perspective. Speciﬁcally, we start with a decision-maker (the principal) who makes decisions under uncertainty based on limited evidence. To formalize this setting, we consider a two-stage dynamic Bayes network in which the principal observes a partial realization of the ﬁrst stage, and makes a prediction (i.e., derives a posterior) about the second stage. We study the extent to which such a decision-maker is susceptible to deception through half-truths that is, through an adversarial masking of a subset of ﬁrst-stage variables, with the assumption that the principal is oblivious to the adversarial nature of this masking (for example, the individual is unaware, or fails to take into account, that it is performed adversarially). While it may at ﬁrst blush be puzzling how a rational Bayesian observer would be oblivious to the presence of an adversary, situations of this kind in fact abound. Consider algorithmic trading as one example. When order book information became available, it gave rise to numerous sophisticated machine learning methods aiming at taking advantage of this additional information (Nevmyvaka, Feng, and Kearns 2006; Nevmyvaka and Kearns 2013). However, many such approaches proved to be vulnerable to order book spooﬁng attacks (Wang, Wellman, and Vorobeychik 2018). Another example is autonomous driving. Despite a number of illustrations of attacks on state-of-the-art sophisticated AI-based perception algorithms (Boloor et al. 2019; Eykholt et al. 2018; Sharif et al. 2016; Vorobeychik and Kantarcioglu 2018), standard autonomous driving stacks, such as Autoware (Foundation ) and Apollo (Baidu ) are largely devoid of any techniques for robust perception. Our ﬁrst observation is that in our setting half-truths (that is, adversarial masking of observations) can lead to arbitrarily wrong beliefs. This is self-evident with lies, but surprising when we can only mask observations. However, we show that the problem of optimally choosing such a mask is extremely hard: in general, it is inapproximable to any polynomial factor. Next, we study an important restricted family of Bayes networks in which transition probabilities of nodes depend on the sum of the parents. This is a nat-

ural model if we consider, for example, opinion diffusion through social inﬂuence. For example, suppose that each variable represents whether an individual likes a particular candidate in an election. The opinions in the second stage would correspond to the impact of social inﬂuence, where parents of a node are their social network neighbors. Our model means that a node s view depends on the number of their neighbors who like the candidate. In this additive model, we show that the problem does not admit a PTAS even when nodes have at most two parents. However, we exhibit two algorithmic approaches for solving this variant: the ﬁrst an n-approximation algorithm, the second a heuristic (which admits no performance guarantees). Our experiments show that the combination of the two yields good performance in practice, even while each is limited by itself. Finally, we show that when temporal dependency is linear, we can ﬁnd an optimal mask in polynomial time. Related Work A number of prior efforts study deception, many in the context of cybersecurity. Among the earliest is work by Cohen and Koike (2003), who formalize deception as guiding attackers through (a benign part of) the attack graph. Recent qualitative studies of deception (Almeshekah and Spafford 2016; Stech, Heckman, and Strom 2016) offer additional insights, but do not provide mathematical modeling approaches. A series of mathematical formalizations of deception in cyber security have also been proposed (Carroll and Grosu 2011; Greenberg 1982; Ettinger and Jehiel 2010; Pawlick and Zhu 2015; Xu et al. 2016), but these tend to model static scenarios and misinformation, rather than information hiding. Several other mathematical models address allocation of honeypots, which is a common means for deceiving cyber attackers (Kiekintveld, Lisy, and Pibil 2015). Recently, deception has also been considered as a security game in which a defender chooses a deceptive presentation of system conﬁguration to an attacker (Schlenker et al. 2018), but without considering half-truths or structured information representation such as a DBN. Another relevant stream of research is that on information design (Rayo and Segal 2010, e.g.). In the commonly studied Bayesian persuasion model (Kamenica and Gentzkow 2011), one considers a signaling game between a sender and a receiver, where the sender has the ability to acquire superior information to the receiver, and the receiver makes a decision that yields (state-dependent) utilities for both. The key question concerns the design of the optimal signal structure. This area has recently received attention from both the algorithmic perspective (how hard is the sender s problem under different assumptions (Dughmi and Xu 2016)) and in various applications, for example pricing (Shen, Tang, and Zeng 2018), auction design (Li and Das 2019), and security games (Rabinovich et al. 2015). Our work is distinct in that it assumes an oblivious principal, but effectively considers signals which have combinatorial structure.

2 Preliminaries Consider a collection of binary variables X = {X1, ..., Xn}. We deﬁne a 2-stage dynamic Bayes network over these, using superscripts to indicate time steps (0 and 1). Speciﬁcally, we assume that each X0 i is unconditionally independent and

for each X0 i , let P(X0 i = 1) = pi. Moreover, each X1 i has a set of parent nodes, Pa(X1 i ) X0 (we only allow interstage dependencies to simplify discussion), and for each X1 i , deﬁne P(X1 i = 1|Pa(X1 i )) as the probabilistic relationship of the associated variable with its parents (variables it depends on) from stage 0. We will denote the realized values of these random variables in lower case: that is, the realization of a random variable Xt i is xt i. We use this structure to deﬁne an interaction between an attacker and a myopic observer (who we also call the principal). In particular, consider an observer who observes a partial realization of stage-0 variables, and aims to predict (in a probabilistic sense) the values of variables in stage 1. This high-level problem is a stylized version of a broad range of decision problems, such as voting behavior. Examples include observing candidate promises, personalities, and past voting record, to predict what they would do once elected; observing infection status for a collection of individuals on a social network, and aiming to predict who will be infected in the future; and so on. We assume that the observer is myopic in the sense that they use standard Bayesian reasoning about posterior probabilities conditional on their observations of stage-0 realizations. However, we speciﬁcally study a situation in which a malicious party adversarially masks a subset of stage-0 realizations (having ﬁrst observed them). We denote the masked posterior by P(X1 i = 1|Pa(X1 i ) \ η), where η is a binary vector with ηi = 1 whenever the realization of X0 i is not observed (because it is masked). We assume that all the stage-0 realizations that are not masked are observed by the principal. Let X1 denote the random vector distributed according to P(X1 i = 1|Pa(X1 i ) (the full set of its parents from X0), while X1 η is a random vector distributed according to P(X1 i = 1|Pa(X1 i ) \ η). More precisely, the sequence of the interaction is as follows: 1. Nature generates a vector x0 = x0 1, ..., x0 n deﬁning the outcomes of X0 according to its prior distribution p. 2. The attacker observes x0 and may choose up to k outcomes to hide from the observer. This decision is captured by the mask η. 3. The observer observes the partially realized state of X0 after applying the mask η, and makes a prediction about X1 (which we capture by the distribution of X1 η).

4. Nature then yields the realization of x1 = x1 1, ..., x1 1 according to the posterior distribution of X1. To understand the consequence of adversarial halftruths of this kind, we consider two problems faced by the adversary: targeted and untargeted attacks. Speciﬁcally, let the two random vectors, X1 and X1 η also stand for their respective distributions, and let D(X1, X1 η) be a statistical distance between the two distributions according to some metric. In the untargeted case, the adversary s problem is to maximize the distance between the masked and true posterior distributions over the random vector in stage 1:

max η D(X1, X1 η) s.t. :

i ηi k. (1)

In the targeted case, the adversary has some desired distribution, X1 α, and the adversary would like to push the observer s

perception as close to this distribution as possible. We formalize this as

min η D(X1 α, X1 η) s.t. :

i ηi k. (2)

Note that in this notation we are suppressing the dependence on the prior, which is implicitly part of any problem instance faced by the adversary.

3 Half-Truth is as Good as a Lie Our ﬁrst result demonstrates that in a fundamental sense, in our model, there are cases where partially hiding the true current state can lead to arbitrary distortion of belief by a myopic observer. Recall that the adversary s aim is to maximize statistical distance D between the true posterior distribution over X1, and the posterior induced by masking a subset of variables in stage 0, X1 η. We now show that for most reasonable measures of statistical distance, we can construct cases in which the adversary can make it arbitrarily large (within limits of the measure itself) that is, the adversary can induce essentially arbitrary distortion in belief solely by masking some of the observations. Deﬁnition 1. We say a statistical distance is positive if for any two random variables A, B we have D(A, B) 0. Note that any distance metric, or probabilistic extension of a distance metric, ﬁts the deﬁnition of positive symmetric. Theorem 2. Suppose the attacker s objective is to maximize some positive statistical distance D. Let A and B be any vectors of binary random variables, then there exists some sequence of dynamic Bayes networks such that

lim n EX0 max η D(X1, X1 η) = lim n max A,B D(A, B)

Proof. Let A, B be the vectors of binary random variables for which D(A, B) attains its maximum value, with respect to n. Then A = A1, ..., An , B = B1, ..., Bn and each variable has prior P(Ai = 1) = ai, P(Bi = 1) = bi. Let X0 X1 deﬁne a dynamic Bayes network on n variables. For all 1 j n, let Pa(X1 j ) = {X0 i : 1 i n}. That is, all nodes in layer 0 are parents of every node in layer 1. Deﬁne the probability distributions over X0 and X1 by the following: X0 i X0, P(X0 i = 1) = ϵ. Next, X1 i X1, P(X1 i = 1| x0 j = 1) = bi and P(X1 i = 1| x0 j = 1) = ai. For each n we will consider the value of D(X1, X1 η) under three types of events that could occur with respect to the possible outcomes, x0 of X0, the adversary s budget k, and the adversary s choice of which nodes to hide conditional on x0. Each of these settings admits a unique type of optimal play from the adversary. Speciﬁcally

(1) X0 j x0 j = 0. In this case the adversary will hide k random nodes since all outcomes are 0. (2)

X0 j x0 j = m k. In this case the adversary will hide only the m nodes whose outcomes are 1. (3)

X0 j x0 j = m > k. In this case the adversary will hide nothing.

In events of type (1) when there is no mask X1 = A. When a mask η is employed, X1 η = B with probability 1 (1 ϵ)k, and X1 η = A with probability (1 ϵ)k. Thus, in this setting,

E D(X1, X1 η) = 1 (1 ϵ)k D(A, B)

+ (1 ϵ)k D(A, A)

Events of this type occur with probability (1 ϵ)n. In events of type (2), without η we have X1 = B. Since m k and all nodes with outcome 0 are hidden. Thus, in light of η we have X1 η = A with probability (1 ϵ)m, and X1 η = B with probability 1 (1 ϵ)m. Therefore, the expected value in this setting is

E D(X1, X1 η) = 1 (1 ϵ)m D(B, B)

+ (1 ϵ)m D(B, A)

Events of this type occur with probability n m ϵm(1 ϵ)n m for each m k. In events of type (3) there are more nodes yielding 1 in layer 0 than the adversary is capable of hiding. So X1 = X1 η = A. Events of this type occur with probability n m ϵm(1 ϵ)n m for each m > h. For notational convenience, and without loss of generality, we will reorder the nodes in X0 n after the observations are made by the adversary, such that for 0 j m, x0 j = 1. Suppose k = n, similar analysis holds for any constant fraction of n. Since D is positive symmetric we have,

EX0 max η D(X1, X1 η)

D(A, B) 1 (1 ϵ)n (1 ϵ)n

+ D(B, A) n

ϵm(1 ϵ)n m(1 ϵ)m

Using the binomial identities we can reduce the above equation to form

D(A, B) 1 (1 ϵ)n (1 ϵ)n

+D(B, A)(1 ϵ)n ((ϵ + 1)n 1)

Thus, since both terms in the above sum are positive, it remains only to be shown for ϵ = log(n)

(1 ϵ)n ((ϵ + 1)n 1) 1 as n

This limit can be evaluated as follows

n 1 + log(n)

Using a slight variation to the identity limn (1 + a

n)cn = eac, we can obtain that this limit does in-fact converge to 1. Thus giving the desired result that

lim n EX0 max η D(X1, X1 η) = lim n max A,B D(A, B)

4 Computational Complexity of Deception by Half-Truth

Let X0 X1 deﬁne a dynamic Bayes network over a set of n binary random variables. Let x0 be a binary vector describing the realized outcomes of X0. In the remainder of the paper, we restrict attention to particular distance metrics of the form:

untargeted: D(X1, X1 η) = E ||X1 X1 η||p

targeted: D(X1, X1 η) = E ||X1 α X1 η||p

where the expectation is with respect to the product distribution of the two random variables and p N { }. These are natural distances in the context of random variables, and correspond to the Lukaszyk-Karmowski metric (LKM) of statistical distance between the distributions. We call the resulting problems (of computing the optimal mask given a prior and a realization of variables at layer 0) Deception by Bayes Network Masking (DBNM) for the untargeted case, and Targeted Deception by Bayes Network Masking (TDBNM) for the targeted case. We now show that this problem does not even admit a polynomial factor approximation for any p.

Theorem 3. If DBNM has a deterministic, polynomial-time, polynomial approximation, for any value of p, then P=NP.

Proof. Suppose that there exists a deterministic, polynomial factor, polynomial time approximation of DBNM. We will show that under this assumption SAT can be solved in polynomial time. Consider an instance of SAT deﬁned by a set of Boolean variables B and a Boolean function Φ, whose terms are the elements of B. The objective is to determine if there exists an assignment of the variables in B such that Φ evaluates to 1. An arbitrary instance of SAT can be encoded into DBNM in the following manner. Let X0 = B, Pa(X1 1) = X0, and deﬁne P(X1 1 = 1|Pa(X1 1)) = Φ (that is, X1 1 = 1 if and only if the formula Φ evaluates to true). For all other j = 1, P(X1 j = 1|Pa(X1 j )) = 0. Lastly, set each prior P(X0 i = 1) = 1 22n and set x0 = 1, 1, ..., 1 . In the case that b = 1, b B, yields Φ = 0, the objective of the attacker is to select a mask η that maximize the value of P(X1 1 = 1|x0\η). For a given mask η, y0 η be any outcome that agrees with x0 on all in X0 \ η, i.e. xi = y0 η,i for all X0 i / η. Let ay0 η = ||x0 y0 η||1

Then, for any η we have,

P(X1 η,1 = 1|x0) =

y0 η P(y0 η)P(X1 1 = 1|y0 η)

y0 η P(X1 1 = 1|y0 η)(1 1 22n ) ay0η ( 1

22n ) |η| ay0η

A certiﬁcate for the SAT instance can be generated via assigning bi = 1 if X0 i / η and bi = 0 if X0 i η. To see that this certiﬁcate is valid, consider two cases on η. The ﬁrst being, η corresponds to an assignment of B yielding Φ = 0, and the second being when the assignment gives Φ = 1.

In the ﬁrst case, let y 0 η be the y0 η outcome such that y0 η,i = 0 for all X0 i η and y0 η,j = 1 for all X0 j / η. Then, since P(X1 1 = 1|y 0 η ) = 0, we have

y0 η P(X1 1 = 1|y0 η)(1 1 22n ) ay0η ( 1

22n ) |η| ay0η

y0 η =y 0 η

P(X1 1 = 1|y0 η)(1 1 22n ) ay0η ( 1

22n ) |η| ay0η

Note that for each y0 η = y 0 η , |η| ay0 η 1. Thus,

y0 η =y 0 η

P(X1 1 = 1|y0 η)(1 1 22n ) ay0η ( 1

22n ) |η| ay0η

y0 η =y 0 η

1 22n 2n( 1

Therefore, if the adversary selects a mask that does not correspond to a satisfying assignment for Φ, its utility is at most 1 2n . The next case to consider is when the adversary selects a a mask which induces Φ = 1. In this case, we have

y0 η =y 0 η

P(X1 1 = 1|y0 η)(1 1 22n ) ay0η ( 1

22n ) |η| ay0η

+ P(X1 1 = 1|y 0 η )(1 2 22n ) ay 0 η

(1 1 22n ) ay 0 η (1 1 22n )n

Thus, if η induces an assignment of B that yields Φ = 1, the adversary utility at least (1 1 22n )n. Which converges to 1, from below, faster than an polynomial of n. By these two cases, we know that when Φ is satisﬁable, there exists a mask with value at least (1 1 22n )n and that no mask corresponding to Φ = 0 can have value greater than 1 2n . In addition to the results of these two cases, we also know that an optimal mask can achieve no more than a value of 1, since only 1 node in X1 has outcomes dependent on X0 and any Lp norm applied to a vector with only a single nonzero dimension will evaluate to exactly the value of the dimension. Therefore, if a polynomial approximation of the optimal solution were to be given, one could deduce the satisﬁability of Φ based on the value of the mask η. That is if V (η) 1 2n , then Φ is not satisﬁable, and if V (η) (1 1 22n )n, then Φ is satisﬁable and η gives the satisfying assignment. This covers all but the case when bi = 1, bi B, yields Φ = 1. In this case, the adversary could return a mask of value arbitrarily close to 0 even though Φ has a satisfying assignment. This case is easily remedied by choosing to check the assignment bi = 1, bi B, before running the approximation. Under this scheme we could use the polynomial approximation algorithm to determine if a given instance of SAT is satisﬁable. Since SAT is NP complete, the existence of such an approximation algorithm would imply that P = NP.

Next, we show that this inapproximability obtains even if we consider randomized algorithms.

Theorem 4. If DBNM has a randomized polynomial factor approximation with constant probability, for any p, then PR = NP.

Proof. Using the previous construction from SAT to DBNM. If there existed an algorithm that could produce a polynomial factor approximation of the constructed instance of DBNM with some constant probability p (0, 1), then the same line of reasoning in the above proof yields a polynomial time algorithm that can determine if a true instance of SAT is satisﬁable with probability at p (0, 1). This algorithm could then be run 1

p times to obtain a success rate of

1 (1 p) 1 p 1 1

2. Moreover, the algorithm would never falsely identify a non-satisﬁable instance as satisﬁable. The existence of such an algorithm would imply that SAT RP, and since SAT is NP-complete and RP is closed under L-reductions, this would also imply that RP = NP.

Finally, we extend the hardness results above to the targeted version of our problem.

Corollary 5. If TDBNM has a deterministic polynomial time, polynomial approximation, or a randomized polynomial time, polynomial approxiation with constant probability, for any p, then P=NP or RP=NP respectivly.

Proof. In both cases we can set Xα = 1, 0, ..., 0 and our objective is exactly the same as it was in the untargeted case, with the only difference being that we need not consider the case when bi = 0 for all i n yields Φ = 1, since η = is an optimal mask. Once we have this setting for Xα, the proof follows identically to the proofs of 3 and 4.

5 Approximation Algorithm for the Additive Case

Our result above shows that polynomial approximations of the optimal solution are intractable in the general case, when the adversary must be able to compute the optimal mask for any prior and any realization of the variables in layer 0. Therefore, we now turn our focus to cases where the DBN exhibits special structure on the transition probabilities. We start with DBNs with additive transition structure, which we deﬁne next.

Deﬁnition 6. We say a transition probability for Xi is additive if P(Xi = 1|Pa(Xi)) = P(Xi = 1|Zi)

X0 j Pa(X1 i ) X0 j

We term the problem of ﬁnding an optimal adversarial mask when all transitions are additive ADBNM, for Additive DBNM in the untargeted case, and TADBNM refers to the corresponding targeted problem.

5.1 Inapproximability in the Additive Case First, we show that even this case is inapproximable, but now in the sense that no PTAS exists for this problem. Theorem 7. No PTAS exists for either ADBNM (untargeted) or TADBNM (targeted), when p = 1, unless P=NP, (even for monotone transition functions, when nodes have at most 2 parents).

Proof. To show that no PTAS exists for either problem, we will reduce from Dense k-Subgraph (DKSG). An instance of DKSG is deﬁned by a budget k and a graph G = (V, E). The objective is to ﬁnd a vertex set S V such that |{(u, v) E : u, v S}| is maximized while |S| k. To reduce an instance of DKSG to an instance of ADBNM perform the following actions. First, let X0 = {X0 v : v V } and let X1 = {X1 (u,v) : (u, v) E}. For each X0 v X0, let P(X0 v = 0) = ϵ for arbitrarily small ϵ. It is easy to check that for ϵ = 1 22n , similar reasoning to our previous hardness result holds. Lastly, set P(X1 (u,v)|Z(u,v)) = 1 if z(u,v) = 2 and P(X1 (u,v)|Z(u,v)) = 0 otherwise. Suppose that x0 =< 0, 0, ..., 0 >. For TADBNM we need one extra condition that Xα = 1, 1, .., 1 . Now, let η X0 be any mask. Then, for each pair X0 v, X0 u η, we have

E |X1 (u,v) X1 η,(u,v)| x0 = (1 ϵ)2

Therefore, for a given η, the attacker s total utility is

X0 u,X0 v X1:u =v (1 ϵ)2 = β(1 ϵ)2

where β is the number of unique pairs contained in η. Hence, the maximum utility an attacker can obtain is β (1 ϵ)2 where β is the maximum number of distinct pairs X1 v, X1 u that can be contained in any η of size at most k. Since each such pair represents an edge in E and η represents a collection of vertices of V , the maximum dense k-subgraph has size β and is given by the vertices in η. That is, if a given mask η has utility β(1 ϵ)2, then the vertices in η correspond to a subgraph of cardinality β. Similarly, if S V describes a subgraph of size β, then by mapping the vertices in S to a mask η, the attacker can achieve utility β(1 ϵ)2. Since the objectives of the two problems share arbitrary similarity, if a PTAS where to exists for ADBNM, then that same PTAS also exists for DKSG. However, unless P=NP no such algorithm exists for DKSG. Thus, no PTAS exists for ADBNM, unless P=NP.

Theorem 8. For p N 2 { } ADBNM (untargeted) or TADBNM (targeted), when p = 1, unless P=NP, (even for monotone transition functions, when nodes have at most 2 parents).

Proof. We will use the same reduction from DKSG used in the proof of Theorem 7. Under construction, and for a general p, the attacker s utility for any η is

X0 j X0 X0 j = i i 1 p

with the understanding that i 1 = 1. Note that this objective function is monotone with respect to the number of unique pairs X0 u, X0 v η that correspond to edges (u, v) E. Further, since each node in X0 is identical each such pair contributes the same increase to the objective function. Therefore, the objective function increases with respect to the number of unique pairs corresponding to edges in the original graph, independent of which pair is added. Therefore the objective function of the attacker is maximized by ﬁnding the largest set of unique pairs X0 u, X0 v which correspond to edges in the graph, this is the exact objective of the original DKSG problem, meaning that a valid solution to one problem is exactly a valid solution to the other and both ADBNM and TADBNM are NP hard for p > 1.

5.2 Approximation Algorithm

While even the ADBNM special case is inapproximable in a sense, we now present our ﬁrst positive result, which is an n-approximation (recall that the best known approximation of DKSG is Θ(n1/4), and we showed that our problem is no easier in the reduction above). First, we impose an additional restriction on the problem: we assume that all transition functions have the propriety that P(X1 i = 1|Zi) is monotone with respect to Zi. We propose Algorithm 0 for this problem. Next, we show that this algorithm yields a provable approximation guarantee.

Algorithm 1 Approximation algorithm

1: best Mask := 2: for each X1 i X1 do 3: η := 4: S0 = {X0 j Pa(X1 i ) : x0 j = 0} 5: S1 = {X0 j Pa(X1 i ) : x0 j = 1} 6: Select S {S1, S0} on the target of X1 i 7: while |η| < k and S \ η = do 8: x := argmin X0 j SP(X0 j = x0 j)

9: add x to η 10: if V (η) > V (best Mask) then 11: best Mask := η return best Mask

Proposition 9. For any p N { } Algorithm 1 achieves a n approximation on both targeted and untargted attacks.

Proof. The algorithm generates one mask for each node X1 i X1. The associated mask, ηi, is meant to push the observer s perception of P(X1 i |zi) as close to some extreme (0 or 1) as possible. We will examine the contribution that the X1 i , most pushed to the desired extreme, makes to the attacker s total utility. Suppose P(X1 i = 1|zηi i ) is being pushed to 1. A symmetric argument will hold in the case

of 0. Let X1 a = arg max Xi

maxηi P Xi = 1|zηi i and

let Qa = P(X1 i = 1|zηi i ). Next we will show that Qa is at least 1

n of the optimal solution no matter what Lp norm is used. The attacker s utility is given by E[||X1 ηi X1||p],

where Xηi X1 is a binary vector. For ﬁnite p we have,

||Xηi X1||p = n

i=1 |xηi xi| 1

and in the case when p = we have

||Xηi X1||p = max i |xηi xi| 1

Under any p the attackers utility on ηa is at least Qa||1||p = Qa. To get the actual bound on approximation we will split on 3 cases. The ﬁrst being when p = 1, the second being when 2 < p < and the third being when p = . In each case, each node has probability at most Qa to attian the desired outcome (0 or 1). In the ﬁrst case, when p = 1, the attacker s optimal utility is upper-bounded by

Qi a(1 Qa)n i = n Qa

Hence the ratio to the optimal solution given by ηa is Qa Qan = 1 n. In the second case, when 2 < p < , we have that the attackers optimal utility is upper-bounded by

i=1 i 1 p n i

Qi a(1 Qa)n i

Qi a(1 Qa)n i = n Qa

and again we get that the ratio to the optimal solution is 1

n. Lastly, when p = the attackers utility is exactly the probability that there exists at least one node with the desired outcome. Since each node has at most probability Qa to yield the desired outcome, the attacker s optimal utility is at most 1 (1 Qa)n and the attacker s utility on ηa is at least Qa. Thus the ratio to the optimal solution is at least Qa 1 (1 Qa)n . By montonicity and evaluation of the limit as

Qa 0 we see that 1

n Qa 1 (1 Qa)n . Therefore, for any p N { } we get an approximation ratio of at least 1

5.3 Heuristic In addition to our approximation algorithm above, we propose a simple heuristic approach for approximating the optimal mask. The heuristic is a hill-climbing strategy in which, at each iteration, we add the node to η that results in the maximum increase of the value of η; see Algorithm 0. As we demonstrate in the experiments below, the combination of the algorithm and the heuristic performs much better than either in isolation (and, of course, jointly achieves the napproximation above). We now show that by itself, heuristic can be arbitrarily bad. Fix n > 3 such that 2|n, let k = n

2 , and let pi = 1 ϵ for a sufﬁciently small ϵ. Suppose x0 =< 0, 0, ..., 0 >. Let Pa(X1 1) = {X0 1, X0 1, ..., X0 n/2}, and for each X1 i with i > 1, let Pa(X1 i ) = {X0 n/2+1, ..., X0 n}. Deﬁne P(X1 1 = 1|z1) = ϵz1

Algorithm 2 Heuristic algorithm

1: best Mask := 2: η := 3: while |η| < k do 4: x := node with largest increase to V (η) 5: η = η {x} 6: if V (η) > V (best Mask) then 7: best Mask = η return best Mask

and for all i > 1 P(X1 i = 1|zi) = 0 if zi < n 2 , and P(X1 i = 1|zi) = 1 if zi = n

2 . Then we can see that the optimal mask, in both the hiding and ﬂipping case is to hide all nodes X1 n/2+1, ..., X1 n. Which, results in a value of at least

n 2 (1 ϵ)n/2 in the hiding case, and n

2 in the ﬂipping case. However, since the only way to greedily increase the value of η is to keep hiding nodes from {X0 1, ..., X0 n/2}, the mask produced by the heuristic will have value (1 ϵ)n/2ϵ n

2 . Thus, we get a ratio of

(1 ϵ)n/2ϵ n

2 n 2 (1 ϵ)n/2 = ϵ

Note that ϵ is independent of n. Thus, as ϵ 0 the value of the heuristic solution also converges to 0 n > 3. Next we will deﬁne and discuss linear Bayesian networks, on such networks this proposed heuristic is guaranteed to ﬁnd the optimal solution, although doing so can be achieved by a much simpler algorithm which we will also discuss.

6 Polynomial-time Algorithm for Linear Bayesian Networks Our ﬁnal contribution is a further restriction on the DBN that yields a polynomial-time algorithm for computing an optimal mask for the adversary. Speciﬁcally, we consider networks in which each transition function is of the form

P X1 i = 1|Pa(X1 i ) =

X0 j Pa(X1 i ) aij X0 j .

We call these linear Bayesian networks.

Theorem 10. In linear Bayesian networks the optimal solution to DBNM and TDBNM can be computed in polynomial time for the l1-norm.

Proof. Consider the untargeted case ﬁrst. Let x0 be the outcome given by nature. Let y0 be any outcome of X0 which agrees with x0 on all elements except those in η. More speciﬁcally, if X0 j / η then x0 j = y0 j and if X0 j η then y0 j is free to be either 0 or 1. For notational convenience we deﬁne the following variables for any mask η, and for any X1 i X1 let

Pi,r = Pa(X1 i ) ηr and Pi = Pa(X1 i ) η

X0 j Pa(X1 i ) aijx0 j and Ri =

X0 j Pa(X1 i )\η aijx0 j

then attacker s utility on Xi 1 can be given as

X0 j Pi aijpj 2Qi Ri +

X0 j Pi aijpj

Consider the change in value of η when adding some X0 r Pa(X1 i )\η denote this new mask as ηr = η {X0 r }. Assume that x0 r = 1, a symmetric argument will yield a similar result when x0 r = 0. For notational convenience, let R i = Ri 1. Then, the difference in value of η and ηr is

X0 j Pi,r aijpj 2Qi(R i +

X0 j Pi,r aijpj)

X0 j Pi aijpj + 2Qi Ri

X0 j Pi aijpj

= airpr(1 2Qi)

Thus for any X1 i X1 if we hide X0 r when x0 r = 1, then the change in utility to X1 i s contribution to the total utility is prair(1 2Qi), and similarly when x0 r = 0, the change is prair(1 2Qi). Thus in both cases we get that hiding X0 r causes the attacker s utility to increase by ( 1)βrprair(1 2Qi) where βr = x0 r. In the targeted case the only way in which our analysis changes is in the value of βr. Since we now have a desired target for each X1 i , if that desired target is 0 then βr is also 0 and similarly when the target is 1, so is betar. Thus in both the targeted and untargeted case the change in utility is independent of the current mask η and that the total utility is simply the sum of the utility on each X1 i . Thus, when hiding any X0 r the change in the attacker s total utility increases linearly by a value that depends only on x0 r and not on the current mask η. Therefore the attackers utility can be written as

r I(Pa(X1 i ) yr( 1)x0 rprair(1 2Qi)

where I(Pa(X1 i ) is the index set of the parents of X1 i , and if X0 r η then yr = 1 and if X0 r / η then yr = 0. Assigning values to each yr such that n r=1 yr k can be done in polynomial time by simply selecting the yr s with the highest associated coefﬁcients.

7 Experiments As discussed in Section 5, our approximation scheme is to compute both the n-approximation mask and the heuristic mask, then take the one yielding the higher utility. Note that this combination clearly yields an n-approximation. As we now demonstrate, it is also signiﬁcantly better in combination than either of the approaches by itself. Figure 1 (left) shows the results on random general and additive networks, and demonstrates that our combined algorithm signiﬁcantly outperforms the approximation algorithm, largely on the strength of the heuristic, which is highly effective in these settings. Figure 1 (right) studies settings constructed to be adversarial to the heuristic. As we can see,

Figure 1: Comparison between our combined algorithm, heuristic and approximation algorithms in isolation, and random masking on randomly generated networks (left) and networks generated adversarially (right).

here the combined algorithm performs similarly to the approximation algorithm, while the heuristic in isolation ultimately performs poorly. Thus, the combination of the two is far stronger than each component in isolation.

8 Conclusion We introduce a model of deception in which a principal needs to make a decision based on the state of the world, and an adversary can mask information about the state. We study this in a model where the principal is oblivious to the presence of the adversary and reasons about state change using a dynamic Bayes network. Even in a simple two time period model, we showt the existence of cases where an adversary with the ability to mask information about the state at time 0 can cause the oblivious principal to have an arbitrarily incorrect posterior. However, computing, or even approximating these masks to within a polynomial factor, is NP-hard in the general case. We also consider this problem with special structure on the transition probabilities, showing that when transitions only depend on the sum of parent values, the problem remains inapproximable, although we now exhibit an n-approximation. On the other hand, when transitions are linear, we show that it can be solved in polynomial time.

Acknowledgments This work was partially supported by the National Science Foundation (CAREER award IIS-1905558 and grants IIS-1903207 and IIS-1910392) and Army Research Ofﬁce (grant W911NF1910241 and MURI grant W911NF1810208).

Almeshekah, M. H., and Spafford, E. H. 2016. Cyber security deception. In Cyber Deception. Springer. 25 52. Baidu. Apollo autonomous driving solution. http://apollo.auto/. Boloor, A.; He, X.; Gill, C.; Vorobeychik, Y.; and Zhang, X. 2019. Simple physical adversarial examples against end-to-end autonomous driving models. In IEEE International Conference on Embedded Software and Systems. Carroll, T., and Grosu, D. 2011. A game theoretic investigation of deception in network security. Security and Communication Networks 4(10):1162 1172.

Cohen, F., and Koike, D. 2003. Leading attackers through attack graphs with deceptions. Computers and Security 22(5):402 411. Dughmi, S., and Xu, H. 2016. Algorithmic Bayesian persuasion. In Symposium on Theory of Computing, 412 425. Ettinger, D., and Jehiel, P. 2010. A theory of deception. Americal Economic Journal: Microeconomics 2(1):1 20. Eykholt, K.; Evtimov, I.; Fernandes, E.; Li, B.; Rahmati, A.; Xiao, C.; Prakash, A.; Kohno, T.; and Song, D. 2018. Robust physicalworld attacks on deep learning visual classiﬁcation. In Computer Vision and Pattern Recognition. Foundation, A. Autoware.ai. https://www.autoware.ai/. Greenberg, I. 1982. The role of deception in decision theory. Journal of Conﬂict Resolution 26(1):139 156. Kamenica, E., and Gentzkow, M. 2011. Bayesian persuasion. The American Economic Review 101(6):2590 2615. Kiekintveld, C.; Lisy, V.; and Pibil, R. 2015. Game-theoretic foundations for the strategic use of honeypots in network security. In Cyber Warfare. 81 101. Li, Z., and Das, S. 2019. Revenue enhancement via asymmetric signaling in interdependent-value auctions. In AAAI Conference on Artiﬁcial Intelligence, 2093 2100. Nevmyvaka, Y., and Kearns, M. 2013. Machine learning for market microstructure and high frequency trading. In High Frequency Trading - New Realities for Traders, Markets and Regulators. Nevmyvaka, Y.; Feng, Y.; and Kearns, M. 2006. Reinforcement learning for optimized trade execution. In International Conference on Machine Learning. Pawlick, J., and Zhu, Q. 2015. Deception by design: Evidencebased signaling games for network defense. In Workshop on the Economics of Information Security. Rabinovich, Z.; Jiang, A. X.; Jain, M.; and Xu, H. 2015. Information disclosure as a means to security. In International Conference on Autonomous Agents and Multiagent Systems, 645 653. Rayo, L., and Segal, I. 2010. Optimal information disclosure. Journal of Political Economy 118(5):949 987. Schlenker, A.; Thakoor, O.; Xu, H.; Tambe, M.; Vayanos, P.; Fang, F.; Tran-Thanh, L.; and Vorobeychik, Y. 2018. Deceiving cyber adversaries: A game theoretic approach. In International Conference on Autonomous Agents and Multiagent Systems. Sharif, M.; Bhagavatula, S.; Bauer, L.; and Reiter, M. K. 2016. Accessorize to a crime: Real and stealthy attacks on state-of-theart face recognition. In ACM SIGSAC Conference on Computer and Communications Security, 1528 1540. Shen, W.; Tang, P.; and Zeng, Y. 2018. A closed-form characterization of buyer signaling schemes in monopoly pricing. In International Conference on Autonomous Agents and Multiagent Systems. Stech, F. J.; Heckman, K. E.; and Strom, B. E. 2016. Integrating cyber-d & d into adversary modeling for active cyber defense. In Cyber Deception. Springer. 1 22. Vorobeychik, Y., and Kantarcioglu, M. 2018. Adversarial Machine Learning. Morgan & Claypool. Wang, X.; Wellman, M. P.; and Vorobeychik, Y. 2018. A cloaking mechanism to mitigate market manipulation. In International Joint Conference on Artiﬁcial Intelligence. Xu, H.; Freeman, R.; Conitzer, V.; Dughmi, S.; and Tambe, M. 2016. Signaling in bayesian stackelberg games. In International Conference on Autonomous Agents and Multiagent Systems.