# optimal_strategies_against_generative_attacks__9ebeb364.pdf

Published as a conference paper at ICLR 2020

OPTIMAL STRATEGIES AGAINST GENERATIVE ATTACKS

Roy Mor Tel Aviv University Tel Aviv, Israel

Erez Peterfreund The Hebrew University of Jerusalem Jerusalem, Israel

Matan Gavish The Hebrew University of Jerusalem Jerusalem, Israel

Amir Globerson Tel Aviv University Tel Aviv, Israel

Generative neural models have improved dramatically recently. With this progress comes the risk that such models will be used to attack systems that rely on sensor data for authentication and anomaly detection. Many such learning systems are installed worldwide, protecting critical infrastructure or private data against malfunction and cyber attacks. We formulate the scenario of such an authentication system facing generative impersonation attacks, characterize it from a theoretical perspective and explore its practical implications. In particular, we ask fundamental theoretical questions in learning, statistics and information theory: How hard is it to detect a fake reality ? How much data does the attacker need to collect before it can reliably generate nominally-looking artiﬁcial data? Are there optimal strategies for the attacker or the authenticator? We cast the problem as a maximin game, characterize the optimal strategy for both attacker and authenticator in the general case, and provide the optimal strategies in closed form for the case of Gaussian source distributions. Our analysis reveals the structure of the optimal attack and the relative importance of data collection for both authenticator and attacker. Based on these insights we design practical learning approaches and show that they result in models that are more robust to various attacks on real-world data.

1 INTRODUCTION

Generative models have attracted considerable attention since the introduction of Generative Adversarial Networks (Goodfellow et al., 2014a). Empirically, GANs have been shown to generate novel data instances that resemble those in the true distribution of the data. The success of GANs also comes with the risk that generative models will be used for attacking sensor-based security systems. One example is identity authentication systems, where an individual is identiﬁed via her images, and GANs might be able to generate such images to gain access (Thies et al., 2016). Another is anomaly detection systems protecting critical infrastructure. As demonstrated by recent cyber-attacks (notably the Stuxnet attack) sensors of these systems can be hijacked, so that GANs can be used to generate normal looking activity while the actual system is being tampered with. The latter is, in fact, a new form of a man-in-the-middle attack.

Our goal here is to construct a theoretical framework for studying the security risk arising from generative models and explore its practical implications. We begin with a simple key insight. If the attacker (i.e., the generative model) has unlimited observations of the source it is trying to imitate, it will be able to fool any authenticator. On the other hand, if the attacker has access to fewer sensor observations than the number of fake observations it needs to generate, it seems intuitively clear that it cannot always succeed (as we indeed prove in Sec. 4). Therefore, the optimal defense and attack strategies depend crucially on the amount of information available to the attacker and authenticator.

Motivated by the above insight, we cast the authentication setting as a two-player maximin game (authenticator vs. attacker) where all observations are ﬁnite. Speciﬁcally, there are three key obser-

Published as a conference paper at ICLR 2020

vation sets to consider: those available to the attacker, those that the attacker needs to generate, and those available to the authenticator when designing the system. Our goal is to understand how these three information sources determine the optimal strategies for both players. Under the realistic assumption that cyber attackers are sophisticated enough to play optimal or close to optimal strategies, a characterization of the maximin authentication strategy can be of signiﬁcant value.

We prove several theoretical results characterizing the optimal strategy for both players. These results highlight the role played by the available observations as well as the functional form for an optimal attacker and authenticator. We refer to the setting above as GAN in The Middle (GIM) due to its similarity to man in the middle attacks. After describing our theoretical results, we show how to learn both authenticator and attacker policies in practice, where both are based on neural architectures. Our GIM method can be applied to multiple practical problems. The ﬁrst is building an authentication system that is robust to impersonation attacks. The second is building a data generating mechanism that can generate novel data instances. Finally, we evaluate the method empirically, showing that it outperforms existing methods in terms of resilience to generative attacks, and that it can be used effectively for data-augmentation in the few-shot learning setting.

2 PROBLEM STATEMENT

We begin by motivating our problem and formulating it as a two-player zero-sum game. As a simple illustrative example, consider a face authentication security system whose goal is to maximize authentication accuracy. The system is initialized by registering k images of an individual θ (the source ), whose identity is to be authenticated. At test-time, each entity claiming to be θ is required to present to the system n of its images, and the authentication system decides whether the entity is θ or an impersonator. We let m denote the maximum number of leaked images any attacker obtained. We observe that if an attacker obtained m ě n images of θ, it can present n of those images. Thus the observations generated by the attacker are indistinguishable from ones generated by θ, leading to failure of the authentication system (see Sec. 4.2 below). Hence, the number of images of θ that the attacker obtains and the size of the authentication sample are of key importance. We now turn to formally stating the problem. Notation. The set of possible observations is denoted by X. Let H denote the known set of possible sources θ, where each source θ P H is deﬁned by a probability density fθ, and an observation of a source θ P H is an X-valued random variable with density fθ. We assume that subsequent observations of the source are IID, so that n sequential observations have density f pnq θ px1, . . . , xnq : śn i 1 fθpxiq. We allow θ to be sampled from a known distribution Q on H and denote the corresponding H-valued random variable by Θ. In what follows we will denote the number of observations leaked to the attacker by m, the number of registration observations available to the authenticator by k, and the number of observations required at authentication by n (these may be generated by either the attacker or the true source). The Authentication Game. The game begins with a random source Θ being drawn from H according to Q. The authenticator ﬁrst receives information about the drawn source, and then chooses a decision rule for deciding whether a given test sequence x P X n is an authentic sequence of observations sampled from f pnq θ or a fake sequence generated by the attacker. Formally, the authenticator learns about the source by seeing k IID registration observations A A1, . . . , Ak f pkq θ . The set of all possible decision rules is then D : X k ˆ X n Ñ t0, 1u (where a decision of 1 corresponds to the true source and 0 to an attacker). After the authenticator ﬁxes its strategy, the attacker can seek the best attack strategy. We assume that the attacker has access to m leaked IID observations Y Y1, . . . , Ym f pmq θ as information about the source θ. Then it generates an attack sequence X P X n and presents it to the authenticator, which uses its decision rule to decide whether X is an authentic sequence of observations sampled from f pnq θ or a fake sequence generated by the attacker. Formally, the strategy set of the attacker is all functions G : X m Ñ p X nq, where p X nq is the set of probability distributions over X n, and g X|Y is the associated conditional probability density. We note that the set H, the parametric family fθ, and the prior probability Q are known to both players. Also, note that the leaked sample Y revealed to the attacker is not available to the authenticator, and the registration sample A is not available to the attacker.

The goal of the authenticator is to maximize its expected accuracy, and the goal of the attacker is to minimize it (or equivalently maximize its success probability). We deﬁne the utility (payoff) of

Published as a conference paper at ICLR 2020

the game as the expected prediction accuracy of the authenticator. To deﬁne expected accuracy we consider the case of equal priors for attack and real samples.1 Formally, for a pair of strategies p D, Gq and a speciﬁc source θ, the expected accuracy of the authenticator is then:

V pθ, D, Gq 1

2EA f pkq θ EY f pmq θ

EX f pnq θ r Dp A, Xqs EX Gp Y qr1 Dp A, Xqs ı (2.1)

Since this utility only depends on G in the second term, minimizing it is equivalent to G maximizing its success probability. To obtain the overall utility for the authenticator, we take the expected value w.r.t Θ and deﬁne V p D, Gq EΘ QV pΘ, D, Gq. Finally, we arrive at the following maximin game:

Vgame max DPD min GPG V p D, Gq (2.2)

where D, G are the sets of all possible authenticator and attacker strategies, respectively. In Sec. 4 we show that this game has a Nash equilibrium, we characterize the optimal strategies and game value in general, and ﬁnd them in closed form for the case of Multivariate Gaussian sources.

3 RELATED WORK

Adversarial hypothesis testing (AHT): Hypothesis testing (HT) is a rich ﬁeld in statistics that studies how one can detect whether a sample was generated by one of two sets of distributions. A variant of HT that is related to our work but distinct from it is AHT (e.g., see Brand ao et al., 2014; Barni & Tondi, 2013b;a; 2014; Bao et al., 2011; Zhou et al., 2019; Br uckner & Scheffer, 2011; Br uckner et al., 2012). These works describe an HT setting where the sample is generated by one of two hypotheses classes, but is then modiﬁed by an adversary in some restricted way. E.g., in Barni & Tondi (2013b;a; 2014) the adversary can change the sample of one class up to a ﬁxed distance (e.g., Hamming). Given the quality of current generative models and the rapid pace of progress, when considering an impersonation attack, it seems that the only relevant restriction one can assume on an attacker is on the information it has. This is not captured by prior work since it assumes that the adversary has a restricted strategy set. In contrast, our work considers a novel problem setting where both players are not limited in their strategy set in any way. This leads to a novel analysis that focuses on the dependence on the ﬁnite information available to each player (m, n, k). Adversarial Examples: It has been observed (Goodfellow et al., 2014b) that deep learning models can be very sensitive to small changes in their input. Such misleading inputs are known as adversarial examples and much recent work has analyzed the phenomenon (Ilyas et al., 2019; Shamir et al., 2019; Zhang et al., 2019), addressed the problem of robustness to these (Moosavi-Dezfooli et al., 2016; Papernot et al., 2017; Yuan et al., 2017), and studied when robusntess can be certiﬁed (Raghunathan et al., 2018; Wong et al., 2018). The setting of robust classiﬁcation in the presence of adversarial examples can also be thought of as a speciﬁc case of AHT (see above), where a classiﬁer is required to predict the class of an observation that could have been perturbed by a restricted adversary (Wong et al., 2018; 2019) or generated by an adversary limited to generating examples that will be classiﬁed correctly by humans (Song et al., 2018). In contrast, in our setting the attacker is not limited in any way, nor does it have another utility in addition to impersonating the source. Furthermore, in adversarial examples, there is no notion of limited information for the adversary, whereas our work focuses on the dependence of the game on the information available to the players (sample sizes n, m, k). GAN: The GAN model is a game between a generator and a discriminator. While our concept of generative attacks is inspired by GAN, it is very different from it: a successful GAN generator is not necessarily a successful attacker in our setting, and vice-versa (e.g., given sufﬁciently expressive generators and discriminators, GANs can memorize the training data, and thus the discriminator will perform at chance level.2 Such a discriminator will not be useful as a defense against generative attacks). Unlike GANs, in our setting, sample sizes are of key importance. Thus, our attacker will not memorize the data it sees, as this will be detected when generating n ą m examples. Conditional GANs: In conditional GANs (Mirza & Osindero, 2014) the generator uses side information for generating new samples. The attacker in our approach (analogous to GAN generator) has input, but this input is not available to the authenticator (analogous to GAN discriminator). Thus,

1We give equal prior probability to attack and real samples for simplicity and clarity, and since this is common in GAN formulations. All results can be trivially generalized to any prior probability for attack. 2If the generator is not expressive enough, the learned GAN may have small support (Arora et al., 2017).

Published as a conference paper at ICLR 2020

the objective of the learning process is fundamentally different. Few-shot learning and generation: Our work relates to few-shot learning (Snell et al., 2017; Vinyals et al., 2016; Finn et al., 2017; Lake et al., 2011; Koch et al., 2015) and few-shot generative models (Rezende et al., 2016; Zakharov et al., 2019; Lake et al., 2015; Edwards & Storkey, 2017; Hewitt et al., 2018) in the sense that both authenticator and attacker need to learn from a limited set of observations. However, in our setting the authenticator is required to predict whether a sample came from the true source or an attacker impersonating the source while taking into consideration the amount of information both players have. These are notions that are not part of the general few-shot learning setup. Also, in prior work on few-shot generation, the generator is either measured through human evaluation (Lake et al., 2015) or trained to maximize the likelihood of its generated sample (Rezende et al., 2016; Edwards & Storkey, 2017; Hewitt et al., 2018). In contrast, in our setting the attacker s objective is to maximize the probability that its generated sample will be labeled as real by an authenticator. To this end, we show that the attacker must consider the sample sizes m, n, k, which the generative models in prior work do not account for. Furthermore, we show in Sec. F.4, and in Figures 1c,6, that the maximum likelihood (ML) solution is indeed sub-optimal in our setting. Image to image translation: Several GAN models have been introduced for mapping between two domains (Zhu et al., 2017; Huang et al., 2018; Isola et al., 2017; Wang et al., 2017; Park et al., 2019). This relates to our work since the attacker also needs to learn to map the leaked sample to an attack sample. However, in our setting the mapping is not to a different domain but rather to other images from the same distribution, which results in a different objective. Data Augmentation: Generative models have also been used for augmenting data in supervised learning, and in particular few-shot learning (Koch et al., 2015; Snell et al., 2017; Vinyals et al., 2016; Lake et al., 2011). One such approach is Data Augmentation GAN (DAGAN) (Antoniou et al., 2018), which takes as input an image and generates a new image. It relates to our framework in the limited case of m 1, n 2, k 1, in the sense that the generator s objective is to map one image to two. However, in DAGAN the only goal of the discriminator is to improve the generator, and the generator is limited to the functional form of adding a new image to the existing one, which is a sub-optimal attack strategy, as can be seen from the Gaussian case of our problem.

4 THEORETICAL RESULTS

In this section, we study the game deﬁned in Eq. 2.2. First, in Sec. 4.1 we show the existence of a Nash equilibrium and characterize the optimal strategies for both players. Speciﬁcally, we show that the optimal attacker strategy minimizes a certain divergence between the source and the attacker s conditional distribution of X given A. Next, Sec. 4.2 shows that when there are more leaked samples m than generated samples n, the authenticator will fail. Finally, in Sec. 4.3 we provide a closed-form solution for both attacker and authenticator for the case of multivariate Gaussian distributions and analyze the effect of the dimension of the observations and the sample sizes m, n, and k. Proofs are provided in the appendix.

4.1 CHARACTERIZING THE OPTIMAL STRATEGIES

We begin by showing that the game deﬁned in Eq. 2.2 admits a Nash equilibrium. Namely, Theorem 4.1 below shows that there exists a pair of strategies p D , G q that satisfy: max DPD min GPG V p D, Gq min GPG max DPD V p D, Gq V p D , G q (4.1)

Theorem 4.1. Consider the attacker G deﬁned by:

g X|Y P argmin g X|Y EA f A

ˇˇf X|Apx|Aq g X|Apx|Aq ˇˇ dx (4.2)

Where, f Apaq ş

θPH Qpθqf pkq θ paqdθ is the marginal density of A, QΘ|A is the poste-

rior over H given A, and f Y |Apy|aq ş

θPH f pmq θ pyq QΘ|Apθ|aqdθ. Also, f X|Apx|aq ş

θPH f pnq θ pxq QΘ|Apθ|aqdθ and g X|Apx|aq ş

y PX m g X|Y px|yqf Y |Apy|aqdy are the conditional densities of X given A, generated by the source and the attacker respectively. Consider the authenticator deﬁned by D pa, xq I f X|Apx|aq ą g X|Apx|aq ı , where I is the indicator function. Then p D , G q is a solution of Eq. 2.2 that satisﬁes Eq. 4.1.

Published as a conference paper at ICLR 2020

The proof (see Sec. D) follows by ﬁrst showing that since Dpa, xq P t0, 1u, it holds that for any G, the optimal authenticator strategy is a MAP test between the two hypotheses (true source or attacker). We then show that given D , the game objective for G becomes Eq. 4.2. Namely, the optimal attacker minimizes the ℓ1 distance over the space X k ˆX n between the true source s conditional distribution of X given A, and its own. Therefore, since the proposed G minimizes Eq. 4.2 by deﬁnition, it holds that min G V p D , Gq V p D , G q max D V p D, G q and it follows that D , G satisfy Eq. 4.1.

4.2 REPLAY ATTACKS: AUTHENTICATION FAILURE FOR n ď m

When n ď m, the attacker generates a number of observations that is at most the number of observations it has seen. Intuitively, an optimal attack in this case, is to simply replay a subset of size n from the m observations. This is known as a replay-attack (Syverson, 1994). This subset constitutes an IID sample of length n of the observed source, and is, therefore, a legitimate fresh sample. In this case, it seems like the attack cannot be detected by the authenticator. Indeed it is easy to show using Theorem 4.1 that this attack is optimal and therefore for n ď m we have: max DPD min GPG V p D, Gq 0.5 (see Sec. E)

4.3 THE GAUSSIAN CASE

We now turn to the case of multivariate Gaussian distributions where we can ﬁnd the exact form of the attacker and authenticator, providing insight into the general problem. Speciﬁcally, we consider the setting where the source distributions are d-dimensional multivariate Gaussians with an unknown mean and known covariance, and the prior Q over H is the improper uniform prior.3 We assume n ą m to keep the problem non-trivial. Let the observations be d-dimensional Gaussian vectors with a known covariance matrix Σ P Rdˆd and an unknown mean vector θ P Rd. The set of possible sources H becomes Rd, the Gaussian mean vectors. For any sample of n examples z P Rnˆd, we let zi denote the i th example, and z 1 n řn i 1 zi denote the sample mean. Finally, for any v P Rd, B P Rdˆd, we deﬁne }v}2 B v T Bv. The following theorem gives a closed-form solution for both attacker and authenticator for the game deﬁned in Eq. 2.2.

Theorem 4.2. Deﬁne δ m{n ď 1 and let ρ m{k. Consider the attacker G deﬁned by the following generative process: Given a leaked sample Y P Rmˆd, G generates a sample X P Rnˆd

as follows: it ﬁrst samples n vectors W1, . . . , Wn iid Np0, Σq and then sets: Xi Wi W Y . Deﬁne the authenticator D by:

} x a}2 Σ 1 ă d p1 ρq 1 ρδ 1

np1 δq log ˆρ 1

Then p D , G q is a solution of Eq. 2.2 that satisﬁes Eq. 4.1.

The proof (see Sec. F) starts by showing that @α ą 0, given Dpa, xq Ir} x a}2 Σ 1 ă αs, the optimal strategy for G is to set x y with probability 1 (as done in G ). To prove this, we ﬁrst use the Prekopa-Leindler inequality (Pr ekopa, 1973) to show that in this case G s maximization objective is log-concave. We then show that any G that satisﬁes x y with probability 1 is a local extremum, and since the objective is log-concave it follows that it is the global maximum. We continue by showing that given G , the proposed D is optimal. To do so, we ﬁrst ﬁnd the distribution of G s attack sample using the Woodbury matrix identity, and then show that D is indeed the optimal decision rule. Finally, using the max-min inequality, this implies that p D , G q satisfy Eq. 4.1.

There are several interesting observations about the above optimal strategies. Perhaps the most intuitive strategy for the attacker would have been to sample n elements from a Gaussian with mean Y and the known covariance Σ. In expectation, this sample would have the correct mean. However, this turns out to be sub-optimal, as can be seen in Figures 1c and 6 (we refer to this as an ML attack. See Sec. F.4 in the appendix for the derivation and visualizations). Instead, the optimal attacker begins by drawing an IID sample W from a Gaussian distribution with mean 0, and then forces the sample mean to be exactly Y by shifting the sample points by Y W. This optimal attacker

3Similar results can be derived for the conjugate prior case, namely, proper Gaussian priors.

Published as a conference paper at ICLR 2020

strategy can be viewed as matching the sufﬁcient statistics of the leaked sample Y in the generated sample X. The optimal authenticator is a MAP test for the optimal attacker, as in Theorem 4.1.

As a corollary to Theorem 4.2 we obtain the value of the game (i.e., the accuracy of D ).

Corollary 4.3. Deﬁne δ and ρ as in Theorem 4.2. Then the game value for the Gaussian case is:

1 2 1 2Γp d

2p1 δq log 1 ρ

2p1 δq log 1 ρ

Where γ is the lower incomplete Gamma function, and Γ is the Gamma function.

The proof (see Sec. F) follows by showing that the test statistic used by D is Gamma distributed. Fig. 1 demonstrates several interesting aspects of the above results. First, Fig. 1a shows that the authenticator accuracy improves as n (the size of the test sample) grows. Furthermore, accuracy also improves as the dimension d grows, meaning that for a speciﬁed authentication accuracy, the required ratio n{m becomes smaller with the dimension d. This is a very encouraging result since although this dimensional dependency is proved only for Gaussian sources, it suggests that for realworld high-dimensional sources (e.g., faces, video, voice, etc.) the authenticator can achieve high accuracy even when requiring a small (and practical) authentication sample.

Intuitively it may seem like authentication is impossible when the authenticator has less data than the attacker (i.e., m ą k). Surprisingly, this is not the case. As can be seen in Fig. 1b, even when m ą k, the authenticator can achieve non trivial accuracy.4 An intuitive explanation of this phenomenon is that the test statistic used by the authenticator is X A , which, due to the variance in the attacker estimation, has higher variance when X is generated by an attacker than when X is generated by the true source. This, in turn, allows the authenticator to discriminate between the hypotheses.

A closed-form solution for the general case remains an open problem. We believe that solving for Gaussians is an important step forward, since it exposes interesting structural properties of the solution, which we use in practice. Furthermore, if G has an encoder-decoder structure, it is not unreasonable that the source distribution in latent space can be approximately Gaussian (as in VAE).

5 GAN IN THE MIDDLE NETWORKS

So far we explored the general formalism of authentication games. Here we consider speciﬁc architectures for D and G. As in GAN based models (Mirza & Osindero, 2014; Mescheder et al., 2018; Karras et al., 2018a;b), we use neural nets to model these, while using insights from our theoretical analysis. Below we provide implementation details for the GIM model (see Sec. H and code for more details). In our analysis, we considered the non-differentiable zero-one loss since it is the real accuracy measure. In practice, we will use cross-entropy as used in most GAN approaches. Authenticator Architecture: The authenticator is implemented as a neural network Dpa, xq that maps from a source information sample a P X k and a test sample x P X n to a probability that the test sample came from the true source. Our framework does not restrict the authenticator to any speciﬁc function type, but in practice one must implement it using some model. We recall that our theoretical results do suggest a certain functional form. The Gaussian results in Sec. 4.3 show that the optimal authenticator is a test on the sufﬁcient statistic of the source parameters. Motivated by this result, and in the spirit of Siamese networks (Koch et al., 2015; Chopra et al., 2005), we consider the following form for the authenticator. We deﬁne a function TD that maps a sample to a ﬁxed sized vector, analogous to the sufﬁcient statistic in the theorem. We apply TD to both a and x. Then, these two outputs are used as input to a comparison function σ which outputs a scalar reﬂecting their similarity. Thus the authenticator can be expressed as: Dpa, xq σp TDpaq, TDpxqq. Attacker Architecture: The attacker is implemented as a stochastic neural network Gpyq that maps a leaked sample y P X m to an attack sample x P X n. Our theoretical results suggest a certain functional form for this network. The Gaussian analysis in Sec. 4.3 shows that the optimal attacker generates a sample whose sufﬁcient statistic matches that of the leaked sample. Motivated by this result, we consider the following functional form for the attacker. First, it applies a function TG that maps the leaked sample Y to a ﬁxed sized vector TGp Y q, analogous to the sufﬁcient statistic in the

4See Sec. C in the appendix, for additional ﬁgures showing that as the dimension grows, the expected accuracy goes to 1.

Published as a conference paper at ICLR 2020

theorem. It then draws n random latent vectors W1, . . . , Wn and matches their mean to the leaked sufﬁcient statistic to obtain the latent vectors W 1 i. Namely it sets W 1 i Wi W TGp Y q as done in Theorem 4.2. Finally, it uses a decoder function ϕ that maps each latent vector W 1 i to the domain X. Thus, the attacker can be expressed as: Gp Y qi ϕp Wi W TGp Y qq @i P rns. Optimization Details: Each iteration begins when a source θ is chosen randomly from the set of sources in the training dataset (e.g., a person to be authenticated). Samples A, Y, Xθ are drawn from the set of examples available for θ, where Xθ represents a test sample from θ. Then, given a leaked sample Y , the generator G generates a fake sample XG, passes it to D and suffers the appropriate loss. Finally, D receives as input the source information sample A, outputs a prediction for each of the test samples Xθ, XG, and suffers the appropriate loss. Optimization is done via gradient ascent on authenticator parameters and descent on attacker parameters, as is typical for GAN problems.

6 EXPERIMENTS

We next evaluate our method empirically. In all experiments, we use the model described in Sec. 5. We optimize the model with adversarial training, using the loss suggested by Mescheder et al. (2018) and Adam (Kingma & Ba, 2015). Our implementation is available at https://github.com/ roymor1/Optimal Strategies Against Generative Attacks. Also, see Sec. H for further implementation details. Gaussian sources: For the case of Gaussian sources, we arrived at a complete characterization of the solution in Sec. 4.3. Thus, we can learn the models using our GIM algorithm and check whether it ﬁnds the correct solution. This is important, as the GIM objective is clearly non-convex and GANs are generally hard to train in practice and lack convergence guarantees (Mescheder et al., 2018). We ran all experiments using a multivariate Gaussian with Σ Id, and in each game the source mean was drawn from the prior distribution Q Np0, 10Idq. This approximates the improper uniform prior since the prior has much larger variance than the sources. Fig. 1a shows the empirical game value compared with the theoretical one as a function of the test sample size n, for ﬁxed m 1, k 10, and different values of d. It can be seen that there is an excellent ﬁt between theory and experiment.

Figure 1: Game value (expected authentication accuracy) for the Gaussian case. (a) A comparison between empirical and theoretical game value for different d values (m 1, k 10). Solid lines describe the theoretical game values whereas the markers describe the empirical accuracy when learning with the GIM model. (b) Theoretical game value as a function of δ, ρ (see Corollary 4.3) for d 100. (c) Empirical accuracy of an optimal authenticator against two attacks: the theoretically optimal attack G from Theorem 4.2 and a maximum likelihood (ML) attack (See Sec. F.4) for the Gaussian case. It can be seen that the ML attack is inferior in that it results in better accuracy for the authenticator, as predicted by our theoretical results.

Authentication on Faces and Characters: We next evaluate GIM in an authentication setting on two datasets: the Vox Celeb2 faces dataset (Nagrani et al., 2017; Chung & Zisserman, 2018), and the Omniglot handwritten character dataset (Lake et al., 2015). Additional information about the datasets, splits, and modeling details is provided in Sections G and H. Our goal is to check whether the GIM authenticator is more robust to generative attacks than a state of the art authentication system. To evaluate this, we consider several attackers: 1) A random source attacker (RS): a naive attacker that ignores the leaked sample Y . It simply draws a random source from the dataset, and samples n real images of that source. From the authenticator s perspective, it s equivalent to a sample version of the veriﬁcation task (Koch et al., 2015; Schroff et al., 2015; Deng et al., 2018), in

Published as a conference paper at ICLR 2020

Table 1: Accuracy of GIM and baselines against attacks. Avg acc denotes average over all attacks.

Authenticator Dataset m n k RS Replay GIM Avg acc

GIM Vox Celeb2 1 5 5 0.897 0.837 0.822 0.852 Arc Face Vox Celeb2 1 5 5 0.998 0.598 0.526 0.707 GIM Omniglot 1 5 5 0.912 0.942 0.868 0.907 Siamese Omniglot 1 5 5 0.994 0.509 0.785 0.763

which an agent is presented with a pair of real images and needs to decide whether they are from the same source or not. 2) Replay attacker (Replay): an attacker which, upon seeing a leaked sample Y , draws n random images of the leaked sample (with replacement). 3) A GIM attack, which is the worst case attacker G, learned by our GIM model.

For Vox Celeb we compare the GIM authenticator to the Arc Face method (Deng et al., 2018), which is currently state of the art in face veriﬁcation. As a baseline for Omniglot, we use the Siamese network suggested by Koch et al. (2015), which achieves state of the art in the veriﬁcation task on Omniglot. Results are shown in Table 1. It can be seen that on average across attacks, GIM outperforms the baselines. The only attack for which GIM is inferior is RS. This is not surprising as this is the objective that both baselines are trained for.

Qualitative evaluation of attacker: In Fig. 2 we provide images generated by the GIM attacker for the test set of both Omniglot and Voxceleb. The images demonstrate qualitatively the strategy learned by the attacker. In the Voxceleb2 dataset, face images are drawn from a video of the person talking. Note that as in real samples from the data, the attack sample varies in pose and expression and not in the background or clothing.

Figure 2: Images generated by the GIM attacker based on one leaked image. In each row, the leftmost image is the real leaked image, and the rest of the images are an attack sample generated by the GIM attacker. (a) Voxceleb2 dataset. (b) Omniglot dataset.

Data augmentation: Finally, we use GIM for data augmentation in one-shot classiﬁcation on Omniglot. This is done by using the GIM attacker to generate more data for a given class. We ﬁrst train GIM on the training set with parameters m 1, n 5, k 5. Then, during both training and testing of one-shot classiﬁcation, given an example, we use the attacker to augment the single example available for each of the classes, by adding to it the n 5 examples our attacker generated from it. We use Prototypical Nets (Snell et al., 2017) as the baseline model. We ﬁnd that without using our augmentation method, Prototypical Nets achieve 95.9% accuracy on the test split, and with our method, they achieve 96.5%, which is similar to the improvement achieved by Antoniou et al. (2018) with Matching networks (Vinyals et al., 2016) as the few-shot classiﬁcation algorithm.

7 CONCLUSIONS

We deﬁned the notion of authentication in the face of generative attacks, in which a generative model attempts to produce a fake reality based on observations of reality. These attacks raise numerous interesting theoretical questions and are very important and timely from a practical standpoint. We proposed to study generative attacks as a two-person zero-sum game between attacker and authenticator. In our most general setup both attacker and authenticator have access to a ﬁnite set of

Published as a conference paper at ICLR 2020

observations of the source. We show that this game has a Nash equilibrium, and we characterize the optimal strategies. In the Gaussian version of the game, a closed form of the optimal strategies is available. A nice outcome of the analysis is that the game value depends on m, n, k only through their ratios δ m{n (i.e., the expansion ratio between attack and leaked sample sizes) and ρ m{k (i.e., the information ratio between the number of source observations available to attacker and authenticator). As we show in Fig. 1b, there is a large range of values for which high accuracy authentication is possible, and as d grows we observe that the high authentication accuracy region in the pδ, ρq plane grows sharply. We introduce the GIM model, which is a practical approach to learning both authenticator and attacker, and whose structure is inspired by our analysis. GIM achieves accuracy that is very close to the theoretical rates in the Gaussian case, and is also more robust to attacks when compared to state of the art authenticators on real data. Many theoretical and practical questions remain. For example, ﬁnding closed form optimal strategies for other distributions, and going beyond IID generation. The non IID setting is of particular importance for the problem of fake video (Thies et al., 2016) and audio (Arik et al., 2018) generation, which we intend to study in the future.

ACKNOWLEDGEMENTS

This work has been supported by the Blavatnik Interdisciplinary Research Center (ICRC), the Federmann Research Center (Hebrew University) and Israeli Science Foundation research grants 1523/16 and 1186/18.

Antreas Antoniou, Amos J. Storkey, and Harrison Edwards. Augmenting image classiﬁers using data augmentation generative adversarial networks. In Artiﬁcial Neural Networks and Machine Learning, pp. 594 603, 2018.

Sercan Arik, Jitong Chen, Kainan Peng, Wei Ping, and Yanqi Zhou. Neural voice cloning with a few samples. In Advances in Neural Information Processing Systems, pp. 10040 10050, 2018.

Sanjeev Arora, Rong Ge, Yingyu Liang, Tengyu Ma, and Yi Zhang. Generalization and equilibrium in generative adversarial nets (GANs). In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 224 232. JMLR. org, 2017.

Ning Bao, O. Patrick Kreidl, and John Musacchio. A network security classiﬁcation game. In Game Theory for Networks - 2nd International ICST Conference, GAMENETS 2011, Shanghai, China, April 16-18, 2011, Revised Selected Papers, pp. 265 280, 2011. doi: 10.1007/ 978-3-642-30373-9z 19. URL https://doi.org/10.1007/978-3-642-30373-9_ 19.

Mauro Barni and Benedetta Tondi. Multiple-observation hypothesis testing under adversarial conditions. In 2013 IEEE International Workshop on Information Forensics and Security, WIFS 2013, Guangzhou, China, November 18-21, 2013, pp. 91 96, 2013a. doi: 10.1109/WIFS.2013. 6707800. URL https://doi.org/10.1109/WIFS.2013.6707800.

Mauro Barni and Benedetta Tondi. The source identiﬁcation game: An information-theoretic perspective. IEEE Trans. Information Forensics and Security, 8(3):450 463, 2013b. doi: 10.1109/ TIFS.2012.2237397. URL https://doi.org/10.1109/TIFS.2012.2237397.

Mauro Barni and Benedetta Tondi. Binary hypothesis testing game with training data. IEEE Trans. Information Theory, 60(8):4848 4866, 2014. doi: 10.1109/TIT.2014.2325571. URL https: //doi.org/10.1109/TIT.2014.2325571.

Fernando G. S. L. Brand ao, Aram Wettroth Harrow, James R. Lee, and Yuval Peres. Adversarial hypothesis testing and a quantum stein s lemma for restricted measurements. In Innovations in Theoretical Computer Science, ITCS 14, Princeton, NJ, USA, January 12-14, 2014, pp. 183 194, 2014. doi: 10.1145/2554797.2554816. URL https://doi.org/10.1145/2554797. 2554816.

Published as a conference paper at ICLR 2020

Michael Br uckner and Tobias Scheffer. Stackelberg games for adversarial prediction problems. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, August 21-24, 2011, pp. 547 555, 2011. doi: 10.1145/ 2020408.2020495. URL https://doi.org/10.1145/2020408.2020495.

Michael Br uckner, Christian Kanzow, and Tobias Scheffer. Static prediction games for adversarial learning problems. J. Mach. Learn. Res., 13:2617 2654, 2012. URL http://dl.acm.org/ citation.cfm?id=2503326.

Sumit Chopra, Raia Hadsell, and Yann Le Cun. Learning a similarity metric discriminatively, with application to face veriﬁcation. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 539 546, 2005.

Arsha Nagrani Chung, Joon Son and Andrew Zisserman. Voxceleb2: Deep speaker recognition. In Interspeech, 2018.

Jiankang Deng, Jia Guo, and Stefanos Zafeiriou. Arcface: Additive angular margin loss for deep face recognition. ar Xiv preprint ar Xiv:1801.07698, 2018.

Harrison Edwards and Amos Storkey. Towards a neural statistician. In International Conference on Learning Representations, 2017.

Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adaptation of deep networks. Co RR, abs/1703.03400, 2017. URL http://arxiv.org/abs/1703. 03400.

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Advances in neural information processing systems, pp. 2672 2680, 2014a.

Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. In International Conference on Learning Representations, 2014b.

Luke B Hewitt, Maxwell I Nye, Andreea Gane, Tommi Jaakkola, and Joshua B Tenenbaum. The variational homoencoder: Learning to learn high capacity generative models from few examples. In Proceedings of the Thirty-Fourth Conference on Uncertainty in Artiﬁcial Intelligence, pp. 988 997, 2018.

Xun Huang and Serge J. Belongie. Arbitrary style transfer in real-time with adaptive instance normalization. In IEEE International Conference on Computer Vision, pp. 1510 1519, 2017.

Xun Huang, Ming-Yu Liu, Serge Belongie, and Jan Kautz. Multimodal unsupervised image-toimage translation. In Proceedings of the European Conference on Computer Vision, pp. 172 189, 2018.

Andrew Ilyas, Shibani Santurkar, Dimitris Tsipras, Logan Engstrom, Brandon Tran, and Aleksander Madry. Adversarial examples are not bugs, they are features. Ar Xiv, abs/1905.02175, 2019.

Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. Image-to-image translation with conditional adversarial networks. In The IEEE Conference on Computer Vision and Pattern Recognition, pp. 5967 5976, 2017.

Justin Johnson, Alexandre Alahi, and Li Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. In European Conference on Computer Vision, 2016.

Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. Progressive growing of GANs for improved quality, stability, and variation. In International Conference on Learning Representations, 2018a.

Tero Karras, Samuli Laine, and Timo Aila. A style-based generator architecture for generative adversarial networks. ar Xiv preprint ar Xiv:1812.04948, 2018b.

Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. In International Conference on Learning Representations, 2015.

Published as a conference paper at ICLR 2020

Gregory Koch, Richard Zemel, and Ruslan Salakhutdinov. Siamese neural networks for one-shot image recognition. In ICML deep learning workshop, volume 2, 2015.

Brenden M. Lake, Ruslan Salakhutdinov, Jason Gross, and Joshua B. Tenenbaum. One shot learning of simple visual concepts. In Proceedings of the 33th Annual Meeting of the Cognitive Science Society, 2011.

Brenden M Lake, Ruslan Salakhutdinov, and Joshua B Tenenbaum. Human-level concept learning through probabilistic program induction. Science, 350(6266):1332 1338, 2015.

Lars Mescheder, Andreas Geiger, and Sebastian Nowozin. Which training methods for GANs do actually converge? In Proceedings of the 35th International Conference on Machine Learning, pp. 3478 3487, 2018.

Mehdi Mirza and Simon Osindero. Conditional generative adversarial nets. ar Xiv preprint ar Xiv:1411.1784, 2014.

Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard. Deepfool: a simple and accurate method to fool deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2574 2582, 2016.

A. Nagrani, J. S. Chung, and A. Zisserman. Voxceleb: a large-scale speaker identiﬁcation dataset. In Interspeech, 2017.

Nicolas Papernot, Patrick Mc Daniel, Ian Goodfellow, Somesh Jha, Z Berkay Celik, and Ananthram Swami. Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM on Asia conference on computer and communications security, pp. 506 519, 2017.

Taesung Park, Ming-Yu Liu, Ting-Chun Wang, and Jun-Yan Zhu. Semantic image synthesis with spatially-adaptive normalization. Co RR, abs/1903.07291, 2019. URL http://arxiv.org/ abs/1903.07291.

Andr as Pr ekopa. On logarithmic concave measures and functions. Acta Scientiarium Mathematicarum, 34:335 343, 1973.

Aditi Raghunathan, Jacob Steinhardt, and Percy Liang. Certiﬁed defenses against adversarial examples. In International Conference on Learning Representations, 2018.

Danilo Jimenez Rezende, Shakir Mohamed, Ivo Danihelka, Karol Gregor, and Daan Wierstra. Oneshot generalization in deep generative models. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016, pp. 1521 1529, 2016. URL http://proceedings.mlr.press/v48/rezende16.html.

Florian Schroff, Dmitry Kalenichenko, and James Philbin. Facenet: A uniﬁed embedding for face recognition and clustering. In The IEEE Conference on Computer Vision and Pattern Recognition, pp. 815 823, 2015.

Adi Shamir, Itay Safran, Eyal Ronen, and Orr Dunkelman. A simple explanation for the existence of adversarial examples with small hamming distance. Co RR, abs/1901.10861, 2019. URL http: //arxiv.org/abs/1901.10861.

Jake Snell, Kevin Swersky, and Richard S. Zemel. Prototypical networks for few-shot learning. In Advances in Neural Information Processing Systems, pp. 4080 4090, 2017.

Yang Song, Rui Shu, Nate Kushman, and Stefano Ermon. Constructing unrestricted adversarial examples with generative models. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (eds.), Advances in Neural Information Processing Systems 31, pp. 8312 8323. Curran Associates, Inc., 2018. URL http://papers.nips.cc/paper/ 8052-constructing-unrestricted-adversarial-examples-with-generative-models. pdf.

Paul Syverson. A taxonomy of replay attacks [cryptographic protocols]. pp. 187 191, 07 1994. ISBN 0-8186-6230-1. doi: 10.1109/CSFW.1994.315935.

Published as a conference paper at ICLR 2020

Justus Thies, Michael Zollhofer, Marc Stamminger, Christian Theobalt, and Matthias Nießner. Face2face: Real-time face capture and reenactment of rgb videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2387 2395, 2016.

Oriol Vinyals, Charles Blundell, Tim Lillicrap, Koray Kavukcuoglu, and Daan Wierstra. Matching networks for one shot learning. In Advances in Neural Information Processing Systems, pp. 3630 3638, 2016.

Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. High-resolution image synthesis and semantic manipulation with conditional gans. Co RR, abs/1711.11585, 2017. URL http://arxiv.org/abs/1711.11585.

Eric Wong, Frank Schmidt, Jan Hendrik Metzen, and J. Zico Kolter. Scaling provable adversarial defenses. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (eds.), Advances in Neural Information Processing Systems, pp. 8400 8409. Curran Associates, Inc., 2018.

Eric Wong, Frank R. Schmidt, and J. Zico Kolter. Wasserstein adversarial examples via projected sinkhorn iterations. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, pp. 6808 6817, 2019. URL http: //proceedings.mlr.press/v97/wong19a.html.

Xiaoyong Yuan, Pan He, Qile Zhu, Rajendra Rana Bhat, and Xiaolin Li. Adversarial examples: Attacks and defenses for deep learning. Co RR, abs/1712.07107, 2017. URL http://arxiv. org/abs/1712.07107.

Egor Zakharov, Aliaksandra Shysheya, Egor Burkov, and Victor S. Lempitsky. Few-shot adversarial learning of realistic neural talking head models. Co RR, abs/1905.08233, 2019. URL http: //arxiv.org/abs/1905.08233.

Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric P. Xing, Laurent El Ghaoui, and Michael I. Jordan. Theoretically principled trade-off between robustness and accuracy. In Proceedings of the 36th International Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, pp. 7472 7482, 2019. URL http://proceedings.mlr.press/v97/ zhang19p.html.

Yan Zhou, Murat Kantarcioglu, and Bowei Xi. A survey of game theoretic approach for adversarial machine learning. Wiley Interdiscip. Rev. Data Min. Knowl. Discov., 9(3), 2019. doi: 10.1002/ widm.1259. URL https://doi.org/10.1002/widm.1259.

Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision, 2017.

Published as a conference paper at ICLR 2020

A INTRODUCTION TO THE APPENDIX

Here we provide additional information and proofs for the main paper. In Sec. B we provide a visualization of the game setting. In Sec. C we plot the game value in the Gaussian case for different parameter values. In Sec. D we provide a proof for Theorem 4.1, in Sec. E we formally state the theorem discussed in Sec. 4.2 and prove it, and in Sec. F we prove Theorem 4.2 and also derive the game value for the sub-optimal ML attacker . Finally, in Sections G, H we provide additional details about the experiments and the implementation of GIM.

B PROBLEM SETUP VISUALIZATION

Our problem setup is illustrated in Fig. 3 for a face authentication scenario. A source θ (in this case an individual) generates IID observations (images). k images are used by the authenticator to study the source which it aims to authenticate. m images are leaked and obtained by an attacker who wishes to impersonate the source and pass the authentication. At test time the authenticator is presented with n images that were generated either by the true source θ, or by the attacker, and decides whether the entity that generated the images was the source θ or an attacker.

Figure 3: An illustration of the game described in the main text. An authenticator receives a sample of n images and needs to decide whether these were generated by a known source, or by an adversary that had access to leaked images. In order to decide, the authenticator is supplied with a sample of k images from the real source.

C GAME VALUE VISUALIZATIONS

In corollary 4.3 we provide the game value (the expected authenticator accuracy) for the case of multivariate Gaussian sources. In this section, we present additional visualizations of the game value as a function of the different parameters of the game. Fig. 4 visualizes the game value as a function of δ and ρ, for different values of d (the observation dimension). δ m

n is the expansion ratio between the source information available to the attacker through the leaked sample and the size of the attacker s attack sample. ρ m

k is the information ratio between the number of source observations available to attacker and authenticator. One can clearly see that as d, the observation dimension, grows large, so does the accuracy of the authenticator. Even for δ values higher than 0.5, and ρ values which intuitively would give the attacker an advantage (e.g., m

Published as a conference paper at ICLR 2020

Fig. 5 visualizes the game value as a function of δ and ϵ, for different values of d (the observation dimension). Where ϵ n

k is the information expansion of the attacker with respect to the authenticator source information. Again, one can clearly see that as d, the observation dimension, grows large, so does the accuracy of the authenticator.

Figure 4: Game value as a function of δ, ρ for different dimensions d.

Figure 5: Game value as a function of δ, ϵ for different dimensions d.

D ADDITIONAL THEOREMS AND PROOFS FOR SEC. 4.1

We begin with some additional notation. Let

θPH Qpθqf pkq θ paqdθ

Published as a conference paper at ICLR 2020

denote the marginal density of A. Let QΘ|A denote the posterior probability over H given A. That is:

QΘ|Apθ|aq Qpθqf pkq θ paq ş

νPH Qpνqf pkq ν paqdν Qpθqf pkq θ paq f Apaq

Also, let f Y |A, f X|A, g X|A, denote the conditional densities deﬁned by:

f Y |Apy|aq ż

θPH f pmq θ pyq QΘ|Apθ|aqdθ

f X|Apx|aq ż

θPH f pnq θ pxq QΘ|Apθ|aqdθ

g X|Apx|aq ż

y PX m g X|Y px|yqf Y |Apy|aqdy

Lemma D.1. Let G be an attacker deﬁned by the conditional probability distribution g X|Y . Then @a P X k, x P X n a best response strategy for D is:

Dpa, xq Irf X|Apx|aq ą g X|Apx|aqs (D.1)

Proof. Given an attacker strategy g X|Y , the objective for D is given by:

argmax D EΘ QV pΘ, D, Gq

argmax D EΘ QEA f pkq Θ

EX f pnq Θ r Dp A, Xqs EY f pmq Θ EX g X|Y p |Y qr1 Dp A, Xqs ı

argmax D EA f A EX f X|Ap |Aqr Dp A, Xqs EY f Y |Ap |Aq EX g X|Y p |Y qr1 Dp A, Xqs ı

a PX k f Apaq ż

f X|Apx|aq Dpa, xq g X|Apx|aqr1 Dpa, xqs dxda

Note that D can be optimized independently for each pair a, x P X k ˆX n. Hence, @a, x P X k ˆX n the objective is:

argmax Dpa,xq Pt0,1u

f X|Apx|aq Dpa, xq g X|Apx|aqr1 Dpa, xqs (

And thus, the optimal decision rule for D is:

Dpa, xq I f X|Apx|aq ą g X|Apx|aq

As required.

Lemma D.2. @a, x P X k ˆ X n let the strategy for D be deﬁned by:

Dpa, xq I f X|Apx|aq ą g X|Apx|aq

Then G is a best response strategy for G, iff it minimizes the ℓ1 distance between the distributions f X|A, g X|A over the space X k ˆ X n. Namely:

g X|Y P argmin g X|Y EA f A

ˇˇf X|Apx|Aq g X|Apx|Aq ˇˇ dx (D.2)

Published as a conference paper at ICLR 2020

Proof. Let D be deﬁned as in Eq. D.1, the objective for G is:

1 2EΘ QV pΘ, D, Gq

argmin g X|Y EΘ QEA f pkq Θ

EX f pnq Θ r Dp A, Xqs EY f pmq Θ EX g X|Y p |Y qr1 Dp A, Xqs ı

argmin g X|Y EA f A EX f X|Ap |Aqr Dp A, Xqs EX g X|Ap |Aqr1 Dp A, Xqs ı

argmin g X|Y

a PX k daf Apaq ż

x PX n dx f X|Apx|aq Dpa, xq g X|Apx|aqr1 Dpa, xqs

argmin g X|Y

a PX k daf Apaq ż

x PX n dx max f X|Apx|aq, g X|Apx|aq ( (D.3)

argmin g X|Y

a PX k daf Apaq ż

x PX n dx f X|Apx|aq g X|Apx|aq ˇˇf X|Apx|aq g X|Apx|aq ˇˇ

argmin g X|Y 1 1

a PX k daf Apaq ż

x PX n dx ˇˇf X|Apx|aq g X|Apx|aq ˇˇ

argmin g X|Y

a PX k daf Apaq ż

x PX n dx ˇˇf X|Apx|aq g X|Apx|aq ˇˇ

As required. Where in Eq. D.3 we used the deﬁnition of D.

Theorem 4.1. Consider the attacker deﬁned by:

g X|Y P argmin g X|Y EA f A

ˇˇf X|Apx|Aq g X|Apx|Aq ˇˇ dx (D.4)

and let G be the corresponding map from X m to the set of probability distributions over X n. Consider the authenticator deﬁned by:

D pa, xq I f X|Apx|aq ą g X|Apx|aq ı (D.5)

where I is the indicator function. Then p D , G q is a solution of Eq. 2.2 that satisﬁes Eq. 4.1.

Proof. From Lemmas D.1, D.2 we have that max D V p D, G q V p D , G q min G V p D , Gq, from which it follows that Eq. 4.1 is satisﬁed and thus p D , G q is a solution of Eq. 2.2.

E THEOREM AND PROOF FOR SEC. 4.2

Theorem E.1. For all n ď m it holds that:

max DPD min GPG V p D, Gq 0.5

Proof. Consider the attacker Greplay deﬁned by the following generative process: Given a leaked sample Y P Rmˆd, Greplay generates a sample X P Rnˆd such that Xi Yi @i P rns. (this is possible since we assumed n ď m). Namely, we have:

g X|Y px|yq

i 1 δpxi yiq

Published as a conference paper at ICLR 2020

Where δ is the Dirac delta. Thus @a, x P X k ˆ X n:

g X|Apx|aq ż

y PX m dyg X|Y px|yqf Y |Apy|aq

y PX m dyg X|Y px|yq ż

θPH dθf pmq θ pyq QΘ|Apθ|aq

i 1 δpxi yiqf pmq θ pyq QΘ|Apθ|aq

θPH dθQΘ|Apθ|aq ż

i 1 δpxi yiqf pnq θ pyq ż

y1PX m n dy1f pm nq θ py1q

θPH dθQΘ|Apθ|aq ż

i 1 δpxi yiqf pnq θ pyq

θPH dθQΘ|Apθ|aqf pnq θ pxq

Deﬁne: D0pa, xq 0 @a, x P X k ˆ X n

Then according to Theorem 4.1 p D0, Greplayq is a solution of Eq. 2.2 that satisﬁes Eq. 4.1, and therefore:

max DPD min GPG V p D, Gq V p D0, Greplayq

2EΘ QEA f pkq θ EY f pmq θ

EX f pnq θ r Dp A, Xqs EX Gp Y qr1 Dp A, Xqs ı

2EΘ QEA f pkq θ EY f pmq θ

EX f pnq θ r0s EX Gp Y qr1s ı

As required.

F ADDITIONAL THEOREMS AND PROOFS FOR SEC. 4.3

F.1 NOTATION AND DEFINITIONS

In this section, we consider the case where the sources are d-dimensional Gaussian vectors with a known covariance matrix Σ CCT P Rdˆd and unknown mean vector θ P Rd. That is, the set of possible sources is H Rd and given θ P H the associated probability density over the domain X Rd is fθpxq 1 ?

p2πqd| detpΣq| exp p 1

2px θq T Σ 1px θqq

A sample of n examples x P X n is considered a matrix x P Rnˆd, where the ﬁrst index represents the sample and the second represents the observation space, Rd. I.e., xij P R is the j th element of the i th example in the sample x P Rnˆd.

We continue with a few more notations that simplify the proofs. Given a matrix x P Rnˆd, we let xc x T 1 , . . . , x T n T P Rnd be the concatenation vector representing x. Given a vector θ P Rd,

we let θc,n θT , . . . , θT T P Rnd be the concatenation of n copies of θ. Given a matrix x P Rnˆd, we let x 1

n řn i 1 xi P Rd denote its mean along the sample dimension. For any matrix B P Rdˆd, we denote:

diagp B, kq

B 0 ... 0 B

Published as a conference paper at ICLR 2020

B B ... ... ... B B

Finally, we deﬁne strategies for both attacker and authenticator, which we prove in what follows to be the optimal strategies for the game.

Deﬁnition F.1. Let G denote an attacker deﬁned by the following generative process: Given a leaked sample Y P Rmˆd, G generates an attack sample X P Rnˆd as follows. It ﬁrst samples n vectors W1, . . . , Wn iid Np0, Σq and then sets:

Xi Wi W Y (F.1)

Also, let g X|Y denote its associated conditional probability.

Deﬁnition F.2. For any α P R , let Dα denote an authenticator deﬁned as:

Dαpa, xq I } x a}2 Σ 1 ă α ı

Where I is the indicator function

F.2 TECHNICAL LEMMAS

Lemma F.3. Let X1, . . . , Xk iid Npµ, Σq Where µ P Rd, Σ P Rdˆd. Then

j 1 Xj Npµ, 1

Proof. We begin by observing that Xc N pµc,k, diagpΣ, kqq. Let B 1

k r Id Ids P Rdˆkd

and observe that X BXc. Therefore, since this is an afﬁne transformation of a Gaussian vector we have:

X Np Bµc,k, BdiagpΣ, kq BT q Npµ, 1

As required.

Lemma F.4. Let X P Rd be a Gaussian vector s.t X Npµ, Σq, and let Xc,n P Rnd be the concatenation of n copies of X. Then:

Xc,n Npµc,n, reppΣ, nqq

and observe that Xc,n BX. Therefore, since this is an afﬁne transformation of a Gaussian vector we have:

Xc,n Np Bµ, BΣBT q Npµc,n, reppΣ, nqq

As required.

Lemma F.5. Let θ P Rd, Σ P Rdˆd represent the mean and covariance of a Gaussian distribution. Let X P Rnˆd be a random sample generated by the attacker deﬁned in Def. F.1. Then:

Xc Npθc,n, diagpΣ, nq repppn m

mn qΣ, nqq Npθc,n, Ψq

Published as a conference paper at ICLR 2020

Proof. Observe that Wc N p0, diagpΣ, nqq. Using Lemma F.3 we get Y Npθ, 1

mΣq and observe that Wc,n repp 1

n Id, nq Wc. Using Lemma F.4 we get Yc,n Npθc,n, 1

mreppΣ, nqq. We deﬁne the following block matrices

nrepp Id, nq, Ind

and observe that:

Z Np 0nd θc,n

, diagpΣ, nq 0 0 1 mreppΣ, nq

Note that Xc Wc Wc,n Yc,n BZ and therefore we get:

Xc Np B 0nd θc,n

, B diagpΣ, nq 0 0 1 mreppΣ, nq

N ˆ θc,n, diag pΣ, nq rep ˆ pn m

As required.

Lemma F.6. Let Σ CCT P Rdˆd represent the covariance of a Gaussian distribution, and consider the following covariance matrix:

Ψ diagpΣ, nq repppn m

Then its inverse is:

Ψ 1 diagpΣ 1, nq n m

n2 reppΣ 1, nq

and the determinant is:

Proof. We begin by deﬁning the following block matrices:

ﬃﬂP Rndˆd (F.2)

ﬃﬂP Rndˆd (F.3)

Published as a conference paper at ICLR 2020

and note that reppp n m

mn qΣ, nq UV T . Then:

Ψ 1 pdiagpΣ, nq repppn m

mn qΣ, nqq 1

pdiagpΣ, nq UV T q 1

piq diagpΣ, nq 1 diagpΣ, nq 1Up Id V T diagpΣ, nq 1Uq 1V T diagpΣ, nq 1

piiq diagpΣ 1, nq diagpΣ 1, nq Up Id V T diagpΣ 1, nq Uq 1V T diagpΣ 1, nq

diagpΣ 1, nq

nm r Id Ids

diagpΣ 1, nq

diagpΣ 1, nq n m

diagpΣ 1, nq n m

n2 reppΣ 1, nq

As required. Where in piq we used the Woodbury matrix identity, and in piiq we used the inverse of a diagonal block matrix.

Next, we turn to ﬁnd the determinant of Ψ:

detpΨq detpdiagpΣ, nq repppn m

detpdiagpΣ, nq UV T q

piiiq detpdiagpΣ, nqqdetp Id V T diagpΣ, nq 1Uq

pivq detpΣqndetp Id V T diagpΣ 1, nq Uq

detpΣqndetp Id n m

nm r Id Ids diagpΣ 1, nq

detpΣqndetp Id n m

Where in piiiq we used the matrix determinant lemma, and in pivq we used the determinant of a diagonal block matrix.

Lemma F.7. Let

hpx, µq expt 1

2σ2 px µq T Σ 1px µqu I x T Σ 1x ă α @px, µq P Rd ˆ Rd

and deﬁne the function:

x PRd dx hpx, µq

Then ψpµq is log-concave over the space Rd.

Proof. We begin by noting that the function px µq T Σ 1px µq is convex w.r.t both x and µ, hence its negative is concave, and thus, by deﬁnition the function:

2σ2 px µq T Σ 1px µqu (F.4)

Published as a conference paper at ICLR 2020

is log-concave w.r.t x, µ. We now show that hpx, µq is log-concave w.r.t both x and µ. First, w.r.t µ. Let β P r0, 1s, µ1, µ2 P Rd, and observe that:

hpx, βµ1 p1 βqµ2q

2σ2 px βµ1 p1 βqµ2q T Σ 1px βµ1 p1 βqµ2qu Irx T Σ 1x ă αs

piq ě expt β

2σ2 px µ1q T Σ 1px µ1qu expt p1 βq

2σ2 px µ2q T Σ 1px µ2qu

Irx T Σ 1x ă αs

piiq expt β

2σ2 px µ1q T Σ 1px µ1qu expt p1 βq

2σ2 px µ2q T Σ 1px µ2qu

p Irx T Σ 1x ă αsqβp Irx T Σ 1x ă αsqp1 βq

hpx, µ1qβhpx, µ2qp1 βq

Therefore hpx, µq is log-concave w.r.t µ. Where in piq we used the log-concavity of the function in Eq. F.4, and in piiq we used the fact that Irx T Σ 1x ă αs P t0, 1u.

Now, w.r.t x. Let β P r0, 1s, x1, x2 P Rd. Observe that for any convex function q we have:

I rqpx1q ă αsβ I rqpx2q ă αsp1 βq piq I rqpx1q ă αs I rqpx2q ă αs I rqpx1q ă α qpx2q ă αs

piiq ď I rβqpx1q p1 βqqpx2q ă αs

piiiq ď I rqpβx1 p1 βqx2q ă αs

Therefore Irx T Σ 1x ă αs is log-concave. Where in piq we used the fact that Irqpxq ă αs P t0, 1u, in piiq we used the fact that qpx1q ă α qpx2q ă α ñ βqpx1q p1 βqqpx2q ă α, and in piiiq we used the convexity of q. Hence, observing hpx, µq we have:

hpβx1 p1 βqx2, µq

2σ2 pβx1 p1 βqx2 µq T Σ 1pβx1 p1 βqx2 µqu

I pβx1 p1 βqx2q T Σ 1pβx1 p1 βqx2q ă α

pivq ě expt β

2σ2 px1 µq T Σ 1px1 µqu expt p1 βq

2σ2 px2 µq T Σ 1px2 µqu

I pβx1 p1 βqx2q T Σ 1pβx1 p1 βqx2q ă α

pvq ě expt β

2σ2 px µ1q T Σ 1px µ1qu expt p1 βq

2σ2 px µ2q T Σ 1px µ2qu

p I x T 1 Σ 1x1 ă α qβp I x T 2 Σ 1x2 ă α qp1 βq

hpx1, µqβhpx2, µqp1 βq

Therefore hpx, µq is log-concave w.r.t x. Where in pivq we used the log-concavity of the function in Eq. F.4, and in pvq we used the log-concavity of Irx T Σ 1x ă αs.

Finally, by using Prekopa-Leindler inequality (Pr ekopa, 1973) we have that ψpµq is log-concave, as required.

F.3 PROOF OF THEOREM 4.2

Lemma F.8. Consider the attacker G , deﬁned in Def. F.1. The best response strategy for the authenticator against this attacker is:

D pa, xq I } x a}2 Σ 1 ă α ı

α dpm kqpn kq

k2pn mq log npm kq

Published as a conference paper at ICLR 2020

Proof. The best response authenticator satisﬁes:

D P argmax DPD V p D, Gq

1 2EΘ QEA f pkq Θ EY f pmq Θ r EX f pnq Θ r Dp A, Xqs EX g X|Y p |Y qr1 Dp A, Xqss

argmax DPD EΘ QEA f pkq Θ r EX f pnq Θ r Dp A, Xqs EX g X|Θp |Θqr1 Dp A, Xqss

F.5 argmax DPD EΘ QEA NpΘc,k,diagpΣ,kqq EX NpΘc,n,diagpΣ,nqqr Dp A, Xqs EX Npθc,n,Ψqr1 Dp A, Xqs

F.6 argmax DPD EΘ Q

a PRkd da ż

x PRnd dx expt 1

2pa Θc,kq T diagpΣ, kq 1pa Θc,kqur

2px Θc,nq T diagpΣ, nq 1px Θc,nqu Dpa, xq d

2px Θc,nq T Ψ 1px Θc,nqur1 Dpa, xqss

D can be chosen independently for each pair pa, xq P X kˆX n. Therefore, for any pa, xq P X kˆX n the decision rule for Dpa, xq 1 is: c

θPRd dθQpθq expt 1

2rpx θc,nq T diagpΣ, nq 1px θc,nq

pa θc,kq T diagpΣ, kq 1pa θc,kqsu ą ż

θPRd dθQpθq expt 1

2rpx θc,nq T Ψ 1px θc,nq pa θc,kq T diagpΣ, kq 1pa θc,kqsu

Observing the LHS integral and using the improper uniform prior assumption we have: ż

θPRd dθQpθq expt 1

2rpx θc,nq T diagpΣ 1, nqpx θc,nq

pa θc,kq T diagpΣ 1, kqpa θc,kqsu

θPRd dθ expt 1

i 1 x T i Σ 1xi

j 1 a T j Σ 1aj 2θT Σ 1pn x k aq pn kqθT Σ 1θsu

i 1 x T i Σ 1xi

j 1 a T j Σ 1aj pn kqpn x k a

n k q T Σ 1pn x k a

θPRd dθ expt n k

2 rpθ n x k a

n k q T Σ 1pθ n x k a

i 1 x T i Σ 1xi

j 1 a T j Σ 1aj pn kqpn x k a

n k q T Σ 1pn x k a

n k qd detpΣq

Observing the RHS and using the improper uniform prior assumption we have: ż

θPRd dθQpθq expt 1

2rpx θc,nq T Ψ 1px θc,nq pa θc,kq T diagpΣ 1, kqpa θc,kqsu

θPRd dθ expt 1

2rpx θc,nq T Ψ 1px θc,nq pa θc,kq T diagpΣ 1, kqpa θc,kqsu

θPRd dθ expt 1

2rpx θc,nq T pdiagpΣ 1, nq n m

n2 reppΣ 1, nqqpx θc,nq

pa θc,kq T diagpΣ 1, kqpa θc,kqsu

Published as a conference paper at ICLR 2020

θPRd dθ expt 1

i 1 pxi θq T Σ 1pxi θq

j 1 paj θq T Σ 1paj θq

j 1 pxi θq T Σ 1pxj θqsu

θPRd dθ expt 1

i 1 pxi θq T Σ 1pxi θq

j 1 paj θq T Σ 1paj θq

pn mqp x θq T Σ 1p x θqsu

θPRd dθ expt 1

i 1 x T i Σ 1xi

j 1 a T j Σ 1aj pn mq x T Σ 1 x

pm kqθT Σ 1θ 2θT Σ 1pm x k aqsu

i 1 x T i Σ 1xi

j 1 a T j Σ 1aj pn mq x T Σ 1 x

pm kqpm x k a

m k q T Σ 1pm x k a

θPRd dθ expt m k

2 rpθ m x k a

m k q T Σ 1pθ m x k a

p 2π m k qd detpΣq expt 1

i 1 x T i Σ 1xi

j 1 a T j Σ 1aj pn mq x T Σ 1 x

pm kqpm x k a

m k q T Σ 1pm x k a

Therefore the decision rule is c

2rpn kqpn x k a

n k q T Σ 1pn x k a

2rpn mq x T Σ 1 x pm kqpm x k a

m k q T Σ 1pm x k a

p 2π m k qd

mpn kqqd expt1

2rpn kqpn x k a

n k q T Σ 1pn x k a

2rpn mq x T Σ 1 x pm kqpm x k a

m k q T Σ 1pm x k a

ôd log npm kq

mpn kq ą pn mq x T Σ 1 x pm kqpm x k a

m k q T Σ 1pm x k a

pn kqpn x k a

n k q T Σ 1pn x k a

ôdpm kqpn kq log npm kq

k2pn mq x T Σ 1 x 2k2pn mq x T Σ 1 a k2pn mq a T Σ 1 a

ôdpm kqpn kq

k2pn mq log npm kq

mpn kq ą p x aq T Σ 1p x aq

As required.

Lemma F.9. Consider the authenticator Dα, as deﬁned in Def. F.2. Then any attacker G, represented by a conditional probability g X|Y , that satisﬁes the condition x y for any leaked sample y P Rmˆd and attacker generated sample x P t Rnˆd : g X|Y px|yq ą 0u, satisﬁes:

G P argmin G1PG V p Dα, G1q @α P R

Published as a conference paper at ICLR 2020

Proof. The best response attacker satisﬁes:

g1 X|Y P argmin g X|Y

1 2EΘ QEA f pkq Θ EY f pmq Θ r EX f pnq Θ r Dαp A, Xqs EX g X|Y p |Y qr1 Dαp A, Xqss

argmin g X|Y EΘ QEA f pkq Θ EY f pmq Θ EX g X|Y p |Y qr1 Dαp A, Xqs

argmax g X|Y EΘ QEA f pkq Θ EY f pmq Θ EX g X|Y p |Y qr Dαp A, Xqs

argmax g X|Y EΘ QEAiid NpΘ,Σq EY iid NpΘ,Σq EX g X|Y p |Y q I X A 2 Σ 1 ă α ıı

Lemma F.3 argmax g X|Y EΘ QE A NpΘ, 1

k Σq EY iid NpΘ,Σq EX g X|Y p |Y q I X A 2 Σ 1 ă α ıı

argmax g X|Y

y PRmˆd dy ż

x PRnˆd dxg X|Y px|yq ż

a PRd d a I } x a}2 Σ 1 ă α ı ż

θPRd dθQpθq

2p a θq T Σ 1p a θqu expt 1

j 1 pyj θq T Σ 1pyj θqu

Note that g X|Y px|yq can be chosen independently for each y P Rmˆd. Thus, we can optimize it independently for each y P Rmˆd and we have:

g1 X|Y p |yq P argmax g X|Y p |yq

x PRnˆd dxg X|Y px|yq ż

a PRd d a I } x a}2 Σ 1 ă α ı ż

θPRd dθQpθq

2p a θq T Σ 1p a θqu expt 1

j 1 pyj θq T Σ 1pyj θqu

Note that for any PDF f over Rnˆd and a function ϕ : Rnˆd Ñ R , it holds that ş

x PRnˆd dxfpxqϕpxq ď supxϕpxq. Therefore, there exists a deterministic distribution g1 X|Y px|yq δpx x1q that achieves the maximum. Thus, it s sufﬁcient to ﬁnd a vector x G that achieves the maximum:

x G P argmax x

x1PRnˆd dx1δpx1 xq ż

a PRd d a I x1 a 2 Σ 1 ă α ı ż

θPRd dθQpθq

2p a θq T Σ 1p a θqu expt 1

j 1 pyj θq T Σ 1pyj θqu

a PRd d a I } x a}2 Σ 1 ă α ı ż

θPRd dθQpθq

2p a θq T Σ 1p a θqu expt 1

j 1 pyj θq T Σ 1pyj θqu

p q argmax x

a PRd d a I } x a}2 Σ 1 ă α ı ż

2rkp a θq T Σ 1p a θq

j 1 pyj θq T Σ 1pyj θqsu

a PRd d a expt 1

2rk a T Σ 1 asu I } x a}2 Σ 1 ă α ı ż

2rpm kqθT Σ 1θ 2θT Σ 1pm y k aqsu

a PRd d a expt 1

2rk a T Σ 1 a 1 m k pm y k aq T Σ 1pm y k aqsu

I } x a}2 Σ 1 ă α ı ż

θPRd dθ expt pm kq

2 pθ m y k a

m k q T Σ 1pθ m y k a

Published as a conference paper at ICLR 2020

a PRd d a expt 1

2rk a T Σ 1 a 1 m k pm y k aq T Σ 1pm y k aqsu

I } x a}2 Σ 1 ă α ı

a PRd d a expt 1

m k a T Σ 1 a 2mk

m k y T Σ 1 a mk m k y T Σ 1 ysu

I } x a}2 Σ 1 ă α ı

a PRd d a expt mk 2pm kqrp a yq T Σ 1p a yqsu I } x a}2 Σ 1 ă α ı

Where in p q we used the fact that Qpθq is the improper uniform prior. Note that the expression depends only on the mean x. Therefore, it s sufﬁcient to ﬁnd a mean vector x that maximizes the expression. We substitute the integration variable to ϕ a x and obtain:

x G P argmax x

tϕPRd:ϕT Σ 1ϕăαu dϕ expt mk 2pm kqrpϕ x yq T Σ 1pϕ x yqsu

argmax x ψp y xq argmax x ψpµq

Where ψ is deﬁned as in Lemma F.7 (with σ m k

mk ), from which it follows that ψpµq is logconcave, and therefore has at most one local extremum which can only be a maximum. Therefore, it is sufﬁcient to show that µ 0 (i.e., x y) is a local extremum by equating the gradient at the point to 0. B Bµψpµq B

tϕPRd:ϕT Σ 1ϕăαu dϕ expt mk 2pm kqrpϕ µq T Σ 1pϕ µqsu

tϕPRd:ϕT Σ 1ϕăαu dϕ expt mk 2pm kqrpϕ µq T Σ 1pϕ µqsu

B Bµpϕ µq T Σ 1pϕ µq

tϕPRd:ϕT Σ 1ϕăαu dϕ expt mk 2pm kqrpϕ µq T Σ 1pϕ µqsuΣ 1pµ ϕq

Therefore: B Bµψpµq|µ 0 mk pm kq

tϕPRd:ϕT Σ 1ϕăαu dϕ expt mk 2pm kqrϕT Σ 1ϕsuΣ 1ϕ

Note that since the domain of integration is symmetric about the origin with respect to negation and the integrand is odd with respect to the integration variable, the integral is equal to zero. I.e., B Bµψpµq|µ 0 0. Therefore, x y (µ 0) achieves the global maximum, and any attacker that satisﬁes the condition: x y for any leaked sample y P Rmˆd and attacker generated sample x P t Rnˆd : g X|Y px|yq ą 0u satisﬁes:

G P argmin G1PG V p Dα, G1q @α P R

As required.

Corollary F.10. Consider an authenticator Dα, as deﬁned in Def. F.2. Then the attacker G , deﬁned in Def. F.1, is a best response. i.e.: G P argmin G1PG V p Dα, G1q @α P R

Proof. Directly from Lemma F.9

Theorem F.11. The game value is: max D min G V p D, Gq min G max D V p D, Gq V p D , G q

1 2 1 2Γp d

2kpn mq log npm kq

mpn kqq γpd

2kpn mq log npm kq

Where γ is the lower incomplete gamma function.

Published as a conference paper at ICLR 2020

Proof. From the max-min inequality we have: max D min G V p D, Gq ď min G max D V p D, Gq

On the other hand, using Lemma F.8 and Corollary F.10 we have:

max D min G V p D, Gq ě min G V p D , Gq F.10 V p D , G q F.8 max D V p D, G q ě min G max D V p D, Gq

Therefore: max D min G V p D, Gq min G max D V p D, Gq V p D , G q

The game value is given by: V p D , G q EΘ QV pΘ, D , G q

2EΘ QEA f pkq Θ EY f pmq Θ

EX f pnq Θ r D p A, Xqs EX g X|Y p |Y qr1 D p A, Xqs ı

2EΘ QEA f pkq Θ EY f pmq Θ EX f pnq Θ

Ir X A 2 Σ 1 ă α s ı EX g X|Y p |Y q 1 Ir X A 2 Σ 1 ă α s ıı

2EΘ QEA f pkq Θ EX f pnq Θ

Ir x A 2 Σ 1 ă α s ı

1 2EΘ QEA f pkq Θ EX g X|Θp |Θq Ir x A 2 Σ 1 ă α s ı

Observing the ﬁrst term we have: 1 2EΘ QEA f pkq Θ EX f pnq Θ

Ir X A 2 Σ 1 ă α s ı

2EΘ QEA f pkq Θ EX f pnq Θ Irp X Aq T Σ 1p X Aq ă α s

2EΘ QEA f pkq Θ EX f pnq Θ Irp X Aq T p CCT q 1p X Aq ă α s

2EΘ QEA f pkq Θ EX f pnq Θ Irp X Aq T C T C 1p X Aq ă α s

2EΘ QEA f pkq Θ EX f pnq Θ Irp C 1p X Aqq T p C 1p X Aqq ă α s

2EΘ QEA f pkq Θ EX f pnq Θ Ir ZT Z ă α s

p q Observe that Z C 1p X Aq C 1rp X Θq p A Θqs

C 1r Id, Ids X Θ A Θ

r C 1, C 1s X Θ A Θ

Note that: X Θ A Θ

nΣ 0dˆd 0dˆd 1 kΣ

Z Np0d, r C 1, C 1s 1

nΣ 0dˆd 0dˆd 1 kΣ

k q C 1ΣC T q

nk C 1CCT C T q

Published as a conference paper at ICLR 2020

We denote Z b

nk n k Z Np0d, Idq, and thus Z1, . . . , Zd are independent standard normal

random variables and ZT Z χ2pdq. Therefore, ZT Z n k

nk ZT Z Γpk d

nk q and we have:

2EΘ QEA f pkq Θ EX f pnq Θ Ir ZT Z ă α s

2EΘ QEZT Z Γpk d

nk q Ir ZT Z ă α s

2EΘ Q 1 Γp d

Where in piq We used the CDF of the Gamma distribution in which γ is the lower incomplete gamma function.

Similarly, observing the second term we have:

1 2EΘ QEA f pkq Θ EX g X|Θp |Θq Ir X A 2 Σ 1 ă α s ı

2EΘ QEA f pkq Θ EX g X|Θp |Θq Ir V T V ă α s

V C 1p X Aq C 1rp X Θq p A Θqs

C 1r Id, Ids X Θ A Θ

r C 1, C 1s X Θ A Θ

Using the deﬁnition of G (Deﬁnition F.1) and Lemma F.3 we have: X Θ A Θ

mΣ 0dˆd 0dˆd 1 kΣ

V Np0d, r C 1, C 1s 1

mΣ 0dˆd 0dˆd 1 kΣ

q Np0d, m k

And similarly to the ﬁrst term, we get:

V T V Γpk d

Therefore, the game value is given by:

V p D , G q 1

2pn kqq γpd

2kpn mq log npm kq

mpn kqq γpd

2kpn mq log npm kq

As required.

Finally, we prove Theorem 4.2 and Corollary 4.3.

Published as a conference paper at ICLR 2020

Theorem 4.2. Deﬁne δ m{n ď 1 and let ρ m{k. Consider the attacker G deﬁned by the following generative process: Given a leaked sample Y P Rmˆd, G generates a sample X P Rnˆd

as follows: it ﬁrst samples n vectors W1, . . . , Wn iid Np0, Σq and then sets: Xi Wi W Y . Deﬁne the authenticator D by:

} x a}2 Σ 1 ă d p1 ρq 1 ρδ 1

np1 δq log ˆρ 1

Then p D , G q is a solution of Eq. 2.2 that satisﬁes Eq. 4.1.

Proof. Directly from Lemma F.8, Corollary F.10, and Theorem F.11 by assigning δ m

Corollary 4.3. Deﬁne δ and ρ as in Theorem 4.2. Then the game value for the Gaussian case is:

1 2 1 2Γp d

2p1 δq log 1 ρ

2p1 δq log 1 ρ

Where γ is the lower incomplete Gamma function, and Γ is the Gamma function.

Proof. Directly from Theorem F.11 by assigning δ m

F.4 GAME VALUE FOR A MAXIMUM LIKELIHOOD ATTACKER

In this section, we consider the most intuitive attacker strategy, which one could naively see as optimal. However, we show that this intuitive optimal attacker is sub-optimal as can be seen in Fig. 1c in the main paper. We consider an attacker that draws the attack sample from a Gaussian distribution with the maximum likelihood estimate of the mean and the known covariance. We denote this attacker by the name ML attacker. We ﬁnd the best response authenticator to this attacker and the associated game value. Fig. 6 visualizes the difference in theoretical game value between the ML attacker (see Deﬁnition F.12) and the optimal attacker (see Deﬁnition F.1) for different values of d (the dimension of observations), and demonstrates that the ML attacker is indeed sub-optimal.

Figure 6: The difference in game value (expected authentication accuracy of the optimal authenticator) between the ML attacker and the optimal attacker for different values of the observations dimension d, as a function of the parameters ρ m

n . Namely: max Dt V p D, GMLqu max Dt V p D, G qu. (a) Difference in game value for d 10. (b) Difference in game value for d 100. (c) Difference in game value for d 1000.

Deﬁnition F.12. Let GML denote an attacker deﬁned by the following generative process: Given a leaked sample Y P Rmˆd, GML generates an attack sample X iid Np Y , Σq

Lemma F.13. Let θ P Rd, Σ P Rdˆd represent the mean and covariance of a Gaussian distribution. Let X P Rnˆd be a random sample generated by the attacker deﬁned in Def. F.12. Then:

Xc Npθc,n, diagpΣ, nq reppp 1

mqΣ, nqq Npθc,n, ΨMLq

Published as a conference paper at ICLR 2020

Proof. Let W1, . . . , Wn iid Np0, Σq, observe that Xi Y Wi @i P rns, and thus:

Where Wc Np0 1dn, diagpΣ, nq. Using Lemma F.3 we have Y Npθ, 1

mΣq and using Lemma F.4 we have Yc,n Npθc,n, repp 1

Let Z Wc Yc,n

and B r Indˆnd , Indˆnds, then Xc Wc Yc,n BZ. Note that:

Z Np 0nd θc,n

, diagpΣ, nq 0 0 repp 1

and therefore we have:

Xc Np B 0nd θc,n

, B diagpΣ, nq 0 0 repp 1

Npθc,n, diagpΣ, nq repp 1

As required.

Lemma F.14. Let Σ CCT P Rdˆd represent the covariance of a Gaussian distribution, and consider the following covariance matrix: ΨML diagpΣ, nq repp 1

mΣ, nq. Then:

Ψ 1 ML diagpΣ 1, nq 1 n mreppΣ 1, nq

and the determinant is: detpΨq detpΣqnpn m

Proof. To ﬁnd the inverse of ΨML we ﬁrst deﬁne:

ﬃﬂP Rndˆd, V 1

ﬃﬂP Rndˆd (F.7)

Therefore we have:

Ψ 1 ML pdiagpΣ, nq repp 1

pdiagpΣ, nq UV T q 1

piq diagpΣ, nq 1 diagpΣ, nq 1Up Id V T diagpΣ, nq 1Uq 1V T diagpΣ, nq 1

piiq diagpΣ 1, nq diagpΣ 1, nq Up Id V T diagpΣ 1, nq Uq 1V T diagpΣ 1, nq

diagpΣ 1, nq

diagpΣ 1, nq

diagpΣ 1, nq

m q Idq 1 1

diagpΣ 1, nq 1 n m

ﬃﬂId Σ 1 Σ 1

Published as a conference paper at ICLR 2020

diagpΣ 1, nq 1 n mreppΣ 1, nq

As required. Where in piq we used the Woodbury matrix identity, and in piiq we used the inverse of a diagonal block matrix.

Next, we turn to ﬁnd the determinant of ΨML:

detpΨq detpdiagpΣ, nq reppp 1

detpdiagpΣ, nq UV T q

piiiq detpdiagpΣ, nqq detp Id V T diagpΣ, nq 1Uq

pivq detpΣqn detp Id V T diagpΣ 1, nq Uq

detpΣqn detp Id 1

m r Id Ids diagpΣ 1, nq

detpΣqn detppn m

detpΣqnpn m

Where in piiiq we used the matrix determinant lemma, and in pivq we used the determinant of a diagonal block matrix.

Lemma F.15. Consider the attacker GML, deﬁned in F.12. The best response strategy for the authenticator against this attacker is:

DMLpa, xq I } x a}2 Σ 1 ă αML ı

αML dpn kqpnm nk mkq

k2n2 logpnm nk mk

Proof. The best response authenticator satisﬁes:

D P argmax DPD V p D, GMLq

1 2EΘ QEA f pkq Θ EY f pmq Θ

EX f pnq Θ r Dp A, Xqs EX g ML X|Y p |Y qr1 Dp A, Xqs ı

argmax DPD EΘ QEA f pkq Θ

EX f pnq Θ r Dp A, Xqs EX g ML X|Θp |Θqr1 Dp A, Xqs ı

Lemma F.13 argmax DPD EΘ QEA NpΘc,k,diagpΣ,kqq EX NpΘc,n,diagpΣ,nqqr Dp A, Xqs EX Npθc,n,ΨMLqr1 Dp A, Xqs

argmax DPD EΘ Q

a PRkd da ż

x PRnd dx expt 1

2pa Θc,kq T diagpΣ, kq 1pa Θc,kqu

1 |detpdiagpΣ, nqq| expt 1

2px Θc,nq T diagpΣ, nq 1px Θc,nqu Dpa, xq

1 |detpΨMLq| expt 1

2px Θc,nq T Ψ 1 MLpx Θc,nqur1 Dpa, xqss

Lemma F.14 argmax DPD EΘ Q

a PRkd da ż

x PRnd dx expt 1

2pa Θc,kq T diagpΣ, kq 1pa Θc,kqur c

m qd expt 1

2px Θc,nq T diagpΣ, nq 1px Θc,nqu Dpa, xq

Published as a conference paper at ICLR 2020

2px Θc,nq T Ψ 1 MLpx Θc,nqur1 Dpa, xqss

D can be chosen independently for each pair pa, xq P X kˆX n. Therefore, for any pa, xq P X kˆX n the decision rule for Dpa, xq 1 is: c

θPRd dθQpθq expt 1

2px θc,nq T diagpΣ, nq 1px θc,nqu

2pa θc,kq T diagpΣ, kq 1pa θc,kqu ą ż

θPRd dθQpθq expt 1

2px θc,nq T Ψ 1 MLpx θc,nqu expt 1

2pa θc,kq T diagpΣ, kq 1pa θc,kqu

Observing the LHS integral and using the improper uniform prior assumption, we obtain: ż

θPRd dθQpθq expt 1

2rpx θc,nq T diagpΣ, nq 1px θc,nq

pa θc,kq T diagpΣ, kq 1pa θc,kqsu

θPRd dθ expt 1

i 1 x T i Σ 1xi

j 1 a T j Σ 1aj 2θT Σ 1pn x k aq pn kqθT Σ 1θsu

i 1 x T i Σ 1xi

j 1 a T j Σ 1aj pn kqpn x k a

n k q T Σ 1pn x k a

θPRd dθ expt n k

2 rpθ n x k a

n k q T Σ 1pθ n x k a

i 1 x T i Σ 1xi

j 1 a T j Σ 1aj pn kqpn x k a

n k q T Σ 1pn x k a

n k qd detpΣq

Observing the RHS and using the improper uniform prior assumption, we obtain: ż

θPRd dθQpθq expt 1

2rpx θc,nq T Ψ 1 MLpx θc,nq pa θc,kq T diagpΣ 1, kqpa θc,kqsu

θPRd dθ expt 1

2rpx θc,nq T Ψ 1 MLpx θc,nq pa θc,kq T diagpΣ 1, kqpa θc,kqsu

θPRd dθ expt 1

2rpx θc,nq T pdiagpΣ 1, nq 1 n mreppΣ 1, nqqpx θc,nq

pa θc,kq T diagpΣ 1, kqpa θc,kqsu

θPRd dθ expt 1

i 1 pxi θq T Σ 1pxi θq

j 1 paj θq T Σ 1paj θq

n mp x θq T Σ 1p x θqsu

i 1 x T i Σ 1xi

j 1 a T j Σ 1aj n2

n m x T Σ 1 x nm nk mk

n m v T Σ 1vsu

θPRd dθ expt 1

n m rv T Σ 1vsu

i 1 x T i Σ 1xi

j 1 a T j Σ 1aj n2

n m x T Σ 1 x nm nk mk

n m v T Σ 1vsu

p 2πpn mq nm nk mk qd detpΣq

Published as a conference paper at ICLR 2020

Where in piq we denoted v nm x kpn mq a

nm nk mk . Therefore, the decision rule is

2rpn kqpn x k a

n k q T Σ 1pn x k a

p 1 mpn kqqd ą

n m x T Σ 1 x nm nk mk

n m v T Σ 1vsu

p 1 nm nk mk qd

mpn kq qd ą

n m x T Σ 1 x nm nk mk

n m v T Σ 1v pn kqpn x k a

n k q T Σ 1pn x k a

ôd logpnm nk mk

n m x T Σ 1 x nm nk mk

n m v T Σ 1v pn kqpn x k a

n k q T Σ 1pn x k a

ôd logpnm nk mk

mpn kq q ą k2n2

pn kqpnm nk mkqp x aq T Σ 1p x aq

ôp x aq T Σ 1p x aq ă dpn kqpnm nk mkq

k2n2 logpnm nk mk

As required.

Theorem F.16. Fix the attacker to be GML as deﬁned in F.12, then the game value is:

max D V p D, GMLq Lemma F.15 V p DML, GMLq

2, dpnm nk mkq

2nk log nm nk mk

2nk log nm nk mk

2p1 ρ δq log 1 ρ δ

2pρ δq log 1 ρ δ

n , and γ is the lower incomplete gamma function.

Proof. The game value is given by:

V p DML, GMLq EΘ QV pΘ, DML, GMLq

2EΘ QEA f pkq Θ EY f pmq Θ

EX f pnq Θ r DMLp A, Xqs EX g ML X|Y p |Y qr1 DMLp A, Xqs ı

2EΘ QEA f pkq Θ EY f pmq Θ EX f pnq Θ

Ir X A 2 Σ 1 ă αMLs ı EX g ML X|Y p |Y q 1 Ir X A 2 Σ 1 ă αMLs ıı

2EΘ QEA f pkq Θ EX f pnq Θ

Ir X A 2 Σ 1 ă αMLs ı

1 2EΘ QEA f pkq Θ EX g ML X|Θp |Θq Ir X A 2 Σ 1 ă αMLs ı

Observing the ﬁrst term, we can see that by replacing α with αML in the analog part of the proof for Theorem F.11 we get:

1 2EΘ QEA f pkq Θ EX f pnq Θ

Ir X A 2 Σ 1 ă αMLs ı 1

Published as a conference paper at ICLR 2020

Again, similarly to the analog part of the proof for Theorem F.11, observing the second term we have: 1 2EΘ QEA f pkq Θ EX g ML X|Θp |Θq Ir X A 2 Σ 1 ă αMLs ı

2EΘ QEA f pkq Θ EX g ML X|Θp |Θq Ir V T V ă αMLs

V C 1p X Aq C 1rp X Θq p A Θqs

C 1r Id, Ids X Θ A Θ

r C 1, C 1s X Θ A Θ

Using the deﬁnition of GML (Deﬁnition F.12) we have X Npθ, n m

nm Σq, using Lemma F.3 we have A Npθ, 1

kΣq, and thus: X Θ A Θ

nm Σ 0 0 1 kΣ

V Np0, r C 1, C 1s n m

nm Σ 0 0 1 kΣ

q Np0d, nm nk mk

We denote V b

nmk nm nk mk V and thus V1, . . . , Vd are independent standard normal random

variables and V T V χ2pdq. Therefore

V T V Γpk d

2, θ 2nm nk mk

And we have:

2, nmkαML 2pnm nk mkqq

Hence, the game value is given by:

V p D , G q 1

2pn kqq γpd

2, nmkαML 2pnm nk mkqqs

2, dpnm nk mkq

2nk log nm nk mk

2nk log nm nk mk

As required.

G EXPERIMENTS - DATASETS

Below we provide details on the datasets used for the authentication experiments on faces and characters. The Vox Celeb2 (Nagrani et al., 2017; Chung & Zisserman, 2018) dataset contains cropped face videos of 6112 identities. We used the original split of 5994 identities for training and 118 for test. For each identity, we saved every ﬁfth frame, resized each frame to 64 ˆ 64, and augmented it using horizontal ﬂip. The Omniglot dataset (Lake et al., 2015) contains handwritten character images from 50 alphabets. There are 1623 different characters, and 20 examples for each character. We use the splits and augmentations suggested by Vinyals et al. (2016) and used by Snell et al. (2017).

Published as a conference paper at ICLR 2020

H EXPERIMENTS - IMPLEMENTATION DETAILS

In this section, we describe our implementation of the GIM model for the different experiments, in detail. Recall from Sec. 5, that in general, the authenticator is a neural network Dpa, xq that can be expressed as Dpa, xq σp TDpaq, TDpxqq, and the generator is a neural network Gpyq that can be expressed as Gpyqi ϕp Wi W TGpyqq @i P rns. In what follows we describe our implementation of these models for each of the experiments.

H.1 GAUSSIAN SOURCES

Authenticator Architecture: For the statistic function TD, we use a concatenation of the mean and standard deviation of the sample. For the comparison function σ, we use the element-wise absolute difference between the statistics TDpaq, TDpxq, followed by a linear layer.

Attacker Architecture: For the statistic function TG, we use the sample mean, i.e., TGpyq y. The noise vectors Wi are generated as follows: First, n Gaussian noise vectors Z1, . . . , Zn iid Np0, Idq are drawn, then each vector Zi is passed through a linear layer to obtain Wi. Finally, the decoder ϕ is the identity function.

Optimization details: The model is trained in an authentication setup as in our theoretical setup, using alternating gradient descent as is common in GAN optimization (Mescheder et al., 2018). Each iteration begins when a source θ P Rd is drawn from the prior distribution Q Np0, 10Idq. Samples A P Rkˆd, Y P Rmˆd, Xθ P Rnˆd are drawn IID from fθ Npθ, Idq, where Xθ represents a real sample from θ. The attacker, given the leaked sample Y , generates a fake sample XG Gp Y q P Rnˆd, passes it to D, and suffers the loss logpsigmoidp Dp A, XGqqq. The authenticator, D, receives as input the source information sample A, outputs a prediction for each of the test samples Xθ, XG, and suffers the binary cross-entropy loss 0.5 plog psigmoidp Dp A, Xθqqq log psigmoidp1 Dp A, XGqqqq. Each experiment is trained for 200K iterations with a batch size of 4000 using the Adam optimizer (Kingma & Ba, 2015) with learning rate 10 4.

H.2 EXPERIMENTS ON VOXCELEB2 AND OMNIGLOT

To describe the models we begin with some notation. We let c denote the number of image channels, h denote the image size (we only consider square images of size c ˆ h ˆ h), and l denote the latent dimension of the model.

Authenticator Architecture: As mentioned above, the authenticator is a neural network model that can be expressed as: Dpa, xq σp TDpaq, TDpxqq The statistic function TD maps a sample of images to a statistic vector s P R6l in the following way: Each image in the sample is mapped using encoders ED src, ED env : r 1, 1scˆhˆh Ñ Rl to two latent vectors vsrc, venv P Rl, respectively. vsrc is designed to represent the source θ, and venv is designed to represent the environment (e.g., pose, lighting, expression). To represent the source of the sample, the sample mean of vsrc is taken. To represent the sample distribution, venv is passed through a non-linear statistic module ζ which is meant to capture more complex statistical functions of the sample.5 Finally, TDpxq is obtained by concatenating vsrc and ζpvenvq. E.g., for x we have:

TDpxq concat

i 1 ED srcpxiq, ζ ED envpxq

The comparison function σ : R6l Ñ R receives two latent vectors sa TDpaq, sx TDpxq representing the statistics of the samples a and x respectively. The vectors are concatenated and then passed through a Multi-Layered Perceptron which outputs a scalar reﬂecting their similarity. Namely: σpsa, sxq MLP pconcat psa, sxqq The full architecture of the authenticator is depicted in Fig. 7.

5ζ is implemented as the concatenation of the standard deviation and mean of the sample after passing each example through a Multi-Layered Perceptron.

Published as a conference paper at ICLR 2020

Figure 7: An overview of the implementation of the GIM authenticator architecture for the experiments on the Voxceleb2 and Omniglot datasets.

Attacker Architecture: Our implementation of the attacker is inspired by the architecture suggested by Zakharov et al. (2019), which relies on an implicit assumption that an image could be modeled as a mapping of two latent vectors to image space. The ﬁrst vector represents the source θ and is the same for any image of θ, the second vector represents the environment (e.g pose, lighting, expression) and is different for each image of the source.

The attacker model consists of the following components: An image encoder EG src : r 1, 1scˆhˆh Ñ Rl that maps an image to a latent vector representing the source θ, an image encoder EG env : r 1, 1scˆhˆh Ñ Rl that maps an image to a latent vector representing the environment, a Multi-layered Perceptron MLPG : Rl Ñ Rl that maps Gaussian noise to the environment latent space, an environment decoder ϕenv : Rl Ñ Rcˆhˆh that maps a latent vector to an environment image which could represent aspects of the environment such as facial landmarks6, and ﬁnally, a generator φ : R2cˆhˆh ˆ Rl Ñ r 1, 1scˆhˆh that maps an environment image concatenated to the real image to a new image. The generator is based on the image to image model used by Zakharov et al. (2019) and Johnson et al. (2016), and uses the source latent vector as input for Adaptive instance normalization (Huang & Belongie, 2017).

The attacker generates a fake sample X P r 1, 1snˆcˆhˆh based on a leaked sample Y P r 1, 1smˆcˆhˆh in the following way: Each image Yj in the leaked sample is mapped using EG src and EG env to latent vectors usrc j , uenv j P Rl. A latent environment vector, venv i , is constructed for

each fake image Xi in the following way: First, n Gaussian noise vectors Z1, . . . , Zn iid Np0, Ilq

6In Zakharov et al. (2019), our so-called environment image is indeed a facial landmarks image which is used as input to the model. In our work, we allow the model to learn which environment image is useful.

Published as a conference paper at ICLR 2020

are drawn, then each vector Zi is passed through MLPG to obtain Wi, and ﬁnally, venv i is obtained by matching the mean of the new latent environment vectors to the sample mean uenv. Namely:

venv i Wi W uenv @i P rns

Each fake image Xi is then generated deterministically as follows: venv i is used as input to the decoder ϕenv which outputs an environment image. This image is concatenated along the channel dimension to a random image from the leaked sample Y , and then passed as input to the generator φ, which also receives usrc as input to its Adaptive instance norm layers. The output of the generator is the fake image Xi for all i P rns. The full architecture of the attacker is depicted in Fig. 8.

Figure 8: An overview of the implementation of the GIM attacker architecture for the experiments on the Voxceleb2 and Omniglot datasets.

Optimization details: The model is trained in an authentication setup as in our theoretical setup, using alternating gradient descent with the regularization parameter as suggested by Mescheder et al. (2018). Each iteration begins when a source θ P Rd is drawn uniformly from the dataset. Samples A P r 1, 1skˆcˆhˆh, Y P r 1, 1smˆcˆhˆh, Xθ P r 1, 1snˆcˆhˆh are sampled uniformly from the images available to the source θ. The attacker, given the leaked sample Y , generates a fake sample XG Gp Y q, passes it to D, and suffers the loss logpsigmoidp Dp A, XGqqq. The authenticator, D, receives as input the source information sample A, outputs a prediction for each of the test samples Xθ, XG, and suffers the binary cross-entropy loss 0.5 plog psigmoidp Dp A, Xθqqq log psigmoidp1 Dp A, XGqqqq.

The experiments on Omniglot were trained for 520k iterations with batch size 128 using the Adam optimizer (Kingma & Ba, 2015) with learning rate 10 6 for D, 10 5 for G, and 10 7 for MLPG (as done by Karras et al. (2018b)). The regularization parameter was set to 0.

The experiments on Voxceleb2 were trained for 250k iterations with batch size 64 using the Adam optimizer (Kingma & Ba, 2015) with learning rate 10 4 for both D and G and 10 6 for MLPG. The regularization parameter was set to 10 (as done by Karras et al. (2018b)) since we noticed that it stabilized and sped up the training, and in contrast to Omniglot and the Gaussian experiments did not seem to hurt the results.