# the_many_faces_of_adversarial_risk__8bcf7f0a.pdf

The Many Faces of Adversarial Risk

Muni Sreenivas Pydi University of Wisconsin - Madison Madison, Wisconsin, USA pydi@wisc.edu

Varun Jog University of Cambridge Cambridge, UK vj270@cam.ac.uk

Adversarial risk quantiﬁes the performance of classiﬁers on adversarially perturbed data. Numerous deﬁnitions of adversarial risk not all mathematically rigorous and differing subtly in the details have appeared in the literature. In this paper, we revisit these deﬁnitions, make them rigorous, and critically examine their similarities and differences. Our technical tools derive from optimal transport, robust statistics, functional analysis, and game theory. Our contributions include the following: generalizing Strassen s theorem to the unbalanced optimal transport setting with applications to adversarial classiﬁcation with unequal priors; showing an equivalence between adversarial robustness and robust hypothesis testing with -Wasserstein uncertainty sets; proving the existence of a pure Nash equilibrium in the two-player game between the adversary and the algorithm; and characterizing adversarial risk by the minimum Bayes error between a pair of distributions belonging to the -Wasserstein uncertainty sets. Our results generalize and deepen recently discovered connections between optimal transport and adversarial robustness and reveal new connections to Choquet capacities and game theory.

1 Introduction

Neural networks are known to be vulnerable to adversarial attacks, which are imperceptible perturbations to input data that maximize loss [38, 15, 5]. Developing algorithms resistant to such attacks has received considerable attention in recent years [8, 28, 24, 20], motivated by safety-critical applications such as autonomous driving [18, 27], medical imaging [17, 23, 22] and law [21, 6].

A classiﬁcation algorithm with high accuracy (low risk) in the absence of an adversary may have poor accuracy (high risk) when an adversary is present. Thus, a modiﬁed notion known as adversarial risk is used to quantify the adversarial robustness of algorithms. Algorithms that minimize adversarial risk are deemed robust. Procedures for ﬁnding them have been effective in practice [24, 41, 28], spurring numerous theoretical investigations into adversarial risk and its minimizers.

There is no universally agreed upon deﬁnition of adversarial risk. Even the simplest setting of binary classiﬁcation in Rd with an ℓ2 adversary admits various deﬁnitions involving set expansions [9, 16], transport maps [29], Markov kernels [31], and couplings [26]. These works broadly interpret adversarial risk as a measure of robustness to small perturbations, but their deﬁnitions differ in subtle details such as the class of adversaries and algorithms considered, budget constraints placed on the adversary, assumptions on the loss function, and the geometry of decision boundaries.

Optimal adversarial risk is most commonly deﬁned as the minimax risk under adversarial contamination [24, 33]. Other notable characterizations include an optimal transport cost between data generating distributions in [30, 2, 10, 11], the optimal value of a distributionally robust optimization problem [36, 35, 40], and the value of a two-player zero-sum game [26, 29, 3, 4].

The diversity of deﬁnitions for adversarial risk makes it challenging to compare approaches. Moreover, not all approaches are rigorous. For instance, the classes of adversarial strategies and classiﬁer

35th Conference on Neural Information Processing Systems (Neur IPS 2021).

algorithms are often unclear, and issues of measurability are ignored. Although this may be harmless for applied research, it has led to incorrect proofs and insufﬁcient assumptions in some theoretical works; a mathematically rigorous foundation for adversarial risk is essential for future research.

In this paper, we examine various notions of adversarial risk for binary classiﬁcation in a nonparametric setting, where the decision boundary (or decision region) of a classiﬁer is an arbitrary subset of the input space. We present rigorous deﬁnitions of adversarial risk and identify conditions under which these deﬁnitions are equivalent. We consider the general setting of Polish spaces (complete, separable metric spaces), and present stronger results for the Euclidean space (Rd). Our contributions are as follows:

We examine the deﬁnition of adversarial risk based on set expansions. For Polish spaces, we observe that adversarial risk is not Borel measurable, and hence, not well-deﬁned when the decision region is an arbitrary set. We show that the problem can be resolved by considering a Polish space equipped with the universal completion of the Borel σ-algebra and restricting the decision regions to Borel sets. For the Euclidean space with the Lebesgue σ-algebra, we show that adversarial risk is well-deﬁned for any Lebesgue measurable decision region. Our key lemma (Lemma 4.3) shows that the Lebesgue σ-algebra is preferred over the Borel σ-algebra because set expansions are Lebesgue measurable but not necessarily Borel measurable. These results are contained in Section 4.

We show that the deﬁnition of adversarial risk using set expansions is identical to a notion of risk that appears in robust hypothesis testing with -Wasserstein uncertainty sets. We prove this result in Polish spaces using the theory of measurable selections [1, 43]. In Rd, we are able to use the powerful theory of Choquet capacities [7] (in particular, Huber and Strassen s 2-alternating capacities [19]) to establish results of a similar nature. These results are contained in Section 5.

We consider the binary classiﬁcation setup with unequal priors and show (under suitable assumptions) that the optimal adversarial risk from the above deﬁnitions is characterized by an unbalanced optimal transport cost between data-generating distributions. For both Polish spaces and Rd, the main tool we use is Theorem 6.1 in which we generalize a classical result of Strassen on excess-cost optimal transport [37, 42] from probability measures to ﬁnite measures with possibly unequal mass. This generalizes results of [31, 2] on binary classiﬁcation, which were only for equal priors. These results are contained in Section 6.

We consider the setup of a zero-sum game between the adversary and the algorithm. We show that the value of this game (adversarial risk) is equal to the minimum Bayes error between a pair of distributions belonging to the -Wasserstein uncertainty sets centered around true data-generating distributions. We prove the existence of a pure Nash equilibrium in this game for Rd and for Polish spaces with a midpoint property. This extends/strengthens the results of [26, 29, 3] to non-parametric classiﬁers. These results are contained in Section 7.

The paper is organized as follows: In Section 2, we present preliminary deﬁnitions from optimal transport and metric space topology. In Section 3, we discuss various deﬁnitions of adversarial risk and present more related work. Sections 4, 5, 6 and 7 contain our main contributions summarized above. We conclude the paper in Section 8 and discuss future research directions.

We emphasize that rectifying measure theoretic issues with existing formulations of adversarial risk is one of our contributions, but not the main focus of our paper. We start our presentation with ﬁxing measurability and well-deﬁnedness (in Section 4) because otherwise we will not be able to rigorously present our main results in the subsequent sections, namely: relation to robust hypothesis testing and Choquet capacities in section 5, generalizing the results of [2, 30] in section, 6 proving minimax theorems and existence of Nash equilibria and extending the results of [26, 3, 29] in section 7.

Notation: Throughout the paper, we use X to denote a Polish space (a complete, separable metric space) with metric d and Borel σ-algebra B(X). For x X and r 0, let Br(x) denote the closed ball of radius r centered at x. We use P(X) and M(X) to denote the space of probability measures and ﬁnite measures deﬁned on the measure space (X, B(X)), respectively. Let B(X) denote the universal completion of B(X). Let P(X) and M(X) denote the space of probability measures and ﬁnite measures deﬁned on the complete measure space (X, B(X)). For µ, ν M(X), we say ν dominates µ if µ(A) ν(A) for all A B(X) and write µ ν. When X is Rd, we use L(X) to

denote the Lebesgue σ-algebra and λ to denote the d-dimensional Lebesgue measure. Note that L(X) = B(X) for X = Rd. For a positive integer n, we use [n] to denote the ﬁnite set {1, . . . , n}.

2 Preliminaries

2.1 Metric Space Topology

We introduce three different notions of set expansions. For ϵ 0 and A B(X), the ϵ-Minkowski expansion of A is given by A ϵ := a ABϵ(a). The ϵ-closed expansion of A is deﬁned as Aϵ := {x X : d(x, A) ϵ}, where d(x, A) = infa A d(x, a). The ϵ-open expansion of A is deﬁned as Aϵ) := {x X : d(x, A) < ϵ}. We use the notation A ϵ to denote ((Ac)ϵ)c. Similarly, A ϵ := ((Ac) ϵ)c. For example, consider the set A = (0, 1] in the space (X, d) = (R, | |) and ϵ > 0. Then A ϵ = ( ϵ, 1 + ϵ], Aϵ = [ ϵ, 1 + ϵ] and Aϵ) = ( ϵ, 1 + ϵ). For any A B(X), Aϵ is closed and Aϵ) is open. Hence, Aϵ, Aϵ) B(X). Moreover, Aϵ) A ϵ Aϵ. However, A ϵ may not be in B(X) (see Appendix for an example). In general, the Minkowski sum of two Borel sets need not be Borel [13], and that of two Lebesgue measurable sets need not be Lebesgue measurable [34].

2.2 Optimal Transport

Let µ, ν P(X). A coupling between µ and ν is a joint probability measure π P(X 2) with marginals µ and ν. The set Π(µ, ν) P(X 2) denotes the set of all couplings between µ and ν. The optimal transport cost between µ and ν under a cost function c : X X [0, ) is deﬁned as Tc(µ, ν) = infπ Π(µ,ν) R

X 2 c(x, x )dπ(x, x ). For a positive integer p, the p-Wasserstein distance

between µ and ν is deﬁned as, Wp(µ, ν) = (Tdp(µ, ν))

1 p . The -Wasserstein metric is deﬁned as W (µ, ν) = limp Wp(µ, ν). It can also be expressed in the following ways:

W (µ, ν) = inf π Π(µ,ν) ess sup (x,x ) π d(x, x ) = inf{δ > 0 : µ(A) ν(Aδ) A B(X)}. (1)

Given a µ P(X) and a measurable function f : X X, the push-forward of µ by f is deﬁned as a probability measure f µ P(X) given by f µ = µ(f 1(A)) for all A B(X).

3 Adversarial Risk: Deﬁnitions and Related Work

We consider a binary classiﬁcation setting on feature space X. Let p0, p1 P(X) be the datagenerating distributions for labels 0 and 1, respectively. Let the prior probabilities for labels 0 and 1 be in the ratio T : 1 where we assume T 1 without loss of generality. For a space of classiﬁers parametrized by w W and a loss function ℓ: (X Y) W [0, ), the adversarial risk of a classiﬁer w W under an adversarial budget of ϵ 0 is deﬁned as [24, 33],

R ϵ(ℓ, w) = E(x,y)

sup d(x,x ) ϵ ℓ((x , y), w)

If the loss function ℓ( , w) is measurable, upper semi-continuous and bounded above for all w W, [26] show that Rϵ(ℓ, w) is well-deﬁned. But in general, it may not be so. A case of special interest is the 0-1 loss function with non-parametric classiﬁers of the form f A(x) := 1{x A} where A B(X). In this case, ℓ0/1((x, y), A) = 1{x A, y = 0} + 1{x Ac, y = 1}. Hence,

R ϵ(ℓ0/1, A) = T T + 1Ep0

sup d(x,x ) ϵ 1{x A}

+ 1 T + 1Ep1

sup d(x,x ) ϵ 1{x Ac}

= T T + 1p0(A ϵ) + 1 T + 1p1((Ac) ϵ). (3)

A problem with the formulation in equation 3 is the ambiguity over the measurability of the sets A ϵ and (Ac) ϵ. Even when A B(X), it is not guaranteed that A ϵ, (Ac) ϵ B(X) (see Appendix for an example). Hence, R ϵ(ℓ0/1, A) is not well-deﬁned for all A B(X). It is shown in [31] that R ϵ(ℓ0/1, A) is well-deﬁned when A is either closed or open, but its validity beyond that is unknown.

A simple ﬁx to this measurability problem is to use closed set expansion instead of the Minkowski set expansion, as done in [25]. This leads to the following formulation.

Rϵ(ℓ0/1, A) = T T + 1p0(Aϵ) + 1 T + 1p1((Ac)ϵ). (4)

The above deﬁnition is well-deﬁned for any A B(X) because Aϵ and (Ac)ϵ are both closed and hence, measurable. However, under the above deﬁnition, a point x A may be perturbed to x Aϵ such that d(x, x ) > ϵ. For example, when A = (0, 1), we have Aϵ = [ ϵ, ϵ] and an adversary may transport x = δ > 0 to x = ϵ, violating the budget constraint at x. Remark 1. The formulations in equations (2), (3) and (4) can give a strictly positive adversarial risk even for a perfect (i.e. Bayes optimal) classiﬁer. This is consistent with the literature on adversarial examples where even a perfect classiﬁer is forced to make errors in the presence of evasion attacks. These formulations of adversarial risk correspond to constant-in-the-ball risk of [16] and corrupted-instance risk in [9, 25]. Here, an adversarial risk of zero is only possible if the supports of p0 and p1 are non-overlapping and separated by at least 2ϵ. This is not the case with other formulations of adversarial risk such as exact-in-the-ball risk [16], prediction-change risk and error-region risk [9, 25]. We focus on the corrupted-instance family of risks in this work.

Another approach to deﬁning adversarial risk is by explicitly deﬁning the class of adversaries of budget ϵ as measurable transport maps f : X X that push-forward the true data distribution such that no point is transported by more than a distance of ϵ; i.e., d(x, f(x)) ϵ. The transport map-based adversarial risk [29] is formally deﬁned as follows:

RFϵ(ℓ0/1, A) = sup f0,f1:X X x X,d(x,fi(x)) ϵ

T T + 1f0 p0(A) + 1 T + 1f1 p1((Ac)). (5)

Yet another deﬁnition uses the robust hypothesis testing framework with W uncertainty sets. In this approach, an adversary perturbs the true distribution pi to a corrupted distribution p i such that W (pi, p i) ϵ. From (1), this is equivalent to the existence of a coupling π Π(pi, p i) such that ess sup(x,x ) π d(x, x ) ϵ. The adversarial risk with such an adversary is given by

RΓϵ(ℓ0/1, A) = sup W (p1,p 1),W (p0,p 0) ϵ

T T + 1p 0(A) + 1 T + 1p 1((Ac)). (6)

Clearly, RFϵ(ℓ0/1, A) RΓϵ(ℓ0/1, A), but conditions for equality were not studied in prior work. Moreover, their relation to set expansion-based deﬁnitions in (3) and (4) was also unknown.

Next we discuss some characterizations of optimal adversarial risk, deﬁned as R ϵ := inf A B(X) R ϵ(ℓ0/1, A). In [30, 2], it is shown that R ϵ = 1 2[1 Dϵ(p0, p1)] for equal priors (T = 1), where Dϵ is an optimal transport cost deﬁned as follows. Deﬁnition 1 (Dϵ cost). Let µ, ν P(X). Let ϵ 0. Let cϵ : X 2 {0, 1} be such that cϵ(x, x ) = 1{(x, x ) X X : d(x, x ) > 2ϵ}. Then for µ, ν P(X) and ϵ 0, Dϵ(µ, ν) = Tcϵ(µ, ν).

For ϵ = 0, Dϵ reduces to the total variation distance. While D0 is a metric on P(X), Dϵ (for ϵ > 0) is neither a metric nor a pseudometric [31].

Other formulations of optimal adversarial risk are inspired from game theory [29, 26, 3]. Consider a game between two players: (1) The adversary whose action space is pairs of distributions p 0, p 1 P(X), and (2) The algorithm whose action space is the space of decision regions of the form A B(X)}. For T > 0, deﬁne r : B(X) P(X) P(X) [0, 1] as, r(A, µ, ν) = T T +1µ(A) + 1 T +1ν((Ac)). The payoff when the algorithm plays ﬁrst is given by inf A B(X) sup W (p0,p 0),W (p1,p 1) ϵ r(A, p 0, p 1), and this quantity is interpreted as the optimal adversarial risk in this setup.

4 Well-Deﬁnedness of Adversarial Risk

As stated in Section 3, R ϵ(ℓ0/1, A) may not be well-deﬁned for some decision regions A B(X) because of the non-measurability of the sets A ϵ and (Ac) ϵ. Speciﬁcally, we have the following lemma.

Table 1: Summary of the adversarial risk deﬁnitions presented in Section 3. R ϵ, Rϵ, RFϵ and RΓϵ denote adversarial risk deﬁned using Minkowski set expansions, closed set expansions, transport maps and -Wasserstein metric respectively.

Risk Adversary s action Perturbation d(x, x ) ϵ? Well-deﬁned?

R ϵ x A x A ϵ Random Yes, x Rd: Yes, for Lebesgue A Polish X: Yes, for Borel A Rϵ x A x Aϵ Random No Yes, for measurable A RFϵ x x = fi(x) Deterministic Yes, x Yes, for measurable fi

pi p 1 W (pi, p i) ϵ Random Almost surely yes, x Yes

Lemma 4.1. For any ϵ > 0, there exists A B(X) such that A ϵ / B(X).

In this section, we lay down the conditions under which the ambiguity can be resolved. We begin by presenting a Lemma that shows that A ϵ is an analytic set (i.e. a continuous image of a Borel set) whenever A is Borel. It is known that an analytic sets are universally measurable, i.e. they belong in B(X), the universal completion of the Borel σ-algebra B(X), and are measurable with respect to any ﬁnite measure deﬁned on the complete measure space, (X, B(X)).

Lemma 4.2. Let A B(X). Then, A ϵ is an analytic set. Consequently, A ϵ B(X).

By virtue of the previous lemma, we have the following. Theorem 4.1. Let p0, p1 P(X). Then for any A B(X), R ϵ(ℓ0/1, A) is well-deﬁned.

For the special case of X = Rd, we can further strengthen Theorem 4.1 to include all Lebesgue measurable sets L(X) instead of just Borel sets B(X). For this, we use the concept of porous sets. Deﬁnition 2 (Porous set). A set E X is said to be porous if there exists α (0, 1) and r0 > 0 such that for every r (0, r0] and every x X, there is an x X such that Bαr(x ) Br(x)\E.

Porous sets are a subclass of nowhere dense sets. Importantly, λ(E) = 0 for any porous set E Rd [47]. By the following lemma, the set difference between the closed/open set expansions is porous.

Lemma 4.3. Let (X, d) = (Rd, ) and A L(X). Then E = Aϵ\Aϵ) is porous.

Lemma 4.3 plays a crucial role in proving that A ϵ L(X) whenever A L(X). We recall that A ϵ is the Minkowski sum of A with the closed ϵ-ball. In general, the Minkowski sum of two Lebesgue measurable sets is not always Lebesgue measurable [34, 14]. So the fact that one of them is a closed ball in case of A ϵ is important. In the following theorem, we use Lemma 4.3 to prove the measurability of A ϵ and in turn prove that R ϵ(ℓ0/1, A) is well-deﬁned for any A L(X).

Theorem 4.2. Let (X, d) = (Rd, ). Let p0, p1 P(X) and let ϵ 0. Then for any A L(X), R ϵ(ℓ0/1, A) is well-deﬁned. If, in addition, p0 and p1 are absolutely continuous with respect to the Lebesgue measure, then R ϵ(ℓ0/1, A) = Rϵ(ℓ0/1, A).

5 Equivalence with -Wasserstein Robustness

In this section, we show the conditions under which R ϵ(ℓ0/1, A) is equivalent to other notions of adversarial risk based on transport maps and W robustness.

5.1 W Robustness in Polish Spaces via Measurable Selections

We begin by presenting a lemma that links the measure of ϵ-Minkowsi set expansion to the worst case measure over a W probability ball of radius ϵ. Lemma 5.1. Let µ P(X) and A B(X). Then sup W (µ,µ ) ϵ µ (A) = µ(A ϵ). Moreover, the supremum in the previous equation is achieved by a µ P(X) that is induced from µ via a measurable transport map φ : X X (i.e. µ = φ µ) satisfying d(x, φ(x)) ϵ for all x X.

Table 2: Eqvivalences among adversarial risk formulations. R ϵ(A), Rϵ(A), RFϵ(A) and RΓϵ(A) denote adversarial risk for 0-1 loss function (ℓ0/1) for a binary classiﬁer with decision region A (i.e. f A(x) = 1{x A}), deﬁned using Minkowski set expansions, closed set expansions, transport maps and -Wasserstein metric respectively. B(X) and L(X) denote the Borel and Lebesgue σ-algebras. (X, B(X)) denotes the universal completion of the Borel measure space, (X, B(X)).

Equivalences in Adversarial Risk Conditions R ϵ(A) = RΓϵ(A) Rd: A L(X) or (X, B(X)): A B(X) R ϵ(A) = RΓϵ(A) = RFϵ(A) (X, B(X)): A B(X) R ϵ(A) = RΓϵ(A) = RFϵ(A) = Rϵ(A) Rd: A L(X) and p0, p1 have densities

A crucial step in the proof of Lemma 5.1 is ﬁnding a measurable transport map φ such that φ 1(A) = A ϵ and d(x, φ(x)) ϵ for all x X. In the following theorem, we use Lemma 5.1 to establish the equivalence between three different notions of adversarial risk introduced in section 3. Theorem 5.1. Let p0, p1 P(X) and A B(X). Then R ϵ(ℓ0/1, A) = RFϵ(ℓ0/1, A) = RΓϵ(ℓ0/1, A). In addition, the supremum over f0 and f1 in RFϵ(ℓ0/1, A) is attained. Similarly, the supremum over p 0 and p 1 in RΓϵ(ℓ0/1, A) is attained.

5.2 W Robustness in Rd via 2-Alternating Capacities

In this subsection, we establish a connection between adversarial risk and Choquet capacities [7] in Rd. This connection allows us to extend Theorem 5.1 from Borel sets to the broader class of Lebesgue measurable sets. We will again use this connection for proving minimax theorems and existence of Nash equilibria in Section 7.1. We begin with the following deﬁnitions. Deﬁnition 3 (Capacity). A set function v : B(X) [0, 1] is a capacity if it satisﬁes the following conditions: (1) v( ) = 0 and v(X) = 1; (2) For A, B B(X), A B = v(A) v(B); (3) An A = v(An) v(A); and (4) Fn F, Fn closed = v(Fn) v(F). Deﬁnition 4 (2-Alternating Capacity). A capacity v deﬁned on the measure space (X, B(X)) is called 2-alternating if v(A B) + v(A B) v(A) + v(B) for all A, B B(X).

For any compact set of probability measures Ξ P(X), the upper probability v(A) = supµ Ξ µ(A) is a capacity [19]. The upper probability of ϵ-neighborhoods of a µ P(X) deﬁned using either the total variation metric or the Levy-Prokhorov metric can be shown to be a 2-alternating capacity [19]. The following lemma shows that A 7 µ(A ϵ) is a 2-alternating capacity under some conditions.

Lemma 5.2. Let (X, d) = (Rd, ). Let µ P(X) and let ϵ 0. Deﬁne a set function v on X such that for any A L(X), v(A) := µ(A ϵ). Then v is a 2-alternating capacity.

Now we relate the capacity deﬁned in Lemma 5.2 to the W metric. Since the ϵ-neighborhood of a µ P(X) in W metric is a compact set of probability measures [46], the upper probability over this W ϵ-ball is a capacity. The following lemma shows that it is a 2-alternating capacity, and identiﬁes it with the capacity deﬁned in Lemma 5.2. Lemma 5.3. Let (X, d) = (Rd, ). Let µ P(X). Then for any A L(X), sup W (µ,µ ) ϵ µ (A) = µ(A ϵ). Moreover, the supremum in the previous equation is attained.

Lemma 5.3 plays a similar role to Lemma 5.1 in proving the following equivalence between adversarial robustness and W robustness. Theorem 5.2. Let (X, d) = (Rd, ). Let p0, p1 P(X) and let ϵ 0. Then for any A L(X), R ϵ(ℓ0/1, A) = RΓϵ(ℓ0/1, A), and the supremum over p 0 and p 1 in RΓϵ(ℓ0/1, A) is attained.

The proof follows by converting the expression for RΓϵ into one for R ϵ using Lemma 5.3. Unlike Theorem 5.1, Theorem 5.2 does not show the equivalence of RFϵ(ℓ0/1, A) with the other deﬁnitions under the relaxed assumption of A L(X). This is because Lemma 5.3 does not provide a pushforward map φ such that µ = φ µ with µ attaining the supremum over the W ball.

6 Optimal Adversarial Risk via Generalized Strassen s Theorem

In section 5, we analyzed adversarial risk for a speciﬁc decision region A B(X). In this section, we analyze inﬁmum of adversarial risk over all possible decision regions; i.e., the optimal adversarial risk. We show that optimal adversarial risk in binary classiﬁcation with unequal priors is characterized by an unbalanced optimal transport cost between data-generating distributions. Our main technical lemma generalizes Strassen s theorem to unbalanced optimal transport. We present this result in subsection 6.1 and present our characterization of optimal adversarial risk in subsection 6.2.

6.1 Unbalanced Optimal Transport & Generalized Strassen s Theorem

Recall from Section 3 that the optimal transport cost Dϵ characterizes the optimal adversarial risk in binary classiﬁcation for equal priors. The following result gives an alternative characterization of Dϵ. Proposition 6.1 (Strassen s theorem). [Corollary 1.28 in [42]] Let µ, ν P(X). Let ϵ 0. Then

sup A B(X) µ(A) ν(A2ϵ) = Dϵ(µ, ν). (7)

Proposition 6.1 is a special case of Kantorovich-Rubinstein duality [42] applied to {0, 1}-valued cost functions. We now generalize this result to measures with unequal masses. We begin with some deﬁnitions that generalize the concepts we introduced in subsection 2.2.

Let µ, ν M(X) be such that µ(X) ν(X). A coupling between µ and ν is a measure π M(X 2) such that for any A B(X), π(A X) = µ(A) and π(X A) ν(A). The set Π(µ, ν) is deﬁned to be the set of all couplings between µ and ν. For a cost function c : X 2 [0, ), the optimal transport cost between µ and ν under c is deﬁned as Tc(µ, ν) = infπ Π(µ,ν) R

X 2 c(x, x )dπ(x, x ). Theorem 6.1 (Generalized Strassen s theorem). Let µ, ν M(X) be such that 0 < M = µ(X) ν(X). Let ϵ > 0. Deﬁne cϵ : X 2 {0, 1} as cϵ(x, x ) = 1{(x, x ) X 2 : d(x, x ) > 2ϵ}. Then

sup A B(X) µ(A) ν(A2ϵ) = Tcϵ(µ, ν) = M inf ν P(X):ν ν/M Dϵ (µ/M, ν ) . (8)

Moreover, the inﬁmum on the right hand side is attained. (Equivalently, there is a coupling π Π(µ, ν) that attains the unbalanced optimal transport cost Tcϵ(µ, ν).)

Our proof of Theorem 6.1 leverages strong duality in linear programming. We ﬁrst establish (8) for discrete measures on a ﬁnite support. We then apply the discrete result on a sequence of measures supported on a countable dense subset of the Polish space X. Using the tightness of ﬁnite measures on X, we construct an optimal coupling that achieves the cost Tcϵ(µ, ν) in (8). We then show that the constructed coupling satisﬁes (8). This proof strategy is adapted from the works of [12] and [32].

6.2 Optimal Adversarial Risk for Unequal Priors

Generalized Strassen s theorem involves closed set expansions. The following lemma allows us to switch to Minkowski set expansions. Lemma 6.1. Let µ, ν M(X) and let ϵ 0. Then sup A B(X) µ(A) ν(A2ϵ) = sup A B(X) µ(A ϵ) ν(A ϵ). Moreover, the supremum in the right hand side of the above equality can be replaced by a supremum over closed sets.

Using the above lemma and the generalized Strassen s theorem, we show the following result on optimal adversarial risk for unequal priors, generalizing the result of [30, 2]. Theorem 6.2. Let p0, p1 P(X) and let ϵ 0. Then,

inf A B(X) R ϵ(ℓ0/1, A) = 1 T + 1

1 inf q P(X):q T p0 Dϵ(q, p1) . (9)

Moreover, the inﬁmum on the left hand side can be replaced by an inﬁmum over closed sets.

The proof follows by using Lemma 6.1 to convert the expression with Minkowski expansion to one with closed expansions, followed by an application of Theorem 6.1 to arrive at the ﬁnal optimal transport-based expression. Theorem 6.2 extends the result of [31] in two ways: (1) the inﬁmum is

taken over all sets for which R ϵ(ℓ0/1, A) is well-deﬁned, instead of restricting to closed sets, and (2) the priors on both labels can be unequal. We also note that for (X, d) = (Rd, ), (9) holds with the inﬁmum on the left hand side taken over all A L(X).

7 Minimax Theorems and Nash Equilibria

In this section, we revisit the zero-sum game between the adversary and the algorithm introduced in section 3. Recall that for A B(X) and p 0, p 1 P(X), the payoff function is given by

r(A, p 0, p 1) = T T + 1p 0(A) + 1 T + 1p 1((Ac)). (10)

The max-min inequality gives us

sup W (p0,p 0),W (p1,p 1) ϵ inf A A r(A, p 0, p 1) inf A B(X) sup W (p0,p 0),W (p1,p 1) ϵ r(A, p 0, p 1). (11)

If the inequality in (11) is an equality, we say that the game has zero duality gap, and admits a value equal to either expression in (11). Then there is no advantage to a player making the ﬁrst move. Our minimax theorems establish such an equality. If in addition to having an equality in (11), there exist p 0, p 1 P(X) that achieve the supremum on the left-hand side and A B(X) that achieves the inﬁmum on the right-hand side, we say that ((p 0, p 1), A ) is a pure Nash equilibrium of the game.

In Section 7.1, we prove the minimax theorem and the existence of a pure Nash equilibrium in Rd using the theory of 2-alternating capacities [19] and the relation to adversarial risk from Section 5.2. Section 7.2 extends these results to more general Polish spaces with a midpoint property."

7.1 Minimax Theorem in Rd via 2-Alternating Capacities

The following theorem proves the minimax equality and the existence of a Nash equilibrium for the adversarial robustness game in Rd. Theorem 7.1 (Minimax theorem in Rd). Let (X, d) = (Rd, ). Let p0, p1 P(X) and let ϵ 0. Deﬁne r as in (10). Then,

sup W (p0,p 0),W (p1,p 1) ϵ inf A L(X) r(A, p 0, p 1) = inf A L(X) sup W (p0,p 0),W (p1,p 1) ϵ r(A, p 0, p 1). (12)

Moreover, there exist p 0, p 1 P(X) and A L(X) that achieve the supremum and inﬁmum on the left and right hand sides of the above equation.

Crucial to the proof of Theorem 7.1 is Lemma 5.2, which shows that the set-valued maps A 7 p0(A ϵ) and Ac 7 p1((Ac) ϵ) are 2-alternating capacities. The same proof technique is not applicable in general Polish spaces because the map A 7 µ(A ϵ) is not a capacity for a general µ P(X). This is because A ϵ is not measurable for all A B(X).

7.2 Minimax Theorem in Polish Spaces via Optimal Transport

We now extend the minimax theorem from Rd to general Polish spaces with the following property. Deﬁnition 5 (Midpoint property). A metric space (X, d) is said to have the midpoint property if for every x1, x2 X, there exists x X such that, d(x1, x) = d(x, x2) = d(x1, x2)/2.

Any normed vector space with distance deﬁned as d(x, x ) = x x satisﬁes the midpoint property. An example of a metric space without this property is the discrete metric space where d(x, x ) = 1{x = x }. The midpoint property plays a crucial role in proving the following theorem, which shows that the Dϵ transport cost between two distributions is the shortest total variation distance between their ϵ-neighborhoods in W metric. A similar result was also presented in [11]. Theorem 7.2 (Dϵ as shortest DT V between W balls). Let (X, d) have the midpoint property. Let µ, ν P(X) and let ϵ 0. Then Dϵ(µ, ν) = inf W (µ,µ ),W (ν,ν ) ϵ DT V (µ , ν ). Moreover, the inﬁmum over DT V in the above equation is attained.

The following theorem uses Theorem 7.2 to prove the minimax equality and the existence of a Nash equilibrium for any Polish space with the midpoint property for the case of equal priors.

Theorem 7.3 (Minimax theorem for equal priors). Let (X, d) have the midpoint property. Let p0, p1 P(X) and let ϵ 0. Deﬁne r as in (10) with T = 1. Then

sup W (p0,p 0),W (p1,p 1) ϵ inf A B(X) r(A, p 0, p 1) = inf A B(X) sup W (p0,p 0),W (p1,p 1) ϵ r(A, p 0, p 1). (13)

Moreover, there exist p 0, p 1 P(X) that achieve the supremum on the left hand side.

Proof. For µ P(X), let WB(µ) denote the set of all µ P(X) such that W (µ, µ ) ϵ.

inf A B(X) sup p 0 W B(p0) p 1 W B(p1)

r(A, p 0, p 1) = inf A B(X) RΓϵ(ℓ0/1, A) (i) = inf A B(X) R ϵ(ℓ0/1, A) (ii) = 1

2 [1 Dϵ(p0, p1)]

sup p 0 W B(p0) p 1 W B(p1)

inf A B(X) r(A, p 0, p 1) (iii) = sup p 0 W B(p0) p 1 W B(p1)

1 2 [1 DT V (p 0, p 1)] = 1

1 inf p 0 W B(p0) p 1 W B(p1)

DT V (p 0, p 1)

where (i) follows from Theorem 5.1, (ii) from Theorem 6.2, and (iii) again from Theorem 6.2 with ϵ = 0. The expressions on the right extremes of the above equations are equal by Theorem 7.2. The existence of p 0, p 1 P(X) follows Theorem 7.2.

To prove the minimax theorem for unequal priors, we need the following generalization of Theorem 7.2 to ﬁnite measures of unequal mass. Lemma 7.1. Let p0, p1 P(X) and let ϵ 0. Then for T 1,

inf q P(X):q T p0 Dϵ(q, p1) = inf q P(X):q T p0 inf W (q,q ),W (p1,p 1) ϵ DT V (q , p 1)

= inf W (p0,p 0),W (p1,p 1) ϵ inf q P(X):q T p 0 DT V (q , p 1) (14)

Now, we prove the minimax equality for unequal priors. Theorem 7.4 (Minimax theorem for unequal priors). Let (X, d) have the midpoint property. Let p0, p1 P(X) and let ϵ 0. For T > 0, deﬁne r as in (10). Then

sup W (p0,p 0),W (p1,p 1) ϵ inf A B(X) r(A, p 0, p 1) = inf A B(X) sup W (p0,p 0),W (p1,p 1) ϵ r(A, p 0, p 1). (15)

The proof uses: (1) the characterization of inf-sup payoff in terms of unbalanced optimal transport using Theorem 5.1; (2) Lemma 7.1; and (3) the minimax equality of Theorem 7.3 for equal priors. Remark 2. Unlike Theorem 7.1, Theorems 7.3 and 7.4 do not guarantee the existence of an optimal decision region A . While Theorem 7.3 guarantees the existence of worst-case pair of perturbed distributions p 0, p 1, Theorem 7.4 does not do so. Nevertheless, an approximate pure Nash equilibrium exists in all the cases. This is in sharp contrast with the non-existence of Nash equilibrium proven in [29] (which considers a different notion of adversarial perturbations). Remark 3. A recent work [26] shows the existence of mixed Nash equilibrium for randomized classiﬁers parametrized by points in a Polish space (see also [29, 3]). Fan s minimax theorem used in this result is inapplicable in our setting of non-parametric, decision region-based classiﬁers. Instead, we applied the theory of Choquet capacities (in Rd) and generalized Strassen s duality theorem (in Polish spaces), which is novel to the best of our knowledge.

8 Discussion

We examined different notions of adversarial risk in a binary classiﬁcation setting with 0-1 loss function and laid down the conditions under which these deﬁnitions are equivalent. By verifying the conditions in Sections 4 and 5, researchers may use different deﬁnitions interchangeably. Several deﬁnitions have also been proposed for adversarial risk under general loss functions [31, 26] using analogous constructions like transport maps, couplings and suprema over ϵ-neighborhoods. Extending our equivalence results to more general loss functions is left for future work.

Figure 1: Illustration of various equivalent formulations of the optimal adversarial risk. The equalities summarize the results of Section 6 and Section 7. For equal priors (T = 1), A and B denote two ways of obtaining the optimal adversarial risk, R ϵ: 1) A , which denotes the Dϵ cost between the true label distributions p0 and p1, and 2) B , which denotes the shortest total variation distance between -Wasserstein balls of radius ϵ around p0 and p1. For unequal priors (T > 1), C , D and E denote three equivalent ways of obtaining R ϵ. The black dotted balls denote -Wasserstein balls and the blue dashed balls denote sets deﬁned using stochastic domination. The order in which the two types of balls appear around p0 is reversed between D and E .

We analyzed optimal adversarial risk for (non-parametric) decision region-based classiﬁers. Using a formulation of optimal transport between ﬁnite measures of unequal mass, we extended the optimal transport based characterization of adversarial risk of [30, 2] to unequal priors by generalizing Strassen s theorem. This may ﬁnd applications in the study of excess cost optimal transport [45, 44]. A recent work [39] obtains a different characterization of optimal adversarial risk using optimal transport on the product space X Y where Y is the label space. Further, they show the evolution of the optimal classiﬁer A as ϵ grows, in terms of a mean curvature ﬂow. This raises an interesting question on the evolution of the optimal adversarial distributions p 0, p 1 P(X) with ϵ.

We proved a minimax theorem for adversarial robustness game and the existence of a Nash equilibrium. We constructed the worst-case pair of distributions p 0, p 1 P(X) in terms of true data distributions and showed that their total variation distance gives the optimal adversarial risk. Identifying worst case distributions could lead to a new approach to developing robust algorithms.

We used Choquet capacities for results in Rd and measurable selections in Polish spaces. Speciﬁcally, we showed that the measure of ϵ-Minkowski expansion is a 2-alternating capacity. This connection could help generalize our results to total variation and Prokhorov distance based contaminations.

Limitations: We largely focused on the binary classiﬁcation setup with 0-1 loss function. While it may be possible extend our results on measurability and relation to -Wasserstein distributional robustness to more general loss functions and a multi-class setup, it is unclear how our results on generalized Strassen s theorem and Nash equilibria can be extended further. Our results on various equivalent formulations of optimal adversarial risk are speciﬁc to adversarial perturbations (or equivalently, -Wasserstein distributional perturbations), and we did not investigate more general perturbation models.

[1] D. P. Bertsekas and S. E. Shreve. Stochastic optimal control: the discrete-time case, volume 5. Athena Scientiﬁc, 1996.

[2] A.N. Bhagoji, D. Cullina, and P. Mittal. Lower bounds on adversarial robustness from optimal transport. In Advances in Neural Information Processing Systems, pages 7496 7508, 2019.

[3] A.J. Bose, G. Gidel, H. Berrard, A. Cianﬂone, P. Vincent, S. Lacoste-Julien, and W.L. Hamilton. Adversarial example games. Advances in Neural Information Processing Systems, 2020.

[4] S.R. Bulò, B. Biggio, I. Pillai, M. Pelillo, and F. Roli. Randomized prediction games for adversarial machine learning. IEEE Transactions on Neural Networks and Learning Systems, 28(11):2466 2478, 2016.

[5] N. Carlini and D. Wagner. Towards evaluating the robustness of neural networks. In IEEE Symposium on Security and Privacy, pages 39 57. IEEE, 2017.

[6] I. Chalkidis and D. Kampas. Deep learning in law: Early adaptation and legal word embeddings trained on large corpora. Artiﬁcial Intelligence and Law, 27(2):171 198, 2019.

[7] G. Choquet. Theory of capacities. In Annales de l institut Fourier, volume 5, pages 131 295, 1954.

[8] J.M. Cohen, E. Rosenfeld, and J. Z. Kolter. Certiﬁed adversarial robustness via randomized smoothing. In International Conference on Machine Learning, 2019.

[9] D. I. Diochnos, S. Mahloujifar, and M. Mahmoody. Adversarial risk and robustness: General deﬁnitions and implications for the uniform distribution. Advances in Neural Information Processing Systems, 31:10359 10368, 2018.

[10] E. Dohmatob. Generalized no free lunch theorem for adversarial robustness. In International Conference on Machine Learning, pages 1646 1654. PMLR, 2019.

[11] E. Dohmatob. Universal lower-bounds on classiﬁcation error under adversarial attacks and random corruption. ar Xiv preprint ar Xiv:2006.09989, 2020.

[12] R.M. Dudley. Distances of probability measures and random variables. In Selected Works of RM Dudley, pages 28 37. Springer, 2010.

[13] P. Erd os and A. Stone. On the sum of two Borel sets. Proceedings of the American Mathematical Society, 25(2):304 306, 1970.

[14] R. Gardner. The Brunn-Minkowski inequality. Bulletin of the American Mathematical Society, 39(3):355 405, 2002.

[15] I. J. Goodfellow, J. Shlens, and C. Szegedy. Explaining and harnessing adversarial examples. International Conference on Learning Representations, 2015.

[16] P. Gourdeau, V. Kanade, M. Kwiatkowska, and J. Worrell. On the hardness of robust classiﬁcation. In Advances in Neural Information Processing Systems, volume 32, 2019.

[17] H. Greenspan, B. van Ginneken, and R. M. Summers. Guest editorial deep learning in medical imaging: Overview and future promise of an exciting new technique. IEEE Transactions on Medical Imaging, 35(5):1153 1159, 2016.

[18] S. Grigorescu, B. Trasnea, T. Cocias, and G. Macesanu. A survey of deep learning techniques for autonomous driving. Journal of Field Robotics, 37(3):362 386, 2020.

[19] P.J. Huber and V. Strassen. Minimax tests and the Neyman-Pearson lemma for capacities. The Annals of Statistics, pages 251 263, 1973.

[20] A. Jalal, A. Ilyas, C. Daskalakis, and A.G. Dimakis. The robust manifold defense: Adversarial training using generative models. ar Xiv preprint ar Xiv:1712.09196, 2017.

[21] R.S.S. Kumar, D.R. O Brien, K. Albert, and S. Vilojen. Law and adversarial machine learning. Neur IPS Workshop on Security in Machine Learning, 2018.

[22] Z. Liu, J. Zhang, V. Jog, P. Loh, and A.B. Mc Millan. Robustifying deep networks for image segmentation. ar Xiv preprint ar Xiv:1908.00656, 2019.

[23] X. Ma, Y. Niu, L. Gu, Y. Wang, Y. Zhao, J. Bailey, and F. Lu. Understanding adversarial attacks on deep learning based medical image analysis systems. Pattern Recognition, 110:107332, 2021.

[24] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu. Towards deep learning models resistant to adversarial attacks. International Conference on Learning Representations, 2018.

[25] S. Mahloujifar, D. I. Diochnos, and M. Mahmoody. The curse of concentration in robust learning: Evasion and poisoning attacks from concentration of measure. In Proceedings of the AAAI Conference on Artiﬁcial Intelligence, volume 33, pages 4536 4543, 2019.

[26] L. Meunier, M. Scetbon, R. Pinot, J. Atif, and Y. Chevaleyre. Mixed Nash equilibria in the adversarial examples game. International Conference on Machine Learning, 2021.

[27] K. Muhammad, A. Ullah, J. Lloret, J. Del Ser, and V.H.C. de Albuquerque. Deep learning for safe autonomous driving: Current challenges and future directions. IEEE Transactions on Intelligent Transportation Systems, 2020.

[28] N. Papernot, P. Mc Daniel, X. Wu, S. Jha, and A. Swami. Distillation as a defense to adversarial perturbations against deep neural networks. In 2016 IEEE Symposium on Security and Privacy, pages 582 597. IEEE, 2016.

[29] R. Pinot, R. Ettedgui, G. Rizk, Y. Chevaleyre, and J. Atif. Randomization matters. how to defend against strong adversarial attacks. In International Conference on Machine Learning, pages 7717 7727. PMLR, 2020.

[30] M. S. Pydi and V. Jog. Adversarial risk via optimal transport and optimal couplings. In International Conference on Machine Learning, pages 7814 7823. PMLR, 2020.

[31] M. S. Pydi and V. Jog. Adversarial risk via optimal transport and optimal couplings. IEEE Transactions on Information Theory, 67(9):6031 6052, 2021.

[32] G. Schay. Nearest random variables with given distributions. The Annals of Probability, pages 163 166, 1974.

[33] U. Shaham, Y. Yamada, and S. Negahban. Understanding adversarial training: Increasing local stability of supervised models through robust optimization. Neurocomputing, 307:195 204, 2018.

[34] W. Sierpi nski. Sur la question de la mesurabilité de la base de M. Hamel. Fundamenta Mathematicae, 1(1):105 111, 1920.

[35] A. Sinha, H. Namkoong, and J. C. Duchi. Certifying some distributional robustness with principled adversarial training. In International Conference on Learning Representations, 2017.

[36] M. Staib and S. Jegelka. Distributionally robust deep learning as a generalization of adversarial training. In Neur IPS workshop on Machine Learning and Computer Security, 2017.

[37] V. Strassen. The existence of probability measures with given marginals. The Annals of Mathematical Statistics, 36(2):423 439, 1965.

[38] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus. Intriguing properties of neural networks. International Conference on Learning Representations, 2014.

[39] N.G. Trillos and R. Murray. Adversarial classiﬁcation: Necessary conditions and geometric ﬂows. ar Xiv preprint ar Xiv:2011.10797, 2020.

[40] Z. Tu, J. Zhang, and D. Tao. Theoretical analysis of adversarial learning: A minimax approach. In Advances in Neural Information Processing Systems, volume 32, 2019.

[41] J. Uesato, B. Oadonoghue, P. Kohli, and A. Oord. Adversarial risk and the dangers of evaluating against weak attacks. In International Conference on Machine Learning, pages 5025 5034. PMLR, 2018.

[42] C. Villani. Topics in Optimal Transportation. American Mathematical Society, 2003.

[43] D. H. Wagner. Survey of measurable selection theorems. SIAM Journal on Control and Optimization, 15(5):859 903, 1977.

[44] L. Yu. Asymptotics for strassen s optimal transport problem. ar Xiv preprint ar Xiv:1912.02051, 2019.

[45] L. Yu and V. Tan. Asymptotic coupling and its applications in information theory. IEEE Transactions on Information Theory, 65(3):1321 1344, 2018.

[46] M.C. Yue, D. Kuhn, and W. Wiesemann. On linear optimization over Wasserstein balls. Mathematical Programming, pages 1 16, 2021.

[47] L. Zajíˇcek. On σ-porous sets in abstract spaces. In Abstract and Applied Analysis, volume 2005, pages 509 534. Hindawi, 2005.