# calibration_and_consistency_of_adversarial_surrogate_losses__7e9aca7c.pdf

Calibration and Consistency of Adversarial Surrogate Losses

Pranjal Awasthi Google Research New York, NY 10011 pranjalawasthi@google.com

Natalie S. Frank Courant Institute New York, NY 10012 nf1066@nyu.edu

Anqi Mao Courant Institute New York, NY 10012 aqmao@cims.nyu.edu

Mehryar Mohri Google Research & Courant Institute New York, NY 10011 mohri@google.com

Yutao Zhong Courant Institute New York, NY 10012 yutao@cims.nyu.edu

Adversarial robustness is an increasingly critical property of classiﬁers in applications. The design of robust algorithms relies on surrogate losses since the optimization of the adversarial loss with most hypothesis sets is NP-hard. But, which surrogate losses should be used and when do they beneﬁt from theoretical guarantees? We present an extensive study of this question, including a detailed analysis of the H-calibration and H-consistency of adversarial surrogate losses. We show that convex loss functions, or the supremum-based convex losses often used in applications, are not H-calibrated for common hypothesis sets used in machine learning. We then give a characterization of H-calibration and prove that some surrogate losses are indeed H-calibrated for the adversarial zero-one loss, with common hypothesis sets. In particular, we ﬁx some calibration results presented in prior work for a family of linear models and signiﬁcantly generalize the results to the nonlinear hypothesis sets. Next, we show that H-calibration is not sufﬁcient to guarantee consistency and prove that, in the absence of any distributional assumption, no continuous surrogate loss is consistent in the adversarial setting. This, in particular, proves that a claim made in prior work is inaccurate. Next, we identify natural conditions under which some surrogate losses that we describe in detail are H-consistent. We also report a series of empirical results which show that many H-calibrated surrogate losses are indeed not H-consistent, and validate our theoretical assumptions. Our adversarial H-consistency results are novel, even for the case where H is the family of all measurable functions.

1 Introduction

Complex multi-layer neural networks trained on large datasets have achieved a remarkable performance in several applications in recent years, in particular in speech and visual recognition tasks (Sutskever et al., 2014; Krizhevsky et al., 2012). However, these rich models are susceptible to imperceptible perturbations (Szegedy et al., 2013). A complex neural network may, for example, misclassify a trafﬁc sign, as a result of a minor variation, which may be the presence of a small advertisement sticker on the sign. Such misclassiﬁcations can have dramatic consequences in practice, for example with self-driving cars. These concerns have motivated the study of adversarial robustness, that is the design of classiﬁers that are robust to small ℓp norm input perturbations (Goodfellow et al., 2014; Madry et al., 2017; Tsipras et al., 2018; Carlini and Wagner, 2017). The standard 0/1 loss is

35th Conference on Neural Information Processing Systems (Neur IPS 2021).

then replaced with a more stringent adversarial loss, which requires a predictor to correctly classify an input point x and also to maintain the same classiﬁcation for all points at a small ℓp distance of x. But, can we devise efﬁcient learning algorithms with theoretical guarantees for the adversarial loss?

Designing such robust algorithms requires resorting to appropriate surrogate losses since optimizing the adversarial loss is NP-hard for most hypothesis sets. A key property for surrogate adversarial losses is their consistency, that is, that exact or near optimal minimizers of the surrogate loss be also exact or near optimal minimizers of the original adversarial loss. The notion of consistency has been extensively studied in the case of the standard 0/1 loss or the multi-class setting (Zhang, 2004; Bartlett et al., 2006; Tewari and Bartlett, 2007; Steinwart, 2007). However, those results or proof techniques cannot be used to establish or characterize consistency in adversarial settings. This is because the adversarial loss of a predictor f at point x is inherently not just a function of f(x) but also of its values around a neighborhood of x. As we shall see, the study of consistency is signiﬁcantly more complex in the adversarial setting, with subtleties that have in fact led to some inaccurate claims made in prior work that we discuss later.

Consistency requires a property of the surrogate and the original losses to hold true for the family of all measurable functions. As argued by Long and Servedio (2013), the notion of H-consistency, which requires a similar property for the surrogate and original losses, but with the near or optimal minimizers considered on the restricted hypothesis set H, is a more relevant and desirable property for learning. Long and Servedio (2013) gave examples of surrogate losses that are not H-consistent when H is the class of all measurable functions but that satisfy a realizable H-consistency condition when H is the class of linear functions. More recently, Zhang and Agarwal (2020) studied the notion of improper realizable H-consistency of linear classes where the surrogate φ can be optimized over a larger class, such as that of piecewise linear functions. Note that these studies only deal with the standard 0/1 classiﬁcation loss. This motivates our main objective: an extensive study of the H-consistency of adversarial surrogate losses, which is critical to the design of robust algorithms with guarantees in this setting.

A more convenient notion in the study of H-consistency is that of H-calibration, which is a related notion that involves conditioning on the input point. H-calibration often is a sufﬁcient condition for H-consistency in the standard classiﬁcation settings (Steinwart, 2007). However, the adversarial loss presents new challenges and requires carefully distinguishing among these notions to avoid drawing false conclusions. As an example, the recent COLT 2020 paper of Bao et al. (2020) presents a study of H-calibration for the adversarial loss in the special case where H is the class of linear functions. However, several comments are due regarding that work. See a detailed discussion in Appendix B.

Our Contributions. We present a more systematic study of the H-calibration and H-consistency including for the case where H = Hall of adversarial surrogate losses. In Section 4, we give a detailed analysis of the H-calibration properties of several natural surrogate losses. We present a series of new negative results showing that, under some general assumptions, convex loss functions and supremumbased convex losses, that are loss functions deﬁned as the supremum over a ball of a convex function, which are those commonly used in applications, are not H-calibrated for common hypothesis sets used in machine learning. Next, we give a characterization of calibration and prove that a family of proposed surrogates are H-calibrated, with common hypothesis sets. These ﬁx previous calibration results presented for the family of linear models in (Bao et al., 2020) and signiﬁcantly generalize the results to the nonlinear hypothesis sets. In Section 5, we study the H-consistency of surrogate loss functions. We prove that, in the absence of distributional assumptions, many surrogate losses shown to be H-calibrated in Section 4 are in fact not H-consistent. This, in particular, proves that a claim presented in a COLT 2020 publication is inaccurate. Next, in contrast, we show that when the minimum of the surrogate loss is achieved within H, under some general conditions, the ρ-margin ramp loss (see, for example, (Mohri et al., 2018)) is H-consistent for H being the linear hypothesis set, or any non-decreasing and continuous g-based hypothesis set, including the Re LU-based hypothesis set. We then give similar H-consistency guarantees for supremum-based surrogate losses based on a non-increasing auxiliary function, including the calibrated supremum-based ρ-margin ramp loss when H is any symmetric hypothesis set, e.g., the multi-layer neural networks. In Section 6, we further report a series of empirical results on simulated data, which show that many H-calibrated surrogate losses are indeed not H-consistent, and justify our conditions for consistency. Overall, our results imply that the loss functions commonly used in practice for optimizing the adversarial loss are not H-consistent and that minimizing such losses may not lead to a more favorable adversarial loss. This could be in fact the reason why the empirical results reported in the literature have not been

favorable. Instead, we suggest alternative surrogate losses that we prove are H-consistent and that can be useful to the design of effective algorithms.

We give a detailed discussion of related work in Appendix A. We start with basic concepts of calibration and consistency (Section 2) and an introduction of robust classiﬁcation (Section 3).

2 Preliminaries

We will denote vectors as lowercase bold letters (e.g. x). The d-dimensional l2-ball with radius r is denoted by Bd 2(r) = {z Rd z 2 r}. We denote by X the set of all possible examples. X is also sometimes referred to as the input space. The set of all possible labels is denoted by Y. We will limit ourselves to the case of binary classiﬁcation where Y = { 1,+1}. Let H be a family of functions from Rd to R. Given a ﬁxed but unknown distribution P over X Y, the binary classiﬁcation learning problem is then formulated as follows. The learner seeks to select a predictor f H with small generalization error with respect to the distribution P. The generalization error of a classiﬁer f H is deﬁned by Rℓ0(f) = E(x,y) P[ℓ0(f,x,y)], where ℓ0(f,x,y) = 1yf(x) 0 is the standard 0/1 loss. More generally, the ℓ-risk of a classiﬁer f for a surrogate loss ℓ(f,x,y) is deﬁned by

Rℓ(f) = E (x,y) P[ℓ(f,x,y)]. (1)

Moreover, the minimal (ℓ, H)-risk, which is also called the Bayes (ℓ, H)-risk, is deﬁned by R ℓ,H = inff H Rℓ(f). In the standard classiﬁcation setting, the goal of a consistency analysis is to determine whether the minimization of a surrogate loss ℓcan lead to that of the binary loss generalization error. Similarly, in adversarially robust classiﬁcation, the goal of a consistency analysis is to determine if the minimization of a surrogate loss ℓyields that of the adversarial generalization error deﬁned by Rℓγ(f) = E(x,y) P[ℓγ(f,x,y)], where

ℓγ(f,x,y) = sup x x x γ 1yf(x ) 0 (2)

is the adversarial 0/1 loss. This motivates the deﬁnition of H-consistency. Deﬁnition 1 (H-Consistency). Given a hypothesis set H, we say that a loss function ℓ1 is Hconsistent with respect to a loss function ℓ2, if the following holds:

Rℓ1(fn) R ℓ1,H n + ÐÐÐ 0 Ô Rℓ2(fn) R ℓ2,H n + ÐÐÐ 0, (3)

for all probability distributions and sequences of {fn}n N H.

For a distribution P over X Y with random variables X and Y , let ηP X [0,1] be a measurable function such that, for any x X, ηP(x) = P(Y = 1 X = x). By the property of conditional expectation, we can rewrite (1) as Rℓ(f) = EX[Cℓ(f,x,ηP(x))], where Cℓ(f,x,η) is the inner ℓ-risk deﬁned as followed:

x X, η [0,1], Cℓ(f,x,η) = ηℓ(f,x,+1) + (1 η)ℓ(f,x, 1). (4)

Moreover, the minimal inner ℓ-risk on H is denoted by C ℓ,H(x,η) = inff H Cℓ(f,x,η). For a margin-based loss φ, the generic conditional φ-risk is Cφ(t,η) = ηφ(t) + (1 η)φ( t) for any η [0,1] and t R (Bartlett et al., 2006). The notion of calibration for the inner risk is often a powerful tool for the analysis of H-consistency (Steinwart, 2007). Deﬁnition 2 (H-Calibration). [Deﬁnition 2.7 in (Steinwart, 2007)] Given a hypothesis set H, we say that a loss function ℓ1 is H-calibrated with respect to a loss function ℓ2 if, for any ϵ > 0, η [0,1], and x X, there exists δ > 0 such that for all f H we have

Cℓ1(f,x,η) < C ℓ1,H(x,η) + δ Ô Cℓ2(f,x,η) < C ℓ2,H(x,η) + ϵ. (5)

Steinwart (2007) points out that if ℓ1 is H-calibrated wrt ℓ2, then H-consistency, that is condition (3), holds for any probability distribution verifying the additional condition of minimizability (Steinwart, 2007, Deﬁnition 2.4). Next, we introduce the notions of calibration function from (Steinwart, 2007). Deﬁnition 3 (Calibration function). Given a hypothesis set H, we deﬁne the calibration function δmax for a pair of losses (ℓ1,ℓ2) as follows: for all x X, η [0,1] and ϵ > 0,

δmax(ϵ,x,η) = inf f H{Cℓ1(f,x,η) C ℓ1,H(x,η) Cℓ2(f,x,η) C ℓ2,H(x,η) ϵ}. (6)

Minimizability

𝒫 supported on {x}

ℋ-calibration ℋ-consistency

𝒞ℓ2(f, x, η)

𝒞ℓ1(f, x, η)

Figure 1: Illustration of H-calibration and H-consistency. Left: H-calibration, for any x X, minimization of Cℓ1(f,x) can lead to that of Cℓ2(f,x). Right: H-consistency, minimization of Rℓ1(f) can lead to that of Rℓ2(f). H-consistency reduces to H-calibration when the support of underlying distribution P is the single point set {x} X; Under the minimizability condition, H-calibration would imply H-consistency.

For any x X, η [0,1] and ϵ > 0, the calibration function gives the maximal δ satisfying the calibration condition (5). The following proposition is an important result from (Steinwart, 2007).

Proposition 4 (Lemma 2.9 in (Steinwart, 2007)). Given a hypothesis set H, loss ℓ1 is H-calibrated with respect to ℓ2 if and only if its calibration function δmax satisﬁes δmax(ϵ,x,η) > 0 for all x X, η [0,1] and ϵ > 0.

Since the concepts of calibration and consistency may not be familiar to readers without an extensive background in this area, we further comment on these notions before presenting our main results. Informally, a loss function is H-consistent if minimizing it results in a classiﬁer whose generalization error is close to the minimal generalization error within H. Similarly, a loss function is H-calibrated if minimizing it results in a classiﬁer whose inner ℓ2-risk is close to the minimal inner ℓ2-risk within H for each x X. H-calibration (5) is a necessary condition for H-consistency (3), but is not always sufﬁcient. As an example, we show in Section 5.1 that H-calibrated surrogate losses proposed in (Bao et al., 2020) are not H-consistent. For this reason, H-consistency, that is consistency for a particular hypothesis set H, is a difﬁcult problem even in standard non-adversarial scenarios. When H is the family of all measurable functions, the notions of calibration and consistency with respect to the 0/1 loss have been widely studied in the literature to analyze the properties of margin-based losses (Zhang, 2004; Bartlett et al., 2006). In this special case, calibration implies consistency. Steinwart (2007) further establishes a sufﬁcient condition called minimizability under which H-calibration (5) implies H-consistency (3). Note that the minimizability condition holds in (Zhang, 2004; Bartlett et al., 2006). However, it does not hold in general in the adversarial scenario and thus analyzing H-consistency becomes much harder. To the best of our knowledge, our work is the ﬁrst to prove H-consistency results for general hypothesis sets H, including for the case where H = Hall, in the context of adversarial classiﬁcation. We conclude this section with an illustration of the connection between the notations of calibration and consistency in Figure 1.

3 Adversarially Robust Classiﬁcation

In adversarially robust classiﬁcation, the loss at (x,y) is measured in terms of the worst loss incurred over an adversarial perturbation of x within a ball of a certain radius in a norm. For simplicity, we will consider perturbations in the l2 norm .1 We will denote by γ the maximum magnitude of the allowed perturbations. Given γ > 0, a data point (x,y), a function f H, and a margin-based loss φ R R+, we deﬁne the adversarial loss of f at (x,y) as

φ(f,x,y) = sup x x x γ φ(yf(x )). (7)

The above naturally motivates supremum-based surrogate losses that are commonly used to optimize the adversarial 0/1 loss (Goodfellow et al., 2014; Madry et al., 2017; Shafahi et al., 2019; Wong et al.,

1Our analysis in the paper can be extended directly to other perturbations such as the l1 ball or l ball, and in fact for any lp norm for p [1, ]. In particular, the proofs of our calibration and consistency results for general hypothesis sets (e.g., Theorem 6, Theorem 7, Theorem 10, Theorem 16, Theorem 20, Theorem 23, Theorem 24) do not require the norm being l2 and work for other norms too.

2020). We say that a surrogate loss φ(f,x,y) is supremum-based if it is of the form deﬁned in (7). We say that the supremum-based surrogate is convex if the function φ in (7) is convex. When φ is non-increasing, the following equality holds (Yin et al., 2019):

sup x x x γ φ(yf(x )) = φ( inf x x x γ yf(x )). (8)

The adversarial 0/1 loss deﬁned in (2) is a special case of (7), where φ is the 0/1 loss, that is, φ(yf(x)) = ℓ0(f,x,y) = 1yf(x) 0. Therefore, the adversarial 0/1 loss has the equivalent form

ℓγ(f,x,y) = sup x x x γ 1yf(x ) 0 = 1 inf x x x γ yf(x ) 0. (9)

This alternative equivalent form of adversarial 0/1 loss is more advantageous to analyze than (2) and would be adopted in our proofs. Without loss of generality, let X = Bd 2(1) and γ (0,1). In this paper, we aim to characterize surrogate losses ℓ1 satisfying H-consistency (3) and H-calibration (5) with ℓ2 = ℓγ and for the hypothesis sets H which are regular for adversarial calibration.

Deﬁnition 5 (Regularity for Adversarial Calibration). We say that a hypothesis set H is regular for adversarial calibration if there exists a distinguishing x in X, that is if there exist f,g H such that inf x x γ f(x ) > 0 and sup x x γ g(x ) < 0.

When studying H-calibration of surrogate losses, it sufﬁces to study sets H that are regular for adversarial calibration not only because all common hypothesis sets admit the property, but also because of the following result. (See Appendix E.1 for the proof.) We say that a hypothesis set H is symmetric, if for any f H, f is also in H.

Theorem 6. Let H be a symmetric hypothesis set. If H is not regular for adversarial calibration, then any surrogate loss ℓis H-calibrated with respect to ℓγ.

Moreover, we speciﬁcally study the following hypothesis sets that are regular for adversarial calibration: linear models: Hlin = {x w x w = 1}, as in (Bao et al., 2020); generalized linear models: Hg = {x g(w x) + b w = 1, b G} where g is a non-decreasing function; the family of all measurable functions: Hall; and multi-layer neural networks: HNN = {x u ρn(Wn( ρ2(W2ρ1(W1x + b1) + b2) ) + bn) u 1 Λ, Wj W, bj 1 B}, where ρj is an activation function; In the special case of g = ( )+ = max( ,0), we denote the corresponding Re LU-based hypothesis set by Hrelu = {x (w x)+ + b w = 1, b G}.

4 H-Calibration

Calibration is a condition that often guarantees consistency and is a ﬁrst step in analyzing surrogate losses. Thus, in this section, we ﬁrst present a detailed study of the calibration properties of several loss functions. We ﬁrst give a series of negative results showing that, under general assumptions, convex losses and supremum-based convex losses, which are typically used in practice for adversarial robustness, are not calibrated. We then complement these results with positive ones by identifying a family of losses that are indeed calibrated under certain general conditions.

4.1 Negative results: convex losses

We ﬁrst study convex losses, which are often used for standard binary classiﬁcation problems.

Theorem 7. Assume H is such that there exists a distinguishing x0 X and f0 H such that f0(x0) = 0. If a margin-based loss φ R R+ is convex, then it is not H-calibrated with respect to ℓγ.

In particular, the assumption holds when H is regular for adversarial calibration and contains 0. By Theorem 7, we obtain the following corollary, which ﬁxes the main negative result of Bao et al. (2020) and generalizes the result to nonlinear hypothesis sets. Note Hlin, HNN and Hall all satisfy there exists a distinguishing x0 X and f0 H such that f0(x0) = 0. When g( γ) + G > 0 and g(γ) G < 0, Hg also satisﬁes this assumption. Verifying this condition on Hg is straightforward for G sufﬁciently large.

Corollary 8. If a margin-based loss φ R R+ is convex, then φ is not H-calibrated with respect to ℓγ, for H = Hlin, Hg with a non-decreasing and continuous function g such that g( γ) + G > 0 and g(γ) G < 0, Hrelu with G > γ, HNN, and Hall.

While convex surrogates are natural for the 0/1 loss, the current practice in designing practical algorithms for the adversarial loss involves using convex supremum-based surrogates (Madry et al., 2017; Wong et al., 2020; Shafahi et al., 2019). We next investigate such losses.

4.2 Negative results: supremum-based convex losses

We study losses of the type φ(f,x,y) = supx x x γ φ(yf(x )), with φ convex, which are often used in practice as surrogates for the adversarial 0/1 loss. The following theorems presents negative results for supremum-based convex surrogate losses for the common hypothesis sets H. Theorem 9. Let φ be a convex and non-increasing margin-based loss. Consider the surrogate loss deﬁned by φ(f,x,y) = supx x x γ φ(yf(x )). Then φ is not H-calibrated with respect to ℓγ, for H = Hlin, Hg with a non-decreasing and continuous function g such that g( γ) + G > 0 and g(γ) G < 0, and Hrelu with G > γ.

Theorem 10. Let H be a hypothesis set containing 0 that is regular for adversarial calibration. If a margin-based loss φ is convex and non-increasing, then the surrogate loss deﬁned by φ(f,x,y) = supx x x γ φ(yf(x )) is not H-calibrated with respect to ℓγ.

The theorems above provides evidence that the current practice of making networks adversarially robust via minimizing convex supremum-based surrogates may have serious deﬁciencies. This may also explain why in practice the adversarial accuracies that are achievable are much lower than the corresponding natural accuracies of the model (Madry et al., 2017). In general, optimizing non-calibrated or non-consistent surrogates could lead to undesirable solutions even under strong assumptions (such as the Bayes risk being zero). See Section 6, where we empirically demonstrate this in a variety of settings. By Theorem 10 and the fact that HNN and Hall both contain 0 and are regular for adversarial calibration, we can derive the following corollary. Corollary 11. Let φ be a convex and non-increasing margin-based loss. Consider the surrogate loss deﬁned by φ(f,x,y) = supx x x γ φ(yf(x )). Then φ is not H-calibrated with respect to ℓγ, for H = HNN, and H = Hall.

The proofs of Theorem 7, Theorem 9 and Theorem 10 are included in Appendix E.2. The key in proving the above theorems is to analyze the calibration function δmax(ϵ,x,η) as deﬁned in (6) of losses (ℓ,ℓγ) at η = 1

2 and distinguishing x0 X. Naturally, this requires us to understand the inner risk Cℓ(f,x,η) that in turn depends on the worst case perturbation of a given data point according to ℓ. Our key insight (Lemma 25) is that δmax(ϵ,x0,η) can be characterized by two quantities M(f,x0,γ) = infx x0 x γ f(x ), M(f,x0,γ) = supx x0 x γ f(x ). Requiring δmax( 1

2) > 0 corresponds to an appropriate convex function not achieving a minimum in a set that has global optimum, thereby reaching a contradiction.

4.3 Positive results

In this section, we aim to provide alternative losses which could be calibrated with respect to ℓγ.

4.3.1 Characterization

In light of the above negative results, we need to consider non-convex surrogates. One possible candidate is the family of losses introduced by Bao et al. (2020) that satisfy the property that the generic conditional φ-risk Cφ(t,η) is quasi-concave in t R for all η [0,1]. Theorem 12 below is a correction to the main positive result, Theorem 11 in (Bao et al., 2020), where we prove the theorem under the correct calibration deﬁnition. Theorem 12. Let a margin-based loss φ be bounded, continuous, non-increasing, and satisfy the property that Cφ(t,η) is quasi-concave in t R for all η [0,1]. Assume that φ( t) > φ(t) for any γ < t 1. Then φ is Hlin-calibrated with respect to ℓγ if and only if for any γ < t 1,

φ(γ) + φ( γ) > φ(t) + φ( t). (10)

The proof of Theorem 12 is included in Appendix E.4, where we make use of Lemma 27 and Lemma 28, which are powerful since they apply to any symmetric hypothesis sets. These lemmas would be used for proving more general positive results, as we will show later. Note Theorem 11 in (Bao et al., 2020) does not hold any more under the correct calibration Deﬁnition 2, since their condition φ(γ) + φ( γ) > φ(1) + φ( 1) is much weaker than (10).

The following theorem extends the above to show that under certain conditions, such surrogate losses are H-calibrated for the class of generalized linear models with respect to the adversarial 0/1 loss. Theorem 13. Let g be a non-decreasing and continuous function such that g(1 + γ) < G and g( 1 γ) > G for some G 0. Let a margin-based loss φ be bounded, continuous, non-increasing, and satisfy the property that Cφ(t,η) is quasi-concave in t R for all η [0,1]. Assume that φ(g( t) G) > φ(G g( t)) and g( t) + g(t) 0 for any 0 t 1. Then φ is Hg-calibrated with respect to ℓγ if and only if for any 0 t 1,

φ(G g( t)) + φ(g( t) G) = φ(g(t) + G) + φ( g(t) G)

and min{φ(A(t)) + φ( A(t)),φ(A(t)) + φ( A(t))} > φ(G g( t)) + φ(g( t) G),

where A(t) = maxs [ t,t] g(s) g(s γ) and A(t) = mins [ t,t] g(s) g(s + γ).

See Appendix E.5 for the proof. The conditions in the theorem above are necessary and sufﬁcient and thus characterize calibration for such surrogate losses. To interpret the conditions better, consider Re LU functions. In that case, the assumptions can be further simpliﬁed to get the following corollary.

Corollary 14. Assume that G > 1 + γ. Let a margin-based loss φ be bounded, continuous, nonincreasing, and satisfy the property that Cφ(t,η) is quasi-concave in t R for all η [0,1]. Assume that φ( G) > φ(G). Then φ is Hrelu-calibrated with respect to ℓγ if and only if for any 0 t 1,

φ(G) + φ( G) = φ(t + G) + φ( t G) and φ(γ) + φ( γ) > φ(G) + φ( G).

4.3.2 Calibration

To demonstrate the applicability of Theorem 13, we consider a speciﬁc surrogate loss namely the ρ-margin loss φρ(t) = min{1,max{0,1 t

ρ}}, ρ > 0, which is a generalization of the ramp

loss (see, (Mohri et al., 2018)). We also deﬁne its supremum-based counterpart as φρ(f,x,y) = supx x x γ φρ(yf(x )). Using Theorem 12, Theorem 13 and Corollary 14 in Section 4.3.1, we can conclude that the ρ-margin loss is calibrated under reasonable conditions for linear hypothesis sets and non-decreasing g-based hypothesis sets, since φρ(t) is bounded, continuous, non-increasing, and satisﬁes Cφρ(t,η) is quasi-concave in t R for all η [0,1]. This is stated formally below. Theorem 15. The surrogate φρ is Hlin-calibrated with respect to ℓγ if and only if ρ > γ. Given a non-decreasing and continuous function g such that g(1+γ) < G and g( 1 γ) > G for some G 0, assume that g( t)+g(t) 0 for any 0 t 1, then φρ is Hg-calibrated with respect to ℓγ if and only if for any 0 t 1, φρ(G g( t)) = φρ(g(t)+G) and min{φρ(A(t)),φρ( A(t))} > φρ(G g( t)), where A(t) = maxs [ t,t] g(s) g(s γ) and A(t) = mins [ t,t] g(s) g(s + γ). Assume that G > 1 + γ, then φρ is Hrelu-calibrated with respect to ℓγ if and only if G ρ > γ.

Recall that in Theorem 10 we ruled out the possibility of ﬁnding H-calibrated supremum-based convex surrogate losses with respect to the adversarial 0/1 loss. However, we show that the supremum-based ρ-margin loss is indeed H-calibrated, where H is any symmetric hypothesis set.

Theorem 16. Let H be a symmetric hypothesis set, then φρ is H-calibrated with respect to ℓγ.

The proof of Theorem 16 is included in Appendix E.4. By Theorem 16 and the fact that Hlin, HNN and Hall are all symmetric, we derive the following.

Corollary 17. φρ is H-calibrated with respect to ℓγ, for H = Hlin, HNN, and Hall.

The results of this section suggest that the ρ-margin loss and supremum-based ρ-margin loss may be good surrogates for the adversarial 0/1 loss. However, calibration, in general, is not equivalent to consistency, our eventual goal. In the next section, we study conditions under which we can expect these surrogates losses to be H-consistent as well.

5 H-Consistency

In this section, we study the H-consistency of surrogate loss functions. The results of the previous section suggest that convex losses or supremum-based convex losses would not be H-consistent. However, H-calibrated losses, such as the ρ-margin loss and supremum-based ρ-margin loss present an intriguing possibility. Bao et al. (2020) made a claim that since the losses they proposed are Hlin-calibrated they are also Hlin-consistent. We ﬁrst present a result that implies this claim is incorrect. In fact, our result stated below shows that without assumptions on the data distribution, no continuous margin based loss or a supremum-based continuous surrogate could be Hlin-consistent.

5.1 Negative results

Theorem 18. No continuous margin-based loss function φ is Hlin-consistent with respect to ℓγ. Furthermore, for any continuous and non-increasing margin-based loss φ, surrogates of the form

φ(f,x,y) = sup x x x γ φ(yf(x ))

are not Hlin-consistent with respect to ℓγ.

The above theorem is proven in Appendix E.6. In particular, Theorem 18 contradicts the Hconsistency claim of Bao et al. (2020) for their proposed losses when H is the family of linear functions. Furthermore, the theorem rules out H-consistency of supremum-based surrogates.

5.2 Positive results

In this section, we investigate the nature of the assumptions on the data distributions that may lead to H-consistency of surrogate losses. We take inspiration from the work of Long and Servedio (2013) and Zhang and Agarwal (2020) who study H-consistency for the standard 0/1 loss. These studies establish consistency under a realizability assumption on the data distribution stated below that requires the Bayes (ℓ0, H)-risk to be zero. Deﬁnition 19 (H-realizability). A distribution P over X Y is H-realizable if it labels points according to a deterministic model in H, i.e., if f H such that P(x,y) P(sgn(f(x)) = y) = 1.

As with H-realizability, we will assume that, under the data distribution, the Bayes (ℓγ, H)-risk is zero. We show that the H-calibrated losses studied in previous sections are H-consistent under natural conditions along with the realizability assumption.

5.2.1 Non-supremum-based surrogates

Theorem 20. Let P be a distribution over X Y and H a hypothesis set for which R ℓγ,H = 0. Let φ be a margin-based loss. If for η 0, there exists f H Hall such that Rφ(f ) R φ,Hall +η < + and φ is H-calibrated with respect to ℓγ, then for all ϵ > 0 there exists δ > 0 such that for all f H,

Rφ(f) + η < R φ,H + δ Ô Rℓγ(f) < R ℓγ,H + ϵ.

For the family of linear models, some convex losses may also be Hlin-consistent verifying the conditions (η = 0) in Theorem 20. However, Hlin-calibrated losses can be Hlin-consistent under more benign assumptions, where the realizability condition R ℓγ,Hlin = 0 can be further relaxed.

Theorem 21. Let P be a distribution over X Y. Assume that there exists g Hlin such that Rℓγ(g ) = R ℓγ,Hall. Let φ be a margin-based loss. If for η 0, there exists f Hlin Hall such that Rφ(f ) R φ,Hall + η < + and φ is Hlin-calibrated with respect to ℓγ, then for all ϵ > 0 there exists δ > 0 such that for all f Hlin we have

Rφ(f) + η < R φ,Hlin + δ Ô Rℓγ(f) < R ℓγ,Hlin + ϵ.

The proofs of Theorem 20 and Theorem 21 are presented in Appendix E.7. Using Theorem 15 in Section 4.3.2 and theorems above, we immediately conclude that the calibrated ρ-margin loss is consistent with respect to ℓγ for all distributions that satisfy our realizability assumptions.

Figure 2: Left: Unit Circle with 1,000 and 2,000 samples. Right: Segment with 5,00 samples.

Theorem 22. If ρ > γ, then φρ is Hlin-consistent wrt ℓγ for all distributions such that there exists g Hlin with Rℓγ(g ) = R ℓγ,Hall and there exists f Hlin such that Rφρ(f ) = R φρ,Hall. If g veriﬁes the calibration condition in Theorem 15, then φρ is Hg-consistent wrt ℓγ for all distribution P over X Y that satisﬁes R ℓγ,Hg = 0 and there exists f Hg such that Rφρ(f ) = R φρ,Hall. If G > 1 + γ and G ρ > γ, then φρ is Hrelu-consistent wrt ℓγ for all distribution P over X Y that satisﬁes R ℓγ,Hrelu = 0 and there exists f Hrelu such that Rφρ(f ) = R φρ,Hall.

5.2.2 Supremum-based surrogates

We can also extend the above to obtain H-consistency of supremum-based convex surrogates. However we need the stronger condition that Rφ is minimized exactly inside H. Theorem 23. Given a distribution P over X Y and a hypothesis set H such that R ℓγ,H = 0. Let φ be a non-increasing margin-based loss. If there exists f H Hall such that Rφ(f ) = R φ,Hall < + and φ(f,x,y) = supx x x γ φ(yf(x )) is H-calibrated with respect to ℓγ, then for all ϵ > 0 there exists δ > 0 such that for all f H we have

R φ(f) < R φ,H + δ Ô Rℓγ(f) < R ℓγ,H + ϵ.

The proof of Theorem 23 is presented in Appendix E.7. Again, when combined with Theorem 16 in Section 4.3.2 we conclude that the H-calibrated supremum-based ρ-margin loss is also H-consistent with respect to ℓγ for all distributions that satisfy our realizability assumptions.

Theorem 24. Let H be a symmetric hypothesis set, then φρ is H-consistent with respect to ℓγ for all distributions P over X Y that satisfy: R ℓγ,H = 0 and there exists f H such that Rφρ(f ) = R φρ,Hall < + .

6 Experiments

Here, we present experiments on simulated data to support our theoretical ﬁndings. The goal is two-fold. First, we empirically demonstrate that indeed H-calibrated surrogates in (Bao et al., 2020) may not be H-consistent unless assumptions on the data distribution are made, even when H is the class of linear functions. This is consistent with our negative result in Theorem 18 and provides an empirical counterexample to the claim made in (Bao et al., 2020). Second, we study the necessity of the realizability assumptions we adopted in Section 5.2 to establish H-consistency of surrogates satisfying the conditions in Theorem 12.

We generate data points x R2 on the unit circle and consider H to be linear models Hlin. We denote f(x) = w x, w = (cos(t),sin(t)) ,t [0,2π),f Hlin. All risks are approximated by their empirical counterparts computed over 107 i.i.d. samples. To demonstrate the need for some assumptions for H-consistency, we construct a scenario we call the Unit Circle case. We consider four surrogates: φhinge, φramp, φsig and φlog deﬁned in Appendix C.1. In general, we refer all of these surrogates as φsur. We generate data points x from the uniform distribution on the unit circle. Deﬁne x as x = (cos(θ),sin(θ)) , θ [0,2π). Set the label of a point x as follows: if θ ( π

2 ,π), then y = 1 with probability 3

4 and y = 1 with probability 1

4; if θ (0, π

2 ) or ( 3π

then y = 1; if θ (π, 3π

2 ), then y = 1. Set γ =

2 2 . In this case, the Bayes (ℓγ, Hlin)-risk is R ℓγ,Hlin 0.5000 0 and is achieved by wℓγ = (cos(θ),sin(θ)) with θ 0.7855. The results obtained by optimizing the different surrogate losses are reported in Table 1(a) and the plots for

Table 1: (a) Unit Circle; (b) Segments.

φsur Rℓγ(f ) θφsur Hlin-cal. Hlin-cons. φhinge 0.5257 0.1420 φramp 0.5263 0.1288 φsig 0.5261 0.1320 φlog 0.5258 0.1414

φsur Rℓγ(f ) Rφsur(f ) θφsur Hlin-cal. Hlin-cons. φhinge 0.0781 0.6907 1.3548 φramp 0.0781 0.3454 1.3548 φsig 0.0777 0.4247 1.3498 φlog 0.0763 0.8078 1.3341 φ1 0.0111 0 π 6 φ2 0 0 0 (a) (b)

1,000 samples and 2,000 samples are shown in Figure 2. Table 1(a) shows that neither calibrated nor non-calibrated (convex) surrogates are Hlin-consistent with respect to ℓγ for this distribution. Figure 2 shows that the classiﬁers obtained by optimizing the four surrogates are almost the same but deviate a lot from the optimal Bayes classiﬁer for ℓγ. This shows that indeed calibrated surrogates may not be consistent and contradicts Figure 12 of (Bao et al., 2020). The discrepancy results from an incorrect calculation of the adversarial Bayes risk in (Bao et al., 2020).

Next, we justify the realizability assumptions made in Section 5.2 for obtaining H-consistency of surrogate losses. To do so, we design a scenario that we call the Segments case. Here, we consider six surrogates, the four studied above and two more surrogates φ1 and φ2 deﬁned in Appendix C.1. The loss φ1 is a convex loss and φ2 is the ρ-margin ramp loss for some ρ > γ. In general, we refer to all of these surrogates as φsur. We show in Appendix C.2 that φhinge, φlog and φ1 are not Hlin-calibrated while φramp, φsig and φ2 are Hlin-calibrated with respect to ℓγ.

1 ˆγ2 and consider: P(Y = 1) = P(Y = 1) = 1

2, and X Y = 1 is the uniform distribution on the line segment {(ˆγ,z) z [0,Iˆγ]} and X Y = 1 is the uniform distribution on the line segment {( ˆγ,z) z [ Iˆγ,0]} where ˆγ = γ + 1 γ

100 = 1+99γ

100 , γ (0,1). We choose γ = 0.1 and set w = (1,0) . It is easy to check that w achieves the Bayes (ℓγ, Hlin)-risk R ℓγ,Hlin = 0. The results for the six different surrogate losses are indicated in Table 1(b) and the plot for 5,00 samples are shown in Figure 2. For φhinge, φramp, φsig and φlog, the Bayes (φsur, Hlin)-risk R φsur,Hlin 0. Table 1(b) shows that they are not Hlin-consistent with respect to ℓγ.

0 50 100 150 200 250 0

Figure 3: Adv. true risk of consistent and calibrated inconsistent losses vs. sample size.

For φ1 and φ2, the Bayes (φsur, Hlin)-risk R φsur,Hlin = 0. Table 1(b) shows that φ1 is not Hlin-consistent (recall that φ1 is not calibrated) but φ2 is Hlin-consistent for this distribution. Hence, even when R ℓγ,Hlin = 0, unless a condition is also imposed on R φsur,Hlin, one cannot expect consistency, thereby justifying our realizability assumption. Note that R φsur,Hlin = R ℓγ,Hlin = 0 is a special case verifying the conditions of Theorem 20 for η = 0. For this distribution, φramp is not Hlin-consistent while φ2 is Hlin-consistent, although both are Hlin-calibrated. We compare them in Figure 3, showing that minimizing Hlin-consistent surrogate φ2 minimizes the adversarial generalization error for large sample sizes but the same does not hold for non Hlinconsistent surrogate φramp.

7 Conclusion

We presented a detailed study of calibration and consistency for adversarial robustness. These results can help guide the design of algorithms for learning robust predictors, an increasingly important problem in applications. Our theoretical results show in particular that many of the surrogate losses typically used in practice do not beneﬁt from any guarantee. Our empirical results further illustrate that in the context of a general example. Our results also show that some of the calibration results presented in previous work do not bear any signiﬁcance, since we prove that in fact they do not guarantee consistency. Instead, we give a series of positive calibration and consistency results for several families of surrogate functions, under some realizability assumptions.

Acknowledgements

This work was partly funded by NSF CCF-1535987 and NSF IIS-1618662.

Attias, I., Kontorovich, A., and Mansour, Y. (2018). Improved generalization bounds for robust learning. ar Xiv preprint ar Xiv:1810.02180.

Awasthi, P., Dutta, A., and Vijayaraghavan, A. (2019). On robustness to adversarial examples and polynomial optimization. In Advances in Neural Information Processing Systems, pages 13737 13747.

Awasthi, P., Frank, N., and Mohri, M. (2020). Adversarial learning guarantees for linear hypotheses and neural networks. In International Conference on Machine Learning, pages 431 441.

Bao, H., Scott, C., and Sugiyama, M. (2020). Calibrated surrogate losses for adversarially robust classiﬁcation. In Conference on Learning Theory, pages 408 451.

Bao, H., Scott, C., and Sugiyama, M. (2021). Corrigendum to: Calibrated surrogate losses for adversarially robust classiﬁcation. ar Xiv preprint ar Xiv:2005.13748.

Bartlett, P. L., Bubeck, S., and Cherapanamjeri, Y. (2021). Adversarial examples in multi-layer random relu networks. ar Xiv preprint ar Xiv:2106.12611.

Bartlett, P. L., Jordan, M. I., and Mc Auliffe, J. D. (2006). Convexity, classiﬁcation, and risk bounds. Journal of the American Statistical Association, 101(473):138 156.

Boyd, S. P. and Vandenberghe, L. (2014). Convex Optimization. Cambridge University Press.

Bubeck, S., Lee, Y. T., Price, E., and Razenshteyn, I. (2018a). Adversarial examples from cryptographic pseudo-random generators. ar Xiv preprint ar Xiv:1811.06418.

Bubeck, S., Price, E., and Razenshteyn, I. (2018b). Adversarial examples from computational constraints. ar Xiv preprint ar Xiv:1805.10204.

Carlini, N. and Wagner, D. (2017). Towards evaluating the robustness of neural networks. In IEEE Symposium on Security and Privacy (SP), pages 39 57.

Carmon, Y., Raghunathan, A., Schmidt, L., Liang, P., and Duchi, J. C. (2019). Unlabeled data improves adversarial robustness. ar Xiv preprint ar Xiv:1905.13736.

Cullina, D., Bhagoji, A. N., and Mittal, P. (2018). PAC-learning in the presence of evasion adversaries. ar Xiv preprint ar Xiv:1806.01471.

Diakonikolas, I., Kane, D. M., and Manurangsi, P. (2020). The complexity of adversarially robust proper learning of halfspaces with agnostic noise. ar Xiv preprint ar Xiv:2007.15220.

Feige, U., Mansour, Y., and Schapire, R. (2015). Learning and inference in the presence of corrupted inputs. In Conference on Learning Theory, pages 637 657.

Feige, U., Mansour, Y., and Schapire, R. E. (2018). Robust inference for multiclass classiﬁcation. In Algorithmic Learning Theory, pages 368 386.

Gao, W. and Zhou, Z.-H. (2015). On the consistency of auc pairwise optimization. In Twenty-Fourth International Joint Conference on Artiﬁcial Intelligence.

Goodfellow, I. J., Shlens, J., and Szegedy, C. (2014). Explaining and harnessing adversarial examples. ar Xiv preprint ar Xiv:1412.6572.

Khim, J. and Loh, P.-L. (2018). Adversarial risk bounds for binary classiﬁcation via function transformation. ar Xiv preprint ar Xiv:1810.09519.

Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Imagenet classiﬁcation with deep convolutional neural networks. In Advances in Neural Information Processing Systems, pages 1097 1105.

Long, P. and Servedio, R. (2013). Consistency versus realizable H-consistency for multiclass classiﬁcation. In International Conference on Machine Learning, pages 801 809.

Madry, A., Makelov, A., Schmidt, L., Tsipras, D., and Vladu, A. (2017). Towards deep learning models resistant to adversarial attacks. ar Xiv preprint ar Xiv:1706.06083.

Mohri, M., Rostamizadeh, A., and Talwalkar, A. (2018). Foundations of Machine Learning. MIT Press, second edition.

Montasser, O., Hanneke, S., and Srebro, N. (2019). Vc classes are adversarially robustly learnable, but only improperly. ar Xiv preprint ar Xiv:1902.04217.

Montasser, O., Hanneke, S., and Srebro, N. (2020). Reducing adversarially robust learning to non-robust pac learning. ar Xiv preprint ar Xiv:2010.12039.

Shafahi, A., Najibi, M., Ghiasi, M. A., Xu, Z., Dickerson, J., Studer, C., Davis, L. S., Taylor, G., and Goldstein, T. (2019). Adversarial training for free! In Advances in Neural Information Processing Systems, pages 3353 3364.

Steinwart, I. (2007). How to compare different loss functions and their risks. Constructive Approximation, 26(2):225 287.

Sutskever, I., Vinyals, O., and Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems, pages 3104 3112.

Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., and Fergus, R. (2013). Intriguing properties of neural networks. ar Xiv preprint ar Xiv:1312.6199.

Tewari, A. and Bartlett, P. L. (2007). On the consistency of multiclass classiﬁcation methods. Journal of Machine Learning Research, 8(36):1007 1025.

Tsipras, D., Santurkar, S., Engstrom, L., Turner, A., and Madry, A. (2018). Robustness may be at odds with accuracy. ar Xiv preprint ar Xiv:1805.12152.

Uematsu, K. and Lee, Y. (2011). On theoretically optimal ranking functions in bipartite ranking. Department of Statistics, The Ohio State University, Tech. Rep, 863.

Wong, E., Rice, L., and Kolter, J. Z. (2020). Fast is better than free: Revisiting adversarial training. ar Xiv preprint ar Xiv:2001.03994.

Yin, D., Ramchandran, K., and Bartlett, P. L. (2019). Rademacher complexity for adversarially robust generalization. In International Conference of Machine Learning, pages 7085 7094.

Zhang, H., Yu, Y., Jiao, J., Xing, E. P., Ghaoui, L. E., and Jordan, M. I. (2019). Theoretically principled trade-off between robustness and accuracy. ar Xiv preprint ar Xiv:1901.08573.

Zhang, M. and Agarwal, S. (2020). Bayes consistency vs. h-consistency: The interplay between surrogate loss functions and the scoring function class. In Advances in Neural Information Processing Systems, pages 16927 16936.

Zhang, T. (2004). Statistical behavior and consistency of classiﬁcation methods based on convex risk minimization. The Annals of Statistics, 32(1):56 85.