# metricfair_classifier_derandomization__409f7050.pdf

Metric-Fair Classifier Derandomization

Jimmy Wu Yatong Chen 1 Yang Liu 1

We study the problem of classifier derandomization in machine learning: given a stochastic binary classifier f : X [0, 1], sample a deterministic classifier ˆf : X {0, 1} that approximates the output of f in aggregate over any data distribution. Recent work revealed how to efficiently derandomize a stochastic classifier with strong output approximation guarantees, but at the cost of individual fairness that is, if f treated similar inputs similarly, ˆf did not. In this paper, we initiate a systematic study of classifier derandomization with metric fairness guarantees. We show that the prior derandomization approach is almost maximally metric-unfair, and that a simple random threshold derandomization achieves optimal fairness preservation but with weaker output approximation. We then devise a derandomization procedure that provides an appealing tradeoff between these two: if f is α-metric fair according to a metric d with a locality-sensitive hash (LSH) family, then our derandomized ˆf is, with high probability, O(α)-metric fair and a close approximation of f. We also prove generic results applicable to all (fair and unfair) classifier derandomization procedures, including a bias-variance decomposition and reductions between various notions of metric fairness.

1. Introduction

We study the general problem of derandomizing stochastic classification models. Consider a typical binary classification setting defined by a feature space X Rn and labels {0, 1}; we wish to devise a procedure that, given a stochastic or randomized classifier f : X [0, 1], efficiently samples

1Department of Computer Science and Engineering, University of California, Santa Cruz, USA. Correspondence to: Jimmy Wu <jimmywu126@gmail.com>, Yatong Chen <ychen592@ucsc.edu>, Yang Liu (corresponding author) <yangliu@ucsc.edu>.

Proceedings of the 39 th International Conference on Machine Learning, Baltimore, Maryland, USA, PMLR 162, 2022. Copyright 2022 by the author(s).

a deterministic classifier ˆf : X {0, 1} from some family of functions F, such that ˆf preserves various qualities of f.

Stochastic classifiers arise naturally in both theory and practice. For example, they are frequently the solutions to constrained optimization problems encoding complex evaluation metrics (Narasimhan, 2018), group fairness (Grgi c Hlaˇca et al., 2017; Agarwal et al., 2018), individual fairness (Dwork et al., 2012; Rothblum & Yona, 2018; Kim et al., 2018; Sharifi-Malvajerdi et al., 2019), and robustness to adversarial attacks (Pinot et al., 2019; Cohen et al., 2019; Pinot et al., 2020; Braverman & Garg, 2020). Stochastic classifiers are also the natural result of taking an ensemble of individual classifiers (Dietterich, 2000; Grgi c-Hlaˇca et al., 2017).

However, they may be undesirable for numerous reasons: a stochastic classifier is not robust to repeated attacks, since even one that is instance-wise 99% accurate will likely err after a few hundred attempts; by the same token, they violate intuitive notions of fairness since even the same individual may be treated differently over multiple classifications. For these reasons, Cotter, Gupta, and Narasimhan (Cotter et al., 2019) recently presented a procedure for derandomizing a stochastic classifier while approximately preserving the outputs of f with high probability. However, the authors observe that their construction results in similar individuals typically being given very different predictions in other words, it does not satisfy individual fairness and ask whether it is possible to obtain a family of deterministic classifiers that preserves both aggregate outputs and individual fairness.

Another motivation for studying individually fair decision making comes from the game-theoretic setting of strategic classification, wherein decision subjects may modify their features to obtain a desired outcome from the classifier (Hardt et al., 2016; Cai et al., 2015; Chen et al., 2018; Dong et al., 2018; Chen et al., 2020). A metric-fair stochastic classifier and by extension, a metric-fair derandomization procedure offers significant protection against such manipulations. See Appendix B for more on this topic.

1.1. Our Contributions

In this paper, we initiate a systematic study of classifier derandomization with individual fairness preservation. In

Metric-Fair Classifier Derandomization

line with many recent works, we formalize individual fairness as metric fairness, which requires the classifier to output similar predictions on close point pairs in some metric space (X, d) (Dwork et al., 2012; Kim et al., 2018; Friedler et al., 2016). Roughly, f is metric-fair if there are constants α, β > 0 such that for all x, x X,

|f(x) f(x )| α d(x, x ) + β

A sampled deterministic classifier ˆf F is metric-fair when this inequality holds in expectation.

Under this formalism, we obtain the following results:

1. We make precise the observation of (Cotter et al.,

2019) that their derandomization procedure, based on pairwise-independent hash functions, does not preserve individual fairness. In fact, we prove that it is almost maximally metric-unfair regardless of how fair the original stochastic classifier was (Section 2.1).

2. We demonstrate that a very simple derandomization procedure, based on setting a single random threshold r [0, 1], attains near-perfect expected fairness preservation, and prove that no better fairness preservation is possible (Section 2.2). However, this procedure s output approximation has higher variance than the pairwise-independent hashing approach in general.

3. We devise a derandomization procedure that achieves nearly the best of both worlds, preserving aggregate outputs with high probability, with only modest loss of metric fairness (Section 3). In particular, when f has fairness parameters (α, β), sampling ˆf from our family FLS yields expected fairness parameters at most (α + 1

2, β + ϵ). We also show a high-probability aggregate fairness guarantee: most deterministic classifiers in F assign most close pairs the same prediction. These guarantees hold for the class of metrics d that possess locality-sensitive hashing (LSH) schemes, which includes a wide variety of generic and data-dependent metrics.

4. We prove structural lemmas applicable to all classifier derandomization procedures: first, a bias-variance decomposition for the error of a derandomization ˆf of f; second, a set of reductions showing that metric fairnesspreserving derandomizations also preserve notions of aggregate and threshold fairness.

A practically appealing aspect of our LSH-based derandomization method is that it is completely oblivious to the original stochastic classifier, in that it requires no knowledge of how f was trained, and its fairness guarantee holds for whatever fairness parameters f happens to satisfy on each pair (x, x ) X2. The technique can therefore be applied

as an independent post-processing step for example, on the many fair stochastic classifiers detailed in recent works (Rothblum & Yona, 2018; Kim et al., 2018). The burden on the model designer is thus reduced to selecting an LSHable metric feature space (X, d) that is appropriate for the classification task.

1.2. Preliminaries

Given a stochastic classifier f : X [0, 1] and distance function d : X X [0, 1], we wish to design an efficiently sampleable set F of deterministic binary classifiers ˆf : X {0, 1}; we call F a family of deterministic classifiers, or a derandomization of f. Moreover, we would like F to have the following properties:

Output approximation: ˆf sampled uniformly1 from F simulates or approximates f in aggregate over any distribution. More precisely, define the pointwise bias and variance of ˆf with respect to f on a sample x X as

bias( ˆf, f, x) := E ˆ f F

h ˆf(x) i f(x)

variance( ˆf, x) := Var ˆ f F

Now let D be a distribution over X. The aggregate bias and variance of ˆf with respect to f on D are

bias( ˆf, f, D) := E x D

h bias( ˆf, f, x) i

variance( ˆf, D) := Var ˆ f F

We seek a family F for which both of these quantities are small. This is a useful notion of a good approximation of f since in practice, classifiers are typically applied in aggregate on some dataset or in deployment. In Section 4.4 we also point out that low bias and variance in the above sense implies that ˆf and f are nearly indistinguishable when compared according to any binary loss functions, such as accuracy, false positive rate, etc.

Individual fairness: Similar individuals are likely to be treated similarly. We formally define this notion as metric fairness, which says that that the classifier should be an approximately Lipschitz-continuous function relative to a given distance metric:

Definition 1.1 ((α, β, d)-metric fairness). Let α 12 and β 0, let d : X2 [0, 1] be a metric, and let x, x

1In this paper, we will always sample uniformly from families of classifiers and hash functions; thus ˆf F means ˆf Unif(F), and h H means h Unif(H). 2We enforce α 1, and not merely α 0, so that the codomain of f is [0, 1] rather than potentially [0, α] (or some

Metric-Fair Classifier Derandomization

X. We say a stochastic classifier f : X [0, 1] satisfies (α, β, d)-metric fairness on (x, x ), or is (α, β, d)-fair on (x, x ), if

|f(x) f(x )| α d(x, x ) + β (1)

Similarly, a deterministic classifier family F is (α, β, d)-fair on (x, x ) if

h ˆf(x) ˆf(x ) i α d(x, x ) + β (2)

When this condition is satisfied for all (x, x ) X2, we simply say the classifier (or family) is (α, β, d)-fair.

To intuit this definition, notice that when a classifier satisfies metric fairness with β = 0, the difference between its predictions on some pair of points x and x scales in proportion to their distance. To conform to this idea of fairness, it is important that the derandomization procedures we design do not substantially increase these fairness parameters, but especially β.

The above definition of metric fairness is most closely related to those of Rothblum and Yona (Rothblum & Yona, 2018), whose focus is learning a probably approximately metric-fair model that generalizes to unseen data; and Kim, Reingold, and Rothblum (Kim et al., 2018), whose focus is in-sample learning when the metric d is not fully specified. Both works take inspiration from the metric-based notion of individual fairness introduced in (Dwork et al., 2012). Crucially however, the aforementioned works provide guarantees exclusively for stochastic classifiers, and to our knowledge, this is the case for all papers to date whose focus is learning metric-fair classifiers.

In addition to this pairwise notion of metric fairness, we will also develop aggregate fairness guarantees for various derandomization procedures. To that end, let X2 τ := (x, x ) X2 d(x, x ) τ denote the set of point pairs within some distance τ [0, 1]. Our aggregate fairness bounds will state that, with high probability over the sampling of ˆf F, most pairs (x, x ) X2 τ receive the same prediction from ˆf.

2. Output Approximation Versus Fairness

We begin our study of metric-fair classifier derandomization by contrasting two approaches: first, the pairwiseindependent derandomization of (Cotter et al., 2019),

other interval of length α < 1). Requiring α 1 thus makes f a proper stochastic classifier and enables direct comparisons between different fairness parameters. This is no loss of generality since (α, β, d)-fairness for α < 1 can also be expressed as (1, β

α)-fairness or, with some loss of generality, (1, β + α, d)- fairness.

which achieves a low-variance approximation of the original stochastic classifier, but does not preserve metric fairness; and second, a simple random threshold derandomization that perfectly preserves metric fairness, at the cost of higher output variance.

2.1. Pairwise-Independent Derandomization

The construction of Cotter, Narasimhan, and Gupta (Cotter et al., 2019) makes use of a pairwise-independent hash function family HPI, i.e. a set of functions h PI : B [k] such that

Pr h HPI[h(b) = i, h(b ) = j] = 1

k2 b = b B, i, j [k]

Observe that a family that satisfies this property is also uniform, i.e. Prh HPI[h(b) = i] = 1/k for all b, i.

The classifier family they propose is then3

FPI := n ˆfh PI h PI HPI o , (3)

where ˆfh PI(x) := 1 f(x) h PI(π(x))

where π : X B is some fixed bucketing function that discretizes the input (since the pairwise-independent hash family has finite domain).

Let us develop some intuition for this construction. First, thinking of k as large, each ˆfh PI FPI essentially assigns a pseudo-random threshold h PI(π(x))

k [0, 1] to each input x, so that ˆf(x) = 1 if and only if f(x) exceeds the threshold. Since h PI is a uniform hash function family, h PI(π(x)) is uniform over [k]; this endows FPI with low bias with respect to f. Using this idea and the pairwise-independence of HPI, the authors show that this classifier family exhibits low bias and variance of approximation:

Theorem 2.1 (Bias and variance of pairwise-independent derandomization (Cotter et al., 2019) (simplified)). Let f be a stochastic classifier, D a distribution over X, and π : X B a bucketing function. Then ˆf FPI satisfies

bias( ˆf PI, f, D) 1

k variance( ˆf PI, f, D) max b B Pr x D[π(x) = b]

E x D[f(x)(1 f(x))] + 1

Moreover, ˆf PI can be sampled using O(log |B| + log k) uniform random bits.

3For the sake of clearer exposition, we simplify the deterministic classifier used in (Cotter et al., 2019), which is actually ˆfh PI(x) := 1{f(x) 2h PI(x) 1

2k }; this does not change Theorem 2.1 or Proposition 2.2 beyond a 1/2k additive difference in the bias, variance, and β.

Metric-Fair Classifier Derandomization

To understand this variance bound, observe that for a given data distribution D, the bound is stronger or weaker depending on how well π disperses samples into different buckets in B. When there exists some b B such that Prx D[π(x) = b] 1, variance( ˆf PI, f, D) Ex D[f(x)(1 f(x))] essentially tracks the stochasticity of f. At the other extreme when Prx D[π(x) = b] = 1/|B| for all b B, variance( ˆf PI, f, D) 1/|B|.

As the authors pointed out (but did not formalize), ˆf PI does not preserve pairwise fairness in general. We make this observation precise by showing that it is always possible to design a dataset, of any desired size, such that the pairwiseindependent derandomization treats every pair of points unfairly for nearly any β < 1/2.

Proposition 2.2 (Unfairness of pairwise-independent derandomization). For every N 2, α 1, β < 1

2 1 2k, and metric d : Rn Rn [0, 1], there exist a set X Rn of size N and stochastic classifier f : X [0, 1] such that the following hold:

1. f is nontrivial and (1, 0, d)-fair.

2. FPI violates (α, β, d)-metric fairness for every pair (x, x ) X2, x = x .

If k is not too small, this says that derandomizing using pairwise-independent hashing is almost maximally unfair, as a uniform random binary function ˆg : X {0, 1} satisfies E[|ˆg(x) ˆg(x )|] = 1/2, and therefore achieves β = 1/2.

Proof sketch of Proposition 2.2. Consider any α 1, β 0, 1

2 1 2k , and N 2. We choose X to be some set of N points on a sufficiently small sphere about the origin, and let f be a classifier that maps half of the points in X to 1+ϵ

2 and the other half to 1 ϵ

2 . When ϵ > 0 is sufficiently small, it can be shown that f is (1, 0, d)-fair over X. However, FPI is not (α, β, d)-fair on any point pair (x, x ) X2. The reason is that since f is almost maximally stochastic (i.e. f(x) 1/2 for all x), and HPI is pairwiseindependent, the binary outputs ˆf(x) and ˆf(x ) are about as likely to be the same as they are likely to be different. Hence E ˆ f FPI[| ˆf(x) ˆf(x )|] 1/2, violating (α, β, d)-metric fairness. See Appendix A.1 for the full proof.

2.2. Random Threshold Classifier

It turns out that there is a near-trivial derandomization that achieves optimal preservation of metric fairness, namely the following random threshold classifier family:

FRT := { ˆfr | r [0, 1]}, where ˆfr := 1{f(x) r} (5)

Formally we make the following observation, whose proof is in Appendix A.2.

Proposition 2.3 (Random threshold derandomization guarantees). Let f be an (α, β, d)-fair stochastic classifier and D a distribution over X. Then the deterministic classifier family FRT is also (α, β, d)-fair. Moreover,

bias( ˆf RT, f, D) = 0

variance( ˆf RT, f, D) E x D[f(x)(1 f(x))]

Note that while this derandomization preserves the original fairness parameters perfectly, its variance can be substantially higher than that of FPI depending on the choice of bucketing function π in Equation (3).

One subtlety here is that FRT is an infinite set, and is therefore not sampleable in practice. For the more realistic scenario in which the threshold r is a number of some fixed precision ϵ > 0, the statements in Proposition 2.3 hold up to additive error ϵ, and ˆf RT can be sampled using O(log(1/ϵ)) uniform random bits. In this case FRT is (α, β + ϵ, d)-fair, and as we can show, this is in fact necessary:

Proposition 2.4 ((α, 0, d)-metric fairness is impossible for finite deterministic families). Let d : X X [0, 1] be a metric over a convex set X Rn, and let F be a finite family of deterministic classifiers, at least one of which is nontrivial. Then for every α 1 and β < 1/|F|, F is not (α, β, d)-fair.

Proof sketch. Since F contains a nontrivial classifier ˆf, we can pick sufficiently close points around a discontinuity of ˆf and show that in expectation, F fails to achieve roughly (α, 1/|F|, d)-fairness on this point pair. See Appendix A.3

for details.

The main consequence is that there is an irreducible amount of additive unfairness β > 0 that cannot be avoided when constructing a fair deterministic classifier family. Indeed, the derandomization F we present in Section 3 has |F| 1/β, thus avoiding the impossible regime indicated by Proposition 2.4.

3. Fair Derandomization via Locality-Sensitive Hashing

In this section, we construct a deterministic classifier family that combines much of the appeal of both the pairwiseindependent derandomization (low output variance) and the random threshold derandomization (strong fairness preservation). This new approach utilizes two types of hashing: first, a pairwise-independent hash family HPI as before; and second, a locality-sensitive hash family:4

4We use the definition of LSH as coined by Charikar (Charikar, 2002). See (Indyk & Motwani, 1998) for an alternative gap-based definition in the same spirit.

Metric-Fair Classifier Derandomization

Definition 3.1 (Locality-sensitive hash (LSH) family). Let X be a set of hashable items, B a set of buckets, and d : X2 [0, 1] a metric distance function. We say a set HLS of functions h : X B is a locality-sensitive family of hash functions for d if for all x, x X,

Pr h HLS [h(x) = h(x )] = d(x, x )

Locality-sensitive hashing is a well-studied technique, and LSH families have been constructed for many standard distances and similarities, such as L1 (Indyk & Motwani, 1998), L2 (Andoni & Indyk, 2006), cosine (Charikar, 2002), Jaccard (Broder, 1997), various data-dependent metrics (Jain et al., 2008; Andoni et al., 2014; Andoni & Razenshteyn, 2015), and more.

Our derandomization works as follows: suppose f : X [0, 1] is a stochastic classifier, HLS is a family of localitysensitive hash functions h LS : X B, and HPI is a family of pairwise-independent hash functions h PI : B [k] for some positive integer k. Our family of deterministic classifiers is then

FLS := n ˆfh LS,h PI h LS HLS, h PI HPI o , (6)

where ˆfh LS,h PI(x) := 1 f(x) h PI(h LS(x))

Let us develop some intuition for this construction. First, thinking of k as large, each ˆf FLS essentially assigns a pseudo-random threshold h PI(h LS(x))

k [0, 1] to each input x, so that ˆf(x) = 1 if and only if f(x) exceeds the threshold. Since the outer hash function h PI is pairwise-independent, and therefore also uniform, h PI(h LS( )) is uniform over [k]. This endows FLS with low bias and variance with respect to f, as we explain in Section 3.1.

Second, the composition of two different hash functions gives us our fairness guarantee: h LS maps close point pairs x, x to the same bucket, then h PI disperses pairs that were not hashed together most of which are distant. This separation of point pairs by distance is precisely what enables good preservation of metric fairness, as we prove in Section 3.2.

3.1. Approximation of Outputs

We show the following bounds on the bias and variance of our derandomization. The proof is deferred to Appendix A.4.

Theorem 3.2 (Bias and variance of derandomized classifier). Let f be a stochastic classifier, ˆf FLS, and D a

distribution over X. Then

bias( ˆf, f, D) 1

variance( ˆf, f, D) E h LS HLS

max b B Pr x D[h LS(x) = b]

E x D[f(x)(1 f(x))] + 1

The above variance bound is similar in form to that of the pairwise-independent derandomization (Theorem 2.1), but with added randomization over the sampling of localitysensitive hash function: when most choices of h LS distribute points x D into buckets relatively evenly, the bound is as small as O(1/|B|); when most hashes are collisions, the bound may be as large as Ex D[f(x)(1 f(x))], essentially tracking the stochasticity of f.

3.2. Preservation of Metric Fairness

We can now show that our derandomization procedure approximately preserves metric fairness, both in the sense of expected fairness for any pair of points (the usual convention in the metric fairness literature), as well as in aggregate over all point pairs.

Theorem 3.3 (Locality-sensitive derandomization preserves metric fairness). Let f be an (α, β, d)-fair stochastic classifier, where d is a metric with an LSH family HLS with k 2/ϵ buckets. Then FLS is a deterministic classifier family satisfying the following:

(Pairwise fairness) Consider any x, x X, and assume without loss of generality that f(x) f(x ). Then

h ˆf(x) ˆf(x ) i

[α + 2f(x)(1 f(x ))] d(x, x ) + β + ϵ

(Aggregate fairness) For any distance threshold τ [0, 1], with probability at least 1 δ over the sampling of ˆf,

Pr (x,x ) X2 τ

h ˆf(x) = ˆf(x ) i

([α + 2f(x)(1 f(x ))] τ + β + ϵ).

The above fairness guarantees can be simplified by noticing that since f(x) f(x ) w.l.o.g., f(x)(1 f(x )) 1/4; this yields the following worst-case bounds over f and (x, x ):

Corollary 3.4 (Worst-case fairness). When f is (α, β, d)- fair, FLS satisfies the following:

Metric-Fair Classifier Derandomization

(Pairwise fairness) α + 1

2, β + ϵ, d -metric fairness on any (x, x ) X2, i.e.

h ˆf(x) ˆf(x ) i α + 1

d(x, x ) + β + ϵ.

(Aggregate fairness) For any distance threshold τ [0, 1], with probability at least 1 δ over the sampling of ˆf,

Pr (x,x ) X2 τ

h ˆf(x) = ˆf(x ) i

2 + β + ϵ .

In expectation and with high probability, therefore, the generated deterministic classifier approximates the fairness guarantee of the original classifier to within a small constant factor when there exists an LSH family H for d. To get a better sense what kind of guarantees this gives us, consider the following example: Example 3.5. Let f be a (1, 0, d)-fair stochastic classifier, and suppose we derandomize it to some ˆf FLS, choosing k = 500. Then by Corollary 3.4,

(Pairwise fairness) ˆf is (3/2, ϵ, d)-metric fair.

(Aggregate fairness) With probability at least 1 δ = 3/4 (over the sampling of ˆf), at least 76% of point pairs within distance τ = 1/20 receive identical predictions.

We present a sketch of the proof of Theorem 3.3; see Appendix A.5 for the complete proof.

Proof sketch of Theorem 3.3. Consider any x, x X. Since ˆf is binary and HLS is locality-sensitive,

h ˆf(x) ˆf(x ) i

= Pr h LS HLS h PI HPI

h ˆf(x) = ˆf(x ) i

= Pr h LS,h PI

h ˆf(x) = ˆf(x ) h LS(x) = h LS(x ) i (1 d(x, x ))

+ Pr h LS,h PI

h ˆf(x) = ˆf(x ) h LS(x) = h LS(x ) i d(x, x )

From here, the proof is a systematic analysis of conditional probabilities. To give some intuition, notice that the event [ ˆf(x) = ˆf(x ) | h LS(x) = h LS(x )] occurs precisely when h PI(h LS(x))

k falls between f(x) and f(x ); by the uniformity of HPI, the probability of this is roughly |f(x) f(x )| α d(x, x ) + β. This is one of several cases that use the uniformity and symmetry properties of the composed hash function h PI(h LS( )) to express | ˆf(x) ˆf(x )| in terms of |f(x) f(x )|. In some cases this is not possible, resulting in an additive 2f(x)(1 f(x )) loss in α.

3.3. Sample Complexity

Since the LSH-based derandomization procedure involves sampling two hash functions HPI and HLS, it samples ˆf using O(log |B| + log k + Sd(X, B)) random bits, where O(log |B| + log k) is the number of bits used to sample a pairwise-independent hash function (Rubinfeld, 2012), and Sd(X, B) is the number of random bits required to sample a locality-sensitive hash function for metric d with domain X and range B. When the metric is the Euclidean distance, for example, O(dim X) random bits suffice (Rashtchian, 2019).

4. Structural Lemmas for Fair Classifier Derandomization

In this section, we present generic results applicable to all classifier derandomization procedures, as well as unify different definitions of fairness used in this paper and others.

4.1. Bias-Variance Decomposition

Up to this point, a stochastic classifier has signified any function f from X to [0, 1]; in this sense, it does not necessarily contain any randomness of its own. However, when it comes time to perform a binary decision on some input x, f(x) is typically interpreted as the probability of outputting 1, i.e. we use the (truly random) binary function 1f(x) Bern(f(x)).

By how much does this prediction typically differ from that of some pre-sampled deterministic classifier ˆf? We show that this error can be decomposed into the bias of ˆf and the variance of both ˆf and f:

Lemma 4.1 (Bias-variance decomposition). Let f : X [0, 1] be a stochastic classifier and F a deterministic classifier family. Then for any x X,

h ˆf(x) 1f(x) i bias( ˆf, 1f, x)

+ 2 Var f (1f(x)) + Var ˆ f F

where bias( ˆf, 1f, x) := | E ˆ f[ ˆf(x)] Ef[1f(x)]|.

We defer the proof to Appendix A.6. For now, let us interpret this decomposition and see how it applies to the derandomization approaches laid out in previous sections. Recall that for all three derandomizations FPI, FRT, and FLS the bias was either zero or could be made arbitrarily small. As for the variance, we see two types: the first, Varf(1f(x)), is equal to f(x)(1 f(x)), i.e. the variance of a Bernoulli with parameter f(x); it therefore quantifies the inherent stochasticity of the given classifier f, over which we have no control. In contrast, the second variance arises

Metric-Fair Classifier Derandomization

from sampling the deterministic classifier ˆf, which depends greatly on the procedure being used. Thus a comparison of the expected error of these approaches boils down to this latter variance, for which the pairwise-independent and locality-sensitive hashing approaches compare favorably against the simple random threshold.

4.2. Metric Fairness and Threshold Fairness

Friedler, Scheidegger, and Venkatasubramanian (Friedler et al., 2016) propose an alternative threshold-based notion of individual fairness that implements the mantra that similar individuals should receive similar treatment, but only extends this constraint to pairs of inputs within a certain distance of interest:

Definition 4.2 ((σ, τ, d)-threshold fairness). Fix some constants σ, τ (0, 1). We say a stochastic classifier f is (σ, τ, d)-threshold fair if for all x, x X such that d(x, x ) σ, we have |f(x) f(x )| τ. We say a deterministic classifier family F is (σ, τ, d)-threshold fair if for all x, x X such that d(x, x ) σ, we have E ˆ f F[| ˆf(x) ˆf(x )|] τ.

Neither metric fairness nor threshold fairness fully subsumes the other. However, we can still show the following algorithmic reduction: if we wish to derandomize a stochastic classifier while preserving threshold fairness, then it suffices to use any procedure that preserves metric fairness. For example, suppose we have a derandomization procedure that worsens the input classifier s fairness parameters α and β to at most a α and b β, respectively, for some small constants a, b 1. We should also expect this procedure to preserve threshold fairness, within certain parameters related to a, b. This is what we prove in the following lemma, but for more general fairness preservation functions:

Lemma 4.3 (Metric-fair derandomization preserves threshold fairness). Suppose we have a procedure that, given an (α, β, d)-metric fair stochastic classifier f, samples a deterministic classifier ˆf from an (A(α), B(β), d)-metric fair family F, for some functions A, B : R R. Then this same procedure also derandomizes any (σ, τ, d)-threshold fair stochastic classifier to a deterministic classifier from a (σ, A(0) σ + B(τ), d)-threshold fair family.

Applying this to the random threshold and locality-sensitive derandomization procedures yields the following:

Corollary 4.4 (Threshold fairness-preserving derandomizations). Let f be a (σ, τ, d)-threshold fair stochastic classifier. Then

The family FRT is (σ, τ, d)-threshold fair.

If d is LSHable, the family FLS, for a choice of k 4/σ, is (σ, σ + τ, d)-threshold fair.

The proofs are deferred to Appendix A.7.

4.3. Pairwise Fairness and Aggregate Fairness

Throughout most of this paper (and in most of the individual fairness literature), we have been focused on pairwise notion of fairness, such as metric fairness (Definition 1.1) and threshold fairness (Definition 4.2). One shortcoming of these definitions is that even if a classifier satisfies them for any particular pair of points (x, x ), they do not hold simultaneously for all input pairs; thus once we sample a specific deterministic classifier ˆf, it may be unfair for many pairs. Fortunately, as we now show, these pairwise statements imply high-probability aggregate fairness guarantees: if F is a metric-fair family, then most deterministic classifiers in F assign most close pairs the same prediction.

To that end, for all distances τ [0, 1], let X2 τ := (x, x ) X2 d(x, x ) τ denote the set of point pairs within distance τ. Then we can bound the fraction of τ-close pairs that receive different predictions:

Lemma 4.5 (Pairwise fairness implies aggregate fairness). Let F be an (α, β, d)-fair deterministic classifier family. Then for any distance threshold τ [0, 1], with probability at least 1 δ over the sampling of ˆf F,

Pr (x,x ) X2 τ

h ˆf(x) = ˆf(x ) i 1 + 1

The proof is deferred to Appendix A.8.

4.4. Output Approximation and Loss Approximation

In this paper, we have analyzed the output approximation qualities of various derandomization techniques using the definitions of bias and variance in Section 1.2, which say that the output of ˆf should resemble that of f, either on a single point x or in aggregate over some distribution D.

An alternative set of definitions of bias and variance, put forth in (Cotter et al., 2019), instead measures how well ˆf preserves the loss of f according to one or more binary loss functions ℓ. This property, which we might call loss approximation, is useful since in practice, classifiers are typically compared based on criteria such as accuracy, false positive rate, etc. evaluated on a dataset and these are essentially binary loss functions averaged over a data distribution.

Concretely, let ℓ: {0, 1} {0, 1} {0, 1} be a loss function and let (x, y) X {0, 1} be an instance with its corresponding label. The loss on this instance incurred by a (stochastic or deterministic) classifier f is defined as

L(f, x, y) := f(x)ℓ(1, y) + (1 f(x))ℓ(0, y)

The (pointwise) bias and variance of ˆf under this loss are

Metric-Fair Classifier Derandomization

bias( ˆf, f, x, y, ℓ) := E ˆ f F

h L( ˆf, x, y) i L(f, x, y)

variance( ˆf, x, y, ℓ) := Var ˆ f F

L( ˆf, x, y)

We observe that these are closely related to the simpler definitions given in Section 1.2:

Lemma 4.6. For any ℓ: {0, 1} {0, 1} {0, 1}, x X, and y {0, 1},

bias( ˆf, f, x, y, ℓ) bias( ˆf, f, x)

variance( ˆf, x, y, ℓ) variance( ˆf, x)

Thus even when the goal is to compute a derandomization that simulates the performance of f on one or more binary loss functions, it essentially suffices to use a derandomization that merely simulates the raw output of f itself. See Appendix A.9 for the proof of this lemma.

5. Discussion

We offer some brief notes regarding practical considerations for our derandomization framework.

A framework for derandomization Our results give machine learning practitioners a timeand space-efficient way to remove randomness with the inherent brittleness, security vulnerabilities, and other issues that stochasticity entails from their deployed models while approximately preserving fairness constraints enforced during training. Notably, our derandomization procedure has the useful quality of being oblivious to f, its training process, and even its actual fairness parameters α and β. It can therefore be applied as an independent post-processing step for example, on the stochastic classifiers generated by the algorithms of (Rothblum & Yona, 2018), (Kim et al., 2018), and others. The burden on the model designer is thus reduced to selecting a metric feature space (X, d) that is both appropriate for the classification task and for which an LSH family exists.

This simplification comes with inherent constraints: it was shown in (Charikar, 2002) that only metrics (or similarities ϕ whose complement d is a metric) can have LSH schemes, though not all of them do. On the positive side, recent work has shown that various non-LSHable similarities can be approximated by LSHable similarities with some provable distortion bound (Chierichetti et al., 2019).

Separation of feature sets Throughout this paper, we have assumed that the inner hash function h LS and classifiers f and ˆf all share the same domain X; however, this is in

no way necessary. In fact, from a fairness perspective, it is often prudent to distinguish between the features used for ensuring fairness and those used purely for inference, i.e. we may have

f : X [0, 1], ˆf : X {0, 1}, and h LS : Z B

The feature set Z should be chosen, in tandem with an appropriate LSHable metric d : Z [0, 1], so as to measure similarity or difference between inputs on the basis of attributes that should be treated equitably; on the other hand, the feature set X can be designed primarily to maximize predictive accuracy, and need not have any overlap with Z. The fairness guarantees of Theorem 3.3 and Corollary 3.4 then hold with respect to the metric space (Z, d) rather than (X, d).

Future work: guarantees for protected attributes This paper has focused on classifier derandomization with individual fairness guarantees, but it is also worthwhile to investigate the effect of derandomization from a group fairness perspective for example, if it is possible to design an LSHable metric such that the derandomization preserves notions of fairness with respect to a protected attribute.

Acknowledgement This work is partially supported by the National Science Foundation (NSF) under grants IIS2007951, IIS-2143895, IIS-2040800 (FAI program in collaboration with Amazon), and CCF-2023495.

Agarwal, A., Beygelzimer, A., Dud ık, M., Langford, J., and Wallach, H. A reductions approach to fair classification. In International Conference on Machine Learning, pp. 60 69. PMLR, 2018.

Andoni, A. and Indyk, P. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In 2006 47th annual IEEE symposium on foundations of computer science (FOCS 06), pp. 459 468. IEEE, 2006.

Andoni, A. and Razenshteyn, I. Optimal data-dependent hashing for approximate near neighbors. In Proceedings of the forty-seventh annual ACM symposium on Theory of computing, pp. 793 801, 2015.

Andoni, A., Indyk, P., Nguy ˆen, H. L., and Razenshteyn, I. Beyond locality-sensitive hashing. In Proceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithms, pp. 1018 1028. SIAM, 2014.

Braverman, M. and Garg, S. The role of randomness and noise in strategic classification. In 1st Symposium on Foundations of Responsible Computing (FORC 2020). Schloss Dagstuhl-Leibniz-Zentrum f ur Informatik, 2020.

Metric-Fair Classifier Derandomization

Broder, A. Z. On the resemblance and containment of documents. In Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No. 97TB100171), pp. 21 29. IEEE, 1997.

Br uckner, M. and Scheffer, T. Stackelberg games for adversarial prediction problems. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 547 555, 2011.

Cai, Y., Daskalakis, C., and Papadimitriou, C. Optimum statistical estimation with strategic data sources. In Conference on Learning Theory, pp. 280 296. PMLR, 2015.

Charikar, M. S. Similarity estimation techniques from rounding algorithms. In Proceedings of the thiry-fourth annual ACM symposium on Theory of computing, pp. 380 388, 2002.

Chen, Y., Podimata, C., Procaccia, A. D., and Shah, N. Strategyproof linear regression in high dimensions. In Proceedings of the 2018 ACM Conference on Economics and Computation, pp. 9 26, 2018.

Chen, Y., Liu, Y., and Podimata, C. Learning strategyaware linear classifiers. Advances in Neural Information Processing Systems, 33:15265 15276, 2020.

Chen, Y., Wang, J., and Liu, Y. Linear classifiers that encourage constructive adaptation. ar Xiv preprint ar Xiv:2011.00355, 2021.

Chierichetti, F., Kumar, R., Panconesi, A., and Terolli, E. On the distortion of locality sensitive hashing. SIAM Journal on Computing, 48(2):350 372, 2019.

Cohen, J., Rosenfeld, E., and Kolter, Z. Certified adversarial robustness via randomized smoothing. In International Conference on Machine Learning, pp. 1310 1320. PMLR, 2019.

Cotter, A., Gupta, M., and Narasimhan, H. On making stochastic classifiers deterministic. Advances in Neural Information Processing Systems (Neur IPS), 2019.

Dietterich, T. G. Ensemble methods in machine learning. In International workshop on multiple classifier systems, pp. 1 15. Springer, 2000.

Dong, J., Roth, A., Schutzman, Z., Waggoner, B., and Wu, Z. S. Strategic classification from revealed preferences. In Proceedings of the 2018 ACM Conference on Economics and Computation, pp. 55 70, 2018.

Dwork, C., Hardt, M., Pitassi, T., Reingold, O., and Zemel, R. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference, pp. 214 226, 2012.

Friedler, S. A., Scheidegger, C., and Venkatasubramanian, S. On the (im)possibility of fairness. ar Xiv preprint ar Xiv:1609.07236, 2016.

Grgi c-Hlaˇca, N., Zafar, M. B., Gummadi, K., and Weller, A. On fairness, diversity, and randomness in algorithmic decision making. In 4th Workshop on Fairness, Accountability, and Transparency in Machine Learning, 2017.

Hardt, M., Megiddo, N., Papadimitriou, C., and Wootters, M. Strategic classification. In Proceedings of the 2016 ACM conference on innovations in theoretical computer science, pp. 111 122, 2016.

Indyk, P. and Motwani, R. Approximate nearest neighbors: towards removing the curse of dimensionality. In Proceedings of the thirtieth annual ACM symposium on Theory of computing, pp. 604 613, 1998.

Jain, P., Kulis, B., and Grauman, K. Fast image search for learned metrics. In 2008 IEEE Conference on computer vision and pattern recognition, pp. 1 8. IEEE, 2008.

Kim, M. P., Reingold, O., and Rothblum, G. N. Fairness through computationally-bounded awareness. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 4847 4857, 2018.

Narasimhan, H. Learning with complex loss functions and constraints. In International Conference on Artificial Intelligence and Statistics, pp. 1646 1654. PMLR, 2018.

Pinot, R., Meunier, L., Araujo, A., Kashima, H., Yger, F., Gouy-Pailler, C., and Atif, J. Theoretical evidence for adversarial robustness through randomization. Advances in Neural Information Processing Systems, 32:11838 11848, 2019.

Pinot, R., Ettedgui, R., Rizk, G., Chevaleyre, Y., and Atif, J. Randomization matters how to defend against strong adversarial attacks. In International Conference on Machine Learning, pp. 7717 7727. PMLR, 2020.

Rashtchian, C. Lecture 09: LSH, 2019. URL: http: //madscience.ucsd.edu/notes/lec9.pdf.

Rothblum, G. and Yona, G. Probably approximately metricfair learning. In International Conference on Machine Learning, pp. 5680 5688. PMLR, 2018.

Rubinfeld, R. MIT 6.842, 2012. URL: https: //people.csail.mit.edu/ronitt/COURSE/ S12/handouts/lec5.pdf.

Sharifi-Malvajerdi, S., Kearns, M., and Roth, A. Average individual fairness: Algorithms, generalization and experiments. Advances in Neural Information Processing Systems, 32:8242 8251, 2019.

Metric-Fair Classifier Derandomization

A. Omitted Proofs

A.1. Unfairness of Pairwise-Independent Derandomization

Proof of Proposition 2.2. For any δ > 0, let Sδ := {x Rn | d(x, 0) = δ} be the sphere of radius δ around the origin. Consider any α 1 and β 0, 1

2 1 2k , and choose X to be some subset of Sδ of size |X| = N in which the closest two points are positioned at distance ϵ from one another, where

0 < ϵ := min x,x X d(x, x ) < 1

Now let f be a classifier that maps half of the points in X to 1+ϵ

2 , and the other half to 1 ϵ

2 . f is (1, 0, d)-fair over X, since for any x, x X,

|f(x) f(x )| 1 + ϵ

= ϵ d(x, x )

However, FPI is not (α, β, d)-fair on any point pair. To see this, consider any x = x X; we show that for ˆf FPI, | ˆf(x) ˆf(x )| is typically large relative to d(x, x ):

h ˆf(x) ˆf(x ) i = Pr ˆ f FPI

h ˆf(x) = ˆf(x ) i ( ˆf {0, 1})

= Pr ˆ f FPI

h ˆf(x) = 1, ˆf(x ) = 0 i + Pr ˆ f FPI

h ˆf(x) = 0, ˆf(x ) = 1 i

k , f(x ) < h(x )

f(x) < h(x)

k , f(x ) h(x )

(by pairwise independence)

2 1 2ϵ + ϵ2 1 ϵ

The distance between any two points in Sδ, and therefore X, is at most 2δ; hence for a choice of δ 0, 1/2 β ϵ 1/2k

(which is possible since β < 1

2 1 2k and ϵ < 1

2 1 2k β), we have

h ˆf(x) ˆf(x ) i 1

2k = 2α 1/2 β ϵ 1/2k

2α + β > α 2δ + β α d(x, x ) + β

which is a violation of (α, β, d)-metric fairness (Equation (2)) and applies to all pairs x, x X.

A.2. Random Threshold Derandomization Guarantees

Proof of Proposition 2.3. Let f be an (α, β, d)-fair classifier, and consider any x, x X. We have

h ˆfr(x) ˆfr(x ) i = Pr ˆ fr FRT

h ˆfr(x) = ˆfr(x ) i ( ˆf {0, 1})

Metric-Fair Classifier Derandomization

= Pr ˆ fr FRT

h ˆfr(x) = 0, ˆfr(x ) = 1 i + Pr ˆ fr FRT

h ˆfr(x) = 1, ˆfr(x ) = 0 i

= Pr r [0,1][f(x) < r f(x )] + Pr r [0,1][f(x ) < r f(x)]

= |f(x) f(x )|

α d(x, x ) + β (f is (α, β, d)-fair)

which shows that FRT is also (α, β, d)-fair. To compute the bias, note that for any x X,

h ˆfr(x) i = Pr r [0,1][f(x) r] = f(x) (8)

which implies bias( ˆfr, f, x) = 0 for all x and hence bias( ˆf, f, D) for all D. Finally for the variance, we have

variance( ˆfr, D) := Var ˆ fr FRT

E x D[ ˆfr(x)]

= E r [0,1]

h ˆfr(x) i 2 E r [0,1]

h ˆfr(x) ii 2

= E r [0,1]

h ˆfr(x) i 2 E x D

h ˆfr(x) i 2

= E r [0,1]

h ˆfr(x) ˆfr(x ) i E x,x D

h ˆfr(x) i E r [0,1]

h ˆfr(x ) i

h ˆfr(x) ˆfr(x ) i E r [0,1]

h ˆfr(x) i E r [0,1]

h ˆfr(x ) i

Cov r [0,1]

ˆfr(x), ˆfr(x )

Var r [0,1]

ˆfr(x) Var r [0,1]

ˆfr(x ) (Cauchy-Schwarz inequality)

Var r [0,1]

Var r [0,1]

ˆfr(x) (Jensen s inequality)

h ˆfr(x) i 1 E r [0,1]

= E x D[f(x)(1 f(x))] (Equation (8))

as required.

A.3. Perfect Deterministic Fairness is Impossible for Finite Families

Proof of Proposition 2.4. Consider any α 1 and β (0, 1/|F|); it suffices to exhibit a pair of points x, x X such that

h ˆf(x) ˆf(x ) i > α d(x, x ) + β.

For any δ > 0, define the ball of radius δ around x to be Bδ(x) := {x X | d(x, x ) δ}. By assumption, F contains at least one nontrivial classifier (i.e. one function that is not identically 1 or 0); let ˆf be one such classifier. Since X Rn is convex and d is a metric, ˆf must be discontinuous at some point x X, meaning that for all δ > 0, there exists x Bδ(x)

Metric-Fair Classifier Derandomization

such that ˆf(x) = 1 ˆf(x ). Choose any δ 0, 1/|F| β

α , and consider some x Bδ (x). We have

h ˆf(x) ˆf(x ) i 1 |F| (at least one function in F is discontinuous at x)

= α 1/|F| β

> α δ + β (δ < 1/|F| β

α d(x, x ) + β (x Bδ (x))

which shows that F is not (α, β, d)-fair.

A.4. Output Approximation of Locality-Sensitive Derandomization

Proof of Theorem 3.2. We will repeatedly use the following fact: by the uniformity of HPI, for all 0 a < b 1 and x X we have

Pr h LS HLS h PI HPI

a h PI(h LS(x))

k , b a + 1

Thus for all x X,

h ˆf(x) i = Pr ˆ f FLS

h ˆf(x) = 1 i = Pr h LS HLS h PI HPI

f(x) h PI(h LS(x))

k , f(x) + 1

which implies bias( ˆf, f, x) 1

k for all x X and hence bias( ˆf, f, D) 1

k for all D.

Now we bound the variance. Define the bucketed stochastic classifier

i=1 1 f(x) i

In other words, g(x) is the smallest multiple of 1/k greater than f(x). Note that |g(x) f(x)| 1

k for all x. Additionally, define the deterministic classifier family GLS from g just as FLS was defined from f in Equation (6), i.e.

GLS := {ˆgh LS,h PI | h LS HLS, h PI HPI} , where ˆgh LS,h PI(x) := 1 g(x) h PI(h LS(x))

It essentially suffices to analyze ˆg instead of ˆf, since in the end, we simply incur an additional bias or variance of 1

k. To begin, observe that for any distribution D over X,

variance( ˆf, f, D) = variance(ˆg, g, D)

:= Var ˆg GLS

E x D[ˆg(x)]

= E h LS HLS h PI HPI

E x D[ˆg(x)] 2 E x D[ˆg(x)] 2

= E h LS HLS h PI HPI

E x D[ˆg(x)] 2 E x D[g(x)] 2

Metric-Fair Classifier Derandomization

To evaluate the first term, note that for any x, x X,

E h LS HLS h PI HPI [ˆg(x)ˆg(x )]

= E h LS HLS

E h PI HPI [1{h LS(x) = h LS(x )}ˆg(x)ˆg(x )] + E h PI HPI [1{h LS(x) = h LS(x )}ˆg(x)ˆg(x )]

= E h LS HLS

E h PI HPI [1{h LS(x) = h LS(x )}ˆg(x)ˆg(x )] + 1{h LS(x) = h LS(x )}g(x)g(x ) (pairwise independence)

Thus the first term of the variance is

E h LS HLS h PI HPI

E x D[ˆg(x)] 2 = E h LS HLS h PI HPI

E x,x D[ˆg(x)ˆg(x )]

E h LS HLS h PI HPI [ˆg(x)ˆg(x )]

E h PI HPI [1{h LS(x) = h LS(x )}ˆg(x)ˆg(x )] + 1{h LS(x) = h LS(x )}g(x)g(x )

Next consider the second term: E x D[g(x)] 2 = E x,x D[g(x)g(x )]

Putting these together, we have

variance( ˆf, f, D)

= E h LS HLS

E x,x D[1{h LS(x) = h LS(x )}ˆg(x)ˆg(x )] E x,x D [1{h LS(x) = h LS(x )}g(x)g(x )]

= E h LS HLS

1{h LS(x) = h LS(x )} E h PI[ˆg(x)ˆg(x )] g(x)g(x )

= E h LS HLS

1{h LS(x) = h LS(x )} E h PI[ˆg(x)ˆg(x )] E h PI[ˆg(x)] E h PI[ˆg(x )]

= E h LS HLS

1{h LS(x) = h LS(x )} Cov h PI (ˆg(x), ˆg(x ))

1{h LS(x) = h LS(x )} q

Var h PI (ˆg(x)) Var h PI (ˆg(x )) (Cauchy-Schwarz inequality)

= E h LS HLS

1{h LS(x) = b} q

Var h PI (ˆg(x)) 2#

= E h LS HLS

Pr x D[h LS(x) = b] E x D

Var h PI (ˆg(x)) h LS(x) = b 2#

Pr x D[h LS(x) = b] 2 E x D

Var h PI (ˆg(x)) h LS(x) = b #

(Jensen s inequality)

= E h LS HLS

Pr x D[h LS(x) = b] 2 E x D[g(x)(1 g(x)) | h LS(x) = b]

" max b B Pr x D[h LS(x) = b] X

b B Pr x D[h LS(x) = b] E x D[g(x)(1 g(x)) | h LS(x) = b]

Metric-Fair Classifier Derandomization

= E h LS HLS

max b B Pr x D[h LS(x) = b] E x D[g(x)(1 g(x))]

max b B Pr x D[h LS(x) = b] E x D

f(x)(1 f(x)) + 1

(bias(f, g, x) 1

k for all x)

A.5. Fairness of LSH-Based Derandomization

Proof of Theorem 3.3. We first prove pairwise metric fairness. Consider any x, x X, and assume without loss of generality that f(x) f(x ). We have

h ˆf(x) ˆf(x ) i

= Pr h LS HLS h PI HPI

h ˆf(x) = ˆf(x ) i ( ˆf {0, 1})

= Pr h LS h PI

h ˆf(x) = ˆf(x ) h LS(x) = h LS(x ) i

Pr h LS[h LS(x) = h LS(x )]

+ Pr h LS h PI

h ˆf(x) = ˆf(x ) h LS(x) = h LS(x ) i

Pr h LS[h LS(x) = h LS(x )] (11)

We evaluate p1 and p2 separately. First, noting that a pairwise-independent hash family is also uniform, we have

Pr h LS,h PI

h ˆf(x) = 0, ˆf(x ) = 1 h LS(x) = h LS(x ) i

= Pr h LS,h PI

f(x) < h PI(h LS(x))

k , f(x ) h PI(h LS(x ))

h LS(x) = h LS(x )

= Pr h LS,h PI

f(x) < h PI(h LS(x))

k f(x ) h LS(x) = h LS(x )

= Pr h LS,h PI

f(x) < h PI(h LS(x))

k f(x ) (h PI is uniform)

By symmetry, Prh LS,h PI[ ˆf(x) = 1, ˆf(x ) = 0 | h LS(x) = h LS(x )] = Prh LS,h PI[f(x) h PI(h LS(x))

k > f(x )]; but this equals zero, since f(x) f(x ). Thus

p1 = Pr h LS,h PI

h ˆf(x) = 1, ˆf(x ) = 0 h LS(x) = h LS(x ) i + Pr h LS,h PI

h ˆf(x) = 0, ˆf(x ) = 1 h LS(x) = h LS(x ) i

= Pr h LS,h PI

f(x) < h PI(h LS(x))

= |f(x) f(x )| 2

k (by Equation (9))

Next, to compute p2, we have

Pr h LS,h PI

h ˆf(x) = 1, ˆf(x ) = 0 h LS(x) = h LS(x ) i

= Pr h LS,h PI

f(x) h PI(h LS(x))

k , f(x ) < h PI(h LS(x ))

h LS(x) = h LS(x )

= Pr h LS,h PI

f(x) h PI(h LS(x))

k , h LS(x) = h LS(x ) Pr h LS,h PI

f(x ) < h PI(h LS(x ))

h LS(x) = h LS(x )

(h PI is pairwise independent)

Metric-Fair Classifier Derandomization

= f(x)(1 f(x )) 1

k (h PI is uniform)

and by symmetry, Prh LS,h PI[ ˆf(x) = 0, ˆf(x ) = 1 | h LS(x) = h LS(x )] = (1 f(x))f(x ) 1

p2 = Pr h LS,h PI

h ˆf(x) = 1, ˆf(x ) = 0 h LS(x) = h LS(x ) i + Pr h LS,h PI

h ˆf(x) = 0, ˆf(x ) = 1 h LS(x) = h LS(x ) i

= f(x) 2f(x )f(x) + f(x ) 2

Substituting p1 and p2 back into Equation (11) yields

E h LS,h PI

h ˆf(x) ˆf(x ) i = p1 Pr h LS[h LS(x) = h LS(x )] + p2 Pr h LS[h LS(x) = h LS(x )]

= |f(x) f(x )| (1 d(x, x )) + (f(x) 2f(x )f(x) + f(x )) d(x, x ) 2

k (h LS is LSH)

= |f(x) f(x )| + 2f(x)(1 f(x )) d(x, x ) 2

α d(x, x ) + β + 2f(x)(1 f(x )) d(x, x ) + 2

k (f is (α, β, d)-fair)

[α + 2f(x)(1 f(x ))] d(x, x ) + β + ϵ (k 2/ϵ)

which proves the pairwise fairness bound. The aggregate fairness bound then follows from Lemma 4.5.

A.6. Bias-Variance Decomposition

Proof of Lemma 4.1. For any c > 0, we have

ˆf(x) 1f(x) E f, ˆ f

h ˆf(x) 1f(x) i + ˆf(x) 1f(x) E f, ˆ f

h ˆf(x) 1f(x) i

h ˆf(x) 1f(x) i + c Var f, ˆ f

ˆf(x) 1f(x) E f, ˆ f

h ˆf(x) 1f(x) i

(by Chebyshev s inequality, w.p. 1 1/c2)

h ˆf(x) 1f(x) i + c Var ˆ f

ˆf(x) E ˆ f

h ˆf(x) i + c Var f

1f(x) E f [1f(x)]

( ˆf(x) E ˆ f[ ˆf(x)] and 1f(x) Ef[1f(x)] have mean zero)

h ˆf(x) 1f(x) i + c Var ˆ f

ˆf(x) + c Var f (1f(x))

The above calculation fails with probability at most 1/c2, in which case the left-hand side still obeys the simple bound | ˆf(x) 1f(x)| 1. Thus taking expectations of both sides, we have

h ˆf(x) 1f(x) i E f, ˆ f

h ˆf(x) 1f(x) i + c Var ˆ f

ˆf(x) + c Var f (1f(x)) + 1

with probability 1 for any c > 0. A choice of c = (Var ˆ f F( ˆf(x)) + Varf(1f(x))) 1/3 yields the result.

A.7. Metric-Fair Derandomization Preserves Threshold Fairness

Proof of Lemma 4.3. First, fix some σ (0, 1) and let X2 σ := (x, x ) X2 d(x, x ) σ . Observe the following translations between metric and threshold fairness on this set:

Metric-Fair Classifier Derandomization

1. If f is (σ, τ, d)-threshold fair, then for any (x, x ) X2 σ,

|f(x) f(x )| τ = 0 d(x, x ) + τ

So, f is also (0, τ, d)-metric fair on such pairs (x, x ).

2. If f is (α, β, d)-metric fair on all (x, x ) X2 σ, then for such pairs,

|f(x) f(x )| α d(x, x ) + β ασ + β

So, f is also (σ, ασ + β, d)-threshold fair.

Now suppose we run our derandomization procedure on a (σ, τ, d)-threshold fair stochastic classifier f. Let F be the deterministic classifier family from which we sample our output. Then f is (0, τ, d)-metric fair over X2 σ (by observation 1 above), F is then (A(0), B(τ), d)-metric fair over X2 σ (by the fairness preservation guarantee), and F is also (σ, A(0) σ + B(τ), d)-threshold fair (by observation 2).

Proof of Corollary 4.4. If f is (σ, τ, d)-threshold fair, then FLS is (σ, τ , d)-threshold fair, where

τ = A(0) σ + B(τ) (Lemma 4.3)

2 σ + τ + 2

k (Corollary 3.4)

= σ + τ (choice of k 4/σ)

A.8. Pairwise Fairness Implies Aggregate Fairness

Proof of Lemma 4.5. For all distances ξ [0, 1], let X2 ξ := (x, x ) X2 d(x, x ) = ξ denote the set of point pairs at distance exactly ξ. Then, for any given ˆf F, let

ρξ( ˆf) := Pr (x,x ) X2 ξ

h ˆf(x) = ˆf(x ) i and ρ τ( ˆf) := Pr (x,x ) X2 τ

h ˆf(x) = ˆf(x ) i

denote the fraction of pairs at distance ξ and within τ, respectively, to which ˆf assigns different outputs. Treating ρξ( ˆf) as a random variable of ˆf, we have

h ρξ( ˆf) i = E ˆ f F

Pr (x,x ) X2 ξ

h ˆf(x) = ˆf(x ) i

E (x,x ) X2 ξ

h ˆf(x) ˆf(x ) i

= E (x,x ) X2 ξ

h ˆf(x) ˆf(x ) i (13)

Thus the fraction of separated pairs within distance τ is

h ρ τ( ˆf) i := E ˆ f F

Pr (x,x ) X2 τ

h ˆf(x) = ˆf(x ) i#

Pr (x,x ) X2 τ

h ˆf(x) = ˆf(x ) d(x, x ) = ξ i Pr (x,x ) X2 τ [d(x, x ) = ξ] dξ

Pr (x,x ) X2 ξ

h ˆf(x) = ˆf(x ) i#

Pr (x,x ) X2 τ [d(x, x ) = ξ] dξ

0 E (x,x ) X2 ξ

h ˆf(x) ˆf(x ) i Pr (x,x ) X2 τ [d(x, x ) = ξ] dξ (by Equation (13))

Metric-Fair Classifier Derandomization

0 (αξ + β) Pr (x,x ) X2 τ [d(x, x ) = ξ] dξ (by (α, β, d)-fairness) (15)

(ατ + β) Z τ

0 Pr (x,x ) X2 τ [d(x, x ) = ξ] dξ

= ατ + β (16)

Since ρ τ [0, 1], Var(ρ τ) = E[ρ2 τ] E[ρ τ]2 E[ρ τ]. Thus applying Chebyshev s inequality to Equation (16) yields

(ατ + β) Pr ˆ f F

E ˆ f F [ρ] δ

which proves the claim.

A.9. Output Approximation and Loss Approximation

Proof of Lemma 4.6. For any x X and y {0, 1},

h L( ˆf, x, y) i = E ˆ f F

h ℓ( ˆf(x), y) i ( ˆf(x) {0, 1})

h ℓ( ˆf(x), y) ˆf(x) = 1 i Pr ˆ f

h ˆf(x) = 1 i + E ˆ f

h ℓ( ˆf(x), y) ˆf(x) = 0 i Pr ˆ f

h ˆf(x) = 0 i

= ℓ(1, y) E ˆ f

h ˆf(x) i + ℓ(0, y) 1 E ˆ f

= ℓ(1, y)f(x) + ℓ(0, y) (1 f(x)) bias( ˆf, f, x)

= f(x)ℓ(1, y) + (1 f(x))ℓ(0, y) bias( ˆf, f, x)

which proves the first inequality concerning the bias. For the variance, notice that since ℓis binary, either Var ˆ f ℓ( ˆf(x), y) = Var ˆ f ˆf(x) or Var ˆ f ℓ( ˆf(x), y) = 0.

Metric-Fair Classifier Derandomization

B. Manipulation Deterrence in Strategic Classification

Fair derandomization procedures carry implications for the strategic classification problem, a popular framework for modeling the behavior of self-interested agents subject to classification decisions (Hardt et al., 2016; Cai et al., 2015; Chen et al., 2018; Dong et al., 2018; Chen et al., 2020). Formally, strategic classification is a Stackelberg game, or a sequential game between two players:

1. First, a decision maker or model designer publishes a classifier. Traditionally, this means a stochastic classifier f : X [0, 1], but in our setting, the model designer may publish a family of deterministic classifiers F, and promises to select a single classifier from F uniformly at random.

2. Next, a strategic agent or decision subject, who is associated with some feature vector x X, decides either to present their true features x, or to change or manipulate their features to some x X to obtain the favorable outcome ˆf(x ) = 1 with higher probability. However, the agent incurs a cost c(x, x ) 0 for altering their features.

Given a (stochastic or deterministic) classifier f : X [0, 1] and cost function c : X2 [0, 1], the utility of an agent with original features x who changes to x is defined as

Uf(x, x ) := f(x ) c(x, x )

and the utility-maximizing move f(x) := arg maxx X Uf(x, x ) is called the best response of x under f and c.

In the following proposition, we observe a general connection between metric fairness and strategic manipulation; namely that the more fair a classifier is with respect to a metric cost function, the less incentive agents have to manipulate their features. The reason is intuitive: if a classifier is a smooth function, then an agent x cannot expect their outcome to change much by moving to some nearby point x . Proposition B.1 (Metric fairness implies reduced manipulation incentive). Let c be a metric cost function and let f be a (α, β, c)-metric fair classifier. Then the maximum utility gained by manipulating x to x is

Uf(x, x ) Uf(x, x) (α 1) c(x, x ) + β.

If f is a deterministic classifier drawn from a family F, then this holds in expectation over the sampling of f.

Proof of Proposition B.1. Under a classifier f, an individual with original features x X who changes to x X derives utility

Uf(x, x ) = f(x ) c(x, x )

f(x) + |f(x ) f(x)| c(x, x )

f(x) + α c(x, x ) + β c(x, x ) (f is (α, β, c)-fair)

= f(x) + (α 1) c(x, x ) + β

= Uf(x, x) + (α 1) c(x, x ) + β

which proves the claim for stochastic classifiers. The proof for a deterministic family F results from taking an expectation Ef F[ ] on both sides.

Braverman and Garg (Braverman & Garg, 2020) already observed this fact for a stochastic classifier with α = 1 and β = 0, in which case there is no incentive to manipulate. Note that by Proposition 2.4, deterministic families cannot achieve such small fairness parameters; hence the upper bound of Proposition B.1 cannot rule out some incentive to manipulate. Nevertheless, it presents a nontrivial worst-case guarantee since, for a classifier without any fairness constraints, there may be individuals near the decision boundary who can flip their decision from, for example, f(x) = 0 to f(x ) = 1 at near-zero cost, thereby gaining utility U(x, x ) U(x, x) 1 through manipulation.

Cost functions studied in the strategic classification literature include the L2 (Hardt et al., 2016; Br uckner & Scheffer, 2011) and Mahalanobis (Chen et al., 2021) distances, both of which are metrics with known LSH families (Andoni & Indyk, 2006; Jain et al., 2008). Therefore, stochastic classifiers trained to be fair with respect to these costs automatically reduce incentives to manipulate features, and if such classifiers are derandomized using fairness-preserving methods, this quality is probably approximately preserved.