# differentially_private_fair_learning__5486f290.pdf

Differentially Private Fair Learning

Matthew Jagielski 1 Michael Kearns 2 Jieming Mao 2 Alina Oprea 1 Aaron Roth 2 Saeed Shariﬁ-Malvajerdi 2

Jonathan Ullman 1

Abstract Motivated by settings in which predictive models may be required to be non-discriminatory with respect to certain attributes (such as race), but even collecting the sensitive attribute may be forbidden or restricted, we initiate the study of fair learning under the constraint of differential privacy. Our ﬁrst algorithm is a private implementation of the equalized odds post-processing approach of (Hardt et al., 2016). This algorithm is appealingly simple, but must be able to use protected group membership explicitly at test time, which can be viewed as a form of disparate treatment . Our second algorithm is a differentially private version of the oracle-efﬁcient in-processing approach of (Agarwal et al., 2018) which is more complex but need not have access to protected group membership at test time. We identify new tradeoffs between fairness, accuracy, and privacy that emerge only when requiring all three properties, and show that these tradeoffs can be milder if group membership may be used at test time. We conclude with a brief experimental evaluation.

1. Introduction

Large-scale algorithmic decision making, often driven by machine learning on consumer data, has increasingly run afoul of various social norms, laws and regulations. A prominent concern is when a learned model exhibits discrimination against some demographic group, perhaps based on race or gender. Concerns over such algorithmic discrimination have led to a recent ﬂurry of research on fairness in machine learning, which includes both new tools and methods for designing fair models, and studies of the tradeoffs between predictive accuracy and fairness (ACM, 2019).

At the same time, both recent and longstanding laws and

1Northeastern University, Boston, MA, USA 2University of Pennsylvania, Philadelphia, PA, USA. Correspondence to: Saeed Shariﬁ-Malvajerdi <saeedsh@wharton.upenn.edu>.

Proceedings of the 36 th International Conference on Machine Learning, Long Beach, California, PMLR 97, 2019. Copyright 2019 by the author(s).

regulations often restrict the use of sensitive or protected attributes in algorithmic decision-making. U.S. law prevents the use of race in the development or deployment of consumer lending or credit scoring models, and recent provisions in the E.U. General Data Protection Regulation (GDPR) restrict or prevent even the collection of racial data for consumers. These two developments the demand for non-discriminatory algorithms and models on the one hand, and the restriction on the collection or use of protected attributes on the other present technical conundrums, since the most straightforward methods for ensuring fairness generally require knowing or using the attribute being protected. It seems difﬁcult to guarantee that a trained model is not discriminating against (say) a racial group if we cannot even identify members of that group in the data.

A recent line of work (Veale & Binns, 2017; Kilbertus et al., 2018) made these cogent observations, and proposed an interesting solution employing the cryptographic tool of secure multiparty computation (commonly abbreviated MPC). In this model, we imagine a commercial entity with access to consumer data that excludes race, but this entity would like to build a predictive model for, say, commercial lending, under the constraint that the model be nondiscriminatory by race with respect to some standard fairness notion (e.g. equality of false rejection rates). In order to do so, the company engages in MPC with a set of regulatory agencies, which are either trusted parties holding consumers race data (Veale & Binns, 2017), or hold among them a secret sharing of race data, provided by the consumers themselves (Kilbertus et al., 2018). Together the company and the regulators apply standard fair machine learning techniques in a distributed fashion. In this way the company never directly accesses the race data, but still manages to produce a fair model, which is the output of the MPC. The guarantee provided by this solution is the standard one of MPC namely, the company learns nothing more than whatever is implied by its own consumer data, and the fair model returned by the protocol.

Our point of departure stems from our assertion that MPC is the wrong guarantee to give if our motivation is ensuring that data about an individual s race does not leak to the company via the model. In particular, MPC implies nothing about what individual information can already be inferred

Differentially Private Fair Learning

from the learned model itself. The guarantee we would prefer is that the company s data and the fair model do not leak anything about an individual s race beyond what can be inferred from population level correlations. That is, the fair model should not leak anything beyond inferences that could be carried out even if the individual in question had declined to provide her racial identity. This is exactly the type of promise made by differential privacy (Dwork et al., 2006b), but not by MPC.

The insufﬁciency of MPC. To emphasize the fact that concerns over leakage of protected attributes under the guarantee of MPC are more than hypothetical, we describe a natural example where this leakage would actually occur.

Example. An SVM model, trained in the standard way, is represented by the underlying support vectors, which are just data points from the training data. Thus, if race is a feature represented in the training data, an SVM model computed under MPC reveals the race of the individuals represented in the support vectors. This is the case even if race is uncorrelated with all other features and labels, in which case differential privacy would prevent such inferences. We note that there are differentially private implementations of SVMs.

The reader might object that, in this example, the algorithm is trained to use racial data at test time, and so the output of the algorithm is directly affected by race. But there are also examples in which the same problems with MPC can arise even when race is not an input to the learned model, and race is again uncorrelated with the company s data. We also note that SVMs are just an extreme case of a learned model ﬁtting, and thus potentially revealing, its training data. For example, points from the training set can also be recovered from trained neural networks (Song et al., 2017).

Our approach: differential privacy. These examples show that cryptographic approaches to locking up sensitive information during a training process are insufﬁcient as a privacy mechanism we need to explicitly reason about what can be inferred from the output of a learning algorithm, not simply say that we cannot learn more than such inferences. In this paper we thus instead consider the problem of designing fair learning algorithms that also promise differential privacy with respect to consumer race, and thus give strong guarantees about what can be inferred from the learned model.

We note that the guarantee of differential privacy is somewhat subtle, and does not promise that the company will be unable to infer race. For example, it might be that a feature that the company already has, such as zip codes, is perfectly correlated with race, and a computation that is differentially private might reveal this correlation. In this case, the company will be able to infer racial information about its

customers. However, differential privacy prevents leakage of individual racial data beyond what can be inferred from population-level correlations.

Like (Veale & Binns, 2017), our approach can be viewed as a collaboration between a company holding non-sensitive consumer data and a regulator holding sensitive data. Our algorithms allow the regulator to build fair models from the combined data set (potentially also under MPC) in a way that ensures the company, or any other party with access to the model or its decisions, cannot infer the race of any consumer in the data much more accurately than they could do from population-level statistics alone. Thus, we comply with the spirit of laws and regulations asking that sensitive attributes not be leaked, while still allowing them to be used to enforce fairness.

1.1. Our Results

We study the problem of learning classiﬁers from data with protected attributes. More speciﬁcally, we are given a class of classiﬁers H and we output a randomized classiﬁer in p Hq (i.e. a distribution over H). The training data consists of m individual data points of the form p X, A, Y q. Here X P X is the vector of unprotected attributes, A P A is the protected attribute and Y is the binary label. As discussed above, our algorithms achieve three goals simultaneously:

Differential privacy: Our learning algorithms satisfy differential privacy (Dwork et al., 2006b) with respect to protected attributes. (They need not be differentially private with respect to the unprotected attributes X although sometimes are.)

Fairness: Our learning algorithms guarantee approximate notions of statistical fairness across the groups speciﬁed by the protected attribute. The particular statistical fairness notion we focus on is Equalized Odds (Hardt et al., 2016), which in the binary classiﬁcation case reduces to asking that false positive rates and false negative rates be approximately equal, conditional on all values of the protected attribute (but our techniques apply to other notions of statistical fairness as well, including statistical parity).

Accuracy: Our output classiﬁer has error rate comparable to non-private benchmarks in p Hq consistent with the fairness constraints.

We evaluate fairness and error as in-sample quantities. Outof-sample generalization for both error and fairness follow from standard sample-complexity bounds in learning theory, and so we elide this complication for clarity (but see e.g. the treatment in (Kearns et al., 2018b) for formal generalization bounds).

We start with a simple extension of the post-processing ap-

Differentially Private Fair Learning

Algorithm Assumptions on H

Fairness Guarantee

Needs access to A at test time?

Does it guarantee privacy of X as well?

Error Fairness Violation

DP-postprocessing None Equalized Odds Yes No r O |A| mϵ

1 r O 1 min ˆqaymϵ

DP-oracle-learner

d H ă 8 d H : V Cp Hq

Equalized Odds No No r O ˆ B min ˆqay

B 1 r O ˆ 1 min ˆqay

|H| ă 8 Equalized Odds No Yes r O ˆ B min ˆqay

|A| lnp|H|q

B 1 r O ˆ 1 min ˆqay

|A| lnp|H|q

|H| ă 8, H has maximally discriminatory classiﬁers

Equalized False Positive Rate

Yes Yes r O ˆ |A| min ˆqay

|A| lnp|H|q

r O ˆ |A| min ˆqay

|A| lnp|H|q

Table 1: Summary of Results for Our Differentially Private Fair Learning Algorithms. In this table, m is the training sample size, ˆqay is the fraction of data with A a and Y y, |A| is the number of protected groups, and ϵ is the privacy parameter. B is explained in text. For all but the marked error bound, the comparison benchmark is the optimal fair classiﬁer. The marked bound is compared to a weaker benchmark: the outcome of the non-private post-processing procedure.

proach of (Hardt et al., 2016). Their algorithm starts with a possibly unfair classiﬁer p Y and derives a fair classiﬁer by mixing p Y with classiﬁers which are based on protected attributes. This involves solving a linear program which takes quantities ˆqˆyay as input. Here ˆqˆyay is the fraction of data points with p Y ˆy, A a, Y y. To make this approach differentially private with respect to protected attributes, we start with p Y which is learned without using protected attributes and we use standard techniques to perturb the ˆqˆyay s before feeding them into the linear program, in a way that guarantees differential privacy. We analyze the additional error and fairness violation that results from the perturbation. Detailed results can be found in Section 3.

Although having the virtue of being exceedingly simple, this ﬁrst approach has two signiﬁcant drawbacks. First, even without privacy, this post-processing approach does not in general produce classiﬁers with error that is comparable to that of the best fair classiﬁers, and our privacy preserving modiﬁcation inherits this limitation. Second, and often more importantly, this post-processing approach crucially requires that protected attributes can be used at test time, and this isn t feasible (or legal) in certain applications. Even when it is, if racial information is held only by a regulator, although it may be feasible to train a model once using MPC, it probably is not feasible to make test-time decisions repeatedly using MPC.

We then consider the approach of (Agarwal et al., 2018), which we refer to it as in-processing (to distinguish it from post-processing). They give an oracle-efﬁcient algorithm, which assumes access to a subroutine that can optimally solve classiﬁcation problems absent a fairness constraint (in practice, and in our experiments, these oracles are implemented using simple learning heuristics). Their ap-

proach does not have either of the above drawbacks: it does not require that protected features be available at test time, and it is guaranteed to produce the approximately optimal fair classiﬁer. The algorithm is correspondingly more complicated. The main idea of their approach (following the presentation of (Kearns et al., 2018b)) is to show that the optimal fair classiﬁer can be found as the equilibrium of a zero-sum game between a Learner who selects classiﬁers in H and an Auditor who ﬁnds fairness violations. This equilibrium can be approximated by iterative play of the game, in which the Auditor plays exponentiated gradient descent and the Learner plays best responses (computed via an efﬁcient cost-sensitive classiﬁcation oracle). To make this approach private, we add Laplace noise to the gradients used by the Auditor and we let the Learner run the exponential mechanism (or some other private learning oracle) to compute approximate best responses. Our technical contribution is to show that the Learner and the Auditor still converge to an approximate equilibrium despite the noise introduced for privacy. Detailed results can be found in Section 4.

One of the most interesting aspects of our results is an inherent tradeoff that arises between privacy, accuracy, and fairness, that doesn t arise when any two of these desiderata are considered alone. This manifests itself as the parameter B in our in-processing result (see Table 1) which mediates the tradeoff between error, fairness and privacy. This parameter also appears in the (non-private) algorithm of (Agarwal et al., 2018) but there it serves only to mediate a tradeoff between fairness and running time. At a high level, the reason for this difference is that without the need for privacy, we can increase the number of iterations of the algorithm to decrease the error to any desired level. However, when we also need to protect privacy, there is an additional tradeoff,

Differentially Private Fair Learning

and increasing the number of iterations also requires increasing the scale of the gradient perturbations, which may not always decrease error.

This tradeoff exhibits an additional interesting feature. Recall that as we discussed above, the in-processing approach works even if we can not use protected attributes at test time. But if we are allowed to use protected attributes at test time, we are able to obtain a better tradeoff between these quantities essentially eliminating the role of the variable B that would otherwise mediate this tradeoff. We give details of this improvement in section 4.1 (for this result, we also need to relax the fairness requirement from Equalized Odds to Equalized False Positive Rates). The main step in the proof is to show that, for small constant B and H containing certain maximally discriminatory classiﬁers which make decisions solely on the basis of group membership, we can give a better characterization of the Learner s strategy at the approximate equilibrium of the zero-sum game.

Finally, we provide evidence that using protected attributes at test time is necessary for obtaining this better tradeoff. In Section 4.2, we consider the sensitivity of computing the error of the optimal classiﬁer subject to fairness constraints. We show that this sensitivity can be substantially higher when the classiﬁer cannot use protected attributes at test time, which shows that higher error must be introduced to estimate this error privately.

1.2. Related Work

The literature on algorithmic fairness is growing rapidly, and is by now far too extensive to exhaustively cover here. See (Chouldechova & Roth, 2018) for a recent survey. Our work builds directly on that of (Hardt et al., 2016), (Agarwal et al., 2018), and (Kearns et al., 2018b). In particular, (Hardt et al., 2016) introduces the equalized odds deﬁnition that we take as our primary fairness goal, and gave a simple postprocessing algorithm that we modify to make differentially private. (Agarwal et al., 2018) derives an oracle efﬁcient algorithm which can optimally solve the fair empirical risk minimization problem (for a variety of statistical fairness constraints, including equalized odds) given oracles (implemented with heuristics) for the unconstrained learning problem. (Kearns et al., 2018b) generalize this algorithm to be able to handle inﬁnitely many protected groups. We give a differentially private version of (Agarwal et al., 2018) as well.

Our paper is directly inspired by (Kilbertus et al., 2018), who study how to train fair machine learning models by encrypting sensitive attributes and applying secure multiparty computation (MPC). We share the goal of (Kilbertus et al., 2018): we want to train fair classiﬁers without leaking information about an individual s race through their participation in the training. Our starting point is the observation that dif-

ferential privacy, rather than secure multiparty computation, is the right tool for this.

We use differential privacy (Dwork et al., 2006b) as our notion of individual privacy, which has become an inﬂuential solution concept for data privacy in the last decade. See (Dwork & Roth, 2014) for a survey. We make use of standard tools from this literature, including the Laplace mechanism (Dwork et al., 2006b), the exponential mechanism (Mc Sherry & Talwar, 2007) and composition theorems (Dwork et al., 2006a; 2010).

2. Model and Preliminaries

Suppose we are given a data set of m individuals drawn i.i.d. from an unknown distribution P where each individual is described by a tuple p X, A, Y q. X P X forms a vector of unprotected attributes, A P A is the protected attribute where |A| ă 8, and Y P Y is a binary label. Without loss of generality, we write A t0, 1, . . . , |A| 1u and let Y t0, 1u. Let p P denote the empirical distribution of the observed data. Our primary goal is to develop an algorithm to learn a (possibly randomized) fair classiﬁer p Y , with an algorithm that guarantees the privacy of the sensitive attribute A. By privacy, we mean differential privacy, and by fairness, we mean (approximate versions of) the Equalized Odds condition of (Hardt et al., 2016). Both of these notions are parameterized: differential privacy has a parameter ϵ, and the approximate fairness constraint is parameterized by γ. Our main interest is in characterizing the tradeoff between ϵ, γ, and classiﬁcation error. Here we provide basic deﬁnitions of fairness and differential privacy. See the supplementary ﬁle for a detailed discussion of them.

Notations: p P throughout refers to the probability taken w.r.t p P. ˆqˆyay : p P rp Y ˆy, A a, Y ys, and ˆqay : p P r A a, Y ys. x FPapp Y q p P rp Y 1 | A a, Y 0s, x TPapp Y q p P rp Y 1 | A a, Y 1s are the false and true positive rates of p Y in the subpopulation t A au. x FPapp Y q |x FPapp Y q x FP0pp Y q| and x TPapp Y q |x TPapp Y q x TP0pp Y q| are used to measure p Y s false/true positive rate discrepancies across groups. x err pp Y q p P rp Y Y s is the error of p Y .

2.1. Fairness

Deﬁnition 2.1 (γ-Equalized Odds Fairness). A classiﬁer p Y satisﬁes the γ-Equalized Odds condition with respect to the attribute A, if for all a P A, the false and true positive rates of p Y in the subpopulations t A au and t A 0u are within γ of one another. In other words, for all a 0, FPapp Y q ď γ and TPapp Y q ď γ.

Differentially Private Fair Learning

2.2. Differential Privacy

Let D be a data universe from which a database D of size m is drawn and let M : Dm Ñ O be an algorithm that takes a database D as input. Differential privacy requires that the addition or removal of a single data entry should have little (distributional) effect on the output of the mechanism. In other words, for every pair of neighboring databases D D1 P Dm that differ in at most one entry, differential privacy requires that the distribution of Mp Dq and Mp D1q are close to each other, measured by parameters ϵ and δ.

Deﬁnition 2.2 (pϵ, δq-Differential Privacy (DP) (Dwork et al., 2006b)). A randomized algorithm M : Dm Ñ O is said to be pϵ, δq-differentially private if for all pairs of neighboring databases D, D1 P Dm and all O Ď O,

Pr Mp Dq P Os ď eϵ Pr Mp D1q P Os δ

if δ 0, M is said to be ϵ-differentially private.

Recall that our data universe is D p X, A, Yq, which will be convenient to partition as p X, Yq ˆ A. Given a dataset D of size m, we will write it as a pair D p DI, DSq where DI P p X, Yqm represent the insensitive attributes and DS P Am represent the sensitive attributes. We will sometimes incidentally guarantee differential privacy over the entire data universe D (see Table 1), but our main goal will be to promise differential privacy only with respect to the sensitive attributes. Write DS D1 S to denote that DS and D1 S differ in exactly one coordinate (i.e. in one person s group membership). An algorithm is pϵ, δq-differentially private in the sensitive attributes if for all DI P p X, Yqm

and for all DS D1 S P Am, we have:

P r Mp DI, DSq P Os ď eϵ P Mp DI, D1 Sq P O δ

3. DP Fair Learning: Post-Processing

In this section we will present our ﬁrst differentially private fair learning algorithm which will be called DPpostprocessing. The DP-postprocessing algorithm (Algorithm 1) is a private variant of the algorithm introduced in (Hardt et al., 2016).

(Hardt et al., 2016) s approach starts with a base classiﬁer p Y which is learned without using protected attributes. They derive a fair classiﬁer p Yp by mixing p Y with classiﬁers depending on the protected attributes. p Yp is speciﬁed by a parameter p ppˆyaqˆy,a, a vector of probabilities such that pˆya : P rp Yp 1 | p Y ˆy, A as. Among all fair p Yp s, the one with minimum error can be found by solving a linear program which takes as input the aggregate statistics ˆqˆyay for all ˆy, a, y.

In Algorithm 1, we make the above approach differentially private. Notice this method depends on the protected attributes only to compute the quantities ˆqˆyay. To guarantee

Algorithm 1 ϵ-DP fair classiﬁcation: DP-postprocessing

Input: privacy parameter ϵ, conﬁdence parameter β, fairness violation γ, training examples tp Xi, Ai, Yiqum i 1

Train the base classiﬁer p Y on tp Xi, Yiqum i 1. Compute ˆqˆyay p P rp Y ˆy, A a, Y ys.

Sample Wˆyay i.i.d. Lap p2{mϵq for all ˆy, a, y. Perturb each ˆqˆyay with noise: qˆyay ˆqˆyay Wˆyay. Solve Ă LP (1) to get the minimizer p .

Output: p , the trained classiﬁer p Y

differential privacy, Algorithm 1 computes qˆyay (a noisy version of ˆqˆyay) and then feeds qˆyay into the linear program Ă LP (1). In this linear program, terms with tildes (e.g. qay, Ă err , Ă FP, Ă TP) are deﬁned with respect to qˆyay instead of ˆqˆyay.

We analyze the performance of Algorithm 1 in Theorem 3.1. Its proof is deferred to the supplementary ﬁle. The main step of the proof is to understand how the introduced noise propagates to the solution of the linear program. We also brieﬂy review the approach of (Hardt et al., 2016) in the supplementary ﬁle.

Ă LP: ϵ-Differentially Private Linear Program

arg min p Ă err pp Ypq

s.t. @a P A a 0 Ă FPapp Ypq ď γ 4 ln p4|A|{βq mint qa0, q00u mϵ

Ă TPapp Ypq ď γ 4 ln p4|A|{βq mint qa1, q01u mϵ 0 ď pˆya ď 1 @ˆy, a (1)

Ă err p Yp : ř

ˆy,a p qˆya0 qˆya1q pˆya ř

Ă FPa p Yp : ˇˇˇĂ FPapp Y q p1a 1 Ă FPapp Y q

p0a Ă FP0pp Y q p10 1 Ă FP0pp Y q p00 ˇˇˇ

Ă TPa p Yp : ˇˇˇĂ TPapp Y q p1a 1 Ă TPapp Y q

p0a Ă TP0pp Y q p10 1 Ă TP0pp Y q p00 ˇˇˇ

Theorem 3.1 (Error-Privacy, Fairness-Privacy Tradeoffs). Suppose mina,ytˆqayu ą 4 ln p4|A|{βq { pm ϵq. Let pp

be the optimal γ-fair solution of the non-private postprocessing algorithm of (Hardt et al., 2016) and let rp be the output of Algorithm 1 which is the optimal solution of Ă LP (1). With probability at least 1 β,

x err p Yrp ď x err p Yp 24|A| ln p4|A|{βq

Differentially Private Fair Learning

and for all a 0,

x FPa p Yrp ď γ 8 ln p4|A|{βq mintˆqa0, ˆq00u mϵ 4 ln p4|A|{βq

x TPa p Yrp ď γ 8 ln p4|A|{βq mintˆqa1, ˆq01u mϵ 4 ln p4|A|{βq

We emphasize that the accuracy guarantee stated in Theorem 3.1 is relative to the non-private post-processing algorithm, not relative to the optimal fair classiﬁer. This is because the non-private post-processing algorithm itself has no such optimality guarantees.

4. DP Fair Learning: In-Processing

In this section we will introduce our second differentially private fair learning algorithm: DP-oracle-learner (Algorithm 3). It is based on the fair learning algorithm presented in (Agarwal et al., 2018). Essentially, (Agarwal et al., 2018) reduces the γ-fair learning problem into the following Lagrangian min-max problem:

min Q P p Hq max λ P Λ Lp Q, λq : x err p Qq λJprp Qq (2)

Here H is a given class of binary classiﬁers with d H V CDp Hq ă 8, and prp Qq is a vector of fairness violations of the classiﬁer Q across groups, and λ P Λ tλ : ||λ||1 ď Bu is the dual variable. In this work,

x FPap Qq x FP0p Qq γ x FP0p Qq x FPap Qq γ x TPap Qq x TP0p Qq γ x TP0p Qq x TPap Qq γ

P R4p|A| 1q

λ λpa,0, q, λpa,0, q, λpa,1, q, λpa,1, q J a PA a 0 P R4p|A| 1q

The method developed by (Agarwal et al., 2018), in the language of (Kearns et al., 2018b), gives a reduction from ﬁnding an optimal fair classiﬁer to ﬁnding the equilibrium of a two-player zero-sum game played between a Learner (Q-player) who needs to solve an unconstrained learning problem (given access to an efﬁcient cost-sensitive classiﬁcation oracle) and an Auditor (λ-player) who weights (λ) the fairness violations. Having the learner play its best response and the auditor play a no-regret learning algorithm guarantees convergence of the average plays to the equilibrium.

In Algorithm 3, to make the above approach private, Laplace noise is added to the gradients used by the Auditor (who plays exponentiated gradient descent with learning rate η) and we let the Learner run the exponential mechanism (or some other private learning oracle) to compute approximate

Subroutine 2 BESTϵ1 h Input: λ, training examples tp Xi, Ai, Yiqum i 1, privacy guarantee ϵ1

for i 1, . . . , m do

C0 i Ð 1t Yi 0u (cost of labeling 0) C1 i Ð 1t Yi 1u λp Ai,Yi, q λp Ai,Yi, q

ˆq Ai Yi 1t Ai 0u ř

λpa,Yi, q λpa,Yi, q

ˆq Ai Yi 1t Ai 0u (cost of labeling 1)

end for Call CSCϵ1p Hq with t Xi, C0 i , C1 i um i 1 to get h .

Algorithm 3 pϵ, δq-differentially private fair classiﬁcation: DP-oracle-learner

Input: privacy parameters ϵ, δ, bound B, VC dimension d H, conﬁdence parameter β, fairness violation γ, training examples tp Xi, Ai, Yiqum i 1

lnp4|A| 3q m ϵ

2p2|A|B 1q?

lnp1{δqpd H lnpmq lnp2{βqq

T rθ1 Ð 0 P R4p|A| 1q

for t 1, . . . , T do

rλt,k Ð B expprθt,kq 1 ř

k1 expprθt,k1q for all k

rht Ð BESTϵ1 h prλtq with ϵ1 ϵ{p4 a

Sample W t: Wt,k i.i.d. Lapp 8|A|?

T lnp1{δq pmina,ytˆqayu m 1q ϵq

rt Ð prtprhtq W t rθt 1 Ð rθt ηrt end for

T řT t 1 rht, rλ Ð 1

T řT t 1 rλt

Output: p r Q, rλq

best responses. Subroutine 2 reduces the Learner s best response problem to privately solving a cost sensitive classiﬁcation problem solved with a private oracle CSCϵ1p Hq. Here we sketch the main steps of analyzing Algorithm 3. All the proofs and a review of the approach of (Agarwal et al., 2018) are deferred to the supplementary ﬁle.

We ﬁrst bound the regret of the Learner and the Auditor in Lemma 4.1 and 4.2 by understanding how the introduced perturbations affect these regret terms.

Lemma 4.1 (Regret of the Private Learner). Suppose trhtu T t 1 is the sequence of best responses to trλtu T t 1 by the private Learner over T rounds. We have that with prob-

Differentially Private Fair Learning

ability at least 1 β{2,

t 1 Lprht, rλtq 1

T min QP p Hq

t 1 Lp Q, rλtq ď

8 p2|A|B 1q a

T ln p1{δq pd H ln pmq ln p2T{βqq pmina,ytˆqayu m 1q ϵ

Lemma 4.2 (Regret of the Private Auditor). Let trλtu T t 1 be the sequence of exponentiated gradient descent plays (with learning rate η) by the private Auditor to given trhtu T t 1 of the private Learner over T rounds. With probability at least 1 β{2,

1 T max λPΛ

t 1 Lprht, λq 1

t 1 Lprht, rλtq ď

B lnp4|A| 3q

δ ln 8T |A|

pmina,ytˆqayu m 1q ϵ

Now in Theorem 4.3, given Lemma 4.1 and 4.2, we can characterize the average plays of both players.

Theorem 4.3. Let p r Q, rλq be the output of Algorithm 3. We have that with probability at least 1 β, p r Q, rλq is a νapproximate solution of the game, i.e.,

Lp r Q, rλq ď Lp Q, rλq ν for all Q P p Hq

Lp r Q, rλq ě Lp r Q, λq ν for all λ P Λ

B min a,y tˆqayu

ln p1{δq ln pmd H{βq

where we hide further logarithmic dependence on m, ϵ, and |A| under the r O notation.

We are now ready to conclude the DP-oracle-learner algorithm s analysis with our main theorem.

Theorem 4.4 (Error-Privacy, Fairness-Privacy Tradeoffs). Let ν be as in Theorem 4.3. Let p r Q, rλq be the output of Algorithm 3 and let Q be the solution to the non-private γ-fair ERM problem. We have that with probability at least 1 β, x err p r Qq ď x err p Q q 2ν and for all a 0,

x FPap r Qq ď γ 1 2ν

B , x TPap r Qq ď γ 1 2ν

Remark 4.1. The bounds above reveal a tradeoff between accuracy and fairness violation by controlling through the parameter B. As B increases, the upper bound on error gets looser while the one on fairness violation gets tighter.

Remark 4.2. We assumed in this section that the protected attribute A is not available to the classiﬁers in H ( A-blind classiﬁcation) and stated all our bounds in terms of d H. In the more general setting where classiﬁers in H could depend on A ( A-aware classiﬁcation), similar results hold. The only change to make is to replace ln pmd Hq with ln p|H|q in the bounds. See the supplementary ﬁle for more details.

4.1. Better Tradeoffs for A-aware Classiﬁcation

In this subsection we show that if we only ask for equalized false positive rates (instead of equalized odds, which also requires equalized true positive rates), and moreover, if we assume H includes all maximally discriminatory classiﬁers (see Assumption 4.1), the fairness violation guarantees given in Theorems 4.4 can be improved.

Assumption 4.1. H includes all group indicator functions: thap X, Aq 1A a, hap X, Aq 1A a |a P Au Ď H.

Theorem 4.5 (Error-Privacy, Fairness-Privacy Tradeoffs). Let p r Q, rλq be the output of Algorithm 3, and let Q be the solution to the non-private γ-fair ERM problem. Under assumptions 4.1, and B ą |A| 1, with probability at least 1 β, x err p r Qq ď x err p Q q 2ν , and for all a 0,

x FPap r Qq ď γ 2ν B p|A| 1q

ν r O ˆ B mina,ytˆqayu

lnp1{δq lnp|H|{βq

4.2. A Separation: A-blind vs. A-aware Classiﬁcation

In this subsection we show that the sensitivity of the error of the optimal classiﬁer subject to fairness constraints can be substantially higher if it is prohibited from using sensitive attributes at test time, and thus we need more noise to estimate this error subject to differential privacy. This shows a fundamental tension between the goals of trading off privacy and approximate equalized odds, with the goal of preventing disparate treatment (using protected attributes explicitly in classiﬁcation).

Given a data set D of m individuals, deﬁne fp Dq to be the optimal error rate of any classiﬁer constrained to have a false positive rate disparity ď γ. Now consider the following problem instance. Let X be the unprotected attribute taking value in X t U, V u, and let A be the protected attribute taking value in A t R, Bu. Suppose H consists of two classiﬁers h0 and h U where h0p X, Aq 0 and h Up X, Aq 1X U. Notice that both h0 and h U depend only on the unprotected attribute. Let h R and h B be two other classiﬁers that depend on the protected attribute: h Rp X, Aq 1A R and h Bp X, Aq 1A B.

Differentially Private Fair Learning

Figure 1: Left ﬁgure shows the Pareto frontier of error and (equalized odds) fairness violation for the DP-oracle-learner algorithm on the Communities dataset across different privacy parameters . Right ﬁgure shows the corresponding Pareto curves for the DP-postprocessing algorithm. Each point on the private curves is averaged over many rounds to reduce the effect of noise variance. See text for details.

Theorem 4.6. Consider γ ą 1{m and datasets with mina ˆqa0 ě C for some constant C ą 0. If H th0, h Uu, the sensitivity of f is Ωp1{pγmqq. If the maximally discriminatory classiﬁers h R and h B are also included in H, i.e. H th0, h U, h R, h Bu, the sensitivity of f is O p1{mq.

5. Experimental Evaluation

As a proof of concept, we empirically evaluate our two algorithms on a common fairness benchmark dataset: the Communities and Crime dataset 2 from the UC Irvine Machine Learning Repository. We refer the reader to (Kearns et al., 2018a) for an outline of potential fairness concerns present in the dataset. We clean and preprocess the data identically to (Kearns et al., 2018a). Our main experimental goal is to obtain, for both algorithms, the Pareto frontier of error and fairness violation tradeoffs for different levels of differential privacy. To elaborate, for a given setting of input parameters, we start with the target fairness violation bound γ 0 and then increase it over a rich pre-speciﬁed subset of r0, 1s while recording for each γ the error and the (realized) fairness violation of the classiﬁer output by the algorithm. We take H to be the class of linear threshold functions, β 0.05, and δ 10 7.

Logistic regression is used as the base classiﬁer of the DPpostprocessing algorithm in our experiments. To implement the Learner s cost-sensitive classiﬁcation oracle used in the DP-oracle-learner algorithm, following (Kearns et al., 2018a), we build a regression-based linear predictor for each vector of costs (C0 and C1), and classify a point according to the lowest predicted cost. We made this private following the method of (Smith et al., 2017): computing each regression as p XT Xq 1XT Cb, and adding appropriately scaled Laplace noise to both XT X and XT Cb. Note

2Brieﬂy, each record in this dataset summarizes aggregate socioeconomic information about both the citizens and police force in a particular U.S. community, and the problem is to predict whether the community has a high rate of violent crime.

when the sensitive attribute A is not included in X (the Ablind case, as in our experiments) noise need not be added to XT X as we only need to guarantee the privacy of A.

The theory is ambiguous in its predictions about which algorithm should perform better: the privacy cost is higher for the in-processing algorithm, but the benchmark that the postprocessing algorithm competes with is weaker. We would generally expect therefore that on sufﬁciently large datasets, the in-processing algorithm would obtain better tradeoffs, but on small datasets, the post-processing algorithm would.

Our experimental results appear in Fig. 1. Indeed, on our relatively small dataset (m 2K), the post-processing algorithm can obtain good tradeoffs between accuracy and fairness at meaningful levels of ϵ, whereas the in-processing algorithm cannot. Nevertheless, we can empirically obtain the shape of the Pareto curve trading off accuracy and fairness for unreasonable levels of ϵ using our algorithm. This is still valuable, because the value of ϵ obtained by our algorithms predictably decreases as the dataset size m increases without otherwise changing the dynamics of the algorithm. For example, if we upsampled our dataset by a factor of 10 (i.e. taking 10 copies of the dataset), the result would be a reasonably sized dataset of m 20K. Our algorithm run on this upsampled dataset would obtain the same tradeoff curve but now with meaningful values of ϵ. In the left panel of Fig. 1, ϵ is the actual privacy parameter used in the experiments; while ϵ1 is the value that the privacy parameter would take on the upsampled dataset.

Recall that the post-processing approach requires the use of the protected attribute at test time, but the in-processing approach does not. Our results therefore suggest that the requirement that we not use the protected attribute at test time (i.e. that we be avoid disparate treatment ) might be extremely burdensome if we also want the protections of differential privacy and have only small dataset sizes. In contrast, it can be overcome with the in-processing algorithm at larger dataset sizes.

Differentially Private Fair Learning

Acknowledgements

AR is supported in part by NSF grants AF-1763307 and CNS-1253345. JU is supported by NSF grants CCF1718088, CCF-1750640, and CNS-1816028, and a Google Faculty Research Award. Alina Oprea was partially supported by the Combat Capabilities Development Command Army Research Laboratory under Cooperative Agreement Number W911NF-13-2-0045 (ARL Cyber Security CRA). The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the ofﬁcial policies, either expressed or implied, of the Combat Capabilities Development Command Army Research Laboratory or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes not withstanding any copyright notation here on.

ACM. ACM Conference on Fairness, Accountability and Transparency. 2019. URL https://fatconference.org/index. html.

Agarwal, A., Beygelzimer, A., Dudik, M., Langford, J., and Wallach, H. A reductions approach to fair classiﬁcation. In Dy, J. and Krause, A. (eds.), Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pp. 60 69, Stockholmsm assan, Stockholm Sweden, 10 15 Jul 2018. PMLR. URL http://proceedings.mlr.press/ v80/agarwal18a.html.

Chouldechova, A. and Roth, A. The frontiers of fairness in machine learning. 2018. URL ar Xiv:1810.08810.

Dwork, C. and Roth, A. The algorithmic foundations of differential privacy. Foundations and Trends R in Theoretical Computer Science, 9(3 4):211 407, 2014. ISSN 1551-305X. doi: 10.1561/0400000042. URL http://dx.doi.org/10.1561/0400000042.

Dwork, C., Kenthapadi, K., Mc Sherry, F., Mironov, I., and Naor, M. Our data, ourselves: Privacy via distributed noise generation. In Vaudenay, S. (ed.), Advances in Cryptology - EUROCRYPT 2006, pp. 486 503, Berlin, Heidelberg, 2006a. Springer Berlin Heidelberg. ISBN 978-3-540-34547-3.

Dwork, C., Mc Sherry, F., Nissim, K., and Smith, A. Calibrating noise to sensitivity in private data analysis. In Halevi, S. and Rabin, T. (eds.), Theory of Cryptography, pp. 265 284, Berlin, Heidelberg, 2006b. Springer Berlin Heidelberg. ISBN 978-3-540-32732-5.

Dwork, C., Rothblum, G. N., and Vadhan, S. Boosting and differential privacy. In Proceedings of the 2010 IEEE

51st Annual Symposium on Foundations of Computer Science, FOCS 10, pp. 51 60, Washington, DC, USA, 2010. IEEE Computer Society. ISBN 978-0-7695-4244-7. doi: 10.1109/FOCS.2010.12. URL http://dx.doi.org/10. 1109/FOCS.2010.12.

Hardt, M., Price, E., , and Srebro, N. Equality of opportunity in supervised learning. In Lee, D. D., Sugiyama, M., Luxburg, U. V., Guyon, I., and Garnett, R. (eds.), Advances in Neural Information Processing Systems 29, pp. 3315 3323. Curran Associates, Inc., 2016. URL http://papers.nips.cc/paper/ 6374-equality-of-opportunity-in-supervised-learning. pdf.

Kearns, M., Neel, S., Roth, A., and Wu, Z. S. An empirical study of rich subgroup fairness for machine learning. 2018a. URL ar Xiv:1808.08166v1.

Kearns, M., Neel, S., Roth, A., and Wu, Z. S. Preventing fairness gerrymandering: Auditing and learning for subgroup fairness. In Dy, J. and Krause, A. (eds.), Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pp. 2564 2572, Stockholmsm assan, Stockholm Sweden, 10 15 Jul 2018b. PMLR. URL http://proceedings.mlr.press/v80/kearns18a.html.

Kilbertus, N., Gasc on, A., Kusner, M. J., Veale, M., Gummadi, K. P., and Weller, A. Blind justice: Fairness with encrypted sensitive attributes. 2018. URL ar Xiv: 1806.03281v1.

Mc Sherry, F. and Talwar, K. Mechanism design via differential privacy. In Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science, FOCS 07, pp. 94 103, Washington, DC, USA, 2007. IEEE Computer Society. ISBN 0-7695-3010-9. doi: 10.1109/FOCS. 2007.41. URL http://dx.doi.org/10.1109/FOCS.2007.41.

Smith, A., Thakurta, A., and Upadhyay, J. Is interaction necessary for distributed private learning? In Security and Privacy (SP), 2017 IEEE Symposium on, pp. 58 77. IEEE, 2017.

Song, C., Ristenpart, T., and Shmatikov, V. Machine learning models that remember too much. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 587 601. ACM, 2017.

Veale, M. and Binns, R. Fairer machine learning in the real world: Mitigating discrimination without collecting sensitive data. Big Data & Society, 4(2):2053951717743530, 2017.