# fair_learning_with_private_demographic_data__74c944c5.pdf Fair Learning with Private Demographic Data Hussein Mozannar 1 Mesrob I. Ohannessian 2 Nathan Srebro 3 Abstract Sensitive attributes such as race are rarely available to learners in real world settings as their collection is often restricted by laws and regulations. We give a scheme that allows individuals to release their sensitive information privately while still allowing any downstream entity to learn nondiscriminatory predictors. We show how to adapt non-discriminatory learners to work with privatized protected attributes giving theoretical guarantees on performance. Finally, we highlight how the methodology could apply to learning fair predictors in settings where protected attributes are only available for a subset of the data. 1. Introduction As algorithmic systems driven by machine learning start to play an increasingly important role in society, concerns arise over their compliance with laws, regulations and societal norms. In particular, machine learning systems have been found to be discriminating against certain demographic groups in applications of criminal assessment, lending and facial recognition (Barocas et al., 2019). To ensure nondiscrimination in learning tasks, knowledge of the sensitive attributes is essential, however, laws and regulation often prohibit access and use of this sensitive data. As an example, credit card companies do not have the right to ask about an individual s race when applying for credit, while at the same time they have to prove that their decisions are non-discriminatory (Commission, 2013; Chen et al., 2019). Apple Card, a credit card offered by Apple and Goldman Sachs, was recently accused of being discriminatory (Vigdor, 2019). Married couples rushed to Twitter to report that there were significant differences in the credit limit given individually to each of them even though they had 1IDSS, Massachusetts Institute of Technology, MA, USA 2Department of Electrical and Computer Engineering, University of Illinois at Chicago, IL, USA 3Toyota Technological Institute, IL, USA. Correspondence to: Hussein Mozannar . Proceedings of the 37 th International Conference on Machine Learning, Online, PMLR 119, 2020. Copyright 2020 by the author(s). shared finances and similar income levels. Supposing Apple was trying to make sure its learned model was non discriminatory, it would have been forced to use proxies for gender and recent work has shown that proxies can be problematic by potentially underestimating discrimination (Kallus et al., 2019). We are then faced with what seems to be two opposing societal notions to satisfy: we want our system to be non-discriminatory while maintaining the privacy of our sensitive attributes. Note that even if the features that our model uses are independent of the sensitive attributes, it is not enough to guarantee notions of non-discrimination that further condition on the truth, e.g. equalized odds. One potential workaround to this problem, ignoring legal feasibility, is to allow the individuals to release their data in a locally differentially private manner Dwork et al. (2006) and then try to learn from this privatized data a non-discriminatory predictor. This allows us to guarantee that our decisions are fair while maintaining a degree of individual privacy to each user. In this work, we consider a binary classification framework where we have access to non-sensitive features X and locally-private versions of the sensitive attributes A denoted by Z. The details of the problem formulation are given in Section 3. Our contributions are as follows: We first give sufficient conditions on our predictor for non-discrimination to be equivalent under A and Z and derive estimators to measure discrimination using the private attributes Z. (Section 4) We give a learning algorithm based on the two-step procedure of Woodworth et al. (2017) and provide statistical guarantees for both the error and discrimination of the resulting predictor. The main innovation in terms of both the algorithm and its analysis is in accessing properties of the sensitive attribute A by carefully inverting the sample statistics of the private attributes Z. (Section 5) We highlight how some of the same approach can handle other forms of deficiency in demographic information, by giving an auditing algorithm with guarantees, when protected attributes are available only for a subset of the data. (Section 6) Beyond the original motivation, this work conveys additional insight on the subtle trade-offs between error and Fair Learning with Private Demographic Data discrimination. In this perspective, privacy is not in itself a requirement, but rather an analytic tool. We give some experimental illustrations of these trade-offs. 2. Related Work Enforcing non-discrimination constraints in supervised learning has been extensively explored with many algorithms proposed to learn fair predictors with methods that fall generally in one category among pre-processing (Zemel et al., 2013), in-processing (Cotter et al., 2018; Agarwal et al., 2018), or post-processing (Hardt et al., 2016). In this work we focus on group-wise statistical notions of discrimination, setting aside critical concerns of individual fairness (Dwork et al., 2012). Kilbertus et al. (2018) were the first to propose to learn a fair predictor without disclosing information about protected attributes, using secure multi-party computation (MPC). However, as Jagielski et al. (2018) noted, MPC does not guarantee that the predictor cannot leak individual information. In response, Jagielski et al. (2018) proposed differentially private (DP) (Dwork et al., 2006) variants of fair learning algorithms. More recent work have similarly explored learning fair and DP predictors (Cummings et al., 2019; Xu et al., 2019; Alabi, 2019; Bagdasaryan & Shmatikov, 2019). In our setting local privacy maintains all the guarantees of DP in addition to not allowing the learner to know for certain any sensitive information about a particular data point. Related work has also considered fair learning when the protected attribute is missing or noisy (Hashimoto et al., 2018; Gupta et al., 2018; Lamy et al., 2019; Awasthi et al., 2019; Kallus et al., 2019; Wang et al., 2020). Among these, the most related setting is that of (Lamy et al., 2019), but it has several critical contrasting points with the present work. The simplest difference is the generalization here to non-binary groups, and the corresponding precise characterization of the equivalence between exact non-discrimination with respect to the original and private attributes. More importantly, their approach is only the first step of our algorithm. As we show in Lemma 2, the first step makes the non-discrimination guarantee depend on both the privacy level and the complexity of the hypothesis class, which could be very costly. We remedy this using the second step of our algorithm. (Awasthi et al., 2019) consider a more general noise model for the protected attributes in the training data, but assume access to the actual protected attributes at test time. The fact that at test time A is provided guarantees that the predictor is not a function of Z and hence for the LDP noise mechanism by Proposition 1, we know that it is enough to guarantee non-discrimination with respect to Z to be non-discriminatory with respect to A, which considerably simplifies the problem. 3. Problem Formulation A predictor ˆY of a binary target Y {0, 1} is a function of non-sensitive attributes X X and possibly also of a sensitive (or protected) attribute A A denoted as ˆY := h(X) or ˆY := h(X, A). We consider a binary classification task where the goal is to learn such a predictor, while ensuring a specified notion of non-discrimination with respect to A. As an example, when deciding to extend credit to a given individual, the protected attribute could denote someone s race and sex and the features X could contain the person s financial history, level of education and housing information. Note that X could very well include proxies for A such as zip code which could reliably infer race (Bureau, 2014). Our focus here is on statistical notions of group-wise nondiscrimination amongst which are the following: Definition 1 (Fairness Definitions). A classifier ˆY satisfies: Equalized odds (EO) if a A P( ˆY = 1|A = a, Y = y) = P( ˆY = 1|Y = y) y {0, 1}, Demographic parity (DP) if a A P( ˆY = 1|A = a) = P( ˆY = 1), Accuracy parity (AP) if a A P( ˆY = Y |A = a) = P( ˆY = Y ), False discovery (ˆy = 1) / omission (ˆy = 0) rates parity if a A P( ˆY = Y | ˆY = ˆy, A = a) = P( ˆY = Y | ˆY = ˆy). Our treatment extends to a very broad family of demographic fairness constraints, let E1, E2 be two probability events defined with respect to (X, Y, ˆY ), then define (E1, E2)-non-discrimination with respect to A as having: P (E1|E2, A = a) = P (E1|E2, A = a ) a, a A (1) All the notions considered in Definition 1 can be cast into the above formulation for one or more set of events (E1, E2). Additionally, one can naturally define approximate versions of the above fairness constraints. As an example, for the notion of equalized odds, let A = {0, 1, , |A| 1} and define γy,a( ˆY ) = P( ˆY = 1|Y = y, A = a), then ˆY satisfies α-EO if: max y {0,1},a A Γya := γy,a( ˆY ) γy,0( ˆY ) α While it is clear that learning or auditing fair predictors requires knowledge of the protected attributes, laws and regulations often restrict the use and the collection of this data (Jagielski et al., 2018). Moreover, even if there are Fair Learning with Private Demographic Data no restrictions on the usage of the protected attribute, it is desirable that this information is not leaked by (1) the algorithm s output and (2) the data collected. Local differential privacy (LDP) guarantees that the entity holding the data does not know for certain the protected attribute of any data point, which in turn makes sure that any algorithm built on this data is differentially private. Formally a locally ϵ differentially private mechanism Q is defined as follows: Definition 2. Q is ϵ differentially private if (Duchi et al. (2013)): max z,a,a Q(Z = z|a) Q (Z = z|a ) eϵ The mechanism we employ is the randomized response mechanism (Warner, 1965; Kairouz et al., 2014): ( eε |A| 1+eε := π if z = a 1 |A| 1+eε := π if z = a The choice of the randomized response mechanism is motivated by its optimality for distribution estimation under LDP constraints (Kairouz et al., 2014; 2016) The hope is that LDP samples of A are sufficient to ensure non-discrimination, allowing us to refrain from the problematic use of proxies for A. For the remainder of this paper, we assume that we have access to n samples S = {(xi, yi, zi)}n i=1 which are the result of an i.i.d draw from an unknown distribution P over X Y A where A = {0, 1, , |A| 1} and Y = {0, 1}, but where A is not observed and instead Z is sampled from Q(.|A) independently from X and Y . We call Z the privatized protected attribute. To emphasize the difference between A and Z with respect to fairness, let qy,a( ˆY ) = P( ˆY = 1|Y = y, Z = a), note that ˆY satisfies α-EO with respect to Z if: max y {0,1},a Z qy,a( ˆY ) qy,0( ˆY ) α. 4. Auditing for Discrimination The two main questions we answer in this section is whether non-discrimination with respect to A and Z are equivalent and how to estimate the non-discrimination of a given predictor. First, note that if a certain predictor ˆY = h(X, Z) uses Z for predictions and is non-discriminatory with respect to Z, then it is possible for it to in fact be discriminatory with respect to A. In Appendix A, we give an explicit example of such a predictor, that violates the equivalence for EO. This illustrates that naïve implementations of fair learning methods can be more discriminatory than perceived. Any method that naïvely uses the attribute Z for its final predictions cannot immediately guarantee any level of nondiscrimination with respect to A especially post-processing methods. This however is not the case when predictors do not avail themselves of the privatized protected attribute Z. Namely, let s consider ˆY that are only a function of X. Since the randomness in the privatization mechanism is independent of X, this implies in particular that ˆY is independent of Z given A. Our first result is that exact non-discrimination is invariant under local privacy: Proposition 1. Consider any exact non-discrimination notion among equalized odds, demographic parity, accuracy parity, or equality of false discovery/omission rates. Let ˆY := h(X) be a binary predictor, then ˆY is nondiscriminatory with respect to A if and only if it is nondiscriminatory with respect to Z. Proof Sketch. We consider a general formulation of the constraints we previously mentioned, let E1, E2 be two probability events defined with respect to (X, Y, ˆY ), then define non-discrimination with respect to A as having: P (E1|E2, A = a) = P (E1|E2, A = a ) a, a A Define this notion similarly with respect to Z. We can obtain the following relation for the conditional probabilities P (E1|E2, Z = a) = P (E1|E2, A = a)) πP(A = a, E2) P(Z = a, E2) a A\{a} P (E1|E2, A = a )) πP(A = a , E2) P(Z = a, E2) Let P be the following |A| |A| matrix: ( Pi,i = πP(A=i,E2) P(Z=i,E2) for i A Pi,j = πP(A=j,E2) P(Z=i,E2) for i, j A s.t.i = j (2) Then we have the following linear system of equations: P(E1|E2, Z = 0) ... P(E1|E2, Z = |A| 1) P(E1|E2, A = 0) ... P(E1|E2, A = |A| 1) The matrix P is row-stochastic and invertible, from this linear system we can deduce that non-discrimination with respect to Z and A are equivalent; details are left to Appendix A. Note that while ˆY not being a function of Z is a sufficient condition for the conclusion in Proposition 1 to hold, the Fair Learning with Private Demographic Data more general condition for EO is that ˆY is independent of Z given A and Y, however actualizing this condition beyond simply ignoring Z at test time is unclear. We next study how to measure non-discrimination from samples. Unfortunately, Proposition 1 applies only in the population-limit. For example for the EO notion, despite what it seems to suggest, naïve sample α-discrimination relative to Z underestimates discrimination relative to A. Interestingly however, for any of the considered fairness notions, we can recover the statistics of the population with respect to A via a linear system of equations relating them to those of Z as in (3). This is done by inverting the matrix P defined in (2), however more care is needed: to compute the matrix P one needs to compute quantities involving the attribute A, which then all have to be related back to Z. Using this relation, we derive an estimator for the discrimination of a predictor that does not suffer from the bias of the naïve approach. First we set key notations for the rest of the paper: Pya := P(Y = y, A = a), Qya := P(Y = y, Z = a) and C = |A| 2+eϵ eϵ 1 . The latter captures the scale of privatization: C O(ϵ 1) if ϵ 1. Let P be the A A matrix as such: ( Pi,i = π Pyi Qyi for i A Pi,j = π Pyj Qyi for i, j A s.t.i = j Then one can relate qy,. and γy,. via: qy0 ... qy,|A| 1 γy,0 ... γy,|A| 1 And thus by inverting P we can recover γy,a, however, the matrix P involves estimating the probabilities P(Y = y, A = a) which we do not have access to but can similarly recover by noting that: Qyz = πPyz + X Let the matrix Π R|A| |A| be as follows Πi,j = π if i = j and Πi,j = π if i = j. Therefore Π 1 k Qy,. = P(Y = y, A = k) where Π 1 k is the k th row of Π 1. Hence we can plug this estimate in to compute P and invert the linear system to measure our discrimination. In Lemma 1, we characterize the sample complexity needed by our estimator to bound the violation in discrimination, specifically for the EO constraint. The privacy penalty C arises from ||P|| . Lemma 1. For any δ (0, 1/2), any binary predictor ˆY := h(X), denote by eΓS ya our proposed estimator for Γya based on S, if n 8 log(8|A|/δ) minya Pya , we have: max ya |eΓS ya Γya| > 5. Learning Fair Predictors In this section, we give a strategy to learn a nondiscriminatory predictor with respect to A from the data S, which only contains the privatized attribute Z. As in Lemma 1, for concreteness and clarity we restrict the analysis to the notion of equalized odds (EO) most of the analysis extends directly to other constraints. In light of the limitation identified by Proposition 1, let H be a hypothesis class of functions that depend only on X. Instead of a single predictor in the class, we exhibit a distribution over hypotheses, which we interpret as a randomized predictor. Let H be the set of all distributions over H, and denote such a randomized predictor by Q H. The goal is to learn a predictor that approximates the performance of the optimal non-discriminatory distribution: Y = arg min Q H P(Q(X) = Y ) (4) s.t. γy,a(Q) = γy,0(Q) y {0, 1}, a A (5) A first natural approach would be to treat the private attribute Z as if it were A and ensure on S that the learned predictor is non-discriminatory. Since the hypothesis class H consists of functions that depend only on X, Proposition 1 applies and offers hope that, if we are able to achieve exact non discriminatory with respect to Z, we would be in fact nondiscriminatory with respect to A. There are two problems with the above approach. First, exact non-discrimination is computationally hard to achieve and approximate nondiscrimination underestimates the discrimination by the privacy penalty C. And second, using an in-processing learning method, such as the reductions approach of (Agarwal et al., 2018), results in a discrimination guarantee that scales with the complexity of H. Our approach is to adapt the two-step procedure of (Woodworth et al., 2017) to our setting. We start by dividing our data set S into two equal parts S1 and S2. The first step is to learn an approximately non-discriminatory predictor ˆY = Q(X) with respect to Z on S1 via the reductions approach of (Agarwal et al., 2018) which we detail in the next subsection. This predictor has low error, but may be highly discriminatory due to the complexity of the class affecting the generalization of non-discrimination of ˆY . The aim of the second step is to produce a final predictor e Y that corrects for this discrimination, without increasing its error by much. We modify the post-processing procedure of (Hardt et al., 2016) to give us non-discrimination with respect to A directly for the derived predictor e Y = f( ˆY , Z). The predictor in the second step does use Z, however with a careful analysis we are able to show that it indeed guarantees non-discrimination with respect to A; note that naively using the post-processing procedure of (Hardt et al., 2016) fails. Two relationships link the first step to the second: how discrimination with respect to Z and with respect to A Fair Learning with Private Demographic Data relate and how the discrimination from the first step affects the error of the derived predictor. In the following subsections we describe each of the steps, along with the statistical guarantees on their performance. 5.1. Step 1: Approximate Non-Discrimination with respect to Z The first step aims to learn a predictor ˆY that is approximately αn-discriminatory with respect to Z defined as: ˆY = arg min Q H err S1(Q(X)) (6) s.t. max y {0,1} |q S1 y,a(Q) q S1 y,a(Q)| αn (7) where for Q H, we use the shorthand err(Q) = P(Q(X) = Y ) and quantities with a superscript S indicate their empirical counterparts. To solve the optimization problem defined in (6), we reduce the constrained optimization problem to a weighted unconstrained problem following the approach of (Agarwal et al., 2018). As is typical with the family of fairness criteria considered, the constraint in (7) can be rewritten as a linear constraint on ˆY explicitly. Let J = Y A, K = Y A \ {0} { , +} and define γ(Q) R|J | with γ(Q)(y,a) = γy,a(Q), with the matrix M R|K| |J | having entries: M(y,a,+),(a ,y ) = I(a = a , y = y ), M(y,a, ),(a ,y ) = I(a = a , y = y ), M(y,a,+),(0,y ) = I(y = y ), M(y,a, ),(0,y ) = I(y = y ). With this reparametrization, we can write αn-EO as: Mγ(Q) αn1 (8) Let us introduce the Lagrange multiplier λ R|K| + and define the Lagrangian: L(Q, λ) = err(Q) + λ (Mγ(Q) α1) (9) We constrain the norm of λ with B R+ and consider the following two dual problems: min Q H max λ R|K| + ,||λ||1 B L(Q, λ) (10) max λ R|K| + ,||λ||1 B min Q H L(Q, λ) (11) Note that L is linear in both Q and λ and their domains are convex and compact, hence the respective solution of both problems form a saddle point of L (Agarwal et al., 2018). To find the saddle point, we treat our problem as a zero-sum game between two players: the Q-player learner and the λ-player auditor and use the strategy of (Freund & Schapire, 1996). The auditor follows the exponentiated gradient algorithm and the learner picks it s best response to the auditor. The approach is fully described in Algorithm 1. Algorithm 1 Exp. gradient reduction for fair classification (Agarwal et al., 2018) Input: training data (Xi, Yi, Zi)n/2 i=1, bound B, learning rate η, rounds T θ1 0 R|K| for t = 1, 2, , T do λt,k B exp(θt,k) 1+P k exp(θt,k) k K ht BESTh(λt) θt+1 θt + η(MγS(ht) αn1) T PT t=1 ht, ˆλ 1 T PT t=1 λt Return ( ˆY , ˆλ) Faced with a given vector λ the learner s best response, BESTh(λ)), puts all the mass on a single predictor h H as the Lagrangian L is linear in Q. (Agarwal et al., 2018) shows that finding the learner s best response amounts to solving a cost-sensitive classification problem. We reestablish the reduction in detail in Appendix A, as there are slight differences with our setup. In particular, in Lemma 2, we establish a generalization bound on the error of the first step predictor ˆY and on its discrimination, defined as the maximum violation in the EO constraint. To denote the latter similarly to the error, we use the shorthand disc( ˆY ) = maxy {0,1},a A Γya. Lemma 2. Given a hypothesis class H, a distribution over (X, A, Y ), B R+ and any δ (0, 1/2), then with probability greater than 1 δ, if n 16 log 8|A|/δ minya Pya , αn = log 64|A|/δ n minya Pya and we let ϑ = Rn/2(H) + q running Algorithm 1 on data set S with T 16 log(4|A|+1) ϑ2 and learning rate η = ϑ 8B returns a predictor ˆY satisfying the following: err( ˆY ) δ/2 err(Y ) + 4Rn/2(H) + 4 disc( ˆY ) δ/2 5C minya P2ya B + 6R minya n Pya 2 log 64|A|/δ n minya Pya (discrimination guarantee) Proof of Lemma 2 can be found in Appendix A. Note that the error bound in Lemma 2 does not scale with the privacy level, however the discrimination bound is not only hit by the privacy, through C, but is further multiplied by the Rademacher complexity Rn(H) of H. Our goal in the next step is to reduce the sample complexity required to achieve low discrimination by removing the dependence Fair Learning with Private Demographic Data on the complexity of the model class in the discrimination bound. Comparison with Differentially Private predictor. Jagielski et al. (2018) modifies Algorithm 1 to ensure that the model is differentially private with respect to A assuming access to data with the non-private attribute A. The error and discrimination generalization bounds obtained (Theorem 4.4 (Jagielski et al., 2018)) both scale with the privacy level ϵ and the complexity of H, meaning the excess terms in the bounds of Lemma 2 are both in the order of O(Rn(H)/ϵ) in their work. Contrast this with our error bound that is independent of ϵ, the catch is that discrimination obtained with LDP is significantly more impacted by the privacy level ϵ. Thus, central differential privacy and local differential privacy in this context give rise to a very different set of trade-offs. 5.2. Step 2: Post-hoc correction to achieve non-discrimination with respect to A We correct the predictor we learned in step 1 using a modified version of the post-processing procedure of Hardt et al. (2016) on the data set S2. The derived second step predictor e Y is fully characterized by 2|A| probabilities P(e Y = 1| ˆY = ˆy, Z = a) := pˆy,z. If we naïvely derive the predictor applying the post-processing procedure of Hardt et al. (2016) on S2 then this does not imply that the predictor satisfies EO as the derived predictor is an explicit function of Z, cf. the discussion in Section 4. Our approach is to directly ensure non-discrimination with respect to A to achieve our goal. Two facts make this possible. First, the base predictor of step 1 is not a function of Z and hence we can measure its false negative and positive rates using the estimator from Lemma 1. And second, to compute these rates for e Y , we can exploit its special structure. In particular, note the following decomposition: P(e Y = 1|Y = y, A = a) = (12) P(e Y = 1| ˆY = 0, A = a)P( ˆY = 0|Y = y, A = a) + P(e Y = 1| ˆY = 1, A = a)P( ˆY = 1|Y = y, A = a) and we have that: P(e Y = 1| ˆY = ˆy, A = a) = πpˆy,a + π X a A\a pˆy,a := epˆy,a and P( ˆY |Y = y, A = a) can be recovered by Lemma 1, denote e PS2( ˆY = ˆy|Y = y, A = a) our estimator based on the empirical PS2( ˆY |Y, Z). Therefore we can compute sample versions of the conditional probabilities (12). Our modified post-hoc correction reduces to solving the following constrained linear program: e Y = arg min p.,. e PS2( ˆY = ˆy, Z = a, Y = 0) e PS2( ˆY = ˆy, Z = a, Y = 1) epˆy,a s.t. ep0,ae PS2( ˆY = 0|Y = y, A = a) + ep1,ae PS2( ˆY = 1|Y = y, A = a) ep0,0e PS2( ˆY = 0|Y = y, A = 0) ep1,0e PS2( ˆY = 1|Y = y, A = 0) eαn, y, a 0 pˆy,a 1 ˆy {0, 1}, a A (13) The following Theorem illustrates the performance of our proposed estimator e Y . Theorem 1. For any hypothesis class H, any distribution over (X, A, Y ) and any δ (0, 1/2), then with probability 1 δ, if n 16 log(8|A|/δ) minya Pya , αn = q 8 log 64/δ n minyz Qyz and eαn = q minya P2ya , the predictor resulting from the twostep procedure satisfies: err(e Y ) δ err(Y ) + 5C minya P2ya B + 10R minya n Pya 2 log 64|A|/δ n minya Pya disc(e Y ) δ δ ) 2n 8|A|C2 Proof Sketch. Since the predictor obtained in step 1 is only a function of X, we can prove the following guarantees on its performance with e Y being an optimal non-discriminatory derived predictor from ˆY : err(e Y ) δ/2 err(e Y ) + 4|A|C log(32|A|/δ) disc(e Y ) δ/2 δ ) 2n 8|A|C2 Next, we have to relate the loss of the optimal derived predictor from ˆY , denoted by e Y , to the loss of the optimal non-discriminatory predictor in H. We can apply Lemma 4 in Woodworth et al. (2017) as the solution of our derived LP is in expectation equal to that in terms of A. Lemma 4 in Woodworth et al. (2017) tells us that the optimal derived predictor has a loss that is no greater than the sum of the loss of the base predictor and its discrimination: err(e Y ) err( ˆY ) + disc( ˆY ) Fair Learning with Private Demographic Data 2.0 1.5 1.0 0.5 0.0 0.5 1.0 1.5 2.0 log(ϵ) discrimination step 1 2-step (ours) 2.0 1.5 1.0 0.5 0.0 0.5 1.0 1.5 2.0 log(ϵ) step 1 2-step (ours) Figure 1. Plots of discrimination violation and accuracy of the step 1 predictor ˆY and the two-step predictor e Y versus the privacy level ϵ on the Adult Income dataset (Kohavi, 1996). Error bars show 95% confidence interval for the average. Plugging in the error and discriminating proved in Lemma 2 we obtain the theorem statement. A detailed proof is given in Appendix A.2.4. Our final predictor e Y has a discrimination guarantee that is independent of the model complexity, however this comes at a cost of a privacy penalty entering the error bound. This creates a new set of trade-offs that do not appear in the absence of the privacy constraint, fairness and error start to trade-off more severely with increasing levels of privacy. 5.3. Experimental Illustration Data. We use the adult income data set (Kohavi, 1996) containing 48,842 examples. The task is to predict whether a person s income is higher than $50k. Each data point has 14 features including education and occupation, the protected attribute A we use is gender: male or female. Approach. We use a logistic regression model for classification. For the reductions approach, we use the implementation in the fairlearn package 1. We set T = 50, η = 2.0 and B = 100 for all experiments. We split the data into 75% for training and 25% for testing. We repeat the splitting over 10 trials. Effect of privacy. We plot in Figure 1 the resulting discrimination violation and model accuracy against increasing privacy levels ϵ for the predictor ˆY resulting from step 1 , trained on all the training data, and the two-step predictor e Y trained on S1 and S2. We observe that e Y achieves lower discrimination than ˆY across the different privacy levels. 1https://github.com/fairlearn/fairlearn This comes at a cost of lower accuracy, which improves at lower privacy regimes (large epsilon). The predictor of step 1 only begins to suffer on accuracy when the privacy level is low enough as the fairness constraint is void at high levels of privacy (small epsilon). Code to reproduce Figure 1 is publicly available 2. 6. Discussion and Extensions Could this approach for private demographic data be used to learn non-discriminatory predictors under other forms of deficiency in demographic information? In this section, we consider another case of interest: when individuals retain the choice of whether to release their sensitive information or not, as in the example of credit card companies. Practically, this means that the learner s data contains one part that has protected attribute labels and another that doesn t. Privacy as effective number of labeled samples. As a first step towards understanding this setting, suppose we are given nℓfully labeled samples: Sℓ = {(x1, a1, y1), , (xnℓ, anℓ, ynℓ)} drawn i.i.d from an unknown distribution P over X A Y where A = {0, 1, , |A| 1} and Y = {0, 1, , |Y| 1}, and nu samples that are missing the protected attribute: Su = {(x1, y1), , (xnu, ynu)} drawn i.i.d from the marginal of P over X Y. Define n := nℓ+nu, S = Sℓ Su and let β > 0 be such that nℓ:= βn and nu = (1 β)n. This data assumption is equivalent to having individuals not reporting their attributes uniformly at random with probability 1 β. The objective is to learn a non-discriminatory predictor ˆY from the data S. To mimic step 1 of our methodology, we propose to modify the reductions approach, so as to allow the learner, Q-player, to learn on the entirety of S while the auditor, λ-player, uses only Sℓ. We do this by first defining a two data set version of the Lagrangian, as such: LS,Sℓ(Q, λ) = err S(Q) + λ (MγSℓ(Q) α1). (14) This changes Algorithm 1 in two key ways: first, the update of θ now only relies on Sℓand, second, the best response of the learner is still a cost-sensitive learning problem, however now the cost depends on whether sample i is in Sℓor Su. If it is in Su, i.e. it does not have a group label, then the instance loss is the misclassification loss, while if it is in Sℓits loss is defined as before. Lemma 3 characterizes the performance of the learned predictor ˆY using the approach just described. Lemma 3. Given a hypothesis class H, a distribution over (X, A, Y ), B R+ and any δ (0, 1/2), then with probability greater than 1 δ, if nℓ 8 log 4|A|/δ minya Pya , 2https://github.com/husseinmozannar/ fairlearn_private_data Fair Learning with Private Demographic Data log 32|A|/δ nℓminya Pya and we let ϑ = Rn(H) + q n , then running the modified Algorithm 1 on data set S and Sl with T 16 log(4|A|+1) ϑ2 and learning rate η = ϑ 8B returns a predictor ˆY satisfying the following: err( ˆY ) δ err(Y ) + 4Rn(H) + 4 disc( ˆY ) δ 2 B + 6R minya nℓPya 2 log 32|A|/δ nℓminya Pya A short proof of Lemma 3 can be found in Appendix A. Notice the similarities between Lemma 2 and 3. The error bound we obtain depends on the entire number of samples n as in the privacy case and the discrimination guarantee is forcibly controlled by the number of labeled group samples nℓ. We can thus interpret the discrimination bound in Lemma 2 as having an effective number of samples controlled by the privacy level ϵ. Individual choice of reporting It may be more realistic to assume that the choice of individuals to report their protected attributes may depend on their own characteristics, let t(x, y, a) (0, 1] (reporting probability function) be the probability that an individual (x, y, a) chooses to report their protected attributes. This can be codified using a binary reporting random variable T where P(T = 1|X = x, Y = y, A = a) = t(x, y, a). Note that in the setting of Lemma 3, t(x, y, a) = β, a constant. Starting from a dataset S of n individuals sampled i.i.d. from P, each individual i flips a coin with bias t(xi, yi, ai) and accordingly chooses to include their attribute ai in S. The result of this process is a splitting of S into Sℓ= {(x1, a1, y1), , (xnℓ, anℓ, ynℓ)} (individuals who report their attributes) and Su = {(x1, y1), , (xnu, ynu)} (individuals who do not report). The goal again is to learn a non discriminatory predictor ˆY . The immediate question is whether we can use our modified algorithm with the two-dataset Langragian (14) and obtain similar guarantees to those in Lemma 3 in this more general setting. This question boils down to asking if the naïve empirical estimate of discrimination is consistent and the answer depends both on the reporting probability function t and the notion of discrimination considered as illustrated in the following proposition. Proposition 2. Consider (E1, E2)-non-discrimination with respect to A. Fix a reporting probability function t : X Y A (0, 1]. If the resulting T and E1 are conditionally independent given {A, E2}, then for all a A we have PSℓ(E1|E2, A = a) p P (E1|E2, A = a) , as n . where, for each n, Sℓis generated via individual random reporting through t. Proof. Given a A, our estimate PSℓ(E1|E2, A = a) is nothing but the empirical estimator of P (E1|E2, A = a, T = 1) where {T = 1} denotes the event that an individual does report their attributes and are thus included in Sℓ. As an immediate consequence we have: PSℓ(E1|E2, A = a) p P (E1|E2, A = a, T = 1) Now, by assumption, T and E1 are conditionally independent given {A, E2} and thus: P (E1|E2, A = a, T = 1) = P (E1|E2, A = a) which completes the proof. Note that the event E1 has strictly positive probability given {E2, A = a, T = 1}, as the reporting probability function is strictly positive. If the independence condition in Proposition 2 is satisfied, then this immediately suggests that we can run Algorithm (1) and obtain learning guarantees. To illustrate this concretely, suppose the notion of nondiscrimination is EO. Consider any reporting probability function of the form t1 : Y A (0, 1] (does not depend on the non-sensitive attributes). Furthermore suppose our hypothesis class consists of functions that depend only on X. The conditional independence condition in Proposition 2 thus holds and we can estimate the discrimination of any predictor in our class. The only change to Lemma 3 in this setup is that the effective number of samples in the discrimination bound is now: n minya Pya Tya where Tya = P(T = 1|Y = y, A = a) (T is the r.v. that denotes reporting); the proof of this observation is immediate. Trade-offs and proxies To complete the parallel with the proposed methodology, what remains is to mimic step 2, to devise ways to have lower sample complexities to achieve non-discrimination. Clearly the dependence on nℓ in Lemma 3 is statistically necessary without any assumptions on the data generating process and the only area of improvement is to remove the dependence on the complexity of the model class. If the sensitive attribute is never available at test time, we cannot apply the post-processing procedure of (Hardt et al., 2016) in a two-stage fashion (Woodworth et al., 2017). In practice, to compensate for the missing direct information, if legally permitted, the learner may leverage multiple sources of data and combine them to obtain indirect access to the sensitive information (Kallus et al., 2019) of individuals. The way this is modeled mathematically is by having recourse to proxies. One of the most widely used proxies is the Bayesian Improved Surname Geocoding (BISG) method, BISG is used to estimate race membership given the last name and geolocation of an individual (Adjaye-Gbewonyo Fair Learning with Private Demographic Data et al., 2014; Fiscella & Fremont, 2006). Using this proxy, one can impute the missing membership labels and then proceed to audit or learn a predictor. But a big issue with proxies is that they may lead to biased estimators for discrimination (Kallus et al., 2019). In order to avoid these pitfalls, one promising line of investigation is to learn it simultaneously with the predictor. What form of proxies can help us measure the discrimination of a certain predictor ˆY : X Y? Some of the aforementioned issues are due to the fact that features X are in general insufficient to estimate group membership, even through the complete probabilistic proxy P(A|X). In particular for EO, if A is not completely identifiable from X then using this proxy leads to inconsistent estimates. In contrast, if we have access to the probabilistic proxy P(A|X, Y ), we then propose the following estimator (see also Chen et al. (2019)) eγS ya( ˆY ) = Pn i=1 ˆY (xi)1(yi = y)P(A = a|xi, yi) Pn i=1 1(yi = y)P(A = a|xi, yi) , (15) which enjoys consistency, via a relatively straightforward proof found in Appendix A. Lemma 4. Let S = {(xi, ai, yi)}n i=1 i.i.d. Pn(A, X, Y ), the estimator eγS ya is consistent. As n eγS ya p γya. We end our discussion here by pointing out that if such a proxy can be efficiently learned from samples, then it can reduce a missing attribute problem effectively to a private attribute problem, allowing us to use much of the same machinery presented in this paper. 7. Conclusion We studied learning non-discriminatory predictors when the protected attributes are privatized or noisy. We observed that, in the population limit, non-discrimination against noisy attributes is equivalent to that against original attributes. We showed this to hold for various fairness criteria. We then characterized the amount of difficulty, in sample complexity, that privacy adds to testing non-discrimination. Using this relationship, we proposed how to carefully adapt existing non-discriminatory learners to work with privatized protected attributes. Care is crucial, as naively using these learners may create the illusion of non-discrimination, while continuing to be highly discriminatory. With the same approach and without recourse to proxy information, we ended by highlighting when and how we can learn predictors when individuals can choose to withhold their protected attributes. Acknowledgements This paper is based upon work supported in part by the NSF Program on Fairness in AI in Collaboration with Amazon under Award No. IIS-1939743, titled FAI: Addressing the 3D Challenges for Data-Driven Fairness: Deficiency, Dynamics, and Disagreement. Any opinion, findings, and conclusions or recommendations expressed in this paper are those of the authors and do not necessarily reflect the views of the National Science Foundation or Amazon. Adjaye-Gbewonyo, D., Bednarczyk, R. A., Davis, R. L., and Omer, S. B. Using the bayesian improved surname geocoding method (bisg) to create a working classification of race and ethnicity in a diverse managed care population: a validation study. Health services research, 49(1):268 283, 2014. Agarwal, A., Beygelzimer, A., Dudík, M., Langford, J., and Wallach, H. A reductions approach to fair classification. ar Xiv preprint ar Xiv:1803.02453, 2018. Alabi, D. The cost of a reductions approach to private fair optimization. ar Xiv preprint ar Xiv:1906.09613, 2019. Awasthi, P., Kleindessner, M., and Morgenstern, J. Effectiveness of equalized odds for fair classification under imperfect group information. ar Xiv preprint ar Xiv:1906.03284, 2019. Bagdasaryan, E. and Shmatikov, V. Differential privacy has disparate impact on model accuracy. ar Xiv preprint ar Xiv:1905.12101, 2019. Barocas, S., Hardt, M., and Narayanan, A. Fairness and Machine Learning. fairmlbook.org, 2019. http://www.fairmlbook.org. Bureau, C. F. P. Using publicly available information to proxy for unidentified race and ethnicity, June 2014. URL files.consumerfinance.gov/f/201409_ cfpb_report_proxy-methodology.pdf. Chen, J., Kallus, N., Mao, X., Svacha, G., and Udell, M. Fairness under unawareness: Assessing disparity when protected class is unobserved. In Proceedings of the Conference on Fairness, Accountability, and Transparency, pp. 339 348. ACM, 2019. Commission, F. T. Your equal credit opportunity rights, January 2013. URL www.consumer.ftc.gov/articles/ 0347-your-equal-credit-opportunity-rights. Fair Learning with Private Demographic Data Cotter, A., Gupta, M., Jiang, H., Srebro, N., Sridharan, K., Wang, S., Woodworth, B., and You, S. Training well-generalizing classifiers for fairness metrics and other data-dependent constraints. ar Xiv preprint ar Xiv:1807.00028, 2018. Cummings, R., Gupta, V., Kimpara, D., and Morgenstern, J. On the compatibility of privacy and fairness. ., 2019. Duchi, J., Shalev-Shwartz, S., Singer, Y., and Chandra, T. Efficient projections onto the l 1-ball for learning in high dimensions. In Proceedings of the 25th international conference on Machine learning, pp. 272 279. ACM, 2008. Duchi, J. C., Jordan, M. I., and Wainwright, M. J. Local privacy and statistical minimax rates. In 2013 IEEE 54th Annual Symposium on Foundations of Computer Science, pp. 429 438. IEEE, 2013. Dwork, C., Mc Sherry, F., Nissim, K., and Smith, A. Calibrating noise to sensitivity in private data analysis. In Theory of cryptography conference, pp. 265 284. Springer, 2006. Dwork, C., Hardt, M., Pitassi, T., Reingold, O., and Zemel, R. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference, pp. 214 226. ACM, 2012. Fiscella, K. and Fremont, A. M. Use of geocoding and surname analysis to estimate race and ethnicity. Health services research, 41(4p1):1482 1500, 2006. Freund, Y. and Schapire, R. E. Game theory, on-line prediction and boosting. In COLT, volume 96, pp. 325 332. Citeseer, 1996. Gupta, M., Cotter, A., Fard, M. M., and Wang, S. Proxy fairness. ar Xiv preprint ar Xiv:1806.11212, 2018. Hardt, M., Price, E., Srebro, N., et al. Equality of opportunity in supervised learning. In Advances in neural information processing systems, pp. 3315 3323, 2016. Hashimoto, T., Srivastava, M., Namkoong, H., and Liang, P. Fairness without demographics in repeated loss minimization. In International Conference on Machine Learning, pp. 1934 1943, 2018. Jagielski, M., Kearns, M., Mao, J., Oprea, A., Roth, A., Sharifi-Malvajerdi, S., and Ullman, J. Differentially private fair learning. ar Xiv preprint ar Xiv:1812.02696, 2018. Kairouz, P., Oh, S., and Viswanath, P. Extremal mechanisms for local differential privacy. In Advances in neural information processing systems, pp. 2879 2887, 2014. Kairouz, P., Bonawitz, K., and Ramage, D. Discrete distribution estimation under local privacy. ar Xiv preprint ar Xiv:1602.07387, 2016. Kallus, N., Mao, X., and Zhou, A. Assessing algorithmic fairness with unobserved protected class using data combination. ar Xiv preprint ar Xiv:1906.00285, 2019. Kilbertus, N., Gascón, A., Kusner, M. J., Veale, M., Gummadi, K. P., and Weller, A. Blind justice: Fairness with encrypted sensitive attributes. ar Xiv preprint ar Xiv:1806.03281, 2018. Kohavi, R. Scaling up the accuracy of naive-bayes classifiers: a decision-tree hybrid. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pp. 202 207, 1996. Lamy, A. L., Zhong, Z., Menon, A. K., and Verma, N. Noise-tolerant fair classification. ar Xiv preprint ar Xiv:1901.10837, 2019. Mc Diarmid, C. On the method of bounded differences. Surveys in combinatorics, 141(1):148 188, 1989. Mohri, M., Rostamizadeh, A., and Talwalkar, A. Foundations of machine learning. MIT press, 2018. Vigdor, N. Apple card investigated after gender discrimination complaints, November 2019. URL https://www.nytimes.com/2019/11/10/ business/ Apple-credit-card-investigation.html. Wang, S., Guo, W., Narasimhan, H., Cotter, A., Gupta, M., and Jordan, M. I. Robust optimization for fairness with noisy protected groups. ar Xiv preprint ar Xiv:2002.09343, 2020. Warner, S. L. Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 60(309):63 69, 1965. Woodworth, B., Gunasekar, S., Ohannessian, M. I., and Srebro, N. Learning non-discriminatory predictors. In Conference on Learning Theory, pp. 1920 1953, 2017. Xu, D., Yuan, S., and Wu, X. Achieving differential privacy and fairness in logistic regression. In Companion Proceedings of The 2019 World Wide Web Conference, pp. 594 599. ACM, 2019. Zemel, R., Wu, Y., Swersky, K., Pitassi, T., and Dwork, C. Learning fair representations. In International Conference on Machine Learning, pp. 325 333, 2013.