# noisetolerant_fair_classification__4691cdee.pdf

Noise-tolerant fair classiﬁcation

Alexandre Lamy Columbia University a.lamy@columbia.edu

Ziyuan Zhong Columbia University ziyuan.zhong@columbia.edu

Aditya Krishna Menon

Google adityakmenon@google.com

Nakul Verma Columbia University verma@cs.columbia.edu

Fairness-aware learning involves designing algorithms that do not discriminate with respect to some sensitive feature (e.g., race or gender). Existing work on the problem operates under the assumption that the sensitive feature available in one s training sample is perfectly reliable. This assumption may be violated in many real-world cases: for example, respondents to a survey may choose to conceal or obfuscate their group identity out of fear of potential discrimination. This poses the question of whether one can still learn fair classiﬁers given noisy sensitive features. In this paper, we answer the question in the afﬁrmative: we show that if one measures fairness using the mean-difference score, and sensitive features are subject to noise from the mutually contaminated learning model, then owing to a simple identity we only need to change the desired fairness-tolerance. The requisite tolerance can be estimated by leveraging existing noise-rate estimators from the label noise literature. We ﬁnally show that our procedure is empirically effective on two case-studies involving sensitive feature censoring.

1 Introduction

Classiﬁcation is concerned with maximally discriminating between a number of pre-deﬁned groups. Fairness-aware classiﬁcation concerns the analysis and design of classiﬁers that do not discriminate with respect to some sensitive feature (e.g., race, gender, age, income). Recently, much progress has been made on devising appropriate measures of fairness (Calders et al., 2009; Dwork et al., 2011; Feldman, 2015; Hardt et al., 2016; Zafar et al., 2017b,a; Kusner et al., 2017; Kim et al., 2018; Speicher et al., 2018; Heidari et al., 2019), and means of achieving them (Zemel et al., 2013; Zafar et al., 2017b; Calmon et al., 2017; Dwork et al., 2018; Agarwal et al., 2018; Donini et al., 2018; Cotter et al., 2018; Williamson & Menon, 2019; Mohri et al., 2019).

Typically, fairness is achieved by adding constraints which depend on the sensitive feature, and then correcting one s learning procedure to achieve these fairness constraints. For example, suppose the data comprises of pairs of individuals and their loan repay status, and the sensitive feature is gender. Then, we may add a constraint that we should predict equal loan repayment for both men and women (see 3.2 for a more precise statement). However, this and similar approaches assume that we are able to correctly measure or obtain the sensitive feature. In many real-world cases, one may only observe noisy versions of the sensitive feature. For example, survey respondents may choose to conceal or obfuscate their group identity out of concerns of potential mistreatment or outright discrimination.

One is then brought to ask whether fair classiﬁcation in the presence of such noisy sensitive features is still possible. Indeed, if the noise is high enough and all original information about the sensitive

Equal contribution

33rd Conference on Neural Information Processing Systems (Neur IPS 2019), Vancouver, Canada.

features is lost, then it is as if the sensitive feature was not provided. Standard learners can then be unfair on such data (Dwork et al., 2011; Pedreshi et al., 2008). Recently, Hashimoto et al. (2018) showed that progress is possible, albeit for speciﬁc fairness measures. The question of what can be done under a smaller amount of noise is thus both interesting and non-trivial.

In this paper, we consider two practical scenarios where we may only observe noisy sensitive features:

(1) suppose we are releasing data involving human participants. Even if noise-free sensitive features

are available, we may wish to add noise so as to obfuscate sensitive attributes, and thus protect participant data from potential misuse. Thus, being able to learn fair classiﬁers under sensitive feature noise is a way to achieve both privacy and fairness. (2) suppose we wish to analyse data where the presence of the sensitive feature is only known for

a subset of individuals, while for others the feature value is unknown. For example, patients ﬁlling out a form may feel comfortable disclosing that they do not have a pre-existing medical condition; however, some who do have this condition may wish to refrain from responding. This can be seen as a variant of the positive and unlabelled (PU) setting (Denis, 1998), where the sensitive feature is present (positive) for some individuals, but absent (unlabelled) for others.

By considering popular measures of fairness and a general model of noise, we show that fair classiﬁcation is possible under many settings, including the above. Our precise contributions are:

(C1) we show that if the sensitive features are subject to noise as per the mutually contaminated

learning model (Scott et al., 2013a), and one measures fairness using the mean-difference score (Calders & Verwer, 2010), then a simple identity (Theorem 2) yields that we only need to change the desired fairness-tolerance. The requisite tolerance can be estimated by leveraging existing noise-rate estimators from the label noise literature, yielding a reduction (Algorithm 1) to regular noiseless fair classiﬁcation. (C2) we show that our procedure is empirically effective on both case-studies mentioned above.

In what follows, we review the existing literature on learning fair and noise-tolerant classiﬁers in 2, and introduce the novel problem formulation of noise-tolerant fair learning in 3. We then detail how to address this problem in 4, and empirically conﬁrm the efﬁcacy of our approach in 5.

2 Background and related work

We review relevant literature on fair and noise-tolerant machine learning.

2.1 Fair machine learning

Algorithmic fairness has gained signiﬁcant attention recently because of the undesirable social impact caused by bias in machine learning algorithms (Angwin et al., 2016; Buolamwini & Gebru, 2018; Lahoti et al., 2018). There are two central objectives: designing appropriate application-speciﬁc fairness criterion, and developing predictors that respect the chosen fairness conditions.

Broadly, fairness objectives can be categorised into individualand group-level fairness. Individuallevel fairness (Dwork et al., 2011; Kusner et al., 2017; Kim et al., 2018) requires the treatment of similar individuals to be similar. Group-level fairness asks the treatment of the groups divided based

on some sensitive attributes (e.g., gender, race) to be similar. Popular notions of group-level fairness include demographic parity (Calders et al., 2009) and equality of opportunity (Hardt et al., 2016); see 3.2 for formal deﬁnitions.

Group-level fairness criteria have been the subject of signiﬁcant algorithmic design and analysis, and are achieved in three possible ways:

pre-processing methods (Zemel et al., 2013; Louizos et al., 2015; Lum & Johndrow, 2016;

Johndrow & Lum, 2017; Calmon et al., 2017; del Barrio et al., 2018; Adler et al., 2018), which usually ﬁnd a new representation of the data where the bias with respect to the sensitive feature is explicitly removed. methods enforcing fairness during training (Calders et al., 2009; Woodworth et al., 2017; Zafar

et al., 2017b; Agarwal et al., 2018), which usually add a constraint that is a proxy of the fairness criteria or add a regularization term to penalise fairness violation.

post-processing methods (Feldman, 2015; Hardt et al., 2016), which usually apply a thresholding

function to make the prediction satisfying the chosen fairness notion across groups.

2.2 Noise-tolerant classiﬁcation

Designing noise-tolerant classiﬁers is a classic topic of study, concerned with the setting where one s training labels are corrupted in some manner. Typically, works in this area postulate a particular model of label noise, and study the viability of learning under this model. Class-conditional noise (CCN) (Angluin & Laird, 1988) is one such effective noise model. Here, samples from each class have their labels ﬂipped with some constant (but class-speciﬁc) probability. Algorithms that deal with CCN corruption have been well studied (Natarajan et al., 2013; Liu & Tao, 2016; Northcutt et al., 2017). These methods typically ﬁrst estimate the noise rates, which are then used for prediction. A special case of CCN learning is learning from positive and unlabelled data (PU learning) (Elkan & Noto, 2008), where in lieu of explicit negative samples, one has a pool of unlabelled data.

Our interest in this paper will be the mutually contaminated (MC) learning noise model (Scott et al., 2013a). This model (described in detail in 3.3) captures both CCN and PU learning as special cases (Scott et al., 2013b; Menon et al., 2015), as well as other interesting noise models.

3 Background and notation

We recall the settings of standard and fairness-aware binary classiﬁcation2, and establish notation.

Our notation is summarized in Table 1.

3.1 Standard binary classiﬁcation

Binary classiﬁcation concerns predicting the label or target feature Y 2 {0, 1} that best corresponds to a given instance X 2 X. Formally, suppose D is a distribution over (instance, target feature) pairs from X {0, 1}. Let f : X ! R be a score function, and F RX be a user-deﬁned class of such score functions. Finally, let : R {0, 1} ! R+ be a loss function measuring the disagreement between a given score and binary label. The goal of binary classiﬁcation is to minimise

LD(f) := E(X,Y ) D[ (f(X), Y )]. (1)

3.2 Fairness-aware classiﬁcation

In fairness-aware classiﬁcation, the goal of accurately predicting the target feature Y remains. However, there is an additional sensitive feature A 2 {0, 1} upon which we do not wish to discriminate. Intuitively, some user-deﬁned fairness loss should be roughly the same regardless of A.

Formally, suppose D is a distribution over (instance, sensitive feature, target feature) triplets from X {0, 1} {0, 1}. The goal of fairness-aware binary classiﬁcation is to ﬁnd3

f := arg min

LD(f), such that D(f)

LD(f) := E(X,A,Y ) D[ (f(X), Y )],

for user-speciﬁed fairness tolerance 0, and fairness constraint D : F ! R+. Such constrained optimisation problems can be solved in various ways, e.g., convex relaxations (Donini et al., 2018), alternating minimisation (Zafar et al., 2017b; Cotter et al., 2018), or linearisation (Hardt et al., 2016).

A number of fairness constraints D( ) have been proposed in the literature. We focus on two important and speciﬁc choices in this paper, inspired by Donini et al. (2018):

!! LD0, (f) LD1, (f)

!! LD0,1(f) LD1,1(f)

2For simplicity, we consider the setting of binary target and sensitive features. However, our derivation and method can be easily extended to the multi-class setting.

3Here, f is assumed to not be allowed to use A at test time, which is a common legal restriction (Lipton et al., 2018). Of course, A can be used at training time to ﬁnd an f which satisﬁes the constraint.

Table 1: Glossary of commonly used symbols

Symbol Meaning Symbol Meaning X instance Dcorr corrupted distribution D A sensitive feature f score function f : X ! R Y target feature accuracy loss : R {0, 1} ! R+ D distribution P(X, A, Y ) LD expected accuracy loss on D Da, distribution P(X, A, Y |A = a) fairness loss : R {0, 1} ! R+ D ,y distribution P(X, A, Y |Y = y) LD expected fairness loss on D Da,y distribution P(X, A, Y |A = a, Y = y) D fairness constraint

where we denote by Da, , D ,y, and Da,y the distributions over X {0, 1} {0, 1} given by D|A=a, D|Y =y, and D|A=a,Y =y and : R {0, 1} ! R+ is the user-deﬁned fairness loss with corresponding LD(f) := E(X,A,Y ) D[ (f(X), Y )]. Intuitively, these measure the difference in the average of the fairness loss incurred among the instances with and without the sensitive feature.

Concretely, if is taken to be (s, y) = 1[sign(s) 6= 1] and the 0-1 loss (s, y) = 1[sign(s) 6= y] respectively, then for = 0, (3) and (4) correspond to the demographic parity (Dwork et al., 2011) and equality of opportunity (Hardt et al., 2016) constraints. Thus, we denote these two relaxed fairness measures disparity of demographic parity (DDP) and disparity of equality of opportunity (DEO). These quantities are also known as the mean difference score in Calders & Verwer (2010).4

3.3 Mutually contaminated learning

In the framework of learning from mutually contaminated distributions (MC learning) (Scott et al., 2013b), instead of observing samples from the true (or clean ) joint distribution D, one observes samples from a corrupted distribution Dcorr. The corruption is such that the observed class-conditional distributions are mixtures of their true counterparts. More precisely, let Dy denote the conditional distribution for label y. Then, one assumes that

D1,corr = (1 ) D1 + D0 D0,corr = β D1 + (1 β) D0, (5)

where , β 2 (0, 1) are (typically unknown) noise parameters with +β < 1.5 Further, the corrupted base rate corr := P[Ycorr = 1] may be arbitrary. The MC learning framework subsumes CCN and PU learning (Scott et al., 2013b; Menon et al., 2015), which are prominent noise models that have seen sustained study in recent years (Jain et al., 2017; Kiryo et al., 2017; van Rooyen & Williamson, 2018; Katz-Samuels et al., 2019; Charoenphakdee et al., 2019).

4 Fairness under sensitive attribute noise

The standard fairness-aware learning problem assumes we have access to the true sensitive attribute, so that we can both measure and control our classiﬁer s unfairness as measured by, e.g., Equation 3. Now suppose that rather than being given the sensitive attribute, we get a noisy version of it. We will show that the fairness constraint on the clean distribution is equivalent to a scaled constraint on the noisy distribution. This gives a simple reduction from fair machine learning in the presence of noise to the regular fair machine learning, which can be done in a variety of ways as discussed in 2.1.

4.1 Sensitive attribute noise model

As previously discussed, we use MC learning as our noise model, as this captures both CCN and PU learning as special cases; hence, we automatically obtain results for both these interesting settings.

4Keeping generic allows us to capture a range of group fairness deﬁnitions, not just demographic parity and equality of opportunity; e.g., disparate mistreatment (Zafar et al., 2017b) corresponds to using the 0-1 loss and DP

D , and equalized odds can be captured by simply adding another constraint for Y = 0 along with EO

D . 5The constraint imposes no loss of generality: when + β > 1, we can simply ﬂip the two labels and apply our theorem. When + β = 1, all information about the sensitive attribute is lost. This pathological case is equivalent to not measuring the sensitive attribute at all.

Our speciﬁc formulation of MC learning noise on the sensitive feature is as follows. Recall from 3.2 that D is a distribution over X {0, 1} {0, 1}. Following (5), for unknown noise parameters , β 2 (0, 1) with + β < 1, we assume that the corrupted class-conditional distributions are:

D1, ,corr = (1 ) D1, + D0, D0, ,corr = β D1, + (1 β) D0, , (6)

and that the corrupted base rate is a,corr (we write the original base rate, P(X,A,Y ) D[A = 1] as a). That is, the distribution over (instance, label) pairs for the group with A = 1, i.e. P(X, Y | A = 1), is assumed to be mixed with the distribution for the group with A = 0, and vice-versa.

Now, when interested in the EO constraint, it can be simpler to assume that the noise instead satisﬁes

D1,1,corr = (1 0) D1,1 + 0 D0,1 D0,1,corr = β0 D1,1 + (1 β0) D0,1, (7)

for noise parameters 0, β0 2 (0, 1). As shown by the following, this is not a different assumption. Lemma 1. Suppose there is noise in the sensitive attribute only, as given in Equation (6). Then, there exists constants 0, β0 such that Equation (7) holds.

Although the lemma gives a way to calculate 0, β0 from , β, in practice it may be useful to consider (7) independently. Indeed, when one is interested in the EO constraints we will show below that only knowledge of 0, β0 is required. It is often much easier to estimate 0, β0 directly (which can be done in the same way as estimating , β simply by considering D ,1,corr rather than Dcorr).

4.2 Fairness constraints under MC learning

We now show that the previously introduced fairness constraints for demographic parity and equality of opportunity are automatically robust to MC learning noise in A. Theorem 2. Assume that we have noise as per Equation (6). Then, for any > 0 and f : X ! R,

D (f) () DP

Dcorr(f) (1 β)

D ,1(f) () EO

Dcorr, ,1(f) (1 0 β0),

where 0 and β0 are as per Equation (7) and Lemma 1.

The above can be seen as a consequence of the immunity of the balanced error (Chan & Stolfo, 1998; Brodersen et al., 2010; Menon et al., 2013) to corruption under the MC model. Speciﬁcally, consider a distribution D over an input space Z and label space W = {0, 1}. Deﬁne

BD := EZ|W =0[h0(Z)] + EZ|W =1[h1(Z)]

for functions h0, h1 : Z ! R. Then, if for every z 2 R h0(z) + h1(z) = 0, we have (van Rooyen, 2015, Theorem 4.16), (Blum & Mitchell, 1998; Zhang & Lee, 2008; Menon et al., 2015)

BDcorr = (1 β) BD, (8)

where Dcorr refers to a corrupted version of D under MC learning with noise parameters , β. That is, the effect of MC noise on BD is simply to perform a scaling. Observe that BD = LD(f) if we set Z to X Y , W to the sensitive feature A, and h0((x, y)) = + (y, f(x)), h1((x, y)) = (y, f(x)). Thus, (8) implies LD(f) = (1 β) LDcorr(f), and thus Theorem 2.

4.3 Algorithmic implications

Theorem 2 has an important algorithmic implication. Suppose we pick a fairness constraint D and seek to solve Equation 2 for a given tolerance 0. Then, given samples from Dcorr, it sufﬁces to simply change the tolerance to 0 = (1 β).

Unsurprisingly, 0 depends on the noise parameters , β. In practice, these will be unknown; however, there have been several algorithms proposed to estimate these from noisy data alone (Scott et al., 2013b; Menon et al., 2015; Liu & Tao, 2016; Ramaswamy et al., 2016; Northcutt et al., 2017). Thus, we may use these to construct estimates of , β, and plug these in to construct an estimate of 0.

In sum, we may tackle fair classiﬁcation in the presence of noisy A by suitably combining any existing fair classiﬁcation method (that takes in a parameter that is proportional to mean-difference score of some fairness measures), and any existing noise estimation procedure. This is summarised in Algorithm 1. Here, Fair Alg is any existing fairness-aware classiﬁcation method that solves Equation 2, and Noise Est is any noise estimation method that estimates , β.

Algorithm 1 Reduction-based algorithm for fair classiﬁcation given noisy A.

Input: Training set S = {(xi, yi, ai)}n

i=1, scorer class F, fairness tolerance 0, fairness constraint ( ), fair classiﬁcation algorithm Fair Alg, noise estimation algorithm Noise Est Output: Fair classiﬁer f 2 F

1: ˆ , ˆβ Noise Est(S) 2: 0 (1 ˆ ˆβ) 3: return Fair Alg(S, F, , 0)

4.4 Noise rate and sample complexity

So far, we have shown that at a distribution level, fairness with respect to the noisy sensitive attribute is equivalent to fairness with respect to the real sensitive attribute. However, from a sample complexity perspective, a higher noise rate will require a larger number of samples for the empirical fairness to generalize well, i.e., guarantee fairness at a distribution level. A concurrent work by Mozannar et al. (2019) derives precise sample complexity bounds and makes this relationship explicit.

4.5 Connection to privacy and fairness

While Algorithm 1 gives a way of achieving fair classiﬁcation on an already noisy dataset such as the use case described in example (2) of 1, it can also be used to simultaneously achieve fairness and privacy. As described in example (1) of 1, the very nature of the sensitive attribute makes it likely that even if noiseless sensitive attributes are available, one might want to add noise to guarantee some form of privacy. Note that simply removing the feature does not sufﬁce, because it would make difﬁcult the task of developing fairness-aware classiﬁers for the dataset (Gupta et al., 2018). Formally, we can give the following privacy guarantee by adding CCN noise to the sensitive attribute. Lemma 3. To achieve ( , δ = 0) differential privacy on the sensitive attribute we can add CCN noise with + = = 1 exp ( )+1 to the sensitive attribute.

Thus, if a desired level of differential privacy is required before releasing a dataset, one could simply add the required amount of CCN noise to the sensitive attributes, publish this modiﬁed dataset as well as the noise level, and researchers could use Algorithm 1 (without even needing to estimate the noise rate) to do fair classiﬁcation as usual.

There is previous work that tries to preserve privacy of individuals sensitive attributes while learning a fair classiﬁer. Kilbertus et al. (2018) employs the cryptographic tool of secure multiparty computation (MPC) to try to achieve this goal. However, as noted by Jagielski et al. (2018), the individual information that MPC tries to protect can still be inferred from the learned model. Further, the method of Kilbertus et al. (2018) is limited to using demographic parity as the fairness criteria.

A more recent work of Jagielski et al. (2018) explored preserving differential privacy (Dwork, 2006) while maintaining fairness constraints. The authors proposed two methods: one adds Laplace noise to training data and apply the post-processing method in Hardt et al. (2016), while the other modiﬁes the method in Agarwal et al. (2018) using the exponential mechanism as well as Laplace noise. Our work differs from them in three major ways:

(1) our work allows for fair classiﬁcation to be done using any in-process fairness-aware

classiﬁer that allows user to specify desired fairness level. On the other hand, the ﬁrst method of Jagielski et al. (2018) requires the use of a post-processing algorithm (which generally have worse trade-offs than in-processing algorithms Agarwal et al. (2018)), while the second method requires the use of a single particular classiﬁer. (2) our focus is on designing fair-classiﬁers with noise-corrupted sensitive attributes; by contrast,

the main concern in Jagielski et al. (2018) is achieving differential privacyand thus they do not discuss how to deal with noise that is already present in the dataset.

(3) our method is shown to work for a large class of different fairness deﬁnitions.

Finally, a concurrent work of Mozannar et al. (2019) builds upon our method for the problem of preserving privacy of the sensitive attribute. The authors use a randomized response procedure on the sensitive attribute values, followed by a two-step procedure to train a fair classiﬁer using the processed data. Theoretically, their method improves upon the sample complexity of our method and extends our privacy result to the case of non-binary groups. However, their method solely focuses on preserving privacy rather than the general problem of sensitive attribute noise.

5 Experiments

We demonstrate that it is viable to learn fair classiﬁers given noisy sensitive features.6 As our underlying fairness-aware classiﬁer, we use a modiﬁed version of the classiﬁer implemented in Agarwal et al. (2018) with the DDP and DEO constraints which, as discussed in 3.2, are special cases of our more general constraints (3) and (4). The classiﬁer s original constraints can also be shown to be noise-invariant but in a slightly different way (see Appendix C for a discussion). An advantage of this classiﬁer is that it is shown to reach levels of fairness violation that are very close to the desired level ( ), i.e., for small enough values of it will reach the constraint boundary.

While we had to choose a particular classiﬁer, our method can be used before using any downstream fair classiﬁer as long as it can take in a parameter that controls the strictness of the fairness constraint and that its constraints are special cases of our very general constraints (3) and (4).

5.1 Noise setting

Our case studies focus on two common special cases of MC learning: CCN and PU learning. Under CCN noise the sensitive feature s value is randomly ﬂipped with probability + if its value was 1, or with probability if its value was 0. As shown in Menon et al. (2015, Appendix C), CCN noise is a special case of MC learning. For PU learning we consider the censoring setting (Elkan & Noto, 2008) which is a special case of CCN learning where one of + and is 0. While our results also apply to the case-controlled setting of PU learning (Ward et al., 2009), the former setting is more natural in our context. Note that from + and one can obtain and β as described in Menon et al. (2015).

5.2 Benchmarks

For each case study, we evaluate our method (termed cor scale); recall this scales the input parameter using Theorem 2 and the values of + and , and then uses the fair classiﬁer to perform classiﬁcation. We compare our method with three different baselines. The ﬁrst two trivial baselines are applying the fair classiﬁer directly on the non-corrupted data (termed nocor) and on the corrupted data (termed cor). While the ﬁrst baseline is clearly the ideal, it won t be possible when only the corrupted data is available. The second baseline should show that there is indeed an empirical need to deal with the noise in some way and that it cannot simply be ignored.

The third, non-trivial, baseline (termed denoise) is to ﬁrst denoise A and then apply the fair classiﬁer on the denoised distribution. This denoising is done by applying the Rank Prune method in Northcutt et al. (2017). Note that we provide Rank Prune with the same known values of + and that we use to apply our scaling so this is a fair comparison to our method. Compared to denoise, we do not explicitly infer individual sensitive feature values; thus, our method does not compromise privacy.

For both case studies, we study the relationship between the input parameter and the testing error and fairness violation. For simplicity, we only consider the DP constraint.

5.3 Case study: privacy preservation

In this case study, we look at COMPAS, a dataset from Propublica (Angwin et al., 2016) that is widely used in the study of fair algorithms. Given various features about convicted individuals, the task is to predict recidivism and the sensitive attribute is race. The data comprises 7918 examples and 10 features. In our experiment, we assume that to preserve differential privacy, CCN noise with

+ = = 0.15 is added to the sensitive attribute. As per Lemma 3, this guarantees ( , δ = 0)

6Source code is available at https://github.com/AIasd/noise_fairlearn.

(a) COMPAS dataset (privacy case study).

(b) law school dataset (PU learning case study).

Figure 1: Relationship between input fairness tolerance versus DP fairness violation (left panels), and versus error (right panels). Our method (cor scale) achieves approximately the ideal fairness violation (indicated by the gray dashed line in the left panels), with only a mild degradation in accuracy compared to training on the uncorrupted data (indicated by the nocor method). Baselines that perform no noise-correction (cor) and explicitly denoise the data (denoise) offer suboptimal tradeoffs by comparison; for example, the former achieves slightly lower error rates, but does so at the expense of greater fairness violation.

differential privacy with = 1.73. We assume that the noise level is released with the dataset (and is thus known). We performed fair classiﬁcation on this noisy data using our method and compare the results to the three benchmarks described above.

Figure 1a shows the average result over three runs each with a random 80-20 training-testing split. (Note that fairness violations and errors are calculated with respect to the true uncorrupted features.) We draw two key insights from this graph:

(i) in terms of fairness violation, our method (cor scale) approximately achieves the desired

fairness tolerance (shown by the gray dashed line). This is both expected and ideal, and it matches what happens when there is no noise (nocor). By contrast, the naïve method cor strongly violates the fairness constraint. (ii) in terms of accuracy, our method only suffers mildly compared with the ideal noiseless method

(nocor); some degradation is expected as noise will lead to some loss of information. By contrast, denoise sacriﬁces much more predictive accuracy than our method.

In light of both the above, our method is seen to achieve the best overall tradeoff between fairness and accuracy. Experimental results with EO constraints, and other commonly studied datasets in the fairness literature (adult, german), show similar trends as in Figure 1a, and are included in Appendix D for completeness.

5.4 Case study: PU learning

In this case study, we consider the dataset law school, which is a subset of the original dataset from LSAC (Wightman, 1998). In this dataset, one is provided with information about various individuals (grades, part time/full time status, age, etc.) and must determine whether or not the individual passed the bar exam. The sensitive feature is race; we only consider black and white. After prepossessing

Figure 2: Relationship between the estimated noise level ˆ and fairness violation/error on the law school dataset using DP constraint (testing curves), with ˆ + = 0 and = 0.2. Our method (cor scale) is not overly sensitive to imperfect estimates of the noise rate, evidenced by its fairness violation and accuracy closely tracking that of training on the uncorrupted data (nocor) as ˆ is varied. That is, red curve in the left plot closely tracks the yellow reference curve. By contrast, the baseline that explicitly denoises the data (denoise) deviates strongly from nocor, and is sensitive to small changes in ˆ . This illustrates that our method performs well even when noise rates must be estimated.

the data by removing instances that had missing values and those belonging to other ethnicity groups (neither black nor white) we were left with 3738 examples each with 11 features.

While the data ostensibly provides the true values of the sensitive attribute, one may imagine having access to only PU information. Indeed, when the data is collected one could imagine that individuals from the minority group would have a much greater incentive to conceal their group membership due to fear of discrimination. Thus, any individual identiﬁed as belonging to the majority group could be assumed to have been correctly identiﬁed (and would be part of the positive instances). On the other hand, no deﬁnitive conclusions could be drawn about individuals identiﬁed as belonging to the minority group (these would therefore be part of the unlabelled instances).

To model a PU learning scenario, we added CCN noise to the dataset with + = 0 and = 0.2. We initially assume that the noise rate is known. Figure 1b shows the average result over three runs under this setting each with a random 80-20 training-testing split. We draw the same conclusion as before: our method achieves the highest accuracy while respecting the speciﬁed fairness constraint.

Unlike in the privacy case, the noise rate in the PU learning scenario is usually unknown in practice, and must be estimated. Such estimates will inevitably be approximate. We thus evaluate the impact of the error of the noise rate estimate on all methods. In Figure 2, we consider a PU scenario where we only have access to an estimate ˆ of the negative noise rate, whose true value is = 0.2. Figure 2 shows the impact of different values of ˆ on the fairness violation and error. We see that that as long as this estimate is reasonably accurate, our method performs the best in terms of being closest to the case of running the fair algorithm on uncorrupted data.

In sum, these results are consistent with our derivation and show that our method cor scale can achieve the desired degree of fairness while minimising loss of accuracy. Appendix E includes results for different settings of , noise level, and on other datasets showing similar trends.

6 Conclusion and future work

In this paper, we showed both theoretically and empirically that even under the very general MC learning noise model (Scott et al., 2013a) on the sensitive feature, fairness can still be preserved by scaling the input unfairness tolerance parameter . In future work, it would be interesting to consider the case of categorical sensitive attributes (as applicable, e.g., for race), and the more challenging case of instance-dependent noise (Awasthi et al., 2015). We remark also that in independent work, Awasthi et al. (2019) studied the effect of sensitive attribute noise on the post-processing method of Hardt et al. (2016). In particular, they identiﬁed conditions when such post-processing can still yield an approximately fair classiﬁer. Our approach has an advantage of being applicable to a generic in-processing fair classiﬁer; however, their approach also handles the case where the sensitive feature is used as an input to the classiﬁer. Exploration of the synthesis of the two approaches is another promising direction for future work.

Bank marketing dataset (a sample taken from UCI). https://www.kaggle.com/rouseguy/ bankbalanced/kernels.

Adler, P., Falk, C., Friedler, S. A., Nix, T., Rybeck, G., Scheidegger, C., Smith, B., and Venkatasub-

ramanian, S. Auditing black-box models for indirect inﬂuence. Knowl. Inf. Syst., 54(1):95 122, January 2018. ISSN 0219-1377.

Agarwal, A., Beygelzimer, A., Dudík, M., Langford, J., and Wallach, H. A reductions approach

to fair classiﬁcation. In Dy, J. G. and Krause, A. (eds.), Proceedings of the 35th International Conference on Machine Learning, volume 80, pp. 60 69, Stanford, CA, 2018. JMLR.

Angluin, D. and Laird, P. Learning from noisy examples. Machine Learning, 2(4):343 370, Apr

1988. ISSN 1573-0565.

Angwin, J., Larson, J., Mattu, S., and Kirchner, L. Machine bias. 2016.

Awasthi, P., Balcan, M.-F., Haghtalab, N., and Urner, R. Efﬁcient learning of linear separators under

bounded noise. In Conference on Learning Theory (COLT), volume 40, pp. 167 190, 2015.

Awasthi, P., Kleindessner, M., and Morgenstern, J. Equalized odds postprocessing under imperfect

group information, 2019.

Blum, A. and Mitchell, T. Combining labeled and unlabeled data with co-training. In Conference on

Learning Theory (COLT), pp. 92 100, 1998.

Brodersen, K. H., Ong, C. S., Stephan, K. E., and Buhmann, J. M. The balanced accuracy and

its posterior distribution. In Proceedings of the 2010 20th International Conference on Pattern Recognition, ICPR 10, pp. 3121 3124, Washington, DC, USA, 2010. IEEE Computer Society. ISBN 978-0-7695-4109-9.

Buolamwini, J. and Gebru, T. Gender shades: Intersectional accuracy disparities in commercial

gender classiﬁcation. In Proceedings of the 1st Conference on Fairness, Accountability and Transparency, pp. 77 91, 2018.

Calders, T. and Verwer, S. Three naive bayes approaches for discrimination-free classiﬁcation. Data

Mining and Knowledge Discovery, 21(2):277 292, Sep 2010. ISSN 1573-756X.

Calders, T., Kamiran, F., and Pechenizkiy, M. Building classiﬁers with independency constraints. In

2009 IEEE International Conference on Data Mining Workshops, pp. 13 18, Dec 2009.

Calmon, F., Wei, D., Vinzamuri, B., Natesan Ramamurthy, K., and Varshney, K. R. Optimized

pre-processing for discrimination prevention. In Advances in Neural Information Processing Systems 30, pp. 3992 4001, 2017.

Chan, P. K. and Stolfo, S. J. Toward scalable learning with non-uniform class and cost distributions:

A case study in credit card fraud detection. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, KDD 98, pp. 164 168, 1998.

Charoenphakdee, N., Lee, J., and Sugiyama, M. On symmetric losses for learning from corrupted

labels. In ICML, 2019.

Cotter, A., Gupta, M. R., Jiang, H., Srebro, N., Sridharan, K., Wang, S., Woodworth, B. E., and You,

S. Training well-generalizing classiﬁers for fairness metrics and other data-dependent constraints. Co RR, abs/1807.00028, 2018. URL http://arxiv.org/abs/1807.00028.

del Barrio, E., Gamboa, F., Gordaliza, P., and Loubes, J.-M. Obtaining fairness using optimal

transport theory. ar Xiv e-prints, art. ar Xiv:1806.03195, June 2018.

Denis, F. PAC Learning from Positive Statistical queries, 1998.

Dheeru, D. and Karra Taniskidou, E. UCI machine learning repository, 2017. URL http://

archive.ics.uci.edu/ml.

Donini, M., Oneto, L., Ben-David, S., Shawe-Taylor, J. S., and Pontil, M. Empirical risk minimization

under fairness constraints. In Advances in Neural Information Processing Systems 31, pp. 2796 2806, 2018.

Dwork, C. Differential privacy. In Bugliesi, M., Preneel, B., Sassone, V., and Wegener, I. (eds.),

Automata, Languages and Programming, pp. 1 12, Berlin, Heidelberg, 2006. Springer Berlin

Heidelberg. ISBN 978-3-540-35908-1.

Dwork, C., Hardt, M., Pitassi, T., Reingold, O., and Zemel, R. S. Fairness through awareness. Co RR,

abs/1104.3913, 2011.

Dwork, C., Immorlica, N., Kalai, A. T., and Leiserson, M. Decoupled classiﬁers for group-fair and

efﬁcient machine learning. In Proceedings of the 1st Conference on Fairness, Accountability and Transparency, pp. 119 133, 2018.

Elkan, C. and Noto, K. Learning classiﬁers from only positive and unlabeled data. In Proceedings of

the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 213 220, 2008.

Feldman, M. Computational fairness: Preventing machine-learned discrimination. 2015.

Gupta, M. R., Cotter, A., Fard, M. M., and Wang, S. Proxy fairness. Co RR, abs/1806.11212, 2018.

URL http://arxiv.org/abs/1806.11212.

Hardt, M., Price, E., and Srebro, N. Equality of opportunity in supervised learning. In Proceedings

of the 30th International Conference on Neural Information Processing Systems, NIPS 16, pp. 3323 3331, USA, 2016. ISBN 978-1-5108-3881-9.

Hashimoto, T. B., Srivastava, M., Namkoong, H., and Liang, P. Fairness without demographics in

repeated loss minimization. ar Xiv preprint ar Xiv:1806.08010, 2018.

Heidari, H., Loi, M., Gummadi, K. P., and Krause, A. A moral framework for understanding of fair ml

through economic models of equality of opportunity. ACM Conference on Fairness, Accountability, and Transparency (ACM FAT*), January 2019.

Jagielski, M., Kearns, M., Mao, J., Oprea, A., Roth, A., Shariﬁ-Malvajerdi, S., and Ullman, J.

Differentially private fair learning. 2018. URL https://arxiv.org/pdf/1812.02696.pdf.

Jain, S., White, M., and Radivojac, P. Recovering true classiﬁer performance in positive-unlabeled

learning. In AAAI, 2017.

Johndrow, J. E. and Lum, K. An algorithm for removing sensitive information: application to

race-independent recidivism prediction. ar Xiv e-prints, art. ar Xiv:1703.04957, March 2017.

Katz-Samuels, J., Blanchard, G., and Scott, C. Decontamination of mutual contamination models.

JMLR, 20(41):1 57, 2019.

Kilbertus, N., Gascon, A., Kusner, M., Veale, M., Gummadi, K., and Weller, A. Blind justice:

Fairness with encrypted sensitive attributes. In Dy, J. and Krause, A. (eds.), Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pp. 2630 2639, Stockholmsmässan, Stockholm Sweden, 10 15 Jul 2018. PMLR. URL http://proceedings.mlr.press/v80/kilbertus18a.html.

Kim, M. P., Reingold, O., and Rothblum, G. N. Fairness through computationally-bounded awareness.

Co RR, abs/1803.03239, 2018. URL http://arxiv.org/abs/1803.03239.

Kiryo, R., Niu, G., du Plessis, M., and Sugiyama, M. Positive-unlabeled learning with non-negative

risk estimator. In NIPS. 2017.

Kusner, M. J., Loftus, J., Russell, C., and Silva, R. Counterfactual fairness. In Advances in Neural

Information Processing Systems 30, pp. 4066 4076, 2017.

Lahoti, P., Weikum, G., and P. Gummadi, K. ifair: Learning individually fair data representations for

algorithmic decision making. 06 2018.

Lipton, Z., Mc Auley, J., and Chouldechova, A. Does mitigating ml s impact disparity require

treatment disparity? In Advances in Neural Information Processing Systems, pp. 8136 8146, 2018.

Liu, T. and Tao, D. Classiﬁcation with noisy labels by importance reweighting. IEEE Transactions

on Pattern Analysis and Machine Intelligence, 38:447 461, 2016.

Louizos, C., Swersky, K., Li, Y., Welling, M., and Zemel, R. The variational fair auto encoder. 11

Lum, K. and Johndrow, J. E. A statistical framework for fair predictive algorithms. Co RR, abs/1610.08077, 2016.

Menon, A. K., Narasimhan, H., Agarwal, S., and Chawla, S. On the statistical consistency of

algorithms for binary classiﬁcation under class imbalance. In Proceedings of the 30th International Conference on Machine Learning, 2013.

Menon, A. K., van Rooyen, B., Ong, C. S., and Williamson, R. C. Learning from corrupted binary

labels via class-probability estimation. In Proceedings of the 32nd International Conference on Machine Learning, 2015.

Mohri, M., Sivek, G., and Suresh, A. T. Agnostic federated learning. In International Conference on

Machine Learning, 2019.

Mozannar, H., Ohannessian, M. I., and Srebro, N. Fair learning with private data. In Neur IPS 2019

workshop on machine learning with guarantees, 2019. URL https://drive.google.com/ file/d/1dq Eexwzl SLK5XPck M00Q7Lggfh5o6sa B/view.

Natarajan, N., Tewari, A., Dhillon, I. S., and Ravikumar, P. Learning with noisy labels. In Neural

Information Processing Systems (NIPS), dec 2013.

Northcutt, C. G., Wu, T., and Chuang, I. L. Learning with conﬁdent examples: Rank pruning

for robust classiﬁcation with noisy labels. In Proceedings of the Thirty-Third Conference on Uncertainty in Artiﬁcial Intelligence, UAI 17. AUAI Press, 2017.

Pedreshi, D., Ruggieri, S., and Turini, F. Discrimination-aware data mining. In Proceedings of

the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 560 568. ACM, 2008.

Ramaswamy, H. G., Scott, C., and Tewari, A. Mixture proportion estimation via kernel embedding of

distributions. In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48, ICML 16, pp. 2052 2060. JMLR.org, 2016.

Scott, C., Blanchard, G., , and Handy, G. Classiﬁcation with asymmetric label noise: Consistency

and maximal denoising. In Conference on Learning Theory (COLT), volume 30, pp. 489 511, 2013a.

Scott, C., Blanchard, G., and Handy, G. Classiﬁcation with asymmetric label noise: Consistency and

maximal denoising. In Shalev-Shwartz, S. and Steinwart, I. (eds.), Proceedings of the 26th Annual Conference on Learning Theory, volume 30 of Proceedings of Machine Learning Research, pp. 489 511, Princeton, NJ, USA, 12 14 Jun 2013b. PMLR.

Speicher, T., Heidari, H., Grgic-Hlaca, N., Gummadi, K. P., Singla, A., Weller, A., and Zafar, M. B. A

uniﬁed approach to quantifying algorithmic unfairness: Measuring individual & group unfairness via inequality indices. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 18, pp. 2239 2248, 2018. ISBN 978-1-4503-5552-0.

van Rooyen, B. Machine Learning via Transitions. Ph D thesis, The Australian National University,

van Rooyen, B. and Williamson, R. C. A theory of learning with corrupted labels. JMLR, 18(228):

1 50, 2018.

Ward, G., Hastie, T., Barry, S., Elith, J., and Leathwick, J. R. Presence-Only Data and the {EM}

Algorithm. Biometrics, 65(2):554 563, 2009.

Wightman, L. national longitudinal bar passage study, 1998.

Williamson, R. C. and Menon, A. K. Fairness risk measures. In Proceedings of the 36th International

Conference on Machine Learning, ICML 2019, 9-15 June 2019, Long Beach, California, USA, pp. 6786 6797, 2019.

Woodworth, B., Gunasekar, S., Ohannessian, M. I., and Srebro, N. Learning non-discriminatory

predictors. In Proceedings of the 2017 Conference on Learning Theory, volume 65, pp. 1920 1953, Amsterdam, Netherlands, 07 10 Jul 2017. PMLR.

Zafar, M. B., Valera, I., Gomez Rodriguez, M., Gummadi, K., and Weller, A. From parity to

preference-based notions of fairness in classiﬁcation. In Advances in Neural Information Processing Systems 30, pp. 229 239, 2017a.

Zafar, M. B., Valera, I., Gomez Rodriguez, M., and Gummadi, K. P. Fairness beyond disparate treat-

ment & disparate impact: Learning classiﬁcation without disparate mistreatment. In Proceedings of the 26th International Conference on World Wide Web, pp. 1171 1180, 2017b.

Zemel, R., Wu, Y., Swersky, K., Pitassi, T., and Dwork, C. Learning fair representations. In

Proceedings of the 30th International Conference on Machine Learning, pp. 325 333, 2013.

Zhang, D. and Lee, W. S. Learning classiﬁers without negative examples: A reduction approach.

In 2008 Third International Conference on Digital Information Management, pp. 638 643, Nov 2008.