# faking_fairness_via_stealthily_biased_sampling__cae8892c.pdf

The Thirty-Fourth AAAI Conference on Artiﬁcial Intelligence (AAAI-20)

Faking Fairness via Stealthily Biased Sampling

Kazuto Fukuchi, 1,3 Satoshi Hara,2 Takanori Maehara3

1University of Tsukuba, 2Osaka University, 3RIKEN Center for Advanced Intelligence Project 3{kazuto.fukuchi, takanori.maehara}@riken.jp, 2satohara@ar.sanken.osaka-u.ac.jp

Auditing fairness of decision-makers is now in high demand. To respond to this social demand, several fairness auditing tools have been developed. The focus of this study is to raise an awareness of the risk of malicious decision-makers who fake fairness by abusing the auditing tools and thereby deceiving the social communities. The question is whether such a fraud of the decision-maker is detectable so that the society can avoid the risk of fake fairness. In this study, we answer this question negatively. We speciﬁcally put our focus on a situation where the decision-maker publishes a benchmark dataset as the evidence of his/her fairness and attempts to deceive a person who uses an auditing tool that computes a fairness metric. To assess the (un)detectability of the fraud, we explicitly construct an algorithm, the stealthily biased sampling, that can deliberately construct an evil benchmark dataset via subsampling. We show that the fraud made by the stealthily based sampling is indeed difﬁcult to detect both theoretically and empirically.

1 Introduction

Background Machine learning models are being increasingly used in individuals consequential decisions such as loan, insurance, and employment. In such applications, the models are required to be fair in the sense that their outputs should be irrelevant to the individuals sensitive feature such as gender, race, and religion (Pedreshi, Ruggieri, and Turini 2008). Several efforts have been devoted to establishing mathematical formulation of fairness (Dwork et al. 2012; Hardt et al. 2016; Dwork and Ilvento 2018) and to propose algorithms that meet the fairness criteria (Bolukbasi et al. 2016; Feldman et al. 2015; Joseph et al. 2016). With increasing attention to fairness, social communities now require to audit systems that incorporate machine learning algorithms to prevent unfair decisions. For example, a 2014 White House Report (Podesta et al. 2014) mentioned [t]he increasing use of algorithms to make eligibility decisions must be carefully monitored for potential discriminatory outcomes for disadvantaged groups, even absent dis-

The authors are listed in alphabetical order Copyright c 2020, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

criminatory intent . A similar statement also appeared in a 2016 White House Report (Munoz, Smith, and Patil 2016). To respond to the above social request, several fairness auditing tools have been developed (Adebayo 2016; Bellamy et al. 2018; Saleiro et al. 2018). These tools help the decision-maker to investigate the fairness of their system by, e.g., computing several fairness metrics (Saleiro et al. 2018), measuring the signiﬁcance of the system inputs (Adebayo 2016), and identifying minority groups with unfair treatments (Bellamy et al. 2018). If the decision-maker found unfairness in their systems, she/he can then ﬁx the systems by inspecting the causes of unfairness. These auditing tools are also useful for promoting fairness of the decision-maker s system to the social communities. For promoting fairness of the system, the decision-maker publishes the outputs of the auditing tools. If the outputs suggest no unfairness in the system, the fact can be seen as an evidence of the system s fairness. The decision-maker thus can appeal fairness of their system by publishing the fact to earn the trust of the social communities.

Risk of Fake Fairness The focus of this study is to raise awareness of the potential risk of malicious decision-makers who fake fairness. If the decision-maker is malicious, he may control the auditing tools results so that his system looks fair for the social communities even if the system is indeed unfair. Such a risk is avoidable if the social communities can detect the decision-maker s fraud. Therefore, the question is whether such a fraud is detectable. In this study, we answer this question negatively. That is, the fraud is very difﬁcult to detect in practice, which indicates that the society is now facing a potential risk of fake fairness. In what follows, we refer to a person who attempts to detect the decision-maker s fraud as a detector. If the decision-maker only publishes the auditing tools outputs, the detectability of the decision-maker s fraud is considerably low. That is, the malicious decision-maker may modify the auditing tools outputs arbitrary, whereas the detector has no way to certify whether or not the outputs are modiﬁed. This concludes that the decision-maker who only publishes the auditing tools outputs might be untrustable. The decision-maker should publish more information

about their system in addition to the auditing tools outputs to acquire the social communities trust. However, because the system s information usually involves some conﬁdential information, the decision-maker wants to prove fairness by publishing minimal information about their system. In this study, we investigate a decision-maker who attempts to prove the fairness of their system by constructing a benchmark dataset. That is, the decision-maker publishes a subset of his dataset with his decisions as minimal information for proving the fairness of the system. Given the benchmark dataset with the decisions, the detector can conﬁrm the fairness of the system by using the auditing tools. In particular, we focus our attention on an auditing tool that computes a fairness metric. With this setup, we assess the detectability of the decision-maker s fraud.

Biased Sampling Attack With the setup above, we consider a type of decision-maker s attacking algorithm, biased sampling attack. In the biased sampling attack, an attacker has a dataset D obtained from an underlying distribution P. Here, the dataset D involves the decisions made by the decision-makers system, which are possibly unfair. The attacker deliberately selects a subset Z D as the benchmark dataset so that the value of the fairness metric for Z is within a fair level. Then, the detector who employs an auditing tool that computes the fairness metric cannot detect unfairness of the decision-maker s system. The simplest method of the biased sampling attack might be case-control sampling (Mantel and Haenszel 1959). If the sensitive information is gender (man or woman) and the decision is binary (positive or negative), this method classiﬁes the dataset into four classes: (man, positive), (woman, positive), (man, negative), and (woman, negative). Then, it samples the desired numbers of points from the classes. By controlling the number of points in each class appropriately, it produces a fair subset Z. Fortunately, the fraud of the case-control sampling could be detected as follows. The detector compares the distribution of the benchmark dataset Z with her prior knowledge (e.g., distributions of ages or zip-codes). Then, because the case-control samples involve a bias from the original distribution, the detector may discover some unnatural thing, which indicates the decision-maker s fraud in the data-revealing process. To hide the fraud, the malicious decision-maker will select fair subset Z whose distribution looks similar to that of D. We refer to such a subset as stealthily biased subset and the problem of sampling such a subset as stealthily biased sampling. Intuitively, the problem is formulated as follows. The mathematical formulation of the problem is given in Section 3.

Problem 1 (Stealthily biased sampling problem (informal)). Given a possibly unfair dataset D obtained from an underlying distribution P, sample subset Z D such that (i) Z is fair in terms of some fairness criteria, and (ii) the distinguishing of the distribution of Z from P is difﬁcult.

Our Contributions In this study, we develop an algorithm for the stealthily biased sampling problem and demonstrate its difﬁculty of detection. First, we formulate the stealthily biased sampling problem as a Wasserstein distance minimization problem. We show that this problem is reduced to the minimum-cost ﬂow problem and solved it in polynomial time. (Section 3) Second, we show the difﬁculty of the detection of the proposed algorithm. We introduce an ideal detector who can access the underlying distribution P and compares the distribution of Z and P by a statistical test. The ideal detector has full information to perform the previously-mentioned fraud detection procedure, and any realistic detector cannot have such access. Therefore, if the ideal detector cannot detect the fraud, we can conclude that any realistic detector either cannot detect the fraud. We prove that the Wasserstein distance is an upper-bound of the advantage, which is a distinguishability measure used in the cryptographic theory (Goldreich 2009), with respect to the Kolmogorov Smirnov statistical test (KS test) (Massey Jr 1951) (Theorem 4). This means that the proposed algorithm is hard to detect even if the ideal detector uses the KS test. (Section 4) Finally, through synthetic and real-world data experiments, we show that the decision-maker can indeed pretend to be fair by using the stealthily biased sampling. Specifically, we demonstrate that the detector cannot detect the fraud of the decision-maker. In the experiments, we investigate detectability against a detector who can access an independent observation from P but cannot P. This detector is also ideal but more practical than the detector introduced in Section 4. The experimental results thus show more practical detectability than the theoretically analyzed one. (Section 5)

2 Preliminaries

Wasserstein Distance Let V be a ﬁnite set, and μ, ν : V R 0 be measures on V . A measure π on V V is a coupling measure of μ and ν if μi =

j V πij, and νj =

i V πij, which is denoted by π Δ(μ, ν). Let (X, d) be a metric space, i.e., d: X X R is positive deﬁnite, symmetric, and satisﬁes the triangle inequality. Suppose that each i V has feature xi X on the metric space. Then, the Wasserstein distance between μ and ν, denoted by W(μ, ν), is deﬁned by the optimal value of the following optimization problem (Vaserstein 1969):

i,j V d(xi, xj)πij, s.t. π Δ(μ, ν). (2.1)

The Wasserstein distance is computed in polynomial time by reducing to the minimum-cost ﬂow problem or using the Sinkhorn iteration (Peyr e and Cuturi 2019).

Minimum-Cost Flow Let G = (V, E) be a directed graph, where V is the vertices, E is the edges, c: E R is the capacity, and a: E R is the cost. The minimum-cost ﬂow

problem is given by

e E a(e)f(e)

s.t. 0 f(e) c(e), e E,

e δ+(u) f(e)

e δ (u) f(e) =

0, u V \ {s, t}, d, u = s, d, u = t, (2.2)

where δ+(u) = {(u, v) E} and δ (v) = {(u, v) E} are the outgoing edges from u and the incoming edges to v, respectively. d 0 is the required amount of the ﬂow. This problem is solvable in O(E

V) time in theory (Lee and Sidford 2013), where O suppresses log factors. The practical evaluation of the minimum-cost ﬂow algorithms are given in the study by (Kov acs 2015).

3 Algorithm for Stealthily Biased Sampling We formulate the stealthily biased sampling problem as a Wasserstein distance minimization problem. The difﬁculty of detecting the stealthily biased sampling is studied in Section 4. Here, we present a formulation for categorical biasing, which controls the number of points in each category.

Problem Formulation Let X be a metric space for the feature space and Y be a ﬁnite set representing the outcome of the decisions. An entry of x X corresponds to a sensitive information; let S be a ﬁnite set representing the class of sensitive information, and let s: X S be the mapping that extracts the sensitive information from the feature. The dataset is given by D = {(x1, y1), . . . , (x N, y N)}, where xi X is the feature of the i-th point and yi Y is the decision of the i-th point. For simplicity, we write i D for (xi, yi) D. Let ν be the uniform measure on D, whose expected number of points is K, i.e., νi = K

N , (i D). This is our reference distribution, i.e., if the decision-maker is not cheating, he will disclose subset Z D following this distribution, i.e., P(i S) = νi, where P denotes the probability. However, as the decision-maker wants to show that the output is fair, he constructs another distribution μ. Similar to the case-control sampling discussed in Section 1, we classify the dataset into bins S Y, and control the expected number of points sampled from each bin. Let k: S Y Z be the number of points of the bins, where K =

s S,y Y k(s, y) |D|. Then, μ satisﬁes the requirement if

(xi,yi) D:s(xi)=s,yi=y μi = k(s, y), (s S, y Y).

We denote by μ P(k) if μ satisﬁes the above constraint. Note that by choosing k appropriately, we can show that Z is fair, thus meeting the ﬁrst requirement in Problem 1. To meet the second requirement in Problem 1, the decision-maker must determine distribution μ such that μ

t (k(s, y), 0)

( , d(xi, xj))

Figure 3.1: Flow network for biased sampling. (c, a) on the edge if it has capacity c and cost a.

is indistinguishable from reference distribution ν. Here, we propose to measure the indistinguishability by using the Wasserstein distance. Then, the stealthily biased sampling problem is mathematically formulated as follows. Problem 2 (Stealthily biased sampling problem (formal)). min W(μ, ν), s.t. μ P(k). By substituting the deﬁnition of the Wasserstein distance into Problem 2, we obtain

i,j D d(xi, xj)πij, s.t. π Δ(μ, ν), μ P(k). (3.2)

As the objective function is linear in π and both Δ(μ, ν) and P(k) are polytopes, Problem 3.2 is a linear programming problem, hence is solved in a polynomial time (Gr otschel, Lov asz, and Schrijver 1981).

Efﬁcient Algorithm To establish an efﬁcient algorithm for the stealthily biased sampling problem, we reduce the problem to a minimum-cost ﬂow problem. We construct the network G = (V, E) in Figure 3.1 with capacity c and cost a. Vertices V consist of the following ﬁve classes: (i) supersource s, (ii) case-controlling vertices usy for all s S and y Y, (iii) left vertices li for all i D, (iv) right vertices rj for all j D, and (v) supersink t. Edges E consist of the following four classes: (i ) (s, usy) for all s S and y Y, whose cost is 0 and capacity is k(s, y), (ii ) (usy, li) for all i D with s(xi) = s and yi = y, whose cost is 0 and capacity is one, (iii ) (li, rj) for all i D and j D, whose cost is d(xi, xj) and capacity is , and (iv ) (rj, t) for all j D, whose cost is 0 and capacity is K/N. By setting the ﬂow amount to K, the solution to the above instance gives the solution to the stealthily biased sampling problem, where πij is the ﬂow across edge (li, rj), and μi is the ﬂow across edge (us(xi)yi, li). As |V| = O(|D|) and |E| = O(|D|2), the problem is solvable in O(|D|2.5) time (Lee and Sidford 2013).

4 Stealthiness of Sampling We theoretically conﬁrm that the stealthily biased sampling is difﬁcult to detect. Recall that the decision-maker s purpose is to make distribution μ indistinguishable from the

uniform distribution ν. To measure the indistinguishability, we introduce advantage, which is used in cryptographic theory (Goldreich 2009). Let νK be K product distribution of a sample drawn from the uniform probability distribution, and let μK be K product distribution of a sample generated by our stealthily biased sampling algorithm. To deﬁne the advantage, let us consider the following game in which a detector attempts to distinguish μK and νK: (1). Flip an unbiased coin. (2). If a head outcome is achieved, the decision-maker reveals D μK to the detector; otherwise, the decision-maker reveals D νK to the detector. (3). The detector estimates the side of the ﬂipped coin. If the probability that the detector estimates the outcome of the unbiased coin correctly is near 1/2, the detector cannot distinguish whether the obtained samples are biased. Let H be a random variable such that P(H = 1) = P(H = 0) = 1/2, which represents the ﬂipped unbiased coin. The detector s estimation algorithm is a mapping Φ from D to {0, 1}, where the output is 1 if the detector expects that the samples are drawn from νK; otherwise, the output is 0. The probability that the detector detects bias correctly is obtained as P(Φ(D) = H), where the randomness comes from ﬂipped coin H and dataset D. Then, the advantage is deﬁned as follows:

Adv(Φ; μK, νK) = P(Φ(D) = H) 1

A smaller Adv value implies that biased distribution μK is more difﬁcult to distinguish from νK against a detector with the test algorithm Φ.

Stealthiness against Kolmogorov Smirnov Test To assess the difﬁculty of detecting the stealthily biased sampling, we consider an ideal detector who can access the underlying distribution ν. Here, we analyze the advantage when the ideal detector who uses the KS test. The KS test is a goodness-of-ﬁt test for real-valued samples. Let Fν be the cumulative distribution function of distribution ν, and let FK be the cumulative distribution function of the empirical measure of the obtained samples. Then, the KS statistic is deﬁned as KS(D; ν) = supx |FK(x) Fν(x)|. The KS test is rejected if KS(D; ν) is larger than an appropriate threshold. Let us consider the detector s algorithms based on the KS statistic. We formally deﬁne a detector s algorithm that returns 1 if the KS statistic is larger than threshold τ as ΦKS,τ(D) = I(KS(D; ν) > τ), where I is the indicator function. We analyze the advantage against ΦKS,τ under a ﬂatness assumption on sample distribution ν. For x X, let Bϵ(x) be the ϵ-ball centered at x. Then, the ﬂatness assumption is deﬁned as follows:

Assumption 3. There exist constants s, C > 0 such that for any ϵ > 0, supx X ν(Bϵ(x)) (ϵ/C)s.

Many natural distributions on a real line satisfy Assumption 3. For example, the one-dimensional normal distribution satisﬁes Assumption 3 with s = 1 and C =

Under the ﬂatness assumption on ν, we reveal an upper bound on the advantage against the KS test in the categorical biasing setting. Let M be the number of pair types of decision and sensitive attribute. Let κ and κ be the distribution over pairs of decision and sensitive attribute on the sample distribution and biased distribution. Then, we reveal the following theorem. Theorem 4. Let W(μK, νK) be the Wasserstein distance equipped with the distance d(D, D ) = mini=1,...,K d(xi, x i) for D = {x1, ..., x K} and D = {x 1, ..., x K}. Under Assumption 3, for threshold τ (C/K)1/s/2, we have

Adv(ΦKS,τ; μK, νK)

K1/s W (μK,νK)/C1/s + 4K! ((1+TV(κ,κ ))/K)K , (4.2)

where s and C are the constants from Assumption 3, TV(κ, κ ) = M i |κi κ i|/2 is the total variation distance. The proof of this theorem can be found in the supplementary material. Since 1 + TV(κ, κ ) < e and K! (K/e)K, the second term in (4.2) is o(1) and is dominated by the ﬁrst term. Because the stealthily biased sampling minimizes the Wasserstein distance (i.e., the ﬁrst term of (4.2)), it also minimizes the upper-bound of the advantage. This implies that the stealthily biased sampling is difﬁcult to detect for the ideal detector. Consequently, for any realistic detector who has less information than the ideal one, it is even more difﬁcult to detect the stealthily biased sampling.

5 Experiments In this section, we show that the stealthily biased sampling is indeed difﬁcult to detect, through experiments on synthetic data and two real-world data (COMPAS and Adult).1 In the experiments, we adopted the demographic parity (DP) (Calders, Kamiran, and Pechenizkiy 2009) as the fairness metric for auditing. Here, let s {0, 1} be a sensitive feature and y {0, 1} be a decision. The DP is then deﬁned as DP = |P(y = 1 | s = 1) P(y = 1 | s = 0)|. A large DP indicates that the decision is unfair because the decisionmaker favors providing positive decisions to one group over the other group.

Summary of the Results Before moving to each experiment, we summarize the main results here. In the experiments, we investigated detectability of the decision-makers fraud against an ideal detector who can access an independent observation D from the underlying distribution P. In all the experiments, we veriﬁed the following three points. R1. Both the stealthily biased and case-control sampling could reduce the DP of the sampled set Z. R2. The stealthily biased sampling was more resistant against the detector s fraud detection compared to the case-control sampling. Speciﬁcally, the stealthily biased sampling marked low scores of the fraud detection criteria for a wide range of the experimental settings.

1The codes can be found at https://github.com/sato9hara/ stealthily-biased-sampling

R3. In all the experiments, the decision-makers successfully pretended to be fair. They could select a subset Z with small DPs and small fraud detection criteria.

Implementation We used Python 3 for data processing. In all the experiments, we used the squared Euclidean distance d(xi, xj) = xi xj 2 as the metric in Wasserstein distance. To solve the minimum-cost ﬂow problem (3.2), we used the network simplex method implemented in LEMON Graph Library.2 With LEMON, the problem could be solved in a few seconds for the datasets with the size N up to a few thousand. For the Adult dataset, we used a bootstraptype estimator to improve the computational scalability (see the full version (Fukuchi, Hara, and Maehara 2019) for the detail).

5.1 Synthetic Example Example 5 (Loan check). Consider a decision-maker who decides to lend money (y = 1) or not (y = 0) based on the applicants sensitive feature s {0, 1} (e.g., gender) and a d-dimensional feature vector x [0, 1]d, where ﬁrst feature x1 is an income. Here, we model the criteria of the decisionmaker as

y = I(x1 + bs > 0.5), (5.1)

where b 0 is a constant. Note that this decision-maker is unfair if b = 0. To pretend to be a fair, for a set of individual s feature, sensitive feature, and the decision D = {(xi, si, yi)}N i=1, the decision-maker selects subset Z D as evidence that the decisions are fair. We solve this problem by using both the stealthily biased and case-control sampling. Data We set the underlying data distribution P as follows. We sampled sensitive feature s with P(s = 1) = 0.5, and sampled feature vector x in a uniformly random manner over [0, 1]d with d = 1.3 Decision y is made by following the criteria (5.1). We sampled dataset D with N = 1, 000 observations from the underlying distribution P. We set the parameters in the criteria (5.1) to be b = 0.2. Thus, the DP of the decision-maker is 0.2. Attacker To reduce the DP through sampling, the sampled set needs to satisfy P(y = 1 | s = 1) P(y = 1 | s = 0) α for a predetermined ratio of positive decisions α [0, 1]. The expected number of sampling in each bin (s, y) {0, 1} {0, 1} is then determined by k(s, y) = 0.5Kαy(1 α)1 y (recall that P(s) = 0.5, s {0, 1}). Detector As a detector, we adopted the Kolmogorov Smirnov two-sample test. The detector has an independent observation D = {(x j, s j)}200 j=1 as a referential dataset sampled from underlying distribution P. Here, we note that the detector has no access to decision y for D because the decision criteria (5.1) is not disclosed. Given Z, the detector applies the Kolmogorov Smirnov two-sample test to detect whether the distribution of S is different from that of referential set D . Here, we consider the strongest detector: we

2https://lemon.cs.elte.hu/trac/lemon 3Results for higher dimensional settings were almost the same as d = 1. See the full version (Fukuchi, Hara, and Maehara 2019).

assume that she knows that only income x1 is used in x for the decision. We denote the distribution of income x1 in Z and D by PS(x1) and PD (x1), respectively. The detector can then use the Kolmogorov Smirnov two-sample test4 in three ways: (i) test PS(x1) = PD (x1), (ii) test PS(x1 | s = 1) = PD (x1 | s = 1), and (iii) test PS(x1 | s = 0) = PD (x1 | s = 0). In the experiment, we set the signiﬁcance level of the test to be 0.05. Result We selected a subset Z D with size |Z| = 200 using both the stealthily biased and case-control sampling. We repeated the experiment 100 times, and summarized the results in Figure 5.1, for several different ratios of positive decisions α. As we summarized earlier, three key observations R1, R2, and R3 can be found in the ﬁgures. R1. Figure 5.1a shows that both the stealthily biased and case-control sampling successfully reduced DP to less than 0.1 through sampling the subset Z. We note that no signiﬁcant differences were observed in DPs between the two sampling methods. R2. Figures 5.1b, 5.1c, and 5.1d show that the stealthily biased sampling was more resistant to the Kolmogorov Smirnov test, compared to the case-control sampling. Speciﬁcally, the stealthily biased sampling attained a small rejection rate in a wide range of α in the sampling process. R3. By using the stealthily biased sampling, the decisionmaker successfully pretended to be fair. By setting α in the sampling to be 0.6, none of the tests could conﬁdently reject that disclosed dataset Z is different from the referential dataset D . For α = 0.6, the rejection rates of all the three tests were kept around 0.05, which is exactly the same as the signiﬁcance level. These results indicate that the detector cannot detect the fraud made by the stealthily biased sampling: the DP of Z is small, and its distribution is sufﬁciently natural so that the statistical test cannot reject it. The case-control sampling showed higher rejection rates in tests of P(x | s = 1) and P(x | s = 0), and thus was outperformed by the stealthily biased sampling. Lastly, we note that the stochastic decision-maker can be far more evil than the deterministic decision-maker considered in this section. See the full version (Fukuchi, Hara, and Maehara 2019) for the detail.

5.2 Real-World Data: COMPAS

For the ﬁrst real-world data experiment, we focus on the COMPAS dataset (Angwin et al. 2016).5 The COMPAS dataset contains several defendant s records obtained from the Broward County Sheriff s Ofﬁce in Florida. Each defendant is scored his or her risk of recidivism using a software called COMPAS. Pro Publica (Angwin et al. 2016) revealed that the COMPAS risk score is discriminative: it tends to score white defendants with low scores while scoring black defendants with high scores.

4In practice, the detector does not know that x1 is a key feature. Thus, the detector needs to use the two-sample test for multidimensional data. However, in our preliminary experiments, we found that multi-dimensional tests have very low detection powers. Therefore, we used an advantageous setting for the detector. 5https://github.com/propublica/compas-analysis

0.4 0.5 0.6 0.7 0.8 0

Ratio of positive decisions α in sampling

Case-control Stealth

(a) Demographic parity

0.4 0.5 0.6 0.7 0.8 0

Ratio of positive decisions α in sampling

Ratio of rejection

(b) Ratio of rejection in test P(x)

0.4 0.5 0.6 0.7 0.8 0

Ratio of positive decisions α in sampling

Ratio of rejection

(c) Ratio of rejection in test P(x|s = 1)

0.4 0.5 0.6 0.7 0.8 0

Ratio of positive decisions α in sampling

Ratio of rejection

(d) Ratio of rejection in test P(x|s = 0)

Figure 5.1: Results for the decision-maker with a = 0.2. The shaded regions in (a) denotes the average DP std. The shaded regions in (b) (d) denote 95% conﬁdence intervals. The dotted line in (b) (d) denotes the signiﬁcance level 0.05.

Because Florida had strong open-records laws, the entire COMPAS dataset was made public, and the bias in the COMPAS risk score was revealed. Here, we consider a virtual scenario that the decision-maker was aware of the bias in the risk score, and he wants to pretend to be fair by hiding the bias. To attain this goal, the decision-maker discloses a subset of the COMPAS dataset as evidence that the COMPAS risk score is fair. Data We used the same data preprocessing following the analysis of Pro Publica (Angwin et al. 2016), which results in eight features x R8 of each defendant, with race as sensitive attribute s {0(black), 1(white)}, and the decision y {0(low-risk), 1(middle/high-risk)}. The preprocessed data includes 5, 278 records, which we randomly held out 1, 278 records as the referential dataset D for the detector. From the remaining 4, 000 records D, we sampled 2, 000 records as Z using both the stealthily biased and casecontrol sampling. To reduce the DP in the sampling, we required the sampled set to satisfy P(y = 1 | s = 1) P(y = 1 | s = 0) α for some α [0, 1]. Detector The detector tries to detect the bias in the disclosed dataset Z by comparing its distribution with the referential dataset D . In the experiment, we adopted the Wasserstein distance (WD) as the detector s detection criteria.6 If the WD between Z and D is sufﬁciently large, the detector can detect the bias in Z, and thus the fraud of the decisionmaker is revealed. Result We repeated the experiment 100 times by randomly changing the data splitting, and summarized the results in Figure 5.2.7 As the baseline without any biased sampling,

6In COMPAS and Adult experiments, we did not adopt the multi-dimensional two-sample tests because they were too weak. 7Here, we measured the WD on P(x). The WD on P(x | s = 1) and P(x | s = 0) can be found in the full version.

we computed DP and the WD for randomly sampled records from D, which are denoted as Baseline in the ﬁgures. The ﬁgures show the clear success of the stealthily biased sampling, as we summarized in R1, R2, and R3. In Figure 5.2(a), with the stealthily biased sampling, the DPs of Z have reduced signiﬁcantly (R1). In Figures 5.2(b), the WDs between Z and D were sufﬁciently small for α = 0.6 so that they are completely indistinguishable from the baselines (R3). The case-control sampling had higher WDs, and it was thus easier for the detector to detect (R2).

5.3 Real-World Data: Adult

As the second real-world data experiment, we used the Adult dataset (Dheeru and Karra Taniskidou 2017). The Adult dataset contains 48,842 records with several individual s features and their labels (high-income or low-income). The dataset is known to include gender bias: in the dataset, while 30% of the male have high-income, only 10% of the female have high-income. The DP of the dataset is therefore 0.2. If we naively train a classiﬁer using the dataset, the resulting classiﬁer inherits the bias and becomes discriminative, i.e., the classiﬁer favors to classify males as high-income. The goal of this experiment is to show that as if the biased classiﬁer is fair by disclosing a part of the dataset with classiﬁer s decision. Data & Classiﬁer In the data preprocessing, we converted categorical features to numerical features.8 We randomly split 10,000 records for the training set, 20,000 records for the test set, and the remaining 18,842 records for the referential set D for the detector. In the experiment, we ﬁrst train a classiﬁer using the training set. As a classiﬁer, we used logistic regression and random forest with 100 trees. We la-

8We used the implementation used in https://www.kaggle.com/ kost13/us-income-logistic-regression/notebook

0.3 0.4 0.5 0.6 0.7 0

Ratio of positive decisions α in sampling

Baseline Case-control Stealth

(a) Demographic parity

0.3 0.4 0.5 0.6 0.7 0.5

Ratio of positive decisions α in sampling

(b) Wasserstein distance in P(x)

Figure 5.2: Results for the COMPAS dataset: The shaded regions in (a) denotes the average DP std. The shaded regions in (b) denote the average WD std.

0.1 0.2 0.3 0.4 0.5 0 0.05 0.1 0.15 0.2

Ratio of positive decisions α in sampling

Baseline Case-control Stealth

(a) Demographic parity

0.1 0.2 0.3 0.4 0.5

Ratio of positive decisions α in sampling

(b) Wasserstein Distance in P(x)

Figure 5.3: Results for the Adult dataset: The shaded regions in (a) denotes the average DP std. The shaded regions in (b) denote the average WD std.

beled all the records in the test set using the trained classiﬁer and obtained the dataset D with the classiﬁer s decision. We then sample the subset Z D with size |Z| = 2, 000 using both the stealthily biased and case-control sampling. To reduce the DP in the sampling, we required the sampled set to satisfy P(y = 1 | s = 1) P(y = 1 | s = 0) α for a predetermined ratio of positive decisions α [0, 1]. Detector We adopted the same detector as the COMPAS data experiment, who refers to the WD as the bias detection metric. Result We repeated the experiment 100 times by randomly changing the data splitting, and summarized the resultsf for logistic regression in Figure 5.3.9 As the baseline, we computed the DP and the WD for randomly sampled 2, 000 sampled records from D, which is denoted as Baseline in the ﬁgure. Similar to the results of COMPAS, the ﬁgures again show the clear success of the stealthily biased sampling (R1, R2, and R3).

6 Conclusion We assessed the risk of malicious decision-makers who try to deceive auditing tools, by investigating the detectability of the decision-maker s fraud. We speciﬁcally put our focus on an auditing tool that computes a fairness metric. To assess the (un)detectability of the fraud, we considered the biased sampling attack, where the decision-maker publishes

9Here, we measured the WD on P(x). The WD on P(x | s = 1) and P(x | s = 0) can be found in the full version (Fukuchi, Hara, and Maehara 2019). The results for random forest can be found also in the full version (Fukuchi, Hara, and Maehara 2019).

a benchmark dataset as the evidence of his or her fairness. In this study, we demonstrated the undetectability by explicitly constructing an algorithm, the stealthily based sampling, that can deliberately construct a fair benchmark dataset. To derive the algorithm, we formulated the sampling problem as a Wasserstein distance minimization, which we reduced to a minimum-cost ﬂow problem for efﬁcient computation. We then showed that the fraud made by the stealthily based sampling is indeed difﬁcult to detect both theoretically and empirically. A recent study of (A ıvodji et al. 2019) has shown that malicious decision-makers can rationalize their unfair decisions by generating seemingly fair explanations, which indicates that an explanation will not be effective for certifying fairnesses. Our results indicate that passing the auditing tools will not be sufﬁcient as the evidence of the fairness as well. Assessing the validity of other auditing tools and mechanisms against malicious decision-makers would be essential. Lastly, in this study, we revealed the difﬁculty of detecting decision-maker s fraud. While auditing tools are getting popular, we will need additional social mechanisms that certify the reported results of these tools. We hope that our study opens up new research directions for practical social mechanisms that can certify fairnesses.

Acknowledgments. We would like to thank S ebastien Gambs and Ulrich A ıvodji for their helpful comments. Kazuto Fukuchi is supported by JSPS KAKENHI Grant Number JP19H04164. Satoshi Hara is supported by JSPS KAKENHI Grant Number JP18K18106.

Adebayo, J. A. 2016. Fair ML: Toolbox for diagnosing bias in predictive modeling. Master s thesis, Massachusetts Institute of Technology. A ıvodji, U.; Arai, H.; Fortineau, O.; Gambs, S.; Hara, S.; and Tapp, A. 2019. Fairwashing: the risk of rationalization. In Proceedings of the 36th International Conference on Machine Learning, 161 170. Angwin, J.; Larson, J.; Mattu, S.; and Kirchner, L. 2016. Machine bias. Pro Publica, May 23. Bellamy, R. K.; Dey, K.; Hind, M.; Hoffman, S. C.; Houde, S.; Kannan, K.; Lohia, P.; Martino, J.; Mehta, S.; Mojsilovic, A.; et al. 2018. AI Fairness 360: An extensible toolkit for detecting, understanding, and mitigating unwanted algorithmic bias. ar Xiv preprint ar Xiv:1810.01943. Bolukbasi, T.; Chang, K.-W.; Zou, J. Y.; Saligrama, V.; and Kalai, A. T. 2016. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Advances in Neural Information Processing Systems, 4349 4357. Calders, T.; Kamiran, F.; and Pechenizkiy, M. 2009. Building classiﬁers with independency constraints. In 2009 IEEE International Conference on Data Mining Workshops, 13 18. Dheeru, D., and Karra Taniskidou, E. 2017. UCI machine learning repository. Dwork, C., and Ilvento, C. 2018. Individual fairness under composition. In Proceedings of Fairness, Accountability, Transparency in Machine Learning. Dwork, C.; Hardt, M.; Pitassi, T.; Reingold, O.; and Zemel, R. 2012. Fairness through awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, 214 226. ACM. Feldman, M.; Friedler, S. A.; Moeller, J.; Scheidegger, C.; and Venkatasubramanian, S. 2015. Certifying and removing disparate impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 259 268. ACM. Fukuchi, K.; Hara, S.; and Maehara, T. 2019. Faking fairness via stealthily biased sampling. ar Xiv preprint ar Xiv:1901.08291. Goldreich, O. 2009. Foundations of cryptography: volume 2, basic applications. Cambridge university press. Gr otschel, M.; Lov asz, L.; and Schrijver, A. 1981. The ellipsoid method and its consequences in combinatorial optimization. Combinatorica 1(2):169 197. Hardt, M.; Price, E.; Srebro, N.; et al. 2016. Equality of opportunity in supervised learning. In Advances in Neural Information Processing Systems, 3315 3323. Joseph, M.; Kearns, M.; Morgenstern, J. H.; and Roth, A. 2016. Fairness in learning: Classic and contextual bandits. In Advances in Neural Information Processing Systems, 325 333. Kov acs, P. 2015. Minimum-cost ﬂow algorithms: an ex-

perimental evaluation. Optimization Methods and Software 30(1):94 127. Lee, Y. T., and Sidford, A. 2013. Path Finding II: An O(m n) algorithm for the minimum cost ﬂow problem. ar Xiv preprint ar Xiv:1312.6713. Mantel, N., and Haenszel, W. 1959. Statistical aspects of the analysis of data from retrospective studies of disease. Journal of the National Cancer Institute 22(4):719 748. Massey Jr, F. J. 1951. The kolmogorov-smirnov test for goodness of ﬁt. Journal of the American statistical Association 46(253):68 78. Munoz, C.; Smith, M.; and Patil, D. J. 2016. Big data: A report on algorithmic systems, opportunity, and civil rights. Technical report, Executive Ofﬁce of the President, The White House. Pedreshi, D.; Ruggieri, S.; and Turini, F. 2008. Discrimination-aware data mining. In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 560 568. ACM. Peyr e, G., and Cuturi, M. 2019. Computational optimal transport. Foundations and Trends R in Machine Learning 11(5-6):355 607. Podesta, J.; Pritzker, P.; Moniz, E. J.; Holdern, J.; and Zients, J. 2014. Big data - seizing opportunities, preserving values. Technical report, Executive Ofﬁce of the President, The White House. Saleiro, P.; Kuester, B.; Stevens, A.; Anisfeld, A.; Hinkson, L.; London, J.; and Ghani, R. 2018. Aequitas: A bias and fairness audit toolkit. ar Xiv preprint ar Xiv:1811.05577. Vaserstein, L. N. 1969. Markov processes over denumerable products of spaces, describing large systems of automata. Problemy Peredachi Informatsii 5(3):64 72.