# metricfree_individual_fairness_in_online_learning__5a17e7a5.pdf Metric-Free Individual Fairness in Online Learning Yahav Bechavod Hebrew University yahav.bechavod@cs.huji.ac.il Christopher Jung University of Pennsylvania chrjung@seas.upenn.edu Zhiwei Steven Wu Carnegie Mellon University zstevenwu@cmu.edu We study an online learning problem subject to the constraint of individual fairness, which requires that similar individuals are treated similarly. Unlike prior work on individual fairness, we do not assume the similarity measure among individuals is known, nor do we assume that such measure takes a certain parametric form. Instead, we leverage the existence of an auditor who detects fairness violations without enunciating the quantitative measure. In each round, the auditor examines the learner s decisions and attempts to identify a pair of individuals that are treated unfairly by the learner. We provide a general reduction framework that reduces online classification in our model to standard online classification, which allows us to leverage existing online learning algorithms to achieve sub-linear regret and number of fairness violations. Surprisingly, in the stochastic setting where the data are drawn independently from a distribution, we are also able to establish PAC-style fairness and accuracy generalization guarantees (Rothblum and Yona (2018)), despite only having access to a very restricted form of fairness feedback. Our fairness generalization bound qualitatively matches the uniform convergence bound of Rothblum and Yona (2018), while also providing a meaningful accuracy generalization guarantee. Our results resolve an open question by Gillen et al. (2018) by showing that online learning under an unknown individual fairness constraint is possible even without assuming a strong parametric form of the underlying similarity measure. 1 Introduction As machine learning increasingly permeates many critical aspects of society, including education, healthcare, criminal justice, and lending, there is by now a vast literature that studies how to make machine learning algorithms fair (see, e.g., Chouldechova and Roth (2018); Podesta et al. (2014); Corbett-Davies and Goel (2018)). Most of the work in this literature tackles the problem by taking the statistical group fairness approach that first fixes a small collection of high-level groups defined by protected attributes (e.g., race or gender) and then asks for approximate parity of some statistic of the predictor, such as positive classification rate or false positive rate, across these groups (see, e.g., Hardt et al. (2016); Chouldechova (2017); Kleinberg et al. (2017); Agarwal et al. (2018)). While notions of group fairness are easy to operationalize, they are aggregate in nature without fairness guarantees for finer subgroups or individuals (Dwork et al., 2012; Hébert-Johnson et al., 2018; Kearns et al., 2018). In contrast, the individual fairness approach aims to address this limitation by asking for explicit fairness criteria at an individual level. In particular, the compelling notion of individual fairness proposed in the seminal work of Dwork et al. (2012) requires that similar people are treated similarly. The original formulation of individual fairness assumes that the algorithm designer has access to 34th Conference on Neural Information Processing Systems (Neur IPS 2020), Vancouver, Canada. a task-specific fairness metric that captures how similar two individuals are in the context of the specific classification task at hand. In practice, however, such a fairness metric is rarely specified, and the lack of metrics has been a major obstacle for the wide adoption of individual fairness. There has been recent work on learning the fairness metric based on different forms of human feedback. For example, Ilvento (2019) provides an algorithm for learning the metric by presenting human arbiters with queries concerning the distance between individuals, and Gillen et al. (2018) provide an online learning algorithm that can eventually learn a Mahalanobis metric based on identified fairness violations. While these results are encouraging, they are still bound by several limitations. In particular, it might be difficult for humans to enunciate a precise quantitative similarity measure between individuals. Moreover, their similarity measure across individuals may not be consistent with any metric (e.g., it may not satisfy the triangle inequality) and is unlikely to be given by a simple parametric function (e.g., the Mahalanobis metric function). To tackle these issues, this paper studies metric-free online learning algorithms for individual fairness that rely on a weaker form of interactive human feedback and minimal assumptions on the similarity measure across individuals. Similar to the prior work of Gillen et al. (2018), we do not assume a prespecified metric, but instead assume access to an auditor, who observes the learner s decisions over a group of individuals that show up in each round and attempts to identify a fairness violation a pair of individuals in the group that should have been treated more similarly by the learner. Since the auditor only needs to identify such unfairly treated pairs, there is no need for them to enunciate a quantitative measure to specify the distance between the identified pairs. Moreover, we do not impose any parametric assumption on the underlying similarity measure, nor do we assume that it is actually a metric since we do not require that similarity measure to satisfy the triangle inequality. Under this model, we provide a general reduction framework that can take any online classification algorithm (without fairness constraint) as a black-box and obtain a learning algorithm that can simultaneously minimize cumulative classification loss and the number of fairness violations. Our results in particular remove many strong assumptions in Gillen et al. (2018), including their parametric assumptions on linear rewards and Mahalanobis distances, and thus answer several questions left open in their work. 1.1 Overview of Model and Results We study an online classification problem: over rounds t = 1, . . . , T, a learner observes a small set of k individuals with their feature vectors (xt τ)k τ=1 in space X. The learner tries to predict the label yt k {0, 1} of each individual with a soft predictor πt that predicts πt(xt τ) [0, 1] on each xt τ and incurs classification loss |πt(xt τ) yt τ|. Then an auditor will investigate if the learner has violated the individual fairness constraint on any pair of individuals within this round, that is, if there exists (τ1, τ2) [k]2 such that |πt(xt τ1) πt(xt τ2)| > d(xt τ1, xt τ2) + α, where d: X X R+ is an unknown distance function and α denotes the auditor s tolerance. If this violation has occurred on any number of pairs, the auditor will identify one of such pairs and incur a fairness loss of 1; otherwise, the fairness loss is 0. Then the learner will update the predictive policy based on the observed labels and the received fairness feedback. Under this model, our results include: A Reduction from Fair Online Classification to Standard Online Classification. Our reductionbased algorithm can take any no-regret online (batch) classification learner as a black-box and achieve sub-linear cumulative fairness loss and sub-linear regret on mis-classification loss compared to the most accurate policy that is fair on every round. In particular, our framework can leverage the generic exponential weights method (Freund and Schapire, 1997; Cesa-Bianchi et al., 1997; Arora et al., 2012) and also oracle-efficient methods, including variants of Follow-the-Perturbed-Leader (FTPL) (e.g., Syrgkanis et al. (2016); Suggala and Netrapalli (2019)), that further reduces online learning to standard supervised learning or optimization problems. We instantiate our framework using two online learning algorithms (exponential weights and CONTEXT-FTPL), both of which obtain a O( T) on misclassification regret and cumulative fairness loss. Fairness and Accuracy Generalization Guarantees. While our algorithmic results hold under adversarial arrivals of the individuals, in the stochastic arrivals setting we show that the uniform average policy over time is probably approximate correct and fair (PACF) (Rothblum and Yona, 2018) that is, the policy is approximately fair on almost all random pairs drawn from the distribution and nearly matches the accuracy gurantee of the best fair policy. In particular, we show that the average policy πavg with high probability satisfies Prx,x [|πavg(x) πavg(x )| > α+1/T 1/4] O(1/T 1/4), which qualitatively achieves similar PACF uniform convergence sample complexity as Rothblum and Yona (2018).1 However, we establish our generalization guarantee through fundamentally different techniques. While their work assumes a fully specified metric and i.i.d. data, the learner in our setting can only access the similarity measure through an auditor s limited fairness violations feedback. The main challenge we need to overcome is that the fairness feedback is inherently adaptive that is, the auditor only provides feedback for the sequence of deployed policies, which are updated adaptively over rounds. In comparison, a fully known metric allows the learner to evaluate the fairness guarantee of all policies simultaneously. As a result, we cannot rely on their uniform convergence result to bound the fairness generalization error, but instead we leverage a probabilistic argument that relates the learner s regret to the distributional fairness guarantee. 2 Related Work Solving open problems in Gillen et al. (2018). The most related work to ours is Gillen et al. (2018), which studies the linear contextual bandit problem subject to individual fairness with an unknown Mahalanobis metric. Similar to our work, they also assume an auditor who can identify fairness violations in each round and provide an online learning algorithm with sublinear regret and a bounded number of fairness violations. Our results resolve two main questions left open in their work. First, we assume a weaker auditor who only identifies a single fairness violation (as opposed to all of the fairness violations in their setting). Second, we remove the strong parametric assumption on the Mahalanobis metric and work with a broad class of similarity functions that need not be metric. Starting with Joseph et al. (2016), there is a different line of work that studies online learning for individual fairness, but subject to a different notion called meritocratic fairness (Jabbari et al., 2017; Joseph et al., 2018; Kannan et al., 2017). These results present algorithms that are fair within each round but again rely on strong realizability assumptions their fairness guarantee depends on the assumption that the outcome variable of each individual is given by a linear function. Gupta and Kamble (2019) also studies online learning subject to individual fairness but with a known metric. They formulate a one-sided fairness constraint across time, called fairness in hindsight, and provide an algorithm with regret O(T M/(M+1)) for some distribution-dependent constant M. Our work is related to several others that aim to enforce individual fairness without a known metric. Ilvento (2019) studies the problem of metric learning by asking human arbiters distance queries. Unlike Ilvento (2019), our algorithm does not explicitly learn the underlying similarity measure and does not require asking auditors numeric queries. The PAC-style fairness generalization bound in our work falls under the framework of probably approximately metric-fairness due to Rothblum and Yona (2018). However, their work assumes a pre-specified fairness metric and i.i.d. data from the distribution, while we establish our generalization through a sequence of adaptive fairness violations feedback over time. Kim et al. (2018) study a group-fairness relaxation of individual fairness, which requires that similar subpopulations are treated similarly. They do not assume a pre-specified metric for their offline learning problem, but they do assume a metric oracle that returns numeric distance values on random pairs of individuals. Jung et al. (2019) study an offline learning problem with subjective individual fairness, in which the algorithm tries to elicit subjective fairness feedback from human judges by asking them questions of the form should this pair of individuals be treated similarly or not?" Their fairness generalization takes a different form, which involves taking averages over both the individuals and human judges. We aim to provide a fairness generalization guarantee that holds for almost all individuals from the population. 3 Model and Preliminaries We define the instance space to be X and its label space to be Y. Throughout this paper, we will restrict our attention to binary labels, that is Y = {0, 1}. We write H : X Y to denote the hypothesis class and assume that H contains a constant hypothesis i.e. there exists h such that h(x) = 0 for all x X. Also, we allow for convex combination of hypotheses for the purpose of randomizing the prediction and denote the simplex of hypotheses by H; we call a randomized hypothesis a policy. 1Rothblum and Yona (2018) show (Theorem 1.4 in their work) that if a policy π is α-fair on all pairs in a i.i.d. dataset of size m, then π satisfies Prx,x [|π(x) π(x )| > α + ϵ] ϵ, as long as m Ω(1/ϵ4). Sometimes, we assume the existence of an underlying (but unknown) distribution D over (X, Y). For each prediction ˆy Y and its true label y Y, there is an associated misclassification loss, ℓ(ˆy, y) = 1(ˆy = y). For simplicity, we overload the notation and write ℓ(π(x), y) = (1 π(x)) y + π(x) (1 y) = E h π[ℓ(h(x), y)]. 3.1 Individual Fairness and Auditor We want our deployed policy π to behave fairly in some manner, and we use the individual fairness definition from Dwork et al. (2012) that asserts that similar individuals should be treated similarly. We assume that there is some distance function d : X X R+ over the instance space X which captures the distance between individuals in X, although d doesn t have to satisfy the triangle inequality. The only requirement on d is that it is always non-negative and symmetric d(x, x ) = d(x , x). Definition 3.1 ((α, β)-fairness). Assume α, β > 0. A policy π H is said to be α-fair on pair (x, x ), if |π(x) π(x )| d(x, x ) + α. We say policy π s α-fairness violation on pair (x, x ) is vα(π, (x, x )) = max(0, |π(x) π(x )| d(x, x ) α). A policy is π is said to be (α, β)-fair on distribution D, if Pr (x,x ) D|X D|X[|π(x) π(x )| > d(x, x ) + α] β. A policy π is said to be α-fair on set S X, if for all (x, x ) S2, it is α-fair. Although individual fairness is intuitively sound, individual fairness notion requires the knowledge of the distance function d which is often hard to specify. Therefore, we rely on an auditor J that can detect instances of α-unfairness. Definition 3.2 (Auditor J ). An auditor Jα which can have its own internal state takes in a reference set S X and a policy π. Then, it outputs ρ which is either null or a pair of indices from the provided reference set to denote that there is some positive α-fairness violation for that pair. For some S = (x1, . . . , xn), Jα(S, π) = ρ = (ρ1, ρ2) if ρ1, ρ2 [n].π(xρ1) π(xρ2) d(xρ1, xρ2) α > 0 null otherwise If there exists multiple pairs with some α-violation, the auditor can choose one arbitrarily. Remark 3.3. Our assumptions on the auditor are much more relaxed than those of Gillen et al. (2018), which require that the auditor outputs whether the policy is 0-fair (i.e. with no slack) on all pairs S2 exactly. Furthermore, the auditor in Gillen et al. (2018) can only handle Mahalanobis distances. In our setting, because of the internal state of the auditor, the auditor does not have to be a fixed function but rather can be adaptively changing in each round. Finally, we never rely on the fact the distance function d stays the same throughout rounds, meaning all our results extend to the case where the distance function governing the fairness constraints is changing every round. 3.2 Online Batch Classification We now describe our online batch classification setting. In each round t = 1, . . . , T, the learner deploys some model πt H. Upon seeing the deployed policy πt, the environment chooses a batch of k individuals, (xt τ, yt τ)k τ=1 and possibly, a pair of individuals from that round on which πt will be responsible for any α-fairness violation. For simplicity, we write xt = (xt τ)k τ=1 and yt = (yt τ)k τ=1. The strategy zt FAIR-BATCH ZFAIR-BATCH that the environment chooses can be described by zt FAIR-BATCH = ( xt, yt) ρt, where ρt [k]2 {null}. Often, we will omit the subscript and simply write zt. If ρt = (ρt 1, ρt 2), then πt will be responsible for the α-fairness violation on the pair (xt ρt 1, xt ρt 2). There are two types of losses that we are interested in: misclassification and fairness loss. Definition 3.4 (Misclassification Loss). The (batch) misclassification loss Err2 is Err(π, zt) = τ=1 ℓ(π(xt τ), yt τ). 2We will overload the notation for this loss; regardless of what Z is, we ll assume Err(π, zt) is well-defined as long as zt includes ( xt, yt). Algorithm 1: Online Fair Batch Classification FAIR-BATCH for t = 1, . . . , T do Learner deploys πt Environment chooses ( xt, yt) Environment chooses the pair ρt zt = ( xt, yt) ρt Learner incurs misclassfication loss Err(πt, zt) Learner incurs fairness loss Unfair(πt, zt) end Algorithm 2: Online Batch Classification BATCH for t = 1, . . . , T do Learner deploys πt Environment chooses zt = ( xt, yt) Learner incurs misclassification loss Err(πt, zt) end Figure 1: Comparison between Online Fair Batch Classification and Online Batch Classification: each is summarized by the interaction between the learner and the environment: ( H ZFAIR-BATCH)T and ( H ZBATCH)T where ZFAIR-BATCH = X k Yk ([k]2 {null}) and ZBATCH = X k Yk. Definition 3.5 (Fairness Loss). The α-fairness loss Unfairα is Unfairα(π, zt) = ( 1 π(xt ρt 1) π(xt ρt 2) d(xt ρt 1, xt ρt 2) α > 0 if ρt = (ρt 1, ρt 2) 0 otherwise We want the total misclassification and fairness loss over T rounds to be as small as any π Q for some competitor set Q, which we describe now. As said above, each round s reference set, a set of pairs for which the deployed policy will possibly be responsible in terms of α-fairness, will be defined in terms of the instances that arrive within that round xt. The baseline Qα that we compete against will be all policies that are α-fair on xt for all t [T]: Qα = {π H : π is α-fair on xt for all t [T]} Note that because H contains a constant hypothesis which must be 0-fair on all instances, Qα cannot be empty. The difference in total loss between our algorithm and a fixed π is called regret , which we formally define below. Definition 3.6 (Algorithm A). An algorithm A : ( H Z) H takes in its past history (πτ, zτ)t 1 τ=1 and deploys a policy πt H at every round t [T]. Definition 3.7 (Regret). For some Q H, the regret of algorithm A with respect to some loss L : H Z R is denoted as Regret L(A, Q, T), if for any (zt)T t=1, t=1 L πt, zt inf π Q t=1 L π , zt = Regret L(A, Q, T), where πt = A((πj, zj)t 1 j=1). When it is not clear from the context, we will use subscript to denote the setting e.g. Regret L FAIR-BATCH. We wish to develop an algorithm such that both the misclassfication and fairness loss regret is sublinear, which is often called no-regret. Note that because π Qα is α-fair on xt for all t [T], we have Unfairα(π , zt) = 0 for all t [T]. Hence, achieving Regret Unfairα FAIR-BATCH(A, Q, T) = o(T) is equivalent to ensuring that the total number of rounds with any α-fairness violation is sublinear. Therefore, our goal is equivalent to developing an algorithm A so that for any (zt)T t=1, Regret Err FAIR-BATCH(A, Q, T) = o(T) and t=1 Unfairα(πt, zt) = o(T). To achieve the result above, we will reduce our setting to a setting with no fairness constraint, which we call online batch classification problem. Similar to the online fair batch classification setting, in each round t, the learner deploys a policy πt, but the environment chooses only a batch of instances (xt τ, yt τ)k τ=1. In online batch classification, we denote the strategy that the environment can take with ZBATCH = X k Yk. We compare the two settings in figure 1. 4 Achieving No Regret Simultaneously Here, we define a round-based Lagrangian loss and show that the regret with respect to our Lagrangian loss also serves as the misclassification and the fairness complaint regret. Then, we show that using an auditor that can detect any fairness violation beyond certain threshold, we can still hope to achieve no-regret against an adaptive adversary. Finally, we show how to achieve no regret with respect to the Lagrangian loss by reducing the problem to an online batch classification where there s no fairness constraint. We show that Follow The-Perturbed-Leader style approach (CONTEXT-FTPL from Syrgkanis et al. (2016)) can achieve sublinear regret in the online batch classification setting, which allows us to achieve sublinear regret with respect to both misclassification and fairness loss in the online fair batch classification setting. 4.1 Lagrangian Formulation Here we present a hybrid loss that we call Lagrangian loss that combines the misclassification loss and the magnitude of the fairness loss of round t. Definition 4.1 (Lagrangian Loss). The (C, α)-Lagrangian loss of π is LC,α π, ( xt, yt), ρt = τ=1 ℓ π xt τ , yt τ + C π(xt ρ1) π(xt ρ2) α ρt = (ρ1, ρ2) 0 ρt = null Given an auditor Jα that can detect any α-fairness violation, we can simulate the online fair batch classification setting with an auditor Jα by setting the pair ρt J = Jα( xt, πt): subscript J is placed on this pair to distinguish from the pair chosen by the environment.3 Definition 4.2 (Lagrangian Regret). Algorithm A s (C, α, Jα )-Lagrangian regret against Q is Regret C,α,Jα (A, Q, T), if for any ( xt, yt)T t=1, we have t=1 LC,α(πt, ( xt, yt), ρt J ) min π Q t=1 LC,α(π , ( xt, yt), ρt J ) Regret C,α,Jα (A, Q, T), where ρt J = Jα ( xt, πt). Remark 4.3. From here on, we assume the auditor has a given sensitivity denoted by α = α + ϵ, where ϵ is a parameter we will fix in order to define our desired benchmark Qα. Now, we show that the Lagrangian regret upper bounds the α-fairness loss regret with some slack by setting C to be appropriately big enough. Also, we show that (C, α, Jα+ϵ)-Lagrangian regret serves as the misclassification loss regret, too. The proofs are given in Appendix A.1. Theorem 4.4. Fix some small constant ϵ > 0 and C k+1 ϵ . For any sequence of environment s strategy (zt)T t=1 ZT FAIR-BATCH, PT t=1 Unfairα+ϵ(πt, zt) Regret C,α,Jα+ϵ (A, Qα, T). Theorem 4.5. Fix some small constant ϵ > 0. For any sequence of (zt)T t=1 ZT FAIR-BATCH and π Qα, τ=1 ℓ πt xt τ , yt τ τ=1 ℓ(π (xt τ), yt τ) Regret C,α,Jα+ϵ (A, Qα, T) , where C k+1 ϵ . In other words, Regret Err FAIR-BATCH(A, Qα, T) Regret C,α,Jα+ϵ (A, Qα, T). 3Although we are simulating the adaptive environment s strategy ρt with ρt J , note that the fairness loss with ρt J will always be at least the fairness loss with ρt because the auditor will always indicate if there s a fairness violation. This distinction between the pair chosen by the environment and the auditor is necessary just for technical reasons, as we need to ensure that the pair used to charge the Lagrangian loss incurs constant instantaneous regret in the rounds where there is actually some fairness violation, as the pair chosen by the environment can possibly have no fairness violation and hence negative instantaneous regret. This will be made more clear in the proof of Theorem 4.4. 4.2 Reduction to Online Batch Classification In this subsection, we will first discuss a computationally inefficient way to achieve no regret with respect to the Lagrangian loss. Then, we will show an efficient reduction to online batch classification and discuss an example of an oracle-efficient algorithm ABATCH that achieves no-regret. It is well known that for linear loss, exponential weights with appropriately tuned learning rate γ can achieve no regret (Freund and Schapire, 1997; Cesa-Bianchi et al., 1997; Arora et al., 2012). Note that our Lagrangian loss Lt C,α(π) = LC,α(π, zt) = τ=1 (1 π(xt τ)) yt τ + π(xt τ) (1 yt τ) + C π(xt ρ1) π(xt ρ2) α ρt = (ρ1, ρ2) 0 ρt = null is linear in π for any zt, and its range is [0, C + k]. Therefore, running exponential weights with learning rate γ = q T , we achieve the following regret with respect to the Lagrangian loss: Corollary 4.6. Running exponential weights with γ = q T and C k+1 ϵ , we achieve Regret Err FAIR-BATCH(A, Qα, T) (C+k) p t=1 Unfairα+ϵ(πt, zt) (C+k) p Nevertheless, running exponential weights is not efficient as it needs to calculate the loss for each h H every round t. To design an oracle-efficient algorithm, we reduce the online batch fair classification problem to the online batch classification problem in an efficient manner and use any online batch algorithm ABATCH((πj, ( x j, y j))t j=1) as a black box. At a high level, our reduction involves just carefully transforming our online fair batch classification history up to t, (πj, ( xj, yj, ρj))t j=1 ( H ZFAIR-BATCH)t into some fake online batch classification history (πj, ( x j, y j))t j=1 ( H ZBATCH)t and then feeding the artificially created history to ABATCH. Without loss of generality, we assume that C k+1 ϵ is an integer; if it s not, then take the ceiling. Now, we describe how the transformation of the history works. For each round t, whenever ρt = (ρt 1, ρt 2), we add C copies of each of (xt ρt 1, 0) and (xt ρt 2, 1) to the original pairs to form x t and y t. Just to keep the batch size the same across each round, even if ρt = null, we add C copies of each of (v, 0) and (v, 1) where v is some arbitrary instance in X. We describe this process in more detail in algorithm 3. This reduction essentially preserves the regret. Theorem 4.7. For any sequence of (zt)T t=1 ZT FAIR-BATCH, Q H, and π Q, t=1 LC,α(πt, zt) t=1 LC,α(π , zt) Regret Err BATCH(A, Q, T), where πt = ABATCH (πj, x j, y j)t 1 j=1 . Therefore, Regret C,α,Jα+ϵ (A, Qα, T) Regret Err BATCH(A, Q, T). One example of ABATCH that achieves sublinear regret in online batch classification is CONTEXT-FTPL from Syrgkanis et al. (2016). We defer the details to Appendix A.3 and present the regret guarantee here. We only focus on their small separator set setting (i.e. there exists a small set of points which serves as a witness to distinguish any two different hypothesis), although their transductive setting (i.e. the contexts {xt}T t=1 are known in advance) naturally follows as well. Theorem 4.8. If the separator set S for H is of size s, then CONTEXT-FTPL achieves the following misclassification and fairness regret in the online fair batch classification setting. Regret Err FAIR-BATCH(A, Qα, T) O t=1 Unfairα+ϵ(πt, zt) O Algorithm 3: Reduction from Online Fair Batch Classification to Online Batch Classification Parameters: inflation constant C, original round size k Initialize: k = k + 2C; for t = 1, . . . , T do Learner deploys πt; Environment chooses ( xt, yt) and the pair ρt; if ρt = (ρt 1, ρt 2) then for i = 1, . . . , C do xt k+i = xt ρt 1 and yt k+i = 0; xt k+C+i = xt ρt 2 and yt k+C+i = 1; end end else for i = 1, . . . , C do xt k+i = v and yt k+i = 0; xt k+C+i = v and yt k+C+i = 1; end end x t = (xt τ)k τ=1 and y t = (yt τ)k τ=1; πt+1 = ABATCH (πj, x j, y j)t j=1 ; end 5 Generalization We observe that until this point, all of our results apply to the more general setting where individuals arrive in any adversarial fashion. In order to argue about generalization, in this section, we will assume the existence of an (unknown) data distribution from which individual arrivals are drawn: {{(xt τ, yt τ)}k τ=1}T t=1 i.i.d. DT k. Despite the data are drawn i.i.d., there are two technical challenges in establishing generalization guarantee: (1) the auditor s fairness feedback at each round is limited to a single fairness violation with regards to the policy deployed in that round, and (2) both the deployed policies and the auditor are adaptive over rounds. To overcome these challenges, we will draw a connection between the established regret guarantees in Section 4 and the learner s distributional accuracy and fairness guarantees. In particular, we will analyze the generalization bounds for the average policy over rounds. Definition 5.1 (Average Policy). Let πt be the policy deployed by the algorithm at round t. The average policy πavg is defined by x : πavg(x) = 1 T PT t=1 πt(x). In order to be consistent with Section 4, we denote α = α + ϵ in this section. Here, we state the main results of this section: Theorem 5.2 (Accuracy Generalization). With probabilty 1 δ, the misclassification loss of πavg is upper bounded by E (x,y) D [ℓ(πavg(x), y)] inf π Qα E (x,y) D [ℓ(π(x), y)]+ 1 k T Regret C,α,Jα+ϵ (A, Qα, T)+ Theorem 5.3 (Fairness Generalization). Assume that for all t, πt is (α, βt)-fair (0 βt 1). With probability 1 δ, for any integer q T, πavg is (α + q T , β )-fair where Regret C,α,Jα+ϵ (A, Qα, T) + Corollary 5.4. Using CONTEXT-FTPL from Syrgkanis et al. (2016) with a separator set of size s, with probability 1 δ, the average policy πavg has the following guarantee: 1. Accuracy: E (x,y) D [ℓ(πavg(x), y)] inf π Qα E (x,y) D [ℓ(π(x), y)]+O ln(|H|) + ln 1 2. Fairness: πavg is (α + λ, λ)-fair where λ = O 4 ln(|H|)+ln( 1 Remark 5.5. Recall that the sensitivity of the auditor α is fixed, and the learner chooses the parameter ϵ (0, α ), which in return determines α = α ϵ and the set of policy Qα the learner is competing against. In the case where α = Ω(1), the learner can choose ϵ in the order of Ω(1) and guarantee that πavg is (α + λ, λ)-fair with λ = O(T 1/4). In this regime, corollary 5.4 implies that policy πavg has a non-trivial accuracy guarantee and a fairness generalization bound that qualitatively matches the uniform convergence bound in Theorem 1.4 of Rothblum and Yona (2018). The accuracy generalization bound of Theorem 5.2 is attained by applying Azuma s inequality on the left hand side of the inequality in Theorem 4.5 and then leveraging the fact that our classification loss function is linear with respect to the base classifiers over which it is defined. The full proof is given in Appendix B. As for the more challenging task of providing a fairness generalization guarantee (Theorem 5.3), we show how careful interpolation between α and β may be be used to provide a meaningful bound. Here, we state the key lemma required for Theorem 5.3 and a brief description of the proof technique. Lemma 5.6. Assume that for all t, πt is (α , βt)-fair (0 βt 1). For any integer q T, πavg is α + q t=1 βt -fair. High-Level Proof Idea for Lemma 5.6 Setting α = α + q T has the following implication: for any pair of individuals (x, x ), in order for πavg to have an α -fairness violation on x, x , at least q of the policies in {π1, . . . , πT } must have an α -fairness violation on x, x . We will then say a subset A X X is α -covered by a policy π, if π has an α -violation on every element in A. We will denote by Aα q X X the subset of pairs of elements from X that are α -covered by at least q policies in {π1, . . . , πT }. Next, consider the probability space D|X D|X over pairs of individuals. The lemma then follows from observing that for any q T, Pr(Aα q ) 1 q Pr(Aα 1 ), as this will allow us to upper bound the probability of an α -fairness violation by the stated bound. In Appendix B, we provide the full proof of Theorem 5.3, which features the covering argument presented in lemma 5.6, in addition to a concentration argument linking the probability of the algorithm deploying unfair policies throught its run to the regret guarantees proven in section 4. We also illustrate why an α, β interpolation is required in order to achieve a non-vacuous guarantee. 6 Conclusion In this paper, we were able to answer an open question by Gillen et al. (2018), proving that online learning under an unknown individual fairness constraint is possible even without assuming a strong parametric form of the underlying similarity measure. We were further able to prove what we consider a very surprising generalization result, matching the state-of-the-art bounds for individual fairness given by Rothblum and Yona (2018), while eliminating or significantly relaxing all of their rather stringent assumptions. Contrary to previous work, which provided individual fairness generalization bounds utilizing standard uniform convergence arguments (Agarwal et al. (2018); Rothblum and Yona (2018)), we have presented a novel proof technique with the use of a composition covering argument (Lemma 5.6), we also believe is of separate interest. Broader Impact As the authors of this work believe that bridging the gap between theoretical research in algorithmic fairness and practical use is of the essence, one of the main focuses of this work has been removing the rather stringent assumptions made in previous research in individual fairness, and replacing these with more realistic ones (if any). As such, the contributions offered in the paper allow taking a step closer to incorporating the long sought-after notion of individual fairness into real life systems. The introduction of a fairness auditor gives a simple, elegant solution to the hurdle posed by the classic similarity metric assumption. The notion of individual fairness pursued in this work offers a strong guarantee on the individual s level (which is not given, for example, by the various more popular yet weaker notions of group fairness). We believe this combination between practicality of use and a strong fairness guarantee has the power to significantly impact our ability to ensure fairness and non-discrimination in machine learning based algorithms. Acknowledgments and Disclosure of Funding We thank Sampath Kannan, Akshay Krishnamurthy, Katrina Ligett, and Aaron Roth for helpful conversations at an early stage of this work. Part of this work was done while YB, CJ, and ZSW were visiting the Simons Institute for the Theory of Computing. YB is supported in part by Israel Science Foundation (ISF) grant #1044/16, the United States Air Force and DARPA under contracts FA8750-16-C-0022 and FA8750-19-2-0222, and the Federmann Cyber Security Center in conjunction with the Israel national cyber directorate. CJ is supported in part by NSF grant AF-1763307. ZSW is supported in part by the NSF FAI Award #1939606, an Amazon Research Award, a Google Faculty Research Award, a J.P. Morgan Faculty Award, a Facebook Research Award, and a Mozilla Research Grant. Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the United States Air Force and DARPA. Alekh Agarwal, Alina Beygelzimer, Miroslav Dudík, John Langford, and Hanna M. Wallach. A reductions approach to fair classification. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018, pages 60 69, 2018. URL http://proceedings.mlr.press/v80/agarwal18a.html. Sanjeev Arora, Elad Hazan, and Satyen Kale. The multiplicative weights update method: a metaalgorithm and applications. Theory of Computing, 8(1):121 164, 2012. Nicolò Cesa-Bianchi, Yoav Freund, David Haussler, David P. Helmbold, Robert E. Schapire, and Manfred K. Warmuth. How to use expert advice. J. ACM, 44(3):427 485, May 1997. ISSN 0004-5411. doi: 10.1145/258128.258179. URL https://doi.org/10.1145/258128.258179. Alexandra Chouldechova. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big Data, Special Issue on Social and Technical Trade-Offs, 2017. Alexandra Chouldechova and Aaron Roth. The frontiers of fairness in machine learning. Co RR, abs/1810.08810, 2018. URL http://arxiv.org/abs/1810.08810. Sam Corbett-Davies and Sharad Goel. The measure and mismeasure of fairness: A critical review of fair machine learning. ar Xiv, 2018. Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard S. Zemel. Fairness through awareness. In Innovations in Theoretical Computer Science 2012, Cambridge, MA, USA, January 8-10, 2012, pages 214 226, 2012. doi: 10.1145/2090236.2090255. URL https: //doi.org/10.1145/2090236.2090255. Yoav Freund and Robert E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci., 55(1):119 139, 1997. doi: 10.1006/jcss.1997.1504. URL https://doi.org/10.1006/jcss.1997.1504. Stephen Gillen, Christopher Jung, Michael J. Kearns, and Aaron Roth. Online learning with an unknown fairness metric. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, Neur IPS 2018, 3-8 December 2018, Montréal, Canada., pages 2605 2614, 2018. URL http://papers.nips.cc/paper/ 7526-online-learning-with-an-unknown-fairness-metric. Swati Gupta and Vijay Kamble. Individual fairness in hindsight. In Proceedings of the 2019 ACM Conference on Economics and Computation, EC 2019, Phoenix, AZ, USA, June 24-28, 2019, pages 805 806, 2019. doi: 10.1145/3328526.3329605. URL https://doi.org/10.1145/3328526. 3329605. Moritz Hardt, Eric Price, and Nathan Srebro. Equality of opportunity in supervised learning. In Neural Information Processing Systems (NIPS), 2016. Úrsula Hébert-Johnson, Michael P. Kim, Omer Reingold, and Guy N. Rothblum. Multicalibration: Calibration for the (computationally-identifiable) masses. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018, pages 1944 1953, 2018. URL http://proceedings.mlr.press/ v80/hebert-johnson18a.html. Christina Ilvento. Metric learning for individual fairness. Co RR, abs/1906.00250, 2019. URL http://arxiv.org/abs/1906.00250. Shahin Jabbari, Matthew Joseph, Michael J. Kearns, Jamie Morgenstern, and Aaron Roth. Fairness in reinforcement learning. In Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017, pages 1617 1626, 2017. URL http://proceedings.mlr.press/v70/jabbari17a.html. Matthew Joseph, Michael J. Kearns, Jamie H. Morgenstern, and Aaron Roth. Fairness in learning: Classic and contextual bandits. In Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 510, 2016, Barcelona, Spain, pages 325 333, 2016. URL http://papers.nips.cc/paper/ 6355-fairness-in-learning-classic-and-contextual-bandits. Matthew Joseph, Michael J. Kearns, Jamie Morgenstern, Seth Neel, and Aaron Roth. Meritocratic fairness for infinite and contextual bandits. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, AIES 2018, New Orleans, LA, USA, February 02-03, 2018, pages 158 163, 2018. doi: 10.1145/3278721.3278764. URL https://doi.org/10.1145/3278721.3278764. Christopher Jung, Michael J. Kearns, Seth Neel, Aaron Roth, Logan Stapleton, and Zhiwei Steven Wu. Eliciting and enforcing subjective individual fairness. Co RR, abs/1905.10660, 2019. URL http://arxiv.org/abs/1905.10660. Sampath Kannan, Michael J. Kearns, Jamie Morgenstern, Mallesh M. Pai, Aaron Roth, Rakesh V. Vohra, and Zhiwei Steven Wu. Fairness incentives for myopic agents. In Proceedings of the 2017 ACM Conference on Economics and Computation, EC 17, Cambridge, MA, USA, June 26-30, 2017, pages 369 386, 2017. doi: 10.1145/3033274.3085154. URL https://doi.org/ 10.1145/3033274.3085154. Michael J. Kearns, Seth Neel, Aaron Roth, and Zhiwei Steven Wu. Preventing fairness gerrymandering: Auditing and learning for subgroup fairness. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018, pages 2569 2577, 2018. URL http://proceedings.mlr.press/v80/kearns18a.html. Michael P. Kim, Omer Reingold, and Guy N. Rothblum. Fairness through computationallybounded awareness. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, Neur IPS 2018, 3-8 December 2018, Montréal, Canada, pages 4847 4857, 2018. URL http://papers.nips.cc/paper/ 7733-fairness-through-computationally-bounded-awareness. Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan. Inherent trade-offs in the fair determination of risk scores. In Proceedings of the 8th Innovations in Theoretical Computer Science Conference, 2017. John Podesta, Penny Pritzker, Ernest J. Moniz, John Holdren, and Jefrey Zients. Big data: Seizing opportunities and preserving values. 2014. Guy N. Rothblum and Gal Yona. Probably approximately metric-fair learning. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018, pages 5666 5674, 2018. URL http://proceedings.mlr.press/ v80/yona18a.html. Arun Sai Suggala and Praneeth Netrapalli. Online non-convex learning: Following the perturbed leader is optimal. Co RR, abs/1903.08110, 2019. URL http://arxiv.org/abs/1903.08110. Vasilis Syrgkanis, Akshay Krishnamurthy, and Robert E. Schapire. Efficient algorithms for adversarial contextual learning. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016, pages 2159 2168, 2016. URL http: //proceedings.mlr.press/v48/syrgkanis16.html.