# fairness_transferability_subject_to_bounded_distribution_shift__e4100f4c.pdf Fairness Transferability Subject to Bounded Distribution Shift Yatong Chen , Reilly Raab , Jialu Wang, Yang Liu University of California, Santa Cruz {ychen592, reilly, faldict, yangliu}@ucsc.edu Given an algorithmic predictor that is fair on some source distribution, will it still be fair on an unknown target distribution that differs from the source within some bound? In this paper, we study the transferability of statistical group fairness for machine learning predictors (i.e., classifiers or regressors) subject to bounded distribution shift. Such shifts may be introduced by initial training data uncertainties, user adaptation to a deployed predictor, dynamic environments, or the use of pre-trained models in new settings. Herein, we develop a bound that characterizes such transferability, flagging potentially inappropriate deployments of machine learning for socially consequential tasks. We first develop a framework for bounding violations of statistical fairness subject to distribution shift, formulating a generic upper bound for transferred fairness violations as our primary result. We then develop bounds for specific worked examples, focusing on two commonly used fairness definitions (i.e., demographic parity and equalized odds) and two classes of distribution shift (i.e., covariate shift and label shift). Finally, we compare our theoretical bounds to deterministic models of distribution shift and against real-world data, finding that we are able to estimate fairness violation bounds in practice, even when simplifying assumptions are only approximately satisfied. 1 Introduction Distribution shift is a common, real-world phenomenon that affects machine learning deployments when the target distribution of examples (features and labels) ultimately encountered by a datadriven policy diverges from the source distribution it was trained for. For socially consequential decisions guided by machine learning, such shifts in the underlying distribution can invalidate fairness guarantees and cause harm by exacerbating social disparities. Unfortunately, distribution shift can be technically difficult or impossible to model at training time (e.g., when depending on complex social dynamics or unrealized world events). Nonetheless, we still wish to certify the robustness of fairness metrics for a policy on possible target distributions. In this paper, we provide a general framework for quantifying the robustness of statistical group fairness guarantees. We assume that the target distribution is adversarially drawn from a bounded domain, thus reducing the hard problem of modelling distribution shift dynamics to a more tractable, static problem. With this framework, we can detect potentially inappropriate policy applications, prior to deployment, when fairness violation bounds are not sufficiently small. This work bridges a gap between recent literature on domain adaptation, which has largely focused on the transferability of prediction accuracy (rather than fairness), and algorithmic fairness, which These authors contributed equally to this work. Corresponding author: yangliu@ucsc.edu 36th Conference on Neural Information Processing Systems (Neur IPS 2022). has typically considered static distributions or prescribed models of distribution shift. Our work is the first to systematically bound quantifiable violations of statistical group fairness while remaining agnostic to (1) the mechanisms responsible for distribution shift, (2) how group-specific distribution shifts are quantified, and (3) the specific statistical definition of group fairness applied. 72 73 74 75 76 77 78 79 80 Prediction Accuracy (%) Violation of Demographic Parity CA MI PA TX WA Figure 1: In Section 7, we evaluate our bounds against historical, temporal distribution shifts in demographics and income recorded by the US Census Bureau [13]. The above figure depicts changes to income-prediction accuracy and demographic parity violation when a classifier initially trained on US state-specific demographic data for 2014 is reused on 2018 data, thus exemplifying the negative potential effects of distribution shift. Our primary result is a bound on a policy s potential violation of statistical group fairness defined in terms of the differences in policy outcomes between groups when applied to a target distribution shifted relative to the source distribution within known constraints. Such settings naturally arise whenever training data represents a random sample of a target population with different statistics or a sample from dynamic environments, when a policy is reused on a new distribution without retraining, or whenever policy deployment itself induces a distribution shift. As an example of this last case, strategic individuals seeking loans might change their features or abstain from future application (thus shifting the distribution of examples) in response to policies trained on historical data [18, 38, 43]. Beyond policy selection, exogenous pressure such as economic trends and noise may also drive distribution shift in this example. In Figure 1, we show how a real-world distribution shift in demographic and income data for US states between 2014 and 2018 may increase fairness violations while decreasing accuracy for a hypothetical classifier trained on the 2014 distribution. In such settings, it is useful to quantify how fairness guarantees transfer across distributions shifted within some bound, thus allowing the deployment of unfair machine learning policies to be avoided. 1.1 Related Work Our work considers a setting similar to recent studies of domain adaptation, which have largely focused on characterizing the effects of distribution shift on prediction performance rather than fairness. Our work also builds on efforts in algorithmic fairness, especially dynamical treatments of distribution shift in response to deployed machine learning policies [23, 11, 30]. We reference specific prior work in these domains in Appendix B, and here discuss existing work that focuses on how certain measures of fairness are affected when policies are subject to specific distribution shift. Fairness subject to Distribution Shift: A number of recent studies have considered specific examples of fairness transferability subject to distribution shift [34, 9, 36, 31, 21]. In particular, Schumann et al. [34] examine equality of opportunity and equalized odds as definitions of group fairness subject to distribution shifts quantified by an H-divergence function; Coston et al. [9] consider demographic parity subject to a covariate shift assumption while group identification remains unavailable to the classifier; Singh et al. [36] focus on common group fairness definitions for binary classifiers subject to a class of distribution shift that generalizes covariate shift and label shift by preserving some conditional probability between variables; and Rezaei et al. [31] similarly consider common binary classification fairness definitions such as equalized odds subject to covariate shift. While we address similar settings to these works as special cases of our bound, we propose a unifying formulation for a broader class of statistical group fairness definitions and distribution shifts. In doing so, we recognize that particular settings recommend themselves to more natural measures of distribution shift, providing examples in Section 4.1, Section 4.2, and Section 5). Another thread in existing literature is the development of robust models with the goal of guaranteeing fairness on a modelled target distribution (e.g., [1, 32, 26, 3, 21]) , for example, by assuming covariate shift and the availability of some unlabelled target data [9, 36, 31]. In particular, Singh et al. [36] focus on learning stable models that will preserve prediction accuracy and fairness, utilizing a causal graph to describe anticipated distribution shifts. Rezaei et al. [31] takes a robust optimization approach, and Coston et al. [9] develops prevalence-constrained and target-fair covariate shift method for getting the robust model. In contrast, our goal is to quantify fairness violations after an adversarial distribution shift for any given policy, including those not trained with robustness in mind. 1.2 Our Contributions Our primary contribution is formulating a general, worst-case upper bound for a given policy s violation of statistical group fairness subject to group-dependent distribution shifts within presupposed bounds (i.e., Equation (9)). Bounding violations of fairness subject to distribution shift allows us to recognize and avoid potentially inappropriate deployments of machine learning when the potential disparities of a prospective policy eclipse a given threshold within bounded distribution shifts of the training distribution. We first characterize the space of statistical group fairness definitions and possible distribution shifts by appeal to premetric functions (Definition 2.1). After formulating the worst-case upper bound, we explore common sets of simplifying assumptions for this bound as special cases, yielding tractable calculations for several familiar combinations of fairness definitions and subcases of distribution shift (Theorem 4.1, Theorem 5.2) with readily interpretable results. Finally, we compare our theoretical bounds to prescribed models of distribution shift in Section 6 and to real-world data in Section 7. The details for reproducing our experimental results can be found at https://github.com/UCSC-REAL/Fairness_Transferability. 2 Formulation The appendices include a table of notation (Appendix A) and all proofs (Appendix F). 2.1 Algorithmic Prediction We consider two distributions, S (source) and T (target), each defined as a probability distribution for examples, where each example defines values for three random variables: X, a feature (e.g., x) with arbitrary domain X; Y , a label (e.g., y) with arbitrary domain Y; and G, a group (e.g., g or h) with finite, countable domain G. The predictor s policy π, intended for S but used on T , defines a fourth variable for each example: viz., ˆY , a predicted label (e.g., ˆy) with domain ˆY = Y. Using P( ) to denote the space of probability distributions over some domain, we denote the space of distributions over examples as D := P(X Y G), such that S, T D. It will also be useful for us to notate the space of distributions over example outcomes associated with a given policy as O := P(X Y ˆY) and the space of distributions over of group-specific examples as G := P(X Y). Without loss of generality, we allow the prediction policy π to be stochastic, such that, for any combination (x, g), the predictor effectively samples ˆY from a corresponding probability distribution π(x, g). Stochastic classifiers arise in various constrained optimization problems and proven useful for making problems with custom losses or fairness constraints tractable [10, 17, 29, 40]. We denote the space of nondeterministic policies as Π := (X G P( ˆY)) (e.g., π Π) and utilize the natural transformations that relate the spaces of distributions D, policies Π, and outcomes O: Pr π,T ( ˆY =ˆy, X=x, G=g) = Pr ˆY π(x,g) ( ˆY =ˆy) Pr X,G T (X=x, G=g) (1) We abuse the Pr notation for both probability density and probability mass functions as appropriate. 2.2 Statistical Group-Fairness We next define a broad class of disparity functions : Π D R representing how unfair a given policy is for a given distribution (e.g., writing (π, T )), noting that this notion of fairness is limited to capturing statistical discrepancies of outcomes between groups. Definition 2.1. We define a premetric3 Ψ on the space of distributions p with respect to q by the properties Ψ(p q) 0 and Ψ(p p) = 0 for all p, q, and refer to the value of Ψ as a shift . Definition 2.2. We define a statistical group disparity for policy π and distribution T in terms of the symmetrized shifts between group-specific outcome distributions. We measure shifts between outcome distributions with a given premetric Ψ: O2 R. (π, T ) := X g,h G Ψ Pr π,T (X, Y, ˆY | G=g) Pr π,T (X, Y, ˆY | G=h) (2) 3Despite use on Wikipedia, this is not a standard term in the literature. In general, the axioms of a premetric as defined in Definition 2.1 are a subset (thus pre ) of those that define a metric. In Definition 2.2, Ψ quantifies the specific statistical differences in outcomes between groups that are "unfair", where a value of 0 implies perfect fairness. In this work, we assume that Ψ is the same for all g, h and that is insensitive to relative group size Pr(G). Examples Familiar applications of Definition 2.2 include demographic parity (DP) and equalized odds (EO). A policy satisfying DP, in expectation, assigns a given binary classification y {0, 1} to the same fraction of examples in each group. We may measure the violation of DP as DP(π, T ) := X Pr π,T ( ˆY =1 | G=g) Pr π,T ( ˆY =1 | G=h) (3) The associated premetric ΨDP for p, q O is ΨDP(p q) = Prp( ˆY =1) Prq( ˆY =1) . To satisfy EO, for binary Y = {0, 1}, π must maintain group-invariant true positive and false positive classification rates. We may measure the violation of EO as EO(π, T ) := X Pr π,T ( ˆY =1 | G=g, Y =y) Pr π,T ( ˆY =1 | G=h, Y =y) (4) The associated premetric is ΨEO(p q) = P y Y Prp( ˆY =1 | Y =y) Prq( ˆY =1 | Y =y) . Note that the restriction of EO to the (Y = 1) case is known as Equal Opportunity (EOp). We remark that Definition 2.2 provides a unifying representation for a wide array of statistical group unfairness definitions and may be used with inequality constraints. That is, we may recover many working definitions of fairness that effectively specify a maximum value of disparity: Definition 2.3. A policy π is ϵ-fair with respect to on distribution T iff (π, T ) ϵ. 2.3 Vector-Bounded Distribution Shift Suppose, after developing policy π for distribution S, we realize some new distribution T on which the policy is actually operating. This realization may be the consequence of sampling errors during the learning process, strategic feedback to our policy, random processes, or the reuse of our policy on a new distribution for which retraining is impractical. Our goal is to bound (π, T ) given knowledge of (π, S) and some notion of how much T possibly differs from S. Definition 2.4. K(p q) is a divergence if and only if for all p and q, K(p q) 0 and K(p q) = 0 q = p. Definition 2.5. Define the group-vectorized shift D, as S mutates into T , as D(T S) := X g eg Dg Pr T (X, Y | G=g) Pr S (X, Y | G=g) (5) where eg represents a unit vector indexed by g, and each Dg : G2 R is a divergence (Definition 2.4). Note that each Dg also defines a premetric (but not necessarily a divergence) on D. Assumption 2.6. Let there exist some vector B 0 bounding D(T S) B, where and denote element-wise inequalities. In Assumption 2.6, B limits the possible distribution shift as S mutates into T , without requiring us to specify a model for how distributions evolve. When modelling distribution shift requires complex dynamics (e.g., when agents learn and respond to classifier policy), we reduce a potentially difficult dynamical problem to a more tractable, adversarial problem to achieve a bound. Lemma 2.7. For all π, , and D, when B = 0, (π, S) = (π, T ). Lemma 2.7 indicates that, for a fixed policy π, a change in disparity requires a measurable shift in distributions from S to T , confirming intuition. Restricted Distribution Shift Common assumptions that restrict the set of distribution shifts include covariate shift and label shift. For covariate shift, the distribution of labels conditioned on features is preserved across distributions for all groups, while for label shift, the distributions of features conditioned on labels is preserved across distributions for all groups. Covariate shift implies Pr T (Y | X, G) = Pr S (Y | X, G) (6) Label shift implies Pr T (X | Y, G) = Pr S (X | Y, G) (7) In Section 4, we explore a deterministic model of a population s response to classification as an example of covariate shift. We do the same in Section 5 for label shift. 3 General Bounds We first define a primary bound in Definition 3.1 before considering simplifying special cases. Given an element-wise bound B on the vector-valued shift D(T S) (Assumption 2.6) we may bound the disparity of policy π on any realizable target distribution T by its supremum value. Definition 3.1. Define the supremum value v for subject to D(T S) B as v( , D, π, S, B) := sup D(T S) B (π, T ) (8) D(T S) B = (π, T ) v( , D, π, S, B) (9) In general, our strategy is to exploit the mathematical structure of the setting encoded by (i.e., Ψ) and D to obtain an upper bound for v defined in Equation (8). We first explore general cases of simplifying assumptions before presenting worked special examples for frequently encountered settings. Finally, we compare the resulting theoretical bounds to numerical results and simulations. 3.1 Lipshitz Conditions The value of v defines a scalar field in B and therefore a conservative vector field F = Bv. bh = Bh max gv Lipshitz Bound (L B) Figure 2: A Lipshitz bound for all curves parameterized by distribution shift bound b in the (0, B) D-hyperrectangle on the surface v. In the figure, for groups i {g, h}, max iv = Li, and the colored dotted lines corresponds to Libi, which, when summed, equal L B. For any curve in D from S to T , bounds of the form F L for some constant L along the curve imply a Lipshitz bound on . We visualize a bound in Figure 2 for all possible curves in the region D(T S) B. Theorem 3.2 (Lipshitz Upper Bound). If there exists an L such that bv( , D, π, S, b) L, everywhere along some curve as b varies from 0 to B, then (π, T ) (π, S) + L B (10) Succinctly, if we are guaranteed that disparity can never increase faster than a certain rate in some measure of distribution shift, then, given a maximum distribution shift, this rate bounds the maximum possible disparity. The utility of Theorem 3.2 arises when a Lipshitz condition L is known, but direct computation of v is difficult. We provide an example of a Lipshitz bound in Section 5. 3.2 Subadditivity Conditions Definition 3.3. Define w as the maximum increase in disparity subject to D(T S) B, i.e., w( , D, π, S, B) := v( , D, π, S, B) (π, S). Theorem 3.4. Suppose, in the region D(T S) B, that w is subadditive in its last argument. That is, w(..., a) + w(..., c) w(..., a + c) for a, c 0 and a + c B. If w is also locally differentiable, then a first-order approximation of w(..., b) evaluated at 0, i.e., L = bw(..., b) b=0 = bv(..., b) b=0 (11) provides an upper bound for v(..., B), i.e., v( , D, π, S, B) (π, S) + L B (12) Theorem 3.4 notes that diminishing returns in the change of as the difference of T with respect to S is increased implies a bound on in terms of its local sensitivity to D at S (i.e., using a firstorder Taylor approximation). Note that, if w is concave in the bounded region, it is also subadditive in the bounded region, but the converse is not true, nor does the converse imply Lipshitzness. 3.3 Geometric Structure It may happen that Ψ: O2 R and each Dg : G2 R share structure that permits a geometric interpretation of distribution shift. While the utility of this observation depends on the specific properties of Ψ and D, we demonstrate a worked example building on Section 4.2 in Appendix D, in which we allow ourselves to select a suitable D for ease of interpretation. We proceed to consider worked examples that adopt common assumptions limiting the form of distribution shift and apply common definitions of statistical group fairness. 4 Covariate Shift We now present our fairness transferability results subject to covariate shift for both demongraphic parity (Section 4.1) and equalized opportunity (Section 4.2) as fairness criteria. 4.1 Demographic Parity The simplest way to work with Equation (8) is to bound the supremum v. We first consider demographic parity (Equation (3)) for Y = {0, 1} and G = {g, h}, subject to covariate shift (Equation (6)). We find that the form of DP subject to covariate shift recommends itself to a natural choice of vector divergence, D. First, define a re-weighting coefficient ωg(T , S, x) := Pr T (X=x|G=g) Pr S(X=x|G=g) . Theorem 4.1. For demographic parity between two groups under covariate shift (denoting, for each g, βg := Prπ,S( ˆY =1 | G=g)), DP(π, T ) DP(π, S) + X βg(1 βg) Var S [ωg(T , S, x)] 1/2 (13) We notice that Var S[ωg(T , S, x)] recommends itself as a suitable divergence Dg from S to T . Using basis vectors eg, for this example, we could define D(T S) = P g eg Var S[ωg(T , S, x)]. When Var S[ωg(T , S, x)] Bg, it follows DP(π, T ) DP(π, S) + P g βg(1 βg) Bg 1/2. Comparing the inequality in Theorem 4.1 and the consequent of Equation (10), we can interpret Prπ,S( ˆY =1) in Theorem 4.1 as an upper bound for the average value of bv( DP, D, π, T , b) along any curve from S to T . Interpreting this result, the closer Pr( ˆY =1) is to 0.5 for any group, the more potentially sensitive the fairness of the policy is to distribution shifts for that group. We can further generalize the results to multi-class and multi-group setting: Corollary 4.2. Theorem 4.1 may be generalized to multiple classes Y = {1, 2, ..., m} and multiple groups G {1, 2, ..., n}, where βg,y = Pr ˆY =y | G=g and assuming Var S[ωg(T , S, x)] Bg: DP(π, T ) := X Pr π,T ( ˆY =y | G=g) Pr π,T ( ˆY =y | G=h) (14) DP(π, T ) DP(π, S) + X βg,y(1 βg,y) Bg 1/2 (15) We remark that in general, binary classification bounds may frequently be generalized to multi-class bounds by redefining fairness violations as a sum of binary-class fairness violations (i.e., same-class vs. different-class labels) and summing the bounds on each. 4.2 Equal Opportunity Consider an example using the (Y =1)-conditioned case of Equalized Odds termed Equal Opportunity (EOp). Denoting, for each group g, the true positive rate β+ g := Prπ,T ( ˆY =1 | Y =1, G=g) as an implicit function of π and T , we define disparity for EOp as EOp(π, T ) := P g,h G |β+ g β+ h |. We may bound the realized value of EOp(π, T ) by bounding β+ g for each group: Theorem 4.3. Subject to covariate shift and any given D, B, assume extremal values for β+ g , i.e., g, Dg(T S) < Bg = lg β+ g (π, T ) ug (16) it follows that v( EOp, D, π, S, B) max xg {lg,ug} xh {lh,uh} Corollary 4.4. The disparity measurement EOp cannot exceed |G|2 In Appendix D, we bound the extremal values of β+ g by geometrically interpreting this quantity as an inner product on an appropriate vector space, utilizing the freedom to select an appropriate D. 5 Label Shift Under label shift (Pr S(X|Y ) = Pr T (X|Y )), violations of EO and EOp are invariant, because the independence of ˆY and Y given X implies Prπ,T ( ˆY |Y ) = Prπ,S( ˆY |Y ). We therefore focus on the violation of demographic parity (DP) (Equation (3)) subject to the label shift condition, treating a binary classification task over two groups for simplicity. In this setting, we choose to measure group-specific distribution shifts from S to T by the change in proportion of ground-truth positive labels, which we refer to as the group qualification rate Qg(T ) := Pr T (Y = 1 | G = g): Dg(T S) := Qg(S) Qg(T ) Bg (18) Theorem 5.1. A Lipshitz condition bounds bv( DP, D, π, S, b) when Dg(T S) := Qg(S) Qg(T ) Bg (19) Specifically, bg v( DP, D, π, S, b) (|G| 1) β+ g β g (20) for true positive rates β+ g and false positive rates β g : β+ g := Pr π ( ˆY =1 | Y =1, G=g); β g := Pr π ( ˆY =1 | Y =0, G=g) (21) Because β+ g and β g are invariant under label shift given a constant policy π, we elide their explicit dependence on the underlying distribution. Theorem 5.2. For DP under the bounded label-shift assumption g, |Qg(S) Qg(T )| Bg, DP(π, T ) DP(π, S) + (|G| 1) X g Bg β+ g β g (22) Intuitively, the change in DP subject to label shift depends on |β+ g β g |, the marginal change in acceptance rates as agents change their qualifications Y . We measure the distribution shift as agents change their qualifications by |Qg(S) Qg(T )|. When β+ g is close to β g , the policy looks like a random classifier, and a label shift has limited effect on statistical group disparity. When |β+ g β g | is large, indicating high classifier accuracy, the effect on supremal disparity is larger. Our bound thus exposes a direct trade-off between accuracy and fairness transferability guarantees. 6 Comparisons to Synthetic Distribution Shifts (Demographic Parity) To further interpret our results, in this section, we consider specific and popular agent models to characterize distribution shift and instantiate our bounds for particular forms of D, B, and . 6.1 Covariate Shift via Strategic Response Let us consider a specific example of covariate shift (Equation (6)) caused by a deterministic, groupindependent model of strategic response in which agents react to a binary classification policy π characterized by group-specific feature thresholds: ˆY π(x, g) = 1 with probability 1 if x τg 0 with probability 1 otherwise (23) For simplicity, we assume the feature domain X = [0, 1]. In response to threshold τg, agents in each group g may modify their feature x to x by incurring a cost cg(x, x ) 0. Similar to [18], we define the utility ug for agents in group g to be ug(x, x ) := βg(x ) βg(x) cg(x, x ); βg(x) := Pr ˆY =1 | X=x, G=g , g. (24) Contrary to the standard strategic classification setting, we do not assume that feature updates represent false reports, but that such updates may correspond to actual changes underlying the true qualification Y of each agent. This assumption has been made in a recent line of research in incentivizing improvement from human agents subject to such classification [5]. Next, we assume all agents are rational utility maximizers (Equation (24)). For a given threshold τg and manipulation budget mg, the best response of an agent with original feature x is x = argmax z ug(x, z), such that cg(x, z) mg (25) To make the problem tractable, we make additional assumptions about the agents best responses. Assumption 6.1. An agent s original feature x is sampled as X U[0,1]4. Assumption 6.2. The cost function cg(x, x ) is monotone in |x x | as cg(x, x ) = |x x|. Under Assumption 6.2, only those agents with features x [τg mg, τg) will attempt to change their feature. We also assume that feature updates are non-deterministic, such that agents with features closer to the decision boundary τg have a greater chance of updating their feature and each updated feature x is sampled from a uniform distribution depending on τg, mg, and x: Assumption 6.3. For agents who attempt to update their features, the probability of a successful feature update is Pr(X = X ) = 1 |x τg| Assumption 6.4. An agent s updated feature x , given original feature x, manipulation budget mg, and classification boundary τg, is sampled as X U[τg,τg+mg x]. With the above setting, we can specify the reweighting coefficient ωg(x) for our setting (Equation (102) in Appendix F.1 and get the following bound for the strategic response setting5: Proposition 6.5. For our assumed setting of strategic response involving DP for two groups {g, h}, Theorem 4.1 implies DP(π, T ) DP(π, S) + τg(1 τg)2 3mg + τh(1 τh)2 The above result shows that two factors lead to a smaller difference between the source and target fairness violations: a less stochastic classifier (when the threshold τg is far away from 0.5) and a smaller manipulation budget mg (diminishing agents ability to adapt their feature). In this case, Bg = 2 3τg(1 τg). These factors lead to less potential manipulation and result in a tighter upper bound for the fairness violation on T . 6.2 Label Shift via Replicator Dynamics We now evaluate our theoretical bound for demographic parity subject to label shift (Theorem 5.2) on the replicator dynamics model of Raab and Liu [30]. Briefly, replicator dynamics assumes that the proportion of agents in a population choosing one strategy over another grows in proportion to the ratio of average utilities realized by the two strategies. The cited model additionally assumes X = R, Y = {0, 1}, and a monotonicity condition for S given by d dx Pr S(X=x|Y =1) Pr S(X=x|Y =0) > 0. Label shift under the discrete-time (t) replicator dynamics may be expressed in terms of group qualification rates Qg := Prt(Y =1 | G=g) and agent utilities (i.e., groupand feature-independent values Uy,ˆy) such that, in each group, the popularity and average utility associated with a label determines its frequency at the next time step t+1. 4where U represents the uniform distribution. 5See Figure 5 in Appendix C for a demonstration of Theorem 4.1. Max observed group shift Source acceptance rate 0.2 0.4 0.6 0.8 Violation of Demographic Parity Max observed group shift Source acceptance rate 0.2 0.4 0.6 0.8 Violation of Demographic Parity (a) A stereoscopic (cross-eye view) comparison between the bound of Section 4.1 (gradated) and simulated results for the model of Section 6.1 (blue) in response to a DP-fair classifier with different initial group-independent acceptance rates. Group 1 qualification rate s1 0.0 0.2 0.4 0.6 0.8 1.0 0.000 0.025 0.050 0.075 0.100 0.125 Modelled Shift Group 1 qualification rate s1 0.0 0.2 0.4 0.6 0.8 1.0 Theoretical Bound Group 1 qualification rate s1 0.0 0.2 0.4 0.6 0.8 1.0 Group 2 qualification rate s2 Bound - Modelled Shift (b) A policy satisfying DP is subject to distribution shift prescribed by replicator dynamics (Section 6.2). Realized disparity increases (blue) are compared to the theoretical bound (Theorem 5.2, gradated), which is tight when group have dissimilar qualification rates. Figure 3: Comparisons to synthetic distribution. Larger versions are provided in Appendix E. Denote the fractions of group-conditioned, feature-independent outcomes with the expression ρy,ˆy g := Prt( ˆY =ˆy, Y =y | G=g) and abbreviate the fraction-weighted utility as uy,ˆy g (t) := Uy,ˆy ρy,ˆy g . We may then represent the replicator dynamics as Qg[t + 1] = u1,1 g (t) + u1,0 g (t) u1,1 g (t) + u1,0 g (t) + u0,0 g (t) + u0,1 g (t) (27) To apply Theorem 5.2, we also observe that |β+ g β g | = |ρ1,1 g ρ0,1 g | ρ1,1 g +ρ0,1 g , where β+ g and β g represent the true positive rate and false positive rate for group g, respectively, and we use the change in qualification rate as our measurement of label shift, i.e., Bg = |Qg[t + 1] Qg[t]|. When demographic parity is perfectly satisfied, we note that the acceptance rate (ρ1,1 g + ρ0,1 g ) is group-independent. Theorem 6.6. For DP subject to label replicator dynamics, DP(π, T ) DP(π, S) + X Qg[t + 1] Qg[t] |ρ1,1 g ρ0,1 g | ρ1,1 g + ρ0,1 g (28) In Figure 3(b), we graphically represent all possible states of an initially fair system (thus determining β and ρ as a result of the monotonicity condition) by the tuple of qualification rates for each group. With the dynamics prescribed by Equation (27), we depict the rate of change of disparity given a fixed, locally DP-fair policy, and compare this to the theoretical bound when Bg = |Qg[t + 1] Qg[t]|. Interpreting our results, we note that the bound lacks information about the relative directions of the change in acceptance rates for each group, and thus over-approximates possible fairness violations when group acceptance rates shift the same direction. When group acceptance rates move in opposing directions, however, the bound gives excellent agreement with the modelled replicator dynamics. 7 Comparisons to Real-World Distribution Shifts We now compare our special-case theoretical bounds (i.e., label/covariate shift) to real-world distribution shifts and hypothetical classifiers. We use American Community Survey (ACS) data provided by the US Census Bureau [16]. We adopt the sampling and pre-processing approaches following the Folktables package provided by Ding et al. [13]6 to obtain 1,599,229 data points. The data is partitioned by (1) all fifty US states and (2) years from 2014 to 2018. We use 10 features covering the demographic information used in the UCI Adult dataset [4], including age, occupation, education, etc., as X for our model, select sex as binary protected group, i.e., G {g = female, h = male}. We set the label Y to whether an individual s annual income is greater than $50K. To apply our label-shift or covariate-shift bounds, we first need to verify whether the two datasets satisfy either of these assumptions. We adopted a conditional independence test [22], which takes data from source and target domains as input and returns a divergence score for each covariate and label variable, reflecting to what extent the variable is shifted between distributions. We find that the likelihood that the covariates shift across US states is approximately two orders of magnitude higher 6This package is available at https://github.com/zykls/folktables. than for labels. More specifically, there are 4 covariates, including class of worker (probabilistic divergence score of 2.67e-2), hours worker per week (3.56e-2), sex (3.56e-2) and race (2.55e-1), that are more likely to be shifted than the label variable (1.29e-4). For temporal shifts within states, we find that the label variable is more likely to be shifted (0.1) than all the other covariates (which are below 0.01), approximately two orders of magnitude in favor of label shift over covariate shift. We therefore compare the disparities of hypothetical policies on these distributions to bounds generated from the corresponding, approximately satisfied assumptions. On this data, we train a set of group-dependent, linear threshold classifiers Prπ(x,g)( ˆY =1) = 1[σ(w x) > τg], for a range of thresholds τg and τh for each source distribution. Here, σ( ) is the logistic function and w denotes a weight vector. We then consider two types of real-world distribution shift: (1) geographic, in which a model trained for one state is evaluated on other US state in the same year, and (2) temporal, in which a model trained for 2014 is evaluated on the same state in 2018. 0.0 0.2 0.4 0.6 0.8 1.0 Change in Violation 0.0 0.2 0.4 0.6 0.8 1.0 Change in Violation 0.0 0.2 0.4 0.6 0.8 1.0 Change in Violation of Demographic Parity (c) CA: 2014 2018 0.0 0.2 0.4 0.6 0.8 1.0 Change in Violation of Demographic Parity (d) TX: 2014 2018 Figure 4: Simulated change in DP violation (blue mesh) subject to geographic and temporal distribution shifts vs. direct application of bounds for approximately satisfied assumptions (respectively, Theorem 4.1 and Theorem 5.2) (gradated mesh). The x-axis and y-axis of both figures represent the policy thresholds τg and τh. We graphically compare the theoretical bounds of Theorem 4.1 and Theorem 5.2 for the increased violation of DP subject to covariate and label shift, respectively, to the simulated violations for our model and data in Figure 4. We provide additional examples and an evaluation of bounds for EO subject to covariate shift (noting that label shift preserves EO in theory) in Appendix E.2. Despite the fact that geographic or temporal distribution shifts only approximately satisfy the assumptions of covariate or label shift, these comparisons demonstrate that our theoretical bounds are not vacuous, approximately bounding the change of fairness violation across real-world domain shifts. For geographic shifts, the covariate shift EO bounds (Appendix E.2) correctly overestimate disparity and tighten near accurate policies, while our DP bounds are useful only for a subset of policy thresholds (Figure 4(b)). add specific pointer. e.g, 4.a, that one is 4.b For temporal shift, the label shift bound for DP correctly overestimates the real change of DP violations but still remains at the same order of magnitude (Figure 4(c) and 4(d)). 8 Conclusion and Discussion In this paper, we have developed a unifying framework for bounding the violation of statistical group fairness guarantees when the underlying distribution shifts within presupposed bounds. We hope that this work can generate meaningful discussion regarding the viability of fairness guarantees subject to distribution shift, the bounds of adversarial attacks against algorithmic fairness, and evaluations of robustness with respect to algorithmic fairness. We believe that, just as published empirical measurements are of limited use without reported uncertainties, fairness guarantees must be accompanied by bounds on their robustness to distribution shift. Future work remains to apply our framework for to problem of fairness transferability in settings with more complicated distribution shift dynamics. For example, compound distribution shifts [33], which compose covariate shifts and label shifts, cannot be treated by composing the theoretical bounds developed herein without additional information regarding intermediate distributions. Another potential future direction is to develop reasonable bounds on anticipated distribution shift from models of human behavior and exogenous pressures. Acknowledgement This work is supported by the National Science Foundation (NSF) under grants IIS-2143895, IIS-2040800 (FAI program in collaboration with Amazon), and CCF-2023495. [1] Bang An, Zora Che, Mucong Ding, and Furong Huang. Transferring fairness under distribution shifts via fair consistency regularization. ar Xiv preprint ar Xiv:2206.12796, 2022. [2] Shai Ben-David, John Blitzer, Koby Crammer, Alex Kulesza, Fernando Pereira, and Jennifer Vaughan. A theory of learning from different domains. Machine Learning, 79:151 175, 2010. [3] Arpita Biswas and Suvam Mukherjee. Ensuring fairness under prior probability shifts. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, pages 414 424, 2021. [4] Catherine Blake. Uci repository of machine learning databases. 1998. [5] Yatong Chen, Jialu Wang, and Yang Liu. Linear classifiers that encourage constructive adaptation. In Algorithmic Recourse workshop at ICML 21, 2021. [6] Alexandra Chouldechova. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data, 5(2):153 163, 2017. [7] Stephen Coate and Glenn C Loury. Will affirmative-action policies eliminate negative stereotypes? The American Economic Review, pages 1220 1240, 1993. [8] Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. Algorithmic decision making and the cost of fairness. In Proceedings of the 23rd acm sigkdd international conference on knowledge discovery and data mining, pages 797 806, 2017. [9] Amanda Coston, Karthikeyan Natesan Ramamurthy, Dennis Wei, Kush R. Varshney, Skyler Speakman, Zairah Mustahsan, and Supriyo Chakraborty. Fair transfer learning with missing protected attributes. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society, AIES 19, page 91 98, New York, NY, USA, 2019. Association for Computing Machinery. ISBN 9781450363242. doi: 10.1145/3306618.3314236. URL https://doi.org/ 10.1145/3306618.3314236. [10] Andrew Cotter, Maya Gupta, and Harikrishna Narasimhan. On making stochastic classifiers deterministic. In Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. [11] Elliot Creager, David Madras, Toniann Pitassi, and Richard Zemel. Causal modeling for fairness in dynamical systems. In Proceedings of the 37th International Conference on Machine Learning, ICML 20. JMLR.org, 2020. [12] Alexander D Amour, Hansa Srinivasan, James Atwood, Pallavi Baljekar, D Sculley, and Yoni Halpern. Fairness is not static: deeper understanding of long term fairness via simulation studies. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pages 525 534, 2020. [13] Frances Ding, Moritz Hardt, John Miller, and Ludwig Schmidt. Retiring adult: New datasets for fair machine learning. In Thirty-Fifth Conference on Neural Information Processing Systems, 2021. URL https://openreview.net/forum?id=b Yi_2708m KK. [14] Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference, pages 214 226, 2012. [15] Michael Feldman, Sorelle A Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. Certifying and removing disparate impact. In proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pages 259 268, 2015. [16] Sarah Flood, Miriam King, Renae Rodgers Steven Ruggles, and J. Robert Warren. Integrated public use microdata series, current population survey: Version 8.0 [dataset], 2020. URL https://www.ipums.org/projects/ipums-cps/d030.v8.0. [17] Nina Grgi c-Hlaˇca, Muhammad Bilal Zafar, Krishna P. Gummadi, and Adrian Weller. On fairness, diversity and randomness in algorithmic decision making, 2017. [18] Moritz Hardt, Nimrod Megiddo, Christos Papadimitriou, and Mary Wootters. Strategic classification. In Proceedings of the 2016 ACM Conference on Innovations in Theoretical Computer Science, page 111 122, New York, NY, USA, 2016. Association for Computing Machinery. [19] Moritz Hardt, Eric Price, and Nati Srebro. Equality of opportunity in supervised learning. In Advances in neural information processing systems, pages 3315 3323, 2016. [20] Lily Hu and Yiling Chen. A short-term intervention for long-term fairness in the labor market. In Proceedings of the 2018 World Wide Web Conference on World Wide Web, pages 1389 1398. International World Wide Web Conferences Steering Committee, 2018. [21] Mintong Kang, Linyi Li, Maurice Weber, Yang Liu, Ce Zhang, and Bo Li. Certifying some distributional fairness with subpopulation decomposition. ar Xiv preprint ar Xiv:2205.15494, 2022. [22] Sean Kulinski, Saurabh Bagchi, and David I Inouye. Feature shift detection: Localizing which features have shifted via conditional distribution tests. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 19523 19533. Curran Associates, Inc., 2020. URL https://proceedings. neurips.cc/paper/2020/file/e2d52448d36918c575fa79d88647ba66-Paper.pdf. [23] Lydia T Liu, Sarah Dean, Esther Rolf, Max Simchowitz, and Moritz Hardt. Delayed impact of fair machine learning. In International Conference on Machine Learning, pages 3150 3158. PMLR, 2018. [24] Lydia T Liu, Ashia Wilson, Nika Haghtalab, Adam Tauman Kalai, Christian Borgs, and Jennifer Chayes. The disparate equilibria of algorithmic decision making when individuals invest rationally. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, pages 381 391, 2020. [25] Yang Liu, Yatong Chen, Zeyu Tang, and Kun Zhang. Model transferability with responsive decision subjects, 2021. [26] Debmalya Mandal, Samuel Deng, Suman Jana, Jeannette Wing, and Daniel J Hsu. Ensuring fairness beyond the training data. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors, Advances in Neural Information Processing Systems, volume 33, pages 18445 18456. Curran Associates, Inc., 2020. [27] Yishay Mansour, Mehryar Mohri, and Afshin Rostamizadeh. Domain adaptation: Learning bounds and algorithms, 2009. [28] Hussein Mouzannar, Mesrob I Ohannessian, and Nathan Srebro. From fair decision making to social equality. In Proceedings of the Conference on Fairness, Accountability, and Transparency, pages 359 368. ACM, 2019. [29] Harikrishna Narasimhan. Learning with complex loss functions and constraints. In Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, volume 84 of Proceedings of Machine Learning Research, pages 1646 1654. PMLR, 09 11 Apr 2018. [30] Reilly Raab and Yang Liu. Unintended selection: Persistent qualification rate disparities and interventions. Advances in Neural Information Processing Systems, 34, 2021. [31] Ashkan Rezaei, Anqi Liu, Omid Memarrast, and Brian D. Ziebart. Robust fairness under covariate shift. In AAAI, 2021. [32] Yuji Roh, Kangwook Lee, Steven Whang, and Changho Suh. Sample selection for fair and robust training. In M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan, editors, Advances in Neural Information Processing Systems, volume 34, pages 815 827. Curran Associates, Inc., 2021. [33] Jessica Schrouff, Natalie Harris, Oluwasanmi Koyejo, Ibrahim Alabdulmohsin, Eva Schnider, Krista Opsahl-Ong, Alex Brown, Subhrajit Roy, Diana Mincu, Christina Chen, et al. Maintaining fairness across distribution shift: do we have viable solutions for real-world applications? ar Xiv preprint ar Xiv:2202.01034, 2022. [34] Candice Schumann, Xuezhi Wang, Alex Beutel, Jilin Chen, Hai Qian, and Ed H. Chi. Transfer of machine learning fairness across domains, 2019. [35] Hidetoshi Shimodaira. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of statistical planning and inference, 90(2):227 244, 2000. [36] Harvineet Singh, Rina Singh, Vishwali Mhasawade, and Rumi Chunara. Fairness violations and mitigation under covariate shift. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAcc T 21, New York, NY, USA, 2021. Association for Computing Machinery. [37] Masashi Sugiyama, Taiji Suzuki, Shinichi Nakajima, Hisashi Kashima, Paul von Bünau, and Motoaki Kawanabe. Direct importance estimation for covariate shift adaptation. Annals of the Institute of Statistical Mathematics, 60(4):699 746, 2008. [38] Berk Ustun, Alexander Spangher, and Yang Liu. Actionable recourse in linear classification. In Proceedings of the Conference on Fairness, Accountability, and Transparency, pages 10 19, 2019. [39] Min Wen, Osbert Bastani, and Ufuk Topcu. Fairness with Dynamics. ar Xiv preprint ar Xiv:1901.08568, 2019. [40] Jimmy Wu, Yatong Chen, and Yang Liu. Metric-fair classifier derandomization. In Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research. PMLR, 17 23 Jul 2022. [41] Rich Zemel, Yu Wu, Kevin Swersky, Toni Pitassi, and Cynthia Dwork. Learning fair representations. In International conference on machine learning, pages 325 333. PMLR, 2013. [42] Kun Zhang, Bernhard Schölkopf, Krikamol Muandet, and Zhikun Wang. Domain adaptation under target and conditional shift. In International Conference on Machine Learning, pages 819 827. PMLR, 2013. [43] Xueru Zhang, Ruibo Tu, Yang Liu, Mingyan Liu, Hedvig Kjellström, Kun Zhang, and Cheng Zhang. How do fair decisions fare in long-term qualification? In Neur IPS, 2020.