# algorithmic_fairness_verification_with_graphical_models__6eef16e0.pdf Algorithmic Fairness Verification with Graphical Models* Bishwamittra Ghosh1, Debabrota Basu2, Kuldeep S. Meel1 1 School of Computing, National University of Singapore, Singapore 2 Equipe Scool, Univ. Lille, Inria, UMR 9189 - CRISt AL, CNRS, Centrale Lille, France In recent years, machine learning (ML) algorithms have been deployed in safety-critical and high-stake decision-making, where the fairness of algorithms is of paramount importance. Fairness in ML centers on detecting bias towards certain demographic populations induced by an ML classifier and proposes algorithmic solutions to mitigate the bias with respect to different fairness definitions. To this end, several fairness verifiers have been proposed that compute the bias in the prediction of an ML classifier essentially beyond a finite dataset given the probability distribution of input features. In the context of verifying linear classifiers, existing fairness verifiers are limited by accuracy due to imprecise modeling of correlations among features and scalability due to restrictive formulations of the classifiers as SSAT/SMT formulas or by sampling. In this paper, we propose an efficient fairness verifier, called FVGM, that encodes the correlations among features as a Bayesian network. In contrast to existing verifiers, FVGM proposes a stochastic subset-sum based approach for verifying linear classifiers. Experimentally, we show that FVGM leads to an accurate and scalable assessment for more diverse families of fairness-enhancing algorithms, fairness attacks, and group/causal fairness metrics than the state-of-the-art fairness verifiers. We also demonstrate that FVGM facilitates the computation of fairness influence functions as a stepping stone to detect the source of bias induced by subsets of features. 1 Introduction The significant improvement of machine learning (ML) over the decades has led to a host of applications of ML in high-stake decision-making such as college admission (Martinez Neda, Zeng, and Gago-Masague 2021), hiring of employees (Ajunwa et al. 2016), and recidivism prediction (Tollenaar and Van der Heijden 2013; Dressel and Farid 2018). ML algorithms often have an accuracy-centric learning objective, which may cause them to be biased towards certain part of the dataset belonging to a certain economically or socially sensitive groups (Landy, Barnes, and Murphy 1978; Zliobaite 2015; Berk 2019). The following example illustrates a plausible case of unfairness induced by ML algorithms. Example 1.1. Following (Ghosh, Basu, and Meel 2021, Example 1.), let us consider an ML problem where the classifier *Source code: https://github.com/meelgroup/justicia Copyright 2022, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. decides the eligibility of an individual for health insurance given their income and fitness (Figure 1). Here, the sensitive feature age (A) follows a Bernoulli distribution, and income (I) and fitness (F) follow Gaussian distributions. We generate 1000 samples from these distributions and use them to train a Support Vector Machine (SVM) classifier. The decision boundary of this classifier is 9.37I + 9.75F 0.34A 9.4, where A = 1 denotes the sensitive group age 40 . This classifier selects an individual above and below 40 years of age with probabilities 0.24 and 0.86, respectively. This illustrates a disparate treatment of individuals of two age groups by the SVM classifier. 0.0 0.5 1.0 income 0.0 0.5 1.0 fitness age < 40 age 40 Figure 1: Age-dependent distributions of income and fitness in Example 1.1. In order to identify and mitigate the bias of ML classifiers, different fairness definitions and fairness algorithms have been proposed (Hardt, Price, and Srebro 2016; Kusner et al. 2017; Mehrabi et al. 2019). In this paper, we focus on two families of fairness definitions: group and causal fairness. Group fairness metrics, such as disparate impact and equalized odds constrain the probability of the positive prediction of the classifier to be (almost) equal among different sensitive groups (Dwork et al. 2012; Feldman et al. 2015). On the other hand, causal fairness metrics assess the difference in positive predictions if every feature in the causal relation remains identical except the sensitive feature (Nabi and Shpitser 2018; Zhang and Bareinboim 2018). The early works on fairness verification focused on measuring fairness metrics of a classifier for a given dataset (Bellamy et al. 2018). Naturally, such techniques were limited in enhancing confidence of users for wide deployment. Consequently, recent verifiers seek to achieve verification beyond finite dataset and in turn focus on the probability distribution of features (Albarghouthi et al. 2017; Bastani, Zhang, and Solar-Lezama 2019; Ghosh, Basu, and Meel 2021). More specifically, the input to the verifier is The Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-22) a classifier and the probability distribution of features, and the output is an estimate of fairness metrics that the classifier obtains given the distribution. For Example 1.1, a fairness verifier takes the SVM classifier and the distribution of features I, F, A as an input and outputs the probability of positive prediction of the classifier for different sensitive groups. In order to solve the fairness-verification problem, existing works have proposed two principled approaches. Firstly, Ghosh, Basu, and Meel (2021) and Albarghouthi et al. (2017) propose formal methods that reduce the problem into a solution of an SSAT or an SMT formula respectively. Secondly, Bastani, Zhang, and Solar-Lezama (2019) propose a sampling approach that relies on extensively enumerating the conditional probabilities of prediction given different sensitive features and thus, incurs high computational cost. Additionally, existing works assume feature independence of non-sensitive features and consider correlated features within a limited scope, such as conditional probabilities of non-sensitive features w.r.t. sensitive features and ignore correlations among non-sensitive features. As a result, the scalability and accuracy of existing verifiers remains a major challenge. In this work, we seek to remedy the aforementioned situation. As a first step, we focus on linear classifiers, which has attracted significant attention from researchers in the context of fair algorithms (Pleiss et al. 2017; Zafar et al. 2017; Dressel and Farid 2018; John, Vijaykeerthy, and Saha 2020). At this point, it is worth highlighting that our empirical evaluation demonstrates that the existing techniques fail to scale beyond small examples or provide highly inaccurate estimates for comparatively small linear classifiers. Our Contributions. In this paper, we propose a fairness verification framework, namely FVGM (Fairness Verification with Graphical Models), for accurately and efficiently verifying linear classifiers. FVGM proposes a novel stochastic subset-sum encoding for linear classifiers with an efficient pseudo-polynomial solution using dynamic programming. To address feature-correlations, FVGM considers a graphical model, particularly a Bayesian Network that represents conditional dependence (and independence) among features in the form of a Directed Acyclic Graph (DAG). Experimentally, FVGM is more accurate and scalable than existing fairness verifiers; FVGM can verify group and causal fairness metrics for multiple fairness algorithms. We also demonstrate two novel applications of FVGM as a fairness verifier: (a) detecting fairness attacks, and (b) computing Fairness Influence Functions (FIF) of features as a mean of identifying (un)fairness contribution of a subset of features. 2 Background In this section, we define different fairness metrics proposed for classification. Following that, we state basics of stochastic subset sum and Bayesian networks that are the main components of our methodology. Fairness in ML. We consider1 a dataset D as a collection of triples (X, A, Y ) generated from an underlying dis- 1We represent sets/vectors by bold letters, and the corresponding distributions by calligraphic letters. We express random variables in uppercase, and an assignment of a random variable in lowercase. tribution D. X {X1, . . . , Xm1} are non-sensitive features whereas A {A1, . . . , Am2} are categorical sensitive features. Y {0, 1} is the binary label (or class) of (X, A). Each non-sensitive feature Xi is sampled from a continuous probability distribution Xi, and each sensitive feature Aj {0, . . . , Nj} is sampled from a discrete probability distribution Aj. We use (x, a) to denote the featurevalues of (X, A). For sensitive features, a valuation vector a = [a1, .., am2] is called a compound sensitive group. For example, consider A = {race, sex} where race {Asian, Color, White} and sex {female, male}. Thus a = [Asian, female] is a compound sensitive group. We represent a binary classifier trained on the dataset D as M : (X, A) ˆY . Here, ˆY {0, 1} is the predicted class of (X, A). We now discuss different fairness metrics in the literature based on the prediction of a classifier (Feldman et al. 2015; Hardt, Price, and Srebro 2016; Nabi and Shpitser 2018). In this paper, FVGM verifies two families of fairness metrics: group fairness (first three in the following) and path-specific causal fairness. 1. Disparate Impact (DI): A classifier satisfies (1 ϵ)- disparate impact if for ϵ [0, 1], mina Pr[ ˆY = 1|A = a] (1 ϵ) maxa Pr[ ˆY = 1|A = a]. 2. Statistical Parity (SP): A classifier satisfies ϵ-statistical parity if for ϵ [0, 1], maxa Pr[ ˆY = 1|A = a] mina Pr[ ˆY = 1|A = a] ϵ. 3. Equalized Odds (EO): A classifier satisfies ϵ-equalized odds if for ϵ [0, 1], maxa Pr[ ˆY = 1|A = a, Y = 0] mina Pr[ ˆY = 1|A = a, Y = 0] ϵ, and maxa Pr[ ˆY = 1|A = a, Y = 1] mina Pr[ ˆY = 1|A = a, Y = 1] ϵ. 4. Path-specific Causal Fairness (PCF): Let amax arg maxa Pr[ ˆY = 1|A = a]. We consider mediator features Z X sampled from the conditional distribution Z|A=amax. This emulates the fact that mediator variables have the same sensitive features amax. For ϵ [0, 1], path-specific causal fairness is defined as maxa Pr[ ˆY = 1|A = a, Z] mina Pr[ ˆY = 1|A = a, Z] ϵ. For all of the above metrics, lower value of ϵ indicates higher fairness demonstrated by the classifier M. Following the observation of (Ghosh, Basu, and Meel 2021), computing all of the aforementioned fairness metrics is equivalent to computing the maximum and minimum of the probability of positive prediction of the classifier, denoted as Pr[ ˆY = 1|A = a], for all compound sensitive groups a from A. Thus, in Section 3, we focus on computing the maximum and minimum probability of positive prediction of the classifier and then extend it to assess corresponding fairness metrics. We call the group for which the probability of positive prediction is maximum (minimum) as the most (least) favored group of the classifier. Stochastic Subset Sum Problem (S3P). Let B {Bi}|B| i=1 be a set of Boolean variables and wi Z be the weight of Bi. Given a constraint of the form P|B| i=1 wi Bi = τ, for a constant threshold τ Z, the subset-sum problem seeks to compute an assignment b {0, 1}|B| such that the constraint evaluates to true when B is substituted with b. Subset sum problem is known to be a NP-complete problem and well-studied in theoretical computer science (Kleinberg and Tardos 2006). The counting version of the subset-sum problem counts all b s for which the above constraint holds; and this problem belongs to the complexity class #P. In this paper, we consider the counting problem for the constraint P|B| i=1 wi Bi τ where variables Bi s are stochastic. More precisely, we define a stochastic subset-sum problem, namely S3P, that computes Pr[P|B| i=1 wi Bi τ]. Details of S3P are in Section 3.1. Bayesian Network. In general, a Probabilistic Graphical Model (Koller and Friedman 2009), and specifically a Bayesian network (Pearl 1985; Chavira and Darwiche 2008), encodes the dependencies and conditional independence between a set of random variables. In this paper, we leverage an access to a Bayesian network on X A that represents the joint distribution on them. A Bayesian network is denoted by a pair (G, θ), where G (V, E) is a DAG (Directed Acyclic Graph), and θ is a set of parameters encoding the conditional probabilities induced by the joint distribution under investigation. Each vertex Vi V corresponds to a random variable. Edges E V V imply conditional dependencies among variables. For each variable Vi V, let Pa(Vi) V \ {Vi} denote the set of parents of Vi. Given Pa(Vi) and parameters θ, Vi is independent of its other non-descendant variables in G. Thus, for the assignment vi of Vi and u of Pa(Vi), the aforementioned semantics of a Bayesian network encodes the joint distribution of V as: Pr[V1 = v1, . . . ,V|V| = v|V|] = i=1 Pr[Vi = vi|Pa(Vi) = u; θ]. (1) 3 FVGM: Fairness Verification with Graphical Models In this section, we present FVGM, a fairness verification framework for linear classifiers that accounts for correlated features represented as a graphical model. The core idea of verifying fairness of a classifier is to compute the probability of positive prediction of the classifier with respect to all compound sensitive groups. To this end, FVGM solves a stochastic subset sum problem, S3P, that is equivalent to computing the probability of positive prediction of the classifier for the most (and the least) favored sensitive group. In this section, we first define S3P and present an efficient dynamic programming solution for S3P. We then extend S3P to consider correlated features as input. Finally, we conclude by discussing fairness verification based on the solution of S3P. Problem Formulation. Given a linear classifier M : (X, A) ˆY and a probability distribution D of X A, our objective is to compute maxa Pr[ ˆY = 1|A = a] and mina Pr[ ˆY = 1|A = a] with respect to D. In this study, we express a linear classifier M as i w Xi Xi + X j w Aj Aj τ i . Here, w denotes the weight (or coefficients) of a feature, τ denotes the bias or the offset parameter of the classifier, and 1 is an indicator function. Hence, the prediction ˆY = 1 if and only if the inner inequality holds. Thus, computing the maximum (resp. minimum) probability of positive prediction is equivalent to finding out the assignment of Aj s for which the probability of satisfying the inner inequality is highest (resp. lowest). We reduce this computation into an instance of S3P. To perform this reduction, we assume weights w and bias τ as integers, and features X A as Boolean. In Sec. 3.5, we relax these assumptions and extend to the practical settings. 3.1 S3P: Stochastic Subset Sum Problem Now, we formally describe the specification and semantics of S3P. S3P operates on a set of Boolean variables B = {Bi}n i=1 {0, 1}n, where wi Z is the weight of Bi, and n |B|. Given a constant threshold τ Z, S3P computes the probability of a subset of B with sum of weights of nonzero variables to be at least τ. Formally, S(B, τ) Pr h X i wi Bi τ i [0, 1]. Aligning with terminologies in stochastic satisfiability (SSAT) (Littman, Majercik, and Pitassi 2001), we categorize the variables B into two types: (i) chance variables that are stochastic and have an associated probability of being assigned to 1 and (ii) choice variables that we optimize while computing S(B, τ). To specify the category of variables, we consider a quantifier qi { pi, , } for each Bi. Elaborately, p is a random quantifier corresponding to a chance variable B B, where p Pr[B = 1]. In contrast, is an existential quantifier corresponding to a choice variable B such that a Boolean assignment of B maximizes S(B, τ). Finally, is an universal quantifier for a choice variable B that fixes an assignment to B that minimizes S(B, τ). Now, we formally present the semantics of S(B, τ) provided that each variable Bi has weight wi and quantifier qi. Let B[2 : n] {Bj}n j=2 be the subset of B without the first variable B1. Then S(B, τ) is recursively defined as: 1[τ 0], if B = S(B[2 : n], τ max{w1, 0}), if q1 = S(B[2 : n], τ min{w1, 0}), if q1 = p1 S(B[2 : n], τ w1)+ (1 p1) S(B[2 : n], τ), if q1 = Observe that when B is empty, S is computed as 1 if τ 0, and S = 0 otherwise. For existential and universal quantifiers, we compute S based on the weight. Specifically, if q1 = , we decrement the threshold τ by the maximum between w1 and 0. For example, if w1 > 0, B1 is assigned 1, and assigned 0 otherwise. Therefore, by solving for an existential variable, we maximize S. In contrast, when if q1 = , we fix an assignment of B1 that minimizes S by choosing between the minimum of w1 and 0. Finally, for random quantifiers, we decompose the computation of S into two sub-problems: one sub-problem where B1 = 1 and the updated threshold becomes τ w1 and another sub-problem where B1 = 0 and the updated threshold remains the same. Herein, we compute S as the expected output of both sub-problems. Remark. S(B, τ) does not depend on the order of B. Computing Minimum and Maximum probability of positive prediction of Linear Classifiers Using S3P. For computing maxa Pr[ ˆY = 1|A = a] of a linear classifier, we set existential quantifiers to sensitive features Aj, randomized quantifiers to non-sensitive features Xi and construct a set B = A X. The coefficients w Aj and w Xi of the classifier become weights of B. Also, we get n = m1 + m2. For non-sensitive variables Xi, which are chance variables, we derive their marginal probability pi = Pr[Xi = 1] from the distribution D. According to semantic of S3P, setting quantifiers on A computes the maximum value of S(B, τ) that equalizes the maximum probability of positive prediction of the classifier. In this case, the inferred assignment of A implies the most favored group amax = arg maxa Pr[ ˆY = 1|A = a]. In contrast, to compute the minimum probability of positive prediction, we instead assign each variable Aj a universal quantifier while keeping random quantifiers over Xi, and infer the least favored group amin = arg mina Pr[ ˆY = 1|A = a]. 3.2 A Dynamic Programming Solution for S3P We propose a dynamic programming approach (Pisinger 1999; Woeginger and Yu 1992) to solve S3P as the problem has overlapping sub-problem properties. For example, S(B, τ) can be solved by solving S(B[2 : n], τ ), where the updated threshold τ , called the residual threshold, depends on the original threshold τ and the assignment of B1 as shown in Eq. (2). Building on this observation, we propose the recursion and terminating condition leading to our dynamic programming algorithm. Recursion. We consider a function dp(i, τ) that solves the sub-problem S(B[i : n], τ), for i {1, . . . , n}. The semantics of S(B, τ) in Eq. (2) induces the recursive definition of dp(i, τ) as: dp(i + 1, τ max{wi, 0}), if qi = dp(i + 1, τ min{wi, 0}), if qi = pi dp(i + 1, τ wi)+ (1 pi) dp(i + 1, τ), if qi = Eq. (3) shows that S(B, τ) can be solved by instantiating dp(1, τ), which includes all the variables in B. Terminating Condition. Let wneg, wpos, and wall be the sum of negative, positive, and all weights of B, respectively. We observe that wneg wall wpos. Thus, for any i, if the residual threshold τ wneg, there is always a subset of B[i : n] with sum of weights at least τ. Conversely, when τ > wpos, there is no subset of B[i : n] with sum of weights at least τ. We leverage this bound and tighten the terminating conditions of dp(i, τ) in Eq. (4). 1 if τ wneg 0 if τ > wpos 1[τ 0] if i = n + 1 (4) Eq. (3) and (4) together define our dynamic programming algorithm. While deploying the algorithm, we store dp(i, τ) in memory to avoid repetitive computations. This allows us to achieve a pseudo-polynomial algorithm (Lemma 1) instead of a na ıve exponential algorithm enumerating all possible assignments. In particular, the time complexity is pseudo-polynomial for chance (random) variables and linear for choice (existential and universal) variables. Lemma 1. Let n be the number of existential and universal variables in B. Let w = P Bi B|qi= max{wi, 0} and w = P Bi B|qi= min{wi, 0} be the considered sum of weights of existential and universal variables, respectively. We can exactly solve S3P using dynamic programming with time complexity O((n n )(τ + |wneg| w w ) + n ). The space complexity is O((n n )(τ +|wneg| w w )). A Heuristic for Faster Computation. We propose two improvements for a faster computation of the dynamic programming solution. Firstly, we observe that in Eq. (3), existential/universal variables are deterministically assigned based on their weights. Hence, we reorder B such that existential/universal variables appear earlier in B than random variables. This allows us to avoid unnecessary repeated exploration of existential/universal variables in dp. Moreover, according to the remark in Section 3.1, reordering B still produces the same exact solution of S3P. Secondly, to reach the terminating condition of dp(i, τ) more frequently, we sort B based on their weights more specifically, within each cluster of random, existential, and universal variables. In particular, if τ 0.5(wpos wneg), τ is closer to wpos than wneg. Hence, we sort each cluster in descending order of weights. Otherwise, we sort in ascending order. We illustrate our dynamic programming approach in Example 3.1. Example 3.1. We consider a linear classifier P + Q + R S 2. Herein, P is a Boolean sensitive feature, and Q, R, S are Boolean non-sensitive features with Pr[Q] = 0.4, Pr[R] = 0.5, and Pr[S] = 0.3. To compute the maximum probability of positive prediction of the classifier, we impose an existential quantifier on P and randomized quantifiers on others. This leads us to the observation that P = 1 is the optimal assignment as w P = 1 > 0. We now require to compute Pr[Q+R S 1], which by dynamic programming, is computed as 0.55. The solution is visualized as a search tree in Figure 2a, where we observe that storing the solution of sub-problems in the memory avoids repetitive computation, such as exploring the node (4, 0). Similarly, the minimum probability of positive prediction of the classifier is 0.14 (not shown in Figure 2a) where we impose a universal quantifier on P to obtain P = 0 as the optimal assignment. 2, 1 0.55 0.85 Pr[Q] + 0.35 Pr[ Q] = 0.55 (a) Known marginal probabilities. 0.85 Pr[Q|P]+ 0.35 Pr[ Q|P] = 0.65 Bayesian network (b) Probabilities computed with a Bayesian network. Figure 2: Search tree representation of S3P for computing the maximum probability of positive prediction of the classifier on variables B = {P, Q, R, S} with weights {1, 1, 1, 1} and threshold τ = 2 . Each node is labeled by (i, τ ), where i is the index of B and τ is the residual threshold. The tree is explored using Depth-First Search (DFS) starting with left child. Within a node, the value in the bottom denotes dp(i, τ ) that is solved recursively based on sub-problems dp(i + 1, ) in child nodes. Yellow nodes denote existential variables and all other nodes are random variables. Additionally, a green node denotes a collision, in which case a previously computed dp solution is returned. Leaf nodes (gray) are computed based on terminating conditions in Eq. 4. In Figure 2b, nodes with double circles, such as {(1, 2), (2, 1), (2, 2)}, are enumerated exponentially to compute conditional probabilities from the Bayesian network. 3.3 S3P with Correlated Variables In S3P presented in Section 3.1, we consider all Boolean variables to be probabilistically independent. This independence assumption often leads to an inaccurate estimate of the probability of positive prediction of the classifier because both sensitive and non-sensitive features can be correlated in practical fairness problems. Therefore, we extend S3P to include correlations among variables. We consider a Bayesian network BN = (G, θ) to represent correlated variables, where G (V, E), V B, E V V, and θ is the parameter of the network. In BN, we constrain that there is no conditional probability of choice (i.e., existential and universal) variables as we optimize their assignment in S3P. Choice variables, however, can impose conditions on chance (i.e., random) variables. In practice, we achieve this by allowing no incoming edge on choice variables while learning BN (ref. Section 4). For a chance variable Bi V, let Pa(Bi) denote its parents. According to Eq. (1), for an assignment u of Pa(Bi), BN ensures Bi to be independent of other non-descendant variables in V. Hence, in the recursion of Eq. (3), we substitute pi with Pr[Bi = 1|Pa(Bi) = u]. In order to explicate the dependence on u, we denote the expected solution of S(B[i : n], τ) as dp(i, τ, u), which for Bi V is modified as follows: dp(i,τ, u) = Pr[Bi = 0|Pa(Bi) = u]dp(i + 1, τ, u {0}) + Pr[Bi = 1|Pa(Bi) = u]dp(i + 1, τ wi, u {1}). Since dp(i, τ, u) involves u, we initially perform a topological sort of V to know the assignment of parents before computing dp on the child. Moreover, there are 2|Pa(Bi)| assignments of Pa(Bi), and we compute dp(i, τ, u) for u {0, 1}|Pa(Bi)| to incorporate all conditional probabilities into S3P. For this enumeration, we do not store dp(i, τ, u) in memory. However, for Bi V that does not appear in the network, we instead compute dp(i, τ) and store it in memory as in Section 3.2, because Bi is not correlated with other variables. Lemma 2 presents the complexity of solving S3P with correlated variables, wherein unlike Lemma 1, the complexity differentiates based on variables in V (exponential) and B \ V (pseudo-polynomial). Lemma 2. Let V B be the set of vertices in the Bayesian network and n be the number of existential and universal variables in B \ V. Let w = P Bi B\V|qi= max{wi, 0} and w = P Bi B\V|qi= min{wi, 0} be the sum of considered weights of existential and universal variables, respectively that only appear in B \ V. To exactly compute S3P with correlated variables in the dynamic programming approach, time complexity is O(2|V| + (n n |V|)(τ + |wneg| w w ) + n ) and space complexity is O((n n |V|)(τ + |wneg| w w )). A Heuristic for Faster Computation. We observe that to encode conditional probabilities, we enumerate all assignments of variables in V. For computing the probability of positive prediction of a linear classifier with correlated features, we consider a heuristic to sort variables in B = A X. Let V B be the set of vertices in the network and Vc = B\V. In this heuristic, we sort sensitive variables A by positioning A V in the beginning followed by A Vc. Then we order the variables B such that variables in A precedes those in X V, and the variables in X Vc follows the ones in X V. This sorting allows us to avoid repetitive enumeration of variables in V B as they are placed earlier in B. Example 3.2. We extend Example 3.1 with a Bayesian Network (G, θ) with V = {P, Q} and E = {(P, Q)}. Parameters θ imply conditional probabilities Pr[Q|P] = 0.6 and Pr[Q| P] = 0.3. In Figure 2b, we enumerate all assignment of P and Q to incorporate all conditional probabilities of Q given P. We, however, observe that the dynamic programming solution in Section 3.2 still prunes search space for variables that do not appear in V, such as {R, S}. Hence following the calculation in Figure 2b, we obtain the maximum probability of positive prediction of the classifier as 0.65 for P = 1. The minimum probability of positive prediction (not shown) is similarly calculated as 0.11 for P = 0. 3.4 Fairness Verification with Computed Probability of Positive Prediction Given a classifier M, a distribution D, and a fairness metric f, verifying whether a classifier is ϵ-fair (ϵ [0, 1]) is equivalent to computing 1[f(M|D) ϵ]. We now compute f(M|D) based on the maximum probability of positive prediction maxa Pr[ ˆY = 1|A = a] and the minimum probability of positive prediction mina Pr[ ˆY = 1|A = a] of a classifier. For measuring fairness metric SP, we compute the difference maxa Pr[ ˆY = 1|A = a] mina Pr[ ˆY = 1|A = a]. We, however, deploy FVGM twice while measuring EO: one for the distribution D conditioned on Y = 1 and another for Y = 0. In each case, we compute maxa Pr[ ˆY = 1|A = a, Y = y] mina Pr[ ˆY = 1|A = a, Y = y] for y {0, 1} and take the maximum difference as the value of EO. For measuring causal metric PCF, we compute maxa Pr[ ˆY = 1|A = a, Z] and mina Pr[ ˆY = 1, Z|A = a, Z] conditioned on mediator features Z and take their difference. To measure DI, we compute the ratio maxa Pr[ ˆY = 1|A = a]/ mina Pr[ ˆY = 1|A = a]. In contrast to other fairness metrics, DI closer to 1 indicates higher fairness level. Thus, we verify whether a classifier achieves (1 ϵ)-DI by checking 1[DI(M|D) 1 ϵ]. 3.5 Extension to Practical Settings For verifying linear classifiers with real-valued features and coefficients, we preprocess them so that FVGM can be invoked. Let Xc R be a continuous real-valued feature with coefficient wc R in the classifier. We discretize Xc to a set Bc of k Boolean variables using binning-based discretization and assign a Boolean variable to each bin. Hence, Bi Bc becomes 1, when Xc belongs to the ith bin. Let µi denote the mean of feature-values within ith bin. We then set the coefficient of Bi as wcµi. By law of large numbers, Xc P i µi Bi for infinitely many bins (Grimmett and Stirzaker 2020). Finally, we multiply the coefficients of discretized variables by l N \ {0} and round to an integer. Accuracy of the preprocessing step relies on the number of bins k and the multiplier l. Therefore, we empirically fine-tune both k and l by comparing the processed classifier with the initial classifier on a validation dataset. 0 20 40 60 80 100 Benchmarks solved Time limit (s) Logistic regression FVGM Fair Square Justicia Veri Fair 0 20 40 60 80 100 Benchmarks solved Time limit (s) Support Vector Machine FVGM Fair Square Justicia Veri Fair Figure 3: A cactus plot to present the scalability of different fairness verifiers. The number of solved benchmarks are on the X-axis and the required time is on the Y -axis; a point (x, y) implies that a verifier takes less than or equal to y seconds to compute fairness metrics of x many benchmarks. We consider 100 benchmarks generated from 5 real-world datasets using 5-fold cross-validation. In each fold, we consider {25, 50, 75, 100} percent of non-sensitive features. 4 Empirical Performance Analysis In this section, we empirically evaluate the performance of FVGM. We first present the experimental setup and the objective of our experiments, followed by experimental results. Experimental Setup. We implement a prototype of FVGM in Python (version 3.8). We deploy the Scikit-learn library for learning linear classifiers such as Logistic Regression (LR) and Support Vector Machine (SVM) with linear kernels. We perform five-fold cross-validation on a dataset. While the classifier is trained on continuous features, we discretize them to Boolean features to be invoked by FVGM. During discretization, we apply a gird-search to estimate the best bin-size within a maximum bin of 10. To convert the coefficients of features into integers, we employ another grid-search to choose the best multiplier within {1, 2, . . . , 100}. For learning a Bayesian network on the converted Boolean data, we deploy the PGMPY library (Ankan and Panda 2015). For network learning, we apply a Hillclimbing search algorithm that learns a DAG structure by optimizing K2 score (Koller and Friedman 2009). For estimating parameters of the network, we use Maximum Likelihood Estimation (MLE) algorithm. We compare FVGM with three existing fairness verifiers: Justicia (Ghosh, Basu, and Meel 2021), Fair Square (Albarghouthi et al. 2017), and Veri Fair (Bastani, Zhang, and Solar Lezama 2019). 4.1 Scalability Analysis Benchmarks. We perform the scalability analysis on five real-world datasets studied in fair ML literature: UCI 2 3 4 5 # Features Disparate impact Logistic regression Justicia Fair Square 2 3 4 5 # Features Disparate impact Support Vector Machine Justicia Fair Square Figure 4: Comparing the average accuracy of different verifiers over 100 synthetic benchmarks while varying the number of features. FVGM yields the closest estimation of the analytically calculated Exact values of DI for LR and SVM classifiers. Adult, German-credit (Doshi-Velez and Kim 2017), COMPAS (Angwin et al. 2016), Ricci (Mc Ginley 2010), and Titanic (https://www.kaggle.com/c/titanic). We consider 100 benchmarks generated from 5 real-world datasets and report the computation times (for DI and SP) of different verifiers. Results. In Figure 3, we present the scalability results of different verifiers. First, we observe that Fair Square often times out (= 900 seconds) and can solve 5 benchmarks. This indicates that SMT-based reduction for linear classifiers cannot scale. Similarly, SSAT-based verifier Justicia that performs pseudo-Boolean to CNF translation for linear classifiers, times out for around 20 out of 100 benchmarks. Sampling-based framework, Veri Fair, has comparatively better scalability than SMT/SSAT based frameworks and can solve more than 90 benchmarks. Finally, FVGM achieves impressive scalability by solving all 100 benchmarks with 1 to 2 orders of magnitude runtime improvements than existing verifiers. Therefore, S3P-based framework FVGM proves to be highly scalable in verifying fairness properties of linear classifiers than the state-of-the-art. 4.2 Accuracy Analysis Benchmark Generation. To perform accuracy analysis, we require the ground truth, which is not available for real-world instances. Therefore, we focus on generating synthetic benchmarks for analytically computing the ground truth of different fairness metrics, such as DI, from the known distribution of features. In each benchmark, we consider n {2, 3, 4, 5} features including one Boolean sensitive feature, say A, generated from a Bernoulli distribution with mean 0.5. We generate non-sensitive features Xi from Gaussian distributions such that Pr[Xi|A = 1] N(µi, σ2) and Pr[Xi|A = 0] N(µ i, σ2), where µi, µ i [0, 1], σ = 0.1, and µi, µ i are chosen from a uniform distribution in [0, 1]. Finally, we cre- Dataset A Algo. DI PCF SP EO Adult race RW 0.53 -0.06 -0.06 -0.02 OP 0.57 -0.07 -0.07 -0.02 sex RW 0.96 -0.16 -0.15 -0.08 OP 0.43 -0.08 -0.08 0.03 COMPAS race RW 0.13 -0.07 -0.07 -0.06 OP 0.15 -0.08 -0.08 -0.05 sex RW 0.1 -0.04 -0.04 0.04 OP 0.09 -0.04 -0.04 -0.03 German age RW 0.52 -0.53 -0.52 -0.47 OP 0.53 -0.53 -0.53 -0.51 sex RW 0.06 0.06 0.06 0.02 OP 0.12 0.12 0.12 0.07 Table 1: Verification of fairness algorithms using FVGM. A denotes sensitive features. RW and OP refer to reweighing and optimized-preprocessing algorithms. Numbers in bold refer to fairness improvement. ate label Y = 1[Pn 1 i=1 Xi 0.5 Pn 1 i=1 (µi + µ i)] such that Y does not directly depend on the sensitive feature. For each n, we generate 100 random benchmarks, learn LR and SVM classifiers on them, and compute DI2 using different verifiers. Results. We assess the accuracy of the competing verifiers in estimating fairness metrics, specifically DI with LR and SVM classifiers. In Figure 4, FVGM computes DI closest to the Exact value for different number of features and both type of classifiers. In contrast, Justicia, Fair Square, and Veri Fair measure DI far from the Exact because of ignoring correlations among the features. For example, for SVM classifier with n = 5 (right plot in Figure 4), Exact DI is 0.089 (average over 100 random benchmarks). Here, FVGM computes DI as 0.094, while all other verifiers compute DI as at least 0.233. Therefore, FVGM is more accurate than existing verifiers as it explicitly considers correlations among features. 5 Applications of FVGM In this section, we apply FVGM for verifying fairnessenhancing algorithms and depreciation attacks. We also demonstrate that FVGM facilitates computation of fairness influence functions by enabling the detection of bias due to individual features. Verifying Fairness-enhancing Algorithms. We deploy FVGM in verifying the effectiveness of fairness-enhancing 2To analytically compute DI, let the coefficients of the classifier be wi for Xi and w A for A, and bias be τ. Since all non-sensitive features are from Gaussian distributions, we compute the probability of the predicted class Pr[ ˆY |A = 1] N(Pn 1 i=1 wiµi, σ2 ˆY ) and Pr[ ˆY |A = 0] N(Pn 1 i=1 wiµ i, σ2 ˆY ) with σ2 ˆY = (Pn 1 i=1 w2 i )σ2. Hence, the probability of positive prediction of the classifier is 1 CDF ˆY |A=1(τ w A) for A = 1 and 1 CDF ˆY |A=0(τ) for A = 0, where CDF is the cumulative distribution function. Finally, we compute DI by taking the ratio of the minimum and the maximum of the probability of positive prediction of the classifier. 1 5 10 20 40 80 120160 Poisoned samples Disparate impact 1 5 10 20 40 80 120160 Poisoned samples Statistical parity Figure 5: Verifying poisoning attack against fairness using FVGM. The red line denotes the safety margin of the ML model against the attack. algorithms designed to ameliorate bias. For example, fairness pre-processing algorithms can be validated by applying FVGM on the unprocessed and the processed data separately and comparing different fairness metrics. In Table 1, we report the effect of fairness algorithms w.r.t. four fairness metrics: disparate impact (DI), path-specific causal fairness (PCF), statistical parity (SP), and equalized odds (EO). Note that, fairness is improved if DI increases and the rest of the metrics decrease. For instance, in most instances for Adult dataset, reweighing (RW) (Kamiran and Calders 2012) and optimized pre-processing (OP) (Calmon et al. 2017) algorithms are successful in improving fairness. The exceptional case is the unfairness regarding the sensitive feature sex , where OP algorithm fails in improving fairness metric EO. Thus, FVGM verifies the enhancement and decrement in fairness by fairness-enhancing algorithms. Verifying Fairness Attacks. We apply FVGM in verifying a fairness poisoning-attack algorithm. This algorithm injects a small fraction of poisoned samples into the training data such that the classifier becomes relatively unfair (Solans, Biggio, and Castillo 2020). We apply this attack to add {1, 5, . . . , 160} poisoned samples and measure the corresponding disparate impact and statistical parity. In Figure 5, FVGM verifies that the disparate impact of the classifier decreases and statistical parity increases, i.e. the classifier becomes more unfair, as the number of poisoned samples increases. Therefore, FVGM shows the potential of being deployed in safety-critical applications to detect fairness attacks. For example, if we set 0.9 as an acceptable threshold of disparate impact, FVGM can raise an alarm once 160 poisoned samples are added. Fairness Influence Function (FIF): Tracing Sources of Unfairness. Another application of FVGM as a fairness verifier is to quantify the effect of a subset of features on fairness. Thus, we define fairness influence function (FIF) that computes the effect of a subset of non-sensitive features S X on the probability of positive prediction of a classifier given a specific sensitive group A = a, FIF(S) Pr[ ˆY = 1|A = a, D] Pr[ ˆY = 1|A = a, D S]. FIF allows us to explain the sources of unfairness in the classifier. In practice, we compute FIF of S by replacing the probability distribution of S with a uniformly random distribution, referred to as D S, and reporting differences in the conditional probability of positive prediction of the classifier. 0.3 0.4 0.5 0.6 0.7 Pr[Y = 1|sex = 1] priors_count juv_misd_count juv_fel_count juv_other_count c_charge_degree 0.4 0.5 0.6 0.7 0.8 Pr[Y = 1|sex 1] priors_count juv_misd_count juv_fel_count juv_other_count c_charge_degree Figure 6: FIF computation for COMPAS dataset. For Female (left plot) and Male, influence of age decreases and the probability of positive prediction of the classifier increases by different magnitudes. In Figure 6, we compute FIF for all features in COMPAS dataset, separately for two sensitive groups: Female ( sex = 1) and Male. This dataset concerns the likelihood of a person of re-offending crimes within the next two years. We first observe that the base values of the probability of positive prediction are different for the two groups (0.46 vs 0.61 for Female and Male), thereby showing Male as more probable to re-offend crimes than Female. Moreover, FIF of the feature age is comparatively higher in magnitude for Male than Female. This implies that while deciding recidivism the algorithm assumes that Female individuals across different ages re-offend crimes with almost the same probability and the probability of re-offending for Male individuals highly depends on age. Thus, applying FVGM to compute FIF provides us insights about sources of bias and thus, indicators to improve fairness. Pr[ ˆY = 1|age 0 0.5] 6 Conclusion In this paper, we propose FVGM, an efficient fairness verification framework for linear classifiers based on a novel stochastic subset-sum problem. FVGM encodes a graphical model of feature-correlations, represented as a Bayesian Network, and computes multiple group and causal fairness metrics accurately. We experimentally demonstrate that FVGM is not only more accurate and scalable than the existing verifiers but also applicable in practical fairness tasks, such as verifying fairness attacks and enhancing algorithms, and computing the fairness influence functions. As a future work, we aim to design fairness-enhancing algorithms certified by fairness verifiers, such as FVGM. Since FVGM serves as an accurate and scalable fairness verifier for linear classifiers, it will be interesting to design such verifiers for other ML models. Acknowledgments This work was supported in part by National Research Foundation, Singapore under its NRF Fellowship Programme [NRF-NRFFAI1-2019-0004 ] and AI Singapore Programme [AISG-RP-2018-005], and NUS ODPRT Grant [R-252-000685-13]. The computational work for this paper was performed on resources of Max Planck Institute for Software Systems, Germany and the National Supercomputing Centre, Singapore (https://www.nscc.sg). References Ajunwa, I.; Friedler, S.; Scheidegger, C. E.; and Venkatasubramanian, S. 2016. Hiring by algorithm: predicting and preventing disparate impact. Available at SSRN. URL: http://sorelle.friedler.net/papers/SSRN-id2746078.pdf. Albarghouthi, A.; D Antoni, L.; Drews, S.; and Nori, A. V. 2017. Fair Square: probabilistic verification of program fairness. Proceedings of the ACM on Programming Languages, 1(OOPSLA): 1 30. Angwin, J.; Larson, J.; Mattu, S.; and Kirchner, L. 2016. Machine bias risk assessments in criminal sentencing. Pro Publica, May, 23. Ankan, A.; and Panda, A. 2015. PGMPY: Probabilistic graphical models using Python. In Proceedings of the 14th Python in Science Conference (SCIPY 2015). Citeseer. Bastani, O.; Zhang, X.; and Solar-Lezama, A. 2019. Probabilistic verification of fairness properties via concentration. Proceedings of the ACM on Programming Languages, 3(OOPSLA): 1 27. Bellamy, R. K. E.; Dey, K.; Hind, M.; Hoffman, S. C.; Houde, S.; Kannan, K.; Lohia, P.; Martino, J.; Mehta, S.; Mojsilovic, A.; Nagar, S.; Ramamurthy, K. N.; Richards, J.; Saha, D.; Sattigeri, P.; Singh, M.; Varshney, K. R.; and Zhang, Y. 2018. AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias. Berk, R. 2019. Accuracy and fairness for juvenile justice risk assessments. Journal of Empirical Legal Studies, 16(1): 175 194. Calmon, F.; Wei, D.; Vinzamuri, B.; Ramamurthy, K. N.; and Varshney, K. R. 2017. Optimized pre-processing for discrimination prevention. In Advances in Neural Information Processing Systems, 3992 4001. Chavira, M.; and Darwiche, A. 2008. On probabilistic inference by weighted model counting. Artificial Intelligence, 172(6-7): 772 799. Doshi-Velez, F.; and Kim, B. 2017. Towards a rigorous science of interpretable machine learning. ar Xiv preprint ar Xiv:1702.08608. Dressel, J.; and Farid, H. 2018. The accuracy, fairness, and limits of predicting recidivism. Science advances, 4: eaao5580. Dwork, C.; Hardt, M.; Pitassi, T.; Reingold, O.; and Zemel, R. 2012. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference, 214 226. Feldman, M.; Friedler, S. A.; Moeller, J.; Scheidegger, C.; and Venkatasubramanian, S. 2015. Certifying and removing disparate impact. In proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, 259 268. Ghosh, B.; Basu, D.; and Meel, K. S. 2021. Justicia: A Stochastic SAT Approach to Formally Verify Fairness. In Proceedings of AAAI. Grimmett, G.; and Stirzaker, D. 2020. Probability and random processes. Oxford University press. Hardt, M.; Price, E.; and Srebro, N. 2016. Equality of opportunity in supervised learning. In Advances in neural information processing systems, 3315 3323. John, P. G.; Vijaykeerthy, D.; and Saha, D. 2020. Verifying Individual Fairness in Machine Learning Models. ar Xiv preprint ar Xiv:2006.11737. Kamiran, F.; and Calders, T. 2012. Data preprocessing techniques for classification without discrimination. Knowledge and Information Systems, 33(1): 1 33. Kleinberg, J.; and Tardos, E. 2006. Algorithm design. Pearson Education India. Koller, D.; and Friedman, N. 2009. Probabilistic graphical models: principles and techniques. MIT press. Kusner, M. J.; Loftus, J.; Russell, C.; and Silva, R. 2017. Counterfactual fairness. In Advances in neural information processing systems, 4066 4076. Landy, F. J.; Barnes, J. L.; and Murphy, K. R. 1978. Correlates of perceived fairness and accuracy of performance evaluation. Journal of Applied psychology, 63(6): 751. Littman, M. L.; Majercik, S. M.; and Pitassi, T. 2001. Stochastic Boolean satisfiability. Journal of Automated Reasoning, 27(3): 251 296. Martinez Neda, B.; Zeng, Y.; and Gago-Masague, S. 2021. Using Machine Learning in Admissions: Reducing Human and Algorithmic Bias in the Selection Process. In Proceedings of the 52nd ACM Technical Symposium on Computer Science Education, 1323 1323. Mc Ginley, A. C. 2010. Ricci v. De Stefano: A Masculinities Theory Analysis. Harv. JL & Gender, 33: 581. Mehrabi, N.; Morstatter, F.; Saxena, N.; Lerman, K.; and Galstyan, A. 2019. A survey on bias and fairness in machine learning. ar Xiv preprint ar Xiv:1908.09635. Nabi, R.; and Shpitser, I. 2018. Fair inference on outcomes. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32. Pearl, J. 1985. Bayesian networks: A model of self-activated memory for evidential reasoning. In Proceedings of the 7th conference of the Cognitive Science Society, University of California, Irvine, CA, USA, 15 17. Pisinger, D. 1999. Linear time algorithms for knapsack problems with bounded weights. Journal of Algorithms, 33(1): 1 14. Pleiss, G.; Raghavan, M.; Wu, F.; Kleinberg, J.; and Weinberger, K. Q. 2017. On fairness and calibration. ar Xiv preprint ar Xiv:1709.02012. Solans, D.; Biggio, B.; and Castillo, C. 2020. Poisoning attacks on algorithmic fairness. ar Xiv preprint ar Xiv:2004.07401. Tollenaar, N.; and Van der Heijden, P. 2013. Which method predicts recidivism best? A comparison of statistical, machine learning and data mining predictive models. Journal of the Royal Statistical Society: Series A (Statistics in Society), 176(2): 565 584. Woeginger, G. J.; and Yu, Z. 1992. On the equal-subset-sum problem. Information Processing Letters, 42(6): 299 302. Zafar, M. B.; Valera, I.; Rogriguez, M. G.; and Gummadi, K. P. 2017. Fairness constraints: Mechanisms for fair classification. In Artificial Intelligence and Statistics, 962 970. Zhang, J.; and Bareinboim, E. 2018. Fairness in decisionmaking the causal explanation formula. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32. Zliobaite, I. 2015. On the relation between accuracy and fairness in binary classification. ar Xiv preprint ar Xiv:1505.05723.