# fairness_and_bias_in_online_selection__5935232f.pdf

Fairness and Bias in Online Selection

Jos e Correa 1 Andr es Cristi 1 Paul D utting 2 Ashkan Norouzi-Fard 2

There is growing awareness and concern about fairness in machine learning and algorithm design. This is particularly true in online selection problems where decisions are often biased, for example, when assessing credit risks or hiring staff. We address the issues of fairness and bias in online selection by introducing multi-color versions of the classic secretary and prophet problem. Interestingly, existing algorithms for these problems are either very unfair or very inefﬁcient, so we develop optimal fair algorithms for these new problems and provide tight bounds on their competitiveness. We validate our theoretical ﬁndings on real-world data.

1. Introduction

The sharp growth in data availability that characterizes modern society challenges our processing capabilities, not only because of its massiveness, but also because of the increasing strict social norms that society seeks in the algorithms processing it. For instance, machine learning algorithms are now used to make credit and lending decisions, to estimate the success of a kidney transplant, to inform hiring decisions, to recommend schools to pupils, among others. Therefore there is a funded concern over the use of algorithms that may violate social norms. Two basic such norms, that are receiving signiﬁcant attention are fairness and privacy, and while a formalization of the latter is relatively well established through the notion of differential privacy (Dwork et al., 2006), the former is much more unexplored from an algorithmic perspective (Kearns & Roth, 2019).

In this paper we are particularly interested in the study of fairness in machine learning algorithms in the context of sequential decision making under stochastic input. Not only the area has seen many recent theoretical developments, but

*Equal contribution 1Department of Industrial Engineering, Universidad de Chile, Santiago, Chile. 2Google Research, Z urich, Switzerland. Correspondence to: Andr es Cristi <andres.cristi@ing.uchile.cl>.

Proceedings of the 38 th International Conference on Machine Learning, PMLR 139, 2021. Copyright 2021 by the author(s).

also it naturally encompasses many real-world decision making processes where biased evaluations should be avoided, such as those mentioned earlier.

Speciﬁcally we consider the basic single item selection models given by the secretary and prophet problems in which items are classiﬁed into different groups. In our secretary model, two candidates from different groups are incomparable, while in our prophet problem the decision maker is constrained to pick from each group with a prespeciﬁed probability. A precursor of the study of fairness in the secretary problem is the work of Buchbinder et al. (2014) who studied, among other things, an incentive compatible version of the secretary problem in which the selection probability does not depend on the arrival position of a candidate. More recently, Gupta & Salem (2020) have studied machine learning algorithms for biased versions of the secretary problem, whereas Cayci et al. (2020) studies similar issues from the perspective of online learning. The fairness term has been used for various concepts in machine learning community. We adopt here the common notion used in various previous works Halabi et al. (2020); Celis et al. (2018a;b;c); Chierichetti et al. (2019; 2017), where we ask that the solution obtained is balanced with respect to some sensitive attribute (e.g., race, gender).

1.1. Our Results

We consider two fundamental problems in fair online selection, both concerned with selecting a single candidate. Candidates are partitioned into different groups or colors. The candidates arrive sequentially and upon arrival of a candidate we have to irrevocably decide whether we want to select the candidate or not. In the ﬁrst problem we consider, which we call the multi-color secretary problem, candidates arrive in uniform random order and we can rank candidates within a group, but we cannot compare candidates across groups. There is also a prior probability that the best candidate from a group is the best candidate overall. The problem models situations in which different qualities of the candidates make them largely incomparable (this could arise in some form due to gender, race, social origin, type of education, etc.). The goal is to maximize the probability with which we stop at the best overall candidate and compare it with that for the ofﬂine optimum. Note that here the ofﬂine optimum simply picks the best candidate from the group

Fairness and Bias in Online Selection

of largest prior probability. Thus, it is extremely unfair. One may think that the best possible online algorithm is to mimick the ofﬂine optimum; namely to select the group of largest prior probability and then run the classic secretary algorithm on that group. We prove that this is not the case and indeed our main result is to obtain the best possible online algorithm for the problem and to establish that it satisﬁes very desirable fairness properties. Hence, for this variant on online selection, fairness follows as a consequence of being online optimal.

In our second problem, which we call the multi-color prophet problem, candidates have values drawn independently from given distributions and arrive in an arbitrary order. The goal is to maximize the expectation of the value of the selected candidate, while selecting from each color with probability proportional to a prescribed vector. We compete with fair opt, the optimal ofﬂine algorithm satisfying the same fairness constraint. So the underlying paradigm here is that although we can compare we understand that the scores may be biased and want to correct the selection process using the prior probabilities. Our main results are the design of fair competitive algorithms. In the most general version we prove that an approximation factor of 1/2 is best possible while improved factors can be obtained by making natural assumptions on the prior probabilities and the arrival order.

2. The Multi-Color Secretary Problem

In the multi-color secretary problem n candidates arrive in uniform random order. Candidates are partitioned into k groups C = {C1, , Ck}. We write n = (n1, . . . , nk) for the vector of group sizes, i.e., |Cj| = nj, for all 1 j k. We identify each of the groups with a distinct color and denote by c(i) the color of candidate i. We can compare candidates of the same color, but we cannot compare candidates across groups. We assume comparisons are strict, and use i i to denote that candidate i is better than candidate i . We write max Cj for the best candidate of color j, and max C for the best candidate overall. A natural assumption is that the best candidate from a group is the best candidate overall with equal probability 1/k, but we can also consider the case where these probabilities are different. We denote the probabilities with which the best candidate of group j is the best candidate overall by pj, and write p = (p1, . . . , pk) for the vector of these probabilities. Since candidates are incomparable across groups this can be modeled by tossing a coin after the fact to decide whether the best candidate of group j is the best candidate overall. The goal is to design an online algorithm that maximizes the probability of selecting the best candidate overall.

Even though they do not take a fairness perspective, our model is related to the ones considered by Kumar et al.

(2011) and Feldman & Tennenholtz (2012). Kumar et al. (2011) study the problem of selecting a maximal secretary from a partially ordered set of candidates. Feldman & Tennenholtz (2012) study the problem of selecting candidates in parallel: each candidate is randomly assigned to a queue and candidates can only be compared with other candidates in the same queue. The objective is to select a maximal candidate. The two models treat all maximal candidates equally so their results apply only to the case where all pj s are equal.

2.1. Key Deﬁnitions

Competitive ratio. We evaluate online algorithms by means of their competitive ratio. Consider some online algorithm ALG. The algorithm selects the best candidate overall, if it selects the best candidate of a given color and this color has the best candidate overall. For an instance of the multi-color secretary problem with group sizes n and probabilities p, we denote by ALG(n, p) {1, ..., n} {φ} the random index at which the algorithm stops, where φ denotes the case when the algorithm does not stop. The success probability of ALG is E[1ALG(n,p)=max C] = E[pc(ALG(n,p)) 1ALG(n,p)=max Cc(ALG(n,p))]. We compare this to the optimal ofﬂine algorithm OPT, i.e., the best algorithm that can select a candidate after all candidates have arrived. An optimal strategy is to choose a color j with maximum pj, and then, choose the best candidate of that color. We denote by OPT(n, p) {1, ..., n} the random index that OPT selects. The success probability of OPT is E[1OPT(n,p)=max C] = max{p1, . . . , pk}.

Deﬁnition 1 (competitive ratio). Fix k and p = (p1, . . . , pk). An online algorithm ALG is β(k, p)- competitive if for all input lengths n and partition sizes n = (n1, . . . , nk),

E[1OPT(n,p)=max C] E[1ALG(n,p)=max C] β(k, p).

Note that β(k, p) 1, and the smaller β(k, p) the better the approximation guarantee.

Unbiased selection. We also examine the extent to which online or ofﬂine algorithms are biased, where ideally selection should be unbiased. One way to measure this is by quantifying how much the probability of selecting from any given color class j can differ from the corresponding probability pj.

Deﬁnition 2 (fairness). Fix k and p = (p1, . . . , pk). An ofﬂine or online algorithm ALG is α(n, p)-fair, where

Fairness and Bias in Online Selection

α(n, p) 1, if for all colors j [k],

pj α(n, p) P(c(ALG(n, p)) = j | ALG(n, p) = φ)

α(n, p) pj.

Uniform arrival times. We model uniform random arrival order through uniform random arrival times. For this, we sample n independent realizations of the Uniform[0, 1] distribution, and denote them by τ1 < τ2 < . . . < τn indexed in increasing order.

2.2. Optimal Online Algorithm

We derive the optimal online algorithm (without fairness considerations), and observe that in sharp contrast to the optimal ofﬂine algorithm it is robustly fair and provides an equal treatment of equals guarantee.

2.2.1. THE ALGORITHM

We show that the optimal online algorithm is from the class of algorithms given by Algorithm 1. Algorithms from this class receive as input a vector of thresholds t = (t1, . . . , tk), one for each color j [k]. When a candidate i arrives the algorithm ﬁrst checks if the candidate arrived after the time threshold for its color tc(i), and if it did, then it accepts the candidate if it is the best candidate of that color so far.

Algorithm 1 GROUPTHRESHOLDS(t) Input: t [0, 1]k, a threshold in time for each group Output: i [n], index of chosen candidate

/* assuming arrival times τ1 < . . . < τn */ for i 1 to n do

if τi > tc(i) then

if i max{i | τi τi, c(i ) = c(i)} then

return i end end end

Notice that the time based arrival model considered in this section is equivalent to the random order arrival model and is used for the sake of simplicity of presentation and proofs. If we are given an algorithm in the time-based model (such as Algorithm 1), then we can translate it into the random arrival model by having the algorithm draw n arrival times from Uniform[0, 1], and assign the i-th smallest arrival time to the i-th candidate in the input stream. If, on the other hand we are given an algorithm in the uniform arrival model, then we can translate it into the time-based model by just ignoring the time component and just using that the candidate that arrived at τi was the i-th candidate to arrive.

Therefore any algorithm in one model can be easily used in

the other model with identical properties.

2.2.2. COMPETITIVE RATIO

Surprisingly, we can show that for any probabilities p = (p1, . . . , pk) there exist optimal thresholds t = (t 1, . . . , t k) that achieve the best competitive ratio. Later on, we show how these thresholds can be computed explicitly (see Lemma 4). Using these thresholds in Algorithm 1 results in the promised optimal online algorithm. Let us start by presenting the success probability of our algorithm for general probabilities and then for the special case that p = (1/k, . . . , 1/k). Afterwards we provide an overview of the proof of these results.

Theorem 1 (competitive ratio, general probabilities). Fix k and p = (p1, . . . , pk). Assume wlog that pj pj+1 for all j < k. Then there exist thresholds t = (t 1, . . . , t k) such that t j t j+1 for all j < k that depend only on the number of colors k and the probabilities p but not on the number of candidates n or the partition sizes n = (n1, . . . , nk) such that Algorithm 1 with thresholds t succeeds with probability at least

T j τ j dτ ,

where T j = Qj j =1 tj . For all k and p = (p1, . . . , pk), no online algorithm can achieve a better competitive ratio in the worst-case over all number of candidates n and partition sizes n = (n1, . . . , nk).

For the special case where p = (1/k, . . . , 1/k) we obtain the following corollary. It shows that in this case we can set a single threshold, and it also provides a simpler-to-parse formula for the competitive ratio.

Corollary 1 (competitive ratio, equal probabilities). Fix k and p = (1/k, . . . , 1/k). Then there exists a single threshold t such that Algorithm 1 with thresholds t = (t , . . . , t ) achieves a competitive ratio of

This is 2 for k = 2,

3 for k = 3, and 1 + O( log k

k ) as k . For all k and p = (1/k, . . . , 1/k), no online algorithm can achieve a better competitive ratio in the worstcase over all number of candidates n and partition sizes n = (n1, . . . , nk).

We note that a bound of k 1 k 1 +O(k/n) for the special case where pj = 1/k for all k also follows from (Kumar et al., 2011).

The main difﬁculty in proving Theorem 1 and Corollary 1 is that in the point-wise optimal online algorithm, which can

Fairness and Bias in Online Selection

be obtained by backward induction, thresholds depend on the number of candidates of each color that have already arrived. This dependency leads to a blow-up in algorithm complexity, and complicates the analysis of the success probability. Our high-level approach is to argue that in the worst-case all nj s are large, and that in this case the pointwise optimal online algorithm is well approximated by the optimal algorithm from the class of algorithms desbribed in Algorithm 1, which simply sets time-dependent thresholds. So we can optimize over these.

A ﬁrst ingredient in our proof is Lemma 1, which shows that for the class of algorithms in Algorithm 1, for any vector of thresholds t the worst-case arises when all nj s are large . Lemma 1. Fix the probabilities p = (p1, . . . , pk) and a vector of thresholds t [0, 1]k. For all j = 1, . . . k, the success probability of GROUPTHRESHOLDS(t) is decreasing in nj.

Our next pair of lemmas, Lemma 2 and Lemma 3, allow us to bound the success probability of the point-wise optimal online algorithm by the limit success probability of the best algorithm which sets time-dependent thresholds (Algorithm 1). Lemma 2. Denote by GT(p, t) the limit of the success probability of GROUPTHRESHOLDS(t) for a given vector of probabilities p. If minj tj c > 0, then the success probability of GROUPTHRESHOLDS(t) is at most GT(p, t) + k (1 c)z, where z = minj nj. Lemma 3. For any vector of probabilities p and sizes n, denote by ON(n, p) the optimal success guarantee of an online algorithm. Then for every p, n there exists a vector t such that ON(n, p) GT(p, t) + o(1), where minj tj 1/(2e).

The ﬁnal ingredient is the following pair of lemmas, Lemma 4 and Lemma 5, which solve for the optimal timedependent thresholds and give a formula for evaluating the limit success probability in terms of these thresholds. Lemma 4. Consider a vector p such that pj pj+1 for all j < k. The optimal thresholds t are given by

t k = (1 (k 1)pk)

t j = t j+1

Pj r=1 pr j 1 pj Pj r=1 pr j 1 pj+1

! 1 j 1 , for 2 j k 1

t 1 = t 2 e

Lemma 5. Consider vectors of probabilities p and thresholds t, and assume ti ti+1 for all i < k. The limit success probability of GROUPTHRESHOLDS(t) is given by

where Tj = Qj j =1 tj .

Putting together these lemmas yields Theorem 1. Their proofs are deferred to the full version of the paper.

2.2.3. FAIRNESS

The optimal ofﬂine algorithm is 1-fair for p = (1/k, . . . , 1/k), but as soon as probabilities are unbalanced it will choose only from the colors which have maximum pj. In the worst case, |pj pj | < ϵ for all j, j , but the optimal ofﬂine algorithm is forced to choose from the unique color j which has maximum pj. We show that in the case where pj = 1/k for all j, the optimal online algorithm is not exactly 1-fair, but approaches 1-fairness exponentially fast in the minimum group size minj nj.

Theorem 2 (fairness result, equal probabilities). For any k and p = (1/k, . . . , 1/k), Algorithm 1 with the optimal single threshold t is 1 + O(k2(1 1

e)minj nj)-fair.

Moreover, we show that the optimal online algorithm is robust and degrades gracefully as we move away from perfectly balanced probabilities.

Theorem 3 (fairness result, general probabilities). Fix k and p = (p1, . . . , pk). Algorithm 1 with the optimal choice of thresholds t = (t 1, . . . , t k) ensures that if pj = pj then t j = t j . Moreover, t is a continuous function of p. So if pj and pj are close so are t j and t j and so is the probability of selection. More precisely, if pj > pj > (1 ϵ)pj, then t j > t j > (1 ϵ)t j , and furthermore,

0 < P(GROUPTHRESHOLDS(t ) selects color j)

P(GROUPTHRESHOLDS(t ) selects color j ) < ϵ.

To exemplify the conclusion of the last theorem consider that we have two colors, say men and women, and that the prior is such that the top candidate is a woman with probability 60% and a man with probability 40%. This translates into having ϵ = 1/3 in the statement of the theorem, which implies that the algorithm will pick a woman at most 33% more often than a man. See Section 4.1 for more examples and empirical validations of these results.

Both Theorem 2 and Theorem 3 follow from analyzing the optimal thresholds in the respective cases, as established in Lemma 4. The complete proofs are deferred to the full version of the paper.

3. The Multi-Color Prophet Problem

We next consider the following multi-color prophet problem. In this model n candidates arrive in uniform random order. Candidates are partitioned into k groups C = {C1, , Ck}. We write n = (n1, . . . , nk) for the vector of group sizes, i.e., |Cj| = nj, for all 1 j k. We

Fairness and Bias in Online Selection

identify each of the groups with a distinct color and let c(i), vi denote the color and value of candidate i, respectively. The value vi that is revealed upon arrival of i, and is drawn independently from a given distribution Fi. We use F = (F1, . . . , Fn) to refer to the vector of distributions. We are also given a probability vector p = (p1, . . . , pk). The goal is to select a candidate in an online manner in order to maximize the expectation of the value of the selected candidate, while selecting from each color with probability proportional to p. We distinguish between the basic setting in which pj is the proportion of candidates that belong to group j, i.e., pj = nj/n, and the general setting in which p is arbitrary. We compare ourselves with the fair optimum, the optimal ofﬂine algorithm that respects the pj s.

3.1. Key Deﬁnitions

Fair optimum. We deﬁne FAIROPT(n, C, F, p) as the optimal ofﬂine algorithm that selects a candidate of group j with probability pj for all j, and we write E[FAIROPT(n, C, F, p)] for the expected value it achieves. More precisely, among the class of randomized rules to select a candidate that choose a candidate from each color j with probability pj, and can observe the realizations of all values, FAIROPT(n, C, F, p) is the one that maximizes the expectation of the value of the selected candidate.

Intuitively, one can think of FAIROPT as the limit of the following experiment. We draw m times, with m >> 1, an independent sample of the vector (v1, . . . , vn), so we obtain {(vi,s)n i=1}m s=1. In each of the vectors we select a candidate i (s) so that 1

m Pm s=1 vi (s),s is maximized and i (s) belongs to color j in m pj of the vectors.

Ex-ante relaxation. We denote by qi the probability with which FAIROPT(n, C, F, p) selects candidate i. Using these probabilities we can obtain the following upper bound on the performance of FAIROPT, which is known in the prophets literature as the ex-ante relaxation (e.g. Feldman et al., 2016),

EXANTE(n, C, F, p) =

i=1 qi E(vi | vi F 1 i (1 qi)).

Fair selection. We say that an online algorithm ALG is fair if it selects a candidate of each color j with probability proportional to pj. Deﬁnition 3 (fair online algorithm). We say that an online algorithm ALG is fair if

P(c(ALG) = j | ALG stops) = pj 1 j k.

Note that this is analogous to being 1-fair in Deﬁnition 2.

Approximation ratio. Our goal is to ﬁnd the fair online algorithm FAIRALG, with best possible approxima-

tion ratio with respect to FAIROPT. To formally deﬁne this, let E[FAIRALG(n, C, F, p)] denote the expected value achieved by FAIRALG. Deﬁnition 4 (approximation ratio). We say that online algorithm FAIRALG provides an α-approximation if

sup n,C,F,p

E[FAIROPT(n, C, F, p)] E[FAIRALG(n, C, F, p)] α.

Note that the smaller α 1 the better. Speciﬁcally, if α = 1, then the expected value achieved by the fair online algorithm matches that of the fair ofﬂine algorithm.

3.2. Optimal Online Algorithms

We develop optimal fair online algorithms with surprisingly small competitive ratio under different assumptions on the setting. In the ﬁrst setting (Section 3.2.1) we consider an arbitrary ﬁxed order of the candidates, non-identical distributions, and general probabilities p. In the second, we assume that variables are i.i.d., and make the natural additional assumption that the pj s are proportional to the group sizes (Section 3.2.2). In the third setting we relax the i.i.d. assumption to hold only within groups, and assume that candidates arrive in uniform random order. The analysis of this setting is deferred to the full version of the paper.

Our high-level approach is the following: We design online algorithms that accept each candidate i with probability α qi, where q = (q1, . . . , qn) are the marginal probabilities with which the optimal fair ofﬂine algorithm FAIROPT accepts candidate i = 1, . . . , n. Note that for a ﬁxed choice of α this uniquely determines thresholds t = (t1, . . . , tn) that we have to set for candidate i = 1, . . . , n. We are still free to choose the parameter α, and we choose it to optimize the worst-case approximation ratio.

Intuitively, choosing a smaller α makes us accept less frequently, but conditional on stopping we choose higher values. The right trade-off between these two forces and hence the right choice of α turns out to be different in each of the three settings. We ﬁnd that in each of the three settings the optimal approximation ratio is equal to 1/α where α is the optimal choice of α.

3.2.1. GENERAL DISTRIBUTIONS

We start by considering the setting in which candidates arrive in any ﬁxed order, candidate values are drawn from not-necessarily identical distributions, and the probabilities p can be arbitrary.

Our algorithm for this case (Algorithm 2) receives as input the probabilities q1, . . . , qn with which FAIROPT accepts candidate 1, . . . , n. It then sets thresholds so that it accepts each of the candidates with probability qi/2.

Note that for i = 1 we can achieve this by setting the thresh-

Fairness and Bias in Online Selection

old to t1 = F 1 1 (1 q1/2). For i = 2 we have to set a slightly lower threshold than F 1 2 (1 q2/2), because with some probability namely q1/2 we stop at i = 1. Indeed, if we set the threshold to t2 = F 1 2 (1 q2/2 1 q1/2) we reach candidate i = 2 with probability 1 q1/2 and conditional on reaching it we accept it with probability q2/2 1 q1/2, so we accept it with probability exactly qi/2 as desired. Continuing like this yields the thresholds used in the algorithm.

Algorithm 2 FAIR GENERAL PROPHET Input: Distributions F1, , Fn, and q1, , qn Output: i [n], index of chosen candidate

s 0 for i 1 to n do

if vi F 1 i (1 qi/2 1 s/2) then return i end s s + qi end

We show that this algorithm is fair, and that it achieves an optimal approximation guarantee. Fairness follows quite directly from the fairness of FAIROPT, because our algorithm accepts with the same marginal probabilities just scaled down by 1/2. For the approximation guarantee we compare the expected value collected by the online algorithm to the expected value achieved by the ex-ante relaxation, which is constrained to use the marginal probabilities q of FAIROPT. Since the latter is only higher than the expected value achieved by FAIROPT, it also implies an approximation guarantee with respect to FAIROPT.

Theorem 4. For general settings and general distributions, Algorithm 2 is fair and achieves a 2-approximation to FAIROPT. No fair online algorithm can achieve a better approximation ratio.

We remark that the bound of 2 in this theorem is incomparable to the well known factor 2 in the regular prophet inequality (Samuel-Cahn, 1984), and indeed we prove it via a substantially different technique.

3.2.2. I.I.D. DISTRIBUTIONS

We next consider the i.i.d. setting, where all values vi are independent samples from a common distribution F. In this case pj = nj/n for all groups j is a natural assumption, because this is the probability with which the maximum overall is from group j. Also note that in this case the optimal ofﬂine algorithm is fair, and chooses each element with probability 1/n. That is, qi = 1/n for all i.

Our Algorithm 3 tries to mimic the optimal fair ofﬂine algorithm, but aims at slightly lower marginal acceptance probabilities of 2/(3n). The derivation of the thresholds t

that achieve this follows the same logic as in our algorithm for general distributions.

Algorithm 3 FAIR IID PROPHET Input: Distributions F Output: i [n], index of chosen candidate

for i 1 to n do

if vi F 1(1 2/3n 1 2(i 1)/3n) then return i end end

We prove that this algorithm is fair, and achieves an optimal approximation ratio of 3/2. To show fairness we again exploit that the algorithm accepts each candidate i with a scaled-down version of the marginal probability qi = 1/n with which FAIROPT accepts a candidate. We establish the approximation factor via a stochastic dominance argument.

Theorem 5. For basic settings and i.i.d. distributions Algorithm 3 is fair and achieves a 3/2-approximation to FAIROPT. No fair online algorithm can achieve a better approximation ratio.

The proofs of Theorem 4 and Theorem 5 and the discussion of the third setting for which we prove a tight approximation ratio of 1/(2

2) 1.707 are deferred to the full version of the paper. In Section 4.2 we experimentally validate these results and compare them with other algorithms in the literature.

4. Empirical Evaluation

In this section we empirically validate our results on synthetical and real-world experiments1. We present experiments for the multi-color secretary problem in Section 4.1 and the multi-color prophet problem in Section 4.2.

4.1. Secretary Experiments

We compare our algorithm (Algorithm 1) with the following two baselines, which are based on the optimal solution to the classic secretary problem (e.g. Lindley, 1961; Dynkin, 1963; Ferguson, 1989):

1. Secretary algorithm (SA): This algorithm ﬁrst computes the maximum value in the ﬁrst 1/e-fraction of the stream, and then picks any element with higher value afterwards. This algorithm does not consider the colors of elements.

1An implementation of these experiments is available at https://github.com/google-research/ google-research/tree/master/fairness_and_ bias_in_online_selection.

Fairness and Bias in Online Selection

Input F-Pick F-Max U-Pick U-Max S-Pick S-Max 0

Number of Occurrences

(a) Synthetic Dataset Equal p

Color 1 Color 2 Color 3 Color 4

Input F-Pick F-Max U-Pick U-Max S-Pick S-Max 0

Number of Occurrences

(b) Synthetic Dataset General p

Color 1 Color 2 Color 3 Color 4

Input F-Pick F-Max U-Pick U-Max S-Pick S-Max 0

Number of Occurrences

(c) Feedback Maximization

31-40 41-50 51-60 After 60

Input F-Pick F-Max U-Pick U-Max S-Pick S-Max 0

Number of Occurrences

(d) Inﬂuence Maximization

Under Normal

Over Obese 1 Obese 2

Figure 1. In this plot, we compare our fair secretary algorithm with the secretary algorithm (SA) and the single-color secretary algorithm (SCSA) on (a) synthetic dataset, equal p values, (b) synthetic dataset, general p values, (c) feedback maximization dataset, and (d) inﬂuence maximization dataset. Here Input is the number of elements from each color in the input, F-Pick and F-Max are the number of elements picked by our fair secretary algorithm and the number of them that are the maximum among the elements of that color. Similarly, U-Pick (S-Pick) and U-Max (S-Max) are the number of elements picked by SA and SCSA and the number of them that are the maximum among the elements of that color.

2. Single-color secretary algorithm (SCSA): This algorithm ﬁrst picks a color proportional to the p values, and then runs the secretary algorithm on the elements of that color. This algorithm does not consider the elements whose color is different from the chosen one.

For all the experiments in this section, we run all the algorithms 20, 000 times. We report the number of times that i) the algorithm selects an element from each of the colors, ii) the number of times the selected element has the highest value in its color.

Synthetic dataset, equal p values. In this experiment, we create a synthetic dataset as follows. There are four colors with 10, 100, 1000, and 10000 occurrences. The value of each element is chosen independently and uniformly at random from [0, 1], so the p values are the same for all the colors, i.e., p = (1/4, 1/4, 1/4, 1/4). In Figure 1 (a), we present the result for this dataset. We observe that our algorithm and SCSA pick almost equal number of times from each color,2 while SA picks almost only from the forth color.

2The slight difference is due to the random nature of the algorithm.

Therefore both our algorithm and SCSA are fair while SA does not satisfy the fairness expectations. We also observe that the number of elements picked by our algorithm is 1.305 times higher than in SCSA (+30.5%), and it picks the maximum element of the color and hence the best element overall 1.721 times more often than SCSA (+73.1%). Therefore the quality of the solution of our algorithm is signiﬁcantly higher than that of SCSA.

Synthetic dataset, general p values. In this experiment, we create a synthetic dataset with four colors of sizes 10, 100, 1000, 10000 and p = (0.3, 0.25, 0.25, 0.2).. The results are presented in Figure 1 (b). We observe that both the distributions of the picked element for our algorithm and SCSA is close to the p distribution while for SA it is clearly different. Moreover, our algorithm performs signiﬁcantly better than SCSA since it picks 1.309 times more elements (+30.9%) and 1.630 times more maximum element of the picked color (+63.0%).

Feedback maximization. We consider a dataset containing one record for each phone call by a Portuguese banking institution (Moro et al., 2014). The goal of this experiment is to select a client and contact them and ask for their feed-

Fairness and Bias in Online Selection

(a) Uniform Distribution

(b) Binomial Distribution

Figure 2. In this plot we present the number of times that our algorithms (Fair PA, Fair IID) and the baselines (SC, EHKS, DP) pick from each position of the input prophet problem stream. In (a) the stream consists of 50 sample from the uniform distribution and in (b) the stream consist of 1000 sample from the binomial distribution.

back. In order to achieve high quality feedback, we want to maximize the length of the call phone call duration while being fair with respect to the age of the interviewee. We divide the clients into 5 colors: under 30, 31-40, 41-50, 51-60, and more that 61 years old. For the sake of being fair, we let p = (1/5, 1/5, 1/5, 1/5, 1/5). In Figure 1 (c), we present the obtained results (along with the number of the records in the input for each color). Similar to the previous experiments, we observe that our algorithm and SCSA pick almost equal number of the times from each color while SA picks mostly (80% of the runs) from the forth color. Morevoer, we observe that our algorithm picks 1.347 times more elements than SCSA (+34.7%), and that it picks the maximum element of the color 1.760 times more often (+76.0%).

Inﬂuence maximization. We consider a dataset containing the inﬂuence of the users of the Pokec social network (Takac & Z abovsk y, 2012). The inﬂuence is computed as the number of the followers for each user. Selecting inﬂuencers has numerous applications, e.g., in advertising. In this experiment we want to be fair with respect to the body mass index (BMI) of the selected inﬂuencers. Therefore we divide the users into 5 colors according to their BMI: under weighted, normal, over weighted, obese type 1, and obese type 2. We let p = (1/5, 1/5, 1/5, 1/5, 1/5). The results are presented in Figure 1 (d).3 Similar to the previous experiments our algorithm and SCSA picks almost equal number of each color while the Secretary algorithm picks only from two colors. Moreover, we observe that our algorithm picks 1.373 times more elements than SCSA (+37.3%), and picks the maximum element of the color 1.756 time more often than SCSA (+75.6%).

4.2. Prophet Experiments

In this section we evaluate our multi-color prophet algorithms (Algorithm 2 (Fair PA) and Algorithm 3 (Fair IID)). We focus on the case, where values are distributed i.i.d. and each candidate is a group on its own. We compare with the following baselines:

3For ease of representation, this experiment is ran 106 times.

SC algorithm (Samuel-Cahn, 1984): This algorithm sets a single threshold so that the maximum is above this threshold with probability exactly 1/2. It achieves an optimal 2-approximation for possibly non-identical independent distributions and arbitrary arrival order.

EHKS algorithm (Ehsani et al., 2018): This algorithm sets a single threshold so that an individual candidate is accepted with probability 1/n. It achieves an approximation of (e + 1)/e 1.58 for possibly non-identical independent distributions and random arrival order.

CFHOV algorithm (Correa et al., to appear): This algorithm sets a sequence of thresholds based on acceptance probabilities that result from solving a differential equation. It achieves an optimal 1.342approximation for IID distributions.

DP algorithm (e.g. Chow et al., 1971): This algorithm is the optimal threshold algorithm for the prophet problem, where thresholds are obtained by backward induction. This algorithm is optimal, even when distributions are different and candidates arrive in arbitrary order.

We consider two settings. In the ﬁrst one the input stream consists of 50 samples from the uniform distribution in range [0, 1], and in the second one the input consists of 1000 samples from the binomial distribution with 1000 trials and 1/2 probability of success of a single trial. For better comparability with existing algorithms, in both cases we assume each candidate is a group on its own. We run each algorithm 50, 000 times.

In Figure 2 we compare the number of times that our algorithms, SC, EHKS, and CFHOV pick from each position of the stream. The DP algorithm picks very unfairly and almost exclusively from the very end of the stream. As this would distort the readability of the ﬁgures we excluded this curve from the plots. We observe that both the SC and EHKS baselines pick candidates more from the ﬁrst half of the stream compared to the second half (by more than a factor of 2), while DP picks mostly from the second half of

Fairness and Bias in Online Selection

the stream (by more than a factor of 4). So all these algorithms are unfair. In contrast, our algorithms select the same number of candidates throughout the stream. The average value of the chosen candidate for our Algorithm 2 (Fair PA), our Algorithm 3 (Fair IID), SC, EHKS, CFHOV, and DP for the uniform distribution is 0.501, 0.661, 0.499, 0.631, 0.752, 0.751, while for the binomial distribution it is 298.34, 389.24, 277.63, 363.97, 430.08, 513.34, respectively.

In conclusion, for both settings, both our algorithms Algorithm 2 and Algorithm 3 provide perfect fairness, while giving 66.71% and 88.01% (for the uniform case), and 58.12% and 75.82% (for the binomial case), of the value of the optimal, but unfair, online algorithm.

5. Conclusion and Open Problems

In this work, we explored questions of fairness and bias in natural multi-color variants of the two canonical problems of online selection, the secretary problem and the prophet problem. We designed optimal fair online algorithms for these problems, and validated the efﬁcacy and fairness of these new algorithms on synthetic and real-world data.

As in many real-world settings the online decisions go beyond the single selection model studied here, there is ample opportunity for extending this line of work to combinatorial settings. We expect that building on the respective lines of work in the secretary, prophet and optimal stopping literature in general, could prove very fruitful.

Particularly exciting directions include an extension to matching problems (Kesselheim et al., 2013; Ezra et al., 2020; Gravin & Wang, 2019), allocation problems with matroid structure (Babaioff et al., 2018; Feldman et al., 2015b; Kleinberg & Weinberg, 2012; D utting et al., 2020a), or even general combinatorial allocation problems (Feldman et al., 2015a; D utting et al., 2020b).

Acknowledgments

Jos e Correa was partially funded by ANID through grant FONDECYT 1190043. Andr es Cristi is supported by ANID under grant PFCHA/Doctorado Nacional/2018-21180347.

Babaioff, M., Immorlica, N., Kempe, D., and Kleinberg, R. Matroid secretary problems. Journal of the ACM, 65(6): 35:1 35:26, 2018.

Buchbinder, N., Jain, K., and Singh, M. Secretary problems via linear programming. Mathematics of Operations Research, 39(1):190 206, 2014.

Cayci, S., Gupta, S., and Eryilmaz, A. Group-fair online

allocation in continuous time. In Proceedings of the Annual Conference on Neural Information and Processing Systems (Neur IPS), 2020.

Celis, E., Keswani, V., Straszak, D., Deshpande, A., Kathuria, T., and Vishnoi, N. Fair and diverse DPPbased data summarization. In Proceedings of the 35th International Conference on Machine Learning (ICML), pp. 716 725, 2018a.

Celis, L. E., Huang, L., and Vishnoi, N. K. Multiwinner voting with fairness constraints. In Proceedings of the Twenty-Seventh International Joint Conference on Artiﬁcial Intelligence, (IJCAI), pp. 144 151, 2018b.

Celis, L. E., Straszak, D., and Vishnoi, N. K. Ranking with fairness constraints. In Proceedings of the 45th International Colloquium on Automata, Languages, and Programming, (ICALP), pp. 28:1 28:15, 2018c.

Chierichetti, F., Kumar, R., Lattanzi, S., and Vassilvitskii, S. Fair clustering through fairlets. In Advances in Neural Information Processing Systems, pp. 5029 5037, 2017.

Chierichetti, F., Kumar, R., Lattanzi, S., and Vassilvtiskii, S. Matroids, matchings, and fairness. In Proceedings of the 22nd International Conference on Artiﬁcial Intelligence and Statistics (AISTATS), pp. 2212 2220, 2019.

Chow, Y. S., Robbins, H. E., and Siegmund, D. Great Expectations: The Theory of Optimal Stopping. Houghton Mifﬂin, Boston, MA, USA, 1971.

Correa, J., Foncea, P., Hoeksma, R., Oosterwijk, T., and Vredeveld, T. Posted price mechanisms and optimal threshold strategies for random arrivals. Mathematics of Operations Research, to appear.

D utting, P., Feldman, M., Kesselheim, T., and Lucier, B. Prophet inequalities made easy: Stochastic optimization by pricing nonstochastic inputs. SIAM J. Comput., 49(3): 540 582, 2020a.

D utting, P., Kesselheim, T., and Lucier, B. An o(log log m) prophet inequality for subadditive combinatorial auctions. In Proceedings of the IEEE Symposium on Foundations of Computer Science (FOCS), pp. 306 317, 2020b.

Dwork, C., Mc Sherry, F., Nissim, K., and A., S. Calibrating noise to sensitivity in private data analysis. In Proceedings of the Theory of Cryptography Conference (TCC), pp. 265 284, 2006.

Dynkin, E. B. The optimum choice of the instant for stopping a markov process. Soviet Mathematics Doklady, 4: 627 629, 1963.

Fairness and Bias in Online Selection

Ehsani, S., Hajiaghayi, M., Kesselheim, T., and Singla, S. Prophet secretary for combinatorial auctions and matroids. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 700 714, 2018.

Ezra, T., Feldman, M., Gravin, N., and Tang, Z. G. Online stochastic max-weight matching: Prophet inequality for vertex and edge arrival models. In Proceedings of the ACM Conference on Economics and Computation (EC), pp. 769 787, 2020.

Feldman, M. and Tennenholtz, M. Interviewing secretaries in parallel. In Proceedings of the ACM Conference on Electronic Commerce (EC), pp. 550 567, 2012.

Feldman, M., Gravin, N., and Lucier, B. Combinatorial auctions via posted prices. In Proceedings of the ACMSIAM Symposium on Discrete Algorithms (SODA), pp. 123 135, 2015a.

Feldman, M., Svensson, O., and Zenklusen, R. A simple O(log log(rank))-competitive algorithm for the matroid secretary problem. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 1189 1201, 2015b.

Feldman, M., Svensson, O., and Zenklusen, R. Online contention resolution schemes. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 1014 1033, 2016.

Ferguson, T. S. Who solved the secretary problem? Statistical Science, 4:282 289, 1989.

Gravin, N. and Wang, H. Prophet inequality for bipartite matching: Merits of being simple and non adaptive. In Proceedings of the ACM Conference on Economics and Computation (EC), pp. 93 109, 2019.

Gupta, S. and Salem, J. Closing the gap: Mitigating bias in online resume-ﬁltering. In Proceedings of the Conference on Web and Internet Economics (WINE), 2020.

Halabi, M. E., Mitrovic, S., Norouzi-Fard, A., Tardos, J., and Tarnawski, J. Fairness in streaming submodular maximization: Algorithms and hardness. In Proceedings of the Annual Conference on Neural Information Processing Systems (Neur IPS), 2020.

Kearns, M. and Roth, A. The Ethical Algorithm: The Science of Socially Aware Algorithm Design. Oxford University Press, 2019.

Kesselheim, T., Radke, K., T onnis, A., and V ocking, B. An optimal online algorithm for weighted bipartite matching and extensions to combinatorial auctions. In Proccedings of the European Symposium on Algorithms (ESA), pp. 589 600, 2013.

Kleinberg, R. and Weinberg, S. M. Matroid prophet inequalities. In Proceedings of the ACM Symposium on Theory of Computing Conference (STOC), pp. 123 136, 2012.

Kumar, R., Lattanzi, S., Vassilvitskii, S., and Vattani, A. Hiring a secretary from a poset. In Proceedings of the ACM Conference on Electronic Commerce (EC), pp. 39 48, 2011.

Lindley, D. V. Dynamic programming and decision theory. Applied Statistics, 10:39 51, 1961.

Moro, S., Cortez, P., and Rita, P. A data-driven approach to predict the success of bank telemarketing. Decision Support Systems, 62:22 31, 2014.

Samuel-Cahn, E. Comparison of threshold stop rules and maximum for independent nonnegative random variables. Annals of Probability, 12:1213 1216, 1984.

Takac, L. and Z abovsk y, M. Data analysis in public social networks. In Proceedings of the International Scientiﬁc Conference and International Workshop Present Day Trends of Innovations, pp. 1 6, 2012.