# designing_committees_for_mitigating_biases__3e716e8d.pdf

The Thirty-Fourth AAAI Conference on Artiﬁcial Intelligence (AAAI-20)

Designing Committees for Mitigating Biases

Michal Feldman,1,2 Yishay Mansour,1,3 Noam Nisan,4 Sigal Oren,5 Moshe Tennenholtz6

1Tel-Aviv University, Israel, 2Microsoft Research, Israel, 3Google Research, Israel, 4Hebrew University, Israel 5Ben-Gurion University of the Negev, Israel, 6Technion Israel Institute of Technology, Israel michal.feldman@cs.tau.ac.il, mansour.yishay@gmail.com, noam@cs.huji.ac.il, orensi@cs.bgu.ac.il, moshet@ie.technion.ac.il

It is widely observed that individuals prefer to interact with others who are more similar to them (this phenomenon is termed homophily). This similarity manifests itself in various ways such as beliefs, values and education. Thus, it should not come as a surprise that when people make hiring choices, for example, their similarity to the candidate plays a role in their choice. In this paper, we suggest that putting the decision in the hands of a committee instead of a single person can reduce this bias. We study a novel model of voting in which a committee of experts is constructed to reduce the biases of its members. We ﬁrst present voting rules that optimally reduce the biases of a given committee. Our main results include the design of committees, for several settings, that are able to reach a nearly optimal (unbiased) choice. We also provide a thorough analysis of the trade-offs between the committee size and the obtained error. Our model is inherently different from the well-studied models of voting that focus on aggregation of preferences or on aggregation of information due to the introduction of similarity biases.

1 Introduction Homophily the tendency of people to prefer others who are more similar to them was long studied in several disciplines including sociology and economics (see (Mc Pherson, Smith-Lovin, and Cook 2001) and references within as well as (Currarini, Jackson, and Pin 2009)). The similarity can be over different dimensions varying from values and beliefs to social status and gender. Usually homophily is discussed in the context of structure of social networks individuals choose friends or spouses based on similarity. However, it was observed that homophily also plays a role in other settings, such as hiring (Rivera 2012). When the decision is of a personal nature (e.g., choosing a spouse) this bias towards similar individuals is natural and expected. However, when

This project has received funding from the European Research Council (ERC) under the European Union s Horizon 2020 research and innovation programme (grant agreement No 740282 and 740435), the European Research Council under the European Union s Seventh Framework Programme (FP7/2007-2013) (grant agreement No 337122), and by the Israel Science Foundation (grant numbers 317/17, 993/17 and 2167/19). Copyright c 2020, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

the decision has a more public nature (e.g., hiring for a ﬁrm or for an academic department), mitigating this bias can be valuable. As we will see in this paper, entrusting decisions such as who to hire in the hands of a committee may help in reducing this bias. Committees are chosen to make decisions for various reasons. Sometimes individuals have different preferences and the committee is formed to make a decision that will maximize the individuals happiness ( aggregation of preferences ). Other times there is a true state of the world which individuals have noisy signals about and a planner wishes to discover what this true state is ( aggregation of information ). Here we suggest that there are some settings in which there is an additional beneﬁt for using a committee instead of a single decision maker that is to reduce individual biases. To further study the advantages of a committee when the individuals are biased, we devise a simple model in which a planner wishes to learn the true state of the world. To this end he organizes a committee that consists of experts that have biased views of the true state of the world. By utilizing some information on the experts biases the planner is able to extract a less biased estimate of the true state. A concrete example of such a setting, from which we will also borrow our terminology, is academic hiring. Consider a dean that needs to form a hiring committee to choose among several different candidates, each working in a different subarea of some academic ﬁeld. In our setting, the dean, who is not from the academic ﬁeld in question, has no information about the quality of the candidates. All faculty members in the academic area the experts have estimates of the candidates qualities that are shaped by the true qualities of the candidates and their own cognitive biases. Speciﬁcally, faculty members prefer candidates that are more similar to them and as a result underestimate the qualities of candidates in sub-ﬁelds that are far from their own sub-ﬁeld. The dean, who is aware of these biases, will thus naturally attempt to construct a committee that is diverse in terms of its biases as to enable him to negate some of these biases and determine who the highest quality candidate is. To model the similarities between the experts and candidates we assume that the candidates and experts reside in known locations in some metric space1. The similar-

1This is in line with to the way similarity between voters and

ity between a candidate and an expert is determined by the distance between them. The two main metric spaces we consider are the unit interval and the more general ddimensional unit hyper-cube. The former can model, for example, how theoretical their research is, while the latter can capture several aspects of their research simultaneously. We assume that each candidate j has an absolute quality qj R. To model the cognitive process by which experts decide which candidate to vote for we assume that an expert i that resides at location ei in the metric space assigns a grade gi(j) = qj d(ei, cj) to candidate j that resides in location cj (where d(ei, cj) is the distance between the two locations). Next, each expert votes for the candidate with the maximal grade: votei = arg maxj C gi(j), where C is the set of candidates. We stress that the grade an expert gives a candidate is not necessarily a part of some conscious computation but rather a result of biases that the expert might be unaware of. Given a committee E of experts, the dean that is aware of the locations of the experts and their voting process will aggregate the votes votei of all committee members, and will need to pick some candidate j . Due to the experts biases the dean will not always be able to ﬁnd the candidate that truly maximizes qj among all qj, so we will aim to minimize the dean s regret 2: maxj Cqj qj . To familiarize the reader with our model, let us inspect a few simple examples. In all three examples depicted in Figure 1 the candidates are marked by circles, the experts by squares and there is an arrow from each expert to the candidate he votes for. First, in Figure 1(a) we observe that if we only have two candidates, an expert located at equal distances from both will always vote for the candidate of higher quality. In Figure 1(b) we see that for three candidates, having an expert at equal distances from each pair of candidates does not necessarily sufﬁce to ﬁnd the best candidate. This is due to the fact that the expert located at 0 might prefer the candidate at 0 instead of voting for the better of the candidates located at 1 and +1. Lastly, in Figure 1(c), we see that combining information from several experts may sufﬁce for determining the best candidate. In the example the expert at 0.4 vote implies that the quality of the candidate at 0 is at most 0.2 less than the quality of the candidate at 1. This is because the distance between the expert at 0.4 and the candidate at 0 is smaller by 0.2 than the distance between him and the candidate at 1. The expert at +0.4 vote implies that the candidate at 1 is better by at least 0.2 from the candidate at 0. Hence, the best candidate is the one at 1.

Our Results

We start our investigation by considering given sets of candidates and experts and ask which voting method should we use to determine who the best candidate is. In the example in Figure 1(c) we have already seen that this can require combining the experts votes in a non-trivial way. We show that

political parties is modeled in political science (Enelow and Hinich 1989; Roemer 2001). 2Our deﬁnition of regret differs from the online learning term regret, although conceptually they both address similar suboptimality issues.

q=0.2 q=0.8

q=0.2 q=0.8 q=0.1

q=0.2 q=0.8 q=0.1

Figure 1: Examples of simple committees.

majority voting, or more generally, plurality voting, can be extremely bad for some sets of candidates. We then identify the regret-minimizing voting method, and show how it can be implemented efﬁciently using a set of shortest-path computations. Next, we study the question of committee size: how large must a committee be in order to achieve a given level of regret? There are two variants of this problem: the ﬁrst is the universal committee in which our committee must be chosen before the locations of the set of candidates are known, and the second is the ad-hoc committee that is chosen for a set of candidates with known locations. In our academic hiring setting, the ﬁrst universal committee describes the scenario in which the dean chooses a hiring committee before the applications arrive, while the second ad-hoc committee describes a scenario in which a hiring committee is chosen after the subareas of the candidates are known. When both the candidates and the experts are located on the unit interval, we are able to give an essentially complete picture:

Theorem 1.1 In the unit interval, for every ε > 0 and candidate set of size m, there exists a universal committee of size O(m/ε) that achieves regret of at most ε. The bound is asymptotically tight even for ad-hoc committees.

The universal committee is constructed by equally spacing the experts across the unit-interval. It might come as a surprise that spacing of O(ε) does not sufﬁce. In particular, the committee size must increase with the size of the candidate set. Thus, the only advantage in choosing the committee in an ad-hoc manner is when the size of the candidate set is unknown in advance. This is not the case for higher dimensional metric spaces:

Theorem 1.2 In the d-dimensional unit hyper-cube, for every ε > 0 and candidate set of size m the following committees achieve regret of at most ε:

A universal committee of size O(md/εd). An ad-hoc committee of size O(m2/ε).

Moreover, every universal committee for m 2d + 1 candidates requires Ω(1/εd) size.

While the universal committee, again, is obtained by spreading the experts uniformly over the unit hypercube (in an (ε/m)-net), ad-hoc committees can be much smaller. Roughly speaking, this is done by constructing a maximal spanning tree of the candidates and constructing an ε-cover

of the tree. For d = 2 we can further improve this bound and construct ad-hoc committees of size O(m1.5/ε). For ad-hoc committees, we also consider the computational question of efﬁciently constructing a committee that chooses a candidate with minimal regret. For the unit interval we show:

Theorem 1.3 There exists a polynomial time algorithm that, given a set of candidates in the unit interval and a target committee size k, outputs a committee of size k whose regret is within a constant factor of the regret of the best committee of size k.

Unfortunately, this is not the case for general discrete metric spaces for which we show that constant approximation is hard, even in a weak bi-criteria sense: It is NP-hard to approximate to with a factor better than Ω(log m) the committee size needed to achieve error ε = 0 ( a perfect committee ).

Related Work Collective decision making has been studied by various authors (e.g., (Young 1988; Austen-Smith and Banks 1996; Feddersen and Pesendorfer 1998)). In this setting committee members need to jointly decide which state of the world is correct. A leading example is that of a jury that is appointed to decide whether the defendant is guilty or innocent. Each committee member receives some noisy signal about the state and submits a vote accordingly. The goal of this literature is to ﬁnd voting rules that minimize the probability of a mistake. Often, the search domain is the set of majority mechanisms. Of particular interest is the comparison between simple majority and unanimity. In our setting voting is for candidates; a main contribution of our paper is the consideration of a particular type of noise, namely the distance between the voter and the different candidates. A novel aspect of our work is that it provides a ﬁrst formal model to deal with the optimal selection of jury while modeling the jury members biases. While this topic is of course of major importance (see, e.g., (Neilsona and Winterb 2000)) we are not aware of formal models dealing with this fundamental topic in economic and social systems. Our paper is situated in an active line of work in AI that takes a utilitarian view of voting (Procaccia and Rosenschein 2006; Boutilier et al. 2015; Anshelevich et al. 2018; Caragiannis et al. 2017). In this line of work each voter has an underlined utility function. The voter then reports his preferences based on this utility function. For example, he may only report his top preference (as in our case) or give a complete ordinal ranking. Given this partial information the objective is to choose a candidate that approximately maximizes the social welfare. Note that this is different from our work as we seek to choose a candidate that is objectively the best and not to maximize the social welfare. Treating both voters and candidates as points in space, where distances between them is the basis for voting is quite standard (see, e.g., (Enelow and Hinich 1989)). An exciting recent line of work initiated by Anshelevich et al. (Anshelevich et al. 2018; Anshelevich and Postl 2017) is particularly relevant: voters and candidates are points in a metric space

and each voter prefers candidates that are closer to him. This line of work takes a utilitarian point of view and explores the distortion of known voting rules and develops new simple rules. Moreover, some models where such type of representation of candidates and voters as points in a metric space is augmented with some objective quality for the candidates have been considered (see, e.g., (Maravall-Rodriguez 2005)). This is however studied in the context of shaping policies by candidates rather than in the context of removing experts biases. A similar setting was also consider in (Krishna and Morgan 2011) in which a large population should vote for one of two candidates, each has two properties: ideology and competence. Each voter might prefer a candidate of different ideology but they all agree on the competence of the candidates. The paper is interested in a different set of questions than the ones we consider, in particular it considers mechanisms for voting for the whole population (and not just a chosen committee) resulting in the more competent candidate being elected. A different body of literature addresses the problem of strategic experts which would like to inﬂuence the outcome. The setting for two experts was studied in (Krishna and Morgan 2001) as an extension of Crawford and Sobel celebrated paper (Crawford and Sobel 1982). In general, assuming that the agents are rational, we can use mechanism design to address this concern. A major tool in this context is virtual implementation (Hitoshi 1993): ask the experts for their reports and then bias the decision (e.g., based on reported preference (Gerardi, Mc Lean, and Postlewaite 2009)) in a way that favors experts which were more consistent with the rest. The approach we take here has some of that ﬂavor, as we bias the decision based on information we have about agents biases. Another related work is the one by (Alpern and Gal 2009), which conducts an equilibrium analysis of selection committees under sequential voting with veto power.

2 Model and Preliminaries In our setting, there is a set C = {1, . . . , m} of m candidates, located along a metric space Ω. The location of candidate j C is cj Ω and is known to all and its quality is qj R. The quality vector of the candidates is q = (q1, . . . , qm) Rm and the location vector is c = (c1, . . . , cm) Ωm. Along the same metric space Ω, there is also a set E of experts, whose locations are known to all. A planner that does not know the qualities of the candidates needs to choose the best one. To make this decision the planner chooses a committee E E of size k among the set of experts; we identify the members of the committee with {1, . . . , k}. The experts location vector is given by e = (e1, . . . , ek) Ωk. Each expert i E gives candidate j a grade gi(j) = qj d(ei, cj). The expert then votes for the candidate with the maximal grade: votei(q) = arg maxj gi(j). When q is clear from the context we omit it and simply write votei. A voting rule is a function V : Ck C, which receives the votes vote(q) = (vote1(q), . . . , votek(q)) of the committee members, and returns a winning candidate.

Our goal is to choose the candidate with the highest quality. However, as discussed in the introduction, we are often unable to choose the candidate with the highest quality. Instead, we would like to minimize the regret. The regret of selecting candidate j with respect to a quality vector q is R(j, q) = maxj {qj qj}. The regret of a voting rule V is R(V ) = maxq R(V (vote(q)), q). We note that for simplicity of presentation we do not deal with tie breaking explicitly throughout the paper, rather we assume that ties are broken in favor of our needs . It is easy to verify that this can be formally addressed by adding an arbitrarily small ε to the quality of the chosen candidate.

3 Minimal Regret Voting Rule The problem of choosing a committee to minimize the regret is composed of two problems. First, a committee should be chosen. Second, a voting rule should be chosen; that is: an aggregation rule that, given the committee members votes, chooses a candidate that minimizes the regret. In this section, we deal with the second problem of choosing a voting rule: given the locations of the candidates c = (c1, . . . , cm), the locations of the experts e = (e1, . . . , ek), and the votes of the experts vote = (vote1, . . . , votek). Which candidate should be chosen in order to minimize the regret over all quality vectors? We ﬁrst observe that natural voting rules, such as majority (or plurality), may perform very poorly:

Observation 3.1 For every committee, there exist instances for which applying the majority voting rule will result in regret of Ω(1).

Proof: Consider the case of 3 candidates on the unit interval located at 0, 1/2, 1. We will show that for any possible locations of a set of experts (of any size), there is a vector of qualities where the majority rule will result in regret of Ω(1). Assume without loss of generality that the majority of the experts lie to left or exactly at 1/2. Now consider the quality vector q1 = 0, q2 = 1, q3 = 1.4. Notice that all the experts left of 1/2 vote for candidate 2, and so the majority vote has regret 0.4.

We next give a polynomial-time voting rule that minimizes the worst-case regret over any quality vector q. Moreover, we will see that by using the optimal voting rule we can make the regret as small as desired using sufﬁciently large committees. Let vi = votei(q) and v = (v1, . . . , vk). The key for deﬁning the minimal regret rule is observing that different qualities vector q may result in the same vector of votes v. We refer to the set of such quality vectors as indistinguishable qualities and denote it by Q(v) = {q : v = vote(q )}. To minimize the regret we should choose the candidate with minimal regret over the set of the indistinguishable qualities. This is a candidate j C such that j = arg minj maxq Q(v) R(j, q ) and we refer to it as the minimal regret candidate. We show a set of linear programs that are used to compute the minimal regret candidate. The ﬁrst step is to observe that the set of indistinguishable qualities Q(v) can be written using linear constraints. Speciﬁcally, for each expert i E and each candidate j C we have the constraint:

qvi d(ei, cvi) qj d(ei, cj), where the variables are the unknown qualities qj. For each candidate h such that there exists at least a single expert that voted for him and every j C, we can compute the regret of selecting h when j is a better candidate as max q Rh,j = qj qh

i E, qvi d(ei, cvi) qj d(ei, cj). We can compute the minimal regret candidate by solving at most m2 linear programs, one per pair of candidates. For each candidate h we compute Rh = maxj Rh,j and select the candidate h with the minimal Rh. In the next theorem we show how to compute the minimal regret candidate more efﬁciently. This is done by reducing the problem to computing shortest paths on a directed graph. Theorem 3.2 Given candidate locations c, experts locations e and expert votes v, the minimal regret candidate can be computed in time O(mk + m3). Proof: We construct a directed graph where every node is associated with a candidate in C. For each expert i E we add a directed edge from the candidate that i voted for (vi) to any other candidate h with weight wvi,h = d(ei, ch) d(ei, cvi). Recall that we have qvi d(ei, cvi) qh d(ei, ch) or equivalently qvi qh wvi,h (1) We claim that the resulting graph does not have cycles with negative weights. To show this consider a cycle vi1 vi2 vin vi1 Summing the corresponding Equations (1) we get n

r=1 wvir ,vir+1 ,

which implies that n r=1 wvir ,vir+1 0, hence the sum of the weights along any cycle is non-negative. Therefore, we can ﬁnd the minimal regret candidate as follows: (1) Build the directed weighted graph (in time O(mk)), (2) Solve all-pair shortest paths (either naively in time O(m3) or use one of the more sophisticated matrix multiplication algorithms), and compute a matrix A where Aj1,j2 is the regret for choosing candidate j1 compared to candidate j2. Then, (3) for each candidate h compute Rh (in time O(m) by computing maximum in row h of A, for a total time of O(m2)) and ﬁnally, (4) compute the minimal regret candidate h (in time O(m) by computing min Rh).

4 Universal Committees We are now ready to discuss our main question, how should we form a committee that will be able to choose the best candidate even when the locations of the candidates are unknown in advance (i.e., a universal committee). More formally, our goal is to form a committee that would be able to guarantee a regret of at most ε for any set of m candidates that may appear. We do assume that the number of candidates m is known in advance. As we shall see, this is unavoidable since the committee size must grow with the number of candidates.

4.1 General Upper bound We begin by constructing a universal committee that achieves a regret of at most ε. We assume here that we can choose an expert at any point in the metric space (i.e., E = Ω). We will need the following deﬁnition: Deﬁnition 4.1 (δ-cover) A set E Ω is an δ-cover of a metric space Ω if for each point x Ω there exists y E such that d(x, y) δ. We construct the universal committee by constructing an (ε/4m)-cover of Ω:

Lemma 4.2 Let Ω be a connected set. Any (ε/4m)-cover of Ω is a universal committee for m candidates that guarantees a regret of at most ε.

Proof: We will show that given the votes of the experts on any set locations c = (c1, . . . , cm) of the m candidates, we can determine the difference in quality between any two candidates that were voted by at least one expert up to an additive error of ε. We assume that every candidate received at least a single vote by one of the committee members. 3 Consider the following directed graph. Every candidate j C is associated with a vertex vj. There exists an edge between candidates j and j if there are experts i and i such that votei = j, votei = j , and d(i, i ) 2ε/(4m). Such an edge is assigned weight w(j, j ) = d(i, j) d(i, j ). The votes of the experts imply that qj d(i, j) qj d(i, j ) and qj d(i , j) qj d(i , j ). Combining the last inequalities with the bounds on distances and the triangular inequality implies that

d(i, j) d(i, j ) qj qj d(i, j) d(i, j ) + ε/m.

Therefore, the difference in quality is w(j, j ) up to an additive error of ε/m. Consider any two nodes in the graph (which correspond to candidates in C) which are connected by a path vj1, . . . , vjr. Since the difference in qualities between any two consecutive candidates j, j is w(j, j ) up to an error of ε/m, the difference between qjr and qj1 is r 1 i=1 w(ji, ji+1) up to an additive error of ε. It remains to show that the resulting graph is strongly connected. Consider two nodes j and j which are not connected. Consider a shortest path p in the metric space connecting j and j . For any point on the path, there is an expert of distance at most ε/(4m). Consider the set of experts Ep E which are at distance at most ε/(4m) from the path p. We can create a path from the experts in Ep such that the distance between any two is at most 2ε/(4m), where the ﬁrst expert votes for j and the last votes for j . As we march along the path, if an expert votes differently than its predecessor, we add an edge to the graph, and thus we have a path in the graph connecting j and j .

The following theorem is a straightforward corollary of the last lemma, using standard bounds on covers.

3The proof can be easily extended to the case that some of the candidates were not voted by any expert. This is done by noting that the quality of a candidate j that did not receive any votes is at most qj qj + ε/4m where j is another candidate that was voted for.

Theorem 4.3 For any number of candidates m, and any d 1, the metric space Ω = [0, 1]d admits a universal committee of size (4m/ε)d that guarantees a regret of ε.

We establish a better bounds for the special case of ddimensional unit hyper-cubes with two candidates:

Theorem 4.4 For the case of m = 2 candidates and any d 1, the metric space Ω = [0, 1]d admits a universal committee of size O(2d/ε) that guarantees a regret of ε.

Proof: We show how to choose a set of experts E with the property that for any two candidates there exists a single expert in E that can distinguish between them with accuracy ε. For x {0, 1}d deﬁne x = 1 x, component-wise, i.e., xi = 1 xi. We call the line connecting x to x the diagonal of (x, x). Along each diagonal of (x, x) we place experts spaced ε/2 away from each other. This gives a total of O(2d/ε) experts. Given two candidates in locations c1 and c2, consider the equi-distance hyperplane from c1 and c2. This hyperplane must intersect at least one of the diagonals (x, x). By the construction, there exists an expert at distance at most ε/2 from the intersection point, which implies that the difference between its distance to c1 and c2 is at most ε.

4.2 Lower Bounds We begin by considering d-dimensional unit hyper-cubes and present two lower bounds on the required size of the committee: one for achieving a regret smaller than 1/2 and the other for achieving a regret smaller than ε 1/

Theorem 4.5 For any d 1, a universal committee that guarantees a regret of at most 1/2 in Ω = [0, 1]d must be of size at least m 1.

Proof: For contradiction, assume there is a universal committee E of size m 2. For every expert in E, locate a candidate in the exact same location as the expert. Then, locate the additional two candidates in (0, . . . , 0) and (1, . . . , 1), respectively. The quality of a candidate is determined as follows. We have two states of the world. In state plus the quality of each candidate is its ﬁrst coordinate, i.e., x1, and in state minus it is the complement of its ﬁrst coordinate, i.e., 1 x1. Note that each expert selects the candidate which is at the same location as him, since for any other candidate the difference in quality is at most the distance between them. Therefore, the two states of the world are indistinguishable given the experts votes, and the same candidate will be selected. The chosen candidate has quality either q or 1 q, while the best candidate has quality 1. Consequently, the regret is max{1 q, 1 (1 q)} 1/2.

Theorem 4.6 For any d 1, a universal committee that guarantees a regret of at most ε 1/

d for any set of m 2d + 1 candidates in [0, 1]d must be of size at least (2ε

Proof: Consider partitioning the space into boxes of size (2δ)d each; there are (2δ) d such boxes. We will show that if there is a box that does not include any expert, then we

can select 2d candidates at the exterior of the box and one candidate in the interior of the box, such that for any expert outside the box, there exists at least one exterior candidate whose distance from the expert is smaller by at least ε than the distance between the expert and the internal candidate. This implies that if the quality of the internal candidate is higher than the other candidates by ε, no expert will vote for the internal candidate. Therefore, we can force a regret of ε in this case. Thus, any box should have an expert in its interior, and hence the number of experts is at least (2δ) d. Let δ = ε

d < 1, then ε = δ/

d. Consider the hyperbox, [ δ, +δ]d and the 2d unit vectors δei and their negation δei. Consider any point x [ δ, +δ]d, and assume that |x1| = maxi |xi|. We compare its distance to the origin versus its distance to δe1. The distance of x to the

origin is d j=1 x2 j = R. The distance of x from δe1 is

(|x1| δ)2 + d j=2 x2 j R 2δx1

2R . Since |x1| R/

we have that the difference in the distance is at least δ/

d. This implies that any point outside the box has a distance smaller by δ

d to at least one of the δei compared to the origin. To conclude, if there is a box which does not have any expert, we build the bad example as follows. We locate 2d+1 candidates, one in each δei with quality 1 and one in the origin with quality 1+ δ

d. No expert outside the box would prefer the candidate at the origin which implies the lower bound.

Next, we reconcile the last two lower bounds for the case of the unit interval. Interestingly, this lower bound holds even for the case of an ad-hoc committee (the committee is chosen only after the candidates locations are known).

Theorem 4.7 For the unit interval, every m 3 and ε > 0 there exists a set of m candidates such that every (even adhoc) committee that achieves regret at most ε requires size k = Ω(m/ε).

Proof: We sketch the proof of the dual argument that for every m 3 and k, if the set of candidates are located along the unit interval in equal distances: ci = i 1 m 1 for i = 1, . . . , m, then every committee of size k achieves regret at least Ω(m/k). The complete proof can be found in the full version. Before presenting the exact construction, we give a highlevel description of the intuition behind it. We construct two different quality vectors q1 and q2 that, on the one hand, admit identical voting of all k experts (and are therefore indistinguishable from the planner s point of view), and on the other hand, admit an Ω(m/k) gap between the highest and the lowest qualities. Both quality vectors consist of a decreasing sequence of qualities followed by an increasing sequence. The difference is that in q1, qualities ﬁrst decrease sharply and then increase mildly, while in q2 they ﬁrst decrease mildly and then increase sharply. This way, the best candidate in q1 is the left-most candidate, while the best candidate in q2 is the right-most one. The actual construction is subtle: the challenge is to delicately ﬁnd the right sequence of qualities that

ensure identical voting on the one hand, yet a sufﬁciently large gap in qualities on the other hand. We now proceed with the exact construction. Divide the line into 10k equal intervals of size δ = 1/(10k) each. This way, there are 10k m 1 intervals between any two candidates. Let e = (e1, . . . , ek) be the location of the experts such that ei ei+1 for every expert i. Given an index ℓ 1, . . . , 10k 2(m 1) (i.e., the index of an interval that resides to the left of the center of two consecutive candidates), two consecutive candidates are said to be ℓ-empty if there are no experts in the ℓth interval between them, nor in its mirror interval ˆℓ:= 10k m 1 ℓ+ 1. Pick ℓ 10k 2(m 1) for which the number of ℓ-empty pairs is maximal and let m denote this number. Observe that m > 4m/5. 4 Finally, further divide the set of ℓ-empty pairs into two equal-size sets the left set and the right set. 5 Let the pair (j, j + 1) be an ℓ-empty pair of candidates from the left set. We wish to build the qualities such that every expert in [cj, cj +(ˆℓ 1)δ] votes for candidate j, while every expert in [cj + ˆℓδ, cj+1] votes for candidate j + 1. (recall, there are no experts inside the interval [cj + (ˆℓ 1)δ, cj + ˆℓδ]). All experts in [cj, cj + (ˆℓ 1)δ] will prefer candidate j over candidate j + 1 if and only if

qj+1 qj + 1 m 1 2(ˆℓ 1)δ. (2)

Similarly, all experts in [cj + ˆℓδ, cj+1] will prefer candidate j + 1 over candidate j if and only if

qj+1 qj + 1 m 1 2ˆℓδ. (3)

The combination of the two inequalities leave us a leeway of 2δ. In particular, we can decrease the quality of qj+1 by value 2(ˆℓ 1)δ 1 m 1 + Δ for any Δ [0, 2δ]. (This is

indeed a decrease since ˆ(ℓ 1)δ 1 2(m 1)). Consider next an ℓ-empty pair (j, j+1) from the right set. We build the qualities such that every expert in the interval [cj, cj + (ℓ 1)δ] votes for candidate j, while every expert in the interval [cj + ℓδ, cj+1] votes for candidate j + 1. By replacing ˆℓby ℓin Equations (2) and (3) we can increase the quality of qj+1 by 1 m 1 2(ℓ 1)δ + Δ for any Δ [0, δ]. We now present the explicit quality vectors q1 and q2. In both q1 and q2, for every pair (j, j + 1) that is not ℓ-empty, we set qj = qj+1. In q1, for a pair of candidates j, j + 1 in the left set, we set q1 j+1 = q1 j + 1 m 1 2ˆℓδ, while for a pair in the right set, we set q1 j+1 = q1 j + 1 m 1 2ℓδ. In q2, for every ℓ-empty pair in the left set, we set q2 j+1 = q2 j + 1 m 1 2ˆℓδ+2δ , and for every ℓ-empty pair in the right set, we set q2 j+1 = q2 j + 1 m 1 2ℓδ+2δ. For these qualities it

4Otherwise, for any ℓ 10k 2(m 1) there exists at least m/5 pairs that are not ℓ-empty. The total number of experts required for this is 10k 2(m 1) m

5 = m k m 1 > k. 5We brieﬂy discuss the case that m is odd later.

holds that q1 m = q1 1 m δ while q2 m = q2 1+m δ. This implies that q1 and q2 are indistinguishable and since m 4m/5, the regret is at least mδ = Ω(m/k), as required.

5 Ad-hoc Committees We return to discussing the case of forming a committee after observing the locations of the candidates. In the full version, we provide a polynomial time approximation for the unit interval. Consider the candidates location vector c = (c1, . . . , cm). Roughly speaking, the heart of the algorithm is identifying the interval which is at the center between the leftmost and rightmost candidates and locating about half of the experts in this interval. The algorithm then, cleverly decides how many experts are needed in each of the rest of the intervals [cj, cj+1] and essentially equally spaces them in each interval. Our techniques also imply the dual approximation: given a target regret ε we can ﬁnd a committee whose size is within a constant factor of the size of the smallest committee with ε regret for that set of candidates. For general discrete metric spaces we show that constant approximation is hard, even in a weak bi-criteria sense. To this end, we consider the problem of computing a minimal size committee that always chooses the optimal candidate (i.e., perfect committee) and show: 6

Theorem 5.1 Approximating the perfect committee problem to with an o(log m)-factor is NP-hard.

To prove the theorem we deﬁne the pair cover problem in which we are given a bipartite graph and need to compute a minimal subset of left vertices such that each pair of right vertices has a left vertex that is connected to both. We show that the latter problem is NP-hard to approximate with in o(log m)-factor by reducing from set cover. Next, we contrast ad-hoc committees and universal committees with respect to the required size of a committee guaranteeing at most ε regret. As discussed in Section 4, for the unit interval there is no difference between the performance of universal and ad-hoc committees. Interestingly, for higher dimensions, observing the locations of the candidates can help reduce the committee size needed to guarantee a certain regret. This is done by ﬁrst building a minimum spanning tree (MST) over the candidates, then densely populating the edges of the MST with experts. The length (or weight) of an MST T is denoted by w(T).

Theorem 5.2 Let Ω be a connected metric space and T a minimum spanning tree of the locations of the candidates c = (c1, . . . , cm) over Ω with a total weight of w(T). There exists a committee of size k = O(w(T)m/ε) which guarantees a regret of at most ε.

Proof: Deﬁne a new metric space ΩT that includes only points on the edges of T. We now create an (ε/4m)-cover for ΩT , which has size at most k = O(w(T)m/ε) and locate the k experts in those locations. By Lemma 4.2 this is a universal committee for ΩT which guarantees a regret of at most ε. Since our set of candidates, by construction, are

6Note that a perfect committee does not always exist.

in ΩT , it implies that in particular the committee will have regret at most ε.

Let D denote the diameter of Ω. Clearly, the weight of the MST is bounded by Dm, which implies the following corollary.

Corollary 5.3 For any connected metric space Ω of diameter D, there exists an ad-hoc committee of size k = O(Dm2/ε) which guarantees a regret of at most ε.

The last corollary implies a big separation between ad-hoc and universal committees. While the required size of a universal committee is exponential in the dimension of the metric space, this exponential growth can be avoided in the case of ad-hoc committees. For the special case of Ω = [0, 1]2 we provide a better bound by establishing a better bound on the weight of the MST. We show:

Theorem 5.4 For Ω = [0, 1]2, there exists a committee of size k = O(m1.5/ε) which guarantees a regret of at most ε.

6 Discussion

In this paper we initiate the study of forming committees to reduce biases in voting. We show that eliminating biases is a difﬁcult task and often requires very large and diverse committees. Furthermore, the required size of the committee increases as the dimension of the metric space increases. To remedy this, one should prefer forming ad-hoc committees whenever possible, as we show that for d-dimensional hyper-cubes ad-hoc committees can be substantially smaller. Alternatively, to simplify the metric space, one should seek to minimize the exposure of the experts to irrelevant information that may bias them. For hiring decisions, one can, for example, refrain from disclosing the family status of the candidates. In the full version of the paper we explore additional aspects of committee formation. We consider randomized voting rules and show that for the unit interval they can perform better than deterministic voting rules by a constant gap. We also revisit the assumption that experts vote for the candidate with the maximal grade. Instead we allow the experts to vote strategically in order to get a candidate they prefer to be chosen. We show that if it is possible to locate many experts at the same location then we can transform any nonstrategic voting rule to a strategyproof voting rule that obtains the same regret by tripling the size of the committee and locating three experts in each original location. If we are not allowed to locate multiple experts at the same location then even for 3 candidates on the unit interval there is no strategyproof voting rule that is guaranteed to do well. Our work raises some new and exciting questions. On a more concrete level, we leave open the question of closing the gap between the lower bound and upper bound on the size of ad-hoc committees for d-dimensional unit hyper-cubes. More generally, in line with (Anshelevich et al. 2018), it is interesting to consider the tradeoff between the quality of the chosen candidate and the amount of information each expert should report. In our work each expert only reports his top candidate. By how much can we improve the quality of the chosen candidate (or reduce the size

of the committee) if we allow each expert to report his two (or more) favorite candidates?

References Alpern, S., and Gal, S. 2009. Analysis and design of selection committees: a game theoretic secretary problem. Int. J. Game Theory 38(3):377 394. Anshelevich, E., and Postl, J. 2017. Randomized social choice functions under metric preferences. Journal of Artiﬁcial Intelligence Research 58:797 827. Anshelevich, E.; Bhardwaj, O.; Elkind, E.; Postl, J.; and Skowron, P. 2018. Approximating optimal social choice under metric preferences. Artiﬁcial Intelligence 264:27 51. Austen-Smith, D., and Banks, J. S. 1996. Information aggregation, rationality, and the condorcet jury theorem. American Political Science Review 90:34 45. Boutilier, C.; Caragiannis, I.; Haber, S.; Lu, T.; Procaccia, A. D.; and Sheffet, O. 2015. Optimal social choice functions: A utilitarian view. Artiﬁcial Intelligence 227:190 213. Caragiannis, I.; Nath, S.; Procaccia, A. D.; and Shah, N. 2017. Subset selection via implicit utilitarian voting. Journal of Artiﬁcial Intelligence Research 58:123 152. Crawford, V. P., and Sobel, J. 1982. Strategic information transmission. Econometrica: Journal of the Econometric Society 1431 1451. Currarini, S.; Jackson, M. O.; and Pin, P. 2009. An economic model of friendship: Homophily, minorities, and segregation. Econometrica 77(4):1003 1045. Enelow, J. M., and Hinich, M. J. 1989. A general probabilistic spatial theory of elections. Public Choice 61:101 113. Feddersen, T., and Pesendorfer, W. 1998. Convicting the innocent: The inferiority of unanimous jury verdicts under strategic voting. American Political Science Review 92:23 35. Gerardi, D.; Mc Lean, R. P.; and Postlewaite, A. 2009. Aggregation of expert opinions. Games and Economic Behavior 65:339 371. Hitoshi, M. 1993. Bayesian monotonicity with side payments. Journal of Economic Theory 59:107 121. Krishna, V., and Morgan, J. 2001. A model of expertise. Quarterly Journal of Economics 116(2). Krishna, V., and Morgan, J. 2011. Overcoming ideological bias in elections. Journal of Political Economy 119(2):183 211. Maravall-Rodriguez, C. 2005. A spatial election with common values. Working Paper. Mc Pherson, M.; Smith-Lovin, L.; and Cook, J. M. 2001. Birds of a feather: Homophily in social networks. Annual review of sociology 415 444. Neilsona, W. S., and Winterb, H. 2000. Bias and the economics of jury selection. International Review of Law and Economics 20:223 250. Procaccia, A. D., and Rosenschein, J. S. 2006. The distortion of cardinal preferences in voting. In International

Workshop on Cooperative Information Agents, 317 331. Springer. Rivera, L. A. 2012. Hiring as cultural matching the case of elite professional service ﬁrms. American Sociological Review 77(6):999 1022. Roemer, J. E. 2001. Political Competition: Theory and Applications. Young, P. 1988. Condorcet s theory of voting. American Political Science Review 82:1231 1244.