# representative_committees_of_peers__128d40e8.pdf

Journal of Artiﬁcial Intelligence Research 71 (2021) 401-429 Submitted 11/2020; published 07/2021

Representative Committees of Peers

Reshef Meir reshefm@ie.technion.ac.il The Faculty of Industrial Engineering and Management, Technion Israel Institute of Technology, Technion City, Haifa 3200003, Israel

Fedor Sandomirskiy fedor.sandomirskiy@gmail.com The Faculty of Industrial Engineering and Management, Technion Israel Institute of Technology, Technion City, Haifa 3200003, Israel International Laboratory of Game Theory and Decision Making, HSE University, Kantemirovskaya 3, St. Petersburg 194100, Russia

Moshe Tennenholtz moshet@ie.technion.ac.il The Faculty of Industrial Engineering and Management, Technion Israel Institute of Technology, Technion City, Haifa 3200003, Israel

A population of voters elects representatives among themselves to decide on a sequence of possibly unforeseen binary issues. Voters care only about the ﬁnal decision, not the elected representatives. The disutility of a voter is proportional to the fraction of issues, where her preferences disagree with the decision. While an issue-by-issue majority vote by all voters would maximize the social welfare, we are interested in how well the preferences of the population can be approximated by a small committee.

We show that a k-sortition (a random committee of k voters with the majority vote

within the committee) leads to an outcome within the factor 1 + O 1

of the optimal social cost for any number of voters n, any number of issues m, and any preference proﬁle. For a small number of issues m, the social cost can be made even closer to optimal by delegation procedures that weigh committee members according to their number of followers. However, for large m, we demonstrate that the k-sortition is the worst-case optimal rule within a broad family of committee-based rules that take into account metric information about the preference proﬁle of the whole population.

1. Introduction

How well do committees represent preferences of the underlying population? How to select committees optimally? This paper aims to answer both questions taking the design perspective on the committee-selection procedure and the choice of the committee-internal voting rule. The mainstream social choice literature considers preferences on the set of candidates to be the primitive of the model. In contrast, we assume that voters have preferences on decisions of the committee. This richer structure provides a natural way to assess how well the committee represents electorate s preferences.

c 2021 AI Access Foundation. All rights reserved.

Meir, Sandomirskiy & Tennenholtz

We consider a population that is going to face a sequence of binary issues to decide on (e.g., academics select a committee to decide whether to run a new journal, where to organize the next workshop, whether to conduct it online because of the pandemic of COVID-19). Once a new issue emerges, each voter can possibly form her preference about the best alternative. This allows us to consider a socially optimal alternative that would be selected at the whole-population referendum, had such a referendum been conducted. However, engaging in a frequent referenda may be a heavy burden for the population. This is part of the reason why historically, most of the world has moved from direct democracy to some form of representative democracy (by committees or parliaments), whereas more recently there is much interest in increased public participation via technological means (Dalton, Burklin, & Drummond, 2001; Brill, 2019; Allen, Berg, & Lane, 2019).

We follow the representative democracy approach in this paper, assuming there is a strict limit on the number of representatives that can actively vote on all issues. We then ask if and how a representative committee can be selected from the general population. We assume that individual s preferences over issues are separable and evaluate how well a committee aggregates preferences of a given population by its social cost, where the cost of a voter is equal to the number of issues on which she disagrees with the ﬁnal decision. Following the standard worst-case approach in approximate mechanism design, the approximation ratio is deﬁned as the worst-case ratio between the (expected) social cost of the elected committee and the optimal social cost. The latter is obtained by the issue-by-issue majority vote of the entire population. We assume that the designer is free to choose both the committee-selection procedure and the way preferences of the elected committee members are aggregated. Our goal is to achieve the approximation ratio close to one while keeping the operational cost low.

Our paper demonstrates the exceptional properties of the following simple procedure: a committee of size k is selected uniformly at random from the population and the decision of the committee is determined by the simple majority on every issue. This procedure, known as k-sortition, has been discussed although not frequently used since ancient times. In the last two decades, this old rule has experienced a surge of theoretical interest accompanied by increasing practical success. For example, sortition committees were pivotal in the legalization of gay marriage and abortion in Ireland,1 have increasingly been organized directly by nation states and provinces,2 and used for Internet governance3; the ﬁrst permanent sortition assembly has been established in the German-speaking community of Belgium,4

and several NGOs5 fruitfully promote sortition committees as an eﬀective instrument for direct democracy. Many other examples can be found on https://participedia.net.

1. https://www.citizensassembly.ie/ 2. For example, climate assemblies in France https://www.conventioncitoyennepourleclimat.fr/en/ and in the UK https://www.climateassembly.uk/ or British Columbia assembly on electoral reform https://citizensassembly.arts.ubc.ca/. 3. See papers by Galvin (2004) and by Fishkin, Senges, Donahoe, Diamond, and Siu (2018). 4. https://www.ips-journal.eu/interviews/deliberative-democracy-makes-citizens-happy-3470/. 5. For example, https://www.sortitionfoundation.org/, http://www.g1000.org/, and a startup (Chaum, 2016, https://rsvoting.org/).

Representative Committees of Peers

1.1 Our Results

Our main result is a characterization of the approximation ratio of k-sortition. For a committee of size k = 3 the ratio is equal to 1.316 (Example 4) and behaves as 1+Θ 1

, when k increases (Theorems 2 and 3). As a corollary of this result, we infer that the optimal committee size k for a population of size n is of the order of n 2 3 given ﬁxed preferenceelicitation costs (Example 5). Can we do better than k-sortition? It turns out that for a small number of issues m, the approximation ratio can be made exponentially close to 1 in k by weighing each committee member according to the fraction of the electorate it best represents. We obtain this result under some restrictions on the preference proﬁle which roughly mean that, for a randomly sampled voter, her preferences over issues are i.i.d. (Proposition 4). Unfortunately, this compelling exponential improvement disappears in a realistic scenario of large number of issues m, and, moreover, the suggested average-proximity weighing becomes harmful (Proposition 5). It is not a coincidence that k-sortition has a better approximation ratio than its weighted version when the number of issues is large. Indeed, as our main result (Theorem 6) shows, k-sortition is the worst-case optimal rule among a broad family of committee rules. To summarize, in unstructured environments, k-sortition is a compelling choice for representing preferences of the electorate.

Structure of the paper The model is introduced in Section 2. We analyze the benchmark rule, k-sortition in Section 3. Section 4 is devoted to improving the outcome of k-sortition for small number of issues by using additional information about the proximity of committee members and voters. In Section 5, we show that k-sortition is worst-case optimal within a broad family of rules. Limitations of the model and remaining questions are discussed in Section 6. Most of the proofs are only sketched in the main body of the paper; all the details can be found in Appendices A, B, and C.

1.2 Related Literature

Finding good rules for committee selection and even formalizing what is good is still a big challenge for the social choice literature; see a survey by Faliszewski, Skowron, Slinko, and Talmon (2017) discussing the general problem and by Lackner and Skowron (2020) describing recent progress in the case of dichotomous preferences. This literature takes a normative approach and aims to characterize voting rules by a combination of desired properties known as axioms. Preferences over candidates are considered to be the primitive of the model despite the fact that, as argued by political scientists, preferences on candidates originate from the electorate s preferences on future decisions that a candidate would take if elected (Austen-Smith & Banks, 1988). This limitation does not allow to capture the objective of selecting committees that optimally represent electorate s preferences on future policies. The idea of sortition selecting representatives by lot from the electorate has its roots in Athenian democracy (see, e.g., Hansen, 1999) and has been rediscovered many times throughout history, e.g., it was used to select the Doge of Venice (Mowbray & Gollmann, 2007). Recently this idea gained popularity (Dowlen, 2017; Cheng, Dughmi, & Kempe,

Meir, Sandomirskiy & Tennenholtz

2017, 2018; Guerrero, 2014) for its fairness, representativness, high barriers to various manipulations6 including vote-buying (Walsh & Xia, 2012; Parkes, Tylkin, & Xia, 2017; Jamroga, Roenne, Ryan, & Stark, 2019; Gersbach, Mamageishvili, & Tejada, 2017), and stronger incentives to learn which alternative is the best compared to all-population referendum (Gersbach, Mamageishvili, Tejada, et al., 2020). As shown by Gersbach, Mamageishvili, and Tejada (2021), hybrid procedures including sortition as one of the steps can increase the incentives for the population to participate in the election. Trustworthy and secure implementation of sortition in practice leads to cryptographic challenges (Eastlake, 2004; Lenstra & Wesolowski, 2015; Basin, Radomirovic, & Schmid, 2018). The chance that a random committee fairly represents minorities can be increased for typical instances by a more advanced sampling technique (Benad e, G olz, & Procaccia, 2019); non-uniform sampling can also help to restore fair representation if members of the elected committee are free to opt out and do it with probability depending on demographic characteristics (Flanigan, G olz, Gupta, & Procaccia, 2020). We emphasize that the goal of our work is to pick a committee whose decision matches the preference of the majority of the population. In particular, fair representation of minorities is not one of our objectives, as the selected committee is only an intermediate step for a ﬁnal decision that applies to the entire population. However, our work shares some similarities with the paper by Benad e et al. (2019), which aims to minimize the variance of the fraction of representatives of a certain demographic minority within the committee. If the designer has access to demographic characteristics of the population and these characteristics are correlated with preferences, a variant of the non-uniform sampling procedure of Benad e et al. (2019) may outperform the uniform sampling in our model for some instances. Note that even in the setting of Benad e et al. (2019), the uniform sampling remains worst-case optimal, which suggests that our result on worst-case optimality of sortition with uniform sampling persists even if demographic data is available. One stream of the literature on multiple referenda deals with the dependency among issues in voters preferences (see, e.g., Lang & Xia, 2016); or with restrictions on the valid outcomes, as in judgement aggregation (Pauly & Van Hees, 2006). We follow a diﬀerent stream of the literature, making the simplifying assumption that preferences of each voter are fully captured by their position in some metric space (in our case, the binary cube with the Hamming distance) (Border & Jordan, 1983; Procaccia & Tennenholtz, 2009; Meir, Procaccia, & Rosenschein, 2012; Goel, Krishnaswamy, & Munagala, 2017; Anshelevich, Bhardwaj, Elkind, Postl, & Skowron, 2018; Garg, Kamble, Goel, Marn, & Munagala, 2019). Measuring the performance of a voting rule by its approximation ratio (or distortion, which is an analogue of the approximation ratio for a given set of candidates) was suggested by Procaccia and Rosenschein (2006). It was later applied by Procaccia and Tennenholtz (2009) for facility-location in metric spaces and became a gold standard in economic-design literature. In this line of papers, the optimal outcome is compared to the best possible outcome subject to some constraint on the voting rule (e.g., that it is strategy-proof, or uses only ordinal information). Indeed, in the special case of a singleton (k = 1), k-sortition boils down to the familiar random dictator voting rule, whose approximation ratio for large populations is 2.

6. Resistance to manipulations made it possible to use sortition as a building block in the design of nonmanipulable implementations of various social choice functions (Saran & Tumennasan, 2013).

Representative Committees of Peers

Cheng et al. (2018) also consider voting on a metric space, where candidates are uniformly sampled from the set of voters, and characterize the class of scoring rules having bounded distortion. Our paper is diﬀerent in many aspects: it analyzes the decisions of the elected committee, allows for fairly general committee-selection procedure, e.g., the sampling procedure may depend on preferences of the electorate, and achieves the approximation-ratio close to 1.

The fact that the approximation ratio of k-sortition goes to 1 for large committee size, can be regarded as an extension of the famous Condorcet Jury theorem, which claims that if each voter receives a noisy signal about the right alternative, the majority vote of a large population reveals the ground truth. In this interpretation, preferences of a random committee member are seen as a noisy estimate of preference of the majority. An important diﬀerence is that the jury theorem requires the population to be far enough from a tie (Paroush, 1998), while our bounds are applicable to all preference proﬁles.

Delegation A compromise between the direct and representative democracies is provided by proxy voting (Alger, 2006; Green-Armytage, 2015; Cohensius, Mannor, Meir, Meirom, & Orda, 2017) and liquid democracy (Kahng, Mackenzie, & Procaccia, 2018; Goelz, Kahng, Mackenzie, & Procaccia, 2018): each voter has an opportunity to engage in the decisionmaking directly but, since she may be non-motivated enough or unavailable, there is an option to specify a representative thus delegating the vote to the proxy. The idea of weighing the committee members, used in Section 4, is inspired by this line of research. Below we provide a detailed overview of several papers on proxy voting with multi-issue setting sharing some similarity with ours. Pivato and Soh (2020) consider a ﬁxed committee with delegation to the closest representative and evaluate its performance on a sequence of i.i.d. issues. They demonstrate that in the limit of large electorate, the decision of the committee always matches the alternative preferred by the majority. This conclusion holds even for committees of size 1 because authors impose additional strong assumptions on the alignment of electorate s and committee members preferences. Our approach diﬀers signiﬁcantly: we are interested in the worst-case guarantees, work with a ﬁxed population of voters, allow them to have arbitrary preferences, and the only randomness in our model comes from the randomization performed by voting rules. The setting by Skowron (2015) is similar to that of Pivato and Soh (2020): both papers implicitly assume i.i.d. issues and strong restrictions on the alignment of preferences of the committee and the electorate. Skowron (2015) discusses how the committee-internal voting rule aﬀects the optimal selection of the committee. We are interested in the optimal selection of the pair (a committee and a committee-internal voting rule), make no assumptions on the preference proﬁle, and rely on the worst-case analysis giving the robust guarantees. In contrast to the two aforementioned papers, Abramowitz and Mattei (2018) impose no restrictions on preferences and propose an interesting continuous way to mix direct and proxy voting by allowing voters to readjust the weights of committee members for each issue. In Section 5, we consider a family of rules containing the proposal by Abramowitz and Mattei (2018) and demonstrate the worst-case optimality of k-sortition within this family.

Meir, Sandomirskiy & Tennenholtz

Magdon-Ismail and Xia (2018) discuss committee selection in a setting with multiple issues and heterogeneous competence of voters captured by the probability that voter s opinion on a given issue matches the ground truth . In our model, the outcome of the issue-wise referenda can be seen as an analog of the ground truth. The conceptual distinction between the two settings is that Magdon-Ismail and Xia (2018) assume the competence of each voter to be known, while we do not have access to issue-wise alignment of each voter s preferences with the majority.

2. The Model

There is a population N of n voters who are going to face a sequence M of m binary issues to decide on. Preferences of each voter i N are represented by a vector xi {0, 1}M, where xi,j = 1 if i prefers the alternative 1 for issue j M and xi,j = 0 if the alternative 0 is preferred. The preference proﬁle of the population is thus given by an n m matrix X = (xi,j)i N,j M of zeros and ones.

Deﬁnition 1 (voting rules). A voting rule f for a population N of voters and the set M of issues speciﬁes the outcome vector z {0, 1}M for each preference proﬁle X {0, 1}N M. We allow for randomization and so a voting rule maps X to a probability distribution f(X) on {0, 1}M according to which the outcome z is then chosen.

An important class of rules is formed by those treating all issues separately in the following sense.

Deﬁnition 2 (issue-wise voting rule7). A voting rule f is issue-wise if for any pair of preference proﬁles X, X that coincide on issue j, we have P(zj = 1) = P(z j = 1), where vectors z f(X) and z f(X ) are the corresponding outcomes.

Committee voting rules. We are interested in voting rules that can be represented as a two-stage procedure: ﬁrst voters select a certain committee of peers C N and then the committee members vote to determine the outcome for each of the issues. Before giving a formal deﬁnition, we explain the intuition behind it. To avoid elicitation of detailed preferences of a large population, we assume that the committee selection depends on the overall alignment of voters tastes only. Formally, for a pair of vectors z, z {0, 1}M, deﬁne the distance d(z, z ) between them to be the number of issues where they disagree, i.e., the Hamming distance: d(z, z ) = P j M |zj z j|. For a preference proﬁle X, the matrix of pairwise distances is denoted by D(X) = d(xi, xi )

i,i N. We assume that the committee is selected using the information contained in D(X) solely.8

7. This property is also known as independence of irrelevant alternatives (IIA) (see, e.g., Pauly & Van Hees, 2006). 8. In practice, estimating D(X) does not require eliciting preferences on all issues M, some of which may even be unforeseen. One approach is to use similarity of socio-demographic characteristics as a proxy for similarity of tastes. Alternative, more reliable approach is to sample k m issues at random and elicit voters preferences on these k issues. By the standard arguments, the distance computed on such a sample gives a 1 + O 1

-approximation of the true distance with high probability for any pair of voters.

Representative Committees of Peers

The decision making process within the committee should be simple and transparent to maintain a clear connection between representatives and the voters they represent, and to enable voting on arbitrary issues as they come along. This motivates the assumption that the committee-internal voting rule of the committee has to be issue-wise. The restriction of a preference proﬁle X to committee members C N is denoted by X|C. Now we are ready to give a formal deﬁnition of a committee voting rule.

Deﬁnition 3 (k-committee voting rules). A k-committee voting rule is given by a function g that maps the matrix of pairwise distances D(X) to a probability distribution over pairs (C, h), where C N is a committee of size k and h is an issue-wise voting rule for the committee C ( a committee-internal voting rule ). The outcome of g is computed as follows: ﬁrst the pair (C, h) is sampled from the distribution g D(X) and then the outcome-vector z is obtained by applying h(X|C).

Any committee rule is a voting rule, and any voting rule can be seen as a k-committee rule with k = n, i.e., the whole population plays a role of the committee. We are interested in the opposite scenario, when the committee size k is small compared to the total number of voters n.

Example 1 (a k-committee rule). Consider the following k-committee rule. It picks a committee C that minimizes the total distance to all the voters P i N mini C d(xi, xi ) (if there are several such committees, it randomizes over them uniformly) and the committeeinternal rule h is the weighted majority vote on every issue, where the weight of a member c is given by wc = 1 + P i N d(xi, xc) 1. So for issue j, the outcome zj = 1 whenever P c C: xc,j=1 wc > 1

2 P c C wc; in case of the opposite strict inequality, zj = 0, and zj {0, 1} is picked at random in case of a tie.

Note that the fact that the committee-internal voting rule is issue-wise does not imply that the committee rule itself is issue-wise since the distribution of the pair (C, h) may depend on preferences of the whole population over all issues through D(X). E.g., the rule from Example 1 is not issue-wise. The selection of C can be thought of as a multiwinner rule that may or may not try to achieve proportional representation. E.g., the selection of C in Example 1 is using a version of Chamberlin-Courant rule (Chamberlin & Courant, 1983).

Ineﬃciency of voting rules. Since we restrict ourselves to k-committee rules, we can, in general, not achieve optimal eﬃciency. To quantify this ineﬃciency, we employ the utilitarian approach. We deﬁne the disutility of a voter i for an outcome-vector z {0, 1}M

as the total number of issues, where i s preferences and z disagree, i.e., this disutility equals to the distance d(xi, z). The utilitarian social cost of an outcome vector z is given by the sum of disutilities, and the social cost of a voting rule f on a preference proﬁle X is deﬁned as the expected social cost of its outcome:

i N d(xi, z), SC f(X) = Ez f(X) [SC(z)] .

Denote by zopt = zopt(X) the socially-optimal outcome, i.e., the one with the minimal cost

zopt = argminz {0,1}M SC(z).

Meir, Sandomirskiy & Tennenholtz

Deﬁnition 4. The approximation ratio of a voting rule f is the worst-case ratio of the expected social cost of its outcome z and the optimal social cost

ARn,m(f) = max X {0,1}N M

[1, + ] (1)

The maximum is over all preference proﬁles X with ﬁxed numbers |N| = n of voters and |M| = m of issues. If the denominator is zero (this happens for unanimous preference proﬁles), the following agreement is used: 0

0 = 1 and C

0 = + for C > 0. When we drop some of the subscripts n and/or m, this means considering their worstcase values, i.e., ARn = supm N ARn,m, ARm = supn N ARn,m, and AR = supn,m N ARn,m,

Example 2 (The simple majority rule). The outcome zj for each issue j is determined on the whole-population referendum: zj = 0 if P i N xi,j < n 2 ; for the opposite strict inequality, zj equals 1; and zj is 0 or 1 equally likely in case of a tie. The resulting rule, referred to as the simple majority rule, always select the socially-optimal outcome, i.e., ARn,m(MAJORITY) = 1. Indeed, minimization of the social cost SC(z) splits into a family of issue-wise minimization problems: minimize P i N |xi,j zj| over zj for each j M; the minimum is achieved if zj matches the majority of (xi,j)i N. Optimality of the simple majority rule comes at the cost of huge burden imposed on the population: to compute the outcome we may need the preferences of the whole population on all the issues.

Example 3 (The random dictatorship). The random dictator rule is a benchmark example in the voting theory. In our setting, it works as follows: a voter i N is selected uniformly at random and i s preference dictates the outcome of each referendum, i.e., z = xi. Note that the random dictatorship is an example of a 1-committee rule. It is known that for the random dictatorship ARn(DICTATOR) 2 2

n for voting in general metric spaces (Meir et al., 2012; Anshelevich & Postl, 2017). Our setting can be regarded as a particular metric space {0, 1}M with the Hamming distance d. However this does not improve the approximation ratio. Let us compute the approximation ratio for the one-issue case. A simple argument (Lemma 1 in Section 3) implies that ARn,m(DICTATOR) = ARn,1(DICTATOR), i.e., the approximation ratio does not depend on the number of issues. Without loss of generality, we can restrict our attention to proﬁles X such that the majority prefers the alternative 0 but there are some supporters of the alternative 1 (the case where 0 is supported unanimously is not the worst case since SC(zopt) = SC(DICTATOR(X)) = 0, and so, by our convention, the ratio of the social costs is equal to one). Denote the number of supporters of the alternative 1 by n1 = P i N xi,1. By the assumption, 1 n1 n

2 . The social costs are SC(0) = n1 and SC(1) = n n1. With probability n1

n the random dictator supports 1 and with the complement probability she supports 0. Therefore,

SC(DICTATOR(X))

n (n n1) + 1 n1

n n1 n1 = 2 2n1

The maximal value ARn,1(DICTATOR) = 2 2

n is achieved at n1 = 1. By Lemma 1 stated below, the answer remains the same for any number of issues.

Representative Committees of Peers

3. Random Committees with Simple Majority Rule Do Good Job

At the end of the last section, we considered two examples. At one extreme, we have the simple majority rule, which achieves the socially-optimal outcome but may require the information on preferences of the whole population of voters. The random dictatorship is at the other extreme: it is enough to know preferences of one voter only, but the social cost in the worst case is almost doubled. Our goal in this section is to ﬁnd the middle ground between these two extremes. We construct simple rules that have the approximation ratio close to 1 and require to learn preferences of a negligible fraction of the population. The following family of k-committee rules is a natural extension of both the simple majority rule and the random dictatorship.

Deﬁnition 5 (k-SORT: k-sortition). A committee C N of size |C| = k is selected uniformly at random. The preferences of the committee members are aggregated using the simple majority rule (see Example 2).

For a population of n voters, n-SORT coincides with the simple majority rule applied to the whole population, while 1-SORT coincides with the random dictatorship.9

Example 4 (The approximation ratio for 3-SORT). Let us compute the approximation ratio for a random committee of size 3 and one issue; by Lemma 1 below, the approximation ratio remains the same for any number of issues m. Similarly to Example 3, we can assume that the proﬁle X has the following structure: out of n 3 voters, a majority supports the alternative 0, and there are 1 n1 n

2 supporters of the alternative 1. We denote by p = n1

2 the probability that a random voter supports 1. Let A be the event that the committee with 3 members selects the suboptimal alternative 1; A occurs if and only if two or three members support 1. The probability of this event equals to

The probability P(A) tends to p2(3 2p) for ﬁxed p and large n. For p 1

2 and any n, this limiting value provides an upper bound: P(A) p2(3 2p). Taking into account that SC(0) = n1 and SC(1) = n n1, we obtain

SC 3-SORT (X) SC(zopt) = (1 P(A))n1 + P(A) (n n1)

n1 = 1 + P(A) (1 2p)

Taking supremum over p 0, 1

2 results in an upper bound on the approximation ratio

ARn,1 3-SORT 1 + sup p (0, 1

2] [p(3 2p)(1 2p)] = 1 + 7

9. Note that none of the rules from the family k-SORT require the designer to elicit the matrix D(X) of pairwise distances. This additional information that the deﬁnition of a committee voting rule allows for will only be used in Sections 4 and 5, which compare the k-sortition with more advanced rules that may depend on D(X).

Meir, Sandomirskiy & Tennenholtz

The maximum is attained at p = 4

7 6 0.226. It is easy to see that if we deﬁne n1 = p n and let n go to inﬁnity, the ratio of the social costs converges to the upper bound in (2). Thus

ARm=1 3-SORT = 1 + 7

The next simple lemma implies that the approximation ratios computed for the random dictatorship and 3-SORT in the one-issue case remain the same for any number of issues.

Lemma 1. For any number of voters n and any committee size k n, the approximation ratio of k-SORT does not depend on the number of issues m:

ARn,m k-SORT = ARn,1 k-SORT (3)

The lemma is proved in Appendix A; here we present a sketch.

Proof sketch of Lemma 1. The left-hand side of (3) is bounded by the right-hand side since any worst-case proﬁle with one issue can be cloned m times. To prove the opposite inequality, we observe that the worst-case proﬁle with m issues cannot be worse than its restriction to the worst issue.

Corollary 1. The approximation ratio for a random dictator AR DICTATOR) = 2 and for a random committee of size k = 3, the ratio AR 3-SORT 1.316.

We see that passing from a random dictator to a random 3-committee signiﬁcantly improves the approximation ratio. The following theorem covers the case of committees of arbitrary size k and demonstrates that the approximation ratio converges to 1 fast with the growth of k.

Theorem 2. For any number of voters n, any number of issues m, and committee size k n, the approximation ratio of k-SORT enjoys the following upper bound

ARn,m k-SORT 1 + 6 exp 1

k 1 + 3.639

The proof is rather long because of technicalities and is postponed to Appendix A; here we sketch the main ideas.

Proof sketch of Theorem 2. Lemma 1 allows to focus on the one-issue case. Similarly to Example 4, we aim to ﬁnd the worst-case proﬁle by maximizing over the fraction p 0, 1

of voters supporting the alternative 1. Given n, p, and k, the number of supporters of the alternative 1 in a random committee has the hypergeometric distribution with parameters (n, p n, k). Despite hyper-geometric random variable is not a sum of i.i.d. contributions, it satisﬁes the Hoeﬀding tail-bound, familiar by the i.i.d. case. This bound and optimization over p leads to upper bound (4). Note that the worst-case p = 1

; however, it takes additional work to exclude small p because the denominator in the approximation ratio (1) vanishes as p 0.

Representative Committees of Peers

Corollary 2 (Deterministic committees). By the probabilistic-method argument, Theorem 2 implies the existence of a deterministic committee of size k such that the majority vote of

its members has the approximation ratio at most 1 + 6 exp( 1

The next result shows that the upper bound from Theorem 2 has the right order of magnitude as a function of k.

Theorem 3. For any n, m, and k, the following lower bound holds

ARn,m k-SORT 1 + 2 Φ ( 1) 1

k 1 + 0.317

provided that 2(k + 1)2 n. Here Φ(t) = 1

2π R t exp y2

2 dy is the standard Gaussian cumulative distribution function.

The proof is sketched here and the details can be found in Appendix A.

Proof sketch of Theorem 3. Similarly to the proof of Theorem 2, we can assume that there is only one issue and obtain that the number of supporters of the alternative 1 among the committee members has the hypergeometric distribution with parameters (n, p n, k), where p is the fraction of supporters among the whole population. Assuming that p is close to 1

2, we approximate the hypergeometric distribution by the sum of i.i.d. Bernoulli random variables and apply the Central Limit Theorem to this sum (we use the so-called Berry Esseen version of the CLT, which provides an upper bound on the error term). As a result, we get the bound (5).

Corollary 3 (Asymptotic behavior). Combining upper and lower bound we see that

ARn,m k-SORT = 1 + Θ 1

Proofs of Theorem 2 and 3 demonstrate that worst-case preference proﬁles for large k are those, where both alternatives have approximately equal support with the imbalance created by Θ 1

-fraction of voters.

The following example illustrates how the obtained bounds on the approximation ratio can be used to ﬁnd the optimal committee size if the preference-elicitation costs are known.

Example 5 (Optimal size of the committee). Assume that eliciting preference of one voter on one issue has some ﬁxed cost c measured in the same units as the social cost of the voting outcome. Let us determine the optimal committee size that minimizes the worst-case regret for large number of voters n. Regret is deﬁned as the diﬀerence between the total cost of k-SORT (including the elicitation cost) and the social cost of the optimal outcome

Regret(X) = SC k-SORT(X) + c m k SC(zopt(X)).

To estimate the order of magnitude of the maximal regret over X, we note that

Regret(X) =

" SC k-SORT(X)

SC(zopt(X)) 1

SC(zopt(X)) + c k m.

Meir, Sandomirskiy & Tennenholtz

The maximal value of the expression in the parenthesis is ARn,m k-SORT 1, which

by Corollary 3. By the same corollary, the proﬁle X maximizing the ratio of

social costs has almost equal support for each of the alternatives and, hence, SC(zopt(X)) = Θ(m n). We see that both factors (the expression in the parenthesis and SC(zopt(X))) achieve their maximal orders of magnitude at the same time. Thus,

max X Regret(X) = Θ m n

Minimizing the worst-case regret over k is, therefore, equivalent to minimizing Θ 1

+ c n k. We conclude that the optimal committee size is

4. Delegation Is Good with Few Issues but Harmful with Many

In the previous section, we saw that the k-sortition achieves an approximation ratio close to 1. A natural question to ask is whether we can push the approximation ratio even closer to 1 by using slightly more information about preferences of the population. We ﬁrst show that the approximation ratios from the previous section can be significantly improved if the number of issues m is small. For this purpose, we use weighted majority voting to aggregate preferences of the committee members, where the weight of a member is determined by the fraction of voters to which this member is the closest representative (Alger, 2006; Cohensius et al., 2017). Similar approach to weighing representatives is studied in the ﬁeld of proxy voting (see Subsection 1.2). The common wisdom suggested by that strain of literature is that usually the quality of opinion-aggregation can be signiﬁcantly improved by ﬁnding a clever way of transforming proximity to weights. It turns out, that in case of large number of issues, this is not the case. While for small number of issues, weighing makes the approximation ratio exponentially close to 1 in the committee size k (compare with 1 + O(1/

k) from Theorems 2 and 3), we show that for large number m of issues, delegation is harmful and the approximation ratio is bounded from below by 9

8 for any size of the committee k. For a committee C N and a voter i N, denote by Ci the subset of committee members closest to i, her best representatives. We assume that each voter distributes one unit of weight over her closest representatives. So the weight wc of a committee member c C is given by wc = P i N: c Ci 1 |Ci|.

Deﬁnition 6 (k-w SORT: weighted k-sortition). A committee C N of size |C| = k is selected uniformly at random. The committee decides on each issue j M using weighted majority rule with weights wc: if P c C xc,j wc > 1

2 P c C wc, then the outcome is zj = 1; in case of the the opposite strict inequality, zj = 0; in case of a tie, zj {0, 1} is taken equally likely.

The following simple example shows that the approximation ratio is exponentially close to 1 in the one-issue case.

Representative Committees of Peers

Example 6 (k-w SORT for 1 issue). Assume that n1 among n voters support the alternative 0 and denote n1

n by p. Without loss of generality, 0 < p < 1

2, i.e., the majority of voters prefers the alternative 0 that has the social cost of p n. If the preferences of all the committee members coincide with those of majority (xc,1 = 0, c C), then, the committee selects the optimal alternative 0. Similarly, if xc,1 = 1 for all c C, the sub-optimal alternative 1 with the social cost (1 p) n is selected. Consider the remaining case: both views 0 and 1 are presented in C. We check that, in this case, the decision is socially optimal for any proportion of supporters of each of the alternatives (compare with at least k+1

2 members needed without weighing!). Indeed, the total weight received by the supporters of zero alternative is (1 p) n and is bigger than the weight of the supporters of one, p n. Thus the committee makes the sub-optimal decision if and only if it contains no sup-

porters of zero alternative, which happens with probability n1 k

pk. Therefore,

the approximation ratio admits the following upper bound

ARn,1 k-w SORT max p (0, 1

(1 pk) p n + pk (1 p) n

p n = 1 + max p (0, 1

2) pk 1(1 2p).

To ﬁnd the worst-case, we take the logarithmic derivative and get k 1

p = 1 1 2 p; hence,

ARn,1 k-w SORT 1 + 1 k 2k 1

By taking a sequence of preference proﬁles with growing number n of voters and p n supporters of the alternative 1, we see that the upper bound is achieved in the limit. Thus

ARm=1 k-w SORT = 1 + 1 k 2k 1

k 1 = 1 + 2

e 1 k 2k (1 + o(1)), k .

We see that weighing drastically improves the approximation ratio: from 1 + O 1

Theorems 2 and 3 to 1 + o 2 k .

If there are several issues, weights of the committee members depend on all of them. Therefore we don t have an analog of Lemma 1, which allowed us to analyze the rule k-SORT in the one-issue case and then extend the results in a straightforward way. In particular, for k-w SORT, the approximation ratio depends on the number of issues. Now we demonstrate that for small number of issues, delegation leads to exponential improvement as in Example 6. To simplify the analysis we focus our attention on a particular class of preference proﬁles. We say that in a proﬁle X = (xi,j)i N,j M a random voter has i.i.d. preferences over issues if, when we sample a voter i N uniformly at random, the random variables (xi,j)j M are independent and identically distributed. We stress that this is a property of one preference proﬁle X and, hence, proﬁles where a random voter has i.i.d. preferences form a deterministic domain restriction. Informally, this property means that issues are similar

Meir, Sandomirskiy & Tennenholtz

but unrelated: there are no logical dependencies between them or other correlations in preferences. We deﬁne the approximation ratio over proﬁles where a random voter has i.i.d. preferences by restricting the maximization in (1) to such proﬁles X

ARi.i.d. n,m (f) = max i.i.d. X {0,1}N M

Note that ARi.i.d. n,m is always below ARn,m. The condition that a random voter has i.i.d. preferences is very restrictive, and it is not too surprising that delegation improves the approximation ratio in this class, similarly to Example 6. The next proposition provides an explicit bound, which additionally highlights an interesting phenomenon: the improvement, that is signiﬁcant for small number of issues m, disappears as m grows.

Proposition 4. The approximation ratio of k-w SORT satisﬁes the following upper bound for proﬁles where a random voter has i.i.d. preferences over m issues

ARi.i.d. n,m k-w SORT 1 + mm+1 exp k (2m)m

We see that for a small number m of issues the approximation ratio converges to 1 exponentially fast in the size of the committee. Thus, as in Example 6, the delegation improves the asymptotic behavior compared to k-SORT (note that ARi.i.d. n,m and ARn,m coincide for k-SORT; the argument is similar to Lemma 1 and is omitted). However, the rate of exponential convergence drops drastically with the growth of the number of issues m. Below we will see that this is not a coincidence. Proposition 4 is proved in Appendix B and the proof is sketched here.

Proof sketch of Proposition 4. Let X be a proﬁle, where a random voter has i.i.d. preferences. W.l.o.g., the majority prefers 0 for each issue; we denote by p 1

2 the fraction of supporters of the alternative 1. If p is very small (p < 1 2m), then by the union bound, more than half of the population prefers 0 for all issues. Therefore, if at least one all 0 voter enters the committee, they get more than half of the total weight and the committee selects the socially optimal outcome. Since such voters are likely to be among the committee members (the probability is at least 1 (pm)k), the social cost of k-w SORT at X is exponentially close to optimal. For p 1 2m, the argument for the low social cost is diﬀerent. Using the union bound we show that, with high probability, for each sequence z {0, 1}M there is a committee member with such preferences. Note that given this event, each voter delegates her voting right to a member with exactly the same preferences as her own, and hence, the outcome of the committee rule coincides with that of the simple majority rule applied to the whole population. Details of the computation can be found in Appendix B.

The next result complements Proposition 4 and shows that for unbounded number of issues, delegation is harmful and the approximation ratio is separated from 1 even for big committees.

Representative Committees of Peers

Proposition 5. For unbounded number of issues, the approximation ratio of k-w SORT admits the following lower bound

AR k-w SORT 9

for any committee size k.

The fact that delegation in multi-issue setting may lead to inferior outcomes is mentioned in (Cohensius et al., 2017, Section 5) for a similar model but without the worst-case analysis. Their insight is that for some preference proﬁles delegation leads to the most extreme voters attracting almost all weight, thus wasting the information about others preferences. We build upon this insight.

Proof sketch (for details, see Appendix B). The idea is to construct a preference proﬁle X with two groups of voters P and Q. All voters from Q have identical preferences and all voters from P are equidistant and the distance between any two distinct voters from P is bigger than the distance between a voter from P and a voter from Q. Given such a metric structure, if any voters from Q enter the committee, they receive the weight both from Q and P and thus the committee decision becomes dictatorial and coincides with preferences of a voter from Q. In particular, the social cost of the outcome coincides with the social cost of the unanimous preferences of Q. In the appendix, we show the existence of such a preference proﬁle X, where the social cost of Q s preferences is 9

8 of the optimum. The proﬁle X is constructed via the probabilistic argument: (xi,j)i P,j M are i.i.d. Bernoulli random variables with success probability p < 1

2; and for i Q and j M we put xi,j = ξj, where (ξj)j M are i.i.d. Bernoulli random variables with success probability q < p. Therefore, for large number of issues m, the distance between any two voters from P is approximately 2p(1 p)m (1 + o(1)), while the distance between a voter from P and a voter from Q is (p(1 q) + q(1 p))m (1 + o(1)). Optimization over p, q, and sizes of P and Q leads to the desired lower bound.

Corollary 4. Comparing Proposition 5 to Theorem 2, we see that 1000-SORT outperforms k-w SORT for any committee size k.

Note that ancient Athenian democracy had 1100 magistrates, 1000 of whom were selected by lot (Hansen, 1999). Committees of size 1000 are also recommended by Mueller, Tollison, and Willett (1972).

5. Optimality of Sortition

In Section 4, we considered weighing of the committee members, where a voter delegates one unit of weight to the closest representative. This method proved to be much better than the simple majority rule for a few i.i.d. issues, however it gives no advantage and may even harm if the number of issues is big. One may think that poor guarantees for large number of issues is a feature of this particular delegation method. Indeed, it is easy to come up with alternative proposals that seem to be better: a voter may smoothly distribute the weight among the committee members in a way that members that are closer to him get more weight. Also, instead of

Meir, Sandomirskiy & Tennenholtz

selecting the committee uniformly, a higher chance can be given to voters with smallest average distance to the rest of the population. In this section we show that none of such natural modiﬁcations can improve the approximation ratio with many issues: k-SORT, a uniformly random committee with the simple majority rule, is worst-case optimal among the family of all committee rules, which can possibly use the whole metric information about the preference proﬁle.

Theorem 6. For any k-committee rule g and any number of voters n, we have

sup m N ARn,m g ARn,1 k-SORT .

In other words, a uniformly-random committee with the simple majority rule is worst-case optimal within the family of all committee rules.10

The theorem is proved at the end of the section and the next two subsections contain necessary preparatory work. The proof is based on two lemmas. The ﬁrst one (Lemma 7) shows that it is enough to prove the theorem for anonymous committee rules, i.e., those rules that are invariant with respect to renaming voters of the underlying population (permuting rows of the preference proﬁle X). The second lemma (Lemma 8) allows us to construct a rich family of proﬁles X such that all the voters are equidistant (for this we need m = n! issues). On such a proﬁle, the metric information encoded in the distance matrix D(X) becomes useless and the only anonymous way to select a committee is uniformly at random. Next we show that among the family of all committee rules with uniformly random committee, no rule can outperform k-SORT both on the proﬁle X and on the complement proﬁle X , where preference of all the voters are ﬂipped. Lemma 8 provides enough ﬂexibility to make X (and, hence, X ), the worst-case proﬁle for k-SORT. Since our committee rule is worse oﬀon one of these proﬁles, we conclude that it has a higher approximation ratio.

Anonymous rules and anonymization. For a preference proﬁle X with n voters and a permutation π of N, we denote by π(X) the preference proﬁle with permuted voters: π(X)i,j = Xπ(i),j.

Deﬁnition 7 (anonymous voting rules). A voting rule f is anonymous if f(X) = f π(X)

for any preference proﬁle X and permutation π.11

A committee rule g satisﬁes this deﬁnition if the distribution of the outcome z seen as a function f(X) of the preferences X of the population is invariant with respect to permutations. In particular, anonymity of the committee-internal rule h is neither necessary nor suﬃcient for the anonymity of the committee rule itself. Even if committee members receive diﬀerent weights as in Section 4 (and thus committee-internal rule is not anonymous), the committee rule may still be anonymous if the selection of the committee and the committeeinternal rule depend on preferences of the population in an anonymous way. For example, k-w SORT rule is anonymous. Similarly, even if a committee-internal rule is anonymous but the choice of the committee is determined by a designated voter (a dictator), the committee rule fails anonymity.

10. We prove a stronger result: instead of supremum over m on the left-hand side, one can take m = n! issues. 11. The equality is understood as the equality of distributions.

Representative Committees of Peers

For a rule h, denote by fanym its anonymization: for any preference proﬁle X

fanym(X) = 1

π Sn f π(X) ,

where n is the number of voters, Sn is the set of all permutations of N, and the sum is understood as the convex combination of distributions. We note that the anonymization of a k-committee rule is another k-committee rule.

Lemma 7. For any numbers of voters and issues and any voting rule h

ARn,m fanym ARn,m f .

The proof is elementary and can be found in Appendix C.

Preference proﬁles with equidistant voters. By the deﬁnition of a committee rule, the choice of the committee can only depend on the matrix D(X) of pairwise distances. The combination of anonymity and the distant-based property becomes very restrictive if distances between pairs of voters are all the same. The following lemma demonstrates that, for any given fraction of supporters of alternative 1, we can ﬁnd a proﬁle with equidistant voters.

Lemma 8. For any number of voters n and n1 n, there exists a preference proﬁle with n voters and m = n! issues such that for each issue j M exactly n1 voters support the alternative 1 and for each pair of distinct voters i, i N the distance d(xi, xi ) = 2p(1 p)m

where p = n1

Sketch of the proof (see Appendix C for the formal argument): The idea is similar to the one used in Proposition 5: if preferences of each voter are given by a vector of i.i.d. Bernoulli random variables with success probability p = n1

n , then, in expectation, n1 voters prefer alternative 1 for each j M, and the distance between any pair of voters concentrates around 2p(1 p)m for large m by the law of large numbers. To prove Lemma 8 we use a derandomization of this construction. This allows us to get exactly n1 supporters of 1 for each issue and exact equality of distances. Details can be found in Appendix C.

Proof of the theorem. Armed with Lemmas 7 and 8, we are ready to prove the theorem.

Proof of Theorem 6. Consider the anonymized rule ganym and check that for m = n! we have ARn,m ganym ARn,m k-SORT . (6)

By Lemma 7, this would imply the same lower bound for g. Pick n1 n

2 such that the preference proﬁle with one issue and n1 supporters of the alternative 1 is the worst proﬁle for k-SORT. With this n1, construct the proﬁle X from Lemma 8 and denote by X the proﬁle, where all the voters ﬂipped their preferences on all the issues, i.e., x i,j = 1 xi,j. Both X and X

Meir, Sandomirskiy & Tennenholtz

are the worst proﬁles for k-SORT (see Lemma 1) and have the same socially-optimal cost SC(zopt(X)) = SC(zopt(X )). Therefore,

ARn,m k-SORT = 1

2 SC k-SORT(X) + SC k-SORT(X )

SC(zopt(X)) . (7)

Our goal is to show that

SC k-SORT(X) + SC k-SORT(X ) SC ganym(X) + SC ganym(X ) . (8)

Together with (7) this inequality would imply (6) since ARn,m ganym SC ganym(Y )

SC zopt(Y ) for

any proﬁle Y . Consider the distributions ganym(X) and ganym(X ) on pairs (C, h) of a committee with k members and a committee-internal rule. The two distributions coincide because ganym

is distance-based and matrices of distances for X and X are the same; without loss of generality, let us focus on X. Since all pairs of voters in X are equidistant, the only anonymous way to select a committee C of size k is to take it uniformly at random among subsets C N with |C| = k. Indeed, if two such subsets C and C diﬀer in probability, this contradicts anonymity because C = π(C ) for some permutation π of N and no permutation changes the matrix of distances. Consider the probability P(zj = 1 | C) that the outcome of ganym(X) for issue j is 1 conditional on the selected committee C. This probability can only depend on (xi,j)i C because h is issue-wise by the deﬁnition of a committee rule. Since ganym is distance-based and anonymous, and any permutation of voters (in particular, those in C) leaves the matrix of distances unchanged, we see that the probability can only depend on q = P i C xi,j, i.e., the total number of supporters of the alternative 1 for issue j. Hence, P(zj = 1 | C) = hq,j = hq,j(C). For any alternative j, the chance that the uniformly random committee C contains

exactly q supporters of the alternative 1 is equal to

n1 q n n1 k q

n k for the proﬁle X and

n n1 q n k q

n k for X . Selection of the alternative supported by majority contributes n1 to

the social cost for both proﬁles, while the sub-optimal costs n n1. This allows to rewrite the right-hand side of (8):

SC ganym(X) + SC ganym(X ) =

n1 q n n1 k q

n k (n n1)hq,j + n1(1 hq,j) +

n n1 q n1 k q

n k n1hq,j + (n n1)(1 hq,j) !

To minimize this expression over hq,j [0, 1], one should minimize the contribution of the

ﬁrst summand whenever

n1 q n n1 k q

n n1 q n k q

n k which leads to hq,j = 0; similarly, for

Representative Committees of Peers

the opposite strict inequality, it is optimal to minimize the second summand, which gives hq,j = 1. Therefore, the optimal hq,j equals 0 if q < k

2 and 1 if q > k

2. Such h corresponds to the simple majority rule with arbitrary tie-breaking in case of a tie. Therefore,

SC ganym(X) + SC ganym(X ) SC k-SORT(X) + SC k-SORT(X ) ,

which completes the proof.

6. Discussion

The main takeaway message from our work is that at least when the preferences of the individuals in the society are separable, the simple and well known k-sortition rule is a reasonable choice that guarantees low approximation ratio. While for a low number of issues we saw that there are more representative rules than the k-sortition rule, the latter also has other beneﬁts. First, the selection of the committee C (and its internal voting rule h) is completely oblivious to voters positions. As such, it can be used to select a committee even before we know some or all of the issues on the agenda. Second, while we did not focus on incentive analysis in this work, it is easy to see that k-sortition is strategy-proof for the the entire population: no voter can aﬀect the selection of the committee, and representatives are always weakly better of by voting their true position on every issue. In contrast, under the proxy-weighted variant, voters may have an incentive to lie in order to aﬀect the weights of committee members.

Our paper uncovered new attractive properties of k-sortition. An interesting direction for future work is to characterize k-sortition as the unique rule possessing a combination of such properties. For example, Theorem 6 claims worst-case optimality of k-sortition but does not exclude the existence of other worst-case optimal k-committee rules. What assumptions should one add to the theorem (if any) to get a characterization? Example 5 describes how to pick the committee size k to minimize regret within the family of k-sortitions. Since sortition does not incur any preference elicitation at the committee-selection stage, a plausible conjecture would be that k-sortition with the optimal k minimizes the worst-case regret within a broader family of rules. It may well be the unique minimizer within the family of all voting rules. The main general direction we are currently investigating is a better understanding of the eﬀects of delegation, in particular when there are few issues and/or restrictions on the structure of preferences. In the long run, we are interested in how sortition and other delegation-based voting rules can guarantee low distortion and good approximation in other domains, including ranked preferences and interdependent issues.

Acknowledgments

The authors are grateful to Akaki Mamageishvili, Nisarg Shah, and two anonymous reviewers for valuable suggestions. Moshe Tennenholtz is supported by the European Research Council (ERC) under the European Union s Horizon 2020 research and innovation program (#740435). Reshef Meir is supported by the Israel Science Foundation grant #2539/20.

Meir, Sandomirskiy & Tennenholtz

Fedor Sandomirskiy is supported by the European Research Council (ERC) under the European Union s Horizon 2020 research and innovation program (#740435), Ministry of Science and Technology grants #19400214 and #2028255, Grant 19-01-00762 of the Russian Foundation for Basic Research, and the Basic Research Program of the National Research University Higher School of Economics.

Appendix A. Proofs for Section 3

Proof of Lemma 1. In order to show that ARn,m k-SORT ARn,1 k-SORT , consider a worst-case proﬁle X {0, 1}N {1} for one issue. Deﬁne a new proﬁle X with m issues by cloning X : xi,j = x i . Since diﬀerent issues contribute to the Hamming distance in an additive way and k-SORT operates on each issue separately,

ARn,m k-SORT SC k-SORT(X)

SC zopt(X) = m SC k-SORT(X )

m SC zopt(X ) = ARn,1 k-SORT . (9)

To prove the opposite inequality, pick a worst-case proﬁle Y {0, 1}N M with m issues and denote by Y j a one-issue proﬁle obtained by the restriction of Y to issue j, i.e., yj i = y i,j. The following computation shows that the worst-case proﬁle with m issues is not worse than its restriction to the worst issue

ARn,m k-SORT = SC k-SORT(Y )

SC zopt(Y ) =

P j M SC k-SORT(Y j)

P j M SC zopt(Y j)

max j M SC k-SORT(Y j)

SC zopt(Y j) ARn,1 k-SORT . (10)

Combining inequalities (9) and (10), we obtain the desired equality of approximation ratios.

Proof of Theorem 2. By Lemma 1, it is enough to consider a one-issue proﬁle X. Without loss of generality, we assume that the majority supports the alternative 0: 1 n1 n

2 voters prefer the alternative 1, and the rest prefer 0. Denote p = n1

2 . Let ξ be the number of committee members supporting the alternative 1. The random variable ξ has the hypergeometric distribution with parameters (n, n1 = pn, k). Recall that the hypergeometric distribution describes the number of red balls among k draws without replacement from an urn with n1 red and n n1 white balls. If we represent supporters of 0 as white balls and supporters of 1 as red and take into account that sampling without replacement is equivalent to picking a uniformly random subset, it becomes obvious that ξ is hypergeometric. If ξ < k

2, then the committee selects the socially-optimal alternative 0 with the social cost p n; for ξ = k

2 (can happen only if k is even), there is a tie and the committee picks any of the two alternatives equally likely; for ξ > k

2, the committee selects the sub-optimal alternative 1 with the social cost (1 p) n. Therefore, the ratio of social costs in (1) can be represented as

SC k-SORT(X)

SC zopt = P ξ < k

2 p n + P ξ = k

2 + P ξ > k

2 (1 p) n p n (11)

Representative Committees of Peers

We will consider the two cases of p close to zero and p close to 1

2 separately. As we will see, the worst-case instances correspond to the latter scenario, however, p in the denominator of (11) does not allow to cover both cases at once.

The case p 1

6. For small p, we use a rough bound on P(ξ k

2) based on the Chebyshev inequality. The expectation and the variance of a hypergeometric random variable are given by

Eξ = p k and Vξ = p(1 p)n k

n 1 k p(1 p) k;

(see, e.g., Feller, 2008, 2.6). By the Chebyshev inequality,

2 p k Eξ 2 4p(1 p)

(1 2p)2 1 k.

Thus by (11), SC k-SORT(X)

SC zopt 1 + 4(1 p)

1 2p 1 k 1 + 5

2. To estimate the probability of ξ k

2, we use the tail bounds for hypergeometric random variables. They enjoy the usual Hoeﬀding inequality

P (ξ Eξ + t) exp 2 t2

for t 0 (13)

as if ξ was the sum of k Bernoulli random variables with success probability p. The inequality for hypergeometric distribution is proved in the same paper of Hoeﬀding (1994); see Section 5 there. From (13), we get P ξ k

2(1 2p)2 k . Substituting this bound in (11), replacing p in denominator by its lower-bound 1

6, and denoting 1 2p by t, we obtain

SC k-SORT(X)

SC zopt 1 + exp 1

2(1 2p)2 k 1 2p

p 1 + 6 max t [0, 1

2] exp t2 k t.

Logarithmic derivative t k + 1

t is zero at t = 1

k. Hence, SC k-SORT(X)

SC zopt 1 + 6 exp 1

Gluing the two cases. For k 2, the upper bound (14) proved for 1

2 exceeds (12) derived for p 1

5 k 6 exp 1

Meir, Sandomirskiy & Tennenholtz

Therefore, (14) bounds the ratio of the social costs for any p and k 2. For k = 1, the right-hand side of (14) is equal to 1+6 exp 1

2 4.6, while the worst-case approximation ratio for 1-SORT (a random dictator) is below 2 by Example 3. Thus the approximation ratio satisﬁes

ARn,1 k-SORT = max X SC k-SORT(X)

SC zopt 1 + 6 exp 1

for all n and k.

Proof of Theorem 3. By Lemma 1, it is enough to consider the case of one issue. Similarly to the proof of Theorem 2, we assume that 1 n1 n

2 voters prefer the alternative 1 and denote p = n1

n . The number of supporters of the alternative 1 among the committee members is denoted by ξ and has the hypergeometric distribution with parameters (n, n1 = pn, k). We will consider the scenario where 2(k + 1)2 n, i.e., the number of voters n is large compared to the committee size k. From (11), we deduce the lower bound:

SC k-SORT(X)

SC zopt 1 + P ξ > k

Let us estimate P ξ > k

2 from below for the hypergeometric distribution with parameters (n, n1 = pn, k). If k k balls are already taken and n 1 k of them are red, the chance to pick the next red ball is n1 n 1 n k n1 k

n. Therefore, P ξ > k

2 is bounded from

below by the probability P Pk i=1 ηi > k

2 , where ηi are i.i.d. Bernoulli random variables

with success probability p k

For i.i.d. random variables, the probability P Pk i=1 ηi t can be approximated by the normal law using the Berry Esseen theorem (a version of the central limit theorem with a bound on approximation error; see Shevtsova, 2011)

t k Eη1 Vη1

2 E η1 Eη1 3

(Vη1) 3 2 1 2

where Φ(z) = 1

2π R z exp y2

k 1 2p + 2k

where we wrote 1

4 instead of Vη1 1

4. Let the number of supporters for the suboptimal alternative be

Representative Committees of Peers

and hence 1

n. Note that p 0, 1

2 by the assumption on n and k. We get

k = Φ( 1) 1 2

Substituting this into (15), we obtain

SC k-SORT(X)

SC zopt 1 + 2 Φ( 1) 1 2

k 2Φ( 1) 2(k+1)

Inequalities Φ( 1) 1

2 and 2(k+1)

k lead to the lower bound on the approximation ratio

ARn,1 k-SORT = max X SC k-SORT(X)

SC zopt 1 + 2Φ( 1)

valid for 2(k + 1)2 n.

Appendix B. Proofs for Section 4

Proof of Proposition 4. Consider a preference proﬁle X where a random voter has i.i.d. preferences over issues. Without loss of generality, for every issue j M, the majority of voters prefers the alternative 0 but not unanimously. We denote by p the fraction of supporters of the alternative 1, so 0 < p 1

2. Consider ﬁrst the case of small p < 1 2m, which will not require the independence assumption. By the union bound, at least 1 pm > 1

2 fraction of all voters prefer the alternative 0 on every issue. Therefore, if the committee C contains a zero member that prefers 0 for every issue, she receives more than 1/2 of the total weight and the alternative 0 wins for all issues thus providing the socially optimal outcome. The probability that a random committee contains a zero member is at least 1 (pm)k, therefore, the ratio of social costs satisﬁes

SC k-w SORT (X) SC(zopt)

1 (pm)k p + (pm)k (1 p)

= 1 + m (pm)k 1(1 2p) 1 + m 2k 1 .

Now consider the case of 1 2m p 1

2. Let us estimate that probability of the event A that for every sequence z {0, 1}M there is a committee member c C with xc = z. Given A, every voter delegates her vote to a committee member with exactly the same preferences as her own and thus the outcome of the committee vote coincides with the outcome of the majority vote of the whole population. Thus, if A occurs, the committee selects the socially-optimal outcome. We estimate the probability of A using the union bound. For any z {0, 1}M with q ones, the probability that there is no committee member with such preferences is is at most

Meir, Sandomirskiy & Tennenholtz

(1 pq(1 p)m q)k. Since there are m q

such vectors z, we get

1 pq(1 p)m q k .

Taking into account that (1 pq(1 p)m q)k (1 pm)k and Pm q=0

= 2m, we obtain

P(A) 1 2m (1 pm)k .

SC k-w SORT (X) SC(zopt) P(A) p + (1 P(A)) (1 p)

p (1 pm)k .

Since 1 2m p, the right-hand side does not exceed

1 + m 2m+1 1 1 (2m)m

k 1 + m 2m+1 exp k (2m)m

where in the last inequality we used that (1 t) 1 t 1

e for any t (0, 1). Note that for any m, k 1, this upper bound exceeds the one obtained for p < 1 2m and thus

ARn,m k-w SORT 1 + m 2m+1 exp k (2m)m

Proof of Proposition 5. Consider a random preference proﬁle X with n voters and m issues. There are two groups of voters P and Q with sizes α n and (1 α) n, respectively. All the voters i Q are unanimous: their preferences xi coincide with a vector ξ = (ξj)j M with components given by i.i.d. Bernoulli random variables with success probability q < 1

2. In contrast, preferences of voters in P are not aligned: (xi,j)i P,j M is a matrix with i.i.d. Bernoulli entries with success probability p q, 1

2 . The expected distance between two distinct voters i, i P is equal to E[d(xi, xi )] = 2p(1 p)m. Similarly, for a voter i P and i Q we have E[d(xi, xi )] = (p(1 q) + q(1 p))m. Expected distance to the zero vector E[d(xi, 0)] is equal to pm for i P and qm for i Q. By the law of large numbers, for any ε > 0 we can ﬁnd m = m(ε, n, p, q) such that all the distances are within a factor 1 ε from their expected values with positive probability:

(1 ε)E[d(xi, z)] d(xi, z) (1 + ε)E[d(xi, z)], i N, z {xi : i N} {0}.

Abusing the notation, we denote by X the realization of the random proﬁle satisfying these inequalities. Assuming that ε is small, we see that the distance between any two voters from P exceeds the distance between voters from P and voters from Q (it is enough to assume that (1 ε)2p(1 p)m > (1 + ε)(p(1 q) + q(1 p))m).

Representative Committees of Peers

Consider a random committee C of size k. If C contains some voters from Q, all voters from N \ C delegate their weight to them. Therefore, if n k > k

2 (i.e., n is large enough compared to k), voters from C Q get more than half of the total weight and, therefore, their unanimous preferences ξ coincide with the outcome of the committee vote.

The chance that C Q is nonempty equals 1

n k 1 αk. Putting all the pieces

together, we obtain the following lower bound on the approximation ratio

AR k-w SORT (1 αk) mini P,i Q d(xi, xi ) αn + αk SC(zopt)

Taking into account that SC(zopt) SC(0) and using the lower bound on the minimal distance, we get

AR k-w SORT αk + (1 αk) (1 ε) p(1 q) + q(1 p) α (1 + ε) αp + (1 α)q) .

Letting ε +0 and q p 0, we get rid of ε and simplify the expression

AR k-w SORT αk + (1 αk) 2(1 p)α.

Now, letting p go to 0, we see that

AR k-w SORT 2α + αk 2αk+1.

For k = 1, the maximum of the right-hand side is achieved at α = 3

4. Substituting this value for any k, we obtain

AR k-w SORT 3

Appendix C. Proofs for Section 5

Proof for Lemma 7. Pick a proﬁle X with n voters and m issues and denote by X the proﬁle of the form π(X) maximizing SC h(π(X)) over all permutations π Sn. Note that the socially-optimal cost is the same for X and X . Therefore,

SC hanym(X)

SC zopt(X) = 1

SC zopt(X) SC h(X )

SC zopt(X ) ARn,m h .

By maximizing the left-hand side over X, we obtain the desired inequality.

Proof of Lemma 8. Identify the set of issues with the set Sn of all permutations π of N. So there are m = n! issues. Fix a set A N of cardinality n1. The set of voters preferring alternative 1 for issue π is deﬁned to be the preimage π 1(A) of the set A under permutation π. In other words, preference of a voter i on an issue π is given by

xi,π = 1, π(i) A 0, π(i) / A.

Meir, Sandomirskiy & Tennenholtz

By the construction, exactly n1 voters prefer 1 for each issue. Consider the distance d(xi, xi ) = P π Sn |xi,π xi ,π| between two distinct voters i, i N. Let τ be the permutation such that τ(i) = 1 and τ(i ) = 2. Taking into account that π π τ deﬁnes an automorphism of Sn and that xi,π τ = xτ(i),π, we get

d(xi, xi ) = X

xi,π τ xi ,π τ = X

x1,π x2,π = d(x1, x2).

Therefore, all the distances between distinct voters are equal to common constant d = d(x1, x2). To ﬁnd d, note that if we take a pair of voters uniformly at random (not necessary distinct), the chance they have diﬀerent opinion on a certain issue π is n1(n n1)+(n n1)n1

n2 , while the chance that they are distinct is n2 n

n2 d = n1(n n1) + (n n1)n1

n2 = d = 2p (1 p)

Abramowitz, B., & Mattei, N. (2018). Flexible representative democracy: an introduction with binary issues. ar Xiv preprint ar Xiv:1811.02921.

Alger, D. (2006). Voting by proxy. Public Choice, 126(1-2), 1 26.

Allen, D. W., Berg, C., & Lane, A. M. (2019). Cryptodemocracy: How Blockchain Can Radically Expand Democratic Choice. Rowman & Littleﬁeld.

Anshelevich, E., Bhardwaj, O., Elkind, E., Postl, J., & Skowron, P. (2018). Approximating optimal social choice under metric preferences. Artiﬁcial Intelligence, 264, 27 51.

Anshelevich, E., & Postl, J. (2017). Randomized social choice functions under metric preferences. Journal of Artiﬁcial Intelligence Research, 58, 797 827.

Austen-Smith, D., & Banks, J. (1988). Elections, coalitions, and legislative outcomes. American Political Science Review, 82(2), 405 422.

Basin, D., Radomirovic, S., & Schmid, L. (2018). Alethea: A provably secure random sample voting protocol. In 2018 IEEE 31st Computer Security Foundations Symposium (CSF), pp. 283 297. IEEE.

Benad e, G., G olz, P., & Procaccia, A. D. (2019). No stratiﬁcation without representation. In Proceedings of the 2019 ACM Conference on Economics and Computation, pp. 281 314.

Border, K. C., & Jordan, J. S. (1983). Straightforward elections, unanimity and phantom voters. The Review of Economic Studies, 50(1), 153 170.

Brill, M. (2019). Interactive democracy: New challenges for social choice theory. In The Future of Economic Design, pp. 59 66. Springer.

Representative Committees of Peers

Chamberlin, J. R., & Courant, P. N. (1983). Representative deliberations and representative decisions: Proportional representation and the Borda rule. The American Political Science Review, 718 733.

Chaum, D. (2016). Random-sample voting. White Paper.

Cheng, Y., Dughmi, S., & Kempe, D. (2017). Of the people: voting is more eﬀective with representative candidates. In Proceedings of the 2017 ACM Conference on Economics and Computation, pp. 305 322.

Cheng, Y., Dughmi, S., & Kempe, D. (2018). On the distortion of voting with multiple representative candidates. In Thirty-Second AAAI Conference on Artiﬁcial Intelligence.

Cohensius, G., Mannor, S., Meir, R., Meirom, E., & Orda, A. (2017). Proxy voting for better outcomes. In Proceedings of the 16th Conference on Autonomous Agents and Multi Agent Systems, pp. 858 866. International Foundation for Autonomous Agents and Multiagent Systems.

Dalton, R. J., Burklin, W. P., & Drummond, A. (2001). Public opinion and direct democracy. Journal of Democracy, 12(4), 141 153.

Dowlen, O. (2017). The political potential of sortition: A study of the random selection of citizens for public oﬃce, Vol. 4. Andrews UK Limited.

Eastlake, D. (2004). Publicly veriﬁable nomination committee (Nom Com) random selection. Network Working Group of The Internet Society, https://tools.ietf.org/pdf/ rfc3797.pdf (accessed November 8, 2020).

Faliszewski, P., Skowron, P., Slinko, A., & Talmon, N. (2017). Multiwinner voting: A new challenge for social choice theory. Trends in computational social choice, 74, 27 47.

Feller, W. (2008). An introduction to probability theory and its applications, Vol. 2. John Wiley & Sons.

Fishkin, J. S., Senges, M., Donahoe, E., Diamond, L., & Siu, A. (2018). Deliberative polling for multistakeholder internet governance: considered judgments on access for the next billion. Information, Communication & Society, 21(11), 1541 1554.

Flanigan, B., G olz, P., Gupta, A., & Procaccia, A. (2020). Neutralizing self-selection bias in sampling for sortition. ar Xiv preprint ar Xiv:2006.10498.

Galvin, J. (2004). IAB and IESG selection, conﬁrmation, and recall process: Operation of the nominating and recall committees. Network Working Group of The Internet Society, https://tools.ietf.org/pdf/rfc3777.pdf (accessed November 8, 2020).

Garg, N., Kamble, V., Goel, A., Marn, D., & Munagala, K. (2019). Iterative local voting for collective decision-making in continuous spaces. Journal of Artiﬁcial Intelligence Research, 64, 315 355.

Gersbach, H., Mamageishvili, A., & Tejada, O. (2017). Sophisticated attacks on decoy ballots: The devil s menu and the market for lemons. ar Xiv preprint ar Xiv:1712.05477.

Gersbach, H., Mamageishvili, A., & Tejada, O. (2021). The eﬀect of handicaps on turnout for large electorates with an application to assessment voting. Journal of Economic Theory, 105 228.

Meir, Sandomirskiy & Tennenholtz

Gersbach, H., Mamageishvili, A., Tejada, O., et al. (2020). Appointed learning for the common good: Optimal committee size and eﬃcient rewards. Tech. rep., CEPR Discussion Papers.

Goel, A., Krishnaswamy, A. K., & Munagala, K. (2017). Metric distortion of social choice rules: Lower bounds and fairness properties. In Proceedings of the 2017 ACM Conference on Economics and Computation, pp. 287 304.

Goelz, P., Kahng, A., Mackenzie, S., & Procaccia, A. D. (2018). The ﬂuid dynamic of liquid democracy. In International Conference on Web and Internet Economics.

Green-Armytage, J. (2015). Direct voting and proxy voting. Constitutional Political Economy, 26(2), 190 220.

Guerrero, A. A. (2014). Against elections: The lottocratic alternative. Philosophy & Public Aﬀairs, 42(2), 135 178.

Hansen, M. H. (1999). The Athenian democracy in the age of Demosthenes: structure, principles, and ideology. University of Oklahoma Press.

Hoeﬀding, W. (1994). Probability inequalities for sums of bounded random variables. In The Collected Works of Wassily Hoeﬀding, pp. 409 426. Springer.

Jamroga, W., Roenne, P. B., Ryan, P. Y., & Stark, P. B. (2019). Risk-limiting tallies. In International Joint Conference on Electronic Voting, pp. 183 199. Springer.

Kahng, A., Mackenzie, S., & Procaccia, A. D. (2018). Liquid democracy: An algorithmic perspective. In Thirty-Second AAAI Conference on Artiﬁcial Intelligence.

Lackner, M., & Skowron, P. (2020). Approval-based committee voting: Axioms, algorithms, and applications. ar Xiv preprint ar Xiv:2007.01795.

Lang, J., & Xia, L. (2016). Voting in combinatorial domains. In Handbook of Computational Social Choice, pp. 197 222. Cambridge University Press.

Lenstra, A. K., & Wesolowski, B. (2015). A random zoo: sloth, unicorn, and trx. IACR Cryptology e Print Archive, 2015, 366.

Magdon-Ismail, M., & Xia, L. (2018). A mathematical model for optimal decisions in a representative democracy. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 4707 4716.

Meir, R., Procaccia, A. D., & Rosenschein, J. S. (2012). Algorithms for strategyproof classiﬁcation. Artiﬁcial Intelligence, 186, 123 156.

Mowbray, M., & Gollmann, D. (2007). Electing the doge of venice: Analysis of a 13th century protocol. In 20th IEEE Computer Security Foundations Symposium (CSF 07), pp. 295 310. IEEE.

Mueller, D. C., Tollison, R. D., & Willett, T. D. (1972). Representative democracy via random selection. Public Choice, pp. 57 68.

Parkes, D. C., Tylkin, P., & Xia, L. (2017). Thwarting vote buying through decoy ballots. In International Conference on Autonomous Agents and Multiagent Systems, pp. 45 66. Springer.

Representative Committees of Peers

Paroush, J. (1998). Stay away from fair coins: A condorcet jury theorem. Social Choice and Welfare, 15 20.

Pauly, M., & Van Hees, M. (2006). Logical constraints on judgement aggregation. Journal of Philosophical logic, 35(6), 569 585.

Pivato, M., & Soh, A. (2020). Weighted representative democracy. Journal of Mathematical Economics.

Procaccia, A. D., & Rosenschein, J. S. (2006). The distortion of cardinal preferences in voting. In International Workshop on Cooperative Information Agents, pp. 317 331. Springer.

Procaccia, A. D., & Tennenholtz, M. (2009). Approximate mechanism design without money. In Proceedings of the 10th ACM conference on Electronic commerce, pp. 177 186.

Saran, R., & Tumennasan, N. (2013). Whose opinion counts? Implementation by sortition. Games and Economic Behavior, 78, 72 84.

Shevtsova, I. (2011). On the absolute constants in the Berry-Esseen type inequalities for identically distributed summands. ar Xiv preprint ar Xiv:1111.6554.

Skowron, P. K. (2015). What do we elect committees for? A voting committee model for multi-winner rules. In Twenty-Fourth International Joint Conference on Artiﬁcial Intelligence.

Walsh, T., & Xia, L. (2012). Lot-based voting rules.. In AAMAS, Vol. 12, pp. 603 610. Citeseer.