# misrepresentation_in_district_voting__0c1fde23.pdf Misrepresentation in District Voting Yoram Bachrach Microsoft Research UK yobach@microsoft.com Omer Lev Univ. of Toronto Canada omerl@cs.toronto.edu Yoad Lewenberg Hebrew Univ. of Jerusalem Israel yoadlew@cs.huji.ac.il Yair Zick Carnegie Mellon Univ. USA yairzick@cs.cmu.edu Voting systems in which voters are partitioned to districts encourage accountability by providing voters an easily identifiable district representative, but can result in a selection of representatives not representative of the electorate s preferences. In some cases, a party may have a majority of the popular vote, but lose the elections due to districting effects. We define the Misrepresentation Ratio which quantifies the deviation from proportional representation in a district-based election, and provide bounds for this ratio under various voting rules. We also examine probabilistic models for election outcomes, and provide an algorithm for approximating the expected Misrepresentation Ratio under a given probabilistic election model. Finally, we provide simulation results for several such probabilistic election models, showing the effects of the number of voters and candidates on the misrepresentation ratio. 1 Introduction A voting system is a method by which voters choose between several alternatives and their opinions are aggregated, ultimately choosing a winner (or winners). Democratic countries, in principle, aim to have a representative outcome, by having a legislature roughly representative of the public s beliefs, and in some countries, by having the chief executive elected directly by the public. However, many democracies use a district based system for the selection of their legislature (most prominently, the Westminster system and the US system). In district based schemes, voters are divided into geographically based districts,1 and each one selects a representative to the legislature. The selection mechanisms differ: Westminster and US systems use plurality, France uses plurality with runoff, while Australia uses STV. Typically, candidates in each constituency are members of political parties, and some systems have the majority party form the executive (others, such as the US, have a similar process for selecting the chief executive). 1While we use the term district , other terms include electoral district (US), riding (Canada), and constituency (UK). Single-member district elections provide voters a single and easily identifiable district representative, encouraging service and accountability. However, the proportion of seats in the legislature belonging to a party may be very different from the proportion of voters supporting that party in the overall population; this is known as the referendum paradox [Nurmi, 1999]. The disparity between the popular vote and the district vote has been a source of contention in US elections; by redistricting constituencies ( gerrymandering ), political parties have manipulated the elections [Issacharoff, 2002; Erikson, 1972]; indeed, the US Voter Rights Act of 1965 includes several provisions that require change in congressional districts in several states to be approved by federal authorities [Schuck, 1987]. Moreover, such a discrepancy is caused not only by gerrymandering, but is built into district-based mechanisms. In the US presidential election of 1876, the losing candidate, Samuel Tilden, got 6% more votes than the winner, Rutherford Hayes, an occurrence that happened twice since. The electoral college, through which the president is elected, displays the problem even more acutely; in the US 1992 presidential election a candidate that garnered 18.9% of the vote, Ross Perot, was not represented at all. In the UK system, in 1951 the Conservative party lost the popular vote to the Labour party while still winning a strict parliamentary majority; in general, while no party has received a majority of the popular vote in a British election since 1931, all but two elections resulted in a parliamentary majority for one of the parties. Similar scenarios have unfolded in Canada, Australia and elsewhere. Such problems may also arise in any multi-level decision making process. So, if an organization or a sensor analysis system (e.g., an automated car), attempts to decide on its next step based on inputs from sub-systems employing their own decision making processes (e.g., each sensor family is a district, and sensors are voting between themselves), they may also encounter such a problem, as a small amount of signals may cause the system to reach a wrong outcome. Consider the following example. Two political parties, A and B, compete in seven districts of equal size, D1, D2, . . . , D7. Both parties run a candidate in every district, and the plurality voting rule is used to determine the winner. Now, suppose that in districts D1, . . . , D4, 60% of the vote goes to the representative of party A; on the other Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-16) hand, in districts D5, D6 and D7 the representatives of party B take 100% of the vote. The final vote tally shows party A wins the election, even though party B has nearly twice as many voters! As the example above shows, a minority (those in favor of party A) may rule over the majority (those in favor of party B). We refer to the outcome in this case as a misrepresentation of the voters preferences. District based electoral systems tend to be more stable : they tend to result in a smaller number of candidates, and thus less fragmentation of the parliamentary body. However, misrepresentation is an inherent byproduct of electoral stability; indeed, stability comes at the heavy price of potentially overriding the preferences of most voters. One way to quantify the degree to which a system skews the true desires of the voters as captured by the total support for each party is to examine the ratio between the number of those who voted for the most popular party in general, and the number of voters who voted for the winner. When more people voted for the losing party than for the winning party this ratio is larger than one. The higher this ratio is, the more pronounced the misrepresentation effect. Our Contribution We examine the issue of misrepresentation due to district-based systems under several prominent voting rules. We first provide a metric for quantifying the misrepresentation effect, both for elections with two parties and for elections with more than two parties, which we call the Misrepresentation Ratio. We provide bounds on the Misrepresentation Ratio for several voting rules, depending on the numbers of voters, candidates and districts. Finally, we provide some simulation results regarding the misrepresentation ratio in certain scenarios, to examine its values under various settings. Related Work Voting district representability has been the topic of much public debate and research for two centuries, since the advent of redistricting in the US ( gerrymandering ), and in the UK, redistricting due to an attempt to follow population changes in successive reform bills. Most of the research on these issues focused on historical [Engstrom, 2006; Butler, 1992], sociological [Lublin, 1997] and political science [Erikson, 1972] issues. Particularly since the Voting Rights Act of 1965, much legal research has also dealt with district based system, though usually focusing on particular countries (mainly though not solely the US) [Schuck, 1987; Issacharoff, 2002]. Political science research on misrepresentation mostly focused on its possible occurrence, history or statistical analysis [Feix et al., 2004; Miller, 2011; 2014; Lahrach and Merlin, 2010; Grofman et al., ; Tangian, 2010; Felsenthal and Miller, 2015]. Some research defines voter misrepresentation as distance of parliamentary seat allocation compared to pure representational settings [Gallagher, 1991; Felsenthal and Machover, 1999; Gelman et al., 2002; Feix et al., 2008], using concepts such as the Banzhaf index. Manipulation by voters is well studied [Xia et al., 2009; Zuckerman et al., 2008; Procaccia et al., 2007], as opposed to institutional manipulation. Our work is similar to voting control problems (see [Hemaspaandra et al., 2007] and the survey by [Faliszewski et al., 2010]), including preliminary work on dividing voters into groups [Erd elyi et al., 2015]. Optimal gerrymandering strategies have been studied in [Owen and Grofman, 1988; Friedman and Holden, 2005]; however, these have mostly focused on 2 party scenarios, as in the US. In establishing our bounds, we encounter the problem of finding the minimal score a candidate can win with, given a scoring rule; Xia [2012] studies a similar problem, but his results do not directly apply to our setting. Our problem is related to the bin packing problem with a constant number of bins. This problem was believed to be NP-hard, but recent advances [Stille, 2008; Jansen et al., 2013] established that it is in P [Goemans and Rothvoß, 2014]. 2 Preliminaries We have a set of voters, V , and each voter i 2 V has a preference order (without ties) over the candidate set C, denoted i. For every c, c0 2 C, we say that i prefers c to c0 if c0 i c. We denote the set of linear preferences over C as L(C). A voting rule is a function f : L(C)n ! C, whose input is a finite list of linear preferences over C (a profile) of size n (most voting rules are well defined for any n > 0) and whose output is a candidate c 2 C. Voting rules are often assumed to output a set of candidates, but since in our setting we are only interested in the winners of the election, the output of f is a single candidate (f may incorporate some fixed tie-breaking rule). Since the set of voters and their preferences is constant throughout the paper, we overload notation and define f over subsets of V : given a set Q V , we let f(Q) be the output of f over the preferences of the voters in Q. A voting rule is neutral if the outcome is invariant under candidate name changes. More formally, a voting rule f is neutral if for any two candidates c and c0, if every voter i 2 V now ranks c in the position of c0 and vice versa, the outcome of f is unchanged if the winner was neither c or c0, is c0 if the winner was c, and is c if the winner was c0. We say a voting rule f is score-monotone if it induces a score for every candidate c 2 C (i.e., each candidate ends up with some real number, and the one with the maximal one is chosen), and the following holds: given two preference profiles R1 = ( i)i2V and R2 = ( 0 i)i2V , if for all a, b 2 C \ {c} we have a i b () a 0 i b, and under R1, c is in a position that is no lower than its position in R2, then c s score under R1 is at least as high as its score in R2. These properties hold for the common voting rules which will be used in this paper. Scoring Rules: A scoring rule for m candidates is defined by a vector of scores, ( 1, . . . , m), where 1 m = 0, 1 > 0.2 Given a voter i 2 V , let rank i(c) be the rank of candidate c in the preference order of i. Given a set of voters Q V , let score i2Q rank i(c). Given a set of voters Q V , the output of Scoring (Q) is a member of arg maxc2C score c (Q) (if there are ties, we break them according to some tie-breaking rule). Many scoring rules are widely used. For example, the plurality rule, where 1 = 1 and j = 0 for all j 2, the veto rule, where j = 1 for all j < m, and m = 0, both of which are specific 2Assuming m = 0 is no loss of generality: if m > 0 score vector can be normalized by setting j = j m. examples of the family of k-approval scoring rules, in which 1 = . . . = k = 1 and k+1 = . . . = m = 0. Another common scoring rule is the Borda rule, where j = m j for all j 2 {1, . . . , m}. Copeland Rule: Given a set of voters Q V , we say that a candidate c beats c0 in a pairwise election under Q if a majority of the members of Q prefer c to c0. For each candidate c 2 C, we let score Cp c (Q) to be the number of candidates that c beats in a pairwise election, minus the number of candidates that beat c. The Copeland voting rule outputs Copeland(Q) = arg maxc2C score Cp c (Q). We note that both scoring rules and Copeland have a natural notion of a candidate score, and were conceptualized as such in [Bartholdi et al., 1989]. This will be instrumental in defining voting misrepresentation, as we do in the following section. Definition 2.1. Given a voting rule f which induces a candidate score, we let scoref(c, Q) to be the score of c 2 C under f, when the voter set is Q V . 3 The Misrepresentation Ratio We are interested in settings where the set of voters is partitioned into districts: these are z disjoint sets D1, . . . , Dz whose union is V . The election winner under this model is determined by applying the voting rule f to D1, . . . , Dz; the candidate who wins the greatest number of districts is the winner (ties are broken using some tie-breaking rule). Definition 3.1. Given a partition of voters into z districts D = {D1, . . . , Dz}, let w be the winner of the election when voters are partitioned into districts as per D. The misrepresentation ratio is the ratio between the maximum score of any candidate under f, and the score of w. Formally: MR(V, D, f) = maxc2C scoref(c, V ) scoref(w, V ) . Note that MR(V, D, f) 1; if MR(V, D, f) = 1 then the winner of the district elections completely captures the popular vote (as measured by f). The higher MR(V, D, f), the less popular the winning candidate is in the eyes of the people; thus, district elections with a high misrepresentation ratio are ones where voters preferences are not appropriately aggregated, due to the effects of district elections. Remark 3.2. In this work we assume all districts have the same number of voters. Without any assumptions on the number of voters in each district, the worst case misrepresentation ratio (Definition 3.1) can be arbitrarily high: consider 2 + 1 districts. Suppose + 1 of the districts have only one voter, and of them have M voters, where M is a very large number. There are two candidates, A and B; candidate A wins all votes in the districts where there are M voters, and candidate B wins all votes in the + 1 districts holding one voter. Thus, the total number of votes for B is O( ), and the total number of votes for A is O(M ), resulting in an arbitrarily high misrepresentation. Our results can be extended by incorporating an additional parameter: max D,D02D |D| |D0|; however, in the interest of space and clarity, we assume that districts are of equal size. 4 Bounds on the Misrepresentation Ratio In what follows, we establish upper bounds on the worst-case misrepresentation ratio; we show that our bounds are tight, i.e. that there exist district elections where the bound holds. Furthermore, we always refer to the district voting instance h V, C, ( i)i2V , D, fi as one that maximizes MR, where |C| = m, |D| = z, for all D 2 D: |D| = n , and f is some neutral score-monotone rule. We assume that w 2 C is the winner of the district elections; that is, w won a plurality of the districts. We mark d(c) for c 2 C as the number of districts won by candidate c. We assume that in the case of a tie for first place in a district, ties are broken against the district election winner; moreover, the district election winner must win a strict plurality of the districts. Intuitively, in order to establish our bound, we want to create the worst possible election. Such an election would have the candidate w win by as small a margin as possible, with some other candidate p 6= w being as popular as possible while losing a plurality of the districts. Given a score inducing voting rule f, we let Ln(f) be the maximal score that a candidate can get while still losing an n voter election, and Mn(f) be the maximal score that a candidate can get and win an n voter election. For example, Ln(Plurality) = 1, and Mn(Plurality) = n. The following lemma (whose proof is in the appendix) offers some insights regarding the number of districts won by each candidate in an instance maximizing MR. Lemma 4.1. If f is a score-monotone, neutral rule then there is a district voting instance maximizing MR such that: 1. if w does not win the election in a district D, then w is ranked last by all voters in D. 2. If the candidate p wins in D, then it is ranked first by all voters in D; if not and the winner is not w, then the score of p from D is Ln(f). 3. d(p) = d(w) 1 if m > 2 or z is odd, otherwise d(p) = 4.1 Scoring Rules We begin our investigation by bounding MR when f is a scoring rule. We begin with a simple technical lemma. (proof in the appendix). Lemma 4.2. For every scoring rule f = Scoring , where = ( 1, . . . , m), if 1 > 2, Ln(f) = 1( + 1). If 1 = 2, denote 0 as the maximal i such that i < 1, then Ln(f) = (n 1) 1 + 0. Corollary 4.3. If f = Scoring then (z+1)Ln(f) Mn(f) for any z > 1. Proof. Given a vector = ( 1, . . . , m) we have that Mn(f) = n 1. Now, by Lemma 4.2, we know that Ln(f) & n 1, from which the claim trivially follows. The following lemma (full proof in the appendix) tells us the number of districts that must be won by the winning candidate in a district election that maximizes MR. Lemma 4.4. Let h V, C, ( i)i2V , D, fi be a district voting instance which maximizes MR, where f is a scoring rule. z (number of districts) can be written as m+r ( , r 2 N[{0} and r < m). There is an MR maximizing instance where the number of districts that w wins is at most + 2, and every other candidate wins at least 1 districts. Proof Sketch. Marking as xw the score w receives in districts won by w, and xp the score p receives in those same districts, we observe that MR can be written as xp + Mn(f) 2Ln(f) xw + (z + 1)Ln(f) Mn(f) xwd(w) ; (1) Taking a derivative of (1) on d(w) shows that (1) is maximized when d(w) is minimized, from which the lemma is derived. The following lemmata (proofs in the appendix) discuss the score that w receives in districts that it wins. Lemma 4.5. For every scoring rule f, if n z, z > 3 (i.e., number of voters in each district is far larger than the number of districts), the score given to w in districts in which it wins must be the minimal possible score needed to win a district in an instance maximizing MR.3 Lemma 4.6. For k-approval voting rules, the minimal score a winner can get in an election is + 1 if nk 1 mod m. Otherwise, it is + 1. Armed with Lemmata 4.1 4.6, we now proceed to analyze specific voting rules. Theorem 4.7. Suppose that n = qm + s and z = m + r, where q, s, , r 2 N [ {0} and s, r < m; then MR(V, D, Plurality) is at least + 1 q + 2 + (z + 1)( 1) n ( + 2)(q + 2) and at most + 1 q + 1 + (z + 1)( 1) n ( + 1)(q + 1) . In particular, MR(V, D, Plurality) is in (m2). Proof. By Lemma 4.4 and Corollary 4.3, we know that w wins either + 1 or + 2 districts. Plugging in the values Ln(Plurality) = 1 and Mn(Plurality) = n into (1), we have that MR(V, D, Plurality) equals 1) n xwd(w) (xw and xp denote the score of w and p, respectively, when w wins a district). We are left just with determining the value of xp and xw. For reasons similar to the ones detailed in Lemma 4.1, it holds that xp = xw 1. Due to Lemma 4.5, MR(V, D, Plurality) is maximized when xw is minimal; We have that w receives q + 2 votes if s 2, and q + 1 otherwise. Plugging this into (1) we obtain the desired result. 3This is not trivial, as possible to increase w s score to allow p to receive a higher score as a 2nd place candidate. The second expression in the upper bound of Theorem 4.7, n 2d n 2 e+1 q+1 can be upper bounded by 1 q+1, which is at most m n . Thus, if the number of voters dominates the number of candidates, this expression has little effect on MR. The second expression can be upper bounded as follows 1) n ( + 1)(q + 1) zn ( + 1)(q + 1) m2. A similar lower bound of can be shown as well, which concludes the proof. Note that tightness is achieved as our constructed expressions were dependent on particular voting profiles (as described in Lemma 4.1), and hence carry on to these expressions. As some parliamentary systems require not a plurality of districts to become a winner, but a majority, we also note the MR in these cases (the proof can be found in the appendix). Corollary 4.8. If the number of districts needed for a victory is above 50%, MR for plurality is (m). The following theorem (proof in the appendix) discusses the MR for k-approval scoring rule. Theorem 4.9. Suppose that the number of districts is expressed as z = m + r, where , r 2 N [ {0} and r < m; then MR(V, D, k-approval) for k > 1 is at least 1 + n(z + + 2) (z + 1) and at most 1 + n(z + + 1) (z ) In particular, MR(V, D, k-approval) is in Corollary 4.10. MR(V, D, Veto) is in (m). One of the main challenges in computing a closed form formula for MR for general scoring rules is that one must first decide what is the minimal score that w can obtain while winning a district for a given score vector . This problem can be thought of as a bin packing problem: candidates can be thought of as bins, and the scores must be packed into them. It is only recently that a polynomial time algorithm has been proposed for bin packing problems with a constant number of types (also commonly referred to as the one-dimensional cutting stock problem) [Goemans and Rothvoß, 2014]. Thus, for general scoring rule we offer looser bounds on the number of votes needed to win: j=1 j; then the minimal number of votes needed to win a district is at most + 1, and at least n S : we allocate the scores as evenly as possible among the candidates, and break the tie in favor of the winner using at most 1 points. Of course, in some cases this can be improved, but it depends on , and on the divisibility of n S and m. The following theorem (proof in the appendix) uses these loose bounds to bound MR for the Borda scoring rule. Theorem 4.11. MR(V, D, Borda) is in (m2). Algorithm 1 Monte-Carlo MR Approximation 1: procedure EXPECTED-MR ( M , B, ", δ) 3: T = 0 4: for i = 1 to s do 5: Sample an election outcome E from M 6: w = arg maxc2C score(c, V ) // The winner 7: sm = maxc2C score(c, V ) // Maximal score 8: Rsamp sw // sampled MR 9: T = T + Rsamp i 10: return ˆr = T s // average of sampled MRs 4.2 Copeland When using the Copeland voting rule, one can get an undefined value for MR, as a score of 0 is possible for the winner. Example 4.12. Let us have two identical districts, each containing 21 voters with the preference w p a and 20 voters who have p w a. A third district contains 41 voters with the preference p w a. w wins the 2 first districts, becoming the ultimate winner. But, looking globally, p s Copeland score is 2, while w s Copeland score is 0, making MR(V, D, Copeland) undefined. The Copeland score can be additively adjusted by adding to each candidate s score a fixed amount that is larger than m. However, Copeland s performance remains bad, as is captured by Theorem 4.13 (see appendix for proof). Theorem 4.13. Under Copeland, the winner w may have the worst possible Copeland score, while another candidate has the best possible Copeland score. 5 The Misrepresentation Ratio Under Uncertain Votes In Section 4 we established bounds on the misrepresentation ratio by constructing pathological examples: settings where districting effects were so pronounced as to cause an extremely unpopular candidate to win the elections, despite the existence of a clearly better alternative. We now take the average-case, rather than the worst-case, approach, and ask how common are instances where misrepresentation is high. We do this in the form of a probabilistic generative model, utilizing partial information to inform our assumptions on the general population. Any instantiation of the model is a voting domain for which we can compute the misrepresentation ratio. Thus, MR is a random variable and we evaluate the expected MR. A naive solution is to exhaustively search over the space of possible election outcomes; for each such outcome we can compute its probability of occurring under the generative model, and the MR value for that outcome; we can then sum the product of the two across all outcomes to get the expected MR. However, such an exhaustive search is intractable, as the space of outcomes can be prohibitively large, especially when there are many candidates, voters and districts. We propose an alternative approximate solution, based on a Monte-Carlo algorithm. Our algorithm requires a bound Chart Title 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05 Ɛ Figure 1: The effects of the noise on the MR on the MR for the examined domain, and such can be found for many cases (see previous section). We assume the generative model is given in the form of a black-box, which outputs a sampled election outcome, consisting of the votes of every voter in every district. We further assume that the winner of the election can be computed in polynomial time.4 Denote the generative model as M, and by r = EM(MR) the expectation of MR under the model M. Our proposed algorithm is probably approximately correct : given two parameters, " and δ, the algorithm returns an approximation ˆr to r, such that with high probability 1 δ the returned value ˆr is very close to r, so that |ˆr r| ". The running time of the algorithm depends on " and δ; it is quadratic in 1 " and logarithmic in 1 δ . Our proposed algorithm is a Monte-Carlo algorithm, but it is only appropriate to voting rules where there is a known bound on the possible MR values.5 The minimal MR value is 1 (as this is the ratio between the maximal score of any candidate and the score of a specific candidate, namely the winner). Given a bound H on the maximal MR in a domain, we refer to the MR value range as B = H 1. The runtime of our algorithm is quadratic in B. The method is given in Algorithm 1, and we provide a proof for its correctness. Theorem 5.1. The value returned by Expected-MR is an ", δ approximation for the expected MR under M: with probability at least 1 δ the returned value ˆr is within a distance " of r = EM(MR), i.e: |ˆr r| ". Proof. We note that the Rsamp i computed inside the loop is the MR in a specific instantiation of an election outcome E sampled from the generative model M (see Definition 3.1), so each Rsamp i is a random variable, whose expectation is r = EM(MR) (i.e. E[Rsamp i ] = r). Our algorithm computes T = Ps i , the sum of s i.i.d draws, each of which has a value of r in expectation, so E[ˆr] = E[ T s ] = r. We use Hoeffding s inequality [Hoeffding, 1963] to show that the number of samples s that we use achieves the desired accuracy " and confidence δ. Hoeffding s inequality states that if R1, . . . , Rn are independent random variables, where each 4Not all voting rules admit a polynomial winner determination algorithm. As our algorithm samples election outcomes, its runtime in this case would not be polynomial. 5Our method is akin to Monte-Carlo methods used for analyzing voting under various forms of uncertainty [Fatima et al., 2008; Bachrach et al., 2010; 2011; Bachrach and Shah, 2013]. 0 1000 2000 3000 4000 5000 Voters in every district (a) MR where the preferences are uniformly drawn, under Borda scoring rule. 0 1000 2000 3000 4000 5000 Voters in every district (b) MR where the preferences are drawn by a Mallows model, under plurality. 1000 2000 3000 4000 5000 6000 Chart Title Series1 Series2 Series3 Series4 Series5 1 0 1000 2000 3000 4000 5000 Voters in every district (c) MR where the preferences are drawn by a Mallows model, under Copeland voting rule. Figure 2: Simulations results Ri is bounded so that Ri 2 [ai, bi], and if T = Ps then Pr(|T E[T]| s") 2 exp i=1(bi ai)2 i is the MR in an election outcome obtained under the generative model M, so Rsamp i is bounded in the range [1, H] (i.e., by our assumption, the MR value range is B = H 1). Applying Hoeffding s bound and substituting B2 ln 2 δ 2 "2 for s, we get Pr δ as required. 6 Simulations We now use our Algorithm Expected-MR to analyze the MR in several voting domains. We begin with a noisy version of the example domain described in the introduction. We fix C = {w, p}, and the number of districts to 11: 6 districts of type A and 5 of type B , modeling heterogeneous and homogenous districts respectively. In type A districts, every voter v votes randomly with Pr[v votes for w] = 1 2 + ", and Pr[v votes for p] = 1 2 "; in type B districts, Pr[v votes for w] = ", Pr[v votes for p] = 1 ", for " 2 & . Figure 1 shows the averaged MR as a function of the noise " (x-axis). Each point is average MR obtained for many elections sampled using this probabilistic model; MR first increases as the noise " grows, until it reaches a sweetspot from which it drops. This indicates that for some models the model noise may not have a monotone effect on the MR. In our second experiment, we fix the number of districts to 15, and range the number of voters in every district from 100 to 5000. We examined MR of elections with m 2 {3, 4, . . . , 7} candidates. Figure 2a shows the averaged MR where the preference of every voter in every district is uniformly drawn from the set of all m! possible orders of candidates, under Borda scoring. The figure shows that increasing the number of voters tends to lower MR. This is not surprising, as all candidates will likely have nearly the same score. Next, we consider the Mallows model [Mallows, 1957] for generating voter preferences, where we assume that there is a ground truth for every district, σ (representing the common ranking of candidates in that district) and dispersion parameter φ 2 . Under the Mallows model every voter compares every pair of candidates independently and ranks them correctly (according to σ ) with probability φ. For every district, σ was drawn uniformly at random and 2, 1). We used the plurality scoring rule (Figure 2b) and Copeland voting rule (Figure 2c). As predicted by our theoretical results, MR grows when there are more candidates. Under Copeland, a fixed amount of m was added each candidate s score so that MR would be positive. Our simulation results indicate that voting misrepresentation may occur in several natural domains. Our second experiment is fair in the sense that there is no preferred candidate, and yet the MR values are quite high. Also, our theoretical results agree with experiments in some natural domains. 7 Conclusions This work analyzes district-based elections. We demonstrate the representability issues that arise in such elections, and show tight bounds on misrepresentation. We further show that misrepresentation is a common occurrence under various natural voter distributions, and that its effect may not diminish even when the number of voters is large. District based elections tend to under-represent smaller parties; this is a long observed phenomenon (and, in some countries, a welcome stabilizing feature). However, we do not focus on smaller parties, but rather show that the preferences of large majorities may be completely unrepresented (the UK elections of 1951, where the Labour party with 48.8% support lost the elections, and the Conservative party with 44.3% not only beat it, but had a strong parliamentary majority, is just one example of these occurrences). Research into the institutional bias in voting procedures, beyond control issues, is one which we think deserves more attention by the computational social choice community. Very few election systems in the world are proportional, and the effect this has on the expression of voters views has mostly focused (in political science research) on how small minorities are hurt. As our analysis shows, large majorities may also be affected. Further research is needed with regards to other voting methods. Moreover, further and complementary concepts may be developed, indicating unfairness, lack of representation and other problems with various voting procedures (parliamentary entrance bounds, common in some countries, are an obvious candidate for such directions). In addition, while we have focused on an outcome of a single winner, a coalitional analysis of district settings may also be of interest. Acknowledgements This work was supported in part by NSERC grant 482671. References [Bachrach and Shah, 2013] Yoram Bachrach and Nisarg Shah. Re- liability weighted voting games. In Algorithmic Game Theory, pages 38 49. Springer, 2013. [Bachrach et al., 2010] Yoram Bachrach, Nadja Betzler, and Piotr Faliszewski. Probabilistic possible winner determination. In AAAI, volume 10, pages 697 702, 2010. [Bachrach et al., 2011] Yoram Bachrach, Reshef Meir, Michal Feldman, and Moshe Tennenholtz. Solving cooperative reliability games. UAI, 2011. [Bartholdi et al., 1989] J. J. Bartholdi, C. A. Tovey, and M. A. Trick. The computational difficulty of manipulating an election. SCW, 6(3):227 241, 1989. [Butler, 1992] D. Butler. The electoral process the redrawing of parliamentary boundaries in britain. British Elections and Parties Yearbook, 2(1), 1992. [Engstrom, 2006] E. J. Engstrom. Stacking the states, stacking the house: The partisan consequences of congressional redistricting in 19th century. APSR, 100:419 427, 2006. [Erd elyi et al., 2015] G. Erd elyi, E. Hemaspaandra, and L. A. Hemaspaandra. More natural models of electoral control by partition. In ADT, pages 396 413, 2015. [Erikson, 1972] R. S. Erikson. Malapportionment, gerrymandering, & party fortunes in congressional elections. APSR, 66(4):1234 1245, 1972. [Faliszewski et al., 2010] P. Faliszewski, E. Hemaspaandra, and L. A. Hemaspaandra. Using complexity to protect elections. CACM, 53(11):74 82, 2010. [Fatima et al., 2008] Shaheen S Fatima, Michael Wooldridge, and Nicholas R Jennings. A linear approximation method for the shapley value. Artificial Intelligence, 172(14):1673 1699, 2008. [Feix et al., 2004] M. R. Feix, D. Lepelley, V. R. Merlin, and J. Rouet. The probability of conflicts in a U.S. presidential type election. Economic Theory, 23(2):227 257, 2004. [Feix et al., 2008] M.R. Feix, D. Lepelley, V.R. Merlin, J. Rouet, and L. Vidu. Majority efficient representation of the citizens in a federal union. Technical report, Universit e de la R eunion, de Caen, d Orleans, 2008. [Felsenthal and Machover, 1999] D. S. Felsenthal and M. Ma- chover. Minimizing the mean majority deficit: The 2nd squareroot rule. Math. Soc. Sci., 37(1):25 37, 1999. [Felsenthal and Miller, 2015] Dan S Felsenthal and Nicholas R Miller. What to do about election inversions under proportional representation? Representation, 51(2):173 186, 2015. [Friedman and Holden, 2005] John N Friedman and Richard Holden. Towards a theory of optimal partisan gerrymandering. Mimeograph, Harvard University, 2005. [Gallagher, 1991] M. Gallagher. Proportionality, disproportionality and electoral systems. Electoral Studies, 10(1):33 51, 1991. [Gelman et al., 2002] Andrew Gelman, Jonathan N. Katz, and Francis Tuerlinckx. The mathematics and statistics of voting power. Statistical Science, 17(4):420 435, 2002. [Goemans and Rothvoß, 2014] M.X. Goemans and T. Rothvoß. Polynomiality for bin packing with a constant number of item types. In SODA, pages 830 839, 2014. [Grofman et al., ] Bernard Grofman, William Koetzle, and Thomas Brunell. An integrated perspective on the three potential sources of partisan bias: Malapportionment, turnout differences, and the geographic distribution of party vote shares. Electoral studies, 16:457 470. [Hemaspaandra et al., 2007] E. Hemaspaandra, L. A. Hemaspaan- dra, and J. Rothe. Anyone but him: The complexity of precluding an alternative. AIJ, 171(5 6):255 285, 2007. [Hoeffding, 1963] W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, 58(301):13 30, 1963. [Issacharoff, 2002] S. Issacharoff. Gerrymandering & political car- tels. Harvard Law Review, 116(2):593 648, 2002. [Jansen et al., 2013] K. Jansen, S. Kratsch, D. Marx, and I. Schlot- ter. Bin packing with fixed number of bins revisited. JCSC, 79(1):39 49, 2013. [Lahrach and Merlin, 2010] R. Lahrach and V. R. Merlin. Assess- ing the probability of the referendum paradox: The french local election case. Voting Power in Practice workshop, 2010. [Lublin, 1997] D. Lublin. The Paradox of Representation: Racial Gerrymandering and Minority Interests in Congress. Princeton University Press, 1997. [Mallows, 1957] C.L. Mallows. Non-null ranking models. I. Biometrika, pages 114 130, 1957. [Miller, 2011] N.R. Miller. Electoral Systems: Paradoxes, Assump- tions & Procedures, chapter Election Inversions by the U.S. Electoral College, pages 93 127. Springer, 2011. [Miller, 2014] N.R. Miller. The house size effect and the referendum paradox in u.s. presidential elections. Electoral Studies, 35:265 271, September 2014. [Nurmi, 1999] H. Nurmi. Voting Paradoxes and How to Deal with Them. Springer-Verlag, 1999. [Owen and Grofman, 1988] G. Owen and B. Grofman. Optimal partisan gerrymandering. Political Geography Quarterly, 7(1):5 22, January 1988. [Procaccia et al., 2007] A. D. Procaccia, J. S. Rosenschein, and A. Zohar. Multi-winner elections: Complexity of manipulation, control & winner-determination. In IJCAI, pages 1476 1481, 2007. [Schuck, 1987] P.H. Schuck. The thickest thicket: Partisan gerry- mandering and judicial regulation of politics. Columbia Law Review, 87(7):1325 1384, 1987. [Stille, 2008] W. Stille. Solution Techniques for specific Bin Pack- ing Problems with Applications to Assembly Line Optimization. Ph D thesis, TU Darmstadt, June 2008. [Tangian, 2010] A. Tangian. Computational application of the mathematical theory of democracy to Arrow s impossibility theorem (how dictatorial are Arrow s dictators?). SCW, 35(1):129 161, 2010. [Xia et al., 2009] L. Xia, M. Zuckerman, A. D. Procaccia, V. Conitzer, and J. S. Rosenschein. Complexity of unweighted coalitional manipulation under some common voting rules. In IJCAI, pages 348 353, 2009. [Xia, 2012] L. Xia. Computing the margin of victory for various voting rules. In EC, pages 982 999, 2012. [Zuckerman et al., 2008] M. Zuckerman, A. D. Procaccia, and J. S. Rosenschein. Algorithms for the coalitional manipulation problem. In SODA, pages 277 286, 2008.