# preference_elicitation_for_participatory_budgeting__1eb04e6b.pdf

Preference Elicitation for Participatory Budgeting

Gerdus Benade Carnegie Mellon University jbenade@andrew.cmu.edu

Swaprava Nath Carnegie Mellon University swapravn@cs.cmu.edu

Ariel D. Procaccia Carnegie Mellon University arielpro@cs.cmu.edu

Nisarg Shah Harvard University nisarg@g.harvard.edu

Participatory budgeting enables the allocation of public funds by collecting and aggregating individual preferences; it has already had a sizable real-world impact. But making the most of this new paradigm requires a rethinking of some of the basics of computational social choice, including the very way in which individuals express their preferences. We analytically compare four preference elicitation methods knapsack votes, rankings by value or value for money, and threshold approval votes through the lens of implicit utilitarian voting, and ﬁnd that threshold approval votes are qualitatively superior. This conclusion is supported by experiments using data from real participatory budgeting elections.

1 Introduction

One of the most well-studied problems in computational social choice (Brandt et al. 2016) deals with aggregating individual preferences over alternatives expressed as rankings into a collective choice of a subset of alternatives (Procaccia, Reddi, and Shah 2012; Skowron, Faliszewski, and Lang 2015; Caragiannis et al. 2016). Nascent social choice applications, though, have given rise to the harder, richer problem of budgeted social choice (Lu and Boutilier 2011), where alternatives have associated costs, and the selected subset is subject to a budget constraint. Our interest in budgeted social choice stems from the striking real-world impact of the participatory budgeting paradigm (Cabannes 2004), which allows local governments to allocate public funds by eliciting and aggregating the preferences of residents over potential projects. Indeed, in just a few years, the Participatory Budgeting Project1 has helped allocate more than $170 million dollars of public money for more than 500 local projects, primarily in the US and Canada (including New York City, Chicago, Boston, and San Francisco). In pioneering work, Goel et al. (2016) who have facilitated a number of participatory budgeting elections as part of the Stanford Crowdsourced Democracy Team2 propose and evaluate two participatory budgeting approaches.

Copyright c 2017, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved. 1http://www.participatorybudgeting.org 2http://voxpopuli.stanford.edu

In the ﬁrst approach, the input format the way in which each voter s preferences are elicited is knapsack votes: Each voter reports his individual solution to the knapsack problem, that is, the set of projects that maximizes his overall value (assuming an additive valuation function), subject to the budget constraint. The second component of the approach is the aggregation rule; in this case, each voter is seen as approving all the projects in his knapsack, and then projects are ordered by the number of approval votes and greedily selected for execution, until the budget runs out. The second approach uses value-for-money comparisons as the input format it asks voters to compare pairs of projects by the ratio between value and cost. These comparisons are aggregated using variants of classic voting rules, including the Borda count rule and the Kemeny rule. In a sense, Goel et al. (2016) take a bottom-up approach: They deﬁne novel, intuitive input formats that encourage voters to take cost not just value into account, and justify them after the fact. By contrast, we wish to take a topdown approach, by specifying an overarching optimization goal, and using it to compare different methods for participatory budgeting.

1.1 Our Approach and Results

Let us deﬁne the participatory budgeting problem a bit more formally, following Goel et al. (2016). A set N of n voters are voting over a set A of m alternatives (projects), where each alternative a has cost ca. The utility voter i has for alternative a is denoted vi(a). Moreover, utility functions are additive, that is, the utility of a voter for a set of alternatives A A is

a A vi(a). Our goal is to choose a set W A of winning alternatives that maximizes the (utilitarian) social welfare, subject to the total cost not exceeding the budget B:

arg max W A :

a W vi(a). (1)

We make essentially3 no assumptions about the utility functions. Nevertheless, solving (1) would be easy if we

3Other than a standard normalization assumption that we discuss later.

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17)

had access to the utility functions; the problem is challenging precisely because we do not. Rather, we have access to votes, in a certain input format, that are consistent with the utility functions. This goal maximizing social welfare based on votes that serve as proxies for latent utility functions has been studied for more than a decade (Procaccia and Rosenschein 2006; Caragiannis and Procaccia 2011; Boutilier et al. 2015; Anshelevich, Bhardwaj, and Postl 2015; Anshelevich and Sekar 2016; Anshelevich and Postl 2016); it has recently been termed implicit utilitarian voting (Caragiannis et al. 2016). Absent complete information about the utility functions, clearly social welfare cannot be perfectly maximized. Procaccia and Rosenschein (2006) introduced the notion of distortion to quantify how far a given aggregation rule is from achieving this goal. Roughly speaking, given a vote proﬁle (a set of n votes) and an outcome, the distortion is the worstcase ratio between the social welfare of the optimal outcome, and the social welfare of the given outcome, where the worst case is taken with respect to all utility proﬁles that are consistent with the given votes. Previous work on implicit utilitarian voting assumes that each voter expresses his preferences by ranking the alternatives in order of decreasing utility. By contrast, the main insight underlying our work is that

... the implicit utilitarian voting framework allows us to decouple the input format and aggregation rule, thereby enabling an analytical comparison of different input formats in terms of their potential for providing good solutions to the participatory budgeting problem.

This decoupling is achieved by associating each input format with the distortion of the optimal (randomized) aggregation rule, that is, the rule that minimizes distortion on every vote proﬁle. Intuitively, the distortion thus associated with an input format measures how useful the information contained in the votes is for achieving social welfare maximization. In 3, we apply this approach to compare four input formats. The ﬁrst is knapsack votes, which (disappointingly) is associated with trivial distortion of Θ(m).4 Next, we analyze two closely related input formats: rankings by value, and rankings by value for money, which ask voters to rank the alternatives by their value and by the ratio of their value and cost, respectively. We ﬁnd that both admit an upper bound of O( m log m) on distortion, which almost matches a lower bound of Ω( m). Finally, we examine a novel input format, which we call threshold approval votes: each voter is asked to approve each alternative whose value for him is above a threshold that we choose. We ﬁnd that its associated distortion is O(log2 m), and establish a lower bound of Ω (log m/ log log m). To summarize, our theoretical results show striking separations between different input formats, with threshold approval votes coming out well on top. While our theoretical results in 3 bound the distortion, i.e., the worst-case ratio of the optimal social welfare to

4As we later show, an upper bound of O(m) can be achieved trivially irrespective of the input format, by selecting a single alternative uniformly at random. Knapsack votes, unfortunately, do not help improve it.

the social welfare achieved, in 4 we compare different approaches to participatory budgeting using the average-case ratio of the two. Speciﬁcally, we experimentally evaluate approaches that use the input formats we study in conjunction with their respective optimal aggregation rules, which minimize the distortion on each proﬁle,5 and compare them to two approaches currently employed in practice. We use data from two real-world participatory budgeting elections held in Boston in 2015 and 2016. The experiments indicate that the use of aggregation rules that minimize distortion on every input proﬁle signiﬁcantly outperforms the currently deployed approaches, and among the input formats we study, threshold approval votes remain superior, even in practice. We also observe that the running times of the distortion minimizing rules scale gracefully for most practical scenarios.

1.2 Related Work

Due to space constraints, we only discuss a few closely related papers. Let us ﬁrst describe the theoretical results of Goel et al. (2016) in slightly greater detail. Most relevant to our work is a theorem that asserts that knapsack voting (i.e., knapsack votes as the input format, coupled with greedy approval-based aggregation) actually maximizes social welfare. However, the result strongly relies on their overlap utility model, where the utility of a voter for a subset of alternatives is (roughly speaking) the size of the intersection between this subset and his own knapsack vote. In a sense, the viewpoint underlying this model is the opposite of ours, as a voter s utility is derived from his vote, instead of the other way around. One criticism of this model is that even if certain alternatives do not ﬁt into a voter s individual knapsack solution due to the budget constraint, the voter could (and usually will) have some utility for them. Goel et al. (2016) also provide strategyproofness results for knapsack voting, which similarly rely on the overlap utility model. Finally, they interpret their methods as maximum likelihood estimators (Young 1988; Conitzer and Sandholm 2005) under certain noise models. Naturally, our work is also closely related to previous work on implicit utilitarian voting. Crucially, as noted above, this line of work focuses exclusively on the rankings-byvalue input format. Boutilier et al. (2015) study the problem of selecting a single winning alternative, and provide an upper bound of O( m log m) and a lower bound of Ω( m) on the distortion achieved by the optimal aggregation rule. Their setting is a special case of the participatory budgeting problem where the cost of each alternative equals the entire budget. Consequently, their lower bound applies to our more general setting, and our upper bound for the rankingsby-value input format generalizes theirs (up to a logarithmic factor). Caragiannis et al. (2016) extend the results of Boutilier et al. (2015) to the case where a subset of alterna-

5Note that such rules are not guaranteed to achieve the optimal performance in our experiments as we measure performance using the average-case ratio of the optimal to the achieved social welfare rather than the (worst-case) distortion. Nonetheless, such rules perform extremely well.

tives of a given size k is to be selected (only for the rankingsby-value input format); this is again a special case of the participatory budgeting problem where the cost of each alternative is B/k. However, our results are incomparable to theirs because we assume additive utility functions following previous work on participatory budgeting (Goel et al. 2016) whereas Caragiannis et al. assume that a voter s utility for a subset of alternatives is his maximum utility for any alternative in the subset.

2 The Model

Let [k] {1, . . . , k}. Let N = [n] be the set of voters, and A be the set of m alternatives. The cost of alternative a is denoted ca, and the budget B is normalized to 1. For S A, let c(S) = a S ca. Deﬁne Fc = {S A : c(S) 1 c(T) > 1, S T A} as the inclusion-maximal budget-feasible subsets of A. We assume that each voter has a utility function vi : A R+ {0}, where vi(a) is the utility that voter i has for alternative a, and that these utilities are additive, i.e., the utility of voter i for a set S A is deﬁned as vi(S) =

a S vi(a). Finally, to ensure fairness among voters, we make the standard assumption (Caragiannis and Procaccia 2011; Boutilier et al. 2015) that vi(A) = 1 for all voters i N. We call the vector v = {v1, . . . , vn} of voter utility functions the utility proﬁle. Given the utility proﬁle, the (utilitarian) social welfare of an alternative a A is deﬁned as sw(a, v) = i N vi(a); for a set S A, let sw(S, v) =

a S sw(a, v). The utility function of a voter i is only accessible through his vote ρi, which is induced by vi. The vector ρ = {ρ1, . . . , ρn} is called the input proﬁle. Let v ρ denote that utility proﬁle v is consistent with input proﬁle ρ. We study four speciﬁc formats for input votes: The knapsack vote κi A of voter i N represents a feasible subset of alternatives with the highest value for the voter. We have vi κi if and only if c(κi) 1 and vi(κi) vi(S) for all S Fc. The rankings-by-value and the rankings-by-value-formoney input formats ask voter i N to rank the alternatives by decreasing value for him, and by decreasing ratio of value for him to cost, respectively. Formally, let L = L(A) denote the set of rankings over the alternatives. For a ranking σ L, let σ(a) denote the position of alternative a in σ, and a σ b denote σ(a) < σ(b), i.e., that a is preferred to b under σ. Then, we say that utility function vi is consistent with the ranking by value (resp. value for money) of voter i N, denoted σi, if and only if vi(a) vi(b) (resp. vi(a)/ca vi(b)/cb) for all a σi b. For a threshold t, the threshold approval vote τi of voter i N consists of the set of alternatives whose value for him is at least t, i.e., vi τi if and only if τi = {a A : vi(a) t}. In our setting, a (randomized) aggregation rule f for an input format maps each input proﬁle ρ in that format to a

distribution over Fc. The rule is deterministic if it returns a particular set in Fc with probability 1. In the implicit utilitarianism framework, the ultimate goal is to maximize the (utilitarian) social welfare. Procaccia and Rosenschein (2006) use the notion of distortion to quantify how far an aggregation rule f is from achieving this goal. The distortion of f on a vote proﬁle ρ is given by

dist(f, ρ) = sup v: v ρ

max T Fc sw(T, v)

ES f( ρ)[sw(S, v)] .

The (overall) distortion of a rule f is given by dist(f) = max ρ dist(f, ρ). The optimal aggregation rule f , which we term the distortion-minimizing aggregation rule, selects the distribution minimizing distortion on each input proﬁle individually, that is,

f ( ρ) = arg min μ Δ(Fc) sup v: v ρ

max T Fc sw(T, v)

ES μ[sw(S, v)] ,

where Δ(Fc) is the set of distributions over Fc. Needless to say, f achieves the best possible overall distortion. Finally, we say that the distortion associated with an input format (i.e., elicitation method) is the overall distortion of the (randomized) distortion-minimizing aggregation rule for that format; this, in a sense, quantiﬁes the effectiveness of the input format in achieving social welfare maximization.6

3 Theoretical Results

Before we present our analysis of the different input formats from the perspective of implicit utilitarianism, let us make a simple observation that holds across all input formats. Observation 3.1. The distortion associated with any input format is at most m.

Proof. Consider the rule that selects a single alternative uniformly at random; this is clearly budget-feasible. Due to the normalization of utility functions, the expected welfare achieved by this rule is (1/m)

a A vi(a) = n/m. On the other hand, the maximum welfare that any subset of alternatives can achieve is at most n. Hence, the distortion of this rule, which does not require any input, is at most m.

3.1 Knapsack Votes

We now present our analysis for knapsack votes an input format advocated by Goel et al. (2016). Theorem 3.2. The distortion associated with knapsack votes is Ω(m).

Proof. Consider the case where every alternative has cost 1 (i.e., equal to the budget). For ease of exposition, assume that m divides n. Consider the input proﬁle κ, in which voters

6In a setting where deterministic rules must be used, one could similarly associate each input format with its best deterministic rule. We study this less motivated and technically less interesting setting in the full version of the paper available at http://procaccia.info/papers.

are partitioned into m subsets {Na}a A of equal size, and for every a A and i Na, we have κi = {a}. Consider a randomized aggregation rule f. There must exist an alternative a A such that Pr[f( κ) = {a }] 1/m. Now, construct a utility proﬁle v such that i) for all i Na , we have vi(a ) = 1, and vi(a) = 0 for a A \ {a }; and ii) for all a A \ {a } and i Na, we have vi(a) = vi(a ) = 1/2, and vi(b) = 0 for b A \ {a, a }. Note that v is consistent with the input proﬁle κ, i.e., v κ. Moreover, it holds that sw(a , v) n/2, whereas sw(a, v) n/m for a A \ {a }. It follows that

dist(f) dist(f, κ) n/2 1 m n + m 1

as desired.

In light of Observation 3.1, this result indicates that the distortion associated with knapsack votes is asymptotically indistinguishable from the distortion one can achieve with absolutely no information about voter preferences, suggesting that knapsack votes may not be an appropriate input format if the goal is to maximize social welfare. Our aim now is to ﬁnd input formats that achieve better results when viewed through the implicit utilitarianism lens.

3.2 Rankings by Value and by Value for Money

Goel et al. (2016) also advocate the use of comparisons between alternatives based on value for money, which, like knapsack votes, encourage voters to consider the trade-off between value and cost. We study rankings by value for money as an input format; observe that such rankings convey more information than speciﬁc pairwise comparisons. In addition, we also study rankings by value, which are prevalent in the existing literature on implicit utilitarian voting (Procaccia and Rosenschein 2006; Caragiannis and Procaccia 2011; Boutilier et al. 2015; Anshelevich, Bhardwaj, and Postl 2015; Anshelevich and Sekar 2016; Anshelevich and Postl 2016). Rankings by value convey more information than k-approval votes, in which each voter submits the set of top k alternatives by their value this is the input format of choice for most real-world participatory budgeting elections (Goel et al. 2016). As noted in 1.2, Boutilier et al. (2015) prove a lower bound of Ω( m) on distortion in the special case of our setting where all alternatives have cost 1, and the input format is rankings by value. This result carries over to our more general setting, not only with rankings by value, but also with rankings by value for money, as both input formats coincide in case of equal costs. Our goal is to establish an almost matching upper bound. We start from a mechanism of Boutilier et al. (2015) that has distortion O( m log m) in their setting. It carefully balances between high-value and low-value alternatives (where value is approximately inferred from the positions of the alternatives in the input rankings). In our more general participatory budgeting problem, it is crucial to also take into account the costs, and ﬁnd the perfect balance between selecting many low-cost alternatives and fewer high-cost ones.

We modify the mechanism of Boutilier et al. precisely to achieve this goal. Speciﬁcally, we partition the alternatives into O(log m) buckets based on their costs, and differentiate between alternatives within a bucket based on their (inferred) value. Our mechanism for rankings by value for money requires more careful treatment as (ironically) values are obfuscated in value-for-money comparisons. At ﬁrst glance our setting seems much more difﬁcult, distortion-wise, than the simple setting of Boutilier et al. (2015). But ultimately we obtain only a slightly weaker upper bound on the distortion associated with both rankings by value and by value for money. In other words, to our surprise, incorporating costs and a budget constraint comes at almost no cost (no pun intended) to social welfare maximization. The proof of this result appears in the full version.

Theorem 3.3. The distortion associated with rankings by value and rankings by value for money is O( m log m).

3.3 Threshold Approval Votes

Approval voting where voters can choose to approve any subset of alternatives, and the most widely approved alternative wins is well studied in social choice theory (Brams and Fishburn 2007). In our utilitarian setting we reinterpret this input format as threshold approval votes, where the principal sets a threshold t, and each voter i N approves every alternative a for which vi(a) t. We ﬁrst investigate deterministic threshold approval votes, in which the threshold selected deterministically, but ﬁnd that it does not help us (signiﬁcantly) improve over the distortion we can already obtain using rankings by value or by value for money. Speciﬁcally, for a ﬁxed threshold, we are always able to construct cases in which alternatives have signiﬁcantly different welfares, but either no alternative is approved or an extremely large set of alternatives are approved, providing the rule little information to distinguish between the alternatives, and yielding high distortion. The proof of this result appears in the full version of the paper.

Theorem 3.4. The distortion associated with deterministic threshold approval votes is Ω( m).

We thus turn our attention to randomized threshold approval votes, in which the threshold is selected in a randomized fashion. Technically, this is a distribution over input formats, one for each value of the threshold.7 The (overall) distortion of a rule f, which now determines both the distribution over thresholds D, and the aggregation of input proﬁle ρ( v, t) induced by utility proﬁle v and threshold t, is

dist(f) = sup v Et D max T Fc sw(T, v) ES f( ρ( v,t))[sw(S, v)].

Interested readers can refer to the full version of the paper for a discussion on how this deﬁnition relates to the deﬁnition of distortion of a rule for a ﬁxed input format. We ﬁnd

7While we study deterministic and randomized threshold selection, we always allow randomized aggregation rules. The full version of the paper presents an analysis of the case where the aggregation rule has to be deterministic.

that the ﬂexibility of randomizing the threshold value allows us to dramatically reduce the distortion. Theorem 3.5. The distortion associated with randomized threshold approval votes is O(log2 m).

Proof. For ease of exposition, assume m is a power of 2. Let I0 = [0, 1/m2], and Ij = (2j 1/m2, 2j/m2], ℓj = 2j 1/m2, and uj = 2j/m2 for j = 1, . . . , 2 log m. Let v denote a utility proﬁle that is consistent with the input proﬁle. For a A and j {0, . . . , 2 log m}, deﬁne na j = |{i N : vi(a) Ij}| to be the number of voters whose utility for a falls in the interval Ij. We now bound the social welfare of a in terms of the numbers na j . Speciﬁcally,

i N I{vi(a) Ij} uj

j=0 na j uj,

A similar argument also yields a lower bound, and after substituting ℓ0 = 0, u0 = 1/m2, and na 0 n, we get

j=1 na j ℓj sw(a, v) n

j=1 na j uj. (2)

Next, divide the alternatives into 1+2 log m buckets based on their costs, with bucket Sj = {a A : ca Ij}. Note that selecting at most 1/uj alternatives from Sj is guaranteed to satisfy the budget constraint. Let S = arg max S Fc sw(S, v) be the feasible set of alternatives maximizing the social welfare. For j, k {0, . . . , 2 log m}, let n j,k =

a S Sk na j . Using Equation (2), we have

j=1 n j,k ℓj sw(S Sk, v)

j=1 n j,k uj. (3)

We now construct three different mechanisms; our ﬁnal mechanism will randomize between them. Mechanism A: Pick a pair (j, k) uniformly at random from the set T = {(j, k) : j, k [2 log m]}. Then, set the threshold to ℓj, and using the resulting input proﬁle, greedily select the 1/uk alternatives from Sk with the largest number of approval votes (or select Sk if |Sk| 1/uk). Let Bj,k denote the set of selected alternatives for the pair (j, k). Because we have j > 0 and k > 0,

sw(Bj,k, v)

4 n j,k uj, (4)

where, in the ﬁrst transition, we bound the welfare from below by only considering utilities that are at least ℓj, and the second transition holds because uj = 2ℓj, |S Sk| 2|Bj,k|, and Bj,k consists of greedily-selected alternatives with the highest number of approval votes. Thus, the expected social welfare achieved by mechanism A is

1 (2 log m)2

k=1 sw(Bj,k, v)

1 4 (2 log m)2

k=1 n j,k uj

1 16 log2 m

sw(S \ S0, v) |S \ S0| n

1 16 log2 m

sw(S \ S0, v) n

where the ﬁrst transition follows from Equation (4), and the second transition follows from Equation (3). Mechanism B: Select all the alternatives in S0. Because each alternative in S0 has cost at most 1/m2, this is clearly budget-feasible. Further, the social welfare achieved by this mechanism is sw(S0, v) sw(S S0, v). Mechanism C: Select a single alternative uniformly at random from A. This is also budget-feasible, and due to normalization of values, its expected social welfare is n/m. Our ﬁnal mechanism executes mechanism A with probability 16 log2 m/(2 + 16 log2 m), and mechanisms B and C with probability 1/(2 + 16 log2 m) each. It is easy to see that its expected social welfare is at least sw(S , v)/(2 + 16 log2 m). Hence, its distortion is O(log2 m).

We also show that at least logarithmic distortion is inevitable even when using randomized threshold approval votes; the proof appears in the full version of the paper.

Theorem 3.6. The distortion associated with randomized threshold approval votes is Ω(log m/ log log m).

4 Empirical Results

Our theoretical results in 3 characterize how well we can optimize distortion on an observed input proﬁle. Recall that distortion is the worst-case ratio of the optimal social welfare to the social welfare achieved, where the worst case is taken over all utility proﬁles consistent with the observed input proﬁle. In practice we care about this ratio according to the actual underlying utility proﬁle. Thus, a distortionminimizing aggregation rule is not guaranteed to be optimal in practice. This is why an empirical study is called for. In this section, we compare the performance of different approaches to participatory budgeting, where the performance is measured by the average-case ratio of the optimal and achieved social welfare, and the average is taken over utility proﬁles drawn to be consistent with input proﬁles from two real-world participatory budgeting elections.

Datasets: We use data from participatory budgeting elections held in 2015 and 2016 in Boston, Massachusetts. Both elections offered voters 10 alternatives. The 2015 dataset contains 2600 4-approval votes (voters were asked to approve their 4 most preferred alternatives) and the 2016 dataset contains 4430 knapsack votes. For each dataset, we conduct 3 independent trials. In each trial, we create r sub-proﬁles, each consisting of n voters drawn at random from the population. For each sub-proﬁle, we draw k random utility proﬁles v consistent with the subproﬁle, and use these to analyze the performance of different approaches. We use the real costs of the projects throughout. The choices of parameters (r, n, k) for the three trials are (5, 10, 10), (8, 7, 10), and (10, 5, 10). We choose this experimental design to yield sufﬁciently many samples to verify statistical signiﬁcance of the results while completing in a reasonable amount of time.

Approaches: We use the utility proﬁle v drawn to create an input proﬁle in four input formats we study. For each format, we use the deterministic as well as randomized distortionminimizing aggregation rule. The non-trivial algorithms we devise for these rules are presented in the full version of the paper. These eight approaches are referred to using the type of aggregation rule used ( Det or Ran ), and the type of input format ( Knap , Val , VFM , or Th Ap ). It is important to note that, unlike the other input formats, threshold approval votes are technically a family of input formats, one for each value of the threshold. While randomizing over the threshold is required to minimize the distortion (the worst-case ratio of the optimal and achieved social welfare), as is our goal in the theoretical results of 3, minimizing the expected ratio of the two can be achieved by a deterministic threshold. Thus, in our experiments, we learn the optimal threshold value based on a holdout set that is not subsequently used. This learning approach is practical as it only uses observed input votes rather than underlying actual utilities. In other words, we acknowledge that this choice gives threshold approval votes an edge but arguably it is an advantage this input format would also enjoy in practice. In addition to our eight approaches, we also test two approaches used in real-world elections (Goel et al. 2016): greedy 4-approval ( Gr 4-Ap ), and greedy knapsack ( Gr Knap ). The former elicits 4-approval votes, and greedily selects the most widely-approved alternatives until the budget is depleted. The latter is almost identical, except for interpreting a knapsack vote as an approval for each alternative in the knapsack. As the performance measure for the ten approaches, we use the average ratio of the optimal and the achieved social welfare according to the actual utility proﬁle used to induce the input proﬁles termed average welfare ratio where the average is taken across the entire experiment.

Results: Figure 1 shows the average welfare ratio of the different approaches with 95% conﬁdence intervals, sorted from best to worst. The differences in performance between all pairs of rules except between Det Knap and Ran Val, and between Ran VFM and Gr Knap are statistically signiﬁcant (Johnson 2013) at a 95% conﬁdence level.

1.0 1.1 1.2 1.3 1.4 1.5 Average Welfare Ratio

Figure 1: Average welfare ratio (lower is better) of different approaches to participatory budgeting based on data from Boston 2015 and 2016 elections.

A few comments are in order. First, deterministic distortion-minimizing aggregation rules generally outperform their randomized counterparts. This is not entirely unexpected. While randomized rules do achieve better distortion, there always exists a deterministic rule minimizing the average welfare ratio objective; although, it is not necessarily the deterministic distortion-minimizing aggregation rule. Second, approaches based on deterministic rules are able to limit the loss in social welfare due to incomplete information about voters utility functions to only 2% 3%. Among these approaches, the one using threshold approval votes incurs the minimum loss. Third, knapsack votes consistently lead to higher distortion than alternative input formats. This, together with the poor theoretical guarantees for knapsack votes, suggests that it may not be worthwhile to ask voters to solve their personal NP-hard knapsack problems in order to cast a vote. Figure 2 shows the running times of our deterministic voting rules, averaged over 10 trials, on a log-log scale. We consider only the deterministic voting rules as they outperform their randomized counterparts in terms of the average welfare ratio. We observe that the running time scales gracefully with the number of agents. The experiments used the Boston 2016 dataset with 10 alternatives, and were run on an 8-core Intel(R) Xeon(R) CPU with 2.27GHz processor speed and 50GB main memory. Even with 500 voters, rules such as Det Th Ap and Det Val take less than 5 minutes, indicating the practicability of these methods even for the largest realworld participatory budgeting elections that we are aware of, which have no more than 5,000 voters.

5 Discussion

Our results indicate that threshold approval votes should receive serious consideration as the input format of choice for participatory budgeting. But there is one important is-

5 10 20 50 100 200 500 Number of Agents

Runtime (sec)

Det Val Det VFM Det Knap Det Th Ap

Figure 2: Average running time of the deterministic voting rules on the Boston 2016 dataset.

sue we have not studied: the cognitive load imposed on voters by different input formats. (If it were not for this issue, we would just elicit the full utility functions the whole point is to reduce cognitive load.) A participatory budgeting system based on threshold approval votes might ask voters to mark each project on which you would be happy to see the city spend $10,000 . While this seems reasonable enough (and probably easier than casting knapsack votes), human subject experiments are needed to rigorously determine whether threshold approval votes, and other input formats, require an acceptable cognitive effort. Whatever the best approach to participatory budgeting is, now is the time to identify it, before various heuristics become hopelessly ingrained. We believe that this is a grand challenge for computational social choice, especially at a point in the ﬁeld s evolution where it is gaining real-world relevance by helping people make decisions in practice.

Acknowledgements

This work was partially supported by NSF grants CCF1215883, CCF-1525932, and IIS-1350598, ONR grant N00014-16-1-3075, a Sloan Research Fellowship, and a Fulbright-Nehru Post-doctoral Fellowship. We thank Ashish Goel and Anilesh Krishnaswamy for generously sharing their data with us, and Tuomas Sandholm for providing the hardware to run the experiments.

Anshelevich, E., and Postl, J. 2016. Randomized social choice functions under metric preferences. In Proceedings of the 25th International Joint Conference on Artiﬁcial Intelligence (IJCAI), 46 52. Anshelevich, E., and Sekar, S. 2016. Blind, greedy, and random: Algorithms for matching and clustering using only

ordinal information. In Proceedings of the 30th AAAI Conference on Artiﬁcial Intelligence (AAAI), 390 396. Anshelevich, E.; Bhardwaj, O.; and Postl, J. 2015. Approximating optimal social choice under metric preferences. In Proceedings of the 29th AAAI Conference on Artiﬁcial Intelligence (AAAI), 777 783. Boutilier, C.; Caragiannis, I.; Haber, S.; Lu, T.; Procaccia, A. D.; and Sheffet, O. 2015. Optimal social choice functions: A utilitarian view. Artiﬁcial Intelligence 227:190 213. Brams, S. J., and Fishburn, P. C. 2007. Approval Voting. Springer, 2nd edition. Brandt, F.; Conitzer, V.; Endress, U.; Lang, J.; and Procaccia, A. D., eds. 2016. Handbook of Computational Social Choice. Cambridge University Press. Cabannes, Y. 2004. Participatory budgeting: A signiﬁcant contribution to participatory democracy. Environment and Urbanization 16(1):27 46. Caragiannis, I., and Procaccia, A. D. 2011. Voting almost maximizes social welfare despite limited communication. Artiﬁcial Intelligence 175(9 10):1655 1671. Caragiannis, I.; Nath, S.; Procaccia, A. D.; and Shah, N. 2016. Subset selection via implicit utilitarian voting. In Proceedings of the 25th International Joint Conference on Artiﬁcial Intelligence (IJCAI), 151 157. Conitzer, V., and Sandholm, T. 2005. Common voting rules as maximum likelihood estimators. In Proceedings of the 21st Annual Conference on Uncertainty in Artiﬁcial Intelligence (UAI), 145 152. Goel, A.; Krishnaswamy, A. K.; Sakshuwong, S.; and Aitamurto, T. 2016. Knapsack voting for participatory budgeting. Manuscript. Johnson, V. E. 2013. Revised standards for statistical evidence. Proceedings of the National Academy of Sciences 110(48):19313 19317. Lu, T., and Boutilier, C. 2011. Budgeted social choice: From consensus to personalized decision making. In Proceedings of the 22nd International Joint Conference on Artiﬁcial Intelligence (IJCAI), 280 286. Procaccia, A. D., and Rosenschein, J. S. 2006. The distortion of cardinal preferences in voting. In Proceedings of the 10th International Workshop on Cooperative Information Agents (CIA), 317 331. Procaccia, A. D.; Reddi, S. J.; and Shah, N. 2012. A maximum likelihood approach for selecting sets of alternatives. In Proceedings of the 28th Annual Conference on Uncertainty in Artiﬁcial Intelligence (UAI), 695 704. Skowron, P.; Faliszewski, P.; and Lang, J. 2015. Finding a collective set of items: From proportional multirepresentation to group recommendation. In Proceedings of the 29th AAAI Conference on Artiﬁcial Intelligence (AAAI), 2131 2137. Young, H. P. 1988. Condorcet s theory of voting. The American Political Science Review 82(4):1231 1244.