# robust_rent_division__ca48b4f8.pdf Robust Rent Division Dominik Peters CNRS, Université Paris Dauphine PSL dominik@lamsade.fr Ariel D. Procaccia Harvard University arielpro@seas.harvard.edu David Zhu Harvard University david.zhu@gmail.com In fair rent division, the problem is to assign rooms to roommates and fairly split the rent based on roommates reported valuations for the rooms. Envy-free rent division is the most popular application on the fair division website Spliddit. The standard model assumes that agents can correctly report their valuations for each room. In practice, agents may be unsure about their valuations, for example because they have had only limited time to inspect the rooms. Our goal is to find a robust rent division that remains fair even if agent valuations are slightly different from the reported ones. We introduce the lexislack solution, which selects a rent division that remains envy-free for valuations within as large a radius as possible of the reported valuations. We also consider robustness notions for valuations that come from a probability distribution, and use results from learning theory to show how we can find rent divisions that (almost) maximize the probability of being envy-free, or that minimize the expected envy. We show that an almost optimal allocation can be identified based on polynomially many samples from the valuation distribution. Finding the best allocation given these samples is NP-hard, but in practice such an allocation can be found using integer linear programming. 1 Introduction The literature on fair division of resources has produced allocation mechanisms for many domains, such as course allocation, indivisible goods, chores, house assignment, and the selection of citizens assemblies [Budish, 2011, Caragiannis et al., 2019, Moulin, 2019, Flanigan et al., 2021]. But arguably the most widely used example is rent division: this is the most popular application on the fair division website spliddit.org [Goldman and Procaccia, 2014], where it has been used more than 30,000 times since its launch in 2014. Rent division deals with the common situation where a group of n future roommates are planning to move into a house or apartment which has n rooms, one for each roommate. They will split the rent payments among themselves. The roommates may differ in how much they are willing to pay for different rooms. Given the room valuations of each roommate, our task is to assign the rooms, and to decide how to split the rent. We wish to do this fairly, and so we will choose an allocation that is envy-free: no roommate would strictly prefer to get another room, given the prices we have assigned to those rooms. Such an allocation is guaranteed to exist [Svensson, 1983]. Let us consider an example with n = 3 roommates, and let the total rent be $1000. Table 1 shows the valuation that each agent assigns to each room. Given this information, the algorithm in use on Spliddit will assign Room 1 to Alice, Room 2 to Bob, and Room 3 to Charlie, charging them $100, $500, and $400 respectively. This allocation is envy-free under the assumption (which we will make 36th Conference on Neural Information Processing Systems (Neur IPS 2022). Room 1 Room 2 Room 3 Alice 300 400 300 Bob 300 700 0 Charlie 300 100 600 Spliddit 100 500 400 Lexislack 200 450 350 Table 1: Example of valuations. In any envy-free allocation, Alice gets room 1, Bob gets room 2, and Charlie gets room 3. The lower rows display the price vectors selected by Spliddit s rule (maximin) and by our lexislack rule. Alice Bob Charlie Alice Bob Charlie (a) Spliddit (b) Lexislack Lexislack maximizes differences between blue and gray bars Spliddit maximizes the height of the shortest active bar Both Spliddit s rule and lexislack are envy-free, and thus the blue bars are at least as high as the gray bars Figure 1: Illustration of the example in Table 1. For Spliddit s rule and for our lexislack rule, we show the quasilinear utility (value minus price) that each agent gets from each room. The bar corresponding to the room assigned to the agent is shaded in blue. Both rules are envy-free, and thus the blue bars are at least high as the gray bars. Subject to envy-freeness, Spliddit s rule maximizes the height of the shortest blue bar. Lexislack maximizes the differences in height between the blue and the gray bars. throughout) that agents have quasilinear utilities: their utility under an allocation is the value of their room minus its price. For example, Alice has utility 300 100 = 200. She does not envy the others: Bob s room would give her only utility 400 500 < 200, and Charlie s room 300 400 < 200. On a typical instance, there are infinitely many allocations that are envy-free. Spliddit s algorithm chooses the one that maximizes the utility of the worst-off agent, subject to envy-freeness [Gal et al., 2017]. This is known as the maximin rule. In optimizing this objective, Spliddit might choose an outcome that is only barely envy-free. In the example, Bob has utility 700 500 = 200, but he would gain the same utility from having Alice s room: 300 100. If, upon moving in, Bob discovers a defect in his room and now only values it at 600, say, then he would envy Alice. Thus, the envy-freeness of Spliddit s allocation is not robust. We study the rent division problem with the goal of finding allocations that are robustly envy-free, in the sense that they remain envy-free even if valuations change slightly. For this, we introduce the lexislack rule, which selects an envy-free allocation where the minimum slack (the amount by which agent i prefers her allocation to agent j s) is maximized lexicographically. This produces an allocation that remains envy-free for all valuation profiles that are within a maximally large 1-radius of the reported profile. In the example of Table 1, the lexislack rule assigns the rooms in the same way as does Spliddit, but charges the roommates $200, $450, and $350. With these prices, each agent prefers their allocation to any other agent s by at least 150 (see Figure 1). This means that even after Bob s adjustment to 600, he does not envy Alice. We show that the lexislack rule always selects an essentially unique outcome, which can be found in polynomial time by linear programming. This notion of robustness may not always be appropriate. Consider two perturbations with equal 1-distance to the reported valuations: one changes agent i s valuations for all rooms by a small amount, the other changes i s valuation for one room by a large amount. Lexislack places equal importance on them. But the former perturbation seems more likely: even if a player is uncertain about the value of a room, that value is more likely to be close to their best estimate than further away. Thus, arguably, different valuation profiles should be weighted differently: we do not want to sacrifice envy-freeness for a likely perturbation in order to obtain it for an unlikely perturbation. To capture this idea, we propose to add noise such as Gaussian noise around the reported valuations. This way, we impute a probability distribution D over valuations. In this setting, our interpretation of robustness is to look for allocations that are envy-free with maximum probability. However, it is not clear how one could efficiently find the most robust allocation given the noisy valuations. As part of our methodological contribution, we propose an approach based on synthetic sampling. Specifically, we sample a number of valuation profiles from D, and then find an allocation that is optimal on this sample using integer linear programming (ILP). By calculating the VC dimension of the space of rent divisions, we give polynomial sample complexity bounds that show how many samples are sufficient so that this approach identifies an almost optimal allocation with high probability. Note that the samples are synthetic, but low sample complexity is crucial nevertheless: a small number of samples leads to a sufficiently small ILP that, as we show, can be optimally solved in practice (even though we prove that the problem is NP-complete). We also show that one can use the sampling approach to find an allocation that minimizes the expected amount by which one agent envies another. In contrast to maximizing the probability of envy-freeness, the minimum envy objective places more emphasis on avoiding bad violations of envy-freeness. An advantage of our sampling-based approach is that it is very general and does not place any restrictions on the distribution D. Our algorithms could also be used for rent division problems with uncertainty, where agents might explicitly report distributions over their valuations. For example, a Spliddit-like user interface could let agents report their valuations as a range rather than a number. We end with some experiments on data taken from Spliddit. They suggest that our three new rules significantly outperform the Spliddit maximin rule on robustness metrics. Interestingly, the lexislack solution does comparably well to the rules based on sampling. Given its conceptual simplicity and easy computation, this suggests lexislack as a good rule when robustness is desired.1 Related Work The rent division model is well-studied in the economics literature [Svensson, 1983, Alkan et al., 1991, Aragones, 1995, Su, 1999, Velez, 2018], often without assuming quasilinear utilities. That literature includes results on the structure of the envy-free set and about strategic aspects. Computer scientists have studied the computation of allocation rules [Gal et al., 2017, Procaccia et al., 2018]. Bei et al. [2021] study a generalization of the rent division problem. Robustness has been studied in several areas of computational social choice, such as in voting [Shiryaev et al., 2013], in committee elections [Bredereck et al., 2021, Gawron and Faliszewski, 2019, Misra and Sonar, 2019], and in stable matching [Chen et al., 2019, Mai and Vazirani, 2018]. We are not aware of such work for fair division, though Menon and Larson [2020] study a related problem of stability which requires that the allocation should not change much if valuations change slightly. For rent division, a blog post by Critch [2015] argues in favor of aiming for robustness in the rent division problem. Critch [2015] implemented an algorithm for robust rent division that appears in experiments to maximize the slack, but it differs from the lexislack rule, and no theoretical analysis of this algorithm is available. Our sampling-based approach is conceptually related to work on data-driven algorithm design [Balcan, 2020], which typically seeks to optimize the hyperparameters of an algorithm with respect to an underlying distribution over instances, based on samples. One thing that distinguishes our distributional setting is that we are using the samples to optimize a single solution to our problem. Computational hardness results for problems similar to our sample-based optimization problems have been obtained for stable matching and for Pareto-optimal assignment [Aziz et al., 2019, 2020]. 2 Preliminaries: Rent Division Let n 2 N and write [n] = {1, . . . , n}. Let N = [n] be a set of n agents, and let R = [n] be a set of n rooms. Without loss of generality, we let the total rent be 1. A (valuation) profile v = (vir)i2N,r2R is a collection of values vir 2 Q+, one for each agent i 2 N and each room r 2 R. 1A simple online tool to compare the lexislack and maximin rules is available at https://pref.tools/rent. A room assignment is a bijection σ : N ! R, so that agent i is assigned room σ(i). Given a valuation profile v, we say that σ is optimal if it maximizes utilitarian social welfare P i2N viσ(i). An allocation (σ, p) is a room assignment σ together with a payment vector p = (p1, . . . , pn) 2 Rn with P r2R pr = 1, where pr is the rent of room r. (The value pr is usually non-negative.) We assume that agents have quasilinear utilities. This means that if v is a valuation profile and (σ, p) is an allocation, then agent i s utility in this allocation is viσ(i) pσ(i), i.e., the valuation of i for her room σ(i) minus the room s rent. An allocation (σ, p) is envy-free if viσ(i) pσ(i) > vir pr for all i 2 N and r 2 R, so that each agent i weakly prefers her allocation to receiving any other room. A solution is a function that given a valuation profile, selects a set of allocations (usually a singleton, but ties may occur). A solution is essentially single-valued if when it selects more than one allocation, then all agents are indifferent between them: every agent gets the same utility from all tied allocations. The following facts are well-known [see, e.g., Velez, 2018]. We include proofs in Appendix D.1. Theorem 2.1. (a) For every optimal room assignment σ, there are prices p so that (σ, p) is envy-free. (b) If (σ, p) is envy-free then σ is optimal. (c) Let σ1, σ2 be optimal room assignments, and let (σ1, p) be an envy-free allocation. Then (σ2, p) is also an envy-free allocation, with all agents indifferent between the two: viσ1(i) pσ1(i) = viσ2(i) pσ2(i) for all i 2 N. Theorem 2.1(a) implies that an envy-free allocation exists for all valuation profiles. We can compute one in polynomial time: find an optimal room assignment σ using bipartite matching, then use linear programming to find prices p that make the allocation envy-free [Gal et al., 2017]. Theorem 2.1(c) implies that when selecting among envy-free allocations, we can restrict attention to any fixed σ and only vary the price vector p. By Theorem 2.1(c), all utility vectors achievable in an envy-free allocation are achieved by allocations of this form. 3 The Lexislack Solution We start by considering a common form of robustness: we look for allocations that remain fair for all valuations that are within some radius of input valuations, for as large a radius as possible. Thus, unlike in later sections, we do not assume that valuations come from a probability distribution. Let v be a valuation profile, fixed throughout. Let (σ, p) be an allocation. For i 2 N and r 2 R, let ir(σ, p) = (viσ(i) pσ(i)) (vir pr). Then define the slack of this allocation as slack(σ, p) = mini2N minr6=σ(i) ir(σ, p). Thus, an allocation has positive slack if every agent strictly prefers their allocation to all other agents allocations. An allocation (σ, p) is envy-free if and only if slack(σ, p) > 0. Slack is a measure of how robustly fair an allocation is, which we formalize in the following result. Proposition 3.1. Let (σ, p) be an envy-free allocation with slack(σ, p) = s > 0. If v0 is a valuation profile that is s-close to v in the sense that r2R |vir v0 for all i 2 N, then (σ, p) is also envy-free under v0. Proof. Let i, j 2 N. Then P r2R |vir v0 ir| 6 s implies iσ(i)) + (v0 iσ(j) viσ(j)) 6 s (1) Adding pσ(j) pσ(i) to both sides and rearranging, we get iσ(i) pσ(i)) (v0 iσ(j) pσ(j)) > (viσ(i) pσ(i)) (viσ(j) pσ(j)) s > 0 where the last inequality is by definition of slack. Thus, i does not envy j under v0. Since i and j were arbitrary, it follows that (σ, p) is envy-free under v0. One can also prove variants of Proposition 3.1. For example, kvi v0 ik1 6 s/2 also implies (1).2 If we wish to ensure robustness in a sense like in Proposition 3.1, this suggests the following rule: maxislack(v) = argmax(σ,p) slack(σ, p). This rule always selects an envy-free allocation: since envy-free allocations exist for every v, there exists an allocation with non-negative slack, and hence the maxislack solution also has non-negative slack. A maxislack solution can be found in polynomial time by computing an optimal assignment σ and then solving the following LP: max L subject to (viσ(i) pσ(i)) (viσ(j) pσ(j)) > L 8i 6= j and P r2R pr = 1. However, there are a few drawbacks to the maxislack rule. First, the rule is not essentially singlevalued: there may be several maxislack allocations which induce different utilities. This is unlike Spliddit s maximin rule which is essentially single-valued [Alkan et al., 1991]. Second, there may be maxislack allocations that do not maximize robustness for all agents. To see this, suppose that two agents i1 and i2 agree on the valuation of every room. Then in any envy-free allocation, the utility they assign to the two bundles allocated to them is equal. Hence the maximum slack attainable is 0, and so every envy-free allocation is maxislack. However, there may be allocations for which the slack between other pairs of agents is larger than 0, and such allocations are more robustly fair. In this spirit, to obtain robustness for a larger collection of agents (or of agent pairs), we can refine the maxislack solution using a leximin strategy. We call the resulting solution the lexislack rule. The lexislack rule selects an allocation (σ, p) that maximizes the smallest of the n2 values ( ir(σ, p))i2N,r2R, and subject to that maximizes the second-smallest of these values, and so on. In contrast to maxislack, the lexislack rule is essentially single-valued. The proof is in Appendix D.2. Theorem 3.2. The lexislack rule is essentially single-valued. In addition, this rule remains efficiently computable. Theorem 3.3. A lexislack allocation can be found in poly time by solving O(n4) linear programs. Proof sketch. This can be done using standard techniques [see Kurokawa et al., 2018, Section 5]. We give an overview of the algorithm. Start by computing an optimal σ. We will decide on the best value of ir one-by-one. Let F ; be the set of (i, r) pairs for which we have fixed their value. Use linear programming to find a price vector such that (σ, p) maximizes the smallest of the non-fixed values ir, subject to keeping the other ir at their fixed value. Say the optimum is L. Now we need to find a pair (i, r) 62 F such that necessarily ir = L in any lexislack allocation. This can again be done by linear programs that check whether it is possible that ir > L. One can show that at least one such pair (i, r) 62 F must exist; we then add it to F and fix its value to L, and repeat. 4 Maximizing Probability of Envy-Freeness In the previous section, we defined robustness using a measure of closeness based on the 1-distance. We now look at a more flexible model where true valuations are assumed to be noisy perturbations of the reported ones. A distribution D over valuations v, therefore, is obtained by asking agents for valuations, and then adding noise (e.g., Gaussian or uniform) around those valuations. Our goal will be to find an allocation (σ, p) that maximizes the probability of being envy-free with respect to D, i.e., one that maximizes EFrate D(σ, p) = Prv D[(σ, p) is envy-free under v]. Our algorithmic approach for finding an allocation with high probability of envy-freeness is to obtain a sample S of m valuation profiles sampled from D, and to compute an allocation that is envy-free on the most profiles in S, i.e., one that maximizes EFrate S(σ, p) = 1 m |{v 2 S : (σ, p) is envy-free under v}|. If the number m of samples is sufficiently high, we may hope that the best allocation on the sample S is also approximately the best on the distribution D. In this section, we will give a bound for the sample size m to be sufficient to ensure this property, and then we will discuss the computational problem of finding the best allocation for a given sample. 2In future work, it may be interesting to study rules that explicitly maximize robustness defined with respect to 1-distance rather than 1. 4.1 Sample Complexity In this section, we will give an upper bound on the number of samples required to guarantee that the allocation that maximizes EFrate S also (almost) maximizes EFrate D, with high probability. Theorem 4.1. Let ", δ > 0. There is a value m 2 N with n2 log n + log(1/δ) such that for every probability distribution D over valuation profiles, if S is a collection of at least m samples drawn i.i.d. from D, and (σ , p ) is the allocation that maximizes EFrate S, then with probability at least 1 δ, EFrate D(σ , p ) > max(σ,p) EFrate D(σ, p) ". We prove this theorem by adapting standard tools from learning theory. Let X be any set, with an unknown ground truth labeling : X ! {0, 1}. A hypothesis is a function h : X ! {0, 1}. Given a sample S = (x1, . . . , xm) of m elements of X (not necessarily distinct), write err S(h) = 1 m|{xi : h(xi) 6= (xi)}| for the fraction of samples that h labeled incorrectly. For a probability distribution D over X, write err D(h) = Prx D[h(x) 6= (x)] for the probability that h incorrectly labels a point sampled from D. A hypothesis class H is a set of hypotheses. Given a random sample S drawn i.i.d. from D, and knowledge of the true labeling of those samples, our goal is to find a hypothesis h 2 H that approximately minimizes err D(h), with high probability. Note that the ground truth , interpreted as a hypothesis, need not be a member of H. In learning theory, this setup corresponds to agnostic PAC learning , where the realizability assumption is not required to hold [Shalev-Shwartz and Ben-David, 2014, Section 3.2]. We say that a set C X is shattered by H if for all S C, there exists h 2 H with h(x) = 1 if x 2 S and h(x) = 0 if x 2 C \S. In other words, if we restrict the hypotheses in H to the set C, then all possible labelings of C are part of H. The VC dimension VCdim(H) of H is the cardinality of the largest subset of X that is shattered by H. We are interested in VC dimension due to the following standard result, adapted from Shalev-Shwartz and Ben-David [2014, Theorem 6.8], which says that PAC learning is possible on hypothesis classes of finite VC dimension. Theorem 4.2. Let ", δ > 0. Let H be a hypothesis class with VCdim(H) = d. Then there exists a value m 2 N with d + log(1/δ) such that for every probability distribution D over X, if S is a collection of at least m samples drawn i.i.d. from D, and h 2 H is the hypothesis that minimizes err S, then with probability at least 1 δ, err D(h ) 6 minh2H err D(h) + ". For our application, we let X be the set of all valuation profiles v. The correct labeling is (v) = 1 for all v. We identify allocations with hypotheses: For an allocation (σ, p), we define the hypothesis h(σ,p) so that for each v, h(σ,p)(v) = 1 if (σ, p) is envy-free under v, 0 otherwise. By these definitions, we have that for all S and D, EFrate S(σ, p) = 1 err S(h(σ,p)), and EFrate D(σ, p) = 1 err D(h(σ,p)). We study the hypothesis class H of all such hypotheses: H = {h(σ,p) : allocations (σ, p)}. To bound its VC dimension, the following result is useful: Lemma 4.3 (Shalev-Shwartz and Ben-David, 2014, Exercise 6.11). Let H1, . . . , Ht be hypothesis classes over X, with VCdim(Hi) 6 d for each i = 1, . . . , t. Then VCdim(H1 [ [ Ht) 6 4d log(2d) + 2 log(t). We can now bound the VC dimension of H. Lemma 4.4. VCdim(H) = O(n2 log n). Proof. For each room assignment σ, define the hypothesis class Hσ = {h(σ,p) : p 2 Rn} corresponding to allocations whose room assignment is σ. Then H = S σ Hσ where the union ranges over all room assignments. We will show that VCdim(Hσ) 6 n2 for each σ. Since there are n! different room assignments and log n! = O(n log n), it follows from Lemma 4.3 that VCdim(H) = O(n2 log n), as required. Let σ be a room assignment. Without loss of generality assume that σ(i) = i. Let d > n2 + 1. Consider a collection of d distinct valuation profiles v(1), . . . , v(d). We show that this collection cannot be shattered by Hσ. For i, j 2 N, say v(k) is uniquely restricting for (i, j) if ii for all 6= k. Thus, such a profile uniquely maximizes the amount by which agent i prefers j s room to her own room, ignoring prices. Clearly, for any pair i, j 2 N, at most one profile can be uniquely restricting for it. Since there are n2 many pairs (i, j) and d > n2, there is at least one profile which is not uniquely restricting for any pair, say v(1). We now ask if there is an allocation (σ, p) that is envy-free under v(2), . . . , v(d), but not envy-free under v(1). We show that the answer is no, so Hσ fails to shatter this collection. Assume for a contradiction that (σ, p) is such an allocation. Since it is not envy-free under v(1), there is a pair i, j 2 N with v(1) ij pj > v(1) ii pi or equivalently ii > pj pi. (2) As v(1) is not uniquely restricting for (i, j), for some 6= 1, Combining (2) and (3), it follows that v( ) ii > pj pi. Thus, (σ, p) is not envy-free under v( ), a contradiction. Our main result in this section, Theorem 4.1, now follows immediately from Theorem 4.2. 4.2 Computational Complexity To make use of Theorem 4.1, we need an algorithm that, given a collection S = (v(1), . . . , v(m)) of valuation profiles sampled from D, finds an allocation that maximizes EFrate S(σ, p). This problem can be encoded as an integer linear program via standard encoding techniques, using binary variables xir encoding that agent i receives room r, continuous variables pr encoding the prices, and a binary variable y for each sample 2 [m], indicating whether the produced allocation will be envy-free under v( ). The full encoding appears in Appendix B. Instead of an ILP approach, can we hope for a polynomial time algorithm finding the best allocation? Let us formulate our optimization problem as a decision problem as follows. EF-RATE MAXIMIZATION Input: Set N of agents, set R of rooms, a list of m valuation profiles v(1), . . . , v(m), number B. Question: Does there exist an allocation that is envy-free for at least B of the m valuation profiles? Unfortunately, this problem is computationally hard. We prove this by a reduction from CLIQUE, given in Appendix D.3. Theorem 4.5. EF-RATE MAXIMIZATION is NP-complete. There are two sources of computational difficulty for solving EF-RATE MAXIMIZATION: we have to decide on one of the n! possible room assignments, and we have to decide on which subset of valuation profiles we are aiming to be envy-free on. But in practice, there is a way to avoid the first source of hardness. Suppose the m valuation profiles are sampled from a continuous distribution D. Then with probability 1, for each sampled profile v( ) there is a unique optimal room assignment σ( ). Any solution to the EF-rate maximization problem must use a room assignment that is optimal for at least one of the given valuation profiles. Thus, at most m different room assignments are candidates, and we can find an optimal solution using m calls to the following problem (one call for each candidate assignment σ( )): EF-RATE MAXIMIZATION WITH FIXED ASSIGNMENT Input: A list of m valuation profiles v(1), . . . , v(m), number B, room assignment σ. Question: Is there a price vector p such that (σ, p) is envy-free for at least B of the m valuation profiles? Unfortunately, this version of the problem is also hard, and so this trick for continuous distributions does not help. We prove this by reduction from the feedback arc set problem in Appendix D.4. Theorem 4.6. EF-RATE MAXIMIZATION WITH FIXED ASSIGNMENT is NP-complete. Nevertheless, as we show in Section 6, we can solve this problem in practice using integer linear programming (ILP). The reason this is possible is that the sample complexity is relatively low, leading to an ILP of practical size. 5 Minimizing Expected Envy In Section 4, we defined robust envy-freeness as allocations that have a high probability of being envy-free when valuations come from a given distribution D. In this section, we consider a different objective function that is more fine-grained. In measuring the probability of envy-freeness, we implicitly treat all failures of envy-freeness equally. We will now minimize expected envy, which treats cases where one agent envies another by a lot as more severe. Given a valuation profile v and an allocation (σ, p), we define the allocation s (maximum) envy, envyv(σ, p), to be (viσ(j) pσ(j)) (viσ(i) pσ(i)) This quantity, which is related to slack as considered in Section 3, measures the biggest amount by which one agent prefers another s bundle. In principle one could allow negative values of envyv(σ, p) for allocations that have positive slack, but we chose to force these values to be non-negative, since our focus is on avoiding envy. Note that an allocation is envy-free if and only if envyv(σ, p) = 0. Our goal in this section is to find an allocation minimizing the expected envy with respect to D, defined as envy D(σ, p) = Ev D[envyv(σ, p)]. Our approach will be similar to before: we obtain a sufficiently large sample S of m profiles from D and select the allocation that does best on the sample, i.e. it minimizes envy S(σ, p) = 1 v2S envyv(σ, p). 5.1 Sample Complexity For stating our sample complexity bound, we assume that valuations v are normalized: let vir > 0 for all i 2 N and r 2 R, and P r2R vir = 1 for all i 2 N. We are going to prove the following result: Theorem 5.1. Let ", δ > 0, and let D be a distribution. If we draw m = O( n "δ) samples i.i.d. from D and if (σ , p ) minimizes envy S , then with probability at least 1 δ, we have envy D(σ , p ) < min(σ,p) envy D(σ, p) + ". Thus, if we draw sufficiently many samples, then with high probability the allocation minimizing expected envy on the sample will, up to ", be minimizing with respect to D. We prove this result by discretizing the space of allocations. We then use a concentration inequality to show that w.h.p. the expected envy with respect to D is close to the expected envy with respect to the sample S. The proof of Theorem 5.1 appears in Appendix D.5. Note that for this result we employed a direct approach. This technique and its discretization step would not have worked for the envy-free rate, because two very close rent divisions can in principle have very different EF rates. On the other hand, we expect that similar bounds for the minimum envy objective could be obtained by using extensions of VC dimension to real-valued functions (e.g., pseudo-dimension). 5.2 Computational Complexity Again, our sample complexity result needs an algorithm that finds the best allocation for a given sample S. Like for EFrate, we can solve this problem using integer linear programming (see Appendix B). For the formal complexity analysis, consider the following decision problem: EXPECTED ENVY MINIMIZATION Input: List S = (v(1), . . . , v(m)), number B. Question: Is there (σ, p) with envy S(σ, p) 6 B? This problem is again NP-complete. The proof is in Appendix D.6, by reduction from CLIQUE. Theorem 5.2. EXPECTED ENVY MINIMIZATION is NP-complete, even for binary valuation profiles. Interestingly, this problem becomes easy once we fix a room assignment σ, because the best price vector can then be computed by linear programming (because the values of the integer variables in the ILP shown in Appendix B are decided by the fixed room assignment σ). Thus, the problem can be solved in time n! poly(n, m), and hence is fixed-parameter tractable with respect to the number of agents n. This is good news: instances will often have a small number of agents, but we will want to consider as large a sample as feasible to ensure low maximum envy. In fact, as we will see momentarily, the NP-completeness of the problem is not an obstacle in real-world instances. 6 Experiments We evaluated our rules on user data taken from Spliddit.3 We studied distributions obtained by adding noise to valuations. We started by selecting 1,000 instances v at random, to speed up computations. The same selection is used for each experiment. For each instance, we normalize the rent to 1, and normalize valuations to sum to 1. We considered three noise models, each parameterized by a choice of noise level " 2 {0, 0.01, . . . , 0.09}. ir vir (1 + Uniform[ ", +"]) (Uniform) ir vir (1 + N[0, "]) (Normal) ir vir (1 + r N[0, "]) (Biased Normal) In each of these noise models, valuations are increased or decreased by a random fraction. Here, N[µ, σ] is a normal distribution with mean µ and standard deviation σ. For the biased normal noise model, we put rooms in an arbitrary fixed order and label them with integers 0, 1, . . . , n 1. Rooms with a higher index have more noise. For each noise model and choice of ", we produced a sample S of size m = 100. We then computed allocations maximizing EFrate S and minimizing envy S. We also computed the maximin and lexislack rules based on the input profile v. For each of the four allocations, we calculated their value of EFrate S, and of envy S. We then average over all 1,000 instances. The results are shown in Figure 2 for the Uniform noise model. Results for the other noise models and more details are given in Appendix A. As expected, on each of the two metrics, the rule optimizing it does best, but all three 3This dataset was kindly provided to us in anonymized form by the maintainer of Spliddit, Nisarg Shah. Probability of Envy-Freeness (EFrate S) Expected Envy (envy S) 0 1 2 3 4 5 6 7 8 9 Noise parameter " Probability of Envy-Freeness (EFrate D) maximin lexislack maxprob minenvy 0 1 2 3 4 5 6 7 8 9 Noise parameter " Expected envy (envy D) maximin lexislack maxprob minenvy Figure 2: Results of experiments for the Uniform noise model. rules aiming for robustness do similarly well. Spliddit s maximin rule does significantly worse on our metrics. Before the experiments, we expected that lexislack would do worse for the biased noise model, but this does not appear to be the case. In the appendix, we also evaluate the sampling-based rules on a freshly drawn sample different from the sample used to optimize the rules (Appendix A.2) as well as on a sample drawn from a different probability distribution (Appendix A.3), to evaluate how sensitive these methods are to being optimized on a small sample and to knowing the right noise distribution. In both cases, we find that the performance of the sampling-based methods worsens, while lexislack is still robust. 0.0s 1.0s 2.0s 3.0s 4.0s 5.0s 6.0s 7.0s 8.0s 1 25 50 75 100 125 Max. Prob. EF Min. Exp. Envy Figure 3: Computation time depending on sample size Figure 3 shows average computation time to compute allocations optimizing EFrate S and envy S, using Gurobi 9.1.2 on four threads of an AMD Ryzen 2990WX (128 GB RAM) with the ILP formulations from Appendix B. The results were obtained for a random selection of 300 Spliddit instances with n = 4, with the Uniform noise model for " = 0.05, and sample sizes m varying from 1 to 125. Minimizing envy is much faster due to fewer integral variables. In Appendix A.4, we report some additional computation times as a function of the noise parameter ". 7 Future Directions Our approach should be applicable to many settings beyond rent division, such as homogeneous divisible goods, cake cutting, or even indivisible goods. For example, the lexislack rule can be adapted to these settings, and similar results as in our distribution-based approach might be achievable. We have shown that the lexislack rule shares some key properties with the maximin rule, such as essential single-valuedness and polynomial-time computability. It would be interesting to axiomatically contrast the two solutions, for example with respect to strategic properties like manipulability. For our distribution-based approach, we assumed that we have access to D only via sampling. Often we may know D more explicitly, for example if we are just adding noise to reported valuations. For such well-behaved D, can we design direct algorithms for finding optimal allocations with respect to our two objectives, without needing samples? Acknowledgments and Disclosure of Funding This work was partially supported by the National Science Foundation under grants IIS-2147187, CCF-2007080, IIS-2024287, and CCF-1733556; and by the Office of Naval Research under grant N00014-20-1-2488. A. Alkan, G. Demange, and D. Gale. Fair allocation of indivisible goods and criteria of justice. Econometrica, 59(4):1023 1039, 1991. E. Aragones. A derivation of the money Rawlsian solution. Social Choice and Welfare, 12:267 276, H. Aziz, P. Biró, R. de Haan, and B. Rastegari. Pareto optimal allocation under uncertain preferences: uncertainty models, algorithms, and complexity. Artificial Intelligence, 276:57 78, 2019. H. Aziz, P. Biró, S. Gaspers, R. de Haan, N. Mattei, and B. Rastegari. Stable matching with uncertain linear preferences. Algorithmica, 82(5):1410 1433, 2020. M.-F. Balcan. Data-driven algorithm design. In T. Roughgarden, editor, Beyond the Worst-Case Analysis of Algorithms, chapter 29. Cambridge University Press, 2020. X. Bei, Z. Li, J. Liu, S. Liu, and X. Lu. Fair division of mixed divisible and indivisible goods. Artificial Intelligence, 293:103436, 2021. R. Bredereck, P. Faliszewski, A. Kaczmarczyk, R. Niedermeier, P. Skowron, and N. Talmon. Robust- ness among multiwinner voting rules. Artificial Intelligence, 290:Article 103403, 2021. E. Budish. The combinatorial assignment problem: Approximate competitive equilibrium from equal incomes. Journal of Political Economy, 119(6):1061 1103, 2011. I. Caragiannis, D. Kurokawa, H. Moulin, A. D. Procaccia, N. Shah, and J. Wang. The unreasonable fairness of maximum Nash welfare. ACM Transactions on Economics and Computation (TEAC), 7 (3):1 32, 2019. J. Chen, P. Skowron, and M. Sorge. Matchings under preferences: Strength of stability and trade-offs. In Proceedings of the 2019 ACM Conference on Economics and Computation (EC), pages 41 59, 2019. A. Critch. Robust rental harmony. https://acritch.com/rent/, 2015. Archived at https://perma.cc/Q9MD-HMAY. B. Flanigan, P. Gölz, A. Gupta, B. Hennig, and A. D. Procaccia. Fair algorithms for selecting citizens assemblies. Nature, 596:548 552, 2021. Y. Gal, M. Mash, A. D. Procaccia, and Y. Zick. Which is the fairest (rent division) of them all? Journal of the ACM, 64(6):Article 39, 2017. G. Gawron and P. Faliszewski. Robustness of approval-based multiwinner voting rules. In Proceedings of the 6th International Conference on Algorithmic Decision Theory (ADT), pages 17 31, 2019. J. Goldman and A. D. Procaccia. Spliddit: Unleashing fair division algorithms. SIGecom Exchanges, 13(2):41 46, 2014. D. Kurokawa, A. D. Procaccia, and N. Shah. Leximin allocations in the real world. ACM Transactions on Economics and Computation, 6(3 4):Article 11, 2018. T. Mai and V. V. Vazirani. Finding stable matchings that are robust to errors in the input. In Proceedings of the 26th Annual European Symposium on Algorithms (ESA), page Article No. 60, 2018. V. Menon and K. Larson. Algorithmic stability in fair allocation of indivisible goods among two agents. ar Xiv:2007.15203, 2020. N. Misra and C. Sonar. Robustness radius for Chamberlin Courant on restricted domains. In Proceedings of the 45th International Conference on Current Trends in Theory and Practice of Informatics (SOFSEM), pages 341 353, 2019. H. Moulin. Fair division in the internet age. Annual Review of Economics, 11:407 441, 2019. A. D. Procaccia, R. A. Velez, and D. Yu. Fair rent division on a budget. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI), pages 1177 1184, 2018. S. Shalev-Shwartz and S. Ben-David. Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, 2014. D. Shiryaev, L. Yu, and E. Elkind. On elections with robust winners. In Proceedings of the 12th International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), pages 415 422, 2013. F. E. Su. Rental harmony: Sperner s lemma in fair division. American Mathematical Monthly, 106 (10):930 942, 1999. L.-G. Svensson. Large indivisibles: An analysis with respect to price equilibrium and fairness. Econometrica, 51(4):939 954, 1983. R. A. Velez. Equitable rent division. ACM Transactions on Economics and Computation (TEAC), 6 (2):Article 9, 2018. L. A. Wolsey. Integer Programming. John Wiley & Sons, 1998. 1. For all authors... (a) Do the main claims made in the abstract and introduction accurately reflect the paper s contributions and scope? [Yes] (b) Did you describe the limitations of your work? [Yes] We mention open problem in the section on future directions, as well as throughout the paper. (c) Did you discuss any potential negative societal impacts of your work? [N/A] (d) Have you read the ethics review guidelines and ensured that your paper conforms to them? [Yes] 2. If you are including theoretical results... (a) Did you state the full set of assumptions of all theoretical results? [Yes] (b) Did you include complete proofs of all theoretical results? [Yes] Due to space con- straints, many proofs appear in the appendix. 3. If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main ex- perimental results (either in the supplemental material or as a URL)? [Yes] In the supplemental material, but not including the data. (b) Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [N/A] (c) Did you report error bars (e.g., with respect to the random seed after running exper- iments multiple times)? [Yes] We report standard errors, with respect to the 1,000 instances. (d) Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] See Figure 3 and the hardware details at the end of Section 6. 4. If you are using existing assets (e.g., code, data, models) or curating/releasing new assets... (a) If your work uses existing assets, did you cite the creators? [Yes] We use a proprietary dataset from Spliddit.org, whose creators we cite [Goldman and Procaccia, 2014]. (b) Did you mention the license of the assets? [N/A] (c) Did you include any new assets either in the supplemental material or as a URL? [No] (d) Did you discuss whether and how consent was obtained from people whose data you re using/curating? [No] (e) Did you discuss whether the data you are using/curating contains personally identi- fiable information or offensive content? [No] The data we received was completely anonymous. 5. If you used crowdsourcing or conducted research with human subjects... (a) Did you include the full text of instructions given to participants and screenshots, if applicable? [N/A] (b) Did you describe any potential participant risks, with links to Institutional Review Board (IRB) approvals, if applicable? [N/A] (c) Did you include the estimated hourly wage paid to participants and the total amount spent on participant compensation? [N/A]