# complexity_of_manipulating_sequential_allocation__f506ad47.pdf

Complexity of Manipulating Sequential Allocation

Haris Aziz Data61, CSIRO and UNSW Sydney, Australia haris.aziz@data61.csiro.au

Sylvain Bouveret LIG - Grenoble INP France sylvain.bouveret@imag.fr

J erˆome Lang Universit e Paris-Dauphine, PSL Research University CNRS, LAMSADE, Paris, France lang@lamsade.dauphine.fr

Simon Mackenzie Carnegie Mellon University, Pittsburg, USA simonm@andrew.cmu.edu

Sequential allocation is a simple allocation mechanism in which agents are given pre-speciﬁed turns in which they take one item among those that are still available. It has long been known that sequential allocation is not strategyproof. This raises the question of the complexity of computing a preference report that yields a higher utility than the truthful preference. We show that the problem is NP-complete for one manipulating agent with additive utilities and several nonmanipulating agents. In doing so, we correct a wrong claim made in a previous paper. We then give two additional results. First, we present a polynomial-time algorithm for optimal manipulation when the manipulator has additive binary utilities. Second, we consider a stronger notion of manipulation whereby the untruthful outcome yields more utility than the truthful outcome for all utilities consistent with the ordinal preferences; for this notion, we show that a manipulation, if any, can be computed in polynomial time.

Introduction A simple but popular mechanism to allocate indivisible items is sequential allocation (Aziz, Walsh, and Xia 2015; Bouveret and Lang 2011; Brams and Strafﬁn 1979; Brams and Taylor 1996; Kalinowski, Narodytska, and Walsh 2013; Kalinowski et al. 2013; Kohler and Chandrasekaran 1971; Levine and Stange 2012; Tominaga, Todo, and Yokoo 2016). In sequential allocation, a sequence speciﬁes the turns of the agents. For example, for sequence 1212, agents 1 and 2 alternate with agent 1 taking the ﬁrst turn. Agents report their preferences by expressing a linear order over items, and are allocated items based on their reported preferences as follows. They get turns according to the sequence, and when their turn comes, they are given the most preferred item (according to their reported order) that has not yet been allocated. Sequential allocation is an ordinal mechanism since the outcome only depends on the ordinal preferences of agents over items. Nevertheless, it is a standard assumption

Copyright c 2017, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

in the literature that agents have underlying additive utilities for the items.1 It has long been known that sequential allocation is not strategyproof in particular when agents do not have consecutive turns. This motivates the natural problem of computing best responses (also referred to as manipulations). Kohler and Chandrasekaran (1971) presented a polynomial-time algorithm to compute the optimal manipulation of an agent when there are two agents and the sequence is alternating (121212..). Bouveret and Lang (2011) initiated further work on manipulation of sequential allocation and showed that (1) it can be checked in polynomial time whether an agent can be allocated a given subset of items; then, they show in a later work (Bouveret and Lang 2014) that (2) an optimal manipulation for two agents (one manipulator and one non-manipulator) can be found in polynomial time for any sequence. Now, they also claim (page 142, left column, lines 5-7) that by putting together (1) and (2), a best response can be computed in polynomial time for one manipulator against several non-manipulators, and that each best response results in the same allocation for the manipulator. As we will show, this conclusion is wrong.

Results We focus on computing best responses (or manipulations) under sequential allocation. We ﬁrst show that the algorithm given by Bouveret and Lang (2014) for computing a best response against one non-manipulator does not extend to several non-manipulators. We then show that the problem of computing a best response is in fact NP-complete. The result has some interesting consequences since many allocation rules are based on sequential allocation. We also give two additional results. We ﬁrst present a polynomial-time algorithm for optimal manipulation when the manipulator has binary utilities. We then consider a stronger notion of manip-

1We view here sequential allocation as a centralized mechanism; it can also be seen as a decentralized mechanism, where instead of reporting their preference, agents pick a remaining item each time their turn come. Our results hold for both views but are easier to state for the centralized view.

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17)

ulation whereby the untruthful outcome yields more utility than the truthful outcome for all utilities consistent with the ordinal preferences. For this notion, we give a polynomialtime algorithm for computing a manipulation.

Preliminaries We consider a set of agents N = {1, . . . , n} and a set of items O = {o1, . . . , om}. Agent 1 is the manipulator and agents 2 to n the non-manipulators. The preference proﬁle = ( 1, . . . , n) speciﬁes for each agent i his complete, strict, and transitive preference i over O. A preference relation i is usually denoted by a list, such as i : a, b, c, d for a i b i c i d. Moreover, for agent 1 we are also given a set of positive values {u1(o)|o O}, where for each o, u1(o) > 0 is 1 s cardinal evaluation of the value of item o. These values are consistent with 1, that is, u1(o) > u1(o ) iff o 1 o . We assume that agent 1 s preference over sets of items are represented by the additive utility function ui(O ) =

o O ui(o) for all O O. We say that u1 is lexicographic if for all o O, u1(o) >

o io u1(o ). We abbreviate u1(S) u1(T) into S 1 T. Finally, an assignment p is a function from O to N.

Complexity of Manipulation We ﬁrst show that the best response algorithm of Bouveret and Lang (2014) does not work for n 3, and that several optimal manipulations may result in different allocations. Example 1. Let the sequence be 1231, and consider the following proﬁle .

1: a, b, c, d 2: c, d, a, b 3: a, b, c, d

According to (Bouveret and Lang 2014), the best response is one in which agent 1 gets {a, d} which can even be achieved by the truthful report. Now, if agent 1 misreports

1: c, b, a, d

then he gets {b, c}. Which one of {a, d} and {b, c} is better depends on u1 and not just on 1. For example, if u1(a) = 5, u1(b) = 4, u1(c) = 3 and u1(d) = 1, then {b, c} 1 {a, d} and the best achievable allocation for 1 is {b, c}. On the other hand, if u1(a) = 4, u1(b) = 3, u2(c) = 2 and u1(d) = 1, then 1 is indifferent between {a, d} and {b, c}. Thus, for more than one non-manipulator the algorithm of Bouveret and Lang (2014) does not necessarily compute a best response,2 and the complexity of computing a best response is an open problem. We are now going to show that the problem is NP-hard. The reduction involves a similar high-level idea as that of the result of Aziz et al. (2015a) that manipulating the probabilistic serial (PS) mechanism is NP-hard. However, the reduction requires new gadgets. Also note that the NP-hardness result for the PS mechanism does

2The reason is that when they translate the n-agent problem P of deciding whether agent 1 can obtain a set of objects S into a 2-agent problem P (Bouveret and Lang 2011), the preference relation of agent 2 in P problem depends on S, and in the best response algorithm for a 2-agent problem of Bouveret and Lang (2014), S is not ﬁxed.

not directly imply a similar result for sequential allocation. Similarly, the NP-hardness of the manipulation of sequential allocation does not imply the NP-hardness of the manipulation of the PS mechanism.

Theorem 2. Computing a best response for the sequential allocation mechanism is NP-complete.

Proof. We prove that the following problem (BEST RESPONSE) is NP-complete: given an assignment setting, a utility function u1 : O N for the manipulator, and a target utility T, can the manipulator specify preferences such that the utility for his allocation under the sequential allocation rule is at least T? The problem BEST RESPONSE is clearly in NP, since the outcome with respect to the reported preference can be computed by simulating sequential allocation. We show hardness by reduction from a restricted NPcomplete version of 3SAT where each literal appears exactly twice in the set of clauses. The problem remains NPcomplete (Berman, Karpinski, and Scott 2003). Given such a 3SAT instance F = (X, C) where X = {x1, . . . , x|X|} is the set of variables and C the set of clauses, we build an instance of BEST RESPONSE where the manipulator can obtain utility T if and only if C is satisﬁable. We denote by L = {x1, x1, . . . , x|X|, x|X|} the set of literals. We deﬁne the set of agents as

N = {1} {a1 li, a2 li : li L}

with agent 1 as the manipulator, and the set of items as

O = {o1 c, o2 c, o3 c : c C}

{o1 li, o2 li, h1 li, h2 li, h3 li, d11 li , d12 li , d21 li , d22 li : li L}

O(c) = {o1 c, o2 c, o3 c}: clause items associated with clause c O(li) = {o1 li, o2 li}: choice items associated with literal li H(li) = {h1 li, h2 li, h3 li}: consistency items associated with li D(li) = {d11 li , d12 li , d21 li , d22 li }: dummy items associated with li

We view the sequential allocation process as follows. The agents preferences are built in a way such that the agents will ﬁrst go through |X| choice rounds corresponding to variables x1, . . . , x|X|, then |C| clause rounds corresponding to c1, . . . , c|C|, and then one ﬁnal collection round.

High-level Idea The picking sequence is composed of successive rounds. Each round is associated with a subset of items that are valued by agent 1 much higher than items associated with later rounds. If he wants to reach the target utility T, then in each round, agent 1 must focus only on the items relevant to that round; those he will not get during the corresponding round will be taken by other agents and it will not be possible for him to take them in later rounds. In each round, agent 1 makes a choice between the items corresponding to a variable and to its negation. There is a

negligible difference between the utility of the items corresponding to the variable and those corresponding its negation (for example o1 x and o1 x). If he chooses the items in a correct way, then he will get a most preferred item corresponding to each of the clauses: more precisely, agent 1, in any case, will get one item hj xi and one item hk xi for each variable xi; if he chooses in a correct way, then this pair of items {hj xi, hk xi} will give him a sufﬁciently high utility, otherwise there will be a utility loss which he will never be able to compensate. We show below that agent 1 reaches utility T if and only if these two conditions are satisﬁed:

1. he chooses the choice items in a consistent way, that is, at each choice round i he chooses either o1 xi and o2 xi (which we view as assigning xi to false), or o1 xi and o2 xi (which we view as assigning xi to true). 2. he manages to get his most preferred clause item for each clause c; this is possible only if none of the agents corresponding to the negation of the literals in c gets this clause item before him.

Utility function of agent 1 Agent 1 highly prefers choice items relevant to variable xi to items relevant to variable xj, j > i; for a given variable he highly prefers its relevant choice items to its relevant consistency items. These items are highly preferred to clause items, which in turn are highly preferred to dummy items:

O(x1), O( x1) 1 H(x1), H( x1) 1 . . . 1 O(x|X|), O( x|X|) 1 H(x|X|), H( x|X|) 1 o1 c1 1 o1 c2 1 . . . 1 o1 c|C| 1 other items

His utilities for items in O(xi) and H(xi) are as follows:

u1(o1 xi) = u1(o1 xi) + ϵ u1(o2 xi) = u1(o2 xi) + ϵ, where ϵ is a negligible quantity. h1 xi 1 h2 xi 1 h3 xi 1 h1 xi 1 h2 xi 1 h3 xi.

u1(h2 xi) + u1(h2 xi) < u1(h1 xi) + u1(h3 xi) = u1(h1 xi) + u1(h3 xi). agent 1 reaches utility T if and only if he gets two items corresponding to a variable, at least one top choice consistency item (h1 x or h1 x) in each round and his target clause item in each clause round.

Choice Round The choice round is composed of a series of rounds, one for each variable xi; each of them is itself composed of two subrounds; in each of them, agent 1 has to choose a choice item. The sub-sequence in choice round corresponding to variable xi is as follows:

1, a1 xi, a2 xi, a1 xi, a2 xi, 1, a1 xi, a2 xi, a1 xi,

a2 xi, a1 xi, a2 xi, 1, a1 xi, a2 xi, 1

The ordinal preferences relevant for the choice round corresponding to variable xi are as follows. Recall that agent 1 s preferences are

1 : o1 xi, o1 xi, o2 xi, o2 xi, h1 xi, h2 xi, h3 xi, h1 xi, h2 xi, h3 xi

Then, each literal agent prefers the items corresponding to the negation of the literal:

a1 xi : o1 xi, d11 xi, d12 xi, o2 xi, h1 xi, h2 xi, h3 xi a2 xi : d21 xi, o1 xi, o2 xi, d22 xi, h1 xi, h2 xi, h3 xi a1 xi : o1 xi, d11 xi, h1 xi, o2 xi, h2 xi, h3 xi, d12 xi a2 xi : d21 xi, o1 xi, o2 xi, h1 xi, h2 xi, h3 xi, d22 xi Items that do not appear in these lists are below in the preference list of an agent. Note the asymmetry between agents corresponding to positive and negative literals: the positive (respectively, negative) literal agents have a dummy item as the least preferred item (respectively, the consistency items as the least preferred items) relevant to the picking in the choice round. We say that agent 1 makes a consistent choice in the choice round corresponding to xi if he picks both o1 xi and o2 xi (in which case we say he assigns xi to true) or both o1 xi and o2 xi (in which case we say he assigns xi to false). Given the preferences of other agents, the pairs that agent 1 can possibly get are only these three ones: {h1 xi, h3 xi}, {h1 xi, h3 xi} and {h2 xi, h2 xi}. If he makes a consistent choice then he will get one of the ﬁrst two pairs, for which he has the same utility (see Tables 1 and 2), and if he makes an inconsistent choice then he will get the pair {h2 xi, h2 xi}, which gives him a smaller utility than the other two pairs (see Tables 3 and 4).

Clause round The sequence in clause round corresponding to clause c = (li lj lk) is as follows.

ali, alj, alk, , 1

For each literal l in clause c, there is an agent a1 l or a2 l that features in the round. Recall that each literal occurs in exactly two clauses in the set of clauses. Agent a1 l (respectively, a2 l) features if c is the ﬁrst (respectively, second) clause in which l is present. After all the clause rounds are ﬁnished, agent 1 will get |C| turns in which he will get the clause items that are still available. For agent 1, according to u1, his relevant preference in the clause round is to go for o1 c. For a literal l, let c and c be the two clauses containing l. If l = x (x being a propositional variable), the preferences of agents a1 x and a2 x in the round corresponding to l are

a1 x : h2 x, h3 x, o3 c, o2 c, o1 c a2 x : h2 x, h3 x, o3 c , o2 c , o1 c

and if l = x then the preferences of agents a1 x and a2 x in the round corresponding to l are

a1 x : d12 x, o3 c, o2 c, o1 c a2 x : d22 x, o3 c , o2 c , o1 c

Note that the consistency items h2 x and h3 x, as well as the dummy items d12 x and d22 x, were possibly picked in the choice round corresponding to variable x.

Stage 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Agent 1 a1 xi a2 xi a1 xi a2 xi 1 a1 xi a2 xi a1 xi a2 xi a1 xi a2 xi 1 a1 xi a2 xi 1 Item picked o1 xi o1 xi d21 xi d11 xi d21 xi o2 xi d11 xi o2 xi h1 xi h2 xi d12 xi d22 xi h3 xi d12 xi d22 xi h1 xi

Table 1: Choice round for variable xi in which agent 1 makes consistent choice {o1 xi, o2 xi} ( assign xi to true ). Agents a1 xi and a2 xi next focus on items h2 xi and h3 xi before turning their attention to the clause items. On the other hand, a1 xi and a2 xi are now ready to get clause items. In this way, the agents corresponding to literals set to false are quicker to get their clause items.

Stage 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Agent 1 a1 xi a2 xi a1 xi a2 xi 1 a1 xi a2 xi a1 xi a2 xi a1 xi a2 xi 1 a1 xi a2 xi 1 Item picked o1 xi d11 xi d21 xi o1 xi d21 xi o2 xi d12 xi d22 xi d11 xi o2 xi h1 xi h2 xi h1 xi h2 xi h3 xi h3 xi

Table 2: Choice round for variable xi in which agent 1 makes consistent choice {o1 xi, o2 xi} ( assign xi to false ). Agents a1 xi and a2 xi next focus on items d12 xi and d22 xi before turning their attention to the clause items. On the other hand, a1 xi and a2 xi are now ready to get clause items.

Stage 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Agent 1 a1 xi a2 xi a1 xi a2 xi 1 a1 xi a2 xi a1 xi a2 xi a1 xi a2 xi 1 a1 xi a2 xi 1 Item picked o1 xi d11 xi d21 xi o1 xi d21 xi o2 xi d12 xi o2 x1 d11 x1 h1 xi h1 xi d22 xi h2 xi h2 xi h3 xi h2 xi

Table 3: Choice round for variable xi in which agent 1 makes inconsistent choice {o1 xi, o2 xi}. As a result of making an inconsistent choice, agent 1 does not get h1 xi nor h1 xi.

Stage 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Agent 1 a1 xi a2 xi a1 xi a2 xi 1 a1 xi a2 xi a1 xi a2 xi a1 xi a2 xi 1 a1 xi a2 xi 1 Item picked o1 xi o1 xi d21 xi d11 xi d21 xi o2 xi d11 xi d22 xi h1 xi o2 xi d12 xi h1 xi h2 xi d12 xi d22 xi h2 xi

Table 4: Choice round for variable xi in which agent 1 makes inconsistent choice {o1 xi, o2 xi}.

If agent 1 has made a consistent choice in the choice round corresponding to variable x and has set x to true , then a1 xi and a2 xi are now ready to get clause items in their respective clause rounds. If agent 1 has made a consistent choice in the choice round corresponding to variable x and has set x to false , then a1 xi and a2 xi already want to get clause items in their respective clause rounds. In the clause round, for any literal l that is not satisﬁed by the truth assignment resulting from the choice round, the agent corresponding to l gets a clause item oi c where c is a clause containing l. For example, if x has been assigned to false, then a1 x gets a clause item for the ﬁrst clause in which x is present and a2 x gets a clause item for the second clause in which x is present. Therefore, if all the literals of a clause c are false, then agent 1 does not get o1 c. If literal l is satisﬁed, the agent a l corresponding to it gets a dummy or consistency item in that round instead of a clause item. This means that if all the literals of a clause are not satisﬁed, then all three clause items of a clause are gone, and agent 1 does not get a clause item. He is much more interested in the clause item o1 c than in the clause items o2 c and o3 c. The other clause items are far down in his preference list so he would rather get all the top clause items o1 c for each clause c rather than o2 c and o3 c. For example, let us consider clause c = (xi xj xk), ans assume variables xi, xj and xk are set to true, i.e., agent 1 has picked choice items corresponding to xi, xj and xk. This means that in the clause round, a1 xj and a1 xk are ready to take their clause items o3 c and o2 c but a1 xi wants to get one of the unallocated consistency items before he is interested in consistency items o3 c, o2 c, o1 c. This is helpful for agent 1 because he can get o1 c. Since each literal occurs

exactly twice in the set of clauses, note that as long as 1 makes a consistent choice, there will be another clause c in which literal x is present and if x is set to true, then a2 x will get h3 xi and hence 1 will be able to get oc . Note that after the clause rounds all the consistency items are already allocated, so agent 1 can hope to get all the top clause items if they were not already taken in the clause rounds.

Stage 1 2 3 after clause rounds Agent a1 xi a1 xj a1 xk 1 Item picked h2 xi o3 c o2 c o1 c

Table 5: Clause round corresponding to c = (xi xj xk)

Collection round The sequence in this round is 1, . . . , 1 |C|

At this stage, agent 1 prefers o1 c1, . . . , o1 c|C| to all other remaining items. The idea is that if agent 1 make choices that sets all the clauses to true, then he gets all the clause items. Note that if 1 makes a consistent choice for the variables but does not pick up all the clause items in the collection round (because the set of clauses is unsatisﬁable), then 1 does not get all the clause items. Since there are items less preferred by 1 than the o1 cs such as o2 c and o3 cs, agent 1 is forced to pick a much less preferred item in the collection round.

Based on construction of the choice and clause rounds, we are in a position to prove a series of claims.

Claim 1. If agent 1 does not make a consistent choice of the variable items, then he does not get utility T.

Proof. If 1 does not make a choice at all in a choice round (that is, if he does not pick one of o1 x or o1 x and one of o2 x or o2 x), then his most preferred items corresponding to the literals are then taken by the agents corresponding to the literal. If 1 makes a choice in each choice round but not a consistent one, then he gets {h2 x, h2 x} which has sufﬁciently less utility than {h1 x, h3 x} or {h1 x, h3 x} which implies he cannot get total utility T.

Claim 2. If agent 1 makes consistent choices but the assignment is not satisfying, then agent 1 does not get utility T.

Proof. If some clause c is set to false, then agent 1 is not able to o1 c because the literal agents in the clause round corresponding to c take all the items o3 c, o2 c, o1 c. This implies that agent 1 does not get utility T.

Claim 3. If there exists a satisfying assignment for C, then agent 1 can get utility T.

Proof. If there exists a satisfying assignment, then consider the preference report of agent 1 in which in each choice round, he picks o1 xi and o2 xi if xi is set to be false. By doing this he gets to pick a top consistency item in that round as well. Since all the clauses are satisﬁed, in each clause round, agent 1 is able to get his clause item o1 c. The utilities are set in a way so that as long as agent 1 gets two items corresponding to the same literal and hence at least one top choice consistency items in each round and his target clause item in each clause round, agent 1 gets utility at least T.

The claims show that agent 1 gets utility at least T if and only if there is a satisfying truth assignment.

The result above also gives us the following statement: computing a best response is NP-hard. This raises the following question: what is the complexity of testing whether there exists a report that gives more utility than the truthful report ?

Theorem 3. Checking whether there exists a report that yields more utility than the truthful report is NP-complete.

Proof. We ﬁrst prove that the restriction of SAT to set of clauses such that (a) exactly one clause has only negative literals, and (b) each literal appears exactly twice in the set of clauses, is NP-complete. We show this by reduction from SAT with restriction (b), which we know to be NP-complete. Let F = {c1, ..., cn} be the initial set of clauses, with (b) holding. Let G the following set of clauses, where d1, ..., dn are new variables:

G = {ci di, 1 i n} { di di+1, 1 i < n} { dn} {d1 d1 ... dn}

Each literal (including di and di for all i) appears exactly twice in G, all clauses except one ( dn) contain at least one positive literal. Now, G is equivalent to { d1, . . . , dn} F, therefore G is satisﬁable if and only if F is satisﬁable. We now present a reduction from SAT with restrictions (a) and (b) to the existence of a better response than the truthful report. The reduction is essentially the same as that in

the proof of Theorem 2, except that the clauses may not be of size 3. For clauses of different size, we simply create as many clause items as the number of literals in the clause and agent 1 still needs to get the top clause item of each clause to achieve his target utility. The reduction again involves choice rounds for all the variables and the goal is for the agent 1 to maximize utility by making consistent choices and then get as many clause items as possible. In the same way as in the proof of Theorem 2, we can show that if the set of clauses is not satisﬁable, agent 1 s optimal strategy is to tell the truth (i.e., set all variables to true) because this leads him to get consistent variable items and all clause items except for the last clause, and that if the set of clauses is satisﬁable, then the satisfying truth assignment also gives the corresponding optimal preference report for agent 1. Therefore, checking whether there exists a better report for agent 1 than the truthful report is NP-complete.

Manipulation under Binary Utilities We now consider the restriction that the manipulator is asked to report a linear order but has binary preferences over items: the set of items is partitioned into two subsets O+ and O , such that for each o O+, ui(o) = α and for each o O , ui(o) = β < α; that is, the items in O+ are the most preferred items of agent 1. Binary utilities are important for e.g., when each agent is only interested in a subset of acceptable items (Bogomolnaia and Moulin 2004). Since the number of items he gets with a ﬁxed sequence is ﬁxed, the utility obtained with a given report is maximal if and only if the number of objects in O+ he gets is maximal. Therefore the manipulation problem under this restriction consists in ﬁnding a report leading agent 1 to get a maximum number of top items . Let us assume that agent 1 (the manipulator) has n1 turns. Let π 1 the policy obtained by removing all occurrences of 1 in π, and let First(π 1, 2, . . . , n, O+) be the ﬁrst item in O+ picked by some agent when simulating π 1 on ( 2, . . . , n). For example, take π = 231312, 2= (o1, o2, o3, o4, o5, o6), 3= (o1, o3, o5, o2, o4, o6) and O+ = {o2, o4}. Then we have π 1 = 2332 and First(π 1, 2, 3, O+) = o2, since simulating π 1 leads to 2 taking o1, 3 taking o3 and o5, and then 2 taking o2. Moreover, let us write π = (head(π), 1, tail(π)), where head(π) is the longest starting subsequence of π in which 1 does not occur, and tail(π) is the subsequence of π starting right after the ﬁrst occurrence of 1. Let Allocate(head(π), 2, . . . , n) be the set of items allocated to 2, . . . , n when simulating head(π) with ( 2, . . . , n). For instance, with π, 2 and 3 as above, we have head(π) = 23, tail(π) = 312, and Allocate(head(π), 2, 3) = {o1, o3}. Consider the following algorithm BR:

Input: O+, π, ( 2, . . . , n) Output: 1 k 1; n1 number of occurrences of 1 in π; While k n1 and O+ = and π = 11 . . . 1 O Allocate(head(π), 2, . . . , n);

remove O from O, O+, and ( 2, . . . , n); π tail(π); ak First(π 1, 2, . . . , n, O+); remove ak from O, O+, and ( 2, . . . , n); k k + 1; End While 1 (a1, . . . , ak); complete 1 with the remaining items of O+ (if any) in an arbitrary way, and then by other items in an arbitrary way; Return 1 Theorem 4. A best response can be computed in polynomial time when the manipulator has binary utilities: Algorithm BR outputs a best response.

Proof. Assume that agent 1 has n1 turns. 1 is interested in getting a maximum number n 1 top items from O+. Let 1 be the sequence of objects obtained using Algorithm BR. The main idea of the algorithm is that 1 is gradually built in a way so that 1 can get all the top items in the list. A new top item ak is only appended to the list if 1 can additionally get it along with the previous items in the list. Just before adding ak, when we simulate sequential allocation, 1 does not get ak because someone else gets it right after 1 s turn in which 1 abstained from picking an item because 1 only had k 1 items. Hence if 1 does not abstain and in fact makes use of his k-th turn, then 1 can get ak. To prove the optimality of the algorithm, we reason iteratively on the number of picking turns of agent 1. Namely, we prove the following statement: (Hn1) suppose that the number of remaining picking turns of 1 is n1. Suppose that there is a picking strategy for agent 1 to get c objects among those from O+, and let 1= b1, . . . , bn1 be the sequence of objects he gets with this strategy. Then the strategy 1 = a1, b2, . . . , bn1 , where we have simply replaced b1 by a1 in 1 gives at least c objects from O+ to agent 1. We omit the argument of (Hn1) due to lack of space. By successively applying this hypothesis on bn1 , bn1 1, bn1 ,... we prove that for each k, the strategy ak, . . . , an1 gives to agent 1 as many objects from O+ as the strategy bk, . . . , bn1 . This proves that for each arbitrary picking strategy, the strategy given by the algorithm gives to agent 1 at least as many objects from O+. Hence, the algorithm returns an optimal strategy.

Manipulation under responsive preferences An allocation S is more preferred with respect to responsive (RS) preferences than allocation T if S is a result of replacing an item in T with a strictly more preferred item (Aziz et al. 2015b; Brams and King 2005, for instance). Note that the responsive relation is transitive but not complete. We examine the problem of determining whether there exists an untruthful report which yields an allocation that is strictly more preferred with respect to responsive preferences. The problem is equivalent to checking whether there exists a report that yields more utility with respect to all additive utilities consistent with the ordinal preferences. In contrast to the problem for speciﬁc utilities, this particular problem of checking whether there exists such a clear manipulation can be solved in polynomial time. Our algorithm

is based on an intimate connection that we identify between manipulation under responsive preferences and under lexicographic preferences. Theorem 5. It can be checked in polynomial time whether there exists a manipulation that gives an allocation that is strictly more preferred with respect to responsive set extension than the truthful outcome.

Proof. We ﬁrst present the algorithm.

Input: picking sequence π, proﬁle ( 1, 2, . . . , n) Output: 1 Rename items such that 1= o1, . . . , om O1 set of items obtained by agent 1 with his sincere strategy; k 1; O ; Repeat If agent 1 can obtain O {ok} then O O {ok} endif k k + 1 Until O lexicographically dominates O1 or k > m If O lexicographically dominates O1 then 1 ranking of O allowing agent 1 to obtain O

Complete 1 with the items in O \ O , ranked as in 1 Return 1 else Return failure endif

It can be determined in polynomial time whether agent 1 can obtain O {ok} (Bouveret and Lang 2011, Proposition 7), therefore the algorithm works in polynomial time. Now we claim that when the algorithm returns 1, then 1 yields a strictly better allocation than the truthful outcome. If the truthful report yields the lexicographical optimal outcome, then clearly, there can be no responsively better outcome because a responsively better outcome is also a lexicographically better outcome. Now assume that the truthful outcome is lexicographically not the best. Then, there exists another allocation that is achievable that is lexicographically better. Consider the ﬁrst (most preferred) item that is in the lexicographically more preferred outcome but not in the truthful outcome. We compute an allocation and partial preference of the manipulator that does not include any less preferred items. For the remaining items, we simply append them in the manipulator s preference list in decreasing order of preference. We claim that the outcome responsively dominates the truthful outcome from the manipulating agent s truthful preferences. The argument is technical and long and is omitted due to lack of space.

Example 6. Let n = 3, m = 7, π = 1231231, and

1: o1, o2, o3, o4, o5, o6, o7 2: o2, o1, o5, o3, o4, o6, o7 3: o2, o3, o4, o1, o5, o7, o6 We have O1 = {o1, o4, o6}. At the ﬁrst step, o1 is added to O ; at the next step, o2 is not added since {o1, o2} is not feasible; at the next step, o3 is added; as O = {o1, o3} lexicographically dominates O1, we exit the loop, initialize 1 to

(o3, o1), and complete it into 1: o3, o1, o2, o4, o5, o6, o7 which leads agent 1 to obtain {o1, o3, o6}.

Conclusion In this paper, we showed that computing a best response under sequential allocation to maximize additive utility is NPhard. We also gave a contrasting results that manipulating sequential allocation under binary utilities and also with respect to responsive preferences is easy. Our NP-hardness result does not involve a constant number of agents. It remains an interesting open problem whether manipulating sequential allocation with respect to cardinal utilities is NP-hard when the number of agents is three or some other constant.

Acknowledgement This work is partly supported by the project ANR-14-CE24-0007-01 Co Co RICo-Co Dec .

References Aziz, H.; Gaspers, S.; Mackenzie, S.; Mattei, N.; Narodytska, N.; and Walsh, T. 2015a. Manipulating the probabilistic serial rule. In Proc. of 14th AAMAS Conference, 1451 1459. Aziz, H.; Gaspers, S.; Mackenzie, S.; and Walsh, T. 2015b. Fair assignment of indivisible objects under ordinal preferences. Artiﬁcial Intelligence 227:71 92. Aziz, H.; Walsh, T.; and Xia, L. 2015. Possible and necessary allocations via sequential mechanisms. In Proc. of 24th IJCAI, 468 474. Berman, P.; Karpinski, M.; and Scott, A. D. 2003. Approximation hardness of short symmetric instances of MAX3SAT. Electronic Colloquium on Computational Complexity (ECCC) (049). Bogomolnaia, A., and Moulin, H. 2004. Random matching under dichotomous preferences. Econometrica 72(1):257 279. Bouveret, S., and Lang, J. 2011. A general elicitation-free protocol for allocating indivisible goods. In Proc. of 22nd IJCAI, 73 78. AAAI Press. Bouveret, S., and Lang, J. 2014. Manipulating picking sequences. In Proc. of 21st ECAI, 141 146. Brams, S. J., and King, D. L. 2005. Efﬁcient fair division: Help the worst off or avoid envy? Rationality and Society 17(4):387 421. Brams, S. J., and Strafﬁn, P. D. 1979. Prisoners dilemma and professional sports drafts. The American Mathematical Monthly 86(2):80 88. Brams, S. J., and Taylor, A. D. 1996. Fair Division: From Cake-Cutting to Dispute Resolution. Cambridge University Press. Kalinowski, T.; Narodytska, N.; Walsh, T.; and Xia, L. 2013. Strategic behavior when allocating indivisible goods sequentially. In Proc. of 27th AAAI Conference, 452 458. AAAI Press. Kalinowski, T.; Narodytska, N.; and Walsh, T. 2013. A social welfare optimal sequential allocation procedure. In Proc. of 22nd IJCAI, 227 233. AAAI Press.

Kohler, D. A., and Chandrasekaran, R. 1971. A class of sequential games. Operations Research 19(2):270 277. Levine, L., and Stange, K. E. 2012. How to make the most of a shared meal: Plan the last bite ﬁrst. The American Mathematical Monthly 119(7):550 565. Tominaga, Y.; Todo, T.; and Yokoo, M. 2016. Manipulations in two-agent sequential allocation with random sequences. In Proc. of 15th AAMAS Conference. IFAAMAS. 141 149.