# optimal_common_contract_with_heterogeneous_agents__76247929.pdf

The Thirty-Fourth AAAI Conference on Artiﬁcial Intelligence (AAAI-20)

Optimal Common Contract with Heterogeneous Agents

Shenke Xiao,1 Zihe Wang,2 Mengjing Chen,1 Pingzhong Tang,1 Xiwang Yang3

1Tsinghua University, 2Shanghai University of Finance and Economics, 3Byte Dance xsk15@mails.tsinghua.edu.cn, wang.zihe@mail.shufe.edu.cn, ccchmj@qq.com, kenshinping@gmail.com, yangxiwang@bytedance.com

We consider the principal-agent problem with heterogeneous agents. Previous works assume that the principal signs independent incentive contracts with every agent to make them invest more efforts on the tasks. However, in many circumstances, these contracts need to be identical for the sake of fairness. We investigate the optimal common contract problem. To our knowledge, this is the ﬁrst attempt to consider this natural and important generalization. We ﬁrst show this problem is NP-complete. Then we provide a dynamic programming algorithm to compute the optimal contract in O(n2m) time, where n, m are the number of agents and actions, under the assumption that the agents cost functions obey increasing difference property. At last, we generalize the setting such that each agent can choose to directly produce a reward in [0, 1]. We provide an O(log n)-approximate algorithm for this generalization.

Introduction Principal-agent theory is a subﬁeld of mechanism design theory. The principal hires an agent to accomplish a task. The agent is able to take actions on behalf of the principal. Agent s different actions lead to different rewards the principal receives. Moral hazard occurs when the agent acts in his own interest which may be in conﬂict with the principal s interest. Therefore the principal designs an incentive contract with the agent to maximize the principal s utility subject to the agent s utility being maximized. The contract is a transfer function from the principal to the agent which could depend on the outcome which is affected by the agent s action. Many economic interactions ﬁt in the principal-agent model. For example, a ﬁrm (principal) hires a salesman (agent) to sell products. The salesman invests effort on selling products. More efforts he invests, more products will be sold. To incentivize salesman invest more efforts, the ﬁrm can set a bonus depending on the amount of products a salesman has sold. A salesman wants to maximize his utility which is

This work is supported by Science and Technology Innovation 2030 New Generation Artiﬁcial Intelligence Major Project No. (2018AAA0100904), Beijing Academy of Artiﬁcial Intelligence (BAAI), the Shanghai Sailing Program (Grant No. 18YF1407900), and the National NSFC (Grant No. 61806121). Copyright c 2020, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

deﬁned to be his bonus minus his efforts. The ﬁrm s utility is the revenue generated from selling products minus the bonus paid to salesmen. The central question in this research ﬁeld asks: What is the principal s optimal contract? Due to the wide application, principal-agent model has been extensively studied (Holmstrom and Milgrom 1991; 1987; Chen et al. 2019; Grossman and Hart 1992; Bolton, Dewatripont, and others 2005). The agent takes a hidden action like effort which cannot be observed by the principal directly. The principal can only observe the outcome of this action and the contract is designed to depend on the outcome only. Most works focus on the problem with one principal and one agent (Armstrong and Vickers 2010; Kleinberg and Kleinberg 2018). When the agent takes different actions, there is a different distribution over principal s reward. Given the distribution information, the optimal contract can be computed efﬁciently through linear programs. In this paper, we consider the problem when there is one principal and multiple heterogeneous agents. These agents could be good at different tasks and we do not assume any relationship between the cost for different tasks among different agents. For sake of the fairness, we do not allow the principal design personal contracts for different agents. Instead, the principal has to design a common contract that applies to every agents. We assume the mapping from the action played to the outcome is deterministic. So the principal knows every agent s action by observing her outcome. The difﬁculty in our model stems from the multiple agents. Since the principal can only use a common contract, he needs to balance the incentivization for every agent.

Our Contribution Our contribution can be summarized as follows.

1. We ﬁrst show that the optimal contract problem with heterogeneous agents is strongly NP-complete.

2. We then propose an O(n2m) dynamic programming algorithm, where n is the number of agents and m is the number of actions, to compute an optimal contract under the assumption that the agents costs obey increasing differences.

3. Next, we generalize the discrete-action setting such that each agent can choose to directly produce a reward in [0, 1]. We show that this generalization is harder than the

original discrete-action version, and provide an O(log n)- approximate algorithm for this generalization.

Other Related Works

Other works also consider multiple agents but in different angles (Babaioff, Feldman, and Nisan 2006a; 2006b; Babaioff et al. 2012; Bernstein and Winter 2012; Babaioff and Winter 2014; Winter 2004). They assume the union of agents actions together determines the outcome. The contract is personalized and speciﬁes the payment in every possible outcome. Therefore the payment to an agent depends on both his action and other agents actions. In contrast, in our paper, the payment to an agent only depends on his own action. In their setting, some works consider that each agent only has a binary action space (Babaioff, Feldman, and Nisan 2006a; Babaioff et al. 2012; Winter 2004), and some works consider the tradeoff between simplicity of the contract and the performance of it (Babaioff and Winter 2014). Bernstein and Winter consider a participation game where each agent has two options representing participating or not (Bernstein and Winter 2012). Each agent s utility from participating is the payment given by the principal plus a value depending on other agents participation. The principal aims to incentivize all agents to participate in the game with minimal payments. Alon et al. study how to motivate multiple agents to take desirable actions using a common evaluation mechanism (Alon et al. 2020). They solve multiple problems in different settings. The main difference between their model and ours is the designer s payoff. The mechanism designer in their model cares about the number of agents who have been motivated in admissible ways while the principal in our model is interested in maximizing the reward generated by agents minus the payment paid to agents. Lavi and Shamash study the model with multiple principals and multiple agents (Lavi and Shamash 2019). Agents do not have cost on actions. This model focuses on the competition between principals. Mc Afee and Mc Millan study another totally different problem where multiple agents compete for a principal s contract (Mc Afee and Mc Millan 1986). A recent work of Azizan et al. studies a model that is almost the same as ours where each agent can choose to directly produce a real number reward (Azizan et al. 2019). However, they assume the designed payment function can be parameterized by a vector in a given set A Rd, and their algorithm explores the whole set A, which is not that efﬁcient.

Problem Description

In this paper, we study the Multiple Agents Contract Problem. There is a principal, n agents and m actions. Each agent can take an action j [m] and produces a reward ρj 0 for the principal. The reward only depends on the action, not on the agent. Each agent i also has a cost ci,j 0 to take an action j. This cost depends on both the agent and the action. Besides the m actions, there is always a zero action with reward 0 such that the cost for each agent to take this action is 0. This action means it is free for each agent to choose to produce nothing. The principal speciﬁes a payment proﬁle (t1, t2, . . . , tm): each agent taking action j will earn a

payment tj. The utility for agent i to take action j is tj ci,j. The agents are self-interested meaning each agent will take an action that maximizes its utility. W.l.o.g., we assume the agents tie-break in favor of the principal. The payoff of the principal is the sum of rewards produced by the agents minus the payments given to the agents, i.e., if agent i takes action i , the payoff of the principal is n i=1(ρi ti ). Our goal is to design the payment proﬁle (t1, t2, . . . , tm) to maximize the payoff of the principal.

Example 1. Suppose there are two agents and two actions. The rewards for the two actions are 8 and 10 respectively. For action 1, agent 1 has a cost 5 and agent 2 has a cost 4. For action 2, agent 1 has a cost 9 and agent 2 has a cost 2. Without agent 2, we can set the payments for the two actions to 5 and 0 respectively, which brings a payoff of 3 to the principal. Without agent 1, we can set the payments for the two actions to 0 and 2 respectively, which brings a payoff of 8 to the principal. However, when the two agents both exist, no matter how we set the payments, the payoff of the principal cannot achieve 3 + 8 = 11. It is optimal to the payments for the two actions to 5 and 3 respectively, which brings a payoff of 10 to the principal.

Hardness The problem deﬁned in the previous section is very hard. To see its hardness, let us consider its decision version, i.e., the problem of determining whether there is a payment proﬁle (t1, t2, . . . , tm) such that the payoff of the principal is no less than a given number r. For convenience, we call this decision problem MAC. We will show in the following theorem that MAC is strongly NP-complete.

Theorem 1. MAC is strongly NP-complete.

Proof. MAC obviously belongs to NP. In the following proof, we reduce the well-known NP-complete problem Not-All Equal 3-Satisﬁability (NAE3SAT) to MAC to show that MAC is strongly NP-complete. Given an instance of NAE3SAT with n variables and m clauses (we assume the variables in one clause are different without loss of generality), we build an instance of MAC as follows. For any variable x in an instance of NAE3SAT, we deﬁne x0 as its negation and deﬁne x1 = x.

Agents For each variable xi, we have an agent Ai. For each literal xb i and each clause cj, we have an agent T b i,j. For each clause cj, we have 6 agents Vj,1, Vj,2, . . . , Vj,6. Actions We have a zero action zero with reward 0. For each literal xb i, we have an action variableb i with reward ρ1. For each clause cj, we have 6 actions clausej,1, . . . , clausej,6 with reward ρ2. Costs For the zero action zero, each agent has a cost 0.

For action variableb i, agent Ai has a cost δ, and T b i,j has a cost 0 for each j. For action clausej,k, agent Vj,k has a cost 0.

For action clausej,k where cj = xb1 i1 xb2 i2 xb3 i3 , the costs vary for different k s and are summarized in Table 1. Note there are exactly 3 agents with cost 1 to take this action. We call the three agents the associated agents of this action. For each action and agent, if we do not mention the cost above, the cost is greater than the reward of the action.

The parameters ρ1, ρ2, δ satisfy the following constraints.1

ρ1 δ > m((2n 3)(ρ2 ρ1) + nδ + 4), (1) δ > 3(ρ2 ρ1 1), (2) ρ2 ρ1 > 2. (3)

Figure 1 shows an example instance of MAC corresponding to an instance of NAE3SAT with 4 variables and 2 clauses x1 x2 x3 and x1 x2 x4. Then we ask whether we can set the payments to the agents so that the optimal payoff of the principal is no less than

n(ρ1 δ)+m(6ρ2 1)+m(n(ρ1 δ)+(n 3)ρ1+3(ρ2 1)). (4) In an optimal solution, the payment for an action will not exceed the reward, so an agent will never be incentivized to take an action whose cost is greater than the reward. The actions that an agent will be potentially incentivized to take in an optimal solution is summarized as follows (since all agents can take the zero action, we omit it in the following list).

Agent Ai will potentially take action variable0 i or variable1 i . Agent T b i,j will potentially take action variableb i or, if variable xi appears in clause cj, clausej,k for some k. Agent Vj,k will potentially take action clausej,k.

Suppose the instance of NAE3SAT has a valid solution, then we set the payments in the instance of MAC as follows.

For each action variableb i, if the value of xb i is True, we set the payment to 0; otherwise we set the payment to δ. For each action clausej,k with associated agents T h1 i1,j, T h2 i2,j, T h3 i3,j, we set the payment to 1 if the values of xh1 i1 , xh2 i2 , xh3 i3 are all True; otherwise we set the payment to 0.

Under these payments, agent Vj,k will always take action clausej,k. If the value of xi is True, agent Ai will take action variable1 i ; otherwise she will take action variable0 i . For agent T b i,j, if the value of xb i is True and variable xi appears in clause cj, she will take action clausej,k for some k; otherwise she will take action

1For example, we can set δ = 7, ρ1 = 13mn + 8 and ρ2 = 13mn + 11.

variableb i. Hence the total payoff of the principal is exactly (4). Now suppose there exists a payment setting such that the optimal payoff of the principal is no less than (4). We ﬁrst show that agent Ai will take one of the actions variable0 i and variable1 i . Otherwise, the payoff of the principal cannot exceed (n 1)(ρ1 δ) + 6mρ2 + 2mnρ2, which is less than (4) by (1). We deﬁne bi such that Ai takes action variablebi i , then the payment for action variablebi i must be no less than δ the cost for agent Ai to take this action. If the payment is greater than δ, we can adjust it to δ. After this adjustment, some agent T bi i,j that takes action variablebi i before the adjustment may turn out to take action clausej,k for some k. This is the only possible cause of payoff loss of the principal. Suppose the payments for variablebi i (before the adjustment) and clausej,k are t1, t2 respectively, since agent T bi i,j chooses to take action variablebi i before the adjustment, we have t1 t2 1, so ρ2 t2 ρ2 t1 1 > ρ1 t1 by (3). This means the adjustment does not reduce the payoff of the principal, hence we can assume the payment for action variablebi i is exactly δ. By an analogous argument, we can also assume the payment for action variable1 bi i is exactly 0. If there exist some i, j such that agent T bi i,j does not take action variablebi i , it must take action clausej,k for some k, and the payment t for action clausej,k must incentivize agent T bi i,j to take action clausej,k, i.e., it must satisfy

Then we adjust the payment for action clausej,k to 1 so that agent T bi i,j is incentivized to take action variablebi i . After this adjustment, at most three agents that take action clausej,k before the adjustment deviate to take actions of the form variablek . Each of these agent brings a payoff of ρ2 t to the principal before the adjustment, and brings a payoff of at least ρ1 δ after the adjustment, so the adjustment reduces the payoff of the principal by at most 3(ρ2 t ρ1 + δ). On the other hand, the payoff of the principal increases by t 1 due to the contribution of agent Vj,k. As a result, since t 1 3(ρ2 t ρ1 +δ) due to (2) and (5), this adjustment does not reduce the payoff of the principal. Hence, we can assume for any i, j, agent T bi i,j takes action variablebi i . Suppose there exists some j such that the payment for action clausej,k is less than 1 for each k. Suppose clause cj contains three variables xi1, xi2, xi3, and action clausej,k0 is an action that the cost for agent T 1 bi1 i1,j to take is 1 (there may exist multiple such k0 s, and we arbitrarily choose one). We then adjust the payment for clausej,k0 to 1. This adjustment attracts T 1 bi1 i1,j to take action clausej,k0, which increases the payoff of the principal by ρ2 1 ρ1. On the other hand, the payoff of the principal contributed by Vj,k0 is decreased by at most 1, which is the only cause that reduces the payoff of the principal. As a result, since ρ2 1 ρ1 > 1, the payoff of the principal increases. Hence, we can assume

Table 1: Cost Table T b1 i1 T b2 i2 T b3 i3 T 1 b1 i1 T 1 b2 i2 T 1 b3 i3 clausej,1 1 1 - - - 1 clausej,2 1 - 1 - 1 - clausej,3 - 1 1 1 - - clausej,4 - - 1 1 1 - clausej,5 - 1 - 1 - 1 clausej,6 1 - - - 1 1

0 variable0 1

A1 T 0 1,1, T 0 1,2

variable1 1

A1 T 1 1,1, T 1 1,2

0 variable0 2

A2 T 0 2,1, T 0 2,2

variable1 2

A2 T 1 2,1, T 1 2,2

variable0 3

A3 T 0 3,1, T 0 3,2

0 variable1 3

A3 T 1 3,1, T 1 3,2

variable0 4

A4 T 0 4,1, T 0 4,2

0 variable1 4

A4 T 1 4,1, T 1 4,2

1 0clause1,1

1 0clause1,2

1 0clause1,3

1 0clause1,4

1 0clause1,5

1 0clause1,6

1 0clause2,1

1 0clause2,2

1 0clause2,3

1 0clause2,4

1 0clause2,5

1 0clause2,6

T 1 1,1, T 1 2,1, T 1 3,1 V1,1 T 1 1,1, T 0 2,1, T 0 3,1 V1,2 T 0 1,1, T 1 2,1, T 0 3,1 V1,3

T 0 1,1, T 0 2,1, T 0 3,1 V1,4 T 0 1,1, T 1 2,1, T 1 3,1 V1,5 T 1 1,1, T 0 2,1, T 1 3,1 V1,6

T 1 1,2, T 0 2,2, T 0 4,2 V2,1 T 1 1,2, T 1 2,2, T 1 4,2 V2,2 T 0 1,2, T 0 2,2, T 1 4,2 V2,3

T 0 1,2, T 1 2,2, T 1 4,2 V2,4 T 0 1,2, T 0 2,2, T 0 4,2 V2,5 T 1 1,2, T 1 2,2, T 0 4,2 V2,6

Figure 1: An example instance of MAC corresponding to an instance of NAE3SAT with 4 variables and 2 clauses x1 x2 x3 and x1 x2 x4: each rectangular represents an action; each line (including the bottom line) in a rectangular represents one or multiple agents, whose names are recorded to the right of the line; the number to the left of a line represents the cost for the agents to take this action; in particular, the number to the left of the top line of a rectangular represents the reward for the action.

for any j, there exists at least one k such that the payment for action clausej,k is no less than 1. Under the assumptions above, the maximum payoff of the principal is exactly (4). To achieve this optimal payoff, for any j, say cj = xh1 i1 xh2 i2 xh3 i3 , there exists exactly one k such that the payment for clausej,k is 1, and its three associated agents T 1 bi1 i1,j , T 1 bi2 i2,j , T 1 bi3 i3,j take this action. According to Table 1, (1 bi1) h1, (1 bi2) h2, (1 bi3) h3 do not have the same value. So we can set variable xi to the value 1 bi (0 represents False and 1 represents True), then all clauses are satisﬁed.

Increasing Differences In this section, we consider the case where agents have different abilities. Roughly speaking, the agents can be ordered from weak to strong, i1, i2, . . . , in, in the sense that it takes less cost for a stronger agent to produce a certain amount reward. We have for each j [m],

ci1,j > ci2,j > > cin,j, (6)

Additionally, we assume the costs obey increasing differences. Deﬁnition 1. Given an instance of the Multiple Agents Contract Problem, we call the costs obey increasing differences if there exists a permutation j1, j2, . . . , jm of 1, 2, . . . , m and a permutation i1, i2, . . . , in of 1, 2, . . . , n such that for any pair of (k, k ) such that k < k , 0 < cik,j1 cik ,j1 < cik,j2 cik ,j2 < < cik,jm cik ,jm. Though MAC is proved to be hard, we give a dynamic programming algorithm to solve the Multiple Agents Contract Problem under the assumption that the costs obey increasing differences. The permutation i1, i2, . . . , in can be found in O(n log n) time by sorting c1,j, c2,j, . . . , cn,j for an arbitrary j, then the permutation j1, j2, . . . , jm can be found in O(m log m) time by sorting ci2,1 ci1,1, ci2,2 ci1,2, . . . , ci2,m ci1,m. For convenience, we assume the actions and agents are already ordered without loss of generality, i.e., ik = jk = k for each k. The zero action is also considered action 0. Before describing the algorithm, we ﬁrst show the following lemmas. Lemma 1. If the costs obey increasing differences, then for any payment proﬁle (t1, t2, . . . , tm), if agent i and i take actions j and j respectively, then i < i j j .

Proof. Suppose i < i but j > j . Since agent i prefers action j to j , we have

tj ci,j tj ci,j . (7)

Similarly, since agent i prefers action j to j, we have

tj ci ,j tj ci ,j. (8)

By combining (7) and (8), we have

ci,j ci ,j ci,j ci ,j . (9)

However, since i < i and j > j , by increasing differences we have ci,j ci ,j > ci,j ci ,j , which contradicts to (9). Therefore, we must have i < i j j .

Lemma 2. Given any 0 j1 jm m, we have

1. Under the constraint that agent i is incentivized to take action ji, the optimal payoff of the principal cannot exceed

i=1 (ρji ci,ji (n i) (ci,ji ci+1,ji))+ρjn cn,jn.

2. If the costs obey increasing differences, and we set the payment proﬁle (t1, t2, . . . , tm) such that

i 1 i =1 ci ,ji ci +1,ji + ci,ji, if there exists i such that j = ji2, 0, otherwise, (11) then the payoff of the principal is no less than (10).

Proof. Since agent i prefers action ji to ji 1, we have

tji ci ,ji tji 1 ci ,ji 1, (12)

and for i = 1 we have tj1 ci,j1 0 since agent 1 prefers action j1 to the zero action. By summing up (12) for i = 1, 2, . . . , i, we have

ci ,ji ci +1,ji + ci,ji.

ci ,ji ci +1,ji + ci,ji

i=1 (ci,ji + (n i) (ci,ji ci+1,ji)) + cn,jn,

so the payoff of the principal cannot exceed (10). On the other hand, suppose the costs obey increasing differences and we set tj according to (11). For any agent i and any action j, there are three cases.

1. If there does not exist some k such that j = jk, then

ci ,ji ci +1,ji (13)

= tji ci,ji,

where the inequality (13) holds due to (6).

2If there exist multiple such i s, we arbitrarily choose one, because if, for example, j = jk = jk +1 = = jk, then the value of i 1 i =1 ci ,ji ci +1,ji + ci,ji is the same for i = k , k + 1, . . . , k.

2. If there exists some k i such that j = jk, we have

ci ,ji ci +1,ji + ck,jk ci,jk

ci ,ji ci +1,ji +

i =k (ci ,jk ci +1,jk)

ci ,ji ci +1,ji +

ci ,ji ci +1,ji

ci ,ji ci +1,ji

= tji ci,ji,

where the inequality (14) holds due to increasing differences: for any i k, ci ,jk ci +1,jk ci ,ji ci +1,ji .

3. If there exists some k > i such that j = jk, we have

ci ,ji ci +1,ji (ci,jk ck,jk)

ci ,ji ci +1,ji

ci ,ji ci +1,ji

i =i (ci ,jk ci +1,jk)

ci ,ji ci +1,ji (15)

= tji ci,ji,

where the inequality (15) holds due to increasing differences: for any i < k, ci ,ji ci +1,ji ci ,jk ci +1,jk.

Anyway, we have tj ci,j tji ci,ji, which means taking action ji maximizes agent i s utility. Note if agent i takes action ji for each i, the payoff of the principal is exactly (10). Recall that the agents tie-break in favor of the principal, so the payoff of the principal is no less than (10).

Lemma 1 and 2 show that we can ﬁnd 0 j1 j2 jm m that maximizes (10), then an optimal payment proﬁle is given by (11). To ﬁnd the optimal j1, j2, . . . , jm, we use a dynamic programming algorithm. For convenience, we deﬁne φ(i, j) = ρj ci,j (n i) (ci,j ci+1,j) for i = 1, 2, . . . , n 1, and deﬁne φ(n, j) = ρj cn,j. We deﬁne the subproblem OPT(i, j) = max0 j1 j2 ji j i i =1 φ(i , ji ). We can see the optimal value of (10) is OPT(n, m), and we have the recursion

OPT(i, j + 1) =

OPT(k, j) +

i =k+1 φ(i , j + 1)

with OPT(i, 0) = 0 for each i. Hence, the optimal value of (10), as well as the optimal j1, j2, . . . , jm, can be computed in O(n2m) time. We conclude the result above as the following theorem.

Theorem 2. If the costs obey increasing differences, there is an O(n2m) algorithm solving the Multiple Agents Contract Problem.

Real Number Actions

In previous sections, we considered the Multiple Agents Contract Problem with discrete actions (DA). A natural generalization is to consider the problem where each agent can choose to produce an arbitrary reward in [0, 1]. We call this generalization Multiple Agents Contract Problem with Real Number Actions (RNA), and formalize it as follows.

There is a principal and n agents. Each agent chooses to produce a reward x [0, 1] for the principal. To take such an action, each agent has a cost which may differ from each other. We deﬁne ci(x) 0 as the cost for agent i to produce a reward x. We assume without loss of generality that ci(0) = 0 for all i, which means it is free for each agent to choose to produce nothing. To incentivize these agents to produce rewards, the principal speciﬁes a payment function t(x): each agent taking this action will earn a payment t(x). The utility for agent i to produce x is t(x) ci(x). Agents are self-interested, meaning each agent will produce a reward that maximizes her utility. We assume agents tie-break in favor of the principal. The payoff of the principal is the sum of the rewards produced by these agents minus the payments given to the agents, i.e., if agent i produces a reward xi, the payoff of the principal is n i=1(xi t(xi)). Our goal is to design the payment function to maximize the payoff of the principal.

Note in this paper, the functions t and ci s are not necessarily continuous. To guarantee every agent has an optimal action we only concern the payment function t where for all i, t(x) ci(x) and x t(x) (in case of tie-breaking) are able to attain their maximums on [0, 1].

We ﬁrst show that this generalization is harder than our original problem by a reduction from DA to RNA. Given an instance of DA, we can construct an instance of RNA by

0, if x = 0, ci,1+M ρm+m M , if 0 < x ρ1+M ρm+m M , ... ci,j+j M ρm+m M , if ρj 1+(j 1)M

ρm+m M < x ρj+j M ρm+m M , ... ci,m+m M

ρm+m M , if ρm 1+(m 1)M

ρm+m M < x 1, (16) for each i, where M is a large enough number3. We will show how to construct an optimal payment proﬁle of the DA instance from an optimal payment function of the RNA instance. For convenience, we deﬁne zj = (ρj +j M)/(ρm + m M) and z0 = 0. Given an optimal payment function t(x) of the RNA instance, suppose agent i chooses to produce xi and deﬁne ji such that zji 1 < xi zji (if xi = 0, then ji = 0). Now consider a ﬁxed i. If xi < zji, we adjust the value of t(x) at x = zji to t(xi). Before this adjustment, agent i produces xi, and after this adjustment, agent i has the same utility to produce zji as to produce xi, so agent i will produce zji after the adjustment (recall the agent tie-breaks in favor of the principal), which increases the payoff of the principal. On the other hand, for any other agent i , t(xi) ci (zji) t(xi) ci (xi) (since the cost function is weakly increasing), which means the utility of producing zji after the adjustment does not exceed that of producing xi. Hence, for any agent except i, changing her produced value to zji due to the adjustment does not decrease the payoff of the principal (recall again that the agents tie-break in favor of the principal). As a result, the payoff of the principal is increased by this adjustment, which contradicts to the fact that t(x) is optimal. Therefore, we can assume xi = zji. The payoff of the principal under the payment function t(x) in the RNA instance is p RNA = n i=1(zji t(zji)). Now we construct a payment proﬁle (t1, t2, . . . , tm) of the DA instance where

tj = t(zj)(ρm + m M) j M. (17)

Under this payment proﬁle, for each agent i and each j, the utility of agent i to take action j is tj ci,j, which is exactly (ρm + m M) times the utility of agent i to produce zj under the payment function t(x) in the RNA instance. Also, agent i brings a payoff of ρj tj to the principal by taking action j, which is exactly (ρm + m M) times the payoff of the principal brought by agent i by producing zj under the payment function t(x) in the RNA instance. Since agent i produces zji under payment function t(x) in the RNA instance, she will take action ji under payment proﬁle (t1, t2, . . . , tm) in the DA instance. The payoff of the principal under the payment proﬁle (t1, t2, . . . , tm) in the DA instance is p DA = n i=1(ρji tji) = (ρm + m M)p RNA. To show the payment proﬁle (t1, t2, . . . , tm) is optimal, we compare it to another arbitrary payment proﬁle (t 1, t 2, . . . , t m). Suppose agent i takes action j i under the

3It is sufﬁcient to choose M = maxi,j{ci,j, ρj} + 1.

payment proﬁle (t 1, t 2, . . . , t m), then the payoff of the principal under the payment proﬁle (t 1, t 2, . . . , t m) in the DA instance is p DA = n i=1(ρji t ji). Let

0, if x = 0, t 1+M ρm+m M , if 0 < x z1, ... t j+j M ρm+m M , if zj 1 < x zj, ... t m+m M ρm+m M , if zm 1 < x zm = 1.

We can see for all i, x t (x) and t (x) ci(x) are able to attain their maximum on [0, 1], so t (x) is a valid payment function. Under this payment function, agent i has the same utility for producing a reward on (zj, zj+1], thus she will produce zj for some j in favor of the principal. Observe, again, that under the payment proﬁle (t 1, t 2, . . . , t m), for each agent i and each j, the utility of agent i to take action j is t j ci,j, which is exactly (ρm + m M) times the utility of agent i to produce zj under the payment function t (x) in the RNA instance. Also, agent i brings a payoff of ρj t j to the principal by taking action j, which is exactly (ρm + m M) times the payoff of the principal brought by agent i by producing zj under the payment function t (x) in the RNA instance. Hence, agent i will produce zj i under the payment function t (x) in the RNA instance. The payoff of the principal under the payment function t (x) in the RNA instance is p RNA = n i=1(zji t (zji)) = p DA/(ρm + m M). Hence, p DA = (ρm+m M)p RNA (ρm+m M)p RNA = p DA, which means (t1, t2, . . . , tm) is indeed an optimal payoff proﬁle of the DA instance.

An Approximate Contract Knowing the RNA problem is hard, we are going to design an approximate contract. We assume for all i, x ci(x) is able to attain its maximum on [0, 1]. Let xi arg maxx [0,1](x ci(x)) (if there are multiple x is achieving the maximum value, we arbitrarily choose one), yi = maxx [0,1](x ci(x)), we have immediately

yi = xi ci(xi) xi. (18)

ti(x) = 0, if 0 x yi, x yi, if yi < x 1.

We assume without loss of generality that y1 y2 yn. We ﬁrst show that ti(x) is a valid payment function, i.e. for all i , x ti(x) and ti(x) ci (x) are able to attain their maximum on [0, 1]. The former is trivial. For ti(x) ci (x), if 0 x yi, then ti(x) ci (x) = ci (x) 0; if yi < x 1, then ti(x) ci (x) = x yi ci (x) yi yi, so ti(x) ci (x) max{0, yi yi}. In addition, ti(0) ci (0) = 0 and ti(xi ) ci (xi ) xi yi ci (xi ) = yi yi. This means the maximum value of ti(x) ci (x)

is max{0, yi yi}, and is achievable at x = 0 or x = xi . Hence, ti(x) is indeed a valid payment function. Note the argument above also shows that for any i i, ti(x) ci (x) attains its maximum at x = xi . By (18), we have xi yi yi, so if agent i chooses to produce xi , she brings a payoff of xi ti(xi ) = xi (xi yi) = yi to the principal. Recall that the agents tie-break in favor of the principal, agent i brings a payoff of at least yi to the principal. Hence, under the payment function ti(x), the payoff of the principal is at least (n i + 1)yi. Let i arg maxi(n i + 1)yi, then we have for all i,

yi (n i + 1)yi

On the other hand, let OPT denote the optimal payoff of the principal. Since agent i brings a payoff of at most maxx [0,1](x ci(x)) = yi to the principal, we have OPT n i=1 yi. Hence,

i=1 yi (n i + 1)yi

This means the payment function ti (x) is an n i=1(1/(n i + 1))-approximate solution, i.e. an O(log n)-approximate solution. In conclusion, we have the following algorithm.

1. For any i, ﬁnd yi = maxx [0,1](x ci(x)) and sort them such that y1 y2 yn.

2. Let i arg max1 i n(n i + 1)yi .

3. Output the payment function

t(x) = 0, if 0 x yi , x yi , if yi < x 1.

Note this algorithm can also be applied to our Multiple Agents Contract Problem with discrete actions. We ﬁrst construct an instance of the problem with real number actions using (16), obtain a payment function using the algorithm above, then we can get a payment proﬁle of the problem with discrete actions using (17). This payment proﬁle is still an O(log n)-approximate solution.

References Alon, T.; Dobson, M. R. C.; Procaccia, A.; Talgam-Cohen, I.; and Tucker-Foltz, J. 2020. Multiagent evaluation mechanisms. In Proceedings of the AAAI Conference on Artiﬁcial Intelligence, forthcoming. Armstrong, M., and Vickers, J. 2010. A model of delegated project choice. Econometrica 78(1):213 244. Azizan, N.; Su, Y.; Dvijotham, K.; and Wierman, A. 2019. Optimal pricing in markets with non-convex costs. In Proceedings of the 2019 ACM Conference on Economics and Computation, 595 595. ACM. Babaioff, M., and Winter, E. 2014. Contract complexity. In EC, 911.

Babaioff, M.; Feldman, M.; Nisan, N.; and Winter, E. 2012. Combinatorial agency. Journal of Economic Theory 147(3):999 1034. Babaioff, M.; Feldman, M.; and Nisan, N. 2006a. Combinatorial agency. In Proceedings of the 7th ACM conference on Electronic commerce, 18 28. ACM. Babaioff, M.; Feldman, M.; and Nisan, N. 2006b. Mixed strategies in combinatorial agency. In International Workshop on Internet and Network Economics, 353 364. Springer. Bernstein, S., and Winter, E. 2012. Contracting with heterogeneous externalities. American Economic Journal: Microeconomics 4(2):50 76. Bolton, P.; Dewatripont, M.; et al. 2005. Contract theory. MIT press. Chen, M.; Tang, P.; Wang, Z.; Xiao, S.; and Yang, X. 2019. Optimal mechanisms with budget for user generated contents. ar Xiv preprint ar Xiv:1907.04740. Grossman, S. J., and Hart, O. D. 1992. An analysis of the principal-agent problem. In Foundations of Insurance Economics. Springer. 302 340. Holmstrom, B., and Milgrom, P. 1987. Aggregation and linearity in the provision of intertemporal incentives. Econometrica: Journal of the Econometric Society 303 328. Holmstrom, B., and Milgrom, P. 1991. Multitask principalagent analyses: Incentive contracts, asset ownership, and job design. JL Econ. & Org. 7:24. Kleinberg, J., and Kleinberg, R. 2018. Delegated search approximates efﬁcient search. In Proceedings of the 2018 ACM Conference on Economics and Computation, 287 302. ACM. Lavi, R., and Shamash, E. 2019. Principal-agent vcg contracts. In Proceedings of the 2019 ACM Conference on Economics and Computation, 783 783. ACM. Mc Afee, R. P., and Mc Millan, J. 1986. Bidding for contracts: a principal-agent analysis. The RAND Journal of Economics 326 338. Winter, E. 2004. Incentives and discrimination. American Economic Review 94(3):764 773.