# game_implementation_what_are_the_obstructions__441c869d.pdf

Game Implementation: What Are the Obstructions?

Jiehua Chen, Seyedeh Negar Layegh Khavidaki, Sebastian Vincent Haydn, Sofia Simola, Manuel Sorge

TU Wien, Austria {jiehua.chen, seyedehngar.khavidaki, sofia.simola, manuel.sorge}@tuwien.ac.at, e51832021@student.tuwien.ac.at

In many applications, we want to influence the decisions of independent agents by designing incentives for their actions. We revisit a fundamental problem in this area, called GAME IMPLEMENTATION: Given a game in standard form and a set of desired strategies, can we design a set of payment promises such that if the players take the payment promises into account, then all undominated strategies are desired? Furthermore, we aim to minimize the cost, that is, the worst-case amount of payments. We study the tractability of computing such payment promises and determine more closely what obstructions we may have to overcome in doing so. We show that GAME IMPLEMENTATION is NP-hard even for two players, solving in particular a long-standing open question and suggesting more restrictions are necessary to obtain tractability results. We thus study the regime in which players have only a small constant number of strategies and obtain the following. First, this case remains NP-hard even if each player s utility depends only on three others. Second, we repair a flawed efficient algorithm for the case of both small number of strategies and small number of players. Among further results, we characterize sets of desired strategies that can be implemented at zero cost as a generalization of Nash equilibria.

1 Introduction Nudge theory (Thaler and Sunstein 2008), gamification (Hamari 2019), and the design of blockchain systems (Buterin et al. 2020) are just a few areas in which we apply incentives in order to coax agents towards behaving in a desirable way. In these general settings, agents select strategies on their own volition, but we may add incentives (or incur penalties) that increase (resp. decrease) the salience or utility of particular strategies in situations of our choice. The goal is to implement a desired set of strategies or strategy profiles, that is, to ensure that undesired strategies entail smaller utility than desired ones. With the advent of blockchain systems, we feel that this topic has gained renewed relevance. First, the design of a blockchain system itself, such as Bitcoin or Ethereum, involves the design of a protocol that rewards intended behavior (e.g., validating transactions by mining blocks for

Copyright 2023, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

block rewards in Bitcoin) or penalizes unintended behavior (e.g., by slashing the stake of validators that deviate from a consensus in the recent upgrade of Ethereum). The latter is a form of enforcing the existence of a Schelling point via incentives. Second, there are now base-layer systems like Ethereum in place that allow world-wide consistent general-purpose computations and thus the straightforward creation of new moneys (called tokens) that can be made to behave in new ways: generated, burned, exchanged, locked, etc. Thus an immense design space for incentivebased protocols was opened up and we witness its continued exploration. There are for example stablecoins, which are tokens that use incentive-based mechanisms to try and reflect the value of some underlying security (Maker DAI, Terra USD, FRAX Shares, and many more)1. Another direction are incentive-based consensus mechanisms for adjudication, moderation, and transferring real-world information onto blockchains (such as Kleros, UMA, Chainlink oracles, and again many more)2. In all of the above design problems, there are independent actors that we want to incentivize to behave in a certain desired way. A fundamental underlying problem herein is GAME IMPLEMENTATION (Monderer and Tennenholtz 2004), stated as follows: We are given a game in standard form (a set of players, strategies for each player, and their utility) and for each player a set of desired strategies. We want to specify a set of payment promises that define additional utilities that we give to players for playing certain strategies. These payment promises shall implement our desired sets of strategies, that is, when taking the payments into account, no player wants to play an undesired strategy. In technical terms, each strategy that is not dominated by any other strategy is desired (see Section 2 for the formal definitions).3 Furthermore, we want to minimize the cost

1See https://makerdao.com/en/whitepaper/, https: //terra.money/Terra White paper.pdf, and https://docs.frax. finance/overview. 2See https://kleros.gitbook.io/docs/, https://docs.umaproject. org/, and https://chain.link/whitepaper. 3We focus here only on pure strategies. Furthermore, GAME IMPLEMENTATION implements a set of strategy profiles rather than a set of strategies for each player. Implementing sets of strategies corresponds to implementing so-called rectangular strategy profiles, see the formal definitions in Section 2.

The Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI-23)

of the implementation, that is, the amount paid in the worst case. More precisely, we want to minimize, over all strategy profiles that consist of undominated strategies, the sum of payment promises to all players. In this work, we explore the question How difficult is it to implement a desired set of strategies?

Contribution. We obtain the following results. We first show that GAME IMPLEMENTATION is NP-hard, even if there are only two players and even if our budget for the cost is 0 (Theorem 3.1). This strengthens two results by Deng, Tang, and Zheng (2016) who showed that GAME IMPLEMENTATION is NP-hard for six players, and that GAME IMPLEMENTATION is NP-hard for two players with mixed strategies, both with positive budgets.4 We note that hardness for mixed-strategies or positive budgets is less surprising because there are a priori more possibilities for encoding combinatorial structure into the solutions. Instead, our reduction shows that the difficulty lies already and mainly in selecting, for each undesired strategy x, a desired strategy that dominates x. We then study a variant of GAME IMPLEMENTATION that was supposedly more tractable (Monderer and Tennenholtz 2004), called EXACT GAME IMPLEMENTATION: In addition to requiring undominated strategies to be desired, no desired strategy may be dominated by another strategy. We show that also this a priori simpler-looking problem is NPhard even for two players (Theorem 5.1); this answers an open question by Eidenbenz et al. (2011). Indeed Monderer and Tennenholtz (2004) gave a polynomial-time algorithm for EXACT GAME IMPLEMENTATION which was shown to produce suboptimal results by Eidenbenz et al. (2011). The above hardness results do not apply in scenarios in which players have only a small constant number of strategies to choose from. We hence consider this regime next. If both the number of players and the number of strategies are small constants, then the only part of the input that may be of unbounded size are the quantities specified in the utility functions. Eidenbenz et al. (2011) showed that in this case EXACT GAME IMPLEMENTATION can be solved efficiently; however, as we observe here there is a flaw in the algorithm. We simplify the algorithm and repair the flaw for a large though not universal class of problem instances, providing the first nontrivial algorithm for implementing strategies that is formally proven to be correct (Theorem 6.2). As we increase the number of players, the size of the input (the number of utility values we have to specify) scales exponentially in the number of players (and strategies). A common way to deal with this explosion is to instead consider the relevant special case of graphical games (Kearns, Littman, and Singh 2001), where the players are situated in a graph and the utility of a player depends only on its neighbors. We hence study this case next, that is, GAME IMPLEMENTATION on graphical games with small constant number of strategies per player. We show that even the case where each player s utility depends only on

4Monderer and Tennenholtz (2004) claimed NP-hardness of GAME IMPLEMENTATION, but the proof was erroneous (Eidenbenz et al. 2011).

three others and each player has only two strategies remains NP-hard (Theorem 4.1). As the case where each player has only one strategy is trivial, a promising future direction is to consider the case where each player depends only on two others or tree-structured games. Finally, before discovering our NP-hardness reduction for GAME IMPLEMENTATION we believed that zero-cost implementation could be solved efficiently. As a tool towards this we characterized strategy sets that can be implemented at cost 0 as a form of generalized Nash equilibria. We believe that this characterization is of independent interest, in particular because it generalizes the result of Monderer and Tennenholtz (2004) that states that Nash equilibria can be implemented at cost 0. Moreover, it captures a fundamental property of self-enforcing sets of strategies, such as morality, which we are not aware of having been formally defined before.

Further related work. Implementation theory (Maskin 1999; Maskin and Sj ostr om 2002) generally studies the implementation of social-choice rules with incentives and it is impossible to give an overview over the large body of work here. Conitzer and Sandholm (2014) studied the complexity of implementing social-choice rules. The main difference to GAME IMPLEMENTATION is that the payment promises to the players that we may choose from are restricted and given in the input. Such restrictions give significantly more leeway for designing hardness reductions. Brill, Freeman, and Conitzer (2015) considered a problem related to GAME IMPLEMENTATION in which some of the utility values are missing and we are to complete the missing values, possibly with negative ones. The goal is to ensure that strategies participating in some Nash equilibrium are desired. In GAME IMPLEMENTATION we have much more freedom in designing our solutions. Wooldridge et al. (2013) studied implementation questions for Boolean games, that is, where the strategies of the players correspond to a selection of truth values of some variables intrinsic to the game. They aimed at implementing Boolean formulas on the variables in some or all Nash equilibria. Because of the relation to Boolean satisfiability, implementation for Boolean games is situated higher in the polynomial hierarchy. Zero-cost implementation has been studied for routing games by Moscibroda and Schmid (2009). They gave bounds on the difference between anarchistic equilibria and those achievable by zero-cost implementation. Finally, Letchford and Conitzer (2010); Deng and Conitzer (2017, 2018) considered the complexity of committing to certain behaviors as a way for one player to change the outcome of a game to his favor.

2 Preliminaries (Full) proofs for results marked by are deferred to the full version (Chen et al. 2022). Throughout, for t N we use [t] to denote the set {1, 2, . . . , t}. A game G is a tuple (N, X, U) where N is the set of players; usually N = [n]. We specify for each player i N a set Xi of strategies available to i. Then the set X equals X1 X2 . . . Xn. We call elements of X strategy profiles. Finally, U = {U1, U2, . . . , Un}, where each Ui is a

function X R, called utility function for player i. As a notational shorthand, for any i N we use X i to denote X1 X2 . . . Xi 1 Xi+1 . . . Xn. Let x = {x1, . . . , xn} be a strategy profile. Then we use x i to refer to x without its ith element, i.e., x i = (x1, . . . , xi 1, xi+1, . . . , xn). Similarly, we use xi to refer to its ith element, i.e., xi = xi. We sometimes write the value of a utility function Ui such that the strategy played by player i comes first in the argument of Ui and the remaining strategies second. For x above and player i N, we could write U(xi, x i). However, if there are only two players, then the first argument x of Ui(x, y) always refers to the strategy of player 1 and the second argument y always refers to the strategy of player 2. Let x, y Xi be two strategies of player i. We say that x dominates y if for each x i X i we have Ui(x, x i) Ui(y, x i) and there exists x i X i such that Ui(x, x i) > Ui(y, x i).5 We say that x Xi is undominated if no other strategy of player i dominates x. For each player i N we denote by X i the set of undominated strategies in Xi. For a game G, by X G we denote the set of strategy profiles that consist entirely of undominated strategies, that is, X G = X 1 X 2 . . . X n. We omit the index G if it is clear from the context. Let G = (N, X, U) be a game. We now define the modified game obtained from G after payments are promised to the players. A payment promise to player i in G is a function X R, usually denoted by Vi. A payment promise for game G is the set of payment promises of the players: V := {V1, V2, . . . , Vn}. The modified game G[V] obtained from G with a payment promise V is the game (N, X, [U + V]), wherein [U + V] := {[Ui + Vi] | i N} and for each i N the function [Ui + Vi] is defined as [Ui + Vi](x) := Ui(x) + Vi(x) for all x X. The cost, denoted as cost(V), of a payment promise V is maxx X G[V] P i N Vi(x). We consider the following decision problem.

GAME IMPLEMENTATION Instance: A game G = (N, X, U), a set of strategy profiles O X, and a real budget δ R 0. Question: Is there a payment promise V such that cost(V) δ and X G[V] O?

A payment promise V as above implements O. The following is a variant of GAME IMPLEMENTATION where we want a given set of strategy profiles to be undominated and say that a payment promise V as below implements O exactly.

EXACT GAME IMPLEMENTATION Instance: A game G = (N, X, U), a set of strategy profiles O X, and a real budget δ R 0. Question: Is there a payment promise V such that cost(V) δ and X G[V] = O?

5This notion of domination is commonly referred to as weak domination. Alternative notions of domination are also studied, such as strict domination in which we require Ui(x, x i) > Ui(y, x i) for all x i X i. In keeping with the literature on implementation (Eidenbenz et al. 2011; Deng, Tang, and Zheng 2016) we focus on weak domination; the results are usually transferable.

We give an example in the full version. We mainly focus on the special case where the strategy profile sets O are rectangular. A strategy profile set Y for a game with player set N is rectangular if for each i N there is Yi Xi such that Y = Y1 Y2 . . . Yn. For a given player i N, we use Y i to denote Y1 Yi 1 Yi+1 . . . Yn. All our hardness results indeed hold even for rectangular strategy profile sets O.

Graphical games. For a more succinct representation we also use the concept of a graphical game. This is a tuple (G, H), where G = (N, X, U) is a game and H an undirected graph with vertex set N and edge set E. Let us use NH(i) to denote the neighborhood of a vertex i N, i.e., the set of all the vertices that are adjacent to i. For every i N, the utility function Ui and the potential payment promise Vi map from Xj1 Xjk to R, where j1, . . . , jk is the canonical ordering of the vertices restricted to NH(i) {i}. In other words, the utility of player i only depends on its own actions and the actions of its neighbors in H. The degree of (G, H) is the maximum vertex degree.

Properties of the domination relation. We now describe a few simple properties of the dominance relation, which are useful in our proofs. Observation 2.1 ( ). Domination is transitive. Observation 2.2 ( ). Domination is asymmetric. In other words, if x dominates y, then y does not dominate x. Observation 2.3 ( ). Every dominated strategy is dominated by some undominated strategy. This immediately implies the following observation: Observation 2.4. The set of undominated strategies is nonempty.

3 GAME IMPLEMENTATION is NP-hard for Two Players and Zero Budget In this section we prove that GAME IMPLEMENTATION is NP-hard even in the very restricted case where we have two players and the budget is zero. Theorem 3.1. GAME IMPLEMENTATION is NP-hard, even for two players and zero budget. We reduce from the following NP-hard problem (Schaefer 1978): X3C Instance: A set of 3ˆn elements A and a collection of 3ˆn sets C} such that Cj A and |Cj| = 3 for every j {0, . . . , 3ˆn 1} and every element ai A appears in exactly three sets, i.e., |{C C | ai C}| = 3. Question: Is there an exact cover of A, i.e., a subcollection S C s.t. |S| = ˆn and A = S C S C?

Let I = (A = {a0, . . . , a3ˆn 1}, C = {C0, . . . , C3ˆn 1}) be an instance of X3C, where |A| = 3ˆn. We create an instance I of GAME IMPLEMENTATION with two players p1 and p2. Let X1 = X2 = A {cai j | Cj C, ai Cj} and O1 = O2 = {cai j | Cj C, ai Cj}. Throughout, we take

i + 1 and i 1 modulo 3ˆn. For each i [3ˆn] and each set Cj, Cp C with ai Cj, ai 1 Cp, we define the utilities as

U1(ai, cai j ) = 2, U1(ai, caz j ) = 1,

U1(cai j , cai j ) = 2, U1(cai j , caz j ) = 1 az Cj \ {ai},

U2(cai 1 p , ai) = 1, U2(cai 1 p , cai j ) = 1.

The undefined utilities are 0 and the budget δ is also 0. The utilities of p1 and p2 are shown in Section 3. Before continuing with the proof, let us explain some intuition. For each ai A we denote the sets containing ai as Ci,1, Ci,2, Ci,3. The utilities of value 2 for p1 and the zero budget enforce that for every ai A exactly one of the strategies cai i,1, cai i,2, cai i,3 can be undominated for p2: To dominate ai for p1 with, say, cai i,2, we must promise a positive amount for playing cai i,2 whenever p2 plays cai i,3 or cai i,1. Thus we must have that neither of those latter strategies is undominated for p2 in order to stay within the budget. The utilities of value 1 for p1 enforce consistency, i.e., if as A is covered by Cr, then every ai Cr must also be covered by Cr. If cas r is undominated for p2, then if we were to dominate ai Cr with cai i,2 where Ci,2 = Cr, we would have to pay p1 least 1 for the strategy cai i,2 when p2 plays cas r . But since these are both undominated strategies, we would exceed the budget. The utilities for p2 enforce that we cover every element ai A. Player p1 always tries to match the element p2 is playing, whereas p2 tries to be one ahead. This prevents the two players from picking some element az A and only playing the strategies related to that. Formally, we claim that I is a positive instance of X3C if and only if I is a positive instance of GAME IMPLEMENTATION. For the forward direction, assume that (A, C) admits an exact cover S. For each element ai A and two distinct sets Cj, Cp C with ai Cj Cp, where Cj S and for each element az Cj, we define V1(cai j , caz p ) = . For each element ai A and two distinct sets Cj, Cp C with ai Cj S, while ai 1 Cp but Cp / S , we define V2(cai 1 p , cai j ) = . To show that this is a valid implementation, we will show that X O and cost(V) = 0.

Claim 3.2 ( ). For each Cj S, each ai Cj and each Cℓ C \ {Cj} with ai Cℓ, we have that cai j dominates both ai and every cai l for both p1 and p2.

Proof of Claim 3.2. Observe that for p1, strategy ai dominates cai j for every Cj C where ai Cj in G. Since S is an exact cover, no Cl C \ {Cj} such that ai Cl is in S. Thus payment promise V1 is 0 when p1 plays cai l . Thus ai also dominates cai l in G[V]. Because dominance is transitive (Observation 2.1), it is enough to show that cai j dominates ai. To see that cai j dominates ai for p1, we show that the utility for playing cai j is always at least that of playing ai.

Case 1: p2 plays cai j . Then, [Ui + Vi](cai j , cai j ) = 2 = [Ui + Vi](ai, cai j ).

Case 2: p2 plays cai l for some Cl C \ {Cj} s.t. ai Cl. Then, since ai is covered by Cj, we know that Cl / S. Therefore, [U1 + V1](cai j , cai l ) = > 2 = [U1 + V1](ai, cai l ). Case 3: p2 plays caz j for some az Cj \ {ai}. Then, [U1 + V1](cai j , caz j ) = 1 = [U1 + V1](ai, caz j ). Case 4: p2 plays caz l for some az A\{ai}, Cl C\{Cj} where ai, az Cl. Then, since element ai is covered by Cj we know that Cl / S. Therefore [U1 + V1](cai j , caz l ) = > 1 = [U1 + V1](ai, caz l ). Case 5: p2 plays caz l for some az A \ {ai}, Cl C \ {Cj}, where ai / Cl, az Cl. Then, [U1 + V1](cai j , caz l ) = 0 = [U1 + V1](ai, caz l ). Case 6: p2 plays az A (ai, az not necessarily distinct). Then, [U1 + V1](cai j , az) = 0 = [U1 + V1](ai, az).

In all cases, the new utility of cai j is higher than or equal to the utility of ai, and in the Cases 2 and 4 the utility is strictly higher. Thus cai j dominates ai for p1. The analogous observation for p2 is proved in the full version.

Since the set of undominated strategies is non-empty (Observation 2.4), Claim 3.2 implies that for every i [2], X i {cai j | Cj S, ai Cj} as all other strategies are dominated. Therefore X {cai j | Cj S, ai Cj} {cai j | Cj S, ai Cj} O, as required. For every j [2], we have that Vj(s1, s2) > 0 only when s1 or s2 is in {cai j | Cj / S, ai Cj}. Since none of these strategies is in X j , we have that cost(V) = maxx X G[V] P i [2] Vi(x) = 0, as required. This concludes the forwards direction. For the backward direction, assume that we have a payment promise V such that cost(V) = 0 and X O in the modified game G[V].

Claim 3.3. Let Cj C, ai A. If cai j X 2, then in G[V] (i) cai j X 1, (ii) cai j dominates ai for p1, (iii) cai l / X 2 where Cl C \ {Cj} and ai Cl.

Proof of Claim 3.3. We first show (i). Since V implements O, by Observation 2.3 there is an undominated strategy that dominates ai for p1. For a strategy s X 1 to dominate ai for p1 we need that V1(s, cai j ) U1(ai, cai j ) U1(s, cai j ) = 2 U1(s, cai j ). Because s is undominated, (s, cai j ) X G[V]. By the definition of cost, we have that 0 cost(V) = maxx X G[V] P

i [2] Vi(x) V1(s, cai j ). By combining these, we obtain that U1(s, cai j ) 2. The only strategy for p1 that satisfies this condition is cai j . Since cai j dominates ai, (ii) follows directly. To prove (iii), assume that both cai j and cai l are undominated for p2. By (i) and (ii), strategy cai j dominates ai for p1 and cai j is undominated. Thus [U1 + V1](cai j , cai l ) U1(ai, cai l ) = 2. From U1(cai j , cai l ) = 0 it follows that V1(cai j , cai l ) 2. But since cai j is undominated for p1 and cai l for p2, (cai j , cai l ) X and thus cost(V) =

Player p2 ca0 0,1 ca0 0,2 ca0 0,3 . . . cai i,1 cai i,2 cai i,3 ... c ai i ,2 . . . c a3ˆn 1 3ˆn 1,3 ca0 0,1 2 ... ... .. . ca0 0,2 2 ... ... .. . ca0 0,3 2 ... ... .. .

... ... ... cai i,1 . .. 2 . .. 1 . . .

cai i,2 . .. 2 . .. . ..

cai i,3 . .. 2 . .. . ..

... ... ... c ai i ,2 . .. 1 . .. 2 . . .

... ... ... c a3ˆn 1 3ˆn 1,3 . .. . .. . .. 2

a0 2 2 2 . .. . .. . ..

... ... ... ai . .. 2 2 2 . .. 1 . . .

... ... ... a3ˆn 1 . .. . .. . .. 2

c ai 1 i 1,1 c ai 1 i 1,2 c ai 1 i 1,3 cai i,1 cai i,2 cai i,3

cai i,1 1 1 1

cai i,2 1 1 1

cai i,3 1 1 1

c ai+1 i+1,1 1 1 1

c ai+1 i+1,2 1 1 1

c ai+1 i+1,3 1 1 1

... ... ... ... ... ...

Table 1: Left: The partial utility matrix of p1 from the proof of Theorem 3.1. For each element aℓ A, let Cℓ,1, Cℓ,2, Cℓ,3 denote the sets containing it. For the sake of example, we assume that ai Ct such that Ci ,2 = Ci,1 = Ct. Columns corresponding to strategies ai, [3ˆn] for p2 are omitted: their values are 0. Right: The utility matrix of p2 from the proof of Theorem 3.1. Columns corresponding to strategies ai, [3ˆn] for p1 are omitted: their values are 0.

maxx X P i [2] Vi(x) V1(cai j , cai l ) = 2 which a contradiction to cost(V ) = 0.

Claim 3.4. Let Cj C, ai A. If cai j X 1, then cai+1 l X 2 for some Cl C such that ai+1 Cl.

Proof of Claim 3.4. Since V implements O, by Observation 2.3, there is an undominated strategy s X 2 that dominates ai+1 for p2. For s to dominate ai+1 we need that [U2 + V2](cai j , s) U2(cai j , ai+1) = 1. Since s is undominated, (cai j , s) X G[V]. Therefore 0 cost(V) = maxx X G[V] P

i [2] Vi(x) V2(cai j , s). Since we need [U2 + V2](cai j , s) 1, it follows that U2(cai j , s) 1. The only strategies for p2 that satisfy this condition are cai+1 l where Cl C such that ai+1 Cl.

We know from the definition of implementations that X 2 = . Therefore there are some Cj C, ai Cj such that cai j X 2. By Claim 3.3(i) we have that cai j X 1. By Claim 3.4 we have that cai+1 p X 2 for some Cp C such that ai+1 Cp. By repeating this argumentation 3ˆn times, we obtain that for every ai A, there is some Cj C such that ai Cj and cai j X 2. This shows that S := {Cj | Cj C, ai A, s.t. cai j X 2} is a cover of A, i.e., S C S C = A. To show that S is an exact cover, we must show that if Cj S and ai Cj then for every Cl C \ {Cj} such that ai Cl, we have that Cl / S.

Assume, towards a contradiction, that some ai A is covered twice, i.e., there are Cj, Cl S where ai Cj Cl. By Claim 3.3(iii) we cannot have both cai j X 2 and cai l X 2. Therefore, without loss of generality, assume that cai j X 2 and caz l X 2 for some az Cl \ {ai}. Then by Claim 3.3(ii) cai j dominates ai for p1. Moreover, [U1 + V1](cai j , caz l ) U1(ai, caz l ) = 1 because ai Cl. Because U1(cai j , caz l ) = 0, we have V1(cai j , caz l ) 1. By Claim 3.3(i) we have cai j X 1 and we assumed that caz l X 2. Thus cost(V) V1(cai j , caz l ) 1 > 0, a contradiction. Therefore each element ai A is covered exactly once, and S is an exact cover. This finishes the proof of Theorem 3.1.

4 GAME IMPLEMENTATION is NP-hard for Max. Degree Three and Two Strategies

In all earlier reductions, the number of strategies per player has been unbounded. In this section we show that in graphical games even bounding the number of strategies and the degree of players together does not help to lower the complexity.

Theorem 4.1 ( ). GAME IMPLEMENTATION on graphical games is NP-hard even for degree three and if each player has at most two strategies.

Proof. To show NP-hardness we reduce from X3C. Let (A = {a0, . . . , a3ˆn 1}, C = {C0, . . . , C3ˆn 1}) be an instance of X3C. We construct an instance (N, X, U, O, δ) of graphical GAME IMPLEMENTATION in the following way:

Let N := C A be the set of players, let Xp = {Tp, Fp} be the set of strategies for player p N. Construct the underlying graph H := (N, E), where E := {{ai, Cj} | Cj C, ai Cj}. It is easy to see that H has degree 3. Throughout this proof, for an element ai A, let us denote the sets that include it as C1 i , C2 i , C3 i in the order of increasing indices in C. When defining utility functions, if the utility of the player does not depend on the strategy played by some other player, the strategy of this player is omitted from the function arguments. For each element ai A, we define

Uai(TC1 i , TC2 i , TC3 i , Fai) = Uai(FC1 i , FC2 i , FC3 i , Fai)

= Uai(TC1 i , TC2 i , FC3 i , Fai) = Uai(TC1 i , FC2 i , TC3 i , Fai)

= Uai(FC1 i , TC2 i , TC3 i , Fai) = 1

The undefined combinations pay out 0. For every Cj C, utility function UCj is 0 for every strategy profile. For every ai A we put Oai = {Tai}. For every Cj C, we put OCj = XCj, concluding the construction. The correctness proof is in the full version.

5 EXACT GAME IMPLEMENTATION is NP-hard The complexity of EXACT GAME IMPLEMENTATION has so far been open. In this section we show that it is NP-hard even when we have only two identical players.

Theorem 5.1 ( ). EXACT GAME IMPLEMENTATION is NPhard, even for two players and rectangular desired strategyprofile sets.

Proof. We give a reduction from the NP-hard 3-COLORING problemin which are a graph H and need to decide whether H can be properly colored with three colors. That is, whether we can assign each vertex exactly one color such that no two adjacent vertices receive the same color. Given an instance H of 3-COLORING we proceed as follows to construct an instance of EXACT GAME IMPLEMENTATION that consists of a game G = (N, X, U), a rectangular strategy profile set O, and the real budget δ = 1. There are two players, that is, N = {1, 2}. The sets of strategies of the two players are identical, that is, X = X1 X2 and X1 = X2. Thus, we only describe X1. For each vertex in V (H) there is a corresponding strategy in X1, that is, V (H) X1. We call these vertex strategies. Furthermore, for each combination of a color c [3] and a vertex v V (H) there is a strategy (v, c) X1. We call these color-choice strategies, and we use C to denote the set of color-choice strategies, that is, C = {(v, c) | v V (H) c [3]}. Finally, we have a set D of 3n dummy strategies. Overall, X1 = X2 = V (H) C D. The strategies to implement are the color-choice strategies, that is, O1 = O2 = C and O = O1 O2. Intuitively, the strategy (v, c) in O1 that dominates strategy v after promises shall correspond to choosing color c for vertex v. The utility functions are symmetric, that is, for each x X1 = X2 and y X2 = X1 we have U1(x, y) = U2(y, x). Thus, we only describe U1 explicitly.

Moreover, we only give the non-zero values of U1, all values not explicitly mentioned are 0. First, for each pair (v, c1), (v, c2) of color-choice strategies that correspond to the same vertex v V (H) we put

U1((v, c1), (v, c2)) = 3, c1 = c2 2, c1 = c2.

Second, for each pair (u, c1), (v, c2) of color-choice strategies corresponding to adjacent vertices u, v V (H) we put

U1((u, c1), (v, c2)) = U1((v, c1), (u, c2)) = 1, c1 = c2 2, c1 = c2.

Third, for each vertex strategy v and each color-choice strategy (v, c) corresponding to v we put U1(v, (v, c)) = 3. Fourth, for each vertex strategy u and each color-choice strategy (u, c) corresponding to a neighbor u NH(v) of v we put U1(v, (u, c)) = 2. Finally, for each color-choice strategy x X1 we pick a distinct dummy strategy y X2 and put U1(x, y) = 1. This concludes the description of the EXACT GAME IMPLEMENTATION instance I = (G, O, δ) where G = (N, X, U) and δ = 1. Intuitively, the players are highly incentivized to play vertex strategies because of the values U1(v, (v, c)) = 3. In order to dominate a vertex strategy v, we need to pick a color-choice strategy corresponding to that vertex v because in order to make a different strategy dominate v we would exceed the budget of 1. The values U1((v, c1), (v, c2)) will enforce that both players select the same color for each vertex. Afterwards, if two adjacent vertices u, v would receive the same color c then the values U1((u, c), (v, c)) = 1 = U2((u, c), (v, c)) enforce that we would have to pay both players: These two color-choice strategies would have to dominate u (for player 1) and v (for player 2), respectively, and we have U1(u, (v, c)) = 2 = U2((u, c), v). The proof of the correctness is in the full version.

6 Correction to Algorithm for EXACT GAME IMPLEMENTATION Eidenbenz et al. (2011) gave an algorithm which on input of a game G and a desired strategy-profile region O, finds the minimum δ such that (G, O, δ) is a positive instance of EXACT GAME IMPLEMENTATION. This is Algorithm 1 in (Eidenbenz et al. 2011), which for completeness is contained in the full version. The algorithm fails to give an exact implementation when for some player i N, a strategy in Oi dominates some other strategy in Oi. We show an example where it fails and provide a fix for a class of games which we refer to as equitable games. To see that the algorithm does not always construct a correct payment promise V, consider a 2-player instance where player 1 and player 2 both have two strategies {s1, s2}. Let us define the utility functions for both players i [2] as

Ui(s1, s1) = 2 Ui(s2, s1) = 1 Ui(s1, s2) = 1 Ui(s2, s2) = 0.

Let O1 = {s1, s2} and O2 = {s1}.

Algorithm 1: Minimum cost exact implementation

input : A game G = (N, X, U) and a rectangular strategy profile region O = O1 On. output : A payment promise V and δ 0 such that V implements O and maxo O P

i N Vi(o) = δ is smallest possible.

1 foreach i N do

2 foreach mapping Fi : Xi \ Oi Oi do

3 V Fi Compute V(Fi, Xi, Oi)

4 δ ; V (0, . . . , 0);

5 foreach F = (F1, . . . , Fn) F do

6 δF maxo O P

i N V Fi(o);

7 if δF < δ then δ δF ; V F;

8 foreach i N do

9 foreach oi Oi do

10 foreach x i X i \ O i do Vi(oi, x i) ;

11 return δ, V

12 def Compute V(Fi, Xi, Oi ):

13 foreach oi Oi do

14 foreach o i O i do

15 if F 1 i (oi) = then

16 Vi(oi, o i) max{0,

17 maxxi F 1 i (oi) Ui(xi, o i) Ui(oi, o i)};

18 else Vi(oi, o i) 0;

19 return Vi

We can see that for both players s1 dominates s2, so X i = {s1} for all i [2]. Because |X i \ Oi| = 0 for all i [2], the check on line (1) of Algorithm 1 from (Eidenbenz et al. 2011) is always false and the algorithm returns that O can be implemented exactly with cost 0. However, V1 constructed by the Algorithm is 0 everywhere, and thus O1 = X 1, meaning this is not an exact implementation of O. Towards a correction, if we can for every player i N, pick for each of its desired strategies oi Oi an undesired strategy profile xoi X \ O, such that xoi i = oi, and for no

desired strategy o i Oi \ {oi} do we have that xoi i = xo i i. In other words, we require

|Oi| |X i \ O i|. (1)

Then we can ensure at no extra cost that no strategy in Oi dominates another strategy in Oi. Let us call a game for which, for every i N, Equation (1) holds equitable. We start by showing that, if a game is equitable then we can translate every non-exact implementation to an exact implementation, where the cost is bounded by the worst-case payment over O in the initial implementation. Throughout this section, let F denote the Cartesian product of the possible functions from Xi \ Oi to Oi for every player, i.e., F = (X1 \O1 O1) (Xn \On On). Theorem 6.1 ( ). Let G = (N, X, U) be an equitable game and O X a rectangular strategy profile region. Let V implement O (not necessarily exactly) and let δ = maxo O P i N Vi(o). Then the payment promise V below implements O exactly with cost(V ) = δ. Let M = Umax + δ + 1 with Umax = maxi N,x X Ui(x), (i.e., Umax is the maximum amount any player may receive in utility). For every player i N, for each of its desired strategies oi Oi,

we choose an undesired strategy profile xoi X \ O, such that xoi i = oi, and for no desired strategy o i Oi \ {o i}

do we have that xoi i = xo i i. Since G is equitable, we can choose such strategy profiles for every i N, oi Oi. To define V , for every i N, x X, set

0, if xi Xi \ Oi Vi(x), if xi Oi, x i O i M + 1 Ui(x), if xi Oi, x i = xoi i M Ui(x), otherwise.

The idea behind the payments is to enforce that every desired strategy has one undesired strategy profile where it is the best possible option. This prevents any other strategy from dominating it. Next we show that Algorithm 1 by Eidenbenz et al. (2011) identifies a payment promise V minimizing maxo O P i N Vi(o). To make the analysis of the algorithm easier, we give a simplified version of their algorithm, which is shown in Algorithm 1. We obtain the following: Theorem 6.2 ( ). For a given equitable game G = (N, X, U) and a set of desired strategy profiles O X, the smallest δ 0 such that (G, O, δ) is a positive instance of EXACT GAME IMPLEMENTATION can be identified in time O(|O| maxi N(n|Oi||Xi\Oi||O||Xi \ Oi| + |Oi|n|Xi\Oi|)).

7 Characterization of Zero-Budget Implementation In this section we characterize rectangular strategy profiles P = P1 . . . Pn that can be implemented at zero cost. We call such profiles promise-Nash equilibrium (PNE). The naming comes from two considerations. First, if each player i has only one strategy in Pi, then a PNE is equivalent to a Nash equilibrium. Second, a PNE encapsulates the notion that no player i has an incentive to switch towards a strategy outside of Pi provided that each other player j plays only strategies in Pj. We believe this notion to be of independent interest because it models situations in which certain types of strategies may be off limits for, e.g., moral reasons. Using PNE may thus enable studying the price (or value) of morality and similar ideas. Formally: Definition 7.1. Let G = (N, X, U) be a game. A rectangular strategy profile region P1 Pn X1 Xn is a promise-Nash equilibrium (PNE) if

i N, xi Xi \ Pi pi Pi : p i P i : Ui(pi, p i) Ui(xi, p i).

We observe the following. Theorem 7.2 ( ). Let G = (N, X, U) be a game. A rectangular strategy profile region P = P1 Pn X1 Xn can be implemented with cost 0 if and only if P is a promise-Nash equilibrium. In fact, this theorem has an interpretation in the morality setting above: Let a profile P consist only of moral strategies and no amoral ones. Then P is a PNE if and only if morality is incentivized without any incentives having to actually be realized. In other words, morality is self-enforcing if and only if it constitutes a PNE.

Acknowledgments

Jiehua Chen and Sofia Simola are supported by the Vienna Science and Technology Fund (WWTF), grant number VRG18-012. Manuel Sorge acknowledges funding by the Alexander von Humboldt Foundation.

Brill, M.; Freeman, R.; and Conitzer, V. 2015. Computing the Optimal Game. In Proceedings of the 2nd Workshop on Exploring Beyond the Worst Case in Computational Social Choice, 1 8. Buterin, V.; Reijsbergen, D.; Leonardos, S.; and Piliouras, G. 2020. Incentives in Ethereum s hybrid Casper protocol. International Journal of Network Management, 30(5). Chen, J.; Khavidaki, N. L.; Haydn, S. V.; Simola, S.; and Sorge, M. 2022. Game Implementation: What Are the Obstructions? Technical report, ar Xiv:2212.00699 [cs.GT]. Conitzer, V.; and Sandholm, T. W. 2014. Complexity of Mechanism Design. Technical report, ar Xiv:cs/0205075 [cs.GT]. Deng, Y.; and Conitzer, V. 2017. Disarmament Games. In Singh, S.; and Markovitch, S., eds., Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4-9, 2017, San Francisco, California, USA, 473 479. Deng, Y.; and Conitzer, V. 2018. Disarmament Games With Resource. In Mc Ilraith, S. A.; and Weinberger, K. Q., eds., Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI 2018), 981 988. AAAI Press. Deng, Y.; Tang, P.; and Zheng, S. 2016. Complexity and Algorithms of K-implementation. In Proceedings of the 15th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 16), 9. Eidenbenz, R.; Pignolet, Y. A.; Schmid, S.; and Wattenhofer, R. 2011. Cost and complexity of harnessing games with payments. International Game Theory Review, 13(01): 13 44. Hamari, J. 2019. Gamification, 1 3. John Wiley & Sons, Ltd. Kearns, M. J.; Littman, M. L.; and Singh, S. 2001. Graphical Models for Game Theory. In Breese, J. S.; and Koller, D., eds., Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence (UAI 01), 253 260. Morgan Kaufmann. Letchford, J.; and Conitzer, V. 2010. Computing optimal strategies to commit to in extensive-form games. In Proceedings of the 11th ACM conference on Electronic commerce (EC 10), 83 92. Association for Computing Machinery. Maskin, E. 1999. Nash Equilibrium and Welfare Optimality. Review of Economic Studies, 66(1): 23 38. Maskin, E.; and Sj ostr om, T. 2002. Implementation Theory, volume 1 of Handbook of Social Choice and Welfare, 237 288. Elsevier.

Monderer, D.; and Tennenholtz, M. 2004. KImplementation. Journal of Artificial Intelligence Research, 21: 37 62. Moscibroda, T.; and Schmid, S. 2009. On Mechanism Design without Payments for Throughput Maximization. In Proceedings of the 28th IEEE International Conference on Computer Communications (INFOCOM 2009), 972 980. Schaefer, T. J. 1978. The Complexity of Satisfiability Problems. In Proceedings of the Tenth Annual ACM Symposium on Theory of Computing (STOC 78), 216 226. Thaler, R. H.; and Sunstein, C. R. 2008. Nudge: Improving Decisions about Health, Wealth, and Happiness. Yale University Press. Wooldridge, M.; Endriss, U.; Kraus, S.; and Lang, J. 2013. Incentive engineering for Boolean games. Artificial Intelligence, 195: 418 439.