# bidding_graph_games_with_partiallyobservable_budgets__a156794d.pdf

Bidding Graph Games with Partially-Observable Budgets

Guy Avni1, Ismäel Jecker2, Ðor de Žikeli c3

1University of Haifa 2University of Warsaw 3Institute of Science and Technology Austria (ISTA)

Two-player zero-sum graph games are a central model, which proceeds as follows. A token is placed on a vertex of a graph, and the two players move it to produce an inﬁnite play, which determines the winner or payoff of the game. Traditionally, the players alternate turns in moving the token. In bidding games, however, the players have budgets and in each turn, an auction (bidding) determines which player moves the token. So far, bidding games have only been studied as fullinformation games. In this work we initiate the study of partial-information bidding games: we study bidding games in which a player s initial budget is drawn from a known probability distribution. We show that while for some bidding mechanisms and objectives, it is straightforward to adapt the results from the full-information setting to the partialinformation setting, for others, the analysis is signiﬁcantly more challenging, requires new techniques, and gives rise to interesting results. Speciﬁcally, we study games with meanpayoff objectives in combination with poorman bidding. We construct optimal strategies for a partially-informed player who plays against a fully-informed adversary. We show that, somewhat surprisingly, the value under pure strategies does not necessarily exist in such games.

Introduction

We consider two-player zero-sum graph games; a fundamental model with applications, e.g., in multi-agent systems (Alur, Henzinger, and Kupferman 2002). A graph game is played on a ﬁnite directed graph as follows. A token is placed on a vertex and the players move it throughout the graph to produce an inﬁnite path, which determines the payoff of the game. Traditional graph games are turn-based: the players alternate turns in moving the token. Bidding games (Lazarus et al. 1996, 1999) are graph games in which an auction (bidding) determines which player moves the token in each turn. The concrete bidding mechanisms that we consider proceed as follows. In each turn, both players simultaneously submit bids, where a bid is legal if it does not exceed the available budget. The higher bidder wins the bidding and moves the token. The mechanisms differ in their payment schemes, which are classiﬁed according to two orthogonal properties. Who pays: in

Copyright 2023, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

ﬁrst-price bidding only the higher bidder pays the bid and in all-pay bidding both players pay their bids. Who is the recipient: in Richman bidding (named after David Richman) payments are made to the other player and in poorman bidding payments are made to the bank , i.e., the bid is lost. As a rule of thumb, bidding games under all-pay and poorman bidding are respectively technically more challenging than ﬁrst-price and Richman bidding. More on this later. In terms of applications, however, we argue below that poorman bidding is often the more appropriate bidding mechanism.

Applications. A central application of graph games is reactive synthesis (Pnueli and Rosner 1989): given a speciﬁcation, the goal is to construct a controller that ensures correct behavior in an adversarial environment. Synthesis is solved by constructing a turn-based parity game in which Player 1 is associated with the controller and Player 2 with the environment, and searching for a winning Player 1 strategy. Bidding games extend the modeling capabilities of graph games. For example, they model ongoing and stateful auctions in which budgets do not contribute to the players utilities. Advertising campaigns are one such setting: the goal is to maximize visibility using a pre-allocated advertising budget. By modeling this setting as a bidding game and solving for Player 1, we obtain a bidding strategy with guarantees against any opponent1. Maximizing visibility can be expressed as a mean-payoff objective (deﬁned below). All-pay poorman bidding is particularly appealing since it constitutes a dynamic version of the well-known Colonel Blotto games (Borel 1921). Rather than thinking of the budgets as money, we think of them as resources at the disposal of the players, like time or energy. Then, deciding how much to bid represents the effort that a player invests in a competition, e.g., investing time to prepare for a job interview, where the player that invests more wins the competition.

Prior work full-information bidding games. The central quantity in bidding games is the initial ratio between the players budgets. Formally, for i {1, 2}, let Bi be Player i s initial budget. Then, Player 1 s initial ratio is B1/(B1+B2). A random-turn game (Peres et al. 2009) with parameter p [0, 1] is similar to a bidding game only that in-

1A worst-case modelling assumes that the other bidders cooperate against Player 1.

The Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI-23)

stead of bidding, in each turn, we toss a coin with probability p that determines which player moves the token. Formally, a random-turn game is a special case of a stochastic game (Condon 1992). Qualitative objectives. In reachability games, each player is associated with a target vertex, the game ends once a target is reached, and the winner is the player whose target is reached. Reachability bidding games were studied in (Lazarus et al. 1996, 1999). It was shown that, for ﬁrst-price reachability games, a threshold ratio exists, which, informally, is a necessary and sufﬁcient initial ratio for winning the game. Moreover, it was shown that ﬁrst-price Richmanbidding games are equivalent to uniform random-turn games (and only Richman bidding); namely, the threshold ratio in a bidding game corresponds to the value of a uniform randomturn game. All-pay reachability games are technically more challenging. Optimal strategies might be mixed and may require sampling from inﬁnite-support probability distributions even in extremely simple games (Avni, Ibsen-Jensen, and Tkadlec 2020). Mean-payoff games. Mean-payoff games are inﬁniteduration quantitative games. Technically, each vertex of the graph is assigned a weight, and the payoff of an inﬁnite play is the long-run average sum of weights along the path. The payoff is Player 1 s reward and Player 2 s cost, thus we refer to them respectively as Max and Min. For example, consider the bowtie game G , depicted in Fig. 1. The payoff in G corresponds to the ratio of bidding that Max wins. Informally G models the setting in which in each day a publisher sells an ad slot, and Max s objective is to maximize visibility: the number of days that his ad is displayed throughout the year. Unlike reachability games, intricate equivalences between mean-payoff bidding games and random-turn games are known for all the mechanisms described above (Avni, Henzinger, and Chonev 2019; Avni, Henzinger, and Ibsen-Jensen 2018; Avni, Henzinger, and Žikeli c 2021; Avni, Jecker, and Žikeli c 2021). Example 1. We illustrate the equivalences between fullinformation bidding games and random-turn games. Consider the bowtie game G (see Fig. 1). For p [0, 1], the random-turn game RT(G , p) that uses a coin with bias p is depicted in Fig. 2. Its expected payoff is p. Suppose that the initial ratio is r (0, 1). Under ﬁrstprice Richman-bidding, the optimal payoff in G does not depend on the initial ratio: no matter what r is, the optimal payoff that Max can guarantee is arbitrarily close to 0.5, hence the equivalance with RT(G , 0.5). Under ﬁrstprice poorman bidding, the optimal payoff does depend on the initial ratio: roughly, the optimal payoff that Max can guarantee is r, hence the equivalence with RT(G , r). For all-pay bidding, pure strategies are only useful in all-pay poorman bidding and only when r > 0.5, where Max can guarantee an optimal payoff of 2r 1

r . The results extend to general strongly-connected games (see Thm. 3).

Our contributions partial-information bidding games. In most auction domains, bidders are not precisely informed of their opponent s budget. Bidding games, however, have only been studied as full-information games. We initiate the

v Max v Min

Figure 1: The mean-payoff game G with the weights in the vertices.

v Max v Min

Figure 2: The simpliﬁed random-turn game RT(G , p), for p [0, 1].

study of bidding games in which the players are partially informed of the opponent s budget. Speciﬁcally, we study bidding games in which the two players budgets are drawn from a known probability distribution, and the players goal is to maximize their expected utility. We ﬁrst show that the results on qualitative objectives as well as ﬁrst-price Richman bidding transfer to the partial-information setting. We turn to study mean-payoff poorman-bidding games, which are signiﬁcantly more challenging. We focus on onesided partial-information games in which only Player 2 s budget is drawn from a probability distribution. Thus, Player 1 is partially informed and Player 2 is fully informed of the opponent s budget. We argue that one-sided partialinformation games are practically well-motivated. Indeed, one-sided partial information is a worst-case modelling: the utility that an optimal strategy for Player 1 guarantees in the game, is a lower bound on the utility that it will guarantee when deployed against the concrete environment. We illustrate our results in the following example.

Example 2. Consider the bowtie game G (Fig. 1), where Max (the partially-informed player) starts with a budget of B and Min (the fully-informed player) starts with a budget that is drawn uniformly at random from supp(γ) = {C1, C2}. We describe an optimal strategy for Max under ﬁrst-price poorman bidding. Max carefully chooses an x [B C1

C2 , B] and divides his budget into two wallets ; the ﬁrst with budget x and the second with budget B x. He initially uses his ﬁrst wallet to play an optimal full-information strategy assuming the initial budgets are x and C1, which guarantees a payoff of at least p1 = x C1+x. If Player 2 spends more than C1, i.e., her initial budget was in fact C2, then Player 1 proceeds to use his second wallet against Player 2 s remaining budget, which guarantees a payoff of at least p2 = B x B x+C2 C1 . Thus, the expected payoff is at least 0.5 (p1 +p2), and Max simply chooses an x that maximizes this expression. Note that the constraint that x B C1

C2 implies that p1 p2, thus Min has an incentive to play so that Max proceeds to use his second wallet. We show that this strategy is optimal, and extend the technique to obtain optimal strategies in general strongly-connected games for ﬁrst-price and all-pay poorman bidding. Finally, we show that the optimal payoff that Min can guarantee in G , is obtained by a surprisingly simple strategy. We show that the following Min strategy is optimal: when her initial budget is Ci, for i {1, 2}, Min follows an optimal full-information strategy for ratio B/(B + Ci). That is, she reveals her true budget in the ﬁrst round and cannot gain utility by hiding this information. The technical challenge is to show that this strategy is optimal.

Our results show that contrary to turn-based, stochastic games, and full-information bidding games, there is a gap between the optimal payoffs that the players can guarantee with pure strategies. Thus, the value does not necessary exist in partial-information mean-payoff bidding games under pure strategies.

Related work. The seminar book (Aumann, Maschler, and Stearns 1995) studies the mean-payoff game G under one-sided partial-information with a different semantic to the one we study. Let L or R denote the two vertices of G . Min has partial information of the weights of L and R, which, before the game begins, are drawn from a known probability distribution. Max, the fully-informed player, knows the weights. In each turn, Max chooses L or R, followed by Min who either accepts or rejects Max s choice, thus both players can affect the movement of the token. The value in the game is shown to exist. Interestingly and similar in spirit to our results, there are cases in which Max cannot use his knowledge advantage and his optimal strategy reveals which of the two vertices he prefers. One-sided partial information have also been considered in turn-based graph games, e.g., (Reif 1984; Raskin et al. 2007; Wulf, Doyen, and Raskin 2006). Discrete bidding games were studied in (Develin and Payne 2010); namely, budgets are given in coins, and the minimal positive bid a player can make is a single coin. Tiebreaking is a signiﬁcant factor in such games (Aghajohari, Avni, and Henzinger 2021). Non-zero-sum bidding games were studied in (Meir, Kalai, and Tennenholtz 2018). See also the survey (Avni and Henzinger 2020).

Preliminaries Strategies in bidding games. A bidding game is played on a directed graph V, E . A strategy in any graph game is a function from histories to actions. In bidding games, a history consists of the sequence of vertices that were visited and bids made by the two players. We stress that the history does not contain the current state of the budgets. Rather, a player can compute his opponent s current budget based on the history of bids, if he knows her initial budget. We formalize the available budget following a history. For i {1, 2}, suppose the initial budget of Player i is Bi. For a history h, we deﬁne the investments of Player i throughout h, denoted Invi(h). In all-pay bidding, Invi(h) is the sum bids made by Player i throughout h, and in ﬁrstprice bidding, it is the sum only over the winning bids. We denote by Bi(h) Player i s available budget following h. Under Richman bidding, winning bids are paid to the opponent, thus Bi(h) = Bi Invi(h) + Inv3 i(h). Under poorman bidding, winning bids are paid to the bank, thus Bi(h) = Bi Invi(h). Given a history, a strategy prescribes an action, which in a bidding game, is a pair b, u R V , where b is a bid and u is the vertex to move to upon winning. We restrict the actions of the players following a history h so that (1) the bid does not exceed the available budget, thus following a history h, a legal bid for Player i is a bid in [0, Bi(h)], and (2) a player must choose a neighbor of the vertex that

the token is placed on. We restrict attention to strategies that choose legal actions for all histories. Note that we consider only pure strategies and disallow mixed strategies (strategies that allow a random choice of action).

Deﬁnition 1. For i {1, 2}, we denote by Si(Bi) the set of legal strategies for Player i with an initial budget of Bi. Note that with a higher initial budget, there are more strategies to choose from, i.e., for B i > Bi, we have Si(Bi) Si(B i).

The central quantity in bidding games is the initial ratio, deﬁned as follows.

Deﬁnition 2. Budget ratio. When Player i s budget is Bi, for i {1, 2}, we say that Player i s ratio is Bi B1+B2 .

Plays. Consider initial budgets B1 and B2 for the two players, two strategies f S1(B1) and g S2(B2), and an initial vertex v. The triple f, g, and v gives rise to a unique play, denoted play(v, f, g). The construction of play(v, f, g) is inductive and is intuitively obtained by allowing the players to play according to f and g. Initially, we place the token on v, thus the ﬁrst history of the game is h = v. Suppose a history h has been played. Then, the next action that the players choose is respectively u1, b1 = f(h) and u2, b2 = g(h). If b1 > b2, then Player 1 wins the bidding and the token moves to u1, and otherwise Player 2 wins the bidding and the token moves to u2. Note that we resolve ties arbitrarily in favor of Player 2. The play continues indeﬁnitely. Since the players always choose neighboring vertices, each play corresponds to an inﬁnite path in V, E . For n N, we use playn(v, f, g) to denote its ﬁnite preﬁx of length n. We sometimes omit the initial vertex from the play when it is clear from the context.

Objectives. We consider zero-sum games. An objective assigns a payoff to a play, which can be thought of as Player 1 s reward and Player 2 s penalty. We thus sometimes refer to Player 1 as Max and Player 2 as Min. We denote by payoff(f, g, v) the payoff of the play play(f, g, v).

Qualitative objectives. The payoff in games with qualitative objectives is in { 1, 1}. We say that Player 1 wins the play when the payoff is 1. We consider two qualitative objectives. (1) Reachability. There is a distinguished target vertex t and a play is winning for Player 1 iff it visits t. (2) Parity. Each vertex is labeled by an index in {1, . . . , d} and a play is winning for Player 1 iff the highest index that is Parity objectives are important in practice, e.g., reactive synthesis (Pnueli and Rosner 1989) is reducted to the problem of solving a (turn-based) parity games.

Mean-payoff games. The quantitative objective that we consider is mean-payoff. Every vertex v in a meanpayoff game has a weight w(v) and the payoff of an inﬁnite play is the long-run average weight that it traverses. Formally, the payoff of an inﬁnite path v1, v2, . . . is lim infn 1

n P 1 i<n w(vi). Note that the deﬁnition favors Min since it uses lim inf.

Values in full-information bidding games. We are interested in ﬁnding the optimal payoff that a player can guarantee with respect to an initial budget ratio. Let c R and

initial budgets B1 and B2. We say that Player 1 can guarantee a payoff of c, if he can reveal that he will be playing according to a strategy f S1(B1), and no matter which strategy g S2(B2) Player 2 responds with, we have payoff(f, g) c. Player 1 s value is the maximal c that he can guarantee, and Player 2 s value is deﬁned dually. Note that there might be a gap between the two players values. When Player 1 s value coincides with Player 2 s value, we say that the value exists in the game.

Partial Information Bidding Games A partial-information bidding game is G = V, E, α, γ1, γ2 , where V, E is a directed graph, α is an objective as we elaborate later, and the budget distribution γi is a probability distribution from which Player i s initial budget is drawn, for i {1, 2}. The support of a probability distribution γ : Q [0, 1] is supp(γ) = {x Q : γ(x) > 0}. We restrict attention to ﬁnite-support probability distributions. For i {1, 2}, the probability that Player i s initial budget is Bi supp(γi) is γi(Bi). Deﬁnition 3. One-sided partial information. We say that a game has one-sided partial information when |supp(γ1)| = 1 and |supp(γ2)| > 1. We then call Player 1 the partiallyinformed player and Player 2 the fully-informed player. We turn to deﬁne values in partial-information games. The intuition is similar to the full-information case only that each player selects a collection of strategies, one for each possible initial budget, and we take the expectation over the payoffs that each pair of strategies achieves. The δ in the following deﬁnition allows us to avoid corner cases due to ties in biddings and the ϵ is crucial to obtain the results on fullinformation mean-payoff bidding games. Deﬁnition 4. (Values in partial-information bidding games). Consider a partial-information bidding game G = V, E, α, β, γ . Suppose supp(β) = {B1, . . . , Bn} and supp(γ) = {C1, . . . , Cn}. We deﬁne Player 1 s value, denoted val (G, β, γ), and Player 2 s value, denoted val (G, β, γ), is deﬁned symmetrically. We deﬁne that val (G, β, γ) = c R if for every δ, ϵ > 0,

There is a collection f B S1(B + δ)

B supp(β) of Player 1 strategies, such that for every collection g C S2(C)

C supp(γ) of Player 2 strategies, we have P B,C β(B) γ(C) payoff(f B, g C) c ϵ. For every collection f B S1(B)

B supp(β) of

Player 1 strategies, there is a collection g C S2(C + δ)

C supp(γ) of Player 2 strategies such that P B,C β(B) γ(C) payoff(f B, g C) c + ϵ.

Note that val (G, β, γ) val (G, β, γ) and when there is equality, we say that the value exists, and denote it by val(G, β, γ). The value in mean-payoff games is often called the meanpayoff value. In mean-payoff games we use MP , MP , and MP instead of val , val , and val, respectively. When G is

full-information and the budget ratio is r, we use MP(G, r) instead of writing the two budgets.

Partial-Information Qualitative First-Price Bidding Games In this section, we focus on ﬁrst-price bidding and show that the value exists in partial-information bidding games with qualitative objectives. The proof adapts results from the fullinformation setting, which we survey ﬁrst. Deﬁnition 5. (Threshold ratios in full-information games). Consider a full-information ﬁrst-price bidding game with a qualitative objective. Suppose that the sum of initial budgets is 1 and that the game starts at v. The threshold ratio in v, denoted Th(v), is a value t such that for every ϵ > 0: Player 1 wins when his ratio is greater than Th(v); namely, when the initial budgets are t + ϵ and 1 t ϵ. Player 2 wins when Player 1 s ratio is less than Th(v); namely, when the initial budgets are t ϵ and 1 t + ϵ. Existence of threshold ratios for full-information reachability games was shown in (Lazarus et al. 1996, 1999) and later extended to full-information parity games in (Avni, Henzinger, and Chonev 2019; Avni, Henzinger, and Ibsen Jensen 2018). Theorem 1. (Lazarus et al. 1996, 1999; Avni, Henzinger, and Chonev 2019; Avni, Henzinger, and Ibsen-Jensen 2018) Threshold ratios exist in every vertex of a parity game. The following theorem, whose proof can be found in the full version, extends these results to the partial-information setting. Theorem 2. Consider a partial-information parity ﬁrstprice bidding game G = V, E, α, β, γ and a vertex v V . Let W = { B, C : B supp(β), C supp(γ), and Th(v) < B B+C }. Then, the value of G in v is P B,C W β(B) γ(C).

Partial-Information Mean-Payoff Bidding Games In this section we study mean-payoff bidding games. Throughout this section we focus on games played on strongly-connected graphs. We start by surveying results on full-information games. The most technically-challenging results concern one-sided partial-information poormanbidding games. We ﬁrst develop optimal strategies for the partially-informed player, and then show that the value does not necessary exist under pure strategies.

Full-Information Mean-Payoff Bidding Games We show equivalences between bidding games and a class of stochastic games (Condon 1992) called random-turn games, which are deﬁne formally as follows. Deﬁnition 6. (Random-turn games). Consider a stronglyconnected mean-payoff bidding game G. For p [0, 1], the random-turn game that corresponds to G w.r.t. p, denoted RT(G, p), is a game in which instead of bidding, in

each turn, we toss a (biased) coin to determine which player moves the token: Player 1 and Player 2 are respectively chosen with probability p and 1 p. Formally, RT(G, p) is constructed as follows. Every vertex v in G, is replaced by three vertices v N, v1, and v2. The vertex v N simulates the coin toss: it has an outgoing edge with probability p to v1 and an edge with probability 1 p to v2. For i {1, 2}, vertex vi simulates Player i winning the coin toss: it is controlled by Player i and has an outgoing edge to u N, for every neighbor u of v. The weights of v N, v1, and v2 coincide with the weight of v. The mean-payoff value of RT(G, p), denoted MP RT(G, p) , is the optimal expected payoff that the two players can guarantee, and it is known to exist (Puterman 2005). Since G is strongly-connected, MP RT(G, p) does not depend on the initial vertex.

For a full-information game G and a ratio r (0, 1), recall that MP(G, r) denotes the optimal payoff that Max can guarantee with initial ratio r. We state the equivalences between the two models.

Theorem 3. Let G be a strongly-connected full-information mean-payoff bidding game.

First-price Richman bidding (Avni, Henzinger, and Chonev 2019). The optimal payoff that Max can guarantee with a pure strategy does not depend on the initial ratio: for every initial ratio r, we have MP(G, r) = MP RT(G, 0.5) . First-price poorman bidding (Avni, Henzinger, and Ibsen-Jensen 2018). The optimal payoff that Max can guarantee with pure strategy and ratio r coincides with the value of a random-turn game with bias r: for every initial ratio r, we have MP(G, r) = MP RT(G, r) . All-pay poorman bidding (Avni, Jecker, and Žikeli c 2021). The optimal payoff that Max can guarantee with a pure strategy and ratio r > 0.5 coincides with the value of a random-turn game with bias (2r 1)/r: for every initial ratio r > 0.5, we have MP(G, r) = MP RT(G, (2r 1)/r) .

Since the optimal payoff under ﬁrst-price Richman bidding depends only on the structure of the game and not on the initial ratios, the result easily generalizes to partialinformation games. Consider two budget distributions β and γ for Min and Max, respectively. Indeed, when Min s initial budget is B supp(β), playing optimally against any C supp(γ) results in the same payoff, and similarly for Max. We thus conclude the following.

Theorem 4. Consider a strongly-connected ﬁrst-price Richman mean-payoff bidding game G. For any two budget distributions β and γ for the two players, we have MP (G, β, γ) = MP (G, β, γ) = MP RT(G, 0.5) .

Remark 1. (All-pay Richman bidding). It was shown in (Avni, Jecker, and Žikeli c 2021) that in all-pay Richman bidding games, pure strategies are useless : no matter what the initial ratio is, Max cannot guarantee a positive payoff with a pure strategy. The study of mean-payoff all-pay Richmanbidding games is thus trivial in the partial-information setting as well.

The Value of the Partially-Informed Player We turn to study partial-information mean-payoff bidding games under poorman bidding, where we focus on one-sided partial information. We arbitrarily set Max to be partiallyinformed and Min to be fully-informed.

First-price bidding. Fix a strongly-connected meanpayoff game G. Suppose that Max s budget is B and Min s budget is chosen from a ﬁnite probability distribution γ with supp(γ) = {C1, . . . , Cn} and Ci < Ci+1, for 1 i < n. We generalize the technique that is illustrated in Example 2. Max carefully chooses increasing x1, . . . , xn, where xn = B. He maintains two accounts : a spending account from which he bids and a savings account. Initially, the spending account has a budget of x1 and the savings account, a budget of B x1. Max plays optimistically . He ﬁrst plays in hope that Min s budget is C1 with a budget of x1. If Min does not spend C1, the payoff is as in full-information games, namely at least p1 = MP RT(G, x1 x1+C1 ) . Otherwise, Min spends at least C1 and Max transfers budget from his savings account to his spending account so that the saving account has B x2 and the spending account has at least x2 x1. Note that if Min s initial budget was indeed C2, at this point she is left with a budget of at most C2 C1. If Min does not spend C2 C1, by following a full-information optimal strategy, Max can guarantee a payoff of at least p2 = MP RT(G, x2 x1 x2 x1+C2 C1 ) . The deﬁnition of p3, . . . , pn is similar. Max chooses x1, . . . , xn so that p1 . . . pn. Thus, when Min s initial budget is Ci, she has an incentive to play so that Max s spending account will reach xi and the payoff will be at least pi. We call such a choice of x1, . . . , xn admissible and formally deﬁne it as follows. Deﬁnition 7. Admissible sequences. Let G be a poorman mean-payoff bidding game. Let B be a budget of Max and γ be a ﬁnite budget distribution of Min with supp(γ) = {C1, . . . , Cn}. A sequence (xi)1 i n of budgets is called admissible with respect to B and γ if 0 x1 x2 xn = B and p1 p2 . . . pn, where

pi = MP RT G, xi xi 1 xi xi 1 + Ci Ci 1

for each 1 i n, with x0 = 0 and C0 = 0. We denote by ADM(B, γ) the set of all admissible sequences with respect to B and γ. We state our main result and the proof can be found in the full version. Theorem 5 (Mean-payoff value of the partially-informed player). Consider a strongly-connected ﬁrst-price poorman mean-payoff bidding game G. Let B be the initial budget of Max and γ be a ﬁnite budget distribution of Min with supp(γ) = {C1, . . . , Cn}. Then

MP (G, β, γ) = max (xi)1 i n ADM(B,γ) Val(x1, . . . , xn), (2)

where Val(x1, . . . , xn) = Pn i=1 γ(Ci)

MP RT G, xi xi 1 xi xi 1+Ci Ci 1

with x0 = 0 and C0 = 0.

We point to some interesting properties of Max s value:

Remark 2. Consider the bowtie game (Fig. 1) and assume Max s budget is ﬁxed to B = 1 and Min s budget is drawn uniformly at random from {C1, C2}.

When C1 = 1 and C2 = 2, the maximum is obtained at x = 0.5, thus Max s optimal expected payoff is 1 3 = B B+C2 . We note that Max has a very simple optimal strategy in this case: assume the worst on Min s initial budget. That is, play according to an optimal strategy for initial budgets B and C2. When C1 = 1, and C2 = 5, the maximum is obtained at x = 0. This is the dual of the case above. Max can assume the best on Min s initial budget and play according to an optimal strategy for budgets B and C1. When Min s budget is C1, this strategy guarantees a payoff of B B+C1 . But when Min s budget is C2, the strategy cannot guarantee a payoff above 0. Thus, the strategy guarantees an expected payoff of 1

2 B B+C1 . There are cases in which Max s optimal strategy is not one of the trivial cases above. When C1 = 1 and C2 = 3, Max s optimal payoff is 1

2) 0.271, which is strictly larger than both 1

2 B B+C1 and 1

4 = B B+C2 .

All-pay poorman bidding We extend the technique in the previous section to all-pay poorman bidding. In order to state our results formally, we need to redeﬁne the notion of admissible sequences since the optimal payoff that Max can guarantee under all-pay bidding differs from the payoff that he can guarantee under ﬁrst-price bidding. Analogously to Def. 7 but now under all-pay bidding, we say that a sequence (xi)1 i n of budgets is called admissible with respect to a budget B of Max and a budget distribution γ of Min if 0 x1 x2 xn = B and p1 p2 . . . pn, where now

pi = MP RT G, 1 Ci Ci 1

I xi xi 1 > Ci Ci 1

for each 1 i n, with x0 = 0 and C0 = 0. Here, I is an indicator function that evaluates to 1 if the input logical formula is true, and to 0 if it is false. We are now ready to state our result on all-pay poorman mean-payoff bidding games.

Theorem 6 (Mean-payoff value of the partially-informed player). Consider a strongly-connected all-pay poorman mean-payoff bidding game G. Let B be the initial budget of Max and γ be a ﬁnite budget distribution of Min with supp(γ) = {C1, . . . , Cn}. Then

MP (G, β, γ) = max (xi)1 i n ADM(B,γ) Val(x1, . . . , xn), (3)

where Val(x1, . . . , xn) = Pn i=1 γ(Ci) MP RT G, 1

I xi xi 1 > Ci Ci 1 with x0 = 0 and C0 = 0 and I an indicator function.

The Mean-Payoff Value of the Fully-Informed Player Under First-Price Poorman Bidding

In this section we identify the optimal expected payoff that the fully-informed player can guarantee in the bowtie game (Fig. 1) under ﬁrst-price bidding. Suppose that Max s initial budget is B and Min s initial budget is drawn from a distribution γ. Consider the following collection of naive strategies for Min: when her initial budget is C supp(γ), Min plays according to an optimal full-information strategy for the ratio B B+C . We ﬁnd it surprising that this collection of strategies is optimal for Min in the bowtie game. The technical challenge in this section is the lower bound. This result complements Thm. 5: we characterize both Min and Max s values in the bowtie game when the players are restricted to use pure strategies. We show, somewhat unexpectedly, that the two values do not necessarily coincide. In order to state the result formally, we need the following deﬁnition. Intuitively, the potential of B, γ is the optimal expected payoff when Min plays according to the collection of naive strategies described above.

Deﬁnition 8. (Potential). Given a budget B R of Max and a budget distribution γ with support supp(γ) = {C1, C2, . . . , Ck} of Min, we deﬁne Pot(B, γ) = Pk j=1 γ(Cj) B B+Cj .

The main result in this section is given in the following theorem, whose proof follows from Lemmas 8 and 9.

Theorem 7 (Mean-payoff value of the fully-informed player). Consider the bowtie game G . Let B be the initial budget of Max and γ be a ﬁnite budget distribution of Min with supp(γ) = {C1, C2, . . . , Ck}. Then,

MP (G, B, γ) = Pot(B, γ) =

j=1 γ(Cj) B B + Cj . (4)

Before proving the theorem, we note the following.

Remark 3. (Inexistence of a value). Our result implies that the value in partial-information mean-payoff ﬁrst-price poorman bidding games under pure strategies is not guaranteed to exist. Indeed, consider G with B = 1 and γ that draws Min s budget uniformly at random from {1, 2}. By Thm. 5, one can verify that the optimal choice of x is 1, thus MP (G , B, γ) = 1

3. On the other hand, by Thm. 7, we have MP (G , B, γ) = 5 12.

The upper bound is obtained when Min reveals her true budget immediately and plays according to the strategies described above. The following lemma follows from results on full-information games (Thm. 3).

Lemma 8 (Upper bound). For every ϵ > 0, Min has a collection of strategies ensuring an expected payoff smaller than Pot(B, γ) + ϵ.

We proceed to the more challenging lower bound and show that there are no Min strategies that perform better than the naive strategy above.

Lemma 9 (Lower bound). For every ϵ > 0 and for every collection (gj SMin(Cj))1 j k of Min strategies, Max has a strategy ensuring an expected payoff greater than Pot(B, γ) ϵ.

Proof. Let ϵ > 0, and let (gj SMin(Cj))1 j k be a collection of Min strategies. We construct a counter strategy f of Max ensuring an expected payoff greater than Pot(B, γ) ϵ. The proof is by induction over the size k of the support of γ. Obviously, if k = 1, Max has perfect information and can follow a full-information optimal strategy to guarantee a payoff of Pot(B, γ) = B B+C1 (Thm. 3). So suppose that k > 1, and that the statement holds for every budget distribution of Min with a support strictly smaller than k. Max carefully chooses a small part x B of his budget and a part y C1 of Min s budget. He plays according to a full-information strategy f for initial budgets x and y. This can result in three possible outcomes: (O1) Min never uses more than y: the payoff is x x+y as in full-information games; (O2) Min reveals her true initial budget, thus Max can distinguish between the case that Min s budget is Ci and Cj, and by the induction hypothesis he can ensure an expected payoff of Pot(B x, γ) using his remaining budget; (O3) Min does not reveal her true initial budget and spends more than y: Max s leftover budget is greater than B x and, for 1 j k, when Min s budget is Cj, she has Cj y, and Max re-starts the loop by selecting a new x. We show that Max can choose x and y in a way that guarantees that the payoffs obtained in the ﬁrst two outcomes are greater than the desired payoff Pot(B, γ) ϵ. Also, when outcome O3 occurs, the potential does not decrease, and eventually either O1 or O2 occur. Formally, we describe a sequence (πi, Bi, γi)0 i m of conﬁgurations comprising of a history πi consistent with every strategy (gj)1 j k, the budget Bi of Max after πi, and the budget distribution γi of Min with supp(γi) = {Ci 1, Ci 2, . . . , Ci k} following πi. Tuple i represents the budget and budget distribution of the players following i 1 choices of outcome O3. Let λ = 1 ϵ

2 and ρ = 1 Pot(B,γ) 1. We start with (π0, B0, γ0) = (v, B, γ) with v an initial vertex, and we show recursively how Max can update this tuple while ensuring that the following four properties are satisﬁed: (P1) The history πi is consistent with every (gj)1 j k; (P2) Max spends his budget sufﬁciently slowly: Bi λi B; (P3) Min spends her budget sufﬁciently fast: Ci j Cj ρ (1 λi)B for every 1 j k; (P4) The potential never decreases: Pot(Bi, γi) Pot(B, γ). Note that for the initial tuple (π0, B0, γ0) = (v, B, γ), these are trivially satisﬁed. Moreover, Property P3 implies an upper bound on i, that is, outcome O3 can happen only ﬁnitely many times: limi Ci 1 limi C1 ρ (1 λi)B = C1 ρ B = B + C1 B Pot(B,γ) = 1 1 B+C1

1 Pk j=1 γ(Cj) 1 B+Cj which is negative since C1 < C2 < . . . <

Ck, yet a negative Ci 1 means that Min illegally bids higher than her available budget. We now deﬁne the choices xi and yi for each i N, and show that they satisfy the properties described above. Let

xi = ϵ 2 λi B and yi = ρ xi. For initial budgets xi and yi, let fi be a Max strategy whose payoff is greater than xi xi+yi ϵ. Max follows fi as long as Min spends at most yi. Let (ψj)1 j k be plays such that for each 1 j k (1) the play πiψj is consistent with the strategy gj; (2) Max plays according to fi along ψj; (3) ψj stops when Min spends more than yi, and is inﬁnite if she never does. In the full version, we consider three possible cases, depending on whether the paths ψj are ﬁnite or inﬁnite, and whether they are distinct or identical. If they are all inﬁnite, or there are at least two distinct ones, we show that Max immediately has a way to obtain the desired payoff. If they are all identical and ﬁnite, we show that, while Max cannot immediately get the desired payoff, he can go to the next step by setting φi+1 = φiψ1, and restarting.

Discussion and Future Work We initiate the study of partial-information bidding games, by studying games with partially-observed budgets. For mean-payoff games, we show a complete picture in stronglyconnected games for the partially-informed player, which is the more important case in practice. By identifying the value for the fully-informed player in the bowtie game, we show that the value in mean-payoff bidding games does not necessarily exist when restricting to pure strategies. We discuss open problems in this model. First, we focus on games played on strongly-connected graphs. Reasoning about such games is the crux of the solution to general full-information bidding games. We thus expect that our results will be key in the solution of partial-information bidding games on general graphs. This extension, however, is not straightforward as in the full-information setting, and we leave it as an open question. Second, we identify the value of the fully-informed player in the bowtie game G . Reasoning about G was the crux of the solution to general stronglyconnected full-information bidding games. In fact, the same technique was used to lift a solution for G to general strongly-connected games under all the previously-studied bidding mechanisms. In partial-information games, however, this technique breaks the intricate analysis in the proof of Thm. 7. Again, we expect a solution to the bowtie game to be a key ingredient in the solution to general stronglyconnected games, and we leave the problem open. Finally, we showed that the value does not necessarily exist under pure strategies. We leave open the problem of developing optimal mixed strategies for the players. This work is part of a research that combines formal methods and AI including multi-agent graph games (Alur, Henzinger, and Kupferman 2002), logics to reason about strategies (Chatterjee, Henzinger, and Piterman 2010; Mogavero et al. 2014) and in particular, their application in auctions (Mittelmann et al. 2022), enhancing networkformation games with concepts from formal methods (e.g., (Avni, Kupferman, and Tamir 2016)), and many more.

Acknowledgments This research was supported in part by ISF grant no. 1679/21, by the ERC Co G 863818 (For M-SMArt), and

the European Union s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Grant Agreement No. 665385.

References Aghajohari, M.; Avni, G.; and Henzinger, T. A. 2021. Determinacy in Discrete-Bidding Inﬁnite-Duration Games. Log. Methods Comput. Sci., 17(1). Alur, R.; Henzinger, T. A.; and Kupferman, O. 2002. Alternating-time temporal logic. J. ACM, 49(5): 672 713. Aumann, R. J.; Maschler, M.; and Stearns, R. E. 1995. Repeated games with incomplete information. MIT press. Avni, G.; and Henzinger, T. A. 2020. A Survey of Bidding Games on Graphs. In Proc. 31st CONCUR, volume 171 of LIPIcs, 2:1 2:21. Schloss Dagstuhl - Leibniz-Zentrum für Informatik. Avni, G.; Henzinger, T. A.; and Chonev, V. 2019. Inﬁniteduration Bidding Games. J. ACM, 66(4): 31:1 31:29. Avni, G.; Henzinger, T. A.; and Ibsen-Jensen, R. 2018. Inﬁnite-Duration Poorman-Bidding Games. In Proc. 14th WINE, volume 11316 of LNCS, 21 36. Springer. Avni, G.; Henzinger, T. A.; and Žikeli c, Ð. 2021. Bidding mechanisms in graph games. J. Comput. Syst. Sci., 119: 133 144. Avni, G.; Ibsen-Jensen, R.; and Tkadlec, J. 2020. All-Pay Bidding Games on Graphs. In Proc. 34th AAAI, 1798 1805. AAAI Press. Avni, G.; Jecker, I.; and Žikeli c, Ð. 2021. Inﬁnite-Duration All-Pay Bidding Games. In Proc. 32nd SODA, 617 636. Avni, G.; Kupferman, O.; and Tamir, T. 2016. Networkformation games with regular objectives. Inf. Comput., 251: 165 178. Borel, E. 1921. La théorie du jeu les équations intégrales á noyau symétrique. Comptes Rendus de l Académie, 173(1304 1308): 58. Chatterjee, K.; Henzinger, T. A.; and Piterman, N. 2010. Strategy logic. Inf. Comput., 208(6): 677 693. Condon, A. 1992. The Complexity of Stochastic Games. Inf. Comput., 96(2): 203 224. Develin, M.; and Payne, S. 2010. Discrete bidding games. Electron. J. Comb., 17(1). Lazarus, A. J.; Loeb, D. E.; Propp, J. G.; Stromquist, W. R.; and Ullman, D. H. 1999. Combinatorial Games under Auction Play. Games and Economic Behavior, 27: 229 264. Lazarus, A. J.; Loeb, D. E.; Propp, J. G.; and Ullman, D. 1996. Richman Games. Games of No Chance, 29: 439 449. Meir, R.; Kalai, G.; and Tennenholtz, M. 2018. Bidding games and efﬁcient allocations. Games Econ. Behav., 112: 166 193. Mittelmann, M.; Maubert, B.; Murano, A.; and Perrussel, L. 2022. Automated Synthesis of Mechanisms. In Proc. 31st IJCAI, 426 432. ijcai.org. Mogavero, F.; Murano, A.; Perelli, G.; and Vardi, M. Y. 2014. Reasoning About Strategies: On the Model-Checking Problem. ACM Trans. Comput. Log., 15(4): 34:1 34:47.

Peres, Y.; Schramm, O.; Shefﬁeld, S.; and Wilson, D. B. 2009. Tug-of-war and the inﬁnity Laplacian. J. Amer. Math. Soc., 22: 167 210. Pnueli, A.; and Rosner, R. 1989. On the Synthesis of a Reactive Module. In Proc. 16th POPL, 179 190. ACM Press. Puterman, M. L. 2005. Markov Decision Processes: Discrete Stochastic Dynamic Programming. New York, NY, USA: John Wiley & Sons, Inc. ISBN 978-0-471-72782-8. Raskin, J.; Chatterjee, K.; Doyen, L.; and Henzinger, T. A. 2007. Algorithms for Omega-Regular Games with Imperfect Information. Log. Methods Comput. Sci., 3(3). Reif, J. H. 1984. The Complexity of Two-Player Games of Incomplete Information. J. Comput. Syst. Sci., 29(2): 274 301. Wulf, M. D.; Doyen, L.; and Raskin, J. 2006. A Lattice Theory for Solving Games of Imperfect Information. In Proc. 9th HSCC, volume 3927 of LNCS, 153 168. Springer.