# optimal_dynamic_auctions_are_virtual_welfare_maximizers__42f5ddbb.pdf

The Thirty-Third AAAI Conference on Artiﬁcial Intelligence (AAAI-19)

Optimal Dynamic Auctions Are Virtual Welfare Maximizers

Vahab Mirrokni Google Research mirrokni@google.com

Renato Paes Leme Google Research renatoppl@google.com

Pingzhong Tang Tsinghua University kenshinping@gmail.com

Song Zuo Google Research szuo@google.com

We are interested in the setting where a seller sells sequentially arriving items, one per period, via a dynamic auction. At the beginning of each period, each buyer draws a private valuation for the item to be sold in that period and this valuation is independent across buyers and periods. The auction can be dynamic in the sense that the auction at period t can be conditional on the bids in that period and all previous periods, subject to certain appropriately deﬁned incentive compatible and individually rational conditions. Perhaps not surprisingly, the revenue optimal dynamic auctions are computationally hard to ﬁnd and existing literatures that aim to approximate the optimal auctions are all based on solving complex dynamic programs. It remains largely open on the structural interpretability of the optimal dynamic auctions. In this paper, we show that any optimal dynamic auction is a virtual welfare maximizer subject to some monotone allocation constraints. In particular, the explicit deﬁnition of the virtual value function above arises naturally from the primal-dual analysis by relaxing the monotone constraints. We further develop an ironing technique that gets rid of the monotone allocation constraints. Quite different from Myerson s ironing approach, our technique is more technically involved due to the interdependence of the virtual value functions across buyers. We nevertheless show that ironing can be done approximately and efﬁciently, which in turn leads to a Fully Polynomial Time Approximation Scheme of the optimal dynamic auction.

1 Introduction Recently, the problem of designing multi-period dynamic mechanisms has been shown to be both theoretically challenging and practically important. In particular, when running a sequence of repeated auctions on online advertising platforms, using dynamic auctions optimized across different time periods could potentially bring signiﬁcant gains both in terms of revenue and social welfare. The power of

The authors thank the anonymous reviewers for their helpful comments. This paper is supported in part by the National Natural Science Foundation of China Grant 61561146398, a China Youth 1000-talent Program, and an Alibaba Innovative Research Program. A full version of this paper is available at https://ssrn.com/abstract=3282600. Copyright c 2019, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

dynamic mechanisms has been investigated by a number of recent papers (Bergemann and V alim aki 2002; Parkes and Singh 2004; Cavallo 2008; Athey and Segal 2013; Kakade, Lobel, and Nazerzadeh 2013; Pai and Vohra 2013; Pavan, Segal, and Toikka 2014; Devanur, Peres, and Sivan 2015; Balseiro, Mirrokni, and Paes Leme 2016; Chawla et al. 2016; Bergemann, Castro, and Weintraub 2017; Balseiro et al. 2017; Balseiro, Mirrokni, and Leme 2017; Lobel and Paes Leme 2017; Shen, Wang, and Zuo 2018; Balseiro et al. 2019). We refer to Bergemann and Said (2011) and Bergemann and V alim aki (2017) for comprehensive surveys on the subject. In particular, we consider a setting where a seller repeatedly interacts with a set of buyers and sells one item per period. The value of each buyer for the item at each period are independently drawn from commonly known prior distributions (no need to be identical) at the beginning of that period. The seller is allowed to sell the item of each period not only depending on the bids submitted in the current period, but also the histories, i.e., all the bids submitted in past periods. In the meanwhile, the seller must guarantee the dynamic auctions to be ex-post individual rational the cumulative utility for each buyer is always positive, and dynamic incentive compatible bidding truthfully (i.e., submitting private values as bids) to the auction is optimal for each buyer by taking into consideration the effect of current bids on future outcomes. Even though dynamic mechanisms can be much more effective in maximizing revenue and social welfare (Jackson and Sonnenschein 2007; Papadimitriou et al. 2016), they have not been widely adopted in practice. The main issue of implementing dynamic mechanisms in practice is their high complexity. The complexity induced by the exponentially growing design space makes it difﬁcult to solve or even to describe such mechanisms. A series of recent work has made progress to resolve the complexity issue described above. For example, Ashlagi, Daskalakis, and Haghpanah (2016) and Mirrokni et al. (2018b) show that it is enough for the optimal dynamic auction to depend on scalar summaries of histories instead of the full history. However, the approximately optimal mechanisms described in these papers are solutions of complex dynamic programs that are written into a large table. It is therefore very difﬁcult to understand the structure of these

mechanisms. It remains open whether there is an intuitive structural characterization of the optimal dynamic auction. In this paper, we show that the exact optimal dynamic auction has a very simple structural interpretation: the optimal dynamic auction in each period is a second price auction on a certain appropriately deﬁned virtual value space.1 More speciﬁcally, such virtual values (before ironing23) are quite similar to Myerson s virtual value (Myerson 1981), i.e., they have the form of linear combinations of private values and Myerson s virtual values (before ironing). However, just like Myerson s auction, to make the virtual welfare4 maximizing allocation rule monotone, one need to ﬁrst iron the virtual values. Unlike the ironing in Myerson s auction, the ironing step in our case is interdependent across the values of different buyers. In other words, one s virtual value after ironing not only depends on his/her own value, but also on other buyers values. Although the ironing step here is not as simple as the ironing in Myerson s auction, its computation is still efﬁcient for constant many buyer cases. Moreover, we provide a Fully Polynomial Time Approximation Scheme to compute the virtual values for any period of the dynamic auction given the histories so far, and such virtual values induce a dynamic auction with revenue arbitrarily close to the optimal.

Techniques There are two main techniques used in our analysis for optimal dynamic auctions. The ﬁrst is the socalled bank account mechanisms which are a subset of dynamic auctions with simple structures and can achieve the optimal revenue of all the dynamic auctions (Mirrokni et al. 2016a; 2016b; 2018a; 2018b). Brieﬂy speaking, a bank account mechanism keeps a state for each buyer as the summary of the history of each buyer, which is a scalar called balance. Each period depends on the previous periods through the vector of buyer balances. With the bank account framework, the designer only needs to specify single-period auctions that are single-period incentive compatible together with a valid balance update policy. In other words, the design of an entire dynamic auction breaks into the design of a series of single-period auctions and a balance update policy. The decomposition greatly simpliﬁes the problem and enables clean mathematical programs for each period. The second is a primal-dual analysis and a sensitivity analysis of the parametric mathematical programs. In fact, primal-dual analysis is commonly used in economic studies (Nisan et al. 2007, Chapter 5; Daskalakis, Deckelbaum, and Tzamos; Cai, Devanur, and Weinberg; Cai and

1A virtual value function is a map from buyer value space to real numbers. A virtual value is the corresponding real number of some private value of the buyer. 2Informally, the ironing of a virtual value function is an operation that maps the virtual value function to another virtual value function (we call an ironed virtual value function) while preserving the expectation. For the formal deﬁnition, see Deﬁnition 3.1. 3Some recent works on virtual value and ironing (Elkind 2007; Roughgarden and Schrijvers 2016). 4Virtual welfare is the sum of virtual values of the buyers who get the item. For expected virtual welfare, the expectation is taken over the randomness from the allocation rules.

Zhao; Daskalakis, Deckelbaum, and Tzamos 2013; 2016; 2017; 2017). In particular, for our problem, we can prove that these programs are convex and satisfy the Slater s conditions. Hence the solution is optimal if and only if the KKT conditions are satisﬁed. From these conditions, we show that the auction in each period maximizes some virtual welfare and we also discover the exact form of the virtual values with partial relaxation on the monotonicity constraints of allocation. Furthermore, we show that adding these relaxed constraints back to the program indeed corresponds to ironing the virtual values. As we mentioned previously, the ironing step here is interdependent across the values of the buyers and hence different from the ironing step in Myerson s auction. To resolve this difﬁculty, we show a method to algorithmically accomplish the ironing step.

2 Preliminaries We study a setting where a seller repeatedly interacts with k buyers selling one item per period over T periods. The value of each buyer i [k] for the item in period t [T] is vt i V. If xt i [0, 1] represents the probability that buyer i is allocated the item in period t, his utility is vt i xt i. Throughout this paper, (i) we use subscripts as the indices of buyers, bold fonts for vectors of all the buyers (i.e., a = (a1, . . . , ak)), and subscript i for the vector except the i-th element (i.e., a i = (a1, . . . , ai 1, ai+1, . . . , ak)); (ii) we use superscripts as the indices of periods and a1..t to denote the sequence a1, . . . , at. The values vt i are assumed to be drawn from independent distributions F t i .56 The distributions F t i are assumed to be common knowledge but the realizations of the random variables are initially unknown for both the buyers and the designer. At each period, the following events happen:

1. each buyer i learns his value vt i F t i ;

2. each buyer i reports value ˆvt i to the designer;

3. the designer implements an outcome xt [0, 1]k and charges the buyers pt;

4. each buyer accrues utility ut i = vt i xt i pt i.

A dynamic mechanism is then described in terms of a pair of maps for each period, which associate the history of reports ˆv1..t = (ˆv1, ˆv2, . . . , ˆvt) to an outcome xt and payment pt:

Outcome: xt : Vkt [0, 1]k, Payment: pt : Vkt Rk.

Finally we deﬁne: ut i(vt i; ˆv1..t) = vt i xt i(ˆv1..t) pt i(ˆv1..t).

2.1 Dynamic Incentive Compatibility A mechanism is incentive compatible if it provides incentives for buyers to reveal their true types in each iteration. Such conditions for dynamic mechanisms can be easily deﬁned by backward induction: in the last period, regardless

5In practice, it is still fair to assume that the values of the buyers (advertisers) are independent conditional on each particular inventory and cookie. 6If the distributions are not independent, then the weak version of truthfulness still holds: truthful if all other buyers bid truthfully.

of the history so far and other buyers reports, it should be incentive compatible for each buyer to report his true value:

v T i = argmaxˆv T i u T i (v T i ; ˆv1..T ), i, ˆv1..T 1, ˆv T i, v T i .

To simplify notations, from now on we will omit the forall quantiﬁcation and assume all expressions are quantiﬁed as for-all in its free variables. For the next-to-last-period, it should be incentive compatible for the buyer to report his true value given that he will report his true value in the following period:

v T 1 i = argmaxˆv T 1 i u T 1 i (v T 1 i ; ˆv1..T 1)

+ Ev T i [u T i (v T i ; ˆv1..T 1, v T i , ˆv T i)].

Proceeding by backward induction, we require that:

vt i = argmaxˆvt i ut i(vt i; ˆv1..t) + U t i (ˆv1..t i |ˆv1..T i ) (DIC)

where the second term is the continuation utility, i.e., the expected utility obtained from the subsequent periods of the mechanism assuming the buyer reports truthfully:

U t i (ˆv1..t i |ˆv1..T i ) := E vt+1..T i

τ=t+1 uτ i (vτ i ; ˆv1..t, vt+1..τ i , ˆvt+1..τ i )

A well-known fact in dynamic mechanism design is that DIC implies that buyer i s expected overall utility U 0 i (ˆv1..T i ) is maximized by reporting truthfully in each period.

2.2 Ex-Post Individual Rationality Another desirable constraint is ex-post individual rationality which says that a buyer should derive non-negative utility from the mechanism for every realization of the values: PT t=1 ut i(vt i; v1..t) 0 (e P-IR)

We focus on the problem of maximizing revenue subject to DIC, e P-IR, and feasibility constraints:

max REV = E[PT t=1 Pk i=1 pt i(v1..t)]

s.t. (DIC), (e P-IR), and feasibility: Pk i=1 xt i(v1..t) 1

2.3 Bank Account Mechanisms (BAM) The space of mechanisms satisfying DIC and e P-IR is very broad and unstructured. We restrict our attention to a subclass of dynamic mechanisms introduced by Mirrokni et al. (2016a) called bank account mechanisms. The mechanisms are simple, dynamic incentive compatible by design and have the following notable features: Lemma 2.1 ((Mirrokni et al. 2018b)). Given any dynamic mechanism satisfying DIC and e P-IR, there exists a bank account mechanism with at least the same revenue and welfare. In particular, for any given setting, there is an optimal mechanism in the form of a bank account mechanism. A bank account mechanism keeps a state for each buyer, which is a scalar called balance. Each period depends on previous ones through the vector of balances. Another main feature is that in this framework, the designer needs to specify single-period auctions that are single-period incentive

compatible together with a valid balance update policy. Once a valid balance update policy is in place, all the designer needs to worry about are single-period IC constraints. A bank account mechanism B is deﬁned in terms of the following functions for each period: A static single-period auction x B,t(vt, b), p B,t(vt, b) parameterized by a balance vector b Rk + that is (singleperiod) incentive-compatible for each b, i.e.:

vt i x B,t i (vt, b) p B,t i (vt, b)

vt i x B,t i (ˆvt i, vt i, b) p B,t i (ˆvt i, vt i, b) (IC)

Note that we do not require the mechanism to be (singleperiod) individually rational, while we require the buyer utility to be balance independent in expectation, i.e.:

Evt i F t i

h vt i x B,t i (vt, b) p B,t i (vt, b) i

is a non-negative constant not depending on b (BI)

A balance update policy b B,t(vt, b) which maps the previous balances and the reports to the current balances, satisfying the following balance update conditions:

0 b B,t i (vt, b) bi + vt i x B,t i (vt, b) p B,t i (vt, b) (BU)

Given the balance update functions, we can deﬁne bt : Vt Rk + recursively as:

b0 = 0 and b1(v1) = b B,1(v1, 0)

and bt(v1..t) = b B,t(vt, b B,t 1(v1..t 1))

which allows us to deﬁne a dynamic mechanism in the standard sense as:

xt(v1..t) = x B,t(vt, bt 1), pt(v1..t) = p B,t(vt, bt 1).

In what follows we will abuse notations by dropping the superscript B and refer to xt(v1..t) and xt(vt, bt 1) interchangeably. One important theorem from previous studies is that any bank account mechanism satisﬁes DIC and e P-IR. Lemma 2.2 ((Mirrokni et al. 2018b)). Any bank account mechanism satisfying IC, BI, and BU is DIC and e P-IR.

3 The Structure of Optimal BAMs In this section, we show our ﬁrst main result that identiﬁes the underlying structure of the optimal bank account mechanism (hence also the optimal dynamic auctions). Interestingly, the dynamic auctions with the structure can be interpreted as an ironed virtual welfare maximizing auction, where the ironed virtual value is deﬁned as follows. Deﬁnition 3.1 (Ironing). The ironing operation on the virtual value functions of all buyers transfers these virtual value functions into some other virtual value functions (called ironed virtual value functions) while preserving the following properties: Monotonicity: The allocation rule maximizes the ironed virtual welfare is monotone: for each buyer i, the allocation probability is weakly increasing in vi for ﬁxed v i.

Limited transfer: The expectation of virtual value conditional on each allocation equivalence class is unchanged. An allocation equivalence class of buyer i with any given v i is a maximal subset of his/her private values where the allocation probability to him/her is constant. The allocation rule here must maximize the ironed virtual welfare. First order dominance: The ironed virtual value is ﬁrst order dominated by the virtual value before ironing. Informally, we have the following theorem: Theorem 3.2 (Informal). Any revenue optimal bank account mechanism is maximizing some virtual value (after ironing) for each period. We will restate the formal version after introducing all necessary notations (Theorem 3.8) and the speciﬁc form of the virtual value (before ironing) will be provided.

3.1 Formalizing the subprogram for each period We start with the subproblem of optimizing period t while all other periods are ﬁxed. In particular, the problem can be formalized as a convex program. Lemma 3.3 (Convex program). For each period t, any valid balance bt, and ﬁxed expected utility for period t, if mechanism is ﬁxed for the remaining periods (t + 1 to T), then the optimal auction for period t can be solved via a convex program. To proof this statement smoothly, we show how to formalize the program step by step in the rest of this subsection rather than putting everything into a single proof environment. For the ease of presentation, we will hide the superscript t while focusing on a single period. Let b be the bank account balance given at the beginning of this period and

ξi(v i) := Evi[vi x(v, b) p(v, b)] (1)

be the expected utility of buyer i in this period. Note that by (BI), it is independent of the bank account balance b, while it could be different for different bids from other buyers, i.e., v i. In fact, the ξ are some parameters we will need to determine later based on the distribution of b induced by the auctions in each period.

Objective The expected revenue acquired within this period can be computed by the expected social welfare minus the expected utility of all the buyers,

PERIODREV(b, ξ) = Ev [P

i(vi xi(v, b) ξi(v i))] .

Besides the expected revenue from the current period, we also need to include the expected revenue for future periods in the objective. We then use g(b |M) to denote the expected total revenue the seller can collect from all upcoming periods by following a given bank account mechanism M in these periods, where b is the vector of the bank account balance at the end of the current period. In fact, b = b+ b(v) is a function of buyers bids v in this period, where b(v) are the changes in the bank accounts. Then the expected revenue since current period is,

REV(b, ξ) = Ev P

i(vixi(v, b) ξi(v i)) + g(b |M) .

Constraints Now we proceed to the constraints of the optimization. According to Lemma 2.2, the mechanism must satisfy constraint (IC), (BI), and (BU). Note that (BI) is guaranteed by the deﬁnition of ξi(v i), we then formalize constraint (IC) and (BU). For (IC), as we won t introduce the payment variables, it is enough to require monotone allocation rules only, i.e., (IC) xi(v, b) is monotone in vi for ﬁxed v i. (2) Since by the Envelope theorem (Rochet 1985), (vi xi(v, b) pi(v, b))/ vi = xi(v, b)

= ui(v, b) = ui(0, v i, b) + R vi 0 xi(s, v i, b)ds. In other words, with p i(v, b) := pi(v, b) pi(0, v i, b) and u i(v, b) := vi xi(v, b) p i(v, b), the utility of buyer i in this period is fully determined by the allocation function xi( , b) and the minimum payment pi(0, v i, b): u i(v, b) = ui(v, b) + pi(0, v i, b) = ui(v, b) ui(0, v i, b) + 0 xi(0, v i, b)

= R vi 0 xi(s, v i, b)ds. Recall that for (BU), we require (i) the increase of balance cannot be more than the utility obtained in the current period and (ii) the updated balance must be nonnegative. Note that by the deﬁnition of ξi(v i) (1): Evi[u i(v, b) pi(0, v i, b)] = Evi[ui(v, b)] = ξi(v i) pi(0, v i, b) = Evi[u i(v, b)] ξi(v i). Hence on one hand, for (i) and (ii), we can rewrite it as: 0 bi + bi(v) bi + ui(v, b)

bi bi(v) ui(v, b) = u i(v, b) pi(0, v i, b) = u i(v, b) Evi[u i(v, b)] + ξi(v i). On the other hand, to ensure the set of the possible balance increment bi(v) satisfying (i) and (ii) is not empty, we need bi + ui(v, b) 0. Note that bi is given and ui(v, b) reaches the minimum when vi = 0, it is equivalent to: bi + ui(0, v i, b) = bi pi(0, v i, b) 0 Evi[u i(v, b)] bi + ξi(v i). In summary, (BU) is equivalent to the follows: Evi[u i(v, b)] bi + ξi(v i) bi bi(v) u i(v, b) Evi[u i(v, b)] + ξi(v i) (3)

Finally, we need the feasibility constraint of the allocation rule that the total allocation of each item is no more than 1: P

i xi(v, b) 1. (4) Therefore, combining the objective REV(b, ξ) and the constraints (2), (3), and (4), for given b and g, the optimization program for current period can be written as: max REV(b, ξ) (5)

subj.t. xi(v) is monotone in vi for any ﬁxed v i Evi[u i(v)] bi + ξi(v i), i, v i bi bi(v) u i(v) Evi[u i(v)] + ξi(v i), i, v P

i [n] xi(v) 1, v

xi(v) 0, bi(v) 0, i, v where we hide the b in the input to xi(v, b) and u i(v, b) to emphasize that b is extraneous.

Simpliﬁcation We then simplify the optimization program for optimal bank account mechanisms. One key observation is that with ﬁxed ξ (for all periods), the optimization with any ﬁnite horizon can be solved via backward induction. In the program for the last period T, the expected revenue for the future, g T , is always zero. For any parameter b of the last period, the optimal value of the program deﬁnes a function with respect to the balance vector b, which, in fact, is the expected revenue for the last period where the mechanism in the last period M T is optimal (for given b and ξ). In other words, the optimal solution of the program for period T deﬁnes the function g T 1(b|M T ), where the mechanism for period T is ﬁxed to be optimal. Similarly, the optimal mechanism for each period M t and the corresponding gt(b|M t+1..T ) functions in the program for each period can be determined by backward induction. From now on, by omitting the mechanism being conditional on in gt, we mean the gt function is conditional on the optimal M t+1..T , i.e.,

gt(b) := gt(b|M t+1..T ) = max M gt(b|M). Note that the maximum always exists because the domain of the mechanisms is compact and the revenue is a continuous function of the mechanism. Lemma 3.4. gt(b) is weakly increasing. Due to limited space, we omit the proof of Lemma 3.4. Hence we can eliminate variable b(v) from the original program (5): Since gt(b) is weakly increasing, it is without loss of generality to enforce bi(v) = u i(v) Evi[u i(v)] + ξi(v i), i, v. Meanwhile, by the following Myerson s lemma, we can further rewrite Evi[u i(v)]. Lemma 3.5 ((Myerson 1981)). For any incentive compatible single item auction, the expected payment of buyer i equals to the expected (Myerson s) virtual welfare contribution of buyer i: i, v i, Evi[pi(v)] = Evi[(vi ϑi(v)) xi(v)],

where ϑi(v) = vi (1 Fi(vi))/fi(vi).7

Therefore,8

Evi[u i(v)] = Evi[vi xi(v) p i(v)] (6) = Evi[vi xi(v) (vi ϑi(v)) xi(v)] = Evi[ϑi(v) xi(v)]. Thus, program (5) can be simpliﬁed as follows, where bi(v), u i(v), and u i(v i) are notations, not variables. max REV(b, ξ) (7) = Ev [P

i(vi xi(v) ξi(v i)) + g( b(v) + b)] subj.t. xi(v) is monotone in vi for ﬁxed v i u i(v i) := Evi[u i(v)] = Evi[ϑi(v)xi(v)] bi + ξi(v i), i, v i bi(v) = u i(v) u i(v i) + ξi(v i), i, v P

i [n] xi(v) 1, v

xi(v) 0, i, v

7Fi must be absolutely integrable. 8ϑi(v) is called virtual value for utility and equation (6) is from (Hartline and Roughgarden 2008, Lemma 2.6).

Convexity We remains to show that the program is convex. In fact, we have the following lemma.

Lemma 3.6. gt(b) is and concave.

We omit the proof of Lemma 3.6 here. Since gt(b) is concave, the objective function (7) is also concave. Meanwhile, all the constraints are linear, so the program is a convex program (note that a standard convex program is minimizing a convex objective function or maximizing a concave objective function).

3.2 Duality, virtual values, and ironing

We ﬁrst consider the optimal solution to the program with the monotonicity constraint (2) on x(v) relaxed. In fact, in the full version of this paper, we generalize the analysis to the original program. Although the optimal solution to the relaxed program is not a feasible solution, the analysis does capture the most interesting insight to this problem and provides the explicit form of the virtual values (before ironing). In the full version, we show that adding the monotonicity constraint (2) back to the program corresponds to applying the ironing operation on the virtual values. For the following relaxed program, let λi(v i) be the Lagrange multiplier of the constraint Evi[ϑi(v)xi(v)] bi + ξi(v i) and µ(v) be the Lagrange multiplier of the feasibility constraint (4).

max REV(b, ξ) (8) subj.t. Evi[ϑi(v)xi(v)] bi + ξi(v i), v i P

i [n] xi(v) 1, xi(v) 0, i, v

Then the Lagrange L := L(x(v), λi(v i), µ(v)) is

i(vi xi(v) ξi(v i)) + g( b(v) + b)] P

i,v i λi(v i) (Evi[ϑi(v)xi(v)] bi ξi(v i))

i xi(v) 1) .

Note that all constraints in program (8) are linear, therefore the Slater s condition is satisﬁed and the KKT conditions (Boyd and Vandenberghe 2004, Chapter 5) are necessary and sufﬁcient for any optimal solution to (8). In particular, the explicit form of the virtual values can be derived from the KKT conditions. Let gi denote the partial derivative of g with respect to the i-th dimension, i.e., gi(b) = g(b)/ bi. Let α and β be deﬁned as follows,

αi(v) := 1 + gi( b(v) + b)

βi(v i) := λi(v i)

f(v i) + Evi[gi( b(v) + b)].

Lemma 3.7 (Virtual welfare maximizer). Any optimal solution to (8) must maximize the expected virtual welfare, where the virtual value function φi(v) is given as follows:

φi(v) = αi(v) vi βi(v i) ϑi(v).

We refer readers to the full version for the detailed proof.

Ironing As we mentioned at the beginning of this subsection, the optimal solution to the original program (7) must maximize the expected ironed virtual welfare deﬁned according to the ironed virtual values φi(v). Moreover, the ironed virtual value φi(v) is the transformation of φi(v) according to the ironing rules (Deﬁnition 3.1). We omit the details of the analysis here and incorporate the conclusion in the formal version of our ﬁrst main theorem (Theorem 3.8). We refer readers to the full version for the complete proof.

3.3 Sensitivity analysis and the optimality across the periods So far we have determined the optimal auction for each period when the expected buyer utilities of each period ξ are ﬁxed. Now we show how to optimize the remaining parameters ξ through sensitivity analysis. Let R(ξ) denote the optimal revenue when the auctions in each period are optimal for the given ξ. Then we show how to determine the partial derivatives of R(ξ) via sensitivity analysis and hence enable the gradient descent algorithm to ﬁnd the optimal ξ . In particular, R(ξ) is a concave function.9 The standard sensitivity analysis (Boyd and Vandenberghe 2004, Chapter 5.6) indicates how much the optimal objective value of a program will change if some constraints become slightly looser or tighter. Such quantities usually have important physical meanings in economic setups. For example, consider gt(b) = max REVt(b, ξ) (we brought back the superscript t to distinguish the variables for different periods). By sensitivity analysis,10 we have:

gt i(b) = Evt[gt+1 i ( b(v) + b)] + P

vt i λt i(vt i).

Then gt i(b), by deﬁnition, is the marginal contribution of the balance of buyer i in period t to the expected revenue since the t-th period. Similarly, Evt[gt+1 i ( b(v)+b)] is the marginal contribution of bi to the expected revenue since the (t + 1)-th period. Hence their difference, P

vt i λt i(vt i), is the marginal contribution of bi to the expected revenue of the t-th period. In particular, λt i(vt i) is the marginal contribution if the values of other buyers are vt i. Moreover, we can conclude the concrete form of the partial derivatives of R(ξ) by sensitivity analysis:

gt(b) ξt i(vt i) = f t(vt i) 1 + Evt i [gt+1 i ( b(v) + b)] + λt i(vt i)

= (βt i(vt i) 1)f t(vt i)

= R(ξ) ξt i(vt i) = Eb[gt(b)]

ξt i(vt i) = (Eb βt i(vt i) 1)f t(vt i),

where the expectation taken over b is computed by simulating the auctions in previous periods. In particular, ξ is selected optimally, if and only if:

Eb βt i(vt i) = 1 or ξt(vt i) = 0, Eb βt i(vt i) 1.

9Since by simply taking any convex combination of different ξ and ξ , the revenue obtained is also the convex combination of the revenue resulted by ξ and ξ . 10If gt is not differentiable at b, the right-hand-side is a subgradient of gt.

3.4 Summary As a summary, we formally restate Theorem 3.2. Theorem 3.8 (Formal). A bank account mechanism is optimal if and only if all the following conditions are satisﬁed: it satisﬁes all the basic constraints, (IC), (BI), (BU) and the feasibility constraint (4); its allocation rule maximizes the ironed virtual welfare, where the virtual value (before ironing) φ(v) has the following form:

φ(v) = αi(v)vi βi(v i)ϑi(v),

which can be seen as the combination of the Myerson s virtual value and the private value; ﬁnally, the expected utility of each period ξ is selected optimally, i.e.,

Eb βt i(vt i) = 1 or ξt(vt i) = 0, Eb βt i(vt i) 1.

4 An Algorithmic Approach to the Structure So far we showed that the optimal bank account mechanism maximizes the ironed virtual welfare in each period. Although the explicit form of the virtual values (before ironing) is given in Lemma 3.7 and the ironing rule is given in Deﬁnition 3.1, how to accomplish the ironing operation is still unknown. One of the major difﬁculty of the ironing comes from the fact that one s virtual value not only depends on his/her own private value, but also depends on the private values of other buyers through the term αi(v). In the presence of such interdependence across different buyers on virtual values, monotone virtual value function does not imply monotone allocation rules. In this section, we algorithmically resolve the difﬁculty of ironing. In particular, we show a Fully Polynomial Time Approximation Scheme (FPTAS) that can accomplish the ironing step for any constant many buyer cases and hence compute the ironed virtual values. Moreover, the bank account mechanism induced by the ironed virtual values computed is (multiplicatively) (1 ϵ)-approximately optimal in terms of revenue. We also emphasize that the hard core of computing the exact optimal solution is not directly from the ironing step but the step of approximating the concave functions gt(b). Note that gt(b) is a continuous function without closed forms and the main effort of this section is to show that we can arbitrarily approximate gt(b) with piece-wise linear functions and guarantee that (i) the number of pieces is polynomial in the input size and (ii) the ﬁnal result is approximately optimal. Recall the program (7), in particular, we bring back the superscripts t to emphasize the periods:

max gt 1(bt) = REV(bt, ξt)

i(vt ixt i(vt) ξt i(vt i)) + gt( bt(vt) + bt)

subj.t. xt i(vt i, vt i) xt i(vt i , vt i) 0, i, vt i, vt i > vt i Evt i [ut i ] = Evt i [ϑt i(vt)xt i(vt)] bt i + ξt i(vt i), vt i

bt i(vt) = ut i (vt) ut i(vt i) + ξt i(vt i), i, vt P

i [n] xt i(vt) 1, xt i(vt) 0, i, vt

In what follows, we formalize the FPTAS via dynamic programming to compute the optimal ironed virtual values for the discrete type case.

Theorem 4.1. The ironed virtual values of the optimal bank account mechanism can be computed through a dynamic programming based algorithm. Moreover, for any ϵ > 0, there is an FPTAS to achieve an ϵ-approximation (multiplicative) of the optimal revenue.

We ﬁrst outline the main idea of the algorithm: (i) for any ﬁxed ξ, compute the ironed virtual values of the approximately optimal bank account mechanism; (ii) compute the optimal ξ using gradient descent. Since we have shown that the revenue of the bank account mechanism is a concave function with respect to ξ, the second step is standard and can be done with polynomially many queries to the (approximately) optimal revenue as a function of ξ. In what follows, we will focus on the ﬁrst step. To compute the ironed virtual value, we need to solve the dual program of (7), which, of course, is equivalent to solve the primal because the strong duality holds. Note that when ξ and bt are ﬁxed, (7) is a standard convex program with polynomially many linear constraints. Hence the optimal solution to its dual could be computed efﬁciently given oracle accesses to the concave function gt. Then by the deﬁnition of gt 1, the value of the optimal solution is gt 1(bt). Therefore, the concave function gt can be evaluated recursively for each t and hence the ironed virtual values of the optimal bank account mechanism can be computed via standard dynamic programming as well. However, the computation of the entire dynamic programming is not directly efﬁcient: (i) if each gt is computed recursively upon every query, the depth of the recursion could be up to T and hence the total number of queries required would be exponential in T; (ii) if each gt is computed once for all possible b so that any further queries of gt can be answered from precalculated values, then the number of input points where gt need to be computed would be unbounded. The key step to resolve these issues is to approximate the concave functions gt(b) by piece-wise linear functions each with at most polynomially many pieces. In addition, each of the piece-wise linear functions can be further expressed as the minimum of a set of afﬁne functions. Hence both the original convex program (7) at each period t and its dual can be approximated by a poly-size linear program. The following lemma ensures that each gt can be well approximated by a piece-wise linear function.

Lemma 4.2. For all t [T], gt can be κ-approximated by two concave piece-wise linear functions, gt and gt:

b, gt(b) gt(b) gt(b) gt(b) + κ maxb gt(b ).

Moreover, each of gt and gt can be written as the minimum of at most polynomially many pieces.

With such approximations to the concave function gt, we can approximately solve the convex program (7) by solving the following linear program, where each occurrence of gt

is replaced by gt: max ht 1(bt) := Evt P

i(vt i xt i(vt) ξt i(vt i))

+ gt( bt(vt) + bt) (9)

subj.t. xt i(vt i, vt i) xt i(vt i , vt i) 0, i, vt i, vt i > vt i Evt i [ut i ] = Evt i [ϑt i(vt)xt i(vt)] bt i + ξt i(vt i), vt i

bt i(vt) = ut i (vt) ut i(vt i) + ξt i(vt i), i, vt P

i [n] xt i(vt) 1, xt i(vt) 0, i, vt

gt( bt(vt) + bt)

αl ( bt(vt) + bt) + βl, l, vt (10) As we mentioned previously, in the last constraint (10), we assume that the function gt can be expressed by the minimum of a set of afﬁne functions, i.e., gt(b) = minl L αl b + βl.

Here we slightly abuse the notation of gt( bt(vt) + bt) as variables in the linear program. Note that having the constraints (10) in the linear program (9) sufﬁces to ensure that the variable values agree with the corresponding function values. Because on one hand, by the constraints, each variable is no more than the corresponding afﬁne functions; on the other hand, since the coefﬁcients of these variables are always positive in the objective, for each of the variables, at least one of the inequalities in (10) must be binding. Let ht 1(b) denote the optimal value of the linear program (9) when bt = b. Similarly, we can deﬁne ht 1 by replacing all gt with gt in (9). The following lemma shows that ht 1 and ht 1 are lower and upper bounds of gt 1. Lemma 4.3. ht 1 and ht 1 are concave and for any b,

ht 1(b) gt 1(b) ht 1(b)

and ht 1(b) ht 1(b) + maxb ( gt(b ) gt(b )). Proof of Lemma 4.3. Let x , x , and x be the optimal solutions for ht 1(b), gt 1(b), and ht 1(b), respectively. Denote these three programs as P, P, and P, respectively. Note that x is feasible in P, hence by the optimality of x , P(x ) P(x ), where P(x) is the objective value of program P on x. On the other hand, for any x that is feasible to both P and P, we have, P(x) P(x).

Because in the objectives, gt 1 gt 1. Hence we conclude:

ht 1(b) = P(x ) P(x ) P(x ) = gt 1(b).

Similarly, we can prove that gt 1(b) ht 1(b). For the last inequality, denote δ = maxb ( gt(b ) gt(b )). Note that for the g variables in x , if we reduce all of them by δ to get x , x must be feasible to P and the objective value is reduced by at most δ, hence P(x ) P( x ) P( x ) δ. In other words, ht 1(b) ht 1(b) δ.

Even through both ht 1 and ht 1 are in fact piece-wise linear functions, but they cannot be directly used for the computation of period t 1, because they may have exponentially many pieces. However, since they are concave, we can apply Lemma 4.2 to get the lower bound of ht 1 and the upper bound of ht 1:

gt 1(b) ht 1(b) gt 1(b) ht 1(b) gt 1(b).

Therefore, we can compute g1, . . . , g T and g1, . . . , g T recursively, and then compute the ironed virtual values of the approximately optimal bank account mechanism for any ﬁxed ξ. Because ξ can then be optimized by using gradient descent, we are done with our algorithm.

References Ashlagi, I.; Daskalakis, C.; and Haghpanah, N. 2016. Sequential mechanisms with ex-post participation guarantees. In EC 16, 213 214. Athey, S., and Segal, I. 2013. An efﬁcient dynamic mechanism. Econometrica 81(6):2463 2485. Balseiro, S.; Lin, M.; Mirrokni, V.; Paes Leme, R.; and Zuo, S. 2017. Dynamic revenue sharing. In NIPS 17, 2678 2686. Balseiro, S.; Mirrokni, V.; Paes Leme, R.; and Zuo, S. 2019. Dynamic double auctions: Towards ﬁrst best. In SODA 19. Balseiro, S. R.; Mirrokni, V. S.; and Leme, R. P. 2017. Dynamic mechanisms with martingale utilities. forthcoming, Management Science. Balseiro, S.; Mirrokni, V.; and Paes Leme, R. 2016. Dynamic mechanisms with martingale utilities. Bergemann, D., and Said, M. 2011. Dynamic auctions. Wiley Encyclopedia of Operations Research and Management Science. Bergemann, D., and V alim aki, J. 2002. Information acquisition and efﬁcient mechanism design. Econometrica 70(3):1007 1033. Bergemann, D., and V alim aki, J. 2017. Dynamic mechanism design: An introduction. Technical report, Cowles Foundation for Research in Economics, Yale University. Bergemann, D.; Castro, F.; and Weintraub, G. 2017. The scope of sequential screening with ex post participation constraints. In EC 17, 163 164. Boyd, S., and Vandenberghe, L. 2004. Convex optimization. Cambridge university press. Cai, Y., and Zhao, M. 2017. Simple mechanisms for subadditive buyers via duality. In STOC 17, 170 183. Cai, Y.; Devanur, N. R.; and Weinberg, S. M. 2016. A duality-based uniﬁed approach to bayesian mechanism design. ACM SIGecom Exchanges 15(1):71 77. Cavallo, R. 2008. Efﬁciency and redistribution in dynamic mechanism design. In EC 08, 220 229. Chawla, S.; Devanur, N. R.; Karlin, A. R.; and Sivan, B. 2016. Simple pricing schemes for consumers with evolving values. In SODA 16, 1476 1490. Daskalakis, C.; Deckelbaum, A.; and Tzamos, C. 2013. Mechanism design via optimal transport. In EC 13, 269 286.

Daskalakis, C.; Deckelbaum, A.; and Tzamos, C. 2017. Strong duality for a multiple-good monopolist. Econometrica 85(3):735 767. Devanur, N. R.; Peres, Y.; and Sivan, B. 2015. Perfect bayesian equilibria in repeated sales. In SODA 15, 983 1002. Elkind, E. 2007. Designing and learning optimal ﬁnite support auctions. In SODA 07, 736 745. SIAM. Hartline, J. D., and Roughgarden, T. 2008. Optimal mechanism design and money burning. In STOC 08, 75 84. Jackson, M. O., and Sonnenschein, H. F. 2007. Overcoming incentive constraints by linking decisions. Econometrica 75(1):241 257. Kakade, S. M.; Lobel, I.; and Nazerzadeh, H. 2013. Optimal dynamic mechanism design and the virtual-pivot mechanism. Operations Research 61(4):837 854. Lobel, I., and Paes Leme, R. 2017. Dynamic mechanism design under positive commitment. Mirrokni, V.; Paes Leme, R.; Tang, P.; and Zuo, S. 2016a. Dynamic auctions with bank accounts. In IJCAI 16, 387 393. Mirrokni, V.; Paes Leme, R.; Tang, P.; and Zuo, S. 2016b. Optimal dynamic mechanisms with ex-post ir via bank accounts. ar Xiv preprint ar Xiv:1605.08840. Mirrokni, V.; Paes Leme, R.; Ren, R.; and Zuo, S. 2018a. Dynamic mechanism design in the ﬁeld. In WWW 18, 1359 1368. Mirrokni, V.; Paes Leme, R.; Tang, P.; and Zuo, S. 2018b. Non-clairvoyant dynamic mechanism design. In EC 18, 169 169. Myerson, R. B. 1981. Optimal auction design. Mathematics of operations research 6(1):58 73. Nisan, N.; Roughgarden, T.; Tardos, E.; and Vazirani, V. V. 2007. Algorithmic game theory, volume 1. Cambridge University Press Cambridge. Pai, M. M., and Vohra, R. 2013. Optimal dynamic auctions and simple index rules. Mathematics of Operations Research 38(4):682 697. Papadimitriou, C.; Pierrakos, G.; Psomas, C.-A.; and Rubinstein, A. 2016. On the complexity of dynamic mechanism design. In SODA 16, 1458 1475. Parkes, D. C., and Singh, S. 2004. An mdp-based approach to online mechanism design. Pavan, A.; Segal, I.; and Toikka, J. 2014. Dynamic mechanism design: A myersonian approach. Econometrica 82(2):601 653. Rochet, J.-C. 1985. The taxation principle and multi-time hamilton-jacobi equations. Journal of Mathematical Economics 14(2):113 128. Roughgarden, T., and Schrijvers, O. 2016. Ironing in the dark. In EC 16, 1 18. Shen, W.; Wang, Z.; and Zuo, S. 2018. Ex-post IR dynamic auctions with cost-per-action payments. In IJCAI 18, 505 511.