# online_nonmonotone_drsubmodular_maximization__6299a904.pdf

Online Non-Monotone DR-Submodular Maximization

Nguyễn Kim Thắng, Abhinav Srivastav IBISC, Univ. Evry, University Paris-Saclay, France kimthang.nguyen@univ-evry.fr, abhinavsriva@gmail.com

In this paper, we study fundamental problems of maximizing DR-submodular continuous functions that have real-world applications in the domain of machine learning, economics, operations research and communication systems. It captures a subclass of non-convex optimization that provides both theoretical and practical guarantees. Here, we focus on minimizing regret for online arriving non-monotone DR-submodular functions over down-closed and general convex sets. First, we present an online algorithm that achieves a 1/eapproximation ratio with the regret of O(T 3/4) for maximizing DR-submodular functions over any down-closed convex set. Note that, the approximation ratio of 1/e matches the bestknown guarantee for the offline version of the problem. Next, we give an online algorithm that achieves an approximation guarantee (depending on the search space) for the problem of maximizing non-monotone continuous DR-submodular functions over a general convex set (not necessarily down-closed). To best of our knowledge, no prior algorithm with approximation guarantee was known for non-monotone DR-submodular maximization in the online setting. Finally we run experiments to verify the performance of our algorithms on problems arising in machine learning domain with the real-world datasets.

Introduction

Continuous DR-submodular optimization is a subclass of non-convex optimization that is an upcoming frontier in machine learning. Roughly speaking, a differentiable nonnegative bounded function F : [0, 1]n [0, 1] is DRsubmodular if F(x) F(y) for all x, y [0, 1]n where xi yi for every 1 i n. (Note that, w.l.o.g. after normalization, assume that F has values in [0, 1].) Intuitively, continuous DR-submodular functions represent the diminishing returns property or the economy of scale in continuous domains. DR-submodularity has been of great interest (Bian, Buhmann, and Krause 2019; Bian et al. 2017a; Chen, Hassani, and Karbasi 2018; Hassani, Soltanolkotabi, and Karbasi 2017; Niazadeh, Roughgarden, and Wang 2018; Staib and Jegelka 2017). Many problems arising in machine learning and statistics, such as Non-definite Quadratic Programming (Ito and Fujimaki 2016), Determinantal Point Processes

Copyright 2021, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

(Kulesza, Taskar et al. 2012), log-submodular models (Djolonga and Krause 2014), to name a few, have been modelled using the notion of continuous DR-submodular functions. In the past decade, online computational framework has been quite successful for tackling a wide variety of challenging problems and capturing many real-world problems with uncertainty. In this computational framework, we focus on the model of online learning. In online learning, at any time step, given a history of actions and a set of associated reward functions, online algorithm first chooses an action from a set of feasible actions; then, an adversary subsequently selects a reward function. The objective is to perform as good as the best fixed action in hindsight. This setting have been extensively explored in the literature, especially in context of convex functions (Hazan 2016). In fact, several algorithms with theoretical approximation guarantees are known for maximizing (offline and online) DR-submodular functions. However, these guarantees hold under the assumptions that are based on the monotonicity of functions coupled with the structure of convex sets such as unconstrained hypercube and down-closed. Though, a majority of real-world problems can be formulated as nonmonotone DR-submodular functions over convex sets that might not be necessarily down-closed. For example, the problems of Determinantal Point Processes, log-submodular models, etc can be viewed as non-monotone DR-submodular maximization problems. Although several algorithms for nonmonotone (set) submodular maximization are known and they are built on maximizing the corresponding multilinear extensions (a particular class of DR-submodular functions), they do not extend to general DR-submodular functions since the optimal solutions of the former are integer points while the ones of the latter are not. Besides, general convex sets include conic convex sets, up-closed convex sets, mixture of covering and packing linear constraints, etc. which appear in many applications. Among others, conic convex sets play a crucial role in convex optimization. Conic programming (Boyd and Vandenberghe 2004) an important subfield of convex optimization consists of optimizing an objective function over a conic convex set. Conic programming reduces to linear programming and semi-definite programming when the objective function is linear and the convex cones are the positive orthant Rn + and positive semidefinite matrices Sn +, respectively. Optimizing non-monotone DR-submodular

The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21)

functions over a (bounded) conic convex set (not necessarily downward-closed) is an interesting and important problem both in theory and in practice. To best of our knowledge, no prior work has been done on maximizing DR-submodular functions over conic sets in online setting. The limit of current theory (Bian, Buhmann, and Krause 2019; Bian et al. 2017a; Chen, Hassani, and Karbasi 2018; Hassani, Soltanolkotabi, and Karbasi 2017; Niazadeh, Roughgarden, and Wang 2018; Zhang et al. 2019) motivates us to develop online algorithms for non-monotone functions. In this work, we explore the online problem of maximizing non-monotone DR-submodular functions over a hypercube and down-closed1 and over a general convex sets. Formally, we consider the following setting for DR-submodular maximization: given a convex domain K [0, 1]n in advance, at each time step t = 1, 2, . . . , T, the online algorithm first selects a vector xt K. Subsequently, the adversary reveals a non-monotone continuous DR-submodular function F t and the algorithm receives a reward of F t(xt). The objective is also to maximize the total reward. We say that an algorithm achieves a (r, R(T))-regret if

t=1 F t(xt) r max x K

t=1 F t(x) R(T)

In other words, r is the approximation ratio that measures the quality of the online algorithm compared to the best fixed solution in hindsight and R(T) represents the regret in the classic terms. Equivalently, we say that the algorithm has r-regret at most R(T). Our goal is to design online algorithms with (r, R(T))-regret where 0 < r 1 is as large as possible, and R(T) is sub-linear in T, i.e., R(T) = o(T).

Our Contributions and Techniques In this paper, we provide algorithms with performance guarantees for the problem over down-closed and general convex sets. Our contributions and techniques are summarized as follows (also see Table 1, the entries in the red correspond to our contribution).

Maximizing Online Non-monotone DR-submodular Functions Over Down-closed Convex Sets The natures of online DR-submodular maximization for monotone and non-monotone functions are quite different. In the setting of monotone functions, the best approximation guarantee is (1 1/e) (Chen, Hassani, and Karbasi 2018) but one can prove that the natural gradient descent algorithm attains 1/2-approximation (Chen, Hassani, and Karbasi 2018). However, for online non-monotone DR-submodular maximization, no constant guarantee is known and it is likely that natural algorithms such as gradient descent, mirror descent, etc do not guarantee to achieve any constant approximation, even small one. An illustrative example of poor local optima (for coordinate ascent algorithm) can be found in (Bian, Buhmann, and Krause 2019, Appendix A). The main result of our paper is the first online algorithm that achieves a constant approximation ratio with the regret

1K is down-closed if for every z K and y z then y K

of O(T 3/4) for maximizing non-monotone DR-submodular functions over any down-closed convex set where T is number of time steps. Moreover, the approximation is 1/e which matches to the best guarantee in the offline setting. Our algorithm is built on the Meta-Frank-Wolfe algorithm introduced by (Chen, Hassani, and Karbasi 2018) for monotone DR-submodular maximization. Their meta Frank-Wolfe algorithm combines the framework of meta-actions proposed in (Streeter and Golovin 2009) with a variant of the Frank Wolfe proposed in (Bian et al. 2017b) for maximizing monotone DR-submodular functions. Informally, at every time t, our algorithm starts from the origin 0 (0 K since K is down-closed) and executes L steps of the Frank-Wolfe algorithm where every update vector at iteration 1 ℓ L is constructed by combining the output of an optimization oracle Eℓand the current vector so that we can exploit the concavity property in positive directions of DR-submodular functions. The solution xt is produced at the end of the L-th step. Subsequently, after observing the function F t, the algorithm subtly defines a vector dt ℓand feedbacks , dt ℓ as the reward function at time t to the oracle Eℓfor 1 ℓ L. The most important distinguishing point of our algorithm compared to that in (Chen, Hassani, and Karbasi 2018) relies on the oracles. In their algorithm, they used linear oracles which are appropriate for maximizing monotone DR-submodular functions. However, it is not clear whether linear or even convex oracles are strong enough to achieve a constant approximation for non-monotone functions. While aiming for a 1/e-approximation the best known approximation in the offline setting we consider the following online non-convex problem, referred throughout this paper as online vee learning problem. At each time t, the online algorithm knows ct K in advance and selects a vector xt K. Subsequently, the adversary reveals a vector at Rn and the algorithm receives a reward at, ct xt . Given two vectors x and y, the vector x y has the ith coordinate (x y)i = max{x(i), y(i)}. The goal is to maximize the total reward over a time horizon T. In order to bypass the non-convexity of the online vee learning problem, we propose the following methodology. Unlike dimension reduction techniques that aim at reducing the dimension of the search space, we lift the search space and the functions to a higher dimension so as to benefit from their properties in this new space. Concretely, at a high level, given a convex set K, we define a sufficiently dense lattice L in [0, 1]n such that every point in K can be approximated by a point in the lattice. The term sufficiently dense corresponds to the fact that the values of any Lipschitz function can be approximated by the function values at lattice points. As the next step, we lift all lattice points in K to a high dimension space so that they form a subset of vertices (corners points) e K of the unit hypercube in the newly defined space. Interestingly, the reward function at, ct xt can be transformed to a linear function in the new space. Hence, our algorithm for the online vee problem consists of updating at every time step a solution in the high-dimension space using a linear oracle and projecting the solution back to the original space. Once the solution is projected back to original space, we construct appropriate update vectors for the original DR-

Monotone Non-monotone

1 2-approx (Bian, Buhmann, and Krause 2019) (Niazadeh, Roughgarden, and Wang 2018)

down-closed

e -approx (Bian et al. 2017b)

e -approx (Bian et al. 2017a)

e -approx (Mokhtari, Hassani, and Karbasi 2018)

(D urr et al. 2020)

down-closed

1 e, O(T 3/4)

(Chen, Hassani, and Karbasi 2018)

3 , O( T ln T )

Table 1: Summary of results on DR-submodular maximization. The entries represent the best-known (r, R(T))-regret; our results are shown in red. The entry in blue is not explicitly stated in literature but can be deduced from (Ene and Nguyen 2016) or from our technique. The guarantees in blank cells can be deduced from a more general one.

submodular maximization and feedback rewards for the online vee oracle. Exploiting the underlying properties of DRsubmodularity, we show that our algorithm achieves the regret bound of (1/e, O(T 3/4)). A useful feature of our algorithm is that it is projection-free if using appropriate projection-free oracles.

Over The Hypercube. Restricting on the DR-submodular maximization problem over the hypercube [0, 1]n, a particular down-closed convex set that has been widely studied, our approach leads to an 1/2-approximation algorithm with regret bound of O(

T log T). Note that this result can be deduced from (Ene and Nguyen 2016) but as it is not explicitly stated in literature, for completeness, we present the proof in the appendix of the full version.

Maximizing Online Non-monotone DR-submodular Functions over General Convex Sets We go beyond the down-closed structure by considering general convex sets. Building upon our salient ideas from previous algorithm, we prove that for general convex sets, the Meta-Frank-Wolfe algorithm with adapted step-sizes guarantees 1 minx K x

ln T ) -regret in expectation. Notably, if the set K contains 0 (for example, K is the intersection of the semi-definite positive matrix cone and the hypercube [0, 1]n) then the algorithm guarantees in expectation a ( 1 3

3, O( T log T ))-regret. To the best of our knowledge, this is the first constant-approximation algorithm for the problem of online non-monotone DR-submodular maximization over a non-trivial convex set that goes beyond down-closed sets. Note that, any algorithm for the problem over a non-down-closed convex set (in particular arbitrary convex set containing the origin) that guarantees a constant approximation must require in general exponentially many value queries to the function (Vondrák 2013). Remark that the quality of the solution, specifically the approximation

ratio, depends on the initial solution x0. This confirms an observation in various contexts that initialization plays an important role in non-convex optimization.

Related Work

In this section, we give a summary on best-known results on DR-submodular maximization. The domain has been investigated more extensively in recent years due to its numerous applications in the field of statistics and machine learning, for example active learning (Golovin and Krause 2011), viral makerting (Kempe, Kleinberg, and Tardos 2003), network monitoring (Gomez Rodriguez, Leskovec, and Krause 2010), document summarization (Lin and Bilmes 2011), crowd teaching (Singla et al. 2014), feature selection (Elenberg et al. 2018), deep neural networks (Elenberg et al. 2017), diversity models (Djolonga, Tschiatschek, and Krause 2016) and recommender systems (Guillory and Bilmes 2011).

Offline Setting (Bian et al. 2017b) considered the problem of maximizing monotone DR-functions subject to a downclosed convex set and showed that the greedy method proposed by (Calinescu et al. 2011), a variant of well-known Frank-Wolfe algorithm in convex optimization, guarantees a (1 1/e)-approximation. However, it has been observed by (Hassani, Soltanolkotabi, and Karbasi 2017) that the greedy method is not robust in stochastic settings (where only unbiased estimates of gradients are available). Subsequently, (Mokhtari, Hassani, and Karbasi 2018) proposed a (1 1/e)-approximation algorithm for maximizing monotone DR-submodular functions over general convex sets in stochastic settings by a new variance reduction technique. The problem of maximizing non-monotone DRsubmodular functions is much harder. (Bian, Buhmann, and Krause 2019) and (Niazadeh, Roughgarden, and Wang 2018) have independently presented algorithms with the same approximation guarantee of (1/2) for the

problem of non-monotone DR-submodular maximization over the hypercube (K = [0, 1]n). These algorithms are inspired by the bi-greedy algorithms in (Buchbinder et al. 2015; Buchbinder and Feldman 2018). (Bian et al. 2017a) made a further step by providing a (1/e)-approximation algorithm where the convex sets are down-closed. (Mokhtari, Hassani, and Karbasi 2018) also presented an algorithm that achieve (1/e) for non-monotone continuous DRsubmodular function over a down-closed convex domain that uses only stochastic gradient estimates. Remark that when aiming for approximation algorithms (in polynomial time), the restriction to down-closed polytopes is unavoidable. Specifically, (Vondrák 2013) proved that any algorithm for the problem over a non-down-closed set that guarantees a constant approximation must require in general exponentially many value queries to the function. Recently, (D urr et al. 2020) presented an algorithm that achieves ((1 minx K x )/3

3)-approximation for non-monotone DR-submodular function over general convex sets that are not necessarily down-closed.

Online Setting The results for online DR-submodular maximization are known only for monotone functions. (Chen, Hassani, and Karbasi 2018) considered the online problem of maximizing monotone DR-submodular functions over general convex sets and provided an algorithm that achieves a (1 1/e, O(

T))-regret. Subsequently, (Zhang et al. 2019) presented an algorithm that reduces the number of per-function gradient evaluations from T 3/2 in (Chen, Hassani, and Karbasi 2018) to 1 and achieves the same approximation ratio of (1 1/e). Leveraging the idea of one gradient per iteration, they presented a bandit algorithm for maximizing monotone DR-submodular function over a general convex set that achieves in expectation (1 1/e) approximation ratio with regret O(T 8/9). Note that in the discrete setting, (Roughgarden and Wang 2018) studied the non-monotone (discrete) submodular maximization over the hypercube and gave an algorithm which guarantees the tight (1/2)-approximation and O(

T) regret. Recently and independently, (Niazadeh et al. 2020) have provided an (1/2, O(

T log T))-regret algorithm for DR-submodular maximization over the hypercube. Their algorithm is built on an interesting framework that transform offline greedy algorithms to online ones using Blackwell approachability. Our approach, based on the lifting procedure, is completely different to theirs.

Preliminaries and Notations

Throughout the paper, we use bold face letters, e.g., x, y to represent vectors. For every S {1, 2, . . . , n}, vector 1S is the n-dim vector whose ith coordinate equals to 1 if i S and 0 otherwise. Given two n-dimensional vectors x, y, we say that x y iff xi yi for all 1 i n. Additionally, we denote by x y, their coordinate-wise maximum vector such that (x y)i = max{xi, yi}. Moreover, the symbol represents the element-wise multiplication that is, given two vectors x, y, vector x y is the such that the i-th coordinate

(x y)i = xiyi. The scalar product x, y = P i xiyi and the norm x = x, x 1/2. In the paper, we assume that K [0, 1]n. We say that K is the hypercube if K = [0, 1]n; K is down-closed if for every z K and y z then y K; and K is general if K is a convex subset of [0, 1]n without any special property.

Definition 1. A function F : [0, 1]n R+ {0} is diminishing-return (DR) submodular if for all vector x y [0, 1]n, any basis vector ei = (0, . . . , 0, 1, 0, . . . , 0) and any constant α > 0 such that x + αei [0, 1]n, y + αei [0, 1]n, it holds that F(x + αei) F(x) F(y + αei) F(y). Note that if function F is differentiable then the diminishing-return (DR) inequality is equivalent to F(x) F(y) x y [0, 1]n. Moreover, if F is twice-differentiable then the DR property is equivalent to all of the entries of its Hessian being non-positive, i.e., 2F xi xj 0 for all 1 i, j n.

Besides, a differentiable function F : K [0, 1]n R+ {0} is said to be β-smooth if for any x, y K, the following holds, F(y) F(x) + F(x), y x + β 2 y x 2; or equivalently, the gradient is β-Lipschitz, i.e., F(x) F(y) β x y .

Online DR-Submodular Maximization over Down-Closed Convex Sets

Online Vee Learning

In this section, we give an algorithm for an online problem that will be the main building block in the design of algorithms for online DR-submodular maximization over downclosed convex sets. In the online vee learning problem, given a down-closed convex set K, at every time t, the online algorithm receives ct K at the beginning of the step and needs to choose a vector xt K. Subsequently, the adversary reveals a vector at Rn and the algorithm receives a reward at, ct xt . The goal is to maximize the total reward PT t=1 at, ct xt over a time horizon T. The main issue in the online vee problem is that the reward functions are non-concave. In order to overcome this obstacle, we consider a novel approach that consists of discretizing and lifting the corresponding functons to a higher dimensional space.

Discretization and Lifting Let L be a lattice such that L = {0, 1

M , . . . , ℓ

M , . . . , 1}n where 0 ℓ M for some parameter M (to be defined later). Note that each xi = ℓ M for 0 ℓ M can be uniquely represented by a vector yi {0, 1}M+1 such that yi0 = . . . = yiℓ= 1 and yi,ℓ+1 = . . . = yi,M = 0. Based on this observation, we lift the discrete set K L to the (n (M +1))-dim space. Specifically, define a lifting map m : K L {0, 1}n (M+1) such that each point (x1, . . . , xn) K L is mapped to an unique point (y10, . . . , y1M, . . . , yn0, . . . , yn M) {0, 1}n M where yi0 = . . . = yiℓ= 1 and yi,ℓ+1 = . . . = yi,M = 0 iff xi = ℓ M for 0 ℓ M. Define e K be the set {1X {0, 1}n (M+1) : 1X = m(x) for some x

Algorithm 1 Online algorithm for vee learning Input: A convex set K, a time horizon T.

1: Let L be a lattice in [0, 1]n and denote the polytope C := conv( e K). 2: Initialize arbitrarily x1 K L and y1 = 1X1 = m(x1) e K. 3: for t = 1 to T do 4: Observe ct, compute 1Ct = m(ct). 5: Play xt. 6: Observe at (so compute eat) and receive the reward at, ct xt . 7: Set yt+1 update C(yt; eat (1 1Ct ) : t t) where denotes the element-wise multiplication. 8: Round: 1Xt+1 round(yt+1). 9: Compute xt+1 = m 1(1Xt+1). 10: end for

K L}. Observe that e K is a subset of discrete points in {0, 1}n (M+1). Let C := conv( e K) be the convex hull of e K.

Algorithm Description In our algorithm, at every time t, we will output xt L K. In fact, we will reduce the online vee problem to a online linear optimization problem in the (n (M + 1))-dim space. Given ct K, we round every coordinate ct i for 1 i n to the largest multiple of 1 M which is smaller than ct i. In other words, the rounded vector ct has the ith-coordinate ct i = ℓ M where ℓ M ct i < ℓ+1

M . Vector ct L and also ct K (since ct ct and K is down-closed). Denote 1Ct = m(ct). Besides, for each vector at Rn, define its correspondence eat Rn (M+1) such that eat i,j = 1 M at i for all 1 i n and 0 j M. Observe that at, ct = eat, 1Ct (where the second scalar product is taken in the space of dimension n (M + 1)). The formal description is given in Algorithm 1. In the algorithm, we use a procedure update. This procedure takes arguments as the polytope C, the current vector yt, the gradients of previous time steps eat (1 1Ct) and outputs the next vector yt+1. One can use different update strategies, in particular the gradient descent or the follow-the-perturbed-leader algorithm (if aiming for a projection-free algorithm).

yt+1 Proj C yt η eat (1 1Ct)

(Gradient Descent)

yt+1 arg min C

t =1 eat (1 1Ct ), y + n T y

(Follow-the-perturbed-leader)

Besides, in the algorithm, we use additionally a procedure round in order to transform a solution yt in the polytope C of dimension n (M + 1) to an integer point in e K. Specifically, given yt conv( e K), we round yt to 1Xt e K using an efficient polynomial-time algorithm given in (Mirrokni et al. 2017) for the approximate Carathéodory s theorem

w.r.t the ℓ2-norm. Specifically, given yt C and an arbitrary ϵ > 0, the algorithm in (Mirrokni et al. 2017) returns a set of k = O(1/ϵ2) integer points 1Xt 1, . . . , 1Xt k e K

in O(1/ϵ2)-time such that yt 1

k Pk j=1 1Xt j ϵ. Hence, given 1Xt 1, . . . , 1Xt k which have been computed, the procedure round simply consists of rounding yt to 1Xt j with probability 1/k. Finally, the solution xt in the original space is computed as m 1(1Xt+1).

Lemma 1. Assume that at G for all 1 t T and the update scheme is chosen either as the gradient ascent update or other procedure with regret of O(

T). Then, using the lattice L with parameter M = (T/n)1/4 and ϵ = 1/

T (in the round procedure), Algorithm 1 achieves a regret bound of O(G(n T)3/4). The total running time of the algorithm is O(T 2).

Algorithm for Online Non-monotone DR-submodular Maximization In this section, we will provide an algorithm for the problem of online non-monotone DR-submodular maximization. The algorithm maintains L online vee learning oracles E1, . . . , EL. At each time step t, the algorithm starts from 0 (origin) and executes L steps of the Frank-Wolfe algorithm where the update vector vt ℓis constructed by combining the output ut ℓof an online vee optimization oracle Eℓwith the current vector xt ℓ. In particular, vt ℓis set to be xt ℓ ut ℓ xt ℓ. Then, the solution xt is produced at the end of the L-th step. Subsequently, after observing the DR-submodular function F t, the algorithm defines a vector dt ℓand feedbacks dt ℓ, xt ℓ ut ℓ as the reward function at time t to the oracle Eℓ for 1 ℓ L. The pseudocode in presented in Algorithm 2.

Theorem 1. Let K [0, 1]n be a down-closed convex set with diameter D. Assume that functions F t s are G-Lipschitz, β-smooth and E h gt ℓ F t(xt ℓ) 2i σ2 for every t and

ℓ. Then, choose L = T 3/4 and the step-sizes ηℓ= 1/L and ρℓ= 2/(ℓ+ 3)2/3 for all 1 ℓ L , Algorithm 2 achieves the following guarantee:

t=1 E[F t(xt)]

O(βD + n3/4G + σ)DT 3/4

Maximizing Non-Monotone DR-Submodular Functions over a General Convex Domain In this section, we consider the problem of maximizing nonmonotone continuous DR-submodular functions over a general convex domain. We show that beyond down-closed convex domain, the Meta-Frank-Wolfe algorithm, introduced by (Chen, Hassani, and Karbasi 2018) for monotone DRsubmodular functions, provides indeed meaningful guarantees for the problem of online non-monotone DR-submodular maximization over a general convex set. Note that no algorithm with performance guarantee was known in the online setting.

Algorithm 2 Online algorithm for down-closed convex sets Input: A convex set K, a time horizon T, online vee optimization oracles E1, . . . , EL, step sizes ρℓ (0, 1), and ηℓ= 1/L for all ℓ [L]

1: Initialize vee optimizing oracle Eℓfor all ℓ {1, . . . L}. 2: Initialize xt 1 0 and dt 0 0 for every 1 t T. 3: for t = 1 to T do 4: Compute ut ℓ output of oracle Eℓin round t for all ℓ {1, . . . , L}. 5: for 1 ℓ L do 6: Set vt ℓ xt ℓ ut ℓ xt ℓ 7: Set xt ℓ+1 xt ℓ+ ηℓvt ℓ. 8: end for 9: Set xt xt L+1. 10: Play xt, observe the function F t and get the reward of F t(xt). 11: Compute a gradient estimate gt ℓsuch that E[gt ℓ|xt ℓ] = F t(xt ℓ) for all ℓ {1, 2, . . . , L}. 12: Compute dt ℓ= (1 ρℓ)dt ℓ 1 + ρℓgt ℓ, for every ℓ {1, . . . , L} 13: Feedback dt ℓ, xt ℓ ut ℓ to oracle Eℓfor all ℓ {1, 2, . . . , L}, 14: end for

Online Linear Optimization Oracles In our algorithms, we use multiple online linear optimization oracles to estimate the gradient of online arriving functions. This idea was originally developed for maximizing monotone DR-submodular functions (Chen et al. 2018). Before presenting algorithms, we recall the online linear optimization problems and corresponding oracles. In the online linear optimization problem, at every time 1 t T, the oracle selects ut K. Subsequently, the adversary reveals a vector dt and feedbacks the function , dt (and a reward ut, dt ) to the oracle. The objective is to minimize the regret. There exists several oracles that guarantee sublinear regret, for example the gradient descent algorithm has the regret of O(n

T) (see for example (Hazan 2016)).

Algorithm Description At a high level, at every time t, our algorithm produces a solution xt by running L steps of the Frank-Wolfe algorithm that uses the outputs of L linear optimization oracles as update vectors. After the algorithm plays xt, it observes stochastic gradient estimates at L points in the convex domain. Subsequently, these estimates are averaged with the estimates from the previous round and are fed to the reward functions of L online linear oracles. The pseudocode is presented in the Algorithm 3.

Theorem 2. Let K [0, 1]n be a general convex domain with the diameter D. Assume that for every 1 t T, 1. F t is β-smooth DR-submodular function and the norm of the gradient F t is bounded by G, i.e., F(x G, x K, 2. the variance of the unbiased stochastic gradients is bounded by σ2, i.e., E h gt ℓ F t(xt ℓ) 2i σ2 for every 1 ℓ L; and

Algorithm 3 Meta-Frank-Wolfe for general convex domains Input: A convex set K, a time horizon T, online linear optimization oracles E1, . . . , EL, step sizes ρℓ (0, 1), and ηℓ (0, 1)

1: Initialize online linear optimization oracle Eℓfor all ℓ {1, . . . L}. 2: Initialize xt 1 x0 for some x0 K and dt 0 0 for every 1 t T. Let x0 arg minx K x . 3: for t = 1 to T do 4: Set vt ℓ output of oracle Eℓin round t 1 for all ℓ {1, . . . , L}. 5: Set xt ℓ+1 (1 ηℓ)xt ℓ+ ηℓvt ℓ. 6: Play xt = xt L+1. 7: Observe gt ℓsuch that E gt ℓ|xt ℓ = F t(xt ℓ). 8: Set dt ℓ (1 ρℓ) dt ℓ 1 +ρℓ gt ℓfor ℓ= {1, . . . , L}. 9: Feedback the reward vt ℓ, dt ℓ to Eℓfor ℓ = {1, . . . , L}. 10: end for

Then setting L = O(T), κ = ln 3

2 , ρℓ= 2 (ℓ+3)2/3 and ηℓ=

κ ℓHL for 1 ℓ L and HL is the Lth harmonic number, the following inequality holds true for Algorithm 3:

t=1 E F t(xt) 1

(1 x0 ) max x K

t=1 F t(x )

O (βD + G + σ)DT

Remarks The regret guarantee, in particular the approximation ratio, depends on the initial solution x0. This confirms the observation that initialization plays an important role in non-convex optimization. For particular case where K is the intersection of a cone and the hypercube [0, 1]n (so 0 K), Algorithm 3 provides vanishing 1/(3

3)-regret. Note that this is the first constant approximation for nonmonotone DR-submodular maximization over a non-trivial convex domain beyond the class of down-closed convex domains. Assume that 0 K and F t s are identical, i.e., F t = F, t , the algorithm guarantees a convergence rate of O 1/log T . It means that to be ϵ-close to a solution which is 1/(3

3)-approximation to the optimal solution of F, the algorithm requires T = O(21/ϵ) iterations. Note that the exponential complexity is unavoidable as any contant approximation algorithm for the multilinear extension of a submodular function (so DR-submodular) over a nondown-closed convex set requires necessarily an exponential number of value queries (Vondrák 2013).

Experiments In this section, we validate our online algorithms on a set of real-world dataset. We show the performance of our algorithms over a down-closed polytope and over a general convex

0 200 400 600 800 1000 Iterations

Facebook-Revenue-Max

Online/Offline

0 200 400 600 800 1000 Iterations

Facebook-Revenue-Max

Online/Offline

Figure 1: Online revenue maximization over (a) a down-closed polytope; and (b) a general polytope. (Iterations correspond to timesteps t [T].)

polytope. All experiments are performed in MATLAB using CPLEX optimization tool on MAC OS version 10.15. In the revenue maximization problem, the goal of a company is to offer for free or advertise a product to users so that the revenue increases through their word-of-mouth" effect on others. Here, we are given an undirected social network graph G = (V, W), where wij W represents the weight of the edge between vertex i and vertex j. If the company invests xi unit of cost on an user i V then user i becomes an advocate of the product with probability 1 (1 p)xi where p (0, 1) is a parameter. Intuitively, this signifies that for investing a unit cost to i, we have an extra chance that the user i becomes an advocate with probability p (Soma and Yoshida 2017). Let S V be a random set of users who advocate for the product. Then the revenue with respect to S is defined as P i S P j V \S wij. Let F : [0, 1]|E| R be the expected revenue obtained in this model, that is F(x) = ES[ P i S P j V \S wij] = P i P j:i =j wij(1 (1 p)xi)(1 p)xj. It has been shown that F is a nonmonotone DR-submodular function (Soma and Yoshida 2017). In our setting, we consider the online variant of the revenue maximization on a social network where at time t the weight of an edge is given wt ij {0, 1}. The experiments are performed on the Facebook dataset that contains 20K users (vertices) and 1M relationships (edges). We choose the number of time steps to be T = 1000. At each time t 1, . . . , T, we randomly uniformly select 2000 vertices V t V , independently of V 1, . . . , V t 1, and construct a batch Bt with edge-weights wt ij = 1 if and only if i, j V t and edge (i, j) exists in the Facebook dataset. In case if i or j do not belong to V t, wt ij = 0. We set p = 0.0001. In the first experiment, we impose a maximum investment constraint on the problem such that P i V xi 1. This, in addition to xi 0, i V constitutes a down-closed feasible convex set. For the general convex set, we impose an additional minimum investment constraint on the problem such that P i V xi 0.1. For comparison purposes, we consider (offline) Frank-

Wolfe variants as the benchmark. Variants of Frank-Wolfe algorithm are shown to be competitive for maximizing (offline) non-monotone DR-submodular functions over down-closed convex sets (Bian et al. 2017a) and over general convex sets (D urr et al. 2020). Moreover, as observed in (Bian et al. 2017a; D urr et al. 2020), variants of Frank-Wolfe algorithm behave well in practice with approximation ratio better than 0.9 in several experiment settings. For these reasons, we adopt Frank-Wolfe variants as our benchmark.

In Figure 1 we show the ratio of between the objective value achieved by the online algorithm and that of the benchmark over the down-closed convex set (Figure 1(a)) and the general convex set (Figure 1(b)). More specifically, the offline Frank-Wolfe variant performs almost as well as the optimum solution (approximation ratio larger than 0.92) and our online algorithms attain a ratio larger than 0.44 compared with the offline benchmark. Hence, the approximation ratios of the online algorithms is larger than 0.43 0.92 > 0.39 > 1/e.)

In this paper, we study the regret minimization for maximization online non-monotone DR-submodular functions. We presented the first 1/e-approximation online algorithm with the regret of O(T 3/4) over down-closed convex sets. Moreover, we presented an online algorithm that achieves an approximation guarantee (depending the search space) for the problem of maximizing non-monotone DR-submodular functions over a general convex set. Finally, we run experiments to verify the performance of our algorithms on a social revenue maximization problem on a Facebook user-relationship dataset. The design of an (1/e, O(

T))-regret algorithm would be challenging as at each time step, one has to guess" the right update vector to obtain the approximation ratio of 1/e while maintaining a low regret without being trapped into a poor local minima.

Acknowledgments Research supported by the ANR project OATA no ANR-15CE40-0015-01.

References Bian, A. A.; Levy, K.; Krause, A.; and Buhmann, J. M. 2017a. Non-monotone Continuous DR-submodular Maximization: Structure and Algorithms. In Neural Information Processing Systems (NIPS).

Bian, A. A.; Mirzasoleiman, B.; Buhmann, J.; and Krause, A. 2017b. Guaranteed Non-convex Optimization: Submodular Maximization over Continuous Domains. In Artificial Intelligence and Statistics, 111 120.

Bian, Y.; Buhmann, J.; and Krause, A. 2019. Optimal Continuous DR-Submodular Maximization and Applications to Provable Mean Field Inference. In Proceedings of the 36th International Conference on Machine Learning, 644 653.

Boyd, S.; and Vandenberghe, L. 2004. Convex optimization. Cambridge university press.

Buchbinder, N.; and Feldman, M. 2018. Deterministic algorithms for submodular maximization problems. ACM Transactions on Algorithms (TALG) 14(3): 32.

Buchbinder, N.; Feldman, M.; Seffi, J.; and Schwartz, R. 2015. A tight linear time (1/2)-approximation for unconstrained submodular maximization. SIAM Journal on Computing 44(5): 1384 1402.

Calinescu, G.; Chekuri, C.; Pál, M.; and Vondrák, J. 2011. Maximizing a monotone submodular function subject to a matroid constraint. SIAM Journal on Computing 40(6): 1740 1766.

Chen, L.; Harshaw, C.; Hassani, H.; and Karbasi, A. 2018. Projection-Free Online Optimization with Stochastic Gradient: From Convexity to Submodularity. In Proceedings of the 35th International Conference on Machine Learning, 814 823.

Chen, L.; Hassani, H.; and Karbasi, A. 2018. Online continuous submodular maximization. In Proc. 21st International Conference on Artificial Intelligence and Statistics (AISTAT).

Djolonga, J.; and Krause, A. 2014. From MAP to marginals: Variational inference in bayesian submodular models. In Advances in Neural Information Processing Systems, 244 252.

Djolonga, J.; Tschiatschek, S.; and Krause, A. 2016. Variational inference in mixed probabilistic submodular models. In Advances in Neural Information Processing Systems, 1759 1767.

D urr, C.; Thang, N. K.; Srivastav, A.; and Tible, L. 2020. Non-monotone DR-submodular Maximization: Approximation and Regret Guarantees. In Proc. 29th International Joint Conference on Artificial Intelligence (IJCAI).

Elenberg, E.; Dimakis, A. G.; Feldman, M.; and Karbasi, A. 2017. Streaming weak submodularity: Interpreting neural networks on the fly. In Advances in Neural Information Processing Systems, 4044 4054.

Elenberg, E. R.; Khanna, R.; Dimakis, A. G.; Negahban, S.; et al. 2018. Restricted strong convexity implies weak submodularity. The Annals of Statistics 46(6B): 3539 3568.

Ene, A.; and Nguyen, H. L. 2016. A reduction for optimizing lattice submodular functions with diminishing returns. ar Xiv:1606.08362 .

Golovin, D.; and Krause, A. 2011. Adaptive submodularity: Theory and applications in active learning and stochastic optimization. Journal of Artificial Intelligence Research 42: 427 486.

Gomez Rodriguez, M.; Leskovec, J.; and Krause, A. 2010. Inferring networks of diffusion and influence. In Proc. 16th ACM SIGKDD international conference on Knowledge discovery and data mining, 1019 1028.

Guillory, A.; and Bilmes, J. A. 2011. Simultaneous Learning and Covering with Adversarial Noise. In ICML, volume 11, 369 376.

Hassani, H.; Soltanolkotabi, M.; and Karbasi, A. 2017. Gradient methods for submodular maximization. In Advances in Neural Information Processing Systems, 5841 5851.

Hazan, E. 2016. Introduction to online convex optimization. Foundations and Trends in Optimization 2(3-4): 157 325.

Ito, S.; and Fujimaki, R. 2016. Large-scale price optimization via network flow. In Advances in Neural Information Processing Systems, 3855 3863.

Kempe, D.; Kleinberg, J.; and Tardos, É. 2003. Maximizing the spread of influence through a social network. In Proc. 9th ACM SIGKDD international conference on Knowledge discovery and data mining, 137 146.

Kulesza, A.; Taskar, B.; et al. 2012. Determinantal point processes for machine learning. Foundations and Trends in Machine Learning 5(2 3): 123 286.

Lin, H.; and Bilmes, J. 2011. A class of submodular functions for document summarization. In Proc. 49th Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, 510 520.

Mirrokni, V.; Leme, R. P.; Vladu, A.; and Wong, S. C.-w. 2017. Tight bounds for approximate Carathéodory and beyond. In Proc. 34th International Conference on Machine Learning, 2440 2448.

Mokhtari, A.; Hassani, H.; and Karbasi, A. 2018. Conditional Gradient Method for Stochastic Submodular Maximization: Closing the Gap. In Procs of the International Conference on Artificial Intelligence and Statistics, volume 84, 1886 1895.

Niazadeh, R.; Golrezaei, N.; Wang, J.; Susan, F.; and Badanidiyuru, A. 2020. Online learning via offline greedy: Applications in market design and optimization. Available at SSRN 3613756 .

Niazadeh, R.; Roughgarden, T.; and Wang, J. R. 2018. Optimal Algorithms for Continuous Non-monotone Submodular and DR-Submodular Maximization. In Neural Information Processing Systems (NIPS).

Roughgarden, T.; and Wang, J. R. 2018. An Optimal Learning Algorithm for Online Unconstrained Submodular Maximization. In Conference On Learning Theory, 1307 1325. Singla, A.; Bogunovic, I.; Bartók, G.; Karbasi, A.; and Krause, A. 2014. Near-Optimally Teaching the Crowd to Classify. In ICML, 154 162. Soma, T.; and Yoshida, Y. 2017. Non-Monotone DRSubmodular Function Maximization. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. Staib, M.; and Jegelka, S. 2017. Robust Budget Allocation via Continuous Submodular Functions. In International Conference on Machine Learning, 3230 3240.

Streeter, M.; and Golovin, D. 2009. An online algorithm for maximizing submodular functions. In Advances in Neural Information Processing Systems, 1577 1584.

Vondrák, J. 2013. Symmetry and approximability of submodular maximization problems. SIAM Journal on Computing 42(1): 265 304. Zhang, M.; Chen, L.; Hassani, H.; and Karbasi, A. 2019. Online Continuous Submodular Maximization: From Full Information to Bandit Feedback. In Advances in Neural Information Processing Systems, 9206 9217.