# multivariate_submodular_optimization__18e3805d.pdf Multivariate Submodular Optimization Richard Santiago * 1 F. Bruce Shepherd * 2 Submodular functions have found a wealth of new applications in data science and machine learning models in recent years. This has been coupled with many algorithmic advances in the area of submodular optimization: (SO) min / max f(S) : S F, where F is a given family of feasible sets over a ground set V and f : 2V R is submodular. Our focus is on a more general class of multivariate submodular optimization (MVSO) problems: min / max f(S1, S2, . . . , Sk) : S1 S2 Sk F. Here we use to denote union of disjoint sets and hence this model is attractive where resources are being allocated across k agents, who share a joint multivariate nonnegative objective f(S1, S2, . . . , Sk) that captures some type of submodularity (i.e. diminishing returns) property. We provide some explicit examples and potential applications for this new framework. For maximization, we show that practical algorithms such as accelerated greedy variants and distributed algorithms achieve good approximation guarantees for very general families (such as matroids and p-systems). For arbitrary families, we show that monotone (resp. nonmonotone) MVSO admits an α(1 1/e) (resp. α 0.385) approximation whenever monotone (resp. nonmonotone) SO admits an α-approximation over the multilinear formulation. This substantially expands the family of tractable models. On the minimization side we give essentially optimal approximations in terms of the curvature of f. 1. Introduction Submodularity is a property of set functions with deep theoretical consequences and a wide range of applications. *Equal contribution 1School of Computer Science, Mc Gill University, Montreal, Canada 2Department of Computer Science, University of British Columbia, Vancouver, Canada. Correspondence to: Richard Santiago . Proceedings of the 36 th International Conference on Machine Learning, Long Beach, California, PMLR 97, 2019. Copyright 2019 by the author(s). Optimizing submodular functions is a central subject in operations research and combinatorial optimization (Lov asz, 1983). It appears in many important optimization frameworks including cuts in graphs, set covering problems, plant location problems, certain satisfiability problems, combinatorial auctions, and maximum entropy sampling. In machine learning it has recently been identified and utilized in domains such as viral marketing (Kempe et al., 2003), information gathering (Krause & Guestrin, 2007), image segmentation (Boykov & Jolly, 2001; Kohli et al., 2009; Jegelka & Bilmes, 2011), document summarization (Lin & Bilmes, 2011), and speeding up satisfiability solvers (Streeter & Golovin, 2009). A set function f : 2V R is submodular if f(S)+f(T) f(S T)+f(S T) for any S, T V . We call f monotone if f(S) f(T) for S T. Throughout, all functions are nonnegative, and we usually assume f( ) = 0. Our functions are given by a value oracle, where for a given set S an algorithm can query the oracle to find its value f(S). We consider the following broad class of submodular optimization (SO) problems: SO(F) Min / Max f(S) : S F where f is a nonnegative submodular set function on a finite ground set V , and F 2V is a family of feasible sets. These problems have been well studied for a variety of set families F. We explore the connections between these (single-agent) problems and their more general multivariate incarnations. In the multivariate (MV) version, we have k agents and a joint multivariate nonnegative objective f(S1, S2, . . . , Sk) that captures some type of submodularity (i.e. diminishing returns) property (see Section 1.1). As before, we are looking for sets S F, however, we now have a 2-phase task: the elements of S must also be partitioned amongst the agents. Hence we have set variables Si and seek to optimize f(S1, S2, . . . , Sk). This leads to the multivariate submodular optimization (MVSO) versions: MVSO(F) Min / Max f(S1, S2, . . . , Sk) : S1 S2 Sk F. Our main objective is to study the approximability of the multivariate problems in terms of their single-agent versions. We refer to the multivariate (MV) gap as the approximation Multivariate Submodular Optimization factor loss incurred by moving to the multivariate setting. To the best of our knowledge, neither the MVSO(F) framework for general families F nor the notion of MV gap have been considered before. An important special case of MVSO occurs when the function f(S1, . . . , Sk) can be separated as f(S1, . . . , Sk) = Pk i=1 fi(Si) where the fi are all submodular; in this case we say that f is separable. This leads to the class of multiagent submodular optimization (MASO) problems (see related work section) MASO(F) Min / Max Pk i=1 fi(Si) : S1 S2 Sk F. 1.1. Multivariate Submodular Optimization We consider functions of several variables which satisfy the following type of submodularity property. A multivariate function f : 2k V R is k-multi-submodular if for all pairs of tuples (S1, S2, ..., Sk), (T1, T2, ..., Tk) 2k V we have f(S1, ..., Sk) + f(T1, ..., Tk) f(S1 T1, S2 T2, .., Sk Tk) +f(S1 T1, S2 T2, ..., Sk Tk). We call f normalized if f( , , . . . , ) = 0, and monotone if f(S1, . . . , Sk) f(T1, . . . , Tk) for all tuples (S1, . . . , Sk) and (T1, . . . , Tk) satisfying Si Ti for all i [k]. For k = 1 a k-multi-submodular function is just a submodular function. In Appendix A of the paper full version we discuss how k-multi-submodular functions can also be characterized (or defined) in terms of diminishing returns. Two explicit examples of (non-separable) k-multisubmodular functions (see Appendix B in the full version for proofs) are the following. Example 1. Consider a multilinear function h : Zk + R given by h(z) = P S [k] a S Q m S zm. Let f : 2k V R be a multivariate set function defined as f(S1, . . . , Sk) = h(|S1|, . . . , |Sk|). Then f is k-multi-submodular if and only if a S 0 for all S [k]. Example 2. Let h : Zk + R be a quadratic function given by h(z) = z T Az. Let f : 2k V R be a multivariate set function defined as f(S1, . . . , Sk) = h(|S1|, . . . , |Sk|). Then f is k-multi-submodular if and only if A = (aij) satisfies aij + aji 0 for all i, j [k]. We believe the above examples are useful for modelling competition between agents in many domains. In Section 1.3 we discuss an application to sensor placement problems. 1.2. Our Contributions Our first contribution is to show that the MV framework can model much more general problems than the separable multi-agent (i.e. MASO) framework. This is quantitatively captured in the next information theoretic result (see Section 3.1) where we establish a large gap between the two problems: (MV Min) min f(S1, . . . , Sk) s.t. S1 Sk = V (MA Min) min Pk i=1 fi(Si) s.t. S1 Sk = V Theorem 3. The MV-Min problem with a nonnegative monotone k-multi-submodular objective function cannot be approximated to a ratio o(n/ log n) in the value oracle model with polynomial number of queries, whereas its separable version MA-Min has a tight O(log n)-approximation polytime algorithm for nonnegative monotone submodular functions fi. The above result shows that the MV model may also potentially face roadblocks in terms of tractability. Fortunately, we can show that the multivariate problem remains very well-behaved in the maximization setting. Our main result establishes that if the single-agent problem for a family F admits approximation via its multilinear relaxation (see Section 2.2), then we may extend this to its multivariate version with a constant factor loss. Theorem 4. If there is a (polytime) α(n)-approximation for monotone SO(F) maximization via its multilinear relaxation, then there is a (polytime) (1 1/e) α(n)- approximation for monotone MVSO(F) maximization. Furthermore, given a downwards closed family F, if there is a (polytime) α(n)-approximation for nonmonotone SO(F) maximization via its multilinear relaxation, then there is a (polytime) 0.385 α(n)-approximation for nonmonotone MVSO(F) maximization. We note that the multilinear relaxation can be efficiently evaluated for a large class of practical and useful submodular functions (Iyer et al., 2014), thus making these algorithms viable for many real-world machine learning problems. We remark that the MV gap of 1 1/e for monotone objectives is tight, in the sense that there are families where this cannot be improved. For instance, F = {V } has a trivial 1-approximation for the single-agent problem, and a 1 1/e inapproximability factor for the separable multi-agent (i.e. MASO) version (Khot et al., 2005; Mirrokni et al., 2008), and hence also for the more general MVSO problem. An immediate application of Theorem 4 is that it provides the first constant (and in fact optimal) (1 1/e)- approximation for the monotone generalized submodular welfare problem max f(S1, S2, . . . , Sk) : S1 Sk = V . This problem generalizes the well-studied submodular welfare problem (Lehmann et al., 2006; Vondr ak, 2008; Korula et al., 2018), which captures several allocation problems Multivariate Submodular Optimization and has important applications in combinatorial auctions, Internet advertising, and network routing. The MV objectives can capture much more general interactions among the agents/bidders, where now a bidder s valuation does not only depend on the set S of items that she gets, but also on the items that her strategic partners and competitors get. For instance, in a bandwidth spectrum auction, this could capture a company s interest to maximize compatibility and prevent cross-border interference. In Section 2 we describe a simple reduction that shows that for some families 1 an (optimal) MV gap of 1 holds. We also discuss how for those families, practical algorithms (such as accelerated greedy variants and distributed algorithms) can be used and lead to good approximation guarantees. Theorem 5. Let F be a matroid, a p-matroid intersection, or a p-system. Then, if there is a (polytime) α-approximation algorithm for monotone (resp. nonmonotone) SO(F) maximization, there is a (polytime) α-approximation algorithm for monotone (resp. nonmonotone) MVSO(F) maximization. On the minimization side our approximation results and MV gaps are larger. This is somewhat expected due to the strong hardness results already existing for single-agent submodular minimization (see Section 1.4). However, we give essentially tight approximations in terms of the objective s curvature. The notion of curvature has been widely used for univariate functions (Conforti & Cornu ejols, 1984; Vondr ak, 2010; Iyer et al., 2013; Bai & Bilmes, 2018), since it allows for better approximations and it is linear time computable. Given a tuple (S1, . . . , Sk) 2k V and (i, v) [k] V , we denote by (S1, . . . , Sk) + (i, v) the new tuple (S1, . . . , Si 1, Si + v, Si+1, . . . , Sk). Then, it is natural to think of the quantity f(S1,...,Sk)((i, v)) := f((S1, . . . , Sk)+(i, v)) f(S1, . . . , Sk) as the marginal gain of assigning element v to agent i in the tuple (S1, . . . , Sk). We also use f((i, v)) to denote the quantity f( , . . . , , v, , . . . , ) where v appears in the ith component. Then given a normalized monotone k-multisubmodular function f : 2k V R we define its total curvature c and its curvature c(S1, . . . , Sk) with respect to a tuple (S1, . . . , , Sk) V k as c = 1 min i [k],v V f(V,V,...,V ) (i,v)((i, v)) f((i, v)) (1) c(S1, . . . , Sk) = 1 min i [k],v Si f(S1,...,Sk) (i,v)((i, v)) f((i, v)) . 1A family of sets F is a p-system if for all S F and v V there exists a set T S such that |T| p and S \ T {v} F. A matroid is a 1-system. Cardinality and partition constraints are examples of matroids. We refer the reader to Schrijver (2003); Calinescu et al. (2007; 2011) for a comprehensive discussion. We prove the following result for k-multi-submodular objectives. We note the gap is stronger in the sense that it is relative to the single-agent modular problem. 2 See Section 3.1 of the full version for proof details. Theorem 6. Let f be a monotone k-multi-submodular function, and let F be a family that admits a (polytime) β-approximation over modular functions. Denote by (S 1, . . . , S k) an optimal solution to monotone MVSO(F) minimization, and by c(S 1, . . . , S k) the curvature of f with respect to (S 1, . . . , S k). Then there is a (poly- i [k] |S i | 1+(P i [k] |S i | 1)(1 c(S 1 ,...,S k))-approximation algorithm for monotone MVSO(F) minimization. In some situations the above result may lead to approximation factors highly preferable to those obtained for general functions (i.e. objectives with curvature 1). Examples of these include families like F = {V }, spanning trees, or perfect matchings, where exact algorithms are available for modular objectives (i.e. β = 1) and any optimal solution (S 1, . . . , S k) satisfies P i [k] |S i | = Ω(n). Thus, we go from polynomial approximation factors (for objectives with curvature 1) to constant or logarithmic factors (for constant or order 1 1 log n curvature). Moreover, having the curvature c(S 1, . . . , S k) can be much more beneficial than having the total curvature c. For instance, for the problem min f(S1, . . . , Sk) : S1 Sk = V with f(S1, . . . , Sk) = min{n, Pk i=1 |Si|}. Here the total curvature of f is 1 (hence leading to an n-approximation in Theorem 6), while the curvature c(S 1, . . . , S k) with respect to any partition (S 1, . . . , S k) is 0 (and thus leading to an exact approximation via Theorem 6). In Section 3.1 we give evidence that Theorem 6 is essentially tight, even for F = {V } where we show the following curvature dependent information-theoretic lower bound. Theorem 7. The monotone MVSO(F) minimization problem over F = {V } and objectives f with total curvature c cannot be approximated to a ratio o( n/ log n 1+( n log n 1)(1 c)) in the value oracle model with polynomial number of queries. Finally, we give an approximation in terms of the number of agents k, which may be preferable in settings where k is not too large. See Section 3.2 of the full version for details. Theorem 8. If there is a (polytime) α(n)-approximation for monotone SO(F) minimization based on rounding the convex relaxation, then there is a (polytime) kα(n)- approximation for monotone MVSO(F) minimization. 2A set function f : 2V R is modular if f(A) + f(B) = f(A B) + f(A B) for all A, B V . Modular functions can always be expressed in the form f(S) = w(S) := P v S w(v) for some weight function w : V R. Multivariate Submodular Optimization 1.3. The Multivariate Model and Applications Our second objective is to extend the multivariate model and show that in some cases this larger class remains tractable. Specifically, we define the capacitated multivariate submodular optimization (CMVSO) problem as follows: CMVSO(F) max / min f(S1, S2, . . . , Sk) s.t. S1 Sk F Si Fi , i [k] where we are supplied with subfamilies Fi. Our results imply that one maintains good approximations even while adding interesting side constraints. For example, for a monotone maximization instance of CMVSO where F is a p-matroid intersection and the Fi are all matroids, our results from Section 2 lead to a ( 1 p+1 ϵ)-approximation algorithm via the multilinear relaxation, or a 1/(p + 2)- approximation via a simple greedy algorithm. We believe that these, combined with other results from Section 2, substantially expand the family of tractable models (both in theory and practice) for maximization. Many existing applications fit into the CMVSO framework and some of these can be enriched through the added flexibility of the capacitated model. For instance, one may include set bounds on the variables: Li Si Ui for each i, or simple cardinality constraints: |Si| bi for each i. A well-studied (Fleischer et al., 2006; Goundan & Schulz, 2007; Calinescu et al., 2011) application of CMVSO in the maximization setting is the Separable Assignment Problem (SAP), which corresponds to the setting where the objective is separable and modular, the Fi are downward closed (i.e. hereditary) families, and F = 2V . The following example illustrates CMVSO s potential as a general model. Example 9 (Sensor Placement with Multivariate Objectives). Sensor placement and information gathering problems have been popular in the submodularity literature (Krause & Guestrin, 2007; Krause et al., 1973; 2008). Given a set of sensors V and a set of possible locations {1, 2, . . . , k}, the goal is to place sensors at some of the locations so as to maximize the informativeness gathered. There is usually also a budget constraint restricting the total number of sensors that can be deployed. This application is well suited to a k-multi-submodular objective function f(S1, ..., Sk) which measures the informativeness of placing sensors Si at location i. A natural mathematical formulation for this is given by max f(S1, S2, ..., Sk) s.t. S1 S2 Sk F Si Fi, where F := {S V : |S| b} imposes the budget constraint and Fi gives additional modelling flexibility. For instance, we could impose Fi = {S Vi : |S| bi} to constrain the types and number of sensors that can be placed at location i. Notice that in these cases both F and the Fi are matroids and hence the algorithms from Section 2.4 apply. One may form a multivariate objective by defining f(S1, S2, . . . , Sn) = P i fi(Si) R(S1, S2, . . . , Sn) where the fi s measure the benefit of placing sensors Si at location i, and R() is a redundancy function. If the fi s are submodular and R() is k-multi-supermodular (i.e. R() is k-multi-submodular), then f is k-multi-submodular. In this setting, it is natural to take the fi s to be coverage functions, where fi(Si) measures the coverage of placing sensors Si at location i. We next propose a family of redundancy functions which are k-multi-supermodular. SUPERMODULAR PENALTY MEASURES VIA QUADRATIC FUNCTIONS. We denote S := (S1, S2, . . . , Sn) and z S := (|S1|, |S2|, . . . , |Sn|). One can show (see Lemma 28 in Appendix B of full version) that if A is a matrix satisfying aij + aji 0, then R(S) := z T S Az S is k-multisupermodular. Then for this particular example one could for instance take redundancy coefficients aij as Θ( 1 d(i,j)2 ) where d(i, j) denotes the distance between locations i and j. This can be further extended so that different sensor types contribute different weights to the vector z S, e.g., define z S(i) = P j Si w(j) for an associated sensor weight vector w. 1.4. Related Work Submodularity naturally arises in many machine learning applications such as viral marketing (Kempe et al., 2003), information gathering (Krause & Guestrin, 2007), image segmentation (Boykov & Jolly, 2001; Kohli et al., 2009; Jegelka & Bilmes, 2011), document summarization (Lin & Bilmes, 2011), news article recommendation (El-Arini et al., 2009), active learning (Golovin & Krause, 2011), and speeding up SAT solvers (Streeter & Golovin, 2009). Single Agent Optimization. Minimizing a submodular function can be solved in polytime (Gr otschel et al., 2012; Schrijver, 2000; Iwata et al., 2001). Unconstrained maximization, on the other hand, is known to be inapproximable for general submodular functions but admits a polytime constant-factor approximation algorithm when f is nonnegative (Buchbinder et al., 2015; Feige et al., 2011). For constrained maximization, the classical work Nemhauser et al. (1978); Nemhauser & Wolsey (1978); Fisher et al. (1978) established an optimal (1 1/e)- approximation for nonnegative monotone maximization under a cardinality constraint, and a (1/(k + 1))- approximation under k matroid constraints. The latter is almost tight since there is an Ω(log(k)/k) inapproximability result (Hazan et al., 2006). For nonnegative monotone functions, Vondr ak (2008); Calinescu et al. Multivariate Submodular Optimization (2011) give an optimal (1 1/e)-approximation based on the multilinear extension when F is a matroid; and Lee et al. (2010) gives a local-search algorithm that achieves a (1/k ϵ)-approximation (for any fixed ϵ > 0) when F is a k-matroid intersection. For nonnegative nonmonotone functions, a 0.385-approximation (Buchbinder & Feldman, 2016) is the best factor known for a matroid constraint. In Lee et al. (2009) a 1/(k + O(1))-approximation is given for k matroid constraints with k fixed, and Gupta et al. (2010) gives a simple multi-greedy algorithm that matches the approximation of Lee et al. but is polytime for any k. For constrained minimization the news is worse (Goel et al., 2009; Svitkina & Fleischer, 2011; Iwata & Nagano, 2009). If F consists of spanning trees Goel et al. (2009) show a lower bound of Ω(n), while if F corresponds to the cardinality constraint {S : |S| k} Svitkina & Fleischer (2011) show a lower bound of Ω( n). There are a few exceptions. The problem can be solved exactly when F is a ring family (Schrijver, 2000), triple family (Gr otschel et al., 2012), or parity family (Goemans & Ramakrishnan, 1995). In the context of NP-Hard problems, there is a 2-approximation (Goel et al., 2009; Iwata & Nagano, 2009) for submodular vertex cover, and an O(k)-approximation for k-uniform hitting set. Multivariate Problems. The notion of k-multisubmodularity already appeared (under the name of multidimensional submodularity) in the work of Fisher et al. (1978), where they consider the multivariate monotone maximization problem with F = {V } as a motivating example for submodular maximization subject to a matroid constraint. They show that for this problem a simple greedy algorithm achieves a 1/2-approximation. The work of Singh et al. (2012) considers the special case of 2-multi-submodular functions (they call them simple bisubmodular). They give constant factor approximations for maximizing monotone 2-multi-submodular functions under cardinality and partition constraints, and provide applications to coupled sensor placement and coupled feature selection problems. Other different extensions of submodular functions to multivariate settings have been studied. Some of these include bisubmodular functions (Qi, 1988; Ando et al., 1996; Fujishige & Iwata, 2005; Bouchet & Cunningham, 1995), ksubmodular functions (Huber & Kolmogorov, 2012; Ward & ˇZivn y, 2014; Ohsaka & Yoshida, 2015), or skew bisubmodular functions (Huber et al., 2014; Thapper & ˇZivn y, 2016; Thapper & Zivny, 2012). Finally, as mentioned in the introduction, an important class of (multi-agent submodular optimization) problems arises when f(S1, . . . , Sk) = P i [k] fi(Si). These problems have been widely studied when F = {V }, both for minimization (Hayrapetyan et al., 2005; Svitkina & Tardos, 2010; Ene & Vondr ak, 2014; Chekuri & Ene, 2011) and maximization (Fisher et al., 1978; Lehmann et al., 2006; Vondr ak, 2008), and have also been considered for general families (Goel et al., 2009; Santiago & Shepherd, 2018). 2. Multivariate Submodular Maximization We describe two different reductions. The first one reduces the capacitated multivariate problem CMVSO to a singleagent SO problem, and it is based on the simple idea of taking k disjoint copies of the original ground set. We use this to establish an (optimal) MV gap of 1 for families such as spanning trees, matroids, and p-systems. The second reduction is based on the multilinear extension of a set function. We show that if the single-agent problem admits approximation via its multilinear relaxation (see Section 2.2), then we may extend this to its multivariate version with a constant factor loss, in the monotone and nonmonotone settings. For the monotone case the MV gap is tight. 2.1. The Lifting Reduction We describe a generic reduction of CMVSO to a singleagent problem max / min f(S) : S L. The argument is based on the idea of viewing assignments of elements v to agents i in a multi-agent bipartite graph. This simple idea (which is equivalent to making k disjoint copies of the ground set) already appeared in the classical work of Fisher et al. (1978), and has since then been widely used (Lehmann et al., 2006; Vondr ak, 2008; Calinescu et al., 2011; Singh et al., 2012; Santiago & Shepherd, 2018). We review it briefly here for completeness. Consider the complete bipartite graph G = ([k] + V, E). Every subset of edges S E can be written uniquely as S = i [k]({i} Si) for some sets Si V . This allows us to go from a multivariate objective (such as the one in CMVSO) to a univariate objective f : 2E R over the lifted space. Namely, for each set S E we define f(S) = f(S1, S2, . . . , Sk). The function f is welldefined because of the one-to-one correspondence between sets S E and tuples (S1, . . . , Sk) V k. We consider two families of sets over E that capture the original constraints: F := {S E : S1 Sk F} H := {S E : Si Fi, i [k]}. We now have: max / min f(S1, S2, . . . , Sk) = s.t. S1 Sk F Si Fi , i [k] max / min f(S) = max / min f(S) s.t. S F H s.t. S L. where in the last step we just let L := F H. Multivariate Submodular Optimization Clearly, this reduction is interesting if our new function f and the family of sets L have properties which allow us to handle them computationally. This depends on the original structure of the function f, and the set families F and Fi. The following is straightforward. Claim 10. If f is a (nonnegative, respectively monotone) k-multi-submodular function, then f as defined above is also (nonnegative, respectively monotone) submodular. In Section 2.4 we discuss several properties of the families F and Fi that are preserved under this reduction, as well as their algorithmic consequences. 2.2. Multilinear Extensions for MV Problems For a set function f : {0, 1}V R we define its multilinear extension f M : [0, 1]V R as v / S (1 zv). This extension (introduced in Calinescu et al. (2007)) has several nice properties. One very useful is the following. Proposition 11. Let f : 2V R be a submodular function and f M : [0, 1]n R its multilinear extension. Then f M is convex along directions d = evi evj for i, j {1, 2, . . . , n}, where ev denotes the characteristic vector of {v}. This now gives rise to natural relaxations. The single-agent multilinear extension relaxation is: (SA-ME) max f M(z) : z P (F), and the multivariate multilinear extension relaxation is: (MV-ME) max f M(z1, z2, . . . , zk) : z1 + z2 + + zk P (F), where P (F) denotes some relaxation of the polytope conv({χS : S F}) 3 , and f the lifted univariate function from the reduction in Section 2.1. Note that f is defined over vectors z = (z1, z2, . . . , zk) [0, 1]E, where we think of zi Rn as the vector associated to agent i. The relaxation SA-ME has been widely used (Calinescu et al., 2011; Lee et al., 2009; Feldman et al., 2011; Ene & Nguyen, 2016; Buchbinder & Feldman, 2016) for submodular maximization. The next result discusses its solvability. We note that MV-ME can be reduced to an instance of SA-ME, and thus be approximated to the same factor (see Section 2.2 of full version). Theorem 12 (Buchbinder & Feldman (2016); Vondr ak (2008)). Let f : 2V R+ be a nonnegative submodular function and f M : [0, 1]V R+ its multilinear extension. Let P [0, 1]V be any downwards closed polytope that admits a polytime separation oracle, and denote 3conv(X) denotes the convex hull of a set X of vectors, and χS denotes the characteristic vector of the set S. OPT = max f M(z) : z P. Then there is a polytime algorithm (Buchbinder & Feldman, 2016) that finds z P such that f M(z ) 0.385 OPT. Moreover, if f is monotone there is a polytime algorithm (Vondr ak, 2008) that finds z P such that f M(z ) (1 1/e)OPT. 2.3. A Tight 1 1/e MV Gap In this section we prove Theorem 4. The main idea is that we start with an (approximate) optimal solution z = z 1 +z 2 + +z k to the MV-ME relaxation and build a new feasible solution ˆz = ˆz1 + ˆz2 + + ˆzk where the ˆzi have supports Vi that are pairwise disjoint. We think of Vi as the set of items associated (or pre-assigned) to agent i. Once we have such a pre-assignment we consider the single-agent problem max g(S) : S F where g(S) = f(S V1, S V2, . . . , S Vk). (2) It is clear that g is nonnegative monotone submodular since f is nonnegative monotone k-multi-submodular. Moreover, for any feasible solution S F for this singleagent problem, we obtain a multivariate solution of the same cost by setting Si = S Vi, since then g(S) = f(S V1, S V2, . . . , S Vk) = f(S1, S2, . . . , Sk). For a set S V and a vector z [0, 1]V we denote by z|S the truncation of z to elements of S. That is, we set z|S(v) = z(v) for each v S and to zero otherwise. Then by definition of g we have that g M(z) = f M(z|V1, z|V2, . . . , z|Vk), where f is the lifted function from Section 2.1. Moreover, if the sets Vi are pairwise disjoint, then f M(z|V1, z|V2, . . . , z|Vk) = f M(z1, z2, . . . , zk). The next result formalizes this observation. Proposition 13. Let z = P i [k] zi be a feasible solution to MV-ME such that the vectors zi have pairwise disjoint supports Vi. Then g M(z) = f M(z1, z2, . . . , zk). We now have all the ingredients to prove our main result for maximization. We note that a gap of 1 1/e appeared in Santiago & Shepherd (2018) for the case of separable objectives f(S1, . . . , Sk) = P i fi(Si). That argument uses the component-wise linearity of the multilinear extension, while our proof for non-separable objectives strongly uses the convexity property from Proposition 11. Theorem 4. If there is a (polytime) α(n)-approximation for monotone SO(F) maximization based on rounding SA-ME, then there is a (polytime) (1 1/e) α(n)-approximation for monotone MVSO(F) maximization. Furthermore, given a downwards closed family F, if there is a (polytime) α(n)- approximation for nonmonotone SO(F) maximization based on rounding SA-ME, then there is a (polytime) 0.385 α(n)- approximation for nonmonotone MVSO(F) maximization. Proof. We discuss first the case of monotone objectives. Multivariate Submodular Optimization STEP 1. Let z = z 1 + z 2 + + z k denote an approximate solution to MV-ME obtained via Theorem 12, and let OPTfrac be the value of an optimal solution. We then have that f M(z 1, z 2, . . . , z k) (1 1/e)OPTfrac (1 1/e)OPTMV . STEP 2. For an element v V let ev denote the characteristic vector of {v}, i.e. the vector in RV which has value 1 in the v-th component and zero elsewhere. Then by Proposition 11 we have that the function h(t) = f M(z 1, z 2, . . . , z i 1, z i + tev, z i+1, . . . , z i 1, z i tev, z i +1, . . . , z k) is convex for any v V and i = i [k]. In particular, given any v V such that there exist i = i [k] with z i (v), z i (v) > 0, there is always a choice so that increasing one component and decreasing the other by the same amount does not decrease the objective value. Let v V be such that there exist i = i [k] with z i (v), z i (v) > 0. Then, we either set z i (v) = z i (v) + z i (v) and z i (v) = 0, or z i (v) = z i (v) + z i (v) and z i (v) = 0, whichever does not decrease the objective value. We repeat until the vectors z i have pairwise disjoint support. Let us denote these new vectors by ˆzi and let ˆz = P i [k] ˆzi. Then notice that the vector z = P i [k] z i remains invariant after performing each of the above updates (i.e. ˆz = z ), and hence the new vectors ˆzi remain a feasible solution. STEP 3. In the last step we use the function g defined in (2), with sets Vi corresponding to the supports of the ˆzi. Given our α-approximation rounding assumption for SA-ME, we can round ˆz to find a set ˆS such that g( ˆS) αg M(ˆz). Then, by setting ˆSi = ˆS Vi we obtain a multivariate solution satisfying f( ˆS1, . . . , ˆSk) = g( ˆS) αg M(ˆz) = αf M(ˆz1, . . . , ˆzk) αf M(z 1, . . . , z k) α(1 1/e)OPTMV , where the second equality follows from Proposition 13. This completes the monotone proof. For the nonmonotone case the argument is very similar. Here we restrict our attention to downwards closed families, since then we can get a 0.385-approximation at STEP 1 via Theorem 12. We then apply STEP 2 and 3 in the same fashion as we did for monotone objectives. This leads to a 0.385 α(n)-approximation for the multivariate problem. 2.4. Invariance Under the Lifting Reduction In Section 2.3 we established a MV gap of 1 1/e for monotone objectives and of 0.385 for nonmonotone objectives and downwards closed families based on the multilinear formulations. In this section we describe several families Table 1. Invariant properties under the lifting reduction (Santiago & Shepherd, 2018) Multivariate problem Single-agent (i.e. reduced) problem 1 (V, F) a p-system (E, F ) a p-system 2 F = bases of a p-system F = bases of a p-system 3 (V, F) a matroid (E, F ) a matroid 4 (V, F) a p-matroid intersection (E, F ) a p-matroid intersection 5 (V, Fi) a matroid for all i [k] (E, H) a matroid 6 Fi a ring family for all i [k] H a ring family with an (optimal) MV gap of 1. Examples of such family classes include spanning trees, matroids, and p-systems. We saw in Section 2.1 that if the original function f is kmulti-submodular then the lifted function f is submodular. We now discuss some properties of the original families Fi and F that are also preserved under the lifting reduction; these were already proved in Santiago & Shepherd (2018). It is shown there, for instance, that if F induces a matroid (or more generally a p-system) over the ground set V , then so does the family F over the lifted space E. We summarize some of these results in Table 1, and discuss next some of the algorithmic consequences. In the setting of MVSO this invariance allows us to leverage several results from the single-agent to the multivariate setting. These are based on the following result, which uses the fact that the size of the lifted space E is nk. Theorem 14. Let F be a matroid, a p-matroid intersection, or a p-system. If there is a (polytime) α(n)-approximation algorithm for monotone (resp. nonmonotone) SO(F) maximization (resp. minimization), then there is a (polytime) α(nk)-approximation algorithm for monotone (resp. nonmonotone) MVSO(F) maximization (resp. minimization). For both monotone and nonmonotone maximization the approximation factors α(n) for the family classes described in Theorem 14 are independent of (the size of the ground set) n. Hence, we immediately get that α(nk) = α(n) for those cases, and thus approximation factors for the corresponding multivariate and single-agent problems are the same. In our MV gap terminology this implies an MV gap of 1 for such problems. This proves Theorem 5. In the setting of CMVSO the results described on entries 5 and 6 of Table 1 provide additional modelling flexibility. This allows us to maintain good approximations while combining several constraints. For example, for a monotone maximization instance of CMVSO where F corresponds to a p-matroid intersection and the Fi are all matroids, the above invariance results lead to a ( 1 p+1 ϵ)-approximation. The results from this section also imply that algorithms that behave very well in practice (such as accelerated greedy variants (Mirzasoleiman et al., 2016a) and distributed algorithms (Mirzasoleiman et al., 2016b)) for the corresponding single-agent problems, can also be used for the more gen- Multivariate Submodular Optimization eral multivariate setting while preserving the same approximation guarantees. We believe this makes the CMVSO framework a good candidate for potential applications in large-scale machine learning problems. 3. Multivariate Submodular Minimization In this section we give (almost tight) curvature dependent approximations for minimization. Due to space limit, we omit the discussion and proof of Theorem 6 to Section 3.1 of the full version. 3.1. An o( n log n) Lower Bound Hardness for F = {V } In this section we focus on the special case where F = {V }, and show that the curvature dependent approximation factors obtained in Theorem 6 are essentially tight (see Section 3.3 of the full version for proofs). We follow a technique from Goemans et al. (2009); Feige et al. (2011); Svitkina & Fleischer (2011) and build two multivariate submodular functions that are hard to distinguish with high probability for any (even randomized) algorithm. Assume that k = n, and let R := (R1, R2, . . . , Rn) V n be a random partition of V . Notice that Pn i=1 |Ri| = n. Let β = ω(log n) and such that β is integer. Consider the two nonnegative monotone n-multi-submodular functions f1, f2 : 2n V R+ given by: f1(S1, . . . , Sn) = min{n, i=1 |Si|} , (3) f2(S1, . . . , Sn) = min{f1(S1, . . . , Sn), β + i=1 |Si Ri|} where Ri = V Ri. We then have the following (see Section 3.3 of full version for proof details). Lemma 15. Any algorithm that makes a polynomial number of oracle calls has probability n ω(1) of distinguishing the functions f1 and f2 above. We now prove our (curvature independent) lower bound. Theorem 16. The monotone MVSO(F) minimization problem over F = {V } cannot be approximated to a ratio o(n/ log n) in the value oracle model with polynomially many queries. Proof. Assume there is a polytime algorithm achieving an approximation factor of α = o(n/ log n). Choose β = ω(log n) such that αβ < n. Consider the output of the algorithm when f2 is given as input. The optimal solution in this case is the partition R = (R1, . . . , Rk), with f2(R1, . . . , Rk) = β. So the algorithm produces a feasible solution (i.e. a partition) (S 1, . . . , S k) satisfying f2(S 1, . . . , S k) αβ < n. However, since f1 takes value exactly n over any partition, there is no feasible solution (S1, . . . , Sn) such that f1(S1, . . . , Sn) < n. This means that if the input is the function f1 then the algorithm produces a different answer, thus distinguishing between f1 and f2, contradicting Lemma 15. The above result contrasts with the known O(log n) approximation (Svitkina & Tardos, 2010) for the case where the multivariate objective is separable, that is f(S1, . . . , Sk) = P i fi(Si) with fi monotone submodular. These two facts combined now prove Theorem 3. We use a construction from Iyer et al. (2013) to explicitly introduce the effect of curvature into the lower bound. Their work is for univariate functions, but it can be naturally extended to the multivariate setting. We modify the functions f1, f2 from (3) as follows: f c i (S1, . . . , Sk) = c fi(S1, . . . , Sk)+(1 c) i=1 |Si| , i = 1, 2. Both f c 1 and f c 2 have total curvature c. Moreover, since f1(S1, . . . , Sk) = f2(S1, . . . , Sk) if and only if f c 1(S1, . . . , Sk) = f c 2(S1, . . . , Sk), by Lemma 15 it follows that any algorithm that makes polynomially many queries cannot distinguish between f c 1 and f c 2 with high probability. In addition, the gap between the optimal solutions for these two functions is given by OPT1 OPT2 = cn + (1 c)n cβ + (1 c)n = n cβ + (1 c)n = n β + (n β)(1 c) = n/β 1 + (n/β 1)(1 c). Then, since β = ω(log n), the lower bound follows. Theorem 7. The monotone MVSO(F) minimization problem over F = {V } and objectives f with total curvature c cannot be approximated to a ratio o( n/ log n 1+( n log n 1)(1 c)) in the value oracle model with polynomial number of queries. 4. Conclusions We introduce a new class of multivariate submodular optimization problems, and give information theoretic evidence that this class encodes much more than the separable versions arising in multi-agent objectives. We provide some explicit examples and potential applications. For maximization, we show that practical algorithms such as accelerated greedy variants and distributed algorithms achieve good approximation guarantees under very general constraints. Moreover, for arbitrary families we show MV gaps of 1 1/e and 0.385 for the monotone and nonmonotone problems respectively. For minimization, we give (essentially tight) approximation factors with respect to the curvature of the multivariate objective function. Multivariate Submodular Optimization Acknowledgements We thank Chandra Chekuri for valuable comments and suggestions. We thank Nick Harvey and Mark Schmidt for their guidance. We also thank the reviewers for their useful comments. Ando, K., Fujishige, S., and Naitoh, T. A characterization of bisubmodular functions. Discrete Mathematics, 148 (1-3):299 303, 1996. Bai, W. and Bilmes, J. A. Greed is still good: Maximizing monotone submodular+ supermodular functions. ar Xiv preprint ar Xiv:1801.07413, 2018. Bouchet, A. and Cunningham, W. H. Delta-matroids, jump systems, and bisubmodular polyhedra. SIAM Journal on Discrete Mathematics, 8(1):17 32, 1995. Boykov, Y. Y. and Jolly, M.-P. Interactive graph cuts for optimal boundary & region segmentation of objects in nd images. In Computer Vision, 2001. ICCV 2001. Proceedings. Eighth IEEE International Conference on, volume 1, pp. 105 112. IEEE, 2001. Buchbinder, N. and Feldman, M. Constrained submodular maximization via a non-symmetric technique. ar Xiv preprint ar Xiv:1611.03253, 2016. Buchbinder, N., Feldman, M., Seffi, J., and Schwartz, R. A tight linear time (1/2)-approximation for unconstrained submodular maximization. SIAM Journal on Computing, 44(5):1384 1402, 2015. Calinescu, G., Chekuri, C., P al, M., and Vondr ak, J. Maximizing a submodular set function subject to a matroid constraint. In Integer programming and combinatorial optimization, pp. 182 196. Springer, 2007. Calinescu, G., Chekuri, C., P al, M., and Vondr ak, J. Maximizing a monotone submodular function subject to a matroid constraint. SIAM Journal on Computing, 40(6): 1740 1766, 2011. Chekuri, C. and Ene, A. Submodular cost allocation problem and applications. In International Colloquium on Automata, Languages, and Programming, pp. 354 366. Springer, 2011. Extended version: ar Xiv preprint ar Xiv:1105.2040. Conforti, M. and Cornu ejols, G. Submodular set functions, matroids and the greedy algorithm: tight worst-case bounds and some generalizations of the rado-edmonds theorem. Discrete applied mathematics, 7(3):251 274, 1984. El-Arini, K., Veda, G., Shahaf, D., and Guestrin, C. Turning down the noise in the blogosphere. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 289 298. ACM, 2009. Ene, A. and Nguyen, H. L. Constrained submodular maximization: Beyond 1/e. In Foundations of Computer Science (FOCS), 2016 IEEE 57th Annual Symposium on, pp. 248 257. IEEE, 2016. Ene, A. and Vondr ak, J. Hardness of submodular cost allocation: Lattice matching and a simplex coloring conjecture. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2014), 28:144 159, 2014. Feige, U., Mirrokni, V. S., and Vondrak, J. Maximizing non-monotone submodular functions. SIAM Journal on Computing, 40(4):1133 1153, 2011. Feldman, M., Naor, J., and Schwartz, R. A unified continuous greedy algorithm for submodular maximization. In Foundations of Computer Science (FOCS), 2011 IEEE 52nd Annual Symposium on, pp. 570 579. IEEE, 2011. Fisher, M. L., Nemhauser, G. L., and Wolsey, L. A. An analysis of approximations for maximizing submodular set functions-II. Springer, 1978. Fleischer, L., Goemans, M. X., Mirrokni, V. S., and Sviridenko, M. Tight approximation algorithms for maximum general assignment problems. In Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm, pp. 611 620. Society for Industrial and Applied Mathematics, 2006. Fujishige, S. and Iwata, S. Bisubmodular function minimization. SIAM Journal on Discrete Mathematics, 19(4): 1065 1073, 2005. Goel, G., Karande, C., Tripathi, P., and Wang, L. Approximability of combinatorial problems with multi-agent submodular cost functions. In Foundations of Computer Science, 2009. FOCS 09. 50th Annual IEEE Symposium on, pp. 755 764. IEEE, 2009. Goemans, M. X. and Ramakrishnan, V. Minimizing submodular functions over families of sets. Combinatorica, 15(4):499 513, 1995. Goemans, M. X., Harvey, N. J., Iwata, S., and Mirrokni, V. Approximating submodular functions everywhere. In Proceedings of the twentieth annual ACM-SIAM symposium on Discrete algorithms, pp. 535 544. Society for Industrial and Applied Mathematics, 2009. Multivariate Submodular Optimization Golovin, D. and Krause, A. Adaptive submodularity: Theory and applications in active learning and stochastic optimization. Journal of Artificial Intelligence Research, 42:427 486, 2011. Goundan, P. R. and Schulz, A. S. Revisiting the greedy approach to submodular set function maximization. Optimization online, pp. 1 25, 2007. Gr otschel, M., Lov asz, L., and Schrijver, A. Geometric algorithms and combinatorial optimization, volume 2. Springer Science & Business Media, 2012. Gupta, A., Roth, A., Schoenebeck, G., and Talwar, K. Constrained non-monotone submodular maximization: Offline and secretary algorithms. In International Workshop on Internet and Network Economics, pp. 246 257. Springer, 2010. Hayrapetyan, A., Swamy, C., and Tardos, E. Network design for information networks. In Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms, pp. 933 942. Society for Industrial and Applied Mathematics, 2005. Hazan, E., Safra, S., and Schwartz, O. On the complexity of approximating k-set packing. computational complexity, 15(1):20 39, 2006. Huber, A. and Kolmogorov, V. Towards minimizing k-submodular functions. In International Symposium on Combinatorial Optimization, pp. 451 462. Springer, 2012. Huber, A., Krokhin, A., and Powell, R. Skew bisubmodularity and valued csps. SIAM Journal on Computing, 43 (3):1064 1084, 2014. Iwata, S. and Nagano, K. Submodular function minimization under covering constraints. In Foundations of Computer Science, 2009. FOCS 09. 50th Annual IEEE Symposium on, pp. 671 680. IEEE, 2009. Iwata, S., Fleischer, L., and Fujishige, S. A combinatorial strongly polynomial algorithm for minimizing submodular functions. Journal of the ACM (JACM), 48(4): 761 777, 2001. Iyer, R., Jegelka, S., and Bilmes, J. Monotone closure of relaxed constraints in submodular optimization: Connections between minimization and maximization: Extended version. 2014. Iyer, R. K., Jegelka, S., and Bilmes, J. A. Curvature and optimal algorithms for learning and minimizing submodular functions. In Advances in Neural Information Processing Systems, pp. 2742 2750, 2013. Jegelka, S. and Bilmes, J. Submodularity beyond submodular energies: coupling edges in graph cuts. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pp. 1897 1904. IEEE, 2011. Kempe, D., Kleinberg, J., and Tardos, E. Maximizing the spread of influence through a social network. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 137 146. ACM, 2003. Khot, S., Lipton, R. J., Markakis, E., and Mehta, A. Inapproximability results for combinatorial auctions with submodular utility functions. In International Workshop on Internet and Network Economics, pp. 92 101. Springer, 2005. Kohli, P., Kumar, M. P., and Torr, P. H. P3 & beyond: Move making algorithms for solving higher order functions. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 31(9):1645 1656, 2009. Korula, N., Mirrokni, V., and Zadimoghaddam, M. Online submodular welfare maximization: Greedy beats 1/2 in random order. SIAM Journal on Computing, 47(3):1056 1086, 2018. Krause, A. and Guestrin, C. Near-optimal observation selection using submodular functions. In AAAI, volume 7, pp. 1650 1654, 2007. Krause, A., Leskovec, J., Guestrin, C., Van Briesen, J., and Faloutsos, C. Efficient sensor placement optimization for securing large water distribution networks. Journal of Water Resources Planning and Management, 134(6): 516 526, November 2008. Krause, K., Goodwin, M., and Smith, R. Optimal software test planning through automated network analysis. TRW Systems Group, 1973. Lee, J., Mirrokni, V. S., Nagarajan, V., and Sviridenko, M. Non-monotone submodular maximization under matroid and knapsack constraints. In Proceedings of the fortyfirst annual ACM symposium on Theory of computing, pp. 323 332. ACM, 2009. Lee, J., Sviridenko, M., and Vondr ak, J. Submodular maximization over multiple matroids via generalized exchange properties. Mathematics of Operations Research, 35(4): 795 806, 2010. Lehmann, B., Lehmann, D., and Nisan, N. Combinatorial auctions with decreasing marginal utilities. Games and Economic Behavior, 55(2):270 296, 2006. URL http://Econ Papers.repec.org/Re PEc: eee:gamebe:v:55:y:2006:i:2:p:270-296. Multivariate Submodular Optimization Lin, H. and Bilmes, J. A class of submodular functions for document summarization. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1, pp. 510 520. Association for Computational Linguistics, 2011. Lov asz, L. Submodular functions and convexity. In Mathematical Programming The State of the Art, pp. 235 257. Springer, 1983. Mirrokni, V., Schapira, M., and Vondr ak, J. Tight information-theoretic lower bounds for welfare maximization in combinatorial auctions. In Proceedings of the 9th ACM conference on Electronic commerce, pp. 70 77. ACM, 2008. Mirzasoleiman, B., Badanidiyuru, A., and Karbasi, A. Fast constrained submodular maximization: Personalized data summarization. In ICML, pp. 1358 1367, 2016a. Mirzasoleiman, B., Karbasi, A., Sarkar, R., and Krause, A. Distributed submodular maximization. The Journal of Machine Learning Research, 17(1):8330 8373, 2016b. Nemhauser, G. L. and Wolsey, L. A. Best algorithms for approximating the maximum of a submodular set function. Mathematics of operations research, 3(3):177 188, 1978. Nemhauser, G. L., Wolsey, L. A., and Fisher, M. L. An analysis of approximations for maximizing submodular set functions - i. Mathematical Programming, 14(1): 265 294, 1978. Ohsaka, N. and Yoshida, Y. Monotone k-submodular function maximization with size constraints. In Advances in Neural Information Processing Systems, pp. 694 702, 2015. Qi, L. Directed submodularity, ditroids and directed submodular flows. Mathematical Programming, 42(1-3): 579 599, 1988. Santiago, R. and Shepherd, F. B. Multi-Agent Submodular Optimization. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2018), 116:23:1 23:20, 2018. Schrijver, A. A combinatorial algorithm minimizing submodular functions in strongly polynomial time. Journal of Combinatorial Theory, Series B, 80(2):346 355, 2000. Schrijver, A. Combinatorial optimization: polyhedra and efficiency, volume 24. Springer, 2003. Singh, A., Guillory, A., and Bilmes, J. On bisubmodular maximization. In Artificial Intelligence and Statistics, pp. 1055 1063, 2012. Streeter, M. and Golovin, D. An online algorithm for maximizing submodular functions. In Advances in Neural Information Processing Systems, pp. 1577 1584, 2009. Svitkina, Z. and Fleischer, L. Submodular approximation: Sampling-based algorithms and lower bounds. SIAM Journal on Computing, 40(6):1715 1737, 2011. Svitkina, Z. and Tardos, E. Facility location with hierarchical facility costs. ACM Transactions on Algorithms (TALG), 6(2):37, 2010. Thapper, J. and Zivny, S. The power of linear programming for valued csps. In Foundations of Computer Science (FOCS), 2012 IEEE 53rd Annual Symposium on, pp. 669 678. IEEE, 2012. Thapper, J. and ˇZivn y, S. The complexity of finite-valued csps. Journal of the ACM (JACM), 63(4):37, 2016. Vondr ak, J. Optimal approximation for the submodular welfare problem in the value oracle model. In Proceedings of the fortieth annual ACM symposium on Theory of computing, pp. 67 74. ACM, 2008. Vondr ak, J. Submodularity and curvature: The optimal algorithm (combinatorial optimization and discrete algorithms). 2010. Ward, J. and ˇZivn y, S. Maximizing bisubmodular and ksubmodular functions. In Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1468 1481. SIAM, 2014.