# on_unconstrained_quasisubmodular_function_optimization__ad0721ad.pdf

On Unconstrained Quasi-Submodular Function Optimization

Jincheng Mei, Kang Zhao and Bao-Liang Lu* Center for Brain-Like Computing and Machine Intelligence Department of Computer Science and Engineering Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering Shanghai Jiao Tong University 800 Dong Chuan Road, Shanghai 200240, China {jcmei,sjtuzk,bllu}@sjtu.edu.cn

With the extensive application of submodularity, its generalizations are constantly being proposed. However, most of them are tailored for special problems. In this paper, we focus on quasi-submodularity, a universal generalization, which satisﬁes weaker properties than submodularity but still enjoys favorable performance in optimization. Similar to the diminishing return property of submodularity, we ﬁrst deﬁne a corresponding property called the single sub-crossing, then we propose two algorithms for unconstrained quasi-submodular function minimization and maximization, respectively. The proposed algorithms return the reduced lattices in O(n) iterations, and guarantee the objective function values are strictly monotonically increased or decreased after each iteration. Moreover, any local and global optima are deﬁnitely contained in the reduced lattices. Experimental results verify the effectiveness and efﬁciency of the proposed algorithms on lattice reduction.

Introduction Given a ground set N = {1, 2, , n}, a set function F : 2N 7 R is said to be submodular (Fujishige 2005) if X, Y N,

F(X) + F(Y ) F(X Y ) + F(X Y ).

An equivalent deﬁnition is given as following, i.e., A B N, i N \ B,

F(i|A) F(i|B),

where F(i|A) F(A + i) F(A) is called the marginal gain of i with respect to A. It implies that submodular functions capture the diminishing return property. To facilitate our presentation, we use F(A + i) to refer to F(A {i}), and F(A i) to refer to F(A \ {i}). Submodularity is widely applied in economics, combinatorics, and machine learning, such as welfare allocation (Vondr ak 2008), sensor placement (Krause, Singh, and Guestrin 2008), feature selection (Das and Kempe 2011), and computer vision (Liu et al. 2011), to name but a few. With the wide application of submodularity, it has many generalizations. For example, Singh et al. (Singh, Guillory,

Copyright c 2015, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

and Bilmes 2012) formulate multiple sensor placement and multimodal feature selection as bisubmodular function maximization, where the objectives have multiple set arguments. Golovin and Krause (Golovin and Krause 2011) introduce the concept of adaptive submodularity to make a sequence of adaptive decisions with uncertain responses. Feige (Feige 2009) proposes maximizing subadditive functions on welfare problems to capture the complement free property of the utility functions. However, all the mentioned generalizations of submodularity enjoy beneﬁts in special application scenarios (multiset selection, adaptive decision, and complement free allocation). In this paper, we study a universal generalization. Submodularity is often viewed as the discrete analogue of convexity (Lov asz 1983). One of the most important generalizations of convexity is quasi-convexity (Boyd and Vandenberghe 2004). Quasi-convex functions satisfy some weaker properties, but still beneﬁt much from the optimization perspective. More speciﬁcally, quasi-convex constraints can be easily transformed to convex constraints via sublevel sets, and quasi-convex optimization problems can be solved through a series of convex feasibility problems using bisection methods (Boyd and Vandenberghe 2004). Considering the celebrated analogue between submodularity and convexity, a natural question is whether submodularity has similar generalizations which satisfy weaker properties but still enjoy favorable performance in optimization? In this paper, we positively answer this question and refer to this generalization as quasi-submodularity. As aforementioned, quasi-submodularity is a weaker property than submodularity. Similar to the diminishing return property of submodular functions, we ﬁrst deﬁne a corresponding property called single sub-crossing. Then we propose two algorithms for unconstrained quasi-submodular minimization and maximization, respectively. Our theoretical analyses show that the proposed algorithms strictly increase or decrease the objective function values after each iteration. The output reduced lattices can be obtained in O(n) iterations, which contain all the local and global optima of the optimization problems. The theoretical and experimental results indicate that although quasi-submodularity is a weaker property than submodularity, it enjoys favorable performance in optimization. The rest of the paper is organized as follows. We ﬁrst

Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence

introduce the concept of quasi-submodularity and deﬁne the single sub-crossing property. Then we present the efﬁcient algorithms and theoretical analyses for unconstrained quasi-submodular function minimization and maximization, respectively. After that, we provide some discussion and present some experimental results to verify the effectiveness and efﬁciency of the proposed algorithms on lattice reduction. Finally, we introduce some related work and give some conclusions about our work.

Quasi-Submodularity It is well known that the term semi-modular is taken from lattice theory (Edmonds 2003). A lattice is a partially ordered set, which contains the supremum and inﬁmum of each element pair. Here, we introduce a very useful lattice. Deﬁnition 1 (Set Interval Lattice). Given two ground sets A, B, a set interval lattice L = [A, B] is deﬁned as {U | A U B}. L is not empty if and only if A B. In the set interval lattice, the partially order relation is deﬁned as the set inclusion . A set S L iff A S B. Obviously, X, Y L, we have X Y, X Y L, thus L is a lattice. The concept of quasi-supermodularity is ﬁrst proposed in economic ﬁelds (Milgrom and Shannon 1994). Quasisupermodularity captures the monotonicity of the solutions as the problem parameters change, and has been proved useful in game theory (Leininger 2006), parametric cuts (Granot et al. 2012), and discrete convex analysis (Murota 2003). Following (Milgrom and Shannon 1994), we give the deﬁnition of quasi-submodularity. Deﬁnition 2 (QSB). A set function F : 2N 7 R is quasisubmodular function if X, Y N, both of the following conditions are satisﬁed

F(X Y ) F(X) F(Y ) F(X Y ), F(X Y ) > F(X) F(Y ) > F(X Y ). (1)

The following proposition implies that the concept of quasi-submodularity is a generalization of submodularity. Proposition 1. Any submodular function is quasisubmodular function, but not vice versa.

Proof. Suppose F : 2N 7 R is a submodular function, and F is not a quasi-submodular function. Then we have F(X Y ) F(X), F(Y ) < F(X Y ), or F(X Y ) > F(X), F(Y ) F(X Y ). Both of the two cases lead to F(X)+F(Y ) < F(X Y )+F(X Y ), which contradicts the deﬁnition of submodularity. A counterexample is given to prove a quasi-submodular function may not be a submodular function. Suppose N = {1, 2}, F( ) = 1, F({1}) = 0, F({2}) = 1.5, and F({1, 2}) = 1. It is easy to check that F satisﬁes the definition of QSB. But F is not a submodular function, since F({1}) + F({2}) < F( ) + F({1, 2}). Actually, F is a supermodular function.

Similar to the diminishing return property of submodular functions, we deﬁne a corresponding property for quasisubmodularity, and name it as single sub-crossing.

Deﬁnition 3 (SSBC). A set function F : 2N 7 R satisﬁes the single sub-crossing property if A B N, i N\B, both of the following conditions are satisﬁed

F(A) F(B) F(A + i) F(B + i), F(A) > F(B) F(A + i) > F(B + i). (2)

As mentioned before, submodularity and diminishing return property are equivalent deﬁnitions. Analogously, quasisubmodularity and single sub-crossing property are also equivalent. Proposition 2. Any quasi-submodular function satisﬁes the single sub-crossing property, and vice versa.

Proof. Suppose F : 2N 7 R is a quasi-submodular function. A B N, i N \ B, let X = B, Y = A + i in (1). It is obvious that F satisﬁes the SSBC property. On the other hand, suppose F satisﬁes the SSBC property. X, Y N, we denote Y \ X = {i1, i2, , ik}. Based on the SSBC property, if F(X Y ) (>)F(X), then we have F(X Y +i1) (>)F(X +i1). Similarly, we have F(X Y +i1 +i2) (>)F(X +i1 +i2). Repeating the operation until ik is added, we get F(Y ) (>)F(X Y ).

Note that in the proof above, if we exchange X and Y , i.e., let X = A + i, Y = B, we will get

F(A) F(A + i) F(B) F(B + i), F(A) > F(A + i) F(B) > F(B + i).

We can rewrite it using the marginal gain notation, i.e., A B N, i N \ B,

F(i|A) (<) 0 F(i|B) (<) 0. (3)

Note that although X and Y are symmetric and interchangeable in (1), we get a representation which is different with the SSBC property. Actually, (3) is a weaker condition than (1). The proposed algorithms work on the weaker notion (3), and the results also hold for quasi-submodularity.

Unconstrained Quasi-Submodular Function Minimization In this section, we are concerned with general unconstrained quasi-submodular minimization problems, where the objective functions are given in the form of value oracle. Generally, we do not make any additional assumptions (such as nonnegative, monotone, symmetric, etc) except quasisubmodularity. Very recently, Iyer et al. (Iyer, Jegelka, and Bilmes 2013) propose a discrete Majorization-Minimization like submodular function minimization algorithm. In (Iyer, Jegelka, and Bilmes 2013), for each submodular function, a tight modular upper bound is established at the current working set, then this bound is minimized as the surrogate function of the objective function. But for quasi-submodular function, there is no known superdifferential, and it can be veriﬁed that the upper bounds in (Iyer, Jegelka, and Bilmes 2013) are no longer bounds for quasi-submodular functions. Actually, without submodularity, quasi-submodularity is sufﬁcient to perform lattice reduction. Consequently, we design the following algorithm.

Algorithm 1 Unconstrained Quasi-Submodular Function Minimization (UQSFMin) Input: Quasi-submodular function F, N = {1, 2, ..., n},

X0 N, t 0. Output: Xt as a local optimum of min

1: At Iteration t, ﬁnd Ut = {u N \ Xt | F(u|Xt) < 0}.

Yt Xt Ut. 2: Find Dt = {d Xt | F(d|Yt d) > 0}. Xt+1

Yt \ Dt. 3: If Xt+1 = Xt (iff Ut = Dt = ), stop and output Xt. 4: t t + 1. Back to Step 1.

X is a local minimum means i X, F(X i) F(X), and j N \ X, F(X + j) F(X).

Algorithm 1 has several nice theoretical guarantees. First, the objective function values are strictly decreased after each iteration, as the following lemma states. Lemma 1. After each except the last iteration of Algorithm 1, the objective function value of the working set is strictly monotonically decreased, i.e., t, F(Xt+1) < F(Xt).

Proof. We prove F(Yt) < F(Xt). F(Xt+1) < F(Yt) can be proved using a similar approach. Suppose Ut = . Deﬁne U k arg min U Ut:|U|=k F(Xt U), and Y k

t = Xt U k. According to the algorithm, u Ut \ U k, F(u|Xt) < 0. Since Xt Y k

t , and u Y k

t , based on the SSBC property, we have F(u|Y k

t ) < 0. This implies F(Y k+1

t ) F(Xt (U k + u)) < F(Xt U k) = F(Y k

t ). Note that F(Y 1

t ) = minu Ut F(Xt + u) < F(Xt) = F(Y 0

t ). We then have F(Yt) = F(Y |Ut|

t ) < F(Y |Ut| 1

t ) < < F(Y 0

t ) = F(Xt).

If we start from X0 = Q0 , after one iteration, we will get X1 = Q1 = {i | F(i| ) < 0}. Similarly, if we start from X0 = S0 N, we will get X1 = S1 = {i | F(i|N i) 0}. Based on the SSBC property, we have i N, F(i| ) < 0 F(i|N i) < 0, i.e., Q1 S1. Thus the reduced lattice L = [Q1, S1] [ , N] is not empty, and we show that it contains all the global minima. Lemma 2. Any global minimum of F(X) is contained in the lattice L = [Q1, S1], i.e., X arg min X N F(X), Q1 X S1.

Proof. We prove Q1 X . X S1 can be proved in a similar way. Suppose Q1 X , i.e., u Q1, u X . According to the deﬁnition of Q1, F(u| ) < 0. Since X , based on the SSBC property, we have F(u|X ) < 0, which implies F(X + u) < F(X ). This contradicts the optimality of X .

If we start Algorithm 1 from X0 = Q0 = , suppose we get Qt after t iterations. It is easy to check that, due to the SSBC property, in each iteration, Qt only adds elements. So we get a chain = Q0 Q1 Qt Q+, where Q+ is the ﬁnal output when the algorithm terminates. Similarly, if we start from X0 = S0 = N, we can get another chain S+ St S1 S0 = N. We

then prove that the endpoint sets of the two chains form a lattice, which contains all the local minima of F.

Lemma 3. Any local minimum of F(X) is contained in the lattice L = [Q+, S+].

Proof. Let P be a local minimum. In the proof of Lemma 2, we use singleton elements to construct contradictions, so we have Q1 P S1. Suppose Qt P St, we then prove Qt+1 P St+1. First, we suppose Qt+1 P. Because Qt+1 = Qt Ut, u Ut, u P. According to the deﬁnition of Ut, F(u|Qt) < 0. Since Qt P, based on the SSBC property, we have F(u|P) < 0. This indicates F(P + u) < F(P), which contradicts the local optimality of P. Hence Qt+1 P. And P St+1 can be proved in a similar way.

Moreover, the two endpoint sets Q+ and S+ are local minima.

Lemma 4. Q+ and S+ are local minima of F(X).

Proof. We prove for Q+. The S+ case is similar. According to the algorithm, i N \ Q+, F(i|Q+) 0. If j Q+, such that F(j|Q+ j) > 0, then we can suppose j was added into Q+ at a previous iteration t. Since Qt j Q+ j, based on the SSBC property, we have F(j|Qt j) > 0. This contradicts the proof of Lemma 1.

Because a global minimum is also a local minimum, Lemma 3 results in the following theorem.

Theorem 1. Any global minimum of F(X) is contained in the lattice L = [Q+, S+], i.e., X arg min X N F(X), Q+ X S+.

Unconstrained Quasi-Submodular Function

Maximization Unconstrained submodular function minimization problems can be exactly optimized in polynomial time (Orlin 2009). Yet unconstrained submodular maximization is NP-hard (Feige, Mirrokni, and Vondrak 2011). The best approximation ratio for unconstrained nonnegative submodular maximization is 1/2 (Buchbinder et al. 2012), which matches the known hardness result (Feige, Mirrokni, and Vondrak 2011). As a strict superset of submodular case, unconstrained quasisubmodular maximization is deﬁnitely NP-hard.

Iyer et al. (Iyer, Jegelka, and Bilmes 2013) also propose a discrete Minorization-Maximization like submodular maximization algorithm. They employ the permutation based subdifferential (Fujishige 2005) to construct tight modular lower bounds, and maximize the lower bounds as surrogate functions. With different permutation strategies, their algorithm actually mimics several existing approximation algorithms, which means their algorithm does not really reduce the lattices in optimization. In addition, for quasisubmodular cases, it also can be veriﬁed that the lower bounds in (Iyer, Jegelka, and Bilmes 2013) are no longer bounds, and quasi-submodular functions have no known subdifferential. Thus, even generalizing their algorithm is impossible.

We ﬁnd Buchbinder et al. (Buchbinder et al. 2012) propose a simple linear time approximation method. The algorithm maintains two working sets, S1 and S2, and S1 S2. At the start, S1 = and S2 = N. Then at each iteration, one element i S2 \S1 is queried to compute its marginal gains over the two working sets, i.e., F(i|S1) and F(i|S2 i). If F(i|S1) + F(i|S2 i) 0, then S1 S1 + i, otherwise S2 S2 i. After n iterations, the algorithm outputs S1 = S2. This algorithm is efﬁcient and reaches an approximation ratio of 1/3. However, the approximate algorithm may mistakenly remove a certain element e X from S2, or add an element u X into S1. Here, X is referred to as a global maximum. Consequently, the working lattices of their algorithm may not contain the global optima. By contrast, we want to reduce the lattices after each iteration while avoid taking erroneous steps. Fortunately, we ﬁnd that if we simultaneously maintain two working sets at each iteration, and take steps in a crossover method, quasi-submodularity can provide theoretical guarantees that the output lattices deﬁnitely contain all the global maxima. Hence, we propose the following algorithm.

Algorithm 2 Unconstrained Quasi-Submodular Function Maximization (UQSFMax) Input: Quasi-submodular function F, N = {1, 2, ..., n}, X0 , Y0 N, t 0. Output: Lattice [Xt, Yt]. 1: At Iteration t, ﬁnd Ut = {u Yt \ Xt | F(u|Yt u) > 0}. Xt+1 Xt Ut. 2: Find Dt = {d Yt \ Xt | F(d|Xt) < 0}. Yt+1 Yt \ Dt. 3: If Xt+1 = Xt and Yt+1 = Yt, stop and output [Xt, Yt]. 4: t t + 1. Back to Step 1.

To ensure the result lattice is not empty, we prove that after each iteration Algorithm 2 maintains a nonempty lattice as the following lemma shows.

Lemma 5. At each iteration of Algorithm 2, the lattice [Xt, Yt] is not empty, i.e., t, Xt Yt.

Proof. According to the deﬁnition, we have X0 Y0. Suppose Xt Yt, we then prove Xt+1 Yt+1. Because Ut, Dt Yt \ Xt, if we prove Ut Dt = , Xt+1 Yt+1 will be satisﬁed. According to the algorithm, u Ut, F(u|Yt u) > 0. Since Xt Yt u, and u Yt u, based on the SSBC property, we have F(u|Xt) > 0, which implies u Dt.

Algorithm 2 also has several very favorable theoretical guarantees. First, the objective function values are strictly increased after each iteration, as the following lemma states.

Lemma 6. After each except the last iteration of Algorithm 2, the objective function values of endpoint sets of lattice [Xt, Yt] are strictly monotonically increased, i.e., t, F(Xt+1) > F(Xt) or F(Yt+1) > F(Yt).

Proof. We prove F(Xt+1) > F(Xt). F(Yt+1) > F(Yt) can be proved using a similar approach. Suppose Ut = .

Deﬁne U k arg max U Ut:|U|=k F(Xt U), and Xk t = Xt U k. According to the algorithm, u Ut \ U k, F(u|Yt u) > 0. Since Xk t Yt u, and u Yt u, based on the SSBC property, we have F(u|Xk t ) > 0. This indicates F(Xk+1 t ) F(Xt (U k + u)) > F(Xt U k) = F(Xk t ). Note that F(X1 t ) = maxu Ut F(Xt + u) > F(Xt) = F(X0 t ). We then have F(Xt+1) = F(X|Ut| t ) > F(X|Ut| 1 t ) > > F(X0 t ) = F(Xt).

After the ﬁrst iteration of Algorithm 2, we get X1 = {i | F(i|N i) > 0}, and Y1 = {i | F(i| ) 0}. Based on Lemma 5, we have X1 Y1. Thus the reduced lattice L = [X1, Y1] [ , N] is not empty, and we show that it contains all the global maxima.

Lemma 7. Any global maximum of F(X) is contained in the lattice L = [X1, Y1], i.e., X arg max X N F(X), X1 X Y1.

Proof. We prove X1 X . X Y1 can be proved in a similar way. Suppose X1 X , i.e., u X1, u X . According to the deﬁnition, F(u|N u) > 0. Since X N u, based on the SSBC property, we have F(u|X ) > 0, that is F(X +u) > F(X ). This contradicts the optimality of X .

At each iteration of Algorithm 2, due to the SSBC property, Xt only adds elements and Yt only removes elements. Thus we have Xt Xt+1 and Yt+1 Yt, i.e., t, [Xt+1, Yt+1] [Xt, Yt]. We denote the output lattice of Algorithm 2 as [X+, Y+]. Then [X+, Y+] is the smallest lattice in the chain which consists of the working lattices: [X+, Y+] [Xt, Yt] [X1, Y1] [X0, Y0] = [ , N]. Based on Lemma 5, [X+, Y+] is not empty, then we prove that it contains all the global maxima of F.

Theorem 2. Suppose Algorithm 2 outputs lattice [X+, Y+]. Any global maximum of F(X) is contained in the lattice L = [X+, Y+], i.e., X arg max X N F(X), X+ X Y+.

Proof. Based on Lemma 7, we have X1 X Y1. Suppose Xt X Yt, we then prove Xt+1 X Yt+1. First, we suppose Xt+1 X . Because Xt+1 = Xt Ut, so u Ut, u X . According to the deﬁnition of Ut, F(u|Yt u) > 0. Since X Yt u, based on the SSBC property, we have F(u|X ) > 0. This implies F(X + u) > F(X ), which contradicts the optimality of X . Hence Xt+1 X . And X Yt+1 can be proved in a similar way.

Note that the proofs of Lemma 7 and Theorem 2 also work for local maximum cases, since we use singleton elements to construct contradictions.

Lemma 8. Any local maximum of F(X) is contained in the lattice L = [X+, Y+].

Lemma 8 indicates that if X+ (Y+) is a local maximum, it is the local maximum which contains the least (most) number of elements. Unfortunately, ﬁnding a local maximum for submodular functions is hard (Feige, Mirrokni, and Vondrak

2011), let alone quasi-submodular cases. Nonetheless, Algorithm 2 provides an efﬁcient strategy for search interval reduction, which is helpful because the reduction is on the exponential power. In the experimental section, we show the reduction can be quite surprising. Moreover, when an objective function has a unique local maximum, which is also the global maximum X+ = Y+, our algorithm can ﬁnd it quickly.

Theorem 3. Algorithm 2 terminates in O(n) iterations. The time complexity is O(n2).

Proof. After each iteration, at least one element is removed from the current working lattice, so it takes O(n) iterations to terminate. At each iteration, all the elements in the current working lattice need to be queried once. Hence, the total complexity of Algorithm 2 is O(n2).

Discussions In Algorithm 1, Q+ and S+ are local minima. While in Algorithm 2, X+ and Y+ may not be local maxima. Is it possible to ﬁnd a lattice for quasi-submodular maximization, where the endpoint sets are local maxima? We give an example to show that such a lattice may not exist. Suppose N = {1, 2}, F( ) = (N) = 1, and F({1}) = F({2}) = 1.5. It is easy to check that F is submodular, thus quasisubmodular. The set of local maxima is {{1}, {2}}. There is no local maximum which contains or is contained by all the other local maxima, since {1} and {2} are not comparable under the set inclusion relation. As aforementioned, unconstrained quasi-submodular function maximization is NP-hard. While for unconstrained minimization, whether there exists a polynomial time algorithm or not is open now.

Experimental Results In this section, we experimentally verify the effectiveness and efﬁciency of our proposed algorithms. We implement our algorithms using the SFO toolbox (Krause 2010) and Matlab. All experiments are run on a single core Intel i5 2.8 GHz CPU with 4GB RAM. We list several widely used quasi-submodular functions and the settings of our experiments as the following:

Iwata s function F(X) = |X||N \ X| P

i X (5i 2n)

(Fujishige and Isotani 2011). The ground set cardinality is set to be n = 5000.

The COM (concave over modular) function F(X) = p

w1(X) + w2(N \ X), where w1 and w2 are randomly generated in [0, 1]n. This function is applied in speech corpora selection (Iyer, Jegelka, and Bilmes 2013). The ground set cardinality is set to be n = 5000.

The half-products function F(X) = P

i,j X,i j a(i)b(j)

c(X), where a, b, c are modular functions, and a, b are non-negative. This function is employed in formulations of many scheduling problems and energy models (Boros and Hammer 2002). Since F is quasi-supermodular, we

minimize F through equivalently maximizing the quasisubmodular function F, i.e., min F = max ( F). n is set to be 100. The linearly perturbed functions. We consider the perturbed facility location function F(X) = L(M, X) + σ(X), where L(M, X) is the facility location function. M is a n d positive matrix. σ is a n-dimensional modular function which denotes the perturbing noise of facility. We set n = 100, d = 400, and randomly generate M in [0.5, 1]n d, the perturbing noise σ in [ 0.01, 0.01]n. The determinant function F(X) = det(KX), where K is a real n n positive deﬁnite matrix indexed by the elements of N, and KX = [Kij]i,j X is the restriction of K to the indices of X. This function is used to represent the sampling probability of determinantal point processes (Kulesza and Taskar 2012). We set n = 100. The multiplicatively separable function F(X) = Πk i=1Fi(Xi). One example is the Cobb-Douglas production function F(X) = Πn i=1w(i)αi, where w 0 and αi 0. This function is applied in economic ﬁelds (Topkis 1998). We set n = 2000.

We are concerned with the approximation ratio of an optimization algorithm. We compare the approximation ratio and running time of UQSFMax with MMax (Iyer, Jegelka, and Bilmes 2013). For MMax, we consider the following variants: random permutation (RP), randomized local search (RLS), and randomized bi-directional greedy (RG). For UQSFMax, we use it as the preprocessing steps of RP, RLS and RG, and denote the corresponding combined methods as URP, URLS, and URG.

Table 1: Average lattice reduction rates.

Algorithm UQSFMax UQSFMin Iwata s function 99.9% 99.9% COM function 99.5% 100.0% half-products function 51.2% 48.8% linearly perturbed function 99.3% 99.8% determinant function 87.0% 72.6% Multi. separable function 100.0% 100.0%

Table 2 presents the approximation ratios while Table 3 shows the running time. According to the comparison results, we ﬁnd that using our UQSFMax as the preprocessing steps of other approximation methods can reach comparable or better approximation performance while improve the efﬁciency, since the UQSFMax can efﬁciently reduce the search spaces of other approximation algorithms, and the reduced lattices deﬁnitely contain all the local and global optima as shown in the previous theoretical analysis. Note that for non-submodular functions (determinant function and multiplicatively separable function), at present we have no efﬁcient method to get approximate optima. So we cannot calculate the approximation ratios and we just record the average lattice reduction rates. We also record the rates of other functions for completeness.

Table 2: Approximation ratios of different algorithms and functions.

Algorithm RP URP RLS URLS RG URG Iwata s function 0.94 1.00 0.99 1.00 0.98 1.00 COM function 0.99 1.00 0.99 1.00 0.99 1.00 half-products function 0.96 0.97 0.95 0.94 0.96 0.99 linearly perturbed function 0.99 1.00 0.99 1.00 0.99 1.00

Table 3: Running time (seconds) of different algorithms and functions.

Algorithm RP URP RLS URLS RG URG Iwata s function 96.18 2.42 240.62 2.47 194.30 2.41 COM function 43.85 7.01 194.52 6.91 366.43 7.16 half-products function 0.35 0.22 0.98 0.52 9.96 4.59 linearly perturbed function 1.37 0.06 3.12 0.06 15.92 0.06

The average lattice reduction rates are shown in Table 1. This result also matches the running time. For example, the average lattice reduction rate for half-products function is 51.2%, and the running time of URG is about a half of the running time of RG. For minimization, we have similar lattice reduction results, which are also presented in Table 1.

Related Work

In this section, we introduce some related work of quasisubmodularity. Quasi-submodularity stems from economic ﬁelds. Milgrom and Shannon (Milgrom and Shannon 1994) ﬁrst propose the deﬁnition of quasi-supermodularity. They ﬁnd that the maximizer of a quasi-supermodular function is monotone as the parameter changes. In combinatorial optimization, for quasi-submodular functions, this property means the set of minimizers has a nested structure, which is the foundation of the proposed UQSFMin algorithm. Another related direction is discrete quasi-convexity (Murota 2003; Murota and Shioura 2003), which departs further from combinatorial optimization. In this paper, we consider set functions, i.e., functions deﬁned on {0, 1}n. While in (Murota and Shioura 2003), quasi L-convex function, which is deﬁned on Zn, is proposed. In (Murota and Shioura 2003), quasi L-convex function is a kind of integer-valued function. When we restrict its domain from Zn to {0, 1}n, quasi L-convex function reduces to quasi-submodular function. Meanwhile, their results based on Zn domain extension reduces to trivial cases in combinatorial optimization. Hence, we view quasi L-convexity (Murota and Shioura 2003) as a generalization of quasisubmodularity based on domain extension, i.e., extending the domain from {0, 1}n to Zn. Unlike submodularity, quasi-submodularity is not wellknown. Nonetheless, there are several applications related to quasi-submodularity scattered in different ﬁelds. In rent seeking game, every contestant tends to maximize his probability of winning for a rent by adjusting his bidding.

The payoff function of each contestant is quasi-submodular on his bidding and the total bidding of all the contestants. Rent seeking game is aggregative quasi-submodular game, where each player s payoff function is quasi-submodular. We refer readers to (Schipper 2004) for more details and examples of aggregative quasi-submodular games. In minimum cut problems with parametric arc capacities, submodularity implies nested structural properties (Granot et al. 2012). While quasi-submodularity also leads to the same properties. But how to employ the properties to ﬁnd an efﬁcient max ﬂow update algorithm for quasi-submodular functions is open at present (Granot et al. 2012).

Conclusions

In this paper, we go beyond submodularity, and focus on a universal generalization of submodularity called quasisubmodularity. We propose two effective and efﬁcient algorithms for unconstrained quasi-submodular function optimization. The theoretical analyses and experimental results demonstrate that although quasi-submodularity is a weaker property than submodularity, it has some good properties in optimization, which lead to lattice reduction while enable us to keep local and global optima in reduced lattices. In our future work, we would like to make our algorithms exact for quasi-submodular function minimization and approximate for quasi-submodular function maximization if it is possible, and try to incorporate the constrained optimization into our framework.

Acknowledgments

This work was supported partially by the National Basic Research Program of China (No. 2013CB329401), the National Natural Science Foundation of China (No. 61272248), the Science and Technology Commission of Shanghai Municipality (No. 13511500200), and the European Union Seventh Framework Program (No. 247619). Asterisk indicates the corresponding author.

Boros, E., and Hammer, P. L. 2002. Pseudo-boolean optimization. Discrete applied mathematics 123(1):155 225. Boyd, S. P., and Vandenberghe, L. 2004. Convex optimization. Cambridge university press. Buchbinder, N.; Feldman, M.; Naor, J.; and Schwartz, R. 2012. A tight linear time (1/2)-approximation for unconstrained submodular maximization. In IEEE Annual Symposium on Foundations of Computer Science, 649 658. Das, A., and Kempe, D. 2011. Submodular meets spectral: Greedy algorithms for subset selection, sparse approximation and dictionary selection. In International Conference on Machine Learning, 1057 1064. Edmonds, J. 2003. Submodular functions, matroids, and certain polyhedra. In Combinatorial Optimization-Eureka, You Shrink! Springer. 11 26. Feige, U.; Mirrokni, V. S.; and Vondrak, J. 2011. Maximizing non-monotone submodular functions. SIAM Journal on Computing 40(4):1133 1153. Feige, U. 2009. On maximizing welfare when utility functions are subadditive. SIAM Journal on Computing 39(1):122 142. Fujishige, S., and Isotani, S. 2011. A submodular function minimization algorithm based on the minimum-norm base. Paciﬁc Journal of Optimization 7(1):3 17. Fujishige, S. 2005. Submodular functions and optimization, volume 58. Elsevier. Golovin, D., and Krause, A. 2011. Adaptive submodularity: Theory and applications in active learning and stochastic optimization. Journal of Artiﬁcial Intelligence Research 42(1):427 486. Granot, F.; Mc Cormick, S. T.; Queyranne, M.; and Tardella, F. 2012. Structural and algorithmic properties for parametric minimum cuts. Mathematical programming 135(1-2):337 367. Iyer, R.; Jegelka, S.; and Bilmes, J. 2013. Fast semidifferential-based submodular function optimization. In International Conference on Machine Learning, 855 863. Krause, A.; Singh, A.; and Guestrin, C. 2008. Near-optimal sensor placements in gaussian processes: Theory, efﬁcient algorithms and empirical studies. The Journal of Machine Learning Research 9:235 284. Krause, A. 2010. Sfo: A toolbox for submodular function optimization. The Journal of Machine Learning Research 11:1141 1144. Kulesza, A., and Taskar, B. 2012. Determinantal point processes for machine learning. Foundations and Trends in Machine Learning 5(2-3):123 286. Leininger, W. 2006. Fending off one means fending off all: evolutionary stability in quasi-submodular aggregative games. Economic Theory 29(3):713 719. Liu, M.; Tuzel, O.; Ramalingam, S.; and Chellappa, R. 2011. Entropy rate superpixel segmentation. In IEEE Conference on Computer Vision and Pattern Recognition, 2097 2104.

Lov asz, L. 1983. Submodular functions and convexity. In Mathematical Programming The State of the Art. Springer. 235 257. Milgrom, P., and Shannon, C. 1994. Monotone comparative statics. Econometrica: Journal of the Econometric Society 157 180. Murota, K., and Shioura, A. 2003. Quasi m-convex and lconvex functions - quasiconvexity in discrete optimization. Discrete Applied Mathematics 131(2):467 494. Murota, K. 2003. Discrete convex analysis, volume 10. SIAM. Orlin, J. B. 2009. A faster strongly polynomial time algorithm for submodular function minimization. Mathematical Programming 118(2):237 251. Schipper, B. C. 2004. Submodularity and the evolution of walrasian behavior. International Journal of Game Theory 32(4):471 477. Singh, A. P.; Guillory, A.; and Bilmes, J. 2012. On bisubmodular maximization. In International Conference on Artiﬁcial Intelligence and Statistics, 1055 1063. Topkis, D. M. 1998. Supermodularity and complementarity. Princeton University Press. Vondr ak, J. 2008. Optimal approximation for the submodular welfare problem in the value oracle model. In ACM Symposium on Theory of Computing, 67 74. ACM.