# groupfairness_in_influence_maximization__353d7297.pdf

Group-Fairness in Inﬂuence Maximization

Alan Tsang1 , Bryan Wilder2 , Eric Rice3 , Milind Tambe2 and Yair Zick1

1Department of Computer Science, National University of Singapore 2School of Engineering and Applied Sciences, Harvard University 3Center for AI in Society, University of Southern California akhtsang@nus.edu.sg, bwilder@g.harvard.edu, ericr@usc.edu, milind tambe@harvard.edu, zick@comp.nus.edu.sg

Inﬂuence maximization is a widely used model for information dissemination in social networks. Recent work has employed such interventions across a wide range of social problems, spanning public health, substance abuse, and international development (to name a few examples). A critical but understudied question is whether the beneﬁts of such interventions are fairly distributed across different groups in the population; e.g., avoiding discrimination with respect to sensitive attributes such as race or gender. Drawing on legal and game-theoretic concepts, we introduce formal deﬁnitions of fairness in inﬂuence maximization. We provide an algorithmic framework to ﬁnd solutions which satisfy fairness constraints, and in the process improve the state of the art for general multi-objective submodular maximization problems. Experimental results on real data from an HIV prevention intervention for homeless youth show that standard inﬂuence maximization techniques oftentimes neglect smaller groups which contribute less to overall utility, resulting in a disparity which our proposed algorithms substantially reduce.

1 Introduction

Inﬂuence maximization in social networks is a well-studied problem with applications in a broad range of domains. Consider, for example, a group of at-risk youth; outreach programs try to provide as many people as possible with useful information (e.g., HIV safety, or available health services). Since resources (e.g., social workers) are limited, it is not possible to personally reach every at-risk individual. It is thus important to target key community ﬁgures who are likely to spread vital information to others. Formally, individuals are nodes V in a social network, and we would like to inﬂuence or activate as many of them as possible. This can be done by initially seeding k nodes (where k |V |). The seed nodes activate their neighbors with some probability, who activate their neighbors and so forth. Our goal is to identify k seeds such that the maximal number of nodes is activated. This

Equal contribution

is the classic inﬂuence maximization problem [Kempe et al., 2003], that has received much attention in the literature. In recent years, the inﬂuence maximization framework has seen application to many social problems, such as HIV prevention for homeless youth [Yadav et al., 2018; Wilder et al., 2018b], public health awareness [Valente and Pumpuang, 2007], ﬁnancial inclusion [Banerjee et al., 2013], and more. Frequently, small and marginalized groups within a larger community are those who beneﬁt the most from attention and assistance. It is important, then, to ensure that the allocation of resources reﬂects and respects the diverse composition of our communities, and that each group receives a fair allocation of the community s resources. For instance, in the HIV prevention domain we may wish to ensure that members of racial minorities or of LGBTQ identity are not disproportionately excluded; this is where our work comes in.

Our Contributions This paper introduces the problem of fair resource allocation in inﬂuence maximization. Our ﬁrst contribution is to propose fairness concepts for inﬂuence maximization. We start with a maximin concept inspired by the legal notion of disparate impact; formally it requires us to maximize the minimum fraction of nodes within each group that are inﬂuenced. While intuitive and well-motivated, this deﬁnition suffers from shortcomings that lead us to introduce a second concept, diversity constraints. Roughly, diversity constraints guarantee that every group receives inﬂuence commensurate with its demand , i.e., what it could have generated on its own, based on a number of seeds proportional to its size. Here, to compute a group s demand, we allow it a number of seeds proportional to its size, but require that it spreads inﬂuence using only nodes in the group. Hence, a small but well connected group may have a better claim for inﬂuence than a large but sparsely connected group. Our second contribution is an algorithmic framework for ﬁnding solutions that satisfy either fairness concept. While the classical inﬂuence maximization problem is submodular (and hence easily solved with a greedy algorithm), fairness considerations produce strongly non-submodular objectives. This renders standard techniques inapplicable. We show that both fairness concepts can be reduced to multi-objective submodular optimization problems, which are substantially more complex. Our key algorithmic contribution is a new method

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)

for general multi-objective submodular optimization which has substantially better approximation guarantee than the current best algorithm [Udwani, 2018], and often better runtime as well. This result may be of independent interest. Our third contribution is an analytical exploration of the price of group fairness in inﬂuence maximization, i.e., the reduction in social welfare with respect to the unconstrained inﬂuence maximization problem due to imposing a fairness concept. We show that the price of diversity can be high in general for both concepts and under a range of settings. Our fourth contribution is an empirical study on real-world social networks that have been used for a socially critical application: HIV prevention for homeless youth. Our results show that standard inﬂuence maximization techniques often cause substantial fairness violations by neglecting small groups. Our proposed algorithm substantially reduces such violations at relatively small cost to overall utility.

Related Work Kempe et al. [2003] introduced inﬂuence maximization and proved that since the objective is submodular, greedily selecting nodes gives a 1 1

e -optimal solution. There has since been substantial interest among the AI community both in developing more scalable algorithms (see Li et al. [2018] for a recent survey) , as well as in addressing the challenges of deployment in public health settings [Yadav et al., 2016; Wilder et al., 2018a]. Recently, such algorithms have been used in real-world pilot tests for HIV prevention amongst homeless youth [Yadav et al., 2018; Wilder et al., 2018b], driving home the need to consider fairness as inﬂuence maximization is applied in socially sensitive domains. To our knowledge, no previous work considers fairness speciﬁcally for inﬂuence maximization. Some literature exists on targeted inﬂuence maximization problems [Pasumarthi et al., 2015; Wen et al., 2018; Chen et al., 2019] where the objective is to reach a speciﬁc set of nodes and not others; by contrast, our goal is to ensure that every group receives a fair amount of inﬂuence spread. The techniques we introduce to optimize fairness metrics are related to research on multi-objective submodular maximization (outside the context of fairness), and we improve existing theoretical guarantees for this general problem [Chekuri et al., 2010; Udwani, 2018]. Outside of inﬂuence maximization, the general idea of diversity as an optimization constraint has received considerable attention in recent years; it has been studied in multiwinner elections (see [Bredereck et al., 2018; Faliszewski et al., 2017] for an overview), resource allocation [Benabbou et al., 2018], and matching problems [Ahmed et al., 2017; Hamada et al., 2017]. We note that some of the above works (e.g. Ahmed et al. [2017] and Schumann et al. [2017]) use a submodular objective function as a means of achieving diversity; interestingly, while the classic inﬂuence maximization target function is submodular, it is no longer so under diversity constraints. Group fairness has been studied extensively in the voting theory literature, where the objective is to identify a committee of k candidates that will satisfy subsets of voters (see a comprehensive overview in [Faliszewski et al., 2017]). There have also been several works on group fairness in fair division, deﬁning notions of

group envy-freeness [Conitzer et al., 2019; Fain et al., 2018; Segal-Halevi and Suksompong, 2018; Todo et al., 2011], and a group maximin share guarantee [Barman et al., 2019; Suksompong, 2018]. One line of work in operations research uses mixed-integer programming to enforce that different groups receive the equal utility (or at least that each group s utility satisﬁes some lower bound) [Bertsimas et al., 2013; Azizi et al., 2018]. This manner of deﬁning fairness is relatively close to our own; e.g., our diversity constraints give one way of instantiating what this lower bound should be. Computationally, we introduce efﬁcient algorithms speciﬁcally for the submodular optimization instead of using mixed-integer programming.

2 Model Agents are embedded in a social network G = (V, E). An edge (i, j) E represents the ability for agent vi to inﬂuence or activate vj. G may be undirected or directed.

Diversity Each agent in our network may identify with one or more groups within the larger population. These represent different ethnicities, genders, sexual orientations, or other groups for which fair treatment is important. Our goal is to maximize inﬂuence in a way such that each group receives at least a fair share of inﬂuence (more on this below). Let us designate these groups as C = {C1, . . . Cm}. Each group Ci represents a non-empty subset of V, = Ci V . Each agent must belong to at least one group, but may belong to multiple groups; i.e. C1 C2 . . . Cm = V . In particular, this allows for the expression of intersectionality, where an individual may be part of several minority groups.

Inﬂuence Maximization We model inﬂuence using the independent cascade model [Kempe et al., 2003], the most common model in the literature. All nodes begin in the inactive state. The decision maker then selects k seed nodes to activate. Each node that is activated makes one attempt to activate each of its inactive neighbors; each attempt succeeds independently with probability p (all of our results also hold for nonuniform probabilities). Newly activated nodes attempt to activate their neighbors and so on, with the process terminating once there are no new activations. We deﬁne the inﬂuence of nodes A V , denoted IG(A), as the expected number of nodes activated by seeding A. Of these, let IG,Ci(A) be the expected number of activated vertices from Ci. Traditional inﬂuence maximization seeks a set A, |A| k, maximizing IG(A). Using a slight abuse of notation, let IG(k) be the maximum inﬂuence that can be achieved by selected k seed nodes. That is, IG(k) = max|A|=k IG(A). Analogously, we deﬁne IG,Ci(k) as the maximum expected number of vertices from Ci that can be activated by k seeds. We now propose two means of capturing group fairness in inﬂuence maximization.

Maximin Fairness Maximin Fairness captures the straightforward goal of improving the conditions for the least well-off groups. That is, we want to maximize the minimum inﬂuence received by any

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)

of the groups, as proportional to their population. This leads to the following utility function based on seed nodes A:

U Maximin(A) = min i IG,Ci(A)

Subject to this maximin constraint, we seek to maximize overall inﬂuence. Thus, we deﬁne IMaximin G = IG(B) with B = argmax A V,|A|=k U Maximin(A). That is, IMaximin G is the expected number of nodes activated by a seed conﬁguration that maximizes the minimum proportional inﬂuence received by any group. This corresponds to the legal concept of disparate impact, which roughly states that a group has been unfairly treated if their success rate under a policy is substantially worse than other groups (see [Barocas and Selbst, 2016] for an overview). Therefore, maximin fairness may be signiﬁcant to governmental or community organizations which are constrained to avoid this form of disparity. However, optimizing for equality of outcomes may be undesirable when some groups are simply much better suited than others to a network intervention. For instance, if one group is very poorly connected, maximin fairness would require that large number of nodes be spent trying to reach this group, even though additional seeds have relatively small impact.

Diversity Constraints We now propose an alternate fairness concept by extending the notion of individual rationality to Group Rationality. The key idea is that no group should be better off by leaving the (inﬂuence maximization) game with their proportional allocation of resources and allocating them internally. For each group Ci, let ki = k|Ci|/|V | be the number of seeds that would be fairly allocated to the group Ci based on the group s size within the larger population, rounded up to remove any doubt that this group receives a fair share. ki is the fair allocation of seeds to the group. Let G[Ci] be the subgraph induced from G by the nodes Ci. This represents the network formed by group Ci if they were to separate from the original network. Now, we deﬁne the group rational inﬂuence that each group Ci can expect to receive as the number of nodes they expect to activate if they left the network, with their fair allocation of ki seeds. We denote this group rational inﬂuence for Ci as IG[Ci](ki). Then, we devise a set of diversity constraints that any group rational seeding conﬁguration A with k seeds must satisfy: IG,Ci(A) IG[Ci](ki), i. That is, the inﬂuence received by each group is at least equal to what each group may accomplish on its own when given its fair share of ki seed nodes. The diversity constraint objective function is to maximize the expected number of nodes activated, subject to the above diversity constraint. The utility for selecting seed nodes A is:

U Rational(A) = IG(A), if IG,Ci(A) IG[Ci](ki), i. 0, otherwise.

The maximum expected inﬂuence obtained via a group rational seeding conﬁguration A is called the rational inﬂuence IRational G = IG(B), where B = argmax A V,|A|=k U Rational(A). Note that since even the

standard inﬂuence maximization problem is already NP-hard and must be approximated, our computational guarantees will relax the above constraint, requiring that each group receive inﬂuence within some factor α of IG[Ci](ki).

Price of Fairness To measure the cost of ensuring a fair outcome for the diverse population, we will measure the Price of Fairness, the ratio of optimal inﬂuence to the best achievable inﬂuence under our two fairness criteria. Here optimal inﬂuence IOPT G = IG(k), which is the maximum amount of expected inﬂuence that can be obtained using any choice of k seed nodes. We omit the subscript where the context is clear.

Po F Rational = IOPT

IRational Po F Maximin = IOPT

3 Optimization The standard approach to inﬂuence maximization is based on submodularity. Formally, a set function f on ground set V is submodular if for every A B V and x V \ B, f(A {x}) f(A) f(B {x}) f(B). This captures the intuition that additional seeds provide diminishing returns. However, both of our fairness concepts are easily shown to violate this property (proofs are deferred to the appendix): Theorem 3.1. U Maximin and U Rational are not submodular. We remark that each individual function IG,Ci, i.e., the number of nodes in group i who are reached, is submodular. However, this property does not hold for the combined objectives U Maximin and U Rational. Hence, we cannot apply the greedy heuristic to group-fair inﬂuence maximization. Instead, we now show that optimizing either utility function reduces to multiobjective submodular maximization, a more general problem deﬁned as follows. The input to the problem is a set of monotone submodular functions f1...fm and corresponding target values W1...Wm. We assume that the fi are normalized (fi( ) 0). The multiobjective submodular maximization problem is to ﬁnd a set S satisfying |S| k with fi(S) Wi for all i, assuming that such an S exists.

3.1 Reduction to Multiobjective Submodular Maximization We now show that each of the fairness-aware inﬂuence maximization objectives can be reduced to solving a small number of instances of multiobjective submodular maximization with appropriately chosen functions fi and targets Wi. Our reductions leverage the property that the underlying inﬂuence functions IG,Ci are submodular even though the group-fair objectives are not. We start with U Maximin. Here, we deﬁne fi = IG,Ci

|Ci| to be group i s inﬂuence spread normalized by the size of the group. All of the target values Wi will be equal, i.e., W1 = W2 = ...Wm = W. Assume that we have a subroutine for multiobjective submodular maximization. If the multiobjective problem is feasible for a given value of W, then the subroutine outputs a set S satisfying U Maximin(S) W. Hence, we simply binary search for the highest value of W for which the multiobjective problem remains feasible.

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)

For U Rational, we let fi = IG,Ci and set the target Wi = IG[Ci](ki). This represent the constraint that group i must receive at least their group-rational share of utility. We then add another objective function ftotal = IG representing the combined utility and binary search for the highest value Wtotal such that the targets W1...Wm, Wtotal are feasible. This represents the largest achievable total utility, subject to diversity constraints. Having reduced both fairness concepts to multiobjective submodular maximization, we turn to algorithms for this core problem. We present an algorithm with substantially improved theoretical guarantees for the general multiobjective problem, and then show how our algorithm can be applied to fair inﬂuence maximization.

3.2 Previous Techniques The multiobjective submodular problem was introduced by Chekuri et al. [2010], who gave an algorithm which guarantees fi (1 1

e)Wi for all i provided that the number of objectives m is smaller than the budget k (when m = Ω(k), the problem is provably inapproximable [Krause et al., 2008]). Unfortunately, this algorithm is of mostly theoretical interest since it runs in time O(n8). Udwani [2018] recently introduced a practically efﬁcient algorithm; however it obtains an asymptotic (1 1

e)2-approximation instead of the optimal 1 1

e . We remedy this gap by providing a practical algorithm obtaining an asymptotic 1 1

e -approximation (Algorithm 1). Its runtime is comparable to, and under many conditions faster than, the algorithm of [Udwani, 2018]. Previous algorithms [Chekuri et al., 2010; Udwani, 2018] start from a common template in submodular optimization, which we also build on. The main idea is to relax the discrete problem to a continuous space. For a given submodular function f, its multilinear extension F is deﬁned on ndimensional vectors x where 0 xj 1 for all j V . xj represents the probability that item j is included in the set. Formally, let S x denote a set which includes each j independently with probability xj. Then, we deﬁne F(x) = ES x[f(S)], which can be evaluated using random samples.

3.3 Algorithm Overview The main challenge is to solve the continuous optimization problem, which is where our technical contribution lies. Algorithm 1 describes the high-level procedure, which runs our continuous optimization subroutine (line 2) and then rounds the output to a discrete set (line 3). Line 1, which ensures that all items with value above a threshold τ are included in the solution, is a technical detail needed to ensure the rounding succeeds. The rounding process captured in lines 1 and 3 is fairly standard and used by both previous algorithms [Chekuri et al., 2010; Udwani, 2018]. Our main novelty lies in an improved algorithm for the continuous problem, MULTIFW.

Algorithm 1 Multiobjective Optimization(γ, τ, T, T , η)

1: S1 = {j : fi({j}) τ for some i} 2: x =MULTIFW(k |S1|, {γ (Wi fi(S))}m i=1) 3: S2 =SWAPROUND(xint) //see [Chekuri et al., 2010] 4: return S1 S2

Algorithm 2 Multiobjective Frank-Wolfe(k, {Wi})

1: x0 = 0 2: for t = 1...T do 3: vt = S-SP-MD(x, {i : Wi Fi(xt 1) ϵ}) 4: xt = xt 1 + 1

5: return APPROXDECOMPOSITION(x T ) //see [Mirrokni et al., 2017] 6: function S-SP-MD(x, I) 7: Initialize v s.t. ||v||1 = k and y (I) arbitrarily 8: for ℓ= 1...T do 9: Sample i y; set ˆ v = 1 Wi Fi(x)Ai grad(x)

10: Sample j v; ˆ y = k diag 1 W F (x)

11: y = ye η ˆ y

||ye η ˆ y ||1

12: v = k min{veη ˆ v ,1}

|| min{veη ˆ v ,1}||1

MULTIFW implements a Frank-Wolfe style algorithm to simultaneously optimize the multilinear extensions F1...Fm of the discrete objectives. The algorithm proceeds over T iterations. Each iteration ﬁrst identiﬁes vt, a good feasible point in continuous space (Algorithm 2, line 3). Then, the current solution xt is updated to add 1

T vt (line 4). Since each point vt is feasible, xt is a convex combination of feasible points and hence always remains feasible. The key to the algorithm is a good choice of the direction vt at each iteration. Roughly, we would like to choose vt in a way that ensures we make progress towards meeting the target Wi for each Fi. If our current solution quality Fi(xt 1) is very far from Wi then vt should focus heavily on improving the value of Fi. By contrast, if Fi(xt 1) is already close to Wi, then vt should focus on improving other objectives instead. This process is formally accomplished via a subroutine S-SP-MD which we introduce in the subsection below.

The output of Algorithm 2 is the ﬁnal point x T produced after T iterations. There is one technical detail to take care of, reﬂected in line 5. Common rounding algorithms for submodular maximization require not just the fractional point x T , but also a representation of x T as a convex combination of integral points, i.e., as a combination of binary vectors representing feasible sets. The rounding algorithm will then merge these binary vectors together to produce the ﬁnal output set. Producing this convex combination is the wellknown Caratheodory problem of decomposing a point in a polytope into a combination of vertices. While the problem can be solved exactly via convex optimization, doing so may incur unnecessarily high runtime. To reduce the time complexity of the algorithm, we ﬁnd the decomposition via an approximate method recently introduced by [Mirrokni et al., 2017]. The details of this method are unimportant (we use it just as a black-box; any method for solving the Caratheodory problem would sufﬁce). In our theoretical analysis, we show that the loss in solution quality due to using an approximate decomposition is negligible (formally, an arbitrarily small ϵ).

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)

3.4 Choosing the Direction

The key challenge is to efﬁciently ﬁnd a vt that makes sufﬁcient progress towards every objective simultaneously. We accomplish this by introducing the subroutine S-SP-MD (lines 6-12), which runs a carefully constructed version of stochastic saddle-point mirror descent [Nemirovski et al., 2009]. We ﬁrst motivate and formalize the problem that SSP-MD attempts to solve. Then, we give some background on mirror descent and explain the steps of the algorithm. As explained earlier, vt must be chosen so that it makes progress towards those objectives for which Wi Fi(xt 1) is large (i.e., we are far from the target). As a ﬁrst step, we will ignore all i for which Wi Fi(xt 1) < ϵ, since for these objectives the current solution is already sufﬁciently good. Let I denote the set of remaining objectives where Wi Fi(xt 1) ϵ. For each i, let Fi(xt 1) denote the gradient of Fi. We will use the gradients of the functions in I to choose vt. Speciﬁcally, our goal is to ﬁnd a feasible v such that

Fi(xt 1) v Wi Fi(xt 1) i I. (1)

It can be shown that such a v always exists whenever the overall multiobjective problem is feasible. If we can ﬁnd this v, the progress we make at each iteration is proportional to our current gap from the targets, resulting in the desired (1 1/e)- approximation after sufﬁciently many iterations. Note that the LHS of Problem 1 is linear in the decision variable v, while the RHS is constant with respect to v. This implies that we could (in principle) ﬁnd a feasible v via linear programming. Naively however, this approach would entail O(n3) runtime per iteration. Our ﬁrst step towards an efﬁcient solution is to convert Problem 1 into a single maxmin problem. Speciﬁcally, we can solve the problem

max ||v||1 k min i I Fi(xt 1) v Wi Fi(xt 1) (2)

and it is easy to see that if a solution v has objective value at least 1 for the maxmin Problem 2, then it is also feasible for Problem 1. We now make a ﬁnal reformulation to obtain a problem amenable to optimization. Speciﬁcally, let (I) denote the set of all distributions over I. We will consider the saddle-point problem

max ||v||1 k min y (I)

i I yi Fi(xt 1) v Wi Fi(xt 1) (3)

where the min now ranges over all distributions over (I) instead of single elements. It is easy to see that the solutions of the two problems are equivalent (since the minimizing distribution will always put probability one on a single element). However, replacing the discrete min with one over a continuous set allows us to draw on continuous optimization techniques to obtain an efﬁcient solution, as explained in the next section.

3.5 Stochastic Saddle-Point Method

We employ a method based on stochastic saddle-point mirror descent (S-SP-MD), introduced by [Nemirovski et al., 2009]. Essentially, this algorithm views Problem 3 as a game between a max player and a min player. Both players update their decision variables (v and y respectively) by using gradient updates based on the objective in Problem 3. Intuitively, the min player will put large weights where the max player is doing badly, forcing the max player to improve v. The algorithm uses two key ideas to make this process efﬁcient. The ﬁrst is that, instead of using standard gradient descent, mirror descent modiﬁes the updates to better exploit the structure of the feasible set. For our case, this results in the exponentiated gradient updates given in lines 11-12 of Algorithm 2. Essentially, each player multiplies their current solution by e η , where is that player s current gradient and η is a learning rate. Then, they rescale to maintain feasibility.

However, this process assumes that the gradients are easily available, an assumption that does not hold in common submodular problems. For instance, for inﬂuence maximization the gradients depend on the random inﬂuence process and cannot be calculated exactly. Hence, the second key idea uses stochastic methods to efﬁciently estimate the gradients (e.g., using simulations of inﬂuence spread). Formally, instead of assuming that Fi can be computed exactly, we will instead make the much weaker assumption that we can obtain an unbiased estimate of it, i.e., a random vector ˆ satisfying E[ ˆ ] = Fi. Efﬁcient estimates of this form are known for many submodular problems (e.g., coverage functions or facility location [Karimi et al., 2017]), and we show below how to create one for inﬂuence maximization. However, even using stochastic gradients may entail unnecessarily high runtime since we still have to compute the estimated gradients for every objective i with respect to every item (node) j. Accordingly, our proposed updates use more restricted oracles that return stochastic estimates of only a subset of the full gradients. This is a key element of obtaining near-linear runtime.

Speciﬁcally, we assume access to two gradient oracles. First, a stochastic gradient oracle Ai grad for each multilinear extension Fi. Given a point x, Ai grad(x) satisﬁes E[Ai grad] = x Fi(x). Second, a stochastic gradient oracle Aj item corresponding to each item j [n] (in inﬂuence maximization, the items are the potential seed nodes). Aj item(x) satisﬁes E[Aj item(x)] = xj F1(x)... xj Fm(x) . We assume that ||Ai grad(x)|| , ||Aj item(x)|| c for some constant c.

Our algorithm calls Ai grad and Aj item for only a single i and j each iteration (instead of enumerating over all i, j as would be naively required). The results are then scaled so that they remain unbiased estimates of the true gradients. The process is formally shown in lines 8-9 of Algorithm 2. Line 8 computes gradients with respect to v for the maximizing player, a process which works as follows. Differentiating Equation 2 with respect to v, we obtain

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)

i I yi Fi(xt 1) Wi Fi(xt 1)

Fi(xt 1) Wi Fi(xt 1)

where i y denotes drawing i at random according to the probability distribution y. From this expression, we see that an unbiased estimate of Equation 4 can be obtained by ﬁrst sampling a single i y, and then calling Ai grad to obtain an unbiased estimate of Fi(xt 1). We then return 1 Wi Fi(xt 1)Ai grad, which has expectation equal to Equation 4. The reasoning behind the gradients for the min player, calculated in line 9, is analogous: we sample a single j and return an appropriately scaled call to Aj item.

3.6 Approximation Guarantee With these techniques in hand, our theoretical analysis shows that S-SP-MD ensures rapid convergence to an ϵ-optimal solution for Problem 2. This convergence property for the inner subroutine then, in turn, allows us to show that the overall strategy employed in Algorithm 1 attains the desired approximation guarantee. Formally, our main theoretical result is given by the following theorem. Here, b = maxi,j fi({j}) is the maximum value of a single item. Theorem 3.2. Given a feasible set of target values W1...Wn, Algorithm 1 outputs a set S such that fi(S) (1 ϵ) 1 m k(1+ϵ )ϵ3 1 1

e Wi ϵ with probability at least 1 δ. Asymptotically as k , the approximation ratio can be set to approach 1 1/e so long as m = o(k log3 k). The algorithm requires O(nm) ϵ -accurate value oracle calls, O(m bk2

δ ) ϵ-accurate value oracle

calls, O bk4c2

ϵ5 log n + bk

δϵ calls to Agrad and Aitem, and

ϵ2 additional work.

This says that Algorithm 1 asymptotically converges to a 1 1

e -approximation when the budget k is larger than the number of objectives m (i.e., the conditions under which the problem is approximable). All terms in the approximation ratio are identical to Udwani [2018], except that we improve their factor 1 1

e . The runtime is also identical apart from the time to solve the continuous problem (MULTIFW vs their corresponding subroutine). This is difﬁcult to compare since our respective algorithms use different oracles to access the functions. However, both kinds of oracles can typically be (approximately) implemented in time O(n). Udwani s algorithm uses O(n) oracle calls, while our s requires O(bk4c2 log n). For large-scale problems, n typically grows much faster than k, b, and c (all of which are often constants, or near-so). Hence, trading O(n2) runtime for O(n log n) can represent a substantial improvement. We present a more detailed discussion in the appendix.

3.7 Instantiation for Inﬂuence Maximization To instantiate Algorithm 1 for inﬂuence maximization, we just need to supply appropriate stochastic gradient oracles.

To our knowledge, no such oracles were previously known for inﬂuence maximization, which is substantially more complicated than other submodular problems because of additional randomness in the objective; naive extensions of previous methods require O(n2) time. We provide efﬁcient O(kn log n) time stochastic gradient oracles by introducing a randomized method to simultaneously estimate many entries of the gradient at once. Details may be found in the appendix. The main idea is to use simulations of the inﬂuence process to estimate the marginal contribution that seeding each node would make towards the objective. Even to produce a noisy estimate, a naive method would require two simulations per node: one where the node is chosen as a seed and one where it is not. Since each simulation takes O(n) runtime this requires O(n2) time overall. Our proposed method uses only O(k log n) simulations, but shares information across them in order to simultaneously estimate the marginal contribution made by all n nodes. .

4 Price of Fairness In this section, we show that both deﬁnitions for the Price of Fairness can be unbounded; moreover, allowing nodes to join multiple groups can, counter-intuitively, worsen the Po F. The proofs in this section show undirected examples demonstrating the worst case. The results naturally serve as examples in a directed setting. Theorem 4.1. As n and p 0, there exists a family of graphs such that Po F Rational .

Proof. We construct a graph G with two parts. In Part L, we have s 1 vertices all disjoint except for two vertices; label one of these x3. In Part S, we have a star with s + 1 nodes. Label a leaf node x1 and the central node x2. We deﬁne two groups: C1 is comprised of the s degree-1 vertices of S, and C2 for the remaining s vertices, which includes the vertices of L and the central vertex x2 of the star. There are k = 2 seeds, and since |C1| = |C2|, they each have a fair allocation of k1 = k2 = 1 seeds. Since the subgraph induced by C1 is comprised of isolated vertices, they have a rational allocation of IG[C1](1) = 1. The subgraph induced by C2 is a collection of isolated vertices and a K2, its rational allocation is IG[C2](1) = 1 + p. We are interested in two seeding conﬁgurations: A = {x1, x3} and B = {x2, x3}. We can verify that conﬁguration A is fair. The A activates 1 + p nodes in Part L, and 1 + p + (s 1)p2 in Part S, for a total of IG(A) = 2 + 2p + (s 1)p2. Now consider conﬁguration B. C1 receives ps inﬂuence, and since p < 2

s, C1 does not receive its group rational share of inﬂuence. However, we can verify that this seeding is optimal. Part L receives (1+p) inﬂuence, and Part S receives 1 + ps. Therefore, IG(B) = 2 + p + ps. We may then calculate our Price of Fairness:

Po F Rational = IOPT G IRational G = 2 + p + ps 2 + 2p + (s 1)p2

And if we take the limit as n , s , Po F 1/p. Finally, as as p 0, Po F .

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)

Figure 1: Left: G with Disjoint Groups. Right: G with Overlapping Groups.

The appendix details a similar result for Maximin Fairness:

Theorem 4.2. As n and p 0, there exists a family of graphs such that Po F Maximin .

Frequently, an individual may identify with multiple groups. Intuitively, we might expect such multi-group membership to improve the inﬂuence received by different groups and make group-fairness easier to achieve (see the appendix for an example). However, in the following, we show that this is not always true giving even a single node membership in a second group can cause the Price of Fairness to worsen by an arbitrarily large amount.

Theorem 4.3. Let G be a graph with groups C1 and C2, and G with groups C 1 and C 2, where G = G, C 1 = C1 and C 2 is obtained from C2 by the addition of one vertex x1 (x1 C1, x1 / C2). There exists a family of such graphs such that

P o F Rational G P o F Rational G = .

Proof. Consider a graph G with two components: one component K contains 2 vertices joint by an edge, the other component S is a star with s + 1 vertices (s 1/p). There are two groups: C1 contains all degree-1 vertices from S and one vertex from K; C2 contains the other vertex x1 from K and the central vertex x2 from S. There is one seed (k = 1), and the fair allocation of seeds to each group is k1 = k2 = 1. Since the induced subgraphs for both groups comprise only of isolated nodes, the group rational inﬂuence for each group is IG[C1] = IG[C2] = 1. Therefore, the seed set {x2} is both fair and optimal, giving an expected inﬂuence of IG({x2}) = 1 + ps. Now, let us modify G by letting x1 belong to both communities to obtain G , and communities C 1 and C 2. The group rational inﬂuence for C 2 remains the same (its members have not changed) but IG [C 1] has increased to 1 + p (by seeding x1). In fact, this forces the fair allocation to seed x1 instead of x2, for a fair inﬂuence of IG ({x1}) = 1 + p.

As n , lim n

P o F Rational G P o F Rational G = lim s 1+ps

A slightly weaker result can be obtained for Maximin Fairness where the construction of the graphs depend on p. The proof is provided in the appendix.

Theorem 4.4. Let G be a graph with groups C1 and C2, and G with groups C 1 and C 2, where G = G, C 1 = C1 and C 2 is obtained from C2 by the addition of one vertex x1 (x1 C1, x1 / C2). Given propagation probability p, we may construct a family of such graphs such that

P o F Maximin G P o F Maximin G .

Characteristic Net. 1 Net. 2 Net. 3 Net. 4

Density 0.012 0.032 0.022 0.034 Modularity 0.803 0.713 0.604 0.537 Median group size 13.0 9.5 16.0 9.5

Table 1: Network characteristics.

5 Experimental Results

We now investigate the empirical impact of considering fairness in inﬂuence maximization. We start with experiments on a set of four real-world social networks which have been previously used for a socially critical application: HIV prevention for homeless youth. Each network has 60-70 nodes, and represents the real-world social connections between a set of homeless youth surveyed in a major US city. Each node in the network is associated with demographic information: their birth sex, gender identity, race, and sexual orientation. The networks can be made available upon request; all code is available at https://github.com/bwilder0/ fair inﬂumax code release. Table 1 gives some aggregate statistics for each network. Each demographic attribute gives a partition of the network into anywhere from 2 to 6 different groups. For each partition, we compare three algorithms: the standard greedy algorithm for inﬂuence maximization, which maximizes the total expected inﬂuence (Greedy), Algorithm 1 used to enforce diversity constraints (DC), and Algorithm 1 used to ﬁnd a maximin fair solution (Maximin). We set the propagation probability to be p = 0.1 and ﬁxed k = 15 seeds (varying these parameters had little impact). We average over 30 runs of the algorithms on each network (since all of the algorithms use random simulations of inﬂuence propagation), with error bars giving bootstrapped 95% conﬁdence intervals. Figure 2 (top) shows that the choice of solution concept has a substantial impact on the results. For the diversity constraints case, we summarize the performance of each algorithm by the mean percentage violation of the constraints over all groups. For the maximin case, we directly report the minimum fraction inﬂuenced over all groups. We see that greedy generates substantial unfairness according to either metric: it generates the highest violations of diversity constraints, and has the smallest minimum fraction inﬂuenced. Greedy actually obtains near-zero maximin value with respect to sexual orientation. This results from it assigning one seed to a minority group in a single run and zero in others. DC performs well across the board: it reduces constraint violations by approximately 55-65% while also performing competitively with respect to the maximin metric (even without explicitly optimizing for it). As expected, the Maximin algorithm generally obtains the best maximin value. DC actually attains slightly better maximin value for one attribute (birthsex); however, the difference is within the conﬁdence intervals and reﬂects slight ﬂuctuations in the approximation quality of the algorithms. However, Maximin performs surprisingly poorly with respect to diversity constraint violations. This indicates that optimizing exclusively for equal inﬂuence spread may force the algorithm to focus on poorly connected groups which exhibit severe diminishing returns.

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)

birthsex race sexori gender 0.0

Mean DC violation

Greedy DC Maximin

birthsex race sexori gender 0.0

Min fraction inﬂuenced

birthsex race sexori gender 1.00

Price of group fairness

DC Violation Min Fraction 0.0

Overlapping Groups

region ethnicity age gender 0.0

Mean DC violation

Greedy DC Maximin

regionethnicity age gender 0.00

Min fraction inﬂuenced

region ethnicity age gender 1.0

Price of group fairness

DC Violation Min Fraction 0.0

Overlapping Groups

Figure 2: Average performance on homeless youth social networks (top) and simulated Antelope Valley networks (bottom).

DC is able to attain almost as much inﬂuence in such groups but is then permitted to focus its remaining budget for higher impact. Interestingly, the price of fairness is relatively small for both solution concepts, in the range 1.05-1.15 (though it is higher for maximin than for DC). This indicates that while standard inﬂuence maximization techniques can introduce substantial fairness violations, mitigating such violations may be substantially less costly in real world networks than the theoretical worst case would suggest. Finally, the rightmost plot in the top row of Figure 2 explores an example with overlapping groups. Speciﬁcally, we consider the race and birthsex attributes so that each node belongs to two groups. Constraint violations are somewhat higher than for either attribute individually, but the price of fairness remains small (1.07 for DC and 1.13 for Maximin). In Figure 2 (bottom), we examine 20 synthetic networks used by Wilder et al. [2018c] to model an obesity prevention intervention in the Antelope Valley region of California. Each node in the network has a geographic region, ethnicity, age, and gender, and nodes are more likely to connect to those with similar attributes. Each network has 500 nodes and we set k = 25. Overall the results are similar to the homeless youth networks. One exception is the high price of fairness that maximin suffers with respect to the region attribute (over 1.4), but the other Po F values are relatively low (below 1.2). We also observe that greedy obtains the (slightly) best maximin performance for gender, likely because the network is sufﬁciently well-mixed across genders that fairness is not a signiﬁcant concern (as conﬁrmed by the extremely low DC violations). Absent true fairness concerns, greedy may perform slightly better since it solves a simpler optimization problem. However, in the last ﬁgure, we examine overlapping groups given by region and ethnicity and observe that greedy actually obtains zero maximin value, indicating that there is one group that it never reached across any run.

6 Conclusions

In this paper, we examine the problem of selecting key ﬁgures in a population to ensure the fair spread of vital information across all groups. This problem modiﬁes the classic inﬂuence maximization problem with additional fairness

provisions based on legal and game theoretic concepts. We examine two methods for determining these provisions, and show that the Price of Fairness for these provisions can be unbounded. We propose an improved algorithm for multiobjective maximization to examine this problem on real world data sets. We show that standard inﬂuence maximization techniques often neglect smaller groups, and a diversity constraint based algorithm can ensure these groups receive a fair allocation of resources at relatively little cost. As automated techniques become increasingly prevalent in society and governance, our technique will help ensure that small and marginalized groups are fairly treated.

Acknowledgments

This work was supported by ARO MURI W911NF1810208 and the California HIV Research Program. Wilder is supported by a NSF Graduate Research Fellowship. Tsang and Zick are supported by a Singapore MOE grant #R-252-000625-133 and the Singapore NRF Fellowship #R-252-000750-733.

[Ahmed et al., 2017] F. Ahmed, J. P. Dickerson, and M. Fuge. Diverse weighted bipartite b-matching. In Proc. of the 26th IJCAI, pages 35 41, 2017.

[Azizi et al., 2018] M. Azizi, P. Vayanos, B. Wilder, E. Rice, and M. Tambe. Designing fair, efﬁcient, and interpretable policies for prioritizing homeless youth for housing resources. In CPAIOR, 2018.

[Banerjee et al., 2013] A. Banerjee, A. Chandrasekhar, E. Duﬂo, and M. O. Jackson. The diffusion of microﬁnance. Science, 341(6144), 2013.

[Barman et al., 2019] S. Barman, A. Biswas, S. K. Krishnamurthy, and Y. Narahari. Groupwise maximin fair allocation of indivisible goods. In Proc. of the 32nd AAAI, 2019.

[Barocas and Selbst, 2016] S. Barocas and A. Selbst. Big data s disparate impact. California Law Review, 104:671, 2016.

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)

[Benabbou et al., 2018] N. Benabbou, M. Chakraborty, V. Ho, J. Sliwinski, and Y. Zick. Diversity constraints in public housing allocation. In Proc. of the 17th AAMAS, pages 973 981, 2018. [Bertsimas et al., 2013] D. Bertsimas, V. Farias, and N. Trichakis. Fairness, efﬁciency, and ﬂexibility in organ allocation for kidney transplantation. Operations Research, 61(1):73 87, 2013. [Bredereck et al., 2018] R. Bredereck, P. Faliszewski, A. Igarashi, M. Lackner, and P. Skowron. Multiwinner elections with diversity constraints. In Proc. of the 32nd AAAI, pages 933 940, 2018. [Chekuri et al., 2010] C. Chekuri, J. Vondr ak, and R. Zenklusen. Dependent randomized rounding via exchange properties of combinatorial structures. In Proc. of the 51st FOCS, pages 575 584, 2010. [Chen et al., 2019] H. Chen, G. Loukides, J. Fan, and H. Chan. Limiting the inﬂuence to vulnerable users in social networks: A ratio perspective. In International Conference on Advanced Information Networking and Applications, 2019. [Conitzer et al., 2019] V. Conitzer, R. Freeman, N. Shah, and J. Wortman Vaughan. Group fairness for the allocation of indivisible goods. In Proc. of the 32nd AAAI, 2019. [Fain et al., 2018] B. Fain, K. Munagala, and N. Shah. Fair allocation of indivisible public goods. In Proc. of the 19th EC, pages 575 592, 2018. [Faliszewski et al., 2017] P. Faliszewski, P. Skowron, A. Slinko, and N. Talmon. Multiwinner voting: A new challenge for social choice theory. In Ulle Endriss, editor, Trends in Computational Social Choice, chapter 2. AI Access, 2017. [Hamada et al., 2017] N. Hamada, C. Hsu, R. Kurata, T. Suzuki, S. Ueda, and M. Yokoo. Strategy-proof school choice mechanisms with minimum quotas and initial endowments. Artiﬁcial Intelligence, 249:47 71, 2017. [Karimi et al., 2017] M. Karimi, M. Lucic, H. Hassani, and A. Krause. Stochastic submodular maximization: The case of coverage functions. In Proc. of the 31st NIPS, pages 6853 6863, 2017. [Kempe et al., 2003] D. Kempe, J. Kleinberg, and E. Tardos. Maximizing the spread of inﬂuence through a social network. In Proc. of the 9th KDD, pages 137 146, 2003. [Krause et al., 2008] A. Krause, B. Mc Mahan, C. Guestrin, and A. Gupta. Robust submodular observation selection. Jour. of ML Research, 9(Dec):2761 2801, 2008. [Li et al., 2018] Y. Li, J. Fan, Y. Wang, and K. L. Tan. Inﬂuence maximization on social graphs: A survey. IEEE Transactions on Knowledge and Data Engineering, 30(10):1852 1872, 2018. [Mirrokni et al., 2017] V. Mirrokni, R. Paes Leme, A. Vladu, and S. C. Wong. Tight bounds for approximate Carath eodory and beyond. In Proc. of the 34th ICML, pages 2440 2448, 2017.

[Nemirovski et al., 2009] A. Nemirovski, A. Juditsky, G. Lan, and A. Shapiro. Robust stochastic approximation approach to stochastic programming. SIAM Journal on Optimization, 19(4):1574 1609, 2009. [Pasumarthi et al., 2015] R. Pasumarthi, R. Narayanam, and B. Ravindran. Near optimal strategies for targeted marketing in social networks. In AAMAS, 2015. [Schumann et al., 2017] C. Schumann, S. N. Counts, J. S. Foster, and J. P. Dickerson. The diverse cohort selection problem: Multi-armed bandits with varied pulls. Co RR, abs/1709.03441, 2017. [Segal-Halevi and Suksompong, 2018] E. Segal-Halevi and W. Suksompong. Democratic fair allocation of indivisible goods. In Proc. of the 27th IJCAI, pages 482 488, 2018. [Suksompong, 2018] W. Suksompong. Approximate maximin shares for groups of agents. Mathematical Social Sciences, 92:40 47, 2018. [Todo et al., 2011] T. Todo, R. Li, X. Hu, T. Mouri, A. Iwasaki, and M. Yokoo. Generalizing envy-freeness toward groups of agents. In Proc. of the 22nd IJCAI, pages 386 392, 2011. [Udwani, 2018] R. Udwani. Multi-objective maximization of monotone submodular functions with cardinality constraint. In Proc. of the 32nd NIPS, pages 9513 9524, 2018. [Valente and Pumpuang, 2007] T. Valente and P. Pumpuang. Identifying opinion leaders to promote behavior change. Health Education & Behavior, 34(6):881 896, 2007. [Wen et al., 2018] Y. Wen, W. Peng, and H. Shuai. Maximizing social inﬂuence on target users. In PAKDD, 2018. [Wilder et al., 2018a] B. Wilder, N. Immorlica, E. Rice, and M. Tambe. Maximizing inﬂuence in an unknown social network. In Proc. of the 32nd AAAI, pages 4743 4750, 2018. [Wilder et al., 2018b] B. Wilder, L. Onasch-Vera, J. Hudson, J. Luna, N. Wilson, R. Petering, D. Woo, M. Tambe, and E. Rice. End-to-end inﬂuence maximization in the ﬁeld. In Proc. of the 17th AAMAS, pages 1414 1422, 2018. [Wilder et al., 2018c] B. Wilder, H. Ou, K. de la Haye, and M. Tambe. Optimizing network structure for preventative health. In Proc. of the 17th AAMAS, pages 841 849, 2018. [Yadav et al., 2016] A. Yadav, H. Chan, A. Xin Jiang, H. Xu, E. Rice, and M. Tambe. Using social networks to aid homeless shelters: Dynamic inﬂuence maximization under uncertainty. In Proc. of the 15th AAMAS, pages 740 748, 2016. [Yadav et al., 2018] A. Yadav, B. Wilder, E. Rice, R. Petering, J. Craddock, A. Yoshioka-Maxwell, M. Hemler, L. Onasch-Vera, M. Tambe, and D. Woo. Bridging the gap between theory and practice in inﬂuence maximization: Raising awareness about hiv among homeless youth. In Proc. of the 27th IJCAI, pages 5399 5403, 2018.

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)