# learning_hedonic_games__455e951f.pdf

Learning Hedonic Games

Jakub Sliwinski and Yair Zick National University of Singapore dcsjaku,dcsyaz@nus.edu.sg

Coalitional stability in hedonic games has usually been considered in the setting where agent preferences are fully known. We consider the setting where agent preferences are unknown; we lay the theoretical foundations for studying the interplay between coalitional stability and (PAC) learning in hedonic games. We introduce the notion of PAC stability the equivalent of core stability under uncertainty and examine the PAC stabilizability and learnability of several popular classes of hedonic games.

1 Introduction Hedonic games [Dr eze and Greenberg, 1980] are often used to model agent interactions in collaborative environments. They describe a simple coalition formation process: agents express their preferences over coalitions; next, a central designer is interested in ﬁnding a coalition structure (i.e. a partition of the agents) that satisﬁes certain desiderata. Recent years have seen considerable efforts by the computational social choice community to ﬁnd novel algorithms computing hedonic solution concepts i.e. functions mapping hedonic games to (sets of) coalition structures guaranteed to satisfy certain properties. The core is a central solution concept in hedonic games. A coalition structure is core stable (or simply stable) if no subset of players can deviate, i.e. have all its members strictly better off should it form. Coalitional stability is a highly desirable property; there is a rich literature studying the core of hedonic games, establishing the existence (or lack thereof) of core outcomes for various classes of hedonic games. It is usually assumed that the agents preferences are fully known in order to identify a stable coalition structure; this is a major obstacle in applying the research, as in many real-world scenarios full knowledge of the agent preferences cannot be assumed. There has been little work identifying hedonic solution concepts under uncertainty. This is where our work comes in; we study the following problem: Can we ﬁnd a coalition structure that is likely to be resistant to deviation, given limited knowledge about the underlying hedonic game? We lay the theoretical foundations for studying the interplay between two fundamental concepts: coalitional stability, and

Hedonic Games Learnable Stabilizable Additively Separable Fractional W-games B-games Top Responsive

Table 1: A summary of this paper s learnability/stabilizability results. For stabilizability, informativity is assumed, as explained in section 4.4.

PAC learning in hedonic games. PAC learning is a canonical model that studies how good probabilistic function approximations can be derived from a (polynomial) number of samples (see Section 2.3 for details).

1.1 Our Contributions We contrast between two objectives: learning the underlying hedonic game i.e. provide an estimate for agent utilities over coalitions and learning a stable outcome i.e. provide a coalition structure that is likely to be resistant against coalitional deviations. We do this for a variety of classes of hedonic games (see Table 1). Our results show that hedonic games exhibit surprisingly diverse behavior: learnability of the underlying hedonic game does not necessarily guarantee one s ability to learn a stable outcome; conversely, there exist classes of hedonic games for which a PAC stable outcome can be efﬁciently computed, though the underlying hedonic game cannot be approximated in polynomial time. This result stands in stark contrast to recent results on coalitional games with transferable utility (TU cooperative games) [Balcan et al., 2015]: in the TU domain, any coalitional game can be PAC stabilized, irrespective of the underlying function class.

1.2 Related Work Hedonic games have been widely studied in the AI/MAS community; most works establish existential and computational properties of solution concepts for various classes of hedonic games [Peters and Elkind, 2015; Aziz and Brandl, 2012; Brˆanzei and Larson, 2011; Deineko and Woeginger, 2013] (see Aziz and Savani [2016] or Woeginger [2013] for an overview). Uncertainty in TU cooperative games has been studied extensively; some works take a Bayesian approach:

Proceedings of the Twenty-Sixth International Joint Conference on Artiﬁcial Intelligence (IJCAI-17)

assuming a known prior over agent abilities, and ﬁnding solution concepts under this regime (see [Chalkiadakis and Boutilier, 2004; Chalkiadakis et al., 2007; Myerson, 2007]). However, the work that most closely mirrors our approach is by Balcan et al. [2015]; Balcan et al. use a PAC learning approach to learn both stable payoff divisions, and the underlying game. The PAC approach has been subsequently applied to other game-theoretic domains, such as learning combinatorial functions from pairwise comparisons [Balcan et al., 2016b], buyer valuations in combinatorial domains [Balcan et al., 2012; 2016a], and adversary models in security games [Sinha et al., 2016].

2 Preliminaries Let N = {1, . . . , n} be a set of agents or players. A coalition is a nonempty subset of N; let Ni = {S : S N, i S} be the set of all coalitions containing player i. A coalition structure is a partition π of N; we refer to the coalition containing i in π as π(i). A hedonic game is a pair N, , where = ( 1, . . . , n) is a preference list, reﬂecting each player i s preferences over the coalitions she would like to join. In other words, i is a reﬂexive, complete, and transitive binary relation over Ni; given S, T Ni, we say that S i T if S i T and T i S.

2.1 The Core of a Hedonic Game We say that a coalition S core blocks a coalition structure π, if every agent i S strictly prefers S to π(i), i.e. S i π(i). A coalition structure π is said to be core stable if there is no coalition that core blocks π; that is, for every S N, there exists at least one player i S such that π(i) i S. The set of core stable coalition structures is called the core of N, , denoted Core(N, ).

2.2 Representing Hedonic Games Hedonic games are represented by (complete) preference orders of players over subsets. In this work, we are making the explicit assumption that player utilities are given numerically, rather than by ranked preferences, thereby exploiting the full strength of the classic PAC learning model; hence we will use vi(S) vi(T) and S i T interchangeably. A class of hedonic games can be represented by many possible utility functions; as we later show, the choice of representation functions plays an important role in the learnability of hedonic games. One can cheat by encoding useful information in the representation function itself. For example, one can encode preferences by integer values, and other valuable information (e.g. a stable partition, the actual game) in the insigniﬁcant fractional terms. In what follows, we focus on functions that naturally encode agent preferences, without providing additional information about the underlying game.

2.3 PAC Learning We brieﬂy describe the PAC learning model; for a detailed exposition, see [Kearns and Vazirani, 1994; Shashua, 2009]. The purpose of the Probably Approximately Correct (PAC) learning model is ﬁnding good function approximations. We are given some unknown function v belonging to some hypothesis class H; H is a family of functions from subsets of

players to R. We let Hn be the restriction of H to functions over n players. We refer to the unknown function as the target concept, and the proposed approximation will be called a hypothesis. An algorithm A that PAC learns H takes as input an error parameter ε > 0, a conﬁdence parameter δ > 0, and m samples ( S1, v(S1) , ..., Sm, v(Sm) ); S1, . . . , Sm are assumed to be i.i.d. sampled from a distribution D over 2N. We are guaranteed that with probability at least 1 δ, A outputs a hypothesis v Hn (here both the target concept and the hypothesis are assumed to be in Hn) such that Pr S D[v (S) = v(S)] < ε. Intuitively, assuming that all ob-

served samples were sampled i.i.d. from D, it is very likely that the algorithm will output a hypothesis v accurately predicting the values of other sets sampled from D. The class H is efﬁciently PAC learnable if there is some algorithm A that can PAC learn any v H; furthermore, both the running time of A and m the number of samples that it requires as input are polynomial in n, 1

ε and log 1

δ . Key results in learning theory relate the inherent complexity of the hypothesis class to its PAC learnability. The VC dimension [Kearns and Vazirani, 1994] is a canonical complexity measure for binary functions (i.e. whose outputs are in {0, 1}). For classes of real-valued functions, the analogous concept of pseudo-dimension [Anthony and Bartlett, 1999, Chapter 11] is deﬁned: we are given a list of sets S1, . . . , Sm N, and a set of real values r1, . . . , rm R; a class of functions H can pseudoshatter ( S1, r1 , . . . , Sm, rm ) if for any possible labeling ℓ1, . . . , ℓm {0, 1} there exists some function f H such that f(Sj) > rj ℓj = 1. The pseudo-dimension of H, denoted Pdim(H) is the maximal value m for which there exists a list of m set-value pairs that can be shattered by H. We will make use of the following theorem: Theorem 2.1 (Anthony and Bartlett [1999]). A hypothesis class H is (ε, δ) PAC learnable using m samples, where m is polynomial in Pdim(H), 1

ε and log 1

δ by giving a hypothesis v consistent with the sample, i.e. v (Si) = v(Si) for all i. Furthermore, if Pdim(H) is superpolynomial in n, H is not PAC learnable. We are interested in ﬁnding a stable partition for a hedonic game G = N, ; however, we do not know anything about players preferences. Barring any other information, the task is virtually impossible. One natural compromise would be to offer a partition that is likely to be stable under some assumptions. More formally, we are given a set of m samples, i.e. m pairs Sj, v(Sj) ; here, Sj N and v(Sj) = (vi(Sj))i Sj are the values assigned to Sj by the members of Sj. In what follows, we often refer to the hedonic game G as N, v , identifying agent preference orders with their utility function. Within this framework, we are interested in the following question: given a hedonic game G = N, v belonging to some class of hedonic games H, can we ﬁnd a partition that is likely to be stable, or output that no such partition exists?

3 PAC Stability in Hedonic Games We now present a learning-theoretic notion of core stability, based on the PAC framework. We say that an algorithm A can

Proceedings of the Twenty-Sixth International Joint Conference on Artiﬁcial Intelligence (IJCAI-17)

PAC stabilize a class of hedonic games, if after seeing some examples, it is able to propose a partition that is unlikely to be core blocked by a coalition sampled from D. More formally, given ε > 0, a partition π is ε-PAC stable under D if

Pr S D[S core blocks π] < ε.

Given an unknown hedonic game G = N, v belonging to some hypothesis class H, a PAC stabilizing algorithm A takes as input m samples ( S1, v(S1) , . . . , Sm, v(Sm) ), where S1, . . . , Sm N, and parameters ε, δ > 0. The algorithm A PAC stabilizes H, if for any hedonic game N, v H, distribution D over 2N, and parameters ε, δ > 0, with probability at least 1 δ, A outputs an ε-PAC stable coalition structure, or reports that the core is empty. We again require that the number of samples, m, is bounded by a polynomial in n, 1

ε and log 1

δ . We similarly say that H is PAC stabilizable if there is some algorithm A that PAC stabilizes H.

4 Learnability and Stabilizability of Hedonic Games

Learnability of a class of hedonic games alone does not imply stabilizability; one might think that the following procedure might work: PAC learn a hypothesis v for v, compute a stable partition π for N, v , and hope that π PAC stabilizes N, v . Unfortunately, this will not necessarily work; intuitively, when learning a game, we only provide value guarantees with respect to coalitions sampled from D. However, stable partitions for v can inevitably contain coalitions not sampled from D, for which we have no information whatsoever. However, if one can ensure that the values of coalitions in the proposed partition are not overestimated, then core stability of that partition for the learned hypothesis will mean ε-PAC stability under D for the underlying game, as formalized in Proposition 4.1 (proof omitted due to space constraints).

Proposition 4.1. Given a hedonic game G = N, v and v be a PAC approximation of v. If π is stable for N, v and i N, v i (π(i)) vi(π(i)) then π PAC stabilizes G.

We now analyze speciﬁc classes of hedonic games for their PAC learnability and stabilizability.

4.1 Additively Separable Hedonic Games

Additively separable hedonic games (ASHGs) are hedonic games where an agent s utility from a coalition is the sum of utilities she assigns to other members of that coalition. Formally, in an ASHG, each agent i N assigns a value vi(j) R to any other player j; we let vi(S) = P

j S\{i} vi(j). We make the natural assumption that ASHGs are represented by the utility functions used in their deﬁnition. We ﬁrst observe that ASHGs are PAC learnable.

Proposition 4.2. The class of additively separable hedonic games is PAC learnable.

Proof. The representation functions of additively separable hedonic games are linear functions, well known to be PAC learnable.

Figure 1: An ASHG. Unshown edges have unknown weights.

The existence of a stable coalition structure in an ASHG is not guaranteed [Banerjee et al., 2001], and as it turns out ASHGs are not PAC stabilizable; given some information about a game, it can extend to a game with non-empty core, or to a game with an empty core, such that some coalition from D will block any proposed partition.

Proposition 4.3. The class of additively separable hedonic games is not PAC stabilizable.

Proof. Consider an additively separable hedonic game with 5 players depicted on Figure 1 where the weight of the edge (i, j) is vi(j), and edges not shown have unknown weights. Consider a distribution D that assigns probability of 1/5 to each of the ﬁve sets {i, (i mod 5) + 1} and 0 elsewhere. It is possible that the game has empty core, for example if all of the remaining values vi(j) = 4 in which case for any partition π, there is a set in the support of D that blocks it, hence Pr S D[S core blocks π] 1

5. However, if the unseen vi(j) are

non-negative, the grand coalition is core stable, hence any algorithm potentially PAC stabilizing the game cannot output that the core is empty. Hence, the class is not PAC stabilizable.

Fractional Hedonic Games [Aziz et al., 2014] are different from ASHGs in that the coalitions values are scaled by their size, i.e. vi(S) =

|S| . Due to space limitation, we just note they are similar to ASHGs with respect to learnability, and similar example can be given regarding nonstabilizability.

Proposition 4.4. The class of Fractional Hedonic Games is PAC learnable, but not PAC stabilizable.

4.2 W-games In hedonic games with W-preferences (W-hedonic games), each player i has a preference i over other players and coalition s value is determined by the worst player in that coalition [Cechl arov a and Hajdukov a, 2004; Cechl arov a and Romero Medina, 2001]. Formally,

S i T s S : t T : s i t.

Proposition 4.5. The class of all representation functions of W-hedonic games is efﬁciently PAC learnable.

Proof. Consider the learnability of vi, the i-th representation function. If it is learnable, trivially all n of them are learnable.

Proceedings of the Twenty-Sixth International Joint Conference on Artiﬁcial Intelligence (IJCAI-17)

We will prove the pseudo-dimension of vi is n, and hence that vi is learnable. Consider any set of coalitions S of size n+1. Suppose for contradiction that the hypothesis class can pseudo-shatter S. Denote by S the subset of S obtained by removing any sets that contain some player exclusively: 1: S := S 2: while T S : T \ S

U S \{T } U = do

3: S = S \ T 4: end while 5: return S

Line 3 results with the set S U S U shrinking by at least one player, hence is executed at most n times, therefore S returned by the procedure is non-empty. Since S is pseudoshattered, then its subset S is pseudo-shattered. Relabel S = {S1, ..., Sm} s.t. the threshold t1 associated with S1 is the smallest. Hence, there exists a representation function vi such that vi(S1) < t1 and for j = 2, 3, ..., m, vi(Sj) > tj. Since vi is a representation function for a W-hedonic game and vi(S1) < t1, there is a player x S1, such that T, x T : vi(T) < t1. Since the condition from line 2 of the procedure cannot hold for the returned S , Sj S : x Sj, but vi(Sj) > tj t1, a contradiction. Since all representation functions have pseudo-dimension bounded by n, they are learnable assuming we can compute a consistent hypothesis in polynomial time. The following algorithm computes a hypothesis v i consistent with a given sample ( S1, v(S1) , ..., Sm, v(Sm) ). For every player k N, the score i assigns to k is max{Sj:k Sj} vi(Sj) or if {Sj : k Sj} = . The algorithm runs in time polynomial in n and m, hence the representation functions of W-hedonic games are efﬁciently PAC learnable.

Interestingly, the algorithm outputs an underestimate of vi s, hence by Proposition 4.1, any partition in the core of the hypothesis is PAC stable. If we would only require some PAC hypothesis with an empty core in case we cannot give a PAC stable partition, this would be enough for stabilizability. However, we require that the core is empty for certain. We prove W-hedonic games are not PAC stabilizable, even for a particularly informative representation function: every player assigning any other player its Borda score (the number of players ranked lower), with no ties assumed. This representation function conveys a lot of information: vi(S) tells us the exact ranking of the worst agent in this coalition.

Proposition 4.6. W-hedonic games with the Borda score of the worst agent as the representation functions are not PAC stabilizable.

Proof. Consider a W-hedonic game with 5 players depicted in Figure 2a, where the weight of edge (i, j) is the Borda score for player j in i s preference. Consider a distribution D that returns one of the sets {i, (i mod 5) + 1} uniformly at random. Depending on the values of edges not shown, the game may extend to the game depicted on Figure 2b or a similar game that would be obtained if players were permuted according to p(i) := (i mod 5) + 1. The respective sets of partitions resistant to coalitions supported by D

(a) A W-game. Unshown edges have unknown weights.

(b) Possible extension of the game from Figure 2a.

are A = {{{1}, {2, 5}, {3, 4}}, {{1, 2, 5}, {3, 4}}} and B = {{{2}, {3, 1}, {4, 5}}, {{2, 3, 1}, {4, 5}}}. Since A B = , even after observing all coalitions from D, it is impossible to indicate a partition π for which Pr S D[S core blocks π] < 1 5 with non-zero conﬁdence. Since in both cases the core of the game is not empty (respectively {{1}, {2, 5}, {3, 4}} and {{2}, {3, 1}, {4, 5}} are core stable), any algorithm potentially PAC stabilizing the class cannot output that the core is empty. Hence, the class is not PAC stabilizable.

4.3 B-games B-games [Cechl arov a and Hajdukov a, 2003; Cechl arov a and Romero-Medina, 2001] are the counterpart of W-games, where the value of a coalition depends on the favorite agent in that coalition. However, if two coalitions of different size share their most preferred agent, the smaller one is preferred. Formally, S i T if and only if either: (a) for s = max i S and t = max i T we have s i t; or (b) for s = max i S and t = max i T, s = t and |S| < |T|. We assume strict agent preferences over individuals. Similarly to W-games, B-games are learnable.

Proposition 4.7. B-hedonic games with arbitrary representation functions are efﬁciently PAC learnable.

Proof Sketch. B-games differ from W-games with respect to learning in that the values of coalitions with the same most preferred agent can differ if they are of a different size. However, if we consider only coalitions of the same size, then we can show that no set of size bigger than n can be pseudo-shattered (this is done in a manner similar to that of W-games). Since n different coalition sizes are possible, the pseudo-dimension of B-games is bounded by n2. To construct a hypothesis v i consistent with S = {S1, . . . , Sm}, let us run the following procedure. Relabel Sk s s.t. vi(S1) vi(Sm); given j {1, . . . , m}, we write Tj = {S1, . . . , Sj}. Set T to the highest indexed Tj s.t. (a) T1, T2 T : vi(T1) > vi(T2) |T1| < |T2|; and (b) X = T T T T \ S S S\T S = . We can now choose v i for any (one) x X in a way that ensures consistency with vi(T) for all T T (details omitted). The same procedure is then applied to S \T , until a consistent labeling of the players

Proceedings of the Twenty-Sixth International Joint Conference on Artiﬁcial Intelligence (IJCAI-17)

is completed. Players that were not labeled by this procedure are given arbitrary small values.

Representation functions for B-games turn out to play an important role in their stabilizability; we begin by showing the following negative result.

Proposition 4.8. The class of B-games with arbitrary representation functions is not PAC stabilizable.

Proof Sketch. Consider a B-game with 7 players and a uniform distribution D of coalitions S1 = {7, 1, 2}, S2 = {2, 3}, S3 = {3, 4}, S4 = {4, 5}, S5 = {5, 6, 7, 1}, where

S1 1 S5, S1 7 S5, S2 2 S1, S3 3 S2, S4 4 S3, S5 5 S4

Since |S5| > |S1|, it cannot be determined if 2 1 7. The only coalition structure resistant to coalitions that D supports, that is common for all possible games if 2 1 7 is assumed, is the grand coalition; however, if 7 1 2, it is possible that S5 blocks the grand coalition. The core is never empty for B-games, hence the class cannot be PAC stabilized.

As hinted in the proof sketch, the non-stabilizability stems from our inability to decide if two intersecting coalitions have the same most preferred agent. However, if we assume that the representation function allows us to decide whether a coalition S is preferred to T based on size or agent identity, then B-games are PAC stabilizable; indeed, as we show in Section 4.4, we can PAC stabilize Top Responsive games, a superclass of B-games, based on a similar assumption.

4.4 Top Responsive Games In Top Responsive games [Alcalde and Revilla, 2004], each agents appreciation of a coalition depends on the most preferred subset within the coalition. More formally, let the choice sets of agent i in coalition S Ni be deﬁned by

Ch(i, S) := {X S : Y S, i Y : X i Y }.

If |Ch(i, S)| = 1, we denote the only element of Ch(i, S) by ch(i, S). A game satisﬁes top-responsiveness if:

1. For all i N and S Ni, |Ch(i, S)| = 1

2. For all i N and S, T Ni:

(a) If ch(i, S) i ch(i, T) then S i T (b) If ch(i, S) = ch(i, T) and S T, then S i T

Note that for B-games, if S i {i}, then |ch(i, S)| = 2. We assume that Top Responsive games are represented using an informative utility model; this means that for any i N and S, T Ni, it can be decided whether ch(i, S) i ch(i, T), by observing the values vi(S) and vi(T). This assumption is based on the intuition that coalitions are primarily judged by the worth of their choice set. If a representation function is informative and two coalitions have different choice sets, the function values will indicate the coalitions are in different buckets . A simple example of an informative representation function, is one for which

vi(S) = vi(T) ch(i, S) = ch(i, T).

Intuitively, the utility derived from the coalitions ch(i, S) is given by integers, and any tiebreaking done by size constraints is done on lower order fractional terms. We show that Top Responsive games are not PAC learnable, even if their representation functions are informative.

Proposition 4.9. Top Responsive hedonic games with informative representation functions are not PAC learnable.

Proof Sketch. Consider the class R of hedonic games where S, T Ni : |S| > |T| = S i T; it is easy to see that R is a subclass of Top Responsive games. One can show that the set X = {S Ni : |S| = n

2 } can be pseudo-shattered by R. Since |X| > 2 n 1

2 , Pdim(R) > 2 n 1

2 ; invoking Theorem 2.1 we conclude that R is not PAC learnable.

Theorem 4.10. Top Responsive hedonic games with informative representation are efﬁciently PAC stabilizable.

Proof. We claim that Algorithm 1 PAC stabilizes Top Responsive games. Algorithm 1 is similar to the Top Covering Algorithm [Alcalde and Revilla, 2004] used to ﬁnd stable outcomes for Top Responsive hedonic games in the fullinformation setting. Intuitively, instead of using the exact choice sets, Algorithm 1 approximates them.

Algorithm 1 An algorithm ﬁnding a PAC stable outcome for Top Responsive games

Input: ε, δ, set S of m = (2n4 + 2n3) 1

δ samples from D 1: l 0, Rl N, π 2: ω 2n2 1

δ 3: while Rl = do 4: S take and remove ω samples from S 5: S {T : T S , T Rl} 6: for i Rl do 7: if i / S X S X then 8: Bi,l {i} 9: else 10: Bi,l arg max T S vi(T) 11: end if 12: Bi,l T

{T S :ch(i,T )=ch(i,Bi,l)} T.

13: end for 14: for j = 1, ..., |Rl| do 15: S take and remove ω samples from S 16: for i Rl do 17: Bi,l Bi,l T

T S :ch(i,T )=ch(i,Bi,l) T.

18: end for 19: end for 20: CC(i, Rl) {i Rl : j1, ..., jk Rl : j1 = i jk = i j2 Bj1,l, . . . , jk Bjk 1,l} 21: i arg minj Rl |CC(j, Rl)| 22: Xl CC(i , Rl) 23: π π Xl 24: Rl+1 Rl \ Xl, l l + 1 25: end while 26: return π

Proceedings of the Twenty-Sixth International Joint Conference on Artiﬁcial Intelligence (IJCAI-17)

Note that after step 19 the algorithm constructs a coalition of players based on (approximations of) choice sets, adds that coalition to π, removes it from the set Rl of players left to be considered, and if Rl is non-empty starts over, similarly to the Top Covering Algorithm. Note that Rl grows strictly smaller with every iteration, hence l n 1. Deﬁne i,Rl as a modiﬁed preference of the player i, that equates any coalitions not entirely contained in Rl to the worst: If T Rl and S Rl, T i,Rl S T i S If T Rl but S Rl, T i,Rl S If T Rl and S Rl, T i,Rl S Given a distribution D, we say that a coalition T is Rl-top-ε for player i, if Pr S D[S i,Rl T] ε.

Trivially, the probability of sampling an Rl-top-ε coalition for player i from D is at least ε. Note that if Pr S D[S Rl] ε,

then any set is Rl-top-ε. Bi,l is the approximation of the choice set ch(i, Rl). The algorithm s objective is to ensure that

Pr S D[v(S) is inconsistent with some Bi,l = ch(i, Rl)] ε.

Steps 4 to 12 approximate ch(i, Rl) by Bi,l by ensuring that sampling a coalition from D with a better choice set than Bi,l is unlikely; this is done by examining enough coalitions so as to see some Rl-topε 2n2 coalition for every player. Let us bound the probability that for a given i, none of the coalitions in S are Rl-topε 2n2 :

ω = 1 ε 2n2

Taking a union bound, the probability that there is some l and some player i Rl s.t. there is no Rl-topε 2n2 coalition for i in S is at most δ

2. Note that seeing a coalition entirely contained in Rl may be very unlikely, and any coalitions not contained in Rl are removed from S at step 5; hence, S can end up not containing any coalition with some player i (condition in line 7). But then with high conﬁdence, as above, every coalition is Rl-topε 2n2 , and the algorithm can pick for example {i}. Note that steps 12 and 17 require an informative representation. The intuition of steps 14 to 19 is to further approximate ch(i, Rl) by Bi,l, ensuring that seeing a coalition from D with the same choice set as Bi,l, but not containing some player from Bi,l, is unlikely. Let E(S, l, i) be the event [S Rl ch(i, Bi,l) = ch(i, S) Bi,l \ S = ]. Suppose (*) Pr S D[E(S, l, i)] ε 2n2 ; then the probability that

S S : E(S, l, i) is 1 ε 2n2 ω < δ 2n3 . By the union bound, for every iteration of l and j, for all i s.t. (*) holds, S S : E(S, l, i) with conﬁdence 1 δ

Note that if S S : E(S, l, i), then step 17 removes some agent from Bi,l, hence the loop at step 14 would be redundant with more iterations. Concluding, after the algorithm s execution, (*) does not hold for any l and i with conﬁdence 1 δ

2. To examine if the algorithm achieves PAC stability, consider some coalition T that core blocks π. Let k = min{l : i T Xl} and i T Xk. Since T core blocks π, T i Xk. By top-responsiveness, one of the following holds: (a) ch(i, T) i ch(i, Xk) or (b) Xk \ T = . Suppose that (a) holds; by construction of Xk, ch(i, Xk) i Bi,k. By topresponsiveness, T i Bi,k. Recall that with conﬁdence 1 δ

2 it can be assumed that Bi,k is Rk-topε 2n2 . By the choice of k, T Rk. By the union bound, Pr T D[T i Bi,k] < ε

Next, suppose that (b) holds. By construction of Xk and i T Xk, there is a player j T Xk, s.t. Bj,k T. Since T core blocks π, T j Xk. Then either ch(j, T) j ch(j, Bj,k) or E(T, k, j) occurs. The former implies (a). Recall that with conﬁdence 1 δ

2, for all l and i, Pr T D[E(T, l, i)] < ε 2n2 . By union bound, the latter hap-

pens with probability smaller than ε

2. Taking a union bound, Pr T D[T core blocks π] < ε with conﬁdence 1 δ.

5 Conclusions and Future Work Our results seem to imply that classes of hedonic games with instances having an empty core are impossible to PAC stabilize. However, things are not as simple as that: some of our examples (Proposition 4.6 and 4.8) face an algorithm with a decision between two possible games with a non-empty core; however, partitions that stabilize one will destabilize the other. One interesting direction for future work would be using structural restrictions: recent works study hedonic games where agent preferences follow a graph structure [Igarashi and Elkind, 2016]. It would be useful to see whether certain graphical assumptions imply PAC stabilizability. Furthermore, while our work studies the core, one can focus on other hedonic solution concepts. Beyond its theoretical interest, we believe that our approach sets a compelling research agenda. It is often necessary to assume uncertainty in real-world domains; indeed, by grounding stability on observed data rather than the entirety of agents preferences, we offer a tailored domain-speciﬁc solution. Since our algorithms (and PAC learning in general) do not use any information regarding the distribution D, they may output different results in different domain implementations. Moving forward in this direction, it is quite possible that making certain assumptions with respect to D will circumvent some of our negative stability results. Our work formally relates observational data and stability in hedonic games; this relation opens the door to empirical evaluation of hedonic games, a further step towards their implementation in real-world systems.

Acknowledgements The authors were supported by MOE Grant no. R-252-000625-133, and by NRF Fellowship no. R-252-000-750-733.

Proceedings of the Twenty-Sixth International Joint Conference on Artiﬁcial Intelligence (IJCAI-17)

[Alcalde and Revilla, 2004] J. Alcalde and P. Revilla. Researching with whom? stability and manipulation. Journal of Mathematical Economics, 40(8):869 887, 2004.

[Anthony and Bartlett, 1999] M. Anthony and P. Bartlett. Neural Network Learning: Theoretical Foundations. Cambridge University Press, 1999.

[Aziz and Brandl, 2012] H. Aziz and F. Brandl. Existence of stability in hedonic coalition formation games. In Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 12, pages 763 770, 2012.

[Aziz and Savani, 2016] H. Aziz and R. Savani. Hedonic games. In F. Brandt, V. Conitzer, U. Endriss, J. Lang, and A.D. Procaccia, editors, Handbook of Computational Social Choice, chapter 15. Cambridge University Press, 2016.

[Aziz et al., 2014] H. Aziz, F. Brandt, and P. Harrenstein. Fractional hedonic games. In Proceedings of the 13th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 14, pages 5 12, 2014.

[Balcan et al., 2012] M.F. Balcan, F. Constantin, S. Iwata, and L. Wang. Learning valuation functions. In Proceedings of the 25th Annual Conference on Learning Theory, COLT 12, pages 4.1 4.24, 2012.

[Balcan et al., 2015] M.F. Balcan, A.D. Procaccia, and Y. Zick. Learning cooperative games. In Proceedings of the 24th International Joint Conference on Artiﬁcial Intelligence, IJCAI 15, pages 475 481, 2015.

[Balcan et al., 2016a] M.F. Balcan, T. Sandholm, and E. Vitercik. Sample complexity of automated mechanism design. In Advances In Neural Information Processing Systems, NIPS 16, pages 2083 2091, 2016.

[Balcan et al., 2016b] M.F. Balcan, E. Vitercik, and C. White. Learning combinatorial functions from pairwise comparisons. In Proceedings of the 29th Annual Conference on Learning Theory, COLT 16, pages 1 35, 2016.

[Banerjee et al., 2001] S. Banerjee, H. Konishi, and T. S onmez. Core in a simple coalition formation game. Social Choice and Welfare, 18(1):135 153, 2001.

[Brˆanzei and Larson, 2011] S. Brˆanzei and K. Larson. Social distance games. In Proceedings of the 22nd International Joint Conference on Artiﬁcial Intelligence, IJCAI 11, pages 91 96, 2011.

[Cechl arov a and Hajdukov a, 2003] K. Cechl arov a and J. Hajdukov a. Computational complexity of stable partitions with B-preferences. International Journal of Game Theory, 31(3):353 364, 2003.

[Cechl arov a and Hajdukov a, 2004] K. Cechl arov a and J. Hajdukov a. Stable partitions with W-preferences. Discrete Applied Mathematics, 138(3):333 347, 2004.

[Cechl arov a and Romero-Medina, 2001] K. Cechl arov a and A. Romero-Medina. Stability in coalition formation games. International Journal of Game Theory, 29(4):487 494, 2001. [Chalkiadakis and Boutilier, 2004] G. Chalkiadakis and C. Boutilier. Bayesian reinforcement learning for coalition formation under uncertainty. In Proceedings of the 3rd International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS 04, pages 1090 1097, 2004. [Chalkiadakis et al., 2007] G. Chalkiadakis, E. Markakis, and C. Boutilier. Coalition formation under uncertainty: Bargaining equilibria and the bayesian core stability concept. In Proceedings of the 6th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 07, pages 412 419, 2007. [Deineko and Woeginger, 2013] V.G. Deineko and G.J. Woeginger. Two hardness results for core stability in hedonic coalition formation games. Discrete Applied Mathematics, 161(13):1837 1842, 2013. [Dr eze and Greenberg, 1980] J. H. Dr eze and J. Greenberg. Hedonic coalitions: Optimality and stability. Econometrica, 48(4):987 1003, May 1980. [Igarashi and Elkind, 2016] A. Igarashi and E. Elkind. Hedonic games with graph-restricted communication. In Proceedings of the 15th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 16, pages 242 250, 2016. [Kearns and Vazirani, 1994] M.J. Kearns and U.V. Vazirani. An introduction to computational learning theory. MIT press, 1994. [Myerson, 2007] R.B. Myerson. Virtual utility and the core for games with incomplete information. Journal of Economic Theory, 136(1):260 285, 2007. [Peters and Elkind, 2015] D. Peters and E. Elkind. Simple causes of complexity in hedonic games. In Proceedings of the 24th International Joint Conference on Artiﬁcial Intelligence, IJCAI 15, pages 617 623, 2015. [Shashua, 2009] A. Shashua. Introduction to machine learning: Class notes 67577. ar Xiv preprint ar Xiv:0904.3664, 2009. [Sinha et al., 2016] A. Sinha, D. Kar, and M. Tambe. Learning adversary behavior in security games: A PAC model perspective. In Proceedings of the 15th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 16, pages 214 222, 2016. [Woeginger, 2013] G.J. Woeginger. Core stability in hedonic coalition formation. In 39th International Conference on Current Trends in Theory and Practice of Computer Science, SOFSEM 13, pages 33 50. Springer, 2013.

Proceedings of the Twenty-Sixth International Joint Conference on Artiﬁcial Intelligence (IJCAI-17)