# individual_fairness_in_graph_decomposition__647835e7.pdf Individual Fairness in Graph Decomposition Kamesh Munagala * 1 Govind S. Sankar * 1 In this paper, we consider classic randomized low diameter decomposition procedures for planar graphs that obtain connected clusters which are cohesive in that close-by pairs of nodes are assigned to the same cluster with high probability. We require the additional aspect of individual fairness pairs of nodes at comparable distances should be separated with comparable probability. We show that classic decomposition procedures do not satisfy this property. We present novel algorithms that achieve various trade-offs between this property and additional desiderata of connectivity of the clusters and optimality in the number of clusters. We show that our individual fairness bounds may be difficult to improve by tying the improvement to resolving a major open question in metric embeddings. We finally show the efficacy of our algorithms on real planar networks modeling congressional redistricting. 1. Introduction In this paper, we study a clustering problem of partitioning users located in a metric space into components, each of diameter O(R) for some input parameter R. Such partitioning has been widely studied as low diameter decomposition (LDD) and has myriad applications. In addition to being a stand-alone clustering procedure, they are, for example, used as a subroutine in metric embeddings (Bartal, 1996; Fakcharoenphol et al., 2004). In this paper, we consider the question of fairness in how different pairs of nodes are separated in the decomposition. In keeping with terminology in the fairness literature, we term this individual fairness of the decomposition. 1Computer Science Department, Duke University, Durham NC 27708-0129, USA. Correspondence to: Kamesh Munagala , Govind S. Sankar . Proceedings of the 41 st International Conference on Machine Learning, Vienna, Austria. PMLR 235, 2024. Copyright 2024 by the author(s). 1.1. Motivation for Individual Fairness As motivation, consider the clustering problem of opening schools or libraries and assigning users to these locations. In such assignments, it is desirable to preserve communities in the sense that nearby users should be assigned to the same cluster. For instance, in Congressional redistricting (Altman, 1998; Cohen-Addad et al., 2018), it is preferable to keep communities from being fractured (Brubach et al., 2020). Similarly, when assigning students to schools, students who live close to each other prefer to be assigned to the same school. This is termed community-cohesion (Ashlagi & Shi, 2014). Of course, no deterministic solution can guarantee community cohesion for all user pairs since some close-by user pairs will be deterministically split. In view of this impossibility, we take the approach of Brubach et al. (2020); Harris et al. (2019) and consider solutions that are a distribution over clusterings. In such a distribution, different pairs of users are separated with different probabilities, and this probability will typically increase with the distance between the pair. This naturally leads to the notion of pairwise envyfreeness pairs of users with comparable distance should be separated with comparable probability. This ensures no pair of users envies a comparable pair in terms of how cohesive the pair is in the decomposition. As another motivation, in Congressional redistricting, one pertinent question is to generate a random sample of representative maps to audit a given map for unfair gerrymandering (De Ford et al., 2021). Ideally, in this random sampling distribution, the marginal probabilities of separation of pairs of precincts of comparable distance should be comparable. This means the probability that any pair of precincts is separated should be both lower and upper bounded by a value that depends on the distance between them. We term this notion of fairness as individual fairness due to its connection to a corresponding notion in fair machine learning (Dwork et al., 2012). The main goal of our paper is to study the existence and computation of LDDs that are individually fair. 1.2. Randomized LDD in Planar Graphs Before proceeding further, we present a formal model for randomized LDD. We consider the canonical case where the users (resp. neighborhoods or other entities) are nodes Individual Fairness in Graph Decomposition of a planar graph G(V, E) and the distance function d(u, v) is the unweighted shortest path distance between nodes u and v. We now define an LDD as follows: Definition 1.1 (Low Diameter Decomposition). Given a graph G(V, E) with shortest path distance d( , ), and given a parameter R (where potentially R n), a low-diameter decomposition (LDD) is a (possibly randomized) partition of V into disjoint clusters π1, π2, . . . , πr, so that for j and u, v πj, d(u, v) = O(R). In the applications previously mentioned, planar graphs are a natural model. For instance, in political redistricting, nodes correspond to precincts, and edges connect nodes whose corresponding precincts are adjacent. Similarly, in school assignment, nodes correspond to neighborhoods that cannot be split, with edges between nodes corresponding to adjacent neighborhoods. If the LDD partitions the planar graph into connected regions, these regions will also be connected when the nodes are expanded into the corresponding precincts or neighborhoods. This property is an advantage of planar graphs over point set models, where a contiguous set of nodes need not be connected when they are expanded. Finally, the radius constraint on the LDD ensures the expanded regions are sufficiently compact. Now consider a pair (u, v) of users separated by distance d(u, v). Several randomized LDD constructions lead to cohesive communities by making the probability that (u, v) are separated an increasing function of d(u, v). This ensures close-by pairs are together with a larger probability than pairs further away. The natural benchmark for this probability is the quantity ρuv = d(u,v) R , as the following example shows. Example 1. Consider a grid graph where vertex v with coordinates (i, j) has edges to the vertices (i 1, j), (i + 1, j), (i, j 1), and (i, j + 1). A simple LDD is to choose (ℓ, m) uniformly at random from [R] [R] and cut along the x-axis at {. . . , ℓ 2R, ℓ R, ℓ, ℓ+R, . . .} and the y-axis at {. . . , m 2R, m R, m, m+R, . . .}. Then, the probability any pair of vertices (u, v) is separated is at most ρuv. Indeed, for any planar graph, the classic procedure of Klein et al. (1993) (KPR, see Algorithm 1) provides a practical LDD method for a planar graph into connected components of diameter O(R) so that the probability of separation of any pair (u, v) is O(ρuv). 1.3. Individual Fairness in LDD We now come to the main focus of this paper: individual fairness. For pairs of nodes (u1, v1) and (u2, v2) with d(u1, v1) d(u2, v2), both pairs should be separated with comparable probability, so that neither pair envies the other. Ideally, (u, v) are separated with probability Θ(ρuv). In Example 1, the probability that (u, v) are separated is at least ρuv/2, so that this probability is indeed Θ(ρuv). This may give us hope that the classical KPR procedure also recreates such an individual fairness guarantee for general planar graphs. Note that the classical analysis implies an upper bound of O(ρuv) on the probability (u, v) is separated, but no corresponding lower bound. Existing Methods can Fail. As our first contribution (Theorem 1.2), we somewhat counter-intuitively show that the KPR procedure (Algorithm 1) cannot yield an individual fairness guarantee for any distance d, some pair (u, v) with distance d(u, v) = d is never separated. Theorem 1.2 (Proved in Appendix B). For any n N and any distance 1 d n 3 , there exists a planar graph Gd and vertices u, v V (G) with d(u, v) = d such that Algorithm 1 with parameter R > 2d never separates u and v. This means that though the procedure guarantees that the probability that (u, v) is separated is O(ρuv), the actual probability of separation zero need not be comparable to this quantity. Therefore, different vertex pairs of comparable distance can be separated with vastly different probabilities, and this holds even when the vertex pair is an edge. 1.4. Achieving Individual Fairness: Our Results Given this negative result, our main contribution is a set of algorithms for randomized planar graph partitioning that achieve various levels of individual fairness. We study the trade-offs between three natural desiderata in randomized LDDs each part in the decomposition should be a connected component, these components should have nontrivial size, and the partitioning should be individually fair. We first state the above desiderata formally: (CON): Each cluster forms a single connected component. This is important in applications such as school assignment or political redistricting (Duchin, 2018). (COMP): The LDD should optimally compress the graph. For a general unweighted planar graph with n nodes, we say that a procedure satisfies (COMP) if (i) the expected number of components produced is O(n/R);1 and (ii) for any outcome of randomness, the algorithm should achieve non-trivial compression: When R = D β , where D is the diameter of the graph and β 1, there always exist Ω(β) components with diameter Ω(R). (f, g)-(IF): This condition captures individual fairness. The probability that (u, v) are separated is at most f(ρuv) and at least g(ρuv), for scalar functions f, g. For an arbitrary function h, we say that f(ρ) = 1This bound cannot be improved for a line graph. Individual Fairness in Graph Decomposition O(h(ρ)) if f(ρuv) = O(h(ρuv)) for every pair u, v; similarly for g. In the ideal case, as in Example 1, Pr[(u, v) is cut] = Θ(ρuv) for all u, v, so that f(ρ) = O(ρ) and g(ρ) = Ω(ρ) simultaneously. Note that the KPR procedure (Algorithm 1) achieves (CON), (COMP), and f(ρ) = O(ρ). However, by Theorem 1.2, it has g(ρ) = 0, and is hence not individually fair. Fairness for Edges. We first show that if we only seek to be individually fair for pairs (u, v) E, that is, edges or d(u, v) = 1, then there is a modification of the KPR procedure via random vertex weights (Algorithm 2) that preserves (COMP) and (CON), and guarantees that any edge (u, v) is separated with probability Θ 1 R . We state the theorem in a bit more generality below. We show a tighter bound for the case of d(u, v) = 1 since we present experiments on this case in Section 7. Theorem 1.3 (Proved in Section 3). There is an LDD procedure (Algorithm 2) that produces connected components of diameter at most 43 (R + 1), satisfies (COMP), and every pair (u, v) V V satisfies 1 2R Pr[u, v separated] R if d(u, v) > 1 3 R if d(u, v) = 1 . Lower Bound on Individual Fairness. Moving on to pairs (u, v) separated by general distances, we show that obtaining an individual fairness guarantee of Θ(ρuv) resolves a major conjecture in metric embeddings and is hence likely difficult. Formally, we tie individual fairness in LDD precisely to the distortion of embedding the graph metric into ℓ1. Such an embedding is a function h that maps the vertices v of G to s-dimensional points h(v) (for some s) with the guarantee that the ℓ1 distance between h(u) and h(v) is at least d(u, v), and at most α d(u, v). The parameter α 1 is termed the distortion. Theorem 1.4 (Proved in Section 4). If there is an LDD procedure for planar graph2 G that with any parameter3 R = ω(n) that achieves f(ρ)/g(ρ) γ and f(ρ) = O(ρ), then there is an ℓ1 embedding of G with distortion O(γ). The best-known distortion bound for embedding a planar graph into ℓ1 is O( log n), and it is a major open question to improve this to a constant. This implies obtaining the optimal bound f(ρ) = O(ρ) and g(ρ) = Ω(ρ) simultaneously resolves a major conjecture, since by Theorem 1.4, it implies a constant distortion embedding of G into ℓ1. 2This theorem holds for arbitrary metric spaces. 3Note that Definition 1.1 allows for the parameter R to be larger than the diameter of G. In this regime, the constraint on the diameter is vacuous, and the only requirement is to satisfy the bounds on the separation probabilities. Positive Results for General Distances. Given the above difficulty, we show algorithms that achieve increasingly stronger individual fairness guarantees as we relax the requirements of (COMP) and (CON). First, we show the following theorem that extends Theorem 1.3 to derive a lower bound on separation probability for non-adjacent vertex pairs. Theorem 1.5 (Proved in Section 5). There is an algorithm (Algorithm 3) that, for any ϵ > 0, achieves (CON), (COMP), f(ρ) = O(ρ), and g(ρ) = Ω(ϵ2 ρ2+ϵ) for any d(u, v). The above theorem is technically the most intricate. We extend the classical analysis in (Klein et al., 1993) by designing a novel modification of their algorithm. Our analysis hinges on showing the existence of a K3,3 minor (and hence a contradiction) if a vertex pair (u, v) is not separated with sufficient probability. At a high level, this is similar to the analysis in (Klein et al., 1993). However, they show such a minor for deriving the diameter of the decomposition. In contrast, in our case, the diameter follows from their argument but we need the minor construction to show the lower bound. This makes the details of our construction different. We next show in Section 6 that the above bound on g can be improved if we violate (COMP). Theorem 1.6 (Proved in Section 6). For any constant ϵ > 0, there is an LDD procedure (Algorithm 4) that achieves (CON), f(ρ) = O (ρ), and g(ρ) = Ω(ϵ ρ1+ϵ). A similar procedure also gives f(ρ) = O(ρ) and g(ρ) = Ω ρ log R . The algorithm that shows the above theorem runs the classical KPR algorithm with a random choice of radius parameter. Such a procedure will preserve connectivity, but the random choice of radius leads to sub-optimal compression, even for a line. We finally consider the implications of violating both (CON) and (COMP). We show the following theorem. Theorem 1.7 (Proved in Appendix C). For any planar graph and parameter R, there is an LDD (Algorithm 5) with f(ρ) = O(ρ) and g(ρ) = Ω ρ log R The algorithm for showing the above theorem involves embedding the planar graph into the ℓ2 metric and projecting the embedding onto a line, reminiscent of locality-sensitive hashing methods for the ℓ2 metric. Such an approach does not preserve either connectivity or (COMP). Note that by Theorem 1.4, improving the above result for R = ω(n) would make progress on the conjecture of optimally embedding planar graphs into ℓ1. The trade-offs above are summarized in Table 1. Note that the gap between f and g improves as we relax the other desiderata of (CON) and (COMP). Individual Fairness in Graph Decomposition (CON) (COMP) g(ρ) KPR (Alg. 1) Theorem 1.5 (Alg. 3) Ω(ϵ2 ρ2+ϵ), ϵ > 0 Theorem 1.6 (Alg. 4) Ω(ϵ ρ1+ϵ), ϵ > 0 Theorem 1.7 (Alg. 5 ) Ω ρ log R Table 1: Trade-offs in Desiderata. In all cases, f(ρ) = O(ρ). The notation Ωϵ hides terms dependent on ϵ. Empirical Results. We perform experiments on realworld planar graphs modeling precinct connectivity in redistricting applications in the states of North Carolina, Maryland, and Pennsylvania in the United States. We consider individual fairness on edges (Theorem 1.3 and Algorithm 2). We show an approach to establish that Algorithm 2 is more individually fair than Algorithm 1 (KPR) across these datasets. Focusing on Algorithm 2, we also show that empirically, the radii of the clusters are much smaller than the theoretical bounds (Theorem 2.1). Further, the number of components satisfies the O(n/R) bound on the number of clusters from (COMP) not just in expectation but almost always. 1.5. Related Work Low Diameter Decomposition. Randomized low diameter decomposition of metric spaces is widely used as a subroutine for embedding the metric space into a simpler space, such as trees or Euclidean metrics, with provably low distortion, meaning that no distance is stretched too much. For general metric spaces, the work of Bartal (1996); Fakcharoenphol et al. (2004) present LDD procedures that achieve a probability of separation O(ρuv log n), where n is the number of points in the metric space. For graphs excluding a Kr minor, which includes planar graphs, the algorithm of Klein et al. (1993) has been a predominant tool. Abraham et al. (2019) devise an entirely different framework using cop decompositions that yields improved separation probability bounds for large r. This, however, does not yield any improvement for planar graphs (r = 5), so we focus on the simpler KPR algorithm in this paper. We refer the reader to the excellent survey by Matouˇsek (2002) for various embedding results and their connection to LDD. In contrast, we consider low diameter decomposition as a stand-alone clustering concept and study individual fairness in the separation probabilities it generates and how this fairness trades off with other natural clustering desiderata. Fairness. Over the last decade, fairness has become an important algorithmic objective in machine learning (Choulde- chova & Roth, 2018). Fair matchings (Huang et al., 2016), rankings (Celis et al., 2017), and bandit problems (Joseph et al., 2016) have been considered, among others. Following the work of Chierichetti et al. (2017), fair clustering has seen much interest (Backurs et al., 2019; Bera et al., 2019; Chen et al., 2019; Esmaeili et al., 2020). Within the literature on fairness, machine learning literature has considered the notion of individual fairness (Dwork et al., 2012), where the goal is to find a randomized solution (classifier, regression) such that close-by data points see similar outcomes. This has inspired similar models for problems such as auctions (Chawla & Jagadeesan, 2022) and assignments (Devic et al., 2023; Shen et al., 2023). We extend this literature to consider fairness in separation probability in randomized clustering. Clustering. Our problem is closely related to classical k-center clustering, where the goal is to open at most k such locations and assign users to these locations. Though this problem is NP-HARD, several efficient algorithms are known that achieve a 2-approximation when the points lie in a metric space (Hochbaum & Shmoys, 1985; Hsu & Nemhauser, 1979). Of particular interest is the work of Brubach et al. (2020) (see also (Miller et al., 2013)) that performs randomized k-center clustering and guarantees a probability of separation of O(ρuv log k). However, their work does not guarantee individual fairness for pairs of users with comparable distances. We use this line of work as motivation for studying randomized partitioning and seeking individual fairness guarantees. In contrast with kcenter clustering, we do not enforce a limit k on the number of clusters since no non-trivial fairness guarantee can be obtained with this restriction.4 2. The KPR Partitioning Algorithm For the sake of completeness, we present the KPR algorithm (Klein et al., 1993) as Algorithm 1. This achieves a randomized LDD with the additional properties that: (a) each cluster πj is a connected component; (b) for any u, v πj, the shortest path from u to v in the subgraph induced by πj has length O(R); and (c) the probability that u, v belong to different clusters (or are separated ) is O(ρuv). Recall that ρuv = d(u,v) R , where d(u, v) is the shortest path distance (number of edges) between u and v in G. First, we present some terminology. We term the maximum shortest path length within the subgraph induced by a cluster as its diameter . For a BFS tree rooted at vertex r, we say 4For example, consider 4 points located on the real line at 0, 1, x, x + 1. If we are only allowed to open 2 clusters, then as x , the pairs of close-by points cannot be separated without drastically affecting the cluster radii. Individual Fairness in Graph Decomposition that r is at level 0, the children of r are at level 1, and so on. We say an edge (u, v) is at level i if u is in level i and v is at level i + 1 or vice versa. We call an iteration of the loop in step 1 one phase of Algorithm 1. Algorithm 1 KPR Decomposition (Klein et al., 1993) Input: Integer R, planar graph G = (V, E). 1: Set F {}. 2: for i = {1, 2, 3} do 3: for every connected component C of G \ F do 4: Perform BFS from an arbitrary root r in C. 5: Sample k uniformly at random from {0, 1, . . . R 1}. 6: for every edge e at level ℓ k mod R from r do 7: F F {e} 8: end for 9: end for 10: end for Algorithm 1 can also be viewed in the following way. At step 1, define Vδ as the vertices whose levels lie in [δR + k, (δ + 1)R + k) for integers δ. Then, the algorithm can instead remove all edges leaving Vδ in step 1, for all δ. Theorem 2.1 ((Klein et al., 1993; Goemans, 2006)). Algorithm 1 partitions the vertices of G into clusters, each of which is a disjoint connected component of diameter at most 43 R. Further, for any (u, v) V V : Pr[(u, v) are separated] 3 ρuv. The procedure satisfies (CON) by definition. Lemma 2.2 shows that it satisfies (COMP). In Appendix B, we show that it does not satisfy individual fairness, since there exist graphs where a pair of vertices are never separated. The following lemma is proved in Appendix A. Lemma 2.2 ((COMP)). For any planar graph with n vertices and diameter D, the expected number of components produced by Algorithm 1 is at most 3n/R. Further, regardless of the randomness, there always exist β/4 components with diameter at least R/8 assuming R = D 3. Individual Fairness for Edges We will now prove Theorem 1.3 via Algorithm 2. The algorithm is similar to Algorithm 1, except it adds random weights to each vertex and cuts either above or below the vertex in the BFS tree depending on the vertex weight. The diameter guarantee follows from the proof of Theorem 2.1. Each phase of Algorithm 2 has at most R + 1 levels in the BFS tree, and we reuse the bounds from Algorithm 1 with parameter R + 1 instead of R. It is easy Algorithm 2 KPR algorithm with random vertex weights Input: Integer R, planar graph G = (V, E). 1: Set F {}. 2: for i = {1, 2, 3} do 3: for every connected component C of G \ F do 4: Perform BFS from an arbitrary root r in C. 5: Sample θ uniformly at random from {0, 1, . . . R 1}. 6: for every vertex v at level ℓv θ mod R do 7: Independently choose bv {0, 1} uniformly at random. 8: ℓv ℓv + bv. 9: end for 10: For non-negative integers q, define Vq = {v | q R + θ < ℓv (q + 1)R + θ}. 11: Let Eq denote edges with one end-point in set Vq, and let E = q 0Eq. 12: F F E . 13: end for 14: end for to check that the procedure satisfies (CON), and a similar argument to Lemma 2.2 shows (COMP). To upper-bound the separation probability for an edge, consider the BFS tree just before executing Step (2) in some phase of the algorithm. If u, v are on the same level, then that level is θ mod R with probability 1 R in this phase. If this does not happen, u and v will not be separated in this phase. Suppose u and v are not on the same level. Without loss of generality, assume that u is in level ℓand v is in level ℓ+ 1. If ℓ θ mod R (which happens with probability 1 R), then bu = 0 with probability 1/2 and (u, v) are subsequently separated. This event, therefore, happens with probability 1 2R. Similarly, if ℓ+ 1 θ mod R, then (u, v) are subsequently separated if bv = 1, which again happens with probability 1 2R. In all other events, (u, v) are not separated. Therefore, in each phase, the probability that u and v are separated is at most 1 R, and the upper bound follows by taking the union bound over three phases. The bound for more general distances is analogous. We will only consider the first phase to get the lower bound on the probability of separation. Assume without loss of generality that u is at least as close to the root of the BFS tree as v. If u and v are at the same level ℓat the beginning of Step (2), then with probability 1 R, θ is chosen as ℓmod R. In this event, u and v are separated if bu = bv, which happens with probability 1 2. This gives us a probability of 1 2R that they are separated. Similarly, if ℓu < ℓv at the beginning of Step (2), then with probability 1 2R, θ ℓv mod R and bu = 0, in which case u and v are separated. This completes the proof of Theorem 1.3. Individual Fairness in Graph Decomposition 4. Lower Bound on Individual Fairness We next prove Theorem 1.4. Let n = |V |. Suppose for any parameter R = ω(n), there is an LDD procedure that produces clusters such that the probability that u, v are separated lies in h c1 d(u,v) R , c2 d(u,v) R i where c2 = O(1). Note that assuming R = ω(n) means the diameter of the resulting clusters are irrelevant, and all we care about is the separation probability. Create an embedding h of G into ℓ1 as follows. Sample η = m R such partitions {σi}η i=1, where m = ω(n2). For each cluster C in partitioning σi, choose qi C = +1/m with probability 1/2 and qi C = 1/m with probability 1/2. For each v C, set tiv = qi C. The embedding has η dimensions and the ith dimension of h(v) is tiv. We have the following lemma for any (u, v) V V : Lemma 4.1. For any (u, v), with probability 1 1 2 d(u, v) h(u) h(v) 1 4c2 d(u, v). Proof. Let hui be the ith coordinate of h(u). In each dimension, if (u, v) are separated, they get different tokens with probability 1/2, in which case their distance is 2/m in that dimension. Otherwise, their distance is zero. This implies E[|hui hvi|] = Pr[(u, v) are separated ] E[ h(u) h(v) 1] = η Pr[(u, v) are separated ] m = c d(u, v) for c [c1, c2]. Let µuv = E[ h(u) h(v) 1]. Note now that h(u) h(v) 1 is the sum of R independent random variables each in {0, 2/m} and further that µuv c1 d(u, v) d(u, v) 1. Using Chernoff bounds, we have: Pr[ h(u) h(v) 1 / µuv (1 0.5)] 2 e m/8 1 for large enough m. Since m = ω(n2), the previous lemma implies that with constant probability, for all (u, v) V V , we have 2 d(u, v) h(u) h(v) 1 4c2 d(u, v). Scaling the coordinates up by 2/c1, this yields a noncontractive embedding into ℓ1 with distortion O(c2/c1), completing the proof of Theorem 1.4. 5. Individual Fairness for General Distances We now prove Theorem 1.5 via Algorithm 3. The key idea of the algorithm is to use two cut sequences in each phase of Algorithm 1. Algorithm 3 KPR algorithm with two cuts per phase Input: Integer R, planar graph G = (V, E). 1: Run the iteration of Algorithm 1 once (for i = 1) on G from an arbitrary root r0. 2: Let F be the set of removed edges. 3: for i = {1, 2} do 4: for every connected component C of G \ F do 5: Perform BFS with root ri set to be the vertex closest to the previous root ri 1 used for creating the component, breaking ties arbitrarily. 6: Sample k1 uniformly at random from {0, 1, . . . R 1}. 7: Sample κ from density ϵ κ1 ϵRϵ ; set k2 = κ . 8: for every edge e at levels ℓ k1 mod R or ℓ k1 + k2 mod R from ri do 9: F F {e}. 10: end for 11: end for 12: end for To prove Theorem 1.5, first, note that it satisfies (CON) by definition. The diameter guarantee of 43 R follows directly from Theorem 2.1. Simply observe that if we just consider the cuts induced by k1 in Step 3, we have effectively run three phases of Algorithm 1. Because of k2, we make some additional cuts in the second and third phases, which can only decrease the diameter. Lemma 5.1. Pr[u, v are separated] = O d(u,v) Proof. We use union bounds. The three cuts made in the first step and by k1 in Step 3 add 3 ρuv to the probability of separation. We now bound the probability that u, v are separated by k2 in Step 3. Consider the path of length d = d(u, v) between u, v in the tree. They are separated only if at least one edge of this path is at some level ℓ k1 + k2 mod R. Suppose that the edges of this path lie between levels ℓ1 and ℓ2 where we know that ℓ1 ℓ2 d 1. If ℓ1 k1+x mod R, then the probability that some edge of this path is cut is the probability that κ [x, x + ℓ2 ℓ1) [x, x + d). This happens with probability ϵ κRϵ dκ = ϵ((x + d)ϵ xϵ) Since k1 is chosen uniformly at random, x is also chosen uniformly at random from {0, 1, . . . , R 1}. Thus, a cut is Individual Fairness in Graph Decomposition made with probability ϵ((x + d)ϵ xϵ) ϵ((x + d)ϵ xϵ) ϵ((R + d)1+ϵ R1+ϵ) (1 + ϵ)R1+ϵ . Choosing ϵ 1, this is at most 2d/R = 2ρuv, giving an overall probability of separation of at most 5ρuv. We prove the following lemma in Appendix A, and it is similar to Lemma 2.2. Lemma 5.2. Algorithm 3 satisfies (COMP). We will now show the following theorem, which yields Theorem 1.5 for d(u, v) 59. Theorem 5.3. For d(u, v) 59, Algorithm 3 satisfies Pr[u, v are separated] ϵ2 16 d(u, v)/59 For (u, v) with d(u, v) < 59, we first run Algorithm 2 and note that Theorem 1.3 implies Pr[u, v separated] = Ω(1/R) = Ω(ρuv) = Ω(ρ2+2ϵ uv ). We subsequently run Algorithm 3 on each resulting component, hence yielding Theorem 1.5. 5.1. Proof of Theorem 5.3 Fix some pair (u, v) of vertices and let d = d(u, v). We assume d is a multiple of 59; otherwise, we can use 59 d 59 as a lower bound for d. First, note that we can assume d 43 R, else u, v are separated with probability 1. Define ˆρ = ρuv 59 and ˆd = d 59. From above, we get ˆρ 1. Further, assume w.l.o.g. that (u, v) are not cut in Step (3). Conditioned on this event, let G1 denote the component to which (u, v) belong. In the next iteration (Step (3) for i = 1), let the root chosen for this component be r1, and the corresponding BFS tree be T1. Let d1(x, y) denote the distance between x, y in G1 and let ℓ1(u) = d1(r1, u) denote the level of the tree T1 (measured from the root) that u belongs to. Events L1 and E1: Let L1 be the event that |ℓ1(u) ℓ1(v)| < ˆd. If L1 does not happen, then (u, v) are cut in Step (3) for i = 1 with probability at least ˆρ and we are done. We proceed assuming event L1 happens and define the first sandwich event E1 as follows. Definition 5.4. In event E1, in Step (3) for i = 1, the (k1, k2) chosen satisfy: k1 ℓmod R for some ℓsatisfying min(ℓ1(u), ℓ1(v)) ℓ min(ℓ1(u), ℓ1(v)) ˆd k1 + k2 ℓmod R for some ℓsatisfying max(ℓ1(u), ℓ1(v)) ℓ max(ℓ1(u), ℓ1(v)) + ˆd Figure 1: An example of the event E1 L1 in T1 (red). Both the cuts made by k1, k2 are within 2 ˆd levels of both u, v, which are themselves within ˆd levels of each other. In other words, the first sandwich event happens when the first cut (corresponding to k1) for that iteration i = 1 is made within ˆd levels above the higher of the levels of u, v in T1 and the second cut (corresponding to k2) in that iteration is made within ˆd levels below the level of the lower of u, v in T1. This is illustrated in Figure 1. Observe that for a sandwich event to happen, k1 must take one of ˆd possible values, which happens with probability ˆρ, and κ must lie in the range [max(ℓ(u), ℓ(v)) k1, max(ℓ(u), ℓ(v)) k1+ ˆd)]. Since max(ℓ(u), ℓ(v)) k1 [ ˆd, 2 ˆd], this probability is at least (3 ˆd)ϵ (2 ˆd)ϵ 4 ˆρϵ. This gives an overall probability of Pr[E1] = ϵ 4 ˆρ1+ϵ. Conditioned on L1 E1, the pair (u, v) could still be connected. If they are not connected, then we have shown the theorem. Suppose they are still connected, then let the resulting component be G2. Define the root r2 of the BFS tree T2 on G2 analogous to above, and let ℓ2(x) denote the level of node x V (G2) in this tree. Events L2 and E2: We assume the event L2 that |ℓ2(u) ℓ2(v)| < ˆd happens. Suppose not, then (u, v) are cut in Step (3) for i = 2 with probability at least ˆρ. Combining this with the bound for Pr[E1] above, we are done. Therefore, assume L2 happens. Then, define the second sandwich event E2 analogous to Definition 5.4, with i = 2 and ℓ2 used instead of i = 1 and ℓ1. We have, Pr[E2|E1] = ϵ 4 ˆρ1+ϵ = Pr[E1 E2] ϵ2 The following lemma completes the proof of Theorem 5.3. Lemma 5.5 (Proved in Appendix D). Conditioned on E1 E2 (assuming L1 L2 happens), u and v are always split. We prove Lemma 5.5 by contradiction. Suppose not and that u, v lie in the same connected component after the two phases. Let G3 be this connected component. Our approach will be to exhibit a K3,3 minor in the graph, contradicting the planarity of G. This construction is the technical heart of the proof, and we present in Appendix D. Individual Fairness in Graph Decomposition 6. Individual Fairness without (COMP) We now prove Theorem 1.6 via Algorithm 4. The key idea is to run Algorithm 1 with the parameter drawn from a distribution depending on R instead of the parameter being R directly. Algorithm 4 KPR algorithm with random diameter Input: Integer R, α R 0, planar graph G = (V, E) 1: Sample r (0, R] according to density (1+α)rα R1+α . 2: Run Algorithm 1 with the parameter r in place of R. Lemma 6.1. Algorithm 2 produces connected components of diameter O(R) and for every u, v V (G), satisfies 43 Pr[u, v are separated] 3ρuv log 1 if α = 0 and if α > 0, satisfies ρuv 1+α Pr[u, v are separated] 3ρuv Proof. The diameter bound follows directly from Theorem 2.1 since the largest parameter with which we run Algorithm 1 is R. Similarly, the connectivity guarantee follows since Algorithm 1 finds connected components. For the lower bound, observe that if 43r < d(u, v), then they are separated with probability 1 from the statement of Theorem 2.1. This happens with probability R1+α dr = d(u, v) For the upper bound, we consider them to be separated with probability 1 if r < 3d(u, v). If r 3d(u, v), then they are separated with probability 3d(u,v) r from Theorem 2.1. This gives us an overall probability of Z 3d(u,v) R1+α dr + Z R r (1 + α)rα 1+α + 3(1 + α)d(u, v) 3d(u,v) rα 1dr. If α > 0, the above expression is (3ρ)1+α + 3ρ(1 + α) αRα (Rα (3d(u, v))α) 3ρ(1 + α) where ρ = ρuv. If α = 0, the probability is 3ρ+log 1 3ρ 3ρ log 1 ρ , completing the proof. To obtain the guarantee in Theorem 1.6, we modify Algorithm 4 as follows: 1. For constant α > 0, we run Algorithm 1 with probability 1 α and Algorithm 4 with probability α. This yields g(ρ) = Ω(αρ1+α) and f(ρ) 3ρ. 2. When α = 0, we run Algorithm 1 with probability 1 1 log R and otherwise run Algorithm 4. This again yields g(ρ) = Ω(ρ/ log R) and f(ρ) = O(ρ). We note that though Algorithm 4 produces O(n/R) components in expectation, it does not satisfy (COMP) since when r = o(R), the maximum diameter is O(r) = o(R). 7. Empirical Results on Precinct Maps In this section, we run Algorithm 2 on precinct data obtained from the MGGG Redistricting Lab (MGGG Lab). This is naturally modeled as a planar graph where the vertices correspond to precincts of a state, and edges denote adjacency of precincts. We ran experiments on the states of North Carolina (NC), Pennsylvania (PA), and Maryland5 (MD). The sizes of the graphs are listed in Table 2. State Nodes Edges Diameter North Carolina 2692 7593 68 Maryland 1809 4718 66 Pennsylvania 9255 25721 89 Table 2: Sizes of the graphs from MGGG Lab. We implemented Algorithm 2 with parameter R = 5 and compared it with Algorithm 1 with the same R. We present the results for individual fairness, as well as the number and diameter of the clusters below. We omit a detailed runtime analysis of the algorithms as they exclusively entail Breadth-First Search traversals. We focus on Algorithm 2 since the real-world graphs have relatively small distances, and Algorithms 3 and 4 provide weaker guarantees in this regime. Individual Fairness. For Algorithm 2, for any edge (u, v) E, Theorem 1.3 yields Pr[(u, v) separated] 1 R = (0.1, 0.6), which bounds individual fairness by a factor of 6 for this algorithm. Empirically, we find that Algorithm 2 outperforms Algorithm 1 significantly on this metric. We plot the histograms of the edge separation probabilities in Figure 2, showing clearly that Algorithm 2 is more individually fair than Algorithm 1. Additionally, we obtain a stronger guarantee in the lower bound at virtually no cost to the upper bound. 5The MD graph from (MGGG Lab) is disconnected. It was modified to connect precincts that shared a water border. Individual Fairness in Graph Decomposition Figure 2: Histograms of edge separation probabilities across 3000 runs of Algorithm 2 on NC, MD, and PA. In our experiments, we choose the roots for the BFS trees uniformly at random. Our results show that even with such a strengthened version of Algorithm 1, it is more unfair than Algorithm 2. This is in line with our result showing that randomness cannot help Algorithm 1 (see Appendix B.3). Figure 3: Box plot of the number of clusters across 200 runs of Algorithm 2 on NC, MD, and PA. The normalized number of clusters refers to the number of clusters as a fraction of the baseline n Number and diameter of clusters. For Algorithm 2, we can adapt Lemma 2.2 to show that the expected number of clusters is at most 6n R . We plot the number of clusters for 200 runs in Figure 3, and note that the empirically observed value is consistently smaller than n/R. We next plot the maximum diameter of a cluster in each of the 200 runs of Algorithm 2 in Figure 4. Although the theoretical guarantee on the diameter of the largest cluster is 43R = 215, as stated in Theorem 2.1, the largest diameter of any cluster is empirically much smaller. 8. Conclusion We briefly mention some extensions. Analogous to Klein et al. (1993), our results naturally extend to graphs excluding a Kr,r minor, with the bounds worsening as r increases. Similarly, for general metric spaces, we can plug existing low diameter decompositions into Algorithm 4 or Algorithm 5, losing an additional O(log n) factor in f(ρ) com- Figure 4: Box plot of the maximum diameters of clusters for the states of NC, MD, and PA across 200 clusterings. The diameters are shown as a fraction of R. pared to the bounds in Table 1; see Appendix C for such a result. It would be interesting to define analogs for (CON) and (COMP) for general metrics and study their trade-offs with individual fairness, analogous to our results for planar graphs in Sections 3 and 5. Exploring whether our bounds in Table 1 are improvable for planar graphs would also be an interesting direction. Our work makes edge separation probabilities proportional; defining some notion of Paretooptimality in conjunction with this would be interesting. Our techniques can be viewed as randomly compressing a graph while preserving fairness among compressed pairs. This can be used as a pre-processing step to solve more complex optimization problems on the entire graph more efficiently. As an example, in Congressional redistricting, we may want to partition the graph into k parts each with an equal population, while preserving individual fairness. Though no worst-case fairness guarantees are possible for fairness in this setting, one can design efficient heuristics to optimize fairness for the compressed graph, and this is an avenue for future work. Similarly, exploring individual fairness for other optimization problems beyond graph clustering would be interesting. Acknowledgment. This work is supported by NSF grant CCF-2113798. Individual Fairness in Graph Decomposition Impact Statement This paper presents work whose goal is to advance the field of Machine Learning. There are many potential societal consequences of our work, none which we feel must be specifically highlighted here. Abraham, I., Gavoille, C., Gupta, A., Neiman, O., and Talwar, K. Cops, robbers, and threatening skeletons: Padded decomposition for minor-free graphs. SIAM Journal on Computing, 48(3):1120 1145, 2019. doi: 10.1137/17M1112406. URL https://doi.org/10. 1137/17M1112406. Altman, M. Modeling the effect of mandatory district compactness on partisan gerrymanders. Political Geography, 17(8):989 1012, 1998. Ashlagi, I. and Shi, P. Improving community cohesion in school choice via correlated-lottery implementation. Operations Research, 62(6):1247 1264, 2014. Backurs, A., Indyk, P., Onak, K., Schieber, B., Vakilian, A., and Wagner, T. Scalable fair clustering. In International Conference on Machine Learning, pp. 405 413. PMLR, 2019. Bartal, Y. Probabilistic approximation of metric spaces and its algorithmic applications. In Proceedings of 37th Conference on Foundations of Computer Science, pp. 184 193, 1996. doi: 10.1109/SFCS.1996.548477. Bera, S., Chakrabarty, D., Flores, N., and Negahbani, M. Fair algorithms for clustering. In Wallach, H., Larochelle, H., Beygelzimer, A., d Alch e-Buc, F., Fox, E., and Garnett, R. (eds.), Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL https://proceedings. neurips.cc/paper/2019/file/ fc192b0c0d270dbf41870a63a8c76c2f-Paper. pdf. Bourgain, J. On lipschitz embedding of finite metric spaces in hilbert space. Israel Journal of Mathematics, 52:46 52, 1985. URL https://api.semanticscholar. org/Corpus ID:121649019. Brubach, B., Chakrabarti, D., Dickerson, J., Khuller, S., Srinivasan, A., and Tsepenekas, L. A pairwise fair and community-preserving approach to k-center clustering. In International Conference on Machine Learning, pp. 1178 1189. PMLR, 2020. Celis, L. E., Straszak, D., and Vishnoi, N. K. Ranking with fairness constraints. ar Xiv preprint ar Xiv:1704.06840, 2017. Chawla, S. and Jagadeesan, M. Individual fairness in advertising auctions through inverse proportionality. In Braverman, M. (ed.), 13th Innovations in Theoretical Computer Science Conference, ITCS 2022, January 31 - February 3, 2022, Berkeley, CA, USA, volume 215 of LIPIcs, pp. 42:1 42:21. Schloss Dagstuhl - Leibniz-Zentrum f ur Informatik, 2022. Chen, X., Fain, B., Lyu, L., and Munagala, K. Proportionally fair clustering. In International Conference on Machine Learning, pp. 1032 1041. PMLR, 2019. Chierichetti, F., Kumar, R., Lattanzi, S., and Vassilvitskii, S. Fair clustering through fairlets. Advances in Neural Information Processing Systems, 30, 2017. Chouldechova, A. and Roth, A. The frontiers of fairness in machine learning. Ar Xiv, abs/1810.08810, 2018. Cohen-Addad, V., Klein, P. N., and Young, N. E. Balanced centroidal power diagrams for redistricting. In Proceedings of the 26th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, SIGSPATIAL 18, pp. 389 396, New York, NY, USA, 2018. Association for Computing Machinery. De Ford, D., Duchin, M., and Solomon, J. Recombination: A family of markov chains for redistricting. Harvard Data Science Review, 3(1), 3 2021. Devic, S., Kempe, D., Sharan, V., and Korolova, A. Fairness in matching under uncertainty. In Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., and Scarlett, J. (eds.), International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA, volume 202 of Proceedings of Machine Learning Research, pp. 7775 7794. PMLR, 2023. Duchin, M. Gerrymandering metrics: How to measure? what s the baseline? ar Xiv, 1801.02064, 2018. Dwork, C., Hardt, M., Pitassi, T., Reingold, O., and Zemel, R. S. Fairness through awareness. In Innovations in Theoretical Computer Science 2012, Cambridge, MA, USA, January 8-10, 2012, pp. 214 226, 2012. Esmaeili, S., Brubach, B., Tsepenekas, L., and Dickerson, J. Probabilistic fair clustering. Advances in Neural Information Processing Systems, 33:12743 12755, 2020. Fakcharoenphol, J., Rao, S., and Talwar, K. A tight bound on approximating arbitrary metrics by tree metrics. Journal of Computer and System Sciences, 69(3):485 497, 2004. Goemans, M. X. Lecture notes. https://math.mit.edu/ goemans/18409-2006/lec9.pdf, 2006. Individual Fairness in Graph Decomposition Harris, D. G., Li, S., Pensyl, T., Srinivasan, A., and Trinh, K. Approximation algorithms for stochastic clustering. Journal of Machine Learning Research, 20(153):1 33, 2019. URL http://jmlr.org/papers/v20/18-716. html. Hochbaum, D. S. and Shmoys, D. B. A best possible heuristic for the k-center problem. Mathematics of Operations Research, 10(2):180 184, 1985. ISSN 0364765X, 15265471. URL http://www.jstor. org/stable/3689371. Hsu, W.-L. and Nemhauser, G. L. Easy and hard bottleneck location problems. Discrete Applied Mathematics, 1(3):209 215, 1979. ISSN 0166-218X. doi: https://doi.org/10.1016/0166-218X(79)90044-1. URL https://www.sciencedirect.com/ science/article/pii/0166218X79900441. Huang, C.-C., Kavitha, T., Mehlhorn, K., and Michail, D. Fair matchings and related problems. Algorithmica, 74 (3):1184 1203, 2016. Joseph, M., Kearns, M., Morgenstern, J., Neel, S., and Roth, A. Fair algorithms for infinite and contextual bandits. ar Xiv preprint ar Xiv:1610.09559, 2016. Klein, P., Plotkin, S. A., and Rao, S. Excluded minors, network decomposition, and multicommodity flow. In Proceedings of the Twenty-Fifth Annual ACM Symposium on Theory of Computing, STOC 93, pp. 682 690, New York, NY, USA, 1993. Association for Computing Machinery. ISBN 0897915917. doi: 10.1145/167088.167261. URL https://doi.org/10.1145/167088.167261. Matouˇsek, J. Embedding Finite Metric Spaces into Normed Spaces, pp. 355 400. Springer New York, New York, NY, 2002. MGGG Lab. Mggg state shapefiles. https://github. com/mggg-states. Accessed January 2024. Miller, G. L., Peng, R., and Xu, S. C. Parallel graph decompositions using random shifts. In Proceedings of the Twenty-Fifth Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 13, pp. 196 203, New York, NY, USA, 2013. Association for Computing Machinery. ISBN 9781450315722. doi: 10.1145/2486159.2486180. URL https://doi. org/10.1145/2486159.2486180. Rao, S. Small distortion and volume preserving embeddings for planar and euclidean metrics. In Proceedings of the Fifteenth Annual Symposium on Computational Geometry, SCG 99, pp. 300 306, New York, NY, USA, 1999. Association for Computing Machinery. ISBN 1581130686. doi: 10.1145/304893.304983. URL https://doi.org/10.1145/304893.304983. Shen, Z., Wang, Z., Zhu, X., Fain, B., and Munagala, K. Fairness in the assignment problem with uncertain priorities. In Agmon, N., An, B., Ricci, A., and Yeoh, W. (eds.), Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2023, London, United Kingdom, 29 May 2023 - 2 June 2023, pp. 188 196. ACM, 2023. Individual Fairness in Graph Decomposition A. Omitted Proofs A.1. Proof of Lemma 2.2 Let C1, C2, C3 be the random variables denoting the additional number of components generated in phases 1, 2, 3 respectively, and let C = C1 + C2 + C3. Consider the first phase of Algorithm 1 and the associated BFS tree. We can pretend the edges not in the tree don t exist; this only decreases the number of components generated. Let a1, a2, . . . , aq denote the number of edges in levels 1, 2, . . . , q in the tree. We pad this sequence with zeros so that the q is a multiple of R. We have Pq i=1 ai = n 1 n. If a level j is cut, the number of additional components created is precisely aj. Since any level is chosen for cut with probability 1/R, we have E[C1] n/R. Suppose we have components of size m1, m2, . . . , mt at the end of the first phase. Applying the above argument again to each of these components, we have E[C2|C1] Pt i=1 mi/R n/R, since Pt i=1 mi n. A similar argument holds for E[C3], completing the proof of the first part by linearity of expectation. For the second part, suppose R = D/β. Then, the BFS tree in the first phase has a depth of at least D/2, so that at least β/4 components have a diameter of at least R/2. For each of these components, the BFS tree now has a depth at least R/2, so at least one component from each has a diameter of at least R/4. Similarly for the third phase, where we obtain a diameter at least R/8, completing the proof. A.2. Proof of Lemma 5.2 The proof is similar to that of Lemma 2.2. The same arguments follow for the first cut created. For the cuts in Step 3, we separately argue below. Let a1, a2, . . . , aq denote the number of edges in levels 1, 2, . . . , q of the tree. The expected number of additional components generated by the cut corresponding to k1 is at most To bound the expected number of components generated by the cut corresponding to k2, observe that the unconditional probability that any edge of the BFS tree is cut is again 1/R. Formally, the expected number of components is: x=0 Pr[k1 = x] i=1 ai Pr[k2 x i mod R|k1 = x] 1 R Pr[k2 x i mod R|k1 = x] = Pq i=1 ai The second requirement of (COMP) follows similarly as the proof of Lemma 2.2, with the observation that we only have 2D R cuts per phase in total for any outcome of randomness. B. Lower Bound for KPR: Proof of Theorem 1.2 B.1. Proof of Theorem 1.2 for d = 1 We start with the case of d = 1 in Figure 5. We present a constant-sized subgraph, but the arguments remain the same if we attach the graph to an arbitrary star graph with root r0. We also consider adversarial choices of roots in Algorithm 1, but we show later that a similar argument works for random choices of roots. Let r0 be the choice of the root. In all subsequent subgraphs, if r0 is not present then r1 is chosen as the root. If neither r0 nor r1 are present, r2 is chosen as the root. Let G0 be the initial graph and let T0 be the BFS tree from r0. The first observation is that u, v, r1, r2 belong to the same level of T0. Thus, no cut can separate these vertices. The only cut that modifies the graph cuts just below r0. For the next phase, the component containing u, v is a 4-cycle with a diagonal joining u, v. Let this graph be G1. In G1, r1 is chosen as the new root. The BFS tree T1 from r1 consists of r1 at level 0, u, v at level 1, and r2 at level 2. Observe that u, v cannot be separated in G1 because they are again at the same level. Subsequently, since R > 2, we only have two possible cuts: A cut that separates level 0 from levels 1 and 2, and a cut that separates level 2 from levels 0 and 1. In either case, the component containing u, v is a 3-cycle with u, v, and one of r1, r2. Without loss of generality, we consider the latter case so that we get a graph G2 in which we choose r2 as the final root. The Individual Fairness in Graph Decomposition Figure 5: Counterexample for d = 1. BFS tree from r2 in G2 again has u, v at the same level. Hence, u and v are never separated. B.2. Proof of Theorem 1.2 for d > 1 Figure 6: Counterexample for d 2. Each blue line is a path of length d. We now prove the statement for general d 2. We describe the construction of the graph we use in Figure 6. Each blue line in the graph is a path of length d. The straight line path between u, v that does not go through r1, r2 is of length d. Each vertex in this path is connected to r1, r2 via paths of length d. u, v, r1, r2 are all at distance d from r0. The red paths between u, v and r1, r2 are paths of length d. Each vertex in this path is connected to r0 via a path of length d. This is possible to do in a planar graph, as shown in the figure. The construction contains 6d2 + d 9d2 vertices. We can arbitrarily attach a star graph with root r0 to increase the number of vertices. However, the number of vertices n must satisfy n 9d2. Let r0 be the choice of the root. In all subsequent subgraphs, if r0 is not present then r1 is chosen as the root. If neither r0 nor r1 are present, r2 is chosen as the root. Let G0 be the initial graph and let T0 be the BFS Tree from r0. Observe that by construction, all the vertices on the red paths at level d in T0. Thus, they must all be in the same subgraph as u, v for the second phase. Let G1 be the component Individual Fairness in Graph Decomposition containing u, v in the second phase. We argue that G1 is roughly of two types , depending on where the cut is made in the first phase: Case 1. The cut is made above level d. This ensures that all vertices at levels d are in G1. Let the resulting graph be Ga 1. Note that r0 V (Ga 1). Case 2. The cut is made below level d. This ensures that all vertices at levels d are in G1. Let the resulting graph be Gb 1. Note that the T0 has at most 2d levels since all vertices of the graph are within 2d levels of r0. Furthermore, no cut can remove a vertex both below level d and above level d from G1 because R > 2d. We now consider these two cases, ignoring the case where the graph is not modified because that can only decrease the probability that u, v are separated. The next two lemmas complete the proof of Theorem 1.2. (a) Graph Ga 1. (b) Graph Ga 2. Figure 7: Case 1 in the proof of Theorem 1.2. Lemma B.1. u and v cannot be separated by the final two iterations (i = 2, 3) of Algorithm 1 run on Ga 1. Proof. In this case, the relevant part of the graph is shown in Figure 7a. Any vertex not shown is a vertex above level d in G0, and does not lie in any path from r1 to any of the vertices in Ga 1. We ignore them for ease of exposition, but it can be verified that they do not affect our arguments, specifically because they do not affect the positions of u and v in the BFS trees or their connectivity. Now, r1 is chosen as the root. Let Ga 2 be the component containing u, v in the next phase. Like before, only two kinds of cuts can modify the graph. 1. If the cut is made above level d, r2 lies in Ga 2 and r1 is not in Ga 2. 2. If the cut is made below level d, r1 lies in Ga 2 and r2 is not in Ga 2. Because of symmetry, we can assume wlog that the cut is made below level d. Regardless of where the cut is, the vertices and edges in Figure 7b are preserved in Ga 2. There may be some extra vertices that can safely be ignored again. We again choose r1 as the root. Observe that every vertex in the straight-line path (colored red) from u to v is at distance d from r1. Thus, no cut can separate the vertices in this path, and u, v must necessarily remain connected after any cut. Lemma B.2. u and v cannot be separated by the final two iterations (i = 2, 3) of Algorithm 1 run on Ga 2. Proof. The relevant part of the graph for phase 2 is of the form in Figure 8a. There are several paths from r1 or r2 to u or v that may or may not be present in Gb 1 but are not pictured. For our arguments, we only need to observe that the shortest paths from r1 and r2 to u and v are preserved in Gb 1. In Gb 1, making cuts below level d does not affect any of the relevant u, v paths. Thus, we assume that the cut is made above level d, giving us the graph Gb 2. The relevant vertices are pictured in Figure 8b. Like before, we can restrict ourselves to this graph because the remaining edges do not affect the BFS trees that we construct. For the next phase, we choose r1 as the root. The key observation is that there is a path from u to v entirely above level d in this tree through r1, and there is a disjoint path from u to v entirely below level d through r2. We now reuse the observation that any cut can either remove vertices below level d or vertices above level d, but not both. Thus, u and v remain connected regardless of the cut. Individual Fairness in Graph Decomposition (a) Graph Gb 1. (b) Graph Gb 2. Figure 8: Case 2 in the proof of Theorem 1.2. B.3. Additional Impossibility Results Algorithm 1 with additional iteration. The proof of Theorem 1.2 describes a pair u, v that remain connected after 3 phases of KPR. One might wonder if it is possible to do better with additional phases. Indeed, it turns out that we can obtain a non-zero probability of separation. This forms the basis of Algorithm 3. However, we also observe that simple modifications to Algorithm 1 can only give a very weak lower bound on the probability of separation. Let KPR+ denote Algorithm 1 with one additional phase. That is, we add one more iteration to step 1. Corollary B.3. For any n N and 1 d n 3 , there exists a planar graph Gd and vertices u, v V (G) with d(u, v) = d such that KPR+ with adversarial choices of roots in each component and parameter R > 2d, separates u, v with probability at most 8ρ4 uv. Proof. By running through the proof of Theorem 1.2 again, we can see that if, in any of the phases, the cut is made beyond the first 2d levels of the BFS Tree, then u, v will not be separated even in 4 phases. For example, if in the first phase, the cut is made beyond level 2d of the BFS Tree, then u, v cannot be separated in the remaining three phases because of Theorem 1.2. Similarly, if this happens in the second or third phases, u, v will never be separated in the remaining phases. Thus, we assume that for the first three phases, the cut is within the first 2d levels of the BFS tree. This happens with probability 2d R 3 = 8ρ3 uv. The final observation is that u, v are connected by a path of length d in the fourth phase regardless of whether the graph in the third phase is Ga 2 or Gb 2. Thus, they are at most d levels apart in any BFS tree and can be separated by probability at most ρuv. This gives us an overall probability of separation of at most 8ρ4 uv. Random choices of roots. The original algorithm of (Klein et al., 1993) chooses the nodes arbitrarily. A natural question is whether considering adversarial choices of roots is too pessimistic and whether the bad case in Theorem 1.2 can be avoided by a simple scheme such as choosing the root uniformly at random. Unfortunately, randomization does not provide any strength beyond possibly providing a 1 poly(n) lower bound. Recall that in the proof of Theorem 1.2, all the adversary needs is to enforce a preference ordering between the roots: If r0 is present in a component, it must be chosen as the root. If r0 is not present but r1 is, then it must be chosen. Finally, if neither r0 nor r1 is present, r2 should be chosen as the root. We argue that such a preference order can be enforced even in the presence of uniform sampling of roots. Corollary B.4. For any n N and 1 d n 1 8 , there exists a planar graph Gd and vertices u, v V (G) with d(u, v) = d such that Algorithm 1 with uniformly random choices of roots in each component and parameter R > 2d separates u, v with probability only O n 1 Proof. We take the construction in Theorem 1.2. Suppose that it has m vertices. We add star graphs with m4, m3, and m2 nodes with roots r0, r1, r2 respectively. Let the total number of nodes now be n = O(m4). Note that choosing a leaf attached to ri is equivalent to choosing ri for the purposes of the proof since the BFS tree from that node is similar to that of the BFS tree from ri. Individual Fairness in Graph Decomposition We now follow the proof of Theorem 1.2 and argue that the roots are chosen as required with high probability. In the first phase, the probability that r0 or a leaf attached to it is not chosen is m3+m2+m m4+m3+m2+m 1 m. In the second phase, if the graph was not modified in the first phase, the same calculation above applies. Similarly, in Gb 1, the probability that r0 is not chosen is again at most 1 m. In Ga 1, the probability that r1 is not chosen is at most m2+m m3+m2 1 m. A similar calculation can be done for both graphs Ga 2 and Gb 2 in the third phase. By union bounds, the proof is complete. C. Individual Fairness without (CON): Proof of Theorem 1.7 We now prove Theorem 1.7. Recall that an embedding is a function h that maps the vertices v of G to s-dimensional points h(v) (for some s) with the guarantee that the ℓ1 distance between h(u) and h(v) is at least d(u, v), and at most α d(u, v). The parameter α 1 is termed the distortion. The algorithm for embedding planar graphs into ℓ1 (Rao, 1999) has the following guarantee: Lemma C.1 ((Rao, 1999)). Given a planar graph with diameter , there is an embedding h into Euclidean space such that for any pair of vertices (u, v): d(u, v) h(u) h(v) 2 O( p log ) d(u, v). Algorithm 5 LDD via Euclidean Embedding Input: Integer R, planar graph G = (V, E) 1: Run Algorithm 1 to obtain clusters of diameter O(R). 2: for every connected component C found do 3: Embed C into a Euclidean metric via the algorithm in (Rao, 1999). Let h(v) be the embedding of v V and let the dimension be γ. 4: Let x be a γ-dimensional vector whose coordinates are i.i.d. N(0, 1). 5: Let q(v) = h(v) x. 6: Choose θ [0, R log R] uniformly at random. 7: For ℓ Z, place all vertices v s.t. q(v) [θ + ℓ R log R, θ + (ℓ+ 1)R log R) in sub-partition C(ℓ). 8: Output the set {C(ℓ)} of sub-partitions. 9: end for The algorithm for showing Theorem 1.7 is shown in Algorithm 5. Let d (u, v) = h(u) h(v) 2. Note that E[|q(v) q(u)|] = d (u, v), so that Pr[(u, v) are separated in Algorithm 5 ] = d (u, v) Observe that (u, v) are separated in Algorithm 5 with probability O(ρuv) or by using Lemma C.1 and observing that = O(R), in Algorithm 5 with probability O(ρuv). This shows f(ρ) = O(ρ). The lower bound follows by observing that (u, v) are separated in Algorithm 5 with probability at least ρ/ log R by the above equation. The algorithm does not satisfy (CON) since any two vertices have a non-zero probability of being in the same cluster because of the unbounded support of the normal distribution. The algorithm does not satisfy (COMP) either since, for any graph, there is a non-zero probability of separating the vertices in n clusters. The proof of Theorem 1.7 now follows. General Metrics. The above algorithm also extends to general metric spaces if we replace Algorithm 5 with the low diameter decomposition procedure in (Fakcharoenphol et al., 2004; Miller et al., 2013). This guarantees probability of separation f(ρ) = O(ρ log n). Now we embed the metric into Euclidean space suffering distortion O(log n) (Bourgain, 1985), and finally choose θ [0, R] at random before performing Algorithm 5. This guarantees g(ρ) = ρ and f(ρ) = O(ρ log n). D. Completing Proof of Theorem 5.3: Proof of Lemma 5.5 At a high level, our proof outline is similar to that for showing the bounded diameter of clusters in Algorithm 1 in (Goemans, 2006), where the overall goal is to show a proof by contradiction by exhibiting a K3,3 minor. We also borrow some of their Individual Fairness in Graph Decomposition Figure 9: The relative positions of the special vertices in T1 (red) and T2 (blue inset). All of u, v, w, z, r2 are within 3 ˆd levels of each other in T1 by virtue of them being in G2. In T2 , u, v, w are within 3 ˆd levels of each other by virtue of them being in G3. notation, though the meaning of the notation is slightly different in our context, as are the details of the construction and analysis. For instance, our supernodes that form the K3,3 minor are easily defined with respect to a few special vertices, as opposed to those in (Klein et al., 1993), which involve many auxiliary nodes. We also only need to look at two BFS trees instead of three, as in (Klein et al., 1993), although we make more cuts. Another important distinction is that our analysis inherently requires the randomness of Algorithm 3. For Algorithm 1, the proof that the clusters have a small diameter only hinges on cutting the BFS tree into slices of width R. In our case, the trees we analyze possess the necessary properties only if we condition on the events E1, E2, L1 and L2. D.1. Special Vertices We first show some simple properties of the graphs G1, G2, G3. The first claim follows from the definition of the BFS tree T1, and the occurrence of event E1 E2 L1 L2. Claim D.1. For any x V (G2), |ℓ1(x) ℓ1(u)| and |ℓ1(x) ℓ1(v)| are both at most 2 ˆd. Similarly, the next claim follows since L2 happens and u, v are still connected in G3. Claim D.2. For any x V (G3), |ℓ2(x) ℓ2(u)| and |ℓ2(x) ℓ2(v)| are at most 2 ˆd. Since the events L1 and L2 happen, and since di(u, v) d, and we defined ˆd = d 59, we have: Claim D.3. For i {1, 2}, we have: ℓi(u) and ℓi(v) are at least 28 ˆd. We now find two special vertices, w and z. Let P be the shortest path between u, v in G3. Let w P be such that d2(u, w) d 2 1 > 28 ˆd and d2(v, w) d 2 1 > 28 ˆd. By Claim D.2 and Claim D.3, we have ℓ2(w) 25 ˆd. Consider the path from r2 to w in T2. Let z be a vertex at distance 9 ˆd from w along this path. Note that z does not belong to G3. Since ℓ2(w) 25 ˆd, we know that z exists. The construction of these vertices is illustrated in Figure 9. Using Claim D.2 with x = w and noting that ℓ2(z) = ℓ2(w) 9 ˆd, we obtain: Claim D.4. |ℓ2(z) ℓ2(u)| and |ℓ2(z) ℓ2(v)| are at most 11 ˆd. D.2. Construction of K3,3 Minor First, the following is a straightforward property of BFS trees that we will use repeatedly. Claim D.5. For i {1, 2}, consider a shortest path Q in Gi from ri to any vertex x. For any x Q, if both x, x Gi+1, then every vertex between x, x in Q also lies in Gi+1. Individual Fairness in Graph Decomposition In the narrative below, we assume x {u, v, z}. Define P1(x) as the shortest path from r1 to x in G1. Further, define P 1 (x) = {x P1(x) | x G1 \ G2} P + 1 (x) = {x P1(x) | x G2}. Using Claim D.5, for x {u, v, z}, P 1 (x) and P + 1 (x) are a partition of P1(x) into two subpaths. Similarly, for x {u, v, z}, we define P2(x) as the shortest path from r2 to x in G2. Let P 2 (x) = {x P2(x) | ℓ2(x ) < ℓ2(x) 4 ˆd} P + 2 (x) = {x P2(x) | ℓ2(x ) ℓ2(x) 4 ˆd}. That is, we partition the path into two subpaths based on the distance from r2. The 4 ˆd vertices closest to x along the path lie in P + 2 (x). Again, by Claim D.5, these form sub-paths of P2(x). Recall that we had assumed for contradiction the existence of a path P from u to v that survived E1 E2, and that passes through w. For y {u, v}, we define Pw(y) as the subpath of P from w to x in G2. Define Pw(z) as the path from w to z in T2. Recall that this is part of the shortest path from r2 to w in G2. We again partition the first 4 ˆd vertices of these three paths from x {u, v, z} as P + w (x) and the remaining part as P w (x). Since Pw(u) and Pw(v) are shortest paths in G3, we have P w (x) = {x Pw(x) | d3(x , w) < d3(x, w) 4 ˆd} P + w (x) = {x Pw(x) | d3(x , w) d3(x, w) 4 ˆd}. for x {u, v}. Similarly, since Pw(z) is a shortest path in G2, we have P w (z) = {x Pw(z) | d2(x , w) < d2(z, w) 4 ˆd} P + w (z) = {x Pw(z) | d2(x , w) d2(z, w) 4 ˆd}. We have the following useful properties of these partitions. Claim D.6. For x {u, v, z} and every y {1, 2, w}, every vertex in P + y (x) is within distance 4 ˆd of x in G2. Proof. This is true by definition for P + 2 (x). For P + 3 (x), this also follows from the definition because the distance between pairs can only increase from G2 to G3. Consider some vertex x P + 1 (x). Applying Claim D.1 on both x and x, we get that |ℓ1(x ) ℓ1(x)| 4 ˆd. But since P + 1 (x) is a shortest path from r1 to x, this means that the path from x to x can be at most 4 ˆd long since ℓ1(x ) increases linearly along the path from r1 to x. Since this entire path survives in G2 (from Claim D.5), it must also mean that the distance between x and x in G2 must also be at most 4 ˆd. We finally describe the K3,3 minor. We set A1 = P 1 (u) P 1 (v) P 1 (z) A2 = P 2 (u) P 2 (v) P 2 (z) A3 = P w (u) P w (v) P w (z) A4 = P + 1 (u) P + 2 (u) P + w (u) A5 = P + 1 (v) P + 2 (v) P + w (v) A6 = P + 1 (z) P + 2 (z) P + w (z) The following lemma will complete the proof of Lemma 5.5. Lemma D.7. After contracting each Ai, ({A1, A2, A3}, {A4, A5, A6}) form a K3,3 graph. Individual Fairness in Graph Decomposition D.3. Proof of Lemma D.7 The rest of the section is devoted to proving this lemma. First, it can be seen that each Ai is a connected subgraph. It is also clear that the required edges for the contracted graph to be a K3,3 are present. We can think of representing the supernodes A1 through A6 by the vertices r1, r2, w, u, v, and z respectively. The path between r1 and u is partitioned between A1 and A4, and hence, the supernodes have an edge between them. Similarly, the path between r2 and u is partitioned between A2 and A4, and the path between w and u is partitioned between A3 and A4. It remains to be argued that each Ai is disjoint from the others. The next set of claims will show that, hence completing the proof of Lemma D.7. Claim D.8. A1 is disjoint from every other Aj. Proof. This follows because A1 is the only set that lies in G1 \ G2. The others all lie in G2. Claim D.9. A4, A5, A6 are all mutually disjoint. Proof. Assume for a contradiction that there is some x A4 A5. Then d2(u, v) d2(u, x) + d2(x, v) 8 ˆd from Claim D.6 which is a contradiction since d2(u, v) d. Instead, if there is some x A4 A6, then we have d2(u, w) d2(u, x) + d2(x, z) + d2(z, w) 13 ˆd which is a contradiction since d2(u, w) d 2. Similarly if there is some x A5 A6. Claim D.10. A2 is disjoint from A4, A5, A6. Proof. We first prove this for A4. The proof is symmetric for A5. Assume for a contradiction that there is some x A2 A4. We have 3 cases depending on what part of A2 that x lies in. x P 2 (u). P 2 (u) is disjoint from P + 2 (u) by definition so x cannot belong to it. By definition of P 2 (u) and because P2(u) is a shortest path from r2 to u, we have d2(x, u) > 4 ˆd. However, this means that it cannot belong to P + 1 (u) or P + w (u) because it would contradict Claim D.6. x P 2 (v). Since x A4, Claim D.6 implies that d2(x, u) 4 ˆd which implies |ℓ2(x) ℓ2(u)| 4 ˆd. But since event L2 happened, this then implies that |ℓ2(x) ℓ2(v)| 5 ˆd. Since x lies on the shortest path from r2 to v, this implies that d2(x, v) 5 ˆd. Thus, we have d2(u, v) d2(x, v) + d2(x, u) 9 ˆd which is a contradiction. x P 2 (z). This is similar to the case of P 2 (v). Since x A4, Claim D.6 implies that d2(x, u) 4 ˆd which also implies |ℓ2(x) ℓ2(u)| 4 ˆd. But Claim D.4 then implies that |ℓ2(x) ℓ2(z)| 15 ˆd. Since x lies on the shortest path from r2 to z, this implies that d2(x, z) 15 ˆd. Thus, we have d2(u, w) d2(x, z) + d2(x, u) 19 ˆd which is a contradiction since we defined w with d2(u, w) 27 ˆd. We now prove it for A6. The proof is similar. Assume for a contradiction that x A2 A6. We again have 3 cases depending on what part of A2 that x lies in. x P 2 (z) is disjoint from P + 2 (z) by definition. By definition of P 2 (z) and because P2(z) is a shortest path from r2 to z, we have d2(x, z) > 4 ˆd. However, this means that it cannot belong to P + 1 (z) or P + w (z) because it would contradict Claim D.6. x P 2 (v). Since it also belongs to A6, Claim D.6 implies that d2(x, z) 4 ˆd which also implies |ℓ2(x) ℓ2(z)| 4 ˆd. But Claim D.4 then implies that |ℓ2(x) ℓ2(v)| 15 ˆd. Since x lies on the shortest path from r2 to v, this implies that d2(x, v) 15 ˆd. Thus, we have d2(w, v) d2(x, v) + d2(x, z) + d2(z, w) 28 ˆd which is a contradiction to d2(u, w) > 28 ˆd. x P 2 (u). This is symmetric to the case of x P 2 (v). This completes the proof. Claim D.11. A2 is disjoint from A3. Individual Fairness in Graph Decomposition Proof. This primarily follows by observing the levels of the vertices in G2. Assume for a contradiction that there is some x A2 A3. We have 3 cases depending on what part of A2 that x lies in. x P 2 (z). Note that x cannot also lie in P w (z) because P2(z) and Pw(z) are partitions of P2(w). We also have ℓ2(x) < ℓ2(z) 4 ˆd = ℓ2(w) 13 ˆd by definition. By applying Claim D.2 on w and combining this with the previous inequality, we get ℓ2(x) < ℓ2(u) 11 ˆd. This implies that x is not in G3. If x also lied in P w (u) or P w (v), this would contradict Claim D.2 because Pw(u) and Pw(v) are paths in G3. x P 2 (u). By definition, we have ℓ2(x) < ℓ2(u) 4 ˆd. If x also lied in P w (u) or P w (v), this would contradict Claim D.2 because Pw(u) and Pw(v) are paths in G3. If x P w (z), then d2(x, w) 9 ˆd because Pw(z) is only that long. Since Pw(z) is part of the shortest path from r2 to w, we also have |ℓ2(x) ℓ2(w)| 9 ˆd. Applying Claim D.2 on w, we get |ℓ2(x) ℓ2(u)| 11 ˆd. But since x also lies on the shortest path from r2 to u, we also get d2(x, u) 11 ˆd. Thus, we get d2(w, u) d2(w, x) + d2(x, u) 20 ˆd which is a contradiction. x P 2 (v) follows similarly by symmetry. Claim D.12. A3 is disjoint from A4, A5, A6. Proof. We first show that A3 is disjoint from A4. The proof that it is disjoint from A5 follows by symmetry. Assume for a contradiction that there is some x A3 A4. We have 3 cases depending on what part of A3 that x lies in. x P w (u). By definition, it is disjoint from P + w (u). Suppose that x also lied in P + 1 (u) or P + 2 (u). This means that x V (G3). By Claim D.5, this means that every vertex between x and u in P + 1 (u) (or P + 2 (u)) also lies in G3 and hence, d3(u, x) 4 ˆd by Claim D.6. This is a contradiction because x P w (u) implies that d3(x, u) > 4 ˆd by definition. x P w (v). It must be disjoint from P + w (u) because they are both disjoint subsets of the path Pw(u) Pw(v) from u to v. Suppose x also lied in P + 1 (u) or P + 2 (u). By Claim D.5, this means that every vertex between x and u in P + 1 (u) (or P + 2 (u)) also lies in G3 and hence, d3(u, x) 4 ˆd by Claim D.6. This contradicts the fact that Pw(u) Pw(v) was a shortest path in G3 because the path u x v is strictly shorter, because d3(u, x) 4 ˆd < d 2 |Pw(u)| and x v is a subpath of Pw(v). x P w (z). Suppose x also lied in A4. By Claim D.6, d2(u, x) 4 ˆd. Thus, we get d2(u, w) d2(u, x)+dw(x, w) 13 ˆd since x lies on Pw(z) which has length 9 ˆd. This is a contradiction. Now, we show that A3 is disjoint from A6. Again, suppose there is some x A3 A6. We have 3 cases. x P w (z). By definition, it is disjoint from P + w (z). We have d2(x, z) > 4 ˆd by definition. If x also lied in P + 1 (z) or P + 2 (z), then this contradicts Claim D.6. x P w (u). If x P w (u), then x V (G3) and hence Claim D.2 implies that |ℓ2(x) ℓ2(u)| 2 ˆd. But if x A6, then Claim D.6 implies that d2(x, z) 4 ˆd = |ℓ2(z) ℓ2(x)| 4 ˆd. We also know that ℓ2(z) = ℓ2(w) 9 ˆd. Thus, it must be |ℓ2(w) ℓ2(x)| 5 ˆd. Applying Claim D.2 on w gives us |ℓ2(x) ℓ2(w)| 3 ˆd. This contradicts the earlier inequality. x P w (v) follows by symmetry. This completes the proof. These claims complete the proof of Lemma D.7, hence that of Lemma 5.5, and hence Theorem 5.3.