# parameterized_approximation_schemes_for_fairrange_clustering__12f9df16.pdf

Parameterized Approximation Schemes for Fair-Range Clustering

Zhen Zhang1,2, Xiaohong Chen1,2, , Limei Liu1,2, Jie Chen1,2, Junyu Huang3, Qilong Feng3,2

1National Key Laboratory Cultivation Base for Data Intelligence and Smart Society, Hunan University of Technology and Business, Changsha 410205, China 2Xiangjiang Laboratory, Changsha 410205, China 3School of Computer Science and Engineering, Central South University, Changsha 410083, China zz@hutb.edu.cn, csu_cxh@163.com, seagullm@163.com, chemjay@hnu.edu.cn, junyuhuang@csu.edu.cn, csufeng@mail.csu.edu.cn

Fair-range clustering extends classical clustering formulations by associating each data point with one or more demographic labels. It imposes lower and upper bound constraints on the number of facilities opened for each label, ensuring fair representation of all demographic groups by the selected facilities. In this paper we focus on the fair-range k-median and k-means problems in Euclidean spaces. We give (1 + ε)-approximation algorithms with ﬁxed-parameter tractable running times for both problems, parameterized by the numbers of opened facilities and demographic labels. For Euclidean metrics, these are the ﬁrst parameterized approximation schemes for the problems, improving upon the previously known O(1)-approximation ratios given by Thejaswi et al. (KDD 2022).

1 Introduction

Clustering seeks to partition a given set of clients into disjoint, cohesive clusters. Among the many formalizations of clustering, the k-median and k-means problems are perhaps the most prevalent ones, owing to the concise nature of their descriptions. Given a set of clients and facilities in a metric space along with a positive integer k, the k-median and k-means problems aim to open at most k facilities and connect each client to the nearest opened facility, such that the sum of the client-connection costs is minimized. In the k-median problem, the connection cost of each client is its distance to the corresponding facility, while in the k-means problem, it is the squared distance. Despite their seemingly simple deﬁnitions, the k-median and k-means problems are computationally challenging, and the development of their approximation algorithms continues to be a vibrant area of research. The current best approximation guarantees are the ratios of 2.613 [Gowda et al., 2023] for the k-median problem and 9 [Ahmadian et al., 2020] for the k-means problem.

The k-median and k-means problems are designed to maximize the similarity between clients and their corresponding facilities, allowing the opened facilities to be considered representative points for the client set. This understanding underscores the important roles that the k-median and k-means problems play in data summarization [Moens et al., 1999, Girdhar and Dudek, 2012]. However, algorithms developed for these problems can often yield unfair summarization of socioeconomic data, as they prioritize minimizing the clustering costs over considering the distribution of demographic labels (e.g., gender, age, race) associated with the opened facilities [Kay et al., 2015]. Driven by this

Corresponding Authors

38th Conference on Neural Information Processing Systems (Neur IPS 2024).

rationale, there has been considerable interest in fair-range clustering. Given a set of data points associated with demographic labels, fair-range clustering extends classical clustering formulations by imposing lower and upper bound constraints on the number of opened facilities associated with each label, thereby ensuring fairness across different demographic groups.

An instance (ℓ, k, C, F, α, β, ρ, τ) of the fair-range clustering problem is speciﬁed by positive integers ℓand k, sets C of clients and F of facilities in a metric space, vectors α = (α1, , αℓ) and β = (β1, , βℓ) of ℓpositive integers satisfying αt βt for each t {1, , ℓ}, and integer ρ 1, where each f F is associated with a set τ(f) {1, , ℓ} of demographic labels. A feasible solution to the instance is speciﬁed by a subset H F of no more than k facilities satisfying |{f H : t τ(f)}| [αt, βt] for each t {1, , ℓ}, and the cost of the solution is P

c C minf H δρ(c, f), where δ is the distance function. The goal of the fair-range clustering problem is to identify a feasible solution with minimum cost.

The fair-range clustering problem is equivalent to the fair-range k-median (Fk Med) problem when ρ = 1, and to the fair-range k-means (Fk Means) problem when ρ = 2. Despite their signiﬁcance in applications requiring fair representations, the Fk Med and Fk Means problems pose signiﬁcantly greater computational challenges than classical clustering problems. As demonstrated by Thejaswi et al. [2021], designing polynomial-time algorithms with provable approximation guarantees for the Fk Med and Fk Means problems is unlikely, as determining the existence of feasible solutions to their instances is NP-hard. For a simpliﬁed scenario where each facility is associated with a single demographic label, Thejaswi et al. [2021] showed that the Fk Med and Fk Means problems can be reduced to the well known matroid clustering problem, which admits constant-factor approximation algorithms [Krishnaswamy et al., 2011, Li, 2011, Swamy, 2014, Friggstad and Zhang, 2016, Krishnaswamy et al., 2018], albeit with an O(k)ℓ 1 multiplicative overhead in algorithmic running time. Hotegni et al. [2023] latter gave an improved reduction to the matroid clustering problem that eliminates the O(k)ℓ 1 overhead. They further demonstrated that solving the Fk Med and Fk Means problems in this simpliﬁed scenario can be achieved even more efﬁciently than solving matroid clustering problems, based on smaller-size linear programs.

In practical scenarios concerning clustering problems, the number of opened facilities (i.e., k) is often considerably smaller than the input size. As such, assuming k to be small and treating it as a ﬁxed parameter is a commonly used way for simplifying these problems, as exempliﬁed in [Cohen-Addad et al., 2019, Goyal and Jaiswal, 2023, Chen et al., 2024, Jaiswal et al., 2024]. Unfortunately, the Fk Med and Fk Means problems have been demonstrated to remain challenging even with this simpliﬁcation: When both k and the number of demographic labels (i.e., ℓ) are ﬁxed parameters, Thejaswi et al. [2022] established the W[2]-hardness of the Fk Med and Fk Means problems, suggesting that exactly solving them in ﬁxed-parameter tractable time (denoted as FPT(k, ℓ) time, meaning n O(1)h(k, ℓ) for an input size of n and a positive function h) is unlikely; Cohen-Addad et al. [2019] showed that the best possible approximation ratios of FPT(k, ℓ)-time algorithms, even for the case where ℓ= 1, cannot be better than 1 + 2e 1 for the Fk Med problem and 1 + 8e 1 for the Fk Means problem. This matches the FPT(k, ℓ)-time (1 + 2e 1 + ε)-approximation algorithm for the Fk Med problem and (1 + 8e 1 + ε)-approximation algorithm for the Fk Means problem given by Thejaswi et al. [2022] for a simpler case considering only the lower bound constraint. Notably, the method for enumerating feasible constraint patterns given by Thejaswi et al. [2022] demonstrates that their algorithms can be effortlessly extended to accommodate the case involving both lower and upper bounds.

The negative result presented by Cohen-Addad et al. [2019] suggests that we cannot hope to approximate the Fk Means problem with a ratio better than 1 + 8e 1 and the Fk Med problem with a ratio better than 1 + 2e 1 in FPT(k, ℓ) time when considering general metric spaces. However, this result does not preclude the possibility of achieving better approximations for these problems in more structured settings, such as Euclidean spaces, since the analysis in Cohen-Addad et al. [2019] is limited to general metrics. In this paper, we take the ﬁrst step toward exploring the properties of Euclidean metrics for the Fk Med and Fk Means problems. Our approach yields FPT(k, ℓ)-time approximation schemes, as stated in Theorem 1 in Section 3.4.

1.1 Other Related Work

Due to the prevalence of Euclidean data in real-world applications involving clustering, signiﬁcant attention has been devoted to developing algorithms that leverage the properties of Euclidean spaces.

Exploring these properties often leads to improved approximation guarantees. One such example can be found in [Cohen-Addad et al., 2022], where a (2.406 + ε)-approximation algorithm for the k-median problem and a (5.912 + ε)-approximation algorithm for the k-means problem in Euclidean spaces are proposed, improving upon the state-of-the-art approximation ratios for these problems under general metrics [Gowda et al., 2023, Ahmadian et al., 2020]. Furthermore, it has been shown that the Euclidean k-median and k-means problems admit approximation schemes2 if k is a ﬁxed parameter and the opened facilities can be located arbitrarily [Kumar et al., 2010, Jaiswal et al., 2014, Bhattacharya et al., 2018, Ding and Xu, 2020]. These algorithms identify a subset of each client-cluster deﬁned by an optimal solution and approximate the corresponding opened facility by the centroid of this subset. However, similar ideas are not applicable to the Fk Med and Fk Means problems, as they involve ﬁnite sets of facilities and hard constraints on the labels of the opened facilities. In these cases, the centroids of the considered subsets are not guaranteed to be feasible as opened facilities.

Constraints on the number of opened facilities associated with different labels were ﬁrst introduced by Hajiaghayi et al. [2010, 2012], inspired by budget considerations for the deployment of servers in content distribution networks. From then on, related clustering problems have been widely explored. When we are provided with an upper bound constraint and each facility is associated with a single label, the problems represent special cases of the matroid clustering problems and directly motivate research into the latter [Krishnaswamy et al., 2011]. For the lower-bounded case, there are FPT(k, ℓ)- time approximation algorithms for the k-median and k-means cost functions (given by Thejaswi et al. [2022], as previously mentioned in Section 1), and a multi-swap local-search heuristic yields an O(ℓ)-approximation for the k-median cost function if each facility has a single label [Thejaswi et al., 2021, Zhang et al., 2024].

In addition to imposing constraints on the distribution of labels associated with opened facilities, fair clustering has been extensively studied under various other settings that introduce different types of constraints. For example, group fairness requires each cluster to provide a fair representation of different demographic groups [Chierichetti et al., 2017, Bera et al., 2019, Bandyapadhyay et al., 2021, Dai et al., 2022, Wu et al., 2024], proportional fairness ensures that no subset of clients, of a given size, can ﬁnd a closed facility that provides a lower connection cost to each of its members [Chen et al., 2019, Micha and Shah, 2020], individual fairness requires that the distance from each client to the nearest opened facility does not exceed a client-speciﬁed threshold [Jung et al., 2020, Mahabadi and Vakilian, 2020, Negahbani and Chakrabarty, 2021, Vakilian and Yalçiner, 2022, Ahmadi et al., 2022, Bateni et al., 2024], and social fairness aims to minimize the maximum clustering cost among groups of clients [Abbasi et al., 2021, Ghadiri et al., 2021, Makarychev and Vakilian, 2021, Goyal and Jaiswal, 2023, Abbasi et al., 2024].

1.2 Preliminaries

From now on, we consider an instance I = (ℓ, k, C, F, α, β, ρ, τ) of the fair-range clustering problem satisfying ρ {1, 2}, |C F| = n, and C F Rd, along with a constant ϵ (0, 0.5). Given an integer i 1 and a set D, deﬁne [i] = {1, , i}, and let [D]i be the Cartesian product D D | {z } i .

Given a point x and a set P of points in an Euclidean space, let δ(x, P) = minp P ||x p|| denote the distance from x to its nearest point in P, and let δi(x, P) = minp P ||x p||i for each i 1.

The following algebraic fact will be utilized in the analysis of the running times of our algorithms.

Lemma 1 Given two real numbers s and t greater than 1, we have logt s max{s, t O(t)}.

The following lemma extends triangle inequality3 to squared Euclidean metrics.

Lemma 2 Given three points x, y, and z in an Euclidean space and a real number γ (0, 1], we have ||x z||2 (1 + γ 1)||x y||2 + (1 + γ)||y z||2.

We will also consider the weighted version of the fair-range clustering problem, which can be deﬁned as follows.

2An approximation scheme is a (1 + ε)-approximation algorithm, where ε is an arbitrary small constant. 3Given three points x, y, and z in an Euclidean space, we have ||x z|| ||x y|| + ||y z||.

Figure 1: (a) The client nearest to the opened facility f is taken as the leader, around which an annular search space is constructed; (b) the center point f is opened in our solution.

Deﬁnition 1 (weighted fair-range clustering) An instance (ℓ, k, C, F, α, β, ρ, τ) of the fair-range clustering problem can be extended to its weighted version (ℓ, k, C, F, α, β, ρ, τ, w) by associating each client c C with a weight w(c) 1. This extension modiﬁes the cost of a feasible solution H F from P

c C δρ(c, H) to P

c C w(c)δρ(c, H).

2 An Overview of Our Algorithms

The FPT(k, ℓ)-time approximation algorithms for the Fk Med and Fk Means problems given by Thejaswi et al. [2022] follow the framework outlined in [Cohen-Addad et al., 2019]. This framework identiﬁes the nearest client to each facility opened in the considered optimal solution as a leader and introduces a set of annuli centered at each leader. Each annulus is deﬁned such that its outer radius is 1 + ε times its inner radius. The framework then enumerates the annuli to identify those that contain the facilities corresponding to the leaders and selects the opened facilities within these annuli, as illustrated in Figure 1-(a). Intuitively, the deﬁnition of the annuli and triangle inequality imply an upper bound on the distances from the selected facilities to the optimal ones. Building upon this insight, Thejaswi et al. [2022] utilized a submodular maximization method to select facilities to be opened within the annuli and demonstrated constant-factor approximation ratios.

We similarly base our algorithms on the framework proposed by Cohen-Addad et al. [2019]. Our approach focuses on exploring the properties of Euclidean metrics to further reﬁne the selection range of opened facilities. Speciﬁcally, we partition each annulus into a set of smaller cells; for each facility opened in the optimal solution, we select the center point of the cell containing it as the facility to be opened, as shown in Figure 1-(b). This process involves carefully balancing the number of cells, which affects the time required to identify the desired cells, against the sizes of the cells, which inﬂuence the distance from each facility opened in the optimal solution to the center point of the cell containing it, as well as our approximation ratio. We achieve this trade-off by constructing nets as deﬁned below. Deﬁnition 2 (γ-net [Gupta et al., 2003]) Given a density parameter γ > 0, a set P Rd, and a subset R P, we call R a γ-net of P if each p R satisﬁes δ(p, R\{p}) γ and each p P satisﬁes δ(p, R) γ.

We partition the annular search space into cells using a set of nets for the facilities. The trade-off between the number and sizes of the cells can be managed by adjusting the density parameters (i.e., the parameter λ in Deﬁnition 2). For each annulus centered around a leader and containing its corresponding facility (the one opened in the optimal solution), we estimate the demographic labels associated with this facility and identify the subset of facilities within the annulus that share these labels. A net is then constructed for this subset, using a density parameter carefully determined by the radius of the annulus. Given the Voronoi diagram deﬁned by the net, Deﬁnition 2 suggests that the facility opened in the optimal solution is close to the center point of its corresponding Voronoi cell. By considering each member of all constructed nets as a candidate for an opened facility, we can ensure that the candidate set includes a subset closely approximating the optimal solution.

It remains to consider how to bound the running time within FPT(k, ℓ) time. The algorithms given by Cohen-Addad et al. [2019] and Thejaswi et al. [2022] start with constructing a coreset, that is, a small weighted subset of the client set whose distribution closely approximates that of the full set.

This facilitates the efﬁcient identiﬁcation of leaders by enumerating the coreset. In this paper, we face additional challenges in bounding the running time. For instance, when partitioning the annuli into cells, the number of cells can depend exponentially on the spatial dimension. To ensure that we can deal with the Fk Med and Fk Means problems in high-dimensional Euclidean spaces within FPT(k, ℓ) time, we map the considered instance to a space of O(log k+log log n) dimensions using a combination of the method for constructing coresets given by Chen [2009] and Johnson-Lindenstrauss transform. Combining this data-reduction technique with our net-based approach for selecting opened facilities, we give FPT(k, ℓ)-time (1 + ε)-approximation algorithms for the Fk Med and Fk Means problems.

3 The Algorithms

We now present our algorithms for the Fk Med and Fk Means problems. In Section 3.1, we introduce our data-reduction method for decreasing the size of the considered instance. In Section 3.2, we construct annular search spaces for the facilities to be opened, using the leaders from the client set. Section 3.3 details the construction of nets for the facilities, based on which we provide approximation schemes in low-dimensional spaces. Finally, in Section 3.4, we show how to combine the data-reduction method with the algorithms designed for low-dimensional spaces to deal with high-dimensional instances of the Fk Med and Fk Means problems.

3.1 Data Reduction

In this section we map instance I to a smaller weighted instance in a low-dimensional space. As mentioned in Section 2, we achieve this using the coreset-construction method given by Chen [2009] and Johnson Lindenstrauss transform, which are detailed in the following two lemmata.

Lemma 3 (Chen [2009]) Given a constant ϵ (0, 0.5), a set P Rd, an integer t > 0, and an integer ρ {1, 2}, a subset P P with a weight function w : P [1, + ) satisfying |P | d(tε 1 log |P|)O(1) and P

p P w(p) = |P| can be constructed in O(|P|dt) time, such that each H Rd with |H| t satisﬁes P

p P w(p)δρ(p, H) [1 ϵ, 1 + ϵ] P

p P δρ(p, H).

Lemma 4 (Johnson and Lindenstrauss [1984], Ailon and Chazelle [2009]) Given a constant ϵ (0, 0.5) and a set P Rd, we can construct a mapping ϕ : Rd R d satisfying d = O(ϵ 2 log |P|) and ||ϕ(p1) ϕ(p2)|| [1, 1 + ϵ]||p1 p2|| for each p1, p2 P in O(d log d) + (ϵ 1 log |P|)O(1) time.

The following lemma is a stronger version of Johnson Lindenstrauss transform, which preserves distances over a broader range through terminal embedding. Speciﬁcally, it modiﬁes the condition for each p1, p2 P in Lemma 4 to for each p1 P and p2 Rd .

Lemma 5 (Narayanan and Nelson [2019]) Given a constant ϵ (0, 0.5) and a set P Rd, we can construct a mapping ϕ : Rd R d satisfying d = O(ϵ 2 log |P|) and ||ϕ(p1) ϕ(p2)|| [1, 1 + ϵ]||p1 p2|| for each p1 P and p2 Rd in (|P|dϵ 1)O(1) time.

It can be assumed that each mapping ϕ : Rd R d constructed by Lemma 4 and Lemma 5 is injective. Such an assumption is made without loss of generality: We can create duplicates of the points in R d that have multiple preimages under ϕ. This ensures that we can always differentiate ϕ(x) and ϕ(y) for any two distinct points x and y in Rd, even if ϕ(x) and ϕ(y) have identical values across all dimensions. Distinguishing the images of the points from Rd is essential in fair-range clustering problems because points with the same dimensional values can have different demographic labels.

Our data-reduction method, which combines Lemma 3, Lemma 4, and Lemma 5, is presented in Algorithm 1 and illustrated in Figure 2 (this ﬁgure outlines the processing ﬂow for the clients). This algorithm ﬁrst leverages Lemma 3 within the O(log n)-dimensional space constructed by Lemma 4, such that the client set can be replaced with a coreset of size logarithmically dependent on n and independent of d. Next, to reduce dimensions while preserving the distances between each client in the coreset and any facility, Algorithm 1 uses Lemma 5 with the coreset as input to construct an O(log k + log log n)-dimensional space. The following lemma provides the performance guarantees of Algorithm 1.

Algorithm 1: The data-reduction method

Input: A constant ϵ (0, 0.5) and an instance (ℓ, k, C, F, α, β, ρ, τ) of the fair-range clustering problem satisfying C F Rd

Output: A mapping ϕ : Rd R d and an instance (ℓ, k, C, F, α, β, ρ, τ, w) of the weighted fair-range clustering problem satisfying ρ {1, 2}, C F R d, F = {ϕ(f) : f F}, and τ(ϕ(f)) = τ(f) for each f F

1 Let ϕ1 : Rd Rd be the mapping constructed by Lemma 4 with (ϵ, C F) as the input;

2 Let C be the weighted set constructed by Lemma 3 with (ϵ, {ϕ1(c) : c C}, k, ρ) as the input, and let w : C [1, + ) be the corresponding weight function;

3 Let ϕ2 : Rd R d be the mapping constructed by Lemma 5 with (ϵ, C ) as the input;

4 ϕ ϕ2 ϕ1, C {ϕ2(c) : c C }, F {ϕ(f) : f F};

5 return ϕ : Rd R d, (ℓ, k, C, F, α, β, ρ, τ ϕ 1, w ϕ 1 2 ).

C {ϕ1(c) : c C}

C {ϕ2(c) : c C }

Figure 2: Given a set C Rd of clients, Algorithm 1 ﬁrst maps it to Rd using mapping ϕ1 : Rd Rd . It then constructs a coreset C for {ϕ1(c) : c C}. Finally, the algorithm maps C to R d using mapping ϕ2 : Rd R d.

Lemma 6 Given a constant ϵ (0, 0.5) and an instance (ℓ, k, C, F, α, β, ρ, τ) of the fair-range clustering problem with ρ {1, 2}, |C F| = n, and C F Rd, Algorithm 1 constructs a mapping ϕ : Rd R d and an instance (ℓ, k, C, F, α, β, ρ, τ, w) of the weighted fair-range clustering problem in O(d log d) + (nkϵ 1)O(1) time, which satisfy the following properties:

c C w(c) = |C|,

(ii) w(c) 1 for each c C,

(iii) | C| (kϵ 1 log n)O(1),

(iv) d = ϵ O(1)(log k + log log n), and

c C w(c)δρ(c, {ϕ(f) : f H}) [1 ϵ, (1 + ϵ)2ρ+1] P

c C δρ(c, H) for each H F with |H| k.

3.2 The Annular Search Spaces

In this section we construct k annular search spaces, each corresponding to one of the k facilities to be opened. We ﬁrst introduce some notations. Let ϕ : Rd R d be the mapping and I = (ℓ, k, C, F, α, β, ρ, τ, w) be the weighted instance constructed by Algorithm 1 with (ϵ, I) as the input, where C F R d, F = {ϕ(f) : f F}, and each f F satisﬁes τ(ϕ(f)) = τ(f). Let H = {f 1 , , f k} be an optimal solution to I, and let opt = P

c C w(c)δρ(c, H ) denote its cost. For each i [k], let Li = {f F : τ(f) = τ(f i )} denote the set of facilities that have the same set of demographic labels as f i , and let C i = {c C : arg minf H ||f c|| = f i } be the cluster of clients deﬁned by f i . Given the lower bound constraint on the number of opened facilities, it may be the case that some facilities in H correspond to empty clusters. We thus deﬁne

Algorithm 2: The algorithm for constructing annular search spaces

Input: A constant ϵ (0, 0.5), an instance I = (ℓ, k, C, F, α, β, ρ, τ, w) of the weighted fair-range clustering problem, and a positive integer n Output: A collection A

2 Let D be the power set of [ℓ];

3 for each f F do

4 Let η(f) be an integer randomly and uniformly selected from [k];

5 for each (D1, , Dk) [D\{ }]k, k [k], (c 1, , c k ) [ C]k , and δ {||c f||ρ : c C, f F} do

6 if |{i [k] : t Di}| [αt, βt] for each t [ℓ] then

7 for each i [k] do

8 L i {f F : τ(f) = Di};

9 for each i [k ] and j [ ϵ 2 log n ] do

10 A(i, j) {f L i : ||f c i||ρ (ϵ(1 + ϵ)j 1δ n 1, ϵ(1 + ϵ)jδ n 1]};

11 A(i, 0) {f L i : ||f c i||ρ ϵδ n 1};

12 for each (j1, , jk ) [[ ϵ 2 log n ] {0}]k do

13 for each i [k ] do

14 Ai {f A(i, ji) : η(f) = i};

15 for each i [k]\[k ] do

16 if {f L i : η(f) = i} = then

17 Let Ai be a singleton subset of {f L i : η(f) = i};

20 A A {{A1, , Ak}};

21 return A.

H 0 = {f H : C i = } and H 1 = H \ H 0. Let k = | H 1|. Without loss of generality, we can assume that H 1 = {f 1 , , f k }.

Following the framework given by Cohen-Addad et al. [2019], we select opened facilities from a set of annuli centered around a group of leaders from C. For each i [k ], let ci denote the client from C nearest to f i , that is, the leader corresponding to f i . Let δρ max = maxi [k ] δρ(ci, f i ). For each i [k ] and j [ ϵ 2 log n ], let A (i, j) = {f Li : ||f ci||ρ (ϵ(1 + ϵ)j 1δρ maxn 1, ϵ(1 + ϵ)jδρ maxn 1]} be the set of facilities from Li located in an annulus centered around ci, and let A (i, 0) = {f Li : ||f ci||ρ ϵδρ maxn 1}. The deﬁnitions of A (i, j) and δρ max imply the existence of an integer j {0, , ϵ 2 log n } satisfying f i A (i, j). Denote by A i such a set A (i, j) containing f i .

Our method for constructing annular search spaces is presented in Algorithm 2. Since the collection {A (1, 0), , A (k , ϵ 2 log n )} can be determined based on the values of {L1, , Lk}, k , δρ max, and {c1, , ck }, Algorithm 2 enumerates all possible values of these parameters in step 5 to ensure that the collection can be captured. Given an integer i [k ] and the sets A (i, 0), , A (i, ϵ 2 log n ), Algorithm 2 enumerates [ ϵ 2 log n ] {0} in step 12 to ﬁnd the integer j with A i = A (i, j). To avoid the case where the search spaces for the k opened facilities intersect and the set of selected facilities contains duplicate elements, Algorithm 2 employs a color-coding technique to eliminate any potential intersections. Speciﬁcally, Algorithm 2 associates each facility f F with a random integer η(f) [k] in step 4, and only selects facilities with η(f) = i when constructing the i-th search space for each i [k] in steps 14 and 17. The performance guarantees of this algorithm are presented in the following lemma.

Lemma 7 The collection A constructed by Algorithm 2 satisﬁes the following two properties:

Algorithm 3: The algorithm for low-dimensional weighted instances

Input: A constant ϵ (0, 0.5), an instance I = (ℓ, k, C, F, α, β, ρ, τ, w) of the weighted fair-range clustering problem, and a positive integer n Output: A solution H to I

2 for each s [kk] do

3 Let A be the collection constructed by Algorithm 2 with (ϵ, I, n) as the input;

5 for each {A1, , Ak} A with Ai = i [k] do

6 for each i [k] do

7 if |Ai| = 1 then

10 Let Si be the maxx,y Ai ϵ||x y||-net of Ai constructed by Lemma 8;

11 Let H be the collection constructed by transforming each tuple in S1 S2 Sk into a set;

13 return H arg min H H P

c C w(c)δρ(c, H).

(i) With probability no less than k k, there exists a collection {A1, , Ak} A satisfying f i Ai A i for each i [k ] and Ai = for each i [k]\[k ];

(ii) Given a collection {A1, , Ak} A, with Ai = for each i [k], and a tuple (f1, , fk) A1 A2 Ak, we have |{i [k] : t τ(fi)}| [αt, βt] for each t [ℓ].

3.3 The Algorithm in Low-Dimensional Spaces

As outlined in Section 2, solutions are constructed by extracting nets from the set of facilities. The following lemma presents a method for generating nets in low-dimensional Euclidean spaces. Lemma 8 (Har-Peled and Mendel [2006]) Given a density parameter γ > 0 and a set P Rd, a γ-net of P of size at most min{|P|, γ d maxp1,p2 P ||p1 p2||d} can be constructed in |P| log |P|2O(d) time.

Our approach for solving the low-dimensional weighted instance I is built upon Algorithm 2 and Lemma 8, and is outlined in Algorithm 3. Since Algorithm 2 yields the desired search spaces with probability k k (as given by the ﬁrst property stated in Lemma 7), Algorithm 3 iteratively invokes it kk times, allowing the probability of successfully constructing the desired search spaces in at least one of the iterations to be boosted to a constant. Given k sets A1, , Ak satisfying the ﬁrst property stated in Lemma 7, Algorithm 3 constructs a net for each set of size greater than 1, and adds the members of the net to the candidate set of opened facilities. Finally, the algorithm constructs a set of feasible solutions to I based on the candidates for opened facilities, and returns the one with the minimum cost among them.

The following lemma says that Algorithm 3 yields a (1 + O(ϵ 1 ρ ))-approximation solution to I with high probability. Lemma 9 The following event occurs with probability no less than 1 e 1: Algorithm 3 yields a feasible H to I satisfying P

c C w(c)δρ(c, H ) < (1+4ϵ)opt if ρ = 1 and P

c C w(c)δρ(c, H ) < (1 + 9 ϵ)opt if ρ = 2.

By analyzing the time Algorithm 3 takes to construct the set of candidate solutions, as well as the size of this set, we can establish the following upper bound on the running time of Algorithm 3.

Lemma 10 Algorithm 3 runs in no more than 2(kϵ 1)O(1)+kℓn O(1) time.

Algorithm 4: The algorithm in high-dimensional spaces

Input: A constant ϵ (0, 0.5) and an instance I = (ℓ, k, C, F, α, β, ρ, τ) of the fair-range clustering problem satisfying C F Rd Output: A solution H to I

1 Let ϕ : Rd R d be the mapping and I = (ℓ, k, C, F, α, β, ρ, τ, w) be the weighted instance constructed by Algorithm 1 with (ϵ, I) as the input;

2 Let H be the solution to I constructed by Algorithm 3 with (ϵ, I, |C F|) as the input;

3 return H {ϕ 1(f) : f H }.

3.4 Extensions to High-Dimensional Spaces

We combine the data-reduction method given in Section 3.1 with the low-dimensional algorithm given in Section 3.3 to solve the Fk Med and Fk Means problems in high-dimensional spaces, as detailed in Algorithm 4. Given a constant ϵ (0, 0.5) and an instance I = (ℓ, k, C, F, α, β, ρ, τ), the algorithm starts with constructing a mapping ϕ : Rd R d and a small weighted instance I = (ℓ, k, C, F, α, β, ρ, τ, w), where C F R d, F = {ϕ(f) : f F}, and τ(ϕ(f)) = τ(f) for each f F. It then constructs a solution to I and returns the set of preimages of the facilities opened in this solution under ϕ. The analysis of the performance guarantees of Algorithm 4 leads to the main result of this paper, as stated in Theorem 1.

Theorem 1 Given an instance (ℓ, k, C, F, α, β, ρ, τ) of fair-range clustering with C F Rd and ρ {1, 2} along with a real number ε (0, 1), there is a randomized (1 + ε)-approximation algorithm running in O(d log d) + 2(kε 1)O(1)+kℓn O(1) time, where n = |C F|.

4 Conclusions

In this paper, we consider the Fk Med and Fk Means problems for the case where the numbers of opened facilities and demographic labels are ﬁxed parameters. Based on a combination of a datareduction method and a space-partitioning approach for selecting opened facilities, we introduce (1 + ε)-approximation algorithms in Euclidean spaces, representing the ﬁrst parameterized approximation schemes for the problems. Given that coreset-construction methods were known for many constrained clustering problems incorporating additional constraints on instances, such as those related to capacities [Cohen-Addad and Li, 2019], group fairness [Bandyapadhyay et al., 2021], and robustness [Huang et al., 2023], an interesting direction for future work is to extend our techniques to deal with the Fk Med and Fk Means problems in similar constrained settings. Another promising avenue for exploration is to accelerate heuristic algorithms for the fair-range clustering problem, such as those given in [Thejaswi et al., 2022], using the data-reduction method proposed in this work.

5 Broader Impact

Our work deals with the fair-range clustering problem, providing algorithmic insights that can facilitate fair decision-making. While our algorithms have been shown to be fair according to speciﬁc deﬁnitions, it is essential to recognize that this fairness does not automatically warrant indiscriminate application. This underscores the need for careful consideration when implementing the algorithms proposed in this paper in real-world scenarios that prioritize fairness.

Acknowledgements

This work was supported by Open Project of Xiangjiang Laboratory under Grant Nos. 22XJ03013 and 24XJJCYJ01003, National Natural Science Foundation of China under Grant Nos. 62202161, 62432016, 62172446, and 62202160, Natural Science Foundation of Hunan Province under Grant No. 2023JJ40240, and Central South University Research Programme of Advanced Interdisciplinary Studies under Grant No. 2023QYJC023.

Fateme Abbasi, Sandip Banerjee, Jaroslaw Byrka, Parinya Chalermsook, Ameet Gadekar, Kamyar Khodamoradi, Dániel Marx, Roohani Sharma, and Joachim Spoerhase. Parameterized approximation for robust clustering in discrete geometric spaces. In Proceedings of the 51st International Colloquium on Automata, Languages, and Programming, volume 297, pages 6:1 6:19, 2024.

Mohsen Abbasi, Aditya Bhaskara, and Suresh Venkatasubramanian. Fair clustering via equitable group representations. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, pages 504 514, 2021.

Saba Ahmadi, Pranjal Awasthi, Samir Khuller, Matthäus Kleindessner, Jamie Morgenstern, Pattara Sukprasert, and Ali Vakilian. Individual preference stability for clustering. In Proceedings of the 39th International Conference on Machine Learning, volume 162, pages 197 246, 2022.

Sara Ahmadian, Ashkan Norouzi-Fard, Ola Svensson, and Justin Ward. Better guarantees for k-means and Euclidean k-median by primal-dual algorithms. SIAM J. Comput., 49(4), 2020.

Nir Ailon and Bernard Chazelle. The fast Johnson Lindenstrauss transform and approximate nearest neighbors. SIAM J. Comput., 39(1):302 322, 2009.

Sayan Bandyapadhyay, Fedor V. Fomin, and Kirill Simonov. On coresets for fair clustering in metric and Euclidean spaces and their applications. In Proceedings of the 48th International Colloquium on Automata, Languages, and Programming, volume 198, pages 23:1 23:15, 2021.

Mohammad Hossein Bateni, Vincent Cohen-Addad, Alessandro Epasto, and Silvio Lattanzi. A scalable algorithm for individually fair k-means clustering. In Proceedings of the 27th International Conference on Artiﬁcial Intelligence and Statistics, volume 238, pages 3151 3159, 2024.

Suman Kalyan Bera, Deeparnab Chakrabarty, Nicolas Flores, and Maryam Negahbani. Fair algorithms for clustering. In Proceedings of the 32nd Annual Conference on Neural Information Processing Systems, pages 4955 4966, 2019.

Anup Bhattacharya, Ragesh Jaiswal, and Amit Kumar. Faster algorithms for the constrained k-means problem. Theory Comput. Syst., 62(1):93 115, 2018.

Ke Chen. On coresets for k-median and k-means clustering in metric and Euclidean spaces and their applications. SIAM J. Comput., 39(3):923 947, 2009.

Xianrun Chen, Dachuan Xu, Yicheng Xu, and Yong Zhang. Parameterized approximation algorithms for sum of radii clustering and variants. In Proceedings of the 38th AAAI Conference on Artiﬁcial Intelligence, pages 20666 20673, 2024.

Xingyu Chen, Brandon Fain, Liang Lyu, and Kamesh Munagala. Proportionally fair clustering. In Proceedings of the 36th International Conference on Machine Learning, volume 97, pages 1032 1041, 2019.

Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, and Sergei Vassilvitskii. Fair clustering through fairlets. In Proceedings of the 30th Annual Conference on Neural Information Processing Systems, pages 5029 5037, 2017.

Vincent Cohen-Addad and Jason Li. On the ﬁxed-parameter tractability of capacitated clustering. In Proceedings of the 46th International Colloquium on Automata, Languages, and Programming, volume 132, pages 41:1 41:14, 2019.

Vincent Cohen-Addad, Anupam Gupta, Amit Kumar, Euiwoong Lee, and Jason Li. Tight FPT approximations for k-median and k-means. In Proceedings of the 46th International Colloquium on Automata, Languages, and Programming, volume 132, pages 42:1 42:14, 2019.

Vincent Cohen-Addad, Hossein Esfandiari, Vahab S. Mirrokni, and Shyam Narayanan. Improved approximations for Euclidean k-means and k-median, via nested quasi-independent sets. In Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing, pages 1621 1628, 2022.

Zhen Dai, Yury Makarychev, and Ali Vakilian. Fair representation clustering with several protected classes. In Proceedings of the 5th ACM Conference on Fairness, Accountability, and Transparency, pages 814 823, 2022.

Hu Ding and Jinhui Xu. A uniﬁed framework for clustering constrained data without locality property. Algorithmica, 82(4):808 852, 2020.

Zachary Friggstad and Yifeng Zhang. Tight analysis of a multiple-swap heurstic for budgeted redblue median. In Proceedings of the 43rd International Colloquium on Automata, Languages, and Programming, volume 55, pages 75:1 75:13, 2016.

Mehrdad Ghadiri, Samira Samadi, and Santosh S. Vempala. Socially fair k-means clustering. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, pages 438 448, 2021.

Yogesh A. Girdhar and Gregory Dudek. Efﬁcient on-line data summarization using extremum summaries. In Proceeding of the 29th IEEE International Conference on Robotics and Automation, pages 3490 3496, 2012.

Kishen N. Gowda, Thomas W. Pensyl, Aravind Srinivasan, and Khoa Trinh. Improved bi-point rounding algorithms and a golden barrier for k-median. In Proceedings of the 34th ACM-SIAM Symposium on Discrete Algorithms, pages 987 1011, 2023.

Dishant Goyal and Ragesh Jaiswal. Tight FPT approximation for socially fair clustering. Inf. Process. Lett., 182:106383, 2023.

Anupam Gupta, Robert Krauthgamer, and James R. Lee. Bounded geometries, fractals, and lowdistortion embeddings. In Proceeding of the 44th Symposium on Foundations of Computer Science, pages 534 543, 2003.

Mohammad Taghi Hajiaghayi, Rohit Khandekar, and Guy Kortsarz. Budgeted red-blue median and its generalizations. In Proceedings of the 18th Annual European Symposium on Algorithms, volume 6346, pages 314 325, 2010.

Mohammad Taghi Hajiaghayi, Rohit Khandekar, and Guy Kortsarz. Local search algorithms for the red-blue median problem. Algorithmica, 63(4):795 814, 2012.

Sariel Har-Peled and Manor Mendel. Fast construction of nets in low-dimensional metrics and their applications. SIAM J. Comput., 35(5):1148 1184, 2006.

Sèdjro S. Hotegni, Sepideh Mahabadi, and Ali Vakilian. Approximation algorithms for fair range clustering. In Proceedings of the 40th International Conference on Machine Learning, volume 202, pages 13270 13284, 2023.

Lingxiao Huang, Shaofeng H.-C. Jiang, Jianing Lou, and Xuan Wu. Near-optimal coresets for robust clustering. In Proceedings of the 11th International Conference on Learning Representations, 2023.

Ragesh Jaiswal, Amit Kumar, and Sandeep Sen. A simple D2-sampling based PTAS for k-means and other clustering problems. Algorithmica, 70(1):22 46, 2014.

Ragesh Jaiswal, Amit Kumar, and Jatin Yadav. FPT approximation for capacitated sum of radii. In Proceedings of the 15th Innovations in Theoretical Computer Science Conference, volume 287, pages 65:1 65:21, 2024.

William B. Johnson and Joram Lindenstrauss. Extensions of Lipschitz mappings into a Hilbert space. Contemp. Math., 26:189 206, 1984.

Christopher Jung, Sampath Kannan, and Neil Lutz. Service in your neighborhood: Fairness in center location. In Proceedings of the 1st Symposium on Foundations of Responsible Computing, volume 156, pages 5:1 5:15, 2020.

Matthew Kay, Cynthia Matuszek, and Sean A. Munson. Unequal representation and gender stereotypes in image search results for occupations. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, pages 3819 3828, 2015.

Ravishankar Krishnaswamy, Amit Kumar, Viswanath Nagarajan, Yogish Sabharwal, and Barna Saha. The matroid median problem. In Proceedings of the 22nd Annual ACM-SIAM Symposium on Discrete Algorithms, pages 1117 1130, 2011.

Ravishankar Krishnaswamy, Shi Li, and Sai Sandeep. Constant approximation for k-median and k-means with outliers via iterative rounding. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, pages 646 659, 2018.

Amit Kumar, Yogish Sabharwal, and Sandeep Sen. Linear-time approximation schemes for clustering problems in any dimensions. J. ACM, 57(2):5:1 5:32, 2010.

Shi Li. A 1.488 approximation algorithm for the uncapacitated facility location problem. In Proceedings of 38th International Colloquium on Automata, Languages and Programming, volume 6756, pages 77 88, 2011.

Sepideh Mahabadi and Ali Vakilian. Individual fairness for k-clustering. In Proceedings of the 37th International Conference on Machine Learning, volume 119, pages 6586 6596, 2020.

Yury Makarychev and Ali Vakilian. Approximation algorithms for socially fair clustering. In Proceedings of the 34th Conference on Learning Theory, volume 134, pages 3246 3264, 2021.

Evi Micha and Nisarg Shah. Proportionally fair clustering revisited. In Proceedings of the 47th International Colloquium on Automata, Languages, and Programming, volume 168, pages 85:1 85:16, 2020.

Marie-Francine Moens, Caroline Uyttendaele, and Jos Dumortier. Abstracting of legal cases: The potential of clustering based on the selection of representative objects. J. Am. Soc. Inf. Sci., 50(2): 151 161, 1999.

Shyam Narayanan and Jelani Nelson. Optimal terminal dimensionality reduction in Euclidean space. In Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, pages 1064 1069, 2019.

Maryam Negahbani and Deeparnab Chakrabarty. Better algorithms for individually fair k-clustering. In Proceedings of the 34th Annual Conference on Neural Information Processing Systems, pages 13340 13351, 2021.

Chaitanya Swamy. Improved approximation algorithms for matroid and knapsack median problems and applications. In Proceedings of the 17th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems and the 18th International Workshop on Randomization and Computation, volume 28, pages 403 418, 2014.

Suhas Thejaswi, Bruno Ordozgoiti, and Aristides Gionis. Diversity-aware k-median: Clustering with fair center representation. In Proceedings of the 32nd European Conference on Machine Learning and the 25th European Conference on Principles and Practice of Knowledge Discovery in Databases, volume 12976, pages 765 780, 2021.

Suhas Thejaswi, Ameet Gadekar, Bruno Ordozgoiti, and Michal Osadnik. Clustering with fair-center representation: Parameterized approximation algorithms and heuristics. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 1749 1759, 2022.

Ali Vakilian and Mustafa Yalçiner. Improved approximation algorithms for individually fair clustering. In Proceedings of the 25th International Conference on Artiﬁcial Intelligence and Statistics, volume 151, pages 8758 8779, 2022.

Di Wu, Qilong Feng, and Jianxin Wang. Approximation algorithms for fair k-median problem without fairness violation. Theor. Comput. Sci., 985:114332, 2024.

Zhen Zhang, Junfeng Yang, Limei Liu, Xuesong Xu, Guozhen Rong, and Qilong Feng. Towards a theoretical understanding of why local search works for clustering with fair-center representation. In Proceedings of the 38th AAAI Conference on Artiﬁcial Intelligence, pages 16953 16960, 2024.

A Proof of Lemma 1

Lemma 1 Given two real numbers s and t greater than 1, we have logt s max{s, t O(t)}.

Proof If t log s log log s, then we have log s O(t log t), and thus logt s t O(t). If t < log s log log s, then

we have logt s < log

log s log log s s = s. Thus, Lemma 1 is true.

B Proof of Lemma 2

Lemma 2 Given three points x, y, and z in an Euclidean space and a real number γ (0, 1], we have ||x z||2 (1 + γ 1)||x y||2 + (1 + γ)||y z||2. Proof Triangle inequality implies that ||x z|| ||x y|| + ||y z||, and thus we have

||x z||2 (||x y|| + ||y z||)2

= ||x y||2 + ||y z||2 + 2 1 γ ||x y|| γ||y z||

||x y||2 + ||y z||2 + 1

γ ||x y||2 + γ||y z||2.

This completes the proof of Lemma 2.

C Proof of Lemma 6

Lemma 6 Given a constant ϵ (0, 0.5) and an instance (ℓ, k, C, F, α, β, ρ, τ) of the fair-range clustering problem with ρ {1, 2}, |C F| = n, and C F Rd, Algorithm 1 constructs a mapping ϕ : Rd R d and an instance (ℓ, k, C, F, α, β, ρ, τ, w) of the weighted fair-range clustering problem in O(d log d) + (nkϵ 1)O(1) time, which satisfy the following properties:

c C w(c) = |C|,

(ii) w(c) 1 for each c C,

(iii) | C| (kϵ 1 log n)O(1),

(iv) d = ϵ O(1)(log k + log log n), and

c C w(c)δρ(c, {ϕ(f) : f H}) [1 ϵ, (1 + ϵ)2ρ+1] P

c C δρ(c, H) for each H F with |H| k.

Proof Algorithm 1 constructs a mapping ϕ1 : Rd Rd using Lemma 4 in step 1, a coreset C along with the corresponding weight function w : C [1, + ) using Lemma 3 in step 2, and a mapping ϕ2 : Rd R d using Lemma 5 in step 3.

We begin by examining the ﬁrst property of the output of Algorithm 1 stated in Lemma 6. We have X

c C w(c) = X

c C w (c) = |{ϕ1(c) : c C}| = |C|, (1)

where the ﬁrst step follows from the fact that C = {ϕ2(c) : c C } (due to step 4 of Algorithm 1) and w is the composite mapping w ϕ 1 2 (due to step 5 of Algorithm 1), the second step follows from the fact that C is the weighted set constructed by Lemma 3 with {ϕ1(c) : c C} as the input set, and the last step is due to the assumption that the mappings constructed by Lemma 4 and Lemma 5 are injective. Equality (1) implies that the ﬁrst property stated in Lemma 6 is true.

The second property stated in Lemma 6 follows directly from the fact that w(c) = w (ϕ 1 2 (c)) for each c C (as established in step 5 of Algorithm 1) and w is a mapping to [1, + ) (due to Lemma 3).

We now consider the third property stated in Lemma 6. This can be veriﬁed by

| C| = |C | d (kϵ 1 log |C|)O(1) = (kϵ 1 log |C|)O(1) log |C F| = (kϵ 1 log n)O(1), (2)

where the ﬁrst step is due to the fact that C = {ϕ2(c) : c C } (as established in step 4 of the algorithm) and the assumption that each mapping constructed by Lemma 5 is injective, the second step follows from Lemma 3, and the third step is due to the fact that ϕ1 : Rd Rd is the mapping constructed by Lemma 4 with (ϵ, C F) as the input.

Using inequality (2) and the fact that ϕ2 : Rd R d is the mapping constructed by Lemma 5 with (ϵ, C ) as the input, we have

d = ϵ O(1) log |C | ϵ O(1) log(kϵ 1 log n) = ϵ O(1)(log k + log log n),

which implies that the fourth property stated in Lemma 6 is true.

Given a subset H F with |H| k, we have X

c C w(c)δρ(c, {ϕ(f) : f H}) [1, (1 + ϵ)ρ] X

c C w(c)δρ(c, {ϕ1(f) : f H})

[1 ϵ, (1 + ϵ)ρ+1] X

c C δρ(ϕ1(c), {ϕ1(f) : f H})

[1 ϵ, (1 + ϵ)2ρ+1] X

c C δρ(c, H),

where the ﬁrst step follows from Lemma 5 and the fact that C = {ϕ2(c) : c C } and ϕ(f) = ϕ2(ϕ1(f)) for each f F (due to step 4 of Algorithm 1), the second step follows from the fact that C is a coreset of {ϕ1(c) : c C} constructed by Lemma 3, and the last step is due to Lemma 4. This completes the proof of the last property stated in Lemma 6.

It remains to show the running time of Algorithm 1. Recall that the algorithm invokes Lemma 4 with (ϵ, C F), invokes Lemma 3 with (ϵ, {ϕ1(c) : c C}, k, ρ), and invokes Lemma 5 with (ϵ, C ), where C {ϕ1(c) : c C} Rd . Combining this with inequality (2), we can express the running time of Algorithm 1 as

O(d log d + |C|d k) + (ϵ 1 log |C F|)O(1) + (|C |d ϵ 1)O(1) O(d log d) + (nkϵ 1)O(1),

as desired.

D Proof of Lemma 7

Lemma 7 The collection A constructed by Algorithm 2 satisﬁes the following two properties:

(i) With probability no less than k k, there exists a collection {A1, , Ak} A satisfying f i Ai A i for each i [k ] and Ai = for each i [k]\[k ]; (ii) Given a collection {A1, , Ak} A, with Ai = for each i [k], and a tuple (f1, , fk) A1 A2 Ak, we have |{i [k] : t τ(fi)}| [αt, βt] for each t [ℓ]. Proof Algorithm 2 enumerates all possible values of A i for each i [k ] and all possible values of Li for each i [k]\[k ]. Consequently, the k sets A 1, , A k , Lk +1, , Lk are guaranteed to be captured by the algorithm. Observe that Algorithm 2 associates each facility f with a random integer η(f) [k]. It can be shown that equality η(f i ) = i i [k] holds with probability k k. When this equality is satisﬁed, and the k sets A 1, , A k , Lk +1, , Lk are provided, the algorithm is able to construct a set Ai satisfying f i Ai A i for each i [k ] by extracting the facilities f A i with η(f) = i in step 14, and ﬁnd a facility f Li with η(f) = i to construct a singleton set Ai Li for each i [k]\[k ] in step 17. Thus, the ﬁrst property stated in Lemma 7 is true.

Now we consider the second property. Given a collection {A1, , Ak} A with Ai = for each i [k] and a tuple (f1, , fk) A1 A2 Ak, the fact that facilities in different sets are associated with distinct integers implies that {f1, , fk} is a distinct set. Combining this with the decision condition employed in step 6 of Algorithm 2, we have

|{i [k] : t τ(fi)}| = |{i [k] : t Di}| [αt, βt]

for each t [ℓ] , where D1, , Dk are the k sets of demographic labels used for constructing {A1, , Ak}. This completes the proof of Lemma 7.

E Proof of Lemma 9

Lemma 9 The following event occurs with probability no less than 1 e 1: Algorithm 3 yields a feasible H to I satisfying P

c C w(c)δρ(c, H ) < (1+4ϵ)opt if ρ = 1 and P

c C w(c)δρ(c, H ) < (1 + 9 ϵ)opt if ρ = 2.

Proof The second property stated in Lemma 7 immediately implies that all candidate solutions constructed by Algorithm 3 are feasible solutions to I. It remains to consider the approximation ratio of the algorithm. Given that Algorithm 3 iteratively invokes Algorithm 2 kk times to construct a collection A, the probability that there exists an element {A1, , Ak} A satisfying the ﬁrst property stated in Lemma 7 is at least 1 (1 k k)kk > 1 e 1. For the purpose of our analysis, we assume that the collection A indeed contains such an element {A1, , Ak}, and let Si be the subset of Ai constructed by Algorithm 3 in step 8 or step 10 for each i [k].

We consider an integer i [k ]. Lemma 7 indicates that

Si = Ai = {f i } (3)

if |Ai| = 1. For the case where |Ai| > 1, Si is a maxx,y Ai ϵ||x y||-net of Ai. In this scenario, it can be concluded that

δρ(f i , Si) ϵρ max x,y Ai ||x y||ρ

2ρϵρ max f Ai ||f ci||ρ

2ρϵρ max{(1 + ϵ)||f i ci||ρ, ϵ

2ρϵρ max{(1 + ϵ)||f i ci||ρ, ϵ

where the ﬁrst step follows from the fact that f i Ai (due to Lemma 7) and the deﬁnition of nets, the second step follows from triangle inequality (for ρ = 1) and Lemma 2 (for ρ = 2, with γ = 1), the third step follows from the fact that an integer j {0, , ϵ 2 log n } satisﬁes Ai A (i, j) (due to Lemma 7), along with the fact that each j [ ϵ 2 log n ] and {x, y} A (i, j) satisfy ||x ci||ρ (1 + ϵ)||y ci||ρ and each f A (i, 0) satisﬁes ||f ci||ρ ϵδρ maxn 1 (due to the deﬁnitions of A (i, j) and A (i, 0)), and the last step follows from the deﬁnition of δρ max and the fact that w(c) 1 for each c C (due to Lemma 6).

Let H be the set of candidate solution constructed using the Cartesian product S1 S2 Sk in step 11 of Algorithm 3. Equality (3) and inequality (4) suggest the existence of a candidate solution {f1, , fk} H satisfying

||fi f i ||ρ 2ρϵρ max{(1 + ϵ)||f i ci||ρ, ϵ

for each i [k ]. Thus, it can be shown that

w(c)||fi f i ||ρ =

w(c)||fi f i ||ρ

w(c) max{(1 + ϵ)||f i ci||ρ, ϵ

w(c) (1 + ϵ)||f i ci||ρ + ϵ

< 2ρϵρ(1 + ϵ)

w(c)||f i ci||ρ + 2ρϵρ+1opt

2ρϵρ(1 + ϵ)

w(c)||f i c||ρ + 2ρϵρ+1opt

= 2ρϵρ(1 + ϵ)opt + 2ρϵρ+1opt < 4ρϵρopt, (6)

where the ﬁrst step is due to the fact that C i = for each i [k]\[k ], the second step is due to inequality (5), the fourth step follows from the fact that Pk i=1 P

c C i w(c) = P

c C w(c) = | C| < n (due to Lemma 6), the ﬁfth step is due to the deﬁnition of ci, the sixth step is due to the deﬁnition of C i , and the last step follows from the fact that ϵ (0, 0.5). Consequently, for the case where ρ = 2, we have

c C w(c)δρ(c, {f1, , fk}) =

w(c)δρ(c, {f1, , fk})

w(c)||c fi||ρ

w(c) (1 + ϵ)||c f i ||ρ + (1 + 1 ϵ)||f i fi||ρ

= (1 + ϵ)opt + (1 + 1 ϵ)

w(c)||f i fi||ρ

< 1 + ϵ + 4(1 + 1 ϵ)ρϵρ opt

< (1 + 9 ϵ)opt, (7)

where the third step is due to Lemma 2 (with γ = ϵ), the fourth step follows from the deﬁnition of C i , the ﬁfth step follows from inequality (6), and the last step is due to the fact that ϵ (0, 0.5).

Replacing Lemma 2 used in the third step of inequality (7) with triangle inequality, we get

c C w(c)δρ(c, {f1, , fk})

w(c) (||c f i ||ρ + ||f i fi||ρ)

w(c)||f i fi||ρ

< (1 + 4ϵ)opt (8)

for the case where ρ = 1.

Using inequality (7) and inequality (8), along with the fact that Algorithm 3 returns the candidate solution with the minimum cost, we complete the proof of Lemma 9.

F Proof of Lemma 10

Lemma 10 Algorithm 3 runs in no more than 2(kϵ 1)O(1)+kℓn O(1) time. Proof Let H be the set of candidate solutions constructed by Algorithm 3, and let A be the collection formed by iteratively invoking Algorithm 2 kk times. Observe that Algorithm 2 enumerates all possible values of k , {c1, , ck}, δρ max, and {L1, , Lk} to guess the sets {A (1, 0), , A (k , ϵ 2 log n )} and {Lk +1, , Lk}. Additionally, it enumerates [[ ϵ 2 log n ] {0}]k to determine the set {A 1, , A k }. Analyzing the possible values of these parameters yields

|A| 2kℓkk+1| F|| C|k+1(ϵ 2 log n + 1)k

2kℓ(kϵ 1 log n)O(k)n

2kℓ(kϵ 1)O(k)n O(1), (9)

where the second step follows from the fact that C = (kϵ 1 log n)O(1) (due to Lemma 6) and | F| < n, and the third step follows from Lemma 1 (with s = n and t = k).

Let {A1, , Ak} A be the collection considered in one of the iterations of steps 6-12 of Algorithm 3. Let S1, , Sk be the subsets constructed in step 8 or step 10 and H be the set of candidate solutions constructed in step 11 during this iteration. For each i [k] with |Ai| > 1, Si is a maxx,y Ai ϵ||x y||-net of Ai, which is constructed based on the maximum distance between the facilities from Ai in step 10. Lemma 8 implies that constructing these nets takes no more than

i=1 2O( d)|Ai|O(1) 2O( d)| F|O(1)k 2O( d)n O(1)k (10)

time. Moreover, the upper bound on the size of a net exhibited in Lemma 8 suggests that

1 |Si| ϵ d (11)

for each i [k]. Inequality (11) leads to

i=1 |Si| ϵ k d,

and thus we have

|H| ϵ k d|A|. (12)

Given the set H of candidate solutions, Algorithm 3 takes | C| dk|H| time to identify the solution with the minimum cost. Combining this with the time required for constructing nets in each iteration exhibited in inequality (10), we know that the running time of Algorithm 3 is upper-bounded by

2O( d)n O(1)k|A| + | C| dk|H| |A|k(2O( d)n O(1) + ϵ k d| C| d)

2kℓϵ O(k d)n O(1)(kϵ 1)O(k)

2kℓn O(1)(kϵ 1)O(k)(k log n)(kϵ 1)O(1)

= 2(kϵ 1)O(1)+kℓn O(1),

where the ﬁrst step follows from inequality (12), the second step follows from the fact that | C| (kϵ 1 log n)O(1) (due to Lemma 3) and inequality (9), the third step follows from the fact that d = ϵ O(1)(log k + log log n) (due to Lemma 3), and the last step is due to Lemma 1 (with s = n and t = (kϵ 1)O(1)). This completes the proof of Lemma 10.

G Proof of Theorem 1

Theorem 1 Given an instance (ℓ, k, C, F, α, β, ρ, τ) of fair-range clustering with C F Rd and ρ {1, 2} along with a real number ε (0, 1), there is a randomized (1 + ε)-approximation algorithm running in O(d log d) + 2(kε 1)O(1)+kℓn O(1) time, where n = |C F|.

Proof Let H be the solution to I returned by Algorithm 4, and let H be the solution to I constructed in step 2 of Algorithm 4. Since each facility in H shares the same set of demographic labels as its corresponding image in R d, it follows that H is a feasible solution to I. Let H be an optimal solution to I and H be an optimal solution to I.

The optimality of H for I and Lemma 6 imply that X

c C w(c)δρ(c, H ) X

c C w(c)δρ(c, {ϕ(f) : f H })

(1 + ϵ)2ρ+1 X

c C δρ(c, H ). (13)

For the case where ρ = 1, we can derive that inequality X

c C δρ(c, H ) 1 1 ϵ

c C w(c)δρ(c, H )

c C w(c)δρ(c, H )

1 ϵ (1 + ϵ)2ρ+1 X

c C δρ(c, H )

< (1 + 39ϵ) X

c C δρ(c, H ) (14)

holds with probability no less than 1 e 1, where the ﬁrst step is due to the fact that H = {ϕ(f) : f H } and Lemma 6, the second step follows from Lemma 9, the third step is due to inequality (13), and the last step follows from the fact that ϵ (0, 0.5). Similarly, we can conclude that inequality X

c C δρ(c, H ) 1 1 ϵ

c C w(c)δρ(c, H )

c C w(c)δρ(c, H )

1 ϵ (1 + ϵ)2ρ+1 X

c C δρ(c, H )

< (1 + 83 ϵ) X

c C δρ(c, H ) (15)

holds with the same probability when ρ = 2, where the second step is due to Lemma 9.

Using inequality (14) and inequality (15), we know that Algorithm 4 is a randomized (1 + 39ϵ)- approximation algorithm for the Fk Med problem and a (1 + 83 ϵ)-approximation algorithm for the Fk Means problem. Moreover, Lemma 6 and Lemma 10 imply that this algorithm runs in O(d log d) + 2(kϵ 1)O(1)+kℓn O(1) time.

Given a constant ε (0, 1), let ϵ = ε 39 for the Fk Med problem and ϵ = ( ε

83)2 for the Fk Means problem, then the argument above implies the existence of (1 + ε)-approximation algorithms with running time O(d log d) + 2(kε 1)O(1)+kℓn O(1) for both problems, as desired.

Neur IPS Paper Checklist

Question: Do the main claims made in the abstract and introduction accurately reﬂect the paper s contributions and scope? Answer: [Yes] Justiﬁcation: The paper s contributions and scope have been accurately claimed in the abstract and introduction. 2. Limitations

Question: Does the paper discuss the limitations of the work performed by the authors? Answer: [Yes] Justiﬁcation: The limitation of the algorithms in the paper is that they consider k and ℓas ﬁxed parameters, which has been clearly discussed in the paper. 3. Theory Assumptions and Proofs

Question: For each theoretical result, does the paper provide the full set of assumptions and a complete (and correct) proof? Answer: [Yes] Justiﬁcation: All the proofs are clearly provided. 4. Experimental Result Reproducibility

Question: Does the paper fully disclose all the information needed to reproduce the main experimental results of the paper to the extent that it affects the main claims and/or conclusions of the paper (regardless of whether the code and data are provided or not)? Answer: [NA] Justiﬁcation: The paper does not include experiments. 5. Open access to data and code

Question: Does the paper provide open access to the data and code, with sufﬁcient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [NA] Justiﬁcation: The paper does not include experiments. 6. Experimental Setting/Details

Question: Does the paper specify all the training and test details (e.g., data splits, hyperparameters, how they were chosen, type of optimizer, etc.) necessary to understand the results? Answer: [NA] Justiﬁcation: The paper does not include experiments. 7. Experiment Statistical Signiﬁcance

Question: Does the paper report error bars suitably and correctly deﬁned or other appropriate information about the statistical signiﬁcance of the experiments? Answer: [NA] Justiﬁcation: The paper does not include experiments. 8. Experiments Compute Resources

Question: For each experiment, does the paper provide sufﬁcient information on the computer resources (type of compute workers, memory, time of execution) needed to reproduce the experiments? Answer: [NA] Justiﬁcation: The paper does not include experiments. 9. Code Of Ethics

Question: Does the research conducted in the paper conform, in every respect, with the Neur IPS Code of Ethics https://neurips.cc/public/Ethics Guidelines? Answer: [Yes] Justiﬁcation: The research conducted in the paper conform, in every respect, with the Neur IPS Code of Ethics. 10. Broader Impacts

Question: Does the paper discuss both potential positive societal impacts and negative societal impacts of the work performed? Answer: [Yes] Justiﬁcation: The potential societal impacts and negative societal impacts are discussed in the last section of the paper. 11. Safeguards

Question: Does the paper describe safeguards that have been put in place for responsible release of data or models that have a high risk for misuse (e.g., pretrained language models, image generators, or scraped datasets)? Answer: [NA] Justiﬁcation: The paper poses no such risks. 12. Licenses for existing assets

Question: Are the creators or original owners of assets (e.g., code, data, models), used in the paper, properly credited and are the license and terms of use explicitly mentioned and properly respected? Answer: [NA] Justiﬁcation: The paper does not use existing assets. 13. New Assets

Question: Are new assets introduced in the paper well documented and is the documentation provided alongside the assets? Answer: [NA] Justiﬁcation: The paper does not release new assets. 14. Crowdsourcing and Research with Human Subjects

Question: For crowdsourcing experiments and research with human subjects, does the paper include the full text of instructions given to participants and screenshots, if applicable, as well as details about compensation (if any)? Answer: [NA] Justiﬁcation: This paper does not involve crowdsourcing nor research with human subjects. 15. Institutional Review Board (IRB) Approvals or Equivalent for Research with Human Subjects Question: Does the paper describe potential risks incurred by study participants, whether such risks were disclosed to the subjects, and whether Institutional Review Board (IRB) approvals (or an equivalent approval/review based on the requirements of your country or institution) were obtained? Answer: [NA] Justiﬁcation: This paper does not involve crowdsourcing nor research with human subjects.