# granularballinduced_multiple_kernel_kmeans__058961db.pdf

Granular-Ball-Induced Multiple Kernel K-Means

Shuyin Xia1, Yifan Wang1, Lifeng Shen1 , and Guoyin Wang2

1 Chongqing Key Laboratory of Computational Intelligence, Key Laboratory of Cyberspace Big Data Intelligent Security, Ministry of Education, and the Key Laboratory of Big Data Intelligent Computing, College of Computer Science and Technology, Chongqing University of Posts and Telecommunications 2 National Center for Applied Mathematics, Chongqing Normal University xiasy@cqupt.edu.cn, wangyifan421@foxmail.com, shenlf@cqupt.edu.cn, wanggy@cqnu.edu.cn

Most existing multi-kernel clustering algorithms, such as multi-kernel K-means, often struggle with computational efficiency and robustness when faced with complex data distributions. These challenges stem from their dependence on point-topoint relationships for optimization, which can lead to difficulty in accurately capturing data sets inherent structure and diversity. Additionally, the intricate interplay between multiple kernels in such algorithms can further exacerbate these issues, effectively impacting their ability to cluster data points in high-dimensional spaces. In this paper, we leverage granular-ball computing to improve the multikernel clustering framework. The core of granularball computing is to adaptively fit data distribution by balls from coarse to acceptable levels. Each ball can enclose data points based on a density consistency measurement. Such ball-based data description thus improves the computational efficiency and the robustness to unknown noises. Specifically, based on granular-ball representations, we introduce the granular-ball kernel (GBK) and its corresponding granular-ball multi-kernel K-means framework (GB-MKKM) for efficient clustering. Using granular-ball relationships in multiple kernel spaces, the proposed GB-MKKM framework shows its superiority in efficiency and clustering performance in the empirical evaluation of various clustering tasks.

1 Introduction

Clustering, a fundamental task in machine learning, groups data based on inherent similarities [Aggarwal, 2018; Duran and Odell, 2013]. Classic methods include K-means [Mac Queen and others, 1967], spectral clustering [Ng et al., 2001], and density-based clustering [Ester et al., 1996]. K-means, widely used for its simplicity, minimizes squared distances to cluster centroids but struggles with nonlinear and arbitraryshaped data distributions.

Corresponding Author

To overcome this limitation, various K-means variants have been developed. K-means++ [Arthur and Vassilvitskii, 2006] improves the convergence speed and clustering accuracy by optimizing the selection of initial cluster centers. [Mussabayev et al., 2023] proposed an adaptive clustering algorithm that does not require initialization or parameter selection. Although this algorithm somewhat improves robustness to initial values, its performance remains limited when dealing with high-dimensional and non-separable datasets. Kernel trick is a powerful technique widely used to handle non-linearly separable data. For example, kernel K-means (KKM) [Sinaga and Yang, 2020] algorithm was proposed by mapping original data to a high-dimensional kernel space, where data can easily be separated linearly, and K-means clustering is subsequently performed. In practice, a kernel matrix is required to construct using the inner products of data points in the high-dimensional space, thereby capturing the nonlinear features of the data [Abin and Beigy, 2015; Yan et al., 2023; Yu et al., 2011; Girolami, 2002]. However, KKM s performance is highly dependent on the choice of the kernel function, which is often challenging to determine across different datasets. To alleviate the limitation of selecting a single kernel function, the multiple kernel clustering (MKC) concept was introduced [Chao et al., 2021; Zhao et al., 2009; Zhang et al., 2020]. The core idea of MKC is to leverage multiple kernel functions to extract complementary information from different feature spaces and optimally fuse these kernel matrices to improve clustering performance. Among MKC algorithms, one of the most popular approaches is the multiple kernel K-means (MKKM) [Huang et al., 2011]. MKKM learns a weighted combination of multiple kernel functions, and its core optimization objective can be defined as:

min γ min H Γ Tr Kγ I HHT , (1)

where H and Kγ represent the clustering partition matrix and the γ-th kernel matrix, respectively. = {σ Rω | Pω k=1 σk = 1, σk 0, k}, where ω represents the number of kernel functions and σk is the kernel function s weight. MKKM adopts an alternating optimization strategy to update the kernel weights and the clustering partition matrix iteratively. However, this alternating approach easily suffers from the local optima, leading to a degraded overall clustering performance [Yao et al., 2020; Liu et al., 2016]. To further

Proceedings of the Thirty-Fourth International Joint Conference on Artiﬁcial Intelligence (IJCAI-25)

improve the MKKM, [Liu, 2022] proposed Simple MKKM, which reformulates the min-max formula into a parameterfree minimization problem and designs a streamlined gradient descent algorithm to solve for the optimal solution. Although this method alleviates the local optimum problem, it is still limited as it requires joint optimization over all samples to compute kernel matrices, significantly increasing time and space complexity. In this paper, we propose an adaptive and robust granularball-induced multiple-kernel K-means clustering framework. It presents the advantages of scalability and clustering performance compared to recent multi-kernel clustering methods. Intuitively, granular-ball computing is a data representation technique that can capture complex data structures with a set of balls. Using a density-based central consistency measurement [Xia et al., 2019], granular balls can be adaptively generated from coarse to acceptable fine granularities. The use of granular balls reduces the amount of data while effectively capturing the distribution information of the data. Based on granular-ball computing, we introduce the granular-ballinduced kernel (GBK) into a multi-kernel K-means clustering framework. Note that the proposed GBK can be play-andplugged into existing MKKM methods for clustering, significantly improving the efficiency and clustering performance. In summary, our main contributions are listed as follows: We are the first to introduce a granular-ball-induced kernel (GBK) into multi-kernel K-means clustering that can be easily extended to existing multiple-kernel methods. The proposed GBK effectively reduces storage requirements and computational costs while ensuring superior clustering performance because it allows for reducing the kernel matrix size from the number of samples n to the number of granular-balls m (m n). Experimental results verify its effectiveness across various commonly used multi-kernel clustering datasets.

2 Related Work 2.1 Multiple Kernel Clustering To address the issue of single kernel insufficiency, multiple kernel K-means (MKKM), as an algorithm combining multiple kernel learning and K-means clustering, improves clustering performance by integrating multiple kernel functions. Based on classical MKKM, recent studies have made improvements from different perspectives, including selecting kernels based on kernel correlation [G onen and Margolin, 2014], introducing local adaptive kernel fusion [Liu et al., 2016], optimizing kernel alignment criteria [Cortes et al., 2012], enhancing robustness [Tao et al., 2018], simplifying the optimization process [Zhang et al., 2021], and removing noisy kernels [Li et al., 2023], etc. For instance, LMKKM [G onen and Margolin, 2014] was proposed better to capture sample-specific features through local adaptive kernel fusion. RMKKM [Du et al., 2015] adopted the l2-norm to improve resistance to outliers, achieving joint optimization of clustering labels, multi-kernel combinations, and memberships through alternating optimization. Besides, ONKC [Liu et al., 2017] was introduced to enhance the representation capability

Figure 1: Illustration of granular ball generation.

of the optimal kernel. A multi-view clustering method, LFMVC [Li et al., 2024], based on Simple MKKM was further studied, which optimized a new objective function using an efficient two-step optimization strategy. Moreover, a multiple kernel clustering method with multi-scale partition selection [Wang et al., 2024] was developed that dynamically removes noisy kernels, and significantly improves performance. One of the main limitations of the above MKKM methods is that they require constructing kernel matrices for all samples and performing optimization, which results in high time and space costs. In this work, we use the distribution information of samples in multiple kernel spaces to enhance efficiency and clustering accuracy.

2.2 Granular-ball Computing

Granular-ball computing (GBC) [Xia et al., 2019] is a helpful tool for improving efficiency by covering data at multiple granularities. This aligns with the strategy of human cognition prioritizing global features before processing details proposed in [Chen, 1982] and the concept of multigranularity cognitive computing [Wang, 2017]. Figure 1 shows how granular balls cover and represent data sets, which provides an intuitive macroscopic perspective for understanding granular computing. Granular-ball computing has been studied in various applications, including accelerating unsupervised K-means [Xia et al., 2020], enhancing density-based clustering like DPC and DBSCAN [Cheng et al., 2023], [Cheng et al., 2024], and addressing high-dimensional challenges through granular-ball-weighted K-means and manifold learning [Xie et al., 2024b; Liu et al., 2024]. In spectral clustering, granular-ball methods exploit data structure for efficiency [Xie et al., 2023]. Recent advancements include the granular-ball clustering algorithm [Xia et al., 2024], introducing center consistency for complex data, and fuzzy theory integration to resolve overlapping boundaries from concept drift [Xie et al., 2024a]. Moreover, Granular-ball computing has recently been applied to point cloud registration [Hu et al., 2025] and multi-view contrastive learning [Su et al., 2025]. These successes demonstrate granular balls powerful representation and generalization capabilities, confirming their potential for multi-view learning.

3 Methodology

This section formally elaborates on the proposed granularball-induced multiple kernel clustering framework. It mainly includes four steps: i) the granular-ball generation step, ii) the granular-ball-induced kernel construction, iii) the multiple kernel fusion step, and iv) the optimization of multikernel clustering. The granular-ball-induced multiple kernel

Proceedings of the Thirty-Fourth International Joint Conference on Artiﬁcial Intelligence (IJCAI-25)

Figure 2: Procedure of granular-ball-induced multiple kernel construction.

construction procedure (the first three steps) is shown in Figure 2. Through the above steps, the proposed method significantly improves the multi-kernel optimization efficiency (smaller number of granular balls) and achieves better clustering performance (granular balls fit the data distribution well and exclude potential noises).

3.1 Granular-ball Generation Let D = {xu Rd, u = 1, 2, . . . , n} be a dataset where n and d denote the number of samples and the dimensionality of the data, respectively. The subscript u indicates the sample index. The granular-ball set is defined as GB = {GB1, GB2, . . . , GBm}, where m granular balls cover and describe the dataset D. We use i to denote the i-th ball where i {1, 2, . . . , m}. The ball GBi s size is Si. Definition 1. Granular-ball Representation: For the i-th GBi, its center ci is given by ci = 1 Si PSi u=1 xu. Its radii (maximum radius rmax and average radius rave) are

rmax = max xu GBi( xu ci ), rave = 1

u=1 xu ci . (2)

Definition 2. Center-Consistency Measure: For the i-th granular-ball GBi, its center-consistency measure [Xia et al., 2024] is

CCMGBi = |χi|

rmax = |χi| rmax

rave Si , (3)

where χi = {xu | xu ci rave}. Equation (3) means the density ratio between the average radius and the maximum radius within a granular ball. Intuitively, a smaller ratio means a lower sparsity level of sample distribution within a granular ball. The criterion for splitting a granular ball is determined by comparing its CCM value with the median CCM of all balls. Specifically, if CCMGBi < λ CCMmedian (λ is two empirically), the granular ball will be split; otherwise, the splitting process stops. The algorithm of granular ball generation is provided in Section 1 of the Appendix.

3.2 Calculation of Granular-ball Kernel Kernel methods embed the data into a new feature space via projections to uncover the nonlinear relationships among the data. For the dataset D, we consider a set of ω projection functions Φ = {ϕ1, ϕ2, . . . , ϕω}. Each projection ϕk (k = 1, 2, . . . , w) encodes the data into the feature space as a new vector ϕk(x). Let {K1, K2, . . . , Kω} denote the Mercer kernel matrices corresponding to these implicit projections:

Kk(xu, xv) = ϕk(xu)T ϕk(xv). (4)

We embed granular balls into the kernel space and derive the Granular-ball Kernel (GBK), defined as follows:

KGB k (ci, cj) =ϕk(ci)T ϕk(cj)

v=1 Kk(xu, xv),

where i, j {1, 2, ..., m}. Kernel mappings of samples within the same granular ball are approximately equal due to their proximity in the input space. This aligns with the local smoothness assumption in kernel methods, where nearby points have similar kernel mappings. Thus, in (5), the mean of the mappings of the samples {xu} within the same ball can estimate the granular-ball center ci s kernel mapping:

u=1 ϕk(xu). (6)

Intuitively, each granular ball can be seen as a small local distribution or cluster, and the kernel mappings of its internal points are expected to concentrate around a common region in the kernel-induced feature space. Let granular balls GBi

Proceedings of the Thirty-Fourth International Joint Conference on Artiﬁcial Intelligence (IJCAI-25)

and GBj be regarded as distributions P and Q respectively, then the granular-ball kernel reflects the expected kernel:

Exu P,xv Q[Kk(xu, xv)]) Kk(E[xu P ]xu, E[xv Q]xv)

=KGB k (ci, cj). (7)

Thus, in practice, we have:

KGB k (ci, cj) = 1 Si Sj

v=1 Kk(xu, xv). (8)

Theorem 1. The granular-ball kernel KGB k (ci, cj) satisfies the Mercer condition. 1. KGB k (ci, cj) is symmetric. 2. For any p, we have p T KGB k p 0.

Proof. Symmetry: Since Kk(xu, xv) = Kk(xv, xu) satisfies symmetry, it follows that:

KGB k (ci, cj) = 1 Sj Si

u=1 Kk(xv, xu) = KGB k (cj, ci).

Positive Semi-Definiteness: Let p = [p1, p2, . . . , pm] and pi = p i/Si. Since Kk is a kernel and it is positive semidefinite, we have p T Kkp 0. Then,

p T KGB k p =

j=1 pipj KGB k (ci, cj)

pj Sj Kk(xu, xv)

j=1 p ip j Kk(xi, xj) 0. (9)

Thus, KGB k satisfies the Mercer condition.

3.3 Multiple Kernel Fusion Based on the granular-ball kernel KGB k (ci, cj) in Equation (5), we construct a granular-ball multi-kernel, defined as:

k=1 σ2 k KGB k , (10)

where σ2 k is the kernel weight and σ . represents the constraints for kernel weights and = {σ Rω | Pω k=1 σk = 1, σk 0, k}. Here, KGB σ will be used in the subsequent establishment of the granular-ball-induced multikernel clustering model.

3.4 GB-induced Multiple Kernel K-means In multiple kernel clustering, the kernel matrices and the clustering partition matrix are the keys to clustering optimization. The granular-ball clustering partition matrix Hgb Rm J is a binary matrix recording the assignment information of granular balls to clusters, defined as:

Hgb(i, s) = 1, if GBi belongs to the s-th cluster, 0, otherwise. (11)

Algorithm 1 GB-SMKKM Algorithm

Input: Granular-ball kernel matrix { KGB k }ω k=1, initial t = 1. Output: Optimized σ. 1: Initialize σ(1) = 1

ω and flag = 1. 2: while flag holds do 3: Compute KGB σ(t) = Pω k=1(σ(t) k )2 KGB k and obtain H(t) gb . 4: Compute Kσ

σk and obtain descent direction d(t) via gradient descent in Equation (14). 5: Update σ(t+1) σ(t) + αd(t). 6: if max |σ(t+1) σ(t)| e 4 then 7: Set flag = 0. 8: end if 9: t t + 1. 10: end while 11: return σ.

To balance the influence of different kernels and ensure the rationality of sample assignments, we optimize the model by minimizing the following objective for the kernel coefficient σk (k = 1, 2, ..., w):

s=1 Rk,i,s. (12)

where Rk,i,s = KGB k (ci, ci) 2KGB k (ci, zs)+KGB k (zs, zs). Here, J clusters are considered. The center of the s-th cluster is denoted as zs. In the objective, the first term is the self-similarity of the granular-ball GBi in the kernel space. The second term measures the similarity between GBi and the cluster center zs, while the third term denotes the cluster center s self-similarity kernel. To simplify the objective function, we omit explicit cluster centers in Equation (12) and reformulate the objective into a matrix form. Thereby, we can construct a granular-ball-based multi-kernel clustering model (GB-SMKKM):

min σ max Hgb ΩTr KGB σ (Hgb HT gb Im) , (13)

where Ω= {Hgb Rm J|HT gb Hgb = IJ}. Further, we have

min σ max Hgb ΩTr KGB σ Hgb HT gb Tr(KGB σ Im). (14)

Algorithm 1 presents the GB-SMKKM optimization procedure. Section 2 of the Appendix provides proof to show that the above objective is differentiable and convex.

3.5 Computational Complexity In this work, n represents the number of samples, m denotes the number of granular balls, ω denotes the number of kernel functions, and k represents the number of clusters. The main computational cost of the proposed method involves the construction of the (m m) kernel matrix KGB σ and the optimization of the assignment matrix Hgb. Specifically, for ω kernel matrices, traversing them for calculation incurs a O(ωm2) complexity. Then, for the maximum optimization step and clustering assignment matrix computation, each update of Hgb Ωrequires solving a positive semi-definite

Proceedings of the Thirty-Fourth International Joint Conference on Artiﬁcial Intelligence (IJCAI-25)

Datasets # Samples Dimension Cluster View Type

GLIOMA 50 4,434 4 1 gene Srbct 63 2,308 4 1 gene YALE 165 1,024 15 1 image JAFFE 213 676 10 1 image ORL 400 1,024 40 1 image WDBC 569 30 2 1 bio. WBC 683 9 2 1 bio. TR41 878 7,454 10 1 text DS8 1,280 2 3 1 noise

Caltec101-7 1,474 48, 40, 254, 1984, 512, 928 7 6 image Mfeat 2,000 216, 76, 64, 6, 240, 47 10 6 image Handwritten 10,000 784, 256 10 2 image

Table 1: Dataset information.

optimization problem, where the largest eigenvalue is calculated once using K-means clustering. Hence, this step incurs a complexity of O(m3 + m2kt), where t is the number of optimization iterations. The computational bottleneck lies in the matrix decomposition step, which incurs a O(m3) complexity. In contrast, MKKM operates on the complete n n kernel matrix, where the matrix decomposition has a O(n3) complexity. This highlights the advantage of the granularball kernel approach, as it reduces the O(n3) term to O(m3), where m n. Finally, when the optimization process for the GB-SMKKM objective function requires T iterations, the total time complexity is: O T[ωm2 + m3 + m2kt] .

4 Experiments

In this section, we conduct the following experiments, including evaluating the clustering performance of the proposed granular-ball-induced multiple kernel clustering against recent strong multiple kernel baselines on nine single-view datasets and three multiple-view datasets; comparing different anchor-sampling methods in multi-kernel clustering frameworks to verify the effectiveness of using granularball computing; and analyzing optimization convergence and computational cost. The experiments are conducted using MATLAB R2022a 1.

4.1 Setup Datasets. We evaluated the proposed algorithm on 12 datasets. The detailed information of datasets is summarized in Table 1. As can be seen, these datasets differ significantly in terms of sample size, multi-view attributes, dimensionality, and the number of categories, making them highly representative and capable of comprehensively verifying the applicability and robustness of the algorithm. Metrics. We adopt three widely-used external evaluation metrics [Rand, 1971] to measure clustering performance: i) Adjusted Rand Index (ARI), which evaluates the similarity between the predicted and true cluster assignments while adjusting for chance; ii) Normalized Mutual Information (NMI), which quantifies the amount of information shared between the predicted clusters and the ground truth; and iii) Clustering Accuracy (ACC) directly measures the proportion of correctly classified samples. To ensure the reliability of the

1Source code and supplementary materials are available at: https: //github.com/Wang Yifan4211115/GB-MKKM.

results and reduce the impact of randomness, each method is repeated 20 times, and the average results are reported. Baselines. In the experiments, the following recent strong baselines are included for comparison. Approximate kernel K-means (RKKM-a) approximates kernel K-means using a low-rank matrix decomposition, significantly reducing the computational cost.([Chitta et al., 2011], SIGKDD) Affinity aggregation for spectral clustering (AASC) aggregates affinity matrices to enhance spectral clustering. ([Huang et al., 2012], CVPR) Multiple kernel K-means (MKKM) combines multiple kernels into a consensus kernel to enhance alignment and performance. ([Huang et al., 2011], IEEE TFS) Robust multiple kernel K-means (RMKKM) leverages the L2,1-norm to handle noise and outliers. ([Du et al., 2015], IJCAI) Simple multiple kernel K-means (Simple-MKKM) simplifies optimization by introducing a streamlined min-max formulation with fewer parameters. ([Liu, 2022], IEEE TPAMI) Multiple kernel K-means clustering with simultaneous spectral rotation (MKKM-SR) integrates spectral rotation with multiple kernel K-means to align the clustering structure across kernels. ([Lu et al., 2022], ICASSP) Scalable multiple kernel clustering (SMKC) scales multiple kernel clustering by learning consensus clustering structures from expectation kernels. ([Liang et al., ][2024], ICML) Furthermore, we introduce the proposed granular-ballinduced kernel into the baselines (MKKM, MKKM-SR, and SMKKM). For a fair comparison, we uniformly employ three types of kernels (with various parameters): Linear, Polynomial, and Gaussian, resulting in 6 kernel functions. It is noteworthy that for the comparison with RMKKM, we adopt the optimal parameters and 12 kernel functions as used in their original paper [Du et al., 2015].

4.2 Results on Single-view Clustering Table 2 reports comparison results averaged on nine singleview clustering tasks regarding ACC, NMI, and ARI metrics. As can be seen, the proposed GB-SMKKM consistently achieves the best results in the three metrics. Specifically, compared to SMKKM, the ACC, NMI, and ARI improvements by the proposed method are 6.83%, 13.93%, and 5.93% respectively. These improvements are statistically significant as the paired T-test yields P-values of 0.038, 0.014, and 0.022 for ACC, NMI, and ARI, respectively. Moreover, the improvements are still significant when the proposed granular-ball-induced kernel is introduced into other baselines, such as MKKM and MKKM-SR. For example, the average improvements of ACC, NMI, and ARI of GBMKKM-SR are 5.59%, 11.79%, and 5.84% higher than those of MKKM-SR, respectively, and the upgrades of ACC, NMI, and ARI of GB-MKKM are 1.05%, 3.47%, and 1.33%, respectively. These indicate that using granular balls is beneficial in improving the optimization in the multi-kernel spaces.

Proceedings of the Thirty-Fourth International Joint Conference on Artiﬁcial Intelligence (IJCAI-25)

Figure 3: Comparison of ACC, NMI, ARI, and Average performance between three anchor selection strategies and GBC.

Method ACCavg(%) NMIavg(%) ARIavg(%)

Random Selection 58.84 45.12 28.87 K-means 64.73 52.65 40.78 BKHK 67.21 58.40 48.10 RKKM-a 62.10 50.42 66.61 AASC 49.48 31.19 52.88 RMKKM 71.98 62.58 75.36

MKKM 72.79 62.91 78.30 GB-MKKM 73.84 66.38 79.63 improvements 1.05 3.47 1.33

MKKM-SR 66.54 51.58 71.35 GB-MKKM-SR 71.93 63.37 77.19 improvements 5.59 11.79 5.84

SMKKM 73.71 62.58 77.97 GB-SMKKM 80.54 76.51 83.90 improvements 6.83 13.93 5.93

Table 2: Comparison results in terms of ACC, NMI, and ARI averaged on nine datasets.

This might be explained by the fact that granular balls can capture the data distribution from multiple granularities and exclude potential noisy data by estimated ball boundaries. We also conducted a statistical analysis regarding ACC using the Friedman Test. Friedman test results are reported in Figure 4. Section 3 of the Appendix, the detailed calculation process of the Friedman Test, is added.

4.3 Results on Multi-view Clustering Here, three multi-view clustering tasks are conducted by comparing a recent strong multi-view baseline SMKC ([Liang et al., ][2024], ICML) for comparison. Note that SMKC is also built upon the SMKKM, which shares a foundation similar to our model. Results are summarized in Table 3. In the experiments using granular-ball kernels in SMKC,

Figure 4: Comparison on the Friedman test. Groups of methods that are not significantly different (at p = 0.05) are connected.

Note: In the table, Caltec101-7 is abbreviated as C101-7.

Method GB-SMKC SMKC

Dataset Hdigit Mfeat C101-7 Hdigit Mfeat C101-7

GB/Anchor Counts 396 60 30 400 62 31 ACC(%) 77.28 94.80 47.83 76.63 89.40 43.69 NMI(%) 65.48 88.90 38.33 64.40 80.92 36.09 ARI(%) 77.28 94.80 82.02 76.63 89.40 81.41

Table 3: Comparison of GB-SMKC and SMKC.

we introduce our variant GB-SMKC. This is achieved by using granular-ball computing to replace their anchor-selection method. Also, the number of anchors in SMKC is kept close to the number of granular balls to ensure fairness.

As seen in Table 3, GB-SMKKM still outperforms SMKC in terms of all ACC, NMI, and ARI metrics. Specifically, the ACC of GB-SMKC is higher by 0.56%, 5.4%, and 4.14% on the Handwritten, Mfeat, and Caltech101-7 tasks, respectively. This validates the effectiveness and applicability of the GBinduced framework on these multi-view clustering tasks.

Proceedings of the Thirty-Fourth International Joint Conference on Artiﬁcial Intelligence (IJCAI-25)

Method ORL WDBC WBC TR41 Ds8

RKKM-a 0.164 0.104 0.073 0.144 0.365 AASC 0.942 0.216 1.772 2.962 1.524 RMKKM 2.338 1.605 1.031 3.769 7.713

MKKM 0.067 0.104 0.116 0.152 0.644 GB-MKKM 0.021 0.017 0.008 0.033 0.154

MKKM-SR 0.322 0.056 0.048 0.182 0.711 GB-MKKM-SR 0.196 0.014 0.015 0.047 0.041

SMKKM 0.111 0.542 2.175 0.202 6.504 GB-SMKKM 0.050 0.536 0.220 0.073 1.202

Table 4: Running time (s) comparison of the GB-induced framework and six baselines.

4.4 Effects of Anchor Selection Strategies

Granular-ball computing reduces the computational cost by transforming the calculation of point-point relationships into that of ball-ball relationships. As it adaptively determines subsets, Granular-ball computing can be regarded as an anchor selection method. Compared to other anchor-sampling methods (e.g., Balanced K-means based Hierarchical Kmeans(BKHK) [Zhu et al., 2017], K-means [Cai and Chen, 2014], and random anchor selection [Nie et al., 2023]), granular-ball computing can be better at describing data distribution with complex shapes. This is because the generated granular balls adaptively determine multi-granularity subsets based on data distribution characteristics and metric criteria. The ball boundaries can exclude unknown noise points, further improving robustness. In this experiment, we compare the above anchor-sampling methods to provide empirical evidence to verify the effectiveness. Figure 3 summarizes the comparison results. As can be seen, granular-ball computing is a better choice than all other sampling methods for multi-kernel clustering optimization. This is because the procedure of granular-ball generation is based on data distribution. The generated balls better fit the original data distribution through the density-based consistency metric. Moreover, the ball boundaries can exclude unknown noise points, improving robustness. Hence, the above results further verify the effectiveness of the proposed granular-ball kernels.

4.5 Runtime Analysis

Table 4 presents the runtime comparison between the GBinduced framework and six baselines. The results on five representative datasets show that the average speed of GB-MKKM is 0.170 seconds higher than that of MKKM, and the average speed of GB-MKKM-SR is 0.201 seconds higher. Similarly, GB-SMKKM is 1.491 seconds faster than SMKKM respectively. Note that granular-ball construction can be done in advance independently of the optimization of different models. Thus, we do not include the construction time here. In section 4 of the Appendix, we report both ball generation and model optimization time. The results show that the speed increases as the dataset scale increases. Therefore, granular-ball-induced kernel matrix construction scales the MKKM algorithms. Meanwhile, in the section 5 of the

Figure 5: Granular-balls count vs. sample size.

appendix, convergence graphs of the monotonically decreasing target values of multiple datasets during the iterative process are presented.

4.6 Analysis of Granular-ball Numbers

Computing a full n n kernel matrix in traditional MKKM methods is computationally expensive. The GB-MKKM framework uses granular balls to characterize data distribution. By reducing the size of the kernel matrix, GB-MKKM minimizes the complexity of subsequent optimization, avoiding modeling the point-to-point dependency inherent in traditional methods. In Figure 5, we report the number of granular balls adaptively optimized by the granular-ball computing process. As can be seen, the granular-ball number is significantly smaller than the original number of sample points on the same datasets. The results indicate that the GB-MKKM framework not only achieves promising high clustering performance, but also effectively reduces the number of sample points. This demonstrates its remarkable flexibility and applicability to various MKKM algorithms, fully proving the ability of granular balls to characterize data in kernel space effectively. Other experimental results about the single kernel method and other methods are supplemented in Section 6 of the Appendix.

5 Conclusion

This paper introduces the GB-induced clustering framework, which enhances multi-kernel k-means (MKKM) by incorporating granular-ball computing. Traditional MKKM suffers from high computational costs due to full kernel matrix construction and point-to-point dependencies. To address this, we propose a novel granular-ball representation that models relationships between local data regions (granular balls) instead of individual points. This ball-to-ball approach significantly reduces complexity, improves robustness to noise, and achieves higher clustering accuracy. Extensive experiments show that GB-induced and its variants consistently outperform traditional MKKM methods across diverse datasets, demonstrating strong generalization, robustness, and scalability. The framework is broadly applicable to existing MKKM algorithms and can be extended to deep architectures for enhanced multi-kernel clustering in neural networks.

Proceedings of the Thirty-Fourth International Joint Conference on Artiﬁcial Intelligence (IJCAI-25)

Acknowledgments

This research was supported by the Chongqing Graduate Research Innovation Program CYB25259 and the National Natural Science Foundation of China under Grant Nos. 62221005, 62450043, 62222601, and 62176033.

[Abin and Beigy, 2015] Ahmad Ali Abin and Hamid Beigy. Active constrained fuzzy clustering: A multiple kernels learning approach. Pattern Recognition, 48(3):953 967, 2015.

[Aggarwal, 2018] Charu C Aggarwal. An introduction to cluster analysis. In Data clustering, pages 1 28. Chapman and Hall/CRC, 2018.

[Arthur and Vassilvitskii, 2006] David Arthur and Sergei Vassilvitskii. k-means++: The advantages of careful seeding. Technical report, Stanford, 2006.

[Cai and Chen, 2014] Deng Cai and Xinlei Chen. Large scale spectral clustering via landmark-based sparse representation. IEEE transactions on cybernetics, 45(8):1669 1680, 2014.

[Chao et al., 2021] Guoqing Chao, Shiliang Sun, and Jinbo Bi. A survey on multiview clustering. IEEE transactions on artificial intelligence, 2(2):146 168, 2021.

[Chen, 1982] Lin Chen. Topological structure in visual perception. Science, 218(4573):699 700, 1982.

[Cheng et al., 2023] Dongdong Cheng, Ya Li, Shuyin Xia, Guoyin Wang, Jinlong Huang, and Sulan Zhang. A fast granular-ball-based density peaks clustering algorithm for large-scale data. IEEE Transactions on Neural Networks and Learning Systems, 2023.

[Cheng et al., 2024] Dongdong Cheng, Cheng Zhang, Ya Li, Shuyin Xia, Guoyin Wang, Jinlong Huang, Sulan Zhang, and Jiang Xie. Gb-dbscan: A fast granular-ball based dbscan clustering algorithm. Information Sciences, 674:120731, 2024.

[Chitta et al., 2011] Radha Chitta, Rong Jin, T. Havens, and Anil K. Jain. Approximate kernel k-means: solution to large scale kernel clustering. In Knowledge Discovery and Data Mining, 2011.

[Cortes et al., 2012] Corinna Cortes, Mehryar Mohri, and Afshin Rostamizadeh. Algorithms for learning kernels based on centered alignment. The Journal of Machine Learning Research, 13(1):795 828, 2012.

[Du et al., 2015] Liang Du, Peng Zhou, Lei Shi, Hanmo Wang, Mingyu Fan, Wenjian Wang, and Yi-Dong Shen. Robust multiple kernel k-means using l21-norm. In Twenty-fourth international joint conference on artificial intelligence, 2015.

[Duran and Odell, 2013] Benjamin S Duran and Patrick L Odell. Cluster analysis: a survey, volume 100. Springer Science & Business Media, 2013.

[Ester et al., 1996] Martin Ester, Hans-Peter Kriegel, J org Sander, Xiaowei Xu, et al. A density-based algorithm for discovering clusters in large spatial databases with noise. In KDD, volume 96, pages 226 231, 1996. [Girolami, 2002] Mark Girolami. Mercer kernel-based clustering in feature space. IEEE transactions on neural networks, 13(3):780 784, 2002. [G onen and Margolin, 2014] Mehmet G onen and Adam A Margolin. Localized data fusion for kernel k-means clustering with application to cancer biology. Advances in neural information processing systems, 27, 2014. [Hu et al., 2025] Limei Hu, Feng Chen, Sen Zhao, Shukai Duan, et al. Gricp: Granular-ball iterative closest point with multikernel correntropy for point cloud fine registration. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 1710 1718, 2025. [Huang et al., 2011] Hsin-Chien Huang, Yung-Yu Chuang, and Chu-Song Chen. Multiple kernel fuzzy clustering. IEEE Transactions on Fuzzy Systems, 20(1):120 134, 2011. [Huang et al., 2012] Hsin-Chien Huang, Yung-Yu Chuang, and Chu-Song Chen. Affinity aggregation for spectral clustering. In 2012 IEEE Conference on computer vision and pattern recognition, pages 773 780. IEEE, 2012. [Li et al., 2023] Miaomiao Li, Yi Zhang, Suyuan Liu, Zhe Liu, and Xinzhong Zhu. Simple multiple kernel k-means with kernel weight regularization. Information Fusion, 100:101902, 2023. [Li et al., 2024] Miaomiao Li, Xinwang Liu, Yi Zhang, and Weixuan Liang. Late fusion multiview clustering via minmax optimization. IEEE Transactions on Neural Networks and Learning Systems, 35(7):9417 9427, 2024. [Liang et al., ] Weixuan Liang, En Zhu, Shengju Yu, Huiying Xu, Xinzhong Zhu, and Xinwang Liu. Scalable multiple kernel clustering: Learning clustering structure from expectation. In Forty-first International Conference on Machine Learning. [Liu et al., 2016] Xinwang Liu, Yong Dou, Jianping Yin, Lei Wang, and En Zhu. Multiple kernel k-means clustering with matrix-induced regularization. In Proceedings of the AAAI conference on artificial intelligence, volume 30, 2016. [Liu et al., 2017] Xinwang Liu, Sihang Zhou, Yueqing Wang, Miaomiao Li, Yong Dou, En Zhu, and Jianping Yin. Optimal neighborhood kernel clustering with multiple kernels. In Proceedings of the AAAI conference on artificial intelligence, volume 31, 2017. [Liu et al., 2024] Shushu Liu, Dongdong Cheng, and Jiang Xie. Granular-ball-based fast spectral embedding clustering algorithm for large-scale data. In Proceedings of the 2024 16th International Conference on Machine Learning and Computing, pages 16 20, 2024. [Liu, 2022] Xinwang Liu. Simplemkkm: Simple multiple kernel k-means. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(4):5174 5186, 2022.

Proceedings of the Thirty-Fourth International Joint Conference on Artiﬁcial Intelligence (IJCAI-25)

[Lu et al., 2022] Jitao Lu, Yihang Lu, Rong Wang, Feiping Nie, and Xuelong Li. Multiple kernel k-means clustering with simultaneous spectral rotation. In ICASSP 20222022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 4143 4147. IEEE, 2022. [Mac Queen and others, 1967] James Mac Queen et al. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, volume 1, pages 281 297. Oakland, CA, USA, 1967. [Mussabayev et al., 2023] Rustam Mussabayev, Nenad Mladenovic, Bassem Jarboui, and Ravil Mussabayev. How to use k-means for big data clustering? Pattern Recognition, 137:109269, 2023. [Ng et al., 2001] Andrew Ng, Michael Jordan, and Yair Weiss. On spectral clustering: Analysis and an algorithm. Advances in neural information processing systems, 14, 2001. [Nie et al., 2023] Feiping Nie, Jingjing Xue, Weizhong Yu, and Xuelong Li. Fast clustering with anchor guidance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023. [Rand, 1971] William M Rand. Objective criteria for the evaluation of clustering methods. Journal of the American Statistical association, 66(336):846 850, 1971. [Sinaga and Yang, 2020] Kristina P Sinaga and Miin-Shen Yang. Unsupervised k-means clustering algorithm. IEEE access, 8:80716 80727, 2020. [Su et al., 2025] Peng Su, Shudong Huang, Weihong Ma, Deng Xiong, and Jiancheng Lv. Multi-view granular-ball contrastive clustering. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 20637 20645, 2025. [Tao et al., 2018] Hong Tao, Chenping Hou, Xinwang Liu, Tongliang Liu, Dongyun Yi, and Jubo Zhu. Reliable multiview clustering. In Proceedings of the AAAI conference on artificial intelligence, volume 32, 2018. [Wang et al., 2024] Jun Wang, Zhenglai Li, Chang Tang, Suyuan Liu, Xinhang Wan, and Xinwang Liu. Multiple kernel clustering with adaptive multi-scale partition selection. IEEE Transactions on Knowledge and Data Engineering, 36(11):6641 6652, 2024. [Wang, 2017] Guoyin Wang. Dgcc: data-driven granular cognitive computing. Granular Computing, 2(4):343 355, 2017. [Xia et al., 2019] Shuyin Xia, Yunsheng Liu, Xin Ding, Guoyin Wang, Hong Yu, and Yuoguo Luo. Granular ball computing classifiers for efficient, scalable and robust learning. Information Sciences, 483:136 152, 2019. [Xia et al., 2020] Shuyin Xia, Daowan Peng, Deyu Meng, Changqing Zhang, Guoyin Wang, Elisabeth Giem, Wei Wei, and Zizhong Chen. A fast adaptive k-means with no bounds. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020.

[Xia et al., 2024] Shuyin Xia, Bolun Shi, Yifan Wang, Jiang Xie, Guoyin Wang, and Xinbo Gao. Gbct: An efficient and adaptive granular-ball clustering algorithm for complex data. ar Xiv preprint ar Xiv:2410.13917, 2024. [Xie et al., 2023] Jiang Xie, Weiyu Kong, Shuyin Xia, Guoyin Wang, and Xinbo Gao. An efficient spectral clustering algorithm based on granular-ball. IEEE Transactions on Knowledge and Data Engineering, 35(9):9743 9753, 2023. [Xie et al., 2024a] Jiang Xie, Minggao Dai, Shuyin Xia, Jingjing Zhang, Guoyin Wang, and Xinbo Gao. An efficient fuzzy stream clustering method based on granularball structure. In 2024 IEEE 40th International Conference on Data Engineering (ICDE), pages 901 913. IEEE, 2024. [Xie et al., 2024b] Jiang Xie, Chunfeng Hua, Shuyin Xia, Yuxin Cheng, Guoyin Wang, and Xinbo Gao. W-gbc: An adaptive weighted clustering method based on granularball structure. In 2024 IEEE 40th International Conference on Data Engineering (ICDE), pages 914 925. IEEE, 2024. [Yan et al., 2023] Wenzhu Yan, Yanmeng Li, and Ming Yang. Towards deeper match for multi-view oriented multiple kernel learning. Pattern Recognition, 134:109119, 2023. [Yao et al., 2020] Yaqiang Yao, Yang Li, Bingbing Jiang, and Huanhuan Chen. Multiple kernel k-means clustering by selecting representative kernels. IEEE Transactions on Neural Networks and Learning Systems, 32(11):4983 4996, 2020. [Yu et al., 2011] Shi Yu, Leon Tranchevent, Xinhai Liu, Wolfgang Glanzel, Johan AK Suykens, Bart De Moor, and Yves Moreau. Optimized data fusion for kernel k-means clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(5):1031 1039, 2011. [Zhang et al., 2020] Changqing Zhang, Yajie Cui, Zongbo Han, Joey Tianyi Zhou, Huazhu Fu, and Qinghua Hu. Deep partial multi-view learning. IEEE transactions on pattern analysis and machine intelligence, 44(5):2402 2415, 2020. [Zhang et al., 2021] Tiejian Zhang, Xinwang Liu, Lei Gong, Siwei Wang, Xin Niu, and Li Shen. Late fusion multiple kernel clustering with local kernel alignment maximization. IEEE Transactions on Multimedia, 25:993 1007, 2021. [Zhao et al., 2009] Bin Zhao, James T Kwok, and Changshui Zhang. Multiple kernel clustering. In Proceedings of the 2009 SIAM international conference on data mining, pages 638 649. SIAM, 2009. [Zhu et al., 2017] Wei Zhu, Feiping Nie, and Xuelong Li. Fast spectral clustering with efficient large graph construction. In 2017 IEEE international conference on acoustics, speech and signal processing (ICASSP), pages 2492 2496. IEEE, 2017.

Proceedings of the Thirty-Fourth International Joint Conference on Artiﬁcial Intelligence (IJCAI-25)