# analogical_inference_enhanced_knowledge_graph_embedding__3e03d1b0.pdf

Analogical Inference Enhanced Knowledge Graph Embedding

Zhen Yao1*, Wen Zhang1*, Mingyang Chen2, Yufeng Huang1, Yi Yang4, Huajun Chen2,3,5

1School of Software Technology, Zhejiang University 2College of Computer Science and Technology, Zhejiang University 3Donghai Laboratory, Zhoushan 316021, China 4Huawei Technologies Co., Ltd 5Alibaba-Zhejiang University Joint Institute of Frontier Technologies {yz0204, zhang.wen, mingyangchen, huangyufeng, huajunsir}@zju.edu.cn yangyi193@huawei.com,

Knowledge graph embedding (KGE), which maps entities and relations in a knowledge graph into continuous vector spaces, has achieved great success in predicting missing links in knowledge graphs. However, knowledge graphs often contain incomplete triples that are difficult to inductively infer by KGEs. To address this challenge, we resort to analogical inference and propose a novel and general self-supervised framework An KGE to enhance KGE models with analogical inference capability. We propose an analogical object retriever that retrieves appropriate analogical objects from entity-level, relation-level, and triple-level. And in An KGE, we train an analogy function for each level of analogical inference with the original element embedding from a welltrained KGE model as input, which outputs the analogical object embedding. In order to combine inductive inference capability from the original KGE model and analogical inference capability enhanced by An KGE, we interpolate the analogy score with the base model score and introduce the adaptive weights in the score function for prediction. Through extensive experiments on FB15k-237 and WN18RR datasets, we show that An KGE achieves competitive results on link prediction task and well performs analogical inference.

1 Introduction Knowledge graphs (KGs) storing a large number of triples in the form of (head entity, relation, tail entity), (h, r, t) for short, are popular data structures for representing factual knowledge. Many knowledge graph projects such as Freebase (Bollacker et al. 2008), Word Net (Miller 1994), YAGO (Suchanek, Kasneci, and Weikum 2007) and DBpedia (Lehmann et al. 2015) are significant foundations to support artificial intelligence applications. They have been successfully used in downstream applications such as word sense disambiguation (Bevilacqua and Navigli 2020), question answering (Yasunaga et al. 2021), and information extraction (Hu et al. 2021), gaining widespread attention. However, most KGs are incomplete, so predicting the missing links between entities is a fundamental problem for KGs

*These authors contributed equally. Corresponding Author. Copyright 2023, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

called link prediction. One of the common approaches to this problem is knowledge graph embedding (KGE) methods, which make prediction through a predefined triple score function with learnt entity and relation embeddings as input. Many KGE models have been proposed like Trans E (Bordes et al. 2013), Dist Mult (Yang et al. 2015), Rotat E (Sun et al. 2019) and HAKE (Zhang et al. 2020). These methods have gained great success in knowledge graph completion task. For most KGE methods, the parametric learning paradigm can be viewed as memorization regarding training data as a book and predicting missing links as the close-book test (Chen et al. 2022), which belongs to inductive inference. However, the large knowledge graphs often contain incomplete triples that are difficult to be inductively inferred by applying memorization paradigm. Nevertheless, the problem may be well solved by using analogical inference method. That is because analogical inference is a referential method, which retrieves similar solutions to solve new problems, similar to an open-book examination. For example, it seems that most people could not remember even learn about what company Ron Wayne founded. However, if they know that Ron Wayne and Steve Jobs are the co-founders, i.e., Steve Jobs and Ron Wayne are analogical objects in this context, and it is well known that Steve Jobs founded Apple Inc.; thus they could analogically infer that Ron Wayne founded Apple Inc. . In order to enhance KGEs with analogical inference capability, there are three problems should be solved: 1) How to define the analogical objects of elements given a task? 2) How to enable the model to map elements to analogical objects? 3) How to combine the original inductive inference capability and enhanced analogical inference capability? We propose An KGE, a novel and general self-supervised framework, which solves these problems very well and enhances well-trained KGEs with analogical inference capability. For problem 1, we think that an analogical object can solve the given task well, and inspired by the nearest neighbor language model (Khandelwal et al. 2020), we propose an analogical retriever covering objects of three levels, including entity, relation, and triple level. Specifically, we consider the score function of KGEs as the assessment of the quality of triples and regrade the replacement triples with the highest scoring as the appropriate analogical objects. For prob-

The Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI-23)

lem 2, we trained a projecting function using analogical objects as supervision signals. This function projects original objects onto appropriate analogical objects. For problem 3, we interpolate the analogy score with the base model score to combine the original inductive inference capability and enhanced analogical inference capability. Moreover, we introduce the adaptive weight to adjust analogical inference in knowledge graph completion task. Finally, through link prediction experiments on FB15k237 and WN18RR datasets, we demonstrate the An KGE is significantly compatible and outperforms the other baseline models. To the best of our knowledge, An KGE is the first framework to enhance KGEs with analogical inference ability. In summary, our contributions in this work include:

We explore the knowledge graph completion task from the analogical inference view. We propose an effective retrieval method covering three levels to obtain the appropriate analogy objects. We propose a novelty analogical inference enhanced framework called An KGE, which could project original objects onto appropriate objects for analogical inference. To our knowledge, the An KGE is the first framework of knowledge graph embedding to enhance analogical inference ability. We conduct experimental evaluations to demonstrate the proposed An KGE is significantly compatible and achieves competitive performance on FB15k-237 and WN18RR datasets, promising practical applications.

2 Related Work Knowledge graph embedding According to previous work (Zhang et al. 2022), the KGE methods can be divided into two categories based on the scoring function and whether a global graph structure is utilized. The first category is the Conventional KGEs (C-KGEs), which apply a geometric assumption in vector space for true triples and use single triple as input for triple scoring. Conventional KGEs use the score function to measure the plausibility of triple. Trans E (Bordes et al. 2013) is a representative conventional KGE method whose score function is h + r t 2. What is more, there are many variants to improve the performance of Trans E, such as Rotat E (Sun et al. 2019), Dist Mult (Yang et al. 2015) and HAKE (Zhang et al. 2020). The other category is the GNN-based methods, which use representations of entities and relations aggregated from their neighbors in the graph instead of embedding them for triple scoring to capture the graph patterns explicitly. R-GCN (Schlichtkrull et al. 2018) is the first GNN framework to model relational data. It introduces relation-specific transformations when neighbor aggregating. SE-GNN (Li et al. 2022) models three levels semantic evidence into knowledge embedding. Note that SE-GNN introducing three levels from the semantic evidence view differs from our three levels analogical objects.

Enhanced KGE framework Recently, some work has proposed some frameworks and strategies to improve the performance of KGE models, which are called enhanced

KGE, such as CAKE (Niu et al. 2022), PUDA(Tang et al. 2022) and REP (Wang et al. 2022). CAKE is a commonsense-aware knowledge embedding framework to extract commonsense from factual triples with entity concepts automatically, which generates commonsense augments to facilitate high-quality negative sampling. PUDA is a data augmentation strategy to address the false negative and data sparsity issue. REP is a post-processing technique to adapt pre-trained KG embeddings with graph context. Our method is designed to enhance a well-trained KGE model with analogical inference capability belonging to the enhanced KGE framework.

Analogical inference In classic artificial intelligence, analogical inference was an active research topic. However, the early computational model of analogy-making study (Gentner 1983; Turney 2008) mainly focuses on structure mapping theory and its implementation in the structure mapping engine. Recently, some researchers proposed k-Nearest Neighbor language model(k NN-LM) (Khandelwal et al. 2020), which can directly query training examples at test time, also can be considered the analogy inference model in the neural language process topic. While effective, these models often require retrieval from a large datastore at test time, significantly increasing the inference overhead. In the field of knowledge graph, the study of analogical inference to solve knowledge graph incomplete problem is missing. ANALOGY (Liu, Wu, and Yang 2017) is the first method for modeling analogical structures in multi-relational embedding, but the performance is not good. Differ in our method uses the nearest neighbor method to perform explicit analogy, ANALOGY uses the commutativity constraint of the normal matrix to model analogical relations implicitly.

3 Analogical Object Retriever

Before introducing our method, in this section, we firstly introduce the background of knowledge graph and analogical inference, and then we propose the analogical object retrievers that retrieve appropriate analogical objects from entitylevel, relation-level, and triple-level. The retrieved analogical objects will be used as supervision signals with our method.

Background A knowledge graph is denoted as G = (E, R, F), where E represents the set of entities, R represents the set of relations, and F = {(h, r, t)} E R E represents the set of triple facts. Analogical inference, which has been long researched in artificial intelligence, maps the target problem to a known source problem that could effectively utilize known knowledge (Hall 1989). Applying analogical inference into link prediction task (h, r, ?) in knowledge graphs, instead of directly predicting the tail entity t, we could make prediction through similar triples that we know, i.e. triples in train dataset. We consider similar triples are composed by analogical objects of (h, r, t). Specifically, we assume that the analogy objects may come from three levels: the analogy of head entity h part resulting similar triple (h , r, t)(entitylevel), the analogy of relation r part resulting similar triple

(h, r , t) (relation-level) and the analogy of combination pair (h, r) part t resulting similar triple (h , r , t) (triple-level). Thus, we propose three retrievers to obtain different level s analogical objects.

Entity-Level Retriever The retriever is designed based on the score function fkge(h, r, t) predefined in a well-trained KGE model, where triples with higher scores are assumed with higher probability to be true. Inspired by the nearest neighbor language model (Khandelwal et al. 2020), we replace all possible objects of the triple and regrade the replacement triples with highest scoring as the appropriate analogical objects. Given a triple (h, r, t), entity-level retriever retrieves similar true triples (h , r, t) for entitylevel analogical inference. For example, we could get the answer of (Sergey Brin, found, ?) is Google through (Larry Page, found, Google) if we know Sergey Brin and Larry Page are co-founders. Specifically, in entity-level retriever, we first replace h with all entities resulting |E| replacement triples, and then regard triples with highest scores measured by the KGE as similar triples. And we name the head entity in similar triples as analogical objects from entity-level retriever. Thus analogical object set could be represented as

Ehrt Ne = {hi | Top( {fkge(hi, r, t) | hi E} )Ne}, (1)

where Top( )k denotes the k elements with top k values among all inputs, fkge( , , ) is the predefined score function in KGE model, and hrt denotes a specific triple (h, r, t) as input. If not otherwise specified, we omit hrt and use ENe instead of Ehrt Ne for simplicity. Compared to retrieving similar triples directly from the train dataset, retrieving according to scores from KGEs could help overcome the incompleteness of KGs.

Relation-Level Retriever Given (h, r, t), relation-level retriever retrieves (h, r , t) for relation-level analogical inference, since there are relations with similar contexts in KGs. For example, the founder of a company is usually the board member. Thus the relation-level analogy object of found is board member. Similar to the entity-level retriever, the analogical object set of (h, r, t) from relationlevel retriever is as follow :

RNr = {ri | Top( {fkge(h, ri, t) | ri R} )Nr}. (2)

Triple-Level Retriever Given (h, r, t), triple-level retriever retrieves (h , r , t) for triple-level analogical inference, which is the combination of entity-level and relationlevel retriever. For instance, Sergey Brin is the founder of Google and Sundar Pichai is the CEO of Google. Therefore, the triple-level analogical objects of (Sergey Brin, found) is (Sundar Pichai, CEO). Actually, the number of all candidate (h , r ) pairs is in millions in most knowledge graphs. In order to reduce the cost of retrieving candidate pairs and inspired by the principle of locality, we often select m entities and n relations with high triple scores separately, and then pair them with each other. Thus the set of analogical objects, namely (h , r ) pairs, from triple-level retriever is

TNt = {(hi, ri) | Top( {fkge(hi, ri, t) | hi Em, ri Rn})Nt}. (3)

4 Methodology In this section, we present a novel KGE enhanced framework called Analogy Enhanced Knowledge Graph Embedding (An KGE), which could model the three levels of analogical inference as introduced in Section 3. Next, we first introduce the definition of analogy function (Section 4.1) and how to train it by using analogical objects (Section 4.2 and Section 4.3). Finally, we introduce how to combine the original inductive inference capability and enhanced analogical inference capability in knowledge graph completion task. (Section 4.4)

4.1 Analogy Function Given a well-trained KGE model M = {E, R, fkge, Θ}, where E, R and fkge are entity embedding table, relation embedding table, and score function of the M, and Θ is the set of other parameters, An KGE enhances M with capability of analogical inference through a projecting function called analogy function f. We train an analogy function for each level of analogical inference with the original element embedding from E or R in M as input and output the analogical object embedding to conduct link prediction. Specifically, analogy function for relation-level analogical inference frel maps an original embedding of a relation r in (h, r, t) to the analogical embedding through a relation projecting vector v R r Rdr that

frel(r) = ra = v R r r, (4)

where dr is the relation hidden dimension, is the elementwise product. Similarly, the analogy function for entity-level analogical inference fent maps an original embedding of an entity h in (h, r, t) to the analogical embedding. Considering that an entity generally tends to be associated with multiple relations, we define fent as:

fent(h, r) = ha = v E h h + λMtrans v R r r, (5)

where v E h Rde is the entity projecting vector and de is the entity hidden dimension. Mtrans Rde dr denotes the transformation matrix that enable to make relation r into consideration. λ is a weight hyper-parameter. Analogy function for triple-level analogical inference ftrp outputs the analogical embedding of entity and relation pairs through combining embedding of entity-level and relationlevel according to KGEs as follows:

ftrp(h, r) = za = gkge (ha, ra) , (6)

gkge( , ) is the function in KGEs that maps a head entity embedding to the tail entity embedding according to the given relation embedding. gkge( , ) and fkge( , , ) of representative KGE models are provided in Appendix A.

4.2 Analogy Objects Aggregator In order to enhance the framework s robustness for analogical inference, we make the analogical objects retrieved following Section 3 as the supervision signals for analogy functions. Specifically, we make the analogy embedding as introduced in Section 4.1 to approach the weighted average of the analogical objects from KGE model M.

Entity Analogy Embedding

Relation Analogy Embedding

Triple Analogy Embedding

Analogy Function Entity Loss

Relation Loss

Triple Loss

Training Stage

Testing Stage

Analogical Retriever

Link Prediction

Analogy Score

Base Model Score

Score Function:

Entity Embedding:

Relation Embedding:

Base KGE Model

Figure 1: This is the An KGE structure diagram with Trans E as the base model. For simplicity, we set the numbers of three levels analogical object are 1. The upper half of figure shows the module of base model. The predefined score function is applied to learnt embedding to get the well-trained model. The lower half of figure shows the module of An KGE. First, An KGE retrieves the analogy objects for training the analogy function. The solid line arrow indicates the An KGE training process. Then, An KGE remakes the prediction ranking by interpolating analogy score. The dashed line arrow indicates the An KGE testing process.

The aggregated embeddings of entity-level and relationlevel, h+ and r+ respectively, are calculated as follows

hi ENe hi S(fkge(hi, r, t)), (7)

ri RNr ri S(fkge(h, ri, t)), (8)

where S( ) is the softmax function that converts a vector of K real numbers into a probability distribution of K possible outcomes, which is formulated as S (ci) = eci/PK k=1 eck. Triple-level aggregated embedding z+ is obtained by the firstly aggregating entity and relation embedding separately and then calculating combination embedding, which can be formulated as:

z+ =gkge z+ e , z+ r ,

(hi,ri) TNt hi S(fkge(hi, ri, t)),

(hi,ri) TNt ri S(fkge(hi, ri, t)).

4.3 Loss Function The training goal of the analogy function is to reduce the distance between the analogy embedding and aggregated embedding obtained following Section 4.1 and 4.2 respectively. In addition, considering that fkge performs priori on the truth value of triples, we take the analogy triple score as another supervision signal. Therefore, given a pair of analogy

embedding Xa and aggregated embedding X + of a triple embeddings (h, r, t), the loss function is

L(X,(h, r, t)) =

logσ γ Xa X + 2 fkge(h, r, t) , (10)

where γ is a hyper-parameter of the loss function, σ is the sigmoid function. 2 is the euclidean norm. However, the three levels of analogical inference are not equally important for different triples. We add weight parameters for each loss of three levels and the final training objective is1:

min Loss = X

βEL(h, (ha, r, t))

+βR L(r, (h, ra, t))

+βT L(z, (ha, ra, t)) .

As a result, considering the different contributions of three level, we introduce βE, βR and βT to adjust gradient descent. The three levels loss function distribution is positively correlated with the score of the analogy triple. Due to page limitation, we put the calculation details in Appendix B.

4.4 Link Prediction

For a test triple (h, r, t) in test set Fte, we follow the k NNLM (Khandelwal et al. 2020) and interpolate the analogy

1During the gradient update, the parameters of the original model are frozen.

FB15k-237 WN18RR

MRR Hit@1 Hit@3 Hit@10 MRR Hit@1 Hit@3 Hit@10

Conventional KGE Trans E (Bordes et al. 2013) 0.317 0.223 0.352 0.504 0.224 0.022 0.390 0.520 ANALOGY (Liu, Wu, and Yang 2017) 0.256 0.165 0.290 0.436 0.405 0.363 0.429 0.474 Rotat E (Sun et al. 2019) 0.336 0.244 0.370 0.524 0.473 0.428 0.491 0.564 HAKE (Zhang et al. 2020) 0.349 0.252 0.385 0.545 0.496 0.452 0.513 0.580 Rot-Pro (Song, Luo, and Huang 2021) 0.344 0.246 0.383 0.540 0.457 0.397 0.482 0.577 Pair RE (Chao et al. 2021) 0.348 0.254 0.384 0.539 0.455 0.413 0.469 0.539 Dual E (Cao et al. 2021) 0.365 0.268 0.400 0.559 0.492 0.444 0.513 0.584

GNN-based KGE R-GCN (Schlichtkrull et al. 2018) 0.249 0.151 0.264 0.417 - - - - A2N (Bansal et al. 2019) 0.317 0.232 0.348 0.486 0.450 0.420 0.460 0.510 Comp GCN (Vashishth et al. 2020) 0.355 0.264 0.390 0.535 0.479 0.443 0.494 0.546 SE-GNN (Li et al. 2022) 0.365 0.271 0.399 0.549 0.484 0.446 0.509 0.572

Enhanced KGE CAKE (Niu et al. 2022) 0.321 0.226 0.355 0.515 - - - - PUDA (Tang et al. 2022) 0.369 0.268 0.408 0.578 0.481 0.436 0.498 0.582 REP (Wang et al. 2022) 0.354 0.262 0.388 0.540 0.488 0.439 0.505 0.588

An KGE-HAKE(ours) 0.385 0.288 0.428 0.572 0.500 0.454 0.515 0.587

Table 1: Link Prediction results on FB15k-237 and WN18RR. The best results are bold and second best results are underline.

score with base model score to get the final score function:

Score(h, r, t) = fkge (h, r, t) + λEfkge (ha, r, t) + λRfkge (h, ra, t) + λT fkge (ha, ra, t) (12)

where λ is the adaptive weight parameter, which is dynamically adjusts analogy weight according to training triples. λE is proportional to the number of triples with the same (r, t) in the training set. λR is proportional to the number of triples with the same (h, t) in the training set. λT is proportional to the number of triples with the same tail entity in the training set. The formula for adaptive weight parameter is2:

λE = min (| {(hi, r, t) F} |/Ne, 1) αE, λR = min (| {(h, ri, t) F} |/Nr, 1) αR, λT = min (| {(hi, ri, t) F} |/Nt, 1) αT , (13)

where αE, αR, αT are basic weight hyper-parameters. Adaptive weight utilizes the train dataset to determine whether test triples are suitable for different levels of analogical inference. When all levels of analogical inference are not suitable, this score function degenerates to the base KGE model. In fact, An KGE remakes the rank of hard-predicted triples in the base model by analogical inference to improve the prediction performance.

5 Experiments In this section, we present and analyze the experimental results.3 We first introduce the experimental settings in detail. Then we show the effectiveness and compatibility of the

2When link prediction, we add reverse relations to expand the dataset and predict tail entity only, which is equivalent to the effect of predicting both head and tail entities. Each prediction will use all entities to replace tail entity. Thus, there is no risk of label leakage. 3Our code is available at https://github.com/zjukg/An KGE

An KGE with multiple base KGE models. Besides, we further analyze the effect of three levels analogical inference by ablation study. Finally, we conduct case study presenting a new view for the explanations of knowledge graph inference by analogical inference.

5.1 Experiments Setup Dataset We conduct experiments on link prediction task on two well-known benchmarks: WN18RR and FB15k-237. WN18RR and FB15k-237 are subsets of WN18 and FB15k, respectively. Some previous work (Dettmers et al. 2018) has indicated the test leakage flaw in WN18 and FB15k, which means test triples appear in train dataset with inverse relations. WN18RR and FB15k-237 removing inverse relations are the modified version. Therefore, we use WN18RR and FB15k-237 as the experiment datasets. The statistic details of these datasets are summarized in Appendix C.

Evaluation protocol We evaluate the KGE framework performance by four frequent evaluation metrics: the reciprocal mean of correct entity ranks in the whole entity set (MRR) and percentage of test triples with correct entities ranked in top 1/3/10 (Hit@1, Hit@3, Hit@10). For a test task (h, r, ?) t, we replace all entities to create corrupted triples. Following the filter setting protocol, we exclude the other true triples appearing in train, valid and test datasets. Finally, we sort the filter corrupted triples according to the triple scores.

Implementation details We train An KGE framework based on four representative KGE models : Trans E (Bordes et al. 2013), Rotat E (Sun et al. 2019), HAKE (Zhang et al. 2020) and Pair RE (Chao et al. 2021). We use the grid search to select the hyper-parameters of our framework. We search the number of analogy objects of three levels Ne,

FB15k-237 WN18RR

MRR Hit@1 Hit@3 Hit@10 MRR Hit@1 Hit@3 Hit@10

Trans E 0.317 0.223 0.352 0.504 0.224 0.022 0.390 0.520 An KGE-Trans E 0.340 0.245 0.379 0.523 0.232 0.031 0.402 0.526

Rotat E 0.336 0.244 0.370 0.524 0.473 0.428 0.491 0.564 An KGE-Rotat E 0.366 0.273 0.405 0.546 0.480 0.431 0.499 0.578

HAKE 0.349 0.252 0.385 0.545 0.496 0.452 0.513 0.580 An KGE-HAKE 0.385 0.288 0.428 0.572 0.500 0.454 0.515 0.587

Pair RE 0.348 0.254 0.384 0.539 0.455 0.413 0.469 0.539 An KGE-Pair RE 0.376 0.281 0.417 0.558 0.462 0.415 0.480 0.556

Table 2: An KGE upon different model on FB15k-237 and WN18RR. The better results are bold.

1 3 5 10 50 100 HAKE Ranking

An KGE Ranking

0 0 0 3 366 1727

2 9 19 332 5824 593

5 48 292 2189 817 29

22 283 1448 546 186 3

352 3816 882 456 214 18

9953 1250 257 143 158 15

Figure 2: Comparison of the ranking between An KGE and base model on the FB15k-237.

Nr and Nt {1, 3, 5, 10, 20}, the basic weight of three levels αE, αR and αT {0.01, 0.05, 0.1, 0.2, 0.3}, learn rate α {1e 3, 1e 4, 1e 5}. The loss function weight γ in Equation (10) is set to 10, the transformation matrix weight λ in Equation (5) is set to 1 and 0 in FB15k-237 and WN18RR respectively. Before training An KGE, we retrieve analogical objects of three levels in train dataset for once. In both training and inference processes, An KGE is extended based on the scoring function of the original model. Thus, An KGE has the same model complexity as the original model.

5.2 Link Prediction Results Main results We use HAKE (Zhang et al. 2020) as the base model for An KGE to compare with other baselines. Baselines are selected from three categories Conventional KGE models including Trans E (Bordes et al. 2013), ANALOGY (Liu, Wu, and Yang 2017), Rotat E (Sun et al. 2019), HAKE, Rot-Pro (Song, Luo, and Huang 2021), Pair RE (Chao et al. 2021), and Dual E (Cao et al. 2021), GNNbased KGE models including R-GCN (Schlichtkrull et al. 2018), A2N (Bansal et al. 2019), Comp GCN (Vashishth et al. 2020), and SE-GNN (Li et al. 2022), and Enhanced KGE framework including CAKE (Niu et al. 2022), PUDA

Models FB15k-237 WN18RR

MRR Hit@1 MRR Hit@1

An KGE 0.385 0.288 0.500 0.454 w/o entity-level 0.384 0.288 0.497 0.451 w/o relation-level 0.349 0.253 0.500 0.455 w/o triple-level 0.384 0.287 0.499 0.453 w/o all 0.349 0.252 0.496 0.452

Table 3: Ablation study of three analogy level, where w/o means removing the corresponding level in An KGE.

(Tang et al. 2022), and REP (Wang et al. 2022). The Table 1 summarizes experiment results on FB15k237 and WN18RR. The result of ANALOGY is from code4. The result of Trans E, Rotat E, HAKE and Pair RE are from our trained model. The base model and An KGE framework training details are provided in Appendix D. The other results are from the published paper. We can see that An KGE enhances the analogical inference ability of the base model HAKE through analogical inference and outperforms the baseline models on most evaluation metrics except the Hit@10 metric where results of An KGE slightly lower than PUDA and REP and achieve the second best. Overall, An KGE remakes the rank of hard-predicted triples in HAKE by analogical inference, achieving the best results on both datasets.

Compatibility results The An KGE is a framework to enhance the analogical inference ability of KGE models, which retrieves analogical objects through fkge predefined in KGE models. Theoretically, our framework is applicable to most KGE models defining a score function for triples. We chose four C-KGE models: Trans E, Rotat E, HAKE, Pair RE as base model to validate compatibility. As Table 2 shows, An KGE achieves a significant improvement over the base model on all metrics. The MRR metric improves by about 3% on the FB15k-237. The result demonstrates that An KGE is compatible with a wide range of KGE models. Moreover, An KGE based on HAKE achieves a more significant improvement on FB15k-237 dataset. HAKE makes the entities

4https://github.com/thunlp/Open KE

Incomplete triple Analogy object An KGE Original Rank Rank

(diencephalon, has part, ?) hypothalamus brain 5 25 (rest, derivationally related form, ?) breath drowse 6 38 (roof, hypernym, ?) protective covering cap 39 20

(felidae, member meronym, ?) panthera has part 5 17 (monodontidae, member meronym, ?) delphinapterus hypernym Reverse 1 64 (literary composition, hypernym, ?) writing has part 88 18

(ticino, instance hypernym, ?) swiss canton (switzerland, has part) 8 54 (south korea, has part, ?) inchon (port, instance hypernym Reverse) 1 31 (elementary geometry, hypernym, ?) geometry (construct, synset domain topic of) 39 12

Table 4: Analogical inference case Study. The better ranks are blod.

hierarchical by using the depth of the entity to model different levels of the hierarchy, which is more helpful for analogical inference. Compared with WN18RR, the improvement of the model on FB15k-237 is more significant, which we speculate is because FB15k-237 has richer relational patterns. So it has more improvement in the process of relation-level analogical inference. In addition, An KGE is designed to predict the hard-predicted triples. The overall accuracy of FB15k-237 is lower than WN18RR. Consequently, the boosting effect of the model is reflected more obviously.

5.3 Model Analysis

Ranking study In order to analyze the improvement effect of An KGE, we compare the ranking results in FB15k237 of the An KGE-HAKE and original HAKE in Figure 2. The horizontal coordinate represents the ranking range of the HAKE model, and the vertical coordinate represents the ranking range of An KGE. We found that ranking changes are less apparent when the ranking is more significant than 100, so we selected the triples ranking within 100 and divided them into six ranking ranges for analysis. The diagonal line represents the unchanged ranking, the lower right of the diagonal line represents the An KGE ranking as better than the HAKE ranking, and the upper left represents worse. We find some triples with worse rankings, but the number is much smaller than those with better rankings. In addition, the change in ranking is not so evident as the base model ranking increases; the better the base model ranking is, the more possible that An KGE could improve the rankings.

Ablation Study We conduct ablation experiments for the analogical inference part of Ank GE. Table 3 shows the results of the ablation study for the An KGE-HAKE on two datasets. We can see that the removal of any part makes the model less effective, except the relation-level on WN18RR dataset. Since there are only 11 relations in WN18RR, it is hard to retrieve suitable relation-level analogical objects. We explain this in more detail in case study. In addition, the WN18RR consists of a lexicon containing contextual words that naturally provide entity-level analogical objects, which makes the model more effective for entity-level analogical inference. The result of FB15k-237 is the opposite. It may be because it has rich relationship patterns, making the relationlevel analogical inference more effective.

Case Study Analogical inference can generate explanations for predicted triples, which are valuable for real-life applications. Our method also provides an analogy view for the explanations of knowledge graph inference. As the Table 4 shows, we provide an intuitive demonstration about analogical inference. For each level, we select multiple example cases from WN18RR test set, and list their corresponding analogical objects and prediction results based on Rotat E. For entity-level, the idea is to retrieve hypernym or hyponym as the analogy object. For example, the diencephalon is located in the core of the brain. The fact that hypothalamus is part of brain improves the reliability of the people s trust on predicted result. However, if hyponym entity becomes the analogy object, it will generate bad explanations and results. For instance, although cap can be regraded as a special type of roof, it is not the protective covering. Thus the misleading explanation that (cap, hypernym, protective covering) downgrades the trustworthiness of the predicting result, which ranks the correct answer at 39. For relation-level, An KGE tends to retrieve the conceptually similar relations, such as the ( member meronym) and ( has part). Nevertheless, there are only 11 relations on WN18RR, which makes the An KGE sometimes retrieve the inappropriate analogy relations. For example, ( hypernym) and ( has part) are the relations of opposite concepts, which leads to bad explanation and worse ranking. For triple-level, An KGE typically focuses on the (h, r) pair structure. As proof, ticino is a canton of Switzerland means that triple (switzerland, has part, swiss canton) is good explanation. However, sometimes the (h, r) pair structure varies too much leading the misclassification.

6 Conclusion

In this paper, we resort to analogical inference to study the knowledge graph completion task. We propose an analogical object retriever that retrieves appropriate analogical objects from entity-level, relation-level, and triple-level. Then, we design a novel and general self-supervised framework to enhance well-trained KGEs with analogical inference capability called An KGE. Our method achieves competitive results on knowledge graph completion task and performs enhanced analogical inference ability. Some future directions include exploring more analogy patterns and a more general framework to adapt to the GNN-based KGE.

Acknowledgments This work is funded by NSFCU19B2027/91846204.

References Bansal, T.; Juan, D.; Ravi, S.; and Mc Callum, A. 2019. A2N: Attending to Neighbors for Knowledge Graph Inference. In ACL (1), 4387 4392. Association for Computational Linguistics. Bevilacqua, M.; and Navigli, R. 2020. Breaking Through the 80% Glass Ceiling: Raising the State of the Art in Word Sense Disambiguation by Incorporating Knowledge Graph Information. In ACL, 2854 2864. Association for Computational Linguistics. Bollacker, K. D.; Evans, C.; Paritosh, P. K.; Sturge, T.; and Taylor, J. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In SIGMOD Conference, 1247 1250. ACM. Bordes, A.; Usunier, N.; Garc ıa-Dur an, A.; Weston, J.; and Yakhnenko, O. 2013. Translating Embeddings for Modeling Multi-relational Data. In NIPS, 2787 2795. Cao, Z.; Xu, Q.; Yang, Z.; Cao, X.; and Huang, Q. 2021. Dual Quaternion Knowledge Graph Embeddings. In AAAI, 6894 6902. AAAI Press. Chao, L.; He, J.; Wang, T.; and Chu, W. 2021. Pair RE: Knowledge Graph Embeddings via Paired Relation Vectors. In ACL/IJCNLP (1), 4360 4369. Association for Computational Linguistics. Chen, X.; Li, L.; Zhang, N.; Tan, C.; Huang, F.; Si, L.; and Chen, H. 2022. Relation Extraction as Open-book Examination: Retrieval-enhanced Prompt Tuning. In SIGIR, 2443 2448. ACM. Dettmers, T.; Minervini, P.; Stenetorp, P.; and Riedel, S. 2018. Convolutional 2D Knowledge Graph Embeddings. In AAAI, 1811 1818. AAAI Press. Gentner, D. 1983. Structure-Mapping: A Theoretical Framework for Analogy. Cogn. Sci., 7(2): 155 170. Hall, R. P. 1989. Computational Approaches to Analogical Reasoning: A Comparative Analysis. Artif. Intell., 39(1): 39 120. Hu, Z.; Cao, Y.; Huang, L.; and Chua, T. 2021. How Knowledge Graph and Attention Help? A Qualitative Analysis into Bag-level Relation Extraction. In ACL/IJCNLP (1), 4662 4671. Association for Computational Linguistics. Khandelwal, U.; Levy, O.; Jurafsky, D.; Zettlemoyer, L.; and Lewis, M. 2020. Generalization through Memorization: Nearest Neighbor Language Models. In ICLR. Open Review.net. Lehmann, J.; Isele, R.; Jakob, M.; Jentzsch, A.; Kontokostas, D.; Mendes, P. N.; Hellmann, S.; Morsey, M.; van Kleef, P.; Auer, S.; and Bizer, C. 2015. DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia. Semantic Web, 6(2): 167 195. Li, R.; Cao, Y.; Zhu, Q.; Bi, G.; Fang, F.; Liu, Y.; and Li, Q. 2022. How Does Knowledge Graph Embedding Extrapolate to Unseen Data: A Semantic Evidence View. In AAAI, 5781 5791. AAAI Press.

Liu, H.; Wu, Y.; and Yang, Y. 2017. Analogical Inference for Multi-relational Embeddings. In ICML, volume 70 of Proceedings of Machine Learning Research, 2168 2178. PMLR. Miller, G. A. 1994. WORDNET: A Lexical Database for English. In HLT. Morgan Kaufmann. Niu, G.; Li, B.; Zhang, Y.; and Pu, S. 2022. CAKE: A Scalable Commonsense-Aware Framework For Multi-View Knowledge Graph Completion. In ACL (1), 2867 2877. Association for Computational Linguistics. Schlichtkrull, M. S.; Kipf, T. N.; Bloem, P.; van den Berg, R.; Titov, I.; and Welling, M. 2018. Modeling Relational Data with Graph Convolutional Networks. In ESWC, volume 10843 of Lecture Notes in Computer Science, 593 607. Springer. Song, T.; Luo, J.; and Huang, L. 2021. Rot-Pro: Modeling Transitivity by Projection in Knowledge Graph Embedding. In Neur IPS, 24695 24706. Suchanek, F. M.; Kasneci, G.; and Weikum, G. 2007. Yago: a core of semantic knowledge. In WWW, 697 706. ACM. Sun, Z.; Deng, Z.; Nie, J.; and Tang, J. 2019. Rotat E: Knowledge Graph Embedding by Relational Rotation in Complex Space. In ICLR (Poster). Open Review.net. Tang, Z.; Pei, S.; Zhang, Z.; Zhu, Y.; Zhuang, F.; Hoehndorf, R.; and Zhang, X. 2022. Positive-Unlabeled Learning with Adversarial Data Augmentation for Knowledge Graph Completion. In IJCAI, 2248 2254. ijcai.org. Turney, P. D. 2008. The Latent Relation Mapping Engine: Algorithm and Experiments. J. Artif. Intell. Res., 33: 615 655. Vashishth, S.; Sanyal, S.; Nitin, V.; and Talukdar, P. P. 2020. Composition-based Multi-Relational Graph Convolutional Networks. In ICLR. Open Review.net. Wang, H.; Dai, S.; Su, W.; Zhong, H.; Fang, Z.; Huang, Z.; Feng, S.; Chen, Z.; Sun, Y.; and Yu, D. 2022. Simple and Effective Relation-based Embedding Propagation for Knowledge Representation Learning. In IJCAI, 2755 2761. ijcai.org. Yang, B.; Yih, W.; He, X.; Gao, J.; and Deng, L. 2015. Embedding Entities and Relations for Learning and Inference in Knowledge Bases. In ICLR (Poster). Yasunaga, M.; Ren, H.; Bosselut, A.; Liang, P.; and Leskovec, J. 2021. QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering. In NAACL-HLT, 535 546. Association for Computational Linguistics. Zhang, W.; Chen, X.; Yao, Z.; Chen, M.; Zhu, Y.; Yu, H.; Huang, Y.; Xu, Y.; Zhang, N.; Xu, Z.; Yuan, Z.; Xiong, F.; and Chen, H. 2022. Neural KG: An Open Source Library for Diverse Representation Learning of Knowledge Graphs. In SIGIR, 3323 3328. ACM. Zhang, Z.; Cai, J.; Zhang, Y.; and Wang, J. 2020. Learning Hierarchy-Aware Knowledge Graph Embeddings for Link Prediction. In AAAI, 3065 3072. AAAI Press.