# geometry_interaction_knowledge_graph_embeddings__79798c51.pdf

Geometry Interaction Knowledge Graph Embeddings

Zongsheng Cao 1,2, Qianqian Xu 3,*, Zhiyong Yang 4, Xiaochun Cao 1,2, Qingming Huang 3,4,5,6,*

1 State Key Laboratory of Information Security, Institute of Information Engineering, CAS, Beijing, China 2 School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China 3 Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, CAS, Beijing, China 4 School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, China 5 Key Laboratory of Big Data Mining and Knowledge Management, Chinese Academy of Sciences, Beijing, China 6 Peng Cheng Laboratory, Shenzhen, China {caozongsheng,caoxiaochun}@iie.ac.cn, xuqianqian@ict.ac.cn, yangzhiyong21@ucas.ac.cn, qmhuang@ucas.ac.cn

Knowledge graph (KG) embeddings have shown great power in learning representations of entities and relations for link prediction tasks. Previous work usually embeds KGs into a single geometric space such as Euclidean space (zero curved), hyperbolic space (negatively curved) or hyperspherical space (positively curved) to maintain their speciﬁc geometric structures (e.g., chain, hierarchy and ring structures). However, the topological structure of KGs appears to be complicated, since it may contain multiple types of geometric structures simultaneously. Therefore, embedding KGs in a single space, no matter the Euclidean space, hyperbolic space or hyperspheric space, cannot capture the complex structures of KGs accurately. To overcome this challenge, we propose Geometry Interaction knowledge graph Embeddings (GIE), which learns spatial structures interactively between the Euclidean, hyperbolic and hyperspherical spaces. Theoretically, our proposed GIE can capture a richer set of relational information, model key inference patterns, and enable expressive semantic matching across entities. Experimental results on three wellestablished knowledge graph completion benchmarks show that our GIE achieves the state-of-the-art performance with fewer parameters.

Introduction

Knowledge Graphs (KGs) attract more and more attentions in the past years, which play an important role in many semantic applications (e.g., question answering (Diefenbach, Singh, and Maret 2018; Keizer et al. 2017), semantic search (Berant et al. 2013; Berant and Liang 2014), recommender systems (Gong et al. 2020; Guo et al. 2020), and dialogue generation (Keizer et al. 2017; He et al. 2017)). KGs can not only represent vast amounts of information available in the world, but also enable powerful relational reasoning. However, real-world KGs are usually incomplete, which restricts the development of downstream tasks above. To address such an issue, Knowledge Graph Embedding (KGE) emerges as an effective solution to predict missing links by the lowdimensional representations of entities and relations. In the past decades, researchers propose various KGE methods which achieve great success. As a branch of KGE

*Corresponding author. Copyright 2022, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

methods, learning in the Euclidean space E and Compl Ex space C is highly effective, which has been demonstrated by methods such as Dist Mult (Yang et al. 2015), Compl Ex (Trouillon et al. 2016) and Tuck ER (Balaževi c, Allen, and Hospedales 2019) (i.e., bilinear models). These methods can infer new relational triplets and capture chain structure with the scoring function based on the intuitive Euclidean distance. However, learning embeddings in Euclidean space are weak to capture complex structures (e.g., hierarchy and ring structures) (Balazevic, Allen, and Hospedales 2019b). As another branch of KGE methods, learning KG embeddings in the non-Euclidean space (e.g., hyperspheric space S and hyperbolic space H) has received wide attention in recent years. Drawing on the geometric properties of the sphere, Manifold E (Xiao, Huang, and Zhu 2016a) embeds entities into the manifold-based space such as hypersphere space, which is beneﬁcial to improve the precision of knowledge embedding and model the ring structure. Due to the capability of hyperbolic geometry in modeling hierarchical structure, methods such as Mu RP (Balazevic, Allen, and Hospedales 2019b) and ROTH (Chami et al. 2020) can outperform the methods based on Euclidean space and Complex space in knowledge graphs of hierarchical structures. However, note that most knowledge graphs embrace complicated structures as shown in Figure 1, it implies that these methods perform a bit poorly when modeling the knowledge graph with hybrid structures (Balazevic, Allen, and Hospedales 2019b) (e.g., ring, hierarchy and chain structures). Recently, (Wang et al. 2021) develops a mixed-curvature multi-relational graph neural network for knowledge graph completion, which is constructed by a Riemannian product manifold between different spaces. However, it does not delve into the interaction of the various spaces and the heterogeneity of different spaces will limit its pattern reasoning ability between spaces. From the analysis above, we can see learning knowledge graph embeddings without geometry interaction will inevitably result in sub-optimal results. The reason lies in that it cannot accurately capture the complex structural features of the entity location space, and the transfer of inference rules in different spaces will be affected or even invalid. Inspired by this, to capture a richer set of structural information and model key inference patterns for KG with complex structures, we attempt to embed the knowledge graph into Euclidean, hyperbolic and hyperspherical spaces jointly by geometry

The Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-22)

Figure 1: An illustration of the structure in knowledge graphs. The ﬁrst part is a knowledge graph about the ﬁlm and characters directed by James Cameron. The second part is the different spatial structure formed by different types of data such as hierarchical structure, ring structure and chain structure. Orange entities of the KG show the chain structure which can be easily modeled in an Euclidean space; green entities of the KG present ring structure which can be depicted in a hypersphere space; blue entities of the KG exhibit hierarchical structures which can be intuitively captured in a hyperbolic space.

interaction, hoping that it can integrate the reasoning ability in different spaces.

In this paper, we take a step beyond the previous work, and propose a novel method called Geometry Interaction Knowledge Graph Embeddings (GIE). GIE can not only exploit hyperbolic, hyperspherical and Euclidean geometries simultaneously to learn the reliable structure in knowledge graphs, but also model inference rules. Speciﬁcally, we model the relations by rich geometric transformations based on special orthogonal group and special Euclidean group to utilize key inference patterns, enabling rich and expressive semantic matching between entities. Next, to approximate the complex structure guided by heterogeneous spaces, we exploit a novel geometry interaction mechanism, which is designed to capture the geometric message for complex structures in KGs. In this way, GIE can adaptively form the reliable spatial structure for knowledge graph embeddings via integrating the interactive geometric message. Moreover, experiments demonstrate that GIE can signiﬁcantly outperform the stateof-the-art methods.

Our contributions are summarized as follows: 1) To the best of our knowledge, we are the ﬁrst to exploit knowledge graph embeddings with geometry interaction, which adaptively learns reliable spatial structures and is equipped with the ability to transmit inference rules in complex structure of KGs. 2) We present a novel method called Geometry Interaction Knowledge Graph Embeddings (GIE), which embraces rich and expressive semantic matching between entities and satisﬁes the key of relational representation learning. (i.e., modeling symmetry, anti-symmetry, inversion, composition, hierarchy, intersection and mutual exclusion patterns). 3) Our model requires fewer parameters and saves about 40% parameters than SOTA models, while achieving the state-of-the-art

performance on three well-established knowledge graph completion benchmarks (WN18RR, FB15K237 and YAGO3-10).

Related Work

The key issue of knowledge graph embeddings is to learn lowdimensional distributed embedding of entities and relations. Previous work has proposed many KGE methods based on geometric properties of different embedding spaces. These methods mainly use Euclidean space (e.g., the vector space, matrix space and tensor space), while some methods are based on other kinds of spaces such as complex space and hyperbolic space. Moreover, there are some methods based on deep neural networks utilized as well. Euclidean embeddings. In recent years, researchers have conducted a lot of research based on Euclidean embeddings for KG representation learning. These methods include translation approaches popularized by Trans E (Bordes et al. 2013), which models the relation as the translation between head and tail entities, i.e., head entity + relation tail entity. Then there are many variations based on Trans E. Trans H (Wang et al. 2014), Trans R (Lin et al. 2015), Trans D (Ji et al. 2015) use different projection vectors for entities and relations, separately. Torus E (Ebisu and Ichise 2018) adopts distance function in a compact Lie group to avoid forcing embeddings to be on a sphere. Trans G (Xiao, Huang, and Zhu 2016b) and KG2E (He et al. 2015) further inject probabilistic principles into this framework by considering Bayesian nonparametric Gaussian mixture model and show better accuracy and scalability. And there are also tensor factorization methods such as RRESCAL (Nickel, Tresp, and Kriegel 2011), Dist Mult (Yang et al. 2015). These methods are very simple and intuitive, but they cannot model the key logical properties (e.g., translations cannot model symmetry).

Complex embeddings. Instead of using a real-valued embedding, learning KG embeddings in the complex space has attracted attention. Some bilinear models such as Compl Ex (Trouillon et al. 2016) and Hol E (Nickel, Rosasco, and Poggio 2016) measure plausibility by matching latent semantics of entities and relations. Then, Rotat E (Sun et al. 2019) models the relation as the rotation in two-dimensional space, thus realizing modeling symmetry/antisymmetry, inversion and composition patterns. Inspired by Rotat E, OTE (Tang et al. 2020) extends the rotation from 2D to high-dimensional space by relational transformation. Moreover, Quat E (Zhang et al. 2019) extends the complex space into the quaternion space with two rotating surfaces to enable rich semantic matching. Following this, (Cao et al. 2021) extends the embedding space to the dual quaternion space, which models the interaction of relations and entities by dual quaternion multiplication. However, complex space is essentially ﬂat with Euclidean space since their curvatures of the space are zero. Therefore, for the KGs with the hierarchical structure, these methods often suffer from large space occupation and distortion embeddings. Non-Euclidean embeddings. Noting the limitations of Euclidean space, Manifold E (Xiao, Huang, and Zhu 2016a) expands pointwise modeling in the translation based principle to manifoldwise space (e.g., hypersphere space), which beneﬁts for modeling ring structures. To capture the hierarchical structure in the knowledge graph, Mu RP (Balazevic, Allen, and Hospedales 2019b) learns KG embeddings in hyperbolic space in order to target hierarchical data. (Gu et al. 2019b) proposes learning embeddings in a product manifold with a heuristic to estimate the sectional curvature of graph data. However, it ignores the interaction of the embedded spaces, which will affect the learning of spatial structure. Then ROTH (Chami et al. 2020) introduces the hyperbolic geometry on the basis of the rotation, which can achieve low-dimensional knowledge graph embeddings. However, for non-hierarchical data, their performance will be limited. Deep neural networks. Neural network approaches such as (Jin et al. 2021b), (Yu et al. 2021),(Nathani et al. 2019) and (Jin et al. 2021a) play important roles in graph learning. Some deep neural networks have also been adopted for the KGE problem, e.g., Neural Tensor Network (Socher et al. 2013) and ER-MLP (Dong et al. 2014) are two representative neural network based methodologies. Moreover, there are some models using neural networks to produce KGE with remarkable effects. For instance, R-GCN (Schlichtkrull et al. 2018), Conv E (Dettmers et al. 2017), Conv KB (Nguyen et al. 2018), A2N (Bansal et al. 2019) and Hyp ER (Balazevic, Allen, and Hospedales 2019a). Recently, M2GNN (Wang et al. 2021) is proposed, which is a generic graph neural network framework to embed multi-relational KG data for knowledge graph embeddings. A downside of these methods is that most of them are not interpretable.

Background and Preliminaries

Multi-relational link prediction. The multi-relational link prediction is an important task for knowledge graphs. Generally, given a set E of entities and a set R of relations, a

knowledge graph G E R E is a set of subject-predicateobject triples. Noticing that the real-world knowledge graphs are often incomplete, the goal of multi-relational link prediction is to complete the KG, i.e., to predict true but unobserved triples based on the information in G with a score function. Typically, the score function is learned with the formula as φ : E R E R, that assigns a score s = φ(h, r, t) to each triplet, indicating the strength of prediction that a particular triplet corresponds to a true fact. Followed closely, a non-linearity, such as the logistic or sigmoid function, is often used to convert the score to a predicted probability p = σ(s) [0, 1] of the triplet being true. Non-Euclidean Geometry. Most of the manifolds in geometric space can be regarded as Riemannian manifolds (M, g) of dimension n, which deﬁne the internal product operation in the tangent space gx for each point x M: Tx M Tx M R. The tangent space Tx M of dimension n contains all possible directions of the path in non-Euclidean space leaving from x. Speciﬁcally, denote g E = In as the Euclidean metric tensor and c as the curvature of the space, then we can use the manifold Mn c = {x Rn : c x < 1} to deﬁne non-Euclidean space (Mn c , gc), where the following Riemannian metric gc x = λ2 xg E and λx = 2 1 c||x||2 . Note that the curvature c is positive for hyperspherical space S, negative for hyperbolic space H, and zero for Euclidean space E. Poincar e ball is a popular adopted model in representation learning, which is commonly used in hyperbolic space and has basic mathematical operations (e.g., addition, multiplication). Moreover, it provides closed-form expressions for many basic objects such as distance and angle (Ganea, Bécigneul, and Hofmann 2018). The principled generalizations of basic operations in hyperspherical space are similar to operations in hyperbolic space, except that the curvature c > 0. Therefore, here we only introduce the operation of hyperbolic space, the operation of hyperspherical space can be obtained by analogy. For each point x in the hyperbolic space, we have a tangent space Tx Hn c . We can use the exponential map expc x : Tx Hn c Hn c and the logarithmic map logc x : Hn c Tx Hn c to establish the connection between the hyperbolic space and tangent space as follows:

expc x(v) = x c

logc x(y) = 2 cλx tanh 1 c x c y x c y

(1) where x, y Hn c , v Tx Hn c , || || denotes the Euclidean norm and c represents M obius addition:

1 + 2c x, y + c y 2 x + 1 c x 2 y 1 + 2c x, y + c2 x 2 y 2 .

(2) The distance between two points x, y Hd c is measured along a geodesic (i.e., shortest path between the points) as follows:

dc(x, y) = 2 c tanh 1 c x c y . (3)

The M obius substraction is then deﬁned by the use of the following notation: x c y = x c ( y). Similar to the scalar multiplication and matrix multiplication in Euclidean space, the multiplication in hyperbolic space can be deﬁned by M obius scalar multiplication and M obius matrix multiplication between vectors x Hn c {0}:

r c x = 1 c tanh r tanh 1( c x ) x

M c x = ( 1 c) tanh Mx

x tanh 1( c x ) Mx

(4) where r R and M Rm n.

Methodology Geometry Interaction Knowledge Graph Embeddings (GIE) Problem Deﬁnition. Given a knowledge graph G, practically, it is usually incomplete, then the link prediction approaches are used to predict new links and complete the knowledge graph. Speciﬁcally, we learn the embeddings of relations and entities through the existing links among the entities, and then predict new links based on these embeddings. Therefore, our ultimate goal is to develop an effective KGE method for the link prediction task.

Representations for Entities and Relations. In the KGE problem, entities can be represented as vectors in lowdimensional space while relations can be represented as geometric transformations. In order to portray spatial transformations accurately and comprehensively, we deﬁne the special orthogonal group and the special Euclidean group as follows respectively: Deﬁnition 1 The special orthogonal group is deﬁned as

SO(n) = A GL(n, R) | AT A = I det A = 1 , (5) where GL(n, R) denotes the n dimensional linear group over the set of real numbers R.

Deﬁnition 2 The special Euclidean group is deﬁned as

SE(n) = A | A = R v 0 1

, R SO(n), v Rn .

(6) Here SE(n) is the set of all rigid transformations in n dimensions. It provides more degrees of freedom for geometric transformation than SO(n).

Note that the special Euclidean group has powerful spatial modeling capabilities and it will play a major role in modeling key inference patterns. In a geometric perspective, it supports multiple transformations, speciﬁcally, reﬂection, inversion, translation, rotation, and homothety. We provide the proof in Appendix1. Geometry Interaction. The geometry interaction plays an important role in capturing the reliable structure of the space and we will explain its mechanism below. First of all, geometric message propagation is a prerequisite for geometry

1https://github.com/Lion-ZS/GIE

interaction. In view of the different spatial properties of Euclidean space, hyperbolic space and hypersphere space, their spatial structures are established through different spatial metrics (e.g., Euclidean metric or hyperbolic metric), which will affect the geometry information propagation between these spaces. Fortunately, the exponential and logarithmic mappings can be utilized as bridges between these spaces for geometry information propagation. Then the geometry interaction can be summarized into three steps: (1) We ﬁrst generate spatial embeddings for the entity e in the knowledge graph, including an embedding E in Euclidean space, an embedding H in hyperbolic space and an embeddings S in hypersphere space. (2) We map the embeddings H and S from the hyperbolic space and hyperspherical space to the tangent space through logarithmic map respectively to facilitate geometry information propagation. Then we use the attention mechanism to interact the geometric messages of Euclidean space and tangent space. (3) We capture the features of different spaces and then adaptively form the reliable spatial structure according to the interactive geometric message. Here we discuss an important detail in geometry interaction. Suppose relation r SE(n) and r is the reverse relation of r2. Notice that the basic operations such as M obius addition does not satisfy commutativity in hyperbolic space. Therefore, given two points x and y, note that x c y = y c x does not hold for general cases in the hyperbolic space, then we have x c y = (y c x). For the triplet (h, r, t) (h and t denote head entity and tail entity respectively) in KG of non-Euclidean structure, we have: h c r c t = (t c r c h). (7) It means that the propagated messages of two opposite directions (i.e., from head entity to tail entity and from tail entity to head entity) are different. To address it, we leverage the embeddings of head entities and tail entities in different sapces separately to propagate geometric messages for geometry interaction. For the head entity, we have Eh = rh which is the embedding of transformed head entity in Euclidean space. At the same time, we have Hh = r v expv 0(h), which is the embedding of transformed head entity in hyperbolic space, and expv 0(h) (v < 0) aims to obtain the embedding of the head entity h in hyperbolic space. Meanwhile, we have Sh = r u expu 0(h), which is the embedding of transformed head entity in hyperspherical space, and expu 0(h) (u > 0) aims to obtain the embedding of the head entity h in the hyperspherical space. Inter(Eh, Hh, Sh) = expc 0(λEEh + λH logv 0(Hh) +λS logu 0(Sh)), (8)

where λ is an attention vector and (λE, λH, λS) = Softmax λT Eh, λT Hh, λT Sh . Here we apply logv 0( )

2Suppose r = R v 0 1

, then we can derive r = R 1 R 1v 0 1

= RT RT v 0 1

, which is convenient

for calculation since we only need to calculate the transpose of the matrix instead of the inverse.

and logu 0( ) in order to map Hh and Sh into the tangent space respectively to propagate geometry information. Moreover, applying expc 0( ) aims to form an approximate spatial structure for the interactive space. To preserve more information of Euclidean, hyperspherical and hyperbolic spaces, we aggregate their messages through performing self-attention guided by geometric distance between entities. After aggregating messages, we apply a pointwise non-linearity function to measure their messages. Then, we form a suitable spatial structure based on the geometric message. In this way, we have the transformed entity interacting in Euclidean space, hyperspherical space and hyperbolic space as follows: Similarly, for the tail entity, let Et = r t, Ht = r v expv 0(t) and St = r u expu 0(t) be the transformed tail entities in Euclidean space, hyperbolic space and hyperspherical space, respectively. Then we aggregate their messages interacting in the space as follows:

Inter(Et, Ht, St) = expc 0(αEEt + αH logv 0(Ht) +αS logu 0(St)), (9)

where α is an attention vector and (αE, αH, αS) = Softmax αT Et, αT Ht, αT St .

Score Function and Loss Function. After integrating the information for head entities and tail entities in different spaces respectively, we design the scoring function for geometry interaction as follows:

φ(h, r, t) = (dc(Inter(Eh, Hh, Sh), t) + dc(Inter(Et, Ht, St), h)) + b. (10)

where dc( , ) represents the distance function, and b is the bias which act as the margin in the scoring function (Balazevic, Allen, and Hospedales 2019b). Here we train GIE by minimizing the cross-entropy loss with uniform negative sampling, where negative examples for a triplet (h, r, t) are sampled uniformly from all possible triplets obtained by perturbing the tail entity:

{h,r,t} Ω Ω log (1 + exp ( Y φ(h, r, t))) , (11)

where Ωand Ω = E R E Ωdenote the set of observed triplets and the set of unobserved triplets, respectively. Y { 1, 1} represents the corresponding label of the triplet (h, r, t).

Key Properties of GIE Connection to Euclidean space, hyperbolic space and hyperspherical space. The geometry interaction mechanism allows us to ﬂexibly take advantages of Euclidean space, hyperbolic space and hyperspherical space. For the chain structure in KGs, we can adjust the local structure of the embedding space in geometry interaction to be closer to Euclidean space. Recall the two maps in Eq.(1) have more appealing forms when x = 0, namely for v T0Dn c \{0}, y Dn c \{0}: expc 0(v) = tanh( c v ) v c v , logc 0(y) =

tanh 1( c y ) y c y . Moreover, we can recover Euclidean geometry as limc 0 expc x(v) = x + v, and

limc 0 logc x(y) = y x. For the hierarchical structure, we can make the local structure of the embedded space close to the hyperbolic structure through geometry interaction (make the space curvature c < 0). For the ring structure, we can form the hyperspherical space (c > 0) and the process is similar to the above. If the embedding locations of entities have multiple structural features (e.g., hierarchy structure and ring structure) simultaneously, we can integrate different geometric messages as shown in the Eq.(8) and Eq.(9), then accordingly form the reliable spatial structure by geometry interaction as above.

Capability in Modeling Inference Patterns. Generally, knowledge graphs embrace multiple relations, which forms the basis of pattern reasoning. Therefore it is particularly important to enable inference patterns applied in different spaces. Suppose M is a relation matrix, and then take the entity e as an example. If we have fk(e) = φk(Mke), it comes the M obius version f c k (e) = ϕ c k (Mk c e), where fk can be any function mapping we need. Then their composition can be rewritten as f c k f c 1 = expc 0 fk f1 logc 0, which means these operations in other spaces can be essentially performed in Euclidean space. Similar to other M obius operations, for a given function in non-Euclidean space, we can restore the form of the function in Euclidean space by limiting c 0, which can be formulated as limc 0 f c(x) = f(x) if f is continuous. This deﬁnition can also satisfy some desirable properties, such as (f g) c = f c g c for f : R Rl and g : Rn Rm, and f c(x)/ f c(x) = f(x)/ f(x) for f(x) = 0 (direction preserving). Based on the analysis above, we can extend the logical rules established in Euclidean space to the hyperbolic space and hyperspheric space in geometry interaction. In view of this, geometry interaction enables us to model key relation patterns in different spaces conveniently. As shown in Table 1, there is a comparison of GIE against these models with respect to capturing prominent inference patterns, which shows the advantages of GIE in modeling these patterns. Please refer to Appendix for the proof.

Experiments

Experimental Setup

Dataset. We evaluate our approach on the link prediction task using three standard competition benchmarks as shown in Table 2, namely WN18RR (Dettmers et al. 2017), FB15K237 (Dettmers et al. 2017) and YAGO3-10 (Mahdisoltani, Biega, and Suchanek 2013). WN18RR is a subset of WN18 (Bordes et al. 2013) and the main relation patterns are symmetry/antisymmetry and composition. FB15K237 is a subset of FB15K (Bordes et al. 2013), in which the inverse relations are removed. YAGO3-10, a subset of YAGO3 (Mahdisoltani, Biega, and Suchanek 2015), mainly includes symmetry/antisymmetry and composition patterns. Each dataset is split into training, validation and testing sets, which is the same as the setting of (Sun et al. 2019). For each KG, we follow the standard data augmentation protocol by adding inverse rela-

Inference pattern Trans E Rotat E Tuck ER Dist Mult Compl Ex Mu RP GIE Symmetry: r1(x, y) r1(y, x) Anti-symmery: r1(x, y) r1(y, x) Inversion: r1(x, y) r2(y, x) Composition: r1(x, y) r2(y, z) r3(x, z) Hierarchy: r1(x, y) r2(x, y) Intersection: r1(x, y) r2(x, y) r3(x, y) Mutual exclusion: r1(x, y) r2(x, y)

Table 1: Inference patterns captured by some popular KGE models.

Dataset N M Training Validation Test ξG Structure Heterogeneity Scale

WN18RR 41k 11 87k 3k 3k -2.54 Low Small FB15k-237 15k 237 272k 18k 20k -0.65 High Medium YAGO3-10 123k 37 1M 5k 5k -0.54 Low Large

Table 2: Statistics of the datasets used in this paper. (N is the number of entities and M is the number of relations. Refer to Appendix for more details about ξG.)

tions (Lacroix, Usunier, and Obozinski 2018) to the datasets for baselines.

Evaluation metrics. For each case, the test triplets are ranked among all triplets with masked entities replaced by entities in the knowledge graph. Hit@n (n=1,3,10) and the Mean Reciprocal Rank (MRR) are reported in the experiments. We follow the standard evaluation protocol in the ﬁltered setting (Bordes et al. 2013): all true triplets in the KG are ﬁltered out during evaluation, since predicting a low rank for these triplets should not be penalized.

Baselines. We compare our method to SOTA models, including Euclidean approaches based on Trans E (Bordes et al. 2013) and Dist Mult (Yang et al. 2015);complex approaches based on Tuck ER (Balaževi c, Allen, and Hospedales 2019), Compl Ex (Trouillon et al. 2016), and Rotat E (Sun et al. 2019); Non-Euclidean approaches based on Mu RP (Balazevic, Allen, and Hospedales 2019b) and ATTH (Chami et al. 2020); and neural network approaches based on Conv E (Dettmers et al. 2017), R-GCN (Schlichtkrull et al. 2018) and Hyp ER (Balazevic, Allen, and Hospedales 2019a).

Implementation Details. We implement GIE in Py Torch and run experiments on a single GPU. The hyper-parameters are determined by the grid search. The best models are selected by early stopping on the validation set. In general, the embedding size k is searched in {50, 100, 200, 250}. Learning rate is tuned amongst {0.001, 0.005, 0.05, 0.1}. For some baselines, we report the results in the original papers.

Results We show empirical results of three datasets in Table 3. On WN18RR where there are plenty of symmetry and composition relations, MRR is improved from 48.1% (Mu RP) to 49.1%, which reﬂects the effective modeling capability of GIE for key patterns. WN18RR contains rich hierarchical data, so it can be seen that GIE can model the hyperbolic

structure well. On FB15k237, GIE outperforms on MRR and yields the score of 36.2% while previous work based on the Non-Euclidean space (Mu RP) can only reach 33.5% at most. On YAGO3-10, which has about 1,000,000 training samples, MRR is improved from 56.8% to 57.9%. This shows that GIE can learn embedding well for the knowledge graphs with large scale. It can be viewed that GIE performs best compared to the existing state-of-the-art models across on all datasets above. The reason lies in that the structures in KGs are usually complex, which pose a challenge for learning embedding. In this case, GIE can learn a more reliable structure with geometry interaction while the structure captured by the previous work are relatively inaccurate and incomplete.

Experiment Analysis Ablations on Relation Type. In Table 5 in Appendix, we summarize the Hits@10 for each relation on WN18RR to conﬁrm the superior representation capability of GIE in modelling different types of relation. We show how the relation types on WN18RR affect the performance of the proposed method and some baselines such as Mu RMP-auto KT, Mu RMP, Mu RS, Mu RP and Mu RE. We report a number of metrics to describe each relation, including global graph curvature (ξG) (Gu et al. 2019a) and Krackhardt hierarchy score (Khs) (Everett and Krackhardt 2012), which can justify whether the given data have a rich hierarchical structure. Speciﬁcally, we compare averaged hits@10 over 10 runs for models of low dimensionality. We can see that the proposed model performs well for different types of relations, which veriﬁes the effectiveness of the proposed method in dealing with complex structures in KGs. Moreover, we can see that the geometry interaction is more effective than the mixed-curvature geometry method.

Impact of Embedding Dimension. To understand the role of dimensionality, we conduct experiments on WN18RR

WN18RR FB15k-237 YAGO3-10 M Model MRR H@1 H@3 H@10 MRR H@1 H@3 H@10 MRR H@1 H@3 H@10

R RESCAL .420 - - .447 .270 - - - - - - - R Trans E .226 - - .501 .294 - - .465 - - - - R Dis Mult .430 .390 .440 .490 .241 .155 .263 .419 .340 .240 .380 .540 R Conv E .430 .400 .440 .520 .325 .237 .356 .501 .440 .350 .490 .620 R Tuck ER .470 .443 .482 .526 .358 .266 .394 .544 - - - - R Mu RE .465 .436 .487 .554 .336 .245 .370 .521 .532 .444 .584 .694 R Comp GCN .479 .443 .494 .546 .355 .264 .390 .535 .489 .395 .500 .582 R A2N .430 .410 .440 .510 .317 .232 .348 .486 .445 .349 .482 .501 C Compl Ex .440 .410 .460 .510 .247 .158 .275 .428 .360 .260 .400 .550 C Rotat E .476 .428 .492 .571 .338 .241 .375 .533 .495 .402 .550 .670 H Mu RP .481 .440 .495 .566 .335 .243 .367 .518 .354 .249 .400 .567 H ATTH .486 .443 .499 .573 .348 .252 .384 .540 .568 .493 .612 .702 S Mu RS .454 .432 .482 .550 .338 .249 .373 .525 .351 .244 .382 .562 P Mu RMP .473 .435 .485 .552 .345 .258 .385 .542 .358 .248 .389 .566

G GIE .491 .452 .505 .575 .362 .271 .401 .552 .579 .505 .618 .709

Table 3: Link prediction results on three datasets. R represents the Euclidean space. C represents the complex space. H represents the hyperbolic space. S represents the spherical space. P represents the mix-curvature space. G represents the geometric interactive space. Best results are in bold and second best results are underlined.

Figure 2: Impact of embedding dimension.

Model Rotat E Mu RP GIE

WN18RR 40.95M 32.76M 17.65M ( 46.1%) FB15K237 29.32M 5.82M 3.41M ( 41.4%) YAGO3-10 49.28M 41.47M 23.81M ( 42.6%)

Table 4: Number of free parameters comparison.

against SOTA methods under varied dimensional settings (10, 20, 50, 100, 150, 200, 250, 500) shown in Figure 2. GIE can exploit hyperbolic, hyperspheric and Euclidean spaces jointly for knowledge graph embedding through an interaction learning mechanism, and we can see that GIE performs well in both low and high dimensions.

Number of Free Parameters Comparison. We compare the numbers of parameters of Rotat E and Mu RP with that of GIE in the experiments above. Geometry interaction can learn a reliable spatial structure by geometric interaction, which

reduce the dependence on adding dimensions to improve performance. As shown in Table 4, it obviously shows that GIE can maintain superior performance with fewer parameters. Ablation Study on Geometry Interaction. In order to study the performance of each component in the geometry interaction score function Eq.(10) separately, we remove dc(Inter(Et, Ht, St), h) of the score function and denote the model as GIE1. We remove dc(Inter(Eh, Hh, Sh), t) of the score function and denote the model as GIE2. We remove the geometry interaction process of the score function and denote the model as GIE3. As shown in Table 6 in Appendix, GIE1, GIE2 and GIE3 perform poorly compared to GIE. The experiment proves that the geometry interaction is a critical step for the embedding performance since geometric interaction is extremely beneﬁcial for learning reliable spatial structures of KGs.

Conclusion In this paper, we design a new knowledge graph embedding model named GIE that can exploit structures of knowledge graphs with interaction in Euclidean, hyperbolic and hyperspherical space simultaneously. Theoretically , GIE is advantageous with its capability in learning reliable spatial structure features in an adaptive way. Moreover, GIE can embrace rich and expressive semantic matching between entities and satisfy the key of relational representation learning. Empirical experimental evaluations on three well-established datasets show that GIE can achieve an overall the state-ofthe-art performance, outperforming multiple recent strong baselines with fewer free parameters.

Acknowledgements This work was supported in part by the National Key R&D Program of China under Grant 2018AAA0102003, in part by National Natural Science Foundation of China: U21B2038, U1936208, 61931008, 61836002, 61733007, 6212200758 and 61976202, in part by the Fundamental Research Funds for the Central Universities, in part by the National Postdoctoral Program for Innovative Talents, Grant No. BX2021298, in part by Youth Innovation Promotion Association CAS, and in part by the Strategic Priority Research Program of Chinese Academy of Sciences, Grant No. XDB28000000.

References Balazevic, I.; Allen, C.; and Hospedales, T. M. 2019a. Hypernetwork Knowledge Graph Embeddings. In ICANN, 553 565. Balazevic, I.; Allen, C.; and Hospedales, T. M. 2019b. Multirelational Poincaré Graph Embeddings. In Neur IPS, 4465 4475. Balaževi c, I.; Allen, C.; and Hospedales, T. M. 2019. Tucker: Tensor factorization for knowledge graph completion. In EMNLP, 5185 5194. Bansal, T.; Juan, D.-C.; Ravi, S.; and Mc Callum, A. 2019. A2N: attending to neighbors for knowledge graph inference. In ACL, 4387 4392. Berant, J.; Chou, A.; Frostig, R.; and Liang, P. 2013. Semantic Parsing on Freebase from Question-Answer Pairs. In ACL, 1533 1544. Berant, J.; and Liang, P. 2014. Semantic Parsing via Paraphrasing. In ACL, 1415 1425. Bordes, A.; Usunier, N.; Garcia-Duran, A.; Weston, J.; and Yakhnenko, O. 2013. Translating embeddings for modeling multi-relational data. In Neur IPS, 2787 2795. Cao, Z.; Xu, Q.; Yang, Z.; Cao, X.; and Huang, Q. 2021. Dual Quaternion Knowledge Graph Embeddings. In AAAI, 6894 6902. Chami, I.; Wolf, A.; Juan, D.-C.; Sala, F.; Ravi, S.; and Ré, C. 2020. Low-Dimensional Hyperbolic Knowledge Graph Embeddings. In ACL. Dettmers, T.; Minervini, P.; Stenetorp, P.; and Riedel, S. 2017. Convolutional 2D Knowledge Graph Embeddings. In AAAI, 1811 1818. Diefenbach, D.; Singh, K. D.; and Maret, P. 2018. WDAquacore1: A Question Answering service for RDF Knowledge Bases. In WWW, 1087 1091. Dong, X.; Gabrilovich, E.; Heitz, G.; Horn, W.; and Zhang, W. 2014. Knowledge vault: A web-scale approach to probabilistic knowledge fusion. ACM. Ebisu, T.; and Ichise, R. 2018. Torus E: Knowledge Graph Embedding on a Lie Group. In AAAI, 1819 1826. Everett, M. G.; and Krackhardt, D. 2012. A second look at Krackhardt s graph theoretical dimensions of informal organizations. Soc. Networks, 34(2): 159 163. Ganea, O.; Bécigneul, G.; and Hofmann, T. 2018. Hyperbolic Neural Networks. In Neur IPS, 5350 5360.

Gong, J.; Wang, S.; Wang, J.; Feng, W.; Peng, H.; Tang, J.; and Yu, P. S. 2020. Attentional Graph Convolutional Networks for Knowledge Concept Recommendation in MOOCs in a Heterogeneous View. In SIGIR, 79 88. Gu, A.; Sala, F.; Gunel, B.; and Ré, C. 2019a. Learning Mixed-Curvature Representations in Product Spaces. In ICLR. Gu, A.; Sala, F.; Gunel, B.; and Re, C. 2019b. LEARNING MIXED-CURVATURE REPRESENTATIONS IN PRODUCTS OF MODEL SPACES. In ICLR. Guo, Q.; Zhuang, F.; Qin, C.; Zhu, H.; Xie, X.; Xiong, H.; and He, Q. 2020. A Survey on Knowledge Graph-Based Recommender Systems. Co RR, abs/2003.00911. He, H.; Balakrishnan, A.; Eric, M.; and Liang, P. 2017. Learning Symmetric Collaborative Dialogue Agents with Dynamic Knowledge Graph Embeddings. In ACL, 1766 1776. He, S.; Liu, K.; Ji, G.; and Zhao, J. 2015. Learning to represent knowledge graphs with gaussian embedding. In CIKM, 623 632. Ji, G.; He, S.; Xu, L.; Liu, K.; and Zhao, J. 2015. Knowledge Graph Embedding via Dynamic Mapping Matrix. In ACL, 687 696. Jin, D.; Huo, C.; Liang, C.; and Yang, L. 2021a. Heterogeneous Graph Neural Network via Attribute Completion. In WWW, 391 400. Jin, D.; Yu, Z.; Jiao, P.; Pan, S.; Yu, P. S.; and Zhang, W. 2021b. A Survey of Community Detection Approaches: From Statistical Modeling to Deep Learning. IEEE Transactions on Knowledge and Data Engineering, DOI: 10.1109/TKDE.2021.3104155. Keizer, S.; Guhe, M.; Cuayáhuitl, H.; Efstathiou, I.; Engelbrecht, K.; Dobre, M. S.; Lascarides, A.; and Lemon, O. 2017. Evaluating Persuasion Strategies and Deep Reinforcement Learning methods for Negotiation Dialogue agents. In EACL, 480 484. Lacroix, T.; Usunier, N.; and Obozinski, G. 2018. Canonical Tensor Decomposition for Knowledge Base Completion. In ICML, 2863 2872. Lin, Y.; Liu, Z.; Sun, M.; Liu, Y.; and Zhu, X. 2015. Learning entity and relation embeddings for knowledge graph completion. In AAAI, 2181 2187. Mahdisoltani, F.; Biega, J.; and Suchanek, F. 2013. YAGO3: A Knowledge Base from Multilingual Wikipedias. In CIDR. Mahdisoltani, F.; Biega, J.; and Suchanek, F. M. 2015. YAGO3: A Knowledge Base from Multilingual Wikipedias. In CIDR. Nathani, D.; Chauhan, J.; Sharma, C.; and Kaul, M. 2019. Learning Attention-based Embeddings for Relation Prediction in Knowledge Graphs. In ACL, 4710 4723. Nguyen, D. Q.; Nguyen, T. D.; Nguyen, D. Q.; and Phung, D. 2018. A novel embedding model for knowledge base completion based on convolutional neural network. In NAACL-HLT, 327 333. Nickel, M.; Rosasco, L.; and Poggio, T. 2016. Holographic Embeddings of Knowledge Graphs. In AAAI, 1955 1961.

Nickel, M.; Tresp, V.; and Kriegel, H. P. 2011. A Three-Way Model for Collective Learning on Multi-Relational Data. In ICML. Schlichtkrull, M.; Kipf, T. N.; Bloem, P.; Van Den Berg, R.; Titov, I.; and Welling, M. 2018. Modeling relational data with graph convolutional networks. In ESWC, 593 607. Springer. Socher, R.; Chen, D.; Manning, C. D.; and Ng, A. Y. 2013. Reasoning With Neural Tensor Networks for Knowledge Base Completion. In Neur IPS. Sun, Z.; Deng, Z.; Nie, J.; and Tang, J. 2019. Rotat E: Knowledge Graph Embedding by Relational Rotation in Complex Space. In ICLR, 1 18. Tang, Y.; Huang, J.; Wang, G.; He, X.; and Zhou, B. 2020. Orthogonal Relation Transforms with Graph Context Modeling for Knowledge Graph Embedding. In ACL, 2713 2722. Trouillon, T.; Welbl, J.; Riedel, S.; Gaussier, E.; and Bouchard, G. 2016. Complex embeddings for simple link prediction. In ICML, 2071 2080. Wang, S.; Wei, X.; dos Santos, C. N.; Wang, Z.; Nallapati, R.; Arnold, A.; Xiang, B.; Philip, S. Y.; and Cruz, I. F. 2021. Mixed-Curvature Multi-Relational Graph Neural Network for Knowledge Graph Completion. In WWW. Wang, Z.; Zhang, J.; Feng, J.; and Chen, Z. 2014. Knowledge graph embedding by translating on hyperplanes. In AAAI, 1112 1119. Xiao, H.; Huang, M.; and Zhu, X. 2016a. From One Point to a Manifold: Knowledge Graph Embedding for Precise Link Prediction. In IJCAI, 1315 1321. Xiao, H.; Huang, M.; and Zhu, X. 2016b. Trans G: A generative model for knowledge graph embedding. In ACL, 2316 2325. Yang, B.; Yih, W.; He, X.; Gao, J.; and Deng, L. 2015. Embedding Entities and Relations for Learning and Inference in Knowledge Bases. In ICLR, 1 13. Yu, Z.; Jin, D.; Liu, Z.; He, D.; Wang, X.; Tong, H.; and Han, J. 2021. AS-GCN: Adaptive Semantic Architecture of Graph Convolutional Networks for Text-Rich Networks. In ICDM. Zhang, S.; Tay, Y.; Yao, L.; and Liu, Q. 2019. Quaternion knowledge graph embeddings. In Neur IPS, 2731 2741.