# cone_semantics_for_logics_with_negation__c28f414a.pdf

Cone Semantics for Logics with Negation

Ozg ur L utf u Ozc ep1 , Mena Leemhuis1 and Diedrich Wolter2

1University of L ubeck, Germany 2University of Bamberg, Germany oezcep@iﬁs.uni-luebeck.de, mena.leemhuis@student.uni-luebeck.de, diedrich.wolter@uni-bamberg.de

This paper presents an embedding of ontologies expressed in the ALC description logic into a realvalued vector space, comprising restricted existential and universal quantiﬁers, as well as concept negation and concept disjunction. Our main result states that an ALC ontology is satisﬁable in the classical sense iff it is satisﬁable by a partial faithful geometric model based on cones. The line of work to which we contribute aims to integrate knowledge representation techniques and machine learning. The new cone-model of ALC proposed in this work gives rise to conic optimization techniques for machine learning, extending previous approaches by its ability to model full ALC.

1 Introduction The idea of embedding words into low-dimensional continuous vector spaces has been implemented successfully in various algorithms with various applications in the realm of information retrieval [Goldberg and Levy, 2014; Pennington et al., 2014; Levy and Goldberg, 2014]. However, these approaches are insensitive to the relational or more generally: to the predicate-logical structure of documents. The embedding idea was pushed further (see, e.g., Nickel et al.; Bordes et al. [2011; 2013] and, for an overview, Wang et al. [2017]) in order to design embeddings of knowledge graphs, i.e., of sets of triples of the form (a R b) for objects a and b and relations R, or even embeddings of knowledge graphs with additional background knowledge, say an ontology consisting of axioms in some (expressive) logic [Mehran Kazemi and Poole, 2018; Kulmanov et al., 2019]. Many classical approaches of knowledge graph embeddings such as Trans E [Bordes et al., 2013] suffer from a lack of full expressivity in the sense laid down by Mehran Kazemi and Poole [2018]: given a knowledge graph and a set of triples known to be true (positive set) as well as triples that are known to be false (negative set), the embedding is said to be fully expressive if it maps the relations and constants into a space such that (a R b) holds in the embedding iff it is in the positive set and (a R b) does not hold iff it is in the negative set. The lack of (full) expressivity of classical approaches to knowledge graph embeddings is due to a decision on how to

model relations R: rather than following the classical logic approach, relations are modelled with computationally feasible data structures (e.g., in case of Trans E by vector translations) that ﬁt into the general mathematical framework of continuous embeddings. In fact, following the translation approach of Trans E one can see that the induced logic is not that of arbitrary relations but that of functional relations. These observations motivate the research on embeddings that give a better compromise between the geometrical models that can be constructed by means of learning and the demands of ontologies. This paper proposes a new embedding of ontologies expressed in the ALC description logic into a real-valued vector space, providing restricted forms of existential and universal quantiﬁers as well as concept negation and concept disjunction. Our main result states that an ALC ontology is satisﬁable in a classical sense iff it is satisﬁable by a geometric model that interprets all concept descriptions as cones (concretely: axis-aligned cones). We derive this result by ﬁrst considering the Boolean part of ALC ontologies and then generalizing this to ALC with concepts up to a ﬁxed quantiﬁer rank. The geometric models we use are partial and thus allow some uncertainty to be retained, i.e., if x is only known to be a member of the union of two atomic concepts, then our partial model will not commit to saying to which atomic concept x belongs. Put differently, we need partial models in order to represent the knowledge contained in the ontology faithfully: exactly those axioms derivable from the ontology should be represented in a partial model. This paper continues by reviewing related approaches. Section 3 summarizes ALC, Section 4 introduces al-cones and discusses their motivation. In Section 5 we present an embedding for propositional ALC and show its completeness. Section 6 then considers the non-propositional case. The paper concludes with a brief discussion of the results.

2 Related Work When combining knowledge representation (KR) and machine learning (ML) techniques, ML algorithms are either used to learn an ontology or they exploit the ontologies as constraint speciﬁcations in order to optimize statistical models. One recent approach along this line that is based on the geometric interpretation of knowledge graph embeddings is presented in Kulmanov et al. [2019]. Here, the lightweight logic EL++ is considered. The method is based on a geo-

Proceedings of the Twenty-Ninth International Joint Conference on Artiﬁcial Intelligence (IJCAI-20)

metric interpretation of Trans E [Bordes et al., 2013], which interprets objects as vectors and relations as translations of these vectors. A concept is in this case an open n-ball with a ﬁxed radius and relations are translations of the n-balls. The method is limited due to the restricted constructions available in EL++, which in particular has no full concept negation. In Guti errez-Basulto and Schockaert [2018], geometric interpretations are deﬁned as classical interpretations of an ontology with two speciﬁc constraints: the domain is a Euclidean space of some dimension m and the interpretations of all (n-ary) relations are constrained to convex sets over nwise cartesian products Rnm of Rm. The main difference to classical knowledge graph approaches is the idea of interpreting n-ary relations (classically) by n-wise cartesian products and (non-classically) requiring them to be convex sets. Their approach achieves ontologies expressed as rules in datalog [Cal ı et al., 2009], which admits rules with existentials in the head of the rule (alias tuple-generating dependencies) and integrity constraints, i.e., rules of the form xψ( x) with a conjunction of atoms ψ( x). Their main result is that an ontology under some speciﬁc constraint (quasi-chainedness) is satisﬁable classically iff it is satisﬁable by a geometric interpretation where all relations are interpreted by convex sets. The use of convex sets for the interpretation of relations can be justiﬁed by their importance as, on the one hand, computationally feasible data structures as used in convex optimization [Boyd and Vandenberghe, 2004] and, on the other hand, as a linguistically and cognitively justiﬁed structure for representing concepts [G ardenfors, 2000]. Convexity is preserved under many operations, in particular it is preserved under intersection and projection, which are the main operations expressible in the allowed fragment of Guti errez-Basulto and Schockaert [2018]. Interestingly, due to the use of integrity constraints, these ontologies implicitly use some aspect of negation, namely that of disjointness. But this is not full negation: implicitly, negation is allowed to occur on the righthand side as atomic negation, but not as negation on the lefthand side of a rule. In the latter case negation allows us to express coverage aspects which amounts to the use of disjunction. And this restriction in the language is not surprising as convexity is neither preserved under set complement nor under set union. In our approach, we try to keep the cake (stick to convexity) and eat it (allow for negation). Our approach follows that of Guti errez-Basulto and Schockaert [2018] in interpreting relations as tuples over a domain. Our approach also presumes ﬁnite satisﬁability of the ontology but works dually by constructing concepts on the axes and then placing individuals on these. The reason is that we incorporate an additional structure, namely a scalar product, which in turn induces the negation operator. And this one constrains the potential places in which the negations of concepts can be placed. This, in particular, prevents adapting the quasi-chainedness property which was deﬁned for datalog and not ALC.

3 The Description Logic ALC

We are going to work with the description logic ALC [Baader, 2003]. We assume that there is a DL vocabulary

Name Syntax Semantics

bottom conjunction C D CI DI

disjunction C D CI DI

negation C I \ CI

universal quantiﬁer R.C {x I | For all y I: If (x, y) RI then y CI} existential quantiﬁer R.C {x I | There is y I

(x, y) RI and y CI}

Table 1: Syntax and semantics for the DL ALC

given by a set of constants Nc, a set of role names NR and concept names NC. The ALC concepts (concept descriptions) over NC NR are described by the grammar

C A | | | C | C C | C C | R.C | R.C

where A NC is an atomic concept, R NR is a role symbol, and C stands for arbitrary concepts. A classical ALC interpretation is a pair ( , ( )I) consisting of a set , called the domain, and an interpretation function ( )I which maps constants to elements in , concept names to subsets of , and role names to subsets of . The semantics of arbitrary concept descriptions for a given interpretation I is given in Table 1. An ontology O is deﬁned as a pair O = (T , A) of a terminological box (tbox) T and an assertional box (abox) A. A tbox consists of general inclusion axioms (GCIs) C D ( C is subsumed by D ) with concept descriptions C, D. An abox consists of a ﬁnite set of assertions, i.e., facts of the form C(a) or of the form R(a, b) for a, b Nc. An interpretation I models a GCI C D, for short I |= C D, iff CI DI. An interpretation I models an abox axiom C(a), for short I |= C(a), iff a I CI and it models an abox axiom of the form R(a, b) iff (a I, b I) RI. An interpretation is a model of an ontology (T , A) iff it models all axioms appearing in T A. An ontology O entails a (tbox or abox) axiom ax, for short O |= ax, iff all models of O are also models of ax. The quantiﬁer rank for arbitrary concepts C is deﬁned as the maximal nesting of quantiﬁers in it. Formally, for an ALC concept deﬁne the quantiﬁer rank by recursion as qr(A) = 0, qr( C) = qr(C), qr(C1 C2) = max{qr(C1), qr(C2)}, and qr( R.C) = qr( R.C) = qr(C) + 1. Each tbox T generates a Boolean algebra, the so-called Lindenbaum-Tarski algebra, as follows: For concepts C, D let be the relation deﬁned by C D iff T |= C D and T |= D C. Relation is an equivalence relation inducing for each concept C an equivalence class [C]. Deﬁne operations , , on the equivalence classes by setting [C] [D] = [C D], [C] [D] = [C D] and [C] = [ C] which can be shown to fulﬁl the axioms of a Boolean algebra.

4 Al-Cone Models Our geometric interpretations are based on axis-aligned cones (al-cones) which form a subclass of the class of convex cones. A convex cone X is a set that fulﬁls the following property: If v, w X, then also λv + µw X for all λ, µ 0. The polar cone X for X is deﬁned for Euclidean spaces with a scalar (dot) product , as follows:

X = {v Rn | w X : v, w 0}

Proceedings of the Twenty-Ninth International Joint Conference on Artiﬁcial Intelligence (IJCAI-20)

In the sequel, we use the term cone to refer to convex cones. Important cones are the so-called n-orthants or nhyperoctants. For dimension n = 2 these are the 4 quadrants. Using the abbreviations R+ = {x R | x 0} and R = {x R | x 0} we can deﬁne the 2n n-dimensional hyperoctants H(b1,...,bn) with bi {+, } by Rb1 Rbn. Cones can be constructed by unions of neighboring hyperoctants. Consider next to +, , also the value u (for union) and deﬁne Ru = R. Then the axis aligned unions of hyperoctants are deﬁned as sets Rb1 Rbn with bi {+, , u}. By further allowing for projections of all the sets mentioned above to the axes of the space we get the al-cones.

X is al-cone : X = X1 Xn, Xi {R, R+, R , {0}}

So, in n dimensions we have 4n possible al-cones. Let us now see how Boolean operations can be modeled. Aside set-based intersection to represent conjunction, special attention is required. All al-cones are convex cones. Convex cones are preserved under intersection, polarity, and other operations (such as projection), but not under the set-union operator. This applies to al-cones as well, but by interpreting using de Morgan we can circumvent that problem. Proposition 1. Al-cones X, Y in Rn are closed as follows: 1. The intersection X Y of X and Y is an al-cone. 2. The polar X is an al-cone. 3. (X ) = X 4. (X Y ) , the convex union of X, Y , is an al-cone.

Proof sketch. (1) holds as Xi in the al-cone deﬁnition is closed under intersection, (2) can be shown by contradiction: Consider one component Xi, suppose it is R+ (other cases follow similarly). Let v X be of the form (0, 0, . . . , xi, 0, . . . ). Since X should contain exactly those vectors u s.t. u, v 0 for any v X, clearly (X )i the i-th component of the polar of X, for short (X )i, cannot contain R+, which disqualiﬁes it from being either R or R+. On the other hand, (X )i must contain the entire R because for any non-positive ui the deﬁning condition holds. (3) is known to hold for arbitrary closed convex cones [Rockafellar, 1997, pp. 121 122], (4) immediately follows from (1), (2).

Before we further embark on technicalities let us motivate the speciﬁcs of our approach by discussing two questions: First, why should one use the polarity operator as the interpretation of concept negation? Second, why should we restrict ourselves to axis-aligned cones?

4.1 Motivation of Polarity-Based Negation The use of the polarity operation for concept negation is motivated by the idea of providing an operator that always maps a concept to a disjoint concept such that the disjoint concept is maximally so w.r.t. the underlying similarity structure. Here, the similarity structure is given by the usual scalar product in the Euclidean space. Two reasons substantiate our choice. As for the ﬁrst reason, one can observe that the idea of considering polarity as a form of negation, is related to a general approach of deﬁning negation on the basis of some binary complementary-relation com, as explained, e.g., in

C (C (A B))I =CI

= ((C A) (C B))I= I

Figure 1: Illustration of Farkas Lemma (a) and counterexample distributivity for arbitrary cones (b)

Dunn [1996] for propositional logics: relation com holds between two propositions iff they are complementary in the sense that there is no situation where both propositions are true (but still both may be false). Then, negation of a proposition p can be deﬁned as the disjunction of all those propositions q that are complementary to p. Of course we are interested here in reading subsets C of the embedding space Rn as concepts and points in Rn as objects falling into the extension of C. But, drawing a closer connection to the approach in Dunn [1996], one could equally think of the points of Rn as possible worlds (or states). Then set C becomes a proposition which is exactly true for points contained in it. As for the second reason, one can consider the following fact, known as Farkas Lemma (see Figure 1a).

Lemma 1 (Farkas Lemma). Let C be the convex cone generated by vectors v1, . . . , vn Rn (i.e., C = smallest convex cone containing all vi) and let w Rn. Then either w C or there is z Rn such that z, vi 0 for all i, and z, w > 0.

The lemma says that if one considers a vector w that is not in the convex cone, i.e., it is in the complement Rn \ C (the usual negation of C), then there is at least a veriﬁer z that is similar to w (namely z, w > 0) and is contained in the polar cone of C. Note that from z, vi 0 for all i it follows that z, v 0 for all v C. In the other direction, if one can ﬁnd a z of the polar cone being similar to w, then w is in the complement of C. In conclusion, polarity can be considered to provide a more intuitive model of negation for scalar-product based similarity structures than set complement.

4.2 Motivation for Considering Al-Cones

Firstly, the usefulness of al-cone models hinges on whether they are able to represent an interesting class of ontologies. In the remainder of this paper we show that al-cone models are indeed complete in the sense that an ontology is satisﬁable classically iff it can be embedded into a geometric model based on al-cones. Secondly, al-cones are motivated by their ability to link to ML. So far we have motivated the use of convex cones as proper structures to handle negation, yet we emphasize that convex cones also are used as computationally feasible data structures in the area of conic optimization (see, e.g., Boyd and Vandenberghe [2004], Section 4.3 on linear optimization problems and Section 4.4.2 on second-order cone programming) and therefore attractive for ML application. Here we only regard the case where one aims to learn a statistical model for data that can be characterized by some ontology

Proceedings of the Twenty-Ninth International Joint Conference on Artiﬁcial Intelligence (IJCAI-20)

that has been speciﬁed in a logic beforehand, not the case of investigating logics induced by the intersection and polarity operators for arbitrary cones. In fact, the resulting logic cannot be guaranteed to lead to Boolean (and so not to full) ALC as it would not fulﬁl the distribution property for and . A simple example is given in Figure 1b. Let us sketch a speciﬁc incremental learning scenario. We are given an ALC tbox. The training data are points from Rn with a concept label or pairs of points with a role label. The dimension n is ﬁxed in advance and represents features. For ease of exposition we assume that the number n equals the number of atoms in the Boolean algebra over the tbox. In this scenario the hypothesis set is chosen as al-cones in Rn and all other set systems of cones that result from rotating the al-cones w.r.t. the same (arbitrarily chosen) angle. On receiving some training datum, the algorithm is required to choose some skeleton of al-cones and possibly rotate it with some angle α in order to ﬁt the datum. We do not specify here how to choose the α, as there may be various possible rotations that can be chosen following an optimization principle, say. In the training phase, the algorithm may not be able to embed the data to any rotated skeleton. In this case, the algorithm would stop due to non-resolvable inconsistencies. These can be inconsistencies due to a too small feature dimension n chosen in the beginning or w.r.t. to our assumption to consider only rotations of al-cones and not other al-cone preserving transformations such as warping or due to inconsistencies w.r.t. the ontology. As the following propositions show, the inconsistency cannot be due to the fact that concepts are represented as al-cones.

5 Embedding for Propositional ALC

Let us start by considering ALC ontologies where the tbox language amounts to propositional logic. That is, we ﬁrst restrict deﬁnitions from Section 3 to the concept constructors , , corresponding to the Boolean operations , , . A Boolean ALC tbox (abox) consists of GCIs (abox axioms) using Boolean concepts only. Let (T , A) be a Boolean ALC ontology. We are going to deﬁne a geometric interpretation of a special kind that can be a model or an anti-model of a Boolean ontology. It is an ordinary model in the sense that it is a classical predicate logical structure with a domain and interpretations for the atomic concept symbols. It is special in the sense that the domain is of a speciﬁc structure, namely a Euclidean space and concept names are interpreted by al-cones and their projections to subspaces. It is non-classical in the sense that some logical constructors are not interpreted by the corresponding set operations, but by operations on al-cones. The interesting aspect is now that logical aspects of the Boolean algebra provide global constraints on the model. These can be satisﬁed by choosing appropriate positions of the interpretations for all atomic concepts. As a consequence, an ontology is satisﬁable classically iff it is satisﬁed by the geometric interpretation.

Deﬁnition 1. A Boolean al-cone interpretation I is a structure ( , ( )I) where is Rn for some n N, and where ( )I maps each concept symbol A to some al-cone and each constant a to some element in \ { 0}. An al-cone interpre-

x-axis = A B

y-axis = A B

Figure 2: Al-cone model for a simple abox with empty tbox

tation for arbitrary Boolean ALC concepts is deﬁned recursively as ( )I = , ( )I = { 0}, (C D)I = CI DI, ( C)I = C , and (C D)I = ( ( C D))I. The notions of an al-cone being a model and that of entailment are deﬁned as in the classical case (but using al-cone interpretations).

Let us consider a simple example of a Boolean ontology, consisting of an empty tbox and an abox constructed as follows: Consider the Lindenbaum-Tarski algebra of all Boolean concepts deﬁned on two Boolean names, say A, B. Then the generated Boolean algebra of classes of equivalent concepts has 222 = 16 elements. Choose for each a smallest representative Ci, i {1, . . . , 16} w.r.t. some ordering. C1 is the bottom concept and let us assume that C2 is the concept B and C3 is the concept B A. Now, for each i with n i 2 take a constant ai and add abox axioms Ci(ai). Thus we represent each of the 15 non-bottom concepts uniquely by some constant. For example, in Figure 2 the constant a2 represents the concept C2 = B. This ontology is satisﬁable classically. There is an al-cone interpretation that fulﬁlls the ontology in R2 that is constructed as follows: interpret A by the left upper quadrant and B by the right upper quadrant. This induces uniquely the positions of all other hyperoctants corresponding to the other concepts Ci. The interesting point is the localization of constants ai. For each constant ai one localizes the hyperoctant corresponding to Ci in the area and positions on it such that it does not coincide with one of the points already associated with a constant. Figure 2 illustrates the positions of the concepts: is by deﬁnition the singleton { 0}, A is the left upper quadrant, is the whole area, etc. One can check that the concepts are associated with appropriate al-cones. For example, the negation A of A is indeed the polar cone of the quadrant of A. Similarly, consider B A, which is interpreted as the positive x-axis R+ {0}. Its polar cone is the whole left area. As another example consider the Boolean algebra induced by the atomic concepts A, B under the tbox axiom A B. Then, e.g., A B and so forth. This gives an embedding with axis-aligned cones as illustrated in Figure 3. Using the construction idea of the examples one can show that ALC-ontologies are classically satisﬁable iff they are by a geometric model based on al-cones.

Proposition 2. Boolean ALC-ontologies are classically satisﬁable iff they are by a geometric model on some ﬁnite Rn

Proceedings of the Twenty-Ninth International Joint Conference on Artiﬁcial Intelligence (IJCAI-20)

A Figure 3: Al-cone model for tbox {A B}

based on al-cones of the form b1 bn with bi {{0}, R+, R , R} for i {1, . . . , n}.

Proof sketch. For the general construction we use the encoding from the examples. The set of encodings {u, +, , 0} used in the example forms a diamond shaped Boolean algebra B4 with +, . As the ALC-ontology O the induced Boolean algebra is ﬁnite, say 2k, with a ﬁnite number of atoms m. We have to chose a dimension n such that 4n 2k and n m/2, the idea being that we have to represent all of the 2k Boolean concepts by embedding the atoms of the Boolean algebra directly onto the half-axes of R. Now one can see that Bn 4 is the n-wise Cartesian product of B4 where , , are deﬁned component-wise based on the , , deﬁned on B4. But in this way over B4 is nothing else than set intersection, is polarity, and is deﬁned by de Morgan. So, if O is satisﬁable, then we can ﬁnd a geometric model of it. : Given a geometric model, one can build a classical model by forgetting any named object that is not on an axis by moving it outside to one of the axes that cover the al-cone.

Geometric models are more speciﬁc than classical models in the sense that they impose an underlying structure on the domain given by, ﬁrstly, the dimensions representing some latent features that usually are not even mentioned in the ontology and, secondly, a scalar product , over the space. Geometric interpretations are more general than classical interpretations in the sense that they are partial interpretations. Actually, when embedding an ontology into some space one expects the embedding to represent not a single, but all interpretations that make the ontology true. And this is (up to some exceptions, see below) the case in the construction of Proposition 2. This allows partial information to be encoded. Consider, e.g., the difference between a2 and a3 in the geometric model of Fig. 2. The individual a3 is completely identiﬁed w.r.t. the given concepts A, B: it lies in the extension of B and in the extension of A. For a2 we only know that it must be an B, but we do not know whether it is also an A. Hence this geometric model correctly reﬂects this speciﬁc knowledge provable in the ontology: one can prove that a2 is a B but one cannot prove that it is an A nor that it is a A. The geometric model illustrated in Fig. 2 does not reﬂect the whole knowledge (not) contained in an ontology in the correct way. For example, it is not possible to represent an object b which is known to be in A B, but neither known

to be an A nor known to be a B. The reason is that by our construction no place is left to reﬂect this partial knowledge. The discussion above motivates the deﬁnition of faithful geometric models of a given ontology. As within ontologies one has to deal with many models there are actually two different adaptations, a weak one and a strong one. Orthogonally to this distinction one may also consider the faithfulness w.r.t. the abox only (to the data only) or to the tbox. An even ﬁner distinction can be given by considering special subclasses of the abox (which we will do here by considering concept assertions vs. role assertions) and the tbox (which we will do here by considering concepts with a bounded quantiﬁer rank).

Deﬁnition 2. Let O be a classically consistent (DL) ontology (or any other representation allowing the distinction between abox and tbox). For a geometric interpretation I we have the following notions of being a faithful model for O: I is a strongly concept-faithful model of O iff for each concept C and each constant b: if b I CI, then O |= C(b); I is a weakly concept-faithful model of O iff for each concept C and each constant b: if b I CI, then O {C(b)} is satisﬁable classically; I is a strongly (weakly) abox-faithful model of O iff it is strongly (weakly) concept-faithful and for each role R and constants a, b: if (a I, b I) RI, then O |= R(a, b) (resp. O {R(a, b)} is satisﬁable classically); I is a strongly (weakly) tbox-faithful model iff for all tbox axioms τ: if I |= τ, then O |= τ (resp. O {τ}) is satisﬁable.

When we consider classical geometric models such as those deﬁned by Guti errez-Basulto and Schockaert [2018], it will in general not be possible to ensure strong faithfulness. So, the notion of strong faithfulness makes sense only for models which allow for representing partial information. Now let us come back to our al-cones. We present two ways to gain faithfulness, the ﬁrst using al-cones, the second a subclass of al-cones.

Proposition 3. For classically satisﬁable Boolean ALContologies there is a strongly concept-faithful and tboxfaithful geometric model on some ﬁnite R2n based on alcones of the form b1 b2n with b2i {0, +, , u} and b2i+1 = b2i. Here n is the number of atomic elements in the Boolean algebra generated by the tbox of the ontology.

Proof sketch. First a geometric model of the given tbox is generated in size n (using the method described in Prop. 2). In the resulting model every dimension is doubled, A B = R+ R R for instance becomes A B = R+ R+ R R R R. To ensure strong conceptfaithfulness it must not be possible for any object in the abox to get embedded into a concept properly subsumed by its most speciﬁc concept (msc). In each of the subconcepts of the object s msc, one can ﬁnd an entry R of the msc a R+ or a R . With doubled dimension there is a R+ R+ or R R . Thus if the corresponding R in the concept is replaced by, e.g., R+ R the object is in the concept, but in none of its subconcepts.

When disregarding R+, R in al-cones we are left with halfspaces, which are sufﬁcient to deﬁne faithful models.

Proceedings of the Twenty-Ninth International Joint Conference on Artiﬁcial Intelligence (IJCAI-20)

Proposition 4. For classically satisﬁable Boolean ALContologies there is a strongly concept-faithful and tboxfaithful geometric model on some ﬁnite Rn using sets of the form b1 bn with bi = {0} or bi = R.

Proof sketch. Assume that an ALC tbox with n atomic elements is given. Arbitrarily assign them to one of the following n spaces {0} {0} R {0} {0}. Then assign other concepts applying disjunction component-wise.

This result follows also from the well-known fact that orthogonal subspaces make up a Boolean algebra.

6 The Non-Boolean Case

Our main task is to deﬁne semantics for roles and quantiﬁers. We choose to deﬁne crisp extensions for roles. As ALC does not have role negation, we do not get areas of uncertainty for roles. More importantly, we do not consider special structures for roles (in particular no cones). Following the characterization of ALC according to Schild [1991] our deﬁnition of the existential quantiﬁer has to satisfy R. = and R.(C D) = R.C R.D. As we will deﬁne the existential quantiﬁer via the universal quantiﬁer we have to ensure that R. = and R.(C D) = R.C R.D.

Deﬁnition 3. An al-cone interpretation I is a Boolean alcone interpretation ( , ( )I) mapping additionally roles R to subsets over such that 0 does not appear in the ﬁrst or second argument of RI. An al-cone interpretation of ALC concepts is deﬁned recursively as for Boolean concepts and deﬁning the concepts of the form R.C as minimal al-cones containing the set {x I | For all y I: if (x, y) RI then y CI}. The interpretation of the existential quantiﬁer is given by de Morgan, i.e., ( R.C)I = ( R. C)I.

Constructions of the Boolean case do not work for non Boolean ALC ontologies due to the fact that the algebra of ALC concepts is not atomic anymore. Consider the simple example On = {loves(narcis, narcis), V ain(narcis)} from Baader and K usters [2006]. Then narcis fulﬁls all concepts of the form Rn.V ain, which, connected conjunctively, give a chain of concepts Ci that become inﬁnitely narrower and narrower, deﬁned by C0 := R.V ain and Ci := Ci 1 Ri+1.V ain. We take a midway between the approach of Guti errez-Basulto and Schockaert [2018] and our aim of representing knowledge aboxand tbox-faithfully. The rationale for this is to support ML applications such as prediction. Given an ML model, prediction is meant to categorize (new) objects w.r.t. the concepts represented in the model. If a geometric model has m dimensions, it can be asked queries of the form A(a) for concept label A and an m-vector a. More generally, we consider not only atomic concepts A but arbitrary concepts in ALC. Of course, in a classical model the extension of all concepts are determined by those of the atomic ones. But the point of our approach is to work with partial models. So let us assume that the queries are of the form C(a) (assertion check) or of the form C(x) with variable x (concept retrieval). Now, the class of queries that are answered in a prediction scenario are constrained in size, where the size is formally

speciﬁed by some parameter. We will assume in the following that the parameter is the quantiﬁer rank (see Sect. 3). In order to ﬁnd all potential inconsistencies in an ontology we have to incorporate also the length of role paths in the abox. Let m be the minimum of the length of paths and the expected maximal quantiﬁer rank of the queries in a prediction scenario. The set of ALC concepts of maximal quantiﬁer rank m still form a Boolean algebra and we have still R. = as well as R.(C D) = R.C R.D. So, within this class we will have atomic elements, say n, which will be put on the axes of the ﬁnite dimensional space Rn.

Proposition 5. ALC-ontologies are classically satisﬁable iff they are satisﬁable by a (strongly abox faithful, m-rankconcepts faithful) geometric model on some ﬁnite Rn using sets of the form b1 bn with bi {{0}, R+, R , R}.

Proof sketch. : Assume that the ontology O is satisﬁable classically. Consider the Lindenbaum-Tarski algebra of maximal m-rank ALC concepts induced by the given tbox. Assume the number of atomic concepts is n. As in the Boolean case put these concepts onto the axes of a Euclidean space Rn respecting the Boolean operators. We obtain an al-cone interpretation as follows: Each atomic concept A is interpreted by the al-cone where it is placed according to the Lindenbaum Tarski algebra embedding. We have to specify the interpretation of the constants and the roles. For a constant a ﬁnd the most speciﬁc concept msc Rank m(a) of rank smaller than m provable in the ontology and place a at some area not occupied already by some individual. In particular, if C(a) follows from the ontology, a is going to be true in the geometric model. For each role assertion R(a, b) entailed by the ontology put (a, b) RI. : Assume that O is not satisﬁable classically. There is a complete closed tableau. But this derivation can be mimicked in the geometric model as all clashes are on concept level, i.e., considering only concept assertions of the form C(a) and C(a) but not role assertions.

7 Conclusion and Outlook

Starting from an interpretation of negation as a polarity operator we presented embeddings of ALC ontologies that interpret all concepts as axis-aligned cones. This result adds an interesting alternative to embeddings considered so far. The resulting embeddings can be used in prediction scenarios where one is interested in predicting whether new samples belong to a concept. An open problem is whether roles can be interpreted by cones (or other feasible structures). As a side product of deﬁning negations by polarity we obtain partial models. So there are individuals for which one does not know whether they belong to a concept or not. This is different from the approaches considered in classical embedding scenarios. A ﬁne-grained treatment of this uncertainty could proceed by considering partial models as in Hartonas [2016] or by providing an epistemic operator [Donini et al., 1998, for example]. Such an operator would allow us to distinguish between instances known to be in a speciﬁc concept C, i.e., those in C and instances which could be in the concept or its negation.

Proceedings of the Twenty-Ninth International Joint Conference on Artiﬁcial Intelligence (IJCAI-20)

[Baader and K usters, 2006] Franz Baader and Ralf K usters. Nonstandard inferences in description logics: The story so far. In D.M. Gabbay, S.S. Goncharov, and M. Zakharyaschev, editors, Mathematical Problems from Applied Logic I, volume 4 of International Mathematical Series, pages 1 75. Springer-Verlag, 2006. [Baader, 2003] Franz Baader. Description logic terminology. In F. Baader, D. Calvanese, D.L. Mc Guinness, D. Nardi, and P.F. Patel-Schneider, editors, The Description Logic Handbook, pages 485 495. Cambridge University Press, 2003. [Bordes et al., 2013] Antoine Bordes, Nicolas Usunier, Alberto Garc ıa-Dur an, Jason Weston, and Oksana Yakhnenko. Translating embeddings for modeling multirelational data. In Christopher J. C. Burges, L eon Bottou, Zoubin Ghahramani, and Kilian Q. Weinberger, editors, Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013, Lake Tahoe, Nevada, United States., pages 2787 2795, 2013. [Boyd and Vandenberghe, 2004] Stephen Boyd and Lieven Vandenberghe. Convex optimization. Cambridge university press, 2004. [Cal ı et al., 2009] Andrea Cal ı, Georg Gottlob, and Thomas Lukasiewicz. Datalog+-: A uniﬁed approach to ontologies and integrity constraints. In Proceedings of the 12th International Conference on Database Theory, pages 14 30. ACM Press, 2009. [Donini et al., 1998] Francesco M. Donini, Maurizio Lenzerini, Daniele Nardi, Werner Nutt, and Anderea Schaerf. An epistemic operator for description logics. Artiﬁcial Intelligence, 100(1):225 274, 1998. [Dunn, 1996] Jon Michael Dunn. Generalized orthonegation. In Heinrich Wansing, editor, Negation A Notion in Focus, Perspektiven der Analytischen Philosophie / Perspectives in Analytical Philosophy 7, pages 3 26. De Gruyter, 1996. [G ardenfors, 2000] Peter G ardenfors. Conceptual Spaces: The Geometry of Thought. The MIT Press, Cambridge, Massachusetts, 2000. [Goldberg and Levy, 2014] Yoav Goldberg and Omer Levy. word2vec Explained: deriving Mikolov et al. s negativesampling word-embedding method. Ar Xiv e-prints, February 2014. [Guti errez-Basulto and Schockaert, 2018] V ıctor Guti errez Basulto and Steven Schockaert. From knowledge graph embedding to ontology embedding? An analysis of the compatibility between vector space representations and rules. In Michael Thielscher, Francesca Toni, and Frank Wolter, editors, Principles of Knowledge Representation and Reasoning: Proceedings of the Sixteenth International Conference, KR 2018, Tempe, Arizona, 30 October 2 November 2018., pages 379 388. AAAI Press, 2018.

[Hartonas, 2016] Chrysaﬁs Hartonas. Reasoning with incomplete information in generalized galois logics without distribution: The case of negation and modal operators. In Katalin Bimb o, editor, J. Michael Dunn on Information Based Logics, pages 279 312. Springer International Publishing, Cham, 2016. [Kulmanov et al., 2019] Maxat Kulmanov, Wang Liu-Wei, Yuan Yan, and Robert Hoehndorf. El embeddings: Geometric construction of models for the description logic EL++. In Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI19), 2019. [Levy and Goldberg, 2014] Omer Levy and Yoav Goldberg. Neural word embedding as implicit matrix factorization. In Zoubin Ghahramani, Max Welling, Corinna Cortes, Neil D. Lawrence, and Kilian Q. Weinberger, editors, Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada, pages 2177 2185, 2014. [Mehran Kazemi and Poole, 2018] Seyed Mehran Kazemi and David Poole. Simpl E Embedding for Link Prediction in Knowledge Graphs. ar Xiv e-prints, page ar Xiv:1802.04868, Feb 2018. [Nickel et al., 2011] Maximilian Nickel, Volker Tresp, and Hans-Peter Kriegel. A three-way model for collective learning on multi-relational data. In Proceedings of the 28th International Conference on International Conference on Machine Learning, ICML 11, pages 809 816, USA, 2011. Omnipress. [Pennington et al., 2014] Jeffrey Pennington, Richard Socher, and Christopher D. Manning. Glove: Global vectors for word representation. In EMNLP, volume 14, pages 1532 1543, 2014. [Rockafellar, 1997] Ralph Tyrrell Rockafellar. Convex Analysis. Princeton University Press, Princeton, NJ, 1997. [Schild, 1991] Klaus Schild. A correspondence theory for terminological logics: Preliminary report. In Proceedings of IJCAI-91, pages 466 471, 1991. [Wang et al., 2017] Quan Wang, Zhendong Mao, Bin Wang, and Li Guo. Knowledge graph embedding: A survey of approaches and applications. IEEE Transactions on Knowledge and Data Engineering, 29(12):2724 2743, Dec 2017.

Proceedings of the Twenty-Ninth International Joint Conference on Artiﬁcial Intelligence (IJCAI-20)