# cone_semantics_for_logics_with_negation__c28f414a.pdf Cone Semantics for Logics with Negation Ozg ur L utf u Ozc ep1 , Mena Leemhuis1 and Diedrich Wolter2 1University of L ubeck, Germany 2University of Bamberg, Germany oezcep@ifis.uni-luebeck.de, mena.leemhuis@student.uni-luebeck.de, diedrich.wolter@uni-bamberg.de This paper presents an embedding of ontologies expressed in the ALC description logic into a realvalued vector space, comprising restricted existential and universal quantifiers, as well as concept negation and concept disjunction. Our main result states that an ALC ontology is satisfiable in the classical sense iff it is satisfiable by a partial faithful geometric model based on cones. The line of work to which we contribute aims to integrate knowledge representation techniques and machine learning. The new cone-model of ALC proposed in this work gives rise to conic optimization techniques for machine learning, extending previous approaches by its ability to model full ALC. 1 Introduction The idea of embedding words into low-dimensional continuous vector spaces has been implemented successfully in various algorithms with various applications in the realm of information retrieval [Goldberg and Levy, 2014; Pennington et al., 2014; Levy and Goldberg, 2014]. However, these approaches are insensitive to the relational or more generally: to the predicate-logical structure of documents. The embedding idea was pushed further (see, e.g., Nickel et al.; Bordes et al. [2011; 2013] and, for an overview, Wang et al. [2017]) in order to design embeddings of knowledge graphs, i.e., of sets of triples of the form (a R b) for objects a and b and relations R, or even embeddings of knowledge graphs with additional background knowledge, say an ontology consisting of axioms in some (expressive) logic [Mehran Kazemi and Poole, 2018; Kulmanov et al., 2019]. Many classical approaches of knowledge graph embeddings such as Trans E [Bordes et al., 2013] suffer from a lack of full expressivity in the sense laid down by Mehran Kazemi and Poole [2018]: given a knowledge graph and a set of triples known to be true (positive set) as well as triples that are known to be false (negative set), the embedding is said to be fully expressive if it maps the relations and constants into a space such that (a R b) holds in the embedding iff it is in the positive set and (a R b) does not hold iff it is in the negative set. The lack of (full) expressivity of classical approaches to knowledge graph embeddings is due to a decision on how to model relations R: rather than following the classical logic approach, relations are modelled with computationally feasible data structures (e.g., in case of Trans E by vector translations) that fit into the general mathematical framework of continuous embeddings. In fact, following the translation approach of Trans E one can see that the induced logic is not that of arbitrary relations but that of functional relations. These observations motivate the research on embeddings that give a better compromise between the geometrical models that can be constructed by means of learning and the demands of ontologies. This paper proposes a new embedding of ontologies expressed in the ALC description logic into a real-valued vector space, providing restricted forms of existential and universal quantifiers as well as concept negation and concept disjunction. Our main result states that an ALC ontology is satisfiable in a classical sense iff it is satisfiable by a geometric model that interprets all concept descriptions as cones (concretely: axis-aligned cones). We derive this result by first considering the Boolean part of ALC ontologies and then generalizing this to ALC with concepts up to a fixed quantifier rank. The geometric models we use are partial and thus allow some uncertainty to be retained, i.e., if x is only known to be a member of the union of two atomic concepts, then our partial model will not commit to saying to which atomic concept x belongs. Put differently, we need partial models in order to represent the knowledge contained in the ontology faithfully: exactly those axioms derivable from the ontology should be represented in a partial model. This paper continues by reviewing related approaches. Section 3 summarizes ALC, Section 4 introduces al-cones and discusses their motivation. In Section 5 we present an embedding for propositional ALC and show its completeness. Section 6 then considers the non-propositional case. The paper concludes with a brief discussion of the results. 2 Related Work When combining knowledge representation (KR) and machine learning (ML) techniques, ML algorithms are either used to learn an ontology or they exploit the ontologies as constraint specifications in order to optimize statistical models. One recent approach along this line that is based on the geometric interpretation of knowledge graph embeddings is presented in Kulmanov et al. [2019]. Here, the lightweight logic EL++ is considered. The method is based on a geo- Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI-20) metric interpretation of Trans E [Bordes et al., 2013], which interprets objects as vectors and relations as translations of these vectors. A concept is in this case an open n-ball with a fixed radius and relations are translations of the n-balls. The method is limited due to the restricted constructions available in EL++, which in particular has no full concept negation. In Guti errez-Basulto and Schockaert [2018], geometric interpretations are defined as classical interpretations of an ontology with two specific constraints: the domain is a Euclidean space of some dimension m and the interpretations of all (n-ary) relations are constrained to convex sets over nwise cartesian products Rnm of Rm. The main difference to classical knowledge graph approaches is the idea of interpreting n-ary relations (classically) by n-wise cartesian products and (non-classically) requiring them to be convex sets. Their approach achieves ontologies expressed as rules in datalog [Cal ı et al., 2009], which admits rules with existentials in the head of the rule (alias tuple-generating dependencies) and integrity constraints, i.e., rules of the form xψ( x) with a conjunction of atoms ψ( x). Their main result is that an ontology under some specific constraint (quasi-chainedness) is satisfiable classically iff it is satisfiable by a geometric interpretation where all relations are interpreted by convex sets. The use of convex sets for the interpretation of relations can be justified by their importance as, on the one hand, computationally feasible data structures as used in convex optimization [Boyd and Vandenberghe, 2004] and, on the other hand, as a linguistically and cognitively justified structure for representing concepts [G ardenfors, 2000]. Convexity is preserved under many operations, in particular it is preserved under intersection and projection, which are the main operations expressible in the allowed fragment of Guti errez-Basulto and Schockaert [2018]. Interestingly, due to the use of integrity constraints, these ontologies implicitly use some aspect of negation, namely that of disjointness. But this is not full negation: implicitly, negation is allowed to occur on the righthand side as atomic negation, but not as negation on the lefthand side of a rule. In the latter case negation allows us to express coverage aspects which amounts to the use of disjunction. And this restriction in the language is not surprising as convexity is neither preserved under set complement nor under set union. In our approach, we try to keep the cake (stick to convexity) and eat it (allow for negation). Our approach follows that of Guti errez-Basulto and Schockaert [2018] in interpreting relations as tuples over a domain. Our approach also presumes finite satisfiability of the ontology but works dually by constructing concepts on the axes and then placing individuals on these. The reason is that we incorporate an additional structure, namely a scalar product, which in turn induces the negation operator. And this one constrains the potential places in which the negations of concepts can be placed. This, in particular, prevents adapting the quasi-chainedness property which was defined for datalog and not ALC. 3 The Description Logic ALC We are going to work with the description logic ALC [Baader, 2003]. We assume that there is a DL vocabulary Name Syntax Semantics bottom conjunction C D CI DI disjunction C D CI DI negation C I \ CI universal quantifier R.C {x I | For all y I: If (x, y) RI then y CI} existential quantifier R.C {x I | There is y I (x, y) RI and y CI} Table 1: Syntax and semantics for the DL ALC given by a set of constants Nc, a set of role names NR and concept names NC. The ALC concepts (concept descriptions) over NC NR are described by the grammar C A | | | C | C C | C C | R.C | R.C where A NC is an atomic concept, R NR is a role symbol, and C stands for arbitrary concepts. A classical ALC interpretation is a pair ( , ( )I) consisting of a set , called the domain, and an interpretation function ( )I which maps constants to elements in , concept names to subsets of , and role names to subsets of . The semantics of arbitrary concept descriptions for a given interpretation I is given in Table 1. An ontology O is defined as a pair O = (T , A) of a terminological box (tbox) T and an assertional box (abox) A. A tbox consists of general inclusion axioms (GCIs) C D ( C is subsumed by D ) with concept descriptions C, D. An abox consists of a finite set of assertions, i.e., facts of the form C(a) or of the form R(a, b) for a, b Nc. An interpretation I models a GCI C D, for short I |= C D, iff CI DI. An interpretation I models an abox axiom C(a), for short I |= C(a), iff a I CI and it models an abox axiom of the form R(a, b) iff (a I, b I) RI. An interpretation is a model of an ontology (T , A) iff it models all axioms appearing in T A. An ontology O entails a (tbox or abox) axiom ax, for short O |= ax, iff all models of O are also models of ax. The quantifier rank for arbitrary concepts C is defined as the maximal nesting of quantifiers in it. Formally, for an ALC concept define the quantifier rank by recursion as qr(A) = 0, qr( C) = qr(C), qr(C1 C2) = max{qr(C1), qr(C2)}, and qr( R.C) = qr( R.C) = qr(C) + 1. Each tbox T generates a Boolean algebra, the so-called Lindenbaum-Tarski algebra, as follows: For concepts C, D let be the relation defined by C D iff T |= C D and T |= D C. Relation is an equivalence relation inducing for each concept C an equivalence class [C]. Define operations , , on the equivalence classes by setting [C] [D] = [C D], [C] [D] = [C D] and [C] = [ C] which can be shown to fulfil the axioms of a Boolean algebra. 4 Al-Cone Models Our geometric interpretations are based on axis-aligned cones (al-cones) which form a subclass of the class of convex cones. A convex cone X is a set that fulfils the following property: If v, w X, then also λv + µw X for all λ, µ 0. The polar cone X for X is defined for Euclidean spaces with a scalar (dot) product , as follows: X = {v Rn | w X : v, w 0} Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI-20) In the sequel, we use the term cone to refer to convex cones. Important cones are the so-called n-orthants or nhyperoctants. For dimension n = 2 these are the 4 quadrants. Using the abbreviations R+ = {x R | x 0} and R = {x R | x 0} we can define the 2n n-dimensional hyperoctants H(b1,...,bn) with bi {+, } by Rb1 Rbn. Cones can be constructed by unions of neighboring hyperoctants. Consider next to +, , also the value u (for union) and define Ru = R. Then the axis aligned unions of hyperoctants are defined as sets Rb1 Rbn with bi {+, , u}. By further allowing for projections of all the sets mentioned above to the axes of the space we get the al-cones. X is al-cone : X = X1 Xn, Xi {R, R+, R , {0}} So, in n dimensions we have 4n possible al-cones. Let us now see how Boolean operations can be modeled. Aside set-based intersection to represent conjunction, special attention is required. All al-cones are convex cones. Convex cones are preserved under intersection, polarity, and other operations (such as projection), but not under the set-union operator. This applies to al-cones as well, but by interpreting using de Morgan we can circumvent that problem. Proposition 1. Al-cones X, Y in Rn are closed as follows: 1. The intersection X Y of X and Y is an al-cone. 2. The polar X is an al-cone. 3. (X ) = X 4. (X Y ) , the convex union of X, Y , is an al-cone. Proof sketch. (1) holds as Xi in the al-cone definition is closed under intersection, (2) can be shown by contradiction: Consider one component Xi, suppose it is R+ (other cases follow similarly). Let v X be of the form (0, 0, . . . , xi, 0, . . . ). Since X should contain exactly those vectors u s.t. u, v 0 for any v X, clearly (X )i the i-th component of the polar of X, for short (X )i, cannot contain R+, which disqualifies it from being either R or R+. On the other hand, (X )i must contain the entire R because for any non-positive ui the defining condition holds. (3) is known to hold for arbitrary closed convex cones [Rockafellar, 1997, pp. 121 122], (4) immediately follows from (1), (2). Before we further embark on technicalities let us motivate the specifics of our approach by discussing two questions: First, why should one use the polarity operator as the interpretation of concept negation? Second, why should we restrict ourselves to axis-aligned cones? 4.1 Motivation of Polarity-Based Negation The use of the polarity operation for concept negation is motivated by the idea of providing an operator that always maps a concept to a disjoint concept such that the disjoint concept is maximally so w.r.t. the underlying similarity structure. Here, the similarity structure is given by the usual scalar product in the Euclidean space. Two reasons substantiate our choice. As for the first reason, one can observe that the idea of considering polarity as a form of negation, is related to a general approach of defining negation on the basis of some binary complementary-relation com, as explained, e.g., in C (C (A B))I =CI = ((C A) (C B))I= I Figure 1: Illustration of Farkas Lemma (a) and counterexample distributivity for arbitrary cones (b) Dunn [1996] for propositional logics: relation com holds between two propositions iff they are complementary in the sense that there is no situation where both propositions are true (but still both may be false). Then, negation of a proposition p can be defined as the disjunction of all those propositions q that are complementary to p. Of course we are interested here in reading subsets C of the embedding space Rn as concepts and points in Rn as objects falling into the extension of C. But, drawing a closer connection to the approach in Dunn [1996], one could equally think of the points of Rn as possible worlds (or states). Then set C becomes a proposition which is exactly true for points contained in it. As for the second reason, one can consider the following fact, known as Farkas Lemma (see Figure 1a). Lemma 1 (Farkas Lemma). Let C be the convex cone generated by vectors v1, . . . , vn Rn (i.e., C = smallest convex cone containing all vi) and let w Rn. Then either w C or there is z Rn such that z, vi 0 for all i, and z, w > 0. The lemma says that if one considers a vector w that is not in the convex cone, i.e., it is in the complement Rn \ C (the usual negation of C), then there is at least a verifier z that is similar to w (namely z, w > 0) and is contained in the polar cone of C. Note that from z, vi 0 for all i it follows that z, v 0 for all v C. In the other direction, if one can find a z of the polar cone being similar to w, then w is in the complement of C. In conclusion, polarity can be considered to provide a more intuitive model of negation for scalar-product based similarity structures than set complement. 4.2 Motivation for Considering Al-Cones Firstly, the usefulness of al-cone models hinges on whether they are able to represent an interesting class of ontologies. In the remainder of this paper we show that al-cone models are indeed complete in the sense that an ontology is satisfiable classically iff it can be embedded into a geometric model based on al-cones. Secondly, al-cones are motivated by their ability to link to ML. So far we have motivated the use of convex cones as proper structures to handle negation, yet we emphasize that convex cones also are used as computationally feasible data structures in the area of conic optimization (see, e.g., Boyd and Vandenberghe [2004], Section 4.3 on linear optimization problems and Section 4.4.2 on second-order cone programming) and therefore attractive for ML application. Here we only regard the case where one aims to learn a statistical model for data that can be characterized by some ontology Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI-20) that has been specified in a logic beforehand, not the case of investigating logics induced by the intersection and polarity operators for arbitrary cones. In fact, the resulting logic cannot be guaranteed to lead to Boolean (and so not to full) ALC as it would not fulfil the distribution property for and . A simple example is given in Figure 1b. Let us sketch a specific incremental learning scenario. We are given an ALC tbox. The training data are points from Rn with a concept label or pairs of points with a role label. The dimension n is fixed in advance and represents features. For ease of exposition we assume that the number n equals the number of atoms in the Boolean algebra over the tbox. In this scenario the hypothesis set is chosen as al-cones in Rn and all other set systems of cones that result from rotating the al-cones w.r.t. the same (arbitrarily chosen) angle. On receiving some training datum, the algorithm is required to choose some skeleton of al-cones and possibly rotate it with some angle α in order to fit the datum. We do not specify here how to choose the α, as there may be various possible rotations that can be chosen following an optimization principle, say. In the training phase, the algorithm may not be able to embed the data to any rotated skeleton. In this case, the algorithm would stop due to non-resolvable inconsistencies. These can be inconsistencies due to a too small feature dimension n chosen in the beginning or w.r.t. to our assumption to consider only rotations of al-cones and not other al-cone preserving transformations such as warping or due to inconsistencies w.r.t. the ontology. As the following propositions show, the inconsistency cannot be due to the fact that concepts are represented as al-cones. 5 Embedding for Propositional ALC Let us start by considering ALC ontologies where the tbox language amounts to propositional logic. That is, we first restrict definitions from Section 3 to the concept constructors , , corresponding to the Boolean operations , , . A Boolean ALC tbox (abox) consists of GCIs (abox axioms) using Boolean concepts only. Let (T , A) be a Boolean ALC ontology. We are going to define a geometric interpretation of a special kind that can be a model or an anti-model of a Boolean ontology. It is an ordinary model in the sense that it is a classical predicate logical structure with a domain and interpretations for the atomic concept symbols. It is special in the sense that the domain is of a specific structure, namely a Euclidean space and concept names are interpreted by al-cones and their projections to subspaces. It is non-classical in the sense that some logical constructors are not interpreted by the corresponding set operations, but by operations on al-cones. The interesting aspect is now that logical aspects of the Boolean algebra provide global constraints on the model. These can be satisfied by choosing appropriate positions of the interpretations for all atomic concepts. As a consequence, an ontology is satisfiable classically iff it is satisfied by the geometric interpretation. Definition 1. A Boolean al-cone interpretation I is a structure ( , ( )I) where is Rn for some n N, and where ( )I maps each concept symbol A to some al-cone and each constant a to some element in \ { 0}. An al-cone interpre- x-axis = A B y-axis = A B Figure 2: Al-cone model for a simple abox with empty tbox tation for arbitrary Boolean ALC concepts is defined recursively as ( )I = , ( )I = { 0}, (C D)I = CI DI, ( C)I = C , and (C D)I = ( ( C D))I. The notions of an al-cone being a model and that of entailment are defined as in the classical case (but using al-cone interpretations). Let us consider a simple example of a Boolean ontology, consisting of an empty tbox and an abox constructed as follows: Consider the Lindenbaum-Tarski algebra of all Boolean concepts defined on two Boolean names, say A, B. Then the generated Boolean algebra of classes of equivalent concepts has 222 = 16 elements. Choose for each a smallest representative Ci, i {1, . . . , 16} w.r.t. some ordering. C1 is the bottom concept and let us assume that C2 is the concept B and C3 is the concept B A. Now, for each i with n i 2 take a constant ai and add abox axioms Ci(ai). Thus we represent each of the 15 non-bottom concepts uniquely by some constant. For example, in Figure 2 the constant a2 represents the concept C2 = B. This ontology is satisfiable classically. There is an al-cone interpretation that fulfills the ontology in R2 that is constructed as follows: interpret A by the left upper quadrant and B by the right upper quadrant. This induces uniquely the positions of all other hyperoctants corresponding to the other concepts Ci. The interesting point is the localization of constants ai. For each constant ai one localizes the hyperoctant corresponding to Ci in the area and positions on it such that it does not coincide with one of the points already associated with a constant. Figure 2 illustrates the positions of the concepts: is by definition the singleton { 0}, A is the left upper quadrant, is the whole area, etc. One can check that the concepts are associated with appropriate al-cones. For example, the negation A of A is indeed the polar cone of the quadrant of A. Similarly, consider B A, which is interpreted as the positive x-axis R+ {0}. Its polar cone is the whole left area. As another example consider the Boolean algebra induced by the atomic concepts A, B under the tbox axiom A B. Then, e.g., A B and so forth. This gives an embedding with axis-aligned cones as illustrated in Figure 3. Using the construction idea of the examples one can show that ALC-ontologies are classically satisfiable iff they are by a geometric model based on al-cones. Proposition 2. Boolean ALC-ontologies are classically satisfiable iff they are by a geometric model on some finite Rn Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI-20) A Figure 3: Al-cone model for tbox {A B} based on al-cones of the form b1 bn with bi {{0}, R+, R , R} for i {1, . . . , n}. Proof sketch. For the general construction we use the encoding from the examples. The set of encodings {u, +, , 0} used in the example forms a diamond shaped Boolean algebra B4 with +, . As the ALC-ontology O the induced Boolean algebra is finite, say 2k, with a finite number of atoms m. We have to chose a dimension n such that 4n 2k and n m/2, the idea being that we have to represent all of the 2k Boolean concepts by embedding the atoms of the Boolean algebra directly onto the half-axes of R. Now one can see that Bn 4 is the n-wise Cartesian product of B4 where , , are defined component-wise based on the , , defined on B4. But in this way over B4 is nothing else than set intersection, is polarity, and is defined by de Morgan. So, if O is satisfiable, then we can find a geometric model of it. : Given a geometric model, one can build a classical model by forgetting any named object that is not on an axis by moving it outside to one of the axes that cover the al-cone. Geometric models are more specific than classical models in the sense that they impose an underlying structure on the domain given by, firstly, the dimensions representing some latent features that usually are not even mentioned in the ontology and, secondly, a scalar product , over the space. Geometric interpretations are more general than classical interpretations in the sense that they are partial interpretations. Actually, when embedding an ontology into some space one expects the embedding to represent not a single, but all interpretations that make the ontology true. And this is (up to some exceptions, see below) the case in the construction of Proposition 2. This allows partial information to be encoded. Consider, e.g., the difference between a2 and a3 in the geometric model of Fig. 2. The individual a3 is completely identified w.r.t. the given concepts A, B: it lies in the extension of B and in the extension of A. For a2 we only know that it must be an B, but we do not know whether it is also an A. Hence this geometric model correctly reflects this specific knowledge provable in the ontology: one can prove that a2 is a B but one cannot prove that it is an A nor that it is a A. The geometric model illustrated in Fig. 2 does not reflect the whole knowledge (not) contained in an ontology in the correct way. For example, it is not possible to represent an object b which is known to be in A B, but neither known to be an A nor known to be a B. The reason is that by our construction no place is left to reflect this partial knowledge. The discussion above motivates the definition of faithful geometric models of a given ontology. As within ontologies one has to deal with many models there are actually two different adaptations, a weak one and a strong one. Orthogonally to this distinction one may also consider the faithfulness w.r.t. the abox only (to the data only) or to the tbox. An even finer distinction can be given by considering special subclasses of the abox (which we will do here by considering concept assertions vs. role assertions) and the tbox (which we will do here by considering concepts with a bounded quantifier rank). Definition 2. Let O be a classically consistent (DL) ontology (or any other representation allowing the distinction between abox and tbox). For a geometric interpretation I we have the following notions of being a faithful model for O: I is a strongly concept-faithful model of O iff for each concept C and each constant b: if b I CI, then O |= C(b); I is a weakly concept-faithful model of O iff for each concept C and each constant b: if b I CI, then O {C(b)} is satisfiable classically; I is a strongly (weakly) abox-faithful model of O iff it is strongly (weakly) concept-faithful and for each role R and constants a, b: if (a I, b I) RI, then O |= R(a, b) (resp. O {R(a, b)} is satisfiable classically); I is a strongly (weakly) tbox-faithful model iff for all tbox axioms τ: if I |= τ, then O |= τ (resp. O {τ}) is satisfiable. When we consider classical geometric models such as those defined by Guti errez-Basulto and Schockaert [2018], it will in general not be possible to ensure strong faithfulness. So, the notion of strong faithfulness makes sense only for models which allow for representing partial information. Now let us come back to our al-cones. We present two ways to gain faithfulness, the first using al-cones, the second a subclass of al-cones. Proposition 3. For classically satisfiable Boolean ALContologies there is a strongly concept-faithful and tboxfaithful geometric model on some finite R2n based on alcones of the form b1 b2n with b2i {0, +, , u} and b2i+1 = b2i. Here n is the number of atomic elements in the Boolean algebra generated by the tbox of the ontology. Proof sketch. First a geometric model of the given tbox is generated in size n (using the method described in Prop. 2). In the resulting model every dimension is doubled, A B = R+ R R for instance becomes A B = R+ R+ R R R R. To ensure strong conceptfaithfulness it must not be possible for any object in the abox to get embedded into a concept properly subsumed by its most specific concept (msc). In each of the subconcepts of the object s msc, one can find an entry R of the msc a R+ or a R . With doubled dimension there is a R+ R+ or R R . Thus if the corresponding R in the concept is replaced by, e.g., R+ R the object is in the concept, but in none of its subconcepts. When disregarding R+, R in al-cones we are left with halfspaces, which are sufficient to define faithful models. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI-20) Proposition 4. For classically satisfiable Boolean ALContologies there is a strongly concept-faithful and tboxfaithful geometric model on some finite Rn using sets of the form b1 bn with bi = {0} or bi = R. Proof sketch. Assume that an ALC tbox with n atomic elements is given. Arbitrarily assign them to one of the following n spaces {0} {0} R {0} {0}. Then assign other concepts applying disjunction component-wise. This result follows also from the well-known fact that orthogonal subspaces make up a Boolean algebra. 6 The Non-Boolean Case Our main task is to define semantics for roles and quantifiers. We choose to define crisp extensions for roles. As ALC does not have role negation, we do not get areas of uncertainty for roles. More importantly, we do not consider special structures for roles (in particular no cones). Following the characterization of ALC according to Schild [1991] our definition of the existential quantifier has to satisfy R. = and R.(C D) = R.C R.D. As we will define the existential quantifier via the universal quantifier we have to ensure that R. = and R.(C D) = R.C R.D. Definition 3. An al-cone interpretation I is a Boolean alcone interpretation ( , ( )I) mapping additionally roles R to subsets over such that 0 does not appear in the first or second argument of RI. An al-cone interpretation of ALC concepts is defined recursively as for Boolean concepts and defining the concepts of the form R.C as minimal al-cones containing the set {x I | For all y I: if (x, y) RI then y CI}. The interpretation of the existential quantifier is given by de Morgan, i.e., ( R.C)I = ( R. C)I. Constructions of the Boolean case do not work for non Boolean ALC ontologies due to the fact that the algebra of ALC concepts is not atomic anymore. Consider the simple example On = {loves(narcis, narcis), V ain(narcis)} from Baader and K usters [2006]. Then narcis fulfils all concepts of the form Rn.V ain, which, connected conjunctively, give a chain of concepts Ci that become infinitely narrower and narrower, defined by C0 := R.V ain and Ci := Ci 1 Ri+1.V ain. We take a midway between the approach of Guti errez-Basulto and Schockaert [2018] and our aim of representing knowledge aboxand tbox-faithfully. The rationale for this is to support ML applications such as prediction. Given an ML model, prediction is meant to categorize (new) objects w.r.t. the concepts represented in the model. If a geometric model has m dimensions, it can be asked queries of the form A(a) for concept label A and an m-vector a. More generally, we consider not only atomic concepts A but arbitrary concepts in ALC. Of course, in a classical model the extension of all concepts are determined by those of the atomic ones. But the point of our approach is to work with partial models. So let us assume that the queries are of the form C(a) (assertion check) or of the form C(x) with variable x (concept retrieval). Now, the class of queries that are answered in a prediction scenario are constrained in size, where the size is formally specified by some parameter. We will assume in the following that the parameter is the quantifier rank (see Sect. 3). In order to find all potential inconsistencies in an ontology we have to incorporate also the length of role paths in the abox. Let m be the minimum of the length of paths and the expected maximal quantifier rank of the queries in a prediction scenario. The set of ALC concepts of maximal quantifier rank m still form a Boolean algebra and we have still R. = as well as R.(C D) = R.C R.D. So, within this class we will have atomic elements, say n, which will be put on the axes of the finite dimensional space Rn. Proposition 5. ALC-ontologies are classically satisfiable iff they are satisfiable by a (strongly abox faithful, m-rankconcepts faithful) geometric model on some finite Rn using sets of the form b1 bn with bi {{0}, R+, R , R}. Proof sketch. : Assume that the ontology O is satisfiable classically. Consider the Lindenbaum-Tarski algebra of maximal m-rank ALC concepts induced by the given tbox. Assume the number of atomic concepts is n. As in the Boolean case put these concepts onto the axes of a Euclidean space Rn respecting the Boolean operators. We obtain an al-cone interpretation as follows: Each atomic concept A is interpreted by the al-cone where it is placed according to the Lindenbaum Tarski algebra embedding. We have to specify the interpretation of the constants and the roles. For a constant a find the most specific concept msc Rank m(a) of rank smaller than m provable in the ontology and place a at some area not occupied already by some individual. In particular, if C(a) follows from the ontology, a is going to be true in the geometric model. For each role assertion R(a, b) entailed by the ontology put (a, b) RI. : Assume that O is not satisfiable classically. There is a complete closed tableau. But this derivation can be mimicked in the geometric model as all clashes are on concept level, i.e., considering only concept assertions of the form C(a) and C(a) but not role assertions. 7 Conclusion and Outlook Starting from an interpretation of negation as a polarity operator we presented embeddings of ALC ontologies that interpret all concepts as axis-aligned cones. This result adds an interesting alternative to embeddings considered so far. The resulting embeddings can be used in prediction scenarios where one is interested in predicting whether new samples belong to a concept. An open problem is whether roles can be interpreted by cones (or other feasible structures). As a side product of defining negations by polarity we obtain partial models. So there are individuals for which one does not know whether they belong to a concept or not. This is different from the approaches considered in classical embedding scenarios. A fine-grained treatment of this uncertainty could proceed by considering partial models as in Hartonas [2016] or by providing an epistemic operator [Donini et al., 1998, for example]. Such an operator would allow us to distinguish between instances known to be in a specific concept C, i.e., those in C and instances which could be in the concept or its negation. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI-20) [Baader and K usters, 2006] Franz Baader and Ralf K usters. Nonstandard inferences in description logics: The story so far. In D.M. Gabbay, S.S. Goncharov, and M. Zakharyaschev, editors, Mathematical Problems from Applied Logic I, volume 4 of International Mathematical Series, pages 1 75. Springer-Verlag, 2006. [Baader, 2003] Franz Baader. Description logic terminology. In F. Baader, D. Calvanese, D.L. Mc Guinness, D. Nardi, and P.F. Patel-Schneider, editors, The Description Logic Handbook, pages 485 495. Cambridge University Press, 2003. [Bordes et al., 2013] Antoine Bordes, Nicolas Usunier, Alberto Garc ıa-Dur an, Jason Weston, and Oksana Yakhnenko. Translating embeddings for modeling multirelational data. In Christopher J. C. Burges, L eon Bottou, Zoubin Ghahramani, and Kilian Q. Weinberger, editors, Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013, Lake Tahoe, Nevada, United States., pages 2787 2795, 2013. [Boyd and Vandenberghe, 2004] Stephen Boyd and Lieven Vandenberghe. Convex optimization. Cambridge university press, 2004. [Cal ı et al., 2009] Andrea Cal ı, Georg Gottlob, and Thomas Lukasiewicz. Datalog+-: A unified approach to ontologies and integrity constraints. In Proceedings of the 12th International Conference on Database Theory, pages 14 30. ACM Press, 2009. [Donini et al., 1998] Francesco M. Donini, Maurizio Lenzerini, Daniele Nardi, Werner Nutt, and Anderea Schaerf. An epistemic operator for description logics. Artificial Intelligence, 100(1):225 274, 1998. [Dunn, 1996] Jon Michael Dunn. Generalized orthonegation. In Heinrich Wansing, editor, Negation A Notion in Focus, Perspektiven der Analytischen Philosophie / Perspectives in Analytical Philosophy 7, pages 3 26. De Gruyter, 1996. [G ardenfors, 2000] Peter G ardenfors. Conceptual Spaces: The Geometry of Thought. The MIT Press, Cambridge, Massachusetts, 2000. [Goldberg and Levy, 2014] Yoav Goldberg and Omer Levy. word2vec Explained: deriving Mikolov et al. s negativesampling word-embedding method. Ar Xiv e-prints, February 2014. [Guti errez-Basulto and Schockaert, 2018] V ıctor Guti errez Basulto and Steven Schockaert. From knowledge graph embedding to ontology embedding? An analysis of the compatibility between vector space representations and rules. In Michael Thielscher, Francesca Toni, and Frank Wolter, editors, Principles of Knowledge Representation and Reasoning: Proceedings of the Sixteenth International Conference, KR 2018, Tempe, Arizona, 30 October 2 November 2018., pages 379 388. AAAI Press, 2018. [Hartonas, 2016] Chrysafis Hartonas. Reasoning with incomplete information in generalized galois logics without distribution: The case of negation and modal operators. In Katalin Bimb o, editor, J. Michael Dunn on Information Based Logics, pages 279 312. Springer International Publishing, Cham, 2016. [Kulmanov et al., 2019] Maxat Kulmanov, Wang Liu-Wei, Yuan Yan, and Robert Hoehndorf. El embeddings: Geometric construction of models for the description logic EL++. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI19), 2019. [Levy and Goldberg, 2014] Omer Levy and Yoav Goldberg. Neural word embedding as implicit matrix factorization. In Zoubin Ghahramani, Max Welling, Corinna Cortes, Neil D. Lawrence, and Kilian Q. Weinberger, editors, Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada, pages 2177 2185, 2014. [Mehran Kazemi and Poole, 2018] Seyed Mehran Kazemi and David Poole. Simpl E Embedding for Link Prediction in Knowledge Graphs. ar Xiv e-prints, page ar Xiv:1802.04868, Feb 2018. [Nickel et al., 2011] Maximilian Nickel, Volker Tresp, and Hans-Peter Kriegel. A three-way model for collective learning on multi-relational data. In Proceedings of the 28th International Conference on International Conference on Machine Learning, ICML 11, pages 809 816, USA, 2011. Omnipress. [Pennington et al., 2014] Jeffrey Pennington, Richard Socher, and Christopher D. Manning. Glove: Global vectors for word representation. In EMNLP, volume 14, pages 1532 1543, 2014. [Rockafellar, 1997] Ralph Tyrrell Rockafellar. Convex Analysis. Princeton University Press, Princeton, NJ, 1997. [Schild, 1991] Klaus Schild. A correspondence theory for terminological logics: Preliminary report. In Proceedings of IJCAI-91, pages 466 471, 1991. [Wang et al., 2017] Quan Wang, Zhendong Mao, Bin Wang, and Li Guo. Knowledge graph embedding: A survey of approaches and applications. IEEE Transactions on Knowledge and Data Engineering, 29(12):2724 2743, Dec 2017. Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI-20)