# answering_queries_with_negation_over_existential_rules__c4951122.pdf

Answering Queries with Negation over Existential Rules

Stefan Ellmauthaler, Markus Kr otzsch, Stephan Mennicke

Knowledge-Based Systems Group, TU Dresden, Dresden, Germany stefan.ellmauthaler@tu-dresden.de, markus.kroetzsch@tu-dresden.de, stephan.mennicke@tu-dresden.de

Ontology-based query answering with existential rules is well understood and implemented for positive queries, in particular conjunctive queries. For queries with negation, however, there is no agreed-upon semantics or standard implementation. This problem is unknown for simpler rule languages, such as Datalog, where it is intuitive and practical to evaluate negative queries over the least model. This fails for existential rules, which instead of a single least model have multiple universal models that may not lead to the same results for negative queries. We therefore propose universal core models as a basis for a meaningful (non-monotonic) semantics for queries with negation. Since cores are hard to compute, we identify syntactic conditions (on rules and queries) under which our core-based semantics can equivalently be obtained for other universal models, such as those produced by practical chase algorithms. Finally, we use our ﬁndings to propose a semantics for a broad class of existential rules with negation.

Introduction Existential rules are a prominent approach in knowledge representation, due to theoretical and practical advances in ontology-based query answering (Baget et al. 2009; Cal ı, Gottlob, and Lukasiewicz 2009), but also because of applications in many other domains, such as data exchange and integration (Fagin et al. 2005). Answering conjunctive queries (CQs) over sets of existential rules is often the main goal, where a CQ is entailed if it is satisﬁed by all models of the given rules. This is often implemented using universal models, which is a class of models each representing all positive query answers, such that one such model sufﬁces to answer CQs (Deutsch, Nash, and Remmel 2008). However, for queries that ask for the absence of facts (i.e., queries incorporating negated atoms), universal models cannot be used because different universal models may yield different query answers. This is a major problem, since negation is an important feature in real-world queries, and a semantic prerequisite for supporting negation in rule bodies, which can be viewed as a recursive generalisation of negative queries. The severity of this limitation is further aggravated by the fact that no such problems exist for Datalog, where negative queries can safely be evaluated over

Copyright 2022, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

Core Models Restricted Chases Chases Universal Models

(Effectively) Core-Safe BNCQ

Thm. 3 (Thm. 4)

Affection-Safe BNCQ

Figure 1: Summary of Results

the unique least model. This can be further generalised by considering stratiﬁed negation in rule bodies, which yields an intuitive and implementable non-monotonic semantics (Abiteboul, Hull, and Vianu 1994). But even this basic form of negation is not safe to use with existential rules. What makes this problem challenging is that we are looking for a non-monotonic form of negation, where the absence of positive evidence is sufﬁcient to entail negative information. Indeed, under the classical ﬁrst-order semantics of negation, no negated queries are ever entailed by a set of existential rules. However, it is not immediate when a non-monotonic semantics should allow for additional, nonclassical consequences, and several distinct approaches have been proposed for existential rules (Magka, Kr otzsch, and Horrocks 2013; Baget et al. 2014; Gottlob et al. 2014; Alviano, Morak, and Pieris 2017; Kr otzsch 2020). Our goal for this paper is to provide a semantics that agrees with the widely used and generally accepted semantics of Datalog with stratiﬁed negation for rules without existentials, but which also respects the classical reading of existential quantiﬁers as mere statements of existence of certain elements without any commitment to their identity (in contrast to logic programming approaches that use function terms for referring to distinguished objects). We therefore propose to evaluate negations with respect to universal models that are cores, an algebraic property that, intuitively speaking, ensures that they contain no redundant structures. Many good practical and theoretical results have been obtained for core models (Fagin, Kolaitis, and Popa 2005; Deutsch, Nash, and Remmel 2008; Carral et al. 2018; Kr otzsch 2020), but computing cores from arbitrary structures is expensive (Hell and Neˇsetˇril 1992). We therefore ask if our core-based semantics for negation can also be computed using other kinds of universal models, which are easier to compute in practice. It turns out

The Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-22)

that this is possible if we restrict the shape of the queries that we want to answer, and we deﬁne several classes of safe queries that can be evaluated over broader classes of models. Our results are summarised in Figure 1, which illustrates the inverse relationship between the generality of the query language and the speciﬁcity of the models that one can use to compute them. The (restricted) chase refers to a widely implemented type of reasoning algorithm, such that the respective query classes could be evaluated in practice (Nenov et al. 2015; Benedikt et al. 2017; Bellomarini, Sallinger, and Gottlob 2018; Carral et al. 2019). The symbol expresses that core models can be embedded into every restricted chase, although no such chase is necessarily equal to the core (hence no ). For any of the given query fragments, we provide concrete syntactic deﬁnitions that can be decided in practice. Our ﬁnal contribution is the extension of our approach to existential rules with negation in their bodies. This is technically more challenging than mere query answering, and even the eager computation of cores during reasoning cannot guarantee a unique semantics (Kr otzsch 2020). We focus on cases where rules can be stratiﬁed in a certain sense, but our conditions signiﬁcantly generalise classical stratiﬁcation (Abiteboul, Hull, and Vianu 1994) and the recent full stratiﬁcation for existential rules (Kr otzsch 2020). Nevertheless, we can still ﬁnd a unique perfect model that can be used in query answering, making our approach a valid generalisation of the perfect core model semantics (Kr otzsch 2020). All proofs are included in the technical report of the paper (Ellmauthaler, Kr otzsch, and Mennicke 2021).

Preliminaries We consider a ﬁrst-order signature with disjoint sets of constants C, (labelled) nulls N, variables V, and predicates P. A term t is an element of C N V. Lists of terms are denoted t = t1, t2, . . . , tn with n = |t|, and treated as sets when order is irrelevant. Each predicate p P has an arity ar(p) N. An atom is an expression p(t) with p P and ar(p) = |t|. An atom p(t) is ground if t C. An interpretation I is a set of atoms without variables. A database D is a ﬁnite set of ground atoms.

Rules and Queries An (existential) rule r is a formula

r = x, y. ϕ[x, y] z. ψ[y, z], (1)

where ϕ and ψ are conjunctions of atoms using only terms from C or from the mutually disjoint lists of variables x, y, z V. We call ϕ the body (denoted body(r)), ψ the head (denoted head(r)), and y the frontier of r. For ease of notation we may treat conjunctions of atoms as sets, and we omit universal quantiﬁers in rules. We require that all variables in y actually occur in ϕ (safety).1 A rule is Datalog if it has no existential quantiﬁers. A normal Boolean conjunctive query (BNCQ) is a formula q = x.ϕ ψ, where ϕ is a conjunction of atoms with variables from x, and ψ is a conjunction of negated atoms not p(t) using only variables that occur in ϕ (safety). We

1This requirement can be relaxed, but it simpliﬁes presentation.

write q+ (q ) for the set of all non-negated atoms in ϕ (ψ). If q = , then q is a boolean conjunctive query (BCQ).

Models and Entailment Given a set of atoms A and an interpretation I, a homomorphism h : A I is a function that maps the terms occurring in A to (the variable-free) terms occurring in I, such that: (i) for all c C, h(c) = c; (ii) for all p P, p(t) A only if p(h(t)) I, where h(t) is the list of h-images of the terms t. If h satisﬁes (ii) with only if strengthened to iff , h is a strong homomorphism. An embedding is an injective strong homomorphism, and an isomorphism is a bijective strong homomorphism (i.e., surjective embedding). We apply homomorphisms to a formula by applying them individually to all of its atoms. A match of a rule r in an interpretation I is a homomorphism h : body(r) I. A match h of r in I is satisﬁed if there is a homomorphism h : head(r) I, such that h h . We call h an extension of h. Rule r is satisﬁed by I, written I |= r, if every match of r in I is satisﬁed. A set of rules Σ is satisﬁed by I, written I |= Σ, if I |= r for all r Σ. We call I a model of Σ and database D if I |= Σ and D I. A BNCQ q is satisﬁed by I, written I |= q, if there is a homomorphism h : q+ I with h(q ) I = . Note, a BNCQ q with q+ q = can never be satisﬁed. Herein we consider only BNCQs q that are non-trivial in this sense.

Universal Models and the Chase A model U of D and Σ is universal if there is a homomorphism U M for every model M of D and Σ. Universal models can be computed with the chase (Deutsch, Nash, and Remmel 2008). In this paper, we will mainly deal with the restricted chase (also known as standard chase). Deﬁnition 1. Let D be a database and Σ a set of rules. A sequence D0, D1, . . . is called a (restricted) chase sequence of D and Σ iff 1. D0 = D, 2. for every Di+1 (i 0) there is a rule r Σ of the form (1) and a match h for r in Di with (a) h is an unsatisﬁed match for r in Di and (b) there is an extension h : head(r) Di+1 of h, such that h (z) is a fresh null for each z z, and 3. if h is a match for some rule r Σ in Di (i 0), then h is satisﬁed in some Dj with j i (fairness). r is applicable in Di if it has an unsatisﬁed match in Di. Chase sequence D0, D1, . . . , Dk is called terminating if Dk is a model of Σ and D. We call D := S i 0 Di a chase of Σ and D. Other chase variants, like the Skolem/semi-oblivious or the oblivious chase, mainly differ in the respective deﬁnitions of rule applicability (2. (a) in Deﬁnition 1).

Inﬂuence of Existential Quantiﬁers For some of the upcoming notions, it is useful to have rule sets Σ where each rule makes use of a distinct set of variables from all the other rules. We say that Σ is renamed-apart. Every rule set can be rewritten to an equivalent renamed-apart rule set. The positions in a predicate p P are the pairs p, 1 , . . . , p, ar(p) , and we will refer to terms at a certain position in an atom, set of atoms, or other formula.

Var(X) (Var (X)/Var (X)) denotes the set of all (existentially quantiﬁed/universally quantiﬁed) variables in a rule or rule set X. We reproduce the following deﬁnition from Kr otzsch and Rudolph (2011) to obtain a structure allowing for syntactic reasoning of the inﬂuence of existential quantiﬁers within the chase.

Deﬁnition 2. Let Σ be a renamed-apart rule set. For x Var(Σ), let ΠB x (ΠH x ) be the sets of all positions at which x occurs in the body (head) of a (necessarily unique) rule in Σ. If x Var (Σ), then Ωx is the smallest set of positions such that (1) ΠH x Ωx and (2) for all y Var (Σ), ΠB y Ωx implies ΠH y Ωx. The set of jointly affected positions is S x Var (Σ) Ωx. For x, y Var (Σ), we write x y if the (unique) rule of y has a frontier variable z with ΠB z Ωx.

The relation x y states that nulls created for x potentially enable rule applications that create nulls for y. Of particular interest is the transitive and reﬂexive closure . Starting from a set of existentially quantiﬁed variables, we seek all positions at which a null might occur that is directly or indirectly inﬂuenced by a variable in this set.

Deﬁnition 3. Let Σ be a renamed-apart rule set and let V Var (Σ). A position π is V -inﬂuenced if there are x V and y with x y such that π Ωy.

Note that jointly affected is the same as Var (Σ)- inﬂuenced. Virtually any chase variant (restricted, Skolem, and oblivious) is compatible with jointly affected positions, meaning that nulls in chases D only occur in Var (Σ)- inﬂuenced positions.

Proposition 1. For renamed-apart rule set Σ and database D with (Skolem/oblivious/restricted) chase D , if p(t1, . . . , tar(p)) D and ti N (1 i ar(p)), then p, i is Var (Σ)-inﬂuenced.

The proof uses an over-approximation of the chase procedure (Ellmauthaler, Kr otzsch, and Mennicke 2021).

Answering BNCQs on Core Models A BCQ is entailed by rule set Σ and database D if it is satisﬁed by all models of Σ and D. Moreover, every universal model of Σ and D satisﬁes exactly the BCQs that are entailed in this sense. Neither condition is appropriate to deﬁne entailment of BNCQs that may use negation. On the one hand, no such query is satisﬁed in all models since there is always a model where every BCQ is true. On the other hand, different universal models do not satisfy the same BNCQs.

Example 1. Take for instance the database D = {a(1, 2), b(2, 2)} together with an empty rule set. Of course, D is a universal model, but so is U := D {a(1, n)} (where n N). As mentioned above, all BCQs evaluate with the same result on D or U. However, for BNCQ q = x, y. a(x, y) b(y, y) we get U |= q while D |= q.

Models that are cores have been suggested as the appropriate model for deﬁning the semantics of queries that may depend on negative information (Deutsch, Nash, and Remmel 2008; Baget et al. 2014). This suggestion is substantiated by the fact that cores are unique up to isomorphism and

that isomorphic structures cannot be distinguished by ﬁrstorder (FO) queries (see, e.g., the Isomorphism Lemma by Ebbinghaus, Flum, and Thomas (1994)). Indeed, if a universal model is a core, then it satisﬁes fewer BNCQs than any other model (see also Theorem 1 below), resulting in the most cautious notion of non-monotonic entailment.

Deﬁnition 4. A ﬁnite interpretation I is a core if every homomorphism h : I I is an isomorphism. A core model of a rule set Σ and a database D is a ﬁnite universal model of Σ and D that is a core.

If Σ and D have any ﬁnite universal model, then they admit a ﬁnite core, which is unique up to isomorphism (Deutsch, Nash, and Remmel 2008). The situation is more complicated on inﬁnite structures, which admit several non-equivalent deﬁnitions of core (Bauslaugh 1995) and where core models might fail to exist altogether (Carral et al. 2018). We therefore focus on cases with ﬁnite models M and their unique cores, which we denote core(M). Many conditions have been studied to recognise cases where ﬁnite universal models are guaranteed to exist (Cuenca Grau et al. 2013). In the rest of the paper we study the notion of core entailments for BNCQs, being those entailments we obtain when evaluating the query on the core model.

Deﬁnition 5. A rule set Σ and database D core-entail a BNCQ q, written Σ, D |=c q, if Σ and D have a core model C satisfying q.

Important Assumption Throughout this paper, we only consider pairs of rule sets Σ and databases D that have a (ﬁnite) core model. Deﬁnition 5 does not apply to other cases.

Reconsidering Example 1, D is not just any universal model, but the core model. Hence, Σ, D |=c q for the query q in the example, as one might intuitively expect. As shown in Example 1, such non-entailments are not preserved in arbitrary (universal) models, but it turns out that entailments are. This is a consequence of the fact that core models are embedded in all universal models in the following sense.

Lemma 1. Let Σ be a rule set and D a database with core model C and arbitrary universal model U. (1) If h : C U is a homomorphism, it is an embedding. We call h(C) the core instance of U and denote it by Uc. (2) Every homomorphism h : Uc Uc is an isomorphism.

Proof. On (1), since U is a universal model and C a (core) model of Σ and D, there is a homomorphism h : U C. Note, h h : C C is an isomorphism as C is a core. We show that h is an embedding, i.e., h is injective and strong. For injectivity, let u and t be distinct terms occurring in C. If h(u) = h(t), it follows that h h(u) = h h(t), which contradicts the assumption that h h is an isomorphism. For showing that h is strong, let u be a list of terms in C and p(h(u)) U. We need to show that p(u) C. As h h is an isomorphism on C, its inverse is also an isomorphism. Hence, p(h h(u)) C and p((h h) 1 (h h)(u)) C allow for the conclusion p(u) C. Towards (2), there is an isomorphism i : C Uc by the argumentation above. Let h : Uc Uc be a homomorphism.

Deﬁne j := i 1 (h i) (i.e., j : C C). Since C is a core, j is an isomorphism. Hence, h must be an isomorphism.

Based on the core instance, we can now show that core entailments of BNCQs carry over to all universal models.

Theorem 1. For every rule set Σ, database D, and BNCQ q, if Σ, D |=c q, then U |= q for all universal models U of Σ and D.

Proof. Let C be a core model of Σ and D. The core instance Uc U preserves all FO-queries, including BNCQs. We conclude that for all BNCQ q, C |= q implies U |= q.

In fact, up to minor variations, the core model is the only model with this property: the only other models for which BNCQ satisfaction is preserved in all universal models are those that can be covered by embeddings from the core model (i.e., all facts occur in the image of some embedding from the core). However, it turns out that some BNCQs allow us to consider broader classes of models for checking core entailment, as we will see in the following sections.

Affection-Safe BNCQs In spite of its appealing semantics, core entailment has the practical disadvantage that core models are difﬁcult to compute. This difﬁculty seems unavoidable if we want minimality in the sense of Theorem 1. However, this does not mean that core entailment of BNCQs does always require us to compute the core model. For example, BNCQs without negation can equivalently be answered in any universal model. We show that similar results can also be obtained for more interesting classes of BNCQs, and how these classes can be effectively recognised. This requires us to look at more speciﬁc classes of models. Indeed, we can readily see that entailment over all universal models does often not coincide with core entailment since core-non-entailments are not preserved across universal models: every BNCQ q whose positive part is entailed is also true in some universal model, unless its negative part is a logical consequence (in this case, the non-entailment of q is a ﬁrst-order logical consequence). This can be shown by adding redundant structures where q is satisﬁed, as in Example 1, but taking arbitrary rule sets into account.

Proposition 2. Consider a rule set Σ and database D. For every BNCQ q such that (a) Σ, D |= q+ and (b) Σ, D |= q+ a(t) for all a(t) q , there is a universal model U of Σ and D with U |= q.

Proof. Deﬁne D+ := D {p(ν(u)) | p(u) q+} with ν(t) = t for all t C and ν(t) a fresh null for all t V. Now let U be a chase over Σ and D+ (e.g., some restricted chase). By construction, U is a model of Σ and D, and ν is a match ν : q+ U. By safety, ν is deﬁned for Var(q ). Now for any α = a(t) q , let Uα be a universal model of Σ and D such that Uα |= q+ α, which exists due to (b). Due to (a), there is a match ν : q+ Uα such that ν (α) / Uα (using (b)). Since, moreover, D Uα, we ﬁnd that there is a homomorphism h : D+ ν (q+) D. By soundness of the chase, h extends to a homomorphism

h : U Uα. Since Uα is universal, this shows that U is universal. By construction, h (ν(α)) = ν (α) / Uα, and hence ν(α) / U. Applying the same reasoning to all α q , we ﬁnd that ν shows U |= q.

Hence, only limited classes of BNCQs can be evaluated over arbitrary universal models. This, however, is due to some universal models containing unmotivated clutter that users might intuitively not expect. In particular, the universal models that are computed by chase procedures are not of this form, and can only produce nulls in jointly affected positions (Proposition 1). Therefore, if all variables in q are bound to some position in q+ that is not jointly affected, then these variables can only match constants. Deﬁnition 6. Let Σ be a rule set. A BNCQ q is affectionsafe w.r.t. Σ if every variable in q occurs at a position in q+ that is not jointly affected. Since all common (e.g., restricted, Skolem, oblivious) chases are compatible with joint affection, we can use their outputs to answer affection-safe BNCQs. Theorem 2. For every rule set Σ, database D, and affection-safe BNCQ q, we have Σ, D |=c q iff M |= q for some2 chase M computed for Σ and D by a chase procedure compatible with jointly affected positions (Proposition 1).

Proof. The only if direction follows from Theorem 1. For the if direction, let M be as in the claim. By Proposition 1, terms t at positions a, i in M that are not jointly affected are constants. Let C be the core model of Σ and D. By universality, there are homomorphisms h1 : M C and h2 : C M. Suppose M |= q by a match h : q+ M with h(q ) M = , but Σ, D |=c q. Since h1 h(q+) C, this means h1 h(q ) = . Hence there is p(t) q with p(h1 h(t)) C \M, and therefore p(h2 h1 h(t)) M. As h(t) contains only constants, this implies p(h(t)) M, contradicting our assumption that M |= q.

Core-Safe BNCQs Affection safety ensures that chase-based models only entail negative BNCQ atoms if they match null-free facts, which always agree across all universal models. We now relax this requirement based on the recent notion of restraints (Kr otzsch 2020), which allow us to identify a larger set of safe positions to bind to. The next example, adopted from Alviano, Morak, and Pieris (2017), illustrates the problem: Example 2. Take as database D = {p(A), f(B, A)} (mnemonics: parent, father-of, equal) and rules

f(x, y) e(x, x) (2) p(x) y. f(y, x) e(y, y) (3)

We can obtain two restricted chases on Σ = {(2), (3)} and D: U1 = D {e(B, B)} and U2 = D {f(n, A), e(n, n), e(B, B)}. For U1, we only apply (2) for match h1 = {x 7 B, y 7 A}; then (3) is satisﬁed. For U2, we apply (3) for match h2 = {x 7 A}, and then rule (2) for

2The restricted chase, e.g., can yield distinct chase results depending on rule application order. Any of them can be used here.

match h1. The BNCQ q = x1, x2, y. f(x1, y) f(x2, y) e(x1, x2) is such that U1 |= q and U2 |= q. Since U1 is the core model of Σ and D, we obtain Σ, D |=c q.

Arguably, the construction of U2 should not have applied (3) with the extended match h 2 = {x 7 A, y 7 n}, because U2 eventually does not need n to satisfy (3). Indeed, h+ 2 = {x 7 A, y 7 B} would be another way to extend h2 to satisfy (3). Functions that remap heads of prior rule applications to alternative structures in a chase are called alternative matches, and they can be shown to occur whenever a restricted chase fails to produce a core (Kr otzsch 2020).

Deﬁnition 7. Let Ia Ib be interpretations such that Ia is obtained from applying rule r for extended match h . A homomorphism h : h (head(r)) Ib is an alternative match of h if

h (t) = t for all terms t in h (body(r)), and there is a null n in h (head(r)) that does not occur in h (h (head(r))).

The occurrence of alternative matches in a restricted chase is associated with structures that do not occur in the core (Kr otzsch 2020). Such structures could lead to additional BNCQ matches. Since alternative matches involve nulls, affection safety can mitigate this by forcing variables in negative atoms to match constants. However, it is more general and still safe if we merely restrict to elements that are not directly or indirectly related to potential alternative matches. The existence of alternative matches in a real chase is undecidable, but we can (over)approximate such matches by considering chase-like interactions of pairs of rules. Based on this approach of deﬁning restraints between rules (Kr otzsch 2020), we can identify existential variables that are at risk of producing redundant structures:

Deﬁnition 8. A rule r1 restrains rule r2, written r1 r2, if there are interpretations Ia Ib and a function h2 where

1. Ib is obtained by applying r1 for match h1, 2. Ia is obtained by applying r2 for match h2, 3. h2 has an alternative match h : h 2(head(r2)) Ib, and 4. h2 has no alternative match h 2(head(r2)) Ib \ h 1(head(r1)).

In this situation, a variable x Var (r2) is restrained if h 2(x) does not occur in h (h 2(head(r2))). We write RΣ for the set of all restrained variables of a rule set Σ.

Variables in RΣ may produce nulls that are not represented in the core model. Moreover, such nulls may be involved in further rule applications that derive additional structures that deviate from the core. To ﬁnd positions of elements that are certain to agree with the core, we therefore consider all positions that are inﬂuenced by restrained variables in the sense of Deﬁnition 3.

Deﬁnition 9. Let Σ be a rule set. A position π is core-safe (w.r.t. Σ) if it is not RΣ-inﬂuenced. A BNCQ q is core-safe w.r.t. Σ if every variable in q occurs at a core-safe position in q+.

Reconsidering query q of Example 2, we ﬁnd that q is not core-safe w.r.t. the given rule set. Positions

e, 1 , e, 2 , f, 1 are not core-safe and variables x1 and x2 both occur at f, 1 in q+. This explains why this example admits restricted chase sequences on which the entailment of q disagrees with the core model. Indeed, for core-safe BNCQs, this problem can never occur:

Theorem 3. For every rule set Σ, database D, core-safe BNCQ q, and restricted chase M of Σ and D, we ﬁnd that Σ, D |=c q iff M |= q.

The only if direction is again clear from Theorem 1. The proof for the if direction is given below. The key insight is that terms at core-safe positions in a chase M must belong to the core instance Mc of M (cf. Lemma 1). This is a similar situation as for affection-safety, where we showed that variables at affection-safe positions must be instantiated with constants, which therefore occur in the core.

Lemma 2. For a restricted chase M of rule set Σ and database D (with ﬁnite core model), and core instance Mc of M, if a term t occurs at a core-safe position in M, then t occurs in Mc.

Proof Sketch. Let C be the core model of Σ and D, h1 : M C a homomorphism (existence by M being universal), and h2 : C Mc the isomorphism for which h 1 2 agrees with h1, i.e., h 1 2 (t) = u if h1(t) = u. The existence of an isomorphism i is ensured by Lemma 1. Furthermore, consider the restriction h 1 : Mc C of h1 to Mc. By Lemma 1 (2), i h 1 : Mc Mc is an isomorphism. Set h2 := (h 1) 1, which is an isomorphism since (i 1 i h 1) = h 1 is an isomorphism. We subsequently analyse the homomorphism h := h2 h1. With respect to the choice of h2, we show that if t occurs at a core-safe position in M, then h (t) = t. The rest of the proof is an induction on the chase sequence of M and is included in our technical report (Ellmauthaler, Kr otzsch, and Mennicke 2021).

Proof of Theorem 3. For the remaining if direction, suppose M |= q. Then there is a homomorphism h : q+ M with h(q ) M = . For core model C of Σ and D and core instance Mc of M, let h1 : M C / h2 : C Mc be the respective homomorphism/isomorphism (by Lemma 1). Let r(t) q . We show that r(h(t)) / M implies r(h1(h(t))) / C. Every variable in t occurs in at least one core-safe position in q+ (by core-safety of q). Thus, every term in u = h(t) occurs in Mc by Lemma 2. Since r(u) / M (and r(u) / Mc), r(h1(u)) / C because, otherwise, r(h2(h1(u))) Mc, which contradicts r(u) / Mc (therefore, / M) because h2 h1 is an isomorphism on Mc (by Lemma 1 (2)).

Core-safe BNCQs therefore are a signiﬁcant generalisation of affection-safe BNCQs, at the cost of requiring the use of the restricted chase or any other correct chase procedure that ensures that its results are subsets of some restricted chase. In practice, this includes chase implementations that use speciﬁc strategies to decide the order in which rules are applied, e.g., by prioritising Datalog rules (which never introduce unnecessary nulls) (Urbani et al. 2018). Since Deﬁnition 8 identiﬁes restrained variables under the assumption

that rule r1 might be applied before rule r2, the fact that some rule application orders are generally impossible (or simply did not happen in a speciﬁc run) allows us to consider more variables effectively core-safe.

Deﬁnition 10. Let Σ be a rule set, D a database, and S = D0, D1, D2, . . . a restricted chase sequence with

Di 1 ri,hi Di for i 1, ri Σ and (unsatisﬁed) match hi. A variable x RΣ is effectively restrained in S if there are j < k, such that x occurs in rule rj and rk rj. The set of all effectively restrained variables in S is denoted RS. A position a, i is effectively core-safe w.r.t. S if it is not RS-inﬂuenced. A BNCQ q is effectively core-safe w.r.t. S if every variable x in q occurs at an effectively core-safe position in q+.

Effective core safety marks even more positions as safe for BNCQs to query for.

Theorem 4. For every rule set Σ, database D, restricted chase sequence S = D0, D1, D2, . . . over Σ and D with chase M = S i 0 Di, and BNCQ q that is effectively coresafe for S, it holds that Σ, D |=c q iff M |= q.

Proof sketch. Similar argumentation as for the proof of Theorem 3. In particular, terms at effectively core-safe positions do occur in the core instance Mc of M (cf. Lemma 2).

Our results put BNCQ answering under core entailment semantics into reach for practical implementations. Indeed, chase procedures, including the restricted chase, are supported by efﬁcient implementations, and the computation of restraints is of the same complexity as the application of a single rule, namely ΣP 2-complete (Kr otzsch 2020). This may at ﬁrst seem harder than computing the core, which is known to be DP-complete (Fagin, Kolaitis, and Popa 2005), but the crucial difference is that the worst-case complexity of core computation refers to the size of the whole chase (often millions of facts), whereas for rule applications and restraint checking, it is about the size of a single rule (typically dozens of facts). This may explain why no general implementation of the core chase is available today.

Rules with Negation We turn our attention to normal rules, which admit negated atoms in their bodies and can naturally be viewed as a recursive generalisation of BNCQ answering. Using our previous insights, we ﬁrst deﬁne a chase-based semantics for such rules in cases where rule bodies are core-safe. We then generalise this by stratifying rule sets in a way that is compatible with restraints, generalising the notion of full stratiﬁcation for normal existential rules (Kr otzsch 2020). This yields a unique and well-deﬁned semantics that we call perfect core semantics. A normal existential rule is an expression r = x, y. ϕ[x, y] χ[x, y] z. ψ[y, z], such that x, y. ϕ[x, y] χ[x, y] is a BNCQ, and ψ[y, z] is a conjunction of atoms. We use body+(r) and body (r) for the sets of all atoms in ϕ and χ. A match of r in an interpretation I is a homomorphism h : body+(r) I with

h(body (r)) I = . Other notions are as deﬁned for rules without negation. The restricted chase procedure of Deﬁnition 1 can then directly be applied to normal rules. This does not in general lead to a sound reasoning algorithm, since a rule might be applicable due to the absence of a negated atom that is inferred later on in the chase. Chase sequences where this does not happen have been called generating and can be used to deﬁne a kind of stable model semantics for normal existential rules (Baget et al. 2014). A well-known approach to obtain generating chase sequences is stratiﬁcation, where rules are partitioned into a sequence of sets ( strata ) such that rules in higher strata cannot derive facts that occur negatively in rules of lower strata. For (normal) existential rules, stratiﬁcations have been deﬁned using three relations between rules: positive reliances r1 + r2 express that r1 might derive facts that allow r2 to be applied, negative reliances r1 r2 express that r1 might derive facts that prevent a possible application of r2 since they occur in body (r2), and restraints r1 r2 are as in Deﬁnition 8 with the additional condition that h1 and h2 are also matches (for normal rules) with respect to Ib. Full formal deﬁnitions of these notions are given in our report (Ellmauthaler, Kr otzsch, and Mennicke 2021).

Example 3. Consider the rules (mnemonics: parent, fatherof, male, child-of, adult, older-than-3, tired)

p(x) v. f(x, v) m(v) (4) f(x, y) m(y) c(y, x) (5) f(x, y) not a(x) not o(x) t(y) (6) a(x) o(x) (7)

We have (5) (4), (4) + (5), and (4) + (6). There are no other relations, in particular (7) (6) since (7) is only applicable in cases where (6) is not applicable anyway.

For a set of normal rules Σ, let RΣ denote the set of restrained variables as of Deﬁnition 8, modiﬁed for normal rules as mentioned above. The set of RΣ-inﬂuenced and core-safe positions is deﬁned as before, ignoring negated atoms in rules. Then a rule r Σ is core-safe if every variable in body (r) occurs on a core-safe position in body+(r). A ﬁrst simple observation highlights a case where the restricted chase can safely be applied to normal rules:

Proposition 3. Let Σ be a set of normal rules such that (1) all rules in Σ are core-safe and (2) there is no negative reliance r1 r2 between any rules r1, r2 Σ. Then every restricted chase sequence over Σ and any database D is generating and yields a model of Σ and D. Moreover, all of these models are homomorphically equivalent.

The full proof is included in our technical report (Ellmauthaler, Kr otzsch, and Mennicke 2021).

Example 4. Let Σ denote the set of rules in Example 3. We ﬁnd that RΣ = {v} because of (5) (4), such that RΣinﬂuenced positions are f, 2 , m, 1 , c, 1 , t, 1 . Hence, (6) is core-safe, and Proposition 3 applies. For a database D = {p(A), f(A, B)}, we can obtain the restricted chases U1 = D {m(B), c(B, A), t(B)}

(by applying (5) before (4)) and U2 = U1 {f(A, n), m(n), c(n, A), t(n)} (by applying (4) before (5)). Though distinct, they are homomorphically equivalent and U1 is their unique core. Example 4 also illustrates a case that is covered by Proposition 3 but is not in scope of the previously deﬁned full stratiﬁcation, which we recall and adapt next. As opposed to traditional stratiﬁcations, the deﬁnition of Kr otzsch (2020) effectively allows some rules (esp. those that are not the target of any or ) to appear in multiple strata. Deﬁnition 11. For a set Σ of normal rules, a list S = Σ1, . . . , Σn with Σ = Sn i=1 Σn is a quasi stratiﬁcation if, for all rules r1 Σi and r2 Σj, 1. if r1 + r2 then i j, 2. if r1 r2 then i < j, 3. if r1 r2 then i j. A quasi stratiﬁcation S is a full stratiﬁcation if i < j holds in case (3); it is a core-safe stratiﬁcation if all rules in Σk are core-safe for Σk for all k {1, . . . , n}. Example 5. The rules of Example 3 do not admit a full stratiﬁcation due to the cycle (5) (4) + (5), but they can be a stratum in a core-safe stratiﬁcation. We add further rules

f(x, y) e(y, y) (8) f(x, y1) f(x, y2) not e(y1, y2) d(y1, y2) (9)

Then (4) + (8) and (8) (9). Rule (9) is not core-safe in the set of all rules. A possible core-safe stratiﬁcation is S = {(4), (5), (6), (7)}, {(8)}, {(9)} . Full stratiﬁcations have been used to deﬁne the perfect core model, as the unique model obtained by conducting a (necessarily generating) chase that proceeds stratum by stratum. Core-safe stratiﬁcation is strictly more general, since the stricter condition i < j in case (3) implies that RΣk is empty for every stratum Σk, so that its rules are core-safe. A suitable chase procedure for rule sets that are core-safe stratiﬁed is given next. Deﬁnition 12. Let Σ be a normal rule set with coresafe stratiﬁcation S = Σ1, . . . , Σn , and let D be a database. The core-safe chase sequence for S and D is a list C0, C1, . . . , Cn such that C0 = D, and for every i {1, . . . , n}, Ci is the core of a restricted chase over Σi and Ci 1, provided that this core is ﬁnite. If such a sequence exists, Cn is called the core-safe chase of Σ and D w.r.t. S, and we denote it by S(D). Note that each stratum Σi satisﬁes the conditions of Proposition 3 since S is a core-safe stratiﬁcation. Hence, the required restricted chase exists and, since it is ﬁnite, has a unique core. In practice, one can ensure the necessary ﬁniteness by using acyclicity conditions that guarantee chase termination (Cuenca Grau et al. 2013). Example 6. Consider the stratiﬁcation S of Example 5 and the database D = {p(A), f(A, B)}. The core C1 of the restricted chase over the ﬁrst stratum was computed as U1 in Example 4. C2 then is simply C0 {e(B, B)}, and C3 = C2 is the resulting core-safe chase.

Example 6 admits other core-safe stratiﬁcations, leading to different core-safe chase sequences, but the ﬁnal result is the same for all of them. The main result of this section is that this is a general property of the core-safe chase, so that we obtain a unique model that provides a semantics of the underlying rule sets. Theorem 5. For rule set Σ and database D with core-safe stratiﬁcations S and S , S(D) is isomorphic to S (D). In the proof in our technical report (Ellmauthaler, Kr otzsch, and Mennicke 2021), we compare core-safe stratiﬁcations up to simple transformations. In particular, we consider splitting and merging of strata in a stratiﬁcation to show that the resulting core-safe chases are isomorphic. The rest of the proof is concerned with showing that all pairs of core-safe stratiﬁcations can be transformed into one another by only sequences of splitting and merging operations. Moreover, we ﬁnd that core-safe models generalise the perfect core models that are based on full stratiﬁcations: Proposition 4. If a rule set is fully stratiﬁed and has a ﬁnite perfect core model, then the latter is isomorphic to its coresafe model. Together with Theorem 5, the previous result justiﬁes that we call our semantics for core-safe stratiﬁed rule sets the perfect core semantics, since it generalises the eponymous semantics of Kr otzsch (2020) without giving up on the uniqueness of the model.

Discussion and Conclusion We have investigated how to answer normal Boolean conjunctive queries (BNCQs) on sets of existential rules, and we proposed the use of core models as a semantic reference point for this task. Arguably, cores are both intuitive and mathematically appealing for deﬁning a non-monotonic negation as failure semantics, since they satisfy existential rules but at the same time minimise the amount of inferences and avoid redundancies. Approaches of truth minimisation are the basis for most non-monotonic semantics, but are much less obvious when giving up the syntactic Herbrand semantics of logic programming. Nevertheless, our approach also has limitations. One of them is the difﬁculty of computing cores in practice, which we have addressed by identifying cases where this can be avoided. This leads to practical procedures, which in fact have already been implemented, though implementers are often not aware that the method is not sound for arbitrary negative queries or stratiﬁed negation in existential rules (Carral et al. 2019). Our core-safe chase still requires certain core constructions (after each stratum), which might not be practical. A possible approach to address this would be to investigate in more detail whether the intermediate non-core structures are truly problematic for evaluating the following rules and queries. A more general limitation is that cores only behave well if universal models are ﬁnite, whereas rule sets with inﬁnite models may have several distinct universal cores or no core that is a universal model at all (Carral et al. 2018). In this sense, the question which semantic reference point to choose in general remains open.

Acknowledgments This work is partly supported by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) in project number 389792660 (TRR 248, Center for Perspicuous Systems), by the Bundesministerium f ur Bildung und Forschung (BMBF, Federal Ministry of Education and Research) in the Center for Scalable Data Analytics and Artiﬁcial Intelligence (Sca DS.AI), and by the Center for Advancing Electronics Dresden (cfaed).

References Abiteboul, S.; Hull, R.; and Vianu, V. 1994. Foundations of Databases. Addison Wesley. Alviano, M.; Morak, M.; and Pieris, A. 2017. Stable Model Semantics for Tuple-Generating Dependencies Revisited. In (Sallinger, den Bussche, and Geerts 2017), 377 388. Baget, J.-F.; Garreau, F.; Mugnier, M.-L.; and Rocher, S. 2014. Revisiting Chase Termination for Existential Rules and their Extension to Nonmonotonic Negation. In Konieczny, S.; and Tompits, H., eds., Proceedings of the 15th International Workshop on Non-Monotonic Reasoning (NMR 2014), volume 1843-14-01 of INFSYS Research Report, 176 184. TU Wien. Baget, J.-F.; Lecl ere, M.; Mugnier, M.-L.; and Salvat, E. 2009. Extending Decidable Cases for Rules with Existential Variables. In Boutilier, C., ed., Proceedings of the 21st International Joint Conference on Artiﬁcial Intelligence (IJCAI 09), 677 682. IJCAI. Bauslaugh, B. L. 1995. Core-like properties of inﬁnite graphs and structures. Discrete Math., 138(1): 101 111. Bellomarini, L.; Sallinger, E.; and Gottlob, G. 2018. The Vadalog System: Datalog-based Reasoning for Knowledge Graphs. Proceedings of the VLDB Endowment, 11(9): 975 987. Benedikt, M.; Konstantinidis, G.; Mecca, G.; Motik, B.; Papotti, P.; Santoro, D.; and Tsamoura, E. 2017. Benchmarking the Chase. In (Sallinger, den Bussche, and Geerts 2017), 37 52. Cal ı, A.; Gottlob, G.; and Lukasiewicz, T. 2009. A general datalog-based framework for tractable query answering over ontologies. In Paredaens, J.; and Su, J., eds., Proceedings of the 28th Symposium on Principles of Database Systems (PODS 09), 77 86. ACM. Carral, D.; Dragoste, I.; Gonz alez, L.; Jacobs, C.; Kr otzsch, M.; and Urbani, J. 2019. VLog: A Rule Engine for Knowledge Graphs. In Ghidini et al., C., ed., Proceedings of the 18th International Semantic Web Conference (ISWC 19, Part II), volume 11779 of LNCS, 19 35. Springer. Carral, D.; Kr otzsch, M.; Marx, M.; Ozaki, A.; and Rudolph, S. 2018. Preserving Constraints with the Stable Chase. In Kimelfeld, B.; and Amsterdamer, Y., eds., Proceedings of the 21st International Conference on Database Theory (ICDT 18), volume 98 of LIPIcs, 12:1 12:19. Schloss Dagstuhl Leibniz-Zentrum fuer Informatik. Cuenca Grau, B.; Horrocks, I.; Kr otzsch, M.; Kupke, C.; Magka, D.; Motik, B.; and Wang, Z. 2013. Acyclicity Notions for Existential Rules and Their Application to Query

Answering in Ontologies. Journal of Artiﬁcial Intelligence Research, 47: 741 808. Deutsch, A.; Nash, A.; and Remmel, J. B. 2008. The Chase Revisited. In Lenzerini, M.; and Lembo, D., eds., Proceedings of the 27th Symposium on Principles of Database Systems (PODS 08), 149 158. ACM. Ebbinghaus, H.-D.; Flum, J.; and Thomas, W. 1994. Semantics of First-Order Languages, 27 57. New York, NY: Springer New York. ISBN 978-1-4757-2355-7. Ellmauthaler, S.; Kr otzsch, M.; and Mennicke, S. 2021. Answering Queries with Negation over Existential Rules. ar Xiv:2112.07376. Fagin, R.; Kolaitis, P. G.; Miller, R. J.; and Popa, L. 2005. Data exchange: semantics and query answering. Theoretical Computer Science, 336(1): 89 124. Fagin, R.; Kolaitis, P. G.; and Popa, L. 2005. Data exchange: Getting to the core. ACM Trans. Database Syst., 30(1): 174 210. Gottlob, G.; Hernich, A.; Kupke, C.; and Lukasiewicz, T. 2014. Stable Model Semantics for Guarded Existential Rules and Description Logics. In Baral, C.; De Giacomo, G.; and Eiter, T., eds., Proceedings of the 14th International Conference on Principles of Knowledge Representation and Reasoning (KR 14), 258 267. AAAI Press. Hell, P.; and Neˇsetˇril, J. 1992. The core of a graph. Discrete Math., 109: 117 126. Kr otzsch, M. 2020. Computing Cores for Existential Rules with the Standard Chase and ASP. In Calvanese, D.; Erdem, E.; and Thielscher, M., eds., Proceedings of the 17th International Conference on Principles of Knowledge Representation and Reasoning (KR 2020), 603 613. IJCAI. Kr otzsch, M.; and Rudolph, S. 2011. Extending Decidable Existential Rules by Joining Acyclicity and Guardedness. In Walsh, T., ed., Proceedings of the 22nd International Joint Conference on Artiﬁcial Intelligence (IJCAI 11), 963 968. AAAI Press/IJCAI. Magka, D.; Kr otzsch, M.; and Horrocks, I. 2013. Computing Stable Models for Nonmonotonic Existential Rules. In Rossi, F., ed., Proceedings of the 23rd International Joint Conference on Artiﬁcial Intelligence (IJCAI 13), 1031 1038. AAAI Press/IJCAI. Nenov, Y.; Piro, R.; Motik, B.; Horrocks, I.; Wu, Z.; and Banerjee, J. 2015. RDFox: A Highly-Scalable RDF Store. In Arenas, M.; Corcho, O.; Simperl, E.; Strohmaier, M.; d Aquin, M.; Srinivas, K.; Groth, P. T.; Dumontier, M.; Heﬂin, J.; Thirunarayan, K.; and Staab, S., eds., Proceedings of the 14th International Semantic Web Conference (ISWC 15), Part II, volume 9367 of LNCS, 3 20. Springer. Sallinger, E.; den Bussche, J. V.; and Geerts, F., eds. 2017. Proceedings of the 36th Symposium on Principles of Database Systems (PODS 17). ACM. Urbani, J.; Kr otzsch, M.; Jacobs, C. J. H.; Dragoste, I.; and Carral, D. 2018. Efﬁcient Model Construction for Horn Logic with VLog: System description. In Galmiche, D.; Schulz, S.; and Sebastiani, R., eds., Proceedings of the 9th International Joint Conference on Automated Reasoning (IJCAR 18), volume 10900 of LNCS, 680 688. Springer.