# datalogmtl_computational_complexity_and_expressive_power__1474f60f.pdf

Datalog MTL: Computational Complexity and Expressive Power

Przemysław A. Wał ega1,2 , Bernardo Cuenca Grau1 , Mark Kaminski1 and Egor V. Kostylev1

1Department of Computer Science, University of Oxford, UK 2Institute of Philosophy, University of Warsaw, Poland {przemyslaw.walega, bernardo.cuenca.grau, mark.kaminski, egor.kostylev}@cs.ox.ac.uk

We study the complexity and expressive power of Datalog MTL a knowledge representation language that extends Datalog with operators from metric temporal logic (MTL) and which has found applications in ontology-based data access and stream reasoning. We establish tight PSpace data complexity bounds and also show that Datalog MTL extended with negation on input predicates can express all queries in PSpace; this implies that MTL operators add signiﬁcant expressive power to Datalog. Furthermore, we provide tight combined complexity bounds for the forwardpropagating fragment of Datalog MTL, which was proposed in the context of stream reasoning, and show that it is possible to express all PSpace queries in the fragment extended with the falsum predicate.

1 Introduction Datalog MTL is an extension of Datalog with metric temporal operators that has been studied in the context of querying temporal data [Brandt et al., 2017; Brandt et al., 2018] and reasoning over data streams [Wał ega et al., 2019]. Datalog MTL allows for MTL expressions in rules such as [k1,k2]ϕ and x[k1,k2]ϕ, with k1 and k2 rational numbers, which hold at time t if ϕ holds at each and some, respectively, moment in the time interval [t k2, t k1]. Rules with such expressions can be used to naturally capture interesting events in temporal data. For instance, an analyst monitoring equipment data might be interested in ﬂagging risky overheating events that is, complex events where the temperature of a piece of equipment (1) has reached 60 Celsius, (2) has continuously exceeded 40 Celsius for a period of 10 seconds, or (3) has continuously exceeded 40 Celsius for two consecutive periods of 5 seconds with at most 2 seconds in-between. Such events are naturally captured by Datalog MTL rules (1) (3), where numbers represent seconds:

Overheat(x) x[0,0]Temp Above60(x), (1)

Overheat(x) [0,10]Temp Above40(x), (2)

Overheat(x) [0,5]Temp Above40(x)

x[5,7] [0,5]Temp Above40(x). (3)

Fact entailment in Datalog MTL has been recently studied by Brandt et al. (2018), who showed that it is EXPSPACEcomplete in combined and P-hard in data complexity, assuming binary encoding of numbers. Brandt et al. (2018) also studied the non-recursive fragment of Datalog MTL and showed that fact entailment becomes PSPACE-complete and in P, respectively. Wał ega et al. (2019) deﬁned forwardpropagating Datalog MTL, which disallows punctual intervals of the form [t, t] and imposes further syntactic restrictions to ensure that rule application cannot propagate derived information to the past; they also showed that fact entailment for this fragment is PSPACE-complete in data complexity assuming binary encoding of numbers and Pcomplete assuming unary encoding. The results in [Brandt et al., 2018] and [Wał ega et al., 2019] place the data complexity of Datalog MTL in-between PSPACE and EXPSPACE; to the best of our knowledge, however, the exact data complexity of Datalog MTL has remained an open problem until now. In this paper, we bridge this gap and show that the data complexity of fact entailment in Datalog MTL is in PSPACE (and hence PSPACE-complete). Similarly to [Brandt et al., 2018], we assume binary encoding of numbers, and consider rules equipped with the MTL operators xϱ, ϱ, and Sϱ, which refer to the past, and |ϱ, ϱ, and Uϱ, which refer to the future. To obtain the upper bound, we deﬁne in Section 3 two non-deterministic generalised Büchi automata relative to a Datalog MTL program and a temporal dataset, such that the languages of these automata are both non-empty if and only if the program and the dataset admit a common model. We then show that the emptyness condition for such automata can be checked using only polynomial space. In Section 4 we revisit fact entailment in the forwardpropagating fragment of Datalog MTL. Our results in Section 3 immediately extend the PSPACE upper bound in data complexity by Wał ega et al. (2019) to programs equipped with the falsum predicate and punctual intervals of the form [t, t]. Concerning combined complexity, we establish a novel EXPSPACE lower bound applicable to programs equipped with and an EXPTIME upper bound (matching the complexity of plain Datalog) for programs without . Finally, we discuss the expressive power of Datalog MTL in the sense of descriptive complexity [Immerman, 1999]. We show that, under standard minor modiﬁcations to the language [Dantsin et al., 2001], the forward-propagating frag-

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)

M, t |= for all t Q M, t |= for all t Q M, t |= xϱA if M, s |= A for some s with t s ϱ M, t |= |ϱA if M, s |= A for some s with s t ϱ M, t |= ϱA if M, s |= A for all s with t s ϱ M, t |= ϱA if M, s |= A for all s with s t ϱ

M, t |= ASϱA if M, s |= A for some s with t s ϱ and M, r |= A for all r (s, t)

M, t |= AUϱA if M, s |= A for some s with s t ϱ and M, r |= A for all r (t, s)

Table 1: Semantics of Datalog MTL ground literals

ment equipped with captures (i.e., can express all queries in) PSPACE. Together with the data complexity results in Section 3, this implies that Datalog MTL also captures PSPACE.

2 Preliminaries

Intervals. We consider intervals over Q, denoted by x, y , where x, y Q { , + }, is [ or (, is ] or ), and

x, y = {t Q | x t y and t = x if is (, whereas t = y if is ) }.

The left and right endpoints of an interval ϱ are denoted by ϱ and ϱ+, respectively. An interval ϱ with ϱ 0 is positive and an interval of the form [t, t] is punctual. The length of an interval ϱ is denoted by |ϱ|. For non-empty intervals ϱ1, ϱ2, and a non-empty positive interval ϱ3 we consider operations ϱ1 ϱ2, ϱ1 ϱ2, ϱc 1, ϱ1 ϱ3, ϱ1 ϱ3, ϱ1 + ϱ3, and ϱ1 ϱ3, the result of each of which is an interval; the ﬁrst two are the usual union and intersection and the others are deﬁned in Table 2 (note that and can result in an empty interval, e.g., [0, 1] [0, 2] = [0, 1]). We assume that each t Q is represented by an integer numerator and a positive integer denominator, both encoded in binary.

Syntax of Datalog MTL. Assume a function-free ﬁrstorder vocabulary with constants and predicates with nonnegative integer arities. An atom is of the form P(τ) where P is a predicate and τ is a matching arity tuple of constants and variables. A literal is an expression from the following grammar, for α an atom and ϱ a non-empty positive interval:

A:=α | | | xϱA | |ϱA | ϱA | ϱA | ASϱA | AUϱA.

A (Datalog MTL) rule is an expression of the form:

B A1 An, (4)

where n 0, each Ai is a literal, and B is a literal not mentioning operators x, |, S, and U. A rule of the form (4) is safe if each variable in B occurs in some Ai. A (Datalog MTL) program is a ﬁnite set of safe rules [Brandt et al., 2018]. The greatest common divisor gcd(Π) of a program Π is the greatest rational number which divides all rational

numbers which are endpoints of intervals in Π to integer values (if Π has no numbers, we take gcd(Π) = 1 for deﬁniteness). An expression e (e.g., an atom, literal, or rule) is ground if it mentions no variables. An assignment ν of variables to constants is a grounding of e if eν is ground. A (temporal) fact is an expression of the form α@ϱ with α a ground atom and ϱ a non-empty interval; a generalised fact is of the form A@ϱ, where A is a ground literal and ϱ a non-empty interval. A dataset is a ﬁnite set of facts. For a program Π and a dataset D we denote with glit(Π, D) the set of ground literals consisting of , all groundings of literals in Π by constants in Π and D, and literals [0,+ )α, (0,+ )α, [0,+ )α, and (0,+ )α, for α an atom in D.

Semantics of Datalog MTL. An interpretation M speciﬁes, for each ground atom α and each time point t Q, whether α is true at t, in which case we write M, t |= α. This notion extends to ground literals different from atoms as in Table 1. Interpretation M is a model of a rule of the form (4) if, for each grounding ν and each t Q, we have M, t |= Bν whenever M, t |= Aiν for every i = 1, . . . , n. Interpretation M is a model of a program Π, written M |= Π, if M is a model of all rules in Π. Interpretation M is a model of a generalised fact A@ϱ if M, t |= A for all t ϱ. Interpretation M is a model of a possibly inﬁnite set E of generalised facts (e.g., a dataset), written M |= E, if M is a model of all the elements of E. Note that each set of facts has the unique minimal model (i.e., a model assigning true as little as possible). A program Π and a possibly inﬁnite set of generalised facts E are consistent if Π and E have a common model. A program Π and a set E entail a set of generalised facts E , written (Π, E) |= E , if each model of Π and E is a model of E . If E is a singleton {A@ϱ}, then we denote it simply by A@ϱ. Brandt et al. (2018) showed that fact entailment and consistency checking in Datalog MTL polynomially reduce to the complements of each other. Hence, in what follows we mostly concentrate on consistency and consider combined complexity of this problem when both program and dataset form the input, and data complexity when the program is ﬁxed.

Normal Form. Similarly to Brandt et al. (2018), we usually assume that programs are in a normal form where there is no nesting of metric operators and where the form of intervals is restricted. For convenience in the automata construction later on, however, we reﬁne their deﬁnition by further restricting the form of unbounded intervals. In particular, a program Π is in normal form if it contains only rules of the forms:

B α1 αn, B ϱα, B ASϱα, B ϱα, B AUϱα,

where each of α, α1, . . . , αn is an atom, A is or an atom, B is or an atom, and ϱ is a non-empty positive interval such that either it is bounded or it satisﬁes ϱ = 0.

Proposition 1. Each program Π can be transformed in polynomial time into a program Π in normal form such that (Π, D) |= α@ϱ if and only if (Π , D) |= α@ϱ for each dataset D and each fact α@ϱ over the vocabulary of Π.

In what follows we assume that all programs are normal.

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)

operation left endpoint right endpoint left-open iff left endpoint is or right-open iff right endpoint is + or

ϱc 1 ϱ 1 ϱ+ 1 never never ϱ1 ϱ2 ϱ 1 ϱ+ 2 ϱ+ 1 ϱ 2 ϱ1 is left-open or ϱ2 is right-open ϱ1 is right-open or ϱ2 is left-open ϱ1 ϱ2 ϱ 1 ϱ 2 ϱ+ 1 ϱ+ 2 ϱ1 is left-open and ϱ2 is left-closed ϱ1 is right-open and ϱ2 is right-closed ϱ1 + ϱ2 ϱ 1 + ϱ 2 ϱ+ 1 + ϱ+ 2 ϱ1 or ϱ2 is left-open ϱ1 or ϱ2 is right-open ϱ1 ϱ2 ϱ 1 + ϱ+ 2 ϱ+ 1 + ϱ 2 ϱ1 is left-open and ϱ2 is right-closed ϱ1 is right-open and ϱ2 is left-closed

Table 2: Operations on intervals, where we assume that + x = and + + x = + for any value x (so + is not commutative)

Canonical Interpretations. Consistency and fact entailment can be checked using the canonical interpretation of a program and a dataset, as deﬁned next [Brandt et al., 2018]. The immediate consequence operator TΠ of a program Π applied to a set E of generalised facts is the minimal set TΠ(E) containing E and satisfying the following conditions, where Π is the set of (results of) all groundings of rules in Π, and E extends E with @ϱ for each non-empty interval ϱ: if B α1 αn is in Π , αi@ϱi E for i = 1, . . . , n, and ϱ = Tn i=1 ϱi is non-empty, then B@ϱ TΠ(E); if B ϱ2α is in Π , α@ϱ1 E, and ϱ = ϱ1 ϱ2 is non-empty, then B@ϱ TΠ(E); if B ϱ2α is in Π , α@ϱ1 E, and ϱ = ϱ1 ϱ2 is non-empty, then B@ϱ TΠ(E); if B A1Sϱ3A2 is in Π , Ai@ϱi E for i = 1, 2, and ϱ = ((ϱc 1 ϱ2)+ϱ3) ϱc 1 is non-empty, then B@ϱ TΠ(E); if B A1Uϱ3A2 is in Π , Ai@ϱi E for i = 1, 2, and ϱ = ((ϱc 1 ϱ2) ϱ3) ϱc 1 is non-empty, then B@ϱ TΠ(E); if α@ϱ1 E, α@ϱ2 E, and ϱ = ϱ1 ϱ2 is non-empty, then α@ϱ TΠ(E). The canonical interpretation Π(D) of Π and D is the minimal model of all facts in T ω1 Π (D), with ω1 the ﬁrst uncountable ordinal, which is deﬁned by application of TΠ as follows, for κ a successor ordinal and γ a limit ordinal:

T 0 Π(D) = D, T κ+1 Π (D) = TΠ(T κ Π(D)), T γ Π(D) = [

κ<γ T κ Π(D).

Brandt et al. (2018) showed that if a program Π and a dataset D are consistent, then Π(D) is their common model, and Π(D) is a model of Π and D if and only if @ϱ / T ω1 Π (D) for all ϱ; moreover, for every fact α@ϱ, (Π, D) |= α@ϱ if and only if Π(D) |= α@ϱ or @ϱ T ω1 Π (D) for some ϱ .

3 Data Complexity of Datalog MTL In this section we establish our main result, namely that checking consistency and fact entailment in Datalog MTL is in PSPACE in data complexity. In the remainder of this section, we ﬁx a program Π in normal form and a dataset D. In Section 3.1 we show that the canonical interpretation Π(D) is structured according to regularly distributed consecutive intervals such that all points within an interval satisfy the same ground atoms. We refer to such intervals as rulerintervals and to the set of their endpoints as the ruler. We also refer to any interpretation satisfying the same ground atoms on each ruler-interval as a ruler-interpretation.

In Section 3.2 we deﬁne two (inﬁnite) generalised Büchi automata and show that their languages are both non-empty if and only if Π and D are consistent. Each state of the automata is a window over an interval, which corresponds to the projection of a ruler-interpretation into this interval. Each symbol in the alphabet is a set of ground literals in glit(Π, D). The initial states of both automata represent a window capturing the input dataset. Transitions between windows W1 and W2 are such that W2 is the result of sliding W1 one step in the ruler. The sets of ﬁnal states ensure that the ruler-interpretation obtained from the facts in the states of an accepting run satisfy also the generalised facts in those states. In Section 3.3 we show that non-emptyness of these automata can be checked using only polynomial space. This is non-trivial since the initial state can be of exponential size in the size of the dataset, and the number of states is unbounded. We address the ﬁrst difﬁculty by guessing the initial state step-by-step. To address the second difﬁculty, we deﬁne equivalent automata with only exponentially many states, each of which is of polynomial size; these automata can be run on-the-ﬂy using only polynomial space.

3.1 Structure of the Canonical Interpretation We next deﬁne the notions of a ruler, a ruler-interval, and a ruler-interpretation for Π and D. Crucially, the time points in the ruler are regularly distributed and depend only on the time points in D and the greatest common divisor of Π. Deﬁnition 2. The ruler (for Π and D) is the set of time points of the form x+n gcd(Π) for x a rational in D (or 0 if D = ) and n an integer. A ruler-interval is either a punctual interval over a time point on the ruler or an interval (t1, t2) with t1, t2 consecutive time points on the ruler. A ruler-interval ϱ precedes or succeeds an interval ϱ with endpoints on the ruler if ϱ ϱ is empty, ϱ ϱ is a non-empty interval, and (ϱ )+ = ϱ or (ϱ ) = ϱ+, respectively. An interpretation M is a ruler-interpretation if, for each ground atom α, we have that M, t |= α implies M, t |= α for every time point t in the same ruler-interval as t. Example 3. A fragment of the ruler and the corresponding ruler-intervals for a program Πex with gcd(Πex) = 2.5 and a dataset Dex = {α@[0, 0], β@[4.2, 4.2]} is depicted below.

0.8 0 1.7 2.5 4.2 5 Ruler:

Ruler-intervals:

Observe that the ruler contains all multiples of 2.5 as well as all rationals of the form 4.2 + 2.5 n; moreover, (4.2, 5) succeeds [0, 4.2] but does not succeed [0, 4.2).

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)

The following lemma shows that the canonical interpretation is a ruler-interpretation, and hence we can concentrate only on ruler-interpretations when checking for consistency.

Lemma 4. Interpretation Π(D) is a ruler-interpretation.

Proof sketch. It sufﬁces to show by induction on the construction steps of T ω1 Π (D) that, for all α@ϱ T ω1 Π (D), the endpoints ϱ and ϱ+ are on the ruler. The base step of the induction trivially holds since T 0 Π(D) = D. For the inductive step, we can show that if α@ϱ is added by any of the rules in the deﬁnition of the consequence operator TΠ (see Section 2), then endpoints ϱ and ϱ+ are of the required form.

3.2 Büchi Automata for Consistency Checking We start this section with the deﬁnition of windows, which play the role of states in the automata construction below.

Deﬁnition 5. Let ϱ be an interval with endpoints on the ruler. A window over ϱ is a set W of generalised facts A@ϱ with A glit(Π, D) and ruler-intervals ϱ ϱ for which there exists a ruler-interpretation M, called corresponding to W, such that for all ruler-intervals ϱ ϱ and A glit(Π, D) we have M |= A@ϱ if and only if A@ϱ W.

Intuitively, a window W over ϱ provides a full description of a ruler-interpretation projected into ϱ; in particular, if M is a ruler-interpretation corresponding to W, then M |= W.

Deﬁnition 6. A window W over an interval ϱ satisﬁes Π whenever, for each grounding B A1 An of a rule in Π and each ruler-interval ϱ ϱ, we have B@ϱ W if B is an atom and Ai@ϱ W for every i = 1, . . . , n, and Ai@ϱ / W for some i = 1, . . . , n if B = .

The following deﬁnitions about windows will also be useful. The length of a window over an interval ϱ is the number of ruler-intervals contained in ϱ (note that the length of the window is always a natural number, contrary to the length of ϱ). We denote by ℓmin the minimal length of a window over a closed interval ϱ with |ϱ| = 2z for z the maximal rational in Π. Note that, since z is a multiple of gcd(Π), the window over [x, x + 2z] has length ℓmin for each rational x on the ruler. A window W2 over an interval ϱ2 follows a window W1 over an interval ϱ1 if the following holds: W1 and W2 have the same length; ϱ1 ϱ2 = ϱ 1 (ϱ1 ϱ2) ϱ 2 for the ruler-intervals ϱ 1 and ϱ 2 preceding ϱ2 and succeeding ϱ1, respectively; A@ϱ W1 if and only if A@ϱ W2 for each rulerinterval ϱ ϱ1 ϱ2. Intuitively, W2 follows W1 over ϱ1 if W2 is obtained from W1 by deleting all generalised facts over the ﬁrst ruler-interval in ϱ1 and adding generalised facts over the ruler-interval succeeding ϱ1. The next lemma establishes a useful closure property of windows that follow one another.

Lemma 7. Let W1 and W2 be windows such that W2 follows W1. If W1 and W2 satisfy Π and are of length at least ℓmin, then W1 W2 is a window satisfying Π.

We are ready to deﬁne the two automata that we use for consistency checking. The automata are parametrised by a window W, which determines the initial state.

Deﬁnition 8. For W a window of length ℓsatisfying Π, let G W be the generalised Büchi automaton (Q, Σ, δ, W, F) with the following components: the states Q are all windows of length ℓsatisfying Π; the alphabet Σ is the powerset of glit(Π, D); the transition function δ is a partial function from Q Σ to Q such that δ(W1, σ) = W2 whenever W2 is followed by W1 and W2 \ W1 = {A@ϱ | A σ}; W is the initial state; accepting condition F (i.e., a set of sets of states) contains

{W Q | ϱα@ϱ W or α@ϱ / W } and

{W Q | αSϱβ@ϱ / W or β@ϱ W }

for each ϱα and each αSϱβ, respectively, in glit(Π, D) with unbounded ϱ, where ϱ is any ruler-interval. The automaton G W is the same as G W except that W2 follows W1 in the deﬁnition of δ, and and S are used instead of and U, respectively, in F. The automata accept an inﬁnite Σ-word σ1σ2 . . . if there is a sequence W1, W2, . . . of states, called an accepting run, such that W1 = W, Wi+1 = δ(Wi, σi) for each i 1, and the run has inﬁnitely many states belonging to each set in F.

Note that both automata in Deﬁnition 8 have inﬁnitely many states; in the next section, we show that they are equivalent to ﬁnite generalised Büchi automata. Also, observe that the automata are essentially deterministic, except that certain pairs of a state and alphabet symbol are lacking a transition. It would be useful to have a generalisation of Lemma 7 to an inﬁnite number of windows. Although such a generalisation is not true in general, we next show that it holds for windows along accepting runs of the automata. We deﬁne an inﬁnite window as in Deﬁnition 5 except that ϱ = ( , + ); all relevant notions transfer to inﬁnite windows.

Lemma 9. If W is a window of length at least ℓmin satisfying Π, then the union of all states in accepting runs of G W and of G W is an inﬁnite window satisfying Π.

Proof. Let W be the union of all states occurring in these accepting runs. Each state is a window satisfying Π, so W also satisﬁes Π. Let M be the minimal model of the set of all facts in W . For each A glit(Π, D) with no unbounded interval and for each ruler-interval ϱ we have M |= A@ϱ if and only if A@ϱ W ; otherwise we would have a state in the accepting run without a corresponding ruler-interpretation (i.e., a state that is not a window). Since accepting runs visit each set in F inﬁnitely often we can show that the above equivalence also holds for A with an unbounded interval. Thus, M corresponds to W , and W is an inﬁnite window.

We are ready to show that automata G W and G W can be used to check whether Π and D are consistent. Let ℓD be the length of a window over [ x z, x + z] for x the maximal absolute value of rationals in D and z the maximal rational in Π. Intuitively, [ x z, x + z] is big enough to contain all facts over bounded intervals from a dataset.

Theorem 10. Program Π and dataset D are consistent iff there is a window W of length ℓD satisfying Π such that the languages of both G W and G W are non-empty and W |= D.

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)

Proof sketch. If Π and D are consistent, then they are satisﬁed by Π(D), which is a ruler-interpretation by Lemma 4. For x the maximal absolute value of rationals in D and z the maximal rational in Π, the forward direction of the theorem is witnessed by the window W over ϱ = [ x z, x + z] such that, for each ground literal A glit(Π, D) and ruler-interval ϱ ϱ, A@ϱ W if and only if Π(D) |= A@ϱ . Conversely, if W satisﬁes the conditions, then G W and G W have accepting runs. Let W be the union of all states in accepting runs of the automata, and M the minimal model of facts in W . By the proof of Lemma 9, W is a window satisfying Π, and M corresponds to W . Since W |= D we can show M |= D; so, M is a model of Π and D.

Although Theorem 10 suggests a consistency checking algorithm, windows of length ℓD may contain an exponential (in the size of a dataset) number of elements; moreover, the number of states in the automata is unbounded. In the following section we show how to overcome these difﬁculties.

3.3 Complexity of Consistency Checking We start by showing that we can decide whether a set of generalised facts is a window by guessing a polynomial representation of a corresponding interpretation. Lemma 11. Checking whether a ﬁnite set W of generalised facts is a window for program Π and dataset D is in NP in the size of (the representation of) D.

Proof sketch. If W is a window over ϱ, then it has a corresponding interpretation M (minimally) witnessing the generalised facts in W. Such M can be represented concisely as a set of facts such that if α@ϱ1 W then α@ϱ1 M, if ϱ2α@ϱ1 W then α@(ϱ1 ϱ2) M, and if ϱ2α@ϱ1 W then α@(ϱ1 + ϱ2) M. Moreover, for a generalised fact in W with S or U we need to guess at most two witnessing facts and add them to M. Importantly, for witnesses outside [ϱ z, ϱ+ + z], for z the maximal rational in Π, the exact ruler-intervals in which they hold are irrelevant and only their order on the timeline needs to be kept, so the representation of M can be guessed in NP. Checking whether M is an interpretation corresponding to W can be clearly done in P.

As a result, checking W |= D is in CONP for a window W, since we can guess a representation of a ruler-interpretation corresponding to W that does not satisfy D. Note also that checking whether a window satisﬁes Π can be done in P. A window W over ϱ is a preﬁx of W over ϱ if ϱ ϱ, the ﬁrst ruler-intervals contained in ϱ and ϱ are the same, and W coincides with W on ϱ ; W is a sufﬁx of W if the same holds when considering the last contained ruler-intervals. Lemma 12. Let W be a window over interval ϱ of length ℓ ℓmin satisfying Π. Then, the preﬁx W and the sufﬁx W of W of length ℓmin are windows satisfying Π such that the languages of G W and G W coincide, and the languages of G W and G W coincide.

Proof sketch. Words accepted by G W are clearly accepted by G W . For the converse, let w by accepted by G W and let W be the union of states in the run of G W on w. Then, as

in Lemma 9, we can show that the ruler-interpretation corresponding to W is a model of Π, and it is possible to split W into a sequence of windows of length ℓthat form an accepting run of G W on w. The case of W is symmetric.

We are ready to show our main result.

Theorem 13. Consistency checking and fact entailment in Datalog MTL are PSPACE-complete in data complexity.

Proof sketch. The lower bound is inherited from the forwardpropagating fragment [Wał ega et al., 2019] (see also Section 4). By Theorem 10, to check if Π and D are consistent we can guess a set W of generalised facts and verify that W is a window of length ℓD satisfying Π such that W |= D and the languages of G W and G W are non-empty. Although W may contain exponentially many elements, we can access it by guessing fragments of length ℓmin one-by-one. Then, by Lemma 7, it sufﬁces to check whether each such fragment is a window satisfying Π; this can be done in NP by Lemma 11, which also implies that W |= D is in CONP. For the non-emptiness checks, by Lemma 12 it sufﬁces to verify, for the preﬁx W and sufﬁx W of W of length ℓmin, whether the languages of G W and G W are non-empty. To overcome the problem with unbounded number of states, we obtain ﬁnite automata by merging states (i.e., windows) that are equivalent modulo a shift of the intervals over which the states are deﬁned. Each of these ﬁnite automata is equivalent to its inﬁnite version and has exponential (in the size of D) number of states. Hence we can check non-emptiness by the standard on-the-ﬂy PSPACE procedure.

4 Forward-Propagating Fragments

In this section we revisit the forward-propagating fragment of Datalog MTL proposed by Wał ega et al. (2019). In this fragment, rules are syntactically restricted to allow for derivation of facts into present and future time points, but precluding propagation towards past (in fact, we consider a slight extension of the language of Wał ega et al., since we allow for ).

Deﬁnition 14. A Datalog MTL rule of the form (4) is forward-propagating if literal B does not mention , and literals Ai do not mention |, , and U. A program is forwardpropagating if so is each of its rules.

As follows from Theorem 13, fact entailment (and hence consistency checking) for forward-propagating programs is in PSPACE in data complexity. Moreover, Wał ega et al. (2019) showed a matching lower bound for fact entailment that holds even if and punctual intervals are disallowed (note that programs without are always consistent).

Corollary 15. Fact entailment for forward-propagating programs is PSPACE-complete in data complexity and stays PSPACE-hard if and punctual intervals are disallowed.

Brandt et al. (2018) showed that fact entailment in the full Datalog MTL language is EXPSPACE-complete in combined complexity. The construction in their lower bound proof, however, does not apply to forward-propagating programs as it uses both past and future temporal operators. We next show that the problem for forward-propagating Datalog MTL

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)

is also EXPSPACE-hard in combined complexity (and hence EXPSPACE-complete). Our proof, however, crucially relies on the use of in rule heads; in fact, we can show that fact entailment becomes EXPTIME-complete (and hence no harder than for plain Datalog) for the language without .

Theorem 16. Fact entailment for forward-propagating programs is EXPSPACE-complete in combined complexity and stays EXPSPACE-hard if punctual intervals are disallowed. The problem is EXPTIME-complete if is disallowed.

Proof sketch. The EXPSPACE lower bound is obtained by reduction of the halting problem for Turing machines using exponential space. Our construction is a modiﬁcation of the PSPACE-hardness construction by Wał ega et al. (2019), which reduces halting of polynomial-space bounded Turing machines to fact entailment for ﬁxed forward-propagating programs without in the heads. Since the program is no longer ﬁxed, we can use varying arity to encode binary representations of the tape position numbers, and use to derive inconsistencies that correspond to input rejection on doubly exponential steps of computation. The EXPTIME lower bound for programs without is a consequence of EXPTIME-hardness of fact entailment in plain Datalog; and the EXPTIME upper bound is obtained by analysis of the algorithm by Wał ega et al. (2019). Their algorithm for checking fact entailment derives facts only up to time t in the checked fact (given in binary). We can then show that the number of time points at which the algorithm derives facts is exponential in the value of t. In each of these time points at most exponentially many facts can be derived, and the derivation of all facts at a given time point is feasible in exponential time.

5 Expressive Power

We ﬁnally discuss the expressivity of (the adapted in an appropriate way) Datalog MTL in the sense of descriptive complexity [Immerman, 1999] and start with relevant deﬁnitions. A logical language is a set of sentences that evaluate on a ﬁnite set of ground atoms over input predicates to either true or false; as usual, we also assume that the input is ordered that is, the set always contains atoms first(c1), next(c1, c2), . . . , next(cn 1, cn), last(cn) for an enumeration c1, . . . , cn of all constants in the set. Such a language captures a complexity class C if evaluating every ﬁxed sentence in the language is in C, and for each problem in C there is a sentence in the language that is true if and only if its input encodes an instance of the problem. Under these deﬁnitions, Datalog MTL and its fragments are not logical languages, because atoms in their inputs have temporal labels and their programs do not evaluate to Boolean values. However, we can overcome these mismatches by adapting Datalog MTL to a logical language Datalog MTLll as follows: 1. each Ai in a rule (4) may also be an atom over an input predicate or the negation of such an atom; 2. B in a rule (4) may also be the special nullary atom Goal, representing the positive answer. Note that input predicates are not allowed in B and in complex literals in Ai. The semantics is extended to Datalog MTLll programs on ordered inputs (i.e., sets of

ground atoms over input predicates including the ordering atoms) in the standard way [Dantsin et al., 2001], assuming that the input atoms hold at time point 0. It is not difﬁcult to modify the PSPACE-completeness (i.e., both directions of Corollary 15) proof to show the following theorem. Theorem 17. Datalog MTLll and forward-propagating fragment of Datalog MTLll capture PSPACE over ordered inputs.

6 Related Work MTL is a prominent formalism for specifying and reasoning over complex events in real-time systems. MTL is equipped with two alternative semantics: pointwise and continuous, where the latter is adopted in this paper. Under the pointwise semantics, an interpretation is seen as a time-increasing sequence of pairs consisting of a time point and a set of propositional variables holding at that time point. In contrast, interpretations under the continuous semantics are functions mapping each element of the (dense) temporal domain to a set of propostional variables. Satisﬁability and model checking of unrestricted MTL formulas are undecidable under the continuous semantics [Alur and Henzinger, 1993], but become decidable (EXPSPACE-complete) if punctual intervals are disallowed [Alur et al., 1996]. Datalog MTL is a Horn fragment of MTL that extends the fundamental rule-based Datalog language [Abiteboul et al., 1995; Dantsin et al., 2001]. It was ﬁrst studied under the continuous semantics in [Brandt et al., 2017; Brandt et al., 2018]. The non-recursive and forward-propagating fragments were studied in [Brandt et al., 2018] and [Wał ega et al., 2019] respectively. Low complexity fragments under the alternative pointwise semantics were investigated in [Kikot et al., 2018]. Datalog MTL has been proposed as a suitable alternative to existing temporal languages for ontology-based data access [Artale et al., 2017]. It has also found applications in the area of stream reasoning [Wał ega et al., 2019], building on a long tradition of applications of MTL model checking to monitoring problems over data streams [Basin et al., 2018; Baldor and Niu, 2012; Thati and Ro su, 2005; Doherty et al., 2009; Niˇckovi c and Piterman, 2010; Ho et al., 2014].

7 Conclusions In this paper we have closed the existing gaps in the complexity of Datalog MTL and its forward-propagating fragment under the continuous semantics. We see several avenues for future work. First, it would be interesting to investigate fragments of Datalog MTL with tractable fact entailment an important requirement for dataintensive applications. Although Brandt et al. (2018) showed that reasoning in the non-recursive fragment is in AC0, recursion remains an important feature in many applications. Second, it would be interesting to investigate the complexity of Datalog MTL under the alternative point-based semantics.

Acknowledgments Research supported by the SIRIUS Centre for Scalable Data Access, the EPSRC projects DBOnto, Ma SI3, and ED3, the NCN grant 2016/23/N/HS1/02168, and the Foundation for Polish Science (FNP).

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)

References [Abiteboul et al., 1995] Serge Abiteboul, Richard Hull, and Victor Vianu. Foundations of Databases. Addison-Wesley, 1995. [Alur and Henzinger, 1993] Rajeev Alur and Thomas A. Henzinger. Real-time logics: Complexity and expressiveness. Inf. Comput., 104(1):35 77, 1993. [Alur et al., 1996] Rajeev Alur, Tomás Feder, and Thomas A Henzinger. The beneﬁts of relaxing punctuality. J. ACM, 43(1):116 146, 1996. [Artale et al., 2017] Alessandro Artale, Roman Kontchakov, Alisa Kovtunova, Vladislav Ryzhikov, Frank Wolter, and Michael Zakharyaschev. Ontology-mediated query answering over temporal data: A survey (invited talk). In LIPIcs-Leibniz International Proceedings in Informatics, volume 90, 2017. [Baldor and Niu, 2012] Kevin Baldor and Jianwei Niu. Monitoring dense-time, continuous-semantics, metric temporal logic. In RV, pages 245 259, 2012. [Basin et al., 2018] David Basin, Felix Klaedtke, and Eugen Z alinescu. Algorithms for monitoring real-time properties. Acta Inform., 55(4):309 338, 2018. [Brandt et al., 2017] Sebastian Brandt, R Kontchakov, V Ryzhikov, G Xiao, and M Zakharyaschev. Ontologybased data access with a Horn fragment of metric temporal logic. In AAAI, pages 1070 1076. AAAI Press, 2017. [Brandt et al., 2018] Sebastian Brandt, Elem Güzel Kalaycı, Vladislav Ryzhikov, Guohui Xiao, and Michael Zakharyaschev. Querying log data with metric temporal logic. J. Artif. Intell. Res., 62:829 877, 2018. [Dantsin et al., 2001] Evgeny Dantsin, Thomas Eiter, Georg Gottlob, and Andrei Voronkov. Complexity and expressive power of logic programming. ACM Comput. Surv., 33(3):374 425, 2001. [Doherty et al., 2009] Patrick Doherty, Jonas Kvarnström, and Fredrik Heintz. A temporal logic-based planning and execution monitoring framework for unmanned aircraft systems. Auton. Agents Multi-Agent Syst., 19(3):332 377, 2009. [Ho et al., 2014] Hsi-Ming Ho, Joël Ouaknine, and James Worrell. Online monitoring of metric temporal logic. In RV, pages 178 192, 2014. [Immerman, 1999] Neil Immerman. Descriptive Complexity. Springer, 1999. [Kikot et al., 2018] Stanislav Kikot, Vladislav Ryzhikov, Przemysław Andrzej Wał ega, and Michael Zakharyaschev. On the data complexity of ontology-mediated queries with MTL operators over timed words. In DL, 2018. [Niˇckovi c and Piterman, 2010] Dejan Niˇckovi c and Nir Piterman. From MTL to deterministic timed automata. In FORMATS, pages 152 167, 2010. [Thati and Ro su, 2005] Prasanna Thati and Grigore Ro su. Monitoring algorithms for metric temporal logic speciﬁcations. Electron. Notes Theor. Comput. Sci., 113:145 162, 2005.

[Wał ega et al., 2019] Przemysław Andrzej Wał ega, Bernardo Cuenca Grau, and Mark Kaminski. Reasoning over streaming data in metric temporal datalog. In AAAI, pages 1941 1948, 2019.

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)